Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
cb9c2e76
Unverified
Commit
cb9c2e76
authored
Apr 08, 2025
by
Xiaomeng Zhao
Committed by
GitHub
Apr 08, 2025
Browse files
Merge pull request #2154 from opendatalab/release-1.3.2
Release 1.3.2
parents
0ab29cdb
b3ac3ac1
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
4 additions
and
3 deletions
+4
-3
README_zh-CN.md
README_zh-CN.md
+2
-1
magic_pdf/model/batch_analyze.py
magic_pdf/model/batch_analyze.py
+1
-1
magic_pdf/pdf_parse_union_core_v2.py
magic_pdf/pdf_parse_union_core_v2.py
+1
-1
No files found.
README_zh-CN.md
View file @
cb9c2e76
...
@@ -47,7 +47,8 @@
...
@@ -47,7 +47,8 @@
</div>
</div>
# 更新记录
# 更新记录
-
2025/04/08 1.3.2发布,修复了一些兼容问题
-
2025/04/08 1.3.2 发布,修复了一些兼容问题
-
支持python 3.13
-
支持python 3.13
-
解决因
`transformers 4.51.0`
导致的报错
-
解决因
`transformers 4.51.0`
导致的报错
-
为部分过时的linux系统(如centos7)做出最后适配,并不再保证后续版本的继续支持,
[
安装说明
](
https://github.com/opendatalab/MinerU/issues/1004
)
-
为部分过时的linux系统(如centos7)做出最后适配,并不再保证后续版本的继续支持,
[
安装说明
](
https://github.com/opendatalab/MinerU/issues/1004
)
...
...
magic_pdf/model/batch_analyze.py
View file @
cb9c2e76
...
@@ -241,7 +241,7 @@ class BatchAnalyze:
...
@@ -241,7 +241,7 @@ class BatchAnalyze:
for
index
,
layout_res_item
in
enumerate
(
need_ocr_lists_by_lang
[
lang
]):
for
index
,
layout_res_item
in
enumerate
(
need_ocr_lists_by_lang
[
lang
]):
ocr_text
,
ocr_score
=
ocr_res_list
[
index
]
ocr_text
,
ocr_score
=
ocr_res_list
[
index
]
layout_res_item
[
'text'
]
=
ocr_text
layout_res_item
[
'text'
]
=
ocr_text
layout_res_item
[
'score'
]
=
float
(
round
(
ocr_score
,
2
)
)
layout_res_item
[
'score'
]
=
float
(
f
"
{
ocr_score
:.
3
f
}
"
)
total_processed
+=
len
(
img_crop_list
)
total_processed
+=
len
(
img_crop_list
)
...
...
magic_pdf/pdf_parse_union_core_v2.py
View file @
cb9c2e76
...
@@ -997,7 +997,7 @@ def pdf_parse_union(
...
@@ -997,7 +997,7 @@ def pdf_parse_union(
for
index
,
span
in
enumerate
(
need_ocr_list
):
for
index
,
span
in
enumerate
(
need_ocr_list
):
ocr_text
,
ocr_score
=
ocr_res_list
[
index
]
ocr_text
,
ocr_score
=
ocr_res_list
[
index
]
span
[
'content'
]
=
ocr_text
span
[
'content'
]
=
ocr_text
span
[
'score'
]
=
float
(
round
(
ocr_score
,
2
)
)
span
[
'score'
]
=
float
(
f
"
{
ocr_score
:.
3
f
}
"
)
# rec_time = time.time() - rec_start
# rec_time = time.time() - rec_start
# logger.info(f'ocr-dynamic-rec time: {round(rec_time, 2)}, total images processed: {len(img_crop_list)}')
# logger.info(f'ocr-dynamic-rec time: {round(rec_time, 2)}, total images processed: {len(img_crop_list)}')
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment