Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
8f0cc148
Commit
8f0cc148
authored
Jun 11, 2025
by
myhloli
Browse files
refactor: update OCR model configurations to use v5 and enhance language handling
parent
4eaa85fd
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
26 additions
and
10 deletions
+26
-10
mineru/model/ocr/paddleocr2pytorch/pytorch_paddle.py
mineru/model/ocr/paddleocr2pytorch/pytorch_paddle.py
+1
-1
mineru/model/ocr/paddleocr2pytorch/pytorchocr/utils/resources/arch_config.yaml
...leocr2pytorch/pytorchocr/utils/resources/arch_config.yaml
+16
-0
mineru/model/ocr/paddleocr2pytorch/pytorchocr/utils/resources/models_config.yml
...eocr2pytorch/pytorchocr/utils/resources/models_config.yml
+9
-9
No files found.
mineru/model/ocr/paddleocr2pytorch/pytorch_paddle.py
View file @
8f0cc148
...
...
@@ -57,7 +57,7 @@ class PytorchPaddleOCR(TextSystem):
self
.
lang
=
kwargs
.
get
(
'lang'
,
'ch'
)
device
=
get_device
()
if
device
==
'cpu'
and
self
.
lang
in
[
'ch'
,
'ch_server'
]:
if
device
==
'cpu'
and
self
.
lang
in
[
'ch'
,
'ch_server'
,
'japan'
,
'chinese_cht'
]:
logger
.
warning
(
"The current device in use is CPU. To ensure the speed of parsing, the language is automatically switched to ch_lite."
)
self
.
lang
=
'ch_lite'
...
...
mineru/model/ocr/paddleocr2pytorch/pytorchocr/utils/resources/arch_config.yaml
View file @
8f0cc148
...
...
@@ -120,6 +120,22 @@ ch_PP-OCRv5_det_infer:
name
:
DBHead
k
:
50
ch_PP-OCRv5_det_server_infer
:
model_type
:
det
algorithm
:
DB
Transform
:
null
Backbone
:
name
:
PPHGNetV2_B4
det
:
True
Neck
:
name
:
LKPAN
out_channels
:
256
intracl
:
True
Head
:
name
:
PFHeadLocal
k
:
50
mode
:
"
large"
ch_PP-OCRv4_det_server_infer
:
model_type
:
det
algorithm
:
DB
...
...
mineru/model/ocr/paddleocr2pytorch/pytorchocr/utils/resources/models_config.yml
View file @
8f0cc148
lang
:
ch_lite
:
det
:
ch_PP-OCRv
3
_det_infer.pth
det
:
ch_PP-OCRv
5
_det_infer.pth
rec
:
ch_PP-OCRv5_rec_infer.pth
dict
:
ppocrv5_dict.txt
ch_lite_v4
:
det
:
ch_PP-OCRv
3
_det_infer.pth
det
:
ch_PP-OCRv
5
_det_infer.pth
rec
:
ch_PP-OCRv4_rec_infer.pth
dict
:
ppocr_keys_v1.txt
ch_server
:
det
:
ch_PP-OCRv
3
_det_infer.pth
det
:
ch_PP-OCRv
5
_det_infer.pth
rec
:
ch_PP-OCRv5_rec_server_infer.pth
dict
:
ppocrv5_dict.txt
ch_server_v4
:
det
:
ch_PP-OCRv
3
_det_infer.pth
det
:
ch_PP-OCRv
5
_det_infer.pth
rec
:
ch_PP-OCRv4_rec_server_infer.pth
dict
:
ppocr_keys_v1.txt
ch
:
det
:
ch_PP-OCRv
3
_det_infer.pth
det
:
ch_PP-OCRv
5
_det_infer.pth
rec
:
ch_PP-OCRv4_rec_server_doc_infer.pth
dict
:
ppocrv4_doc_dict.txt
en
:
...
...
@@ -28,12 +28,12 @@ lang:
rec
:
korean_PP-OCRv3_rec_infer.pth
dict
:
korean_dict.txt
japan
:
det
:
Multilingual
_PP-OCRv
3
_det_infer.pth
rec
:
japan
_PP-OCRv
3
_rec_infer.pth
det
:
ch
_PP-OCRv
5
_det_infer.pth
rec
:
ch
_PP-OCRv
5
_rec_
server_
infer.pth
dict
:
japan_dict.txt
chinese_cht
:
det
:
Multilingual
_PP-OCRv
3
_det_infer.pth
rec
:
ch
inese_cht
_PP-OCRv
3
_rec_infer.pth
det
:
ch
_PP-OCRv
5
_det_infer.pth
rec
:
ch_PP-OCRv
5
_rec_
server_
infer.pth
dict
:
chinese_cht_dict.txt
ta
:
det
:
Multilingual_PP-OCRv3_det_infer.pth
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment