Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
zhougaofeng
magic_pdf
Commits
c19bb36e
Commit
c19bb36e
authored
Dec 31, 2024
by
zhougaofeng
Browse files
Update pdf_parse_by_ocr.py
parent
23eff75f
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
magic_pdf/pdf_parse_by_ocr.py
magic_pdf/pdf_parse_by_ocr.py
+2
-2
No files found.
magic_pdf/pdf_parse_by_ocr.py
View file @
c19bb36e
...
@@ -3,7 +3,7 @@ from magic_pdf.data.dataset import PymuDocDataset
...
@@ -3,7 +3,7 @@ from magic_pdf.data.dataset import PymuDocDataset
from
magic_pdf.pdf_parse_union_core_v2
import
pdf_parse_union
from
magic_pdf.pdf_parse_union_core_v2
import
pdf_parse_union
def
parse_pdf_by_ocr
(
ocr_status
,
config_path
,
local_image_dir
,
pdf_bytes
,
def
parse_pdf_by_ocr
(
config_path
,
local_image_dir
,
pdf_bytes
,
model_list
,
model_list
,
imageWriter
,
imageWriter
,
start_page_id
=
0
,
start_page_id
=
0
,
...
@@ -11,7 +11,7 @@ def parse_pdf_by_ocr(ocr_status,config_path,local_image_dir,pdf_bytes,
...
@@ -11,7 +11,7 @@ def parse_pdf_by_ocr(ocr_status,config_path,local_image_dir,pdf_bytes,
debug_mode
=
False
,
debug_mode
=
False
,
):
):
dataset
=
PymuDocDataset
(
pdf_bytes
)
dataset
=
PymuDocDataset
(
pdf_bytes
)
return
pdf_parse_union
(
ocr_status
,
config_path
,
local_image_dir
,
dataset
,
return
pdf_parse_union
(
config_path
,
local_image_dir
,
dataset
,
model_list
,
model_list
,
imageWriter
,
imageWriter
,
SupportedPdfParseMethod
.
OCR
,
SupportedPdfParseMethod
.
OCR
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment