Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
2bdb5445
Commit
2bdb5445
authored
Mar 14, 2025
by
JesseChen1031
Browse files
update api path and documents
parent
102fe277
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
11 deletions
+11
-11
projects/web_api/app.py
projects/web_api/app.py
+11
-11
No files found.
projects/web_api/app.py
View file @
2bdb5445
...
@@ -64,14 +64,13 @@ def init_writers(
...
@@ -64,14 +64,13 @@ def init_writers(
Initialize writers based on path type
Initialize writers based on path type
Args:
Args:
pdf
_path:
PDF
file path (local path or S3 path)
file
_path: file path (local path or S3 path)
pdf_
file: Uploaded
PDF
file object
file: Uploaded file object
output_path: Output directory path
output_path: Output directory path
output_image_path: Image output directory path
output_image_path: Image output directory path
Returns:
Returns:
Tuple[writer, image_writer, pdf_bytes]: Returns initialized writer tuple and PDF
Tuple[writer, image_writer, file_bytes]: Returns initialized writer tuple and file content
file content
"""
"""
file_extension
:
str
=
None
file_extension
:
str
=
None
if
file_path
:
if
file_path
:
...
@@ -120,7 +119,8 @@ def process_file(
...
@@ -120,7 +119,8 @@ def process_file(
Process PDF file content
Process PDF file content
Args:
Args:
pdf_bytes: Binary content of PDF file
file_bytes: Binary content of file
file_extension: file extension
parse_method: Parse method ('ocr', 'txt', 'auto')
parse_method: Parse method ('ocr', 'txt', 'auto')
image_writer: Image writer
image_writer: Image writer
...
@@ -170,9 +170,9 @@ def encode_image(image_path: str) -> str:
...
@@ -170,9 +170,9 @@ def encode_image(image_path: str) -> str:
@
app
.
post
(
@
app
.
post
(
"/
pdf
_parse"
,
"/
file
_parse"
,
tags
=
[
"projects"
],
tags
=
[
"projects"
],
summary
=
"Parse
PDF
files (supports local files and S3)"
,
summary
=
"Parse files (supports local files and S3)"
,
)
)
async
def
file_parse
(
async
def
file_parse
(
file
:
UploadFile
=
None
,
file
:
UploadFile
=
None
,
...
@@ -190,10 +190,10 @@ async def file_parse(
...
@@ -190,10 +190,10 @@ async def file_parse(
to the specified directory.
to the specified directory.
Args:
Args:
pdf_
file: The PDF file to be parsed. Must not be specified together with
file: The PDF file to be parsed. Must not be specified together with
`
pdf
_path`
`
file
_path`
pdf
_path: The path to the PDF file to be parsed. Must not be specified together
file
_path: The path to the PDF file to be parsed. Must not be specified together
with `
pdf_
file`
with `file`
parse_method: Parsing method, can be auto, ocr, or txt. Default is auto. If
parse_method: Parsing method, can be auto, ocr, or txt. Default is auto. If
results are not satisfactory, try ocr
results are not satisfactory, try ocr
is_json_md_dump: Whether to write parsed data to .json and .md files. Default
is_json_md_dump: Whether to write parsed data to .json and .md files. Default
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment