Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
128182f3
Commit
128182f3
authored
Jun 13, 2025
by
myhloli
Browse files
feat: update README_zh-CN with command line usage and model download instructions for MinerU
parent
f8bf2c14
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
51 additions
and
1 deletion
+51
-1
README_zh-CN.md
README_zh-CN.md
+51
-1
No files found.
README_zh-CN.md
View file @
128182f3
...
@@ -499,6 +499,14 @@ uv pip install -e .[all] -i https://mirrors.aliyun.com/pypi/simple
...
@@ -499,6 +499,14 @@ uv pip install -e .[all] -i https://mirrors.aliyun.com/pypi/simple
### 命令行
### 命令行
最简单的命令行方式使用MinerU
```
commandline
mineru -p <input_path> -o <output_path>
```
其中
`<input_path>`
为本地PDF文件或目录,
`<output_path>`
为输出目录。
如果您需要获得更多命令行参数信息,可以使用以下命令
```
commandline
```
commandline
mineru --help
mineru --help
```
```
...
@@ -515,7 +523,8 @@ Options:
...
@@ -515,7 +523,8 @@ Options:
the file type. txt: Use text extraction
the file type. txt: Use text extraction
method. ocr: Use OCR method for image-based
method. ocr: Use OCR method for image-based
PDFs. Without method specified, 'auto' will
PDFs. Without method specified, 'auto' will
be used by default.
be used by default. Adapted only for the
case where the backend is set to "pipeline".
-b, --backend [pipeline|vlm-transformers|vlm-sglang-engine|vlm-sglang-client]
-b, --backend [pipeline|vlm-transformers|vlm-sglang-engine|vlm-sglang-client]
the backend for parsing pdf: pipeline: More
the backend for parsing pdf: pipeline: More
general. vlm-transformers: More general.
general. vlm-transformers: More general.
...
@@ -553,7 +562,48 @@ Options:
...
@@ -553,7 +562,48 @@ Options:
The source of the model repository. Default
The source of the model repository. Default
is 'huggingface'.
is 'huggingface'.
--help Show this message and exit.
--help Show this message and exit.
```
MinerU现已使用自动模型下载功能,默认为运行时在第一次加载时下载当前所需要的模型文件,默认使用huggingface作为模型源,如您的网络无法访问huggingface,您可以通过以下方式切换为modelscope源
```
commandline
mineru -p <input_path> -o <output_path> --source modelscope
```
或使用环境变量
```
bash
export
MINERU_MODEL_SOURCE
=
modelscope
mineru
-p
<input_path>
-o
<output_path>
```
如果您需要使用本地模型文件,请先通过命令将模型下载到本地
```
commandline
$ mineru-models-download --help
Usage: mineru-models-download [OPTIONS]
Download MinerU model files.
Supports downloading pipeline or VLM models from ModelScope or HuggingFace.
Options:
-s, --source [huggingface|modelscope]
The source of the model repository.
-m, --model_type [pipeline|vlm|all]
The type of the model to download.
--help Show this message and exit.
```
或通过交互式命令行下载模型文件
```
commandline
mineru-models-download
Please select the model download source: (huggingface, modelscope) [huggingface]:
Please select the model type to download: (pipeline, vlm, all) [all]:
```
模型下载完成后,会自动将本地模型路径配置在用户目录的
`mineru.json`
中
您可以在下次执行MinerU时,直接使用本地模型文件进行解析
```
commandline
mineru -p <input_path> -o <output_path> --source local
```
或使用环境变量
```
bash
export
MINERU_MODEL_SOURCE
=
local
mineru
-p
<input_path>
-o
<output_path>
```
```
> [!TIP]
> [!TIP]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment