<ahref="https://github.com/opendatalab/PDF-Extract-Kit">PDF-Extract-Kit: High-Quality PDF Extraction Toolkit</a>🔥🔥🔥
<br>
<br>
🚀<ahref="https://mineru.net/?source=github">Access MinerU Now→✅ Zero-Install Web Version ✅ Full-Featured Desktop Client ✅ Instant API Access; Skip deployment headaches – get all product formats in one click. Developers, dive in!</a>
🚀<ahref="https://mineru.net/?source=github">Access MinerU Now→✅ Zero-Install Web Version ✅ Full-Featured Desktop Client ✅ Instant API Access; Skip deployment headaches – get all product formats in one click. Developers, dive in!</a>
</p>
</p>
...
@@ -398,36 +395,6 @@
...
@@ -398,36 +395,6 @@
</details>
</details>
</details>
</details>
<!-- TABLE OF CONTENT -->
<detailsopen="open">
<summary><h2style="display: inline-block">Table of Contents</h2></summary>
> Linux and macOS systems automatically support CUDA/MPS acceleration after installation. For Windows users who want to use CUDA acceleration,
> please visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to install PyTorch with the appropriate CUDA version.
#### 1.3 Install Full Version (supports sglang acceleration) (requires device with Turing or newer architecture and at least 8GB GPU memory)
If you need to use **sglang to accelerate VLM model inference**, you can choose any of the following methods to install the full version:
- Install using uv or pip:
```bash
uv pip install-U"mineru[all]"
```
- Install from source:
```bash
uv pip install-e .[all]
```
> [!TIP]
> If any exceptions occur during the installation of `sglang`, please refer to the [official sglang documentation](https://docs.sglang.ai/start/install.html) for troubleshooting and solutions, or directly use Docker-based installation.
> For more information about output files, please refer to [Output File Documentation](docs/output_file_en_us.md)
> `mineru[core]` includes all core features except `sglang` acceleration, compatible with Windows / Linux / macOS systems, suitable for most users.
> If you need to use `sglang` acceleration for VLM model inference or install a lightweight client on edge devices, please refer to the documentation [Extension Modules Installation Guide](https://opendatalab.github.io/MinerU/quick_start/extension_modules/).
---
---
### 3. API Calls or Visual Invocation
#### Deploy MinerU using Docker
MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues.
1. Directly invoke using Python API: [Python Invocation Example](demo/demo.py)
You can get the [Docker Deployment Instructions](https://opendatalab.github.io/MinerU/quick_start/docker_deployment/) in the documentation.
2. Invoke using FastAPI:
```bash
mineru-api --host 127.0.0.1 --port 8000
```
Visit http://127.0.0.1:8000/docs in your browser to view the API documentation.
3. Use Gradio WebUI or Gradio API:
```bash
# Using pipeline/vlm-transformers/vlm-sglang-client backend
Access http://127.0.0.1:7860 in your browser to use the Gradio WebUI, or visit http://127.0.0.1:7860/?view=api to use the Gradio API.
> [!TIP]
> Below are some suggestions and notes for using the sglang acceleration mode:
> - The sglang acceleration mode currently supports operation on Turing architecture GPUs with a minimum of 8GB VRAM, but you may encounter VRAM shortages on GPUs with less than 24GB VRAM. You can optimize VRAM usage with the following parameters:
> - If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by setting `--mem-fraction-static 0.5`. If VRAM issues persist, try lowering it further to `0.4` or below.
> - If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode: `--tp-size 2`
> - If you are already successfully using sglang to accelerate VLM inference but wish to further improve inference speed, consider the following parameters:
> - If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode: `--dp-size 2`
> - You can also enable `torch.compile` to accelerate inference speed by about 15%: `--enable-torch-compile`
> - For more information on using sglang parameters, please refer to the [sglang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
> - All sglang-supported parameters can be passed to MinerU via command-line arguments, including those used with the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
> [!TIP]
> - In any case, you can specify visible GPU devices at the start of a command line by adding the `CUDA_VISIBLE_DEVICES` environment variable. For example:
> - This method works for all command-line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
> - Below are some common `CUDA_VISIBLE_DEVICES` settings:
> ```bash
> CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
> CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
> CUDA_VISIBLE_DEVICES="0,1" Same as above, quotation marks are optional
> CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
> CUDA_VISIBLE_DEVICES="" No GPU will be visible
> ```
> - Below are some possible use cases:
> - If you have multiple GPUs and need to specify GPU 0 and GPU 1 to launch 'sglang-server' in multi-GPU mode, you can use the following command:
> - If you have multiple GPUs and need to launch two `fastapi` services on GPU 0 and GPU 1 respectively, listening on different ports, you can use the following commands:
### 4. Extending MinerU Functionality Through Configuration Files
### Using MinerU
- MinerU is designed to work out-of-the-box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your home directory and add custom configurations.
- The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`. Alternatively, you can create it by copying the [configuration template file](./mineru.template.json) to your home directory and renaming it to `mineru.json`.
- Below are some available configuration options:
-`latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to the `$` symbol, and can be modified to other symbols or strings as needed.
-`llm-aided-config`: Used to configure related parameters for LLM-assisted heading level detection, compatible with all LLM models supporting the `OpenAI protocol`. It defaults to Alibaba Cloud Qwen's `qwen2.5-32b-instruct` model. You need to configure an API key yourself and set `enable` to `true` to activate this feature.
-`models-dir`: Used to specify local model storage directories. Please specify separate model directories for the `pipeline` and `vlm` backends. After specifying these directories, you can use local models by setting the environment variable `export MINERU_MODEL_SOURCE=local`.
---
You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](https://opendatalab.github.io/MinerU/usage/).
- If you encounter any issues during usage, you can first check the [FAQ](docs/FAQ_en_us.md) for solutions.
- If you encounter any issues during usage, you can first check the [FAQ](https://opendatalab.github.io/MinerU/FAQ/) for solutions.
- If your issue remains unresolved, you may also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to interact with an AI assistant, which can address most common problems.
- If your issue remains unresolved, you may also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to interact with an AI assistant, which can address most common problems.
- If you still cannot resolve the issue, you are welcome to join our community via [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to discuss with other users and developers.
- If you still cannot resolve the issue, you are welcome to join our community via [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to discuss with other users and developers.
...
@@ -872,11 +609,11 @@ Currently, some models in this project are trained based on YOLO. However, since
...
@@ -872,11 +609,11 @@ Currently, some models in this project are trained based on YOLO. However, since
# Links
# Links
-[Easy Data Preparation with latest LLMs-based Operators and Pipelines](https://github.com/OpenDCAI/DataFlow)
-[Vis3 (OSS browser based on s3)](https://github.com/opendatalab/Vis3)
-[LabelU (A Lightweight Multi-modal Data Annotation Tool)](https://github.com/opendatalab/labelU)
-[LabelU (A Lightweight Multi-modal Data Annotation Tool)](https://github.com/opendatalab/labelU)
@@ -4,39 +4,39 @@ If your question is not listed, you can also use [DeepWiki](https://deepwiki.com
...
@@ -4,39 +4,39 @@ If your question is not listed, you can also use [DeepWiki](https://deepwiki.com
If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
## 1. Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2
??? question "Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2"
The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue:
The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue:
## 2. Error when installing MinerU on CentOS 7 or Ubuntu 18: `ERROR: Failed building wheel for simsimd`
??? question "Error when installing MinerU on CentOS 7 or Ubuntu 18: `ERROR: Failed building wheel for simsimd`"
The new version of albumentations (1.4.21) introduces a dependency on simsimd. Since the pre-built package of simsimd for Linux requires a glibc version greater than or equal to 2.28, this causes installation issues on some Linux distributions released before 2019. You can resolve this issue by using the following command:
The new version of albumentations (1.4.21) introduces a dependency on simsimd. Since the pre-built package of simsimd for Linux requires a glibc version greater than or equal to 2.28, this causes installation issues on some Linux distributions released before 2019. You can resolve this issue by using the following command:
## 3. Missing text information in parsing results when installing and using on Linux systems.
??? question "Missing text information in parsing results when installing and using on Linux systems."
MinerU uses `pypdfium2` instead of `pymupdf` as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images.
MinerU uses `pypdfium2` instead of `pymupdf` as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images.
To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems:
To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems:
```bash
```bash
sudo apt update
sudo apt update
sudo apt install fonts-noto-core
sudo apt install fonts-noto-core
sudo apt install fonts-noto-cjk
sudo apt install fonts-noto-cjk
fc-cache -fv
fc-cache -fv
```
```
You can also directly use our [Docker deployment](../quick_start/docker_deployment.md) method to build the image, which includes the above font packages by default.
You can also directly use our [Docker deployment](../quick_start/docker_deployment.md) method to build the image, which includes the above font packages by default.