Unverified Commit 5685b22b authored by Sidney233's avatar Sidney233 Committed by GitHub
Browse files

Merge branch 'opendatalab:dev' into dev

parents 77c6b669 7e34fe70
This diff is collapsed.
# Documentation:
# https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands
services: services:
mineru-sglang: mineru-sglang-server:
image: mineru-sglang:latest image: mineru-sglang:latest
container_name: mineru-sglang container_name: mineru-sglang-server
restart: always restart: always
profiles: ["sglang-server"]
ports: ports:
- 30000:30000 - 30000:30000
environment: environment:
...@@ -30,3 +29,66 @@ services: ...@@ -30,3 +29,66 @@ services:
- driver: nvidia - driver: nvidia
device_ids: ["0"] device_ids: ["0"]
capabilities: [gpu] capabilities: [gpu]
mineru-api:
image: mineru-sglang:latest
container_name: mineru-api
restart: always
profiles: ["api"]
ports:
- 8000:8000
environment:
MINERU_MODEL_SOURCE: local
entrypoint: mineru-api
command:
--host 0.0.0.0
--port 8000
# parameters for sglang-engine
# --enable-torch-compile # You can also enable torch.compile to accelerate inference speed by approximately 15%
# --dp-size 2 # If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode
# --tp-size 2 # If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode.
# --mem-fraction-static 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
ulimits:
memlock: -1
stack: 67108864
ipc: host
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: [ "0" ]
capabilities: [ gpu ]
mineru-gradio:
image: mineru-sglang:latest
container_name: mineru-gradio
restart: always
profiles: ["gradio"]
ports:
- 7860:7860
environment:
MINERU_MODEL_SOURCE: local
entrypoint: mineru-gradio
command:
--server-name 0.0.0.0
--server-port 7860
--enable-sglang-engine true # Enable the sglang engine for Gradio
# --enable-api false # If you want to disable the API, set this to false
# --max-convert-pages 20 # If you want to limit the number of pages for conversion, set this to a specific number
# parameters for sglang-engine
# --enable-torch-compile # You can also enable torch.compile to accelerate inference speed by approximately 15%
# --dp-size 2 # If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode
# --tp-size 2 # If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode.
# --mem-fraction-static 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
ulimits:
memlock: -1
stack: 67108864
ipc: host
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: [ "0" ]
capabilities: [ gpu ]
# Frequently Asked Questions # Frequently Asked Questions
If your question is not listed, you can also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to communicate with the AI assistant, which can solve most common problems.
If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
## 1. Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2 ## 1. Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2
The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue: The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue:
...@@ -21,3 +25,18 @@ pip install -U "mineru[pipeline_old_linux]" ...@@ -21,3 +25,18 @@ pip install -U "mineru[pipeline_old_linux]"
``` ```
Reference: https://github.com/opendatalab/MinerU/issues/1004 Reference: https://github.com/opendatalab/MinerU/issues/1004
## 3. Missing text information in parsing results when installing and using on Linux systems.
MinerU uses `pypdfium2` instead of `pymupdf` as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images.
To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems:
```bash
sudo apt update
sudo apt install fonts-noto-core
sudo apt install fonts-noto-cjk
fc-cache -fv
```
You can also directly use our [Docker deployment](../quick_start/docker_deployment.md) method to build the image, which includes the above font packages by default.
Reference: https://github.com/opendatalab/MinerU/issues/2915
<div align="center" xmlns="http://www.w3.org/1999/html"> <div align="center" xmlns="http://www.w3.org/1999/html">
<!-- logo --> <!-- logo -->
<p align="center"> <p align="center">
<img src="images/MinerU-logo.png" width="300px" style="vertical-align:middle;"> <img src="../images/MinerU-logo.png" width="300px" style="vertical-align:middle;">
</p> </p>
</div>
<!-- icon --> <!-- icon -->
[![stars](https://img.shields.io/github/stars/opendatalab/MinerU.svg)](https://github.com/opendatalab/MinerU) [![stars](https://img.shields.io/github/stars/opendatalab/MinerU.svg)](https://github.com/opendatalab/MinerU)
...@@ -15,22 +15,19 @@ ...@@ -15,22 +15,19 @@
[![Downloads](https://static.pepy.tech/badge/mineru)](https://pepy.tech/project/mineru) [![Downloads](https://static.pepy.tech/badge/mineru)](https://pepy.tech/project/mineru)
[![Downloads](https://static.pepy.tech/badge/mineru/month)](https://pepy.tech/project/mineru) [![Downloads](https://static.pepy.tech/badge/mineru/month)](https://pepy.tech/project/mineru)
[![OpenDataLab](https://img.shields.io/badge/Demo_on_OpenDataLab-blue?logo=&labelColor=white)](https://mineru.net/OpenSourceTools/Extractor?source=github) [![OpenDataLab](https://img.shields.io/badge/Demo_on_OpenDataLab-blue?logo=&labelColor=white)](https://mineru.net/OpenSourceTools/Extractor?source=github)
[![HuggingFace](https://img.shields.io/badge/Demo_on_HuggingFace-yellow.svg?logo=&labelColor=white)](https://huggingface.co/spaces/opendatalab/MinerU)
[![ModelScope](https://img.shields.io/badge/Demo_on_ModelScope-purple?logo=&labelColor=white)](https://www.modelscope.cn/studios/OpenDataLab/MinerU) [![ModelScope](https://img.shields.io/badge/Demo_on_ModelScope-purple?logo=&labelColor=white)](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
[![HuggingFace](https://img.shields.io/badge/Demo_on_HuggingFace-yellow.svg?logo=&labelColor=white)](https://huggingface.co/spaces/opendatalab/MinerU)
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/gist/myhloli/3b3a00a4a0a61577b6c30f989092d20d/mineru_demo.ipynb) [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/gist/myhloli/3b3a00a4a0a61577b6c30f989092d20d/mineru_demo.ipynb)
[![arXiv](https://img.shields.io/badge/arXiv-2409.18839-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2409.18839) [![arXiv](https://img.shields.io/badge/arXiv-2409.18839-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2409.18839)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/opendatalab/MinerU) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/opendatalab/MinerU)
<div align="center" xmlns="http://www.w3.org/1999/html">
<a href="https://trendshift.io/repositories/11174" target="_blank"><img src="https://trendshift.io/api/badge/repositories/11174" alt="opendatalab%2FMinerU | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> <a href="https://trendshift.io/repositories/11174" target="_blank"><img src="https://trendshift.io/api/badge/repositories/11174" alt="opendatalab%2FMinerU | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
<!-- hot link --> <!-- hot link -->
<p align="center"> <p align="center">
<a href="https://github.com/opendatalab/PDF-Extract-Kit">PDF-Extract-Kit: High-Quality PDF Extraction Toolkit</a>🔥🔥🔥 🚀<a href="https://mineru.net/?source=github">MinerU Official Website→✅ Zero-Install Online Version ✅ Full-Featured Client ✅ Developer API Online Access, skip deployment hassles, get all product formats with one click, go fast!</a>
<br>
<br>
🚀<a href="https://mineru.net/?source=github">Access MinerU Now→✅ Zero-Install Web Version ✅ Full-Featured Desktop Client ✅ Instant API Access; Skip deployment headaches – get all product formats in one click. Developers, dive in!</a>
</p> </p>
<!-- join us --> <!-- join us -->
...@@ -38,28 +35,34 @@ ...@@ -38,28 +35,34 @@
<p align="center"> <p align="center">
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="http://mineru.space/s/V85Yl" target="_blank">WeChat</a> 👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="http://mineru.space/s/V85Yl" target="_blank">WeChat</a>
</p> </p>
</div> </div>
## Project Introduction ## Project Introduction
MinerU is a tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format. MinerU is a tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format.
MinerU was born during the pre-training process of [InternLM](https://github.com/InternLM/InternLM). We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models. MinerU was born during the pre-training process of [InternLM](https://github.com/InternLM/InternLM). We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models.
Compared to well-known commercial products, MinerU is still young. If you encounter any issues or if the results are not as expected, please submit an issue on [issue](https://github.com/opendatalab/MinerU/issues) and **attach the relevant PDF**. Compared to well-known commercial products domestically and internationally, MinerU is still young. If you encounter any issues or if the results are not as expected, please submit an issue on [GitHub Issues](https://github.com/opendatalab/MinerU/issues) and **attach the relevant PDF**.
![type:video](https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c) ![type:video](https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c)
## Key Features ## Key Features
- Remove headers, footers, footnotes, page numbers, etc., to ensure semantic coherence. - Remove headers, footers, footnotes, page numbers and other elements to ensure semantic coherence
- Output text in human-readable order, suitable for single-column, multi-column, and complex layouts. - Output text in human reading order, suitable for single-column, multi-column and complex layouts
- Preserve the structure of the original document, including headings, paragraphs, lists, etc. - Retain the original document structure, including titles, paragraphs, lists, etc.
- Extract images, image descriptions, tables, table titles, and footnotes. - Extract images, image descriptions, tables, table titles and footnotes
- Automatically recognize and convert formulas in the document to LaTeX format. - Automatically identify and convert formulas in documents to LaTeX format
- Automatically recognize and convert tables in the document to HTML format. - Automatically identify and convert tables in documents to HTML format
- Automatically detect scanned PDFs and garbled PDFs and enable OCR functionality. - Automatically detect scanned PDFs and garbled PDFs, and enable OCR functionality
- OCR supports detection and recognition of 84 languages. - OCR supports detection and recognition of 84 languages
- Supports multiple output formats, such as multimodal and NLP Markdown, JSON sorted by reading order, and rich intermediate formats. - Support multiple output formats, such as multimodal and NLP Markdown, reading-order-sorted JSON, and information-rich intermediate formats
- Supports various visualization results, including layout visualization and span visualization, for efficient confirmation of output quality. - Support multiple visualization results, including layout visualization, span visualization, etc., for efficient confirmation of output effects and quality inspection
- Supports running in a pure CPU environment, and also supports GPU(CUDA)/NPU(CANN)/MPS acceleration - Support pure CPU environment operation, and support GPU(CUDA)/NPU(CANN)/MPS acceleration
- Compatible with Windows, Linux, and Mac platforms. - Compatible with Windows, Linux and Mac platforms
\ No newline at end of file
## User Guide
- [Quick Start Guide](./quick_start/index.md)
- [Detailed Usage Instructions](./usage/index.md)
# Known Issues
- Reading order is determined by the model based on the spatial distribution of readable content, and may be out of order in some areas under extremely complex layouts.
- Limited support for vertical text.
- Tables of contents and lists are recognized through rules, and some uncommon list formats may not be recognized.
- Code blocks are not yet supported in the layout model.
- Comic books, art albums, primary school textbooks, and exercises cannot be parsed well.
- Table recognition may result in row/column recognition errors in complex tables.
- OCR recognition may produce inaccurate characters in PDFs of lesser-known languages (e.g., diacritical marks in Latin script, easily confused characters in Arabic script).
- Some formulas may not render correctly in Markdown.
# Deploying MinerU with Docker
MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues.
## Build Docker Image using Dockerfile:
```bash
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/Dockerfile
docker build -t mineru-sglang:latest -f Dockerfile .
```
> [!TIP]
> The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/china/Dockerfile) uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the base image by default, supporting Turing/Ampere/Ada Lovelace/Hopper platforms.
> If you are using the newer `Blackwell` platform, please modify the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200` before executing the build operation.
## Docker Description
MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sglang` inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use `sglang` to accelerate VLM model inference.
> [!NOTE]
> Requirements for using `sglang` to accelerate VLM model inference:
> - Device must have Turing architecture or later graphics cards with 8GB+ available VRAM.
> - The host machine's graphics driver should support CUDA 12.6 or higher; `Blackwell` platform should support CUDA 12.8 or higher. You can check the driver version using the `nvidia-smi` command.
> - Docker container must have access to the host machine's graphics devices.
>
> If your device doesn't meet the above requirements, you can still use other features of MinerU, but cannot use `sglang` to accelerate VLM model inference, meaning you cannot use the `vlm-sglang-engine` backend or start the `vlm-sglang-server` service.
## Start Docker Container:
```bash
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 -p 7860:7860 -p 8000:8000 \
--ipc=host \
-it mineru-sglang:latest \
/bin/bash
```
After executing this command, you will enter the Docker container's interactive terminal with some ports mapped for potential services. You can directly run MinerU-related commands within the container to use MinerU's features.
You can also directly start MinerU services by replacing `/bin/bash` with service startup commands. For detailed instructions, please refer to the [MinerU Usage Documentation](../usage/index.md).
## Start Services Directly with Docker Compose
We provide a `compose.yml` file that you can use to quickly start MinerU services.
```bash
# Download compose.yaml file
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
```
- Start `sglang-server` service and connect to `sglang-server` via `vlm-sglang-client` backend:
```bash
docker compose -f compose.yaml --profile mineru-sglang-server up -d
# In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<server_ip>:30000
```
- Start API service:
```bash
docker compose -f compose.yaml --profile mineru-api up -d
```
Access `http://<server_ip>:8000/docs` in your browser to view the API documentation.
- Start Gradio WebUI service:
```bash
docker compose -f compose.yaml --profile mineru-gradio up -d
```
Access `http://<server_ip>:7860` in your browser to use the Gradio WebUI or access `http://<server_ip>:7860/?view=api` to use the Gradio API.
# MinerU Extension Modules Installation Guide
MinerU supports installing extension modules on demand based on different needs to enhance functionality or support specific model backends.
## Common Scenarios
### Core Functionality Installation
The `core` module is the core dependency of MinerU, containing all functional modules except `sglang`. Installing this module ensures the basic functionality of MinerU works properly.
```bash
uv pip install mineru[core]
```
---
### Using `sglang` to Accelerate VLM Model Inference
The `sglang` module provides acceleration support for VLM model inference, suitable for graphics cards with Turing architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.
In the configuration, `all` includes both `core` and `sglang` modules, so `mineru[all]` and `mineru[core,sglang]` are equivalent.
```bash
uv pip install mineru[all]
```
> [!TIP]
> If exceptions occur during installation of the complete package including sglang, please refer to the [sglang official documentation](https://docs.sglang.ai/start/install.html) to try to resolve the issue, or directly use the [Docker](./docker_deployment.md) deployment method.
---
### Installing Lightweight Client to Connect to sglang-server
If you need to install a lightweight client on edge devices to connect to `sglang-server`, you can install the basic mineru package, which is very lightweight and suitable for devices with only CPU and network connectivity.
```bash
uv pip install mineru
```
---
### Using Pipeline Backend on Outdated Linux Systems
If your system is too outdated to meet the dependency requirements of `mineru[core]`, this option can minimally meet MinerU's runtime requirements, suitable for old systems that cannot be upgraded and only need to use the pipeline backend.
```bash
uv pip install mineru[pipeline_old_linux]
```
# Quick Start # Quick Start
If you encounter any installation issues, please first consult the [FAQ](../FAQ/index.md). If you encounter any installation issues, please check the [FAQ](../FAQ/index.md) first.
## Online Experience
If the parsing results are not as expected, refer to the [Known Issues](../known_issues.md). - Official online demo: The official online version has the same functionality as the client, with a beautiful interface and rich features, requires login to use
- [![OpenDataLab](https://img.shields.io/badge/Demo_on_OpenDataLab-blue?logo=&labelColor=white)](https://mineru.net/OpenSourceTools/Extractor?source=github)
- Gradio-based online demo: A WebUI developed based on Gradio, with a simple interface and only core parsing functionality, no login required
- [![ModelScope](https://img.shields.io/badge/Demo_on_ModelScope-purple?logo=&labelColor=white)](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
- [![HuggingFace](https://img.shields.io/badge/Demo_on_HuggingFace-yellow.svg?logo=&labelColor=white)](https://huggingface.co/spaces/opendatalab/MinerU)
There are three different ways to experience MinerU: ## Local Deployment
- [Online Demo](online_demo.md)
- [Local Deployment](local_deployment.md)
> [!WARNING] > [!WARNING]
> **Pre-installation Notice—Hardware and Software Environment Support** > **Prerequisites - Hardware and Software Environment Support**
> >
> To ensure the stability and reliability of the project, we only optimize and test for specific hardware and software environments during development. This ensures that users deploying and running the project on recommended system configurations will get the best performance with the fewest compatibility issues. > To ensure the stability and reliability of the project, we have optimized and tested only specific hardware and software environments during development. This ensures that users can achieve optimal performance and encounter the fewest compatibility issues when deploying and running the project on recommended system configurations.
> >
> By focusing resources on the mainline environment, our team can more efficiently resolve potential bugs and develop new features. > By concentrating our resources and efforts on mainstream environments, our team can more efficiently resolve potential bugs and timely develop new features.
> >
> In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first. Most issues already have corresponding solutions in the FAQ. We also encourage community feedback to help us gradually expand support. > In non-mainstream environments, due to the diversity of hardware and software configurations, as well as compatibility issues with third-party dependencies, we cannot guarantee 100% usability of the project. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first, as most issues have corresponding solutions in the FAQ. Additionally, we encourage community feedback on issues so that we can gradually expand our support range.
<table> <table>
<tr> <tr>
...@@ -30,9 +31,9 @@ There are three different ways to experience MinerU: ...@@ -30,9 +31,9 @@ There are three different ways to experience MinerU:
</tr> </tr>
<tr> <tr>
<td>Operating System</td> <td>Operating System</td>
<td>windows/linux/mac</td> <td>Linux / Windows / macOS</td>
<td>windows/linux</td> <td>Linux / Windows</td>
<td>windows(wsl2)/linux</td> <td>Linux / Windows (via WSL2)</td>
</tr> </tr>
<tr> <tr>
<td>CPU Inference Support</td> <td>CPU Inference Support</td>
...@@ -41,12 +42,12 @@ There are three different ways to experience MinerU: ...@@ -41,12 +42,12 @@ There are three different ways to experience MinerU:
</tr> </tr>
<tr> <tr>
<td>GPU Requirements</td> <td>GPU Requirements</td>
<td>Turing architecture or later, 6GB+ VRAM or Apple Silicon</td> <td>Turing architecture and later, 6GB+ VRAM or Apple Silicon</td>
<td colspan="2">Ampere architecture or later, 8GB+ VRAM</td> <td colspan="2">Turing architecture and later, 8GB+ VRAM</td>
</tr> </tr>
<tr> <tr>
<td>Memory Requirements</td> <td>Memory Requirements</td>
<td colspan="3">Minimum 16GB+, 32GB+ recommended</td> <td colspan="3">Minimum 16GB+, recommended 32GB+</td>
</tr> </tr>
<tr> <tr>
<td>Disk Space Requirements</td> <td>Disk Space Requirements</td>
...@@ -56,4 +57,32 @@ There are three different ways to experience MinerU: ...@@ -56,4 +57,32 @@ There are three different ways to experience MinerU:
<td>Python Version</td> <td>Python Version</td>
<td colspan="3">3.10-3.13</td> <td colspan="3">3.10-3.13</td>
</tr> </tr>
</table> </table>
\ No newline at end of file
### Install MinerU
#### Install MinerU using pip or uv
```bash
pip install --upgrade pip
pip install uv
uv pip install -U "mineru[core]"
```
#### Install MinerU from source code
```bash
git clone https://github.com/opendatalab/MinerU.git
cd MinerU
uv pip install -e .[core]
```
> [!TIP]
> `mineru[core]` includes all core features except `sglang` acceleration, compatible with Windows / Linux / macOS systems, suitable for most users.
> If you need to use `sglang` acceleration for VLM model inference or install a lightweight client on edge devices, please refer to the documentation [Extension Modules Installation Guide](./extension_modules.md).
---
#### Deploy MinerU using Docker
MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues.
You can get the [Docker Deployment Instructions](./docker_deployment.md) in the documentation.
---
# Local Deployment
## Install MinerU
### Install via pip or uv
```bash
pip install --upgrade pip
pip install uv
uv pip install -U "mineru[core]"
```
### Install from source
```bash
git clone https://github.com/opendatalab/MinerU.git
cd MinerU
uv pip install -e .[core]
```
> [!NOTE]
> Linux and macOS systems automatically support CUDA/MPS acceleration after installation. For Windows users who want to use CUDA acceleration,
> please visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to install PyTorch with the appropriate CUDA version.
### Install Full Version (supports sglang acceleration) (requires device with Turing or newer architecture and at least 8GB GPU memory)
If you need to use **sglang to accelerate VLM model inference**, you can choose any of the following methods to install the full version:
- Install using uv or pip:
```bash
uv pip install -U "mineru[all]"
```
- Install from source:
```bash
uv pip install -e .[all]
```
> [!TIP]
> If any exceptions occur during the installation of `sglang`, please refer to the [official sglang documentation](https://docs.sglang.ai/start/install.html) for troubleshooting and solutions, or directly use Docker-based installation.
- Build image using Dockerfile:
```bash
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/global/Dockerfile
docker build -t mineru-sglang:latest -f Dockerfile .
```
Start Docker container:
```bash
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
--ipc=host \
mineru-sglang:latest \
mineru-sglang-server --host 0.0.0.0 --port 30000
```
Or start using Docker Compose:
```bash
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
docker compose -f compose.yaml up -d
```
> [!TIP]
> The Dockerfile uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the default base image, which supports the Turing/Ampere/Ada Lovelace/Hopper platforms.
> If you are using the newer Blackwell platform, please change the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200`.
### Install client (for connecting to sglang-server on edge devices that require only CPU and network connectivity)
```bash
uv pip install -U mineru
mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<host_ip>:<port>
```
---
\ No newline at end of file
# Online Demo
[![OpenDataLab](https://img.shields.io/badge/Demo_on_OpenDataLab-blue?logo=&labelColor=white)](https://mineru.net/OpenSourceTools/Extractor?source=github)
[![HuggingFace](https://img.shields.io/badge/Demo_on_HuggingFace-yellow.svg?logo=&labelColor=white)](https://huggingface.co/spaces/opendatalab/MinerU)
[![ModelScope](https://img.shields.io/badge/Demo_on_ModelScope-purple?logo=&labelColor=white)](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
\ No newline at end of file
# TODO
- [x] Reading order based on the model
- [x] Recognition of `index` and `list` in the main text
- [x] Table recognition
- [x] Heading Classification
- [ ] Code block recognition in the main text
- [ ] [Chemical formula recognition](../chemical_knowledge_introduction/introduction.pdf)
- [ ] Geometric shape recognition
\ No newline at end of file
# Advanced Command Line Parameters
## SGLang Acceleration Parameter Optimization
### Memory Optimization Parameters
> [!TIP]
> SGLang acceleration mode currently supports running on Turing architecture graphics cards with a minimum of 8GB VRAM, but graphics cards with <24GB VRAM may encounter insufficient memory issues. You can optimize memory usage with the following parameters:
> - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
> - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
### Performance Optimization Parameters
> [!TIP]
> If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
> - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
> - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
### Parameter Passing Instructions
> [!TIP]
> - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
> - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
## GPU Device Selection and Configuration
### CUDA_VISIBLE_DEVICES Basic Usage
> [!TIP]
> - In any situation, you can specify visible GPU devices by adding the `CUDA_VISIBLE_DEVICES` environment variable at the beginning of the command line. For example:
> ```bash
> CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
> ```
> - This specification method is effective for all command line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
### Common Device Configuration Examples
> [!TIP]
> - Here are some common `CUDA_VISIBLE_DEVICES` setting examples:
> ```bash
> CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
> CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
> CUDA_VISIBLE_DEVICES="0,1" Same as above, quotation marks are optional
> CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
> CUDA_VISIBLE_DEVICES="" No GPU will be visible
> ```
### Practical Application Scenarios
> [!TIP]
> Here are some possible usage scenarios:
> - If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start 'sglang-server', you can use the following command:
> ```bash
> CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
> ```
> - If you have multiple graphics cards and need to start two `fastapi` services on cards 0 and 1, listening on different ports respectively, you can use the following commands:
> ```bash
> # In terminal 1
> CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
> # In terminal 2
> CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001
> ```
# API Calls or Visual Invocation
1. Directly invoke using Python API: [Python Invocation Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
2. Invoke using FastAPI:
```bash
mineru-api --host 127.0.0.1 --port 8000
```
Visit http://127.0.0.1:8000/docs in your browser to view the API documentation.
3. Use Gradio WebUI or Gradio API:
```bash
# Using pipeline/vlm-transformers/vlm-sglang-client backend
mineru-gradio --server-name 127.0.0.1 --server-port 7860
# Or using vlm-sglang-engine/pipeline backend
mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
```
Access http://127.0.0.1:7860 in your browser to use the Gradio WebUI, or visit http://127.0.0.1:7860/?view=api to use the Gradio API.
> [!TIP]
> - Below are some suggestions and notes for using the sglang acceleration mode:
> - The sglang acceleration mode currently supports operation on Turing architecture GPUs with a minimum of 8GB VRAM, but you may encounter VRAM shortages on GPUs with less than 24GB VRAM. You can optimize VRAM usage with the following parameters:
> - If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by setting `--mem-fraction-static 0.5`. If VRAM issues persist, try lowering it further to `0.4` or below.
> - If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode: `--tp-size 2`
> - If you are already successfully using sglang to accelerate VLM inference but wish to further improve inference speed, consider the following parameters:
> - If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode: `--dp-size 2`
> - You can also enable `torch.compile` to accelerate inference speed by about 15%: `--enable-torch-compile`
> - For more information on using sglang parameters, please refer to the [sglang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
> - All sglang-supported parameters can be passed to MinerU via command-line arguments, including those used with the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
> [!TIP]
> - In any case, you can specify visible GPU devices at the start of a command line by adding the `CUDA_VISIBLE_DEVICES` environment variable. For example:
> ```bash
> CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
> ```
> - This method works for all command-line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
> - Below are some common `CUDA_VISIBLE_DEVICES` settings:
> ```bash
> CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
> CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
> CUDA_VISIBLE_DEVICES="0,1" Same as above, quotation marks are optional
> CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
> CUDA_VISIBLE_DEVICES="" No GPU will be visible
> ```
> - Below are some possible use cases:
> - If you have multiple GPUs and need to specify GPU 0 and GPU 1 to launch 'sglang-server' in multi-GPU mode, you can use the following command:
> ```bash
> CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
> ```
> - If you have multiple GPUs and need to launch two `fastapi` services on GPU 0 and GPU 1 respectively, listening on different ports, you can use the following commands:
> ```bash
> # In terminal 1
> CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
> # In terminal 2
> CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001
> ```
---
# Command Line Tools Usage Instructions
## View Help Information
To view help information for MinerU command line tools, you can use the `--help` parameter. Here are help information examples for various command line tools:
```bash
mineru --help
Usage: mineru [OPTIONS]
Options:
-v, --version Show version and exit
-p, --path PATH Input file path or directory (required)
-o, --output PATH Output directory (required)
-m, --method [auto|txt|ocr] Parsing method: auto (default), txt, ocr (pipeline backend only)
-b, --backend [pipeline|vlm-transformers|vlm-sglang-engine|vlm-sglang-client]
Parsing backend (default: pipeline)
-l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|latin|arabic|east_slavic|cyrillic|devanagari]
Specify document language (improves OCR accuracy, pipeline backend only)
-u, --url TEXT Service address when using sglang-client
-s, --start INTEGER Starting page number for parsing (0-based)
-e, --end INTEGER Ending page number for parsing (0-based)
-f, --formula BOOLEAN Enable formula parsing (default: enabled)
-t, --table BOOLEAN Enable table parsing (default: enabled)
-d, --device TEXT Inference device (e.g., cpu/cuda/cuda:0/npu/mps, pipeline backend only)
--vram INTEGER Maximum GPU VRAM usage per process (GB) (pipeline backend only)
--source [huggingface|modelscope|local]
Model source, default: huggingface
--help Show help information
```
```bash
mineru-api --help
Usage: mineru-api [OPTIONS]
Options:
--host TEXT Server host (default: 127.0.0.1)
--port INTEGER Server port (default: 8000)
--reload Enable auto-reload (development mode)
--help Show this message and exit.
```
```bash
mineru-gradio --help
Usage: mineru-gradio [OPTIONS]
Options:
--enable-example BOOLEAN Enable example files for input. The example
files to be input need to be placed in the
`example` folder within the directory where
the command is currently executed.
--enable-sglang-engine BOOLEAN Enable SgLang engine backend for faster
processing.
--enable-api BOOLEAN Enable gradio API for serving the
application.
--max-convert-pages INTEGER Set the maximum number of pages to convert
from PDF to Markdown.
--server-name TEXT Set the server name for the Gradio app.
--server-port INTEGER Set the server port for the Gradio app.
--latex-delimiters-type [a|b|all]
Set the type of LaTeX delimiters to use in
Markdown rendering: 'a' for type '$', 'b' for
type '()[]', 'all' for both types.
--help Show this message and exit.
```
## Environment Variables Description
Some parameters of MinerU command line tools have equivalent environment variable configurations. Generally, environment variable configurations have higher priority than command line parameters and take effect across all command line tools.
- `MINERU_DEVICE_MODE`: Used to specify inference device, supports device types like `cpu/cuda/cuda:0/npu/mps`, only effective for `pipeline` backend.
- `MINERU_VIRTUAL_VRAM_SIZE`: Used to specify maximum GPU VRAM usage per process (GB), only effective for `pipeline` backend.
- `MINERU_MODEL_SOURCE`: Used to specify model source, supports `huggingface/modelscope/local`, defaults to `huggingface`, can be switched to `modelscope` or local models through environment variables.
- `MINERU_TOOLS_CONFIG_JSON`: Used to specify configuration file path, defaults to `mineru.json` in user directory, can specify other configuration file paths through environment variables.
- `MINERU_FORMULA_ENABLE`: Used to enable formula parsing, defaults to `true`, can be set to `false` through environment variables to disable formula parsing.
- `MINERU_TABLE_ENABLE`: Used to enable table parsing, defaults to `true`, can be set to `false` through environment variables to disable table parsing.
# Extending MinerU Functionality Through Configuration Files
- MinerU is designed to work out-of-the-box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your home directory and add custom configurations.
- The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`. Alternatively, you can create it by copying the [configuration template file](../../mineru.template.json) to your home directory and renaming it to `mineru.json`.
- Below are some available configuration options:
- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to the `$` symbol, and can be modified to other symbols or strings as needed.
- `llm-aided-config`: Used to configure related parameters for LLM-assisted heading level detection, compatible with all LLM models supporting the `OpenAI protocol`. It defaults to Alibaba Cloud Qwen's `qwen2.5-32b-instruct` model. You need to configure an API key yourself and set `enable` to `true` to activate this feature.
- `models-dir`: Used to specify local model storage directories. Please specify separate model directories for the `pipeline` and `vlm` backends. After specifying these directories, you can use local models by setting the environment variable `export MINERU_MODEL_SOURCE=local`.
---
\ No newline at end of file
# Using MinerU # Using MinerU
## Command Line Usage ## Quick Model Source Configuration
MinerU uses `huggingface` as the default model source. If users cannot access `huggingface` due to network restrictions, they can conveniently switch the model source to `modelscope` through environment variables:
### Basic Usage
The simplest command line invocation is:
```bash
mineru -p <input_path> -o <output_path>
```
- `<input_path>`: Local PDF/Image file or directory (supports pdf/png/jpg/jpeg/webp/gif)
- `<output_path>`: Output directory
### View Help Information
Get all available parameter descriptions:
```bash ```bash
mineru --help export MINERU_MODEL_SOURCE=modelscope
```
### Parameter Details
```text
Usage: mineru [OPTIONS]
Options:
-v, --version Show version and exit
-p, --path PATH Input file path or directory (required)
-o, --output PATH Output directory (required)
-m, --method [auto|txt|ocr] Parsing method: auto (default), txt, ocr (pipeline backend only)
-b, --backend [pipeline|vlm-transformers|vlm-sglang-engine|vlm-sglang-client]
Parsing backend (default: pipeline)
-l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|latin|arabic|east_slavic|cyrillic|devanagari]
Specify document language (improves OCR accuracy, pipeline backend only)
-u, --url TEXT Service address when using sglang-client
-s, --start INTEGER Starting page number (0-based)
-e, --end INTEGER Ending page number (0-based)
-f, --formula BOOLEAN Enable formula parsing (default: on)
-t, --table BOOLEAN Enable table parsing (default: on)
-d, --device TEXT Inference device (e.g., cpu/cuda/cuda:0/npu/mps, pipeline backend only)
--vram INTEGER Maximum GPU VRAM usage per process (GB)(pipeline backend only)
--source [huggingface|modelscope|local]
Model source, default: huggingface
--help Show help information
``` ```
For more information about model source configuration and custom local model paths, please refer to the [Model Source Documentation](./model_source.md) in the documentation.
--- ---
## Model Source Configuration ## Quick Usage via Command Line
MinerU has built-in command line tools that allow users to quickly use MinerU for PDF parsing through the command line:
MinerU automatically downloads required models from HuggingFace on first run. If HuggingFace is inaccessible, you can switch model sources:
### Switch to ModelScope Source
```bash
mineru -p <input_path> -o <output_path> --source modelscope
```
Or set environment variable:
```bash ```bash
export MINERU_MODEL_SOURCE=modelscope # Default parsing using pipeline backend
mineru -p <input_path> -o <output_path> mineru -p <input_path> -o <output_path>
``` ```
- `<input_path>`: Local PDF/image file or directory
- `<output_path>`: Output directory
### Using Local Models > [!NOTE]
> The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. Windows users who need cuda acceleration should visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to select the appropriate command for their cuda version to install acceleration-enabled `torch` and `torchvision`.
#### 1. Download Models Locally
```bash
mineru-models-download --help
```
Or use interactive command-line tool to select models:
```bash
mineru-models-download
```
After download, model paths will be displayed in current terminal and automatically written to `mineru.json` in user directory.
#### 2. Parse Using Local Models > [!TIP]
> For more information about output files, please refer to [Output File Documentation](./output_file.md).
```bash ```bash
mineru -p <input_path> -o <output_path> --source local # Or specify vlm backend for parsing
mineru -p <input_path> -o <output_path> -b vlm-transformers
``` ```
> [!TIP]
> The vlm backend additionally supports `sglang` acceleration. Compared to the `transformers` backend, `sglang` can achieve 20-30x speedup. You can check the installation method for the complete package supporting `sglang` acceleration in the [Extension Modules Installation Guide](../quick_start/extension_modules.md).
Or enable via environment variable: If you need to adjust parsing options through custom parameters, you can also check the more detailed [Command Line Tools Usage Instructions](./cli_tools.md) in the documentation.
```bash
export MINERU_MODEL_SOURCE=local
mineru -p <input_path> -o <output_path>
```
--- ---
## Using sglang to Accelerate VLM Model Inference ## Advanced Usage via API, WebUI, sglang-client/server
### Through the sglang-engine Mode - Direct Python API calls: [Python Usage Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
- FastAPI calls:
```bash ```bash
mineru -p <input_path> -o <output_path> -b vlm-sglang-engine mineru-api --host 127.0.0.1 --port 8000
``` ```
Access http://127.0.0.1:8000/docs in your browser to view the API documentation.
### Through the sglang-server/client Mode - Start Gradio WebUI visual frontend:
```bash
1. Start Server: # Using pipeline/vlm-transformers/vlm-sglang-client backends
mineru-gradio --server-name 127.0.0.1 --server-port 7860
```bash # Or using vlm-sglang-engine/pipeline backends (requires sglang environment)
mineru-sglang-server --port 30000 mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
``` ```
Access http://127.0.0.1:7860 in your browser to use Gradio WebUI or access http://127.0.0.1:7860/?view=api to use the Gradio API.
2. Use Client in another terminal: - Using `sglang-client/server` method:
```bash
```bash # Start sglang server (requires sglang environment)
mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000 mineru-sglang-server --port 30000
``` # In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
```
> [!TIP] > [!TIP]
> For more information about output files, please refer to [Output File Documentation](../output_file.md) > All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
## Extending MinerU Functionality with Configuration Files
--- - MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your user directory to add custom configurations.
\ No newline at end of file - The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
- Here are some available configuration options:
- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
# Model Source Documentation
MinerU uses `HuggingFace` and `ModelScope` as model repositories. Users can switch model sources or use local models as needed.
- `HuggingFace` is the default model source, providing excellent loading speed and high stability globally.
- `ModelScope` is the best choice for users in mainland China, providing seamlessly compatible `hf` SDK modules, suitable for users who cannot access HuggingFace.
## Methods to Switch Model Sources
### Switch via Command Line Parameters
Currently, only the `mineru` command line tool supports switching model sources through command line parameters. Other command line tools such as `mineru-api`, `mineru-gradio`, etc., do not support this yet.
```bash
mineru -p <input_path> -o <output_path> --source modelscope
```
### Switch via Environment Variables
You can switch model sources by setting environment variables in any situation. This applies to all command line tools and API calls.
```bash
export MINERU_MODEL_SOURCE=modelscope
```
or
```python
import os
os.environ["MINERU_MODEL_SOURCE"] = "modelscope"
```
>[!TIP]
> Model sources set through environment variables will take effect in the current terminal session until the terminal is closed or the environment variable is modified. They have higher priority than command line parameters - if both command line parameters and environment variables are set, the command line parameters will be ignored.
## Using Local Models
### 1. Download Models to Local Storage
```bash
mineru-models-download --help
```
or use the interactive command line tool to select model downloads:
```bash
mineru-models-download
```
>[!TIP]
>- After download completion, the model path will be output in the current terminal window and automatically written to `mineru.json` in the user directory.
>- After downloading models locally, you can freely move the model folder to other locations while updating the model path in `mineru.json`.
>- If you deploy the model folder to another server, please ensure you move the `mineru.json` file to the user directory of the new device and configure the model path correctly.
>- If you need to update model files, you can run the `mineru-models-download` command again. Model updates do not support custom paths currently - if you haven't moved the local model folder, model files will be incrementally updated; if you have moved the model folder, model files will be re-downloaded to the default location and `mineru.json` will be updated.
### 2. Use Local Models for Parsing
```bash
mineru -p <input_path> -o <output_path> --source local
```
or enable through environment variables:
```bash
export MINERU_MODEL_SOURCE=local
mineru -p <input_path> -o <output_path>
```
{
"bucket_info":{
"bucket-name-1":["ak", "sk", "endpoint"],
"bucket-name-2":["ak", "sk", "endpoint"]
},
"latex-delimiter-config": {
"display": {
"left": "$$",
"right": "$$"
},
"inline": {
"left": "$",
"right": "$"
}
},
"llm-aided-config": {
"title_aided": {
"api_key": "your_api_key",
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"model": "qwen2.5-32b-instruct",
"enable": false
}
},
"models-dir": {
"pipeline": "",
"vlm": ""
},
"config_version": "1.3.0"
}
\ No newline at end of file
# 常见问题解答 # 常见问题解答
## 1.在WSL2的Ubuntu22.04中遇到报错`ImportError: libGL.so.1: cannot open shared object file: No such file or directory` 如果未能列出您的问题,您也可以使用[DeepWiki](https://deepwiki.com/opendatalab/MinerU)与AI助手交流,这可以解决大部分常见问题。
如果您仍然无法解决问题,您可通过[Discord](https://discord.gg/Tdedn9GTXq)[WeChat](http://mineru.space/s/V85Yl)加入社区,与其他用户和开发者交流。
### 1. 在WSL2的Ubuntu22.04中遇到报错`ImportError: libGL.so.1: cannot open shared object file: No such file or directory`
WSL2的Ubuntu22.04中缺少`libgl`库,可通过以下命令安装`libgl`库解决: WSL2的Ubuntu22.04中缺少`libgl`库,可通过以下命令安装`libgl`库解决:
...@@ -11,7 +15,7 @@ sudo apt-get install libgl1-mesa-glx ...@@ -11,7 +15,7 @@ sudo apt-get install libgl1-mesa-glx
参考:https://github.com/opendatalab/MinerU/issues/388 参考:https://github.com/opendatalab/MinerU/issues/388
## 2.在 CentOS 7 或 Ubuntu 18 系统安装MinerU时报错`ERROR: Failed building wheel for simsimd` ### 2. 在 CentOS 7 或 Ubuntu 18 系统安装MinerU时报错`ERROR: Failed building wheel for simsimd`
新版本albumentations(1.4.21)引入了依赖simsimd,由于simsimd在linux的预编译包要求glibc的版本大于等于2.28,导致部分2019年之前发布的Linux发行版无法正常安装,可通过如下命令安装: 新版本albumentations(1.4.21)引入了依赖simsimd,由于simsimd在linux的预编译包要求glibc的版本大于等于2.28,导致部分2019年之前发布的Linux发行版无法正常安装,可通过如下命令安装:
``` ```
...@@ -21,3 +25,17 @@ pip install -U "mineru[pipeline_old_linux]" ...@@ -21,3 +25,17 @@ pip install -U "mineru[pipeline_old_linux]"
``` ```
参考:https://github.com/opendatalab/MinerU/issues/1004 参考:https://github.com/opendatalab/MinerU/issues/1004
### 3. 在 Linux 系统安装并使用时,解析结果缺失部份文字信息。
MinerU在>=2.0的版本中使用`pypdfium2`代替`pymupdf`作为PDF页面的渲染引擎,以解决AGPLv3的许可证问题,在某些Linux发行版,由于缺少CJK字体,可能会在将PDF渲染成图片的过程中丢失部份文字。
为了解决这个问题,您可以通过以下命令安装noto字体包,这在Ubuntu/debian系统中有效:
```bash
sudo apt update
sudo apt install fonts-noto-core
sudo apt install fonts-noto-cjk
fc-cache -fv
```
也可以直接使用我们的[Docker部署](../quick_start/docker_deployment.md)方式构建镜像,镜像中默认包含以上字体包。
参考:https://github.com/opendatalab/MinerU/issues/2915
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment