Merge pull request #2895 from opendatalab/release-2.1.0

Release 2.1.0

Merge pull request #2895 from opendatalab/release-2.1.0
Release 2.1.0
66e616bd · Xiaomeng Zhao · GitHub · 592b659e · a4c9a07b · 66e616bd
Unverified Commit 66e616bd authored Jul 05, 2025 by Xiaomeng Zhao Committed by GitHub Jul 05, 2025
20 changed files
--- a/projects/README.md
+++ b/projects/README.md
@@ -3,9 +3,7 @@
 ## Project List
 - Projects compatible with version 2.0:
-  - [gradio_app](./gradio_app/README.md): Web application based on Gradio
+  - [multi_gpu_v2](./multi_gpu_v2/README.md): Multi-GPU parallel processing based on LitServe
 - Projects not yet compatible with version 2.0:
-  - [web_api](./web_api/README.md): Web API based on FastAPI
-  - [multi_gpu](./multi_gpu/README.md): Multi-GPU parallel processing based on LitServe
  - [mcp](./mcp/README.md): MCP server based on the official API
--- a/projects/README_zh-CN.md
+++ b/projects/README_zh-CN.md
@@ -3,9 +3,7 @@
 ## 项目列表
 - 已兼容2.0版本的项目列表
-  - [gradio_app](./gradio_app/README_zh-CN.md): 基于 Gradio 的 Web 应用
+  - [multi_gpu_v2](./multi_gpu_v2/README_zh.md): 基于 LitServe 的多 GPU 并行处理
 - 未兼容2.0版本的项目列表
-  - [web_api](./web_api/README.md): 基于 FastAPI 的 Web API 
-  - [multi_gpu](./multi_gpu/README.md): 基于 LitServe 的多 GPU 并行处理
  - [mcp](./mcp/README.md): 基于官方api的mcp server
--- a/projects/gradio_app/README.md
+++ b/projects/gradio_app/README.md
-## Installation
-MinerU(>=0.8.0)
- > If you already have a functioning MinerU environment, you can skip this step.
- > 
-[Deploy in CPU environment](https://github.com/opendatalab/MinerU?tab=readme-ov-file#quick-cpu-demo)
-[Deploy in GPU environment](https://github.com/opendatalab/MinerU?tab=readme-ov-file#using-gpu)
-Third-party Software
-```bash
-pip install gradio gradio-pdf
-```
-## Start Gradio App
-```bash
-python app.py
-```
-## Use Gradio App
-Access http://127.0.0.1:7860 in your web browser
\ No newline at end of file
--- a/projects/gradio_app/README_zh-CN.md
+++ b/projects/gradio_app/README_zh-CN.md
-## 安装
-MinerU(>=0.8.0)
- >如已有正常运行的MinerU环境则可以跳过此步骤
-> 
-[在CPU环境部署](https://github.com/opendatalab/MinerU/blob/master/README_zh-CN.md#%E4%BD%BF%E7%94%A8cpu%E5%BF%AB%E9%80%9F%E4%BD%93%E9%AA%8C)
-[在GPU环境部署](https://github.com/opendatalab/MinerU/blob/master/README_zh-CN.md#%E4%BD%BF%E7%94%A8gpu)
-第三方软件
-```bash
-pip install gradio gradio-pdf
-```
-## 启动gradio应用
-```bash
-python app.py
-```
-## 使用gradio应用
-在浏览器中访问 http://127.0.0.1:7860
\ No newline at end of file
--- a/projects/gradio_app/examples/2list_1table.pdf
+++ b/projects/gradio_app/examples/2list_1table.pdf
--- a/projects/gradio_app/examples/3list_1table.pdf
+++ b/projects/gradio_app/examples/3list_1table.pdf
--- a/projects/gradio_app/examples/academic_paper_formula.pdf
+++ b/projects/gradio_app/examples/academic_paper_formula.pdf
--- a/projects/gradio_app/examples/academic_paper_img_formula.pdf
+++ b/projects/gradio_app/examples/academic_paper_img_formula.pdf
--- a/projects/gradio_app/examples/academic_paper_list.pdf
+++ b/projects/gradio_app/examples/academic_paper_list.pdf
--- a/projects/gradio_app/examples/complex_layout.pdf
+++ b/projects/gradio_app/examples/complex_layout.pdf
--- a/projects/gradio_app/examples/complex_layout_para_split_list.pdf
+++ b/projects/gradio_app/examples/complex_layout_para_split_list.pdf
--- a/projects/gradio_app/examples/garbled_formula.pdf
+++ b/projects/gradio_app/examples/garbled_formula.pdf
--- a/projects/gradio_app/examples/magazine_complex_layout_images_list.pdf
+++ b/projects/gradio_app/examples/magazine_complex_layout_images_list.pdf
--- a/projects/gradio_app/examples/scanned.pdf
+++ b/projects/gradio_app/examples/scanned.pdf
--- a/projects/gradio_app/requirements.txt
+++ b/projects/gradio_app/requirements.txt
-magic-pdf[full]>=0.8.0
-gradio
-gradio-pdf
\ No newline at end of file
--- a/projects/multi_gpu/README.md
+++ b/projects/multi_gpu/README.md
-## 项目简介
-本项目提供基于 LitServe 的多 GPU 并行处理方案。LitServe 是一个简便且灵活的 AI 模型服务引擎，基于 FastAPI 构建。它为 FastAPI 增强了批处理、流式传输和 GPU 自动扩展等功能，无需为每个模型单独重建 FastAPI 服务器。
-## 环境配置
-请使用以下命令配置所需的环境：
-```bash
-pip install -U magic-pdf[full] litserve python-multipart filetype
-```
-## 快速使用
-### 1. 启动服务端
-以下示例展示了如何启动服务端，支持自定义设置：
-```python
-server = ls.LitServer(
-    MinerUAPI(output_dir='/tmp'),  # 可自定义输出文件夹
-    accelerator='cuda',  # 启用 GPU 加速
-    devices='auto',  # "auto" 使用所有 GPU
-    workers_per_device=1,  # 每个 GPU 启动一个服务实例
-    timeout=False  # 设置为 False 以禁用超时
-)
-server.run(port=8000)  # 设定服务端口为 8000
-```
-启动服务端命令：
-```bash
-python server.py
-```
-### 2. 启动客户端
-以下代码展示了客户端的使用方式，可根据需求修改配置：
-```python
-files = ['demo/small_ocr.pdf']  # 替换为文件路径，支持 pdf、jpg/jpeg、png、doc、docx、ppt、pptx 文件
-n_jobs = np.clip(len(files), 1, 8)  # 设置并发线程数，此处最大为 8，可根据自身修改
-results = Parallel(n_jobs, prefer='threads', verbose=10)(
-    delayed(do_parse)(p) for p in files
-)
-print(results)
-```
-启动客户端命令：
-```bash
-python client.py
-```
-好了，你的文件会自动在多个 GPU 上并行处理！🍻🍻🍻
--- a/projects/multi_gpu/client.py
+++ b/projects/multi_gpu/client.py
-import base64
-import requests
-import numpy as np
-from loguru import logger
-from joblib import Parallel, delayed
-def to_b64(file_path):
-    try:
-        with open(file_path, 'rb') as f:
-            return base64.b64encode(f.read()).decode('utf-8')
-    except Exception as e:
-        raise Exception(f'File: {file_path} - Info: {e}')
-def do_parse(file_path, url='http://127.0.0.1:8000/predict', **kwargs):
-    try:
-        response = requests.post(url, json={
-            'file': to_b64(file_path),
-            'kwargs': kwargs
-        })
-        if response.status_code == 200:
-            output = response.json()
-            output['file_path'] = file_path
-            return output
-        else:
-            raise Exception(response.text)
-    except Exception as e:
-        logger.error(f'File: {file_path} - Info: {e}')
-if __name__ == '__main__':
-    files = ['demo/small_ocr.pdf']
-    n_jobs = np.clip(len(files), 1, 8)
-    results = Parallel(n_jobs, prefer='threads', verbose=10)(
-        delayed(do_parse)(p) for p in files
-    )
-    print(results)
--- a/projects/multi_gpu/server.py
+++ b/projects/multi_gpu/server.py
-import os
-import uuid
-import shutil
-import tempfile
-import gc
-import fitz
-import torch
-import base64
-import filetype
-import litserve as ls
-from pathlib import Path
-from fastapi import HTTPException
-class MinerUAPI(ls.LitAPI):
-    def __init__(self, output_dir='/tmp'):
-        self.output_dir = Path(output_dir)
-    def setup(self, device):
-        if device.startswith('cuda'):
-            os.environ['CUDA_VISIBLE_DEVICES'] = device.split(':')[-1]
-            if torch.cuda.device_count() > 1:
-                raise RuntimeError("Remove any CUDA actions before setting 'CUDA_VISIBLE_DEVICES'.")
-        from magic_pdf.tools.cli import do_parse, convert_file_to_pdf
-        from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton
-        self.do_parse = do_parse
-        self.convert_file_to_pdf = convert_file_to_pdf
-        model_manager = ModelSingleton()
-        model_manager.get_model(True, False)
-        model_manager.get_model(False, False)
-        print(f'Model initialization complete on {device}!')
-    def decode_request(self, request):
-        file = request['file']
-        file = self.cvt2pdf(file)
-        opts = request.get('kwargs', {})
-        opts.setdefault('debug_able', False)
-        opts.setdefault('parse_method', 'auto')
-        return file, opts
-    def predict(self, inputs):
-        try:
-            pdf_name = str(uuid.uuid4())
-            output_dir = self.output_dir.joinpath(pdf_name)
-            self.do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1])
-            return output_dir
-        except Exception as e:
-            shutil.rmtree(output_dir, ignore_errors=True)
-            raise HTTPException(status_code=500, detail=str(e))
-        finally:
-            self.clean_memory()
-    def encode_response(self, response):
-        return {'output_dir': response}
-    def clean_memory(self):
-        if torch.cuda.is_available():
-            torch.cuda.empty_cache()
-            torch.cuda.ipc_collect()
-        gc.collect()
-    def cvt2pdf(self, file_base64):
-        try:
-            temp_dir = Path(tempfile.mkdtemp())
-            temp_file = temp_dir.joinpath('tmpfile')
-            file_bytes = base64.b64decode(file_base64)
-            file_ext = filetype.guess_extension(file_bytes)
-            if file_ext in ['pdf', 'jpg', 'png', 'doc', 'docx', 'ppt', 'pptx']:
-                if file_ext == 'pdf':
-                    return file_bytes
-                elif file_ext in ['jpg', 'png']:
-                    with fitz.open(stream=file_bytes, filetype=file_ext) as f:
-                        return f.convert_to_pdf()
-                else:
-                    temp_file.write_bytes(file_bytes)
-                    self.convert_file_to_pdf(temp_file, temp_dir)
-                    return temp_file.with_suffix('.pdf').read_bytes()
-            else:
-                raise Exception('Unsupported file format')
-        except Exception as e:
-            raise HTTPException(status_code=500, detail=str(e))
-        finally:
-            shutil.rmtree(temp_dir, ignore_errors=True)
-if __name__ == '__main__':
-    server = ls.LitServer(
-        MinerUAPI(output_dir='/tmp'),
-        accelerator='cuda',
-        devices='auto',
-        workers_per_device=1,
-        timeout=False
-    )
-    server.run(port=8000)
--- a/projects/multi_gpu_v2/README.md
+++ b/projects/multi_gpu_v2/README.md
+# MinerU v2.0 Multi-GPU Server
+[简体中文](README_zh.md)
+A streamlined multi-GPU server implementation.
+## Quick Start
+### 1. install MinerU
+```bash
+pip install --upgrade pip
+pip install uv
+uv pip install -U "mineru[core]"
+uv pip install litserve aiohttp loguru
+```
+### 2. Start the Server
+```bash
+python server.py
+```
+### 3. Start the Client
+```bash
+python client.py
+```
+Now, pdf files under folder [demo](../../demo/) will be processed in parallel. Assuming you have 2 gpus, if you change the `workers_per_device` to `2`, 4 pdf files will be processed at the same time!
+## Customize
+### Server 
+Example showing how to start the server with custom settings:
+```python
+server = ls.LitServer(
+    MinerUAPI(output_dir='/tmp/mineru_output'),
+    accelerator='auto',  # You can specify 'cuda'
+    devices='auto',  # "auto" uses all available GPUs
+    workers_per_device=1,  # One worker instance per GPU
+    timeout=False  # Disable timeout for long processing
+)
+server.run(port=8000, generate_client_file=False)
+```
+### Client 
+The client supports both synchronous and asynchronous processing:
+```python
+import asyncio
+import aiohttp
+from client import mineru_parse_async
+async def process_documents():
+    async with aiohttp.ClientSession() as session:
+        # Basic usage
+        result = await mineru_parse_async(session, 'document.pdf')
+        # With custom options
+        result = await mineru_parse_async(
+            session, 
+            'document.pdf',
+            backend='pipeline',
+            lang='ch',
+            formula_enable=True,
+            table_enable=True
+        )
+# Run async processing
+asyncio.run(process_documents())
+```
+### Concurrent Processing
+Process multiple files simultaneously:
+```python
+async def process_multiple_files():
+    files = ['doc1.pdf', 'doc2.pdf', 'doc3.pdf']
+    async with aiohttp.ClientSession() as session:
+        tasks = [mineru_parse_async(session, file) for file in files]
+        results = await asyncio.gather(*tasks)
+    return results
+```
--- a/projects/multi_gpu_v2/README_zh.md
+++ b/projects/multi_gpu_v2/README_zh.md
+# MinerU v2.0 多GPU服务器
+[English](README.md)
+这是一个精简的多GPU服务器实现。
+## 快速开始
+### 1. 安装 MinerU
+```bash
+pip install --upgrade pip
+pip install uv
+uv pip install -U "mineru[core]"
+uv pip install litserve aiohttp loguru
+```
+### 2. 启动服务器
+```bash
+python server.py
+```
+### 3. 启动客户端
+```bash
+python client.py
+```
+现在，`[demo](../../demo/)` 文件夹下的PDF文件将并行处理。假设您有2个GPU，如果您将 `workers_per_device` 更改为 `2`，则可以同时处理4个PDF文件！
+## 自定义
+### 服务器
+以下示例展示了如何启动带有自定义设置的服务器：
+```python
+server = ls.LitServer(
+    MinerUAPI(output_dir='/tmp/mineru_output'),  # 自定义输出文件夹
+    accelerator='auto',  # 您可以指定 'cuda'
+    devices='auto',  # "auto" 使用所有可用的GPU
+    workers_per_device=1,  # 每个GPU启动一个工作实例
+    timeout=False  # 禁用超时，用于长时间处理
+)
+server.run(port=8000, generate_client_file=False)
+```
+### 客户端
+客户端支持同步和异步处理：
+```python
+import asyncio
+import aiohttp
+from client import mineru_parse_async
+async def process_documents():
+    async with aiohttp.ClientSession() as session:
+        # 基本用法
+        result = await mineru_parse_async(session, 'document.pdf')
+        # 带自定义选项
+        result = await mineru_parse_async(
+            session, 
+            'document.pdf',
+            backend='pipeline',
+            lang='ch',
+            formula_enable=True,
+            table_enable=True
+        )
+# 运行异步处理
+asyncio.run(process_documents())
+```
+### 并行处理
+同时处理多个文件：
+```python
+async def process_multiple_files():
+    files = ['doc1.pdf', 'doc2.pdf', 'doc3.pdf']
+    async with aiohttp.ClientSession() as session:
+        tasks = [mineru_parse_async(session, file) for file in files]
+        results = await asyncio.gather(*tasks)
+    return results
+```
\ No newline at end of file