Merge pull request #2622 from myhloli/dev

Dev

Merge pull request #2622 from myhloli/dev
Dev
bcbbee8c · Xiaomeng Zhao · GitHub · 3cc3f754 · ced5a7b4 · 3cc3f754
Unverified Commit bcbbee8c authored Jun 13, 2025 by Xiaomeng Zhao Committed by GitHub Jun 13, 2025
20 changed files
--- a/next_docs/en/user_guide.rst
+++ b/next_docs/en/user_guide.rst
-.. toctree::
-    :maxdepth: 2
-    user_guide/install
-    user_guide/usage
-    user_guide/quick_start
-    user_guide/tutorial
-    user_guide/data
-    user_guide/inference_result
-    user_guide/pipe_result
--- a/next_docs/en/user_guide/data.rst
+++ b/next_docs/en/user_guide/data.rst
-Data
-=========
-.. toctree::
-   :maxdepth: 2
-   data/dataset
-   data/read_api
-   data/data_reader_writer 
-   data/io
--- a/next_docs/en/user_guide/data/data_reader_writer.rst
+++ b/next_docs/en/user_guide/data/data_reader_writer.rst
-Data Reader Writer 
-====================
-Aims for read or write bytes from different media, You can implement new classes to meet the needs of your personal scenarios 
-if MinerU have not provide the suitable classes. It is easy to implement new classes, the only one requirement is to inherit from
-``DataReader`` or ``DataWriter``
-.. code:: python
-    class SomeReader(DataReader):
-        def read(self, path: str) -> bytes:
-            pass
-        def read_at(self, path: str, offset: int = 0, limit: int = -1) -> bytes:
-            pass
-    class SomeWriter(DataWriter):
-        def write(self, path: str, data: bytes) -> None:
-            pass
-        def write_string(self, path: str, data: str) -> None:
-            pass
-Reader may curious about the difference between :doc:`io` and this section. Those two sections look very similarity at first glance.
-:doc:`io` provides fundamental functions, while This section thinks more at application level. Customer can build they own classes to meet 
-their own applications need which may share same IO function. That is why we have :doc:`io`.
-Important Classes
-----------------
-.. code:: python
-    class FileBasedDataReader(DataReader):
-        def __init__(self, parent_dir: str = ''):
-            pass
-    class FileBasedDataWriter(DataWriter):
-        def __init__(self, parent_dir: str = '') -> None:
-            pass
-Class ``FileBasedDataReader`` initialized with unary param ``parent_dir``, That means that every method ``FileBasedDataReader`` provided will have features as follow.
-Features:
-    #. read content from the absolute path file, ``parent_dir`` will be ignored.
-    #. read the relative path, file will first join with ``parent_dir``, then read content from the merged path
-.. note::
-    ``FileBasedDataWriter`` shares the same behavior with ``FileBaseDataReader``
-.. code:: python 
-    class MultiS3Mixin:
-        def __init__(self, default_prefix: str, s3_configs: list[S3Config]):
-            pass
-    class MultiBucketS3DataReader(DataReader, MultiS3Mixin):
-        pass
-All read-related method that class ``MultiBucketS3DataReader`` provided will have features as follow.
-Features:
-    #. read object with full s3-format path, for example ``s3://test_bucket/test_object``, ``default_prefix`` will be ignored.
-    #. read object with relative path, file will join ``default_prefix`` and trim the ``bucket_name`` firstly, then read the content. ``bucket_name`` is the first element of the result after split ``default_prefix`` with delimiter ``\`` 
-.. note::
-    ``MultiBucketS3DataWriter`` shares the same behavior with ``MultiBucketS3DataReader``
-.. code:: python
-    class S3DataReader(MultiBucketS3DataReader):
-        pass
-``S3DataReader`` is build on top of MultiBucketS3DataReader which only support for bucket. So is ``S3DataWriter``. 
-Read Examples
------------
-.. code:: python
-    import os 
-    from magic_pdf.data.data_reader_writer import *
-    from magic_pdf.data.data_reader_writer import MultiBucketS3DataReader
-    from magic_pdf.data.schemas import S3Config
-    # file based related
-    file_based_reader1 = FileBasedDataReader('')
-    ## will read file abc
-    file_based_reader1.read('abc')
-    file_based_reader2 = FileBasedDataReader('/tmp')
-    ## will read /tmp/abc
-    file_based_reader2.read('abc')
-    ## will read /tmp/logs/message.txt
-    file_based_reader2.read('/tmp/logs/message.txt')
-    # multi bucket s3 releated
-    bucket = "bucket"               # replace with real bucket
-    ak = "ak"                       # replace with real access key
-    sk = "sk"                       # replace with real secret key
-    endpoint_url = "endpoint_url"   # replace with real endpoint_url
-    bucket_2 = "bucket_2"               # replace with real bucket
-    ak_2 = "ak_2"                       # replace with real access key
-    sk_2 = "sk_2"                       # replace with real secret key 
-    endpoint_url_2 = "endpoint_url_2"   # replace with real endpoint_url
-    test_prefix = 'test/unittest'
-    multi_bucket_s3_reader1 = MultiBucketS3DataReader(f"{bucket}/{test_prefix}", [S3Config(
-            bucket_name=bucket, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
-        ),
-        S3Config(
-            bucket_name=bucket_2,
-            access_key=ak_2,
-            secret_key=sk_2,
-            endpoint_url=endpoint_url_2,
-        )])
-    ## will read s3://{bucket}/{test_prefix}/abc
-    multi_bucket_s3_reader1.read('abc')
-    ## will read s3://{bucket}/{test_prefix}/efg
-    multi_bucket_s3_reader1.read(f's3://{bucket}/{test_prefix}/efg')
-    ## will read s3://{bucket2}/{test_prefix}/abc
-    multi_bucket_s3_reader1.read(f's3://{bucket_2}/{test_prefix}/abc')
-    # s3 related
-    s3_reader1 = S3DataReader(
-        test_prefix,
-        bucket,
-        ak,
-        sk,
-        endpoint_url
-    )
-    ## will read s3://{bucket}/{test_prefix}/abc
-    s3_reader1.read('abc')
-    ## will read s3://{bucket}/efg
-    s3_reader1.read(f's3://{bucket}/efg')
-Write Examples
---------------
-.. code:: python
-    import os
-    from magic_pdf.data.data_reader_writer import *
-    from magic_pdf.data.data_reader_writer import MultiBucketS3DataWriter
-    from magic_pdf.data.schemas import S3Config
-    # file based related
-    file_based_writer1 = FileBasedDataWriter("")
-    ## will write 123 to abc
-    file_based_writer1.write("abc", "123".encode())
-    ## will write 123 to abc
-    file_based_writer1.write_string("abc", "123")
-    file_based_writer2 = FileBasedDataWriter("/tmp")
-    ## will write 123 to /tmp/abc
-    file_based_writer2.write_string("abc", "123")
-    ## will write 123 to /tmp/logs/message.txt
-    file_based_writer2.write_string("/tmp/logs/message.txt", "123")
-    # multi bucket s3 releated
-    bucket = "bucket"               # replace with real bucket
-    ak = "ak"                       # replace with real access key
-    sk = "sk"                       # replace with real secret key
-    endpoint_url = "endpoint_url"   # replace with real endpoint_url
-    bucket_2 = "bucket_2"               # replace with real bucket
-    ak_2 = "ak_2"                       # replace with real access key
-    sk_2 = "sk_2"                       # replace with real secret key 
-    endpoint_url_2 = "endpoint_url_2"   # replace with real endpoint_url
-    test_prefix = "test/unittest"
-    multi_bucket_s3_writer1 = MultiBucketS3DataWriter(
-        f"{bucket}/{test_prefix}",
-        [
-            S3Config(
-                bucket_name=bucket, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
-            ),
-            S3Config(
-                bucket_name=bucket_2,
-                access_key=ak_2,
-                secret_key=sk_2,
-                endpoint_url=endpoint_url_2,
-            ),
-        ],
-    )
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    multi_bucket_s3_writer1.write_string("abc", "123")
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    multi_bucket_s3_writer1.write("abc", "123".encode())
-    ## will write 123 to s3://{bucket}/{test_prefix}/efg
-    multi_bucket_s3_writer1.write(f"s3://{bucket}/{test_prefix}/efg", "123".encode())
-    ## will write 123 to s3://{bucket_2}/{test_prefix}/abc
-    multi_bucket_s3_writer1.write(f's3://{bucket_2}/{test_prefix}/abc', '123'.encode())
-    # s3 related
-    s3_writer1 = S3DataWriter(test_prefix, bucket, ak, sk, endpoint_url)
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    s3_writer1.write("abc", "123".encode())
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    s3_writer1.write_string("abc", "123")
-    ## will write 123 to s3://{bucket}/efg
-    s3_writer1.write(f"s3://{bucket}/efg", "123".encode())
-Check :doc:`../../api/data_reader_writer` for more details
--- a/next_docs/en/user_guide/data/dataset.rst
+++ b/next_docs/en/user_guide/data/dataset.rst
-Dataset 
-===========
-Import Classes 
-----------------
-Dataset 
-^^^^^^^^
-Each pdfs or image will form one ``Dataset``. As we all know, Pdf has two categories, :ref:`digital_method_section` or :ref:`ocr_method_section`.
-Will get ``ImageDataset`` which is subclass of ``Dataset`` with images and get ``PymuDocDataset`` from pdf files.
-The difference between ``ImageDataset`` and ``PymuDocDataset`` is that ``ImageDataset`` only support ``OCR`` parse method, 
-while ``PymuDocDataset`` support both ``OCR`` and ``TXT``
-.. note::
-    In fact some pdf may generated by images, that means it can not support ``TXT`` methods. Currently it is something the user needs to ensure does not happen
-Pdf Parse Methods
------------------
-.. _ocr_method_section:
-OCR 
-^^^^
-Extract chars via ``Optical Character Recognition`` technical.
-.. _digital_method_section:
-TXT
-^^^^^^^^
-Extract chars via third-party library, currently we use ``pymupdf``. 
-Check :doc:`../../api/dataset` for more details
--- a/next_docs/en/user_guide/data/io.rst
+++ b/next_docs/en/user_guide/data/io.rst
-IO
-===
-Aims for read or write bytes from different media, Currently We provide ``S3Reader``, ``S3Writer`` for AWS S3 compatible media 
-and ``HttpReader``, ``HttpWriter`` for remote Http file. You can implement new classes to meet the needs of your personal scenarios 
-if MinerU have not provide the suitable classes. It is easy to implement new classes, the only one requirement is to inherit from
-``IOReader`` or ``IOWriter``
-.. code:: python
-    class SomeReader(IOReader):
-        def read(self, path: str) -> bytes:
-            pass
-        def read_at(self, path: str, offset: int = 0, limit: int = -1) -> bytes:
-            pass
-    class SomeWriter(IOWriter):
-        def write(self, path: str, data: bytes) -> None:
-            pass
-Check :doc:`../../api/io` for more details
--- a/next_docs/en/user_guide/data/read_api.rst
+++ b/next_docs/en/user_guide/data/read_api.rst
-read_api 
-==========
-Read the content from file or directory to create ``Dataset``, Currently we provided serval functions that cover some scenarios.
-if you have new scenarios that is common to most of the users, you can post it on the offical github issues with detail descriptions.
-Also it is easy to implement your own read-related funtions.
-Important Functions
-------------------
-read_jsonl
-^^^^^^^^^^^^^^^^
-Read the contet from jsonl which may located on local machine or remote s3. if you want to know more about jsonl, please goto :doc:`../../additional_notes/glossary`
-.. code:: python
-    from magic_pdf.data.read_api import *
-    from magic_pdf.data.data_reader_writer import MultiBucketS3DataReader
-    from magic_pdf.data.schemas import S3Config
-    # read jsonl from local machine
-    datasets = read_jsonl("tt.jsonl", None)   # replace with real jsonl file
-    # read jsonl from remote s3
-    bucket = "bucket_1"                     # replace with real s3 bucket
-    ak = "access_key_1"                     # replace with real s3 access key
-    sk = "secret_key_1"                     # replace with real s3 secret key
-    endpoint_url = "endpoint_url_1"         # replace with real s3 endpoint url
-    bucket_2 = "bucket_2"                   # replace with real s3 bucket
-    ak_2 = "access_key_2"                   # replace with real s3 access key
-    sk_2 = "secret_key_2"                   # replace with real s3 secret key
-    endpoint_url_2 = "endpoint_url_2"       # replace with real s3 endpoint url
-    s3configs = [
-        S3Config(
-            bucket_name=bucket, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
-        ),
-        S3Config(
-            bucket_name=bucket_2,
-            access_key=ak_2,
-            secret_key=sk_2,
-            endpoint_url=endpoint_url_2,
-        ),
-    ]
-    s3_reader = MultiBucketS3DataReader(bucket, s3configs)
-    datasets = read_jsonl(f"s3://bucket_1/tt.jsonl", s3_reader)  # replace with real s3 jsonl file
-read_local_pdfs
-^^^^^^^^^^^^^^^^^
-Read pdf from path or directory.
-.. code:: python
-    from magic_pdf.data.read_api import *
-    # read pdf path
-    datasets = read_local_pdfs("tt.pdf")
-    # read pdfs under directory
-    datasets = read_local_pdfs("pdfs/")
-read_local_images
-^^^^^^^^^^^^^^^^^^^
-Read images from path or directory
-.. code:: python 
-    from magic_pdf.data.read_api import *
-    # read from image path 
-    datasets = read_local_images("tt.png")  # replace with real file path
-    # read files from directory that endswith suffix in suffixes array 
-    datasets = read_local_images("images/", suffixes=[".png", ".jpg"])  # replace with real directory 
-read_local_office
-^^^^^^^^^^^^^^^^^^^^
-Read MS-Office files from path or directory
-.. code:: python 
-    from magic_pdf.data.read_api import *
-    # read from image path 
-    datasets = read_local_office("tt.doc")  # replace with real file path
-    # read files from directory that endswith suffix in suffixes array 
-    datasets = read_local_office("docs/")  # replace with real directory 
-Check :doc:`../../api/read_api` for more details
\ No newline at end of file
--- a/next_docs/en/user_guide/inference_result.rst
+++ b/next_docs/en/user_guide/inference_result.rst
-Inference Result
-==================
-.. admonition:: Tip
-    :class: tip
-    Please first navigate to :doc:`tutorial/pipeline` to get an initial understanding of how the pipeline works; this will help in understanding the content of this section.
-The **InferenceResult** class is a container for storing model inference results and implements a series of methods related to these results, such as draw_model, dump_model.
-Checkout :doc:`../api/model_operators` for more details about **InferenceResult**
-Model Inference Result
-----------------------
-Structure Definition
-^^^^^^^^^^^^^^^^^^^^^^^^
-.. code:: python
-    from pydantic import BaseModel, Field
-    from enum import IntEnum
-    class CategoryType(IntEnum):
-            title = 0               # Title
-            plain_text = 1          # Text
-            abandon = 2             # Includes headers, footers, page numbers, and page annotations
-            figure = 3              # Image
-            figure_caption = 4      # Image description
-            table = 5               # Table
-            table_caption = 6       # Table description
-            table_footnote = 7      # Table footnote
-            isolate_formula = 8     # Block formula
-            formula_caption = 9     # Formula label
-            embedding = 13          # Inline formula
-            isolated = 14           # Block formula
-            text = 15               # OCR recognition result
-    class PageInfo(BaseModel):
-        page_no: int = Field(description="Page number, the first page is 0", ge=0)
-        height: int = Field(description="Page height", gt=0)
-        width: int = Field(description="Page width", ge=0)
-    class ObjectInferenceResult(BaseModel):
-        category_id: CategoryType = Field(description="Category", ge=0)
-        poly: list[float] = Field(description="Quadrilateral coordinates, representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively")
-        score: float = Field(description="Confidence of the inference result")
-        latex: str | None = Field(description="LaTeX parsing result", default=None)
-        html: str | None = Field(description="HTML parsing result", default=None)
-    class PageInferenceResults(BaseModel):
-            layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results", ge=0)
-            page_info: PageInfo = Field(description="Page metadata")
-Example
-^^^^^^^^^^^
-.. code:: json
-    [
-        {
-            "layout_dets": [
-                {
-                    "category_id": 2,
-                    "poly": [
-                        99.1906967163086,
-                        100.3119125366211,
-                        730.3707885742188,
-                        100.3119125366211,
-                        730.3707885742188,
-                        245.81326293945312,
-                        99.1906967163086,
-                        245.81326293945312
-                    ],
-                    "score": 0.9999997615814209
-                }
-            ],
-            "page_info": {
-                "page_no": 0,
-                "height": 2339,
-                "width": 1654
-            }
-        },
-        {
-            "layout_dets": [
-                {
-                    "category_id": 5,
-                    "poly": [
-                        99.13092803955078,
-                        2210.680419921875,
-                        497.3183898925781,
-                        2210.680419921875,
-                        497.3183898925781,
-                        2264.78076171875,
-                        99.13092803955078,
-                        2264.78076171875
-                    ],
-                    "score": 0.9999997019767761
-                }
-            ],
-            "page_info": {
-                "page_no": 1,
-                "height": 2339,
-                "width": 1654
-            }
-        }
-    ]
-The format of the poly coordinates is [x0, y0, x1, y1, x2, y2, x3, y3],
-representing the coordinates of the top-left, top-right, bottom-right,
-and bottom-left points respectively. |Poly Coordinate Diagram|
-Inference Result
-------------------------
-.. code:: python
-    from magic_pdf.operators.models import InferenceResult
-    from magic_pdf.data.dataset import Dataset
-    dataset : Dataset = some_data_set    # not real dataset
-    # The inference results of all pages, ordered by page number, are stored in a list as the inference results of MinerU
-    model_inference_result: list[PageInferenceResults] = []
-    Inference_result = InferenceResult(model_inference_result, dataset)
-some_model.pdf
-^^^^^^^^^^^^^^^^^^^^
-.. figure:: ../_static/image/inference_result.png
-.. |Poly Coordinate Diagram| image:: ../_static/image/poly.png
--- a/next_docs/en/user_guide/install.rst
+++ b/next_docs/en/user_guide/install.rst
-Installation
-==============
-.. toctree::
-   :maxdepth: 1
-   install/install
-   install//boost_with_cuda
-   install/download_model_weight_files
-   install/config
--- a/next_docs/en/user_guide/install/boost_with_cuda.rst
+++ b/next_docs/en/user_guide/install/boost_with_cuda.rst
-Boost With Cuda 
-================
-If your device supports CUDA and meets the GPU requirements of the
-mainline environment, you can use GPU acceleration. Please select the
-appropriate guide based on your system:
-  :ref:`ubuntu_22_04_lts_section`
-  :ref:`windows_10_or_11_section`
-.. _ubuntu_22_04_lts_section:
-Ubuntu 22.04 LTS
-----------------
-1. Check if NVIDIA Drivers Are Installed
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. code:: sh
-   nvidia-smi
-If you see information similar to the following, it means that the
-NVIDIA drivers are already installed, and you can skip Step 2.
-.. note::
-   ``CUDA Version`` should be >= 12.4, If the displayed version number is less than 12.4, please upgrade the driver.
-.. code:: text
-   +---------------------------------------------------------------------------------------+
-   | NVIDIA-SMI 570.133.07             Driver Version: 572.83         CUDA Version: 12.8   |
-   |-----------------------------------------+----------------------+----------------------+
-   | GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
-   | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
-   |                                         |                      |               MIG M. |
-   |=========================================+======================+======================|
-   |   0  NVIDIA GeForce RTX 3060 Ti   WDDM  | 00000000:01:00.0  On |                  N/A |
-   |  0%   51C    P8              12W / 200W |   1489MiB /  8192MiB |      5%      Default |
-   |                                         |                      |                  N/A |
-   +-----------------------------------------+----------------------+----------------------+
-2. Install the Driver
-~~~~~~~~~~~~~~~~~~~~~
-If no driver is installed, use the following command:
-.. code:: sh
-   sudo apt-get update
-   sudo apt-get install nvidia-driver-570-server
-Install the proprietary driver and restart your computer after
-installation.
-.. code:: sh
-   reboot
-3. Install Anaconda
-~~~~~~~~~~~~~~~~~~~
-If Anaconda is already installed, skip this step.
-.. code:: sh
-   wget https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
-   bash Anaconda3-2024.06-1-Linux-x86_64.sh
-In the final step, enter ``yes``, close the terminal, and reopen it.
-4. Create an Environment Using Conda
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Specify Python version 3.10～3.13.
-.. code:: sh
-    conda create -n mineru 'python=3.12' -y
-    conda activate mineru
-5. Install Applications
-~~~~~~~~~~~~~~~~~~~~~~~
-.. code:: sh
-   pip install -U magic-pdf[full]
-.. admonition:: TIP
-    :class: tip
-    After installation, you can check the version of ``magic-pdf`` using the following command:
-.. code:: sh
-   magic-pdf --version
-6. Download Models
-~~~~~~~~~~~~~~~~~~
-Refer to detailed instructions on :doc:`download_model_weight_files`
-7. Understand the Location of the Configuration File
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-After completing the `6. Download Models <#6-download-models>`__ step,
-the script will automatically generate a ``magic-pdf.json`` file in the
-user directory and configure the default model path. You can find the
-``magic-pdf.json`` file in your user directory.
-.. admonition:: TIP
-    :class: tip
-    The user directory for Linux is “/home/username”.
-8. First Run
-~~~~~~~~~~~~
-Download a sample file from the repository and test it.
-.. code:: sh
-   wget https://github.com/opendatalab/MinerU/raw/master/demo/pdfs/small_ocr.pdf
-   magic-pdf -p small_ocr.pdf -o ./output
-9. Test CUDA Acceleration
-~~~~~~~~~~~~~~~~~~~~~~~~~
-If your graphics card has at least **8GB** of VRAM, follow these steps
-to test CUDA acceleration:
-1. Modify the value of ``"device-mode"`` in the ``magic-pdf.json``
-   configuration file located in your home directory.
-   .. code:: json
-      {
-        "device-mode": "cuda"
-      }
-2. Test CUDA acceleration with the following command:
-   .. code:: sh
-      magic-pdf -p small_ocr.pdf -o ./output
-.. _windows_10_or_11_section:
-Windows 10/11
--------------
-1. Install CUDA
-~~~~~~~~~~~~~~~~~~~~~~~~~
-You need to install a CUDA version that is compatible with torch's requirements. For details, please refer to the [official PyTorch website](https://pytorch.org/get-started/locally/).
- CUDA 11.8 https://developer.nvidia.com/cuda-11-8-0-download-archive
- CUDA 12.4 https://developer.nvidia.com/cuda-12-4-0-download-archive
- CUDA 12.6 https://developer.nvidia.com/cuda-12-6-0-download-archive
- CUDA 12.8 https://developer.nvidia.com/cuda-12-8-0-download-archive
-2. Install Anaconda
-~~~~~~~~~~~~~~~~~~~
-If Anaconda is already installed, you can skip this step.
-Download link: https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Windows-x86_64.exe
-3. Create an Environment Using Conda
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-::
-    conda create -n mineru 'python=3.12' -y
-    conda activate mineru
-4. Install Applications
-~~~~~~~~~~~~~~~~~~~~~~~
-::
-   pip install -U magic-pdf[full]
-.. admonition:: Tip
-    :class: tip
-    After installation, you can check the version of ``magic-pdf``:
-    .. code:: bash
-      magic-pdf --version
-5. Download Models
-~~~~~~~~~~~~~~~~~~
-Refer to detailed instructions on :doc:`download_model_weight_files`
-6. Understand the Location of the Configuration File
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-After completing the `5. Download Models <#5-download-models>`__ step,
-the script will automatically generate a ``magic-pdf.json`` file in the
-user directory and configure the default model path. You can find the
-``magic-pdf.json`` file in your 【user directory】 .
-.. admonition:: Tip
-    :class: tip
-    The user directory for Windows is “C:/Users/username”.
-7. First Run
-~~~~~~~~~~~~
-Download a sample file from the repository and test it.
-.. code:: powershell
-     wget https://github.com/opendatalab/MinerU/raw/master/demo/pdfs/small_ocr.pdf -O small_ocr.pdf
-     magic-pdf -p small_ocr.pdf -o ./output
-8. Test CUDA Acceleration
-~~~~~~~~~~~~~~~~~~~~~~~~~
-If your graphics card has at least 8GB of VRAM, follow these steps to
-test CUDA-accelerated parsing performance.
-1. **Overwrite the installation of torch and torchvision** supporting CUDA.(Please select the appropriate index-url based on your CUDA version. For more details, refer to the [PyTorch official website](https://pytorch.org/get-started/locally/).)
-.. code:: sh
-   pip install --force-reinstall torch torchvision --index-url https://download.pytorch.org/whl/cu124
-2. **Modify the value of ``"device-mode"``** in the ``magic-pdf.json``
-   configuration file located in your user directory.
-   .. code:: json
-      {
-        "device-mode": "cuda"
-      }
-3. **Run the following command to test CUDA acceleration**:
-   ::
-      magic-pdf -p small_ocr.pdf -o ./output
--- a/next_docs/en/user_guide/install/config.rst
+++ b/next_docs/en/user_guide/install/config.rst
--- a/next_docs/en/user_guide/install/download_model_weight_files.rst
+++ b/next_docs/en/user_guide/install/download_model_weight_files.rst
-Download Model Weight Files
-==============================
-Model downloads are divided into initial downloads and updates to the
-model directory. Please refer to the corresponding documentation for
-instructions on how to proceed.
-Initial download of model files
------------------------------
-1. Download the Model from Hugging Face
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Use a Python Script to Download Model Files from Hugging Face
-.. code:: bash
-   pip install huggingface_hub
-   wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models_hf.py
-   python download_models_hf.py
-The Python script will automatically download the model files and
-configure the model directory in the configuration file.
-The configuration file can be found in the user directory, with the
-filename ``magic-pdf.json``.
-How to update models previously downloaded
-----------------------------------------
-1. Models downloaded via Hugging Face or Model Scope
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-If you previously downloaded models via Hugging Face or Model Scope, you
-can rerun the Python script used for the initial download. This will
-automatically update the model directory to the latest version.
--- a/next_docs/en/user_guide/install/install.rst
+++ b/next_docs/en/user_guide/install/install.rst
--- a/next_docs/en/user_guide/pipe_result.rst
+++ b/next_docs/en/user_guide/pipe_result.rst
--- a/next_docs/en/user_guide/quick_start.rst
+++ b/next_docs/en/user_guide/quick_start.rst
-Quick Start 
-==============
-Want to learn about the usage methods under different scenarios ? This page gives good examples about multiple usage cases match your needs.
-.. toctree::
-    :maxdepth: 1
-    quick_start/convert_pdf 
-    quick_start/convert_image
-    quick_start/convert_ms_office
--- a/next_docs/en/user_guide/quick_start/convert_image.rst
+++ b/next_docs/en/user_guide/quick_start/convert_image.rst
-Convert Image
-===============
-Command Line
-^^^^^^^^^^^^^
-.. code:: python
-    # make sure the file have correct suffix
-    magic-pdf -p a.png -o output -m auto
-API
-^^^^^^
-.. code:: python
-    import os
-    from magic_pdf.data.data_reader_writer import FileBasedDataWriter
-    from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
-    from magic_pdf.data.read_api import read_local_images
-    # prepare env
-    local_image_dir, local_md_dir = "output/images", "output"
-    image_dir = str(os.path.basename(local_image_dir))
-    os.makedirs(local_image_dir, exist_ok=True)
-    image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(
-        local_md_dir
-    )
-    # proc
-    ## Create Dataset Instance
-    input_file = "some_image.jpg"       # replace with real image file
-    input_file_name = input_file.split(".")[0]
-    ds = read_local_images(input_file)[0]
-    # ocr mode
-    ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(
-        md_writer, f"{input_file_name}.md", image_dir
-    )
--- a/next_docs/en/user_guide/quick_start/convert_ms_office.rst
+++ b/next_docs/en/user_guide/quick_start/convert_ms_office.rst
-Convert Doc
-=============
-.. admonition:: Warning
-    :class: tip
-    When processing MS-Office files, we first use third-party software to convert the MS-Office files to PDF.
-    For certain MS-Office files, the quality of the converted PDF files may not be very high, which can affect the quality of the final output.
-Command Line
-^^^^^^^^^^^^^
-.. code:: python
-    # replace with real ms-office file, we support MS-DOC, MS-DOCX, MS-PPT, MS-PPTX now
-    magic-pdf -p a.doc -o output -m auto
-API
-^^^^^^^^
-.. code:: python
-    import os
-    from magic_pdf.data.data_reader_writer import FileBasedDataWriter, FileBasedDataReader
-    from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
-    from magic_pdf.data.read_api import read_local_office
-    from magic_pdf.config.enums import SupportedPdfParseMethod
-    # prepare env
-    local_image_dir, local_md_dir = "output/images", "output"
-    image_dir = str(os.path.basename(local_image_dir))
-    os.makedirs(local_image_dir, exist_ok=True)
-    image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(
-        local_md_dir
-    )
-    # proc
-    ## Create Dataset Instance
-    input_file = "some_doc.doc"     # replace with real ms-office file, we support MS-DOC, MS-DOCX, MS-PPT, MS-PPTX now
-    input_file_name = input_file.split(".")[0]
-    ds = read_local_office(input_file)[0]
-    ## inference
-    if ds.classify() == SupportedPdfParseMethod.OCR:
-        ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(
-        md_writer, f"{input_file_name}.md", image_dir)
-    else:
-        ds.apply(doc_analyze, ocr=False).pipe_txt_mode(image_writer).dump_md(
-        md_writer, f"{input_file_name}.md", image_dir)
--- a/next_docs/en/user_guide/quick_start/convert_pdf.rst
+++ b/next_docs/en/user_guide/quick_start/convert_pdf.rst
--- a/next_docs/en/user_guide/tutorial.rst
+++ b/next_docs/en/user_guide/tutorial.rst
--- a/next_docs/en/user_guide/tutorial/output_file_description.rst
+++ b/next_docs/en/user_guide/tutorial/output_file_description.rst
--- a/next_docs/en/user_guide/tutorial/pipeline.rst
+++ b/next_docs/en/user_guide/tutorial/pipeline.rst