docs: remove outdated documentation files

- Deleted .readthedocs.yaml files from multiple directories - Removed outdated API and user guide documentation files - Deleted command line usage examples - Removed CUDA acceleration guide

docs: remove outdated documentation files
- Deleted .readthedocs.yaml files from multiple directories - Removed outdated API and user guide documentation files - Deleted command line usage examples - Removed CUDA acceleration guide
cf5c8f47 · myhloli · cb57e84c · cb57e84c · cb57e84c · cb57e84c
Commit cf5c8f47 authored Jun 13, 2025 by myhloli
20 changed files
--- a/next_docs/en/user_guide/data.rst
+++ b/next_docs/en/user_guide/data.rst
-
-
-Data
-=========
-
-.. toctree::
-   :maxdepth: 2
-
-   data/dataset
-
-   data/read_api
-
-   data/data_reader_writer 
-
-   data/io
-
-
-
-
--- a/next_docs/en/user_guide/data/data_reader_writer.rst
+++ b/next_docs/en/user_guide/data/data_reader_writer.rst
-
-Data Reader Writer 
-====================
-
-Aims for read or write bytes from different media, You can implement new classes to meet the needs of your personal scenarios 
-if MinerU have not provide the suitable classes. It is easy to implement new classes, the only one requirement is to inherit from
-``DataReader`` or ``DataWriter``
-
-.. code:: python
-
-    class SomeReader(DataReader):
-        def read(self, path: str) -> bytes:
-            pass
-
-        def read_at(self, path: str, offset: int = 0, limit: int = -1) -> bytes:
-            pass
-
-
-    class SomeWriter(DataWriter):
-        def write(self, path: str, data: bytes) -> None:
-            pass
-
-        def write_string(self, path: str, data: str) -> None:
-            pass
-
-
-Reader may curious about the difference between :doc:`io` and this section. Those two sections look very similarity at first glance.
-:doc:`io` provides fundamental functions, while This section thinks more at application level. Customer can build they own classes to meet 
-their own applications need which may share same IO function. That is why we have :doc:`io`.
-
-
-Important Classes
-----------------
-
-.. code:: python
-
-    class FileBasedDataReader(DataReader):
-        def __init__(self, parent_dir: str = ''):
-            pass
-
-
-    class FileBasedDataWriter(DataWriter):
-        def __init__(self, parent_dir: str = '') -> None:
-            pass
-
-Class ``FileBasedDataReader`` initialized with unary param ``parent_dir``, That means that every method ``FileBasedDataReader`` provided will have features as follow.
-
-Features:
-    #. read content from the absolute path file, ``parent_dir`` will be ignored.
-    #. read the relative path, file will first join with ``parent_dir``, then read content from the merged path
-
-
-.. note::
-
-    ``FileBasedDataWriter`` shares the same behavior with ``FileBaseDataReader``
-
-
-.. code:: python 
-
-    class MultiS3Mixin:
-        def __init__(self, default_prefix: str, s3_configs: list[S3Config]):
-            pass
-
-    class MultiBucketS3DataReader(DataReader, MultiS3Mixin):
-        pass
-
-All read-related method that class ``MultiBucketS3DataReader`` provided will have features as follow.
-
-Features:
-    #. read object with full s3-format path, for example ``s3://test_bucket/test_object``, ``default_prefix`` will be ignored.
-    #. read object with relative path, file will join ``default_prefix`` and trim the ``bucket_name`` firstly, then read the content. ``bucket_name`` is the first element of the result after split ``default_prefix`` with delimiter ``\`` 
-
-.. note::
-    ``MultiBucketS3DataWriter`` shares the same behavior with ``MultiBucketS3DataReader``
-
-
-.. code:: python
-
-    class S3DataReader(MultiBucketS3DataReader):
-        pass
-
-``S3DataReader`` is build on top of MultiBucketS3DataReader which only support for bucket. So is ``S3DataWriter``. 
-
-
-Read Examples
------------
-
-.. code:: python
-
-    import os 
-    from magic_pdf.data.data_reader_writer import *
-    from magic_pdf.data.data_reader_writer import MultiBucketS3DataReader
-    from magic_pdf.data.schemas import S3Config
-
-    # file based related
-    file_based_reader1 = FileBasedDataReader('')
-
-    ## will read file abc
-    file_based_reader1.read('abc')
-
-    file_based_reader2 = FileBasedDataReader('/tmp')
-
-    ## will read /tmp/abc
-    file_based_reader2.read('abc')
-
-    ## will read /tmp/logs/message.txt
-    file_based_reader2.read('/tmp/logs/message.txt')
-
-    # multi bucket s3 releated
-    bucket = "bucket"               # replace with real bucket
-    ak = "ak"                       # replace with real access key
-    sk = "sk"                       # replace with real secret key
-    endpoint_url = "endpoint_url"   # replace with real endpoint_url
-
-    bucket_2 = "bucket_2"               # replace with real bucket
-    ak_2 = "ak_2"                       # replace with real access key
-    sk_2 = "sk_2"                       # replace with real secret key 
-    endpoint_url_2 = "endpoint_url_2"   # replace with real endpoint_url
-
-    test_prefix = 'test/unittest'
-    multi_bucket_s3_reader1 = MultiBucketS3DataReader(f"{bucket}/{test_prefix}", [S3Config(
-            bucket_name=bucket, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
-        ),
-        S3Config(
-            bucket_name=bucket_2,
-            access_key=ak_2,
-            secret_key=sk_2,
-            endpoint_url=endpoint_url_2,
-        )])
-
-    ## will read s3://{bucket}/{test_prefix}/abc
-    multi_bucket_s3_reader1.read('abc')
-
-    ## will read s3://{bucket}/{test_prefix}/efg
-    multi_bucket_s3_reader1.read(f's3://{bucket}/{test_prefix}/efg')
-
-    ## will read s3://{bucket2}/{test_prefix}/abc
-    multi_bucket_s3_reader1.read(f's3://{bucket_2}/{test_prefix}/abc')
-
-    # s3 related
-    s3_reader1 = S3DataReader(
-        test_prefix,
-        bucket,
-        ak,
-        sk,
-        endpoint_url
-    )
-
-    ## will read s3://{bucket}/{test_prefix}/abc
-    s3_reader1.read('abc')
-
-    ## will read s3://{bucket}/efg
-    s3_reader1.read(f's3://{bucket}/efg')
-
-
-Write Examples
---------------
-
-.. code:: python
-
-    import os
-    from magic_pdf.data.data_reader_writer import *
-    from magic_pdf.data.data_reader_writer import MultiBucketS3DataWriter
-    from magic_pdf.data.schemas import S3Config
-
-    # file based related
-    file_based_writer1 = FileBasedDataWriter("")
-
-    ## will write 123 to abc
-    file_based_writer1.write("abc", "123".encode())
-
-    ## will write 123 to abc
-    file_based_writer1.write_string("abc", "123")
-
-    file_based_writer2 = FileBasedDataWriter("/tmp")
-
-    ## will write 123 to /tmp/abc
-    file_based_writer2.write_string("abc", "123")
-
-    ## will write 123 to /tmp/logs/message.txt
-    file_based_writer2.write_string("/tmp/logs/message.txt", "123")
-
-    # multi bucket s3 releated
-    bucket = "bucket"               # replace with real bucket
-    ak = "ak"                       # replace with real access key
-    sk = "sk"                       # replace with real secret key
-    endpoint_url = "endpoint_url"   # replace with real endpoint_url
-
-    bucket_2 = "bucket_2"               # replace with real bucket
-    ak_2 = "ak_2"                       # replace with real access key
-    sk_2 = "sk_2"                       # replace with real secret key 
-    endpoint_url_2 = "endpoint_url_2"   # replace with real endpoint_url
-
-    test_prefix = "test/unittest"
-    multi_bucket_s3_writer1 = MultiBucketS3DataWriter(
-        f"{bucket}/{test_prefix}",
-        [
-            S3Config(
-                bucket_name=bucket, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
-            ),
-            S3Config(
-                bucket_name=bucket_2,
-                access_key=ak_2,
-                secret_key=sk_2,
-                endpoint_url=endpoint_url_2,
-            ),
-        ],
-    )
-
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    multi_bucket_s3_writer1.write_string("abc", "123")
-
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    multi_bucket_s3_writer1.write("abc", "123".encode())
-
-    ## will write 123 to s3://{bucket}/{test_prefix}/efg
-    multi_bucket_s3_writer1.write(f"s3://{bucket}/{test_prefix}/efg", "123".encode())
-
-    ## will write 123 to s3://{bucket_2}/{test_prefix}/abc
-    multi_bucket_s3_writer1.write(f's3://{bucket_2}/{test_prefix}/abc', '123'.encode())
-
-    # s3 related
-    s3_writer1 = S3DataWriter(test_prefix, bucket, ak, sk, endpoint_url)
-
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    s3_writer1.write("abc", "123".encode())
-
-    ## will write 123 to s3://{bucket}/{test_prefix}/abc
-    s3_writer1.write_string("abc", "123")
-
-    ## will write 123 to s3://{bucket}/efg
-    s3_writer1.write(f"s3://{bucket}/efg", "123".encode())
-
-
-
-Check :doc:`../../api/data_reader_writer` for more details
--- a/next_docs/en/user_guide/data/dataset.rst
+++ b/next_docs/en/user_guide/data/dataset.rst
-
-
-Dataset 
-===========
-
-
-Import Classes 
-----------------
-
-Dataset 
-^^^^^^^^
-
-Each pdfs or image will form one ``Dataset``. As we all know, Pdf has two categories, :ref:`digital_method_section` or :ref:`ocr_method_section`.
-Will get ``ImageDataset`` which is subclass of ``Dataset`` with images and get ``PymuDocDataset`` from pdf files.
-The difference between ``ImageDataset`` and ``PymuDocDataset`` is that ``ImageDataset`` only support ``OCR`` parse method, 
-while ``PymuDocDataset`` support both ``OCR`` and ``TXT``
-
-.. note::
-
-    In fact some pdf may generated by images, that means it can not support ``TXT`` methods. Currently it is something the user needs to ensure does not happen
-
-
-
-Pdf Parse Methods
------------------
-
-.. _ocr_method_section:
-OCR 
-^^^^
-Extract chars via ``Optical Character Recognition`` technical.
-
-.. _digital_method_section:
-TXT
-^^^^^^^^
-Extract chars via third-party library, currently we use ``pymupdf``. 
-
-
-
-Check :doc:`../../api/dataset` for more details
-
--- a/next_docs/en/user_guide/data/io.rst
+++ b/next_docs/en/user_guide/data/io.rst
-
-IO
-===
-
-Aims for read or write bytes from different media, Currently We provide ``S3Reader``, ``S3Writer`` for AWS S3 compatible media 
-and ``HttpReader``, ``HttpWriter`` for remote Http file. You can implement new classes to meet the needs of your personal scenarios 
-if MinerU have not provide the suitable classes. It is easy to implement new classes, the only one requirement is to inherit from
-``IOReader`` or ``IOWriter``
-
-.. code:: python
-
-    class SomeReader(IOReader):
-        def read(self, path: str) -> bytes:
-            pass
-
-        def read_at(self, path: str, offset: int = 0, limit: int = -1) -> bytes:
-            pass
-
-
-    class SomeWriter(IOWriter):
-        def write(self, path: str, data: bytes) -> None:
-            pass
-
-Check :doc:`../../api/io` for more details
-
--- a/next_docs/en/user_guide/data/read_api.rst
+++ b/next_docs/en/user_guide/data/read_api.rst
-
-read_api 
-==========
-
-Read the content from file or directory to create ``Dataset``, Currently we provided serval functions that cover some scenarios.
-if you have new scenarios that is common to most of the users, you can post it on the offical github issues with detail descriptions.
-Also it is easy to implement your own read-related funtions.
-
-
-Important Functions
-------------------
-
-
-read_jsonl
-^^^^^^^^^^^^^^^^
-
-Read the contet from jsonl which may located on local machine or remote s3. if you want to know more about jsonl, please goto :doc:`../../additional_notes/glossary`
-
-.. code:: python
-
-    from magic_pdf.data.read_api import *
-    from magic_pdf.data.data_reader_writer import MultiBucketS3DataReader
-    from magic_pdf.data.schemas import S3Config
-
-    # read jsonl from local machine
-    datasets = read_jsonl("tt.jsonl", None)   # replace with real jsonl file
-
-    # read jsonl from remote s3
-
-    bucket = "bucket_1"                     # replace with real s3 bucket
-    ak = "access_key_1"                     # replace with real s3 access key
-    sk = "secret_key_1"                     # replace with real s3 secret key
-    endpoint_url = "endpoint_url_1"         # replace with real s3 endpoint url
-
-    bucket_2 = "bucket_2"                   # replace with real s3 bucket
-    ak_2 = "access_key_2"                   # replace with real s3 access key
-    sk_2 = "secret_key_2"                   # replace with real s3 secret key
-    endpoint_url_2 = "endpoint_url_2"       # replace with real s3 endpoint url
-
-    s3configs = [
-        S3Config(
-            bucket_name=bucket, access_key=ak, secret_key=sk, endpoint_url=endpoint_url
-        ),
-        S3Config(
-            bucket_name=bucket_2,
-            access_key=ak_2,
-            secret_key=sk_2,
-            endpoint_url=endpoint_url_2,
-        ),
-    ]
-
-    s3_reader = MultiBucketS3DataReader(bucket, s3configs)
-
-    datasets = read_jsonl(f"s3://bucket_1/tt.jsonl", s3_reader)  # replace with real s3 jsonl file
-
-read_local_pdfs
-^^^^^^^^^^^^^^^^^
-
-Read pdf from path or directory.
-
-
-.. code:: python
-
-    from magic_pdf.data.read_api import *
-
-    # read pdf path
-    datasets = read_local_pdfs("tt.pdf")
-
-    # read pdfs under directory
-    datasets = read_local_pdfs("pdfs/")
-
-
-read_local_images
-^^^^^^^^^^^^^^^^^^^
-
-Read images from path or directory
-
-.. code:: python 
-
-    from magic_pdf.data.read_api import *
-
-    # read from image path 
-    datasets = read_local_images("tt.png")  # replace with real file path
-
-    # read files from directory that endswith suffix in suffixes array 
-    datasets = read_local_images("images/", suffixes=[".png", ".jpg"])  # replace with real directory 
-
-
-read_local_office
-^^^^^^^^^^^^^^^^^^^^
-Read MS-Office files from path or directory
-
-.. code:: python 
-
-    from magic_pdf.data.read_api import *
-
-    # read from image path 
-    datasets = read_local_office("tt.doc")  # replace with real file path
-
-    # read files from directory that endswith suffix in suffixes array 
-    datasets = read_local_office("docs/")  # replace with real directory 
-
-
-
-
-Check :doc:`../../api/read_api` for more details
\ No newline at end of file
--- a/next_docs/en/user_guide/inference_result.rst
+++ b/next_docs/en/user_guide/inference_result.rst
-
-Inference Result
-==================
-
-.. admonition:: Tip
-    :class: tip
-
-    Please first navigate to :doc:`tutorial/pipeline` to get an initial understanding of how the pipeline works; this will help in understanding the content of this section.
-
-The **InferenceResult** class is a container for storing model inference results and implements a series of methods related to these results, such as draw_model, dump_model.
-Checkout :doc:`../api/model_operators` for more details about **InferenceResult**
-
-
-Model Inference Result
-----------------------
-
-Structure Definition
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-.. code:: python
-
-    from pydantic import BaseModel, Field
-    from enum import IntEnum
-
-    class CategoryType(IntEnum):
-            title = 0               # Title
-            plain_text = 1          # Text
-            abandon = 2             # Includes headers, footers, page numbers, and page annotations
-            figure = 3              # Image
-            figure_caption = 4      # Image description
-            table = 5               # Table
-            table_caption = 6       # Table description
-            table_footnote = 7      # Table footnote
-            isolate_formula = 8     # Block formula
-            formula_caption = 9     # Formula label
-
-            embedding = 13          # Inline formula
-            isolated = 14           # Block formula
-            text = 15               # OCR recognition result
-
-
-    class PageInfo(BaseModel):
-        page_no: int = Field(description="Page number, the first page is 0", ge=0)
-        height: int = Field(description="Page height", gt=0)
-        width: int = Field(description="Page width", ge=0)
-
-    class ObjectInferenceResult(BaseModel):
-        category_id: CategoryType = Field(description="Category", ge=0)
-        poly: list[float] = Field(description="Quadrilateral coordinates, representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively")
-        score: float = Field(description="Confidence of the inference result")
-        latex: str | None = Field(description="LaTeX parsing result", default=None)
-        html: str | None = Field(description="HTML parsing result", default=None)
-
-    class PageInferenceResults(BaseModel):
-            layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results", ge=0)
-            page_info: PageInfo = Field(description="Page metadata")
-
-
-Example
-^^^^^^^^^^^
-
-.. code:: json
-
-    [
-        {
-            "layout_dets": [
-                {
-                    "category_id": 2,
-                    "poly": [
-                        99.1906967163086,
-                        100.3119125366211,
-                        730.3707885742188,
-                        100.3119125366211,
-                        730.3707885742188,
-                        245.81326293945312,
-                        99.1906967163086,
-                        245.81326293945312
-                    ],
-                    "score": 0.9999997615814209
-                }
-            ],
-            "page_info": {
-                "page_no": 0,
-                "height": 2339,
-                "width": 1654
-            }
-        },
-        {
-            "layout_dets": [
-                {
-                    "category_id": 5,
-                    "poly": [
-                        99.13092803955078,
-                        2210.680419921875,
-                        497.3183898925781,
-                        2210.680419921875,
-                        497.3183898925781,
-                        2264.78076171875,
-                        99.13092803955078,
-                        2264.78076171875
-                    ],
-                    "score": 0.9999997019767761
-                }
-            ],
-            "page_info": {
-                "page_no": 1,
-                "height": 2339,
-                "width": 1654
-            }
-        }
-    ]
-
-The format of the poly coordinates is [x0, y0, x1, y1, x2, y2, x3, y3],
-representing the coordinates of the top-left, top-right, bottom-right,
-and bottom-left points respectively. |Poly Coordinate Diagram|
-
-
-
-Inference Result
-------------------------
-
-
-.. code:: python
-
-    from magic_pdf.operators.models import InferenceResult
-    from magic_pdf.data.dataset import Dataset
-
-    dataset : Dataset = some_data_set    # not real dataset
-
-    # The inference results of all pages, ordered by page number, are stored in a list as the inference results of MinerU
-    model_inference_result: list[PageInferenceResults] = []
-
-    Inference_result = InferenceResult(model_inference_result, dataset)
-
-
-
-some_model.pdf
-^^^^^^^^^^^^^^^^^^^^
-
-.. figure:: ../_static/image/inference_result.png
-
-
-
-.. |Poly Coordinate Diagram| image:: ../_static/image/poly.png
--- a/next_docs/en/user_guide/install.rst
+++ b/next_docs/en/user_guide/install.rst
-
-Installation
-==============
-
-.. toctree::
-   :maxdepth: 1
-
-   install/install
-   install//boost_with_cuda
-   install/download_model_weight_files
-   install/config
-
--- a/next_docs/en/user_guide/install/boost_with_cuda.rst
+++ b/next_docs/en/user_guide/install/boost_with_cuda.rst
-
-Boost With Cuda 
-================
-
-
-If your device supports CUDA and meets the GPU requirements of the
-mainline environment, you can use GPU acceleration. Please select the
-appropriate guide based on your system:
-
-  :ref:`ubuntu_22_04_lts_section`
-  :ref:`windows_10_or_11_section`
-
-
-.. _ubuntu_22_04_lts_section:
-
-Ubuntu 22.04 LTS
-----------------
-
-1. Check if NVIDIA Drivers Are Installed
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code:: sh
-
-   nvidia-smi
-
-If you see information similar to the following, it means that the
-NVIDIA drivers are already installed, and you can skip Step 2.
-
-.. note::
-
-   ``CUDA Version`` should be >= 12.4, If the displayed version number is less than 12.4, please upgrade the driver.
-
-.. code:: text
-
-   +---------------------------------------------------------------------------------------+
-   | NVIDIA-SMI 570.133.07             Driver Version: 572.83         CUDA Version: 12.8   |
-   |-----------------------------------------+----------------------+----------------------+
-   | GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
-   | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
-   |                                         |                      |               MIG M. |
-   |=========================================+======================+======================|
-   |   0  NVIDIA GeForce RTX 3060 Ti   WDDM  | 00000000:01:00.0  On |                  N/A |
-   |  0%   51C    P8              12W / 200W |   1489MiB /  8192MiB |      5%      Default |
-   |                                         |                      |                  N/A |
-   +-----------------------------------------+----------------------+----------------------+
-
-2. Install the Driver
-~~~~~~~~~~~~~~~~~~~~~
-
-If no driver is installed, use the following command:
-
-.. code:: sh
-
-   sudo apt-get update
-   sudo apt-get install nvidia-driver-570-server
-
-Install the proprietary driver and restart your computer after
-installation.
-
-.. code:: sh
-
-   reboot
-
-3. Install Anaconda
-~~~~~~~~~~~~~~~~~~~
-
-If Anaconda is already installed, skip this step.
-
-.. code:: sh
-
-   wget https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
-   bash Anaconda3-2024.06-1-Linux-x86_64.sh
-
-In the final step, enter ``yes``, close the terminal, and reopen it.
-
-4. Create an Environment Using Conda
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Specify Python version 3.10～3.13.
-
-.. code:: sh
-
-    conda create -n mineru 'python=3.12' -y
-    conda activate mineru
-
-5. Install Applications
-~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code:: sh
-
-   pip install -U magic-pdf[full]
-
-.. admonition:: TIP
-    :class: tip
-
-    After installation, you can check the version of ``magic-pdf`` using the following command:
-
-.. code:: sh
-
-   magic-pdf --version
-
-
-6. Download Models
-~~~~~~~~~~~~~~~~~~
-
-Refer to detailed instructions on :doc:`download_model_weight_files`
-
-7. Understand the Location of the Configuration File
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-After completing the `6. Download Models <#6-download-models>`__ step,
-the script will automatically generate a ``magic-pdf.json`` file in the
-user directory and configure the default model path. You can find the
-``magic-pdf.json`` file in your user directory.
-
-.. admonition:: TIP
-    :class: tip
-
-    The user directory for Linux is “/home/username”.
-
-8. First Run
-~~~~~~~~~~~~
-
-Download a sample file from the repository and test it.
-
-.. code:: sh
-
-   wget https://github.com/opendatalab/MinerU/raw/master/demo/pdfs/small_ocr.pdf
-   magic-pdf -p small_ocr.pdf -o ./output
-
-9. Test CUDA Acceleration
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-If your graphics card has at least **8GB** of VRAM, follow these steps
-to test CUDA acceleration:
-
-1. Modify the value of ``"device-mode"`` in the ``magic-pdf.json``
-   configuration file located in your home directory.
-
-   .. code:: json
-
-      {
-        "device-mode": "cuda"
-      }
-
-2. Test CUDA acceleration with the following command:
-
-   .. code:: sh
-
-      magic-pdf -p small_ocr.pdf -o ./output
-
-
-.. _windows_10_or_11_section:
-
-Windows 10/11
--------------
-
-1. Install CUDA
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-You need to install a CUDA version that is compatible with torch's requirements. For details, please refer to the [official PyTorch website](https://pytorch.org/get-started/locally/).
-
- CUDA 11.8 https://developer.nvidia.com/cuda-11-8-0-download-archive
- CUDA 12.4 https://developer.nvidia.com/cuda-12-4-0-download-archive
- CUDA 12.6 https://developer.nvidia.com/cuda-12-6-0-download-archive
- CUDA 12.8 https://developer.nvidia.com/cuda-12-8-0-download-archive
-
-
-2. Install Anaconda
-~~~~~~~~~~~~~~~~~~~
-
-If Anaconda is already installed, you can skip this step.
-
-Download link: https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Windows-x86_64.exe
-
-3. Create an Environment Using Conda
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-::
-
-    conda create -n mineru 'python=3.12' -y
-    conda activate mineru
-
-4. Install Applications
-~~~~~~~~~~~~~~~~~~~~~~~
-
-::
-
-   pip install -U magic-pdf[full]
-
-.. admonition:: Tip
-    :class: tip
-
-    After installation, you can check the version of ``magic-pdf``:
-
-    .. code:: bash
-
-      magic-pdf --version
-
-
-5. Download Models
-~~~~~~~~~~~~~~~~~~
-
-Refer to detailed instructions on :doc:`download_model_weight_files`
-
-6. Understand the Location of the Configuration File
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-After completing the `5. Download Models <#5-download-models>`__ step,
-the script will automatically generate a ``magic-pdf.json`` file in the
-user directory and configure the default model path. You can find the
-``magic-pdf.json`` file in your 【user directory】 .
-
-.. admonition:: Tip
-    :class: tip
-
-    The user directory for Windows is “C:/Users/username”.
-
-7. First Run
-~~~~~~~~~~~~
-
-Download a sample file from the repository and test it.
-
-.. code:: powershell
-
-     wget https://github.com/opendatalab/MinerU/raw/master/demo/pdfs/small_ocr.pdf -O small_ocr.pdf
-     magic-pdf -p small_ocr.pdf -o ./output
-
-8. Test CUDA Acceleration
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-If your graphics card has at least 8GB of VRAM, follow these steps to
-test CUDA-accelerated parsing performance.
-
-1. **Overwrite the installation of torch and torchvision** supporting CUDA.(Please select the appropriate index-url based on your CUDA version. For more details, refer to the [PyTorch official website](https://pytorch.org/get-started/locally/).)
-
-.. code:: sh
-
-   pip install --force-reinstall torch torchvision --index-url https://download.pytorch.org/whl/cu124
-
-
-2. **Modify the value of ``"device-mode"``** in the ``magic-pdf.json``
-   configuration file located in your user directory.
-
-   .. code:: json
-
-      {
-        "device-mode": "cuda"
-      }
-
-3. **Run the following command to test CUDA acceleration**:
-
-   ::
-
-      magic-pdf -p small_ocr.pdf -o ./output
--- a/next_docs/en/user_guide/install/config.rst
+++ b/next_docs/en/user_guide/install/config.rst
-
-
-Config
-=========
-
-File **magic-pdf.json** is typically located in the **${HOME}** directory under a Linux system or in the **C:\Users\{username}** directory under a Windows system.
-
-.. admonition:: Tip 
-    :class: tip
-
-    You can override the default location of config file via the following command:
-    
-    export MINERU_TOOLS_CONFIG_JSON=new_magic_pdf.json
-
-
-
-magic-pdf.json
----------------
-
-.. code:: json 
-
-    {
-        "bucket_info":{
-            "bucket-name-1":["ak", "sk", "endpoint"],
-            "bucket-name-2":["ak", "sk", "endpoint"]
-        },
-        "models-dir":"/tmp/models",
-        "layoutreader-model-dir":"/tmp/layoutreader",
-        "device-mode":"cpu",
-        "layout-config": {
-            "model": "doclayout_yolo"
-        },
-        "formula-config": {
-            "mfd_model": "yolo_v8_mfd",
-            "mfr_model": "unimernet_small",
-            "enable": true
-        },
-        "table-config": {
-            "model": "rapid_table",
-            "enable": true,
-            "max_time": 400    
-        },
-        "config_version": "1.0.0"
-    }
-
-
-
-
-bucket_info
-^^^^^^^^^^^^^^
-Store the access_key, secret_key and endpoint of AWS S3 Compatible storage config
-
-Example: 
-
-.. code:: text
-
-        {
-            "image_bucket":[{access_key}, {secret_key}, {endpoint}],
-            "video_bucket":[{access_key}, {secret_key}, {endpoint}]
-        }
-
-
-models-dir
-^^^^^^^^^^^^
-
-Store the models download from **huggingface** or **modelshop**. You do not need to modify this field if you download the model using the scripts shipped with **MinerU**
-
-
-layoutreader-model-dir
-^^^^^^^^^^^^^^^^^^^^^^^
-
-Store the models download from **huggingface** or **modelshop**. You do not need to modify this field if you download the model using the scripts shipped with **MinerU**
-
-
-devide-mode
-^^^^^^^^^^^^^^
-
-This field have two options, **cpu** or **cuda**.
-
-**cpu**: inference via cpu
-
-**cuda**: using cuda to accelerate inference
-
-
-layout-config 
-^^^^^^^^^^^^^^^
-
-.. code:: json
-
-    {
-        "model": "doclayout_yolo"
-    }
-
-layout model can not be disabled now.
-
-
-formula-config
-^^^^^^^^^^^^^^^^
-
-.. code:: json
-
-    {
-        "mfd_model": "yolo_v8_mfd",   
-        "mfr_model": "unimernet_small",
-        "enable": true 
-    }
-
-
-mfd_model
-""""""""""
-
-Specify the formula detection model, options are ['yolo_v8_mfd']
-
-
-mfr_model
-""""""""""
-Specify the formula recognition model, options are ['unimernet_small']
-
-Check `UniMERNet <https://github.com/opendatalab/UniMERNet>`_ for more details
-
-
-enable
-""""""""
-
-on-off flag, options are [true, false]. **true** means enable formula inference, **false** means disable formula inference
-
-
-table-config
-^^^^^^^^^^^^^^^^
-
-.. code:: json
-
-   {
-        "model": "rapid_table",
-        "enable": true,
-        "max_time": 400    
-    }
-
-model
-""""""""
-
-Specify the table inference model, options are ['rapid_table']
-
-
-max_time
-"""""""""
-
-Since table recognition is a time-consuming process, we set a timeout period. If the process exceeds this time, the table recognition will be terminated.
-
-
-
-enable
-"""""""
-
-on-off flag, options are [true, false]. **true** means enable table inference, **false** means disable table inference
-
-
-config_version
-^^^^^^^^^^^^^^^^
-
-The version of config schema.
-
-
-.. admonition:: Tip
-    :class: tip
-    
-    Check `Config Schema <https://github.com/opendatalab/MinerU/blob/master/magic-pdf.template.json>`_ for the latest details
-
--- a/next_docs/en/user_guide/install/download_model_weight_files.rst
+++ b/next_docs/en/user_guide/install/download_model_weight_files.rst
-
-Download Model Weight Files
-==============================
-
-Model downloads are divided into initial downloads and updates to the
-model directory. Please refer to the corresponding documentation for
-instructions on how to proceed.
-
-Initial download of model files
------------------------------
-
-1. Download the Model from Hugging Face
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Use a Python Script to Download Model Files from Hugging Face
-
-.. code:: bash
-
-   pip install huggingface_hub
-   wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models_hf.py
-   python download_models_hf.py
-
-The Python script will automatically download the model files and
-configure the model directory in the configuration file.
-
-The configuration file can be found in the user directory, with the
-filename ``magic-pdf.json``.
-
-How to update models previously downloaded
-----------------------------------------
-
-1. Models downloaded via Hugging Face or Model Scope
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-If you previously downloaded models via Hugging Face or Model Scope, you
-can rerun the Python script used for the initial download. This will
-automatically update the model directory to the latest version.
--- a/next_docs/en/user_guide/install/install.rst
+++ b/next_docs/en/user_guide/install/install.rst
-
-Install 
-===============================================================
-If you encounter any installation issues, please first consult the :doc:`../../additional_notes/faq`.
-If the parsing results are not as expected, refer to the :doc:`../../additional_notes/known_issues`.
-
-Also you can try `online demo <https://www.modelscope.cn/studios/OpenDataLab/MinerU>`_ without installation.
-
-.. admonition:: Warning
-    :class: tip
-
-    **Pre-installation Notice—Hardware and Software Environment Support**
-
-    To ensure the stability and reliability of the project, we only optimize
-    and test for specific hardware and software environments during
-    development. This ensures that users deploying and running the project
-    on recommended system configurations will get the best performance with
-    the fewest compatibility issues.
-
-    By focusing resources on the mainline environment, our team can more
-    efficiently resolve potential bugs and develop new features.
-
-    In non-mainline environments, due to the diversity of hardware and
-    software configurations, as well as third-party dependency compatibility
-    issues, we cannot guarantee 100% project availability. Therefore, for
-    users who wish to use this project in non-recommended environments, we
-    suggest carefully reading the documentation and FAQ first. Most issues
-    already have corresponding solutions in the FAQ. We also encourage
-    community feedback to help us gradually expand support.
-
-.. raw:: html
-
-    <style>
-        table, th, td {
-        border: 1px solid black;
-        border-collapse: collapse;
-        }
-    </style>
-    <table>
-    <tr>
-        <td colspan="3" rowspan="2">Operating System</td>
-    </tr>
-    <tr>
-        <td>Linux after 2019</td>
-        <td>Windows 10 / 11</td>
-        <td>macOS 11+</td>
-    </tr>
-    <tr>
-        <td colspan="3">CPU</td>
-        <td>x86_64 / arm64</td>
-        <td>x86_64(unsupported ARM Windows)</td>
-        <td>x86_64 / arm64</td>
-    </tr>
-    <tr>
-        <td colspan="3">Memory Requirements</td>
-        <td colspan="3">16GB or more, recommended 32GB+</td>
-    </tr>
-    <tr>
-        <td colspan="3">Storage Requirements</td>
-        <td colspan="3">20GB or more, with a preference for SSD</td>
-    </tr>
-    <tr>
-        <td colspan="3">Python Version</td>
-        <td colspan="3">3.10~3.13</td>
-    </tr>
-    <tr>
-        <td colspan="3">Nvidia Driver Version</td>
-        <td>latest (Proprietary Driver)</td>
-        <td>latest</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td colspan="3">CUDA Environment</td>
-        <td colspan="2"><a href="https://pytorch.org/get-started/locally/">Refer to the PyTorch official website</a></td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td colspan="3">CANN Environment(NPU support)</td>
-        <td>8.0+(Ascend 910b)</td>
-        <td>None</td>
-        <td>None</td>
-    </tr>
-    <tr>
-        <td rowspan="2">GPU/MPS Hardware Support List</td>
-        <td colspan="2">GPU VRAM 6GB or more</td>
-        <td colspan="2">All GPUs with Tensor Cores produced from Volta(2017) onwards.<br>
-        More than 6GB VRAM </td>
-        <td rowspan="2">Apple silicon</td>
-    </tr>
-    </table>
-
-
-
-Create an environment
---------------------------
-
-.. code-block:: shell
-
-    conda create -n mineru 'python=3.12' -y
-    conda activate mineru
-    pip install -U "magic-pdf[full]"
-
-
-Download model weight files
------------------------------
-
-.. code-block:: shell
-
-    pip install huggingface_hub
-    wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models_hf.py
-    python download_models_hf.py    
-
-
-
-Install LibreOffice[Optional]
----------------------------------
-
-This section is required for handle **doc**, **docx**, **ppt**, **pptx** filetype, You can **skip** this section if no need for those filetype processing.
-
-
-Linux/Macos Platform
-""""""""""""""""""""""
-
-.. code::
-
-    apt-get/yum/brew install libreoffice
-
-
-Windows Platform 
-""""""""""""""""""""
-
-.. code::
-
-    install libreoffice 
-    append "install_dir\LibreOffice\program" to ENVIRONMENT PATH
-
-
-.. tip::
-
-    The MinerU is installed, Check out :doc:`../usage/command_line` to convert your first pdf **or** reading the following sections for more details about install
-
-
--- a/next_docs/en/user_guide/pipe_result.rst
+++ b/next_docs/en/user_guide/pipe_result.rst
-
-
-Pipe Result
-==============
-
-.. admonition:: Tip
-    :class: tip
-
-    Please first navigate to :doc:`tutorial/pipeline` to get an initial understanding of how the pipeline works; this will help in understanding the content of this section.
-
-
-The **PipeResult** class is a container for storing pipeline processing results and implements a series of methods related to these results, such as draw_layout, draw_span.
-Checkout :doc:`../api/pipe_operators` for more details about **PipeResult**
-
-
-
-Structure Definitions
-------------------------------
-
-**some_pdf_middle.json**
-
-+----------------+--------------------------------------------------------------+
-| Field Name     | Description                                                  |
-|                |                                                              |
-+================+==============================================================+
-| pdf_info       | list, each element is a dict representing the parsing result |
-|                | of each PDF page, see the table below for details            |
-+----------------+--------------------------------------------------------------+
-| \_             | ocr \| txt, used to indicate the mode used in this           |
-| parse_type     | intermediate parsing state                                   |
-|                |                                                              |
-+----------------+--------------------------------------------------------------+
-| \_version_name | string, indicates the version of magic-pdf used in this      |
-|                | parsing                                                      |
-|                |                                                              |
-+----------------+--------------------------------------------------------------+
-
-**pdf_info**
-
-Field structure description
-
-+-------------------------+------------------------------------------------------------+
-| Field                   | Description                                                |
-| Name                    |                                                            |
-+=========================+============================================================+
-| preproc_blocks          | Intermediate result after PDF preprocessing, not yet       |
-|                         | segmented                                                  |
-+-------------------------+------------------------------------------------------------+
-| layout_bboxes           | Layout segmentation results, containing layout direction   |
-|                         | (vertical, horizontal), and bbox, sorted by reading order  |
-+-------------------------+------------------------------------------------------------+
-| page_idx                | Page number, starting from 0                               |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| page_size               | Page width and height                                      |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| \_layout_tree           | Layout tree structure                                      |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| images                  | list, each element is a dict representing an img_block     |
-+-------------------------+------------------------------------------------------------+
-| tables                  | list, each element is a dict representing a table_block    |
-+-------------------------+------------------------------------------------------------+
-| interline_equation      | list, each element is a dict representing an               |
-|                         | interline_equation_block                                   |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| discarded_blocks        | List, block information returned by the model that needs   |
-|                         | to be dropped                                              |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| para_blocks             | Result after segmenting preproc_blocks                     |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-
-In the above table, ``para_blocks`` is an array of dicts, each dict
-representing a block structure. A block can support up to one level of
-nesting.
-
-**block**
-
-The outer block is referred to as a first-level block, and the fields in
-the first-level block include:
-
-+------------------------+-------------------------------------------------------------+
-| Field                  | Description                                                 |
-| Name                   |                                                             |
-+========================+=============================================================+
-| type                   | Block type (table|image)                                    |
-+------------------------+-------------------------------------------------------------+
-| bbox                   | Block bounding box coordinates                              |
-+------------------------+-------------------------------------------------------------+
-| blocks                 | list, each element is a dict representing a second-level    |
-|                        | block                                                       |
-+------------------------+-------------------------------------------------------------+
-
-There are only two types of first-level blocks: “table” and “image”. All
-other blocks are second-level blocks.
-
-The fields in a second-level block include:
-
-+----------------------+----------------------------------------------------------------+
-| Field                | Description                                                    |
-| Name                 |                                                                |
-+======================+================================================================+
-|                      | Block type                                                     |
-| type                 |                                                                |
-+----------------------+----------------------------------------------------------------+
-|                      | Block bounding box coordinates                                 |
-| bbox                 |                                                                |
-+----------------------+----------------------------------------------------------------+
-|                      | list, each element is a dict representing a line, used to      |
-| lines                | describe the composition of a line of information              |
-+----------------------+----------------------------------------------------------------+
-
-Detailed explanation of second-level block types
-
-================== ======================
-type               Description
-================== ======================
-image_body         Main body of the image
-image_caption      Image description text
-table_body         Main body of the table
-table_caption      Table description text
-table_footnote     Table footnote
-text               Text block
-title              Title block
-interline_equation Block formula
-================== ======================
-
-**line**
-
-The field format of a line is as follows:
-
-+---------------------+----------------------------------------------------------------+
-| Field               | Description                                                    |
-| Name                |                                                                |
-+=====================+================================================================+
-|                     | Bounding box coordinates of the line                           |
-| bbox                |                                                                |
-+---------------------+----------------------------------------------------------------+
-| spans               | list, each element is a dict representing a span, used to      |
-|                     | describe the composition of the smallest unit                  |
-+---------------------+----------------------------------------------------------------+
-
-**span**
-
-+---------------------+-----------------------------------------------------------+
-| Field               | Description                                               |
-| Name                |                                                           |
-+=====================+===========================================================+
-| bbox                | Bounding box coordinates of the span                      |
-+---------------------+-----------------------------------------------------------+
-| type                | Type of the span                                          |
-+---------------------+-----------------------------------------------------------+
-| content             | Text spans use content, chart spans use img_path to store |
-| \|                  | the actual text or screenshot path information            |
-| img_path            |                                                           |
-+---------------------+-----------------------------------------------------------+
-
-The types of spans are as follows:
-
-================== ==============
-type               Description
-================== ==============
-image              Image
-table              Table
-text               Text
-inline_equation    Inline formula
-interline_equation Block formula
-================== ==============
-
-**Summary**
-
-A span is the smallest storage unit for all elements.
-
-The elements stored within para_blocks are block information.
-
-The block structure is as follows:
-
-First-level block (if any) -> Second-level block -> Line -> Span
-
-.. _example-1:
-
-example
-^^^^^^^
-
-.. code:: json
-
-   {
-       "pdf_info": [
-           {
-               "preproc_blocks": [
-                   {
-                       "type": "text",
-                       "bbox": [
-                           52,
-                           61.956024169921875,
-                           294,
-                           82.99800872802734
-                       ],
-                       "lines": [
-                           {
-                               "bbox": [
-                                   52,
-                                   61.956024169921875,
-                                   294,
-                                   72.0000228881836
-                               ],
-                               "spans": [
-                                   {
-                                       "bbox": [
-                                           54.0,
-                                           61.956024169921875,
-                                           296.2261657714844,
-                                           72.0000228881836
-                                       ],
-                                       "content": "dependent on the service headway and the reliability of the departure ",
-                                       "type": "text",
-                                       "score": 1.0
-                                   }
-                               ]
-                           }
-                       ]
-                   }
-               ],
-               "layout_bboxes": [
-                   {
-                       "layout_bbox": [
-                           52,
-                           61,
-                           294,
-                           731
-                       ],
-                       "layout_label": "V",
-                       "sub_layout": []
-                   }
-               ],
-               "page_idx": 0,
-               "page_size": [
-                   612.0,
-                   792.0
-               ],
-               "_layout_tree": [],
-               "images": [],
-               "tables": [],
-               "interline_equations": [],
-               "discarded_blocks": [],
-               "para_blocks": [
-                   {
-                       "type": "text",
-                       "bbox": [
-                           52,
-                           61.956024169921875,
-                           294,
-                           82.99800872802734
-                       ],
-                       "lines": [
-                           {
-                               "bbox": [
-                                   52,
-                                   61.956024169921875,
-                                   294,
-                                   72.0000228881836
-                               ],
-                               "spans": [
-                                   {
-                                       "bbox": [
-                                           54.0,
-                                           61.956024169921875,
-                                           296.2261657714844,
-                                           72.0000228881836
-                                       ],
-                                       "content": "dependent on the service headway and the reliability of the departure ",
-                                       "type": "text",
-                                       "score": 1.0
-                                   }
-                               ]
-                           }
-                       ]
-                   }
-               ]
-           }
-       ],
-       "_parse_type": "txt",
-       "_version_name": "0.6.1"
-   }
-
-
-Pipeline Result
------------------
-
-.. code:: python
-
-    from magic_pdf.pdf_parse_union_core_v2 import pdf_parse_union
-    from magic_pdf.operators.pipes import PipeResult
-    from magic_pdf.data.dataset import Dataset
-
-    res = pdf_parse_union(*args, **kwargs)
-    res['_parse_type'] = PARSE_TYPE_OCR
-    res['_version_name'] = __version__
-    if 'lang' in kwargs and kwargs['lang'] is not None:
-        res['lang'] = kwargs['lang']
-
-    dataset : Dataset = some_dataset   # not real dataset
-    pipeResult = PipeResult(res, dataset)
-
-
-
-some_pdf_layout.pdf
-~~~~~~~~~~~~~~~~~~~
-
-Each page layout consists of one or more boxes. The number at the top
-left of each box indicates its sequence number. Additionally, in
-``layout.pdf``, different content blocks are highlighted with different
-background colors.
-
-.. figure:: ../_static/image/layout_example.png
-   :alt: layout example
-
-   layout example
-
-some_pdf_spans.pdf
-~~~~~~~~~~~~~~~~~~
-
-All spans on the page are drawn with different colored line frames
-according to the span type. This file can be used for quality control,
-allowing for quick identification of issues such as missing text or
-unrecognized inline formulas.
-
-.. figure:: ../_static/image/spans_example.png
-   :alt: spans example
-
-   spans example
--- a/next_docs/en/user_guide/quick_start.rst
+++ b/next_docs/en/user_guide/quick_start.rst
-
-Quick Start 
-==============
-
-Want to learn about the usage methods under different scenarios ? This page gives good examples about multiple usage cases match your needs.
-
-.. toctree::
-    :maxdepth: 1
-
-    quick_start/convert_pdf 
-    quick_start/convert_image
-    quick_start/convert_ms_office
--- a/next_docs/en/user_guide/quick_start/convert_image.rst
+++ b/next_docs/en/user_guide/quick_start/convert_image.rst
-
-
-Convert Image
-===============
-
-
-Command Line
-^^^^^^^^^^^^^
-
-.. code:: python
-
-    # make sure the file have correct suffix
-    magic-pdf -p a.png -o output -m auto
-
-
-API
-^^^^^^
-
-.. code:: python
-
-    import os
-
-    from magic_pdf.data.data_reader_writer import FileBasedDataWriter
-    from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
-    from magic_pdf.data.read_api import read_local_images
-
-    # prepare env
-    local_image_dir, local_md_dir = "output/images", "output"
-    image_dir = str(os.path.basename(local_image_dir))
-
-    os.makedirs(local_image_dir, exist_ok=True)
-
-    image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(
-        local_md_dir
-    )
-
-    # proc
-    ## Create Dataset Instance
-    input_file = "some_image.jpg"       # replace with real image file
-
-    input_file_name = input_file.split(".")[0]
-    ds = read_local_images(input_file)[0]
-
-    # ocr mode
-    ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(
-        md_writer, f"{input_file_name}.md", image_dir
-    )
--- a/next_docs/en/user_guide/quick_start/convert_ms_office.rst
+++ b/next_docs/en/user_guide/quick_start/convert_ms_office.rst
-
-
-Convert Doc
-=============
-
-.. admonition:: Warning
-    :class: tip
-
-    When processing MS-Office files, we first use third-party software to convert the MS-Office files to PDF.
-
-    For certain MS-Office files, the quality of the converted PDF files may not be very high, which can affect the quality of the final output.
-
-
-
-Command Line
-^^^^^^^^^^^^^
-
-.. code:: python
-
-    # replace with real ms-office file, we support MS-DOC, MS-DOCX, MS-PPT, MS-PPTX now
-    magic-pdf -p a.doc -o output -m auto
-
-
-API
-^^^^^^^^
-.. code:: python
-
-    import os
-
-    from magic_pdf.data.data_reader_writer import FileBasedDataWriter, FileBasedDataReader
-    from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
-    from magic_pdf.data.read_api import read_local_office
-    from magic_pdf.config.enums import SupportedPdfParseMethod
-
-
-    # prepare env
-    local_image_dir, local_md_dir = "output/images", "output"
-    image_dir = str(os.path.basename(local_image_dir))
-
-    os.makedirs(local_image_dir, exist_ok=True)
-
-    image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(
-        local_md_dir
-    )
-
-    # proc
-    ## Create Dataset Instance
-    input_file = "some_doc.doc"     # replace with real ms-office file, we support MS-DOC, MS-DOCX, MS-PPT, MS-PPTX now
-
-    input_file_name = input_file.split(".")[0]
-    ds = read_local_office(input_file)[0]
-
-
-    ## inference
-    if ds.classify() == SupportedPdfParseMethod.OCR:
-        ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(
-        md_writer, f"{input_file_name}.md", image_dir)
-    else:
-        ds.apply(doc_analyze, ocr=False).pipe_txt_mode(image_writer).dump_md(
-        md_writer, f"{input_file_name}.md", image_dir)
--- a/next_docs/en/user_guide/quick_start/convert_pdf.rst
+++ b/next_docs/en/user_guide/quick_start/convert_pdf.rst
-
-
-Convert PDF
-============
-
-Command Line
-^^^^^^^^^^^^^
-
-.. code:: python
-
-    # make sure the file have correct suffix
-    magic-pdf -p a.pdf -o output -m auto
-
-
-API
-^^^^^^
-.. code:: python
-
-    import os
-
-    from magic_pdf.data.data_reader_writer import FileBasedDataWriter, FileBasedDataReader
-    from magic_pdf.data.dataset import PymuDocDataset
-    from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
-
-    # args
-    pdf_file_name = "abc.pdf"  # replace with the real pdf path
-    name_without_suff = pdf_file_name.split(".")[0]
-
-    # prepare env
-    local_image_dir, local_md_dir = "output/images", "output"
-    image_dir = str(os.path.basename(local_image_dir))
-
-    os.makedirs(local_image_dir, exist_ok=True)
-
-    image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(
-        local_md_dir
-    )
-
-    # read bytes
-    reader1 = FileBasedDataReader("")
-    pdf_bytes = reader1.read(pdf_file_name)  # read the pdf content
-
-    # proc
-    ## Create Dataset Instance
-    ds = PymuDocDataset(pdf_bytes)
-
-    ## inference
-    if ds.classify() == SupportedPdfParseMethod.OCR:
-        ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(
-        md_writer, f"{name_without_suff}.md", image_dir
-    )
-
-    else:
-        ds.apply(doc_analyze, ocr=False).pipe_txt_mode(image_writer).dump_md(
-        md_writer, f"{name_without_suff}.md", image_dir
-    )
--- a/next_docs/en/user_guide/tutorial.rst
+++ b/next_docs/en/user_guide/tutorial.rst
-
-Tutorial
-===========
-
-From the beginning to the end, Show how to using mineru via a minimal project
-
-.. toctree::
-    :maxdepth: 1
-
-    tutorial/pipeline
-
--- a/next_docs/en/user_guide/tutorial/output_file_description.rst
+++ b/next_docs/en/user_guide/tutorial/output_file_description.rst
-
-Output File Description
-=========================
-
-After executing the ``magic-pdf`` command, in addition to outputting
-files related to markdown, several other files unrelated to markdown
-will also be generated. These files will be introduced one by one.
-
-some_pdf_layout.pdf
-~~~~~~~~~~~~~~~~~~~
-
-Each page layout consists of one or more boxes. The number at the top
-left of each box indicates its sequence number. Additionally, in
-``layout.pdf``, different content blocks are highlighted with different
-background colors.
-
-.. figure:: ../../_static/image/layout_example.png
-   :alt: layout example
-
-   layout example
-
-some_pdf_spans.pdf
-~~~~~~~~~~~~~~~~~~
-
-All spans on the page are drawn with different colored line frames
-according to the span type. This file can be used for quality control,
-allowing for quick identification of issues such as missing text or
-unrecognized inline formulas.
-
-.. figure:: ../../_static/image/spans_example.png
-   :alt: spans example
-
-   spans example
-
-some_pdf_model.json
-~~~~~~~~~~~~~~~~~~~
-
-Structure Definition
-^^^^^^^^^^^^^^^^^^^^
-
-.. code:: python
-
-   from pydantic import BaseModel, Field
-   from enum import IntEnum
-
-   class CategoryType(IntEnum):
-        title = 0               # Title
-        plain_text = 1          # Text
-        abandon = 2             # Includes headers, footers, page numbers, and page annotations
-        figure = 3              # Image
-        figure_caption = 4      # Image description
-        table = 5               # Table
-        table_caption = 6       # Table description
-        table_footnote = 7      # Table footnote
-        isolate_formula = 8     # Block formula
-        formula_caption = 9     # Formula label
-
-        embedding = 13          # Inline formula
-        isolated = 14           # Block formula
-        text = 15               # OCR recognition result
-
-
-   class PageInfo(BaseModel):
-       page_no: int = Field(description="Page number, the first page is 0", ge=0)
-       height: int = Field(description="Page height", gt=0)
-       width: int = Field(description="Page width", ge=0)
-
-   class ObjectInferenceResult(BaseModel):
-       category_id: CategoryType = Field(description="Category", ge=0)
-       poly: list[float] = Field(description="Quadrilateral coordinates, representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively")
-       score: float = Field(description="Confidence of the inference result")
-       latex: str | None = Field(description="LaTeX parsing result", default=None)
-       html: str | None = Field(description="HTML parsing result", default=None)
-
-   class PageInferenceResults(BaseModel):
-        layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results", ge=0)
-        page_info: PageInfo = Field(description="Page metadata")
-
-
-   # The inference results of all pages, ordered by page number, are stored in a list as the inference results of MinerU
-   inference_result: list[PageInferenceResults] = []
-
-The format of the poly coordinates is [x0, y0, x1, y1, x2, y2, x3, y3],
-representing the coordinates of the top-left, top-right, bottom-right,
-and bottom-left points respectively. |Poly Coordinate Diagram|
-
-example
-^^^^^^^
-
-.. code:: json
-
-   [
-       {
-           "layout_dets": [
-               {
-                   "category_id": 2,
-                   "poly": [
-                       99.1906967163086,
-                       100.3119125366211,
-                       730.3707885742188,
-                       100.3119125366211,
-                       730.3707885742188,
-                       245.81326293945312,
-                       99.1906967163086,
-                       245.81326293945312
-                   ],
-                   "score": 0.9999997615814209
-               }
-           ],
-           "page_info": {
-               "page_no": 0,
-               "height": 2339,
-               "width": 1654
-           }
-       },
-       {
-           "layout_dets": [
-               {
-                   "category_id": 5,
-                   "poly": [
-                       99.13092803955078,
-                       2210.680419921875,
-                       497.3183898925781,
-                       2210.680419921875,
-                       497.3183898925781,
-                       2264.78076171875,
-                       99.13092803955078,
-                       2264.78076171875
-                   ],
-                   "score": 0.9999997019767761
-               }
-           ],
-           "page_info": {
-               "page_no": 1,
-               "height": 2339,
-               "width": 1654
-           }
-       }
-   ]
-
-some_pdf_middle.json
-~~~~~~~~~~~~~~~~~~~~
-
-+----------------+--------------------------------------------------------------+
-| Field Name     | Description                                                  |
-|                |                                                              |
-+================+==============================================================+
-| pdf_info       | list, each element is a dict representing the parsing result |
-|                | of each PDF page, see the table below for details            |
-+----------------+--------------------------------------------------------------+
-| \_             | ocr \| txt, used to indicate the mode used in this           |
-| parse_type     | intermediate parsing state                                   |
-|                |                                                              |
-+----------------+--------------------------------------------------------------+
-| \_version_name | string, indicates the version of magic-pdf used in this      |
-|                | parsing                                                      |
-|                |                                                              |
-+----------------+--------------------------------------------------------------+
-
-**pdf_info**
-
-Field structure description
-
-+-------------------------+------------------------------------------------------------+
-| Field                   | Description                                                |
-| Name                    |                                                            |
-+=========================+============================================================+
-| preproc_blocks          | Intermediate result after PDF preprocessing, not yet       |
-|                         | segmented                                                  |
-+-------------------------+------------------------------------------------------------+
-| layout_bboxes           | Layout segmentation results, containing layout direction   |
-|                         | (vertical, horizontal), and bbox, sorted by reading order  |
-+-------------------------+------------------------------------------------------------+
-| page_idx                | Page number, starting from 0                               |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| page_size               | Page width and height                                      |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| \_layout_tree           | Layout tree structure                                      |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| images                  | list, each element is a dict representing an img_block     |
-+-------------------------+------------------------------------------------------------+
-| tables                  | list, each element is a dict representing a table_block    |
-+-------------------------+------------------------------------------------------------+
-| interline_equation      | list, each element is a dict representing an               |
-|                         | interline_equation_block                                   |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| discarded_blocks        | List, block information returned by the model that needs   |
-|                         | to be dropped                                              |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-| para_blocks             | Result after segmenting preproc_blocks                     |
-|                         |                                                            |
-+-------------------------+------------------------------------------------------------+
-
-In the above table, ``para_blocks`` is an array of dicts, each dict
-representing a block structure. A block can support up to one level of
-nesting.
-
-**block**
-
-The outer block is referred to as a first-level block, and the fields in
-the first-level block include:
-
-+------------------------+-------------------------------------------------------------+
-| Field                  | Description                                                 |
-| Name                   |                                                             |
-+========================+=============================================================+
-| type                   | Block type (table|image)                                    |
-+------------------------+-------------------------------------------------------------+
-| bbox                   | Block bounding box coordinates                              |
-+------------------------+-------------------------------------------------------------+
-| blocks                 | list, each element is a dict representing a second-level    |
-|                        | block                                                       |
-+------------------------+-------------------------------------------------------------+
-
-There are only two types of first-level blocks: “table” and “image”. All
-other blocks are second-level blocks.
-
-The fields in a second-level block include:
-
-+----------------------+----------------------------------------------------------------+
-| Field                | Description                                                    |
-| Name                 |                                                                |
-+======================+================================================================+
-|                      | Block type                                                     |
-| type                 |                                                                |
-+----------------------+----------------------------------------------------------------+
-|                      | Block bounding box coordinates                                 |
-| bbox                 |                                                                |
-+----------------------+----------------------------------------------------------------+
-|                      | list, each element is a dict representing a line, used to      |
-| lines                | describe the composition of a line of information              |
-+----------------------+----------------------------------------------------------------+
-
-Detailed explanation of second-level block types
-
-================== ======================
-type               Description
-================== ======================
-image_body         Main body of the image
-image_caption      Image description text
-table_body         Main body of the table
-table_caption      Table description text
-table_footnote     Table footnote
-text               Text block
-title              Title block
-interline_equation Block formula
-================== ======================
-
-**line**
-
-The field format of a line is as follows:
-
-+---------------------+----------------------------------------------------------------+
-| Field               | Description                                                    |
-| Name                |                                                                |
-+=====================+================================================================+
-|                     | Bounding box coordinates of the line                           |
-| bbox                |                                                                |
-+---------------------+----------------------------------------------------------------+
-| spans               | list, each element is a dict representing a span, used to      |
-|                     | describe the composition of the smallest unit                  |
-+---------------------+----------------------------------------------------------------+
-
-**span**
-
-+---------------------+-----------------------------------------------------------+
-| Field               | Description                                               |
-| Name                |                                                           |
-+=====================+===========================================================+
-| bbox                | Bounding box coordinates of the span                      |
-+---------------------+-----------------------------------------------------------+
-| type                | Type of the span                                          |
-+---------------------+-----------------------------------------------------------+
-| content             | Text spans use content, chart spans use img_path to store |
-| \|                  | the actual text or screenshot path information            |
-| img_path            |                                                           |
-+---------------------+-----------------------------------------------------------+
-
-The types of spans are as follows:
-
-================== ==============
-type               Description
-================== ==============
-image              Image
-table              Table
-text               Text
-inline_equation    Inline formula
-interline_equation Block formula
-================== ==============
-
-**Summary**
-
-A span is the smallest storage unit for all elements.
-
-The elements stored within para_blocks are block information.
-
-The block structure is as follows:
-
-First-level block (if any) -> Second-level block -> Line -> Span
-
-.. _example-1:
-
-example
-^^^^^^^
-
-.. code:: json
-
-   {
-       "pdf_info": [
-           {
-               "preproc_blocks": [
-                   {
-                       "type": "text",
-                       "bbox": [
-                           52,
-                           61.956024169921875,
-                           294,
-                           82.99800872802734
-                       ],
-                       "lines": [
-                           {
-                               "bbox": [
-                                   52,
-                                   61.956024169921875,
-                                   294,
-                                   72.0000228881836
-                               ],
-                               "spans": [
-                                   {
-                                       "bbox": [
-                                           54.0,
-                                           61.956024169921875,
-                                           296.2261657714844,
-                                           72.0000228881836
-                                       ],
-                                       "content": "dependent on the service headway and the reliability of the departure ",
-                                       "type": "text",
-                                       "score": 1.0
-                                   }
-                               ]
-                           }
-                       ]
-                   }
-               ],
-               "layout_bboxes": [
-                   {
-                       "layout_bbox": [
-                           52,
-                           61,
-                           294,
-                           731
-                       ],
-                       "layout_label": "V",
-                       "sub_layout": []
-                   }
-               ],
-               "page_idx": 0,
-               "page_size": [
-                   612.0,
-                   792.0
-               ],
-               "_layout_tree": [],
-               "images": [],
-               "tables": [],
-               "interline_equations": [],
-               "discarded_blocks": [],
-               "para_blocks": [
-                   {
-                       "type": "text",
-                       "bbox": [
-                           52,
-                           61.956024169921875,
-                           294,
-                           82.99800872802734
-                       ],
-                       "lines": [
-                           {
-                               "bbox": [
-                                   52,
-                                   61.956024169921875,
-                                   294,
-                                   72.0000228881836
-                               ],
-                               "spans": [
-                                   {
-                                       "bbox": [
-                                           54.0,
-                                           61.956024169921875,
-                                           296.2261657714844,
-                                           72.0000228881836
-                                       ],
-                                       "content": "dependent on the service headway and the reliability of the departure ",
-                                       "type": "text",
-                                       "score": 1.0
-                                   }
-                               ]
-                           }
-                       ]
-                   }
-               ]
-           }
-       ],
-       "_parse_type": "txt",
-       "_version_name": "0.6.1"
-   }
-
-.. |Poly Coordinate Diagram| image:: ../../_static/image/poly.png
--- a/next_docs/en/user_guide/tutorial/pipeline.rst
+++ b/next_docs/en/user_guide/tutorial/pipeline.rst
-
-
-Pipeline
-==========
-
-
-Minimal Example 
-^^^^^^^^^^^^^^^^^
-
-.. code:: python
-
-    import os
-
-    from magic_pdf.data.data_reader_writer import FileBasedDataWriter, FileBasedDataReader
-    from magic_pdf.data.dataset import PymuDocDataset
-    from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
-
-    # args
-    pdf_file_name = "abc.pdf"  # replace with the real pdf path
-    name_without_suff = pdf_file_name.split(".")[0]
-
-    # prepare env
-    local_image_dir, local_md_dir = "output/images", "output"
-    image_dir = str(os.path.basename(local_image_dir))
-
-    os.makedirs(local_image_dir, exist_ok=True)
-
-    image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(
-        local_md_dir
-    )
-
-    # read bytes
-    reader1 = FileBasedDataReader("")
-    pdf_bytes = reader1.read(pdf_file_name)  # read the pdf content
-
-    # proc
-    ## Create Dataset Instance
-    ds = PymuDocDataset(pdf_bytes)
-
-    ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(md_writer, f"{name_without_suff}.md", image_dir)
-
-Running the above code will result in the following
-
-
-.. code:: bash 
-
-    output/
-    ├── abc.md
-    └── images
-
-
-Excluding the setup of the environment, such as creating directories and importing dependencies, the actual code snippet for converting pdf to markdown is as follows
-
-
-.. code:: python 
-
-    # read bytes
-    reader1 = FileBasedDataReader("")
-    pdf_bytes = reader1.read(pdf_file_name)  # read the pdf content
-
-    # proc
-    ## Create Dataset Instance
-    ds = PymuDocDataset(pdf_bytes)
-
-    ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(md_writer, f"{name_without_suff}.md", image_dir)
-
-``ds.apply(doc_analyze, ocr=True)`` generates an ``InferenceResult`` object. The ``InferenceResult`` object, when executing the ``pipe_ocr_mode`` method, produces a ``PipeResult`` object.
-The ``PipeResult`` object, upon executing ``dump_md``, generates a ``markdown`` file at the specified location.
-
-
-The pipeline execution process is illustrated in the following diagram
-
-
-.. image:: ../../_static/image/pipeline.drawio.svg 
-
-.. raw:: html
-
-    <br> </br>
-
-Currently, the process is divided into three stages: data, inference, and processing, which correspond to the ``Dataset``, ``InferenceResult``, and ``PipeResult`` entities in the diagram.
-These stages are linked together through methods like ``apply``, ``doc_analyze``, or ``pipe_ocr_mode``
-
-
-.. admonition:: Tip
-    :class: tip
-
-    For more detailed information about ``Dataset``, ``InferenceResult``, and ``PipeResult``, please refer to :doc:`../../api/dataset`, :doc:`../../api/model_operators`, :doc:`../../api/pipe_operators`
-
-
-Pipeline Composition
-^^^^^^^^^^^^^^^^^^^^^
-
-.. code:: python 
-
-    class Dataset(ABC):
-        @abstractmethod
-        def apply(self, proc: Callable, *args, **kwargs):
-            """Apply callable method which.
-
-            Args:
-                proc (Callable): invoke proc as follows:
-                    proc(self, *args, **kwargs)
-
-            Returns:
-                Any: return the result generated by proc
-            """
-            pass
-
-    class InferenceResult(InferenceResultBase):
-
-        def apply(self, proc: Callable, *args, **kwargs):
-            """Apply callable method which.
-
-            Args:
-                proc (Callable): invoke proc as follows:
-                    proc(inference_result, *args, **kwargs)
-
-            Returns:
-                Any: return the result generated by proc
-            """
-            return proc(copy.deepcopy(self._infer_res), *args, **kwargs)
-
-        def pipe_ocr_mode(
-            self,
-            imageWriter: DataWriter,
-            start_page_id=0,
-            end_page_id=None,
-            debug_mode=False,
-            lang=None,
-            ) -> PipeResult:
-            pass
-
-    class PipeResult:
-        def apply(self, proc: Callable, *args, **kwargs):
-            """Apply callable method which.
-
-            Args:
-                proc (Callable): invoke proc as follows:
-                    proc(pipeline_result, *args, **kwargs)
-
-            Returns:
-                Any: return the result generated by proc
-            """
-            return proc(copy.deepcopy(self._pipe_res), *args, **kwargs)
-
-
-The ``Dataset``, ``InferenceResult``, and ``PipeResult`` classes all have an ``apply`` method, which can be used to chain different stages of the computation. 
-As shown below, ``MinerU`` provides a set of methods to compose these classes.
-
-
-.. code:: python 
-
-    # proc
-    ## Create Dataset Instance
-    ds = PymuDocDataset(pdf_bytes)
-
-    ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(md_writer, f"{name_without_suff}.md", image_dir)
-
-
-Users can implement their own functions for chaining as needed. For example, a user could use the ``apply`` method to create a function that counts the number of pages in a ``pdf`` file.
-
-
-.. code:: python
-
-    from magic_pdf.data.data_reader_writer import  FileBasedDataReader
-    from magic_pdf.data.dataset import PymuDocDataset
-
-    # args
-    pdf_file_name = "abc.pdf"  # replace with the real pdf path
-
-    # read bytes
-    reader1 = FileBasedDataReader("")
-    pdf_bytes = reader1.read(pdf_file_name)  # read the pdf content
-
-    # proc
-    ## Create Dataset Instance
-    ds = PymuDocDataset(pdf_bytes)
-
-    def count_page(ds)-> int:
-        return len(ds)
-
-    print("page number: ", ds.apply(count_page)) # will output the page count of `abc.pdf`
--- a/next_docs/en/user_guide/usage.rst
+++ b/next_docs/en/user_guide/usage.rst
-
-
-Usage
-========
-
-.. toctree::
-   :maxdepth: 1
-
-   usage/command_line
-   usage/api
-   usage/docker
-