After executing the `mineru` command, in addition to outputting files related to markdown, several other files unrelated to markdown will also be generated. These files will be introduced one by one.
## Overview
## some_pdf_layout.pdf
After executing the `mineru` command, in addition to the main markdown file output, multiple auxiliary files are generated for debugging, quality inspection, and further processing. These files include:
Each page's layout consists of one or more bounding boxes. The number in the top-right corner of each box indicates the reading order. Additionally, different content blocks are highlighted with distinct background colors within the layout.pdf.
-**Visual debugging files**: Help users intuitively understand the document parsing process and results

-**Structured data files**: Contain detailed parsing data for secondary development
## some_pdf_spans.pdf(Applicable only to the pipeline backend)
The following sections provide detailed descriptions of each file's purpose and format.
All spans on the page are drawn with different colored line frames according to the span type. This file can be used for quality control, allowing for quick identification of issues such as missing text or unrecognized inline formulas.
## Visual Debugging Files

### Layout Analysis File (layout.pdf)
## some_pdf_model.json(Applicable only to the pipeline backend)
poly:list[float]=Field(description="Quadrilateral coordinates, representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively")
# The inference results of all pages, ordered by page number, are stored in a list as the inference results of MinerU
inference_result:list[PageInferenceResults]=[]
inference_result:list[PageInferenceResults]=[]
```
```
The format of the poly coordinates is \[x0, y0, x1, y1, x2, y2, x3, y3\], representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively.
- Represents coordinates of top-left, top-right, bottom-right, bottom-left points respectively
- Coordinate origin is at the top-left corner of the page

### example
#### Sample Data
```json
```json
[
[
...
@@ -116,142 +165,127 @@ The format of the poly coordinates is \[x0, y0, x1, y1, x2, y2, x3, y3\], repres
...
@@ -116,142 +165,127 @@ The format of the poly coordinates is \[x0, y0, x1, y1, x2, y2, x3, y3\], repres
]
]
```
```
## some_pdf_model_output.txt (Applicable only to the VLM backend)
### VLM Output Results (model_output.txt)
This file contains the output of the VLM model, with each page's output separated by `----`.
> [!NOTE]
Each page's output consists of text blocks starting with `<|box_start|>` and ending with `<|md_end|>`.
> Only applicable to VLM backend
The meaning of each field is as follows:
-`<|box_start|>x0 y0 x1 y1<|box_end|>`
x0 y0 x1 y1 represent the coordinates of a quadrilateral, indicating the top-left and bottom-right points. The values are based on a normalized page size of 1000x1000.
-`<|ref_start|>type<|ref_end|>`
`type` indicates the block type. Possible values are:
```json
{
"text":"Text",
"title":"Title",
"image":"Image",
"image_caption":"Image Caption",
"image_footnote":"Image Footnote",
"table":"Table",
"table_caption":"Table Caption",
"table_footnote":"Table Footnote",
"equation":"Interline Equation"
}
```
-`<|md_start|>Markdown content<|md_end|>`
This field contains the Markdown content of the block. If `type` is `text`, the end of the text may contain the `<|txt_contd|>` tag, indicating that this block can be connected with the following `text` block(s).
If `type` is `table`, the content is in `otsl` format and needs to be converted into HTML for rendering in Markdown.
This file is a JSON array where each element is a dict storing all readable content blocks in the document in reading order.
#### Functionality
`content_list` can be viewed as a simplified version of `middle.json`. The content block types are mostly consistent with those in `middle.json`, but layout information is not included.
The content has the following types:
This is a simplified version of `middle.json` that stores all readable content blocks in reading order as a flat structure, removing complex layout information for easier subsequent processing.
| type | desc |
#### Content Types
|:---------|:--------------|
| image | Image |
| table | Table |
| text | Text / Title |
| equation | Block formula |
Please note that both `title` and text blocks in `content_list` are uniformly represented using the text type. The `text_level` field is used to distinguish the hierarchy of text blocks:
| Type | Description |
- A block without the `text_level` field or with `text_level=0` represents body text.
|------|-------------|
- A block with `text_level=1` represents a level-1 heading.
| `image` | Image |
- A block with `text_level=2` represents a level-2 heading, and so on.
| `table` | Table |
| `text` | Text/Title |
| `equation` | Interline formula |
Each content contains the `page_idx` field, indicating the page number (starting from 0) where the content block resides.
#### Text Level Identification
### example
Text levels are distinguished through the `text_level` field:
- No `text_level` or `text_level: 0`: Body text
-`text_level: 1`: Level 1 heading
-`text_level: 2`: Level 2 heading
- And so on...
#### Common Fields
All content blocks include a `page_idx` field indicating the page number (starting from 0).
#### Sample Data
```json
```json
[
[
...
@@ -438,3 +480,12 @@ Each content contains the `page_idx` field, indicating the page number (starting
...
@@ -438,3 +480,12 @@ Each content contains the `page_idx` field, indicating the page number (starting
}
}
]
]
```
```
## Summary
The above files constitute MinerU's complete output results. Users can choose appropriate files for subsequent processing based on their needs:
-**Model outputs**: Use raw outputs (model.json, model_output.txt)
-**Debugging and verification**: Use visualization files (layout.pdf, spans.pdf)
-**Content extraction**: Use simplified files (*.md, content_list.json)
-**Secondary development**: Use structured files (middle.json)
>- The `compose.yaml` file contains configurations for multiple services of MinerU, you can choose to start specific services as needed.
>- Different services might have additional parameter configurations, which you can view and edit in the `compose.yaml` file.
>- Due to the pre-allocation of GPU memory by the `sglang` inference acceleration framework, you may not be able to run multiple `sglang` services simultaneously on the same machine. Therefore, ensure that other services that might use GPU memory have been stopped before starting the `vlm-sglang-server` service or using the `vlm-sglang-engine` backend.
- Start `sglang-server` service and connect to `sglang-server` via `vlm-sglang-client` backend:
- Start `sglang-server` service and connect to `sglang-server` via `vlm-sglang-client` backend:
```bash
```bash
docker compose -f compose.yaml --profile mineru-sglang-server up -d
docker compose -f compose.yaml --profile mineru-sglang-server up -d
# In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
@@ -4,12 +4,16 @@ If you encounter any installation issues, please check the [FAQ](../FAQ/index.md
...
@@ -4,12 +4,16 @@ If you encounter any installation issues, please check the [FAQ](../FAQ/index.md
## Online Experience
## Online Experience
- Official online demo: The official online version has the same functionality as the client, with a beautiful interface and rich features, requires login to use
@@ -22,7 +26,7 @@ If you encounter any installation issues, please check the [FAQ](../FAQ/index.md
...
@@ -22,7 +26,7 @@ If you encounter any installation issues, please check the [FAQ](../FAQ/index.md
>
>
> In non-mainstream environments, due to the diversity of hardware and software configurations, as well as compatibility issues with third-party dependencies, we cannot guarantee 100% usability of the project. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first, as most issues have corresponding solutions in the FAQ. Additionally, we encourage community feedback on issues so that we can gradually expand our support range.
> In non-mainstream environments, due to the diversity of hardware and software configurations, as well as compatibility issues with third-party dependencies, we cannot guarantee 100% usability of the project. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first, as most issues have corresponding solutions in the FAQ. Additionally, we encourage community feedback on issues so that we can gradually expand our support range.
> - If you have multiple GPUs and need to specify GPU 0–3, and start the 'sglang-server' using multi-GPU data parallelism and tensor parallelism, you can use the following command:
> - If you have multiple graphics cards and need to start two `fastapi` services on cards 0 and 1, listening on different ports respectively, you can use the following commands:
> - If you have multiple graphics cards and need to start two `fastapi` services on cards 0 and 1, listening on different ports respectively, you can use the following commands:
> All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
> All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
...
@@ -69,6 +76,7 @@ If you need to adjust parsing options through custom parameters, you can also ch
...
@@ -69,6 +76,7 @@ If you need to adjust parsing options through custom parameters, you can also ch
- MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your user directory to add custom configurations.
- MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your user directory to add custom configurations.
- The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
- The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`, or you can create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
- Here are some available configuration options:
- Here are some available configuration options:
-`latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
-`latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
-`llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
-`llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
-`models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
-`models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
>- After download completion, the model path will be output in the current terminal window and automatically written to `mineru.json` in the user directory.
>- After download completion, the model path will be output in the current terminal window and automatically written to `mineru.json` in the user directory.
>- You can also create it by copying the [configuration template file](https://github.com/opendatalab/MinerU/blob/master/mineru.template.json) to your user directory and renaming it to `mineru.json`.
>- After downloading models locally, you can freely move the model folder to other locations while updating the model path in `mineru.json`.
>- After downloading models locally, you can freely move the model folder to other locations while updating the model path in `mineru.json`.
>- If you deploy the model folder to another server, please ensure you move the `mineru.json` file to the user directory of the new device and configure the model path correctly.
>- If you deploy the model folder to another server, please ensure you move the `mineru.json` file to the user directory of the new device and configure the model path correctly.
>- If you need to update model files, you can run the `mineru-models-download` command again. Model updates do not support custom paths currently - if you haven't moved the local model folder, model files will be incrementally updated; if you have moved the model folder, model files will be re-downloaded to the default location and `mineru.json` will be updated.
>- If you need to update model files, you can run the `mineru-models-download` command again. Model updates do not support custom paths currently - if you haven't moved the local model folder, model files will be incrementally updated; if you have moved the model folder, model files will be re-downloaded to the default location and `mineru.json` will be updated.
"text":"The response of flow duration curves to afforestation",
"text":"The response of flow duration curves to afforestation",
"text_level":1,
"text_level":1,
"page_idx":0
"page_idx":0
},
},
{
{
"type":"text",
"type":"text",
"text":"Abstract",
"text":"Received 1 October 2003; revised 22 December 2004; accepted 3 January 2005 ",
"page_idx":0
},
{
"type":"text",
"text":"Abstract ",
"text_level":2,
"text_level":2,
"page_idx":0
"page_idx":0
},
},
{
"type":"text",
"text":"The hydrologic effect of replacing pasture or other short crops with trees is reasonably well understood on a mean annual basis. The impact on flow regime, as described by the annual flow duration curve (FDC) is less certain. A method to assess the impact of plantation establishment on FDCs was developed. The starting point for the analyses was the assumption that rainfall and vegetation age are the principal drivers of evapotranspiration. A key objective was to remove the variability in the rainfall signal, leaving changes in streamflow solely attributable to the evapotranspiration of the plantation. A method was developed to (1) fit a model to the observed annual time series of FDC percentiles; i.e. 10th percentile for each year of record with annual rainfall and plantation age as parameters, (2) replace the annual rainfall variation with the long term mean to obtain climate adjusted FDCs, and (3) quantify changes in FDC percentiles as plantations age. Data from 10 catchments from Australia, South Africa and New Zealand were used. The model was able to represent flow variation for the majority of percentiles at eight of the 10 catchments, particularly for the 10–50th percentiles. The adjusted FDCs revealed variable patterns in flow reductions with two types of responses (groups) being identified. Group 1 catchments show a substantial increase in the number of zero flow days, with low flows being more affected than high flows. Group 2 catchments show a more uniform reduction in flows across all percentiles. The differences may be partly explained by storage characteristics. The modelled flow reductions were in accord with published results of paired catchment experiments. An additional analysis was performed to characterise the impact of afforestation on the number of zero flow days $( N _ { \\mathrm { z e r o } } )$ for the catchments in group 1. This model performed particularly well, and when adjusted for climate, indicated a significant increase in $N _ { \\mathrm { z e r o } }$ . The zero flow day method could be used to determine change in the occurrence of any given flow in response to afforestation. The methods used in this study proved satisfactory in removing the rainfall variability, and have added useful insight into the hydrologic impacts of plantation establishment. This approach provides a methodology for understanding catchment response to afforestation, where paired catchment data is not available. ",
"indicates that the rainfall term was significant at the $5 \\%$ level, $T$ indicates that the time term was significant at the $5 \\%$ level, \\* represents significance at the $10 \\%$ level, and na denotes too few data points for meaningful analysis. "