inference_result.rst 4.66 KB
Newer Older
xu rui's avatar
xu rui committed
1

icecraft's avatar
icecraft committed
2
Inference Result
xu rui's avatar
xu rui committed
3
4
==================

5
6
7
8
9
.. admonition:: Tip
    :class: tip

    Please first navigate to :doc:`tutorial/pipeline` to get an initial understanding of how the pipeline works; this will help in understanding the content of this section.

icecraft's avatar
icecraft committed
10
The **InferenceResult** class is a container for storing model inference results and implements a series of methods related to these results, such as draw_model, dump_model.
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
Checkout :doc:`../api/model_operators` for more details about **InferenceResult**


Model Inference Result
-----------------------

Structure Definition
^^^^^^^^^^^^^^^^^^^^^^^^

.. code:: python

    from pydantic import BaseModel, Field
    from enum import IntEnum

    class CategoryType(IntEnum):
            title = 0               # Title
            plain_text = 1          # Text
            abandon = 2             # Includes headers, footers, page numbers, and page annotations
            figure = 3              # Image
            figure_caption = 4      # Image description
            table = 5               # Table
            table_caption = 6       # Table description
            table_footnote = 7      # Table footnote
            isolate_formula = 8     # Block formula
            formula_caption = 9     # Formula label

            embedding = 13          # Inline formula
            isolated = 14           # Block formula
            text = 15               # OCR recognition result


    class PageInfo(BaseModel):
        page_no: int = Field(description="Page number, the first page is 0", ge=0)
        height: int = Field(description="Page height", gt=0)
        width: int = Field(description="Page width", ge=0)

    class ObjectInferenceResult(BaseModel):
        category_id: CategoryType = Field(description="Category", ge=0)
        poly: list[float] = Field(description="Quadrilateral coordinates, representing the coordinates of the top-left, top-right, bottom-right, and bottom-left points respectively")
        score: float = Field(description="Confidence of the inference result")
        latex: str | None = Field(description="LaTeX parsing result", default=None)
        html: str | None = Field(description="HTML parsing result", default=None)

    class PageInferenceResults(BaseModel):
            layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results", ge=0)
            page_info: PageInfo = Field(description="Page metadata")


icecraft's avatar
icecraft committed
59
Example
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
^^^^^^^^^^^

.. code:: json

    [
        {
            "layout_dets": [
                {
                    "category_id": 2,
                    "poly": [
                        99.1906967163086,
                        100.3119125366211,
                        730.3707885742188,
                        100.3119125366211,
                        730.3707885742188,
                        245.81326293945312,
                        99.1906967163086,
                        245.81326293945312
                    ],
                    "score": 0.9999997615814209
                }
            ],
            "page_info": {
                "page_no": 0,
                "height": 2339,
                "width": 1654
            }
        },
        {
            "layout_dets": [
                {
                    "category_id": 5,
                    "poly": [
                        99.13092803955078,
                        2210.680419921875,
                        497.3183898925781,
                        2210.680419921875,
                        497.3183898925781,
                        2264.78076171875,
                        99.13092803955078,
                        2264.78076171875
                    ],
                    "score": 0.9999997019767761
                }
            ],
            "page_info": {
                "page_no": 1,
                "height": 2339,
                "width": 1654
            }
        }
    ]

The format of the poly coordinates is [x0, y0, x1, y1, x2, y2, x3, y3],
representing the coordinates of the top-left, top-right, bottom-right,
and bottom-left points respectively. |Poly Coordinate Diagram|



icecraft's avatar
icecraft committed
119
Inference Result
120
121
122
123
124
-------------------------


.. code:: python

icecraft's avatar
icecraft committed
125
126
127
    from magic_pdf.operators.models import InferenceResult
    from magic_pdf.data.dataset import Dataset

128
129
130
131
132
133
134
135
136
137
138
139
    dataset : Dataset = some_data_set    # not real dataset

    # The inference results of all pages, ordered by page number, are stored in a list as the inference results of MinerU
    model_inference_result: list[PageInferenceResults] = []

    Inference_result = InferenceResult(model_inference_result, dataset)



some_model.pdf
^^^^^^^^^^^^^^^^^^^^

xu rui's avatar
xu rui committed
140
.. figure:: ../_static/image/inference_result.png
141
142


xu rui's avatar
xu rui committed
143

144
.. |Poly Coordinate Diagram| image:: ../_static/image/poly.png