inference_ppocr_en.md 8.56 KB
Newer Older
Leif's avatar
Leif committed
1

2
# Python Inference for PP-OCR Model Zoo
Leif's avatar
Leif committed
3
4
5
6

This article introduces the use of the Python inference engine for the PP-OCR model library. The content is in order of text detection, text recognition, direction classifier and the prediction method of the three in series on the CPU and GPU.


WenmuZhou's avatar
WenmuZhou committed
7
8
9
10
- [Python Inference for PP-OCR Model Zoo](#python-inference-for-pp-ocr-model-zoo)
  - [Text Detection Model Inference](#text-detection-model-inference)
  - [Text Recognition Model Inference](#text-recognition-model-inference)
    - [1. Lightweight Chinese Recognition Model Inference](#1-lightweight-chinese-recognition-model-inference)
tink2123's avatar
tink2123 committed
11
12
    - [2. English Recognition Model Inference](#2-english-recognition-model-inference)
    - [3. Multilingual Model Inference](#3-multilingual-model-inference)
WenmuZhou's avatar
WenmuZhou committed
13
14
  - [Angle Classification Model Inference](#angle-classification-model-inference)
  - [Text Detection Angle Classification and Recognition Inference Concatenation](#text-detection-angle-classification-and-recognition-inference-concatenation)
Leif's avatar
Leif committed
15
16
17

<a name="DETECTION_MODEL_INFERENCE"></a>

18
## Text Detection Model Inference
Leif's avatar
Leif committed
19
20
21
22
23

The default configuration is based on the inference setting of the DB text detection model. For lightweight Chinese detection model inference, you can execute the following commands:

```
# download DB text detection inference model
24
25
wget  https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar
tar xf ch_PP-OCRv3_det_infer.tar
26
# run inference
27
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/"
Leif's avatar
Leif committed
28
29
```

fanruinet's avatar
fanruinet committed
30
The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
Leif's avatar
Leif committed
31
32
33
34
35
36
37
38
39
40
41
42
43

![](../imgs_results/det_res_00018069.jpg)

You can use the parameters `limit_type` and `det_limit_side_len` to limit the size of the input image,
The optional parameters of `limit_type` are [`max`, `min`], and
`det_limit_size_len` is a positive integer, generally set to a multiple of 32, such as 960.

The default setting of the parameters is `limit_type='max', det_limit_side_len=960`. Indicates that the longest side of the network input image cannot exceed 960,
If this value is exceeded, the image will be resized with the same width ratio to ensure that the longest side is `det_limit_side_len`.
Set as `limit_type='min', det_limit_side_len=960`, it means that the shortest side of the image is limited to 960.

If the resolution of the input picture is relatively large and you want to use a larger resolution prediction, you can set det_limit_side_len to the desired value, such as 1216:
```
44
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --det_limit_type=max --det_limit_side_len=1216
Leif's avatar
Leif committed
45
46
47
48
```

If you want to use the CPU for prediction, execute the command as follows
```
49
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/"  --use_gpu=False
Leif's avatar
Leif committed
50
51
52
53
```

<a name="RECOGNITION_MODEL_INFERENCE"></a>

54
## Text Recognition Model Inference
Leif's avatar
Leif committed
55
56
57


<a name="LIGHTWEIGHT_RECOGNITION"></a>
58
### 1. Lightweight Chinese Recognition Model Inference
Leif's avatar
Leif committed
59

60
**Note**: The input shape used by the recognition model of `PP-OCRv3` is `3, 48, 320`. If you use other recognition models, you need to set the parameter `--rec_image_shape` according to the model. In addition, the `rec_algorithm` used by the recognition model of `PP-OCRv3` is `SVTR_LCNet` by default. Note the difference from the original `SVTR`.
61
62


Leif's avatar
Leif committed
63
64
65
66
For lightweight Chinese recognition model inference, you can execute the following commands:

```
# download CRNN text recognition inference model
67
68
wget  https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar
tar xf ch_PP-OCRv3_rec_infer.tar
69
# run inference
70
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_10.png" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --rec_image_shape=3,48,320
Leif's avatar
Leif committed
71
72
73
74
75
76
77
```

![](../imgs_words_en/word_10.png)

After executing the command, the prediction results (recognized text and score) of the above image will be printed on the screen.

```bash
78
Predicts of ./doc/imgs_words_en/word_10.png:('PAIN', 0.988671)
Leif's avatar
Leif committed
79
```
tink2123's avatar
tink2123 committed
80
81
<a name="2-english-recognition-model-inference"></a>
### 2. English Recognition Model Inference
Leif's avatar
Leif committed
82

tink2123's avatar
tink2123 committed
83
For English recognition model inference, you can execute the following commands,you need to specify the dictionary path used by `--rec_char_dict_path`:
Leif's avatar
Leif committed
84

tink2123's avatar
tink2123 committed
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
```
# download en model:
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar
tar xf en_PP-OCRv3_det_infer.tar
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/en/word_1.png" --rec_model_dir="./en_PP-OCRv3_det_infer/" --rec_char_dict_path="ppocr/utils/en_dict.txt"
```

After executing the command, the prediction result of the above figure is:

```
Predicts of ./doc/imgs_words/en/word_1.png: ('JOINT', 0.998160719871521)
```


<a name="3-multilingual-model-inference"></a>

### 3. Multilingual Model Inference
102
If you need to predict [other language models](./models_list_en.md#Multilingual), when using inference model prediction, you need to specify the dictionary path used by `--rec_char_dict_path`. At the same time, in order to get the correct visualization results,
Leif's avatar
Leif committed
103
104
105
You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/fonts` path, such as Korean recognition:

```
106
107
wget wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar

WenmuZhou's avatar
WenmuZhou committed
108
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/fonts/korean.ttf"
Leif's avatar
Leif committed
109
110
111
112
113
114
115
116
117
118
119
```
![](../imgs_words/korean/1.jpg)

After executing the command, the prediction result of the above figure is:

``` text
Predicts of ./doc/imgs_words/korean/1.jpg:('바탕으로', 0.9948904)
```

<a name="ANGLE_CLASS_MODEL_INFERENCE"></a>

120
## Angle Classification Model Inference
Leif's avatar
Leif committed
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139

For angle classification model inference, you can execute the following commands:


```
# download text angle class inference model:
wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
tar xf ch_ppocr_mobile_v2.0_cls_infer.tar
python3 tools/infer/predict_cls.py --image_dir="./doc/imgs_words_en/word_10.png" --cls_model_dir="ch_ppocr_mobile_v2.0_cls_infer"
```
![](../imgs_words_en/word_10.png)

After executing the command, the prediction results (classification angle and score) of the above image will be printed on the screen.

```
 Predicts of ./doc/imgs_words_en/word_10.png:['0', 0.9999995]
```

<a name="CONCATENATION"></a>
140
## Text Detection Angle Classification and Recognition Inference Concatenation
Leif's avatar
Leif committed
141

142
**Note**: The input shape used by the recognition model of `PP-OCRv3` is `3, 48, 320`. If you use other recognition models, you need to set the parameter `--rec_image_shape` according to the model. In addition, the `rec_algorithm` used by the recognition model of `PP-OCRv3` is `SVTR_LCNet` by default. Note the difference from the original `SVTR`.
143

Leif's avatar
Leif committed
144
145
146
147
When performing prediction, you need to specify the path of a single image or a folder of images through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, the parameter `cls_model_dir` specifies the path to angle classification inference model and the parameter `rec_model_dir` specifies the path to identify the inference model. The parameter `use_angle_cls` is used to control whether to enable the angle classification model. The parameter `use_mp` specifies whether to use multi-process to infer `total_process_num` specifies process number when using multi-process. The parameter . The visualized recognition results are saved to the `./inference_results` folder by default.

```shell
# use direction classifier
148
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --cls_model_dir="./cls/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=true
Leif's avatar
Leif committed
149
150

# not use use direction classifier
151
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=false
Leif's avatar
Leif committed
152
# use multi-process
153
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./ch_PP-OCRv3_det_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=false --use_mp=True --total_process_num=6
Leif's avatar
Leif committed
154
155
156
157
158
```


After executing the command, the recognition result image is as follows:

159
![](../imgs_results/system_res_00018069_v3.jpg)