inference_en.md 25.1 KB
Newer Older
Khanh Tran's avatar
Khanh Tran committed
1

tink2123's avatar
tink2123 committed
2
# Reasoning based on Python prediction engine
Khanh Tran's avatar
Khanh Tran committed
3

WenmuZhou's avatar
WenmuZhou committed
4
The inference model (the model saved by `paddle.jit.save`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
Khanh Tran's avatar
Khanh Tran committed
5
6
7

The model saved during the training process is the checkpoints model, which saves the parameters of the model and is mostly used to resume training.

LDOUBLEV's avatar
LDOUBLEV committed
8
Compared with the checkpoints model, the inference model will additionally save the structural information of the model. Therefore, it is easier to deploy because the model structure and model parameters are already solidified in the inference model file, and is suitable for integration with actual systems.
LDOUBLEV's avatar
LDOUBLEV committed
9
For more details, please refer to the document [Classification Framework](https://github.com/PaddlePaddle/PaddleClas/blob/release%2F2.0/docs/zh_CN/extension/paddle_mobile_inference.md).
Khanh Tran's avatar
Khanh Tran committed
10

WenmuZhou's avatar
WenmuZhou committed
11
Next, we first introduce how to convert a trained model into an inference model, and then we will introduce text detection, text recognition, angle class, and the concatenation of them based on inference model.
Khanh Tran's avatar
Khanh Tran committed
12

licx's avatar
licx committed
13
14
15
- [CONVERT TRAINING MODEL TO INFERENCE MODEL](#CONVERT)
    - [Convert detection model to inference model](#Convert_detection_model)
    - [Convert recognition model to inference model](#Convert_recognition_model)
WenmuZhou's avatar
WenmuZhou committed
16
17
18
    - [Convert angle classification model to inference model](#Convert_angle_class_model)


licx's avatar
licx committed
19
20
21
22
23
- [TEXT DETECTION MODEL INFERENCE](#DETECTION_MODEL_INFERENCE)
    - [1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE](#LIGHTWEIGHT_DETECTION)
    - [2. DB TEXT DETECTION MODEL INFERENCE](#DB_DETECTION)
    - [3. EAST TEXT DETECTION MODEL INFERENCE](#EAST_DETECTION)
    - [4. SAST TEXT DETECTION MODEL INFERENCE](#SAST_DETECTION)
WenmuZhou's avatar
WenmuZhou committed
24
25
    - [5. Multilingual model inference](#Multilingual model inference)

licx's avatar
licx committed
26
27
28
- [TEXT RECOGNITION MODEL INFERENCE](#RECOGNITION_MODEL_INFERENCE)
    - [1. LIGHTWEIGHT CHINESE MODEL](#LIGHTWEIGHT_RECOGNITION)
    - [2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE](#CTC-BASED_RECOGNITION)
tink2123's avatar
tink2123 committed
29
    - [3. SRN-BASED TEXT RECOGNITION MODEL INFERENCE](#SRN-BASED_RECOGNITION)
WenmuZhou's avatar
WenmuZhou committed
30
31
    - [3. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY](#USING_CUSTOM_CHARACTERS)
    - [4. MULTILINGUAL MODEL INFERENCE](MULTILINGUAL_MODEL_INFERENCE)
WenmuZhou's avatar
WenmuZhou committed
32
33
34
35
36

- [ANGLE CLASSIFICATION MODEL INFERENCE](#ANGLE_CLASS_MODEL_INFERENCE)
    - [1. ANGLE CLASSIFICATION MODEL INFERENCE](#ANGLE_CLASS_MODEL_INFERENCE)

- [TEXT DETECTION ANGLE CLASSIFICATION AND RECOGNITION INFERENCE CONCATENATION](#CONCATENATION)
licx's avatar
licx committed
37
38
    - [1. LIGHTWEIGHT CHINESE MODEL](#LIGHTWEIGHT_CHINESE_MODEL)
    - [2. OTHER MODELS](#OTHER_MODELS)
WenmuZhou's avatar
WenmuZhou committed
39

licx's avatar
licx committed
40
<a name="CONVERT"></a>
xxxpsyduck's avatar
xxxpsyduck committed
41
## CONVERT TRAINING MODEL TO INFERENCE MODEL
licx's avatar
licx committed
42
<a name="Convert_detection_model"></a>
xxxpsyduck's avatar
xxxpsyduck committed
43
### Convert detection model to inference model
Khanh Tran's avatar
Khanh Tran committed
44

xxxpsyduck's avatar
xxxpsyduck committed
45
Download the lightweight Chinese detection model:
Khanh Tran's avatar
Khanh Tran committed
46
```
WenmuZhou's avatar
WenmuZhou committed
47
wget -P ./ch_lite/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar && tar xf ./ch_lite/ch_ppocr_mobile_v2.0_det_train.tar -C ./ch_lite/
Khanh Tran's avatar
Khanh Tran committed
48
```
WenmuZhou's avatar
WenmuZhou committed
49

Khanh Tran's avatar
Khanh Tran committed
50
51
The above model is a DB algorithm trained with MobileNetV3 as the backbone. To convert the trained model into an inference model, just run the following command:
```
WenmuZhou's avatar
WenmuZhou committed
52
53
# -c Set the training algorithm yml configuration file
# -o Set optional parameters
WenmuZhou's avatar
WenmuZhou committed
54
# Global.pretrained_model parameter Set the training model address to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
WenmuZhou's avatar
WenmuZhou committed
55
56
# Global.load_static_weights needs to be set to False
# Global.save_inference_dir Set the address where the converted model will be saved.
tink2123's avatar
tink2123 committed
57

WenmuZhou's avatar
WenmuZhou committed
58
python3 tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=./ch_lite/ch_ppocr_mobile_v2.0_det_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/det_db/
Khanh Tran's avatar
Khanh Tran committed
59
```
WenmuZhou's avatar
WenmuZhou committed
60

WenmuZhou's avatar
WenmuZhou committed
61
When converting to an inference model, the configuration file used is the same as the configuration file used during training. In addition, you also need to set the `Global.pretrained_model` parameter in the configuration file.
WenmuZhou's avatar
WenmuZhou committed
62
After the conversion is successful, there are three files in the model save directory:
Khanh Tran's avatar
Khanh Tran committed
63
64
```
inference/det_db/
65
66
67
    ├── inference.pdiparams         # The parameter file of detection inference model
    ├── inference.pdiparams.info    # The parameter information of detection inference model, which can be ignored
    └── inference.pdmodel           # The program file of detection inference model
Khanh Tran's avatar
Khanh Tran committed
68
69
```

licx's avatar
licx committed
70
<a name="Convert_recognition_model"></a>
xxxpsyduck's avatar
xxxpsyduck committed
71
### Convert recognition model to inference model
Khanh Tran's avatar
Khanh Tran committed
72

xxxpsyduck's avatar
xxxpsyduck committed
73
Download the lightweight Chinese recognition model:
Khanh Tran's avatar
Khanh Tran committed
74
```
WenmuZhou's avatar
WenmuZhou committed
75
wget -P ./ch_lite/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar && tar xf ./ch_lite/ch_ppocr_mobile_v2.0_rec_train.tar -C ./ch_lite/
Khanh Tran's avatar
Khanh Tran committed
76
77
78
79
```

The recognition model is converted to the inference model in the same way as the detection, as follows:
```
WenmuZhou's avatar
WenmuZhou committed
80
81
# -c Set the training algorithm yml configuration file
# -o Set optional parameters
WenmuZhou's avatar
WenmuZhou committed
82
# Global.pretrained_model parameter Set the training model address to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
WenmuZhou's avatar
WenmuZhou committed
83
84
# Global.load_static_weights needs to be set to False
# Global.save_inference_dir Set the address where the converted model will be saved.
tink2123's avatar
tink2123 committed
85

WenmuZhou's avatar
WenmuZhou committed
86
python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model=./ch_lite/ch_ppocr_mobile_v2.0_rec_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/rec_crnn/
Khanh Tran's avatar
Khanh Tran committed
87
88
89
90
```

If you have a model trained on your own dataset with a different dictionary file, please make sure that you modify the `character_dict_path` in the configuration file to your dictionary file path.

WenmuZhou's avatar
WenmuZhou committed
91
After the conversion is successful, there are three files in the model save directory:
Khanh Tran's avatar
Khanh Tran committed
92
```
WenmuZhou's avatar
WenmuZhou committed
93
inference/det_db/
94
95
96
    ├── inference.pdiparams         # The parameter file of recognition inference model
    ├── inference.pdiparams.info    # The parameter information of recognition inference model, which can be ignored
    └── inference.pdmodel           # The program file of recognition model
Khanh Tran's avatar
Khanh Tran committed
97
98
```

WenmuZhou's avatar
WenmuZhou committed
99
100
101
102
103
<a name="Convert_angle_class_model"></a>
### Convert angle classification model to inference model

Download the angle classification model:
```
WenmuZhou's avatar
WenmuZhou committed
104
wget -P ./ch_lite/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar && tar xf ./ch_lite/ch_ppocr_mobile_v2.0_cls_train.tar -C ./ch_lite/
WenmuZhou's avatar
WenmuZhou committed
105
106
107
108
```

The angle classification model is converted to the inference model in the same way as the detection, as follows:
```
WenmuZhou's avatar
WenmuZhou committed
109
110
# -c Set the training algorithm yml configuration file
# -o Set optional parameters
WenmuZhou's avatar
WenmuZhou committed
111
# Global.pretrained_model parameter Set the training model address to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
WenmuZhou's avatar
WenmuZhou committed
112
113
# Global.load_static_weights needs to be set to False
# Global.save_inference_dir Set the address where the converted model will be saved.
WenmuZhou's avatar
WenmuZhou committed
114

WenmuZhou's avatar
WenmuZhou committed
115
python3 tools/export_model.py -c configs/cls/cls_mv3.yml -o Global.pretrained_model=./ch_lite/ch_ppocr_mobile_v2.0_cls_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/cls/
WenmuZhou's avatar
WenmuZhou committed
116
117
118
119
```

After the conversion is successful, there are two files in the directory:
```
WenmuZhou's avatar
WenmuZhou committed
120
inference/det_db/
121
122
123
    ├── inference.pdiparams         # The parameter file of angle class inference model
    ├── inference.pdiparams.info    # The parameter information of  angle class inference model, which can be ignored
    └── inference.pdmodel           # The program file of angle class model
WenmuZhou's avatar
WenmuZhou committed
124
125
126
```


licx's avatar
licx committed
127
<a name="DETECTION_MODEL_INFERENCE"></a>
xxxpsyduck's avatar
xxxpsyduck committed
128
## TEXT DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
129

tink2123's avatar
tink2123 committed
130
131
The following will introduce the lightweight Chinese detection model inference, DB text detection model inference and EAST text detection model inference. The default configuration is based on the inference setting of the DB text detection model.
Because EAST and DB algorithms are very different, when inference, it is necessary to **adapt the EAST text detection algorithm by passing in corresponding parameters**.
Khanh Tran's avatar
Khanh Tran committed
132

licx's avatar
licx committed
133
<a name="LIGHTWEIGHT_DETECTION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
134
### 1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
135

xxxpsyduck's avatar
xxxpsyduck committed
136
For lightweight Chinese detection model inference, you can execute the following commands:
Khanh Tran's avatar
Khanh Tran committed
137
138

```
LDOUBLEV's avatar
LDOUBLEV committed
139
140
141
142
# download DB text detection inference model
wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
tar xf ch_ppocr_mobile_v2.0_det_infer.tar
# predict
143
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/"
Khanh Tran's avatar
Khanh Tran committed
144
145
146
147
```

The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:

148
![](../imgs_results/det_res_00018069.jpg)
Khanh Tran's avatar
Khanh Tran committed
149

LDOUBLEV's avatar
LDOUBLEV committed
150
You can use the parameters `limit_type` and `det_limit_side_len` to limit the size of the input image,
MissPenguin's avatar
MissPenguin committed
151
The optional parameters of `limit_type` are [`max`, `min`], and
LDOUBLEV's avatar
LDOUBLEV committed
152
`det_limit_size_len` is a positive integer, generally set to a multiple of 32, such as 960.
Khanh Tran's avatar
Khanh Tran committed
153

LDOUBLEV's avatar
LDOUBLEV committed
154
155
156
157
158
The default setting of the parameters is `limit_type='max', det_limit_side_len=960`. Indicates that the longest side of the network input image cannot exceed 960,
If this value is exceeded, the image will be resized with the same width ratio to ensure that the longest side is `det_limit_side_len`.
Set as `limit_type='min', det_limit_side_len=960`, it means that the shortest side of the image is limited to 960.

If the resolution of the input picture is relatively large and you want to use a larger resolution prediction, you can set det_limit_side_len to the desired value, such as 1216:
Khanh Tran's avatar
Khanh Tran committed
159
```
LDOUBLEV's avatar
LDOUBLEV committed
160
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./inference/det_db/" --det_limit_type=max --det_limit_side_len=1216
Khanh Tran's avatar
Khanh Tran committed
161
162
163
164
```

If you want to use the CPU for prediction, execute the command as follows
```
LDOUBLEV's avatar
LDOUBLEV committed
165
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False
Khanh Tran's avatar
Khanh Tran committed
166
167
```

licx's avatar
licx committed
168
<a name="DB_DETECTION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
169
### 2. DB TEXT DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
170

WenmuZhou's avatar
WenmuZhou committed
171
First, convert the model saved in the DB text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)), you can use the following command to convert:
Khanh Tran's avatar
Khanh Tran committed
172
173

```
WenmuZhou's avatar
WenmuZhou committed
174
python3 tools/export_model.py -c configs/det/det_r50_vd_db.yml -o Global.pretrained_model=./det_r50_vd_db_v2.0_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/det_db
Khanh Tran's avatar
Khanh Tran committed
175
176
177
178
179
180
181
182
183
184
```

DB text detection model inference, you can execute the following command:

```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_db/"
```

The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

185
![](../imgs_results/det_res_img_10_db.jpg)
Khanh Tran's avatar
Khanh Tran committed
186
187
188

**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese text images.

licx's avatar
licx committed
189
<a name="EAST_DETECTION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
190
### 3. EAST TEXT DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
191

MissPenguin's avatar
MissPenguin committed
192
First, convert the model saved in the EAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)), you can use the following command to convert:
Khanh Tran's avatar
Khanh Tran committed
193
194

```
WenmuZhou's avatar
WenmuZhou committed
195
python3 tools/export_model.py -c configs/det/det_r50_vd_east.yml -o Global.pretrained_model=./det_r50_vd_east_v2.0_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/det_east
Khanh Tran's avatar
Khanh Tran committed
196
```
licx's avatar
licx committed
197
**For EAST text detection model inference, you need to set the parameter ``--det_algorithm="EAST"``**, run the following command:
Khanh Tran's avatar
Khanh Tran committed
198
199
200
201

```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_east/" --det_algorithm="EAST"
```
licx's avatar
licx committed
202

Khanh Tran's avatar
Khanh Tran committed
203
204
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

MissPenguin's avatar
MissPenguin committed
205
![](../imgs_results/det_res_img_10_east.jpg)
Khanh Tran's avatar
Khanh Tran committed
206

licx's avatar
licx committed
207
208
209
210
211
212
**Note**: EAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.


<a name="SAST_DETECTION"></a>
### 4. SAST TEXT DETECTION MODEL INFERENCE
#### (1). Quadrangle text detection model (ICDAR2015)  
MissPenguin's avatar
MissPenguin committed
213
First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)), you can use the following command to convert:
licx's avatar
licx committed
214
215

```
WenmuZhou's avatar
WenmuZhou committed
216
python3 tools/export_model.py -c configs/det/det_r50_vd_sast_icdar15.yml -o Global.pretrained_model=./det_r50_vd_sast_icdar15_v2.0_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/det_sast_ic15
licx's avatar
licx committed
217
218
219
```

**For SAST quadrangle text detection model inference, you need to set the parameter `--det_algorithm="SAST"`**, run the following command:
Khanh Tran's avatar
Khanh Tran committed
220

licx's avatar
licx committed
221
222
223
224
225
```
python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_sast_ic15/"
```

The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
Khanh Tran's avatar
Khanh Tran committed
226

MissPenguin's avatar
MissPenguin committed
227
![](../imgs_results/det_res_img_10_sast.jpg)
licx's avatar
licx committed
228
229

#### (2). Curved text detection model (Total-Text)  
MissPenguin's avatar
MissPenguin committed
230
First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the Total-Text English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_totaltext_v2.0_train.tar)), you can use the following command to convert:
licx's avatar
licx committed
231
232

```
WenmuZhou's avatar
WenmuZhou committed
233
python3 tools/export_model.py -c configs/det/det_r50_vd_sast_totaltext.yml -o Global.pretrained_model=./det_r50_vd_sast_totaltext_v2.0_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/det_sast_tt
licx's avatar
licx committed
234
235
236
237
238
239
240
241
242
243
```

**For SAST curved text detection model inference, you need to set the parameter `--det_algorithm="SAST"` and `--det_sast_polygon=True`**, run the following command:

```
python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_sast_polygon=True
```

The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

MissPenguin's avatar
MissPenguin committed
244
![](../imgs_results/det_res_img623_sast.jpg)
licx's avatar
licx committed
245
246
247
248

**Note**: SAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.

<a name="RECOGNITION_MODEL_INFERENCE"></a>
xxxpsyduck's avatar
xxxpsyduck committed
249
## TEXT RECOGNITION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
250

xxxpsyduck's avatar
xxxpsyduck committed
251
The following will introduce the lightweight Chinese recognition model inference, other CTC-based and Attention-based text recognition models inference. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss. In practice, it is also found that the result of the model based on Attention loss is not as good as the one based on CTC loss. In addition, if the characters dictionary is modified during training, make sure that you use the same characters set during inferencing. Please check below for details.
Khanh Tran's avatar
Khanh Tran committed
252
253


licx's avatar
licx committed
254
<a name="LIGHTWEIGHT_RECOGNITION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
255
### 1. LIGHTWEIGHT CHINESE TEXT RECOGNITION MODEL REFERENCE
Khanh Tran's avatar
Khanh Tran committed
256

xxxpsyduck's avatar
xxxpsyduck committed
257
For lightweight Chinese recognition model inference, you can execute the following commands:
Khanh Tran's avatar
Khanh Tran committed
258
259

```
WenmuZhou's avatar
WenmuZhou committed
260
261
262
263
# download CRNN text recognition inference model
wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
tar xf ch_ppocr_mobile_v2.0_rec_infer.tar
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_10.png" --rec_model_dir="ch_ppocr_mobile_v2.0_rec_infer"
Khanh Tran's avatar
Khanh Tran committed
264
265
```

WenmuZhou's avatar
WenmuZhou committed
266
![](../imgs_words_en/word_10.png)
Khanh Tran's avatar
Khanh Tran committed
267
268
269

After executing the command, the prediction results (recognized text and score) of the above image will be printed on the screen.

WenmuZhou's avatar
WenmuZhou committed
270
```bash
WenmuZhou's avatar
WenmuZhou committed
271
Predicts of ./doc/imgs_words_en/word_10.png:('PAIN', 0.9897658)
WenmuZhou's avatar
WenmuZhou committed
272
```
Khanh Tran's avatar
Khanh Tran committed
273

licx's avatar
licx committed
274
<a name="CTC-BASED_RECOGNITION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
275
### 2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
276

WenmuZhou's avatar
WenmuZhou committed
277
Taking CRNN as an example, we introduce the recognition model inference based on CTC loss. Rosetta and Star-Net are used in a similar way, No need to set the recognition algorithm parameter rec_algorithm.
Khanh Tran's avatar
Khanh Tran committed
278

WenmuZhou's avatar
WenmuZhou committed
279
First, convert the model saved in the CRNN text recognition training process into an inference model. Taking the model based on Resnet34_vd backbone network, using MJSynth and SynthText (two English text recognition synthetic datasets) for training, as an example ([model download address](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)). It can be converted as follow:
Khanh Tran's avatar
Khanh Tran committed
280
281

```
WenmuZhou's avatar
WenmuZhou committed
282
python3 tools/export_model.py -c configs/det/rec_r34_vd_none_bilstm_ctc.yml -o Global.pretrained_model=./rec_r34_vd_none_bilstm_ctc_v2.0_train/best_accuracy Global.load_static_weights=False Global.save_inference_dir=./inference/rec_crnn
Khanh Tran's avatar
Khanh Tran committed
283
284
```

WenmuZhou's avatar
WenmuZhou committed
285
For CRNN text recognition model inference, execute the following commands:
Khanh Tran's avatar
Khanh Tran committed
286
287
288
289

```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./inference/starnet/" --rec_image_shape="3, 32, 100" --rec_char_type="en"
```
xxxpsyduck's avatar
xxxpsyduck committed
290

WenmuZhou's avatar
WenmuZhou committed
291
![](../imgs_words_en/word_336.png)
Khanh Tran's avatar
Khanh Tran committed
292

WenmuZhou's avatar
WenmuZhou committed
293
294
295
296
297
After executing the command, the recognition result of the above image is as follows:

```bash
Predicts of ./doc/imgs_words_en/word_336.png:('super', 0.9999073)
```
xxxpsyduck's avatar
xxxpsyduck committed
298
**Note**:Since the above model refers to [DTRB](https://arxiv.org/abs/1904.01906) text recognition training and evaluation process, it is different from the training of lightweight Chinese recognition model in two aspects:
Khanh Tran's avatar
Khanh Tran committed
299
300
301
302
303
304
305
306
307
308

- The image resolution used in training is different: the image resolution used in training the above model is [3,32,100], while during our Chinese model training, in order to ensure the recognition effect of long text, the image resolution used in training is [3, 32, 320]. The default shape parameter of the inference stage is the image resolution used in training phase, that is [3, 32, 320]. Therefore, when running inference of the above English model here, you need to set the shape of the recognition image through the parameter `rec_image_shape`.

- Character list: the experiment in the DTRB paper is only for 26 lowercase English characters and 10 numbers, a total of 36 characters. All upper and lower case characters are converted to lower case characters, and characters not in the above list are ignored and considered as spaces. Therefore, no characters dictionary file is used here, but a dictionary is generated by the below command. Therefore, the parameter `rec_char_type` needs to be set during inference, which is specified as "en" in English.

```
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
dict_character = list(self.character_str)
```

tink2123's avatar
tink2123 committed
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
<a name="SRN-BASED_RECOGNITION"></a>
### 3. SRN-BASED TEXT RECOGNITION MODEL INFERENCE

The recognition model based on SRN requires additional setting of the recognition algorithm parameter
--rec_algorithm="SRN". At the same time, it is necessary to ensure that the predicted shape is consistent
with the training, such as: --rec_image_shape="1, 64, 256"

```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" \
                                    --rec_model_dir="./inference/srn/" \
                                    --rec_image_shape="1, 64, 256" \
                                    --rec_char_type="en" \
                                    --rec_algorithm="SRN"
```

licx's avatar
licx committed
324
<a name="USING_CUSTOM_CHARACTERS"></a>
tink2123's avatar
tink2123 committed
325
### 4. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY
WenmuZhou's avatar
WenmuZhou committed
326
If the text dictionary is modified during training, when using the inference model to predict, you need to specify the dictionary path used by `--rec_char_dict_path`, and set `rec_char_type=ch`
LDOUBLEV's avatar
LDOUBLEV committed
327
328

```
WenmuZhou's avatar
WenmuZhou committed
329
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_type="ch" --rec_char_dict_path="your text dict path"
LDOUBLEV's avatar
LDOUBLEV committed
330
331
```

WenmuZhou's avatar
WenmuZhou committed
332
<a name="MULTILINGUAL_MODEL_INFERENCE"></a>
tink2123's avatar
tink2123 committed
333
### 5. MULTILINGAUL MODEL INFERENCE
WenmuZhou's avatar
WenmuZhou committed
334
If you need to predict other language models, when using inference model prediction, you need to specify the dictionary path used by `--rec_char_dict_path`. At the same time, in order to get the correct visualization results,
tink2123's avatar
tink2123 committed
335
You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/fonts` path, such as Korean recognition:
WenmuZhou's avatar
WenmuZhou committed
336
337

```
tink2123's avatar
tink2123 committed
338
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/fonts/korean.ttf"
WenmuZhou's avatar
WenmuZhou committed
339
340
341
342
343
344
```
![](../imgs_words/korean/1.jpg)

After executing the command, the prediction result of the above figure is:

``` text
WenmuZhou's avatar
WenmuZhou committed
345
Predicts of ./doc/imgs_words/korean/1.jpg:('바탕으로', 0.9948904)
WenmuZhou's avatar
WenmuZhou committed
346
347
348
349
350
351
352
353
354
355
356
357
358
359
```

<a name="ANGLE_CLASSIFICATION_MODEL_INFERENCE"></a>
## ANGLE CLASSIFICATION MODEL INFERENCE

The following will introduce the angle classification model inference.


<a name="ANGLE_CLASS_MODEL_INFERENCE"></a>
### 1.ANGLE CLASSIFICATION MODEL INFERENCE

For angle classification model inference, you can execute the following commands:

```
WenmuZhou's avatar
WenmuZhou committed
360
python3 tools/infer/predict_cls.py --image_dir="./doc/imgs_words_en/word_10.png" --cls_model_dir="./inference/cls/"
WenmuZhou's avatar
WenmuZhou committed
361
```
WenmuZhou's avatar
WenmuZhou committed
362
363
364
365
366
367
```
# download text angle class inference model:
wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
tar xf ch_ppocr_mobile_v2.0_cls_infer.tar
python3 tools/infer/predict_cls.py --image_dir="./doc/imgs_words_en/word_10.png" --cls_model_dir="ch_ppocr_mobile_v2.0_cls_infer"
```
WenmuZhou's avatar
WenmuZhou committed
368
![](../imgs_words_en/word_10.png)
WenmuZhou's avatar
WenmuZhou committed
369
370
371

After executing the command, the prediction results (classification angle and score) of the above image will be printed on the screen.

WenmuZhou's avatar
WenmuZhou committed
372
```
WenmuZhou's avatar
WenmuZhou committed
373
 Predicts of ./doc/imgs_words_en/word_10.png:['0', 0.9999995]
WenmuZhou's avatar
WenmuZhou committed
374
```
WenmuZhou's avatar
WenmuZhou committed
375

licx's avatar
licx committed
376
<a name="CONCATENATION"></a>
WenmuZhou's avatar
WenmuZhou committed
377
## TEXT DETECTION ANGLE CLASSIFICATION AND RECOGNITION INFERENCE CONCATENATION
Khanh Tran's avatar
Khanh Tran committed
378

licx's avatar
licx committed
379
<a name="LIGHTWEIGHT_CHINESE_MODEL"></a>
xxxpsyduck's avatar
xxxpsyduck committed
380
### 1. LIGHTWEIGHT CHINESE MODEL
Khanh Tran's avatar
Khanh Tran committed
381

littletomatodonkey's avatar
littletomatodonkey committed
382
When performing prediction, you need to specify the path of a single image or a folder of images through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, the parameter `cls_model_dir` specifies the path to angle classification inference model and the parameter `rec_model_dir` specifies the path to identify the inference model. The parameter `use_angle_cls` is used to control whether to enable the angle classification model. The parameter `use_mp` specifies whether to use multi-process to infer `total_process_num` specifies process number when using multi-process. The parameter . The visualized recognition results are saved to the `./inference_results` folder by default.
Khanh Tran's avatar
Khanh Tran committed
383

littletomatodonkey's avatar
littletomatodonkey committed
384
```shell
WenmuZhou's avatar
WenmuZhou committed
385
# use direction classifier
WenmuZhou's avatar
WenmuZhou committed
386
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true
WenmuZhou's avatar
WenmuZhou committed
387
388

# not use use direction classifier
WenmuZhou's avatar
WenmuZhou committed
389
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/"
littletomatodonkey's avatar
littletomatodonkey committed
390
391
392
393

# use multi-process
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false --use_mp=True --total_process_num=6
```
Khanh Tran's avatar
Khanh Tran committed
394
395
396
397
```

After executing the command, the recognition result image is as follows:

WenmuZhou's avatar
WenmuZhou committed
398
![](../imgs_results/system_res_00018069.jpg)
Khanh Tran's avatar
Khanh Tran committed
399

licx's avatar
licx committed
400
<a name="OTHER_MODELS"></a>
xxxpsyduck's avatar
xxxpsyduck committed
401
### 2. OTHER MODELS
Khanh Tran's avatar
Khanh Tran committed
402

licx's avatar
licx committed
403
404
405
406
407
If you want to try other detection algorithms or recognition algorithms, please refer to the above text detection model inference and text recognition model inference, update the corresponding configuration and model.

**Note: due to the limitation of rotation logic of detected box, SAST curved text detection model (using the parameter `det_sast_polygon=True`) is not supported for model combination yet.**

The following command uses the combination of the EAST text detection and STAR-Net text recognition:
Khanh Tran's avatar
Khanh Tran committed
408
409
410
411
412
413
414

```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_east/" --det_algorithm="EAST" --rec_model_dir="./inference/starnet/" --rec_image_shape="3, 32, 100" --rec_char_type="en"
```

After executing the command, the recognition result image is as follows:

WenmuZhou's avatar
WenmuZhou committed
415
![](../imgs_results/img_10_east_starnet.jpg)