inference_en.md 23.3 KB
Newer Older
Khanh Tran's avatar
Khanh Tran committed
1

tink2123's avatar
tink2123 committed
2
# Reasoning based on Python prediction engine
Khanh Tran's avatar
Khanh Tran committed
3

WenmuZhou's avatar
WenmuZhou committed
4
The inference model (the model saved by `paddle.jit.save`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
Khanh Tran's avatar
Khanh Tran committed
5
6
7

The model saved during the training process is the checkpoints model, which saves the parameters of the model and is mostly used to resume training.

WenmuZhou's avatar
WenmuZhou committed
8
Compared with the checkpoints model, the inference model will additionally save the structural information of the model. It has superior performance in predicting in deployment and accelerating inferencing, is flexible and convenient, and is suitable for integration with actual systems. For more details, please refer to the document [Classification Framework](https://github.com/PaddlePaddle/PaddleClas/blob/master/docs/zh_CN/extension/paddle_inference.md).
Khanh Tran's avatar
Khanh Tran committed
9

WenmuZhou's avatar
WenmuZhou committed
10
Next, we first introduce how to convert a trained model into an inference model, and then we will introduce text detection, text recognition, angle class, and the concatenation of them based on inference model.
Khanh Tran's avatar
Khanh Tran committed
11

licx's avatar
licx committed
12
13
14
- [CONVERT TRAINING MODEL TO INFERENCE MODEL](#CONVERT)
    - [Convert detection model to inference model](#Convert_detection_model)
    - [Convert recognition model to inference model](#Convert_recognition_model)
WenmuZhou's avatar
WenmuZhou committed
15
16
17
    - [Convert angle classification model to inference model](#Convert_angle_class_model)


licx's avatar
licx committed
18
19
20
21
22
- [TEXT DETECTION MODEL INFERENCE](#DETECTION_MODEL_INFERENCE)
    - [1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE](#LIGHTWEIGHT_DETECTION)
    - [2. DB TEXT DETECTION MODEL INFERENCE](#DB_DETECTION)
    - [3. EAST TEXT DETECTION MODEL INFERENCE](#EAST_DETECTION)
    - [4. SAST TEXT DETECTION MODEL INFERENCE](#SAST_DETECTION)
WenmuZhou's avatar
WenmuZhou committed
23
24
    - [5. Multilingual model inference](#Multilingual model inference)

licx's avatar
licx committed
25
26
27
28
- [TEXT RECOGNITION MODEL INFERENCE](#RECOGNITION_MODEL_INFERENCE)
    - [1. LIGHTWEIGHT CHINESE MODEL](#LIGHTWEIGHT_RECOGNITION)
    - [2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE](#CTC-BASED_RECOGNITION)
    - [3. ATTENTION-BASED TEXT RECOGNITION MODEL INFERENCE](#ATTENTION-BASED_RECOGNITION)
WenmuZhou's avatar
WenmuZhou committed
29
30
31
32
33
34
35
36
    - [4. SRN-BASED TEXT RECOGNITION MODEL INFERENCE](#SRN-BASED_RECOGNITION)
    - [5. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY](#USING_CUSTOM_CHARACTERS)
    - [6. MULTILINGUAL MODEL INFERENCE](MULTILINGUAL_MODEL_INFERENCE)

- [ANGLE CLASSIFICATION MODEL INFERENCE](#ANGLE_CLASS_MODEL_INFERENCE)
    - [1. ANGLE CLASSIFICATION MODEL INFERENCE](#ANGLE_CLASS_MODEL_INFERENCE)

- [TEXT DETECTION ANGLE CLASSIFICATION AND RECOGNITION INFERENCE CONCATENATION](#CONCATENATION)
licx's avatar
licx committed
37
38
    - [1. LIGHTWEIGHT CHINESE MODEL](#LIGHTWEIGHT_CHINESE_MODEL)
    - [2. OTHER MODELS](#OTHER_MODELS)
WenmuZhou's avatar
WenmuZhou committed
39

licx's avatar
licx committed
40
<a name="CONVERT"></a>
xxxpsyduck's avatar
xxxpsyduck committed
41
## CONVERT TRAINING MODEL TO INFERENCE MODEL
licx's avatar
licx committed
42
<a name="Convert_detection_model"></a>
xxxpsyduck's avatar
xxxpsyduck committed
43
### Convert detection model to inference model
Khanh Tran's avatar
Khanh Tran committed
44

xxxpsyduck's avatar
xxxpsyduck committed
45
Download the lightweight Chinese detection model:
Khanh Tran's avatar
Khanh Tran committed
46
```
WenmuZhou's avatar
WenmuZhou committed
47
wget -P ./ch_lite/ {link} && tar xf ./ch_lite/{file} -C ./ch_lite/
Khanh Tran's avatar
Khanh Tran committed
48
```
WenmuZhou's avatar
WenmuZhou committed
49

Khanh Tran's avatar
Khanh Tran committed
50
51
The above model is a DB algorithm trained with MobileNetV3 as the backbone. To convert the trained model into an inference model, just run the following command:
```
WenmuZhou's avatar
WenmuZhou committed
52
53
# -c Set the yml configuration file of the training algorithm, you need to write the path of the training model to be converted into the Global.checkpoints parameter in the configuration file, without adding the file suffixes .pdmodel, .pdopt or .pdparams.
# -o Set the address where the converted model will be saved.
tink2123's avatar
tink2123 committed
54

WenmuZhou's avatar
WenmuZhou committed
55
python3 tools/export_model.py -c configs/det/det_mv3_db_v1.1.yml -o ./inference/det_db/
Khanh Tran's avatar
Khanh Tran committed
56
```
WenmuZhou's avatar
WenmuZhou committed
57

WenmuZhou's avatar
WenmuZhou committed
58
59
When converting to an inference model, the configuration file used is the same as the configuration file used during training. In addition, you also need to set the `Global.checkpoints` parameter in the configuration file.
After the conversion is successful, there are three files in the model save directory:
Khanh Tran's avatar
Khanh Tran committed
60
61
```
inference/det_db/
WenmuZhou's avatar
WenmuZhou committed
62
63
64
    ├── det.pdiparams         # The parameter file of detection inference model which needs to be renamed to params
    ├── det.pdiparams.info    # The parameter information of detection inference model, which can be ignored
    └── det.pdmodel           # The program file of detection inference model which needs to be renamed to model
Khanh Tran's avatar
Khanh Tran committed
65
66
```

licx's avatar
licx committed
67
<a name="Convert_recognition_model"></a>
xxxpsyduck's avatar
xxxpsyduck committed
68
### Convert recognition model to inference model
Khanh Tran's avatar
Khanh Tran committed
69

xxxpsyduck's avatar
xxxpsyduck committed
70
Download the lightweight Chinese recognition model:
Khanh Tran's avatar
Khanh Tran committed
71
```
WenmuZhou's avatar
WenmuZhou committed
72
wget -P ./ch_lite/ {link} && tar xf ./ch_lite/{file} -C ./ch_lite/
Khanh Tran's avatar
Khanh Tran committed
73
74
75
76
```

The recognition model is converted to the inference model in the same way as the detection, as follows:
```
WenmuZhou's avatar
WenmuZhou committed
77
78
# -c Set the yml configuration file of the training algorithm, you need to write the path of the training model to be converted into the Global.checkpoints parameter in the configuration file, without adding the file suffixes .pdmodel, .pdopt or .pdparams.
# -o Set the address where the converted model will be saved.
tink2123's avatar
tink2123 committed
79

WenmuZhou's avatar
WenmuZhou committed
80
python3 tools/export_model.py -c configs/cls/cls_mv3.yml -o ./inference/cls/
Khanh Tran's avatar
Khanh Tran committed
81
82
83
84
```

If you have a model trained on your own dataset with a different dictionary file, please make sure that you modify the `character_dict_path` in the configuration file to your dictionary file path.

WenmuZhou's avatar
WenmuZhou committed
85
After the conversion is successful, there are three files in the model save directory:
Khanh Tran's avatar
Khanh Tran committed
86
```
WenmuZhou's avatar
WenmuZhou committed
87
88
89
90
inference/det_db/
    ├── rec.pdiparams         # The parameter file of recognition inference model which needs to be renamed to params
    ├── rec.pdiparams.info    # The parameter information of recognition inference model, which can be ignored
    └── rec.pdmodel           # The program file of detection recognition model which needs to be renamed to model
Khanh Tran's avatar
Khanh Tran committed
91
92
```

WenmuZhou's avatar
WenmuZhou committed
93
94
95
96
97
<a name="Convert_angle_class_model"></a>
### Convert angle classification model to inference model

Download the angle classification model:
```
WenmuZhou's avatar
WenmuZhou committed
98
wget -P ./ch_lite/ {link} && tar xf ./ch_lite/{file} -C ./ch_lite/
WenmuZhou's avatar
WenmuZhou committed
99
100
101
102
```

The angle classification model is converted to the inference model in the same way as the detection, as follows:
```
WenmuZhou's avatar
WenmuZhou committed
103
104
# -c Set the yml configuration file of the training algorithm, you need to write the path of the training model to be converted into the Global.checkpoints parameter in the configuration file, without adding the file suffixes .pdmodel, .pdopt or .pdparams.
# -o Set the address where the converted model will be saved.
WenmuZhou's avatar
WenmuZhou committed
105

WenmuZhou's avatar
WenmuZhou committed
106
python3 tools/export_model.py -c configs/cls/cls_mv3.yml -o ./inference/cls/
WenmuZhou's avatar
WenmuZhou committed
107
108
109
110
111
112
113
114
115
116
```

After the conversion is successful, there are two files in the directory:
```
/inference/cls/
  └─  model     Identify the saved model files
  └─  params    Identify the parameter files of the inference model
```


licx's avatar
licx committed
117
<a name="DETECTION_MODEL_INFERENCE"></a>
xxxpsyduck's avatar
xxxpsyduck committed
118
## TEXT DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
119

tink2123's avatar
tink2123 committed
120
121
The following will introduce the lightweight Chinese detection model inference, DB text detection model inference and EAST text detection model inference. The default configuration is based on the inference setting of the DB text detection model.
Because EAST and DB algorithms are very different, when inference, it is necessary to **adapt the EAST text detection algorithm by passing in corresponding parameters**.
Khanh Tran's avatar
Khanh Tran committed
122

licx's avatar
licx committed
123
<a name="LIGHTWEIGHT_DETECTION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
124
### 1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
125

xxxpsyduck's avatar
xxxpsyduck committed
126
For lightweight Chinese detection model inference, you can execute the following commands:
Khanh Tran's avatar
Khanh Tran committed
127
128
129
130
131
132
133

```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/"
```

The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:

134
![](../imgs_results/det_res_2.jpg)
Khanh Tran's avatar
Khanh Tran committed
135

WenmuZhou's avatar
WenmuZhou committed
136
137
138
The size of the image is limited by the parameters `limit_type` and `det_limit_side_len`, `limit_type=max` is to limit the length of the long side <`det_limit_side_len`, and `limit_type=min` is to limit the length of the short side>`det_limit_side_len`,
When the picture does not meet the restriction conditions (for `limit_type=max`and  long side >`det_limit_side_len` or for `min` and short side <`det_limit_side_len`), the image will be scaled proportionally.
This parameter is set to `limit_type='max', det_max_side_len=960` by default. If the resolution of the input picture is relatively large, and you want to use a larger resolution prediction, you can execute the following command:
Khanh Tran's avatar
Khanh Tran committed
139
140

```
WenmuZhou's avatar
WenmuZhou committed
141
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --det_limit_type=max --det_limit_side_len=1200
Khanh Tran's avatar
Khanh Tran committed
142
143
144
145
146
147
148
```

If you want to use the CPU for prediction, execute the command as follows
```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False
```

licx's avatar
licx committed
149
<a name="DB_DETECTION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
150
### 2. DB TEXT DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
151

WenmuZhou's avatar
WenmuZhou committed
152
First, convert the model saved in the DB text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](link)), you can use the following command to convert:
Khanh Tran's avatar
Khanh Tran committed
153
154

```
WenmuZhou's avatar
WenmuZhou committed
155
156
# -c Set the yml configuration file of the training algorithm, you need to write the path of the training model to be converted into the Global.checkpoints parameter in the configuration file, without adding the file suffixes .pdmodel, .pdopt or .pdparams.
# -o Set the address where the converted model will be saved.
Khanh Tran's avatar
Khanh Tran committed
157

WenmuZhou's avatar
WenmuZhou committed
158
python3 tools/export_model.py -c configs/det/det_r50_vd_db.yml -o "./inference/det_db"
Khanh Tran's avatar
Khanh Tran committed
159
160
161
162
163
164
165
166
167
168
```

DB text detection model inference, you can execute the following command:

```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_db/"
```

The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

169
![](../imgs_results/det_res_img_10_db.jpg)
Khanh Tran's avatar
Khanh Tran committed
170
171
172

**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese text images.

licx's avatar
licx committed
173
<a name="EAST_DETECTION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
174
### 3. EAST TEXT DETECTION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
175

WenmuZhou's avatar
WenmuZhou committed
176
First, convert the model saved in the EAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](link)), you can use the following command to convert:
Khanh Tran's avatar
Khanh Tran committed
177
178

```
WenmuZhou's avatar
WenmuZhou committed
179
180
# -c Set the yml configuration file of the training algorithm, you need to write the path of the training model to be converted into the Global.checkpoints parameter in the configuration file, without adding the file suffixes .pdmodel, .pdopt or .pdparams.
# -o Set the address where the converted model will be saved.
Khanh Tran's avatar
Khanh Tran committed
181
182
183

python3 tools/export_model.py -c configs/det/det_r50_vd_east.yml -o Global.checkpoints="./models/det_r50_vd_east/best_accuracy" Global.save_inference_dir="./inference/det_east"
```
licx's avatar
licx committed
184
**For EAST text detection model inference, you need to set the parameter ``--det_algorithm="EAST"``**, run the following command:
Khanh Tran's avatar
Khanh Tran committed
185
186
187
188

```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_east/" --det_algorithm="EAST"
```
licx's avatar
licx committed
189

Khanh Tran's avatar
Khanh Tran committed
190
191
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

192
![](../imgs_results/det_res_img_10_east.jpg)
Khanh Tran's avatar
Khanh Tran committed
193

licx's avatar
licx committed
194
195
196
197
198
199
**Note**: EAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.


<a name="SAST_DETECTION"></a>
### 4. SAST TEXT DETECTION MODEL INFERENCE
#### (1). Quadrangle text detection model (ICDAR2015)  
WenmuZhou's avatar
WenmuZhou committed
200
First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](link)), you can use the following command to convert:
licx's avatar
licx committed
201
202

```
WenmuZhou's avatar
WenmuZhou committed
203
204
205
206
# -c Set the yml configuration file of the training algorithm, you need to write the path of the training model to be converted into the Global.checkpoints parameter in the configuration file, without adding the file suffixes .pdmodel, .pdopt or .pdparams.
# -o Set the address where the converted model will be saved.

python3 tools/export_model.py -c configs/det/det_r50_vd_sast_icdar15.yml -o "./inference/det_sast_ic15"
licx's avatar
licx committed
207
208
209
```

**For SAST quadrangle text detection model inference, you need to set the parameter `--det_algorithm="SAST"`**, run the following command:
Khanh Tran's avatar
Khanh Tran committed
210

licx's avatar
licx committed
211
212
213
214
215
```
python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_sast_ic15/"
```

The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
Khanh Tran's avatar
Khanh Tran committed
216

licx's avatar
licx committed
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
![](../imgs_results/det_res_img_10_sast.jpg)

#### (2). Curved text detection model (Total-Text)  
First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the Total-Text English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_total_text.tar)), you can use the following command to convert:

```
python3 tools/export_model.py -c configs/det/det_r50_vd_sast_totaltext.yml -o Global.checkpoints="./models/sast_r50_vd_total_text/best_accuracy" Global.save_inference_dir="./inference/det_sast_tt"
```

**For SAST curved text detection model inference, you need to set the parameter `--det_algorithm="SAST"` and `--det_sast_polygon=True`**, run the following command:

```
python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_sast_polygon=True
```

The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

MissPenguin's avatar
MissPenguin committed
234
![](../imgs_results/det_res_img623_sast.jpg)
licx's avatar
licx committed
235
236
237
238

**Note**: SAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.

<a name="RECOGNITION_MODEL_INFERENCE"></a>
xxxpsyduck's avatar
xxxpsyduck committed
239
## TEXT RECOGNITION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
240

xxxpsyduck's avatar
xxxpsyduck committed
241
The following will introduce the lightweight Chinese recognition model inference, other CTC-based and Attention-based text recognition models inference. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss. In practice, it is also found that the result of the model based on Attention loss is not as good as the one based on CTC loss. In addition, if the characters dictionary is modified during training, make sure that you use the same characters set during inferencing. Please check below for details.
Khanh Tran's avatar
Khanh Tran committed
242
243


licx's avatar
licx committed
244
<a name="LIGHTWEIGHT_RECOGNITION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
245
### 1. LIGHTWEIGHT CHINESE TEXT RECOGNITION MODEL REFERENCE
Khanh Tran's avatar
Khanh Tran committed
246

xxxpsyduck's avatar
xxxpsyduck committed
247
For lightweight Chinese recognition model inference, you can execute the following commands:
Khanh Tran's avatar
Khanh Tran committed
248
249
250
251
252

```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/ch/word_4.jpg" --rec_model_dir="./inference/rec_crnn/"
```

253
![](../imgs_words/ch/word_4.jpg)
Khanh Tran's avatar
Khanh Tran committed
254
255
256
257
258
259

After executing the command, the prediction results (recognized text and score) of the above image will be printed on the screen.

Predicts of ./doc/imgs_words/ch/word_4.jpg:['实力活力', 0.89552695]


licx's avatar
licx committed
260
<a name="CTC-BASED_RECOGNITION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
261
### 2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE
Khanh Tran's avatar
Khanh Tran committed
262
263
264

Taking STAR-Net as an example, we introduce the recognition model inference based on CTC loss. CRNN and Rosetta are used in a similar way, by setting the recognition algorithm parameter `rec_algorithm`.

WenmuZhou's avatar
WenmuZhou committed
265
First, convert the model saved in the STAR-Net text recognition training process into an inference model. Taking the model based on Resnet34_vd backbone network, using MJSynth and SynthText (two English text recognition synthetic datasets) for training, as an example ([model download address](link)). It can be converted as follow:
Khanh Tran's avatar
Khanh Tran committed
266
267

```
WenmuZhou's avatar
WenmuZhou committed
268
269
# -c Set the yml configuration file of the training algorithm, you need to write the path of the training model to be converted into the Global.checkpoints parameter in the configuration file, without adding the file suffixes .pdmodel, .pdopt or .pdparams.
# -o Set the address where the converted model will be saved.
Khanh Tran's avatar
Khanh Tran committed
270

WenmuZhou's avatar
WenmuZhou committed
271
python3 tools/export_model.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o "./inference/starnet"
Khanh Tran's avatar
Khanh Tran committed
272
273
274
275
276
277
278
```

For STAR-Net text recognition model inference, execute the following commands:

```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./inference/starnet/" --rec_image_shape="3, 32, 100" --rec_char_type="en"
```
xxxpsyduck's avatar
xxxpsyduck committed
279

licx's avatar
licx committed
280
<a name="ATTENTION-BASED_RECOGNITION"></a>
xxxpsyduck's avatar
xxxpsyduck committed
281
### 3. ATTENTION-BASED TEXT RECOGNITION MODEL INFERENCE
282
![](../imgs_words_en/word_336.png)
Khanh Tran's avatar
Khanh Tran committed
283
284
285
286
287

After executing the command, the recognition result of the above image is as follows:

Predicts of ./doc/imgs_words_en/word_336.png:['super', 0.9999555]

xxxpsyduck's avatar
xxxpsyduck committed
288
**Note**:Since the above model refers to [DTRB](https://arxiv.org/abs/1904.01906) text recognition training and evaluation process, it is different from the training of lightweight Chinese recognition model in two aspects:
Khanh Tran's avatar
Khanh Tran committed
289
290
291
292
293
294
295
296
297
298

- The image resolution used in training is different: the image resolution used in training the above model is [3,32,100], while during our Chinese model training, in order to ensure the recognition effect of long text, the image resolution used in training is [3, 32, 320]. The default shape parameter of the inference stage is the image resolution used in training phase, that is [3, 32, 320]. Therefore, when running inference of the above English model here, you need to set the shape of the recognition image through the parameter `rec_image_shape`.

- Character list: the experiment in the DTRB paper is only for 26 lowercase English characters and 10 numbers, a total of 36 characters. All upper and lower case characters are converted to lower case characters, and characters not in the above list are ignored and considered as spaces. Therefore, no characters dictionary file is used here, but a dictionary is generated by the below command. Therefore, the parameter `rec_char_type` needs to be set during inference, which is specified as "en" in English.

```
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
dict_character = list(self.character_str)
```

WenmuZhou's avatar
WenmuZhou committed
299
300
301
<a name="SRN-BASED_RECOGNITION"></a>
### 4. SRN-BASED TEXT RECOGNITION MODEL INFERENCE

WenmuZhou's avatar
WenmuZhou committed
302
The recognition model based on SRN need to ensure that the predicted shape is consistent with the training, such as: --rec_image_shape="1, 64, 256"
WenmuZhou's avatar
WenmuZhou committed
303
304
305
306
307

```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" \
                                    --rec_model_dir="./inference/srn/" \
                                    --rec_image_shape="1, 64, 256" \
WenmuZhou's avatar
WenmuZhou committed
308
                                    --rec_char_type="en" 
WenmuZhou's avatar
WenmuZhou committed
309
310
311
```


licx's avatar
licx committed
312
<a name="USING_CUSTOM_CHARACTERS"></a>
WenmuZhou's avatar
WenmuZhou committed
313
### 5. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY
xxxpsyduck's avatar
xxxpsyduck committed
314
If the chars dictionary is modified during training, you need to specify the new dictionary path by setting the parameter `rec_char_dict_path` when using your inference model to predict.
LDOUBLEV's avatar
LDOUBLEV committed
315
316
317
318
319

```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_type="en" --rec_char_dict_path="your text dict path"
```

WenmuZhou's avatar
WenmuZhou committed
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
<a name="MULTILINGUAL_MODEL_INFERENCE"></a>
### 6. MULTILINGAUL MODEL INFERENCE
If you need to predict other language models, when using inference model prediction, you need to specify the dictionary path used by `--rec_char_dict_path`. At the same time, in order to get the correct visualization results,
You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/` path, such as Korean recognition:

```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/korean.ttf"
```
![](../imgs_words/korean/1.jpg)

After executing the command, the prediction result of the above figure is:

``` text
2020-09-19 16:15:05,076-INFO:      index: [205 206  38  39]
2020-09-19 16:15:05,077-INFO:      word : 바탕으로
2020-09-19 16:15:05,077-INFO:      score: 0.9171358942985535
```

<a name="ANGLE_CLASSIFICATION_MODEL_INFERENCE"></a>
## ANGLE CLASSIFICATION MODEL INFERENCE

The following will introduce the angle classification model inference.


<a name="ANGLE_CLASS_MODEL_INFERENCE"></a>
### 1.ANGLE CLASSIFICATION MODEL INFERENCE

For angle classification model inference, you can execute the following commands:

```
python3 tools/infer/predict_cls.py --image_dir="./doc/imgs_words/ch/word_4.jpg" --cls_model_dir="./inference/cls/"
```

WenmuZhou's avatar
WenmuZhou committed
353
![](../imgs_words_en/word_10.png)
WenmuZhou's avatar
WenmuZhou committed
354
355
356

After executing the command, the prediction results (classification angle and score) of the above image will be printed on the screen.

WenmuZhou's avatar
WenmuZhou committed
357
358
359
360
```
infer_img: doc/imgs_words_en/word_10.png
     result: ('0', 0.9999995)
```
WenmuZhou's avatar
WenmuZhou committed
361

licx's avatar
licx committed
362
<a name="CONCATENATION"></a>
WenmuZhou's avatar
WenmuZhou committed
363
## TEXT DETECTION ANGLE CLASSIFICATION AND RECOGNITION INFERENCE CONCATENATION
Khanh Tran's avatar
Khanh Tran committed
364

licx's avatar
licx committed
365
<a name="LIGHTWEIGHT_CHINESE_MODEL"></a>
xxxpsyduck's avatar
xxxpsyduck committed
366
### 1. LIGHTWEIGHT CHINESE MODEL
Khanh Tran's avatar
Khanh Tran committed
367

WenmuZhou's avatar
WenmuZhou committed
368
When performing prediction, you need to specify the path of a single image or a folder of images through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, the parameter `cls_model_dir` specifies the path to angle classification inference model and the parameter `rec_model_dir` specifies the path to identify the inference model. The parameter `use_angle_cls` is used to control whether to enable the angle classification model.The visualized recognition results are saved to the `./inference_results` folder by default.
Khanh Tran's avatar
Khanh Tran committed
369
370

```
WenmuZhou's avatar
WenmuZhou committed
371
372
373
374
375
# use direction classifier
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true

# not use use direction classifier
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/"
Khanh Tran's avatar
Khanh Tran committed
376
377
378
379
```

After executing the command, the recognition result image is as follows:

380
![](../imgs_results/2.jpg)
Khanh Tran's avatar
Khanh Tran committed
381

licx's avatar
licx committed
382
<a name="OTHER_MODELS"></a>
xxxpsyduck's avatar
xxxpsyduck committed
383
### 2. OTHER MODELS
Khanh Tran's avatar
Khanh Tran committed
384

licx's avatar
licx committed
385
386
387
388
389
If you want to try other detection algorithms or recognition algorithms, please refer to the above text detection model inference and text recognition model inference, update the corresponding configuration and model.

**Note: due to the limitation of rotation logic of detected box, SAST curved text detection model (using the parameter `det_sast_polygon=True`) is not supported for model combination yet.**

The following command uses the combination of the EAST text detection and STAR-Net text recognition:
Khanh Tran's avatar
Khanh Tran committed
390
391
392
393
394
395
396

```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_east/" --det_algorithm="EAST" --rec_model_dir="./inference/starnet/" --rec_image_shape="3, 32, 100" --rec_char_type="en"
```

After executing the command, the recognition result image is as follows:

397
![](../imgs_results/img_10.jpg)