whl_en.md 20.5 KB
Newer Older
WenmuZhou's avatar
WenmuZhou committed
1
2
3
4
5
6
7
8
9
10
11
# paddleocr package

## Get started quickly
### install package
install by pypi
```bash
pip install paddleocr
```

build own whl package and install
```bash
WenmuZhou's avatar
WenmuZhou committed
12
13
python3 setup.py bdist_wheel
pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr
WenmuZhou's avatar
WenmuZhou committed
14
15
16
```
### 1. Use by code

WenmuZhou's avatar
WenmuZhou committed
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
* detection classification and recognition
```python
from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)


# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

Output will be a list, each item contains bounding box, text and recognition confidence
```bash
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......
```

Visualization of results

<div align="center">
    <img src="../imgs_results/whl/12_det_rec.jpg" width="800">
</div>

WenmuZhou's avatar
WenmuZhou committed
55
56
57
* detection and recognition
```python
from paddleocr import PaddleOCR,draw_ocr
WenmuZhou's avatar
WenmuZhou committed
58
ocr = PaddleOCR(lang='en') # need to run only once to download and load model into memory
WenmuZhou's avatar
WenmuZhou committed
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path)
for line in result:
    print(line)

# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

Output will be a list, each item contains bounding box, text and recognition confidence
```bash
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
WenmuZhou's avatar
WenmuZhou committed
80
......
WenmuZhou's avatar
WenmuZhou committed
81
82
83
84
85
86
87
88
```

Visualization of results

<div align="center">
    <img src="../imgs_results/whl/12_det_rec.jpg" width="800">
</div>

WenmuZhou's avatar
WenmuZhou committed
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
* classification and recognition
```python
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, cls=True)
for line in result:
    print(line)
```

Output will be a list, each item contains recognition text and confidence
```bash
['PAIN', 0.990372]
```

WenmuZhou's avatar
WenmuZhou committed
104
105
106
* only detection
```python
from paddleocr import PaddleOCR,draw_ocr
WenmuZhou's avatar
WenmuZhou committed
107
ocr = PaddleOCR() # need to run only once to download and load model into memory
WenmuZhou's avatar
WenmuZhou committed
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path,rec=False)
for line in result:
    print(line)

# draw result
from PIL import Image

image = Image.open(img_path).convert('RGB')
im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

Output will be a list, each item only contains bounding box
```bash
[[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]]
[[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]]
[[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]]
WenmuZhou's avatar
WenmuZhou committed
127
......
WenmuZhou's avatar
WenmuZhou committed
128
129
130
131
132
133
134
135
136
137
138
```

Visualization of results

<div align="center">
    <img src="../imgs_results/whl/12_det.jpg" width="800">
</div>

* only recognition
```python
from paddleocr import PaddleOCR
WenmuZhou's avatar
WenmuZhou committed
139
ocr = PaddleOCR(lang='en') # need to run only once to load model into memory
WenmuZhou's avatar
WenmuZhou committed
140
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
WenmuZhou's avatar
WenmuZhou committed
141
result = ocr.ocr(img_path, det=False, cls=False)
WenmuZhou's avatar
WenmuZhou committed
142
143
144
145
for line in result:
    print(line)
```

WenmuZhou's avatar
WenmuZhou committed
146
Output will be a list, each item contains recognition text and confidence
WenmuZhou's avatar
WenmuZhou committed
147
148
149
150
```bash
['PAIN', 0.990372]
```

WenmuZhou's avatar
WenmuZhou committed
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
* only classification
```python
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, rec=False, cls=True)
for line in result:
    print(line)
```

Output will be a list, each item contains classification result and confidence
```bash
['0', 0.99999964]
```

WenmuZhou's avatar
WenmuZhou committed
166
167
168
169
170
171
172
### Use by command line

show help information
```bash
paddleocr -h
```

WenmuZhou's avatar
WenmuZhou committed
173
174
175
176
177
178
179
180
181
182
183
184
185
* detection classification and recognition
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --use_angle_cls true -cls true --lang en
```

Output will be a list, each item contains bounding box, text and recognition confidence
```bash
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......
```

WenmuZhou's avatar
WenmuZhou committed
186
187
* detection and recognition
```bash
WenmuZhou's avatar
WenmuZhou committed
188
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --lang en
WenmuZhou's avatar
WenmuZhou committed
189
190
191
192
193
194
195
```

Output will be a list, each item contains bounding box, text and recognition confidence
```bash
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
WenmuZhou's avatar
WenmuZhou committed
196
......
WenmuZhou's avatar
WenmuZhou committed
197
198
```

WenmuZhou's avatar
WenmuZhou committed
199
200
201
202
203
204
205
206
207
208
* classification and recognition
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true -cls true --det false --lang en
```

Output will be a list, each item contains text and recognition confidence
```bash
['PAIN', 0.990372]
```

WenmuZhou's avatar
WenmuZhou committed
209
210
211
212
213
214
215
216
217
218
* only detection
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --rec false
```

Output will be a list, each item only contains bounding box
```bash
[[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]]
[[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]]
[[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]]
WenmuZhou's avatar
WenmuZhou committed
219
......
WenmuZhou's avatar
WenmuZhou committed
220
221
222
223
```

* only recognition
```bash
WenmuZhou's avatar
WenmuZhou committed
224
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --det false --cls false --lang en
WenmuZhou's avatar
WenmuZhou committed
225
226
227
228
229
230
231
```

Output will be a list, each item contains text and recognition confidence
```bash
['PAIN', 0.990372]
```

WenmuZhou's avatar
WenmuZhou committed
232
233
234
235
236
237
238
239
240
241
* only classification
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true -cls true --det false --rec false
```

Output will be a list, each item contains classification result and confidence
```bash
['0', 0.99999964]
```

WenmuZhou's avatar
WenmuZhou committed
242
243
244
245
246
247
248
249
250
## Use custom model
When the built-in model cannot meet the needs, you need to use your own trained model.
First, refer to the first section of [inference_en.md](./inference_en.md) to convert your det and rec model to inference model, and then use it as follows

### 1. Use by code

```python
from paddleocr import PaddleOCR,draw_ocr
# The path of detection and recognition model must contain model and params files
WenmuZhou's avatar
WenmuZhou committed
251
ocr = PaddleOCR(det_model_dir='{your_det_model_dir}', rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}', cls_model_dir='{your_cls_model_dir}', use_angle_cls=True)
WenmuZhou's avatar
WenmuZhou committed
252
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
WenmuZhou's avatar
WenmuZhou committed
253
result = ocr.ocr(img_path, cls=True)
WenmuZhou's avatar
WenmuZhou committed
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
for line in result:
    print(line)

# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

### Use by command line

```bash
WenmuZhou's avatar
WenmuZhou committed
271
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} --rec_char_dict_path {your_rec_char_dict_path} --cls_model_dir {your_cls_model_dir} --use_angle_cls true --cls true
WenmuZhou's avatar
WenmuZhou committed
272
273
```

MissPenguin's avatar
MissPenguin committed
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
### Use web images or numpy array as input

1. Web image

Use by code
```python
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = 'http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

# show result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```
Use by command line
```bash
paddleocr --image_dir http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg --use_angle_cls=true
```

2. Numpy array
Support numpy array as input only when used by code

```python
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg'
img = cv2.imread(img_path)
# img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY), If your own training model supports grayscale images, you can uncomment this line
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

# show result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```


WenmuZhou's avatar
WenmuZhou committed
327
328
329
330
331
332
333
334
## Parameter Description

| Parameter                    | Description                                                                                                                                                                                                                 | Default value                  |
|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|
| use_gpu                 | use GPU or not                                                                                                                                                                                                          | TRUE                    |
| gpu_mem                 | GPU memory size used for initialization                                                                                                                                                                                              | 8000M                   |
| image_dir               | The images path or folder path for predicting when used by the command line                                                                                                                                                                           |                         |
| det_algorithm           | Type of detection algorithm selected                                                                                                                                                                                                   | DB                      |
WenmuZhou's avatar
WenmuZhou committed
335
| det_model_dir           | the text detection inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to `~/.paddleocr/det`; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path | None           |
WenmuZhou's avatar
WenmuZhou committed
336
337
338
339
340
341
342
343
| det_max_side_len        | The maximum size of the long side of the image. When the long side exceeds this value, the long side will be resized to this size, and the short side will be scaled proportionally                                                                                                                         | 960                     |
| det_db_thresh           | Binarization threshold value of DB output map                                                                                                                                                                                        | 0.3                     |
| det_db_box_thresh       | The threshold value of the DB output box. Boxes score lower than this value will be discarded                                                                                                                                                                         | 0.5                     |
| det_db_unclip_ratio     | The expanded ratio of DB output box                                                                                                                                                                                             | 2                       |
| det_east_score_thresh   | Binarization threshold value of EAST output map                                                                                                                                                                                       | 0.8                     |
| det_east_cover_thresh   | The threshold value of the EAST output box. Boxes score lower than this value will be discarded                                                                                                                                                                         | 0.1                     |
| det_east_nms_thresh     | The NMS threshold value of EAST model output box                                                                                                                                                                                              | 0.2                     |
| rec_algorithm           | Type of recognition algorithm selected                                                                                                                                                                                                | CRNN                    |
WenmuZhou's avatar
WenmuZhou committed
344
| rec_model_dir           | the text recognition inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to `~/.paddleocr/rec`; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path | None |
WenmuZhou's avatar
WenmuZhou committed
345
346
347
| rec_image_shape         | image shape of recognition algorithm                                                                                                                                                                                            | "3,32,320"              |
| rec_char_type           | Character type of recognition algorithm, Chinese (ch) or English (en)                                                                                                                                                                               | ch                      |
| rec_batch_num           | When performing recognition, the batchsize of forward images                                                                                                                                                                                         | 30                      |
WenmuZhou's avatar
WenmuZhou committed
348
349
| max_text_length         | The maximum text length that the recognition algorithm can recognize                                                                                                                                                                                         | 25                      |
| rec_char_dict_path      | the alphabet path which needs to be modified to your own path when `rec_model_Name` use mode 2                                                                                                                                              | ./ppocr/utils/ppocr_keys_v1.txt                        |
WenmuZhou's avatar
WenmuZhou committed
350
| use_space_char          | Whether to recognize spaces                                                                                                                                                                                                         | TRUE                    |
MissPenguin's avatar
MissPenguin committed
351
| drop_score          | Filter the output by score (from the recognition model), and those below this score will not be returned                                                                                                                                                                                                        | 0.5                    |
WenmuZhou's avatar
WenmuZhou committed
352
353
354
355
356
| use_angle_cls          | Whether to load classification model                                                                                                                                                                                                       | FALSE                    |
| cls_model_dir           | the classification inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to `~/.paddleocr/cls`; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path | None |
| cls_image_shape         | image shape of classification algorithm                                                                                                                                                                                            | "3,48,192"              |
| label_list         | label list of classification algorithm                                                                                                                                                                                            | ['0','180']           |
| cls_batch_num           | When performing classification, the batchsize of forward images                                                                                                                                                                                         | 30                      |
WenmuZhou's avatar
WenmuZhou committed
357
| enable_mkldnn           | Whether to enable mkldnn                                                                                                                                                                                                       | FALSE                   |
WenmuZhou's avatar
WenmuZhou committed
358
359
| use_zero_copy_run           | Whether to forward by zero_copy_run                                                                                                                                                                               | FALSE                   |
| lang                     | The support language, now only Chinese(ch)、English(en)、French(french)、German(german)、Korean(korean)、Japanese(japan) are supported                                                                                                                                                                                                  | ch                    |
WenmuZhou's avatar
WenmuZhou committed
360
| det                     | Enable detction when `ppocr.ocr` func exec                                                                                                                                                                                                   | TRUE                    |
WenmuZhou's avatar
WenmuZhou committed
361
| rec                     | Enable recognition when `ppocr.ocr` func exec                                                                                                                                                                                                   | TRUE                    |
MissPenguin's avatar
MissPenguin committed
362
| cls                     | Enable classification when `ppocr.ocr` func exec((Use use_angle_cls in command line mode to control whether to start classification in the forward direction)                                                                                                                                                                                                   | FALSE                    |