quickstart_en.md 10.3 KB
Newer Older
littletomatodonkey's avatar
littletomatodonkey committed
1

Leif's avatar
Leif committed
2
# PaddleOCR Quick Start
littletomatodonkey's avatar
littletomatodonkey committed
3

Leif's avatar
Leif committed
4
[PaddleOCR Quick Start](#paddleocr-quick-start)
littletomatodonkey's avatar
littletomatodonkey committed
5

Leif's avatar
Leif committed
6
7
8
9
10
11
12
13
14
15
16
* [1. Light Installation](#1-light-installation)
  + [1.1 Install PaddlePaddle2.0](#11-install-paddlepaddle20)
  + [1.2 Install PaddleOCR Whl Package](#12-install-paddleocr-whl-package)
* [2. Easy-to-Use](#2-easy-to-use)
  + [2.1 Use by command line](#21-use-by-command-line)
    - [2.1.1 English and Chinese Model](#211-english-and-chinese-model)
    - [2.1.2 Multi-language Model](#212-multi-language-model)
    - [2.1.3 LayoutParser](#213-layoutparser)
  + [2.2 Use by Code](#22-use-by-code)
    - [2.2.1 Chinese & English Model and Multilingual Model](#221-chinese---english-model-and-multilingual-model)
    - [2.2.2 LayoutParser](#222-layoutparser)
littletomatodonkey's avatar
littletomatodonkey committed
17

Leif's avatar
Leif committed
18
<a name="1-light-installation"></a>
littletomatodonkey's avatar
littletomatodonkey committed
19

Leif's avatar
Leif committed
20
## 1. Light Installation
littletomatodonkey's avatar
littletomatodonkey committed
21

Leif's avatar
Leif committed
22
<a name="11-install-paddlepaddle20"></a>
WenmuZhou's avatar
WenmuZhou committed
23

Leif's avatar
Leif committed
24
### 1.1 Install PaddlePaddle2.0
littletomatodonkey's avatar
littletomatodonkey committed
25

Leif's avatar
Leif committed
26
27
28
```bash
# If you have cuda9 or cuda10 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu==2.0.0 -i https://mirror.baidu.com/pypi/simple
littletomatodonkey's avatar
littletomatodonkey committed
29

Leif's avatar
Leif committed
30
31
# If you only have cpu on your machine, please run the following command to install
python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple
littletomatodonkey's avatar
littletomatodonkey committed
32
```
Leif's avatar
Leif committed
33
34
35

For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation.

Leif's avatar
Leif committed
36
37
<a name="12-install-paddleocr-whl-package"></a>

Leif's avatar
Leif committed
38
39
40
41
### 1.2 Install PaddleOCR Whl Package

```bash
pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+
littletomatodonkey's avatar
littletomatodonkey committed
42
43
```

Leif's avatar
Leif committed
44
- **For windows users:** If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows. Please try to download Shapely whl file [here](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely).
Leif's avatar
Leif committed
45

Leif's avatar
Leif committed
46
  Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
Leif's avatar
Leif committed
47

Leif's avatar
Leif committed
48
- **For layout analysis users**, run the following command to install **Layout-Parser**
littletomatodonkey's avatar
littletomatodonkey committed
49

Leif's avatar
Leif committed
50
51
52
53
54
55
56
57
58
59
60
61
62
  ```bash
  pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
  ```

<a name="2-easy-to-use"></a>

## 2. Easy-to-Use

<a name="21-use-by-command-line"></a>

### 2.1 Use by command line

PaddleOCR provides a series of test images, click xx to download, and then switch to the corresponding directory in the terminal
Leif's avatar
Leif committed
63
64

```bash
Leif's avatar
Leif committed
65
cd /path/to/ppocr_img
littletomatodonkey's avatar
littletomatodonkey committed
66
```
Leif's avatar
Leif committed
67

Leif's avatar
Leif committed
68
If you do not use the provided test image, you can replace the following `--image_dir` parameter with the corresponding test image path
Leif's avatar
Leif committed
69

Leif's avatar
Leif committed
70
<a name="211-english-and-chinese-model"></a>
Leif's avatar
Leif committed
71

Leif's avatar
Leif committed
72
#### 2.1.1 Chinese and English Model
Leif's avatar
Leif committed
73

Leif's avatar
Leif committed
74
* Detection, direction classification and recognition: set the direction classifier parameter`--use_angle_cls true` to recognize vertical text.
Leif's avatar
Leif committed
75

Leif's avatar
Leif committed
76
77
78
  ```bash
  paddleocr --image_dir ./imgs_en/img_12.jpg --use_angle_cls true --lang en
  ```
littletomatodonkey's avatar
littletomatodonkey committed
79

Leif's avatar
Leif committed
80
  Output will be a list, each item contains bounding box, text and recognition confidence
littletomatodonkey's avatar
littletomatodonkey committed
81

Leif's avatar
Leif committed
82
83
84
85
86
87
88
89
90
91
92
93
  ```bash
  [[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
  [[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
  [[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
  ......
  ```

* Only detection: set `--rec` to `false`

  ```bash
  paddleocr --image_dir ./imgs_en/img_12.jpg --rec false
  ```
Leif's avatar
Leif committed
94

Leif's avatar
Leif committed
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
  Output will be a list, each item only contains bounding box

  ```bash
  [[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]]
  [[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]]
  [[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]]
  ......
  ```

* Only recognition: set `--det` to `false`

  ```bash
  paddleocr --image_dir ./imgs_words_en/word_10.png --det false --lang en
  ```

  Output will be a list, each item contains text and recognition confidence

  ```bash
  ['PAIN', 0.990372]
  ```

More whl package usage can be found in [whl package](./whl_en.md)
<a name="212-multi-language-model"></a>
Leif's avatar
Leif committed
118
119
120

#### 2.1.2 Multi-language Model

Leif's avatar
Leif committed
121
Paddleocr currently supports 80 languages, which can be switched by modifying the `--lang` parameter.
Leif's avatar
Leif committed
122
123
124

``` bash
paddleocr --image_dir ./doc/imgs_en/254.jpg --lang=en
littletomatodonkey's avatar
littletomatodonkey committed
125
126
```

Leif's avatar
Leif committed
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
<div align="center">
    <img src="../imgs_en/254.jpg" width="300" height="600">
    <img src="../imgs_results/multi_lang/img_02.jpg" width="600" height="600">
</div>
The result is a list, each item contains a text box, text and recognition confidence

```text
[('PHO CAPITAL', 0.95723116), [[66.0, 50.0], [327.0, 44.0], [327.0, 76.0], [67.0, 82.0]]]
[('107 State Street', 0.96311164), [[72.0, 90.0], [451.0, 84.0], [452.0, 116.0], [73.0, 121.0]]]
[('Montpelier Vermont', 0.97389287), [[69.0, 132.0], [501.0, 126.0], [501.0, 158.0], [70.0, 164.0]]]
[('8022256183', 0.99810505), [[71.0, 175.0], [363.0, 170.0], [364.0, 202.0], [72.0, 207.0]]]
[('REG 07-24-201706:59 PM', 0.93537045), [[73.0, 299.0], [653.0, 281.0], [654.0, 318.0], [74.0, 336.0]]]
[('045555', 0.99346405), [[509.0, 331.0], [651.0, 325.0], [652.0, 356.0], [511.0, 362.0]]]
[('CT1', 0.9988654), [[535.0, 367.0], [654.0, 367.0], [654.0, 406.0], [535.0, 406.0]]]
......
```
littletomatodonkey's avatar
littletomatodonkey committed
143

Leif's avatar
Leif committed
144
Commonly used multilingual abbreviations include
littletomatodonkey's avatar
littletomatodonkey committed
145

Leif's avatar
Leif committed
146
147
148
149
150
| Language            | Abbreviation |      | Language | Abbreviation |      | Language | Abbreviation |
| ------------------- | ------------ | ---- | -------- | ------------ | ---- | -------- | ------------ |
| Chinese & English   | ch           |      | French   | fr           |      | Japanese | japan        |
| English             | en           |      | German   | german       |      | Korean   | korean       |
| Chinese Traditional | chinese_cht  |      | Italian  | it           |      | Russian  | ru           |
littletomatodonkey's avatar
littletomatodonkey committed
151

Leif's avatar
Leif committed
152
153
A list of all languages and their corresponding abbreviations can be found in [Multi-Language Model Tutorial](./multi_languages_en.md)
<a name="213-layoutparser"></a>
littletomatodonkey's avatar
littletomatodonkey committed
154

Leif's avatar
Leif committed
155
#### 2.1.3 LayoutParser
littletomatodonkey's avatar
littletomatodonkey committed
156

Leif's avatar
Leif committed
157
158
159
160
To use the layout analysis function of PaddleOCR, you need to specify `--type=structure`

```bash
paddleocr --image_dir=../doc/table/1.png --type=structure
littletomatodonkey's avatar
littletomatodonkey committed
161
162
```

Leif's avatar
Leif committed
163
- **Results Format**
Leif's avatar
Leif committed
164

Leif's avatar
Leif committed
165
  The returned results of PP-Structure is a list composed of a dict, an example is as follows
Leif's avatar
Leif committed
166

Leif's avatar
Leif committed
167
168
169
170
171
172
173
174
175
  ```shell
  [
    {   'type': 'Text',
        'bbox': [34, 432, 345, 462],
        'res': ([[36.0, 437.0, 341.0, 437.0, 341.0, 446.0, 36.0, 447.0], [41.0, 454.0, 125.0, 453.0, 125.0, 459.0, 41.0, 460.0]],
                  [('Tigure-6. The performance of CNN and IPT models using difforen', 0.90060663), ('Tent  ', 0.465441)])
    }
  ]
  ```
Leif's avatar
Leif committed
176

Leif's avatar
Leif committed
177
  The description of each field in dict is as follows
Leif's avatar
Leif committed
178

Leif's avatar
Leif committed
179
180
181
182
183
  | Parameter | Description                                                  |
  | --------- | ------------------------------------------------------------ |
  | type      | Type of image area                                           |
  | bbox      | The coordinates of the image area in the original image, respectively [left upper x, left upper y, right bottom x, right bottom y] |
  | res       | OCR or table recognition result of image area。<br> Table: HTML string of the table; <br> OCR: A tuple containing the detection coordinates and recognition results of each single line of text |
littletomatodonkey's avatar
littletomatodonkey committed
184

Leif's avatar
Leif committed
185
- **Parameter Description:**
littletomatodonkey's avatar
littletomatodonkey committed
186

Leif's avatar
Leif committed
187
188
189
190
191
192
  | Parameter       | Description                                                  | Default value                                |
  | --------------- | ------------------------------------------------------------ | -------------------------------------------- |
  | output          | The path where excel and recognition results are saved       | ./output/table                               |
  | table_max_len   | The long side of the image is resized in table structure model | 488                                          |
  | table_model_dir | inference model path of table structure model                | None                                         |
  | table_char_type | dict path of table structure model                           | ../ppocr/utils/dict/table_structure_dict.txt |
Leif's avatar
Leif committed
193

Leif's avatar
Leif committed
194
<a name="22-use-by-code"></a>
Leif's avatar
Leif committed
195

Leif's avatar
Leif committed
196
197
### 2.2 Use by Code
<a name="221-chinese---english-model-and-multilingual-model"></a>
Leif's avatar
Leif committed
198

Leif's avatar
Leif committed
199
#### 2.2.1 Chinese & English Model and Multilingual Model
Leif's avatar
Leif committed
200

Leif's avatar
Leif committed
201
* detection, angle classification and recognition:
Leif's avatar
Leif committed
202

Leif's avatar
Leif committed
203
204
205
206
207
208
209
```python
from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `fr`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = './imgs_en/img_12.jpg'
Leif's avatar
Leif committed
210
211
212
213
214
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)


Leif's avatar
Leif committed
215
216
# draw result
from PIL import Image
Leif's avatar
Leif committed
217
218
219
220
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
Leif's avatar
Leif committed
221
im_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')
Leif's avatar
Leif committed
222
223
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
littletomatodonkey's avatar
littletomatodonkey committed
224
```
Leif's avatar
Leif committed
225

Leif's avatar
Leif committed
226
Output will be a list, each item contains bounding box, text and recognition confidence
Leif's avatar
Leif committed
227
228

```bash
Leif's avatar
Leif committed
229
230
231
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
Leif's avatar
Leif committed
232
......
littletomatodonkey's avatar
littletomatodonkey committed
233
234
```

Leif's avatar
Leif committed
235
Visualization of results
littletomatodonkey's avatar
littletomatodonkey committed
236

Leif's avatar
Leif committed
237
<div align="center">
Leif's avatar
Leif committed
238
    <img src="../imgs_results/whl/12_det_rec.jpg" width="800">
Leif's avatar
Leif committed
239
</div>
Leif's avatar
Leif committed
240
<a name="222-layoutparser"></a>
littletomatodonkey's avatar
littletomatodonkey committed
241

Leif's avatar
Leif committed
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
#### 2.2.2 LayoutParser

```python
import os
import cv2
from paddleocr import PPStructure,draw_structure_result,save_structure_res

table_engine = PPStructure(show_log=True)

save_folder = './output/table'
img_path = './table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
save_structure_res(result, save_folder,os.path.basename(img_path).split('.')[0])

for line in result:
    line.pop('img')
    print(line)

from PIL import Image

font_path = './fonts/simfang.ttf'
image = Image.open(img_path).convert('RGB')
im_show = draw_structure_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```
littletomatodonkey's avatar
littletomatodonkey committed
269