quickstart_en.md 8.5 KB
Newer Older
MissPenguin's avatar
MissPenguin committed
1
2
# PaddleOCR Quick Start

MissPenguin's avatar
MissPenguin committed
3
**Note:** This tutorial mainly introduces the usage of PP-OCR series models, please refer to [PP-Structure Quick Start](../../ppstructure/docs/quickstart_en.md) for the quick use of document analysis related functions.
MissPenguin's avatar
MissPenguin committed
4
5

- [1. Installation](#1-installation)
WenmuZhou's avatar
WenmuZhou committed
6
7
    - [1.1 Install PaddlePaddle](#11-install-paddlepaddle)
    - [1.2 Install PaddleOCR Whl Package](#12-install-paddleocr-whl-package)
MissPenguin's avatar
MissPenguin committed
8
- [2. Easy-to-Use](#2-easy-to-use)
WenmuZhou's avatar
WenmuZhou committed
9
10
11
12
13
    - [2.1 Use by Command Line](#21-use-by-command-line)
      - [2.1.1 Chinese and English Model](#211-chinese-and-english-model)
      - [2.1.2 Multi-language Model](#212-multi-language-model)
    - [2.2 Use by Code](#22-use-by-code)
      - [2.2.1 Chinese & English Model and Multilingual Model](#221-chinese--english-model-and-multilingual-model)
MissPenguin's avatar
MissPenguin committed
14
- [3. Summary](#3-summary)
littletomatodonkey's avatar
littletomatodonkey committed
15
16
17



Leif's avatar
Leif committed
18
<a name="1nstallation"></a>
littletomatodonkey's avatar
littletomatodonkey committed
19

Leif's avatar
Leif committed
20
## 1. Installation
littletomatodonkey's avatar
littletomatodonkey committed
21

Leif's avatar
Leif committed
22
<a name="11-install-paddlepaddle"></a>
WenmuZhou's avatar
WenmuZhou committed
23

Leif's avatar
Leif committed
24
25
26
### 1.1 Install PaddlePaddle

> If you do not have a Python environment, please refer to [Environment Preparation](./environment_en.md).
littletomatodonkey's avatar
littletomatodonkey committed
27

Leif's avatar
Leif committed
28
- If you have CUDA 9 or CUDA 10 installed on your machine, please run the following command to install
littletomatodonkey's avatar
littletomatodonkey committed
29

Leif's avatar
Leif committed
30
31
32
33
34
35
36
37
38
  ```bash
  python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
  ```

- If you have no available GPU on your machine, please run the following command to install the CPU version

  ```bash
  python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
  ```
littletomatodonkey's avatar
littletomatodonkey committed
39

Leif's avatar
Leif committed
40
For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation.
WenmuZhou's avatar
WenmuZhou committed
41

Leif's avatar
Leif committed
42
43
44
<a name="12-install-paddleocr-whl-package"></a>

### 1.2 Install PaddleOCR Whl Package
Leif's avatar
Leif committed
45
46
47

```bash
pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+
littletomatodonkey's avatar
littletomatodonkey committed
48
49
```

Leif's avatar
Leif committed
50
- **For windows users:** If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows. Please try to download Shapely whl file [here](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely).
Leif's avatar
Leif committed
51

Leif's avatar
Leif committed
52
  Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
Leif's avatar
Leif committed
53

Leif's avatar
Leif committed
54
- **For layout analysis users**, run the following command to install **Layout-Parser**
littletomatodonkey's avatar
littletomatodonkey committed
55

Leif's avatar
Leif committed
56
57
58
59
60
61
62
63
64
65
  ```bash
  pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
  ```

<a name="2-easy-to-use"></a>

## 2. Easy-to-Use

<a name="21-use-by-command-line"></a>

66
### 2.1 Use by Command Line
Leif's avatar
Leif committed
67

Leif's avatar
Leif committed
68
PaddleOCR provides a series of test images, click [here](https://paddleocr.bj.bcebos.com/dygraph_v2.1/ppocr_img.zip) to download, and then switch to the corresponding directory in the terminal
Leif's avatar
Leif committed
69
70

```bash
Leif's avatar
Leif committed
71
cd /path/to/ppocr_img
littletomatodonkey's avatar
littletomatodonkey committed
72
```
Leif's avatar
Leif committed
73

Leif's avatar
Leif committed
74
If you do not use the provided test image, you can replace the following `--image_dir` parameter with the corresponding test image path
Leif's avatar
Leif committed
75

76
77
**Note**: The whl package uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3,48,320`, so if you use the recognition function, you need to add the parameter `--rec_image_shape 3,48,320`, if you do not use the default `PP- OCRv3` model, you do not need to set this parameter.

Leif's avatar
Leif committed
78
<a name="211-english-and-chinese-model"></a>
Leif's avatar
Leif committed
79

Leif's avatar
Leif committed
80
#### 2.1.1 Chinese and English Model
Leif's avatar
Leif committed
81

82
* Detection, direction classification and recognition: set the parameter`--use_gpu false` to disable the gpu device
Leif's avatar
Leif committed
83

Leif's avatar
Leif committed
84
  ```bash
85
  paddleocr --image_dir ./imgs_en/img_12.jpg --use_angle_cls true --lang en --use_gpu false --rec_image_shape 3,48,320
Leif's avatar
Leif committed
86
  ```
littletomatodonkey's avatar
littletomatodonkey committed
87

Leif's avatar
Leif committed
88
  Output will be a list, each item contains bounding box, text and recognition confidence
littletomatodonkey's avatar
littletomatodonkey committed
89

Leif's avatar
Leif committed
90
  ```bash
91
92
93
  [[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]
  [[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]
  [[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]
Leif's avatar
Leif committed
94
95
96
97
98
99
100
101
  ......
  ```

* Only detection: set `--rec` to `false`

  ```bash
  paddleocr --image_dir ./imgs_en/img_12.jpg --rec false
  ```
Leif's avatar
Leif committed
102

Leif's avatar
Leif committed
103
104
105
  Output will be a list, each item only contains bounding box

  ```bash
106
107
108
  [[397.0, 802.0], [1092.0, 802.0], [1092.0, 841.0], [397.0, 841.0]]
  [[397.0, 750.0], [1211.0, 750.0], [1211.0, 789.0], [397.0, 789.0]]
  [[397.0, 702.0], [1209.0, 698.0], [1209.0, 734.0], [397.0, 738.0]]
Leif's avatar
Leif committed
109
110
111
112
113
114
  ......
  ```

* Only recognition: set `--det` to `false`

  ```bash
115
  paddleocr --image_dir ./imgs_words_en/word_10.png --det false --lang en --rec_image_shape 3,48,320
Leif's avatar
Leif committed
116
117
118
119
120
  ```

  Output will be a list, each item contains text and recognition confidence

  ```bash
121
  ['PAIN', 0.9934559464454651]
Leif's avatar
Leif committed
122
123
  ```

124
If you need to use the 2.0 model, please specify the parameter `--version PP-OCR`, paddleocr uses the PP-OCRv3 model by default(`--versioin PP-OCRv3`). More whl package usage can be found in [whl package](./whl_en.md)
Leif's avatar
Leif committed
125
<a name="212-multi-language-model"></a>
Leif's avatar
Leif committed
126
127
128

#### 2.1.2 Multi-language Model

MissPenguin's avatar
MissPenguin committed
129
PaddleOCR currently supports 80 languages, which can be switched by modifying the `--lang` parameter. PP-OCRv3 currently only supports Chinese and English models, and other multilingual models will be updated one after another.
Leif's avatar
Leif committed
130
131

``` bash
132
paddleocr --image_dir ./doc/imgs_en/254.jpg --lang=en --rec_image_shape 3,48,320
littletomatodonkey's avatar
littletomatodonkey committed
133
134
```

Leif's avatar
Leif committed
135
136
137
138
139
140
141
<div align="center">
    <img src="../imgs_en/254.jpg" width="300" height="600">
    <img src="../imgs_results/multi_lang/img_02.jpg" width="600" height="600">
</div>
The result is a list, each item contains a text box, text and recognition confidence

```text
142
143
144
[[[67.0, 51.0], [327.0, 46.0], [327.0, 74.0], [68.0, 80.0]], ('PHOCAPITAL', 0.9944712519645691)]
[[[72.0, 92.0], [453.0, 84.0], [454.0, 114.0], [73.0, 122.0]], ('107 State Street', 0.9744491577148438)]
[[[69.0, 135.0], [501.0, 125.0], [501.0, 156.0], [70.0, 165.0]], ('Montpelier Vermont', 0.9357033967971802)]
Leif's avatar
Leif committed
145
146
......
```
littletomatodonkey's avatar
littletomatodonkey committed
147

Leif's avatar
Leif committed
148
Commonly used multilingual abbreviations include
littletomatodonkey's avatar
littletomatodonkey committed
149

Leif's avatar
Leif committed
150
151
152
153
154
| Language            | Abbreviation |      | Language | Abbreviation |      | Language | Abbreviation |
| ------------------- | ------------ | ---- | -------- | ------------ | ---- | -------- | ------------ |
| Chinese & English   | ch           |      | French   | fr           |      | Japanese | japan        |
| English             | en           |      | German   | german       |      | Korean   | korean       |
| Chinese Traditional | chinese_cht  |      | Italian  | it           |      | Russian  | ru           |
littletomatodonkey's avatar
littletomatodonkey committed
155

Leif's avatar
Leif committed
156
A list of all languages and their corresponding abbreviations can be found in [Multi-Language Model Tutorial](./multi_languages_en.md)
littletomatodonkey's avatar
littletomatodonkey committed
157

Leif's avatar
Leif committed
158

Leif's avatar
Leif committed
159
<a name="22-use-by-code"></a>
Leif's avatar
Leif committed
160

Leif's avatar
Leif committed
161
162
### 2.2 Use by Code
<a name="221-chinese---english-model-and-multilingual-model"></a>
Leif's avatar
Leif committed
163

Leif's avatar
Leif committed
164
#### 2.2.1 Chinese & English Model and Multilingual Model
Leif's avatar
Leif committed
165

Leif's avatar
Leif committed
166
* detection, angle classification and recognition:
Leif's avatar
Leif committed
167

Leif's avatar
Leif committed
168
169
170
171
172
173
174
```python
from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `fr`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = './imgs_en/img_12.jpg'
Leif's avatar
Leif committed
175
176
177
178
179
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)


Leif's avatar
Leif committed
180
181
# draw result
from PIL import Image
Leif's avatar
Leif committed
182
183
184
185
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
Leif's avatar
Leif committed
186
im_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')
Leif's avatar
Leif committed
187
188
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
littletomatodonkey's avatar
littletomatodonkey committed
189
```
Leif's avatar
Leif committed
190

Leif's avatar
Leif committed
191
Output will be a list, each item contains bounding box, text and recognition confidence
Leif's avatar
Leif committed
192
193

```bash
194
195
196
197
[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]
  [[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]
  [[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]
  ......
littletomatodonkey's avatar
littletomatodonkey committed
198
199
```

Leif's avatar
Leif committed
200
Visualization of results
littletomatodonkey's avatar
littletomatodonkey committed
201

Leif's avatar
Leif committed
202
<div align="center">
Leif's avatar
Leif committed
203
    <img src="../imgs_results/whl/12_det_rec.jpg" width="800">
Leif's avatar
Leif committed
204
</div>
Leif's avatar
Leif committed
205

Leif's avatar
Leif committed
206
207
208
209
210

<a name="3"></a>

## 3. Summary

MissPenguin's avatar
MissPenguin committed
211
In this section, you have mastered the use of PaddleOCR whl package.
Leif's avatar
Leif committed
212

MissPenguin's avatar
MissPenguin committed
213
PaddleOCR is a rich and practical OCR tool library that get through the whole process of data production, model training, compression, inference and deployment, please refer to the [tutorials](../../README.md#tutorials) to start the journey of PaddleOCR.