README.md 15.2 KB
Newer Older
WenmuZhou's avatar
WenmuZhou committed
1
2
English | [简体中文](README_ch.md)

Leif's avatar
Leif committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<p align="center">
 <img src="./doc/PaddleOCR_log.png" align="middle" width = "600"/>
<p align="center">


------------------------------------------------------------------------------------------

<p align="left">
    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleOCR?color=ffa"></a>
    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href=""><img src="https://img.shields.io/pypi/format/PaddleOCR?color=c77"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors"><img src="https://img.shields.io/github/contributors/PaddlePaddle/PaddleOCR?color=9ea"></a>
    <a href="https://pypi.org/project/PaddleOCR/"><img src="https://img.shields.io/pypi/dm/PaddleOCR?color=9cf"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf"></a>
</p>

WenmuZhou's avatar
WenmuZhou committed
21
## Introduction
Leif's avatar
Leif committed
22

LDOUBLEV's avatar
LDOUBLEV committed
23
PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
WenmuZhou's avatar
WenmuZhou committed
24

grasswolfs's avatar
grasswolfs committed
25
26
## Notice
PaddleOCR supports both dynamic graph and static graph programming paradigm
27
- Dynamic graph: dygraph branch (default), **supported by paddle 2.0.0 ([installation](./doc/doc_en/installation_en.md))**
grasswolfs's avatar
grasswolfs committed
28
29
- Static graph: develop branch

WenmuZhou's avatar
WenmuZhou committed
30
**Recent updates**
Leif's avatar
Leif committed
31

Leif's avatar
Leif committed
32
33
34
35
- PaddleOCR R&D team would like to share the released tools with developers, at 20:15 pm on August 4th, [Live Address](https://live.bilibili.com/21689802).
- 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).
- 2021.4.8 release end-to-end text recognition algorithm [PGNet](https://www.aaai.org/AAAI21Papers/AAAI-2885.WangP.pdf) which is published in AAAI 2021. Find tutorial [here](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/pgnet_en.md);release multi language recognition [models](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md), support more than 80 languages recognition; especically, the performance of [English recognition model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/models_list_en.md#English) is Optimized.

Daniel Yang's avatar
Daniel Yang committed
36
- 2021.1.21 update more than 25+ multilingual recognition models [models list](./doc/doc_en/models_list_en.md), including:English, Chinese, German, French, Japanese,Spanish,Portuguese Russia Arabic and so on.  Models for more languages will continue to be updated [Develop Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).
MissPenguin's avatar
MissPenguin committed
37
- 2020.12.15 update Data synthesis tool, i.e., [Style-Text](./StyleText/README.md),easy to synthesize a large number of images which are similar to the target scene image.
grasswolfs's avatar
grasswolfs committed
38
- 2020.11.25 Update a new data annotation tool, i.e., [PPOCRLabel](./PPOCRLabel/README.md), which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
WenmuZhou's avatar
WenmuZhou committed
39
40
41
42
43
- 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
- [more](./doc/doc_en/update_en.md)

## Features
- PPOCR series of high-quality pre-trained models, comparable to commercial effects
grasswolfs's avatar
grasswolfs committed
44
45
46
47
    - Ultra lightweight ppocr_mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
    - General ppocr_server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
    - Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
    - Support multi-language recognition: Korean, Japanese, German, French
grasswolfs's avatar
grasswolfs committed
48
- Rich toolkits related to the OCR areas
grasswolfs's avatar
grasswolfs committed
49
50
    - Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
    - Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image
WenmuZhou's avatar
WenmuZhou committed
51
52
53
54
55
- Support user-defined training, provides rich predictive inference deployment solutions
- Support PIP installation, easy to use
- Support Linux, Windows, MacOS and other systems

## Visualization
56

WenmuZhou's avatar
WenmuZhou committed
57
<div align="center">
LDOUBLEV's avatar
LDOUBLEV committed
58
    <img src="doc/imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
tink2123's avatar
tink2123 committed
59
60
    <img src="doc/imgs_results/multi_lang/img_01.jpg" width="800">
    <img src="doc/imgs_results/multi_lang/img_02.jpg" width="800">
WenmuZhou's avatar
WenmuZhou committed
61
62
63
</div>

The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see [More visualizations](./doc/doc_en/visualization_en.md).
dyning's avatar
dyning committed
64

LDOUBLEV's avatar
LDOUBLEV committed
65
66
67
68
69
<a name="Community"></a>
## Community
- Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.

<div align="center">
Daniel Yang's avatar
Daniel Yang committed
70
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/dygraph/doc/joinus.PNG"  width = "200" height = "200" />
LDOUBLEV's avatar
LDOUBLEV committed
71
72
73
</div>


WenmuZhou's avatar
WenmuZhou committed
74
## Quick Experience
dyning's avatar
dyning committed
75

WenmuZhou's avatar
WenmuZhou committed
76
You can also quickly experience the ultra-lightweight OCR : [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)
dyning's avatar
dyning committed
77

WenmuZhou's avatar
WenmuZhou committed
78
Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): [Sign in to the website to obtain the QR code for  installing the App](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)
tink2123's avatar
tink2123 committed
79

WenmuZhou's avatar
WenmuZhou committed
80
 Also, you can scan the QR code below to install the App (**Android support only**)
LDOUBLEV's avatar
LDOUBLEV committed
81

grasswolfs's avatar
grasswolfs committed
82
<div align="center">
WenmuZhou's avatar
WenmuZhou committed
83
<img src="./doc/ocr-android-easyedge.png"  width = "200" height = "200" />
grasswolfs's avatar
grasswolfs committed
84
</div>
dyning's avatar
dyning committed
85

WenmuZhou's avatar
WenmuZhou committed
86
87
88
89
- [**OCR Quick Start**](./doc/doc_en/quickstart_en.md)

<a name="Supported-Chinese-model-list"></a>

LDOUBLEV's avatar
LDOUBLEV committed
90

tink2123's avatar
tink2123 committed
91
## PP-OCR 2.0 series model list(Update on Dec 15)
MissPenguin's avatar
MissPenguin committed
92
**Note** : Compared with [models 1.1](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md), which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.
WenmuZhou's avatar
WenmuZhou committed
93
94
95

| Model introduction                                           | Model name                   | Recommended scene | Detection model                                              | Direction classifier                                         | Recognition model                                            |
| ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
grasswolfs's avatar
grasswolfs committed
96
97
| Chinese and English ultra-lightweight OCR model (9.4M)       | ch_ppocr_mobile_v2.0_xx      | Mobile & server   |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar)      |
| Chinese and English general OCR model (143.4M)               | ch_ppocr_server_v2.0_xx      | Server            |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_traingit.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar)  |  
LDOUBLEV's avatar
LDOUBLEV committed
98

WenmuZhou's avatar
WenmuZhou committed
99

LDOUBLEV's avatar
LDOUBLEV committed
100
For more model downloads (including multiple languages), please refer to [PP-OCR v2.0 series model downloads](./doc/doc_en/models_list_en.md).
WenmuZhou's avatar
WenmuZhou committed
101

LDOUBLEV's avatar
LDOUBLEV committed
102
For a new language request, please refer to [Guideline for new language_requests](#language_requests).
WenmuZhou's avatar
WenmuZhou committed
103
104

## Tutorials
Leif's avatar
Leif committed
105
- [Environment Preparation](./doc/doc_en/environment_en.md)
WenmuZhou's avatar
WenmuZhou committed
106
- [Quick Start](./doc/doc_en/quickstart_en.md)
Leif's avatar
Leif committed
107
- [PaddleOCR Overview and Installation](./doc/doc_en/paddleOCR_overview_en.md)
Leif's avatar
Leif committed
108
109
110
111
112
- PP-OCR Industry Landing: from Training to Deployment
    - [PP-OCR Model and Configuration](./doc/doc_en/models_and_config_en.md)
        - [PP-OCR Model Download](./doc/doc_en/models_list_en.md)
        - [Yml Configuration](./doc/doc_en/config_en.md)
        - [Python Inference](./doc/doc_en/inference_en.md)
Leif's avatar
Leif committed
113
    - [PP-OCR Training](./doc/doc_en/training_en.md)
Leif's avatar
Leif committed
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
        - [Text Detection](./doc/doc_en/detection_en.md)
        - [Text Recognition](./doc/doc_en/recognition_en.md)
        - [Direction Classification](./doc/doc_en/angle_class_en.md)
    - Inference and Deployment
        - [Python Inference](./doc/doc_en/inference_en.md)
        - [C++ Inference](./deploy/cpp_infer/readme_en.md)
        - [Serving](./deploy/pdserving/README.md)
        - [Mobile](./deploy/lite/readme_en.md)
        - [Benchmark](./doc/doc_en/benchmark_en.md)  
- [PP-Structure: Information Extraction](./ppstructure/README.md)
    - [Layout Parser](./ppstructure/layout/README.md)
    - [Table Recognition](./ppstructure/table/README.md)
- Academic Circles
    - [Two-stage Algorithm](./doc/doc_en/algorithm_overview_en.md)
    - [PGNet Algorithm](./doc/doc_en/algorithm_overview_en.md)
LDOUBLEV's avatar
LDOUBLEV committed
129
- Data Annotation and Synthesis
grasswolfs's avatar
grasswolfs committed
130
    - [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)
dyning's avatar
dyning committed
131
    - [Data Synthesis Tool: Style-Text](./StyleText/README.md)
grasswolfs's avatar
grasswolfs committed
132
133
    - [Other Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
    - [Other Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
WenmuZhou's avatar
WenmuZhou committed
134
135
136
137
138
- Datasets
    - [General OCR Datasets(Chinese/English)](./doc/doc_en/datasets_en.md)
    - [HandWritten_OCR_Datasets(Chinese)](./doc/doc_en/handwritten_datasets_en.md)
    - [Various OCR Datasets(multilingual)](./doc/doc_en/vertical_and_multilingual_datasets_en.md)
- [Visualization](#Visualization)
LDOUBLEV's avatar
LDOUBLEV committed
139
- [New language requests](#language_requests)
WenmuZhou's avatar
WenmuZhou committed
140
141
142
143
144
145
146
- [FAQ](./doc/doc_en/FAQ_en.md)
- [Community](#Community)
- [References](./doc/doc_en/reference_en.md)
- [License](#LICENSE)
- [Contribution](#CONTRIBUTION)


LDOUBLEV's avatar
LDOUBLEV committed
147
148
149
150

<a name="PP-OCR-Pipeline"></a>

## PP-OCR Pipeline
dyning's avatar
dyning committed
151
152

<div align="center">
WenmuZhou's avatar
WenmuZhou committed
153
    <img src="./doc/ppocr_framework.png" width="800">
dyning's avatar
dyning committed
154
155
</div>

dyning's avatar
dyning committed
156
PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection[2], detection frame correction and CRNN text recognition[7]. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941). Besides, The implementation of the FPGM Pruner [8] and PACT quantization [9] is based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim).
dyning's avatar
dyning committed
157

tink2123's avatar
tink2123 committed
158

WenmuZhou's avatar
WenmuZhou committed
159
160
## Visualization [more](./doc/doc_en/visualization_en.md)
- Chinese OCR model
dyning's avatar
dyning committed
161
<div align="center">
LDOUBLEV's avatar
LDOUBLEV committed
162
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
LDOUBLEV's avatar
LDOUBLEV committed
163
164
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00015504.jpg" width="800">
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
LDOUBLEV's avatar
LDOUBLEV committed
165
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
dyning's avatar
dyning committed
166
</div>
tink2123's avatar
tink2123 committed
167

WenmuZhou's avatar
WenmuZhou committed
168
- English OCR model
dyning's avatar
dyning committed
169
<div align="center">
LDOUBLEV's avatar
LDOUBLEV committed
170
    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
dyning's avatar
dyning committed
171
</div>
172

WenmuZhou's avatar
WenmuZhou committed
173
- Multilingual OCR model
dyning's avatar
dyning committed
174
<div align="center">
LDOUBLEV's avatar
LDOUBLEV committed
175
    <img src="./doc/imgs_results/french_0.jpg" width="800">
LDOUBLEV's avatar
LDOUBLEV committed
176
    <img src="./doc/imgs_results/korean.jpg" width="800">
dyning's avatar
dyning committed
177
</div>
tink2123's avatar
tink2123 committed
178

dyning's avatar
dyning committed
179

LDOUBLEV's avatar
LDOUBLEV committed
180
181
182
183
184
<a name="language_requests"></a>
## Guideline for new language requests

If you want to request a new language support, a PR with 2 following files are needed:

grasswolfs's avatar
grasswolfs committed
185
1. In folder [ppocr/utils/dict](./ppocr/utils/dict),
LDOUBLEV's avatar
LDOUBLEV committed
186
187
it is necessary to submit the dict text to this path and name it with `{language}_dict.txt` that contains a list of all characters. Please see the format example from other files in that folder.

grasswolfs's avatar
grasswolfs committed
188
2. In folder [ppocr/utils/corpus](./ppocr/utils/corpus),
LDOUBLEV's avatar
LDOUBLEV committed
189
190
191
192
193
194
195
196
it is necessary to submit the corpus to this path and name it with `{language}_corpus.txt` that contains a list of words in your language.
Maybe, 50000 words per language is necessary at least.
Of course, the more, the better.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to [Multilingual OCR Development Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).

MissPenguin's avatar
MissPenguin committed
197

WenmuZhou's avatar
WenmuZhou committed
198
199
200
201
202
203
204
205
206
<a name="LICENSE"></a>
## License
This project is released under <a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>

<a name="CONTRIBUTION"></a>
## Contribution
We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.

- Many thanks to [Khanh Tran](https://github.com/xxxpsyduck) and [Karl Horky](https://github.com/karlhorky) for contributing and revising the English documentation.
littletomatodonkey's avatar
littletomatodonkey committed
207
- Many thanks to [zhangxin](https://github.com/ZhangXinNan) for contributing the new visualize function、add .gitignore and discard set PYTHONPATH manually.
WenmuZhou's avatar
WenmuZhou committed
208
209
210
211
212
- Many thanks to [lyl120117](https://github.com/lyl120117) for contributing the code for printing the network structure.
- Thanks [xiangyubo](https://github.com/xiangyubo) for contributing the handwritten Chinese OCR datasets.
- Thanks [authorfu](https://github.com/authorfu) for contributing Android demo  and [xiadeye](https://github.com/xiadeye) contributing iOS demo, respectively.
- Thanks [BeyondYourself](https://github.com/BeyondYourself) for contributing many great suggestions and simplifying part of the code style.
- Thanks [tangmq](https://gitee.com/tangmq) for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
LDOUBLEV's avatar
LDOUBLEV committed
213
214
215
216
- Thanks [lijinhan](https://github.com/lijinhan) for contributing a new way, i.e., java SpringBoot, to achieve the request for the Hubserving deployment.
- Thanks [Mejans](https://github.com/Mejans) for contributing the Occitan corpus and character set.
- Thanks [LKKlein](https://github.com/LKKlein) for contributing a new deploying package with the Golang program language.
- Thanks [Evezerest](https://github.com/Evezerest), [ninetailskim](https://github.com/ninetailskim), [edencfc](https://github.com/edencfc), [BeyondYourself](https://github.com/BeyondYourself) and [1084667371](https://github.com/1084667371) for contributing a new data annotation tool, i.e., PPOCRLabel。