README.md 7 KB
Newer Older
tink2123's avatar
tink2123 committed
1

dyning's avatar
dyning committed
2
## 简介
tink2123's avatar
tink2123 committed
3
4
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。

dyning's avatar
dyning committed
5
## 特性
dyning's avatar
dyning committed
6
7
8
- 超轻量级中文OCR,总模型仅8.6M
    - 单模型支持中英文数字组合识别、竖排文本识别、长文本识别
    - 检测模型DB(4.1M)+识别模型CRNN(4.5M)
dyning's avatar
dyning committed
9
10
11
12
- 多种文本检测训练算法,EAST、DB
- 多种文本识别训练算法,Rosetta、CRNN、STAR-Net、RARE

## **超轻量级中文OCR体验**
tink2123's avatar
tink2123 committed
13

LDOUBLEV's avatar
LDOUBLEV committed
14
![](./doc/imgs_draw/11.jpg)
LDOUBLEV's avatar
LDOUBLEV committed
15

dyning's avatar
dyning committed
16
上图是超轻量级中文OCR模型效果展示,更多效果图请见文末[效果展示](#效果展示)
dyning's avatar
dyning committed
17

dyning's avatar
dyning committed
18
#### 1.环境配置
LDOUBLEV's avatar
LDOUBLEV committed
19

dyning's avatar
dyning committed
20
请先参考[快速安装](./doc/installation.md)配置PaddleOCR运行环境。
tink2123's avatar
tink2123 committed
21

dyning's avatar
dyning committed
22
#### 2.模型下载
LDOUBLEV's avatar
LDOUBLEV committed
23

tink2123's avatar
tink2123 committed
24
```
dyning's avatar
dyning committed
25
# 创建模型保存目录
tink2123's avatar
tink2123 committed
26
mkdir inference && cd inference && mkdir det && mkdir rec
dyning's avatar
dyning committed
27
# 下载inference模型文件包
tink2123's avatar
tink2123 committed
28
wget -P ./inference https://paddleocr.bj.bcebos.com/inference.tar
dyning's avatar
dyning committed
29
30
# inference模型文件包解压
tar -xf ./inference/inference.tar
tink2123's avatar
tink2123 committed
31
32
```

dyning's avatar
dyning committed
33
34
#### 3.单张图像或者图像集合预测

dyning's avatar
dyning committed
35
以下代码实现了文本检测、识别串联推理,在执行预测时,需要通过参数image_dir指定单张图像或者图像集合的路径、参数det_model_dir指定检测inference模型的路径和参数rec_model_dir指定识别inference模型的路径。
dyning's avatar
dyning committed
36

tink2123's avatar
tink2123 committed
37
```
dyning's avatar
dyning committed
38
# 设置PYTHONPATH环境变量
tink2123's avatar
tink2123 committed
39
40
export PYTHONPATH=.

dyning's avatar
dyning committed
41
42
43
44
45
# 预测image_dir指定的单张图像
python tools/infer/predict_system.py --image_dir="/Demo.jpg" --det_model_dir="./inference/det/"  --rec_model_dir="./inference/rec/"

# 预测image_dir指定的图像集合
python tools/infer/predict_system.py --image_dir="/test_imgs/" --det_model_dir="./inference/det/"  --rec_model_dir="./inference/rec/"
tink2123's avatar
tink2123 committed
46
```
dyning's avatar
dyning committed
47
更多的文本检测、识别串联推理使用方式请参考文档教程中[基于推理引擎预测](./doc/inference.md)
tink2123's avatar
tink2123 committed
48

dyning's avatar
dyning committed
49
50
## 文档教程
- [快速安装](./doc/installation.md)
dyning's avatar
dyning committed
51
52
53
- [文本检测模型训练/评估/预测](./doc/detection.md)
- [文本识别模型训练/评估/预测](./doc/recognition.md)
- [基于推理引擎预测](./doc/inference.md)
dyning's avatar
dyning committed
54

dyning's avatar
dyning committed
55
## 文本检测算法
tink2123's avatar
tink2123 committed
56
57
58
59

PaddleOCR开源的文本检测算法列表:
- [x]  [EAST](https://arxiv.org/abs/1704.03155)
- [x]  [DB](https://arxiv.org/abs/1911.08947)
dyning's avatar
dyning committed
60
- [ ]  [SAST](https://arxiv.org/abs/1908.05498)(百度自研, comming soon)
tink2123's avatar
tink2123 committed
61

dyning's avatar
dyning committed
62
在ICDAR2015文本检测公开数据集上,算法效果如下:
tink2123's avatar
tink2123 committed
63
64
65

|模型|骨干网络|Hmean|
|-|-|-|
tink2123's avatar
tink2123 committed
66
67
68
69
|[EAST](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)|ResNet50_vd|85.85%|
|[EAST](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)|MobileNetV3|79.08%|
|[DB](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)|ResNet50_vd|83.30%|
|[DB](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)|MobileNetV3|73.00%|
tink2123's avatar
tink2123 committed
70

dyning's avatar
dyning committed
71
PaddleOCR文本检测算法的训练和使用请参考文档教程中[文本检测模型训练/评估/预测](./doc/detection.md)
tink2123's avatar
tink2123 committed
72

dyning's avatar
dyning committed
73
## 文本识别算法
tink2123's avatar
tink2123 committed
74
75
76

PaddleOCR开源的文本识别算法列表:
- [x]  [CRNN](https://arxiv.org/abs/1507.05717)
dyning's avatar
dyning committed
77
78
79
- [x]  [Rosetta](https://arxiv.org/abs/1910.05085)
- [x]  [STAR-Net](http://www.bmva.org/bmvc/2016/papers/paper043/index.html)
- [x]  [RARE](https://arxiv.org/abs/1603.03915v1)
dyning's avatar
dyning committed
80
- [ ]  [SRN](https://arxiv.org/abs/2003.12294)(百度自研, comming soon)
tink2123's avatar
tink2123 committed
81

dyning's avatar
dyning committed
82
参考[DTRB](https://arxiv.org/abs/1904.01906)文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别合成数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行效果评估,算法效果如下:
tink2123's avatar
tink2123 committed
83
84
85

|模型|骨干网络|ACC|
|-|-|-|
tink2123's avatar
tink2123 committed
86
87
88
89
90
91
92
93
|[Rosetta](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)|Resnet34_vd|80.24%|
|[Rosetta](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)|MobileNetV3|78.16%|
|[CRNN](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)|Resnet34_vd|82.20%|
|[CRNN](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|MobileNetV3|79.37%|
|[STAR-Net](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|Resnet34_vd|83.93%|
|[STAR-Net](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|MobileNetV3|81.56%|
|[RARE](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|Resnet34_vd|84.90%|
|[RARE](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|MobileNetV3|83.32%|
tink2123's avatar
tink2123 committed
94

dyning's avatar
dyning committed
95
PaddleOCR文本识别算法的训练和使用请参考文档教程中[文本识别模型训练/评估/预测](./doc/recognition.md)
tink2123's avatar
tink2123 committed
96

dyning's avatar
dyning committed
97
98
## 端到端OCR算法
- [ ]  [End2End-PSL](https://arxiv.org/abs/1909.07808)(百度自研, comming soon)
tink2123's avatar
tink2123 committed
99

dyning's avatar
dyning committed
100
<a name="效果展示"></a>
LDOUBLEV's avatar
LDOUBLEV committed
101
## 效果展示
dyning's avatar
dyning committed
102
103
104
105
106
107
108
109
![](./doc/imgs_draw/1.jpg)
![](./doc/imgs_draw/4.jpg)
![](./doc/imgs_draw/6.jpg)
![](./doc/imgs_draw/7.jpg)
![](./doc/imgs_draw/9.jpg)
![](./doc/imgs_draw/12.jpg)
![](./doc/imgs_draw/16.jpg)
![](./doc/imgs_draw/22.jpg)
tink2123's avatar
tink2123 committed
110
111


dyning's avatar
dyning committed
112
## 参考文献
tink2123's avatar
tink2123 committed
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
```
1. EAST:
@inproceedings{zhou2017east,
  title={EAST: an efficient and accurate scene text detector},
  author={Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun},
  booktitle={Proceedings of the IEEE conference on Computer Vision and Pattern Recognition},
  pages={5551--5560},
  year={2017}
}

2. DB:
@article{liao2019real,
  title={Real-time Scene Text Detection with Differentiable Binarization},
  author={Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang},
  journal={arXiv preprint arXiv:1911.08947},
  year={2019}
}

3. DTRB:
@inproceedings{baek2019wrong,
  title={What is wrong with scene text recognition model comparisons? dataset and model analysis},
  author={Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={4715--4723},
  year={2019}
}

4. SAST:
@inproceedings{wang2019single,
  title={A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning},
  author={Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming},
  booktitle={Proceedings of the 27th ACM International Conference on Multimedia},
  pages={1277--1285},
  year={2019}
}

5. SRN:
@article{yu2020towards,
  title={Towards Accurate Scene Text Recognition with Semantic Reasoning Networks},
  author={Yu, Deli and Li, Xuan and Zhang, Chengquan and Han, Junyu and Liu, Jingtuo and Ding, Errui},
  journal={arXiv preprint arXiv:2003.12294},
  year={2020}
}

6. end2end-psl:
@inproceedings{sun2019chinese,
  title={Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning},
  author={Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={9086--9095},
  year={2019}
}
```
dyning's avatar
dyning committed
166
167
168
169
170
171
172
173

## 许可证书
本项目的发布受<a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>许可认证。

## 版本更新

## 如何贡献代码
我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。