README.md 10.5 KB
Newer Older
dyning's avatar
dyning committed
1
## 简介
tink2123's avatar
tink2123 committed
2
3
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。

dyning's avatar
dyning committed
4
5
6
**近期更新**
- 2020.5.30,模型预测、训练支持Windows系统,识别结果的显示进行了优化
- 2020.5.30,开源通用中文OCR模型
dyning's avatar
dyning committed
7
- 2020.5.30,提供超轻量级中文OCR在线体验
dyning's avatar
dyning committed
8

dyning's avatar
dyning committed
9
## 特性
dyning's avatar
dyning committed
10
11
12
- 超轻量级中文OCR,总模型仅8.6M
    - 单模型支持中英文数字组合识别、竖排文本识别、长文本识别
    - 检测模型DB(4.1M)+识别模型CRNN(4.5M)
dyning's avatar
dyning committed
13
14
15
- 多种文本检测训练算法,EAST、DB
- 多种文本识别训练算法,Rosetta、CRNN、STAR-Net、RARE

dyning's avatar
dyning committed
16
17
18
19
20
21
### 支持的中文模型列表:

|模型名称|模型简介|检测模型地址|识别模型地址|
|-|-|-|-|
|chinese_db_crnn_mobile|超轻量级中文OCR模型|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) & [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) & [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|
|chinese_db_crnn_server|通用中文OCR模型|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) & [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) & [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|
LDOUBLEV's avatar
LDOUBLEV committed
22

dyning's avatar
dyning committed
23
超轻量级中文OCR在线体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr
LDOUBLEV's avatar
LDOUBLEV committed
24

dyning's avatar
dyning committed
25
**也可以按如下教程快速体验超轻量级中文OCR和通用中文OCR模型。**
LDOUBLEV's avatar
LDOUBLEV committed
26

dyning's avatar
dyning committed
27
## **超轻量级中文OCR以及通用中文OCR体验**
tink2123's avatar
tink2123 committed
28

LDOUBLEV's avatar
LDOUBLEV committed
29
![](doc/imgs_results/11.jpg)
LDOUBLEV's avatar
LDOUBLEV committed
30

dyning's avatar
dyning committed
31
上图是超轻量级中文OCR模型效果展示,更多效果图请见文末[效果展示](#效果展示)
dyning's avatar
dyning committed
32

dyning's avatar
dyning committed
33
#### 1.环境配置
LDOUBLEV's avatar
LDOUBLEV committed
34

dyning's avatar
dyning committed
35
请先参考[快速安装](./doc/installation.md)配置PaddleOCR运行环境。
tink2123's avatar
tink2123 committed
36

dyning's avatar
dyning committed
37
#### 2.inference模型下载
LDOUBLEV's avatar
LDOUBLEV committed
38

dyning's avatar
dyning committed
39
#### (1)超轻量级中文OCR模型下载
tink2123's avatar
tink2123 committed
40
```
LDOUBLEV's avatar
LDOUBLEV committed
41
mkdir inference && cd inference
dyning's avatar
dyning committed
42
# 下载超轻量级中文OCR模型的检测模型并解压
LDOUBLEV's avatar
LDOUBLEV committed
43
wget https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar && tar xf ch_det_mv3_db_infer.tar
dyning's avatar
dyning committed
44
# 下载超轻量级中文OCR模型的识别模型并解压
LDOUBLEV's avatar
LDOUBLEV committed
45
wget https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar && tar xf ch_rec_mv3_crnn_infer.tar
dyning's avatar
dyning committed
46
47
48
49
50
51
52
53
54
55
cd ..
```
#### (2)通用中文OCR模型下载
```
mkdir inference && cd inference
# 下载通用中文OCR模型的检测模型并解压
wget https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar && tar xf ch_det_r50_vd_db_infer.tar
# 下载通用中文OCR模型的识别模型并解压
wget https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar && tar xf ch_rec_r34_vd_crnn_infer.tar
cd ..
tink2123's avatar
tink2123 committed
56
57
```

dyning's avatar
dyning committed
58
59
#### 3.单张图像或者图像集合预测

dyning's avatar
dyning committed
60
以下代码实现了文本检测、识别串联推理,在执行预测时,需要通过参数image_dir指定单张图像或者图像集合的路径、参数det_model_dir指定检测inference模型的路径和参数rec_model_dir指定识别inference模型的路径。可视化识别结果默认保存到 ./inference_results 文件夹里面。
dyning's avatar
dyning committed
61

tink2123's avatar
tink2123 committed
62
```
dyning's avatar
dyning committed
63
# 设置PYTHONPATH环境变量
tink2123's avatar
tink2123 committed
64
65
export PYTHONPATH=.

dyning's avatar
dyning committed
66
# 预测image_dir指定的单张图像
dyning's avatar
dyning committed
67
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/"  --rec_model_dir="./inference/ch_rec_mv3_crnn/"
dyning's avatar
dyning committed
68
69

# 预测image_dir指定的图像集合
dyning's avatar
dyning committed
70
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_det_mv3_db/"  --rec_model_dir="./inference/ch_rec_mv3_crnn/"
dyning's avatar
dyning committed
71

LDOUBLEV's avatar
LDOUBLEV committed
72
# 如果想使用CPU进行预测,需设置use_gpu参数为False
dyning's avatar
dyning committed
73
74
75
76
77
78
79
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/"  --rec_model_dir="./inference/ch_rec_mv3_crnn/" --use_gpu=False
```

通用中文OCR模型的体验可以按照上述步骤下载相应的模型,并且更新相关的参数,示例如下:
```
# 预测image_dir指定的单张图像
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_r50_vd_db/"  --rec_model_dir="./inference/ch_rec_r34_vd_crnn/"
tink2123's avatar
tink2123 committed
80
```
LDOUBLEV's avatar
LDOUBLEV committed
81

dyning's avatar
dyning committed
82
更多的文本检测、识别串联推理使用方式请参考文档教程中[基于预测引擎推理](./doc/inference.md)
tink2123's avatar
tink2123 committed
83

dyning's avatar
dyning committed
84
85
## 文档教程
- [快速安装](./doc/installation.md)
dyning's avatar
dyning committed
86
87
88
- [文本检测模型训练/评估/预测](./doc/detection.md)
- [文本识别模型训练/评估/预测](./doc/recognition.md)
- [基于预测引擎推理](./doc/inference.md)
dyning's avatar
dyning committed
89

dyning's avatar
dyning committed
90
## 文本检测算法
tink2123's avatar
tink2123 committed
91
92

PaddleOCR开源的文本检测算法列表:
tink2123's avatar
tink2123 committed
93
- [x]  EAST([paper](https://arxiv.org/abs/1704.03155))
tink2123's avatar
fix url  
tink2123 committed
94
95
- [x]  DB([paper](https://arxiv.org/abs/1911.08947))
- [ ]  SAST([paper](https://arxiv.org/abs/1908.05498))(百度自研, comming soon)
tink2123's avatar
tink2123 committed
96

dyning's avatar
dyning committed
97
在ICDAR2015文本检测公开数据集上,算法效果如下:
tink2123's avatar
tink2123 committed
98

LDOUBLEV's avatar
fix doc  
LDOUBLEV committed
99
|模型|骨干网络|precision|recall|Hmean|下载链接|
100
|-|-|-|-|-|-|
dyning's avatar
dyning committed
101
|EAST|ResNet50_vd|88.18%|85.51%|86.82%|[下载链接](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)|
LDOUBLEV's avatar
fix doc  
LDOUBLEV committed
102
103
104
|EAST|MobileNetV3|81.67%|79.83%|80.74%|[下载链接](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)|
|DB|ResNet50_vd|83.79%|80.65%|82.19%|[下载链接](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)|
|DB|MobileNetV3|75.92%|73.18%|74.53%|[下载链接](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)|
LDOUBLEV's avatar
LDOUBLEV committed
105

106
* 注: 上述DB模型的训练和评估,需设置后处理参数box_thresh=0.6,unclip_ratio=1.5,使用不同数据集、不同模型训练,可调整这两个参数进行优化
tink2123's avatar
tink2123 committed
107

dyning's avatar
dyning committed
108
PaddleOCR文本检测算法的训练和使用请参考文档教程中[文本检测模型训练/评估/预测](./doc/detection.md)
tink2123's avatar
tink2123 committed
109

dyning's avatar
dyning committed
110
## 文本识别算法
tink2123's avatar
tink2123 committed
111
112

PaddleOCR开源的文本识别算法列表:
tink2123's avatar
tink2123 committed
113
114
115
116
117
- [x]  CRNN([paper](https://arxiv.org/abs/1507.05717))
- [x]  Rosetta([paper](https://arxiv.org/abs/1910.05085))
- [x]  STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
- [x]  RARE([paper](https://arxiv.org/abs/1603.03915v1))
- [ ]  SRN([paper](https://arxiv.org/abs/2003.12294))(百度自研, comming soon)
tink2123's avatar
tink2123 committed
118

dyning's avatar
dyning committed
119
参考[DTRB](https://arxiv.org/abs/1904.01906)文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
tink2123's avatar
tink2123 committed
120

dyning's avatar
dyning committed
121
|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
dyning's avatar
dyning committed
122
|-|-|-|-|-|
dyning's avatar
dyning committed
123
124
125
126
127
128
129
130
|Rosetta|Resnet34_vd|80.24%|rec_r34_vd_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)|
|Rosetta|MobileNetV3|78.16%|rec_mv3_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)|
|CRNN|Resnet34_vd|82.20%|rec_r34_vd_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)|
|CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|
|STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|
|STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|
|RARE|Resnet34_vd|84.90%|rec_r34_vd_tps_bilstm_attn|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|
|RARE|MobileNetV3|83.32%|rec_mv3_tps_bilstm_attn|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|
tink2123's avatar
tink2123 committed
131

dyning's avatar
dyning committed
132
PaddleOCR文本识别算法的训练和使用请参考文档教程中[文本识别模型训练/评估/预测](./doc/recognition.md)
tink2123's avatar
tink2123 committed
133

dyning's avatar
dyning committed
134
135
## 端到端OCR算法
- [ ]  [End2End-PSL](https://arxiv.org/abs/1909.07808)(百度自研, comming soon)
tink2123's avatar
tink2123 committed
136

dyning's avatar
dyning committed
137
<a name="效果展示"></a>
dyning's avatar
dyning committed
138
## 超轻量级中文OCR效果展示
LDOUBLEV's avatar
LDOUBLEV committed
139
140
141
142
143
144
145
146
![](doc/imgs_results/1.jpg)
![](doc/imgs_results/7.jpg)
![](doc/imgs_results/12.jpg)
![](doc/imgs_results/4.jpg)
![](doc/imgs_results/6.jpg)
![](doc/imgs_results/9.jpg)
![](doc/imgs_results/16.png)
![](doc/imgs_results/22.jpg)
tink2123's avatar
tink2123 committed
147

dyning's avatar
dyning committed
148
149
150
151
## 更新
- 2020.5.30,模型预测、训练支持Windows系统,识别结果的显示进行了优化
- 2020.5.30,开源通用中文OCR模型
- 2020.5.30,提供超轻量级中文OCR在线体验
tink2123's avatar
tink2123 committed
152

dyning's avatar
dyning committed
153
## 参考文献
tink2123's avatar
tink2123 committed
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
```
1. EAST:
@inproceedings{zhou2017east,
  title={EAST: an efficient and accurate scene text detector},
  author={Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun},
  booktitle={Proceedings of the IEEE conference on Computer Vision and Pattern Recognition},
  pages={5551--5560},
  year={2017}
}

2. DB:
@article{liao2019real,
  title={Real-time Scene Text Detection with Differentiable Binarization},
  author={Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang},
  journal={arXiv preprint arXiv:1911.08947},
  year={2019}
}

3. DTRB:
@inproceedings{baek2019wrong,
  title={What is wrong with scene text recognition model comparisons? dataset and model analysis},
  author={Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={4715--4723},
  year={2019}
}

4. SAST:
@inproceedings{wang2019single,
  title={A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning},
  author={Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming},
  booktitle={Proceedings of the 27th ACM International Conference on Multimedia},
  pages={1277--1285},
  year={2019}
}

5. SRN:
@article{yu2020towards,
  title={Towards Accurate Scene Text Recognition with Semantic Reasoning Networks},
  author={Yu, Deli and Li, Xuan and Zhang, Chengquan and Han, Junyu and Liu, Jingtuo and Ding, Errui},
  journal={arXiv preprint arXiv:2003.12294},
  year={2020}
}

6. end2end-psl:
@inproceedings{sun2019chinese,
  title={Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning},
  author={Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={9086--9095},
  year={2019}
}
```
dyning's avatar
dyning committed
207
208
209
210
211
212

## 许可证书
本项目的发布受<a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>许可认证。

## 如何贡献代码
我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。