README.md 4.55 KB
Newer Older
sugon_cxj's avatar
sugon_cxj committed
1
# PaddleOCR
chenxj's avatar
chenxj committed
2
## 论文
chenxj's avatar
chenxj committed
3
4
5
6
7
8
9
10
11
12
PaddleOCR通过det、rec、cls三个模型分别实现字符检测、字符识别和字符方向分类的应用
det模型主要用DB算法,参考论文如下:

https://arxiv.org/pdf/1911.08947.pdf

rec模型主要用SVTR算法,参考论文如下:

https://arxiv.org/pdf/2205.00159.pdf

cls模型用mobilenetv3实现通用分类,参考论文如下:
chenxj's avatar
chenxj committed
13

chenxj's avatar
chenxj committed
14
https://arxiv.org/pdf/1905.02244.pdf
sugon_cxj's avatar
sugon_cxj committed
15
## 模型结构
chenxj's avatar
chenxj committed
16
det:
chenxj's avatar
chenxj committed
17
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/det/dbnet-arc.png)
chenxj's avatar
chenxj committed
18
rec:
chenxj's avatar
chenxj committed
19
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/rec/SVTR-arc.png)
chenxj's avatar
chenxj committed
20
cls:
chenxj's avatar
chenxj committed
21
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/cls/mobilenetv3-arc.png)
chenxj's avatar
chenxj committed
22
23
## 算法原理
det->cls->rec->text
sugon_cxj's avatar
sugon_cxj committed
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
## 数据集
推荐使用icdar2015数据集[icdar2015](https://rrc.cvc.uab.es/?ch=4&com=downloads)

检测模型训练集文件结构
```
/PaddleOCR/train_data/icdar2015/text_localization/
  └─ icdar_c4_train_imgs/         Training data of icdar dataset
  └─ ch4_test_images/             Testing data of icdar dataset
  └─ train_icdar2015_label.txt    Training annotation of icdar dataset
  └─ test_icdar2015_label.txt     Test annotation of icdar dataset
```
识别模型训练集文件结构
```
|-train_data
  |-rec
    |- rec_gt_train.txt
    |- train
        |- word_001.png
        |- word_002.jpg
        |- word_003.jpg
        | ...
    |-ic15_data
        |- rec_gt_test.txt
        |- test
            |- word_001.jpg
            |- word_002.jpg
            |- word_003.jpg
            | ...
```
chenxj's avatar
chenxj committed
53
## 环境配置
sugon_cxj's avatar
sugon_cxj committed
54
[光源](https://www.sourcefind.cn/#/service-details)可拉取训练以及推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载paddle安装包。PaddleOCR推荐的镜像如下:
chenxj's avatar
chenxj committed
55
56
57
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.3.2-centos7.6-dtk-22.10.1-py37-latest
```
chenxj's avatar
chenxj committed
58
## 训练
sugon_cxj's avatar
sugon_cxj committed
59
60

检测模型
chenxj's avatar
chenxj committed
61
```
sugon_cxj's avatar
sugon_cxj committed
62
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
chenxj's avatar
chenxj committed
63
```
sugon_cxj's avatar
sugon_cxj committed
64
识别模型
chenxj's avatar
chenxj committed
65
```
sugon_cxj's avatar
sugon_cxj committed
66
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./pretrain_models/en_PP-OCRv3_rec_train/best_accuracy
chenxj's avatar
chenxj committed
67
```
chenxj's avatar
chenxj committed
68
## 测试
sugon_cxj's avatar
sugon_cxj committed
69
检测模型
chenxj's avatar
chenxj committed
70
```
sugon_cxj's avatar
sugon_cxj committed
71
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./output/db_mv3/best_accuracy.pdparams
chenxj's avatar
chenxj committed
72
```
sugon_cxj's avatar
sugon_cxj committed
73
识别模型
chenxj's avatar
chenxj committed
74
```
sugon_cxj's avatar
sugon_cxj committed
75
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./output/v3_en_mobile/best_accuracy.pdparams
chenxj's avatar
chenxj committed
76
```
chenxj's avatar
chenxj committed
77
## 测试(ort)
chenxj's avatar
chenxj committed
78
79
80
81
82
83
84
85
检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx --use_onnx=true
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx --use_onnx=true
```
chenxj's avatar
chenxj committed
86
## 推理
chenxj's avatar
chenxj committed
87
```
sugon_cxj's avatar
sugon_cxj committed
88
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=false --rec_image_shape=3,48,320 --warmup=1
chenxj's avatar
chenxj committed
89
```
chenxj's avatar
chenxj committed
90
## 推理(ort)
91
92
93
```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.onnx" --rec_model_dir="./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx" --use_onnx=true --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
```
chenxj's avatar
chenxj committed
94
95
96
## result
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/inference_results/08.jpg)
### 性能和准确率数据
sugon_cxj's avatar
sugon_cxj committed
97
98
99
100
101
102
103
104
105
106

检测模型测试
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.7054 | 0.7193  |

识别模型测试
| Model | Acc | 
| :------: | :------: |
| rec | 0.6490 | 
chenxj's avatar
chenxj committed
107
108
109
110
111
112
113
114
115
116

检测模型测试(ort)
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.5097 | 0.4068  |

识别模型测试(ort)
| Model | Acc | 
| :------: | :------: |
| rec | 0.6076 | 
chenxj's avatar
chenxj committed
117
118
119
120
121
## 应用场景
### 算法类别
ocr
### 热点应用行业
工业制造、金融、交通、教育、医疗
chenxj's avatar
chenxj committed
122
## 源码仓库及问题反馈
chenxj's avatar
chenxj committed
123
https://developer.hpccube.com/codes/modelzoo/paddleocr
sugon_cxj's avatar
sugon_cxj committed
124
125
126
## 参考
* [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)