README.md 6.91 KB
Newer Older
dcuai's avatar
dcuai committed
1
# PaddleOCR_paddle
chenxj's avatar
chenxj committed
2
## 论文
chenxj's avatar
chenxj committed
3
PaddleOCR通过det、rec、cls三个模型分别实现字符检测、字符识别和字符方向分类的应用
chenxj's avatar
chenxj committed
4

chenxj's avatar
chenxj committed
5
6
7
8
9
10
11
12
13
det模型主要用DB算法,参考论文如下:

https://arxiv.org/pdf/1911.08947.pdf

rec模型主要用SVTR算法,参考论文如下:

https://arxiv.org/pdf/2205.00159.pdf

cls模型用mobilenetv3实现通用分类,参考论文如下:
chenxj's avatar
chenxj committed
14

chenxj's avatar
chenxj committed
15
https://arxiv.org/pdf/1905.02244.pdf
chenych's avatar
chenych committed
16

sugon_cxj's avatar
sugon_cxj committed
17
## 模型结构
chenych's avatar
chenych committed
18
19
20
21
定位模型:
<div align=center>
    <img src="./configs/det/dbnet-arc.png"/>
</div>
chenxj's avatar
chenxj committed
22

chenych's avatar
chenych committed
23
24
25
26
识别模型:
<div align=center>
    <img src="./configs/rec/SVTR-arc.png"/>
</div>
chenxj's avatar
chenxj committed
27

chenych's avatar
chenych committed
28
29
30
31
分类模型:
<div align=center>
    <img src="./configs/cls/mobilenetv3-arc.png"/>
</div>
chenxj's avatar
chenxj committed
32

chenych's avatar
chenych committed
33
## 算法原理
chenxj's avatar
chenxj committed
34

chenych's avatar
chenych committed
35
36
37
<div align=center>
    <img src="./configs/ocr.png"/>
</div>
chenxj's avatar
chenxj committed
38

chenxj's avatar
chenxj committed
39
## 环境配置
dcuai's avatar
dcuai committed
40
[光源](https://sourcefind.cn/#/main-page)可拉取训练以及推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载paddle、onnxruntime安装包。PaddleOCR推荐的镜像如下:
chenych's avatar
chenych committed
41
42

```bash
dcuai's avatar
dcuai committed
43
44
docker pull image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.5.2-ubuntu20.04-dtk24.04.1-py3.8
docker run -d -t --privileged --device=/dev/kfd --device=/dev/dri/ --shm-size 64g --network=host -v `pwd`:/挂在目录 -v /opt/hyhal:/opt/hyhal:ro --group-add video --name paddleocr-test image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.5.2-ubuntu20.04-dtk24.04.1-py3.8
chenych's avatar
chenych committed
45

chenxj's avatar
chenxj committed
46
docker exec -it paddleocr-test bash
chenych's avatar
chenych committed
47

chenxj's avatar
chenxj committed
48
pip3 install -r requirements.txt
dcuai's avatar
dcuai committed
49
pip3 install onnxruntime-1.15.0+das1.1.git739f24d.abi1.dtk2404-cp38-cp38-manylinux_2_31_x86_64.whl
chenych's avatar
chenych committed
50
51
pip3 install numpy==1.23.4

chenxj's avatar
chenxj committed
52
53
54
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/MobileNetV3_large_x0_5_pretrained.pdparams
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar
```
chenych's avatar
chenych committed
55

sugon_cxj's avatar
sugon_cxj committed
56
57
58
## 数据集
推荐使用icdar2015数据集[icdar2015](https://rrc.cvc.uab.es/?ch=4&com=downloads)

chenych's avatar
chenych committed
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
### 检测模型数据集
label数据准备有以下两个方法,二选一即可
```bash
cd train_data/icdar2015/text_localization/

# 方法一:label数据下载
wget -P ./train_data/  https://paddleocr.bj.bcebos.com/dataset/train_icdar2015_label.txt
wget -P ./train_data/  https://paddleocr.bj.bcebos.com/dataset/test_icdar2015_label.txt
# 方法二:将官网下载的标签文件转换为 train_icdar2015_label.txt、test_icdar2015_label.txt
python ppocr/utils/gen_label.py --mode="det" --root_path="/path/to/ch4_training_images/"  \
                    --input_path="/path/to/ch4_training_localization_transcription_gt" \
                    --output_label="train_data/icdar2015/text_localization/train_icdar2015_label.txt"
python ppocr/utils/gen_label.py --mode="det" --root_path="/path/to/ch4_test_images/"  \
                    --input_path="/path/to/Challenge4_Test_Task1_GT" \
                    --output_label="train_data/icdar2015/text_localization/test_icdar2015_label.txt"
```
准备完成的数据目录结构如下:
```
|-train_data/icdar2015/text_localization/
  |- ch4_training_images/         Training data of icdar dataset
  |- ch4_test_images/             Testing data of icdar dataset
sugon_cxj's avatar
sugon_cxj committed
80
81
82
  └─ train_icdar2015_label.txt    Training annotation of icdar dataset
  └─ test_icdar2015_label.txt     Test annotation of icdar dataset
```
chenych's avatar
chenych committed
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98

### 识别模型数据集
label数据准备有以下两个方法,二选一即可
```bash
cd train_data/rec
# 方法一:label数据下载
# 训练集标签
wget -P ./train_data/rec  https://paddleocr.bj.bcebos.com/dataset/rec_gt_train.txt
# 测试集标签
wget -P ./train_data/rec  https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt

# 方法二:将官网下载的标签文件转换为 rec_gt_train.txt、rec_gt_test.txt
python ppocr/utils/gen_label.py --mode="rec" --input_path="{path/of/train/label}" --output_label="rec_gt_train.txt"
python ppocr/utils/gen_label.py --mode="rec" --input_path="{path/of/test/label}" --output_label="rec_gt_test.txt"
```
准备完成的数据目录结构如下:
sugon_cxj's avatar
sugon_cxj committed
99
100
101
102
103
104
105
106
107
```
|-train_data
  |-rec
    |- rec_gt_train.txt
    |- train
        |- word_001.png
        |- word_002.jpg
        |- word_003.jpg
        | ...
chenych's avatar
chenych committed
108
109
110
111
112
113
    |- rec_gt_test.txt
    |- test
        |- word_001.jpg
        |- word_002.jpg
        |- word_003.jpg
        | ...
sugon_cxj's avatar
sugon_cxj committed
114
```
chenych's avatar
chenych committed
115

chenxj's avatar
chenxj committed
116
## 训练
chenych's avatar
chenych committed
117
> 数据路径请根据实际准备数据路径修改`config`下的`yml`文件
sugon_cxj's avatar
sugon_cxj committed
118

chenych's avatar
chenych committed
119
### 检测模型
chenxj's avatar
chenxj committed
120
```
sugon_cxj's avatar
sugon_cxj committed
121
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
chenxj's avatar
chenxj committed
122
```
chenych's avatar
chenych committed
123
124

### 识别模型
chenxj's avatar
chenxj committed
125
```
sugon_cxj's avatar
sugon_cxj committed
126
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./pretrain_models/en_PP-OCRv3_rec_train/best_accuracy
chenxj's avatar
chenxj committed
127
```
chenych's avatar
chenych committed
128

dcuai's avatar
dcuai committed
129
### 测试
chenych's avatar
chenych committed
130
131
#### 检测模型
- Paddle
chenxj's avatar
chenxj committed
132
```
sugon_cxj's avatar
sugon_cxj committed
133
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./output/db_mv3/best_accuracy.pdparams
chenxj's avatar
chenxj committed
134
```
chenych's avatar
chenych committed
135
136

- ort
chenxj's avatar
chenxj committed
137
```
chenych's avatar
chenych committed
138
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx --use_onnx=true
chenxj's avatar
chenxj committed
139
```
chenych's avatar
chenych committed
140
141
142

#### 识别模型
- Paddle
chenxj's avatar
chenxj committed
143
```
chenych's avatar
chenych committed
144
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./output/v3_en_mobile/best_accuracy.pdparams
chenxj's avatar
chenxj committed
145
```
chenych's avatar
chenych committed
146
- ort
chenxj's avatar
chenxj committed
147
148
149
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx --use_onnx=true
```
chenych's avatar
chenych committed
150

chenxj's avatar
chenxj committed
151
## 推理
chenych's avatar
chenych committed
152
153
154
### paddle
```bash
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
155
```
chenych's avatar
chenych committed
156
157
158

### ort
```bash
159
160
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.onnx" --rec_model_dir="./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx" --use_onnx=true --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
```
chenych's avatar
chenych committed
161

chenxj's avatar
chenxj committed
162
## result
chenych's avatar
chenych committed
163
164
165
<div align=center>
    <img src="./inference_results/08.jpg"/>
</div>
sugon_cxj's avatar
sugon_cxj committed
166

chenych's avatar
chenych committed
167
168
169
### 精度
#### paddle
- 检测模型测试
sugon_cxj's avatar
sugon_cxj committed
170
171
172
173
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.7054 | 0.7193  |

chenych's avatar
chenych committed
174
175
- 识别模型测试
| Model | Acc |
sugon_cxj's avatar
sugon_cxj committed
176
| :------: | :------: |
chenych's avatar
chenych committed
177
| rec | 0.6490 |
chenxj's avatar
chenxj committed
178

chenych's avatar
chenych committed
179
180
#### ort
- 检测模型测试
chenxj's avatar
chenxj committed
181
182
183
184
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.5097 | 0.4068  |

chenych's avatar
chenych committed
185
186
- 识别模型测试
| Model | Acc |
chenxj's avatar
chenxj committed
187
| :------: | :------: |
chenych's avatar
chenych committed
188
189
| rec | 0.6076 |

chenxj's avatar
chenxj committed
190
191
## 应用场景
### 算法类别
chenych's avatar
chenych committed
192
193
OCR

chenxj's avatar
chenxj committed
194
### 热点应用行业
chenych's avatar
chenych committed
195
196
`制造,金融,交通,教育,医疗`

chenxj's avatar
chenxj committed
197
## 源码仓库及问题反馈
chenych's avatar
chenych committed
198
- https://developer.sourcefind.cn/codes/modelzoo/paddleocr
sugon_cxj's avatar
sugon_cxj committed
199

chenych's avatar
chenych committed
200
201
## 参考资料
- https://github.com/PaddlePaddle/PaddleOCR