README.md 4.96 KB
Newer Older
sugon_cxj's avatar
sugon_cxj committed
1
# PaddleOCR
chenxj's avatar
chenxj committed
2
3
## 论文
det、rec、cls三个模型的backbone基于mobilenetv3,可参考mobilenetv3的相关论文
chenxj's avatar
chenxj committed
4

chenxj's avatar
chenxj committed
5
https://arxiv.org/pdf/1905.02244.pdf
sugon_cxj's avatar
sugon_cxj committed
6
## 模型结构
chenxj's avatar
chenxj committed
7
8

det:
chenxj's avatar
chenxj committed
9
10
11
12
13
14
15
16
17
  └─ Backbone:
      └─ name: MobileNetV3
      └─ scale: 0.5
      └─ model_name: large
  └─ Neck:
      └─ name: DBFPN
      └─ out_channels: 256
  └─ Head:
      └─ name: DBHead
chenxj's avatar
chenxj committed
18
19

rec:
chenxj's avatar
chenxj committed
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
  └─ Backbone:
      └─ name: MobileNetV1Enhance
      └─ scale: 0.5
      └─ last_conv_stride: [1, 2]
      └─ last_pool_type: avg
  └─ Head:
      └─ name: MultiHead
      └─ head_list:
        └─ CTCHead:
            └─ Neck:
              └─ name: svtr
              └─ dims: 64
              └─ depth: 2
              └─ hidden_dims: 120
              └─ use_guide: True
            └─ Head:
              └─ fc_decay: 0.00001
        └─ SARHead:
            └─ enc_dim: 512
            └─ max_text_length: *max_text_length
chenxj's avatar
chenxj committed
40
41

cls:
chenxj's avatar
chenxj committed
42
43
44
45
46
47
48
  └─ Backbone:
      └─ name: MobileNetV3
      └─ scale: 0.35
      └─ model_name: small
  └─ Head:
      └─ name: ClsHead
      └─ class_dim: 2
chenxj's avatar
chenxj committed
49
50
## 算法原理
det->cls->rec->text
sugon_cxj's avatar
sugon_cxj committed
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
## 数据集
推荐使用icdar2015数据集[icdar2015](https://rrc.cvc.uab.es/?ch=4&com=downloads)

检测模型训练集文件结构
```
/PaddleOCR/train_data/icdar2015/text_localization/
  └─ icdar_c4_train_imgs/         Training data of icdar dataset
  └─ ch4_test_images/             Testing data of icdar dataset
  └─ train_icdar2015_label.txt    Training annotation of icdar dataset
  └─ test_icdar2015_label.txt     Test annotation of icdar dataset
```
识别模型训练集文件结构
```
|-train_data
  |-rec
    |- rec_gt_train.txt
    |- train
        |- word_001.png
        |- word_002.jpg
        |- word_003.jpg
        | ...
    |-ic15_data
        |- rec_gt_test.txt
        |- test
            |- word_001.jpg
            |- word_002.jpg
            |- word_003.jpg
            | ...
```
chenxj's avatar
chenxj committed
80
## 环境配置
sugon_cxj's avatar
sugon_cxj committed
81
[光源](https://www.sourcefind.cn/#/service-details)可拉取训练以及推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载paddle安装包。PaddleOCR推荐的镜像如下:
chenxj's avatar
chenxj committed
82
83
84
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.3.2-centos7.6-dtk-22.10.1-py37-latest
```
chenxj's avatar
chenxj committed
85
## 训练
sugon_cxj's avatar
sugon_cxj committed
86
87

检测模型
chenxj's avatar
chenxj committed
88
```
sugon_cxj's avatar
sugon_cxj committed
89
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
chenxj's avatar
chenxj committed
90
```
sugon_cxj's avatar
sugon_cxj committed
91
识别模型
chenxj's avatar
chenxj committed
92
```
sugon_cxj's avatar
sugon_cxj committed
93
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./pretrain_models/en_PP-OCRv3_rec_train/best_accuracy
chenxj's avatar
chenxj committed
94
```
chenxj's avatar
chenxj committed
95
## 测试
sugon_cxj's avatar
sugon_cxj committed
96
检测模型
chenxj's avatar
chenxj committed
97
```
sugon_cxj's avatar
sugon_cxj committed
98
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./output/db_mv3/best_accuracy.pdparams
chenxj's avatar
chenxj committed
99
```
sugon_cxj's avatar
sugon_cxj committed
100
识别模型
chenxj's avatar
chenxj committed
101
```
sugon_cxj's avatar
sugon_cxj committed
102
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./output/v3_en_mobile/best_accuracy.pdparams
chenxj's avatar
chenxj committed
103
```
chenxj's avatar
chenxj committed
104
## 测试(ort)
chenxj's avatar
chenxj committed
105
106
107
108
109
110
111
112
检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx --use_onnx=true
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx --use_onnx=true
```
chenxj's avatar
chenxj committed
113
## 推理
chenxj's avatar
chenxj committed
114
```
sugon_cxj's avatar
sugon_cxj committed
115
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=false --rec_image_shape=3,48,320 --warmup=1
chenxj's avatar
chenxj committed
116
```
chenxj's avatar
chenxj committed
117
## 推理(ort)
118
119
120
```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.onnx" --rec_model_dir="./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx" --use_onnx=true --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
```
chenxj's avatar
chenxj committed
121
122
123
## result
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/inference_results/08.jpg)
### 性能和准确率数据
sugon_cxj's avatar
sugon_cxj committed
124
125
126
127
128
129
130
131
132
133

检测模型测试
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.7054 | 0.7193  |

识别模型测试
| Model | Acc | 
| :------: | :------: |
| rec | 0.6490 | 
chenxj's avatar
chenxj committed
134
135
136
137
138
139
140
141
142
143

检测模型测试(ort)
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.5097 | 0.4068  |

识别模型测试(ort)
| Model | Acc | 
| :------: | :------: |
| rec | 0.6076 | 
chenxj's avatar
chenxj committed
144
145
146
147
148
## 应用场景
### 算法类别
ocr
### 热点应用行业
工业制造、金融、交通、教育、医疗
chenxj's avatar
chenxj committed
149
## 源码仓库及问题反馈
chenxj's avatar
chenxj committed
150
https://developer.hpccube.com/codes/modelzoo/paddleocr
sugon_cxj's avatar
sugon_cxj committed
151
152
153
## 参考
* [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)