README.md 4.59 KB
Newer Older
sugon_cxj's avatar
sugon_cxj committed
1
# PaddleOCR
chenxj's avatar
chenxj committed
2
3
4
## 论文
det、rec、cls三个模型的backbone基于mobilenetv3,可参考mobilenetv3的相关论文
https://arxiv.org/pdf/1905.02244.pdf
sugon_cxj's avatar
sugon_cxj committed
5
## 模型结构
chenxj's avatar
chenxj committed
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

det:
Backbone:
  name: MobileNetV3
  scale: 0.5
  model_name: large
Neck:
  name: DBFPN
  out_channels: 256
Head:
  name: DBHead

rec:
Backbone:
  name: MobileNetV1Enhance
  scale: 0.5
  last_conv_stride: [1, 2]
  last_pool_type: avg
Head:
  name: MultiHead
  head_list:
    - CTCHead:
        Neck:
          name: svtr
          dims: 64
          depth: 2
          hidden_dims: 120
          use_guide: True
        Head:
          fc_decay: 0.00001
    - SARHead:
        enc_dim: 512
        max_text_length: *max_text_length

cls:
Backbone:
  name: MobileNetV3
  scale: 0.35
  model_name: small
Head:
  name: ClsHead
  class_dim: 2
## 算法原理
det->cls->rec->text
sugon_cxj's avatar
sugon_cxj committed
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
## 数据集
推荐使用icdar2015数据集[icdar2015](https://rrc.cvc.uab.es/?ch=4&com=downloads)

检测模型训练集文件结构
```
/PaddleOCR/train_data/icdar2015/text_localization/
  └─ icdar_c4_train_imgs/         Training data of icdar dataset
  └─ ch4_test_images/             Testing data of icdar dataset
  └─ train_icdar2015_label.txt    Training annotation of icdar dataset
  └─ test_icdar2015_label.txt     Test annotation of icdar dataset
```
识别模型训练集文件结构
```
|-train_data
  |-rec
    |- rec_gt_train.txt
    |- train
        |- word_001.png
        |- word_002.jpg
        |- word_003.jpg
        | ...
    |-ic15_data
        |- rec_gt_test.txt
        |- test
            |- word_001.jpg
            |- word_002.jpg
            |- word_003.jpg
            | ...
```
chenxj's avatar
chenxj committed
79
## 环境配置
sugon_cxj's avatar
sugon_cxj committed
80
[光源](https://www.sourcefind.cn/#/service-details)可拉取训练以及推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载paddle安装包。PaddleOCR推荐的镜像如下:
chenxj's avatar
chenxj committed
81
82
83
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.3.2-centos7.6-dtk-22.10.1-py37-latest
```
chenxj's avatar
chenxj committed
84
## 训练
sugon_cxj's avatar
sugon_cxj committed
85
86

检测模型
chenxj's avatar
chenxj committed
87
```
sugon_cxj's avatar
sugon_cxj committed
88
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
chenxj's avatar
chenxj committed
89
```
sugon_cxj's avatar
sugon_cxj committed
90
识别模型
chenxj's avatar
chenxj committed
91
```
sugon_cxj's avatar
sugon_cxj committed
92
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./pretrain_models/en_PP-OCRv3_rec_train/best_accuracy
chenxj's avatar
chenxj committed
93
```
chenxj's avatar
chenxj committed
94
## 测试
sugon_cxj's avatar
sugon_cxj committed
95
检测模型
chenxj's avatar
chenxj committed
96
```
sugon_cxj's avatar
sugon_cxj committed
97
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./output/db_mv3/best_accuracy.pdparams
chenxj's avatar
chenxj committed
98
```
sugon_cxj's avatar
sugon_cxj committed
99
识别模型
chenxj's avatar
chenxj committed
100
```
sugon_cxj's avatar
sugon_cxj committed
101
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./output/v3_en_mobile/best_accuracy.pdparams
chenxj's avatar
chenxj committed
102
```
chenxj's avatar
chenxj committed
103
## 测试(ort)
chenxj's avatar
chenxj committed
104
105
106
107
108
109
110
111
检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx --use_onnx=true
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx --use_onnx=true
```
chenxj's avatar
chenxj committed
112
## 推理
chenxj's avatar
chenxj committed
113
```
sugon_cxj's avatar
sugon_cxj committed
114
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=false --rec_image_shape=3,48,320 --warmup=1
chenxj's avatar
chenxj committed
115
```
chenxj's avatar
chenxj committed
116
## 推理(ort)
117
118
119
```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.onnx" --rec_model_dir="./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx" --use_onnx=true --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
```
chenxj's avatar
chenxj committed
120
121
122
## result
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/inference_results/08.jpg)
### 性能和准确率数据
sugon_cxj's avatar
sugon_cxj committed
123
124
125
126
127
128
129
130
131
132

检测模型测试
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.7054 | 0.7193  |

识别模型测试
| Model | Acc | 
| :------: | :------: |
| rec | 0.6490 | 
chenxj's avatar
chenxj committed
133
134
135
136
137
138
139
140
141
142

检测模型测试(ort)
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.5097 | 0.4068  |

识别模型测试(ort)
| Model | Acc | 
| :------: | :------: |
| rec | 0.6076 | 
chenxj's avatar
chenxj committed
143
144
145
146
147
## 应用场景
### 算法类别
ocr
### 热点应用行业
工业制造、金融、交通、教育、医疗
chenxj's avatar
chenxj committed
148
## 源码仓库及问题反馈
chenxj's avatar
chenxj committed
149
https://developer.hpccube.com/codes/modelzoo/paddleocr
sugon_cxj's avatar
sugon_cxj committed
150
151
152
## 参考
* [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)