README.md 5.34 KB
Newer Older
dcuai's avatar
dcuai committed
1
# paddleocr_paddle
chenxj's avatar
chenxj committed
2
## 论文
chenxj's avatar
chenxj committed
3
PaddleOCR通过det、rec、cls三个模型分别实现字符检测、字符识别和字符方向分类的应用
chenxj's avatar
chenxj committed
4

chenxj's avatar
chenxj committed
5
6
7
8
9
10
11
12
13
det模型主要用DB算法,参考论文如下:

https://arxiv.org/pdf/1911.08947.pdf

rec模型主要用SVTR算法,参考论文如下:

https://arxiv.org/pdf/2205.00159.pdf

cls模型用mobilenetv3实现通用分类,参考论文如下:
chenxj's avatar
chenxj committed
14

chenxj's avatar
chenxj committed
15
https://arxiv.org/pdf/1905.02244.pdf
sugon_cxj's avatar
sugon_cxj committed
16
## 模型结构
chenxj's avatar
chenxj committed
17
det:
chenxj's avatar
chenxj committed
18

chenxj's avatar
chenxj committed
19
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/det/dbnet-arc.png)
chenxj's avatar
chenxj committed
20

chenxj's avatar
chenxj committed
21
rec:
chenxj's avatar
chenxj committed
22

chenxj's avatar
chenxj committed
23
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/rec/SVTR-arc.png)
chenxj's avatar
chenxj committed
24

chenxj's avatar
chenxj committed
25
cls:
chenxj's avatar
chenxj committed
26

chenxj's avatar
chenxj committed
27
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/cls/mobilenetv3-arc.png)
chenxj's avatar
chenxj committed
28
## 算法原理
sugon_cxj's avatar
sugon_cxj committed
29
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/ocr.png)
chenxj's avatar
chenxj committed
30
## 环境配置
dcuai's avatar
dcuai committed
31
[光源](https://sourcefind.cn/#/main-page)可拉取训练以及推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载paddle、onnxruntime安装包。PaddleOCR推荐的镜像如下:
chenxj's avatar
chenxj committed
32
```
dcuai's avatar
dcuai committed
33
34
docker pull image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.5.2-ubuntu20.04-dtk24.04.1-py3.8
docker run -d -t --privileged --device=/dev/kfd --device=/dev/dri/ --shm-size 64g --network=host -v `pwd`:/挂在目录 -v /opt/hyhal:/opt/hyhal:ro --group-add video --name paddleocr-test image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.5.2-ubuntu20.04-dtk24.04.1-py3.8
chenxj's avatar
chenxj committed
35
36
docker exec -it paddleocr-test bash
pip3 install -r requirements.txt
dcuai's avatar
dcuai committed
37
pip3 install onnxruntime-1.15.0+das1.1.git739f24d.abi1.dtk2404-cp38-cp38-manylinux_2_31_x86_64.whl
chenxj's avatar
chenxj committed
38
39
40
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/MobileNetV3_large_x0_5_pretrained.pdparams
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar
```
sugon_cxj's avatar
sugon_cxj committed
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
## 数据集
推荐使用icdar2015数据集[icdar2015](https://rrc.cvc.uab.es/?ch=4&com=downloads)

检测模型训练集文件结构
```
/PaddleOCR/train_data/icdar2015/text_localization/
  └─ icdar_c4_train_imgs/         Training data of icdar dataset
  └─ ch4_test_images/             Testing data of icdar dataset
  └─ train_icdar2015_label.txt    Training annotation of icdar dataset
  └─ test_icdar2015_label.txt     Test annotation of icdar dataset
```
识别模型训练集文件结构
```
|-train_data
  |-rec
    |- rec_gt_train.txt
    |- train
        |- word_001.png
        |- word_002.jpg
        |- word_003.jpg
        | ...
    |-ic15_data
        |- rec_gt_test.txt
        |- test
            |- word_001.jpg
            |- word_002.jpg
            |- word_003.jpg
            | ...
```
chenxj's avatar
chenxj committed
70
## 训练
sugon_cxj's avatar
sugon_cxj committed
71
72

检测模型
chenxj's avatar
chenxj committed
73
```
sugon_cxj's avatar
sugon_cxj committed
74
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
chenxj's avatar
chenxj committed
75
```
sugon_cxj's avatar
sugon_cxj committed
76
识别模型
chenxj's avatar
chenxj committed
77
```
sugon_cxj's avatar
sugon_cxj committed
78
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./pretrain_models/en_PP-OCRv3_rec_train/best_accuracy
chenxj's avatar
chenxj committed
79
```
dcuai's avatar
dcuai committed
80
### 测试
chenxj's avatar
chenxj committed
81
### 测试(paddle)
sugon_cxj's avatar
sugon_cxj committed
82
检测模型
chenxj's avatar
chenxj committed
83
```
sugon_cxj's avatar
sugon_cxj committed
84
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./output/db_mv3/best_accuracy.pdparams
chenxj's avatar
chenxj committed
85
```
sugon_cxj's avatar
sugon_cxj committed
86
识别模型
chenxj's avatar
chenxj committed
87
```
sugon_cxj's avatar
sugon_cxj committed
88
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./output/v3_en_mobile/best_accuracy.pdparams
chenxj's avatar
chenxj committed
89
```
chenxj's avatar
chenxj committed
90
### 测试(ort)
chenxj's avatar
chenxj committed
91
92
93
94
95
96
97
98
检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx --use_onnx=true
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx --use_onnx=true
```
chenxj's avatar
chenxj committed
99
## 推理
chenxj's avatar
chenxj committed
100
### 推理(paddle)
chenxj's avatar
chenxj committed
101
```
sugon_cxj's avatar
sugon_cxj committed
102
 python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
chenxj's avatar
chenxj committed
103
```
chenxj's avatar
chenxj committed
104
### 推理(ort)
105
106
107
```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.onnx" --rec_model_dir="./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx" --use_onnx=true --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
```
chenxj's avatar
chenxj committed
108
109
## result
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/inference_results/08.jpg)
sugon_cxj's avatar
sugon_cxj committed
110
### 精度
sugon_cxj's avatar
sugon_cxj committed
111
112
113
114
115
116
117
118
119
120

检测模型测试
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.7054 | 0.7193  |

识别模型测试
| Model | Acc | 
| :------: | :------: |
| rec | 0.6490 | 
chenxj's avatar
chenxj committed
121
122
123
124
125
126
127
128
129
130

检测模型测试(ort)
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.5097 | 0.4068  |

识别模型测试(ort)
| Model | Acc | 
| :------: | :------: |
| rec | 0.6076 | 
chenxj's avatar
chenxj committed
131
132
133
134
## 应用场景
### 算法类别
ocr
### 热点应用行业
dcuai's avatar
dcuai committed
135
制造,金融,交通,教育,医疗
chenxj's avatar
chenxj committed
136
## 源码仓库及问题反馈
chenxj's avatar
chenxj committed
137
https://developer.hpccube.com/codes/modelzoo/paddleocr
sugon_cxj's avatar
sugon_cxj committed
138
## 参考资料
sugon_cxj's avatar
sugon_cxj committed
139
140
* [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)