README.md 3.69 KB
Newer Older
dcuai's avatar
dcuai committed
1
# chineseocr_lite
chenxj's avatar
chenxj committed
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
## 论文
chineseocr_lite通过det、rec、cls三个模型分别实现字符检测、字符识别和字符方向分类的应用

det模型主要用DB算法,参考论文如下:

https://arxiv.org/pdf/1911.08947.pdf

rec模型主要用crnn算法,参考论文如下:

https://arxiv.org/pdf/1507.05717.pdf

cls模型用resnet实现通用分类,参考论文如下:

https://arxiv.org/pdf/1512.03385.pdf
## 模型结构
det:

![image](https://developer.hpccube.com/codes/modelzoo/chineseocr_lite_onnx/-/raw/main/configs/dbnet-arc.png)

rec:

![image](https://developer.hpccube.com/codes/modelzoo/chineseocr_lite_onnx/-/raw/main/configs/crnn-arc.png)

cls:

![image](https://developer.hpccube.com/codes/modelzoo/chineseocr_lite_onnx/-/raw/main/configs/resnet-arc.png)
## 算法原理
chenxj's avatar
chenxj committed
29
![image](https://developer.hpccube.com/codes/modelzoo/chineseocr_lite_onnx/-/raw/main/configs/ocr.png)
chenxj's avatar
chenxj committed
30
31
32
## 数据集
推荐使用icdar2015数据集[icdar2015](https://rrc.cvc.uab.es/?ch=4&com=downloads)

luopl's avatar
luopl committed
33
数据集SCNet快速下载链接[icdar2015](http://113.200.138.88:18080/aidatasets/project-dependency/icdar2015/-/tree/new_icdar2015)
luopl's avatar
luopl committed
34

chenxj's avatar
chenxj committed
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
检测模型训练集文件结构
```
/PaddleOCR/train_data/icdar2015/text_localization/
  └─ icdar_c4_train_imgs/         Training data of icdar dataset
  └─ ch4_test_images/             Testing data of icdar dataset
  └─ train_icdar2015_label.txt    Training annotation of icdar dataset
  └─ test_icdar2015_label.txt     Test annotation of icdar dataset
```
识别模型训练集文件结构
```
|-train_data
  |-rec
    |- rec_gt_train.txt
    |- train
        |- word_001.png
        |- word_002.jpg
        |- word_003.jpg
        | ...
    |-ic15_data
        |- rec_gt_test.txt
        |- test
            |- word_001.jpg
            |- word_002.jpg
            |- word_003.jpg
            | ...
```
## 环境配置
[光源](https://www.sourcefind.cn/#/service-details)可拉取训练以及推理的docker镜像,在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载paddle安装包用于模型测试。chineseocr_lite_onnx推荐的镜像如下:
```
chenxj's avatar
chenxj committed
64
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:ort-lite-1.14.0_migraphx3.1.2-dtk23.04
dcuai's avatar
dcuai committed
65
66
67
docker run -d -t --privileged --device=/dev/kfd --device=/dev/dri/ --network=host --group-add video --name chineseocr-test image.sourcefind.cn:5000/dcu/admin/base/custom:ort-lite-1.14.0_migraphx3.1.2-dtk23.04
docker exec -it chineseocr-test bash
source /opt/dtk-23.04/env.sh
chenxj's avatar
chenxj committed
68
69
cd chineseocr_lite_onnx
pip3 install -r requirements.txt
dcuai's avatar
dcuai committed
70
pip install urllib3==1.23 pyyaml
chenxj's avatar
chenxj committed
71
```
dcuai's avatar
dcuai committed
72
73
## 训练
### 测试
chenxj's avatar
chenxj committed
74
75
76
77
78
79
80
81
82
83
84
85
86
87
检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/eval.py -c configs/det_mv3_db.yml -o Global.pretrained_model=./models/dbnet.onnx
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/eval.py -c configs/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=./models/crnn_lite_lstm.onnx
```
## 推理
```
python3 main.py --img_dir="./images/" --det_model_dir="./models/dbnet.onnx" --rec_model_dir="./models/crnn_lite_lstm.onnx" --cls_model_dir="./models/angle_net.onnx" --use_angle_cls=1 --warmup=1
```
## result
![image](https://developer.hpccube.com/codes/modelzoo/chineseocr_lite_onnx/-/raw/main/dbnet/test.jpg)
dcuai's avatar
dcuai committed
88
### 精度
chenxj's avatar
chenxj committed
89
90
91
92
93
94
95
96
97
98
99
100
101
102

检测模型测试
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.6969 | 0.2291  |

识别模型测试
| Model | Acc | 
| :------: | :------: |
| rec | 0.1160 | 
## 应用场景
### 算法类别
ocr
### 热点应用行业
dcuai's avatar
dcuai committed
103
制造,金融,交通,教育,医疗
chenxj's avatar
chenxj committed
104
105
## 源码仓库及问题反馈
https://developer.hpccube.com/codes/modelzoo/chineseocr_lite_onnx
dcuai's avatar
dcuai committed
106
## 参考资料
chenxj's avatar
chenxj committed
107
* [chineseocr_lite](https://github.com/DayBreak-u/chineseocr_lite)
sugon_cxj's avatar
sugon_cxj committed
108