README.md 6.49 KB
Newer Older
1
# PaddleOCR
yangql's avatar
yangql committed
2
3
## 论文
PaddleOCR通过det、rec、cls三个模型分别实现字符检测、字符识别和字符方向分类的应用
yangql's avatar
yangql committed
4

yangql's avatar
yangql committed
5
6
7
8
9
10
11
12
13
14
15
det模型主要用DB算法,参考论文如下:

https://arxiv.org/pdf/1911.08947.pdf

rec模型主要用SVTR算法,参考论文如下:

https://arxiv.org/pdf/2205.00159.pdf

cls模型用mobilenetv3实现通用分类,参考论文如下:

https://arxiv.org/pdf/1905.02244.pdf
yangql's avatar
yangql committed
16
17

## 模型结构
18
PaddleOCR使用ch_PP-OCRv3_det + ch_ppocr_mobile_v2.0_cls + ch_PP-OCRv3_rec三个模型进行图像中的文本识别。
yangql's avatar
yangql committed
19
det:
yangql's avatar
yangql committed
20

yangql's avatar
yangql committed
21
![image](./Doc/images/dbnet-arc.png)
yangql's avatar
yangql committed
22

yangql's avatar
yangql committed
23
rec:
yangql's avatar
yangql committed
24

yangql's avatar
yangql committed
25
![image](./Doc/images/SVTR-arc.png)
yangql's avatar
yangql committed
26

yangql's avatar
yangql committed
27
cls:
yangql's avatar
yangql committed
28

yangql's avatar
yangql committed
29
![image](./Doc/images/mobilenetv3-arc.png)
yangql's avatar
yangql committed
30
31
32
33
34
35
36

## 算法原理

## 环境配置
### Docker
拉取镜像:
```
yangql's avatar
yangql committed
37
38
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:ort1.14.0_migraphx3.0.0-dtk22.10.1
```
yangql's avatar
yangql committed
39
40
41
创建并启动容器,安装相关依赖:
```
docker run --shm-size 16g --network=host --name=paddleocr_ort --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/paddleocr_ort:/home/paddleocr_ort -it <Your Image ID> /bin/bash
yangql's avatar
yangql committed
42

yangql's avatar
yangql committed
43
44
45
46
47
48
49
50
51
# 激活dtk
source /opt/dtk/env.sh
```

## 数据集
## 推理
### Python版本推理
本次采用PaddleOCR模型基于ONNXRuntime推理框架进行图像文本识别,模型文件下载链接:https://pan.baidu.com/s/1uGHhimKLb5k5f9xaFmNBwQ , 提取码:ggvz ,并将ch_PP-OCRv3_det_infer.onnx、ch_ppocr_mobile_v2.0_cls_infer.onnx、ch_PP-OCRv3_rec_infer.onnx模型文件保存在Resource/Models文件夹下。下面介绍如何运行python代码示例,Python示例的详细说明见Doc目录下的Tutorial_Python.md。
#### 设置Python环境变量
yangql's avatar
yangql committed
52
53
54
```
export PYTHONPATH=/opt/dtk/lib:$PYTHONPATH
```
yangql's avatar
yangql committed
55
#### 运行示例
yangql's avatar
yangql committed
56
```python
57
58
# 进入paddleocr ort工程根目录
cd <path_to_paddleocr_ort> 
yangql's avatar
yangql committed
59
60
61
62
63
64
65
66

# 进入示例程序目录
cd Python/

# 安装依赖
pip install -r requirements.txt

### 运行示例
67
python paddleocr.py
yangql's avatar
yangql committed
68
69
```

yangql's avatar
yangql committed
70
### C++版本推理
71
本次采用PaddleOCR模型基于ONNXRuntime推理框架进行图像文本识别,模型文件下载链接:https://pan.baidu.com/s/1uGHhimKLb5k5f9xaFmNBwQ , 提取码:ggvz ,并将ch_PP-OCRv3_det_infer.onnx、ch_ppocr_mobile_v2.0_cls_infer.onnx、ch_PP-OCRv3_rec_infer.onnx模型文件保存在Resource/Models文件夹下。下面介绍如何运行python代码示例,Python示例的详细说明见Doc目录下的Tutorial_Cpp.md。
yangql's avatar
yangql committed
72
#### 构建工程
yangql's avatar
yangql committed
73
```
yangql's avatar
yangql committed
74
75
76
77
78
79
source /opt/dtk/env.sh

# 安装Opencv依赖
cd <path_to_resnet50_onnxruntime>
sh ./3rdParty/InstallOpenCVDependences.sh

yangql's avatar
yangql committed
80
81
rbuild build -d depend
```
yangql's avatar
yangql committed
82
#### 设置环境变量
yangql's avatar
yangql committed
83
84
将依赖库依赖加入环境变量LD_LIBRARY_PATH,在~/.bashrc中添加如下语句:
```
85
export LD_LIBRARY_PATH=<path_to_paddleocr_ort>/depend/lib64/:$LD_LIBRARY_PATH
yangql's avatar
yangql committed
86
87
88
89
90
91
```
然后执行:
```
source ~/.bashrc
```
### 运行示例
yangql's avatar
yangql committed
92
```
93
94
# 进入paddleocr ort工程根目录
cd <path_to_paddleocr_ort> 
yangql's avatar
yangql committed
95
96
97
98
99

# 进入build目录
cd build/

# 执行示例程序
100
./PaddleOCR
yangql's avatar
yangql committed
101
102
```

yangql's avatar
yangql committed
103
104
105
106
107
## result
### python版本
```
[[[[245.0, 9.0], [554.0, 8.0], [554.0, 27.0], [245.0, 28.0]], '人生活的真实写照:善有善报,恶有恶报。', '0.9306996673345566'], [[[9.0, 49.0], [522.0, 50.0], [522.0, 69.0], [9.0, 68.0]], '我们中国人有一句俗语说:“种瓜得瓜,种豆得豆。”而这就是每个', '0.9294075581335253'], [[[84.0, 105.0], [555.0, 104.0], [555.0, 125.0], [85.0, 127.0]], "every man's life: good begets good, and evil leads to evil.", '0.8932319914301237'], [[[28.0, 147.0], [556.0, 146.0], [556.0, 168.0], [28.0, 169.0]], 'melons; if he sows beans, he will reap beans." And this is true of', '0.900923888185131'], [[[0.0, 185.0], [524.0, 188.0], [524.0, 212.0], [0.0, 209.0]], 'We Chinese have a saying:"If a man plants melons, he will reap', '0.9216671202863965'], [[[295.0, 248.0], [553.0, 248.0], [553.0, 264.0], [295.0, 264.0]], '它不仅适用于今生,也适用于来世。', '0.927988795673146'], [[[14.0, 289.0], [554.0, 290.0], [554.0, 307.0], [14.0, 306.0]], '一每一个行为都有一种结果。在我看来,这种想法是全宇宙的道德基础;', '0.88565122719967'], [[[9.0, 330.0], [521.0, 330.0], [521.0, 349.0], [9.0, 349.0]], '假如说过去的日子曾经教给我们一些什么的话,那就是有因必有果一', '0.9162070232052957'], [[[343.0, 388.0], [555.0, 388.0], [555.0, 405.0], [343.0, 405.0]], 'in this world and the next.', '0.8764956444501877'], [[[15.0, 426.0], [554.0, 426.0], [554.0, 448.0], [15.0, 448.0]], 'opinion, is the moral foundation of the universe; it applies equally', '0.9183026262815448'], [[[62.0, 466.0], [556.0, 468.0], [556.0, 492.0], [62.0, 490.0]], 'effect - every action has a consequence. This thought, in my', '0.9308378403304053']]
```
yangql's avatar
yangql committed
108

yangql's avatar
yangql committed
109
### C++版本
yangql's avatar
yangql committed
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
```
TextBox[0](+padding)[score(0.711119),[x: 293, y: 58], [x: 604, y: 58], [x: 604, y: 79], [x: 293, y: 79]]
...
TextBox[11](+padding)[score(0.605026),[x: 92, y: 554], [x: 610, y: 557], [x: 609, y: 585], [x: 92, y: 582]]
---------- step: drawTextBoxes ----------
---------- step: angleNet getAngles ----------
angle[0][index(1), score(1.000000), time(57.276707ms)]
...
angle[11][index(1), score(0.930842), time(2.952602ms)]
---------- step: crnnNet getTextLine ----------
textLine[0](人生活的真实写照:善有善报,恶有恶报。)
textScores[0]{0.576271 ,0.99956 ,0.999475 ,0.99967 ,0.998779 ,0.999525 ,0.805865 ,0.999865 ,0.988233 ,0.999061 ,0.999581 ,0.999483 ,0.999324 ,0.995648 ,0.561861 ,0.961845 ,0.995993 ,0.998593 ,0.994963}
crnnTime[0](58.019418ms)
...
textLine[11](If the past has taught us anything, it is that every cause brings)
textScores[11]{0.996653 ,0.625094 ,0.97989 ,0.999761 ,0.816289 ,0.99883 ,0.963821 ,0.999222 ,0.999725 ,0.999588 ,0.542554 ,0.998707 ,0.911063 ,0.603935 ,0.99833 ,0.994734 ,0.998606 ,0.999571 ,0.9995 ,0.99971 ,0.983833 ,0.941867 ,0.989647 ,0.999145 ,0.998365 ,0.995752 ,0.999369 ,0.999424 ,0.976135 ,0.998815 ,0.999755 ,0.67898 ,0.999837 ,0.999205 ,0.982815 ,0.991013 ,0.999252 ,0.818822 ,0.996863 ,0.998451 ,0.999198 ,0.812635 ,0.999701 ,0.567811 ,0.999545 ,0.815998 ,0.996471 ,0.998722 ,0.999546 ,0.999121 ,0.999202 ,0.99971 ,0.980306 ,0.999399 ,0.635116 ,0.99954 ,0.998961 ,0.600432 ,0.990555 ,0.999872 ,0.998974 ,0.999687 ,0.56602 ,0.999607 ,0.999343}
crnnTime[11](38.051758ms)

```
yangql's avatar
yangql committed
129
130
131
132
133
## 应用场景
### 算法类别
`ocr`
### 热点应用行业
`工业制造``金融``交通``教育``医疗`
yangql's avatar
yangql committed
134
135

## 源码仓库及问题反馈
yangql's avatar
yangql committed
136
https://developer.hpccube.com/codes/modelzoo/paddleocr_onnxruntime
yangql's avatar
yangql committed
137

yangql's avatar
yangql committed
138
## 参考资料
yangql's avatar
yangql committed
139
140
141

https://github.com/RapidAI/RapidOCR

142
https://github.com/RapidAI/RapidOcrOnnx