README.md 6.48 KB
Newer Older
1
# PaddleOCR
yangql's avatar
yangql committed
2
3
## 论文
PaddleOCR通过det、rec、cls三个模型分别实现字符检测、字符识别和字符方向分类的应用
yangql's avatar
yangql committed
4

yangql's avatar
yangql committed
5
6
7
8
9
10
11
12
13
14
15
det模型主要用DB算法,参考论文如下:

https://arxiv.org/pdf/1911.08947.pdf

rec模型主要用SVTR算法,参考论文如下:

https://arxiv.org/pdf/2205.00159.pdf

cls模型用mobilenetv3实现通用分类,参考论文如下:

https://arxiv.org/pdf/1905.02244.pdf
yangql's avatar
yangql committed
16
17

## 模型结构
18
PaddleOCR使用ch_PP-OCRv3_det + ch_ppocr_mobile_v2.0_cls + ch_PP-OCRv3_rec三个模型进行图像中的文本识别。
yangql's avatar
yangql committed
19
det:
yangql's avatar
yangql committed
20

yangql's avatar
yangql committed
21
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/det/dbnet-arc.png)
yangql's avatar
yangql committed
22

yangql's avatar
yangql committed
23
rec:
yangql's avatar
yangql committed
24

yangql's avatar
yangql committed
25
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/rec/SVTR-arc.png)
yangql's avatar
yangql committed
26

yangql's avatar
yangql committed
27
cls:
yangql's avatar
yangql committed
28

yangql's avatar
yangql committed
29
30
31
32
33
34
35
36
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/configs/cls/mobilenetv3-arc.png)

## 算法原理

## 环境配置
### Docker
拉取镜像:
```
yangql's avatar
yangql committed
37
38
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:ort1.14.0_migraphx3.0.0-dtk22.10.1
```
yangql's avatar
yangql committed
39
40
41
创建并启动容器,安装相关依赖:
```
docker run --shm-size 16g --network=host --name=paddleocr_ort --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/paddleocr_ort:/home/paddleocr_ort -it <Your Image ID> /bin/bash
yangql's avatar
yangql committed
42

yangql's avatar
yangql committed
43
44
45
46
47
# 激活dtk
source /opt/dtk/env.sh
```

## 数据集
yangql's avatar
yangql committed
48

yangql's avatar
yangql committed
49
50
51
52
## 推理
### Python版本推理
本次采用PaddleOCR模型基于ONNXRuntime推理框架进行图像文本识别,模型文件下载链接:https://pan.baidu.com/s/1uGHhimKLb5k5f9xaFmNBwQ , 提取码:ggvz ,并将ch_PP-OCRv3_det_infer.onnx、ch_ppocr_mobile_v2.0_cls_infer.onnx、ch_PP-OCRv3_rec_infer.onnx模型文件保存在Resource/Models文件夹下。下面介绍如何运行python代码示例,Python示例的详细说明见Doc目录下的Tutorial_Python.md。
#### 设置Python环境变量
yangql's avatar
yangql committed
53
54
55
```
export PYTHONPATH=/opt/dtk/lib:$PYTHONPATH
```
yangql's avatar
yangql committed
56
#### 运行示例
yangql's avatar
yangql committed
57
```python
58
59
# 进入paddleocr ort工程根目录
cd <path_to_paddleocr_ort> 
yangql's avatar
yangql committed
60
61
62
63
64
65
66
67

# 进入示例程序目录
cd Python/

# 安装依赖
pip install -r requirements.txt

### 运行示例
68
python paddleocr.py
yangql's avatar
yangql committed
69
70
```

yangql's avatar
yangql committed
71
### C++版本推理
yangql's avatar
yangql committed
72

73
本次采用PaddleOCR模型基于ONNXRuntime推理框架进行图像文本识别,模型文件下载链接:https://pan.baidu.com/s/1uGHhimKLb5k5f9xaFmNBwQ , 提取码:ggvz ,并将ch_PP-OCRv3_det_infer.onnx、ch_ppocr_mobile_v2.0_cls_infer.onnx、ch_PP-OCRv3_rec_infer.onnx模型文件保存在Resource/Models文件夹下。下面介绍如何运行python代码示例,Python示例的详细说明见Doc目录下的Tutorial_Cpp.md。
yangql's avatar
yangql committed
74
#### 构建工程
yangql's avatar
yangql committed
75
76
77
```
rbuild build -d depend
```
yangql's avatar
yangql committed
78
#### 设置环境变量
yangql's avatar
yangql committed
79
80
将依赖库依赖加入环境变量LD_LIBRARY_PATH,在~/.bashrc中添加如下语句:
```
81
export LD_LIBRARY_PATH=<path_to_paddleocr_ort>/depend/lib64/:$LD_LIBRARY_PATH
yangql's avatar
yangql committed
82
83
84
85
86
87
88
```
然后执行:
```
source ~/.bashrc
source /opt/dtk/env.sh
```
### 运行示例
yangql's avatar
yangql committed
89
```
90
91
# 进入paddleocr ort工程根目录
cd <path_to_paddleocr_ort> 
yangql's avatar
yangql committed
92
93
94
95
96

# 进入build目录
cd build/

# 执行示例程序
97
./PaddleOCR
yangql's avatar
yangql committed
98
99
```

yangql's avatar
yangql committed
100
101
102
103
104
## result
### python版本
```
[[[[245.0, 9.0], [554.0, 8.0], [554.0, 27.0], [245.0, 28.0]], '人生活的真实写照:善有善报,恶有恶报。', '0.9306996673345566'], [[[9.0, 49.0], [522.0, 50.0], [522.0, 69.0], [9.0, 68.0]], '我们中国人有一句俗语说:“种瓜得瓜,种豆得豆。”而这就是每个', '0.9294075581335253'], [[[84.0, 105.0], [555.0, 104.0], [555.0, 125.0], [85.0, 127.0]], "every man's life: good begets good, and evil leads to evil.", '0.8932319914301237'], [[[28.0, 147.0], [556.0, 146.0], [556.0, 168.0], [28.0, 169.0]], 'melons; if he sows beans, he will reap beans." And this is true of', '0.900923888185131'], [[[0.0, 185.0], [524.0, 188.0], [524.0, 212.0], [0.0, 209.0]], 'We Chinese have a saying:"If a man plants melons, he will reap', '0.9216671202863965'], [[[295.0, 248.0], [553.0, 248.0], [553.0, 264.0], [295.0, 264.0]], '它不仅适用于今生,也适用于来世。', '0.927988795673146'], [[[14.0, 289.0], [554.0, 290.0], [554.0, 307.0], [14.0, 306.0]], '一每一个行为都有一种结果。在我看来,这种想法是全宇宙的道德基础;', '0.88565122719967'], [[[9.0, 330.0], [521.0, 330.0], [521.0, 349.0], [9.0, 349.0]], '假如说过去的日子曾经教给我们一些什么的话,那就是有因必有果一', '0.9162070232052957'], [[[343.0, 388.0], [555.0, 388.0], [555.0, 405.0], [343.0, 405.0]], 'in this world and the next.', '0.8764956444501877'], [[[15.0, 426.0], [554.0, 426.0], [554.0, 448.0], [15.0, 448.0]], 'opinion, is the moral foundation of the universe; it applies equally', '0.9183026262815448'], [[[62.0, 466.0], [556.0, 468.0], [556.0, 492.0], [62.0, 490.0]], 'effect - every action has a consequence. This thought, in my', '0.9308378403304053']]
```
yangql's avatar
yangql committed
105

yangql's avatar
yangql committed
106
### C++版本
yangql's avatar
yangql committed
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
```
TextBox[0](+padding)[score(0.711119),[x: 293, y: 58], [x: 604, y: 58], [x: 604, y: 79], [x: 293, y: 79]]

...

TextBox[11](+padding)[score(0.605026),[x: 92, y: 554], [x: 610, y: 557], [x: 609, y: 585], [x: 92, y: 582]]
---------- step: drawTextBoxes ----------
---------- step: angleNet getAngles ----------
angle[0][index(1), score(1.000000), time(57.276707ms)]

...

angle[11][index(1), score(0.930842), time(2.952602ms)]
---------- step: crnnNet getTextLine ----------
textLine[0](人生活的真实写照:善有善报,恶有恶报。)
textScores[0]{0.576271 ,0.99956 ,0.999475 ,0.99967 ,0.998779 ,0.999525 ,0.805865 ,0.999865 ,0.988233 ,0.999061 ,0.999581 ,0.999483 ,0.999324 ,0.995648 ,0.561861 ,0.961845 ,0.995993 ,0.998593 ,0.994963}
crnnTime[0](58.019418ms)

...

textLine[11](If the past has taught us anything, it is that every cause brings)
textScores[11]{0.996653 ,0.625094 ,0.97989 ,0.999761 ,0.816289 ,0.99883 ,0.963821 ,0.999222 ,0.999725 ,0.999588 ,0.542554 ,0.998707 ,0.911063 ,0.603935 ,0.99833 ,0.994734 ,0.998606 ,0.999571 ,0.9995 ,0.99971 ,0.983833 ,0.941867 ,0.989647 ,0.999145 ,0.998365 ,0.995752 ,0.999369 ,0.999424 ,0.976135 ,0.998815 ,0.999755 ,0.67898 ,0.999837 ,0.999205 ,0.982815 ,0.991013 ,0.999252 ,0.818822 ,0.996863 ,0.998451 ,0.999198 ,0.812635 ,0.999701 ,0.567811 ,0.999545 ,0.815998 ,0.996471 ,0.998722 ,0.999546 ,0.999121 ,0.999202 ,0.99971 ,0.980306 ,0.999399 ,0.635116 ,0.99954 ,0.998961 ,0.600432 ,0.990555 ,0.999872 ,0.998974 ,0.999687 ,0.56602 ,0.999607 ,0.999343}
crnnTime[11](38.051758ms)

```

## 源码仓库及问题反馈

yangql's avatar
yangql committed
135
https://developer.hpccube.com/codes/modelzoo/paddleocr_onnxruntime
yangql's avatar
yangql committed
136

yangql's avatar
yangql committed
137
## 参考资料
yangql's avatar
yangql committed
138
139
140

https://github.com/RapidAI/RapidOCR

141
https://github.com/RapidAI/RapidOcrOnnx