# PaddleOCR ## 论文 PaddleOCR通过det、rec、cls三个模型分别实现字符检测、字符识别和字符方向分类的应用 det模型主要用DB算法,参考论文如下: https://arxiv.org/pdf/1911.08947.pdf rec模型主要用SVTR算法,参考论文如下: https://arxiv.org/pdf/2205.00159.pdf cls模型用mobilenetv3实现通用分类,参考论文如下: https://arxiv.org/pdf/1905.02244.pdf ## 模型结构 PaddleOCR使用ch_PP-OCRv3_det + ch_ppocr_mobile_v2.0_cls + ch_PP-OCRv3_rec三个模型进行图像中的文本识别。 det: ![image](./Doc/images/dbnet-arc.png) rec: ![image](./Doc/images/SVTR-arc.png) cls: ![image](./Doc/images/mobilenetv3-arc.png) ## 算法原理 ![image](./Doc/images/ocr.png) ## 环境配置 ### Docker(方法一) 拉取镜像: ``` docker pull image.sourcefind.cn:5000/dcu/admin/base/migraphx:4.3.0-ubuntu20.04-dtk24.04.1-py3.10 ``` 创建并启动容器,安装相关依赖: ``` docker run --shm-size 16g --network=host --name=paddleocr_onnxruntime --privileged -v /opt/hyhal:/opt/hyhal --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/paddleocr_onnxruntime:/home/paddleocr_onnxruntime -it /bin/bash # 激活dtk source /opt/dtk/env.sh ``` ### Dockerfile(方法二) ``` cd ./docker docker build --no-cache -t paddleocr_onnxruntime:2.0 . docker run --shm-size 16g --network=host --name=paddleocr_onnxruntime --privileged -v /opt/hyhal:/opt/hyhal --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/paddleocr_onnxruntime:/home/paddleocr_onnxruntime -it /bin/bash ``` ## 数据集 无 ## 推理 ### Python版本推理 本次采用PaddleOCR模型基于ONNXRuntime推理框架进行图像文本识别,模型文件ch_PP-OCRv3_det_infer.onnx、ch_ppocr_mobile_v2.0_cls_infer.onnx、ch_PP-OCRv3_rec_infer.onnx模型文件保存在Resource/Models文件夹下。下面介绍如何运行python代码示例,Python示例的详细说明见Doc目录下的Tutorial_Python.md。 #### 设置Python环境变量 ``` export PYTHONPATH=/opt/dtk/lib:$PYTHONPATH ``` #### 运行示例 ```python # 进入paddleocr onnxruntime工程根目录 cd # 进入示例程序目录 cd Python/ # 安装依赖 pip install -r requirements.txt ### 运行示例 python paddleocr.py ``` ### C++版本推理 本次采用PaddleOCR模型基于ONNXRuntime推理框架进行图像文本识别,模型文件ch_PP-OCRv3_det_infer.onnx、ch_ppocr_mobile_v2.0_cls_infer.onnx、ch_PP-OCRv3_rec_infer.onnx模型文件保存在Resource/Models文件夹下。下面介绍如何运行python代码示例,Python示例的详细说明见Doc目录下的Tutorial_Cpp.md。 #### 构建工程 ``` source /opt/dtk/env.sh # 安装Opencv依赖 cd rbuild build -d depend ``` #### 设置环境变量 将依赖库依赖加入环境变量LD_LIBRARY_PATH,在~/.bashrc中添加如下语句: ``` export LD_LIBRARY_PATH=/depend/lib64/:$LD_LIBRARY_PATH ``` 然后执行: ``` source ~/.bashrc ``` ### 运行示例 ``` # 进入paddleocr onnxruntime工程根目录 cd # 进入build目录 cd build/ # 执行示例程序 ./PaddleOCR ``` ## result ### python版本 ``` [[[[245.0, 9.0], [554.0, 8.0], [554.0, 27.0], [245.0, 28.0]], '人生活的真实写照:善有善报,恶有恶报。', '0.9306996673345566'], [[[9.0, 49.0], [522.0, 50.0], [522.0, 69.0], [9.0, 68.0]], '我们中国人有一句俗语说:“种瓜得瓜,种豆得豆。”而这就是每个', '0.9294075581335253'], [[[84.0, 105.0], [555.0, 104.0], [555.0, 125.0], [85.0, 127.0]], "every man's life: good begets good, and evil leads to evil.", '0.8932319914301237'], [[[28.0, 147.0], [556.0, 146.0], [556.0, 168.0], [28.0, 169.0]], 'melons; if he sows beans, he will reap beans." And this is true of', '0.900923888185131'], [[[0.0, 185.0], [524.0, 188.0], [524.0, 212.0], [0.0, 209.0]], 'We Chinese have a saying:"If a man plants melons, he will reap', '0.9216671202863965'], [[[295.0, 248.0], [553.0, 248.0], [553.0, 264.0], [295.0, 264.0]], '它不仅适用于今生,也适用于来世。', '0.927988795673146'], [[[14.0, 289.0], [554.0, 290.0], [554.0, 307.0], [14.0, 306.0]], '一每一个行为都有一种结果。在我看来,这种想法是全宇宙的道德基础;', '0.88565122719967'], [[[9.0, 330.0], [521.0, 330.0], [521.0, 349.0], [9.0, 349.0]], '假如说过去的日子曾经教给我们一些什么的话,那就是有因必有果一', '0.9162070232052957'], [[[343.0, 388.0], [555.0, 388.0], [555.0, 405.0], [343.0, 405.0]], 'in this world and the next.', '0.8764956444501877'], [[[15.0, 426.0], [554.0, 426.0], [554.0, 448.0], [15.0, 448.0]], 'opinion, is the moral foundation of the universe; it applies equally', '0.9183026262815448'], [[[62.0, 466.0], [556.0, 468.0], [556.0, 492.0], [62.0, 490.0]], 'effect - every action has a consequence. This thought, in my', '0.9308378403304053']] ``` ### C++版本 ``` TextBox[0](+padding)[score(0.711119),[x: 293, y: 58], [x: 604, y: 58], [x: 604, y: 79], [x: 293, y: 79]] ... TextBox[11](+padding)[score(0.605026),[x: 92, y: 554], [x: 610, y: 557], [x: 609, y: 585], [x: 92, y: 582]] ---------- step: drawTextBoxes ---------- ---------- step: angleNet getAngles ---------- angle[0][index(1), score(1.000000), time(57.276707ms)] ... angle[11][index(1), score(0.930842), time(2.952602ms)] ---------- step: crnnNet getTextLine ---------- textLine[0](人生活的真实写照:善有善报,恶有恶报。) textScores[0]{0.576271 ,0.99956 ,0.999475 ,0.99967 ,0.998779 ,0.999525 ,0.805865 ,0.999865 ,0.988233 ,0.999061 ,0.999581 ,0.999483 ,0.999324 ,0.995648 ,0.561861 ,0.961845 ,0.995993 ,0.998593 ,0.994963} crnnTime[0](58.019418ms) ... textLine[11](If the past has taught us anything, it is that every cause brings) textScores[11]{0.996653 ,0.625094 ,0.97989 ,0.999761 ,0.816289 ,0.99883 ,0.963821 ,0.999222 ,0.999725 ,0.999588 ,0.542554 ,0.998707 ,0.911063 ,0.603935 ,0.99833 ,0.994734 ,0.998606 ,0.999571 ,0.9995 ,0.99971 ,0.983833 ,0.941867 ,0.989647 ,0.999145 ,0.998365 ,0.995752 ,0.999369 ,0.999424 ,0.976135 ,0.998815 ,0.999755 ,0.67898 ,0.999837 ,0.999205 ,0.982815 ,0.991013 ,0.999252 ,0.818822 ,0.996863 ,0.998451 ,0.999198 ,0.812635 ,0.999701 ,0.567811 ,0.999545 ,0.815998 ,0.996471 ,0.998722 ,0.999546 ,0.999121 ,0.999202 ,0.99971 ,0.980306 ,0.999399 ,0.635116 ,0.99954 ,0.998961 ,0.600432 ,0.990555 ,0.999872 ,0.998974 ,0.999687 ,0.56602 ,0.999607 ,0.999343} crnnTime[11](38.051758ms) ``` ### 精度 无 ## 应用场景 ### 算法类别 `ocr` ### 热点应用行业 `制造,金融,交通,教育,医疗` ## 源码仓库及问题反馈 https://developer.sourcefind.cn/codes/modelzoo/paddleocr_onnxruntime ## 参考资料 https://github.com/RapidAI/RapidOCR https://github.com/RapidAI/RapidOcrOnnx