README.md

# PaddleOCR
## 论文
det、rec、cls三个模型的backbone基于mobilenetv3，可参考mobilenetv3的相关论文
https://arxiv.org/pdf/1905.02244.pdf
## 模型结构

det:
Backbone:
  name: MobileNetV3
  scale: 0.5
  model_name: large
Neck:
  name: DBFPN
  out_channels: 256
Head:
  name: DBHead

rec:
Backbone:
  name: MobileNetV1Enhance
  scale: 0.5
  last_conv_stride: [1, 2]
  last_pool_type: avg
Head:
  name: MultiHead
  head_list:
    - CTCHead:
        Neck:
          name: svtr
          dims: 64
          depth: 2
          hidden_dims: 120
          use_guide: True
        Head:
          fc_decay: 0.00001
    - SARHead:
        enc_dim: 512
        max_text_length: *max_text_length

cls:
Backbone:
  name: MobileNetV3
  scale: 0.35
  model_name: small
Head:
  name: ClsHead
  class_dim: 2
## 算法原理
det->cls->rec->text
## 数据集
推荐使用icdar2015数据集[icdar2015](https://rrc.cvc.uab.es/?ch=4&com=downloads)。

检测模型训练集文件结构
```
/PaddleOCR/train_data/icdar2015/text_localization/
  └─ icdar_c4_train_imgs/         Training data of icdar dataset
  └─ ch4_test_images/             Testing data of icdar dataset
  └─ train_icdar2015_label.txt    Training annotation of icdar dataset
  └─ test_icdar2015_label.txt     Test annotation of icdar dataset
```
识别模型训练集文件结构
```
|-train_data
  |-rec
    |- rec_gt_train.txt
    |- train
        |- word_001.png
        |- word_002.jpg
        |- word_003.jpg
        | ...
    |-ic15_data
        |- rec_gt_test.txt
        |- test
            |- word_001.jpg
            |- word_002.jpg
            |- word_003.jpg
            | ...
```
## 环境配置
在[光源](https://www.sourcefind.cn/#/service-details)可拉取训练以及推理的docker镜像，在[光合开发者社区](https://cancon.hpccube.com:65024/4/main/)可下载paddle安装包。PaddleOCR推荐的镜像如下：
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.3.2-centos7.6-dtk-22.10.1-py37-latest
```
## 训练

检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./pretrain_models/en_PP-OCRv3_rec_train/best_accuracy
```
## 测试
检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./output/db_mv3/best_accuracy.pdparams
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=./output/v3_en_mobile/best_accuracy.pdparams
```
## 测试(ort)
检测模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx --use_onnx=true
```
识别模型
```
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/eval.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx --use_onnx=true
```
## 推理
```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/" --rec_model_dir="./ch_PP-OCRv3_rec_infer/" --use_angle_cls=false --rec_image_shape=3,48,320 --warmup=1
```
## 推理(ort)
```
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det.onnx" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.onnx" --rec_model_dir="./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec.onnx" --use_onnx=true --use_angle_cls=true --rec_image_shape=3,48,320 --warmup=1
```
## result
![image](https://developer.hpccube.com/codes/modelzoo/paddleocr/-/raw/main/inference_results/08.jpg)
### 性能和准确率数据

检测模型测试
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.7054 | 0.7193  |

识别模型测试
| Model | Acc | 
| :------: | :------: |
| rec | 0.6490 | 

检测模型测试(ort)
| Model | Precision | Recall |
| :------: | :------: |:------: |
| det | 0.5097 | 0.4068  |

识别模型测试(ort)
| Model | Acc | 
| :------: | :------: |
| rec | 0.6076 | 
## 应用场景
### 算法类别
ocr
### 热点应用行业
工业制造、金融、交通、教育、医疗
## 源码仓库及问题反馈
https://developer.hpccube.com/codes/modelzoo/paddleocr
## 参考
* [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)