Merge branch 'dygraph' into dygraph

26c16324 · Evezerest · GitHub · d9549ce6 · 0b37c118 · 26c16324
Unverified Commit 26c16324 authored Nov 08, 2021 by Evezerest Committed by GitHub Nov 08, 2021
20 changed files
--- a/PPOCRLabel/README.md
+++ b/PPOCRLabel/README.md
@@ -21,12 +21,9 @@ PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, w
  - Click to modify the recognition result.(If you can't change the result, please switch to the system default input method, or switch back to the original input method again)
 - 2020.12.18: Support re-recognition of a single label box (by [ninetailskim](https://github.com/ninetailskim) ), perfect shortcut keys.
-### TODO:
+## 1. Installation
- Lock box mode: For the same scene data, the size and position of the locked detection box can be transferred between different pictures.
-## Installation
+### 1.1 Environment Preparation
-### 1. Environment Preparation
 #### **Install PaddlePaddle 2.0**
@@ -66,7 +63,7 @@ If you getting this error `OSError: [WinError 126] The specified module could no
 Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
-### 2. Install PPOCRLabel
+### 1.2 Install PPOCRLabel
 #### Windows
@@ -94,9 +91,9 @@ cd ./PPOCRLabel # Change the directory to the PPOCRLabel folder
 python3 PPOCRLabel.py
 ```
-## Usage
+## 2. Usage
-### Steps
+### 2.1 Steps
 1. Build and launch using the instructions above.
@@ -140,9 +137,9 @@ python3 PPOCRLabel.py
 |  rec_gt.txt   | The recognition label file, which can be directly used for PPOCR identification model training, is generated after the user clicks on the menu bar "File"-"Export recognition result". |
 |   crop_img    | The recognition data, generated at the same time with *rec_gt.txt* |
-## Explanation
+## 3. Explanation
-### Shortcut keys
+### 3.1 Shortcut keys
 | Shortcut keys            | Description                                      |
 | ------------------------ | ------------------------------------------------ |
@@ -162,31 +159,37 @@ python3 PPOCRLabel.py
 | Ctrl--                   | Zoom out                                         |
 | ↑→↓←                     | Move selected box                                |
-### Built-in Model
+### 3.2 Built-in Model
 - Default model: PPOCRLabel uses the Chinese and English ultra-lightweight OCR model in PaddleOCR by default, supports Chinese, English and number recognition, and multiple language detection.
 - Model language switching: Changing the built-in model language is supportable by clicking "PaddleOCR"-"Choose OCR Model" in the menu bar. Currently supported languagesinclude French, German, Korean, and Japanese.
  For specific model download links, please refer to [PaddleOCR Model List](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md#multilingual-recognition-modelupdating)
- Custom model: The model trained by users can be replaced by modifying PPOCRLabel.py in [PaddleOCR class instantiation](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/PPOCRLabel/PPOCRLabel.py#L110) referring [Custom Model Code](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/whl_en.md#use-custom-model)
+- **Custom Model**: If users want to replace the built-in model with their own inference model, they can follow the [Custom Model Code Usage](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_en/whl_en.md#31-use-by-code) by modifying PPOCRLabel.py for [Instantiation of PaddleOCR class](https://github.com/PaddlePaddle/PaddleOCR/blob/release/ 2.3/PPOCRLabel/PPOCRLabel.py#L116) :
+  add parameter `det_model_dir`  in `self.ocr = PaddleOCR(use_pdserving=False, use_angle_cls=True, det=True, cls=True, use_gpu=gpu, lang=lang) `
-### Export Label Result
+### 3.3 Export Label Result
 PPOCRLabel supports three ways to export Label.txt
 - Automatically export: After selecting "File - Auto Export Label Mode", the program will automatically write the annotations into Label.txt every time the user confirms an image. If this option is not turned on, it will be automatically exported after detecting that the user has manually checked 5 images.
+  > The automatically export mode is turned off by default
 - Manual export: Click "File-Export Marking Results" to manually export the label.
 - Close application export
-### Export Partial Recognition Results
+### 3.4 Export Partial Recognition Results
-For some data that are difficult to recognize, the recognition results will not be exported by **unchecking** the corresponding tags in the recognition results checkbox.
+For some data that are difficult to recognize, the recognition results will not be exported by **unchecking** the corresponding tags in the recognition results checkbox. The unchecked recognition result is saved as `True` in the `difficult` variable in the label file `label.txt`.
-*Note: The status of the checkboxes in the recognition results still needs to be saved manually by clicking Save Button.*
+> *Note: The status of the checkboxes in the recognition results still needs to be saved manually by clicking Save Button.*
-### Error message
+### 3.5 Error message
 - If paddleocr is installed with whl, it has a higher priority than calling PaddleOCR class with paddleocr.py, which may cause an exception if whl package is not updated.

--- a/PPOCRLabel/README_ch.md
+++ b/PPOCRLabel/README_ch.md
@@ -21,16 +21,12 @@ PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具，内置P
  - 识别结果更改为单击修改。（如果无法修改，请切换为系统自带输入法，或再次切回原输入法）
 - 2020.12.18： 支持对单个标记框进行重新识别（by [ninetailskim](https://github.com/ninetailskim)），完善快捷键。
-#### 尽请期待
- 锁定框模式：针对同一场景数据，被锁定的检测框的大小与位置能在不同图片之间传递。
 如果您对以上内容感兴趣或对完善工具有不一样的想法，欢迎加入我们的SIG队伍与我们共同开发。可以在[此处](https://github.com/PaddlePaddle/PaddleOCR/issues/1728)完成问卷和前置任务，经过我们确认相关内容后即可正式加入，享受SIG福利，共同为OCR开源事业贡献（特别说明：针对PPOCRLabel的改进也属于PaddleOCR前置任务）
-## 安装
+## 1. 安装
-### 1. 环境搭建
+### 1.1 环境搭建
 #### 安装PaddlePaddle
 ```bash
@@ -67,7 +63,7 @@ pip3 install -r requirements.txt
 注意，windows环境下，建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装， 直接通过pip安装的shapely库可能出现`[winRrror 126] 找不到指定模块的问题`。
-### 2. 安装PPOCRLabel
+### 1.2 安装PPOCRLabel
 #### Windows
@@ -95,11 +91,9 @@ cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下
 python3 PPOCRLabel.py --lang ch
 ```
+## 2. 使用
+### 2.1 操作步骤
-## 使用
-### 操作步骤
 1. 安装与运行：使用上述命令安装与运行程序。
 2. 打开文件夹：在菜单栏点击 “文件” - "打开目录" 选择待标记图片的文件夹<sup>[1]</sup>.
@@ -130,9 +124,9 @@ python3 PPOCRLabel.py --lang ch
 |  rec_gt.txt   | 识别标签。可直接用于PPOCR识别模型训练。需用户手动点击菜单栏“文件” - "导出识别结果"后产生。 |
 |   crop_img    |   识别数据。按照检测框切割后的图片。与rec_gt.txt同时产生。   |
-## 说明
+## 3. 说明
-### 快捷键
+### 3.1 快捷键
 | 快捷键           | 说明                         |
 | ---------------- | ---------------------------- |
@@ -152,29 +146,35 @@ python3 PPOCRLabel.py --lang ch
 | Ctrl--           | 放大                         |
 | ↑→↓←             | 移动标记框                   |
-### 内置模型
+### 3.2 内置模型
 - 默认模型：PPOCRLabel默认使用PaddleOCR中的中英文超轻量OCR模型，支持中英文与数字识别，多种语言检测。
 - 模型语言切换：用户可通过菜单栏中 "PaddleOCR" - "选择模型" 切换内置模型语言，目前支持的语言包括法文、德文、韩文、日文。具体模型下载链接可参考[PaddleOCR模型列表](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/models_list.md).
- - 自定义模型：用户可根据[自定义模型代码使用](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md#%E8%87%AA%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9E%8B)，通过修改PPOCRLabel.py中针对[PaddleOCR类的实例化](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/PPOCRLabel/PPOCRLabel.py#L110)替换成自己训练的模型。
+ - **自定义模型**：如果用户想将内置模型更换为自己的推理模型，可根据[自定义模型代码使用](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md#%E8%87%AA%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9E%8B)，通过修改PPOCRLabel.py中针对[PaddleOCR类的实例化](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/PPOCRLabel/PPOCRLabel.py#L116) :
+   `self.ocr = PaddleOCR(use_pdserving=False, use_angle_cls=True, det=True, cls=True, use_gpu=gpu, lang=lang) `，在 `det_model_dir` 中传入  自己的模型即可。 
-### 导出标记结果
+### 3.3 导出标记结果
 PPOCRLabel支持三种导出方式：
 - 自动导出：点击“文件 - 自动导出标记结果”后，用户每确认过一张图片，程序自动将标记结果写入Label.txt中。若未开启此选项，则检测到用户手动确认过5张图片后进行自动导出。
+  > 默认情况下自动导出功能为关闭状态
 - 手动导出：点击“文件 - 导出标记结果”手动导出标记。
 - 关闭应用程序导出
-### 导出部分识别结果
+### 3.4 导出部分识别结果
-针对部分难以识别的数据，通过在识别结果的复选框中**取消勾选**相应的标记，其识别结果不会被导出。
+针对部分难以识别的数据，通过在识别结果的复选框中**取消勾选**相应的标记，其识别结果不会被导出。被取消勾选的识别结果在标记文件 `label.txt` 中的 `difficult` 变量保存为 `True` 。
-*注意：识别结果中的复选框状态仍需用户手动点击确认后才能保留*
+> *注意：识别结果中的复选框状态仍需用户手动点击确认后才能保留*
-### 错误提示
+### 3.5 错误提示
 - 如果同时使用whl包安装了paddleocr，其优先级大于通过paddleocr.py调用PaddleOCR类，whl包未更新时会导致程序异常。
 - PPOCRLabel**不支持对中文文件名**的图片进行自动标注。
@@ -209,6 +209,7 @@ PPOCRLabel支持三种导出方式：
    recRootPath是根据PPOCRLabel标注的数据集划分后的字符识别数据集存放的路径，默认是../train_data/rec
-### 参考资料
+### 4. 参考资料
 1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
--- a/PTDN/docs/install.md
+++ b/PTDN/docs/install.md
-## 环境配置
-本教程适用于PTDN目录下基础功能测试的运行环境搭建。
-推荐环境：
- CUDA 10.1
- CUDNN 7.6
- TensorRT 6.1.0.5 / 7.1
-推荐docker镜像安装，按照如下命令创建镜像，当前目录映射到镜像中的`/paddle`目录下
-```
-nvidia-docker run --name paddle -it -v $PWD:/paddle paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82 /bin/bash
-cd /paddle
-# 安装带TRT的paddle
-pip3.7 install https://paddle-wheel.bj.bcebos.com/with-trt/2.1.3/linux-gpu-cuda10.1-cudnn7-mkl-gcc8.2-trt6-avx/paddlepaddle_gpu-2.1.3.post101-cp37-cp37m-linux_x86_64.whl
-# 安装AutoLog
-git clone https://github.com/LDOUBLEV/AutoLog
-cd AutoLog
-pip3.7 install -r requirements.txt
-python3.7 setup.py bdist_wheel
-pip3.7 install ./dist/auto_log-1.0.0-py3-none-any.whl
-# 下载OCR代码
-cd ../
-git clone https://github.com/PaddlePaddle/PaddleOCR
-```
-安装PaddleOCR依赖：
-```
-cd PaddleOCR
-pip3.7 install -r requirements.txt
-```
-## FAQ :
-Q. You are using Paddle compiled with TensorRT, but TensorRT dynamic library is not found. Ignore this if TensorRT is not needed.
-A. 问题一般是当前安装paddle版本带TRT，但是本地环境找不到TensorRT的预测库，需要下载TensorRT库，解压后设置环境变量LD_LIBRARY_PATH;
-如：
-```
-export LD_LIBRARY_PATH=/usr/local/python3.7.0/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/paddle/package/TensorRT-6.0.1.5/lib
-```
-或者问题是下载的TensorRT版本和当前paddle中编译的TRT版本不匹配，需要下载版本相符的TRT。
--- a/README.md
+++ b/README.md
@@ -119,7 +119,7 @@ For a new language request, please refer to [Guideline for new language_requests
    - [Table Recognition](./ppstructure/table/README.md)
 - Academic Circles
    - [Two-stage Algorithm](./doc/doc_en/algorithm_overview_en.md)
-    - [PGNet Algorithm](./doc/doc_en/algorithm_overview_en.md)
+    - [PGNet Algorithm](./doc/doc_en/pgnet_en.md)
    - [Python Inference](./doc/doc_en/inference_en.md)
 - Data Annotation and Synthesis
    - [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)

--- a/README_ch.md
+++ b/README_ch.md
@@ -109,15 +109,16 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 - [PP-Structure信息提取](./ppstructure/README_ch.md)
    - [版面分析](./ppstructure/layout/README_ch.md)
    - [表格识别](./ppstructure/table/README_ch.md)
+- OCR学术圈
+    - [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md)
+    - [端到端PGNet算法](./doc/doc_ch/pgnet.md)
+    - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
+    - [使用PaddleOCR架构添加新算法](./doc/doc_ch/add_new_algorithm.md)
 - 数据标注与合成
    - [半自动标注工具PPOCRLabel](./PPOCRLabel/README_ch.md)
    - [数据合成工具Style-Text](./StyleText/README_ch.md)
    - [其它数据标注工具](./doc/doc_ch/data_annotation.md)
    - [其它数据合成工具](./doc/doc_ch/data_synthesis.md)
- OCR学术圈
-    - [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md)
-    - [端到端PGNet算法](./doc/doc_ch/pgnet.md)
-    - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
 - 数据集
    - [通用中英文OCR数据集](./doc/doc_ch/datasets.md)
    - [手写中文OCR数据集](./doc/doc_ch/handwritten_datasets.md)

--- a/benchmark/readme.md
+++ b/benchmark/readme.md
-# PaddleOCR DB/EAST 算法训练benchmark测试
+# PaddleOCR DB/EAST/PSE 算法训练benchmark测试
 PaddleOCR/benchmark目录下的文件用于获取并分析训练日志。
 训练采用icdar2015数据集，包括1000张训练图像和500张测试图像。模型配置采用resnet18_vd作为backbone，分别训练batch_size=8和batch_size=16的情况。
@@ -28,7 +28,3 @@ det_res18_db_v2.0_sp_bs8_fp32_1
 det_res18_db_v2.0_mp_bs16_fp32_1
 det_res18_db_v2.0_mp_bs8_fp32_1
 ```
--- a/benchmark/run_benchmark_det.sh
+++ b/benchmark/run_benchmark_det.sh
@@ -6,7 +6,7 @@ function _set_params(){
    run_mode=${1:-"sp"}          # 单卡sp|多卡mp
    batch_size=${2:-"64"}
    fp_item=${3:-"fp32"}        # fp32|fp16
-    max_iter=${4:-"500"}       # 可选，如果需要修改代码提前中断
+    max_iter=${4:-"10"}       # 可选，如果需要修改代码提前中断
    model_name=${5:-"model_name"}
    run_log_path=${TRAIN_LOG_DIR:-$(pwd)}  # TRAIN_LOG_DIR 后续QA设置该参数
@@ -20,7 +20,7 @@ function _train(){
    echo "Train on ${num_gpu_devices} GPUs"
    echo "current CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES, gpus=$num_gpu_devices, batch_size=$batch_size"
-    train_cmd="-c configs/det/${model_name}.yml -o Train.loader.batch_size_per_card=${batch_size} Global.epoch_num=${max_iter} "   
+    train_cmd="-c configs/det/${model_name}.yml -o Train.loader.batch_size_per_card=${batch_size} Global.epoch_num=${max_iter} Global.eval_batch_step=[0,20000] Global.print_batch_step=2"   
    case ${run_mode} in
      sp) 
        train_cmd="python3.7 tools/train.py "${train_cmd}""
@@ -39,18 +39,24 @@ function _train(){
        echo -e "${model_name}, SUCCESS"
        export job_fail_flag=0
    fi
-    kill -9 `ps -ef|grep 'python3.7'|awk '{print $2}'`
    if [ $run_mode = "mp" -a -d mylog ]; then
        rm ${log_file}
        cp mylog/workerlog.0 ${log_file}
    fi
+}
-    # run log analysis
+function _analysis_log(){
-    analysis_cmd="python3.7 benchmark/analysis.py --filename ${log_file}  --mission_name ${model_name} --run_mode ${mode} --direction_id 0 --keyword 'ips:' --base_batch_size ${batch_szie} --skip_steps 1 --gpu_num ${num_gpu_devices}  --index 1  --model_mode=-1  --ips_unit=samples/sec"
+    analysis_cmd="python3.7 benchmark/analysis.py --filename ${log_file}  --mission_name ${model_name} --run_mode ${run_mode} --direction_id 0 --keyword 'ips:' --base_batch_size ${batch_size} --skip_steps 1 --gpu_num ${num_gpu_devices}  --index 1  --model_mode=-1  --ips_unit=samples/sec"
    eval $analysis_cmd
 }
+function _kill_process(){
+    kill -9 `ps -ef|grep 'python3.7'|awk '{print $2}'`
+}
 _set_params $@
 _train
+_analysis_log
+_kill_process
\ No newline at end of file
--- a/benchmark/run_det.sh
+++ b/benchmark/run_det.sh
@@ -3,11 +3,11 @@
 # 1 安装该模型需要的依赖 (如需开启优化策略请注明)
 python3.7 -m pip install -r requirements.txt
 # 2 拷贝该模型需要数据、预训练模型
-wget -c  -p ./tain_data/  https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar && cd train_data  && tar xf icdar2015.tar && cd ../
+wget -P ./train_data/  https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar && cd train_data  && tar xf icdar2015.tar && cd ../
-wget -c -p ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_pretrained.pdparams
+wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_pretrained.pdparams
 # 3 批量运行（如不方便批量，1，2需放到单个模型中）
-model_mode_list=(det_res18_db_v2.0 det_r50_vd_east)
+model_mode_list=(det_res18_db_v2.0 det_r50_vd_east det_r50_vd_pse)
 fp_item_list=(fp32)
 bs_list=(8 16)
 for model_mode in ${model_mode_list[@]}; do
@@ -15,11 +15,11 @@ for model_mode in ${model_mode_list[@]}; do
          for bs_item in ${bs_list[@]}; do
            echo "index is speed, 1gpus, begin, ${model_name}"
            run_mode=sp
-            CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 10 ${model_mode}     #  (5min)
+            CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 2 ${model_mode}     #  (5min)
            sleep 60
            echo "index is speed, 8gpus, run_mode is multi_process, begin, ${model_name}"
            run_mode=mp
-            CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 10 ${model_mode} 
+            CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 2 ${model_mode} 
            sleep 60
            done
      done

--- a/configs/det/ch_PP-OCRv2/ch_PP-OCR_det_distill.yml
+++ b/configs/det/ch_PP-OCRv2/ch_PP-OCR_det_distill.yml
@@ -90,7 +90,7 @@ Optimizer:
 PostProcess:
  name: DistillationDBPostProcess
-  model_name: ["Student", "Student2"]
+  model_name: ["Student"]
  key: head_out
  thresh: 0.3
  box_thresh: 0.6

--- a/deploy/cpp_infer/include/ocr_rec.h
+++ b/deploy/cpp_infer/include/ocr_rec.h
@@ -44,7 +44,8 @@ public:
                          const int &gpu_id, const int &gpu_mem,
                          const int &cpu_math_library_num_threads,
                          const bool &use_mkldnn, const string &label_path,
-                          const bool &use_tensorrt, const std::string &precision) {
+                          const bool &use_tensorrt, const std::string &precision,
+                          const int &rec_batch_num) {
    this->use_gpu_ = use_gpu;
    this->gpu_id_ = gpu_id;
    this->gpu_mem_ = gpu_mem;
@@ -52,6 +53,7 @@ public:
    this->use_mkldnn_ = use_mkldnn;
    this->use_tensorrt_ = use_tensorrt;
    this->precision_ = precision;
+    this->rec_batch_num_ = rec_batch_num;
    this->label_list_ = Utility::ReadDict(label_path);
    this->label_list_.insert(this->label_list_.begin(),
@@ -64,7 +66,7 @@ public:
  // Load Paddle inference model
  void LoadModel(const std::string &model_dir);
-  void Run(cv::Mat &img, std::vector<double> *times);
+  void Run(std::vector<cv::Mat> img_list, std::vector<double> *times);
 private:
  std::shared_ptr<Predictor> predictor_;
@@ -82,10 +84,12 @@ private:
  bool is_scale_ = true;
  bool use_tensorrt_ = false;
  std::string precision_ = "fp32";
+  int rec_batch_num_ = 6;
  // pre-process
  CrnnResizeImg resize_op_;
  Normalize normalize_op_;
-  Permute permute_op_;
+  PermuteBatch permute_op_;
  // post-process
  PostProcessor post_processor_;

--- a/deploy/cpp_infer/include/preprocess_op.h
+++ b/deploy/cpp_infer/include/preprocess_op.h
@@ -44,6 +44,11 @@ public:
  virtual void Run(const cv::Mat *im, float *data);
 };
+class PermuteBatch {
+public:
+  virtual void Run(const std::vector<cv::Mat> imgs, float *data);
+};
 class ResizeImgType0 {
 public:
  virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len,

--- a/deploy/cpp_infer/include/utility.h
+++ b/deploy/cpp_infer/include/utility.h
@@ -50,6 +50,9 @@ public:
  static cv::Mat GetRotateCropImage(const cv::Mat &srcimage,
                          std::vector<std::vector<int>> box);
+  static std::vector<int> argsort(const std::vector<float>& array);
 };
 } // namespace PaddleOCR
\ No newline at end of file
--- a/deploy/cpp_infer/src/main.cpp
+++ b/deploy/cpp_infer/src/main.cpp
@@ -61,7 +61,7 @@ DEFINE_string(cls_model_dir, "", "Path of cls inference model.");
 DEFINE_double(cls_thresh, 0.9, "Threshold of cls_thresh.");
 // recognition related
 DEFINE_string(rec_model_dir, "", "Path of rec inference model.");
-DEFINE_int32(rec_batch_num, 1, "rec_batch_num.");
+DEFINE_int32(rec_batch_num, 6, "rec_batch_num.");
 DEFINE_string(char_list_file, "../../ppocr/utils/ppocr_keys_v1.txt", "Path of dictionary.");
@@ -146,8 +146,9 @@ int main_rec(std::vector<cv::String> cv_all_img_names) {
    CRNNRecognizer rec(FLAGS_rec_model_dir, FLAGS_use_gpu, FLAGS_gpu_id,
                       FLAGS_gpu_mem, FLAGS_cpu_threads,
                       FLAGS_enable_mkldnn, char_list_file,
-                       FLAGS_use_tensorrt, FLAGS_precision);
+                       FLAGS_use_tensorrt, FLAGS_precision, FLAGS_rec_batch_num);
+    std::vector<cv::Mat> img_list;
    for (int i = 0; i < cv_all_img_names.size(); ++i) {
      LOG(INFO) << "The predict img: " << cv_all_img_names[i];
@@ -156,14 +157,13 @@ int main_rec(std::vector<cv::String> cv_all_img_names) {
        std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << endl;
        exit(1);
      }
+      img_list.push_back(srcimg);
+    }
    std::vector<double> rec_times;
-      rec.Run(srcimg, &rec_times);
+    rec.Run(img_list, &rec_times);
    time_info[0] += rec_times[0];
    time_info[1] += rec_times[1];
    time_info[2] += rec_times[2];
-    }
    if (FLAGS_benchmark) {
        AutoLogger autolog("ocr_rec", 
@@ -171,7 +171,7 @@ int main_rec(std::vector<cv::String> cv_all_img_names) {
                           FLAGS_use_tensorrt,
                           FLAGS_enable_mkldnn,
                           FLAGS_cpu_threads,
-                           1, 
+                           FLAGS_rec_batch_num, 
                           "dynamic", 
                           FLAGS_precision, 
                           time_info, 
@@ -209,7 +209,7 @@ int main_system(std::vector<cv::String> cv_all_img_names) {
    CRNNRecognizer rec(FLAGS_rec_model_dir, FLAGS_use_gpu, FLAGS_gpu_id,
                       FLAGS_gpu_mem, FLAGS_cpu_threads,
                       FLAGS_enable_mkldnn, char_list_file,
-                       FLAGS_use_tensorrt, FLAGS_precision);
+                       FLAGS_use_tensorrt, FLAGS_precision, FLAGS_rec_batch_num);
    for (int i = 0; i < cv_all_img_names.size(); ++i) {
      LOG(INFO) << "The predict img: " << cv_all_img_names[i];
@@ -228,19 +228,22 @@ int main_system(std::vector<cv::String> cv_all_img_names) {
      time_info_det[1] += det_times[1];
      time_info_det[2] += det_times[2];
-      cv::Mat crop_img;
+      std::vector<cv::Mat> img_list;
      for (int j = 0; j < boxes.size(); j++) {
+          cv::Mat crop_img;
          crop_img = Utility::GetRotateCropImage(srcimg, boxes[j]);
          if (cls != nullptr) {
              crop_img = cls->Run(crop_img);
          }
-        rec.Run(crop_img, &rec_times);
+          img_list.push_back(crop_img);
+      }
+      rec.Run(img_list, &rec_times);
      time_info_rec[0] += rec_times[0];
      time_info_rec[1] += rec_times[1];
      time_info_rec[2] += rec_times[2];
    }
-    }
    if (FLAGS_benchmark) {
        AutoLogger autolog_det("ocr_det", 
                            FLAGS_use_gpu,
@@ -257,7 +260,7 @@ int main_system(std::vector<cv::String> cv_all_img_names) {
                            FLAGS_use_tensorrt,
                            FLAGS_enable_mkldnn,
                            FLAGS_cpu_threads,
-                            1, 
+                            FLAGS_rec_batch_num, 
                            "dynamic", 
                            FLAGS_precision, 
                            time_info_rec, 

--- a/deploy/cpp_infer/src/ocr_rec.cpp
+++ b/deploy/cpp_infer/src/ocr_rec.cpp
@@ -16,27 +16,48 @@
 namespace PaddleOCR {
-void CRNNRecognizer::Run(cv::Mat &img, std::vector<double> *times) {
+void CRNNRecognizer::Run(std::vector<cv::Mat> img_list, std::vector<double> *times) {
-  cv::Mat srcimg;
+    std::chrono::duration<float> preprocess_diff = std::chrono::steady_clock::now() - std::chrono::steady_clock::now();
-  img.copyTo(srcimg);
+    std::chrono::duration<float> inference_diff = std::chrono::steady_clock::now() - std::chrono::steady_clock::now();
-  cv::Mat resize_img;
+    std::chrono::duration<float> postprocess_diff = std::chrono::steady_clock::now() - std::chrono::steady_clock::now();
+    int img_num = img_list.size();
+    std::vector<float> width_list;
+    for (int i = 0; i < img_num; i++) {
+        width_list.push_back(float(img_list[i].cols) / img_list[i].rows);
+    }
+    std::vector<int> indices = Utility::argsort(width_list);
-  float wh_ratio = float(srcimg.cols) / float(srcimg.rows);
+    for (int beg_img_no = 0; beg_img_no < img_num; beg_img_no += this->rec_batch_num_) {
        auto preprocess_start = std::chrono::steady_clock::now();
-  this->resize_op_.Run(srcimg, resize_img, wh_ratio, this->use_tensorrt_);
+        int end_img_no = min(img_num, beg_img_no + this->rec_batch_num_);
+        float max_wh_ratio = 0;
-  this->normalize_op_.Run(&resize_img, this->mean_, this->scale_,
+        for (int ino = beg_img_no; ino < end_img_no; ino ++) {
-                          this->is_scale_);
+            int h = img_list[indices[ino]].rows;
+            int w = img_list[indices[ino]].cols;
-  std::vector<float> input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
+            float wh_ratio = w * 1.0 / h;
+            max_wh_ratio = max(max_wh_ratio, wh_ratio);
+        }
+        std::vector<cv::Mat> norm_img_batch;
+        for (int ino = beg_img_no; ino < end_img_no; ino ++) {
+            cv::Mat srcimg;
+            img_list[indices[ino]].copyTo(srcimg);
+            cv::Mat resize_img;
+            this->resize_op_.Run(srcimg, resize_img, max_wh_ratio, this->use_tensorrt_);
+            this->normalize_op_.Run(&resize_img, this->mean_, this->scale_, this->is_scale_);
+            norm_img_batch.push_back(resize_img);
+        }
-  this->permute_op_.Run(&resize_img, input.data());
+        int batch_width = int(ceilf(32 * max_wh_ratio)) - 1;
+        std::vector<float> input(this->rec_batch_num_ * 3 * 32 * batch_width, 0.0f);
+        this->permute_op_.Run(norm_img_batch, input.data());
        auto preprocess_end = std::chrono::steady_clock::now();
+        preprocess_diff += preprocess_end - preprocess_start;
        // Inference.
        auto input_names = this->predictor_->GetInputNames();
        auto input_t = this->predictor_->GetInputHandle(input_names[0]);
-  input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
+        input_t->Reshape({this->rec_batch_num_, 3, 32, batch_width});
        auto inference_start = std::chrono::steady_clock::now();
        input_t->CopyFromCpu(input.data());
        this->predictor_->Run();
@@ -52,9 +73,11 @@ void CRNNRecognizer::Run(cv::Mat &img, std::vector<double> *times) {
        output_t->CopyToCpu(predict_batch.data());
        auto inference_end = std::chrono::steady_clock::now();
+        inference_diff += inference_end - inference_start;
        // ctc decode
        auto postprocess_start = std::chrono::steady_clock::now();
+        for (int m = 0; m < predict_shape[0]; m++) {
            std::vector<std::string> str_res;
            int argmax_idx;
            int last_index = 0;
@@ -64,11 +87,11 @@ void CRNNRecognizer::Run(cv::Mat &img, std::vector<double> *times) {
            for (int n = 0; n < predict_shape[1]; n++) {
                argmax_idx =
-        int(Utility::argmax(&predict_batch[n * predict_shape[2]],
+                    int(Utility::argmax(&predict_batch[(m * predict_shape[1] + n) * predict_shape[2]],
-                            &predict_batch[(n + 1) * predict_shape[2]]));
+                                        &predict_batch[(m * predict_shape[1] + n + 1) * predict_shape[2]]));
                max_value =
-        float(*std::max_element(&predict_batch[n * predict_shape[2]],
+                    float(*std::max_element(&predict_batch[(m * predict_shape[1] + n) * predict_shape[2]],
-                                &predict_batch[(n + 1) * predict_shape[2]]));
+                                            &predict_batch[(m * predict_shape[1] + n + 1) * predict_shape[2]]));
                if (argmax_idx > 0 && (!(n > 0 && argmax_idx == last_index))) {
                    score += max_value;
@@ -77,21 +100,23 @@ void CRNNRecognizer::Run(cv::Mat &img, std::vector<double> *times) {
                }
                last_index = argmax_idx;
            }
-  auto postprocess_end = std::chrono::steady_clock::now();
            score /= count;
+            if (isnan(score))
+                continue;
            for (int i = 0; i < str_res.size(); i++) {
                std::cout << str_res[i];
            }
            std::cout << "\tscore: " << score << std::endl;
+        }
-  std::chrono::duration<float> preprocess_diff = preprocess_end - preprocess_start;
+        auto postprocess_end = std::chrono::steady_clock::now();
+        postprocess_diff += postprocess_end - postprocess_start;
+    }
    times->push_back(double(preprocess_diff.count() * 1000));
-  std::chrono::duration<float> inference_diff = inference_end - inference_start;
    times->push_back(double(inference_diff.count() * 1000));
-  std::chrono::duration<float> postprocess_diff = postprocess_end - postprocess_start;
    times->push_back(double(postprocess_diff.count() * 1000));
 }
 void CRNNRecognizer::LoadModel(const std::string &model_dir) {
  //   AnalysisConfig config;
  paddle_infer::Config config;

--- a/deploy/cpp_infer/src/preprocess_op.cpp
+++ b/deploy/cpp_infer/src/preprocess_op.cpp
@@ -40,6 +40,17 @@ void Permute::Run(const cv::Mat *im, float *data) {
  }
 }
+void PermuteBatch::Run(const std::vector<cv::Mat> imgs, float *data) {
+    for (int j = 0; j < imgs.size(); j ++){
+        int rh = imgs[j].rows;
+        int rw = imgs[j].cols;
+        int rc = imgs[j].channels();
+        for (int i = 0; i < rc; ++i) {
+            cv::extractChannel(imgs[j], cv::Mat(rh, rw, CV_32FC1, data + (j * rc + i) * rh * rw), i);
+        }
+    }
+}
 void Normalize::Run(cv::Mat *im, const std::vector<float> &mean,
                    const std::vector<float> &scale, const bool is_scale) {
  double e = 1.0;
@@ -95,6 +106,7 @@ void CrnnResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img, float wh_ratio,
  float ratio = float(img.cols) / float(img.rows);
  int resize_w, resize_h;
  if (ceilf(imgH * ratio) > imgW)
    resize_w = imgW;
  else

--- a/deploy/cpp_infer/src/utility.cpp
+++ b/deploy/cpp_infer/src/utility.cpp
@@ -147,4 +147,17 @@ cv::Mat Utility::GetRotateCropImage(const cv::Mat &srcimage,
  }
 }
+std::vector<int> Utility::argsort(const std::vector<float>& array)
+{
+    const int array_len(array.size());
+    std::vector<int> array_index(array_len, 0);
+    for (int i = 0; i < array_len; ++i)
+        array_index[i] = i;
+    std::sort(array_index.begin(), array_index.end(),
+        [&array](int pos1, int pos2) {return (array[pos1] < array[pos2]); });
+    return array_index;
+}
 } // namespace PaddleOCR
\ No newline at end of file
--- a/deploy/lite/ocr_db_crnn.cc
+++ b/deploy/lite/ocr_db_crnn.cc
@@ -12,12 +12,14 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.
-#include "paddle_api.h" // NOLINT
 #include <chrono>
+#include "paddle_api.h" // NOLINT
+#include "paddle_place.h"
 #include "cls_process.h"
 #include "crnn_process.h"
 #include "db_post_process.h"
+#include "AutoLog/auto_log/lite_autolog.h"
 using namespace paddle::lite_api; // NOLINT
 using namespace std;
@@ -27,7 +29,7 @@ void NeonMeanScale(const float *din, float *dout, int size,
                   const std::vector<float> mean,
                   const std::vector<float> scale) {
  if (mean.size() != 3 || scale.size() != 3) {
-    std::cerr << "[ERROR] mean or scale size must equal to 3\n";
+    std::cerr << "[ERROR] mean or scale size must equal to 3" << std::endl;
    exit(1);
  }
  float32x4_t vmean0 = vdupq_n_f32(mean[0]);
@@ -159,7 +161,8 @@ void RunRecModel(std::vector<std::vector<std::vector<int>>> boxes, cv::Mat img,
                 std::vector<float> &rec_text_score,
                 std::vector<std::string> charactor_dict,
                 std::shared_ptr<PaddlePredictor> predictor_cls,
-                 int use_direction_classify) {
+                 int use_direction_classify,
+                 std::vector<double> *times) {
  std::vector<float> mean = {0.5f, 0.5f, 0.5f};
  std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f};
@@ -226,7 +229,7 @@ void RunRecModel(std::vector<std::vector<std::vector<int>>> boxes, cv::Mat img,
 std::vector<std::vector<std::vector<int>>>
 RunDetModel(std::shared_ptr<PaddlePredictor> predictor, cv::Mat img,
-            std::map<std::string, double> Config) {
+            std::map<std::string, double> Config, std::vector<double> *times) {
  // Read img
  int max_side_len = int(Config["max_side_len"]);
  int det_db_use_dilate = int(Config["det_db_use_dilate"]);
@@ -234,6 +237,7 @@ RunDetModel(std::shared_ptr<PaddlePredictor> predictor, cv::Mat img,
  cv::Mat srcimg;
  img.copyTo(srcimg);
+  auto preprocess_start = std::chrono::steady_clock::now();
  std::vector<float> ratio_hw;
  img = DetResizeImg(img, max_side_len, ratio_hw);
  cv::Mat img_fp;
@@ -248,8 +252,10 @@ RunDetModel(std::shared_ptr<PaddlePredictor> predictor, cv::Mat img,
  std::vector<float> scale = {1 / 0.229f, 1 / 0.224f, 1 / 0.225f};
  const float *dimg = reinterpret_cast<const float *>(img_fp.data);
  NeonMeanScale(dimg, data0, img_fp.rows * img_fp.cols, mean, scale);
+  auto preprocess_end = std::chrono::steady_clock::now();
  // Run predictor
+  auto inference_start = std::chrono::steady_clock::now();
  predictor->Run();
  // Get output and post process
@@ -257,8 +263,10 @@ RunDetModel(std::shared_ptr<PaddlePredictor> predictor, cv::Mat img,
      std::move(predictor->GetOutput(0)));
  auto *outptr = output_tensor->data<float>();
  auto shape_out = output_tensor->shape();
+  auto inference_end = std::chrono::steady_clock::now();
  // Save output
+  auto postprocess_start = std::chrono::steady_clock::now();
  float pred[shape_out[2] * shape_out[3]];
  unsigned char cbuf[shape_out[2] * shape_out[3]];
@@ -287,14 +295,35 @@ RunDetModel(std::shared_ptr<PaddlePredictor> predictor, cv::Mat img,
  std::vector<std::vector<std::vector<int>>> filter_boxes =
      FilterTagDetRes(boxes, ratio_hw[0], ratio_hw[1], srcimg);
+  auto postprocess_end = std::chrono::steady_clock::now();
+  std::chrono::duration<float> preprocess_diff = preprocess_end - preprocess_start;
+  times->push_back(double(preprocess_diff.count() * 1000));
+  std::chrono::duration<float> inference_diff = inference_end - inference_start;
+  times->push_back(double(inference_diff.count() * 1000));
+  std::chrono::duration<float> postprocess_diff = postprocess_end - postprocess_start;
+  times->push_back(double(postprocess_diff.count() * 1000));
  return filter_boxes;
 }
-std::shared_ptr<PaddlePredictor> loadModel(std::string model_file) {
+std::shared_ptr<PaddlePredictor> loadModel(std::string model_file, std::string power_mode, int num_threads) {
  MobileConfig config;
  config.set_model_from_file(model_file);
+  if (power_mode == "LITE_POWER_HIGH"){
+      config.set_power_mode(LITE_POWER_HIGH);
+  } else {
+      if (power_mode == "LITE_POWER_LOW") {
+          config.set_power_mode(LITE_POWER_HIGH);
+      } else {
+          std::cerr << "Only support LITE_POWER_HIGH or LITE_POWER_HIGH." << std::endl;
+          exit(1);
+      }
+  }
+  config.set_threads(num_threads);
  std::shared_ptr<PaddlePredictor> predictor =
      CreatePaddlePredictor<MobileConfig>(config);
  return predictor;
@@ -354,60 +383,255 @@ std::map<std::string, double> LoadConfigTxt(std::string config_path) {
  return dict;
 }
-int main(int argc, char **argv) {
+void check_params(int argc, char **argv) {
-  if (argc < 5) {
+  if (argc<=1 || (strcmp(argv[1], "det")!=0 && strcmp(argv[1], "rec")!=0 && strcmp(argv[1], "system")!=0)) {
-    std::cerr << "[ERROR] usage: " << argv[0]
+    std::cerr << "Please choose one mode of [det, rec, system] !" << std::endl;
-              << " det_model_file cls_model_file rec_model_file image_path "
-                 "charactor_dict\n";
    exit(1);
  }
-  std::string det_model_file = argv[1];
+  if (strcmp(argv[1], "det") == 0) {
-  std::string rec_model_file = argv[2];
+      if (argc < 9){
-  std::string cls_model_file = argv[3];
+        std::cerr << "[ERROR] usage:" << argv[0]
-  std::string img_path = argv[4];
+                  << " det det_model num_threads batchsize power_mode img_dir det_config lite_benchmark_value" << std::endl;
-  std::string dict_path = argv[5];
+        exit(1);
+      }
+  }
-  //// load config from txt file
+  if (strcmp(argv[1], "rec") == 0) {
-  auto Config = LoadConfigTxt("./config.txt");
+      if (argc < 9){
-  int use_direction_classify = int(Config["use_direction_classify"]);
+        std::cerr << "[ERROR] usage:" << argv[0]
+                  << " rec rec_model num_threads batchsize power_mode img_dir key_txt lite_benchmark_value" << std::endl;
+        exit(1);
+      }
+  }
+  if (strcmp(argv[1], "system") == 0) {
+      if (argc < 12){
+        std::cerr << "[ERROR] usage:" << argv[0]
+                  << " system det_model rec_model clas_model num_threads batchsize power_mode img_dir det_config key_txt lite_benchmark_value" << std::endl;
+        exit(1);
+      }
+  }
+}
+void system(char **argv){
+  std::string det_model_file = argv[2];
+  std::string rec_model_file = argv[3];
+  std::string cls_model_file = argv[4];
+  std::string precision = argv[5];
+  std::string num_threads = argv[6];
+  std::string batchsize = argv[7];
+  std::string power_mode = argv[8];
+  std::string img_dir = argv[9];
+  std::string det_config_path = argv[10];
+  std::string dict_path = argv[11];
+  if (strcmp(argv[5], "FP32") != 0 && strcmp(argv[5], "INT8") != 0) {
+      std::cerr << "Only support FP32 or INT8." << std::endl;
+      exit(1);
+  }
-  auto start = std::chrono::system_clock::now();
+  std::vector<cv::String> cv_all_img_names;
+  cv::glob(img_dir, cv_all_img_names);
-  auto det_predictor = loadModel(det_model_file);
+  //// load config from txt file
-  auto rec_predictor = loadModel(rec_model_file);
+  auto Config = LoadConfigTxt(det_config_path);
-  auto cls_predictor = loadModel(cls_model_file);
+  int use_direction_classify = int(Config["use_direction_classify"]);
  auto charactor_dict = ReadDict(dict_path);
  charactor_dict.insert(charactor_dict.begin(), "#"); // blank char for ctc
  charactor_dict.push_back(" ");
-  cv::Mat srcimg = cv::imread(img_path, cv::IMREAD_COLOR);
+  auto det_predictor = loadModel(det_model_file, power_mode, std::stoi(num_threads));
-  auto boxes = RunDetModel(det_predictor, srcimg, Config);
+  auto rec_predictor = loadModel(rec_model_file, power_mode, std::stoi(num_threads));
+  auto cls_predictor = loadModel(cls_model_file, power_mode, std::stoi(num_threads));
+  for (int i = 0; i < cv_all_img_names.size(); ++i) {
+    std::cout << "The predict img: " << cv_all_img_names[i] << std::endl;
+    cv::Mat srcimg = cv::imread(cv_all_img_names[i], cv::IMREAD_COLOR);
+    if (!srcimg.data) {
+      std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << std::endl;
+      exit(1);
+    }
+    std::vector<double> det_times;
+    auto boxes = RunDetModel(det_predictor, srcimg, Config, &det_times);
    std::vector<std::string> rec_text;
    std::vector<float> rec_text_score;
+    std::vector<double> rec_times;
    RunRecModel(boxes, srcimg, rec_predictor, rec_text, rec_text_score,
-              charactor_dict, cls_predictor, use_direction_classify);
+                charactor_dict, cls_predictor, use_direction_classify, &rec_times);
-  auto end = std::chrono::system_clock::now();
+    //// visualization
-  auto duration =
+    auto img_vis = Visualization(srcimg, boxes);
-      std::chrono::duration_cast<std::chrono::microseconds>(end - start);
+    //// print recognized text
+    for (int i = 0; i < rec_text.size(); i++) {
+      std::cout << i << "\t" << rec_text[i] << "\t" << rec_text_score[i]
+                << std::endl;
+    }
+  }
+}
+void det(int argc, char **argv) {
+  std::string det_model_file = argv[2];
+  std::string precision = argv[3];
+  std::string num_threads = argv[4];
+  std::string batchsize = argv[5];
+  std::string power_mode = argv[6];
+  std::string img_dir = argv[7];
+  std::string det_config_path = argv[8];
+  if (strcmp(argv[3], "FP32") != 0 && strcmp(argv[3], "INT8") != 0) {
+      std::cerr << "Only support FP32 or INT8." << std::endl;
+      exit(1);
+  }
+  std::vector<cv::String> cv_all_img_names;
+  cv::glob(img_dir, cv_all_img_names);
+  //// load config from txt file
+  auto Config = LoadConfigTxt(det_config_path);
+  auto det_predictor = loadModel(det_model_file, power_mode, std::stoi(num_threads));
+  std::vector<double> time_info = {0, 0, 0};
+  for (int i = 0; i < cv_all_img_names.size(); ++i) {
+    std::cout << "The predict img: " << cv_all_img_names[i] << std::endl;
+    cv::Mat srcimg = cv::imread(cv_all_img_names[i], cv::IMREAD_COLOR);
+    if (!srcimg.data) {
+      std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << std::endl;
+      exit(1);
+    }
+    std::vector<double> times;
+    auto boxes = RunDetModel(det_predictor, srcimg, Config, &times);
    //// visualization
    auto img_vis = Visualization(srcimg, boxes);
+    std::cout << boxes.size() << " bboxes have detected:" << std::endl;
+    // for (int i=0; i<boxes.size(); i++){
+    //   std::cout << "The " << i << " box:" << std::endl;
+    //   for (int j=0; j<4; j++){
+    //     for (int k=0; k<2; k++){
+    //       std::cout << boxes[i][j][k] << "\t";
+    //     }
+    //   }
+    //   std::cout << std::endl;
+    // }
+    time_info[0] += times[0];
+    time_info[1] += times[1];
+    time_info[2] += times[2];
+  }
+  if (strcmp(argv[9], "True") == 0) {
+    AutoLogger autolog(det_model_file, 
+                       0,
+                       0,
+                       0,
+                       std::stoi(num_threads),
+                       std::stoi(batchsize), 
+                       "dynamic", 
+                       precision, 
+                       power_mode,
+                       time_info, 
+                       cv_all_img_names.size());
+    autolog.report();
+  }
+}
+void rec(int argc, char **argv) {
+  std::string rec_model_file = argv[2];
+  std::string precision = argv[3];
+  std::string num_threads = argv[4];
+  std::string batchsize = argv[5];
+  std::string power_mode = argv[6];
+  std::string img_dir = argv[7];
+  std::string dict_path = argv[8];
+  if (strcmp(argv[3], "FP32") != 0 && strcmp(argv[3], "INT8") != 0) {
+      std::cerr << "Only support FP32 or INT8." << std::endl;
+      exit(1);
+  }
+  std::vector<cv::String> cv_all_img_names;
+  cv::glob(img_dir, cv_all_img_names);
+  auto charactor_dict = ReadDict(dict_path);
+  charactor_dict.insert(charactor_dict.begin(), "#"); // blank char for ctc
+  charactor_dict.push_back(" ");
+  auto rec_predictor = loadModel(rec_model_file, power_mode, std::stoi(num_threads));
+  std::shared_ptr<PaddlePredictor> cls_predictor;
+  std::vector<double> time_info = {0, 0, 0};
+  for (int i = 0; i < cv_all_img_names.size(); ++i) {
+    std::cout << "The predict img: " << cv_all_img_names[i] << std::endl;
+    cv::Mat srcimg = cv::imread(cv_all_img_names[i], cv::IMREAD_COLOR);
+    if (!srcimg.data) {
+      std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << std::endl;
+      exit(1);
+    }
+    int width = srcimg.cols;
+    int height = srcimg.rows;
+    std::vector<int> upper_left = {0, 0};
+    std::vector<int> upper_right = {width, 0};
+    std::vector<int> lower_right = {width, height};
+    std::vector<int> lower_left  = {0, height};
+    std::vector<std::vector<int>> box = {upper_left, upper_right, lower_right, lower_left};
+    std::vector<std::vector<std::vector<int>>> boxes = {box};
+    std::vector<std::string> rec_text;
+    std::vector<float> rec_text_score;
+    std::vector<double> times;
+    RunRecModel(boxes, srcimg, rec_predictor, rec_text, rec_text_score,
+                charactor_dict, cls_predictor, 0, &times);
    //// print recognized text
    for (int i = 0; i < rec_text.size(); i++) {
      std::cout << i << "\t" << rec_text[i] << "\t" << rec_text_score[i]
                << std::endl;
    }
+  }
+  // TODO: support autolog
+  if (strcmp(argv[9], "True") == 0) {
+    AutoLogger autolog(rec_model_file, 
+                       0,
+                       0,
+                       0,
+                       std::stoi(num_threads),
+                       std::stoi(batchsize), 
+                       "dynamic", 
+                       precision, 
+                       power_mode,
+                       time_info, 
+                       cv_all_img_names.size());
+    autolog.report();
+  }
+}
+int main(int argc, char **argv) {
+  check_params(argc, argv);
+  std::cout << "mode: " << argv[1] << endl;
-  std::cout << "花费了"
+  if (strcmp(argv[1], "system") == 0) {
-            << double(duration.count()) *
+    system(argv);
-                   std::chrono::microseconds::period::num /
+  }
-                   std::chrono::microseconds::period::den
-            << "秒" << std::endl;
+  if (strcmp(argv[1], "det") == 0) {
+    det(argc, argv);
+  }
+  if (strcmp(argv[1], "rec") == 0) {
+    rec(argc, argv);
+  }
  return 0;
 }
--- a/deploy/paddle2onnx/readme.md
+++ b/deploy/paddle2onnx/readme.md
+# paddle2onnx 模型转化与预测
+本章节介绍 PaddleOCR 模型如何转化为 ONNX 模型，并基于 ONNX 引擎预测。
+## 1. 环境准备
+需要准备 Paddle2ONNX 模型转化环境，和 ONNX 模型预测环境
+###  Paddle2ONNX
+Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式，算子目前稳定支持导出 ONNX Opset 9~11，部分Paddle算子支持更低的ONNX Opset转换。
+更多细节可参考 [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md)
+- 安装 Paddle2ONNX
+```
+python3.7 -m pip install paddle2onnx
+```
+- 安装 ONNX
+```
+# 建议安装 1.4.0 版本，可根据环境更换版本号
+python3.7 -m pip install onnxruntime==1.4.0
+```
+## 2. 模型转换
+- Paddle 模型下载
+有两种方式获取Paddle静态图模型：在 [model_list](../../doc/doc_ch/models_list.md) 中下载PaddleOCR提供的预测模型；
+参考[模型导出说明](../../doc/doc_ch/inference.md#训练模型转inference模型)把训练好的权重转为 inference_model。
+以 ppocr 检测模型为例：
+```
+wget -nc  -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
+cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && cd ..
+```
+- 模型转换
+使用 Paddle2ONNX 将Paddle静态图模型转换为ONNX模型格式：
+```
+paddle2onnx --model_dir=./inference/ch_ppocr_mobile_v2.0_det_infer/ \
+--model_filename=inference.pdmodel \
+--params_filename=inference.pdiparams \
+--save_file=./inference/det_mobile_onnx/model.onnx \
+--opset_version=10 \
+--enable_onnx_checker=True
+```
+执行完毕后，ONNX 模型会被保存在 `./inference/det_mobile_onnx/` 路径下
+## 3. onnx 预测
+以检测模型为例，使用 ONNX 预测可执行如下命令：
+```
+python3.7 ../../tools/infer/predict_det.py --use_gpu=False --use_onnx=True \
+--det_model_dir=./inference/det_mobile_onnx/model.onnx \
+--image_dir=../../doc/imgs/1.jpg
+```
+执行命令后在终端会打印出预测的检测框坐标，并在 `./inference_results/` 下保存可视化结果。
+```
+root INFO: 1.jpg  [[[291, 295], [334, 292], [348, 844], [305, 847]], [[344, 296], [379, 294], [387, 669], [353, 671]]]
+The predict time of ../../doc/imgs/1.jpg: 0.06162881851196289
+The visualized image saved in ./inference_results/det_res_1.jpg
+```
+* 注意：ONNX暂时不支持变长预测，因为需要将输入resize到固定输入，预测结果可能与直接使用Paddle预测有细微不同。
--- a/deploy/pdserving/README.md
+++ b/deploy/pdserving/README.md
@@ -114,7 +114,7 @@ The recognition model is the same.
    git clone https://github.com/PaddlePaddle/PaddleOCR
    # Enter the working directory  
-    cd PaddleOCR/deploy/pdserver/
+    cd PaddleOCR/deploy/pdserving/
    ```
    The pdserver directory contains the code to start the pipeline service and send prediction requests, including:

--- a/deploy/pdserving/README_CN.md
+++ b/deploy/pdserving/README_CN.md
@@ -112,7 +112,7 @@ python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_mobile_v2.0_rec_in
    git clone https://github.com/PaddlePaddle/PaddleOCR
    # 进入到工作目录
-    cd PaddleOCR/deploy/pdserver/
+    cd PaddleOCR/deploy/pdserving/
    ```
    pdserver目录包含启动pipeline服务和发送预测请求的代码，包括：
    ```