merge dygraph

ac98415b · WenmuZhou · af34d785 · 29929ac6 · ac98415b · ac98415b
Commit ac98415b authored Sep 14, 2021 by WenmuZhou
20 changed files
--- a/README.md
+++ b/README.md
@@ -25,7 +25,7 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
 **Recent updates**
- PaddleOCR R&D team would like to share the released tools with developers, at 20:15 pm on September 8th, [Live Address](https://live.bilibili.com/21689802).
+- PaddleOCR R&D team would like to share the key points of PP-OCRv2, at 20:15 pm on September 8th, [Live Address](https://live.bilibili.com/21689802).
 - 2021.9.7 release PaddleOCR v2.3, [PP-OCRv2](#PP-OCRv2) is proposed. The inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server in CPU device. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile.
 - 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).
 - 2021.4.8 release end-to-end text recognition algorithm [PGNet](https://www.aaai.org/AAAI21Papers/AAAI-2885.WangP.pdf) which is published in AAAI 2021. Find tutorial [here](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/pgnet_en.md)；release multi language recognition [models](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md), support more than 80 languages recognition; especically, the performance of [English recognition model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/models_list_en.md#English) is Optimized.
@@ -86,7 +86,7 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr
 | Model introduction                                           | Model name                   | Recommended scene | Detection model                                              | Direction classifier                                         | Recognition model                                            |
 | ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
-| Chinese and English ultra-lightweight PP-OCRv2 model（11.6M） | ch_ppocrv2_xx |Mobile&Server|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/chinese/ch_ppocr_mobile_v2.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/chinese/ch_ppocr_mobile_v2.1_det_distill_train.tar)| [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/chinese/ch_ppocr_mobile_v2.1_rec_train.tar)|
+| Chinese and English ultra-lightweight PP-OCRv2 model（11.6M） |  ch_PP-OCRv2_xx |Mobile&Server|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)| [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/ch/ch_PP-OCRv2_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar)|
 | Chinese and English ultra-lightweight PP-OCR model (9.4M)       | ch_ppocr_mobile_v2.0_xx      | Mobile & server   |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar)      |
 | Chinese and English general PP-OCR model (143.4M)               | ch_ppocr_server_v2.0_xx      | Server            |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_traingit.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar)  |  
@@ -103,13 +103,12 @@ For a new language request, please refer to [Guideline for new language_requests
    - [PP-OCR Model and Configuration](./doc/doc_en/models_and_config_en.md)
        - [PP-OCR Model Download](./doc/doc_en/models_list_en.md)
        - [Yml Configuration](./doc/doc_en/config_en.md)
-        - [Python Inference](./doc/doc_en/inference_en.md)
+        - [Python Inference for PP-OCR Model Library](./doc/doc_en/inference_ppocr_en.md)
    - [PP-OCR Training](./doc/doc_en/training_en.md)
        - [Text Detection](./doc/doc_en/detection_en.md)
        - [Text Recognition](./doc/doc_en/recognition_en.md)
        - [Direction Classification](./doc/doc_en/angle_class_en.md)
    - Inference and Deployment
-        - [Python Inference](./doc/doc_en/inference_en.md)
        - [C++ Inference](./deploy/cpp_infer/readme_en.md)
        - [Serving](./deploy/pdserving/README.md)
        - [Mobile](./deploy/lite/readme_en.md)
@@ -120,6 +119,7 @@ For a new language request, please refer to [Guideline for new language_requests
 - Academic Circles
    - [Two-stage Algorithm](./doc/doc_en/algorithm_overview_en.md)
    - [PGNet Algorithm](./doc/doc_en/algorithm_overview_en.md)
+    - [Python Inference](./doc/doc_en/inference_en.md)
 - Data Annotation and Synthesis
    - [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)
    - [Data Synthesis Tool: Style-Text](./StyleText/README.md)
@@ -146,7 +146,7 @@ For a new language request, please refer to [Guideline for new language_requests
 [1] PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection, detection frame correction and CRNN text recognition. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module (as shown in the green box above). The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941).
-[2] On the basis of PP-OCR, PP-OCRv2 is further optimized in five aspects. The detection model adopts CML(Collaborative Mutual Learning) knowledge distillation strategy and CopyPaste data expansion strategy; The recognition model adopts LCNet lightweight backbone network, U-DML knowledge distillation strategy and enhanced CTC loss function improvement (as shown in the red box above), which further improves the inference speed and prediction effect. For more details, please refer to the technical report of PP-OCRv2 (arXiv link is coming soon).
+[2] On the basis of PP-OCR, PP-OCRv2 is further optimized in five aspects. The detection model adopts CML(Collaborative Mutual Learning) knowledge distillation strategy and CopyPaste data expansion strategy. The recognition model adopts LCNet lightweight backbone network, U-DML knowledge distillation strategy and enhanced CTC loss function improvement (as shown in the red box above), which further improves the inference speed and prediction effect. For more details, please refer to the technical report of PP-OCRv2 (arXiv link is coming soon).

--- a/README_ch.md
+++ b/README_ch.md
@@ -81,7 +81,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 | 模型简介     | 模型名称     |推荐场景          | 检测模型 | 方向分类器 | 识别模型 |
 | ------------ | --------------- | ----------------|---- | ---------- | -------- |
-| 中英文超轻量PP-OCRv2模型（11.6M） | ch_ppocrv2_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/chinese/ch_ppocr_mobile_v2.1_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/chinese/ch_ppocr_mobile_v2.1_det_distill_train.tar)| [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/chinese/ch_ppocr_mobile_v2.1_rec_train.tar)|
+| 中英文超轻量PP-OCRv2模型（13.0M） |  ch_PP-OCRv2_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)| [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar)|
 | 中英文超轻量PP-OCR mobile模型（9.4M） | ch_ppocr_mobile_v2.0_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar)      |
 | 中英文通用PP-OCR server模型（143.4M）   |ch_ppocr_server_v2.0_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar)    |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar)  |  
@@ -95,13 +95,12 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
    - [PP-OCR模型与配置文件](./doc/doc_ch/models_and_config.md)
        - [PP-OCR模型下载](./doc/doc_ch/models_list.md)
        - [配置文件内容与生成](./doc/doc_ch/config.md)
-        - [模型库快速使用](./doc/doc_ch/inference.md)
+        - [PP-OCR模型库快速推理](./doc/doc_ch/inference_ppocr.md)
    - [PP-OCR模型训练](./doc/doc_ch/training.md)
        - [文本检测](./doc/doc_ch/detection.md)
        - [文本识别](./doc/doc_ch/recognition.md)
        - [方向分类器](./doc/doc_ch/angle_class.md)
    - PP-OCR模型推理部署
-        - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
        - [基于C++预测引擎推理](./deploy/cpp_infer/readme.md)
        - [服务化部署](./deploy/pdserving/README_CN.md)
        - [端侧部署](./deploy/lite/readme.md)
@@ -117,6 +116,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 - OCR学术圈
    - [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md)
    - [端到端PGNet算法](./doc/doc_ch/pgnet.md)
+    - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
 - 数据集
    - [通用中英文OCR数据集](./doc/doc_ch/datasets.md)
    - [手写中文OCR数据集](./doc/doc_ch/handwritten_datasets.md)

--- a/configs/det/ch_ppocr_v2.1/ch_det_lite_train_cml_v2.1.yml
+++ b/configs/det/ch_ppocr_v2.1/ch_det_lite_train_cml_v2.1.yml
@@ -8,7 +8,7 @@ Global:
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [3000, 2000]
  cal_metric_during_train: False
-  pretrained_model: ./pretrain_models/ch_ppocr_mobile_v2.1_det_distill_train/best_accuracy
+  pretrained_model: ./pretrain_models/ch_PP-OCRv2_det_distill_train/best_accuracy
  checkpoints:
  save_inference_dir:
  use_visualdl: False

--- a/configs/det/ch_ppocr_v2.1/ch_det_lite_train_distill_v2.1.yml
+++ b/configs/det/ch_ppocr_v2.1/ch_det_lite_train_distill_v2.1.yml
--- a/configs/det/ch_ppocr_v2.1/ch_det_lite_train_dml_v2.1.yml
+++ b/configs/det/ch_ppocr_v2.1/ch_det_lite_train_dml_v2.1.yml
--- a/configs/det/ch_ppocr_v2.1/ch_det_mv3_db_v2.1_student.yml
+++ b/configs/det/ch_ppocr_v2.1/ch_det_mv3_db_v2.1_student.yml
--- a/configs/det/det_mv3_db.yml
+++ b/configs/det/det_mv3_db.yml
--- a/configs/det/det_r50_vd_db.yml
+++ b/configs/det/det_r50_vd_db.yml
@@ -98,7 +98,7 @@ Train:
    shuffle: True
    drop_last: False
    batch_size_per_card: 16
-    num_workers: 8
+    num_workers: 4
 Eval:
  dataset:

--- a/configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml
+++ b/configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml
+Global:
+  debug: false
+  use_gpu: true
+  epoch_num: 800
+  log_smooth_window: 20
+  print_batch_step: 10
+  save_model_dir: ./output/rec_mobile_pp-OCRv2
+  save_epoch_step: 3
+  eval_batch_step: [0, 2000]
+  cal_metric_during_train: true
+  pretrained_model:
+  checkpoints:
+  save_inference_dir:
+  use_visualdl: false
+  infer_img: doc/imgs_words/ch/word_1.jpg
+  character_dict_path: ppocr/utils/ppocr_keys_v1.txt
+  character_type: ch
+  max_text_length: 25
+  infer_mode: false
+  use_space_char: true
+  distributed: true
+  save_res_path: ./output/rec/predicts_mobile_pp-OCRv2.txt
+Optimizer:
+  name: Adam
+  beta1: 0.9
+  beta2: 0.999
+  lr:
+    name: Piecewise
+    decay_epochs : [700, 800]
+    values : [0.001, 0.0001]
+    warmup_epoch: 5
+  regularizer:
+    name: L2
+    factor: 2.0e-05
+Architecture:
+  model_type: rec
+  algorithm: CRNN
+  Transform:
+  Backbone:
+    name: MobileNetV1Enhance
+    scale: 0.5
+  Neck:
+    name: SequenceEncoder
+    encoder_type: rnn
+    hidden_size: 64
+  Head:
+    name: CTCHead
+    mid_channels: 96
+    fc_decay: 0.00002
+Loss:
+  name: CTCLoss
+PostProcess:
+  name: CTCLabelDecode
+Metric:
+  name: RecMetric
+  main_indicator: acc
+Train:
+  dataset:
+    name: SimpleDataSet
+    data_dir: ./train_data/
+    label_file_list:
+    - ./train_data/train_list.txt
+    transforms:
+    - DecodeImage:
+        img_mode: BGR
+        channel_first: false
+    - RecAug:
+    - CTCLabelEncode:
+    - RecResizeImg:
+        image_shape: [3, 32, 320]
+    - KeepKeys:
+        keep_keys:
+        - image
+        - label
+        - length
+  loader:
+    shuffle: true
+    batch_size_per_card: 128
+    drop_last: true
+    num_workers: 8
+Eval:
+  dataset:
+    name: SimpleDataSet
+    data_dir: ./train_data
+    label_file_list:
+    - ./train_data/val_list.txt
+    transforms:
+    - DecodeImage:
+        img_mode: BGR
+        channel_first: false
+    - CTCLabelEncode:
+    - RecResizeImg:
+        image_shape: [3, 32, 320]
+    - KeepKeys:
+        keep_keys:
+        - image
+        - label
+        - length
+  loader:
+    shuffle: false
+    drop_last: false
+    batch_size_per_card: 128
+    num_workers: 8
--- a/configs/rec/ch_ppocr_v2.1/rec_chinese_lite_train_distillation_v2.1.yml
+++ b/configs/rec/ch_ppocr_v2.1/rec_chinese_lite_train_distillation_v2.1.yml
@@ -4,7 +4,7 @@ Global:
  epoch_num: 800
  log_smooth_window: 20
  print_batch_step: 10
-  save_model_dir: ./output/rec_chinese_lite_distillation_v2.1
+  save_model_dir: ./output/rec_pp-OCRv2_distillation
  save_epoch_step: 3
  eval_batch_step: [0, 2000]
  cal_metric_during_train: true
@@ -19,7 +19,7 @@ Global:
  infer_mode: false
  use_space_char: true
  distributed: true
-  save_res_path: ./output/rec/predicts_chinese_lite_distillation_v2.1.txt
+  save_res_path: ./output/rec/predicts_pp-OCRv2_distillation.txt
 Optimizer:
@@ -88,6 +88,7 @@ Loss:
  - DistillationDMLLoss:
      weight: 1.0
      act: "softmax"
+      use_log: true
      model_name_pairs:
      - ["Student", "Teacher"]
      key: head_out

--- a/configs/rec/rec_r31_sar.yml
+++ b/configs/rec/rec_r31_sar.yml
+Global:
+  use_gpu: true
+  epoch_num: 5
+  log_smooth_window: 20
+  print_batch_step: 20
+  save_model_dir: ./sar_rec
+  save_epoch_step: 1
+  # evaluation is run every 2000 iterations
+  eval_batch_step: [0, 2000]
+  cal_metric_during_train: True
+  pretrained_model:
+  checkpoints: 
+  save_inference_dir:
+  use_visualdl: False
+  infer_img: 
+  # for data or label process
+  character_dict_path: ppocr/utils/dict90.txt
+  character_type: EN_symbol
+  max_text_length: 30
+  infer_mode: False
+  use_space_char: False
+  rm_symbol: True
+  save_res_path: ./output/rec/predicts_sar.txt
+Optimizer:
+  name: Adam
+  beta1: 0.9
+  beta2: 0.999
+  lr:
+    name: Piecewise
+    decay_epochs: [3, 4]
+    values: [0.001, 0.0001, 0.00001] 
+  regularizer:
+    name: 'L2'
+    factor: 0
+Architecture:
+  model_type: rec
+  algorithm: SAR
+  Transform:
+  Backbone:
+    name: ResNet31
+  Head:
+    name: SARHead
+Loss:
+  name: SARLoss
+PostProcess:
+  name: SARLabelDecode
+Metric:
+  name: RecMetric
+Train:
+  dataset:
+    name: SimpleDataSet
+    label_file_list: ['./train_data/train_list.txt']
+    data_dir: ./train_data/
+    ratio_list: 1.0
+    transforms:
+      - DecodeImage: # load image
+          img_mode: BGR
+          channel_first: False
+      - SARLabelEncode: # Class handling label
+      - SARRecResizeImg:
+          image_shape: [3, 48, 48, 160] # h:48 w:[48,160]
+          width_downsample_ratio: 0.25
+      - KeepKeys:
+          keep_keys: ['image', 'label', 'valid_ratio'] # dataloader will return list in this order
+  loader:
+    shuffle: True
+    batch_size_per_card: 64
+    drop_last: True
+    num_workers: 8
+    use_shared_memory: False
+Eval:
+  dataset:
+    name: LMDBDataSet
+    data_dir: ./train_data/data_lmdb_release/evaluation/
+    transforms:
+      - DecodeImage: # load image
+          img_mode: BGR
+          channel_first: False
+      - SARLabelEncode: # Class handling label
+      - SARRecResizeImg:
+          image_shape: [3, 48, 48, 160]
+          width_downsample_ratio: 0.25
+      - KeepKeys:
+          keep_keys: ['image', 'label', 'valid_ratio'] # dataloader will return list in this order
+  loader:
+    shuffle: False
+    drop_last: False
+    batch_size_per_card: 64
+    num_workers: 4
+    use_shared_memory: False
--- a/deploy/cpp_infer/src/main.cpp
+++ b/deploy/cpp_infer/src/main.cpp
@@ -44,7 +44,7 @@ DEFINE_int32(cpu_threads, 10, "Num of threads with CPU.");
 DEFINE_bool(enable_mkldnn, false, "Whether use mkldnn with CPU.");
 DEFINE_bool(use_tensorrt, false, "Whether use tensorrt.");
 DEFINE_string(precision, "fp32", "Precision be one of fp32/fp16/int8");
-DEFINE_bool(benchmark, true, "Whether use benchmark.");
+DEFINE_bool(benchmark, false, "Whether use benchmark.");
 DEFINE_string(save_log_path, "./log_output/", "Save benchmark log path.");
 // detection related
 DEFINE_string(image_dir, "", "Dir of input image.");
@@ -127,9 +127,15 @@ int main_det(std::vector<cv::String> cv_all_img_names) {
 int main_rec(std::vector<cv::String> cv_all_img_names) {
    std::vector<double> time_info = {0, 0, 0};
+    std::string char_list_file = FLAGS_char_list_file;
+    if (FLAGS_benchmark) 
+        char_list_file = FLAGS_char_list_file.substr(6);
+    cout << "label file: " << char_list_file << endl;
    CRNNRecognizer rec(FLAGS_rec_model_dir, FLAGS_use_gpu, FLAGS_gpu_id,
                       FLAGS_gpu_mem, FLAGS_cpu_threads,
-                       FLAGS_enable_mkldnn, FLAGS_char_list_file,
+                       FLAGS_enable_mkldnn, char_list_file,
                       FLAGS_use_tensorrt, FLAGS_precision);
    for (int i = 0; i < cv_all_img_names.size(); ++i) {
@@ -149,11 +155,27 @@ int main_rec(std::vector<cv::String> cv_all_img_names) {
      time_info[2] += rec_times[2];
    }
+    if (FLAGS_benchmark) {
+        AutoLogger autolog("ocr_rec", 
+                           FLAGS_use_gpu,
+                           FLAGS_use_tensorrt,
+                           FLAGS_enable_mkldnn,
+                           FLAGS_cpu_threads,
+                           1, 
+                           "dynamic", 
+                           FLAGS_precision, 
+                           time_info, 
+                           cv_all_img_names.size());
+        autolog.report();
+    }
    return 0;
 }
 int main_system(std::vector<cv::String> cv_all_img_names) {
+    std::vector<double> time_info_det = {0, 0, 0};
+    std::vector<double> time_info_rec = {0, 0, 0};
    DBDetector det(FLAGS_det_model_dir, FLAGS_use_gpu, FLAGS_gpu_id,
                   FLAGS_gpu_mem, FLAGS_cpu_threads, 
                   FLAGS_enable_mkldnn, FLAGS_max_side_len, FLAGS_det_db_thresh,
@@ -169,17 +191,20 @@ int main_system(std::vector<cv::String> cv_all_img_names) {
                           FLAGS_use_tensorrt, FLAGS_precision);
    }
+    std::string char_list_file = FLAGS_char_list_file;
+    if (FLAGS_benchmark) 
+        char_list_file = FLAGS_char_list_file.substr(6);
+    cout << "label file: " << char_list_file << endl;
    CRNNRecognizer rec(FLAGS_rec_model_dir, FLAGS_use_gpu, FLAGS_gpu_id,
                       FLAGS_gpu_mem, FLAGS_cpu_threads,
-                       FLAGS_enable_mkldnn, FLAGS_char_list_file,
+                       FLAGS_enable_mkldnn, char_list_file,
                       FLAGS_use_tensorrt, FLAGS_precision);
-    auto start = std::chrono::system_clock::now();
    for (int i = 0; i < cv_all_img_names.size(); ++i) {
      LOG(INFO) << "The predict img: " << cv_all_img_names[i];
-      cv::Mat srcimg = cv::imread(FLAGS_image_dir, cv::IMREAD_COLOR);
+      cv::Mat srcimg = cv::imread(cv_all_img_names[i], cv::IMREAD_COLOR);
      if (!srcimg.data) {
        std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << endl;
        exit(1);
@@ -189,6 +214,9 @@ int main_system(std::vector<cv::String> cv_all_img_names) {
      std::vector<double> rec_times;
      det.Run(srcimg, boxes, &det_times);
+      time_info_det[0] += det_times[0];
+      time_info_det[1] += det_times[1];
+      time_info_det[2] += det_times[2];
      cv::Mat crop_img;
      for (int j = 0; j < boxes.size(); j++) {
@@ -198,18 +226,36 @@ int main_system(std::vector<cv::String> cv_all_img_names) {
          crop_img = cls->Run(crop_img);
        }
        rec.Run(crop_img, &rec_times);
+        time_info_rec[0] += rec_times[0];
+        time_info_rec[1] += rec_times[1];
+        time_info_rec[2] += rec_times[2];
      }
-      auto end = std::chrono::system_clock::now();
-      auto duration =
-          std::chrono::duration_cast<std::chrono::microseconds>(end - start);
-      std::cout << "Cost  "
-                << double(duration.count()) *
-                       std::chrono::microseconds::period::num /
-                       std::chrono::microseconds::period::den
-                << "s" << std::endl;
    }
+    if (FLAGS_benchmark) {
+        AutoLogger autolog_det("ocr_det", 
+                            FLAGS_use_gpu,
+                            FLAGS_use_tensorrt,
+                            FLAGS_enable_mkldnn,
+                            FLAGS_cpu_threads,
+                            1, 
+                            "dynamic", 
+                            FLAGS_precision, 
+                            time_info_det, 
+                            cv_all_img_names.size());
+        AutoLogger autolog_rec("ocr_rec", 
+                            FLAGS_use_gpu,
+                            FLAGS_use_tensorrt,
+                            FLAGS_enable_mkldnn,
+                            FLAGS_cpu_threads,
+                            1, 
+                            "dynamic", 
+                            FLAGS_precision, 
+                            time_info_rec, 
+                            cv_all_img_names.size());
+        autolog_det.report();
+        std::cout << endl;
+        autolog_rec.report();
+    }  
    return 0;
 }

--- a/deploy/pdserving/web_service_det.py
+++ b/deploy/pdserving/web_service_det.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from paddle_serving_server.web_service import WebService, Op
+import logging
+import numpy as np
+import cv2
+import base64
+# from paddle_serving_app.reader import OCRReader
+from ocr_reader import OCRReader, DetResizeForTest
+from paddle_serving_app.reader import Sequential, ResizeByFactor
+from paddle_serving_app.reader import Div, Normalize, Transpose
+from paddle_serving_app.reader import DBPostProcess, FilterBoxes, GetRotateCropImage, SortedBoxes
+_LOGGER = logging.getLogger()
+class DetOp(Op):
+    def init_op(self):
+        self.det_preprocess = Sequential([
+            DetResizeForTest(), Div(255),
+            Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose(
+                (2, 0, 1))
+        ])
+        self.filter_func = FilterBoxes(10, 10)
+        self.post_func = DBPostProcess({
+            "thresh": 0.3,
+            "box_thresh": 0.5,
+            "max_candidates": 1000,
+            "unclip_ratio": 1.5,
+            "min_size": 3
+        })
+    def preprocess(self, input_dicts, data_id, log_id):
+        (_, input_dict), = input_dicts.items()
+        data = base64.b64decode(input_dict["image"].encode('utf8'))
+        self.raw_im = data
+        data = np.fromstring(data, np.uint8)
+        # Note: class variables(self.var) can only be used in process op mode
+        im = cv2.imdecode(data, cv2.IMREAD_COLOR)
+        self.ori_h, self.ori_w, _ = im.shape
+        det_img = self.det_preprocess(im)
+        _, self.new_h, self.new_w = det_img.shape
+        return {"x": det_img[np.newaxis, :].copy()}, False, None, ""
+    def postprocess(self, input_dicts, fetch_dict, log_id):
+        det_out = fetch_dict["save_infer_model/scale_0.tmp_1"]
+        ratio_list = [
+            float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w
+        ]
+        dt_boxes_list = self.post_func(det_out, [ratio_list])
+        dt_boxes = self.filter_func(dt_boxes_list[0], [self.ori_h, self.ori_w])
+        out_dict = {"dt_boxes": str(dt_boxes)}
+        return out_dict, None, ""
+class OcrService(WebService):
+    def get_pipeline_response(self, read_op):
+        det_op = DetOp(name="det", input_ops=[read_op])
+        return det_op
+uci_service = OcrService(name="ocr")
+uci_service.prepare_pipeline_config("config.yml")
+uci_service.run_service()
--- a/deploy/pdserving/web_service_rec.py
+++ b/deploy/pdserving/web_service_rec.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from paddle_serving_server.web_service import WebService, Op
+import logging
+import numpy as np
+import cv2
+import base64
+# from paddle_serving_app.reader import OCRReader
+from ocr_reader import OCRReader, DetResizeForTest
+from paddle_serving_app.reader import Sequential, ResizeByFactor
+from paddle_serving_app.reader import Div, Normalize, Transpose
+_LOGGER = logging.getLogger()
+class RecOp(Op):
+    def init_op(self):
+        self.ocr_reader = OCRReader(
+            char_dict_path="../../ppocr/utils/ppocr_keys_v1.txt")
+    def preprocess(self, input_dicts, data_id, log_id):
+        (_, input_dict), = input_dicts.items()
+        raw_im = base64.b64decode(input_dict["image"].encode('utf8'))
+        data = np.fromstring(raw_im, np.uint8)
+        im = cv2.imdecode(data, cv2.IMREAD_COLOR)
+        feed_list = []
+        max_wh_ratio = 0
+        ## Many mini-batchs, the type of feed_data is list.
+        max_batch_size = 6  # len(dt_boxes)
+        # If max_batch_size is 0, skipping predict stage
+        if max_batch_size == 0:
+            return {}, True, None, ""
+        boxes_size = max_batch_size
+        rem = boxes_size % max_batch_size
+        h, w = im.shape[0:2]
+        wh_ratio = w * 1.0 / h
+        max_wh_ratio = max(max_wh_ratio, wh_ratio)
+        _, w, h = self.ocr_reader.resize_norm_img(im, max_wh_ratio).shape
+        norm_img = self.ocr_reader.resize_norm_img(im, max_batch_size)
+        norm_img = norm_img[np.newaxis, :]
+        feed = {"x": norm_img.copy()}
+        feed_list.append(feed)
+        return feed_list, False, None, ""
+    def postprocess(self, input_dicts, fetch_data, log_id):
+        res_list = []
+        if isinstance(fetch_data, dict):
+            if len(fetch_data) > 0:
+                rec_batch_res = self.ocr_reader.postprocess(
+                    fetch_data, with_score=True)
+                for res in rec_batch_res:
+                    res_list.append(res[0])
+        elif isinstance(fetch_data, list):
+            for one_batch in fetch_data:
+                one_batch_res = self.ocr_reader.postprocess(
+                    one_batch, with_score=True)
+                for res in one_batch_res:
+                    res_list.append(res[0])
+        res = {"res": str(res_list)}
+        return res, None, ""
+class OcrService(WebService):
+    def get_pipeline_response(self, read_op):
+        rec_op = RecOp(name="rec", input_ops=[read_op])
+        return rec_op
+uci_service = OcrService(name="ocr")
+uci_service.prepare_pipeline_config("config.yml")
+uci_service.run_service()
--- a/doc/ic15_location_download.png
+++ b/doc/ic15_location_download.png
--- a/doc/doc_ch/algorithm_overview.md
+++ b/doc/doc_ch/algorithm_overview.md
@@ -49,6 +49,7 @@ PaddleOCR基于动态图开源的文本识别算法列表：
 - [x]  RARE([paper](https://arxiv.org/abs/1603.03915v1))
 - [x]  SRN([paper](https://arxiv.org/abs/2003.12294))
 - [x]  NRTR([paper](https://arxiv.org/abs/1806.00926v2))
+- [x]  SAR([paper](https://arxiv.org/abs/1811.00751v2))
 参考[DTRB](https://arxiv.org/abs/1904.01906) 文字识别训练和评估流程，使用MJSynth和SynthText两个文字识别数据集训练，在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估，算法效果如下：
@@ -64,6 +65,6 @@ PaddleOCR基于动态图开源的文本识别算法列表：
 |RARE|Resnet34_vd|83.6%|rec_r34_vd_tps_bilstm_att |[下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_att_v2.0_train.tar)|
 |SRN|Resnet50_vd_fpn| 88.52% | rec_r50fpn_vd_none_srn | [下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r50_vd_srn_train.tar) |
 |NRTR|NRTR_MTB| 84.3% | rec_mtb_nrtr | [下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mtb_nrtr_train.tar) |
+|SAR|Resnet31| 87.2% | rec_r31_sar | [下载链接](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_sar_train.tar) |
 PaddleOCR文本识别算法的训练和使用请参考文档教程中[模型训练/评估中的文本识别部分](./recognition.md)。
--- a/doc/doc_ch/benchmark.md
+++ b/doc/doc_ch/benchmark.md
@@ -12,40 +12,27 @@
 ## 评估指标  
 说明：
- v1.0是未添加优化策略的DB+CRNN模型，v1.1是添加多种优化策略和方向分类器的PP-OCR模型。slim_v1.1是使用裁剪或量化的模型。
 - 检测输入图像的的长边尺寸是960。
- 评估耗时阶段为图像输入到结果输出的完整阶段，包括了图像的预处理和后处理。  
+- 评估耗时阶段为图像预测耗时，不包括图像的预处理和后处理。  
 - `Intel至强6148`为服务器端CPU型号，测试中使用Intel MKL-DNN 加速。
 - `骁龙855`为移动端处理平台型号。  
-不同预测模型大小和整体识别精度对比
+预测模型大小和整体识别精度对比
 | 模型名称                     | 整体模型<br>大小\(M\) | 检测模型<br>大小\(M\) | 方向分类器<br>模型大小\(M\) | 识别模型<br>大小\(M\) | 整体识别<br>F\-score |
 |:-:|:-:|:-:|:-:|:-:|:-:|
-| ch\_ppocr\_mobile\_v1\.1 | 8\.1        | 2\.6        | 0\.9           | 4\.6        | 0\.5193      |
+| PP-OCRv2 | 11\.6        | 3\.0        | 0\.9           | 8\.6        | 0\.5224      |
-| ch\_ppocr\_server\_v1\.1 | 155\.1      | 47\.2       | 0\.9           | 107         | 0\.5414      |
+| PP-OCR mobile |   8\.1  | 2\.6        | 0\.9           | 4\.6        | 0\.503       |
-| ch\_ppocr\_mobile\_v1\.0 | 8\.6        | 4\.1        | \-             | 4\.5        | 0\.393       |
+| PP-OCR server | 155\.1  | 47\.2       | 0\.9           | 107         | 0\.570       |
-| ch\_ppocr\_server\_v1\.0 | 203\.8      | 98\.5       | \-             | 105\.3      | 0\.4436      |
-不同预测模型在T4 GPU上预测速度对比，单位ms
-| 模型名称                     | 整体  | 检测 | 方向分类器 | 识别  |
-|:-:|:-:|:-:|:-:|:-:|
-| ch\_ppocr\_mobile\_v1\.1 | 137 | 35 | 24    | 78  |
-| ch\_ppocr\_server\_v1\.1 | 204 | 39 | 25    | 140 |
-| ch\_ppocr\_mobile\_v1\.0 | 117 | 41 | \-    | 76  |
-| ch\_ppocr\_server\_v1\.0 | 199 | 52 | \-    | 147 |
-不同预测模型在CPU上预测速度对比，单位ms
-| 模型名称                     | 整体   | 检测  | 方向分类器 | 识别  |
+预测模型在CPU和GPU上的速度对比，单位ms
-|:-:|:-:|:-:|:-:|:-:|
-| ch\_ppocr\_mobile\_v1\.1 | 421  | 164 | 51    | 206 |
-| ch\_ppocr\_mobile\_v1\.0 | 398  | 219 | \-    | 179 |
-裁剪量化模型和原始模型模型大小，整体识别精度和在SD 855上预测速度对比
+| 模型名称                     | CPU   | T4 GPU  |
+|:-:|:-:|:-:|
+| PP-OCRv2 | 330  | 111 |
+| PP-OCR mobile | 356  | 11 6|
+| PP-OCR server | 1056  | 200 |
-| 模型名称                           | 整体模型<br>大小\(M\) | 检测模型<br>大小\(M\) | 方向分类器<br>模型大小\(M\) | 识别模型<br>大小\(M\) | 整体识别<br>F\-score | SD 855<br>\(ms\) |
+更多 PP-OCR 系列模型的预测指标可以参考[PP-OCR Benchamrk](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/benchmark.md)
-|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
-| ch\_ppocr\_mobile\_v1\.1       | 8\.1        | 2\.6        | 0\.9           | 4\.6        | 0\.5193      | 306          |
-| ch\_ppocr\_mobile\_slim\_v1\.1 | 3\.5        | 1\.4        | 0\.5           | 1\.6        | 0\.521       | 268          |
--- a/doc/doc_ch/detection.md
+++ b/doc/doc_ch/detection.md
@@ -19,15 +19,16 @@
 <a name="11-----"></a>
 ## 1.1 数据准备
-icdar2015数据集可以从[官网](https://rrc.cvc.uab.es/?ch=4&com=downloads)下载到，首次下载需注册。
+icdar2015 TextLocalization数据集是文本检测的数据集，包含1000张训练图像和500张测试图像。
+icdar2015数据集可以从[官网](https://rrc.cvc.uab.es/?ch=4&com=downloads)下载到，首次下载需注册。
 注册完成登陆后，下载下图中红色框标出的部分，其中， `Training Set Images`下载的内容保存为`icdar_c4_train_imgs`文件夹下，`Test Set Images` 下载的内容保存为`ch4_test_images`文件夹下
 <p align="center">
- <img src="./doc/datasets/ic15_location_download.png" align="middle" width = "600"/>
+ <img src="../datasets/ic15_location_download.png" align="middle" width = "700"/>
 <p align="center">
-将下载到的数据集解压到工作目录下，假设解压在 PaddleOCR/train_data/ 下。另外，PaddleOCR将零散的标注文件整理成单独的标注文件
+将下载到的数据集解压到工作目录下，假设解压在 PaddleOCR/train_data/下。另外，PaddleOCR将零散的标注文件整理成单独的标注文件
 ，您可以通过wget的方式进行下载。
 ```shell
 # 在PaddleOCR路径下

--- a/doc/doc_ch/environment.md
+++ b/doc/doc_ch/environment.md
 # 运行环境准备
+Windows和Mac用户推荐使用Anaconda搭建Python环境，Linux用户建议使用docker搭建PyThon环境。
+如果对于Python环境熟悉的用户可以直接跳到第2步安装PaddlePaddle。
 * [1. Python环境搭建](#1)
  + [1.1 Windows](#1.1)
@@ -283,7 +286,7 @@ Linux用户可选择Anaconda或Docker两种方式运行。如果你熟悉Docker
 #### 1.3.2 Docker环境配置
-**注意：第一次使用这个镜像，会自动下载该镜像，请耐心等待。**
+**注意：第一次使用这个镜像，会自动下载该镜像，请耐心等待。您也可以访问[DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/)获取与您机器适配的镜像。**
 ```bash
 # 切换到工作目录下
@@ -297,8 +300,6 @@ sudo docker run --name ppocr -v $PWD:/paddle --network=host -it paddlepaddle/pad
 如果使用CUDA10，请运行以下命令创建容器，设置docker容器共享内存shm-size为64G，建议设置32G以上
 sudo nvidia-docker run --name ppocr -v $PWD:/paddle --shm-size=64G --network=host -it paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82 /bin/bash
-您也可以访问[DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/)获取与您机器适配的镜像。
 # ctrl+P+Q可退出docker 容器，重新进入docker 容器使用如下命令
 sudo docker container exec -it ppocr /bin/bash
 ```

--- a/doc/doc_ch/knowledge_distillation.md
+++ b/doc/doc_ch/knowledge_distillation.md
@@ -39,7 +39,7 @@ PaddleOCR中集成了知识蒸馏的算法，具体地，有以下几个主要
 ### 2.1 识别配置文件解析
-配置文件在[rec_chinese_lite_train_distillation_v2.1.yml](../../configs/rec/ch_ppocr_v2.1/rec_chinese_lite_train_distillation_v2.1.yml)。
+配置文件在[ch_PP-OCRv2_rec_distillation.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_distillation.yml)。
 #### 2.1.1 模型结构
@@ -246,6 +246,39 @@ Metric:
 关于`DistillationMetric`更加具体的实现可以参考: [distillation_metric.py](../../ppocr/metrics/distillation_metric.py#L24)。
+#### 2.1.5 蒸馏模型微调
+对蒸馏得到的识别蒸馏进行微调有2种方式。
+（1）基于知识蒸馏的微调：这种情况比较简单，下载预训练模型，在[ch_PP-OCRv2_rec_distillation.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_distillation.yml)中配置好预训练模型路径以及自己的数据路径，即可进行模型微调训练。
+（2）微调时不使用知识蒸馏：这种情况，需要首先将预训练模型中的学生模型参数提取出来，具体步骤如下。
+* 首先下载预训练模型并解压。
+```shell
+# 下面预训练模型并解压
+wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar
+tar -xf ch_PP-OCRv2_rec_train.tar
+```
+* 然后使用python，对其中的学生模型参数进行提取
+```python
+import paddle
+# 加载预训练模型
+all_params = paddle.load("ch_PP-OCRv2_rec_train/best_accuracy.pdparams")
+# 查看权重参数的keys
+print(all_params.keys())
+# 学生模型的权重提取
+s_params = {key[len("Student."):]: all_params[key] for key in all_params if "Student." in key}
+# 查看学生模型权重参数的keys
+print(s_params.keys())
+# 保存
+paddle.save(s_params, "ch_PP-OCRv2_rec_train/student.pdparams")
+```
+转化完成之后，使用[ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml)，修改预训练模型的路径（为导出的`student.pdparams`模型路径）以及自己的数据路径，即可进行模型微调。
 ### 2.2 检测配置文件解析
 * coming soon!