Merge branch 'PaddlePaddle:dygraph' into dygraph

f1c771bd · d2623587501 · GitHub · ff88312c · 1a7e64bb · f1c771bd
Unverified Commit f1c771bd authored Nov 11, 2021 by d2623587501 Committed by GitHub Nov 11, 2021
20 changed files
--- a/doc/doc_en/add_new_algorithm_en.md
+++ b/doc/doc_en/add_new_algorithm_en.md
-# Add new algorithm
+# Add New Algorithm

 PaddleOCR decomposes an algorithm into the following parts, and modularizes each part to make it more convenient to develop new algorithms.

@@ -263,7 +263,7 @@ Metric:
  main_indicator: acc
 ```

-## 优化器
+## Optimizer

 The optimizer is used to train the network. The optimizer also contains network regularization and learning rate decay modules. This part is under [ppocr/optimizer](../../ppocr/optimizer). PaddleOCR has built-in
 Commonly used optimizer modules such as `Momentum`, `Adam` and `RMSProp`, common regularization modules such as `Linear`, `Cosine`, `Step` and `Piecewise`, and common learning rate decay modules such as `L1Decay` and `L2Decay`.

--- a/doc/doc_en/algorithm_overview_en.md
+++ b/doc/doc_en/algorithm_overview_en.md
+# Two-stage Algorithm
+
+- [1. Algorithm Introduction](#1-algorithm-introduction)
+  * [1.1 Text Detection Algorithm](#11-text-detection-algorithm)
+  * [1.2 Text Recognition Algorithm](#12-text-recognition-algorithm)
+- [2. Training](#2-training)
+- [3. Inference](#3-inference)
+
 <a name="Algorithm_introduction"></a>
-## Algorithm introduction
+
+## 1. Algorithm Introduction

 This tutorial lists the text detection algorithms and text recognition algorithms supported by PaddleOCR, as well as the models and metrics of each algorithm on **English public datasets**. It is mainly used for algorithm introduction and algorithm performance comparison. For more models on other datasets including Chinese, please refer to [PP-OCR v2.0 models list](./models_list_en.md).

@@ -8,31 +17,32 @@ This tutorial lists the text detection algorithms and text recognition algorithm
 - [2. Text Recognition Algorithm](#TEXTRECOGNITIONALGORITHM)

 <a name="TEXTDETECTIONALGORITHM"></a>
-### 1. Text Detection Algorithm
+
+### 1.1 Text Detection Algorithm

 PaddleOCR open source text detection algorithms list:
- [x]  EAST([paper](https://arxiv.org/abs/1704.03155))
- [x]  DB([paper](https://arxiv.org/abs/1911.08947))
- [x]  SAST([paper](https://arxiv.org/abs/1908.05498))
- [x]  PSE([paper](https://arxiv.org/abs/1903.12473v2))
+- [x]  EAST([paper](https://arxiv.org/abs/1704.03155))[2]
+- [x]  DB([paper](https://arxiv.org/abs/1911.08947))[1]
+- [x]  SAST([paper](https://arxiv.org/abs/1908.05498))[4]
+- [x]  PSENet([paper](https://arxiv.org/abs/1903.12473v2)）

 On the ICDAR2015 dataset, the text detection result is as follows:

-|Model|Backbone|precision|recall|Hmean|Download link|
+|Model|Backbone|Precision|Recall|Hmean|Download link|
 | --- | --- | --- | --- | --- | --- |
-|EAST|ResNet50_vd|85.80%|86.71%|86.25%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
-|EAST|MobileNetV3|79.42%|80.64%|80.03%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_east_v2.0_train.tar)|
-|DB|ResNet50_vd|86.41%|78.72%|82.38%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)|
-|DB|MobileNetV3|77.29%|73.08%|75.12%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar)|
-|SAST|ResNet50_vd|91.39%|83.77%|87.42%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
-|PSE|ResNet50_vd|85.81%|79.53%|82.55%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
-|PSE|MobileNetV3|82.20%|70.48%|75.89%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|
+|EAST|ResNet50_vd|85.80%|86.71%|86.25%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)|
+|EAST|MobileNetV3|79.42%|80.64%|80.03%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_east_v2.0_train.tar)|
+|DB|ResNet50_vd|86.41%|78.72%|82.38%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)|
+|DB|MobileNetV3|77.29%|73.08%|75.12%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar)|
+|SAST|ResNet50_vd|91.39%|83.77%|87.42%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
+|PSE|ResNet50_vd|85.81%|79.53%|82.55%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
+|PSE|MobileNetV3|82.20%|70.48%|75.89%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|

 On Total-Text dataset, the text detection result is as follows:

-|Model|Backbone|precision|recall|Hmean|Download link|
+|Model|Backbone|Precision|Recall|Hmean|Download link|
 | --- | --- | --- | --- | --- | --- |
-|SAST|ResNet50_vd|89.63%|78.44%|83.66%|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_totaltext_v2.0_train.tar)|
+|SAST|ResNet50_vd|89.63%|78.44%|83.66%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_totaltext_v2.0_train.tar)|

 **Note：** Additional data, like icdar2013, icdar2017, COCO-Text, ArT, was added to the model training of SAST. Download English public dataset in organized format used by PaddleOCR from:
 * [Baidu Drive](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (download code: 2bpi).
@@ -41,31 +51,41 @@ On Total-Text dataset, the text detection result is as follows:
 For the training guide and use of PaddleOCR text detection algorithms, please refer to the document [Text detection model training/evaluation/prediction](./detection_en.md)

 <a name="TEXTRECOGNITIONALGORITHM"></a>
-### 2. Text Recognition Algorithm
+### 1.2 Text Recognition Algorithm

 PaddleOCR open-source text recognition algorithms list:
- [x]  CRNN([paper](https://arxiv.org/abs/1507.05717))
- [x]  Rosetta([paper](https://arxiv.org/abs/1910.05085))
- [x]  STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
- [x]  RARE([paper](https://arxiv.org/abs/1603.03915v1))
- [x]  SRN([paper](https://arxiv.org/abs/2003.12294))
- [x]  NRTR([paper](https://arxiv.org/abs/1806.00926v2))
+- [x]  CRNN([paper](https://arxiv.org/abs/1507.05717))[7]
+- [x]  Rosetta([paper](https://arxiv.org/abs/1910.05085))[10]
+- [x]  STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))[11]
+- [x]  RARE([paper](https://arxiv.org/abs/1603.03915v1))[12]
+- [x]  SRN([paper](https://arxiv.org/abs/2003.12294))[5]
+- [x]  NRTR([paper](https://arxiv.org/abs/1806.00926v2))[13]
 - [x]  SAR([paper](https://arxiv.org/abs/1811.00751v2))
+- [x] SEED([paper](https://arxiv.org/pdf/2005.10977.pdf))

 Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:

 |Model|Backbone|Avg Accuracy|Module combination|Download link|
 |---|---|---|---|---|
-|Rosetta|Resnet34_vd|80.9%|rec_r34_vd_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
-|Rosetta|MobileNetV3|78.05%|rec_mv3_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
-|CRNN|Resnet34_vd|82.76%|rec_r34_vd_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
-|CRNN|MobileNetV3|79.97%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
-|StarNet|Resnet34_vd|84.44%|rec_r34_vd_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
-|StarNet|MobileNetV3|81.42%|rec_mv3_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
-|RARE|MobileNetV3|82.5%|rec_mv3_tps_bilstm_att |[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_att_v2.0_train.tar)|
-|RARE|Resnet34_vd|83.6%|rec_r34_vd_tps_bilstm_att |[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_att_v2.0_train.tar)|
-|SRN|Resnet50_vd_fpn| 88.52% | rec_r50fpn_vd_none_srn |[Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r50_vd_srn_train.tar)|
-|NRTR|NRTR_MTB| 84.3% | rec_mtb_nrtr | [Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mtb_nrtr_train.tar) |
-|SAR|Resnet31| 87.2% | rec_r31_sar | [Download link](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_sar_train.tar) |
-
-Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./recognition_en.md)
+|Rosetta|Resnet34_vd|80.9%|rec_r34_vd_none_none_ctc|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
+|Rosetta|MobileNetV3|78.05%|rec_mv3_none_none_ctc|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
+|CRNN|Resnet34_vd|82.76%|rec_r34_vd_none_bilstm_ctc|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
+|CRNN|MobileNetV3|79.97%|rec_mv3_none_bilstm_ctc|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
+|StarNet|Resnet34_vd|84.44%|rec_r34_vd_tps_bilstm_ctc|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
+|StarNet|MobileNetV3|81.42%|rec_mv3_tps_bilstm_ctc|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
+|RARE|MobileNetV3|82.5%|rec_mv3_tps_bilstm_att |[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_att_v2.0_train.tar)|
+|RARE|Resnet34_vd|83.6%|rec_r34_vd_tps_bilstm_att |[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_att_v2.0_train.tar)|
+|SRN|Resnet50_vd_fpn| 88.52% | rec_r50fpn_vd_none_srn |[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r50_vd_srn_train.tar)|
+|NRTR|NRTR_MTB| 84.3% | rec_mtb_nrtr | [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mtb_nrtr_train.tar) |
+|SAR|Resnet31| 87.2% | rec_r31_sar | [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_sar_train.tar) |
+|SEED|Aster_Resnet| 85.2% | rec_resnet_stn_bilstm_att | [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
+
+Please refer to the document for training guide and use of PaddleOCR 
+
+## 2. Training
+
+For the training guide and use of PaddleOCR text detection algorithms, please refer to the document [Text detection model training/evaluation/prediction](./detection_en.md). For text recognition algorithms, please refer to [Text recognition model training/evaluation/prediction](./recognition_en.md)
+
+## 3. Inference
+
+Except for the PP-OCR series models of the above models, the other models only support inference based on the Python engine. For details, please refer to [Inference based on Python prediction engine](./inference_en.md)
--- a/doc/doc_en/angle_class_en.md
+++ b/doc/doc_en/angle_class_en.md
@@ -15,7 +15,6 @@ The text line image obtained after text detection is sent to the recognition mod
 Example of 0 and 180 degree data samples：

 ![](../imgs_results/angle_class_example.jpg)
-### DATA PREPARATION

 <a name="data-preparation"></a>
 ## 2. Data Preparation
@@ -131,8 +130,6 @@ python3 tools/eval.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/
 <a name="prediction"></a>
 ## 5. Prediction

-### PREDICTION
-
 * Training engine prediction

 Using the model trained by paddleocr, you can quickly get prediction through the following script.

--- a/doc/doc_en/config_en.md
+++ b/doc/doc_en/config_en.md
-# Configuration
+# Configuration 

 - [1. Optional Parameter List](#1-optional-parameter-list)
 - [2. Intorduction to Global Parameters of Configuration File](#2-intorduction-to-global-parameters-of-configuration-file)
@@ -36,10 +36,10 @@ Take rec_chinese_lite_train_v2.0.yml as an example
 |      pretrained_model    |    Set the path of the pre-trained model      |  ./pretrain_models/CRNN/best_accuracy  |  \          |
 |      checkpoints         |    set model parameter path            |       None        |   Used to load parameters after interruption to continue training|
 |      use_visualdl  |    Set whether to enable visualdl for visual log display |          False        |    [Tutorial](https://www.paddlepaddle.org.cn/paddle/visualdl) |
-|      infer_img            |    Set inference image path or folder path     |       ./infer_img | \|
-|      character_dict_path |    Set dictionary path            |  ./ppocr/utils/ppocr_keys_v1.txt  |    If the character_dict_path is None, model can only recognize number and lower letters                 |
+|      infer_img            |    Set inference image path or folder path     |       ./infer_img | \||
+|      character_dict_path |    Set dictionary path            |  ./ppocr/utils/ppocr_keys_v1.txt  | If the character_dict_path is None, model can only recognize number and lower letters |
 |      max_text_length     |    Set the maximum length of text        |       25          |                \                 |
-|      use_space_char     |    Set whether to recognize spaces             |        True      |          Only support in character_type=ch mode                 |
+|      use_space_char     |    Set whether to recognize spaces             |        True      |          \|               |
 |      label_list          |    Set the angle supported by the direction classifier       |    ['0','180']    |     Only valid in angle classifier model |
 |      save_res_path          |    Set the save address of the test model results       |    ./output/det_db/predicts_db.txt    |     Only valid in the text detection model |

@@ -196,21 +196,21 @@ Italian is made up of Latin letters, so after executing the command, you will ge
      epoch_num: 500
      ...
      character_dict_path:  {path/of/dict} # path of dict
-
+   
   Train:
      dataset:
        name: SimpleDataSet
        data_dir: train_data/ # root directory of training data
        label_file_list: ["./train_data/train_list.txt"] # train label path
      ...
-
+   
   Eval:
      dataset:
        name: SimpleDataSet
        data_dir: train_data/ # root directory of val data
        label_file_list: ["./train_data/val_list.txt"] # val label path
      ...
-
+   
   ```


@@ -220,12 +220,12 @@ Currently, the multi-language algorithms supported by PaddleOCR are:
 | :--------: |  :-------:   | :-------:  |   :-------:   |   :-----:   |  :-----:   | :-----:  |
 | rec_chinese_cht_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | chinese traditional  |
 | rec_en_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | English(Case sensitive)   |
-| rec_french_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | French |  
+| rec_french_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | French |
 | rec_ger_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | German   |
 | rec_japan_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | Japanese |
 | rec_korean_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | Korean  |
 | rec_latin_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | Latin  |
-| rec_arabic_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | arabic |  
+| rec_arabic_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | arabic |
 | rec_cyrillic_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | cyrillic   |
 | rec_devanagari_lite_train.yml |  CRNN |   Mobilenet_v3 small 0.5 |  None   |  BiLSTM |  ctc  | devanagari  |


--- a/doc/doc_en/detection_en.md
+++ b/doc/doc_en/detection_en.md
@@ -98,14 +98,14 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml -o   \
 # multi-GPU training
 # Set the GPU ID used by the '--gpus' parameter.
 python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
- 
+
 # multi-Node, multi-GPU training
 # Set the IPs of your nodes used by the '--ips' parameter. Set the GPU ID used by the '--gpus' parameter.
 python3 -m paddle.distributed.launch --ips="xx.xx.xx.xx,xx.xx.xx.xx" --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml \
     -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
 ```
-**Note:** For multi-Node multi-GPU training, you need to replace the `ips` value in the preceding command with the address of your machine, and the machines must be able to ping each other. The command for viewing the IP address of the machine is `ifconfig`.
- 
+**Note:** For multi-Node multi-GPU training, you need to replace the `ips` value in the preceding command with the address of your machine, and the machines must be able to ping each other. In addition, it requires activating commands separately on multiple machines when we start the training. The command for viewing the IP address of the machine is `ifconfig`.
+
 If you want to further speed up the training, you can use [automatic mixed precision training](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/01_paddle2.0_introduction/basic_concept/amp_en.html). for single card training, the command is as follows:
 ```
 python3 tools/train.py -c configs/det/det_mv3_db.yml \
@@ -235,6 +235,7 @@ python3 tools/infer/predict_det.py --det_algorithm="EAST" --det_model_dir="./out
 ## 5. FAQ

 Q1: The prediction results of trained model and inference model are inconsistent?
+
 **A**: Most of the problems are caused by the inconsistency of the pre-processing and post-processing parameters during the prediction of the trained model and the pre-processing and post-processing parameters during the prediction of the inference model. Taking the model trained by the det_mv3_db.yml configuration file as an example, the solution to the problem of inconsistent prediction results between the training model and the inference model is as follows:
 - Check whether the [trained model preprocessing](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/configs/det/det_mv3_db.yml#L116) is consistent with the prediction [preprocessing function of the inference model](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/tools/infer/predict_det.py#L42). When the algorithm is evaluated, the input image size will affect the accuracy. In order to be consistent with the paper, the image is resized to [736, 1280] in the training icdar15 configuration file, but there is only a set of default parameters when the inference model predicts, which will be considered To predict the speed problem, the longest side of the image is limited to 960 for resize by default. The preprocessing function of the training model preprocessing and the inference model is located in [ppocr/data/imaug/operators.py](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/ppocr/data/imaug/operators.py#L147)
 - Check whether the [post-processing of the trained model](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/configs/det/det_mv3_db.yml#L51) is consistent with the [post-processing parameters of the inference](https://github.com/PaddlePaddle/PaddleOCR/blob/c1ed243fb68d5d466258243092e56cbae32e2c14/tools/infer/utility.py#L50).
--- a/doc/doc_en/environment_en.md
+++ b/doc/doc_en/environment_en.md
 # Environment Preparation

+Windows and Mac users are recommended to use Anaconda to build a Python environment, and Linux users are recommended to use docker to build a Python environment. If you are familiar with the Python environment, you can skip to step 2 to install PaddlePaddle.
+
 Recommended working environment:
 - PaddlePaddle >= 2.0.0 (2.1.2)
 - python3.7

--- a/doc/doc_en/inference_ppocr_en.md
+++ b/doc/doc_en/inference_ppocr_en.md

-# Python Inference for PP-OCR Model Library
+# Python Inference for PP-OCR Model Zoo

 This article introduces the use of the Python inference engine for the PP-OCR model library. The content is in order of text detection, text recognition, direction classifier and the prediction method of the three in series on the CPU and GPU.


 - [Text Detection Model Inference](#DETECTION_MODEL_INFERENCE)
-
 - [Text Recognition Model Inference](#RECOGNITION_MODEL_INFERENCE)
    - [1. Lightweight Chinese Recognition Model Inference](#LIGHTWEIGHT_RECOGNITION)
    - [2. Multilingaul Model Inference](#MULTILINGUAL_MODEL_INFERENCE)
-    
 - [Angle Classification Model Inference](#ANGLE_CLASS_MODEL_INFERENCE)
-
 - [Text Detection Angle Classification and Recognition Inference Concatenation](#CONCATENATION)

 <a name="DETECTION_MODEL_INFERENCE"></a>
@@ -22,10 +19,10 @@ The default configuration is based on the inference setting of the DB text detec

 ```
 # download DB text detection inference model
-wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
-tar xf ch_ppocr_mobile_v2.0_det_infer.tar
-# predict
-python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/"
+wget  https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar
+tar xf ch_PP-OCRv2_det_infer.tar
+# run inference
+python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./ch_PP-OCRv2_det_infer.tar/"
 ```

 The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:
@@ -42,12 +39,12 @@ Set as `limit_type='min', det_limit_side_len=960`, it means that the shortest si

 If the resolution of the input picture is relatively large and you want to use a larger resolution prediction, you can set det_limit_side_len to the desired value, such as 1216:
 ```
-python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./inference/det_db/" --det_limit_type=max --det_limit_side_len=1216
+python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./inference/ch_PP-OCRv2_det_infer/" --det_limit_type=max --det_limit_side_len=1216
 ```

 If you want to use the CPU for prediction, execute the command as follows
 ```
-python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False
+python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_dir="./inference/ch_PP-OCRv2_det_infer/"  --use_gpu=False
 ```

 <a name="RECOGNITION_MODEL_INFERENCE"></a>
@@ -62,9 +59,10 @@ For lightweight Chinese recognition model inference, you can execute the followi

 ```
 # download CRNN text recognition inference model
-wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
-tar xf ch_ppocr_mobile_v2.0_rec_infer.tar
-python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_10.png" --rec_model_dir="ch_ppocr_mobile_v2.0_rec_infer"
+wget  https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar
+tar xf ch_PP-OCRv2_rec_infer.tar
+# run inference
+python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/ch/word_4.jpg" --rec_model_dir="./ch_PP-OCRv2_rec_infer/"
 ```

 ![](../imgs_words_en/word_10.png)
@@ -78,10 +76,12 @@ Predicts of ./doc/imgs_words_en/word_10.png:('PAIN', 0.9897658)
 <a name="MULTILINGUAL_MODEL_INFERENCE"></a>

 ### 2. Multilingaul Model Inference
-If you need to predict other language models, when using inference model prediction, you need to specify the dictionary path used by `--rec_char_dict_path`. At the same time, in order to get the correct visualization results,
+If you need to predict [other language models](./models_list_en.md#Multilingual), when using inference model prediction, you need to specify the dictionary path used by `--rec_char_dict_path`. At the same time, in order to get the correct visualization results,
 You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/fonts` path, such as Korean recognition:

 ```
+wget wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar
+
 python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/fonts/korean.ttf"
 ```
 ![](../imgs_words/korean/1.jpg)
@@ -120,13 +120,13 @@ When performing prediction, you need to specify the path of a single image or a

 ```shell
 # use direction classifier
-python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true
+python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/ch_PP-OCRv2_det_infer/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/ch_PP-OCRv2_rec_infer/" --use_angle_cls=true

 # not use use direction classifier
-python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/"
+python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/ch_PP-OCRv2_det_infer/" --rec_model_dir="./inference/ch_PP-OCRv2_rec_infer/" --use_angle_cls=false

 # use multi-process
-python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false --use_mp=True --total_process_num=6
+python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/ch_PP-OCRv2_det_infer/" --rec_model_dir="./inference/ch_PP-OCRv2_rec_infer/" --use_angle_cls=false --use_mp=True --total_process_num=6
 ```



--- a/doc/doc_en/models_list_en.md
+++ b/doc/doc_en/models_list_en.md
-## OCR model list（V2.1, updated on 2021.9.6）
+# OCR Model List（V2.1, updated on 2021.9.6）
 > **Note**
 > 1. Compared with the model v2.0, the 2.1 version of the detection model has a improvement in accuracy, and the 2.1 version of the recognition model is optimized in accuracy and CPU speed.
 > 2. Compared with [models 1.1](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md), which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.
@@ -6,9 +6,9 @@

 - [1. Text Detection Model](#Detection)
 - [2. Text Recognition Model](#Recognition)
-    - [Chinese Recognition Model](#Chinese)
-    - [English Recognition Model](#English)
-    - [Multilingual Recognition Model](#Multilingual)
+    - [2.1 Chinese Recognition Model](#Chinese)
+    - [2.2 English Recognition Model](#English)
+    - [2.3 Multilingual Recognition Model](#Multilingual)
 - [3. Text Angle Classification Model](#Angle)
 - [4. Paddle-Lite Model](#Paddle-Lite)

@@ -25,26 +25,26 @@ Relationship of the above models is as follows.
 ![](../imgs_en/model_prod_flow_en.png)

 <a name="Detection"></a>
-### 1. Text Detection Model
+## 1. Text Detection Model

 |model name|description|config|model size|download|
 | --- | --- | --- | --- | --- |
-|ch_PP-OCRv2_det_slim|slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_det_cml.yml](../../configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml)| 3M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar)|
-|ch_PP-OCRv2_det|Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_det_cml.yml](../../configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml)|3M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)|
+|ch_PP-OCRv2_det_slim|[New] slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_det_cml.yml](../../configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml)| 3M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar)|
+|ch_PP-OCRv2_det|[New] Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_det_cml.yml](../../configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml)|3M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)|
 |ch_ppocr_mobile_slim_v2.0_det|Slim pruned lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)|2.6M |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_det_prune_infer.tar)|
 |ch_ppocr_mobile_v2.0_det|Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)|3M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|
 |ch_ppocr_server_v2.0_det|General model, which is larger than the lightweight model, but achieved better performance|[ch_det_res18_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml)|47M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)|

 <a name="Recognition"></a>
-### 2. Text Recognition Model
+## 2. Text Recognition Model

 <a name="Chinese"></a>
-#### Chinese Recognition Model
+### 2.1 Chinese Recognition Model

 |model name|description|config|model size|download|
 | --- | --- | --- | --- | --- |
-|ch_PP-OCRv2_rec_slim|Slim qunatization with distillation lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml)| 9M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_train.tar) |
-|ch_PP-OCRv2_rec|Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml)|8.5M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar) |
+|ch_PP-OCRv2_rec_slim|[New] Slim qunatization with distillation lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml)| 9M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_train.tar) |
+|ch_PP-OCRv2_rec|[New] Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml)|8.5M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar) |
 |ch_ppocr_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)| 6M | [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_slim_train.tar) |
 |ch_ppocr_mobile_v2.0_rec|Original lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)|5.2M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
 |ch_ppocr_server_v2.0_rec|General model, supporting Chinese, English and number recognition|[rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml)|94.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
@@ -53,7 +53,7 @@ Relationship of the above models is as follows.
 **Note:** The `trained model` is finetuned on the `pre-trained model` with real data and synthsized vertical text data, which achieved better performance in real scene. The `pre-trained model` is directly trained on the full amount of real data and synthsized data, which is more suitable for finetune on your own dataset.

 <a name="English"></a>
-#### English Recognition Model
+### 2.2 English Recognition Model

 |model name|description|config|model size|download|
 | --- | --- | --- | --- | --- |
@@ -61,7 +61,7 @@ Relationship of the above models is as follows.
 |en_number_mobile_v2.0_rec|Original lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)|2.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_train.tar) |

 <a name="Multilingual"></a>
-#### Multilingual Recognition Model（Updating...）
+### 2.3 Multilingual Recognition Model（Updating...）

 |model name| dict file | description|config|model size|download|
 | --- | --- | --- |--- | --- | --- |
@@ -82,7 +82,7 @@ For more supported languages, please refer to : [Multi-language model](./multi_l


 <a name="Angle"></a>
-### 3. Text Angle Classification Model
+## 3. Text Angle Classification Model

 |model name|description|config|model size|download|
 | --- | --- | --- | --- | --- |
@@ -90,7 +90,7 @@ For more supported languages, please refer to : [Multi-language model](./multi_l
 |ch_ppocr_mobile_v2.0_cls|Original model for text angle classification|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|1.38M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |

 <a name="Paddle-Lite"></a>
-### 4. Paddle-Lite Model
+## 4. Paddle-Lite Model
 |Version|Introduction|Model size|Detection model|Text Direction model|Recognition model|Paddle-Lite branch|
 |---|---|---|---|---|---|---|
 |PP-OCRv2|extra-lightweight chinese OCR optimized model|11M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer_opt.nb)|v2.9|

--- a/doc/doc_en/pgnet_en.md
+++ b/doc/doc_en/pgnet_en.md
@@ -24,9 +24,9 @@ The results of detection and recognition are as follows:
 ![](../imgs_results/e2e_res_img293_pgnet.png)
 ![](../imgs_results/e2e_res_img295_pgnet.png)
 ### Performance
-####Test set: Total Text
+#### Test set: Total Text

-####Test environment: NVIDIA Tesla V100-SXM2-16GB
+#### Test environment: NVIDIA Tesla V100-SXM2-16GB
 |PGNetA|det_precision|det_recall|det_f_score|e2e_precision|e2e_recall|e2e_f_score|FPS|download|
 | --- | --- | --- | --- | --- | --- | --- | --- | --- |
 |Paper|85.30|86.80|86.1|-|-|61.7|38.20 (size=640)|-|
@@ -36,7 +36,7 @@ The results of detection and recognition are as follows:

 <a name="Environment_Configuration"></a>
 ## 2. Environment Configuration
-Please refer to [Quick Installation](./installation_en.md) Configure the PaddleOCR running environment.
+Please refer to [Operation Environment Preparation](./environment_en.md) to configure PaddleOCR operating environment first, refer to [PaddleOCR Overview and Project Clone](./paddleOCR_overview_en.md) to clone the project

 <a name="Quick_Use"></a>
 ## 3. Quick Use

--- a/doc/doc_en/quickstart_en.md
+++ b/doc/doc_en/quickstart_en.md
@@ -53,10 +53,10 @@ If you do not use the provided test image, you can replace the following `--imag

 #### 2.1.1 Chinese and English Model

-* Detection, direction classification and recognition: set the direction classifier parameter`--use_angle_cls true` to recognize vertical text.
+* Detection, direction classification and recognition: set the parameter`--use_gpu false` to disable the gpu device

  ```bash
-  paddleocr --image_dir ./imgs_en/img_12.jpg --use_angle_cls true --lang en
+  paddleocr --image_dir ./imgs_en/img_12.jpg --use_angle_cls true --lang en --use_gpu false
  ```

  Output will be a list, each item contains bounding box, text and recognition confidence

--- a/doc/doc_en/training_en.md
+++ b/doc/doc_en/training_en.md
@@ -25,11 +25,10 @@ The PaddleOCR model uses configuration files to manage network training and eval
 For the complete configuration file description, please refer to [Configuration File](./config_en.md)

 <a name="1-basic-concepts"></a>
-# 1. Basic concepts

 ## 2. Basic Concepts

-The following parameters need to be paid attention to when tuning the model:
+In the process of model training, some hyperparameters need to be manually adjusted to help the model obtain the optimal index at the least loss. Different data volumes may require different hyper-parameters. When you want to finetune your own data or tune the model effect, there are several parameter adjustment strategies for reference:

 <a name="11-learning-rate"></a>
 ### 2.1 Learning Rate
@@ -53,7 +52,7 @@ and the learning rate is the same in each stage.
 warmup_epoch means that in the first 5 epochs, the learning rate will gradually increase from 0 to base_lr. For all strategies, please refer to the code [learning_rate.py](../../ppocr/optimizer/learning_rate.py).

 <a name="12-regularization"></a>
-## 1.2 Regularization
+### 2.2 Regularization

 Regularization can effectively avoid algorithm overfitting. PaddleOCR provides L1 and L2 regularization methods.
 L1 and L2 regularization are the most commonly used regularization methods.
@@ -125,7 +124,7 @@ There are several experiences for reference when constructing the data set:

 <a name="3-faq"></a>

-# 3. FAQ
+## 4. FAQ

 **Q**: How to choose a suitable network input shape when training CRNN recognition?

@@ -147,9 +146,10 @@ There are several experiences for reference when constructing the data set:

    A: It is normal for the acc to be 0 at the beginning of the recognition model training, and the indicator will come up after a longer training period.

-
 ***
+
 Click the following links for detailed training tutorial:  
+
 - [text detection model training](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/detection.md)  
 - [text recognition model training](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/recognition.md)  
 - [text direction classification model training](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/angle_class.md)  
--- a/doc/doc_en/whl_en.md
+++ b/doc/doc_en/whl_en.md
@@ -366,4 +366,6 @@ im_show.save('result.jpg')
 | rec                     | Enable recognition when `ppocr.ocr` func exec                                                                                                                                                                                                   | TRUE                    |
 | cls                     | Enable classification when `ppocr.ocr` func exec((Use use_angle_cls in command line mode to control whether to start classification in the forward direction)                                                                                                                                                                                                   | FALSE                    |
 | show_log                     | Whether to print log in det and rec | FALSE                    |
-| type                     | Perform ocr or table structuring, the value is selected in ['ocr','structure']                                                                                                                                                                                             | ocr                    |
\ No newline at end of file
+| type                     | Perform ocr or table structuring, the value is selected in ['ocr','structure']                                                                                                                                                                                             | ocr                    |
+| ocr_version                     | OCR Model version number, the current model support list is as follows: PP-OCRv2 support Chinese detection and recognition model, PP-OCR support Chinese detection, recognition and direction classifier, multilingual recognition model | PP-OCRv2                 |
+| structure_version                     | table structure Model version number, the current model support list is as follows: STRUCTURE support english table structure model | STRUCTURE                 |
--- a/doc/joinus.PNG
+++ b/doc/joinus.PNG
--- a/doc/pr.png
+++ b/doc/pr.png
--- a/paddleocr.py
+++ b/paddleocr.py
@@ -16,6 +16,9 @@ import os
 import sys

 __dir__ = os.path.dirname(__file__)
+
+import paddle
+
 sys.path.append(os.path.join(__dir__, ''))

 import cv2
@@ -29,7 +32,7 @@ from ppocr.utils.logging import get_logger
 logger = get_logger()
 from ppocr.utils.utility import check_and_read_gif, get_image_file_list
 from ppocr.utils.network import maybe_download, download_with_progressbar, is_link, confirm_model_dir_url
-from tools.infer.utility import draw_ocr, str2bool
+from tools.infer.utility import draw_ocr, str2bool, check_gpu
 from ppstructure.utility import init_args, draw_structure_result
 from ppstructure.predict_system import OCRSystem, save_structure_res

@@ -39,130 +42,137 @@ __all__ = [
 ]

 SUPPORT_DET_MODEL = ['DB']
-VERSION = '2.2.1'
+VERSION = '2.3.0.1'
 SUPPORT_REC_MODEL = ['CRNN']
 BASE_DIR = os.path.expanduser("~/.paddleocr/")

-DEFAULT_MODEL_VERSION = '2.0'
+DEFAULT_OCR_MODEL_VERSION = 'PP-OCR'
+DEFAULT_STRUCTURE_MODEL_VERSION = 'STRUCTURE'
 MODEL_URLS = {
-    '2.1': {
-        'det': {
-            'ch': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar',
-            },
-        },
-        'rec': {
-            'ch': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar',
-                'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
-            }
-        }
-    },
-    '2.0': {
-        'det': {
-            'ch': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar',
-            },
-            'en': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_ppocr_mobile_v2.0_det_infer.tar',
+    'OCR': {
+        'PP-OCRv2': {
+            'det': {
+                'ch': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar',
+                },
            },
-            'structure': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar'
+            'rec': {
+                'ch': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar',
+                    'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
+                }
            }
        },
-        'rec': {
-            'ch': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
-            },
-            'en': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/en_dict.txt'
-            },
-            'french': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/french_dict.txt'
-            },
-            'german': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/german_dict.txt'
-            },
-            'korean': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/korean_dict.txt'
-            },
-            'japan': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/japan_dict.txt'
-            },
-            'chinese_cht': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/chinese_cht_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/chinese_cht_dict.txt'
-            },
-            'ta': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ta_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/ta_dict.txt'
+        DEFAULT_OCR_MODEL_VERSION: {
+            'det': {
+                'ch': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar',
+                },
+                'en': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_ppocr_mobile_v2.0_det_infer.tar',
+                },
+                'structure': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar'
+                }
            },
-            'te': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/te_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/te_dict.txt'
+            'rec': {
+                'ch': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
+                },
+                'en': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/en_dict.txt'
+                },
+                'french': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/french_dict.txt'
+                },
+                'german': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/german_dict.txt'
+                },
+                'korean': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/korean_dict.txt'
+                },
+                'japan': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/japan_dict.txt'
+                },
+                'chinese_cht': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/chinese_cht_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/chinese_cht_dict.txt'
+                },
+                'ta': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ta_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/ta_dict.txt'
+                },
+                'te': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/te_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/te_dict.txt'
+                },
+                'ka': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ka_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/ka_dict.txt'
+                },
+                'latin': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/latin_dict.txt'
+                },
+                'arabic': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/arabic_ppocr_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/arabic_dict.txt'
+                },
+                'cyrillic': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/cyrillic_ppocr_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/cyrillic_dict.txt'
+                },
+                'devanagari': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/devanagari_ppocr_mobile_v2.0_rec_infer.tar',
+                    'dict_path': './ppocr/utils/dict/devanagari_dict.txt'
+                },
+                'structure': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar',
+                    'dict_path': 'ppocr/utils/dict/table_dict.txt'
+                }
            },
-            'ka': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ka_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/ka_dict.txt'
+            'cls': {
+                'ch': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar',
+                }
            },
-            'latin': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/latin_dict.txt'
-            },
-            'arabic': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/arabic_ppocr_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/arabic_dict.txt'
-            },
-            'cyrillic': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/cyrillic_ppocr_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/cyrillic_dict.txt'
-            },
-            'devanagari': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/devanagari_ppocr_mobile_v2.0_rec_infer.tar',
-                'dict_path': './ppocr/utils/dict/devanagari_dict.txt'
-            },
-            'structure': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar',
-                'dict_path': 'ppocr/utils/dict/table_dict.txt'
-            }
-        },
-        'cls': {
-            'ch': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar',
-            }
-        },
-        'table': {
-            'en': {
-                'url':
-                'https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar',
-                'dict_path': 'ppocr/utils/dict/table_structure_dict.txt'
+        }
+    },
+    'STRUCTURE': {
+        DEFAULT_STRUCTURE_MODEL_VERSION: {
+            'table': {
+                'en': {
+                    'url':
+                    'https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar',
+                    'dict_path': 'ppocr/utils/dict/table_structure_dict.txt'
+                }
            }
        }
    }
@@ -177,7 +187,20 @@ def parse_args(mMain=True):
    parser.add_argument("--det", type=str2bool, default=True)
    parser.add_argument("--rec", type=str2bool, default=True)
    parser.add_argument("--type", type=str, default='ocr')
-    parser.add_argument("--version", type=str, default='2.1')
+    parser.add_argument(
+        "--ocr_version",
+        type=str,
+        default='PP-OCRv2',
+        help='OCR Model version, the current model support list is as follows: '
+        '1. PP-OCRv2 Support Chinese detection and recognition model. '
+        '2. PP-OCR support Chinese detection, recognition and direction classifier and multilingual recognition model.'
+    )
+    parser.add_argument(
+        "--structure_version",
+        type=str,
+        default='STRUCTURE',
+        help='Model version, the current model support list is as follows:'
+        ' 1. STRUCTURE Support en table structure model.')

    for action in parser._actions:
        if action.dest in ['rec_char_dict_path', 'table_char_dict_path']:
@@ -215,9 +238,9 @@ def parse_lang(lang):
        lang = "cyrillic"
    elif lang in devanagari_lang:
        lang = "devanagari"
-    assert lang in MODEL_URLS[DEFAULT_MODEL_VERSION][
+    assert lang in MODEL_URLS['OCR'][DEFAULT_OCR_MODEL_VERSION][
        'rec'], 'param lang must in {}, but got {}'.format(
-            MODEL_URLS[DEFAULT_MODEL_VERSION]['rec'].keys(), lang)
+            MODEL_URLS['OCR'][DEFAULT_OCR_MODEL_VERSION]['rec'].keys(), lang)
    if lang == "ch":
        det_lang = "ch"
    elif lang == 'structure':
@@ -227,33 +250,41 @@ def parse_lang(lang):
    return lang, det_lang


-def get_model_config(version, model_type, lang):
-    if version not in MODEL_URLS:
-        logger.warning('version {} not in {}, use version {} instead'.format(
-            version, MODEL_URLS.keys(), DEFAULT_MODEL_VERSION))
+def get_model_config(type, version, model_type, lang):
+    if type == 'OCR':
+        DEFAULT_MODEL_VERSION = DEFAULT_OCR_MODEL_VERSION
+    elif type == 'STRUCTURE':
+        DEFAULT_MODEL_VERSION = DEFAULT_STRUCTURE_MODEL_VERSION
+    else:
+        raise NotImplementedError
+    model_urls = MODEL_URLS[type]
+    if version not in model_urls:
+        logger.warning('version {} not in {}, auto switch to version {}'.format(
+            version, model_urls.keys(), DEFAULT_MODEL_VERSION))
        version = DEFAULT_MODEL_VERSION
-    if model_type not in MODEL_URLS[version]:
-        if model_type in MODEL_URLS[DEFAULT_MODEL_VERSION]:
+    if model_type not in model_urls[version]:
+        if model_type in model_urls[DEFAULT_MODEL_VERSION]:
            logger.warning(
-                'version {} not support {} models, use version {} instead'.
+                'version {} not support {} models, auto switch to version {}'.
                format(version, model_type, DEFAULT_MODEL_VERSION))
            version = DEFAULT_MODEL_VERSION
        else:
            logger.error('{} models is not support, we only support {}'.format(
-                model_type, MODEL_URLS[DEFAULT_MODEL_VERSION].keys()))
+                model_type, model_urls[DEFAULT_MODEL_VERSION].keys()))
            sys.exit(-1)
-    if lang not in MODEL_URLS[version][model_type]:
-        if lang in MODEL_URLS[DEFAULT_MODEL_VERSION][model_type]:
-            logger.warning('lang {} is not support in {}, use {} instead'.
-                           format(lang, version, DEFAULT_MODEL_VERSION))
+    if lang not in model_urls[version][model_type]:
+        if lang in model_urls[DEFAULT_MODEL_VERSION][model_type]:
+            logger.warning(
+                'lang {} is not support in {}, auto switch to version {}'.
+                format(lang, version, DEFAULT_MODEL_VERSION))
            version = DEFAULT_MODEL_VERSION
        else:
            logger.error(
                'lang {} is not support, we only support {} for {} models'.
-                format(lang, MODEL_URLS[DEFAULT_MODEL_VERSION][model_type].keys(
+                format(lang, model_urls[DEFAULT_MODEL_VERSION][model_type].keys(
                ), model_type))
            sys.exit(-1)
-    return MODEL_URLS[version][model_type][lang]
+    return model_urls[version][model_type][lang]


 class PaddleOCR(predict_system.TextSystem):
@@ -265,23 +296,28 @@ class PaddleOCR(predict_system.TextSystem):
        """
        params = parse_args(mMain=False)
        params.__dict__.update(**kwargs)
+        params.use_gpu = check_gpu(params.use_gpu)
+
        if not params.show_log:
            logger.setLevel(logging.INFO)
        self.use_angle_cls = params.use_angle_cls
        lang, det_lang = parse_lang(params.lang)

        # init model dir
-        det_model_config = get_model_config(params.version, 'det', det_lang)
+        det_model_config = get_model_config('OCR', params.ocr_version, 'det',
+                                            det_lang)
        params.det_model_dir, det_url = confirm_model_dir_url(
            params.det_model_dir,
            os.path.join(BASE_DIR, VERSION, 'ocr', 'det', det_lang),
            det_model_config['url'])
-        rec_model_config = get_model_config(params.version, 'rec', lang)
+        rec_model_config = get_model_config('OCR', params.ocr_version, 'rec',
+                                            lang)
        params.rec_model_dir, rec_url = confirm_model_dir_url(
            params.rec_model_dir,
            os.path.join(BASE_DIR, VERSION, 'ocr', 'rec', lang),
            rec_model_config['url'])
-        cls_model_config = get_model_config(params.version, 'cls', 'ch')
+        cls_model_config = get_model_config('OCR', params.ocr_version, 'cls',
+                                            'ch')
        params.cls_model_dir, cls_url = confirm_model_dir_url(
            params.cls_model_dir,
            os.path.join(BASE_DIR, VERSION, 'ocr', 'cls'),
@@ -362,22 +398,27 @@ class PPStructure(OCRSystem):
    def __init__(self, **kwargs):
        params = parse_args(mMain=False)
        params.__dict__.update(**kwargs)
+        params.use_gpu = check_gpu(params.use_gpu)
+
        if not params.show_log:
            logger.setLevel(logging.INFO)
        lang, det_lang = parse_lang(params.lang)

        # init model dir
-        det_model_config = get_model_config(params.version, 'det', det_lang)
+        det_model_config = get_model_config('OCR', params.ocr_version, 'det',
+                                            det_lang)
        params.det_model_dir, det_url = confirm_model_dir_url(
            params.det_model_dir,
            os.path.join(BASE_DIR, VERSION, 'ocr', 'det', det_lang),
            det_model_config['url'])
-        rec_model_config = get_model_config(params.version, 'rec', lang)
+        rec_model_config = get_model_config('OCR', params.ocr_version, 'rec',
+                                            lang)
        params.rec_model_dir, rec_url = confirm_model_dir_url(
            params.rec_model_dir,
            os.path.join(BASE_DIR, VERSION, 'ocr', 'rec', lang),
            rec_model_config['url'])
-        table_model_config = get_model_config(params.version, 'table', 'en')
+        table_model_config = get_model_config(
+            'STRUCTURE', params.structure_version, 'table', 'en')
        params.table_model_dir, table_url = confirm_model_dir_url(
            params.table_model_dir,
            os.path.join(BASE_DIR, VERSION, 'ocr', 'table'),

--- a/ppocr/modeling/heads/table_att_head.py
+++ b/ppocr/modeling/heads/table_att_head.py
@@ -23,32 +23,40 @@ import numpy as np


 class TableAttentionHead(nn.Layer):
-    def __init__(self, in_channels, hidden_size, loc_type, in_max_len=488, **kwargs):
+    def __init__(self,
+                 in_channels,
+                 hidden_size,
+                 loc_type,
+                 in_max_len=488,
+                 max_text_length=100,
+                 max_elem_length=800,
+                 max_cell_num=500,
+                 **kwargs):
        super(TableAttentionHead, self).__init__()
        self.input_size = in_channels[-1]
        self.hidden_size = hidden_size
        self.elem_num = 30
-        self.max_text_length = 100
-        self.max_elem_length = 500
-        self.max_cell_num = 500
+        self.max_text_length = max_text_length
+        self.max_elem_length = max_elem_length
+        self.max_cell_num = max_cell_num

        self.structure_attention_cell = AttentionGRUCell(
            self.input_size, hidden_size, self.elem_num, use_gru=False)
        self.structure_generator = nn.Linear(hidden_size, self.elem_num)
        self.loc_type = loc_type
        self.in_max_len = in_max_len
-        
+
        if self.loc_type == 1:
            self.loc_generator = nn.Linear(hidden_size, 4)
        else:
            if self.in_max_len == 640:
-                self.loc_fea_trans = nn.Linear(400, self.max_elem_length+1)
+                self.loc_fea_trans = nn.Linear(400, self.max_elem_length + 1)
            elif self.in_max_len == 800:
-                self.loc_fea_trans = nn.Linear(625, self.max_elem_length+1)
+                self.loc_fea_trans = nn.Linear(625, self.max_elem_length + 1)
            else:
-                self.loc_fea_trans = nn.Linear(256, self.max_elem_length+1)
+                self.loc_fea_trans = nn.Linear(256, self.max_elem_length + 1)
            self.loc_generator = nn.Linear(self.input_size + hidden_size, 4)
-            
+
    def _char_to_onehot(self, input_char, onehot_dim):
        input_ont_hot = F.one_hot(input_char, onehot_dim)
        return input_ont_hot
@@ -60,16 +68,16 @@ class TableAttentionHead(nn.Layer):
        if len(fea.shape) == 3:
            pass
        else:
-            last_shape = int(np.prod(fea.shape[2:])) # gry added
+            last_shape = int(np.prod(fea.shape[2:]))  # gry added
            fea = paddle.reshape(fea, [fea.shape[0], fea.shape[1], last_shape])
            fea = fea.transpose([0, 2, 1])  # (NTC)(batch, width, channels)
        batch_size = fea.shape[0]
-        
+
        hidden = paddle.zeros((batch_size, self.hidden_size))
        output_hiddens = []
        if self.training and targets is not None:
            structure = targets[0]
-            for i in range(self.max_elem_length+1):
+            for i in range(self.max_elem_length + 1):
                elem_onehots = self._char_to_onehot(
                    structure[:, i], onehot_dim=self.elem_num)
                (outputs, hidden), alpha = self.structure_attention_cell(
@@ -96,7 +104,7 @@ class TableAttentionHead(nn.Layer):
            alpha = None
            max_elem_length = paddle.to_tensor(self.max_elem_length)
            i = 0
-            while i < max_elem_length+1:
+            while i < max_elem_length + 1:
                elem_onehots = self._char_to_onehot(
                    temp_elem, onehot_dim=self.elem_num)
                (outputs, hidden), alpha = self.structure_attention_cell(
@@ -105,7 +113,7 @@ class TableAttentionHead(nn.Layer):
                structure_probs_step = self.structure_generator(outputs)
                temp_elem = structure_probs_step.argmax(axis=1, dtype="int32")
                i += 1
-                
+
            output = paddle.concat(output_hiddens, axis=1)
            structure_probs = self.structure_generator(output)
            structure_probs = F.softmax(structure_probs)
@@ -119,9 +127,9 @@ class TableAttentionHead(nn.Layer):
                loc_concat = paddle.concat([output, loc_fea], axis=2)
                loc_preds = self.loc_generator(loc_concat)
                loc_preds = F.sigmoid(loc_preds)
-        return {'structure_probs':structure_probs, 'loc_preds':loc_preds}
+        return {'structure_probs': structure_probs, 'loc_preds': loc_preds}
+

-    
 class AttentionGRUCell(nn.Layer):
    def __init__(self, input_size, hidden_size, num_embeddings, use_gru=False):
        super(AttentionGRUCell, self).__init__()

--- a/ppocr/utils/network.py
+++ b/ppocr/utils/network.py
@@ -24,15 +24,17 @@ from ppocr.utils.logging import get_logger
 def download_with_progressbar(url, save_path):
    logger = get_logger()
    response = requests.get(url, stream=True)
-    total_size_in_bytes = int(response.headers.get('content-length', 0))
-    block_size = 1024  # 1 Kibibyte
-    progress_bar = tqdm(total=total_size_in_bytes, unit='iB', unit_scale=True)
-    with open(save_path, 'wb') as file:
-        for data in response.iter_content(block_size):
-            progress_bar.update(len(data))
-            file.write(data)
-    progress_bar.close()
-    if total_size_in_bytes == 0 or progress_bar.n != total_size_in_bytes:
+    if response.status_code == 200:
+        total_size_in_bytes = int(response.headers.get('content-length', 1))
+        block_size = 1024  # 1 Kibibyte
+        progress_bar = tqdm(
+            total=total_size_in_bytes, unit='iB', unit_scale=True)
+        with open(save_path, 'wb') as file:
+            for data in response.iter_content(block_size):
+                progress_bar.update(len(data))
+                file.write(data)
+        progress_bar.close()
+    else:
        logger.error("Something went wrong while downloading models")
        sys.exit(0)

@@ -45,7 +47,7 @@ def maybe_download(model_storage_directory, url):
    if not os.path.exists(
            os.path.join(model_storage_directory, 'inference.pdiparams')
    ) or not os.path.exists(
-        os.path.join(model_storage_directory, 'inference.pdmodel')):
+            os.path.join(model_storage_directory, 'inference.pdmodel')):
        assert url.endswith('.tar'), 'Only supports tar compressed package'
        tmp_path = os.path.join(model_storage_directory, url.split('/')[-1])
        print('download {} to {}'.format(url, tmp_path))

--- a/test_tipc/test_paddle2onnx.sh
+++ b/test_tipc/test_paddle2onnx.sh
@@ -16,26 +16,28 @@ IFS=$'\n'
 lines=(${dataline})

 # parser paddle2onnx
-padlle2onnx_cmd=$(func_parser_value "${lines[1]}")
-infer_model_dir_key=$(func_parser_key "${lines[2]}")
-infer_model_dir_value=$(func_parser_value "${lines[2]}")
-model_filename_key=$(func_parser_key "${lines[3]}")
-model_filename_value=$(func_parser_value "${lines[3]}")
-params_filename_key=$(func_parser_key "${lines[4]}")
-params_filename_value=$(func_parser_value "${lines[4]}")
-save_file_key=$(func_parser_key "${lines[5]}")
-save_file_value=$(func_parser_value "${lines[5]}")
-opset_version_key=$(func_parser_key "${lines[6]}")
-opset_version_value=$(func_parser_value "${lines[6]}")
-enable_onnx_checker_key=$(func_parser_key "${lines[7]}")
-enable_onnx_checker_value=$(func_parser_value "${lines[7]}")
+model_name=$(func_parser_value "${lines[1]}")
+python=$(func_parser_value "${lines[2]}")
+padlle2onnx_cmd=$(func_parser_value "${lines[3]}")
+infer_model_dir_key=$(func_parser_key "${lines[4]}")
+infer_model_dir_value=$(func_parser_value "${lines[4]}")
+model_filename_key=$(func_parser_key "${lines[5]}")
+model_filename_value=$(func_parser_value "${lines[5]}")
+params_filename_key=$(func_parser_key "${lines[6]}")
+params_filename_value=$(func_parser_value "${lines[6]}")
+save_file_key=$(func_parser_key "${lines[7]}")
+save_file_value=$(func_parser_value "${lines[7]}")
+opset_version_key=$(func_parser_key "${lines[8]}")
+opset_version_value=$(func_parser_value "${lines[8]}")
+enable_onnx_checker_key=$(func_parser_key "${lines[9]}")
+enable_onnx_checker_value=$(func_parser_value "${lines[9]}")
 # parser onnx inference 
-inference_py=$(func_parser_value "${lines[8]}")
-use_gpu_key=$(func_parser_key "${lines[9]}")
-use_gpu_value=$(func_parser_value "${lines[9]}")
-det_model_key=$(func_parser_key "${lines[10]}")
-image_dir_key=$(func_parser_key "${lines[11]}")
-image_dir_value=$(func_parser_value "${lines[11]}")
+inference_py=$(func_parser_value "${lines[10]}")
+use_gpu_key=$(func_parser_key "${lines[11]}")
+use_gpu_value=$(func_parser_value "${lines[11]}")
+det_model_key=$(func_parser_key "${lines[12]}")
+image_dir_key=$(func_parser_key "${lines[13]}")
+image_dir_value=$(func_parser_value "${lines[13]}")


 LOG_PATH="./test_tipc/output"

--- a/tools/__init__.py
+++ b/tools/__init__.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
--- a/tools/infer/utility.py
+++ b/tools/infer/utility.py
@@ -17,7 +17,7 @@ import os
 import sys
 import cv2
 import numpy as np
-import json
+import paddle
 from PIL import Image, ImageDraw, ImageFont
 import math
 from paddle import inference
@@ -601,5 +601,12 @@ def get_rotate_crop_image(img, points):
    return dst_img


+def check_gpu(use_gpu):
+    if use_gpu and not paddle.is_compiled_with_cuda():
+
+        use_gpu = False
+    return use_gpu
+
+
 if __name__ == '__main__':
    pass