Merge remote-tracking branch 'origin/dygraph' into dygraph

41a1b292 · Leif · 9471054e · 3d30899b · 41a1b292 · 41a1b292
Commit 41a1b292 authored Jan 20, 2022 by Leif
20 changed files
--- a/doc/doc_en/enhanced_ctc_loss_en.md
+++ b/doc/doc_en/enhanced_ctc_loss_en.md
+# Enhanced CTC Loss
+In OCR recognition, CRNN is a text recognition algorithm widely applied in the industry. In the training phase, it uses CTCLoss to calculate the network loss. In the inference phase, it uses CTCDecode to obtain the decoding result. Although the CRNN algorithm has been proven to achieve reliable recognition results in actual business, users have endless requirements for recognition accuracy. So how to improve the accuracy of text recognition? Taking CTCLoss as the starting point, this paper explores the improved fusion scheme of CTCLoss from three different perspectives: Hard Example Mining, Multi-task Learning, and Metric Learning. Based on the exploration, we propose EnhancedCTCLoss, which includes the following 3 components: Focal-CTC Loss, A-CTC Loss, C-CTC Loss.
+## 1. Focal-CTC Loss
+Focal Loss was proposed by the paper, "[Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002)". When the loss was first proposed, it was mainly to solve the problem of a serious imbalance in the ratio of positive and negative samples in one-stage target detection. This loss function reduces the weight of a large number of simple negative samples in training and also can be understood as a kind of difficult sample mining.
+The form of the loss function is as follows:
+<div align="center"> 
+<img src="./focal_loss_formula.png" width = "600" /> 
+</div>
+Among them, y' is the output of the activation function, and the value is between 0-1. It adds a modulation factor (1-y’)^&gamma; and a balance factor &alpha; on the basis of the original cross-entropy loss. When &alpha; = 1, y = 1, the comparison between the loss function and the cross-entropy loss is shown in the following figure:
+<div align="center"> 
+<img src="./focal_loss_image.png" width = "600" /> 
+</div>
+As can be seen from the above figure, when &gamma; > 0, the adjustment coefficient (1-y’)^&gamma; gives smaller weight to the easy-to-classify sample loss, making the network pay more attention to the difficult and misclassified samples. The adjustment factor &gamma; is used to adjust the rate at which the weight of simple samples decreases. When &gamma; = 0, it is the cross-entropy loss function. When &gamma; increases, the influence of the adjustment factor will also increase. Experiments revealed that 2 is the optimal value of &gamma;. The balance factor &alpha; is used to balance the uneven proportions of the positive and negative samples. In the text, &alpha; is taken as 0.25.
+For the classic CTC algorithm, suppose a certain feature sequence (f<sub>1</sub>, f<sub>2</sub>, ......f<sub>t</sub>), after CTC decoding, the probability that the result is equal to label is y', then the probability that the CTC decoding result is not equal to label is (1-y'); it is not difficult to find that the CTCLoss value and y' have the following relationship:
+<div align="center"> 
+<img src="./equation_ctcloss.png" width = "250" /> 
+</div>
+Combining the idea of Focal Loss, assigning larger weights to difficult samples and smaller weights to simple samples can make the network focus more on the mining of difficult samples and further improve the accuracy of recognition. Therefore, we propose Focal-CTC Loss. Its definition is as follows:
+<div align="center"> 
+<img src="./equation_focal_ctc.png" width = "500" /> 
+</div>
+In the experiment, the value of &gamma; is 2, &alpha; = 1, see this for specific implementation: [rec_ctc_loss.py](../../ppocr/losses/rec_ctc_loss.py)
+## 2. A-CTC Loss
+A-CTC Loss is short for CTC Loss + ACE Loss. Among them, ACE Loss was proposed by the paper, “[Aggregation Cross-Entropy for Sequence Recognition](https://arxiv.org/abs/1904.08364)”. Compared with CTCLoss, ACE Loss has the following two advantages:
+ ACE Loss can solve the recognition problem of 2-D text, while CTCLoss can only process 1-D text
+ ACE Loss is better than CTC loss in time complexity and space complexity
+The advantages and disadvantages of the OCR recognition algorithm summarized by the predecessors are shown in the following figure:
+<div align="center">
+<img src="./rec_algo_compare.png" width = "1000" /> 
+</div>
+Although ACELoss does handle 2D predictions, as shown in the figure above, and has advantages in memory usage and inference speed, in practice, we found that using ACELoss alone, the recognition effect is not as good as CTCLoss. Consequently, we tried to combine CTCLoss and ACELoss, and CTCLoss is the mainstay while ACELoss acts as an auxiliary supervision loss. This attempt has achieved better results. On our internal experimental data set, compared to using CTCLoss alone, the recognition accuracy can be improved by about 1%.
+A_CTC Loss is defined as follows:
+<div align="center">
+<img src="./equation_a_ctc.png" width = "300" /> 
+</div>
+In the experiment, λ = 0.1. See the ACE loss implementation code: [ace_loss.py](../../ppocr/losses/ace_loss.py)
+## 3. C-CTC Loss
+C-CTC Loss is short for CTC Loss + Center Loss. Among them, Center Loss was proposed by the paper, “[A Discriminative Feature Learning Approach for Deep Face Recognition](https://link.springer.com/chapter/10.1007/978-3-319-46478-7_31)“. It was first used in face recognition tasks to increase the distance between classes and reduce the distance within classes. It is an earlier and also widely used algorithm.
+In the task of Chinese OCR recognition, through the analysis of bad cases, we found that a major difficulty in Chinese recognition is that there are many similar characters, which are easy to misunderstand. From this, we thought about whether we can learn from the idea of n to increase the class spacing of similar characters, to improve recognition accuracy. However, Metric Learning is mainly used in the field of image recognition, and the label of the training data is a fixed value; for OCR recognition, it is a sequence recognition task essentially, and there is no explicit alignment between features and labels. Therefore, how to combine the two is still a direction worth exploring.
+By trying Arcmargin, Cosmargin and other methods, we finally found that Centerloss can help further improve the accuracy of recognition. C_CTC Loss is defined as follows:
+<div align="center">
+<img src="./equation_c_ctc.png" width = "300" /> 
+</div>
+In the experiment, we set λ=0.25. See the center_loss implementation code: [center_loss.py](../../ppocr/losses/center_loss.py)
+It is worth mentioning that in C-CTC Loss, choosing to initialize the Center randomly does not bring significant improvement. Our Center initialization method is as follows:
+ Based on the original CTCLoss, a network N is obtained by training
+ Select the training set, identify the completely correct part, and form the set G
+ Send each sample in G to the network, perform forward calculation, and extract the correspondence between the input of the last FC layer (ie feature) and the result of argmax calculation (ie index)
+ Aggregate features with the same index, calculate the average, and get the initial center of each character.
+Taking the configuration file `configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml` as an example, the center extraction command is as follows:
+```
+python tools/export_center.py -c configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml -o Global.pretrained_model="./output/rec_mobile_pp-OCRv2/best_accuracy"
+```
+After running, `train_center.pkl` will be generated in the main directory of PaddleOCR.
+## 4. Experiment
+For the above three solutions, we conducted training and evaluation based on Baidu's internal data set. The experimental conditions are shown in the following table:
+| algorithm | Focal_CTC | A_CTC | C-CTC |
+| :-------- | :-------- | ----: | :---: |
+| gain      | +0.3%     | +0.7% | +1.7% |
+Based on the above experimental conclusions, we adopted the C-CTC strategy in PP-OCRv2. It is worth mentioning that, because PP-OCRv2 deals with the recognition task of 6625 Chinese characters, the character set is relatively large and there are many similar characters, so the C-CTC solution brings a significant improvement on this task. But if you switch to other OCR recognition tasks, the conclusion may be different. You can try Focal-CTC, A-CTC, C-CTC, and the combined solution EnhancedCTC. We believe it will bring different degrees of improvement.
+The unified combined plan is shown in the following file: [rec_enhanced_ctc_loss.py](../../ppocr/losses/rec_enhanced_ctc_loss.py)
\ No newline at end of file
--- a/doc/doc_en/environment_en.md
+++ b/doc/doc_en/environment_en.md
@@ -4,9 +4,9 @@ Windows and Mac users are recommended to use Anaconda to build a Python environm
 Recommended working environment:
 - PaddlePaddle >= 2.0.0 (2.1.2)
- python3.7
+- Python 3.7
- CUDA10.1 / CUDA10.2
+- CUDA 10.1 / CUDA 10.2
- CUDNN 7.6
+- cuDNN 7.6
 * [1. Python Environment Setup](#1)
  + [1.1 Windows](#1.1)
@@ -25,7 +25,7 @@ Recommended working environment:
 #### 1.1.1 Install Anaconda
- Note: To use paddlepaddle you need to install python environment first, here we choose python integrated environment Anaconda toolkit
+- Note: To use PaddlePaddle you need to install python environment first, here we choose python integrated environment Anaconda toolkit
  - Anaconda is a common python package manager
  - After installing Anaconda, you can install the python environment, as well as numpy and other required toolkit environment.
@@ -44,19 +44,19 @@ Recommended working environment:
    <img src="../install/windows/anaconda_install_folder.png" alt="install config" width="500" align=" left"/>
-  - Check conda to add environment variables and ignore the warning that
+  - Check Conda to add environment variables and ignore the warning that
    <img src="../install/windows/anaconda_install_env.png" alt="add conda to path" width="500" align="center"/>
-#### 1.1.2 Opening the terminal and creating the conda environment
+#### 1.1.2 Opening the terminal and creating the Conda environment
 - Open Anaconda Prompt terminal: bottom left Windows Start Menu -> Anaconda3 -> Anaconda Prompt start console
  <img src="../install/windows/anaconda_prompt.png" alt="anaconda download" width="300" align="center"/>
- Create a new conda environment
+- Create a new Conda environment
  ```shell
  # Enter the following command at the command line to create an environment named paddle_env
@@ -70,7 +70,7 @@ Recommended working environment:
  <img src="../install/windows/conda_new_env.png" alt="conda create" width="700" align="center"/>
- To activate the conda environment you just created, enter the following command at the command line.
+- To activate the Conda environment you just created, enter the following command at the command line.
  ```shell
  # Activate the paddle_env environment
@@ -91,7 +91,7 @@ The above anaconda environment and python environment are installed
 #### 1.2.1 Installing Anaconda
- Note: To use paddlepaddle you need to install the python environment first, here we choose the python integrated environment Anaconda toolkit
+- Note: To use PaddlePaddle you need to install the python environment first, here we choose the python integrated environment Anaconda toolkit
  - Anaconda is a common python package manager
  - After installing Anaconda, you can install the python environment, as well as numpy and other required toolkit environment
@@ -108,17 +108,17 @@ The above anaconda environment and python environment are installed
  - Just follow the default settings, it will take a while to install
- It is recommended to install a code editor such as vscode or pycharm
+- It is recommended to install a code editor such as VSCode or PyCharm
-#### 1.2.2 Open a terminal and create a conda environment
+#### 1.2.2 Open a terminal and create a Conda environment
 - Open the terminal
  - Press command and spacebar at the same time, type "terminal" in the focus search, double click to enter terminal
- **Add conda to the environment variables**
+- **Add Conda to the environment variables**
-  - Environment variables are added so that the system can recognize the conda command
+  - Environment variables are added so that the system can recognize the Conda command
  - Open `~/.bash_profile` in the terminal by typing the following command.
@@ -126,7 +126,7 @@ The above anaconda environment and python environment are installed
    vim ~/.bash_profile
    ```
-  - Add conda as an environment variable in `~/.bash_profile`.
+  - Add Conda as an environment variable in `~/.bash_profile`.
    ```shell
    # Press i first to enter edit mode
@@ -156,12 +156,12 @@ The above anaconda environment and python environment are installed
    - When you are done, press `esc` to exit edit mode, then type `:wq!` and enter to save and exit
-  - Verify that the conda command is recognized.
+  - Verify that the Conda command is recognized.
    - Enter `source ~/.bash_profile` in the terminal to update the environment variables
-    - Enter `conda info --envs` in the terminal again, if it shows that there is a base environment, then conda has been added to the environment variables
+    - Enter `conda info --envs` in the terminal again, if it shows that there is a base environment, then Conda has been added to the environment variables
- Create a new conda environment
+- Create a new Conda environment
  ```shell
  # Enter the following command at the command line to create an environment called paddle_env
@@ -175,7 +175,7 @@ The above anaconda environment and python environment are installed
    - <img src="../install/mac/conda_create.png" alt="conda_create" width="600" align="center"/>
- To activate the conda environment you just created, enter the following command at the command line.
+- To activate the Conda environment you just created, enter the following command at the command line.
  ```shell
  # Activate the paddle_env environment
@@ -198,7 +198,7 @@ Linux users can choose to run either Anaconda or Docker. If you are familiar wit
 #### 1.3.1 Anaconda environment configuration
- Note: To use paddlepaddle you need to install the python environment first, here we choose the python integrated environment Anaconda toolkit
+- Note: To use PaddlePaddle you need to install the python environment first, here we choose the python integrated environment Anaconda toolkit
  - Anaconda is a common python package manager
  - After installing Anaconda, you can install the python environment, as well as numpy and other required toolkit environment
@@ -214,9 +214,9 @@ Linux users can choose to run either Anaconda or Docker. If you are familiar wit
  - Select the appropriate version for your operating system
      - Type `uname -m` in the terminal to check the command set used by your system
-  - Download method 1: Download locally, then transfer the installation package to the linux server
+  - Download method 1: Download locally, then transfer the installation package to the Linux server
-  - Download method 2: Directly use linux command line to download
+  - Download method 2: Directly use Linux command line to download
    ```shell
    # First install wget
@@ -277,12 +277,12 @@ Linux users can choose to run either Anaconda or Docker. If you are familiar wit
    - When you are done, press `esc` to exit edit mode, then type `:wq!` and enter to save and exit
-  - Verify that the conda command is recognized.
+  - Verify that the Conda command is recognized.
    - Enter `source ~/.bash_profile` in the terminal to update the environment variables
-    - Enter `conda info --envs` in the terminal again, if it shows that there is a base environment, then conda has been added to the environment variables
+    - Enter `conda info --envs` in the terminal again, if it shows that there is a base environment, then Conda has been added to the environment variables
- Create a new conda environment
+- Create a new Conda environment
  ```shell
  # Enter the following command at the command line to create an environment called paddle_env
@@ -296,7 +296,7 @@ Linux users can choose to run either Anaconda or Docker. If you are familiar wit
    <img src="../install/linux/conda_create.png" alt="conda_create" width="500" align="center"/>
- To activate the conda environment you just created, enter the following command at the command line.
+- To activate the Conda environment you just created, enter the following command at the command line.
  ```shell
  # Activate the paddle_env environment
@@ -335,13 +335,13 @@ sudo docker container exec -it ppocr /bin/bash
 ## 2. Install PaddlePaddle 2.0
- If you have cuda9 or cuda10 installed on your machine, please run the following command to install
+- If you have CUDA 9 or CUDA 10 installed on your machine, please run the following command to install
 ```bash
 python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
 ```
- If you only have cpu on your machine, please run the following command to install
+- If you have no available GPU on your machine, please run the following command to install the CPU version
 ```bash
 python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

--- a/doc/doc_en/inference_en.md
+++ b/doc/doc_en/inference_en.md
@@ -139,7 +139,7 @@ tar xf ch_ppocr_mobile_v2.0_det_infer.tar
 python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/"
 ```
-The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:
+The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
 ![](../imgs_results/det_res_00018069.jpg)
@@ -244,7 +244,7 @@ The visualized text detection results are saved to the `./inference_results` fol
 <a name="RECOGNITION_MODEL_INFERENCE"></a>
 ## 3. Text Recognition Model Inference
-The following will introduce the lightweight Chinese recognition model inference, other CTC-based and Attention-based text recognition models inference. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss. In practice, it is also found that the result of the model based on Attention loss is not as good as the one based on CTC loss. In addition, if the characters dictionary is modified during training, make sure that you use the same characters set during inferencing. Please check below for details.
+The following will introduce the lightweight Chinese recognition model inference, other CTC-based and Attention-based text recognition models inference. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss. In practice, it is also found that the result of the model based on Attention loss is not as good as the one based on CTC loss. In addition, if the characters dictionary is modified during training, make sure that you use the same characters set during inference. Please check below for details.
 <a name="LIGHTWEIGHT_RECOGNITION"></a>

--- a/doc/doc_en/inference_ppocr_en.md
+++ b/doc/doc_en/inference_ppocr_en.md
@@ -7,7 +7,7 @@ This article introduces the use of the Python inference engine for the PP-OCR mo
 - [Text Detection Model Inference](#DETECTION_MODEL_INFERENCE)
 - [Text Recognition Model Inference](#RECOGNITION_MODEL_INFERENCE)
    - [1. Lightweight Chinese Recognition Model Inference](#LIGHTWEIGHT_RECOGNITION)
-    - [2. Multilingaul Model Inference](#MULTILINGUAL_MODEL_INFERENCE)
+    - [2. Multilingual Model Inference](#MULTILINGUAL_MODEL_INFERENCE)
 - [Angle Classification Model Inference](#ANGLE_CLASS_MODEL_INFERENCE)
 - [Text Detection Angle Classification and Recognition Inference Concatenation](#CONCATENATION)
@@ -25,7 +25,7 @@ tar xf ch_PP-OCRv2_det_infer.tar
 python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./ch_PP-OCRv2_det_infer.tar/"
 ```
-The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:
+The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
 ![](../imgs_results/det_res_00018069.jpg)
@@ -75,7 +75,7 @@ Predicts of ./doc/imgs_words_en/word_10.png:('PAIN', 0.9897658)
 <a name="MULTILINGUAL_MODEL_INFERENCE"></a>
-### 2. Multilingaul Model Inference
+### 2. Multilingual Model Inference
 If you need to predict [other language models](./models_list_en.md#Multilingual), when using inference model prediction, you need to specify the dictionary path used by `--rec_char_dict_path`. At the same time, in order to get the correct visualization results,
 You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/fonts` path, such as Korean recognition:

--- a/doc/doc_en/installation_en.md
+++ b/doc/doc_en/installation_en.md
 ## QUICK INSTALLATION
-After testing, paddleocr can run on glibc 2.23. You can also test other glibc versions or install glic 2.23 for the best compatibility.
+After testing, PaddleOCR can run on glibc 2.23. You can also test other glibc versions or install glibc 2.23 for the best compatibility.
 PaddleOCR working environment:
 - PaddlePaddle 2.0.0
- python3.7
+- Python 3.7
 - glibc 2.23
-It is recommended to use the docker provided by us to run PaddleOCR, please refer to the use of docker [link](https://www.runoob.com/docker/docker-tutorial.html/).
+It is recommended to use the docker provided by us to run PaddleOCR. Please refer to the docker tutorial [link](https://www.runoob.com/docker/docker-tutorial.html/).
-*If you want to directly run the prediction code on mac or windows, you can start from step 2.*
+*If you want to directly run the prediction code on Mac or Windows, you can start from step 2.*
-**1. (Recommended) Prepare a docker environment. The first time you use this docker image, it will be downloaded automatically. Please be patient.**
+**1. (Recommended) Prepare a docker environment. For the first time you use this docker image, it will be downloaded automatically. Please be patient.**
 ```
 # Switch to the working directory
 cd /home/Projects
@@ -22,7 +22,7 @@ cd /home/Projects
 sudo docker run --name ppocr -v $PWD:/paddle --network=host -it  paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82  /bin/bash
 ```
-If using CUDA10, please run the following command to create a container.
+With CUDA10, please run the following command to create a container.
 It is recommended to set a shared memory greater than or equal to 32G through the --shm-size parameter:
 ```
 sudo nvidia-docker run --name ppocr -v $PWD:/paddle --shm-size=64G --network=host -it paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82 /bin/bash
@@ -51,11 +51,11 @@ For more software version requirements, please refer to the instructions in [Ins
 # Recommend
 git clone https://github.com/PaddlePaddle/PaddleOCR
-# If you cannot pull successfully due to network problems, you can also choose to use the code hosting on the cloud:
+# If you cannot pull successfully due to network problems, you can switch to the mirror hosted on Gitee:
 git clone https://gitee.com/paddlepaddle/PaddleOCR
-# Note: The cloud-hosting code may not be able to synchronize the update with this GitHub project in real time. There might be a delay of 3-5 days. Please give priority to the recommended method.
+# Note: The mirror on Gitee may not keep in synchronization with the latest update with the project on GitHub. There might be a delay of 3-5 days. Please try GitHub at first.
 ```
 **4. Install third-party libraries**
@@ -66,6 +66,6 @@ pip3 install -r requirements.txt
 If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows.
-Please try to download Shapely whl file using [http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely).
+Please try to download Shapely whl file from [http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely).
 Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
--- a/doc/doc_en/knowledge_distillation_en.md
+++ b/doc/doc_en/knowledge_distillation_en.md
+<a name="0"></a>
+# Knowledge Distillation
+ [Knowledge Distillation](#0)
+  + [1. Introduction](#1)
+    - [1.1 Introduction to Knowledge Distillation](#11)
+    - [1.2 Introduction to PaddleOCR Knowledge Distillation](#12)
+  + [2. Configuration File Analysis](#2)
+    + [2.1 Recognition Model Configuration File Analysis](#21)
+      - [2.1.1 Model Structure](#211)
+      - [2.1.2 Loss Function ](#212)
+      - [2.1.3 Post-processing](#213)
+      - [2.1.4 Metric Calculation](#214)
+      - [2.1.5 Fine-tuning Distillation Model](#215)
+    + [2.2 Detection Model Configuration File Analysis](#22)
+      - [2.2.1 Model Structure](#221)
+      - [2.2.2 Loss Function](#222)
+      - [2.2.3 Post-processing](#223)
+      - [2.2.4 Metric Calculation](#224)
+      - [2.2.5 Fine-tuning Distillation Model](#225)
+<a name="1"></a>
+## 1. Introduction
+<a name="11"></a>
+### 1.1 Introduction to Knowledge Distillation
+In recent years, deep neural networks have been proved to be an extremely effective method for solving problems in the fields of computer vision and natural language processing.
+By constructing a suitable neural network and training it, the performance metrics of the final network model will basically exceed the traditional algorithm.
+When the amount of data is large enough, increasing the amount of parameters by constructing a reasonable network model can significantly improve the performance of the model,
+but this brings about the problem of a sharp increase in the complexity of the model. Large models are more expensive to use in actual scenarios.
+Deep neural networks generally have more parameter redundancy. At present, there are several main methods to compress the model and reduce the amount of its parameters.
+Such as pruning, quantification, knowledge distillation, etc., where knowledge distillation refers to the use of teacher models to guide student models to learn specific tasks,
+to ensure that the small model obtains a relatively large performance improvement under the condition of unchanged parameters.
+In addition, in the knowledge distillation task, a mutual learning model training method was also derived.
+The paper [Deep Mutual Learning](https://arxiv.org/abs/1706.00384) pointed out that using two identical models to supervise each other during the training process can achieve better results than a single model training.
+<a name="12"></a>
+### 1.2 Introduction to PaddleOCR Knowledge Distillation
+Whether it is a large model distilling a small model, or a small model learning from each other and updating parameters,
+they are essentially the output between different models or mutual supervision between feature maps.
+The only difference is (1) whether the model requires fixed parameters. (2) Whether the model needs to be loaded with a pre-trained model.
+For the case where a large model distills a small model, the large model generally needs to load the pre-trained model and fix the parameters.
+For the situation where small models distill each other, the small models generally do not load the pre-trained model, and the parameters are also in a learnable state.
+In the task of knowledge distillation, it is not only the distillation between two models, but also the situation where multiple models learn from each other.
+Therefore, in the knowledge distillation code framework, it is also necessary to support this type of distillation method.
+The algorithm of knowledge distillation is integrated in PaddleOCR. Specifically, it has the following main features:
+- It supports mutual learning of any network, and does not require the sub-network structure to be completely consistent or to have a pre-trained model. At the same time, there is no limit to the number of sub-networks, just add it in the configuration file.
+- Support arbitrarily configuring the loss function through the configuration file, not only can use a certain loss, but also a combination of multiple losses.
+- Support all model-related environments such as knowledge distillation training, prediction, evaluation, and export, which is convenient for use and deployment.
+Through knowledge distillation, in the common Chinese and English text recognition task, without adding any time-consuming prediction,
+the accuracy of the model can be improved by more than 3%. Combining the learning rate adjustment strategy and the model structure fine-tuning strategy,
+the final improvement is more than 5%.
+<a name="2"></a>
+## 2. Configuration File Analysis
+In the process of knowledge distillation training, there is no change in data preprocessing, optimizer, learning rate, and some global attributes.
+The configuration files of the model structure, loss function, post-processing, metric calculation and other modules need to be fine-tuned.
+The following takes the knowledge distillation configuration file for recognition and detection as an example to analyze the training and configuration of knowledge distillation.
+<a name="21"></a>
+### 2.1 Recognition Model Configuration File Analysis
+The configuration file is in [ch_PP-OCRv2_rec_distillation.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_distillation.yml).
+<a name="211"></a>
+#### 2.1.1 Model Structure
+In the knowledge distillation task, the model structure configuration is as follows.
+```yaml
+Architecture:
+  model_type: &model_type "rec"    # Model category, recognition, detection, etc.
+  name: DistillationModel          # Structure name, in the distillation task, it is DistillationModel
+  algorithm: Distillation          # Algorithm name
+  Models:                          # Model, including the configuration information of the subnet
+    Teacher:                       # The name of the subnet, it must include at least the `pretrained` and `freeze_params` parameters, and the other parameters are the construction parameters of the subnet
+      pretrained:                  # Does this sub-network need to load pre-training weights
+      freeze_params: false         # Do you need fixed parameters
+      return_all_feats: true       # Do you need to return all features, if it is False, only the final output is returned
+      model_type: *model_type      # Model category
+      algorithm: CRNN              # The algorithm name of the sub-network. The remaining parameters of the sub-network are consistent with the general model training configuration
+      Transform:
+      Backbone:
+        name: MobileNetV1Enhance
+        scale: 0.5
+      Neck:
+        name: SequenceEncoder
+        encoder_type: rnn
+        hidden_size: 64
+      Head:
+        name: CTCHead
+        mid_channels: 96
+        fc_decay: 0.00002
+    Student:                       # Another sub-network, here is a distillation example of DML, the two sub-networks have the same structure, and both need to learn parameters
+      pretrained:                  # The following parameters are the same as above
+      freeze_params: false
+      return_all_feats: true
+      model_type: *model_type
+      algorithm: CRNN
+      Transform:
+      Backbone:
+        name: MobileNetV1Enhance
+        scale: 0.5
+      Neck:
+        name: SequenceEncoder
+        encoder_type: rnn
+        hidden_size: 64
+      Head:
+        name: CTCHead
+        mid_channels: 96
+        fc_decay: 0.00002
+```
+If you want to add more sub-networks for training, you can also add the corresponding fields in the configuration file according to the way of adding `Student` and `Teacher`.
+For example, if you want 3 models to supervise each other and train together, then `Architecture` can be written in the following format.
+```yaml
+Architecture:
+  model_type: &model_type "rec"
+  name: DistillationModel
+  algorithm: Distillation
+  Models:
+    Teacher:
+      pretrained:
+      freeze_params: false
+      return_all_feats: true
+      model_type: *model_type
+      algorithm: CRNN
+      Transform:
+      Backbone:
+        name: MobileNetV1Enhance
+        scale: 0.5
+      Neck:
+        name: SequenceEncoder
+        encoder_type: rnn
+        hidden_size: 64
+      Head:
+        name: CTCHead
+        mid_channels: 96
+        fc_decay: 0.00002
+    Student:
+      pretrained:
+      freeze_params: false
+      return_all_feats: true
+      model_type: *model_type
+      algorithm: CRNN
+      Transform:
+      Backbone:
+        name: MobileNetV1Enhance
+        scale: 0.5
+      Neck:
+        name: SequenceEncoder
+        encoder_type: rnn
+        hidden_size: 64
+      Head:
+        name: CTCHead
+        mid_channels: 96
+        fc_decay: 0.00002
+    Student2:                       # The new sub-network introduced in the knowledge distillation task, the configuration is the same as above
+      pretrained:
+      freeze_params: false
+      return_all_feats: true
+      model_type: *model_type
+      algorithm: CRNN
+      Transform:
+      Backbone:
+        name: MobileNetV1Enhance
+        scale: 0.5
+      Neck:
+        name: SequenceEncoder
+        encoder_type: rnn
+        hidden_size: 64
+      Head:
+        name: CTCHead
+        mid_channels: 96
+        fc_decay: 0.00002
+```
+When the model is finally trained, it contains 3 sub-networks: `Teacher`, `Student`, `Student2`.
+The specific implementation code of the `DistillationModel` class can refer to [distillation_model.py](../../ppocr/modeling/architectures/distillation_model.py).
+The final model output is a dictionary, the key is the name of all the sub-networks, for example, here are `Student` and `Teacher`, and the value is the output of the corresponding sub-network,
+which can be `Tensor` (only the last layer of the network is returned) and `dict` (also returns the characteristic information in the middle).
+In the recognition task, in order to add more loss functions and ensure the scalability of the distillation method, the output of each sub-network is saved as a `dict`, which contains the sub-module output.
+Take the recognition model as an example. The output result of each sub-network is `dict`, the key contains `backbone_out`, `neck_out`, `head_out`, and `value` is the tensor of the corresponding module. Finally, for the above configuration file, `DistillationModel` The output format is as follows.
+```json
+{
+  "Teacher": {
+    "backbone_out": tensor,
+    "neck_out": tensor,
+    "head_out": tensor,
+  },
+  "Student": {
+    "backbone_out": tensor,
+    "neck_out": tensor,
+    "head_out": tensor,
+  }
+}
+```
+<a name="212"></a>
+#### 2.1.2 Loss Function
+In the knowledge distillation task, the loss function configuration is as follows.
+```yaml
+Loss:
+  name: CombinedLoss                           # Loss function name
+  loss_config_list:                            # List of loss function configuration files, mandatory functions for CombinedLoss
+  - DistillationCTCLoss:                       # CTC loss function based on distillation, inherited from standard CTC loss
+      weight: 1.0                              # The weight of the loss function. In loss_config_list, each loss function must include this field
+      model_name_list: ["Student", "Teacher"]  # For the prediction results of the distillation model, extract the output of these two sub-networks and calculate the CTC loss with gt
+      key: head_out                            # In the sub-network output dict, take the corresponding tensor
+  - DistillationDMLLoss:                       # DML loss function, inherited from the standard DMLLoss
+      weight: 1.0  
+      act: "softmax"                           # Activation function, use it to process the input, can be softmax, sigmoid or None, the default is None
+      model_name_pairs:                        # The subnet name pair used to calculate DML loss. If you want to calculate the DML loss of other subnets, you can continue to add it below the list
+      - ["Student", "Teacher"]
+      key: head_out  
+  - DistillationDistanceLoss:                  # Distilled distance loss function
+      weight: 1.0  
+      mode: "l2"                               # Support l1, l2 or smooth_l1
+      model_name_pairs:                        # Calculate the distance loss of the subnet name pair
+      - ["Student", "Teacher"]
+      key: backbone_out  
+```
+Among the above loss functions, all distillation loss functions are inherited from the standard loss function class.
+The main functions are: Analyze the output of the distillation model, find the intermediate node (tensor) used to calculate the loss,
+and then use the standard loss function class to calculate.
+Taking the above configuration as an example, the final distillation training loss function contains the following three parts.
+- The final output `head_out` of `Student` and `Teacher` calculates the CTC loss with gt (loss weight equals 1.0). Here, because both sub-networks need to update the parameters, both of them need to calculate the loss with gt.
+- DML loss between `Student` and `Teacher`'s final output `head_out` (loss weight equals 1.0).
+- L2 loss between `Student` and `Teacher`'s backbone network output `backbone_out` (loss weight equals 1.0).
+For more specific implementation of `CombinedLoss`, please refer to: [combined_loss.py](../../ppocr/losses/combined_loss.py#L23).
+For more specific implementations of distillation loss functions such as `DistillationCTCLoss`, please refer to [distillation_loss.py](../../ppocr/losses/distillation_loss.py)
+<a name="213"></a>
+#### 2.1.3 Post-processing
+In the knowledge distillation task, the post-processing configuration is as follows.
+```yaml
+PostProcess:
+  name: DistillationCTCLabelDecode       # CTC decoding post-processing of distillation tasks, inherited from the standard CTCLabelDecode class
+  model_name: ["Student", "Teacher"]     # For the prediction results of the distillation model, extract the outputs of these two sub-networks and decode them
+  key: head_out                          # Take the corresponding tensor in the subnet output dict
+```
+Taking the above configuration as an example, the CTC decoding output of the two sub-networks `Student` and `Teahcer` will be calculated at the same time.
+Among them, `key` is the name of the subnet, and `value` is the list of subnets.
+For more specific implementation of `DistillationCTCLabelDecode`, please refer to: [rec_postprocess.py](../../ppocr/postprocess/rec_postprocess.py#L128)
+<a name="214"></a>
+#### 2.1.4 Metric Calculation
+In the knowledge distillation task, the metric calculation configuration is as follows.
+```yaml
+Metric:
+  name: DistillationMetric         # CTC decoding post-processing of distillation tasks, inherited from the standard CTCLabelDecode class
+  base_metric_name: RecMetric      # The base class of indicator calculation. For the output of the model, the indicator will be calculated based on this class
+  main_indicator: acc              # The name of the indicator
+  key: "Student"                   # Select the main_indicator of this subnet as the criterion for saving the best model
+```
+Taking the above configuration as an example, the accuracy metric of the `Student` subnet will be used as the judgment metric for saving the best model.
+At the same time, the accuracy metric of all subnets will be printed out in the log.
+For more specific implementation of `DistillationMetric`, please refer to: [distillation_metric.py](../../ppocr/metrics/distillation_metric.py#L24).
+<a name="215"></a>
+#### 2.1.5 Fine-tuning Distillation Model
+There are two ways to fine-tune the recognition distillation task.
+1. Fine-tuning based on knowledge distillation: this situation is relatively simple, download the pre-trained model. Then configure the pre-training model path and your own data path in [ch_PP-OCRv2_rec_distillation.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_distillation.yml) to perform fine-tuning training of the model.
+2. Do not use knowledge distillation in fine-tuning: In this case, you need to first extract the student model parameters from the pre-training model. The specific steps are as follows.
+- First download the pre-trained model and unzip it.
+```shell
+wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar
+tar -xf ch_PP-OCRv2_rec_train.tar
+```
+- Then use python to extract the student model parameters
+```python
+import paddle
+# Load the pre-trained model
+all_params = paddle.load("ch_PP-OCRv2_rec_train/best_accuracy.pdparams")
+# View the keys of the weight parameter
+print(all_params.keys())
+# Weight extraction of student model
+s_params = {key[len("Student."):]: all_params[key] for key in all_params if "Student." in key}
+# View the keys of the weight parameters of the student model
+print(s_params.keys())
+# Save weight parameters
+paddle.save(s_params, "ch_PP-OCRv2_rec_train/student.pdparams")
+```
+After the extraction is complete, use [ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml) to modify the path of the pre-trained model (the path of the exported `student.pdparams` model) and your own data path to fine-tune the model.
+<a name="22"></a>
+### 2.2 Detection Model Configuration File Analysis
+The configuration file of the detection model distillation is in the ```PaddleOCR/configs/det/ch_PP-OCRv2/``` directory, which contains three distillation configuration files:
+- ```ch_PP-OCRv2_det_cml.yml```, Use one large model to distill two small models, and the two small models learn from each other
+- ```ch_PP-OCRv2_det_dml.yml```, Method of mutual distillation of two student models
+- ```ch_PP-OCRv2_det_distill.yml```, The method of using large teacher model to distill small student model
+<a name="221"></a>
+#### 2.2.1 Model Structure
+In the knowledge distillation task, the model structure configuration is as follows:
+```
+Architecture:
+  name: DistillationModel          # Structure name, in the distillation task, it is DistillationModel
+  algorithm: Distillation          # Algorithm name
+  Models:                          # Model, including the configuration information of the subnet
+    Student:                       # The name of the subnet, it must include at least the `pretrained` and `freeze_params` parameters, and the other parameters are the construction parameters of the subnet
+      pretrained: ./pretrain_models/MobileNetV3_large_x0_5_pretrained  # Does this sub-network need to load pre-training weights
+      freeze_params: false         # Do you need fixed parameters
+      return_all_feats: false      # Do you need to return all features, if it is False, only the final output is returned
+      model_type: det
+      algorithm: DB
+      Backbone:
+        name: MobileNetV3
+        scale: 0.5
+        model_name: large
+        disable_se: True
+      Neck:
+        name: DBFPN
+        out_channels: 96
+      Head:
+        name: DBHead
+        k: 50
+    Teacher:                      # Another sub-network, here is a distillation example of a large model distill a small model
+      pretrained: ./pretrain_models/ch_ppocr_server_v2.0_det_train/best_accuracy
+      freeze_params: true         # The Teacher model is well-trained and does not need to participate in training
+      return_all_feats: false
+      model_type: det
+      algorithm: DB
+      Transform:
+      Backbone:
+        name: ResNet
+        layers: 18
+      Neck:
+        name: DBFPN
+        out_channels: 256
+      Head:
+        name: DBHead
+        k: 50
+```
+If DML is used, that is, the method of two small models learning from each other, the Teacher network structure in the above configuration file needs to be set to the same configuration as the Student model.
+Refer to the configuration file for details. [ch_PP-OCRv2_det_dml.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_dml.yml)
+The following describes the configuration file parameters [ch_PP-OCRv2_det_cml.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml):
+```
+Architecture:
+  name: DistillationModel  
+  algorithm: Distillation
+  model_type: det
+  Models:
+    Teacher:                         # Teacher model configuration of CML distillation
+      pretrained: ./pretrain_models/ch_ppocr_server_v2.0_det_train/best_accuracy
+      freeze_params: true            # Teacher does not train
+      return_all_feats: false
+      model_type: det
+      algorithm: DB
+      Transform:
+      Backbone:
+        name: ResNet
+        layers: 18
+      Neck:
+        name: DBFPN
+        out_channels: 256
+      Head:
+        name: DBHead
+        k: 50
+    Student:                         # Student model configuration for CML distillation
+      pretrained: ./pretrain_models/MobileNetV3_large_x0_5_pretrained  
+      freeze_params: false
+      return_all_feats: false
+      model_type: det
+      algorithm: DB
+      Backbone:
+        name: MobileNetV3
+        scale: 0.5
+        model_name: large
+        disable_se: True
+      Neck:
+        name: DBFPN
+        out_channels: 96
+      Head:
+        name: DBHead
+        k: 50
+    Student2:                          # Student2 model configuration for CML distillation
+      pretrained: ./pretrain_models/MobileNetV3_large_x0_5_pretrained  
+      freeze_params: false
+      return_all_feats: false
+      model_type: det
+      algorithm: DB
+      Transform:
+      Backbone:
+        name: MobileNetV3
+        scale: 0.5
+        model_name: large
+        disable_se: True
+      Neck:
+        name: DBFPN
+        out_channels: 96
+      Head:
+        name: DBHead
+        k: 50
+```
+The specific implementation code of the distillation model `DistillationModel` class can refer to [distillation_model.py](../../ppocr/modeling/architectures/distillation_model.py).
+The final model output is a dictionary, the key is the name of all the sub-networks, for example, here are `Student` and `Teacher`, and the value is the output of the corresponding sub-network,
+which can be `Tensor` (only the last layer of the network is returned) and `dict` (also returns the characteristic information in the middle).
+In the distillation task, in order to facilitate the addition of the distillation loss function, the output of each network is saved as a `dict`, which contains the sub-module output.
+The key contains `backbone_out`, `neck_out`, `head_out`, and `value` is the tensor of the corresponding module. Finally, for the above configuration file, the output format of `DistillationModel` is as follows.
+```json
+{
+  "Teacher": {
+    "backbone_out": tensor,
+    "neck_out": tensor,
+    "head_out": tensor,
+  },
+  "Student": {
+    "backbone_out": tensor,
+    "neck_out": tensor,
+    "head_out": tensor,
+  }
+}
+```
+<a name="222"></a>
+#### 2.2.2 Loss Function
+In the task of detection knowledge distillation ```ch_PP-OCRv2_det_distill.yml````, the distillation loss function configuration is as follows.
+```yaml
+Loss:
+  name: CombinedLoss                 # Loss function name
+  loss_config_list:                  # List of loss function configuration files, mandatory functions for CombinedLoss
+  - DistillationDilaDBLoss:          # DB loss function based on distillation, inherited from standard DBloss
+      weight: 1.0                    # The weight of the loss function. In loss_config_list, each loss function must include this field
+      model_name_pairs:              # Extract the output of these two sub-networks and calculate the loss between them
+      - ["Student", "Teacher"]
+      key: maps                      # In the sub-network output dict, take the corresponding tensor
+      balance_loss: true             # The following parameters are the configuration parameters of standard DBloss
+      main_loss_type: DiceLoss
+      alpha: 5
+      beta: 10
+      ohem_ratio: 3
+  - DistillationDBLoss:              # Used to calculate the loss between Student and GT
+      weight: 1.0
+      model_name_list: ["Student"]   # The model name only has Student, which means that the loss between Student and GT is calculated
+      name: DBLoss
+      balance_loss: true
+      main_loss_type: DiceLoss
+      alpha: 5
+      beta: 10
+      ohem_ratio: 3
+```
+Similarly, distillation loss function configuration(`ch_PP-OCRv2_det_cml.yml`) is shown below. Compared with the loss function configuration of ch_PP-OCRv2_det_distill.yml, there are three changes:
+```yaml
+Loss:
+  name: CombinedLoss
+  loss_config_list:
+  - DistillationDilaDBLoss:
+      weight: 1.0
+      model_name_pairs:
+      - ["Student", "Teacher"]
+      - ["Student2", "Teacher"]                  # 1. Calculate the loss of two Student and Teacher
+      key: maps
+      balance_loss: true
+      main_loss_type: DiceLoss
+      alpha: 5
+      beta: 10
+      ohem_ratio: 3
+  - DistillationDMLLoss:                         # 2. Add to calculate the loss between two students
+      model_name_pairs:
+      - ["Student", "Student2"]
+      maps_name: "thrink_maps"
+      weight: 1.0
+      # act: None
+      key: maps
+  - DistillationDBLoss:
+      weight: 1.0
+      model_name_list: ["Student", "Student2"]   # 3. Calculate the loss between two students and GT
+      balance_loss: true
+      main_loss_type: DiceLoss
+      alpha: 5
+      beta: 10
+      ohem_ratio: 3
+```
+For more specific implementation of `DistillationDilaDBLoss`, please refer to: [distillation_loss.py](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.4/ppocr/losses/distillation_loss.py#L185).
+For more specific implementations of distillation loss functions such as `DistillationDBLoss`, please refer to: [distillation_loss.py](https://github.com/PaddlePaddle/PaddleOCR/blob/04c44974b13163450dfb6bd2c327863f8a194b3c/ppocr/losses/distillation_loss.py?_pjax=%23js-repo-pjax-container%2C%20div%5Bitemtype%3D%22http%3A%2F%2Fschema.org%2FSoftwareSourceCode%22%5D%20main%2C%20%5Bdata-pjax-container%5D#L148)
+<a name="223"></a>
+#### 2.2.3 Post-processing
+In the task of detecting knowledge distillation, the post-processing configuration of detecting distillation is as follows.
+```yaml
+PostProcess:
+  name: DistillationDBPostProcess                  # The CTC decoding post-processing of the DB detection distillation task, inherited from the standard DBPostProcess class
+  model_name: ["Student", "Student2", "Teacher"]   # Extract the output of multiple sub-networks and decode them. The network that does not require post-processing is not set in model_name
+  thresh: 0.3
+  box_thresh: 0.6
+  max_candidates: 1000
+  unclip_ratio: 1.5
+```
+Taking the above configuration as an example, the output of the three subnets `Student`, `Student2` and `Teacher` will be calculated at the same time for post-processing calculations.
+Since there are multiple inputs, there are also multiple outputs returned by post-processing.
+For a more specific implementation of `DistillationDBPostProcess`, please refer to: [db_postprocess.py](../../ppocr/postprocess/db_postprocess.py#L195)
+<a name="224"></a>
+#### 2.2.4 Metric Calculation
+In the knowledge distillation task, the metric calculation configuration is as follows.
+```yaml
+Metric:
+  name: DistillationMetric
+  base_metric_name: DetMetric
+  main_indicator: hmean
+  key: "Student"
+```
+Since distillation needs to include multiple networks, only one network metrics needs to be calculated when calculating the metrics.
+The `key` field is set to `Student`, it means that only the metrics of the `Student` network is calculated.
+Model Structure
+<a name="225"></a>
+#### 2.2.5 Fine-tuning Distillation Model
+There are three ways to fine-tune the detection distillation task:
+- `ch_PP-OCRv2_det_distill.yml`, The teacher model is set to the model provided by PaddleOCR or the large model you have trained.
+- `ch_PP-OCRv2_det_cml.yml`, Use cml distillation. Similarly, the Teacher model is set to the model provided by PaddleOCR or the large model you have trained.
+- `ch_PP-OCRv2_det_dml.yml`, Distillation using DML. The method of mutual distillation of the two Student models has an accuracy improvement of about 1.7% on the data set used by PaddleOCR.
+In fine-tune, you need to set the pre-trained model to be loaded in the `pretrained` parameter of the network structure.
+In terms of accuracy improvement, `cml` > `dml` > `distill`. When the amount of data is insufficient or the accuracy of the teacher model is similar to that of the student, this conclusion may change.
+In addition, since the distillation pre-training model provided by PaddleOCR contains multiple model parameters, if you want to extract the parameters of the student model, you can refer to the following code:
+```sh
+# Download the parameters of the distillation training model
+wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar
+```
+```python
+import paddle
+# Load the pre-trained model
+all_params = paddle.load("ch_PP-OCRv2_det_distill_train/best_accuracy.pdparams")
+# View the keys of the weight parameter
+print(all_params.keys())
+# Extract the weights of the student model
+s_params = {key[len("Student."):]: all_params[key] for key in all_params if "Student." in key}
+# View the keys of the weight parameters of the student model
+print(s_params.keys())
+# Save
+paddle.save(s_params, "ch_PP-OCRv2_det_distill_train/student.pdparams")
+```
+Finally, the parameters of the student model will be saved in `ch_PP-OCRv2_det_distill_train/student.pdparams` for the fine-tune of the model.
--- a/doc/doc_en/models_en.md
+++ b/doc/doc_en/models_en.md
@@ -7,13 +7,13 @@ This section contains two parts. Firstly, [PP-OCR Model Download](./models_list_
 Let's first understand some basic concepts.
- [Introduction about OCR](#introduction-about-ocr)
+- [Introduction to OCR](#introduction-to-ocr)
  * [Basic Concepts of OCR Detection Model](#basic-concepts-of-ocr-detection-model)
  * [Basic Concepts of OCR Recognition Model](#basic-concepts-of-ocr-recognition-model)
  * [PP-OCR Model](#pp-ocr-model)
-## 1. Introduction about OCR
+## 1. Introduction to OCR
 This section briefly introduces the basic concepts of OCR detection model and recognition model, and introduces PaddleOCR's PP-OCR model.

--- a/doc/doc_en/models_list_en.md
+++ b/doc/doc_en/models_list_en.md
 # OCR Model List（V2.1, updated on 2021.9.6）
 > **Note**
-> 1. Compared with the model v2.0, the 2.1 version of the detection model has a improvement in accuracy, and the 2.1 version of the recognition model is optimized in accuracy and CPU speed.
+> 1. Compared with the model v2.0, the 2.1 version of the detection model has a improvement in accuracy, and the 2.1 version of the recognition model has optimizations in accuracy and speed with CPU.
 > 2. Compared with [models 1.1](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md), which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.
 > 3. All models in this tutorial are all ppocr-series models, for more introduction of algorithms and models based on public dataset, you can refer to [algorithm overview tutorial](./algorithm_overview_en.md).
@@ -18,7 +18,7 @@ The downloadable models provided by PaddleOCR include `inference model`, `traine
 |--- | --- | --- |
 |inference model|inference.pdmodel、inference.pdiparams|Used for inference based on Paddle inference engine，[detail](./inference_en.md)|
 |trained model, pre-trained model|\*.pdparams、\*.pdopt、\*.states |The checkpoints model saved in the training process, which stores the parameters of the model, mostly used for model evaluation and continuous training.|
-|slim model|\*.nb| Model compressed by PaddleSim (a model compression tool using PaddlePaddle), which is suitable for mobile-side deployment scenarios (Paddle-Lite is needed for slim model deployment). |
+|slim model|\*.nb| Model compressed by PaddleSlim (a model compression tool using PaddlePaddle), which is suitable for mobile-side deployment scenarios (Paddle-Lite is needed for slim model deployment). |
 Relationship of the above models is as follows.
@@ -50,7 +50,7 @@ Relationship of the above models is as follows.
 |ch_ppocr_server_v2.0_rec|General model, supporting Chinese, English and number recognition|[rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml)|94.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
-**Note:** The `trained model` is finetuned on the `pre-trained model` with real data and synthsized vertical text data, which achieved better performance in real scene. The `pre-trained model` is directly trained on the full amount of real data and synthsized data, which is more suitable for finetune on your own dataset.
+**Note:** The `trained model` is fine-tuned on the `pre-trained model` with real data and synthesized vertical text data, which achieved better performance in real scene. The `pre-trained model` is directly trained on the full amount of real data and synthesized data, which is more suitable for fine-tune on your own dataset.
 <a name="English"></a>
 ### 2.2 English Recognition Model

--- a/doc/doc_en/multi_languages_en.md
+++ b/doc/doc_en/multi_languages_en.md
@@ -28,12 +28,12 @@ The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japane
 This document will briefly introduce how to use the multilingual model.
 - [1 Installation](#Install)
-    - [1.1 paddle installation](#paddleinstallation)
+    - [1.1 Paddle installation](#paddleinstallation)
-    - [1.2 paddleocr package installation](#paddleocr_package_install)
+    - [1.2 PaddleOCR package installation](#paddleocr_package_install)
 - [2 Quick Use](#Quick_Use)
    - [2.1 Command line operation](#Command_line_operation)
-    - [2.2 python script running](#python_Script_running)
+    - [2.2 Run with Python script](#python_Script_running)
 - [3 Custom Training](#Custom_Training)
 - [4 Inference and Deployment](#inference)
 - [4 Supported languages and abbreviations](#language_abbreviations)
@@ -42,7 +42,7 @@ This document will briefly introduce how to use the multilingual model.
 ## 1 Installation
 <a name="paddle_install"></a>
-### 1.1 paddle installation
+### 1.1 Paddle installation
 ```
 # cpu
 pip install paddlepaddle
@@ -52,7 +52,7 @@ pip install paddlepaddle-gpu
 ```
 <a name="paddleocr_package_install"></a>
-### 1.2 paddleocr package installation
+### 1.2 PaddleOCR package installation
 pip install
@@ -79,8 +79,8 @@ paddleocr -h
 * Whole image prediction (detection + recognition)
-Paddleocr currently supports 80 languages, which can be switched by modifying the --lang parameter.
+PaddleOCR currently supports 80 languages, which can be specified by the --lang parameter.
-The specific supported [language] (#language_abbreviations) can be viewed in the table.
+The supported languages are listed in the [table](#language_abbreviations).
 ``` bash
 paddleocr --image_dir doc/imgs_en/254.jpg --lang=en
@@ -90,7 +90,7 @@ paddleocr --image_dir doc/imgs_en/254.jpg --lang=en
    <img src="../imgs_results/multi_lang/img_02.jpg" width="600" height="600">
 </div>
-The result is a list, each item contains a text box, text and recognition confidence
+The result is a list. Each item contains a text box, text and recognition confidence
 ```text
 [('PHO CAPITAL', 0.95723116), [[66.0, 50.0], [327.0, 44.0], [327.0, 76.0], [67.0, 82.0]]]
 [('107 State Street', 0.96311164), [[72.0, 90.0], [451.0, 84.0], [452.0, 116.0], [73.0, 121.0]]]
@@ -110,7 +110,7 @@ paddleocr --image_dir doc/imgs_words_en/word_308.png --det false --lang=en
 ![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_words_en/word_308.png)
-The result is a tuple, which returns the recognition result and recognition confidence
+The result is a 2-tuple, which contains the recognition result and recognition confidence
 ```text
 (0.99879867, 'LITTLE')
@@ -122,7 +122,7 @@ The result is a tuple, which returns the recognition result and recognition conf
 paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --rec false
 ```
-The result is a list, each item contains only text boxes
+The result is a list. Each item represents the coordinates of a text box.
 ```
 [[26.0, 457.0], [137.0, 457.0], [137.0, 477.0], [26.0, 477.0]]
@@ -132,9 +132,9 @@ The result is a list, each item contains only text boxes
 ```
 <a name="python_script_running"></a>
-### 2.2 python script running
+### 2.2 Run with Python script
-ppocr also supports running in python scripts for easy embedding in your own code:
+PPOCR is able to run with Python scripts for easy integration with your own code:
 * Whole image prediction (detection + recognition)
@@ -167,12 +167,12 @@ Visualization of results:
 ![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/korean.jpg)
-ppocr also supports direction classification. For more usage methods, please refer to: [whl package instructions](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md).
+PPOCR also supports direction classification. For more detailed usage, please refer to: [whl package instructions](whl_en.md).
 <a name="Custom_training"></a>
 ## 3 Custom training
-ppocr supports using your own data for custom training or finetune, where the recognition model can refer to [French configuration file](../../configs/rec/multi_language/rec_french_lite_train.yml)
+PPOCR supports using your own data for custom training or fine-tune, where the recognition model can refer to [French configuration file](../../configs/rec/multi_language/rec_french_lite_train.yml)
 Modify the training data path, dictionary and other parameters.
 For specific data preparation and training process, please refer to: [Text Detection](../doc_en/detection_en.md), [Text Recognition](../doc_en/recognition_en.md), more functions such as predictive deployment,
@@ -183,7 +183,7 @@ For functions such as data annotation, you can read the complete [Document Tutor
 ## 4 Inference and Deployment
 In addition to installing the whl package for quick forecasting,
-ppocr also provides a variety of forecasting deployment methods.
+PPOCR also provides a variety of forecasting deployment methods.
 If necessary, you can read related documents:
 - [Python Inference](./inference_en.md)

--- a/doc/doc_en/paddleOCR_overview_en.md
+++ b/doc/doc_en/paddleOCR_overview_en.md
@@ -2,7 +2,7 @@
 ## 1. PaddleOCR Overview
-PaddleOCR contains rich text detection, text recognition and end-to-end algorithms. Combining actual testing and industrial experience, PaddleOCR chooses DB and CRNN as the basic detection and recognition models, and proposes a series of models, named PP-OCR, for industrial applications after a series of optimization strategies. The PP-OCR model is aimed at general scenarios and forms a model library according to different languages. Based on the capabilities of PP-OCR, PaddleOCR releases the PP-Structure tool library for document scene tasks, including two major tasks: layout analysis and table recognition. In order to get through the entire process of industrial landing, PaddleOCR provides large-scale data production tools and a variety of prediction deployment tools to help developers quickly turn ideas into reality.
+PaddleOCR contains rich text detection, text recognition and end-to-end algorithms. With the experience from real world scenarios and the industry, PaddleOCR chooses DB and CRNN as the basic detection and recognition models, and proposes a series of models, named PP-OCR, for industrial applications after a series of optimization strategies. The PP-OCR model is aimed at general scenarios and forms a model library of different languages. Based on the capabilities of PP-OCR, PaddleOCR releases the PP-Structure toolkit for document scene tasks, including two major tasks: layout analysis and table recognition. In order to get through the entire process of industrial landing, PaddleOCR provides large-scale data production tools and a variety of prediction deployment tools to help developers quickly turn ideas into reality.
 <div align="center">
    <img src="../overview_en.png">
@@ -18,11 +18,11 @@ PaddleOCR contains rich text detection, text recognition and end-to-end algorith
 # Recommend
 git clone https://github.com/PaddlePaddle/PaddleOCR
-# If you cannot pull successfully due to network problems, you can also choose to use the code hosting on the cloud:
+# If you cannot pull successfully due to network problems, you can switch to the mirror hosted on Gitee:
 git clone https://gitee.com/paddlepaddle/PaddleOCR
-# Note: The cloud-hosting code may not be able to synchronize the update with this GitHub project in real time. There might be a delay of 3-5 days. Please give priority to the recommended method.
+# Note: The mirror on Gitee may not keep in synchronization with the latest project on GitHub. There might be a delay of 3-5 days. Please try GitHub at first.
 ```
 ### **2.2 Install third-party libraries**
@@ -34,6 +34,6 @@ pip3 install -r requirements.txt
 If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows.
-Please try to download Shapely whl file using [http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely).
+Please try to download Shapely whl file from [http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely).
 Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
\ No newline at end of file
--- a/doc/doc_en/pgnet_en.md
+++ b/doc/doc_en/pgnet_en.md
@@ -6,18 +6,18 @@
 <a name="Brief_Introduction"></a>
 ## 1. Brief Introduction
-OCR algorithm can be divided into two-stage algorithm and end-to-end algorithm. The two-stage OCR algorithm is generally divided into two parts, text detection and text recognition algorithm. The text detection algorithm gets the detection box of the text line from the image, and then the recognition algorithm identifies the content of the text box. The end-to-end OCR algorithm can complete text detection and recognition in one algorithm. Its basic idea is to design a model with both detection unit and recognition module, share the CNN features of both and train them together. Because one algorithm can complete character recognition, the end-to-end model is smaller and faster.
+OCR algorithms can be divided into two categories: two-stage algorithm and end-to-end algorithm. The two-stage OCR algorithm is generally divided into two parts, text detection and text recognition algorithm. The text detection algorithm locates the box of the text line from the image, and then the recognition algorithm identifies the content of the text box. The end-to-end OCR algorithm combines text detection and recognition in one algorithm. Its basic idea is to design a model with both detection unit and recognition module, share the CNN features of both and train them together. Because one algorithm can complete character recognition, the end-to-end model is smaller and faster.
 ### Introduction Of PGNet Algorithm
-In recent years, the end-to-end OCR algorithm has been well developed, including MaskTextSpotter series, TextSnake, TextDragon, PGNet series and so on. Among these algorithms, PGNet algorithm has the advantages that other algorithms do not
+During the recent years, the end-to-end OCR algorithm has been well developed, including MaskTextSpotter series, TextSnake, TextDragon, PGNet series and so on. Among these algorithms, PGNet algorithm has some advantages over the other algorithms.
- Pgnet loss is designed to guide training, and no character-level annotations is needed
+- PGNet loss is designed to guide training, and no character-level annotations is needed.
- NMS and ROI related operations are not needed, It can accelerate the prediction
+- NMS and ROI related operations are not needed. It can accelerate the prediction
 - The reading order prediction module is proposed
 - A graph based modification module (GRM) is proposed to further improve the performance of model recognition
 - Higher accuracy and faster prediction speed
-For details of PGNet algorithm, please refer to [paper](https://www.aaai.org/AAAI21Papers/AAAI-2885.WangP.pdf) ,The schematic diagram of the algorithm is as follows:
+For details of PGNet algorithm, please refer to [paper](https://www.aaai.org/AAAI21Papers/AAAI-2885.WangP.pdf). The schematic diagram of the algorithm is as follows:
 ![](../pgnet_framework.png)
-After feature extraction, the input image is sent to four branches: TBO module for text edge offset prediction, TCL module for text centerline prediction, TDO module for text direction offset prediction, and TCC module for text character classification graph prediction.
+After feature extraction, the input image is sent to four branches: TBO module for text edge offset prediction, TCL module for text center-line prediction, TDO module for text direction offset prediction, and TCC module for text character classification graph prediction.
 The output of TBO and TCL can get text detection results after post-processing, and TCL, TDO and TCC are responsible for text recognition.
 The results of detection and recognition are as follows:
@@ -40,7 +40,7 @@ Please refer to [Operation Environment Preparation](./environment_en.md) to conf
 <a name="Quick_Use"></a>
 ## 3. Quick Use
-### inference model download
+### Inference model download
 This section takes the trained end-to-end model as an example to quickly use the model prediction. First, download the trained end-to-end inference model [download address](https://paddleocr.bj.bcebos.com/dygraph_v2.0/pgnet/e2e_server_pgnetA_infer.tar)
 ```
 mkdir inference && cd inference
@@ -131,7 +131,7 @@ python3 tools/train.py -c configs/e2e/e2e_r50_vd_pg.yml -o Optimizer.base_lr=0.0
 ```
 #### Load trained model and continue training
-If you expect to load trained model and continue the training again, you can specify the parameter `Global.checkpoints` as the model path to be loaded.
+If you would like to load trained model and continue the training again, you can specify the parameter `Global.checkpoints` as the model path to be loaded.
 ```shell
 python3 tools/train.py -c configs/e2e/e2e_r50_vd_pg.yml -o Global.checkpoints=./your/trained/model
 ```

--- a/doc/doc_en/training_en.md
+++ b/doc/doc_en/training_en.md
@@ -12,15 +12,15 @@
 * [4. FAQ](#3-faq)
-This article will introduce the basic concepts that need to be mastered during model training and the tuning methods during training.
+This article will introduce the basic concepts that is necessary for model training and tuning.
-At the same time, it will briefly introduce the components of the PaddleOCR model training data and how to prepare the data finetune model in the vertical scene.
+At the same time, it will briefly introduce the structure of the training data and how to prepare the data to fine-tune model in vertical scenes.
 <a name="1-Yml-Configuration"></a>
 ## 1. Yml Configuration
-The PaddleOCR model uses configuration files to manage network training and evaluation parameters. In the configuration file, you can set the model, optimizer, loss function, and pre- and post-processing parameters of the model. PaddleOCR reads these parameters from the configuration file, and then builds a complete training process to complete the model training. When optimized, the configuration can be completed by modifying the parameters in the configuration file, which is simple to use and convenient to modify.
+The PaddleOCR uses configuration files to control network training and evaluation parameters. In the configuration file, you can set the model, optimizer, loss function, and pre- and post-processing parameters of the model. PaddleOCR reads these parameters from the configuration file, and then builds a complete training process to train the model. Fine-tuning can also be completed by modifying the parameters in the configuration file, which is simple and convenient.
 For the complete configuration file description, please refer to [Configuration File](./config_en.md)
@@ -28,13 +28,13 @@ For the complete configuration file description, please refer to [Configuration
 ## 2. Basic Concepts
-In the process of model training, some hyperparameters need to be manually adjusted to help the model obtain the optimal index at the least loss. Different data volumes may require different hyper-parameters. When you want to finetune your own data or tune the model effect, there are several parameter adjustment strategies for reference:
+During the model training process, some hyper-parameters can be manually specified to obtain the optimal result at the least cost. Different data volumes may require different hyper-parameters. When you want to fine-tune the model based on your own data, there are several parameter adjustment strategies for reference:
 <a name="11-learning-rate"></a>
 ### 2.1 Learning Rate
-The learning rate is one of the important hyperparameters for training neural networks. It represents the step length of the gradient moving to the optimal solution of the loss function in each iteration.
+The learning rate is one of the most important hyper-parameters for training neural networks. It represents the step length of the gradient moving towards the optimal solution of the loss function in each iteration.
-A variety of learning rate update strategies are provided in PaddleOCR, which can be modified through configuration files, for example:
+A variety of learning rate update strategies are provided by PaddleOCR, which can be specified in configuration files. For example,
 ```
 Optimizer:
@@ -46,16 +46,15 @@ Optimizer:
    warmup_epoch: 5
 ```
-Piecewise stands for piecewise constant attenuation. Different learning rates are specified in different learning stages,
+`Piecewise` stands for piece-wise constant attenuation. Different learning rates are specified in different learning stages, and the learning rate stay the same in each stage.
-and the learning rate is the same in each stage.
-warmup_epoch means that in the first 5 epochs, the learning rate will gradually increase from 0 to base_lr. For all strategies, please refer to the code [learning_rate.py](../../ppocr/optimizer/learning_rate.py).
+`warmup_epoch` means that in the first 5 epochs, the learning rate will be increased gradually from 0 to base_lr. For all strategies, please refer to the code [learning_rate.py](../../ppocr/optimizer/learning_rate.py).
 <a name="12-regularization"></a>
 ### 2.2 Regularization
-Regularization can effectively avoid algorithm overfitting. PaddleOCR provides L1 and L2 regularization methods.
+Regularization can effectively avoid algorithm over-fitting. PaddleOCR provides L1 and L2 regularization methods.
-L1 and L2 regularization are the most commonly used regularization methods.
+L1 and L2 regularization are the most widely used regularization methods.
 L1 regularization adds a regularization term to the objective function to reduce the sum of absolute values of the parameters;
 while in L2 regularization, the purpose of adding a regularization term is to reduce the sum of squared parameters.
 The configuration method is as follows:
@@ -95,7 +94,7 @@ The current open source models, data sets and magnitudes are as follows:
    - Chinese data set, LSVT street view data set crops the image according to the truth value, and performs position calibration, a total of 30w images. In addition, based on the LSVT corpus, 500w of synthesized data.
    - Small language data set, using different corpora and fonts, respectively generated 100w synthetic data set, and using ICDAR-MLT as the verification set.
-Among them, the public data sets are all open source, users can search and download by themselves, or refer to [Chinese data set](./datasets.md), synthetic data is not open source, users can use open source synthesis tools to synthesize by themselves. Synthesis tools include [text_renderer](https://github.com/Sanster/text_renderer), [SynthText](https://github.com/ankush-me/SynthText), [TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator) etc.
+Among them, the public data sets are all open source, users can search and download by themselves, or refer to [Chinese data set](../doc_ch/datasets.md), synthetic data is not open source, users can use open source synthesis tools to synthesize by themselves. Synthesis tools include [text_renderer](https://github.com/Sanster/text_renderer), [SynthText](https://github.com/ankush-me/SynthText), [TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator) etc.
 <a name="22-vertical-scene"></a>
@@ -129,17 +128,17 @@ There are several experiences for reference when constructing the data set:
 **Q**: How to choose a suitable network input shape when training CRNN recognition?
    A: The general height is 32, the longest width is selected, there are two methods:
    (1) Calculate the aspect ratio distribution of training sample images. The selection of the maximum aspect ratio considers 80% of the training samples.
    (2) Count the number of texts in training samples. The selection of the longest number of characters considers the training sample that satisfies 80%. Then the aspect ratio of Chinese characters is approximately considered to be 1, and that of English is 3:1, and the longest width is estimated.
 **Q**: During the recognition training, the accuracy of the training set has reached 90, but the accuracy of the verification set has been kept at 70, what should I do?
    A: If the accuracy of the training set is 90 and the test set is more than 70, it should be over-fitting. There are two methods to try:
    (1) Add more augmentation methods or increase the [probability] of augmented prob (https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppocr/data/imaug/rec_img_aug.py#L341), The default is 0.4.
    (2) Increase the [l2 dcay value] of the system (https://github.com/PaddlePaddle/PaddleOCR/blob/a501603d54ff5513fc4fc760319472e59da25424/configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml#L47)
 **Q**: When the recognition model is trained, loss can drop normally, but acc is always 0

--- a/doc/doc_en/update_en.md
+++ b/doc/doc_en/update_en.md
@@ -5,7 +5,7 @@
 - 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).
 - 2021.4.8 release end-to-end text recognition algorithm [PGNet](https://www.aaai.org/AAAI21Papers/AAAI-2885.WangP.pdf) which is published in AAAI 2021. Find tutorial [here](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/pgnet_en.md)；release multi language recognition [models](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md), support more than 80 languages recognition; especically, the performance of [English recognition model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/models_list_en.md#English) is Optimized.
- 2021.1.21 update more than 25+ multilingual recognition models [models list](./doc/doc_en/models_list_en.md), including：English, Chinese, German, French, Japanese，Spanish，Portuguese Russia Arabic and so on.  Models for more languages will continue to be updated [Develop Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).
+- 2021.1.21 update more than 25+ multilingual recognition models [models list](./models_list_en.md), including：English, Chinese, German, French, Japanese，Spanish，Portuguese Russia Arabic and so on.  Models for more languages will continue to be updated [Develop Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).
 - 2020.12.15 update Data synthesis tool, i.e., [Style-Text](../../StyleText/README.md)，easy to synthesize a large number of images which are similar to the target scene image.
 - 2020.11.25 Update a new data annotation tool, i.e., [PPOCRLabel](../../PPOCRLabel/README.md), which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
 - 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941

--- a/ppstructure/vqa/images/input/zh_val_0.jpg
+++ b/ppstructure/vqa/images/input/zh_val_0.jpg
--- a/ppstructure/vqa/images/input/zh_val_21.jpg
+++ b/ppstructure/vqa/images/input/zh_val_21.jpg
--- a/ppstructure/vqa/images/input/zh_val_40.jpg
+++ b/ppstructure/vqa/images/input/zh_val_40.jpg
--- a/ppstructure/vqa/images/input/zh_val_42.jpg
+++ b/ppstructure/vqa/images/input/zh_val_42.jpg
--- a/ppstructure/vqa/images/result_re/zh_val_21_re.jpg
+++ b/ppstructure/vqa/images/result_re/zh_val_21_re.jpg
--- a/ppstructure/vqa/images/result_re/zh_val_40_re.jpg
+++ b/ppstructure/vqa/images/result_re/zh_val_40_re.jpg
--- a/ppstructure/vqa/images/result_ser/zh_val_0_ser.jpg
+++ b/ppstructure/vqa/images/result_ser/zh_val_0_ser.jpg