Merge remote-tracking branch 'origin/dygraph' into dygraph

86b90aa9 · Leif · 801b5771 · 8fe1b8d3 · 86b90aa9 · 86b90aa9
Commit 86b90aa9 authored Dec 22, 2021 by Leif
20 changed files
--- a/README.md
+++ b/README.md
@@ -13,7 +13,6 @@ English | [简体中文](README_ch.md)
    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href=""><img src="https://img.shields.io/pypi/format/PaddleOCR?color=c77"></a>
-    <a href="https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors"><img src="https://img.shields.io/github/contributors/PaddlePaddle/PaddleOCR?color=9ea"></a>
    <a href="https://pypi.org/project/PaddleOCR/"><img src="https://img.shields.io/pypi/dm/PaddleOCR?color=9cf"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf"></a>
 </p>
@@ -24,7 +23,8 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
 **Recent updates**
+- 2021.12.21 OCR open source online course starts. The lesson starts at 8:30 every night and lasts for ten days. Free registration: https://aistudio.baidu.com/aistudio/course/introduce/25207
+- 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR) and 3 DocVQA algorithms (LayoutLM、LayoutLMv2，LayoutXLM).
 - PaddleOCR R&D team would like to share the key points of PP-OCRv2, at 20:15 pm on September 8th, [Course Address](https://aistudio.baidu.com/aistudio/education/group/info/6758).
 - 2021.9.7 release PaddleOCR v2.3, [PP-OCRv2](#PP-OCRv2) is proposed. The inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server in CPU device. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile.
 - 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).
@@ -38,7 +38,11 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
    - Ultra lightweight PP-OCR mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
    - General PP-OCR server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
    - Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
-    - Support multi-language recognition: Korean, Japanese, German, French
+    - Support multi-language recognition: about 80 languages like Korean, Japanese, German, French, etc
+- document structurize system PP-Structure
+    - support layout analysis and table recognition (support export to Excel)
+    - support key information extraction
+    - support DocVQA
 - Rich toolkits related to the OCR areas
    - Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
    - Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image

--- a/README_ch.md
+++ b/README_ch.md
@@ -9,7 +9,6 @@
    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href=""><img src="https://img.shields.io/pypi/format/PaddleOCR?color=c77"></a>
-    <a href="https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors"><img src="https://img.shields.io/github/contributors/PaddlePaddle/PaddleOCR?color=9ea"></a>
    <a href="https://pypi.org/project/PaddleOCR/"><img src="https://img.shields.io/pypi/dm/PaddleOCR?color=9cf"></a>
    <a href="https://github.com/PaddlePaddle/PaddleOCR/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleOCR?color=ccf"></a>
 </p>
@@ -20,11 +19,13 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 ## 近期更新
+- 2021.12.21 《OCR十讲》课程开讲，12月21日起每晚八点半线上授课！ 【免费】报名地址：https://aistudio.baidu.com/aistudio/course/introduce/25207
+- 2021.12.21 发布PaddleOCR v2.4。OCR算法新增1种文本检测算法（PSENet），3种文本识别算法（NRTR、SEED、SAR）；文档结构化算法新增1种关键信息提取算法（SDMGR），3种DocVQA算法（LayoutLM、LayoutLMv2，LayoutXLM）。
 - PaddleOCR研发团队对最新发版内容技术深入解读，9月8日晚上20:15，[课程回放](https://aistudio.baidu.com/aistudio/education/group/info/6758)。
 - 2021.9.7 发布PaddleOCR v2.3与[PP-OCRv2](#PP-OCRv2)，CPU推理速度相比于PP-OCR server提升220%；效果相比于PP-OCR mobile 提升7%。
 - 2021.8.3 发布PaddleOCR v2.2，新增文档结构分析[PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README_ch.md)工具包，支持版面分析与表格识别（含Excel导出）。
-> 完整PaddleOCR更新时间线可参考[文档](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/update.md)。
+> [更多](./doc/doc_ch/update.md)
 ## 特性
@@ -33,11 +34,14 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
    - 超轻量PP-OCR mobile移动端系列：检测（3.0M）+方向分类器（1.4M）+ 识别（5.0M）= 9.4M
    - 通用PPOCR server系列：检测（47.1M）+方向分类器（1.4M）+ 识别（94.9M）= 143.4M
    - 支持中英文数字组合识别、竖排文本识别、长文本识别
-    - 支持多语言识别：韩语、日语、德语、法语等
+    - 支持多语言识别：韩语、日语、德语、法语等约80种语言
+- PP-Structure文档结构化系统
+    - 支持版面分析与表格识别（含Excel导出）
+    - 支持关键信息提取任务
+    - 支持DocVQA任务
 - 丰富易用的OCR相关工具组件
    - 半自动数据标注工具PPOCRLabel：支持快速高效的数据标注
    - 数据合成工具Style-Text：批量合成大量与目标场景类似的图像
-    - 文档分析能力PP-Structure：支持版面分析与表格识别（含Excel导出）
 - 支持用户自定义训练，提供丰富的预测推理部署方案
 - 支持PIP快速安装使用
 - 可运行于Linux、Windows、MacOS等多种系统
@@ -56,6 +60,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 <div align="center">
 <img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/dygraph/doc/joinus.PNG"  width = "200" height = "200" />
 </div>
 ## 零代码体验
 - 在线网站体验：超轻量PP-OCR mobile模型体验地址：https://www.paddlepaddle.org.cn/hub/scene/ocr

--- a/configs/kie/kie_unet_sdmgr.yml
+++ b/configs/kie/kie_unet_sdmgr.yml
+Global:
+  use_gpu: True
+  epoch_num: 60
+  log_smooth_window: 20
+  print_batch_step: 50
+  save_model_dir: ./output/kie_5/
+  save_epoch_step: 50
+  # evaluation is run every 5000 iterations after the 4000th iteration
+  eval_batch_step: [ 0, 80 ]
+  # 1. If pretrained_model is saved in static mode, such as classification pretrained model
+  #    from static branch, load_static_weights must be set as True.
+  # 2. If you want to finetune the pretrained models we provide in the docs,
+  #    you should set load_static_weights as False.
+  load_static_weights: False
+  cal_metric_during_train: False
+  pretrained_model: 
+  checkpoints:
+  save_inference_dir:
+  use_visualdl: False
+  class_path: ./train_data/wildreceipt/class_list.txt
+  infer_img: ./train_data/wildreceipt/1.txt
+  save_res_path: ./output/sdmgr_kie/predicts_kie.txt
+  img_scale: [ 1024, 512 ]
+Architecture:
+  model_type: kie
+  algorithm: SDMGR
+  Transform:
+  Backbone:
+    name: Kie_backbone
+  Head:
+    name: SDMGRHead
+Loss:
+  name: SDMGRLoss
+Optimizer:
+  name: Adam
+  beta1: 0.9
+  beta2: 0.999
+  lr:
+    name: Piecewise
+    learning_rate: 0.001
+    decay_epochs: [ 60, 80, 100]
+    values: [ 0.001, 0.0001, 0.00001]
+    warmup_epoch: 2
+  regularizer:
+    name: 'L2'
+    factor: 0.00005
+PostProcess:
+  name: None
+Metric:
+  name: KIEMetric
+  main_indicator: hmean
+Train:
+  dataset:
+    name: SimpleDataSet
+    data_dir: ./train_data/wildreceipt/
+    label_file_list: [ './train_data/wildreceipt/wildreceipt_train.txt' ]
+    ratio_list: [ 1.0 ]
+    transforms:
+      - DecodeImage: # load image
+          img_mode: RGB
+          channel_first: False
+      - NormalizeImage:
+          scale: 1
+          mean: [ 123.675, 116.28, 103.53 ]
+          std: [ 58.395, 57.12, 57.375 ]
+          order: 'hwc'
+      - KieLabelEncode: # Class handling label
+          character_dict_path: ./train_data/wildreceipt/dict.txt
+      - KieResize:
+      - ToCHWImage:
+      - KeepKeys:
+          keep_keys: [ 'image', 'relations', 'texts', 'points', 'labels', 'tag', 'shape'] # dataloader will return list in this order
+  loader:
+    shuffle: True
+    drop_last: False
+    batch_size_per_card: 4
+    num_workers: 4
+Eval:
+  dataset:
+    name: SimpleDataSet
+    data_dir: ./train_data/wildreceipt
+    label_file_list:
+      - ./train_data/wildreceipt/wildreceipt_test.txt
+      # - /paddle/data/PaddleOCR/train_data/wildreceipt/1.txt
+    transforms:
+      - DecodeImage: # load image
+          img_mode: RGB
+          channel_first: False
+      - KieLabelEncode: # Class handling label
+          character_dict_path: ./train_data/wildreceipt/dict.txt
+      - KieResize:
+      - NormalizeImage:
+          scale: 1
+          mean: [ 123.675, 116.28, 103.53 ]
+          std: [ 58.395, 57.12, 57.375 ]
+          order: 'hwc'
+      - ToCHWImage:
+      - KeepKeys:
+          keep_keys: [ 'image', 'relations', 'texts', 'points', 'labels', 'tag', 'ori_image', 'ori_boxes', 'shape']
+  loader:
+    shuffle: False
+    drop_last: False
+    batch_size_per_card: 1 # must be 1
+    num_workers: 4
--- a/configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml
+++ b/configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml
@@ -28,6 +28,7 @@ Optimizer:
  lr:
    name: Cosine
    learning_rate: 0.001
+    warmup_epoch: 5
  regularizer:
    name: 'L2'
    factor: 0.00004

--- a/configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml
+++ b/configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml
@@ -28,6 +28,7 @@ Optimizer:
  lr:
    name: Cosine
    learning_rate: 0.001
+    warmup_epoch: 5
  regularizer:
    name: 'L2'
    factor: 0.00001

--- a/configs/rec/rec_resnet_stn_bilstm_att.yml
+++ b/configs/rec/rec_resnet_stn_bilstm_att.yml
@@ -75,7 +75,7 @@ Train:
          channel_first: False
      - SEEDLabelEncode: # Class handling label
      - RecResizeImg:
-          character_type: en
+          character_dict_path:
          image_shape: [3, 64, 256]
          padding: False
      - KeepKeys:
@@ -96,7 +96,7 @@ Eval:
          channel_first: False
      - SEEDLabelEncode: # Class handling label
      - RecResizeImg:
-          character_type: en
+          character_dict_path:
          image_shape: [3, 64, 256]
          padding: False
      - KeepKeys:

--- a/deploy/cpp_infer/readme.md
+++ b/deploy/cpp_infer/readme.md
@@ -103,7 +103,7 @@ opencv3/
 #### 1.2.1 直接下载安装
-* [Paddle预测库官网](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0/guides/05_inference_deployment/inference/build_and_install_lib_cn.html) 上提供了不同cuda版本的Linux预测库，可以在官网查看并选择合适的预测库版本（*建议选择paddle版本>=2.0.1版本的预测库* ）。
+* [Paddle预测库官网](https://paddle-inference.readthedocs.io/en/latest/user_guides/download_lib.html) 上提供了不同cuda版本的Linux预测库，可以在官网查看并选择合适的预测库版本（*建议选择paddle版本>=2.0.1版本的预测库* ）。
 * 下载之后使用下面的方法解压。
@@ -119,7 +119,7 @@ tar -xf paddle_inference.tgz
 ```shell
 git clone https://github.com/PaddlePaddle/Paddle.git
-git checkout release/2.1
+git checkout develop
 ```
 * 进入Paddle目录后，编译方法如下。

--- a/deploy/cpp_infer/readme_en.md
+++ b/deploy/cpp_infer/readme_en.md
@@ -79,7 +79,7 @@ opencv3/
 #### 1.2.1 Direct download and installation
-[Paddle inference library official website](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0/guides/05_inference_deployment/inference/build_and_install_lib_cn.html). You can view and select the appropriate version of the inference library on the official website.
+[Paddle inference library official website](https://paddle-inference.readthedocs.io/en/latest/user_guides/download_lib.html). You can view and select the appropriate version of the inference library on the official website.
 * After downloading, use the following method to uncompress.
@@ -97,7 +97,7 @@ Finally you can see the following files in the folder of `paddle_inference/`.
 ```shell
 git clone https://github.com/PaddlePaddle/Paddle.git
-git checkout release/2.1
+git checkout develop
 ```
 * After entering the Paddle directory, the commands to compile the paddle inference library are as follows.

--- a/deploy/pdserving/README.md
+++ b/deploy/pdserving/README.md
@@ -45,63 +45,67 @@ PaddleOCR operating environment and Paddle Serving operating environment are nee
    ```
 3. Install the client to send requests to the service
-    In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
-    The python3.7 version is recommended here:
-    ```
+```bash
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
+# 安装serving，用于启动服务
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
-    ```
+pip3 install paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
+# 如果是cuda10.1环境，可以使用下面的命令安装paddle-serving-server
-4. Install serving-app
+# wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
-    ```
+# pip3 install paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
-    pip3 install paddle-serving-app==0.6.1
-    ```
+# 安装client，用于向服务发送请求
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.7.0-cp37-none-any.whl
+pip3 install paddle_serving_client-0.7.0-cp37-none-any.whl
+# 安装serving-app
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.7.0-py3-none-any.whl
+pip3 install paddle_serving_app-0.7.0-py3-none-any.whl
+```
-   **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
+   **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md).
 <a name="model-conversion"></a>
 ## Model conversion
 When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
-Firstly, download the [inference model](https://github.com/PaddlePaddle/PaddleOCR#pp-ocr-20-series-model-listupdate-on-dec-15) of PPOCR
+Firstly, download the [inference model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/README_ch.md#pp-ocr%E7%B3%BB%E5%88%97%E6%A8%A1%E5%9E%8B%E5%88%97%E8%A1%A8%E6%9B%B4%E6%96%B0%E4%B8%AD) of PPOCR
 ```
 # Download and unzip the OCR text detection model
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar -O ch_PP-OCRv2_det_infer.tar && tar -xf ch_PP-OCRv2_det_infer.tar
 # Download and unzip the OCR text recognition model
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar -O ch_PP-OCRv2_rec_infer.tar &&  tar -xf ch_PP-OCRv2_rec_infer.tar
 ```
 Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
 ```
 #  Detection model conversion
-python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_mobile_v2.0_det_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_det_infer/ \
                                         --model_filename inference.pdmodel          \
                                         --params_filename inference.pdiparams       \
-                                         --serving_server ./ppocr_det_mobile_2.0_serving/ \
+                                         --serving_server ./ppocrv2_det_serving/ \
-                                         --serving_client ./ppocr_det_mobile_2.0_client/
+                                         --serving_client ./ppocrv2_det_client/
 #  Recognition model conversion
-python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_mobile_v2.0_rec_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \
                                         --model_filename inference.pdmodel          \
                                         --params_filename inference.pdiparams       \
-                                         --serving_server ./ppocr_rec_mobile_2.0_serving/  \
+                                         --serving_server ./ppocrv2_rec_serving/  \
-                                         --serving_client ./ppocr_rec_mobile_2.0_client/
+                                         --serving_client ./ppocrv2_rec_client/
 ```
 After the detection model is converted, there will be additional folders of `ppocr_det_mobile_2.0_serving` and `ppocr_det_mobile_2.0_client` in the current folder, with the following format:
 ```
-|- ppocr_det_mobile_2.0_serving/
+|- ppocrv2_det_serving/
-   |- __model__
+  |- __model__  
-   |- __params__
+  |- __params__
-   |- serving_server_conf.prototxt
+  |- serving_server_conf.prototxt  
-   |- serving_server_conf.stream.prototxt
+  |- serving_server_conf.stream.prototxt
-|- ppocr_det_mobile_2.0_client
+|- ppocrv2_det_client
-   |- serving_client_conf.prototxt
+  |- serving_client_conf.prototxt  
-   |- serving_client_conf.stream.prototxt
+  |- serving_client_conf.stream.prototxt
 ```
 The recognition model is the same.

--- a/deploy/pdserving/README_CN.md
+++ b/deploy/pdserving/README_CN.md
@@ -34,70 +34,66 @@ PaddleOCR提供2种服务部署方式：
 - 准备PaddleServing的运行环境，步骤如下
-1. 安装serving，用于启动服务
+```bash
-    ```
+# 安装serving，用于启动服务
-    pip3 install paddle-serving-server==0.6.1 # for CPU
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
+pip3 install paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
-    # 其他GPU环境需要确认环境再选择执行如下命令
+# 如果是cuda10.1环境，可以使用下面的命令安装paddle-serving-server
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
+# wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
+# pip3 install paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
-    ```
+# 安装client，用于向服务发送请求
-2. 安装client，用于向服务发送请求
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.7.0-cp37-none-any.whl
-    在[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包，这里推荐python3.7版本：
+pip3 install paddle_serving_client-0.7.0-cp37-none-any.whl
-    ```
+# 安装serving-app
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.7.0-py3-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
+pip3 install paddle_serving_app-0.7.0-py3-none-any.whl
-    ```
+```
-3. 安装serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-    **Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)。
+**Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md)。
 <a name="模型转换"></a>
 ## 模型转换
 使用PaddleServing做服务化部署时，需要将保存的inference模型转换为serving易于部署的模型。
-首先，下载PPOCR的[inference模型](https://github.com/PaddlePaddle/PaddleOCR#pp-ocr-20-series-model-listupdate-on-dec-15)
+首先，下载PPOCR的[inference模型](https://github.com/PaddlePaddle/PaddleOCR#pp-ocr-series-model-listupdate-on-september-8th)
-```
+```bash
 # 下载并解压 OCR 文本检测模型
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar -O ch_PP-OCRv2_det_infer.tar && tar -xf ch_PP-OCRv2_det_infer.tar
 # 下载并解压 OCR 文本识别模型
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar -O ch_PP-OCRv2_rec_infer.tar &&  tar -xf ch_PP-OCRv2_rec_infer.tar
 ```
 接下来，用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
-```
+```bash
 # 转换检测模型
-python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_mobile_v2.0_det_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_det_infer/ \
                                         --model_filename inference.pdmodel          \
                                         --params_filename inference.pdiparams       \
-                                         --serving_server ./ppocr_det_mobile_2.0_serving/ \
+                                         --serving_server ./ppocrv2_det_serving/ \
-                                         --serving_client ./ppocr_det_mobile_2.0_client/
+                                         --serving_client ./ppocrv2_det_client/
 # 转换识别模型
-python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_mobile_v2.0_rec_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \
                                         --model_filename inference.pdmodel          \
                                         --params_filename inference.pdiparams       \
-                                         --serving_server ./ppocr_rec_mobile_2.0_serving/  \
+                                         --serving_server ./ppocrv2_rec_serving/  \
-                                         --serving_client ./ppocr_rec_mobile_2.0_client/
+                                         --serving_client ./ppocrv2_rec_client/
 ```
-检测模型转换完成后，会在当前文件夹多出`ppocr_det_mobile_2.0_serving` 和`ppocr_det_mobile_2.0_client`的文件夹，具备如下格式：
+检测模型转换完成后，会在当前文件夹多出`ppocrv2_det_serving` 和`ppocrv2_det_client`的文件夹，具备如下格式：
 ```
-|- ppocr_det_mobile_2.0_serving/
+|- ppocrv2_det_serving/
  |- __model__  
  |- __params__
  |- serving_server_conf.prototxt  
  |- serving_server_conf.stream.prototxt
-|- ppocr_det_mobile_2.0_client
+|- ppocrv2_det_client
  |- serving_client_conf.prototxt  
  |- serving_client_conf.stream.prototxt

--- a/deploy/pdserving/config.yml
+++ b/deploy/pdserving/config.yml
@@ -34,7 +34,7 @@ op:
            client_type: local_predictor
            #det模型路径
-            model_config: ./ppocr_det_mobile_2.0_serving
+            model_config: ./ppocrv2_det_serving
            #Fetch结果列表，以client_config中fetch_var的alias_name为准
            fetch_list: ["save_infer_model/scale_0.tmp_1"]
@@ -60,7 +60,7 @@ op:
            client_type: local_predictor
            #rec模型路径
-            model_config: ./ppocr_rec_mobile_2.0_serving
+            model_config: ./ppocrv2_rec_serving
            #Fetch结果列表，以client_config中fetch_var的alias_name为准
            fetch_list: ["save_infer_model/scale_0.tmp_1"]  

--- a/deploy/pdserving/web_service.py
+++ b/deploy/pdserving/web_service.py
@@ -54,7 +54,7 @@ class DetOp(Op):
        _, self.new_h, self.new_w = det_img.shape
        return {"x": det_img[np.newaxis, :].copy()}, False, None, ""
-    def postprocess(self, input_dicts, fetch_dict, log_id):
+    def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
        det_out = fetch_dict["save_infer_model/scale_0.tmp_1"]
        ratio_list = [
            float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w
@@ -129,7 +129,7 @@ class RecOp(Op):
        return feed_list, False, None, ""
-    def postprocess(self, input_dicts, fetch_data, log_id):
+    def postprocess(self, input_dicts, fetch_data, data_id, log_id):
        res_list = []
        if isinstance(fetch_data, dict):
            if len(fetch_data) > 0:

--- a/deploy/pdserving/web_service_det.py
+++ b/deploy/pdserving/web_service_det.py
@@ -54,7 +54,7 @@ class DetOp(Op):
        _, self.new_h, self.new_w = det_img.shape
        return {"x": det_img[np.newaxis, :].copy()}, False, None, ""
-    def postprocess(self, input_dicts, fetch_dict, log_id):
+    def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
        det_out = fetch_dict["save_infer_model/scale_0.tmp_1"]
        ratio_list = [
            float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w

--- a/deploy/pdserving/web_service_rec.py
+++ b/deploy/pdserving/web_service_rec.py
@@ -56,7 +56,7 @@ class RecOp(Op):
        feed_list.append(feed)
        return feed_list, False, None, ""
-    def postprocess(self, input_dicts, fetch_data, log_id):
+    def postprocess(self, input_dicts, fetch_data, data_id, log_id):
        res_list = []
        if isinstance(fetch_data, dict):
            if len(fetch_data) > 0:

--- a/doc/doc_ch/update.md
+++ b/doc/doc_ch/update.md
 # 更新
+- 2021.12.21 《OCR十讲》课程开讲，12月21日起每晚八点半线上授课！ 【免费】报名地址：https://aistudio.baidu.com/aistudio/course/introduce/25207
+- 2021.12.21 发布PaddleOCR v2.4。OCR算法新增1种文本检测算法（PSENet），3种文本识别算法（NRTR、SEED、SAR）；文档结构化算法新增1种关键信息提取算法（SDMGR），3种DocVQA算法（LayoutLM、LayoutLMv2，LayoutXLM）。
 - 2021.9.7 发布PaddleOCR v2.3，发布[PP-OCRv2](#PP-OCRv2)，CPU推理速度相比于PP-OCR server提升220%；效果相比于PP-OCR mobile 提升7%。
 - 2021.8.3 发布PaddleOCR v2.2，新增文档结构分析[PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README_ch.md)工具包，支持版面分析与表格识别（含Excel导出）。
 - 2021.6.29 [FAQ](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/doc/doc_ch/FAQ.md)新增5个高频问题，总数248个，每周一都会更新，欢迎大家持续关注。

--- a/doc/doc_en/update_en.md
+++ b/doc/doc_en/update_en.md
 # RECENT UPDATES
+- 2021.12.21 OCR open source online course starts. The lesson starts at 8:30 every night and lasts for ten days. Free registration: https://aistudio.baidu.com/aistudio/course/introduce/25207
+- 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR) and 3 DocVQA algorithms (LayoutLM、LayoutLMv2，LayoutXLM).
 - 2021.9.7 release PaddleOCR v2.3, [PP-OCRv2](#PP-OCRv2) is proposed. The CPU inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile.
 - 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.2/ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).
 - 2021.4.8 release end-to-end text recognition algorithm [PGNet](https://www.aaai.org/AAAI21Papers/AAAI-2885.WangP.pdf) which is published in AAAI 2021. Find tutorial [here](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/pgnet_en.md)；release multi language recognition [models](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md), support more than 80 languages recognition; especically, the performance of [English recognition model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/models_list_en.md#English) is Optimized.

--- a/ppocr/data/imaug/label_ops.py
+++ b/ppocr/data/imaug/label_ops.py
@@ -19,6 +19,7 @@ from __future__ import unicode_literals
 import numpy as np
 import string
+from shapely.geometry import LineString, Point, Polygon
 import json
 from ppocr.utils.logging import get_logger
@@ -286,6 +287,168 @@ class E2ELabelEncodeTrain(object):
        return data
+class KieLabelEncode(object):
+    def __init__(self, character_dict_path, norm=10, directed=False, **kwargs):
+        super(KieLabelEncode, self).__init__()
+        self.dict = dict({'': 0})
+        with open(character_dict_path, 'r', encoding='utf-8') as fr:
+            idx = 1
+            for line in fr:
+                char = line.strip()
+                self.dict[char] = idx
+                idx += 1
+        self.norm = norm
+        self.directed = directed
+    def compute_relation(self, boxes):
+        """Compute relation between every two boxes."""
+        x1s, y1s = boxes[:, 0:1], boxes[:, 1:2]
+        x2s, y2s = boxes[:, 4:5], boxes[:, 5:6]
+        ws, hs = x2s - x1s + 1, np.maximum(y2s - y1s + 1, 1)
+        dxs = (x1s[:, 0][None] - x1s) / self.norm
+        dys = (y1s[:, 0][None] - y1s) / self.norm
+        xhhs, xwhs = hs[:, 0][None] / hs, ws[:, 0][None] / hs
+        whs = ws / hs + np.zeros_like(xhhs)
+        relations = np.stack([dxs, dys, whs, xhhs, xwhs], -1)
+        bboxes = np.concatenate([x1s, y1s, x2s, y2s], -1).astype(np.float32)
+        return relations, bboxes
+    def pad_text_indices(self, text_inds):
+        """Pad text index to same length."""
+        max_len = 300
+        recoder_len = max([len(text_ind) for text_ind in text_inds])
+        padded_text_inds = -np.ones((len(text_inds), max_len), np.int32)
+        for idx, text_ind in enumerate(text_inds):
+            padded_text_inds[idx, :len(text_ind)] = np.array(text_ind)
+        return padded_text_inds, recoder_len
+    def list_to_numpy(self, ann_infos):
+        """Convert bboxes, relations, texts and labels to ndarray."""
+        boxes, text_inds = ann_infos['points'], ann_infos['text_inds']
+        boxes = np.array(boxes, np.int32)
+        relations, bboxes = self.compute_relation(boxes)
+        labels = ann_infos.get('labels', None)
+        if labels is not None:
+            labels = np.array(labels, np.int32)
+            edges = ann_infos.get('edges', None)
+            if edges is not None:
+                labels = labels[:, None]
+                edges = np.array(edges)
+                edges = (edges[:, None] == edges[None, :]).astype(np.int32)
+                if self.directed:
+                    edges = (edges & labels == 1).astype(np.int32)
+                np.fill_diagonal(edges, -1)
+                labels = np.concatenate([labels, edges], -1)
+        padded_text_inds, recoder_len = self.pad_text_indices(text_inds)
+        max_num = 300
+        temp_bboxes = np.zeros([max_num, 4])
+        h, _ = bboxes.shape
+        temp_bboxes[:h, :h] = bboxes
+        temp_relations = np.zeros([max_num, max_num, 5])
+        temp_relations[:h, :h, :] = relations
+        temp_padded_text_inds = np.zeros([max_num, max_num])
+        temp_padded_text_inds[:h, :] = padded_text_inds
+        temp_labels = np.zeros([max_num, max_num])
+        temp_labels[:h, :h + 1] = labels
+        tag = np.array([h, recoder_len])
+        return dict(
+            image=ann_infos['image'],
+            points=temp_bboxes,
+            relations=temp_relations,
+            texts=temp_padded_text_inds,
+            labels=temp_labels,
+            tag=tag)
+    def convert_canonical(self, points_x, points_y):
+        assert len(points_x) == 4
+        assert len(points_y) == 4
+        points = [Point(points_x[i], points_y[i]) for i in range(4)]
+        polygon = Polygon([(p.x, p.y) for p in points])
+        min_x, min_y, _, _ = polygon.bounds
+        points_to_lefttop = [
+            LineString([points[i], Point(min_x, min_y)]) for i in range(4)
+        ]
+        distances = np.array([line.length for line in points_to_lefttop])
+        sort_dist_idx = np.argsort(distances)
+        lefttop_idx = sort_dist_idx[0]
+        if lefttop_idx == 0:
+            point_orders = [0, 1, 2, 3]
+        elif lefttop_idx == 1:
+            point_orders = [1, 2, 3, 0]
+        elif lefttop_idx == 2:
+            point_orders = [2, 3, 0, 1]
+        else:
+            point_orders = [3, 0, 1, 2]
+        sorted_points_x = [points_x[i] for i in point_orders]
+        sorted_points_y = [points_y[j] for j in point_orders]
+        return sorted_points_x, sorted_points_y
+    def sort_vertex(self, points_x, points_y):
+        assert len(points_x) == 4
+        assert len(points_y) == 4
+        x = np.array(points_x)
+        y = np.array(points_y)
+        center_x = np.sum(x) * 0.25
+        center_y = np.sum(y) * 0.25
+        x_arr = np.array(x - center_x)
+        y_arr = np.array(y - center_y)
+        angle = np.arctan2(y_arr, x_arr) * 180.0 / np.pi
+        sort_idx = np.argsort(angle)
+        sorted_points_x, sorted_points_y = [], []
+        for i in range(4):
+            sorted_points_x.append(points_x[sort_idx[i]])
+            sorted_points_y.append(points_y[sort_idx[i]])
+        return self.convert_canonical(sorted_points_x, sorted_points_y)
+    def __call__(self, data):
+        import json
+        label = data['label']
+        annotations = json.loads(label)
+        boxes, texts, text_inds, labels, edges = [], [], [], [], []
+        for ann in annotations:
+            box = ann['points']
+            x_list = [box[i][0] for i in range(4)]
+            y_list = [box[i][1] for i in range(4)]
+            sorted_x_list, sorted_y_list = self.sort_vertex(x_list, y_list)
+            sorted_box = []
+            for x, y in zip(sorted_x_list, sorted_y_list):
+                sorted_box.append(x)
+                sorted_box.append(y)
+            boxes.append(sorted_box)
+            text = ann['transcription']
+            texts.append(ann['transcription'])
+            text_ind = [self.dict[c] for c in text if c in self.dict]
+            text_inds.append(text_ind)
+            labels.append(ann['label'])
+            edges.append(ann.get('edge', 0))
+        ann_infos = dict(
+            image=data['image'],
+            points=boxes,
+            texts=texts,
+            text_inds=text_inds,
+            edges=edges,
+            labels=labels)
+        return self.list_to_numpy(ann_infos)
 class AttnLabelEncode(BaseRecLabelEncode):
    """ Convert between text-label and text-index """
@@ -344,8 +507,12 @@ class SEEDLabelEncode(BaseRecLabelEncode):
            max_text_length, character_dict_path, use_space_char)
    def add_special_char(self, dict_character):
+        self.padding = "padding"
        self.end_str = "eos"
-        dict_character = dict_character + [self.end_str]
+        self.unknown = "unknown"
+        dict_character = dict_character + [
+            self.end_str, self.padding, self.unknown
+        ]
        return dict_character
    def __call__(self, data):
@@ -356,8 +523,8 @@ class SEEDLabelEncode(BaseRecLabelEncode):
        if len(text) >= self.max_text_len:
            return None
        data['length'] = np.array(len(text)) + 1  # conclude eos
-        text = text + [len(self.character) - 1] * (self.max_text_len - len(text)
+        text = text + [len(self.character) - 3] + [len(self.character) - 2] * (
-                                                   )
+            self.max_text_len - len(text) - 1)
        data['label'] = np.array(text)
        return data

--- a/ppocr/data/imaug/operators.py
+++ b/ppocr/data/imaug/operators.py
@@ -111,7 +111,6 @@ class NormalizeImage(object):
        from PIL import Image
        if isinstance(img, Image.Image):
            img = np.array(img)
        assert isinstance(img,
                          np.ndarray), "invalid input 'img' in NormalizeImage"
        data['image'] = (
@@ -367,3 +366,53 @@ class E2EResizeForTest(object):
        ratio_w = resize_w / float(w)
        return im, (ratio_h, ratio_w)
+class KieResize(object):
+    def __init__(self, **kwargs):
+        super(KieResize, self).__init__()
+        self.max_side, self.min_side = kwargs['img_scale'][0], kwargs[
+            'img_scale'][1]
+    def __call__(self, data):
+        img = data['image']
+        points = data['points']
+        src_h, src_w, _ = img.shape
+        im_resized, scale_factor, [ratio_h, ratio_w
+                                   ], [new_h, new_w] = self.resize_image(img)
+        resize_points = self.resize_boxes(img, points, scale_factor)
+        data['ori_image'] = img
+        data['ori_boxes'] = points
+        data['points'] = resize_points
+        data['image'] = im_resized
+        data['shape'] = np.array([new_h, new_w])
+        return data
+    def resize_image(self, img):
+        norm_img = np.zeros([1024, 1024, 3], dtype='float32')
+        scale = [512, 1024]
+        h, w = img.shape[:2]
+        max_long_edge = max(scale)
+        max_short_edge = min(scale)
+        scale_factor = min(max_long_edge / max(h, w),
+                           max_short_edge / min(h, w))
+        resize_w, resize_h = int(w * float(scale_factor) + 0.5), int(h * float(
+            scale_factor) + 0.5)
+        max_stride = 32
+        resize_h = (resize_h + max_stride - 1) // max_stride * max_stride
+        resize_w = (resize_w + max_stride - 1) // max_stride * max_stride
+        im = cv2.resize(img, (resize_w, resize_h))
+        new_h, new_w = im.shape[:2]
+        w_scale = new_w / w
+        h_scale = new_h / h
+        scale_factor = np.array(
+            [w_scale, h_scale, w_scale, h_scale], dtype=np.float32)
+        norm_img[:new_h, :new_w, :] = im
+        return norm_img, scale_factor, [h_scale, w_scale], [new_h, new_w]
+    def resize_boxes(self, im, points, scale_factor):
+        points = points * scale_factor
+        img_shape = im.shape[:2]
+        points[:, 0::2] = np.clip(points[:, 0::2], 0, img_shape[1])
+        points[:, 1::2] = np.clip(points[:, 1::2], 0, img_shape[0])
+        return points
--- a/ppocr/losses/__init__.py
+++ b/ppocr/losses/__init__.py
@@ -35,6 +35,7 @@ from .cls_loss import ClsLoss
 # e2e loss
 from .e2e_pg_loss import PGLoss
+from .kie_sdmgr_loss import SDMGRLoss
 # basic loss function
 from .basic_loss import DistanceLoss
@@ -50,7 +51,7 @@ def build_loss(config):
    support_dict = [
        'DBLoss', 'PSELoss', 'EASTLoss', 'SASTLoss', 'CTCLoss', 'ClsLoss',
        'AttentionLoss', 'SRNLoss', 'PGLoss', 'CombinedLoss', 'NRTRLoss',
-        'TableAttentionLoss', 'SARLoss', 'AsterLoss'
+        'TableAttentionLoss', 'SARLoss', 'AsterLoss', 'SDMGRLoss'
    ]
    config = copy.deepcopy(config)
    module_name = config.pop('name')

--- a/ppocr/losses/kie_sdmgr_loss.py
+++ b/ppocr/losses/kie_sdmgr_loss.py
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from paddle import nn
+import paddle
+class SDMGRLoss(nn.Layer):
+    def __init__(self, node_weight=1.0, edge_weight=1.0, ignore=0):
+        super().__init__()
+        self.loss_node = nn.CrossEntropyLoss(ignore_index=ignore)
+        self.loss_edge = nn.CrossEntropyLoss(ignore_index=-1)
+        self.node_weight = node_weight
+        self.edge_weight = edge_weight
+        self.ignore = ignore
+    def pre_process(self, gts, tag):
+        gts, tag = gts.numpy(), tag.numpy().tolist()
+        temp_gts = []
+        batch = len(tag)
+        for i in range(batch):
+            num, recoder_len = tag[i][0], tag[i][1]
+            temp_gts.append(
+                paddle.to_tensor(
+                    gts[i, :num, :num + 1], dtype='int64'))
+        return temp_gts
+    def accuracy(self, pred, target, topk=1, thresh=None):
+        """Calculate accuracy according to the prediction and target.
+        Args:
+            pred (torch.Tensor): The model prediction, shape (N, num_class)
+            target (torch.Tensor): The target of each prediction, shape (N, )
+            topk (int | tuple[int], optional): If the predictions in ``topk``
+                matches the target, the predictions will be regarded as
+                correct ones. Defaults to 1.
+            thresh (float, optional): If not None, predictions with scores under
+                this threshold are considered incorrect. Default to None.
+        Returns:
+            float | tuple[float]: If the input ``topk`` is a single integer,
+                the function will return a single float as accuracy. If
+                ``topk`` is a tuple containing multiple integers, the
+                function will return a tuple containing accuracies of
+                each ``topk`` number.
+        """
+        assert isinstance(topk, (int, tuple))
+        if isinstance(topk, int):
+            topk = (topk, )
+            return_single = True
+        else:
+            return_single = False
+        maxk = max(topk)
+        if pred.shape[0] == 0:
+            accu = [pred.new_tensor(0.) for i in range(len(topk))]
+            return accu[0] if return_single else accu
+        pred_value, pred_label = paddle.topk(pred, maxk, axis=1)
+        pred_label = pred_label.transpose(
+            [1, 0])  # transpose to shape (maxk, N)
+        correct = paddle.equal(pred_label,
+                               (target.reshape([1, -1]).expand_as(pred_label)))
+        res = []
+        for k in topk:
+            correct_k = paddle.sum(correct[:k].reshape([-1]).astype('float32'),
+                                   axis=0,
+                                   keepdim=True)
+            res.append(
+                paddle.multiply(correct_k,
+                                paddle.to_tensor(100.0 / pred.shape[0])))
+        return res[0] if return_single else res
+    def forward(self, pred, batch):
+        node_preds, edge_preds = pred
+        gts, tag = batch[4], batch[5]
+        gts = self.pre_process(gts, tag)
+        node_gts, edge_gts = [], []
+        for gt in gts:
+            node_gts.append(gt[:, 0])
+            edge_gts.append(gt[:, 1:].reshape([-1]))
+        node_gts = paddle.concat(node_gts)
+        edge_gts = paddle.concat(edge_gts)
+        node_valids = paddle.nonzero(node_gts != self.ignore).reshape([-1])
+        edge_valids = paddle.nonzero(edge_gts != -1).reshape([-1])
+        loss_node = self.loss_node(node_preds, node_gts)
+        loss_edge = self.loss_edge(edge_preds, edge_gts)
+        loss = self.node_weight * loss_node + self.edge_weight * loss_edge
+        return dict(
+            loss=loss,
+            loss_node=loss_node,
+            loss_edge=loss_edge,
+            acc_node=self.accuracy(
+                paddle.gather(node_preds, node_valids),
+                paddle.gather(node_gts, node_valids)),
+            acc_edge=self.accuracy(
+                paddle.gather(edge_preds, edge_valids),
+                paddle.gather(edge_gts, edge_valids)))