Fixed merge conflict in README.md

112ad00d · Manan Goel · b7aa10e5 · 95ac9c0e · 112ad00d · 112ad00d
Commit 112ad00d authored Apr 23, 2022 by Manan Goel
20 changed files
--- a/doc/doc_ch/config.md
+++ b/doc/doc_ch/config.md
@@ -45,18 +45,18 @@
 ### Optimizer ([ppocr/optimizer](../../ppocr/optimizer))
-|         字段             |            用途            |      默认值        |            备注             |
+|         字段             |      用途       |      默认值      |            备注             |
-| :---------------------: |  :---------------------:   | :--------------:  |   :--------------------:   |
+| :---------------------: |:-------------:|:-------------:|   :--------------------:   |
-|      name        |         优化器类名          |  Adam  |  目前支持`Momentum`,`Adam`,`RMSProp`, 见[ppocr/optimizer/optimizer.py](../../ppocr/optimizer/optimizer.py)  |
+|      name        |     优化器类名     |     Adam      |  目前支持`Momentum`,`Adam`,`RMSProp`, 见[ppocr/optimizer/optimizer.py](../../ppocr/optimizer/optimizer.py)  |
-|      beta1           |    设置一阶矩估计的指数衰减率  |       0.9         |               \             |
+|      beta1           | 设置一阶矩估计的指数衰减率 |      0.9      |               \             |
-|      beta2           |    设置二阶矩估计的指数衰减率  |     0.999         |               \             |
+|      beta2           | 设置二阶矩估计的指数衰减率 |     0.999     |               \             |
-|      clip_norm           |    所允许的二范数最大值  |              |               \             |
+|      clip_norm           |  所允许的二范数最大值   |               |               \             |
-|      **lr**                |         设置学习率decay方式       |   -    |       \  |
+|      **lr**                | 设置学习率decay方式  |       -       |       \  |
-|        name    |      学习率decay类名   |         Cosine       | 目前支持`Linear`,`Cosine`,`Step`,`Piecewise`, 见[ppocr/optimizer/learning_rate.py](../../ppocr/optimizer/learning_rate.py) |
+|        name    |  学习率decay类名   |    Cosine     | 目前支持`Linear`,`Cosine`,`Step`,`Piecewise`, 见[ppocr/optimizer/learning_rate.py](../../ppocr/optimizer/learning_rate.py) |
-|        learning_rate      |    基础学习率        |       0.001      |  \        |
+|        learning_rate      |     基础学习率     |     0.001     |  \        |
-|      **regularizer**      |  设置网络正则化方式        |       -      | \        |
+|      **regularizer**      |   设置网络正则化方式   |       -       | \        |
-|        name      |    正则化类名      |       L2     | 目前支持`L1`,`L2`, 见[ppocr/optimizer/regularizer.py](../../ppocr/optimizer/regularizer.py)        |
+|        name      |     正则化类名     |      L2       | 目前支持`L1`,`L2`, 见[ppocr/optimizer/regularizer.py](../../ppocr/optimizer/regularizer.py)        |
-|        factor      |    学习率衰减系数       |       0.00004     |  \        |
+|        factor      |     正则化系数     |       0.00001        |  \        |
 ### Architecture ([ppocr/modeling](../../ppocr/modeling))

--- a/doc/doc_ch/detection.md
+++ b/doc/doc_ch/detection.md
@@ -10,7 +10,10 @@
  * [2.1 启动训练](#21-----)
  * [2.2 断点训练](#22-----)
  * [2.3 更换Backbone 训练](#23---backbone---)
-  * [2.4 知识蒸馏训练](#24---distill---)
+  * [2.4 混合精度训练](#24---amp---)
+  * [2.5 分布式训练](#25---fleet---)
+  * [2.6 知识蒸馏训练](#26---distill---)
+  * [2.7 其他训练环境（Windows/macOS/Linux DCU）](#27---other---)
 - [3. 模型评估与预测](#3--------)
  * [3.1 指标评估](#31-----)
  * [3.2 测试检测效果](#32-------)
@@ -103,9 +106,6 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml \
 python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml \
     -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
-# 多机多卡训练，通过 --ips 参数设置使用的机器IP地址，通过 --gpus 参数设置使用的GPU ID
-python3 -m paddle.distributed.launch --ips="xx.xx.xx.xx,xx.xx.xx.xx" --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml \
-     -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
 ```
 上述指令中，通过-c 选择训练使用configs/det/det_db_mv3.yml配置文件。
@@ -116,15 +116,6 @@ python3 -m paddle.distributed.launch --ips="xx.xx.xx.xx,xx.xx.xx.xx" --gpus '0,1
 python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001
 ```
-**注意:** 采用多机多卡训练时，需要替换上面命令中的ips值为您机器的地址，机器之间需要能够相互ping通。另外，训练时需要在多个机器上分别启动命令。查看机器ip地址的命令为`ifconfig`。
-如果您想进一步加快训练速度，可以使用[自动混合精度训练](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/01_paddle2.0_introduction/basic_concept/amp_cn.html)， 以单机单卡为例，命令如下：
-```shell
-python3 tools/train.py -c configs/det/det_mv3_db.yml \
-     -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained \
-     Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True
- ```
 <a name="22-----"></a>
 ## 2.2 断点训练
@@ -183,15 +174,49 @@ args1: args1
 **注意**：如果要更换网络的其他模块，可以参考[文档](./add_new_algorithm.md)。
+<a name="24---amp---"></a>
+## 2.4 混合精度训练
+如果您想进一步加快训练速度，可以使用[自动混合精度训练](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/01_paddle2.0_introduction/basic_concept/amp_cn.html)， 以单机单卡为例，命令如下：
+```shell
+python3 tools/train.py -c configs/det/det_mv3_db.yml \
+     -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained \
+     Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True
+ ```
+<a name="26---fleet---"></a>
+## 2.5 分布式训练
+多机多卡训练时，通过 `--ips` 参数设置使用的机器IP地址，通过 `--gpus` 参数设置使用的GPU ID：
+```bash
+python3 -m paddle.distributed.launch --ips="xx.xx.xx.xx,xx.xx.xx.xx" --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml \
+     -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+```
-<a name="24---distill---"></a>
+**注意:** 采用多机多卡训练时，需要替换上面命令中的ips值为您机器的地址，机器之间需要能够相互ping通。另外，训练时需要在多个机器上分别启动命令。查看机器ip地址的命令为`ifconfig`。
+<a name="26---distill---"></a>
-## 2.4 知识蒸馏训练
+## 2.6 知识蒸馏训练
 PaddleOCR支持了基于知识蒸馏的检测模型训练过程，更多内容可以参考[知识蒸馏说明文档](./knowledge_distillation.md)。
+**注意：** 知识蒸馏训练目前只支持PP-OCR使用的`DB`和`CRNN`算法。
+<a name="27---other---"></a>
+## 2.7 其他训练环境
+- Windows GPU/CPU
+- macOS
+- Linux DCU
 <a name="3--------"></a>
 # 3. 模型评估与预测
@@ -206,22 +231,22 @@ PaddleOCR计算三个OCR检测相关的指标，分别是：Precision、Recall
 python3 tools/eval.py -c configs/det/det_mv3_db.yml  -o Global.checkpoints="{path/to/weights}/best_accuracy"
 ```
-* 注：`box_thresh`、`unclip_ratio`是DB后处理所需要的参数，在评估EAST模型时不需要设置
 <a name="32-------"></a>
 ## 3.2 测试检测效果
-测试单张图像的检测效果
+测试单张图像的检测效果：
 ```shell
 python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o Global.infer_img="./doc/imgs_en/img_10.jpg" Global.pretrained_model="./output/det_db/best_accuracy"
 ```
-测试DB模型时，调整后处理阈值
+测试DB模型时，调整后处理阈值：
 ```shell
 python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o Global.infer_img="./doc/imgs_en/img_10.jpg" Global.pretrained_model="./output/det_db/best_accuracy"  PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=2.0
 ```
+* 注：`box_thresh`、`unclip_ratio`是DB后处理参数，其他检测模型不支持。
-测试文件夹下所有图像的检测效果
+测试文件夹下所有图像的检测效果：
 ```shell
 python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o Global.infer_img="./doc/imgs_en/" Global.pretrained_model="./output/det_db/best_accuracy"
 ```

--- a/doc/doc_ch/docvqa_datasets.md
+++ b/doc/doc_ch/docvqa_datasets.md
--- a/doc/doc_ch/layout_datasets.md
+++ b/doc/doc_ch/layout_datasets.md
--- a/doc/doc_ch/ocr_book.md
+++ b/doc/doc_ch/ocr_book.md
+# 《动手学OCR》电子书
+特点：
+- 覆盖OCR全栈技术
+- 理论实践相结合
+- Notebook交互式学习
+- 配套教学视频
+[电子书下载]()
+目录：
+![]()
+[notebook教程](../../notebook/notebook_ch/)
+[教学视频](https://aistudio.baidu.com/aistudio/education/group/info/25207)
\ No newline at end of file
--- a/doc/doc_ch/ppocr_introduction.md
+++ b/doc/doc_ch/ppocr_introduction.md
+[English](../doc_en/ppocr_introduction_en.md) | 简体中文
+# PP-OCR
+- [1. 简介](#1)
+- [2. 特性](#2)
+- [3. benchmark](#3)
+- [4. 效果展示](#4)
+- [5. 使用教程](#5)
+    - [5.1 快速体验](#51)
+    - [5.2 模型训练、压缩、推理部署](#52)
+- [6. 模型库](#6)
+<a name="1"></a>
+## 1. 简介
+PP-OCR是PaddleOCR自研的实用的超轻量OCR系统。在实现[前沿算法](algorithm.md)的基础上，考虑精度与速度的平衡，进行**模型瘦身**和**深度优化**，使其尽可能满足产业落地需求。
+PP-OCR是一个两阶段的OCR系统，其中文本检测算法选用[DB](algorithm_det_db.md)，文本识别算法选用[CRNN](algorithm_rec_crnn.md)，并在检测和识别模块之间添加[文本方向分类器](angle_class.md)，以应对不同方向的文本识别。
+PP-OCR系统pipeline如下：
+<div align="center">
+    <img src="../ppocrv2_framework.jpg" width="800">
+</div>
+PP-OCR系统在持续迭代优化，目前已发布PP-OCR和PP-OCRv2两个版本：
+[1] PP-OCR从骨干网络选择和调整、预测头部的设计、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型自动裁剪量化8个方面，采用19个有效策略，对各个模块的模型进行效果调优和瘦身(如绿框所示)，最终得到整体大小为3.5M的超轻量中英文OCR和2.8M的英文数字OCR。更多细节请参考PP-OCR技术方案 https://arxiv.org/abs/2009.09941
+[2] PP-OCRv2在PP-OCR的基础上，进一步在5个方面重点优化，检测模型采用CML协同互学习知识蒸馏策略和CopyPaste数据增广策略；识别模型采用LCNet轻量级骨干网络、UDML 改进知识蒸馏策略和[Enhanced CTC loss](./doc/doc_ch/enhanced_ctc_loss.md)损失函数改进（如上图红框所示），进一步在推理速度和预测效果上取得明显提升。更多细节请参考PP-OCRv2[技术报告](https://arxiv.org/abs/2109.03144)。
+<a name="2"></a>
+## 2. 特性
+- 超轻量PP-OCRv2系列：检测（3.1M）+ 方向分类器（1.4M）+ 识别（8.5M）= 13.0M
+- 超轻量PP-OCR mobile移动端系列：检测（3.0M）+方向分类器（1.4M）+ 识别（5.0M）= 9.4M
+- 通用PP-OCR server系列：检测（47.1M）+方向分类器（1.4M）+ 识别（94.9M）= 143.4M
+- 支持中英文数字组合识别、竖排文本识别、长文本识别
+- 支持多语言识别：韩语、日语、德语、法语等约80种语言
+<a name="3"></a>
+## 3. benchmark
+关于PP-OCR系列模型之间的性能对比，请查看[benchmark](./benchmark.md)文档。
+<a name="4"></a>
+## 4. 效果展示 [more](./visualization.md)
+<details open>
+<summary>PP-OCRv2 中文模型</summary>
+<div align="center">
+      <img src="../imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
+      <img src="../imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
+</div>
+<div align="center">
+    <img src="../imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
+    <img src="../imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
+</div>
+</details>
+<details open>
+<summary>PP-OCRv2 英文模型</summary>
+<div align="center">
+    <img src="../imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
+</div>
+</details>
+<details open>
+<summary>PP-OCRv2 其他语言模型</summary>
+<div align="center">
+    <img src="../imgs_results/french_0.jpg" width="800">
+    <img src="../imgs_results/korean.jpg" width="800">
+</div>
+</details>
+<a name="5"></a>
+## 5. 使用教程
+<a name="51"></a>
+### 5.1 快速体验
+- 在线网站体验：超轻量PP-OCR mobile模型体验地址：https://www.paddlepaddle.org.cn/hub/scene/ocr
+- 移动端demo体验：[安装包DEMO下载地址](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统)
+- 一行命令快速使用：[快速开始（中英文/多语言）](./doc/doc_ch/quickstart.md)
+<a name="52"></a>
+### 5.2 模型训练、压缩、推理部署
+更多教程，包括模型训练、模型压缩、推理部署等，请参考[文档教程](../../README_ch.md#文档教程)。
+<a name="6"></a>
+## 6. 模型库
+PP-OCR中英文模型列表如下：
+| 模型简介                              | 模型名称                | 推荐场景        | 检测模型                                                     | 方向分类器                                                   | 识别模型                                                     |
+| ------------------------------------- | ----------------------- | --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
+| 中英文超轻量PP-OCRv2模型（13.0M）     | ch_PP-OCRv2_xx          | 移动端&服务器端 | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar) |
+| 中英文超轻量PP-OCR mobile模型（9.4M） | ch_ppocr_mobile_v2.0_xx | 移动端&服务器端 | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
+| 中英文通用PP-OCR server模型（143.4M） | ch_ppocr_server_v2.0_xx | 服务器端        | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
+更多模型下载（包括英文数字模型、多语言模型、Paddle-Lite模型等），可以参考[PP-OCR 系列模型下载](./models_list.md)。
\ No newline at end of file
--- a/doc/doc_ch/quickstart.md
+++ b/doc/doc_ch/quickstart.md
- [PaddleOCR快速开始](#paddleocr快速开始)
+# PaddleOCR 快速开始
-  - [1. 安装](#1-安装)
-    - [1.1 安装PaddlePaddle](#11-安装paddlepaddle)
+**说明：** 本文主要介绍PaddleOCR wheel包对PP-OCR系列模型的快速使用，如要体验文档分析相关功能，请参考[PP-Structure快速使用教程](../../ppstructure/docs/quickstart.md)。
-    - [1.2 安装PaddleOCR whl包](#12-安装paddleocr-whl包)
-  - [2. 便捷使用](#2-便捷使用)
+- [1. 安装](#1)
-    - [2.1 命令行使用](#21-命令行使用)
+  - [1.1 安装PaddlePaddle](#11)
-      - [2.1.1 中英文模型](#211-中英文模型)
+  - [1.2 安装PaddleOCR whl包](#12)
-      - [2.1.2 多语言模型](#212-多语言模型)
+- [2. 便捷使用](#2)
-      - [2.1.3 版面分析](#213-版面分析)
+  - [2.1 命令行使用](#21)
-    - [2.2 Python脚本使用](#22-python脚本使用)
+      - [2.1.1 中英文模型](#211)
-      - [2.2.1 中英文与多语言使用](#221-中英文与多语言使用)
+      - [2.1.2 多语言模型](#212)
-      - [2.2.2 版面分析](#222-版面分析)
+  - [2.2 Python脚本使用](#22)
-  - [3. 小结](#3-小结)
+      - [2.2.1 中英文与多语言使用](#221)
+- [3.小结](#3)
-# PaddleOCR快速开始
-<a name="1"></a>
+<a name="1"></a>
 ## 1. 安装
 <a name="11"></a>
 ### 1.1 安装PaddlePaddle
 > 如果您没有基础的Python运行环境，请参考[运行环境准备](./environment.md)。
@@ -39,22 +37,13 @@
 更多的版本需求，请参照[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
 <a name="12"></a>
 ### 1.2 安装PaddleOCR whl包
 ```bash
 pip install "paddleocr>=2.0.1" # 推荐使用2.0.1+版本
 ```
- 对于Windows环境用户：
+- 对于Windows环境用户：直接通过pip安装的shapely库可能出现`[winRrror 126] 找不到指定模块的问题`。建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装。
-  直接通过pip安装的shapely库可能出现`[winRrror 126] 找不到指定模块的问题`。建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装，
- 使用**版面分析**功能时，运行以下命令**安装 Layout-Parser**
-  ```bash
-  pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
-  ```
 <a name="2"></a>
@@ -68,7 +57,8 @@ PaddleOCR提供了一系列测试图片，点击[这里](https://paddleocr.bj.bc
 cd /path/to/ppocr_img
 ```
-如果不使用提供的测试图片，可以将下方`--image_dir`参数替换为相应的测试图片路径
+如果不使用提供的测试图片，可以将下方`--image_dir`参数替换为相应的测试图片路径。
 <a name="211"></a>
 #### 2.1.1 中英文模型
@@ -154,60 +144,6 @@ paddleocr --image_dir ./imgs_en/254.jpg --lang=en
 | 繁体中文 | chinese_cht |      | 意大利文 | it     |      | 俄罗斯文 | ru     |
 全部语种及其对应的缩写列表可查看[多语言模型教程](./multi_languages.md)
-<a name="213"></a>
-#### 2.1.3 版面分析
-版面分析是指对文档图片中的文字、标题、列表、图片和表格5类区域进行划分。对于前三类区域，直接使用OCR模型完成对应区域文字检测与识别，并将结果保存在txt中。对于表格类区域，经过表格结构化处理后，表格图片转换为相同表格样式的Excel文件。图片区域会被单独裁剪成图像。
-使用PaddleOCR的版面分析功能，需要指定`--type=structure`
-```bash
-paddleocr --image_dir=./table/1.png --type=structure
-```
- **返回结果说明**
-  PP-Structure的返回结果为一个dict组成的list，示例如下
-  ```shell
-  [{   'type': 'Text',
-        'bbox': [34, 432, 345, 462],
-        'res': ([[36.0, 437.0, 341.0, 437.0, 341.0, 446.0, 36.0, 447.0], [41.0, 454.0, 125.0, 453.0, 125.0, 459.0, 41.0, 460.0]],
-                  [('Tigure-6. The performance of CNN and IPT models using difforen', 0.90060663), ('Tent  ', 0.465441)])
-    }
-  ]
-  ```
-  其中各个字段说明如下
-  | 字段 | 说明                                                         |
-  | ---- | ------------------------------------------------------------ |
-  | type | 图片区域的类型                                               |
-  | bbox | 图片区域的在原图的坐标，分别[左上角x，左上角y，右下角x，右下角y] |
-  | res  | 图片区域的OCR或表格识别结果。<br>表格: 表格的HTML字符串; <br>OCR: 一个包含各个单行文字的检测坐标和识别结果的元组 |
-  运行完成后，每张图片会在`output`字段指定的目录下有一个同名目录，图片里的每个表格会存储为一个excel，图片区域会被裁剪之后保存下来，excel文件和图片名为表格在图片里的坐标。
-  ```
-  /output/table/1/
-    └─ res.txt
-    └─ [454, 360, 824, 658].xlsx  表格识别结果
-    └─ [16, 2, 828, 305].jpg            被裁剪出的图片区域
-    └─ [17, 361, 404, 711].xlsx        表格识别结果
-  ```
- **参数说明**
-  | 字段            | 说明                                     | 默认值                                       |
-  | --------------- | ---------------------------------------- | -------------------------------------------- |
-  | output          | excel和识别结果保存的地址                | ./output/table                               |
-  | table_max_len   | 表格结构模型预测时，图像的长边resize尺度 | 488                                          |
-  | table_model_dir | 表格结构模型 inference 模型地址          | None                                         |
-  | table_char_dict_path | 表格结构模型所用字典地址                 | ../ppocr/utils/dict/table_structure_dict.txt |
-  大部分参数和paddleocr whl包保持一致，见 [whl包文档](./whl.md)
 <a name="22"></a>
@@ -256,35 +192,7 @@ im_show.save('result.jpg')
 <div align="center">
    <img src="../imgs_results/whl/11_det_rec.jpg" width="800">
 </div>
-<a name="222"></a>
-#### 2.2.2 版面分析
-```python
-import os
-import cv2
-from paddleocr import PPStructure,draw_structure_result,save_structure_res
-table_engine = PPStructure(show_log=True)
-save_folder = './output/table'
-img_path = './table/paper-image.jpg'
-img = cv2.imread(img_path)
-result = table_engine(img)
-save_structure_res(result, save_folder,os.path.basename(img_path).split('.')[0])
-for line in result:
-    line.pop('img')
-    print(line)
-from PIL import Image
-font_path = './fonts/simfang.ttf' # PaddleOCR下提供字体包
-image = Image.open(img_path).convert('RGB')
-im_show = draw_structure_result(image, result,font_path=font_path)
-im_show = Image.fromarray(im_show)
-im_show.save('result.jpg')
-```
 <a name="3"></a>
@@ -292,4 +200,4 @@ im_show.save('result.jpg')
 通过本节内容，相信您已经熟练掌握PaddleOCR whl包的使用方法并获得了初步效果。
-PaddleOCR是一套丰富领先实用的OCR工具库，打通数据、模型训练、压缩和推理部署全流程，因此在[下一节](./paddleOCR_overview.md)中我们将首先为您介绍PaddleOCR的全景图，然后克隆PaddleOCR项目，正式开启PaddleOCR的应用之旅。
+PaddleOCR是一套丰富领先实用的OCR工具库，打通数据、模型训练、压缩和推理部署全流程，您可以参考[文档教程](../../README_ch.md#文档教程)，正式开启PaddleOCR的应用之旅。
--- a/doc/doc_ch/table_datasets.md
+++ b/doc/doc_ch/table_datasets.md
--- a/doc/doc_en/algorithm_det_db_en.md
+++ b/doc/doc_en/algorithm_det_db_en.md
+# DB
+- [1. Introduction](#1)
+- [2. Environment](#2)
+- [3. Model Training / Evaluation / Prediction](#3)
+    - [3.1 Training](#3-1)
+    - [3.2 Evaluation](#3-2)
+    - [3.3 Prediction](#3-3)
+- [4. Inference and Deployment](#4)
+    - [4.1 Python Inference](#4-1)
+    - [4.2 C++ Inference](#4-2)
+    - [4.3 Serving](#4-3)
+    - [4.4 More](#4-4)
+- [5. FAQ](#5)
+<a name="1"></a>
+## 1. Introduction
\ No newline at end of file
--- a/doc/doc_en/pgnet_en.md
+++ b/doc/doc_en/pgnet_en.md
--- a/doc/doc_en/algorithm_en.md
+++ b/doc/doc_en/algorithm_en.md
+# Academic Algorithms and Models
+PaddleOCR will add cutting-edge OCR algorithms and models continuously. Check out the supported models and tutorials by clicking the following list:
+- [text detection algorithms](./algorithm_overview_en.md#11)
+- [text recognition algorithms](./algorithm_overview_en.md#12)
+- [end-to-end algorithms](./algorithm_overview_en.md#2)
+Developers are welcome to contribute more algorithms! Please refer to [add new algorithm](./add_new_algorithm_en.md) guideline.
\ No newline at end of file
--- a/doc/doc_en/algorithm_overview_en.md
+++ b/doc/doc_en/algorithm_overview_en.md
-# Two-stage Algorithm
+# OCR Algorithms
- [1. Algorithm Introduction](#1-algorithm-introduction)
+- [1. Two-stage Algorithms](#1)
-  * [1.1 Text Detection Algorithm](#11-text-detection-algorithm)
+  * [1.1 Text Detection Algorithms](#11)
-  * [1.2 Text Recognition Algorithm](#12-text-recognition-algorithm)
+  * [1.2 Text Recognition Algorithms](#12)
- [2. Training](#2-training)
+- [2. End-to-end Algorithms](#2)
- [3. Inference](#3-inference)
-<a name="Algorithm_introduction"></a>
-## 1. Algorithm Introduction
+This tutorial lists the OCR algorithms supported by PaddleOCR, as well as the models and metrics of each algorithm on **English public datasets**. It is mainly used for algorithm introduction and algorithm performance comparison. For more models on other datasets including Chinese, please refer to [PP-OCR v2.0 models list](./models_list_en.md).
-This tutorial lists the text detection algorithms and text recognition algorithms supported by PaddleOCR, as well as the models and metrics of each algorithm on **English public datasets**. It is mainly used for algorithm introduction and algorithm performance comparison. For more models on other datasets including Chinese, please refer to [PP-OCR v2.0 models list](./models_list_en.md).
+<a name="1"></a>
+## 1. Two-stage Algorithms
- [1. Text Detection Algorithm](#TEXTDETECTIONALGORITHM)
+<a name="11"></a>
- [2. Text Recognition Algorithm](#TEXTRECOGNITIONALGORITHM)
-<a name="TEXTDETECTIONALGORITHM"></a>
+### 1.1 Text Detection Algorithms
-### 1.1 Text Detection Algorithm
+Supported text detection algorithms (Click the link to get the tutorial):
+- [x]  [DB](./algorithm_det_db_en.md)
-PaddleOCR open source text detection algorithms list:
+- [x]  [EAST](./algorithm_det_east_en.md)
- [x]  EAST([paper](https://arxiv.org/abs/1704.03155))[2]
+- [x]  [SAST](./algorithm_det_sast_en.md)
- [x]  DB([paper](https://arxiv.org/abs/1911.08947))[1]
+- [x]  [PSENet](./algorithm_det_psenet_en.md)
- [x]  SAST([paper](https://arxiv.org/abs/1908.05498))[4]
+- [x]  [FCENet](./algorithm_det_fcenet_en.md)
- [x]  PSENet([paper](https://arxiv.org/abs/1903.12473v2)）
 On the ICDAR2015 dataset, the text detection result is as follows:
@@ -48,20 +45,19 @@ On Total-Text dataset, the text detection result is as follows:
 * [Baidu Drive](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (download code: 2bpi).
 * [Google Drive](https://drive.google.com/drive/folders/1ll2-XEVyCQLpJjawLDiRlvo_i4BqHCJe?usp=sharing)
-For the training guide and use of PaddleOCR text detection algorithms, please refer to the document [Text detection model training/evaluation/prediction](./detection_en.md)
-<a name="TEXTRECOGNITIONALGORITHM"></a>
+<a name="12"></a>
-### 1.2 Text Recognition Algorithm
+### 1.2 Text Recognition Algorithms
-PaddleOCR open-source text recognition algorithms list:
+Supported text recognition algorithms (Click the link to get the tutorial):
- [x]  CRNN([paper](https://arxiv.org/abs/1507.05717))[7]
+- [x]  [CRNN](./algorithm_rec_crnn_en.md)
- [x]  Rosetta([paper](https://arxiv.org/abs/1910.05085))[10]
+- [x]  [Rosetta](./algorithm_rec_rosetta_en.md)
- [x]  STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))[11]
+- [x]  [STAR-Net](./algorithm_rec_starnet_en.md)
- [x]  RARE([paper](https://arxiv.org/abs/1603.03915v1))[12]
+- [x]  [RARE](./algorithm_rec_rare_en.md)
- [x]  SRN([paper](https://arxiv.org/abs/2003.12294))[5]
+- [x]  [SRN](./algorithm_rec_srn_en.md)
- [x]  NRTR([paper](https://arxiv.org/abs/1806.00926v2))[13]
+- [x]  [NRTR](./algorithm_rec_nrtr_en.md)
- [x]  SAR([paper](https://arxiv.org/abs/1811.00751v2))
+- [x]  [SAR](./algorithm_rec_sar_en.md)
- [x] SEED([paper](https://arxiv.org/pdf/2005.10977.pdf))
+- [x]  [SEED](./algorithm_rec_seed_en.md)
 Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
@@ -80,12 +76,10 @@ Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation r
 |SAR|Resnet31| 87.20% | rec_r31_sar | [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_sar_train.tar) |
 |SEED|Aster_Resnet| 85.35% | rec_resnet_stn_bilstm_att | [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
-Please refer to the document for training guide and use of PaddleOCR
-## 2. Training
-For the training guide and use of PaddleOCR text detection algorithms, please refer to the document [Text detection model training/evaluation/prediction](./detection_en.md). For text recognition algorithms, please refer to [Text recognition model training/evaluation/prediction](./recognition_en.md)
+<a name="2"></a>
-## 3. Inference
+## 2. End-to-end Algorithms
-Except for the PP-OCR series models of the above models, the other models only support inference based on the Python engine. For details, please refer to [Inference based on Python prediction engine](./inference_en.md)
+Supported end-to-end algorithms (Click the link to get the tutorial):
+- [x]  [PGNet](./algorithm_e2e_pgnet_en.md)
--- a/doc/doc_en/android_demo_en.md
+++ b/doc/doc_en/android_demo_en.md
-# Android Demo quick start
-### 1. Install the latest version of Android Studio
-It can be downloaded from https://developer.android.com/studio . This Demo is written by Android Studio version 4.0.
-### 2. Create a new project
-The NDK version 20b is used in the demo test, and the compilation can be successfully supported for version 20 and above.
-If you are a beginner, you can install and test the NDK compilation environment in the following ways.
-File -> New ->New Project  to create  "Native C++" project
-1. Start a new Android Studio project
-   Select Native C++ in the project template, select Paddle OCR/deploy/android_demo path
-   After entering the project, it will be automatically compiled. The first compilation
-   will take a long time. It is recommended to add an agent to speed up the download.
-**Agent add:**
-  Android Studio -> Preferences -> Appearance & Behavior -> System Settings -> HTTP Proxy -> Manual proxy configuration
-![](../demo/proxy.png)
-2. Start compilation
-Click the compile button, connect the phone, and follow the instructions of Android Studio to complete the operation.
-When you see the following picture in Android Studio, the compilation is complete:
-![](../demo/build.png)
-**Tip:** At this time, if the following error message that OpenCV cannot be found appears, please re-click compile,
-exit the project after compiling, and enter again.
-![](../demo/error.png)
-### 3. Send to mobile
-Complete the compilation, click Run, and check the effect on the mobile phone.
-### 4. How to customize the demo picture
-1. Image storage path: android_demo/app/src/main/assets/images
-   Place the custom picture under this path
-2. Configuration file: android_demo/app/src/main/res/values/strings.xml
-   Modify IMAGE_PATH_DEFAULT to a custom picture name
-# Get more support
-Go to [EasyEdge](https://ai.baidu.com/easyedge/app/open_source_demo?referrerUrl=paddlelite) to get more development support:
- Demo APP: You can use your mobile phone to scan the code to install, which is convenient for the mobile terminal to quickly experience text recognition
- SDK: The model is packaged to adapt to different chip hardware and operating system SDKs, including a complete interface to facilitate secondary development
--- a/doc/doc_en/paddleOCR_overview_en.md
+++ b/doc/doc_en/paddleOCR_overview_en.md
-# PaddleOCR Overview and Project Clone
+# Project Clone
-## 1. PaddleOCR Overview
+## 1. Clone PaddleOCR
-PaddleOCR contains rich text detection, text recognition and end-to-end algorithms. With the experience from real world scenarios and the industry, PaddleOCR chooses DB and CRNN as the basic detection and recognition models, and proposes a series of models, named PP-OCR, for industrial applications after a series of optimization strategies. The PP-OCR model is aimed at general scenarios and forms a model library of different languages. Based on the capabilities of PP-OCR, PaddleOCR releases the PP-Structure toolkit for document scene tasks, including two major tasks: layout analysis and table recognition. In order to get through the entire process of industrial landing, PaddleOCR provides large-scale data production tools and a variety of prediction deployment tools to help developers quickly turn ideas into reality.
+```bash
-<div align="center">
-    <img src="../overview_en.png">
-</div>
-## 2. Project Clone
-### **2.1 Clone PaddleOCR repo**
-```
 # Recommend
 git clone https://github.com/PaddlePaddle/PaddleOCR
@@ -25,9 +13,9 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
 # Note: The mirror on Gitee may not keep in synchronization with the latest project on GitHub. There might be a delay of 3-5 days. Please try GitHub at first.
 ```
-### **2.2 Install third-party libraries**
+## 2. Install third-party libraries
-```
+```bash
 cd PaddleOCR
 pip3 install -r requirements.txt
 ```

--- a/doc/doc_en/config_en.md
+++ b/doc/doc_en/config_en.md
@@ -57,7 +57,7 @@ Take rec_chinese_lite_train_v2.0.yml as an example
 |        learning_rate      |    Set the base learning rate        |       0.001      |  \        |
 |      **regularizer**      |  Set network regularization method        |       -      | \        |
 |        name      |    Regularizer class name      |       L2     |  Currently support`L1`,`L2`, see[ppocr/optimizer/regularizer.py](../../ppocr/optimizer/regularizer.py)        |
-|        factor      |    Learning rate decay coefficient       |       0.00004     |  \        |
+|        factor      |    Regularizer coefficient       |       0.00001     |  \        |
 ### Architecture ([ppocr/modeling](../../ppocr/modeling))

--- a/doc/doc_en/ocr_book_en.md
+++ b/doc/doc_en/ocr_book_en.md
+# E-book: *Dive Into OCR*
\ No newline at end of file
--- a/doc/doc_en/ppocr_introduction_en.md
+++ b/doc/doc_en/ppocr_introduction_en.md
+English | [简体中文](../doc_ch/ppocr_introduction.md)
+# PP-OCR
+- [1. Introduction](#1)
+- [2. Features](#2)
+- [3. Benchmark](#3)
+- [4. Visualization](#4)
+- [5. Tutorial](#5)
+    - [5.1 Quick start](#51)
+    - [5.2 Model training / compression / deployment](#52)
+- [6. Model zoo](#6)
+<a name="1"></a>
+## 1. Introduction
+PP-OCR is a self-developed practical ultra-lightweight OCR system, which is slimed and optimized based on the reimplemented [academic algorithms](algorithm_en.md), considering the balance between **accuracy** and **speed**.
+PP-OCR is a two-stage OCR system, in which the text detection algorithm is [DB](algorithm_det_db_en.md), and the text recognition algorithm is [CRNN](algorithm_rec_crnn_en.md). Besides, a [text direction classifier](angle_class_en.md) is added between the detection and recognition modules to deal with text in different directions.
+PP-OCR pipeline is as follows:
+<div align="center">
+    <img src="../ppocrv2_framework.jpg" width="800">
+</div>
+PP-OCR system is in continuous optimization. At present, PP-OCR and PP-OCRv2 have been released:
+[1] PP-OCR adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module (as shown in the green box above). The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941).
+[2] On the basis of PP-OCR, PP-OCRv2 is further optimized in five aspects. The detection model adopts CML(Collaborative Mutual Learning) knowledge distillation strategy and CopyPaste data expansion strategy. The recognition model adopts LCNet lightweight backbone network, U-DML knowledge distillation strategy and enhanced CTC loss function improvement (as shown in the red box above), which further improves the inference speed and prediction effect. For more details, please refer to the technical report of PP-OCRv2 (https://arxiv.org/abs/2109.03144).
+<a name="2"></a>
+## 2. Features
+- Ultra lightweight PP-OCRv2 series models: detection (3.1M) + direction classifier (1.4M) + recognition 8.5M) = 13.0M
+- Ultra lightweight PP-OCR mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
+- General PP-OCR server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
+- Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
+- Support multi-lingual recognition: about 80 languages like Korean, Japanese, German, French, etc
+<a name="3"></a>
+## 3. benchmark
+For the performance comparison between PP-OCR series models, please check the [benchmark](./benchmark_en.md) documentation.
+<a name="4"></a>
+## 4. Visualization [more](./visualization.md)
+<details open>
+<summary>PP-OCRv2 English model</summary>
+<div align="center">
+    <img src="../imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
+</div>
+</details>
+<details open>
+<summary>PP-OCRv2 Chinese model</summary>
+<div align="center">
+      <img src="../imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
+      <img src="../imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
+</div>
+<div align="center">
+    <img src="../imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
+    <img src="../imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
+</div>
+</details>
+<details open>
+<summary>PP-OCRv2 Multilingual model</summary>
+<div align="center">
+    <img src="../imgs_results/french_0.jpg" width="800">
+    <img src="../imgs_results/korean.jpg" width="800">
+</div>
+</details>
+<a name="5"></a>
+## 5. Tutorial
+<a name="51"></a>
+### 5.1 Quick start
+- You can also quickly experience the ultra-lightweight OCR : [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)
+- Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): [Sign in to the website to obtain the QR code for  installing the App](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)
+- One line of code quick use: [Quick Start](./quickstart_en.md)
+<a name="52"></a>
+### 5.2 Model training / compression / deployment
+For more tutorials, including model training, model compression, deployment, etc., please refer to [tutorials](../../README.md#Tutorials)。
+<a name="6"></a>
+## 6. Model zoo
+## PP-OCR Series Model List（Update on September 8th）
+| Model introduction                                           | Model name                   | Recommended scene | Detection model                                              | Direction classifier                                         | Recognition model                                            |
+| ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
+| Chinese and English ultra-lightweight PP-OCRv2 model（11.6M） |  ch_PP-OCRv2_xx |Mobile & Server|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)| [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar)|
+| Chinese and English ultra-lightweight PP-OCR model (9.4M)       | ch_ppocr_mobile_v2.0_xx      | Mobile & server   |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar)      |
+| Chinese and English general PP-OCR model (143.4M)               | ch_ppocr_server_v2.0_xx      | Server            |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar)  |
+For more model downloads (including multiple languages), please refer to [PP-OCR series model downloads](./models_list_en.md).
+For a new language request, please refer to [Guideline for new language_requests](../../README.md#language_requests).
--- a/doc/features.png
+++ b/doc/features.png
--- a/doc/features_en.png
+++ b/doc/features_en.png
--- a/doc/joinus.PNG
+++ b/doc/joinus.PNG