Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into fix_vqa

0d7ee968 · WenmuZhou · a323fce6 · a11fbc0f · 0d7ee968 · 0d7ee968
Commit 0d7ee968 authored Jan 05, 2022 by WenmuZhou
20 changed files
--- a/notebook/notebook_ch/如何使用本书.ipynb
+++ b/notebook/notebook_ch/如何使用本书.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": false
+   },
+   "source": [
+    "# 1. 课程预备知识\n",
+    "\n",
+    "本课所涉及的OCR模型建立在深度学习的基础之上，因此与其相关的基础知识、环境配置、项目工程与其他资料将在本节介绍，尤其对深度学习不熟悉的读者可以查看和学习相应内容。\n",
+    "\n",
+    "### 1.1 预备知识\n",
+    "\n",
+    "深度学习的“学习”由机器学习中的神经元、感知机、多层神经网络等内容一路发展而来，因此了解基础的机器学习算法对于深度学习的理解和应用有很大帮助。而深度学习的“深”则体现在对大量信息处理过程中使用的卷积、池化等一系列以向量为基础的数学运算。如果缺乏这两者的理论基础，可以学习李宏毅老师的[线性代数](https://aistudio.baidu.com/aistudio/course/introduce/2063)和[机器学习](https://aistudio.baidu.com/aistudio/course/introduce/1978)课程。\n",
+    "\n",
+    "对于深度学习本身的理解，可以参考百度杰出架构师毕然老师的零基础课程：[百度架构师手把手带你零基础实践深度学习](https://aistudio.baidu.com/aistudio/course/introduce/1297)，其中覆盖了深度学习的发展历史，通过一个经典案例介绍深度学习的完整组成部分，是一套以实践为导向的深度学习课程。\n",
+    "\n",
+    "对于理论知识的实践，[Python基础知识](https://aistudio.baidu.com/aistudio/course/introduce/1224)必不可少，同时为了快速复现深度学习模型，本课程使用的深度学习框架为：飞桨PaddlePaddle。如果你已经使用过其他框架，通过[快速上手文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/practices/quick_start/hello_paddle.html)可以迅速了解飞桨的使用方法。\n",
+    "\n",
+    "### 1.2 基础环境准备\n",
+    "\n",
+    "如果你想在本地环境运行本课程的代码且之前未搭建过Python环境，可以根据[零基础运行环境准备](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/environment.md)，根据自己的操作系统安装Anaconda或docker环境。\n",
+    "\n",
+    "如果你没有本地资源，可以通过AI Studio实训平台完成代码运行，其中的每个项目都通过Notebook的方式呈现，方便开发者学习。若对Notebook的相关操作不熟悉，可以参考[AI Studio项目说明](https://ai.baidu.com/ai-doc/AISTUDIO/0k3e2tfzm)。\n",
+    "\n",
+    "### 1.3 获取和运行代码\n",
+    "\n",
+    "本课程依托PaddleOCR的代码库形成，首先，克隆PaddleOCR的完整项目：\n",
+    "\n",
+    "```bash\n",
+    "#【推荐】\n",
+    "git clone https://github.com/PaddlePaddle/PaddleOCR\n",
+    "\n",
+    "# 如果因为网络问题无法pull成功，也可选择使用码云上的托管：\n",
+    "git clone https://gitee.com/paddlepaddle/PaddleOCR\n",
+    "```\n",
+    "\n",
+    "> 注：码云托管代码可能无法实时同步本github项目更新，存在3~5天延时，请优先使用推荐方式。\n",
+    ">\n",
+    "> \t\t如果你不熟悉git操作，可以直接在PaddleOCR的首页的 `Code` 中下载压缩包\n",
+    "\n",
+    "然后安装第三方库：\n",
+    "\n",
+    "```\n",
+    "cd PaddleOCR\n",
+    "pip3 install -r requirements.txt\n",
+    "```\n",
+    "\n",
+    "\n",
+    "\n",
+    "### 1.4 查阅资料\n",
+    "\n",
+    "[PaddleOCR使用文档](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/README_ch.md#%E6%96%87%E6%A1%A3%E6%95%99%E7%A8%8B) (中文) 中详细介绍了如何使用PaddleOCR完成模型应用、训练和部署。文档内容丰富，大多数用户的问题都在文档或FAQ中有所描述，尤其在[FAQ(中文)](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/FAQ.md)中，按照深度学习的应用过程沉淀了用户的常见问题，建议大家仔细阅读。\n",
+    "\n",
+    "### 1.5 寻求帮助\n",
+    "\n",
+    "如果你在使用PaddleOCR的过程中遇到BUG、易用性或者文档相关的问题，可通过[Github issue](https://github.com/PaddlePaddle/PaddleOCR/issues)与官方联系，请按照issue模板尽可能多的提供信息，以便官方人员迅速定位问题。同时，微信群是广大PaddleOCR用户的日常交流阵地，更适合提问一些咨询类问题，除了有PaddleOCR团队成员以外，还会有热心开发者回答大家的问题。"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "py35-paddle1.2.0"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
--- a/ppocr/data/imaug/label_ops.py
+++ b/ppocr/data/imaug/label_ops.py
@@ -345,7 +345,7 @@ class KieLabelEncode(object):
        max_num = 300
        temp_bboxes = np.zeros([max_num, 4])
        h, _ = bboxes.shape
-        temp_bboxes[:h, :h] = bboxes
+        temp_bboxes[:h, :] = bboxes
        temp_relations = np.zeros([max_num, max_num, 5])
        temp_relations[:h, :h, :] = relations

--- a/ppocr/data/imaug/operators.py
+++ b/ppocr/data/imaug/operators.py
@@ -23,7 +23,6 @@ import sys
 import six
 import cv2
 import numpy as np
-import fasttext
 class DecodeImage(object):
@@ -136,6 +135,7 @@ class ToCHWImage(object):
 class Fasttext(object):
    def __init__(self, path="None", **kwargs):
+        import fasttext
        self.fast_model = fasttext.load_model(path)
    def __call__(self, data):

--- a/ppocr/utils/save_load.py
+++ b/ppocr/utils/save_load.py
@@ -138,13 +138,16 @@ def load_pretrained_params(model, path):
    params = paddle.load(path + '.pdparams')
    state_dict = model.state_dict()
    new_state_dict = {}
-    for k1, k2 in zip(state_dict.keys(), params.keys()):
+    for k1 in params.keys():
-        if list(state_dict[k1].shape) == list(params[k2].shape):
+        if k1 not in state_dict.keys():
-            new_state_dict[k1] = params[k2]
+            logger.warning("The pretrained params {} not in model".format(k1))
+        else:
+            if list(state_dict[k1].shape) == list(params[k1].shape):
+                new_state_dict[k1] = params[k1]
            else:
                logger.warning(
                    "The shape of model params {} {} not matched with loaded params {} {} !".
-                format(k1, state_dict[k1].shape, k2, params[k2].shape))
+                    format(k1, state_dict[k1].shape, k1, params[k1].shape))
    model.set_state_dict(new_state_dict)
    logger.info("load pretrain successful from {}".format(path))
    return model

--- a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
@@ -3,7 +3,7 @@ model_name:PPOCRv2_ocr_det_kl
 python:python3.7
 Global.pretrained_model:null
 Global.save_inference_dir:null
-infer_model:./inference/ch_PP-OCRv2_det_infer/
+infer_model:./inference/ch_PP-OCRv2_det_infer
 infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml -o
 infer_quant:True
 inference:tools/infer/predict_det.py

--- a/test_tipc/configs/ch_ppocr_mobile_V2.0_det_FPGM/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_V2.0_det_FPGM/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:null
 quant_export:null
 fpgm_export:deploy/slim/prune/export_prune_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o 

--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o 
 quant_export:null
 fpgm_export:null

--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:null
 quant_export:null
 fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/rec_chinese_lite_train_v2.0.yml -o 

--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
@@ -13,7 +13,7 @@ inference:tools/infer/predict_rec.py
 --rec_batch_num:1
 --use_tensorrt:False|True
 --precision:int8
--det_model_dir:
+--rec_model_dir:
 --image_dir:./inference/rec_inference
 null:null
 --benchmark:True

--- a/test_tipc/configs/ch_ppocr_server_v2.0_det/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_det/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c test_tipc/configs/ch_ppocr_server_v2.0_det/det_r50_vd_db.yml -o 
 quant_export:null 
 fpgm_export:null

--- a/test_tipc/configs/det_mv3_db_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_mv3_db_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c configs/det/det_mv3_db.yml -o 
 quant_export:null
 fpgm_export:null

--- a/test_tipc/configs/det_mv3_east_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_mv3_east_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_east_v2.0/det_mv3_east.yml -o 
 quant_export:null 
 fpgm_export:null

--- a/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_pse_v2.0/det_mv3_pse.yml -o 
 quant_export:null 
 fpgm_export:null

--- a/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c configs/det/det_r50_vd_db.yml -o 
 quant_export:null 
 fpgm_export:null
@@ -34,7 +34,7 @@ distill_export:null
 export1:null
 export2:null
 ##
-train_model:./inference/ch_ppocr_server_v2.0_det_train/best_accuracy
+train_model:./inference/det_r50_vd_db_v2.0_train/best_accuracy
 infer_export:tools/export_model.py -c configs/det/det_r50_vd_db.yml -o
 infer_quant:False
 inference:tools/infer/predict_det.py

--- a/test_tipc/configs/det_r50_vd_east_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_east_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_east_v2.0/det_r50_vd_east.yml -o 
 quant_export:null 
 fpgm_export:null

--- a/test_tipc/configs/det_r50_vd_pse_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_pse_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml -o 
 quant_export:null 
 fpgm_export:null

--- a/test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/det_r50_vd_sast_icdar2015.yml -o 
 quant_export:null
 fpgm_export:null
@@ -43,7 +43,7 @@ inference:tools/infer/predict_det.py
 --cpu_threads:1|6
 --rec_batch_num:1
 --use_tensorrt:False
--precision:fp32|fp16|int8
+--precision:fp32|int8
 --det_model_dir:
 --image_dir:./inference/ch_det_data_50/all-sum-510/
 null:null

--- a/test_tipc/configs/det_r50_vd_sast_totaltext_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_sast_totaltext_v2.0/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_sast_totaltext_v2.0/det_r50_vd_sast_totaltext.yml -o 
 quant_export:null
 fpgm_export:null
@@ -43,7 +43,7 @@ inference:tools/infer/predict_det.py
 --cpu_threads:1|6
 --rec_batch_num:1
 --use_tensorrt:False
--precision:fp32|fp16|int8
+--precision:fp32|int8
 --det_model_dir:
 --image_dir:./inference/ch_det_data_50/all-sum-510/
 null:null

--- a/test_tipc/configs/en_server_pgnetA/train_infer_python.txt
+++ b/test_tipc/configs/en_server_pgnetA/train_infer_python.txt
@@ -26,7 +26,7 @@ null:null
 ##
 ===========================infer_params===========================
 Global.save_inference_dir:./output/
-Global.pretrained_model:
+Global.checkpoints:
 norm_export:tools/export_model.py -c configs/e2e/e2e_r50_vd_pg.yml -o 
 quant_export:null
 fpgm_export:null

--- a/test_tipc/docs/test_inference_js.md
+++ b/test_tipc/docs/test_inference_js.md
+# Web 端基础预测功能测试
+Web 端主要基于 Jest-Puppeteer 完成 e2e 测试，其中 Puppeteer 操作 Chrome 完成推理流程，Jest 完成测试流程。
+>Puppeteer 是一个 Node 库，它提供了一个高级 API 来通过 DevTools 协议控制 Chromium 或 Chrome
+>Jest 是一个 JavaScript 测试框架，旨在确保任何 JavaScript 代码的正确性。
+#### 环境准备
+* 安装 Node（包含 npm ） （https://nodejs.org/zh-cn/download/）
+* 确认是否安装成功，在命令行执行
+```sh
+# 显示所安 node 版本号，即表示成功安装
+node -v
+```
+* 确认 npm 是否安装成成
+```sh
+# npm 随着 node 一起安装，一般无需额外安装
+# 显示所安 npm 版本号，即表示成功安装
+npm -v
+```
+#### 使用
+```sh
+# web 测试环境准备
+bash test_tipc/prepare_js.sh 'js_infer'
+# web 推理测试
+bash test_tipc/test_inference_js.sh
+```
+#### 流程设计
+###### paddlejs prepare
+ 1. 判断 node, npm 是否安装
+ 2. 下载测试模型，当前检测模型是 ch_PP-OCRv2_det_infer ，识别模型是 ch_PP-OCRv2_rec_infer[1, 3, 32, 320]。如果需要替换模型，可直接将模型文件放在test_tipc/web/models/目录下。
+  - 文本检测模型：https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar
+  - 文本识别模型：https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar
+  - 文本识别模型[1, 3, 32, 320]：https://paddlejs.bj.bcebos.com/models/ch_PP-OCRv2_rec_infer.tar
+  - 保证较为准确的识别效果，需要将文本识别模型导出为输入shape是[1, 3, 32, 320]的静态模型
+ 3. 转换模型， model.pdmodel model.pdiparams 转换为 model.json chunk.dat（检测模型保存地址：test_tipc/web/models/ch_PP-OCRv2/det，识别模型保存地址：test_tipc/web/models/ch_PP-OCRv2/rec）
+ 4. 安装最新版本 ocr sdk  @paddlejs-models/ocr@latest
+ 5. 安装测试环境依赖 puppeteer、jest、jest-puppeteer，如果检查到已经安装，则不会进行二次安装
+ ###### paddlejs infer test
+ 1. Jest 执行 server command：`python3 -m http.server 9811` 开启本地服务
+ 2. 启动 Jest 测试服务，通过 jest-puppeteer 插件完成 chrome 操作，加载 @paddlejs-models/ocr 脚本完成推理流程
+ 3. 测试用例为原图识别后的文本结果与预期文本结果（expect.json）进行对比，测试通过有两个标准：
+    * 原图识别结果逐字符与预期结果对比，误差不超过 **10个字符**；
+    * 原图识别结果每个文本框字符内容与预期结果进行相似度对比，相似度不小于 0.9（全部一致则相似度为1）。 
+    只有满足上述两个标准，视为测试通过。通过为如下显示：
+ <img width="600" src="https://user-images.githubusercontent.com/43414102/146406599-80b30c66-f2f8-4f57-a68a-007c479ff0f7.png">