Commit 4b214948 authored by zhiminzhang0830's avatar zhiminzhang0830
Browse files

Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into new_branch

parents 917606b4 6e607a0f
- [文档视觉问答(DOC-VQA)](#文档视觉问答doc-vqa)
- [1. 简介](#1-简介)
- [2. 性能](#2-性能)
- [3. 效果演示](#3-效果演示)
- [3.1 SER](#31-ser)
- [3.2 RE](#32-re)
- [4. 安装](#4-安装)
- [4.1 安装依赖](#41-安装依赖)
- [4.2 安装PaddleOCR(包含 PP-OCR 和 VQA)](#42-安装paddleocr包含-pp-ocr-和-vqa)
- [5. 使用](#5-使用)
- [5.1 数据和预训练模型准备](#51-数据和预训练模型准备)
- [5.2 SER](#52-ser)
- [5.3 RE](#53-re)
- [6. 参考链接](#6-参考链接)
# 文档视觉问答(DOC-VQA)
## 1. 简介
VQA指视觉问答,主要针对图像内容进行提问和回答,DOC-VQA是VQA任务中的一种,DOC-VQA主要针对文本图像的文字内容提出问题。
PP-Structure 里的 DOC-VQA算法基于PaddleNLP自然语言处理算法库进行开发。
......@@ -16,23 +34,23 @@ PP-Structure 里的 DOC-VQA算法基于PaddleNLP自然语言处理算法库进
本项目是 [LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding](https://arxiv.org/pdf/2104.08836.pdf) 在 Paddle 2.2上的开源实现,
包含了在 [XFUND数据集](https://github.com/doc-analysis/XFUND) 上的微调代码。
## 1 性能
## 2. 性能
我们在 [XFUN](https://github.com/doc-analysis/XFUND) 的中文数据集上对算法进行了评估,性能如下
| 模型 | 任务 | hmean | 模型下载地址 |
|:---:|:---:|:---:| :---:|
| LayoutXLM | RE | 0.7483 | [链接](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) |
| LayoutXLM | SER | 0.9038 | [链接](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh.tar) |
| LayoutXLM | RE | 0.7483 | [链接](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) |
| LayoutLMv2 | SER | 0.8544 | [链接](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLMv2_xfun_zh.tar)
| LayoutLMv2 | RE | 0.6777 | [链接](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutLMv2_xfun_zh.tar) |
| LayoutLM | SER | 0.7731 | [链接](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLM_xfun_zh.tar) |
## 2. 效果演示
## 3. 效果演示
**注意:** 测试图片来源于XFUN数据集。
### 2.1 SER
### 3.1 SER
![](../../doc/vqa/result_ser/zh_val_0_ser.jpg) | ![](../../doc/vqa/result_ser/zh_val_42_ser.jpg)
---|---
......@@ -45,8 +63,7 @@ PP-Structure 里的 DOC-VQA算法基于PaddleNLP自然语言处理算法库进
在OCR检测框的左上方也标出了对应的类别和OCR识别结果。
### 2.2 RE
### 3.2 RE
![](../../doc/vqa/result_re/zh_val_21_re.jpg) | ![](../../doc/vqa/result_re/zh_val_40_re.jpg)
---|---
......@@ -54,10 +71,9 @@ PP-Structure 里的 DOC-VQA算法基于PaddleNLP自然语言处理算法库进
图中红色框表示问题,蓝色框表示答案,问题和答案之间使用绿色线连接。在OCR检测框的左上方也标出了对应的类别和OCR识别结果。
## 4. 安装
## 3. 安装
### 3.1 安装依赖
### 4.1 安装依赖
- **(1) 安装PaddlePaddle**
......@@ -73,8 +89,7 @@ python3 -m pip install "paddlepaddle>=2.2" -i https://mirror.baidu.com/pypi/simp
```
更多需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
### 3.2 安装PaddleOCR(包含 PP-OCR 和 VQA )
### 4.2 安装PaddleOCR(包含 PP-OCR 和 VQA)
- **(1)pip快速安装PaddleOCR whl包(仅预测)**
......@@ -99,10 +114,9 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
python3 -m pip install -r ppstructure/vqa/requirements.txt
```
## 4. 使用
## 5. 使用
### 4.1 数据和预训练模型准备
### 5.1 数据和预训练模型准备
如果希望直接体验预测过程,可以下载我们提供的预训练模型,跳过训练过程,直接预测即可。
......@@ -125,7 +139,7 @@ wget https://paddleocr.bj.bcebos.com/dataset/XFUND.tar
python3 ppstructure/vqa/helper/trans_xfun_data.py --ori_gt_path=path/to/json_path --output_path=path/to/save_path
```
### 4.2 SER任务
### 5.2 SER
启动训练之前,需要修改下面的四个字段
......@@ -164,7 +178,7 @@ CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/vqa/ser/layoutxlm.yml -o
使用如下命令即可完成`OCR引擎 + SER`的串联预测
```shell
CUDA_VISIBLE_DEVICES=0 python3 tools/infer_vqa_token_ser.py -c configs/vqa/ser/layoutxlm.yml -o Architecture.Backbone.checkpoints=PP-Layout_v1.0_ser_pretrained/ Global.infer_img=doc/vqa/input/zh_val_42.jpg
CUDA_VISIBLE_DEVICES=0 python3 tools/infer_vqa_token_ser.py -c configs/vqa/ser/layoutxlm.yml -o Architecture.Backbone.checkpoints=ser_LayoutXLM_xfun_zh/ Global.infer_img=doc/vqa/input/zh_val_42.jpg
```
最终会在`config.Global.save_res_path`字段所配置的目录下保存预测结果可视化图像以及预测结果文本文件,预测结果文本文件名为`infer_results.txt`
......@@ -178,8 +192,7 @@ export CUDA_VISIBLE_DEVICES=0
python3 helper/eval_with_label_end2end.py --gt_json_path XFUND/zh_val/xfun_normalize_val.json --pred_json_path output_res/infer_results.txt
```
### 3.3 RE任务
### 5.3 RE
* 启动训练
......@@ -219,13 +232,12 @@ CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/vqa/re/layoutxlm.yml -o
使用如下命令即可完成`OCR引擎 + SER + RE`的串联预测
```shell
export CUDA_VISIBLE_DEVICES=0
python3 tools/infer_vqa_token_ser_re.py -c configs/vqa/re/layoutxlm.yml -o Architecture.Backbone.checkpoints=PP-Layout_v1.0_re_pretrained/ Global.infer_img=doc/vqa/input/zh_val_21.jpg -c_ser configs/vqa/ser/layoutxlm.yml -o_ser Architecture.Backbone.checkpoints=PP-Layout_v1.0_ser_pretrained/
python3 tools/infer_vqa_token_ser_re.py -c configs/vqa/re/layoutxlm.yml -o Architecture.Backbone.checkpoints=re_LayoutXLM_xfun_zh/ Global.infer_img=doc/vqa/input/zh_val_21.jpg -c_ser configs/vqa/ser/layoutxlm.yml -o_ser Architecture.Backbone.checkpoints=ser_LayoutXLM_xfun_zh/
```
最终会在`config.Global.save_res_path`字段所配置的目录下保存预测结果可视化图像以及预测结果文本文件,预测结果文本文件名为`infer_results.txt`
## 参考链接
## 6. 参考链接
- LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding, https://arxiv.org/pdf/2104.08836.pdf
- microsoft/unilm/layoutxlm, https://github.com/microsoft/unilm/tree/master/layoutxlm
......
#!/bin/bash
source test_tipc/common_func.sh
# set env
python=python
export str_tmp=$(echo `pip list|grep paddlepaddle-gpu|awk -F ' ' '{print $2}'`)
export frame_version=${str_tmp%%.post*}
export frame_commit=$(echo `${python} -c "import paddle;print(paddle.version.commit)"`)
# run benchmark sh
# Usage:
# bash run_benchmark_train.sh config.txt params
# or
# bash run_benchmark_train.sh config.txt
function func_parser_params(){
strs=$1
IFS="="
array=(${strs})
tmp=${array[1]}
echo ${tmp}
}
function func_sed_params(){
filename=$1
line=$2
param_value=$3
params=`sed -n "${line}p" $filename`
IFS=":"
array=(${params})
key=${array[0]}
value=${array[1]}
new_params="${key}:${param_value}"
IFS=";"
cmd="sed -i '${line}s/.*/${new_params}/' '${filename}'"
eval $cmd
}
function set_gpu_id(){
string=$1
_str=${string:1:6}
IFS="C"
arr=(${_str})
M=${arr[0]}
P=${arr[1]}
gn=`expr $P - 1`
gpu_num=`expr $gn / $M`
seq=`seq -s "," 0 $gpu_num`
echo $seq
}
function get_repo_name(){
IFS=";"
cur_dir=$(pwd)
IFS="/"
arr=(${cur_dir})
echo ${arr[-1]}
}
FILENAME=$1
# copy FILENAME as new
new_filename="./test_tipc/benchmark_train.txt"
cmd=`yes|cp $FILENAME $new_filename`
FILENAME=$new_filename
# MODE must be one of ['benchmark_train']
MODE=$2
PARAMS=$3
# bash test_tipc/benchmark_train.sh test_tipc/configs/det_mv3_db_v2_0/train_benchmark.txt benchmark_train dynamic_bs8_null_DP_N1C1
IFS=$'\n'
# parser params from train_benchmark.txt
dataline=`cat $FILENAME`
# parser params
IFS=$'\n'
lines=(${dataline})
model_name=$(func_parser_value "${lines[1]}")
# 获取benchmark_params所在的行数
line_num=`grep -n "train_benchmark_params" $FILENAME | cut -d ":" -f 1`
# for train log parser
batch_size=$(func_parser_value "${lines[line_num]}")
line_num=`expr $line_num + 1`
fp_items=$(func_parser_value "${lines[line_num]}")
line_num=`expr $line_num + 1`
epoch=$(func_parser_value "${lines[line_num]}")
line_num=`expr $line_num + 1`
profile_option_key=$(func_parser_key "${lines[line_num]}")
profile_option_params=$(func_parser_value "${lines[line_num]}")
profile_option="${profile_option_key}:${profile_option_params}"
line_num=`expr $line_num + 1`
flags_value=$(func_parser_value "${lines[line_num]}")
# set flags
IFS=";"
flags_list=(${flags_value})
for _flag in ${flags_list[*]}; do
cmd="export ${_flag}"
eval $cmd
done
# set log_name
repo_name=$(get_repo_name )
SAVE_LOG=${BENCHMARK_LOG_DIR:-$(pwd)} # */benchmark_log
mkdir -p "${SAVE_LOG}/benchmark_log/"
status_log="${SAVE_LOG}/benchmark_log/results.log"
# The number of lines in which train params can be replaced.
line_python=3
line_gpuid=4
line_precision=6
line_epoch=7
line_batchsize=9
line_profile=13
line_eval_py=24
line_export_py=30
func_sed_params "$FILENAME" "${line_eval_py}" "null"
func_sed_params "$FILENAME" "${line_export_py}" "null"
func_sed_params "$FILENAME" "${line_python}" "$python"
# if params
if [ ! -n "$PARAMS" ] ;then
# PARAMS input is not a word.
IFS="|"
batch_size_list=(${batch_size})
fp_items_list=(${fp_items})
device_num_list=(N1C4)
run_mode="DP"
else
# parser params from input: modeltype_bs${bs_item}_${fp_item}_${run_mode}_${device_num}
IFS="_"
params_list=(${PARAMS})
model_type=${params_list[0]}
batch_size=${params_list[1]}
batch_size=`echo ${batch_size} | tr -cd "[0-9]" `
precision=${params_list[2]}
# run_process_type=${params_list[3]}
run_mode=${params_list[3]}
device_num=${params_list[4]}
IFS=";"
if [ ${precision} = "null" ];then
precision="fp32"
fi
fp_items_list=($precision)
batch_size_list=($batch_size)
device_num_list=($device_num)
fi
IFS="|"
for batch_size in ${batch_size_list[*]}; do
for precision in ${fp_items_list[*]}; do
for device_num in ${device_num_list[*]}; do
# sed batchsize and precision
func_sed_params "$FILENAME" "${line_precision}" "$precision"
func_sed_params "$FILENAME" "${line_batchsize}" "$MODE=$batch_size"
func_sed_params "$FILENAME" "${line_epoch}" "$MODE=$epoch"
gpu_id=$(set_gpu_id $device_num)
if [ ${#gpu_id} -le 1 ];then
run_process_type="SingleP"
log_path="$SAVE_LOG/profiling_log"
mkdir -p $log_path
log_name="${repo_name}_${model_name}_bs${batch_size}_${precision}_${run_process_type}_${run_mode}_${device_num}_profiling"
func_sed_params "$FILENAME" "${line_gpuid}" "0" # sed used gpu_id
# set profile_option params
tmp=`sed -i "${line_profile}s/.*/${profile_option}/" "${FILENAME}"`
# run test_train_inference_python.sh
cmd="bash test_tipc/test_train_inference_python.sh ${FILENAME} benchmark_train > ${log_path}/${log_name} 2>&1 "
echo $cmd
eval $cmd
eval "cat ${log_path}/${log_name}"
# without profile
log_path="$SAVE_LOG/train_log"
speed_log_path="$SAVE_LOG/index"
mkdir -p $log_path
mkdir -p $speed_log_path
log_name="${repo_name}_${model_name}_bs${batch_size}_${precision}_${run_process_type}_${run_mode}_${device_num}_log"
speed_log_name="${repo_name}_${model_name}_bs${batch_size}_${precision}_${run_process_type}_${run_mode}_${device_num}_speed"
func_sed_params "$FILENAME" "${line_profile}" "null" # sed profile_id as null
cmd="bash test_tipc/test_train_inference_python.sh ${FILENAME} benchmark_train > ${log_path}/${log_name} 2>&1 "
echo $cmd
job_bt=`date '+%Y%m%d%H%M%S'`
eval $cmd
job_et=`date '+%Y%m%d%H%M%S'`
export model_run_time=$((${job_et}-${job_bt}))
eval "cat ${log_path}/${log_name}"
# parser log
_model_name="${model_name}_bs${batch_size}_${precision}_${run_process_type}_${run_mode}"
cmd="${python} ${BENCHMARK_ROOT}/scripts/analysis.py --filename ${log_path}/${log_name} \
--speed_log_file '${speed_log_path}/${speed_log_name}' \
--model_name ${_model_name} \
--base_batch_size ${batch_size} \
--run_mode ${run_mode} \
--run_process_type ${run_process_type} \
--fp_item ${precision} \
--keyword ips: \
--skip_steps 2 \
--device_num ${device_num} \
--speed_unit samples/s \
--convergence_key loss: "
echo $cmd
eval $cmd
last_status=${PIPESTATUS[0]}
status_check $last_status "${cmd}" "${status_log}"
else
IFS=";"
unset_env=`unset CUDA_VISIBLE_DEVICES`
run_process_type="MultiP"
log_path="$SAVE_LOG/train_log"
speed_log_path="$SAVE_LOG/index"
mkdir -p $log_path
mkdir -p $speed_log_path
log_name="${repo_name}_${model_name}_bs${batch_size}_${precision}_${run_process_type}_${run_mode}_${device_num}_log"
speed_log_name="${repo_name}_${model_name}_bs${batch_size}_${precision}_${run_process_type}_${run_mode}_${device_num}_speed"
func_sed_params "$FILENAME" "${line_gpuid}" "$gpu_id" # sed used gpu_id
func_sed_params "$FILENAME" "${line_profile}" "null" # sed --profile_option as null
cmd="bash test_tipc/test_train_inference_python.sh ${FILENAME} benchmark_train > ${log_path}/${log_name} 2>&1 "
echo $cmd
job_bt=`date '+%Y%m%d%H%M%S'`
eval $cmd
job_et=`date '+%Y%m%d%H%M%S'`
export model_run_time=$((${job_et}-${job_bt}))
eval "cat ${log_path}/${log_name}"
# parser log
_model_name="${model_name}_bs${batch_size}_${precision}_${run_process_type}_${run_mode}"
cmd="${python} ${BENCHMARK_ROOT}/scripts/analysis.py --filename ${log_path}/${log_name} \
--speed_log_file '${speed_log_path}/${speed_log_name}' \
--model_name ${_model_name} \
--base_batch_size ${batch_size} \
--run_mode ${run_mode} \
--run_process_type ${run_process_type} \
--fp_item ${precision} \
--keyword ips: \
--skip_steps 2 \
--device_num ${device_num} \
--speed_unit images/s \
--convergence_key loss: "
echo $cmd
eval $cmd
last_status=${PIPESTATUS[0]}
status_check $last_status "${cmd}" "${status_log}"
fi
done
done
done
===========================train_params===========================
model_name:det_mv3_db_v2.0
model_name:det_mv3_db_v2_0
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
......@@ -48,4 +48,10 @@ inference:tools/infer/predict_det.py
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
--benchmark:True
null:null
\ No newline at end of file
null:null
===========================train_benchmark_params==========================
batch_size:8|16
fp_items:fp32|fp16
epoch:2
--profiler_options:batch_range=[10,20];state=GPU;tracer_option=Default;profile_path=model.profile
flags:FLAGS_eager_delete_tensor_gb=0.0;FLAGS_fraction_of_gpu_memory_to_use=0.98;FLAGS_conv_workspace_size_limit=4096
\ No newline at end of file
===========================train_params===========================
model_name:det_r18_db_v2_0
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:null
Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_lite_infer=4
Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
null:null
##
trainer:norm_train
norm_train:tools/train.py -c configs/det/det_res18_db_v2.0.yml -o
quant_export:null
fpgm_export:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:null
null:null
##
===========================infer_params===========================
Global.save_inference_dir:./output/
Global.checkpoints:
norm_export:null
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
##
train_model:null
infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
--enable_mkldnn:True|False
--cpu_threads:1|6
--rec_batch_num:1
--use_tensorrt:False|True
--precision:fp32|fp16|int8
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
--benchmark:True
null:null
===========================train_benchmark_params==========================
batch_size:8|16
fp_items:fp32|fp16
epoch:2
--profiler_options:batch_range=[10,20];state=GPU;tracer_option=Default;profile_path=model.profile
===========================train_params===========================
model_name:det_r50_vd_east_v2.0
model_name:det_r50_vd_east_v2_0
python:python3.7
gpu_list:0
Global.use_gpu:True|True
......@@ -13,7 +13,7 @@ train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
null:null
##
trainer:norm_train
norm_train:tools/train.py -c test_tipc/configs/det_r50_vd_east_v2.0/det_r50_vd_east.yml -o
norm_train:tools/train.py -c test_tipc/configs/det_r50_vd_east_v2_0/det_r50_vd_east.yml -o
pact_train:null
fpgm_train:null
distill_train:null
......@@ -27,7 +27,7 @@ null:null
===========================infer_params===========================
Global.save_inference_dir:./output/
Global.checkpoints:
norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_east_v2.0/det_r50_vd_east.yml -o
norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_east_v2_0/det_r50_vd_east.yml -o
quant_export:null
fpgm_export:null
distill_export:null
......@@ -35,7 +35,7 @@ export1:null
export2:null
##
train_model:./inference/det_r50_vd_east_v2.0_train/best_accuracy
infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_east_v2.0/det_r50_vd_east.yml -o
infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_east_v2_0/det_r50_vd_east.yml -o
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
......@@ -49,3 +49,8 @@ inference:tools/infer/predict_det.py
--save_log_path:null
--benchmark:True
--det_algorithm:EAST
===========================train_benchmark_params==========================
batch_size:8
fp_items:fp32|fp16
epoch:2
--profiler_options:batch_range=[10,20];state=GPU;tracer_option=Default;profile_path=model.profile
\ No newline at end of file
===========================train_params===========================
model_name:det_r50_vd_pse_v2.0
model_name:det_r50_vd_pse_v2_0
python:python3.7
gpu_list:0
Global.use_gpu:True|True
......@@ -13,7 +13,7 @@ train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
null:null
##
trainer:norm_train
norm_train:tools/train.py -c test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml -o
norm_train:tools/train.py -c test_tipc/configs/det_r50_vd_pse_v2_0/det_r50_vd_pse.yml -o
pact_train:null
fpgm_train:null
distill_train:null
......@@ -27,7 +27,7 @@ null:null
===========================infer_params===========================
Global.save_inference_dir:./output/
Global.checkpoints:
norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml -o
norm_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_pse_v2_0/det_r50_vd_pse.yml -o
quant_export:null
fpgm_export:null
distill_export:null
......@@ -35,7 +35,7 @@ export1:null
export2:null
##
train_model:./inference/det_r50_vd_pse_v2.0_train/best_accuracy
infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_pse_v2.0/det_r50_vd_pse.yml -o
infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_pse_v2_0/det_r50_vd_pse.yml -o
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
......@@ -49,3 +49,8 @@ inference:tools/infer/predict_det.py
--save_log_path:null
--benchmark:True
--det_algorithm:PSE
===========================train_benchmark_params==========================
batch_size:8
fp_items:fp32|fp16
epoch:2
--profiler_options:batch_range=[10,20];state=GPU;tracer_option=Default;profile_path=model.profile
# TIPC Linux端Benchmark测试文档
该文档为Benchmark测试说明,Benchmark预测功能测试的主程序为`benchmark_train.sh`,用于验证监控模型训练的性能。
# 1. 测试流程
## 1.1 准备数据和环境安装
运行`test_tipc/prepare.sh`,完成训练数据准备和安装环境流程。
```shell
# 运行格式:bash test_tipc/prepare.sh train_benchmark.txt mode
bash test_tipc/prepare.sh test_tipc/configs/det_mv3_db_v2_0/train_benchmark.txt benchmark_train
```
## 1.2 功能测试
执行`test_tipc/benchmark_train.sh`,完成模型训练和日志解析
```shell
# 运行格式:bash test_tipc/benchmark_train.sh train_benchmark.txt mode
bash test_tipc/benchmark_train.sh test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt benchmark_train
```
`test_tipc/benchmark_train.sh`支持根据传入的第三个参数实现只运行某一个训练配置,如下:
```shell
# 运行格式:bash test_tipc/benchmark_train.sh train_benchmark.txt mode
bash test_tipc/benchmark_train.sh test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt benchmark_train dynamic_bs8_fp32_DP_N1C1
```
dynamic_bs8_fp32_DP_N1C1为test_tipc/benchmark_train.sh传入的参数,格式如下:
`${modeltype}_${batch_size}_${fp_item}_${run_mode}_${device_num}`
包含的信息有:模型类型、batchsize大小、训练精度如fp32,fp16等、分布式运行模式以及分布式训练使用的机器信息如单机单卡(N1C1)。
## 2. 日志输出
运行后将保存模型的训练日志和解析日志,使用 `test_tipc/configs/det_mv3_db_v2_0/train_benchmark.txt` 参数文件的训练日志解析结果是:
```
{"model_branch": "dygaph", "model_commit": "7c39a1996b19087737c05d883fd346d2f39dbcc0", "model_name": "det_mv3_db_v2_0_bs8_fp32_SingleP_DP", "batch_size": 8, "fp_item": "fp32", "run_process_type": "SingleP", "run_mode": "DP", "convergence_value": "5.413110", "convergence_key": "loss:", "ips": 19.333, "speed_unit": "samples/s", "device_num": "N1C1", "model_run_time": "0", "frame_commit": "8cc09552473b842c651ead3b9848d41827a3dbab", "frame_version": "0.0.0"}
```
训练日志和日志解析结果保存在benchmark_log目录下,文件组织格式如下:
```
train_log/
├── index
│   ├── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_speed
│   └── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C4_speed
├── profiling_log
│   └── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_profiling
└── train_log
├── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_log
└── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C4_log
```
......@@ -20,6 +20,25 @@ model_name=$(func_parser_value "${lines[1]}")
trainer_list=$(func_parser_value "${lines[14]}")
if [ ${MODE} = "benchmark_train" ];then
pip install -r requirements.txt
if [[ ${model_name} =~ "det_mv3_db_v2_0" || ${model_name} =~ "det_r50_vd_east_v2_0" || ${model_name} =~ "det_r50_vd_pse_v2_0" || ${model_name} =~ "det_r18_db_v2_0" ]];then
rm -rf ./train_data/icdar2015
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar --no-check-certificate
cd ./train_data/ && tar xf icdar2015.tar && cd ../
fi
if [[ ${model_name} =~ "det_r50_vd_east_v2_0" || ${model_name} =~ "det_r50_vd_pse_v2_0" ]];then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/ResNet50_vd_ssld_pretrained.pdparams --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar --no-check-certificate
cd ./train_data/ && tar xf icdar2015.tar && cd ../
fi
if [[ ${model_name} =~ "det_r18_db_v2_0" ]];then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/ResNet18_vd_pretrained.pdparams --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar --no-check-certificate
cd ./train_data/ && tar xf icdar2015.tar && cd ../
fi
fi
if [ ${MODE} = "lite_train_lite_infer" ];then
# pretrain lite train data
......@@ -52,7 +71,7 @@ if [ ${MODE} = "lite_train_lite_infer" ];then
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/total_text_lite.tar --no-check-certificate
cd ./train_data && tar xf total_text_lite.tar && ln -s total_text_lite total_text && cd ../
fi
if [ ${model_name} == "det_mv3_db_v2.0" ]; then
if [ ${model_name} == "det_mv3_db_v2_0" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_mv3_db_v2.0_train.tar && cd ../
fi
......@@ -211,7 +230,7 @@ elif [ ${MODE} = "whole_infer" ];then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_totaltext_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_r50_vd_sast_totaltext_v2.0_train.tar && cd ../
fi
if [ ${model_name} == "det_mv3_db_v2.0" ]; then
if [ ${model_name} == "det_mv3_db_v2_0" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_mv3_db_v2.0_train.tar && tar xf ch_det_data_50.tar && cd ../
fi
......@@ -223,7 +242,7 @@ elif [ ${MODE} = "whole_infer" ];then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_mv3_pse_v2.0_train.tar & cd ../
fi
if [ ${model_name} == "det_r50_vd_pse_v2.0" ]; then
if [ ${model_name} == "det_r50_vd_pse_v2_0" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_r50_vd_pse_v2.0_train.tar & cd ../
fi
......@@ -231,7 +250,7 @@ elif [ ${MODE} = "whole_infer" ];then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_east_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_mv3_east_v2.0_train.tar & cd ../
fi
if [ ${model_name} == "det_r50_vd_east_v2.0" ]; then
if [ ${model_name} == "det_r50_vd_east_v2_0" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_r50_vd_east_v2.0_train.tar & cd ../
fi
......
......@@ -28,32 +28,32 @@
| DB |ch_ppocr_mobile_v2.0_det_PACT | 检测 | 支持 | 多机多卡 <br> 混合精度 | PACT量化 | Paddle Inference: C++ <br> Paddle Serving: Python, C++ <br> Paddle-Lite: <br> (1) ARM CPU(C++) |
| DB |ch_ppocr_mobile_v2.0_det_KL | 检测 | 支持 | 多机多卡 <br> 混合精度 | 离线量化| Paddle Inference: C++ <br> Paddle Serving: Python, C++ <br> Paddle-Lite: <br> (1) ARM CPU(C++) |
| DB |ch_ppocr_server_v2.0_det | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ |
| DB |ch_PP-OCRv2_det | 检测 |
| DB |ch_PP-OCRv2_det | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ |
| CRNN |ch_ppocr_mobile_v2.0_rec | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ <br> Paddle-Lite: <br> (1) ARM CPU(C++) |
| CRNN |ch_ppocr_server_v2.0_rec | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ |
| CRNN |ch_PP-OCRv2_rec | 识别 |
| CRNN |ch_PP-OCRv2_rec | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ |
| PP-OCR |ch_ppocr_mobile_v2.0 | 检测+识别 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ <br> Paddle-Lite: <br> (1) ARM CPU(C++) |
| PP-OCR |ch_ppocr_server_v2.0 | 检测+识别 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ |
|PP-OCRv2|ch_PP-OCRv2 | 检测+识别 |
| DB |det_mv3_db_v2.0 | 检测 |
| DB |det_r50_vd_db_v2.0 | 检测 |
| EAST |det_mv3_east_v2.0 | 检测 |
| EAST |det_r50_vd_east_v2.0 | 检测 |
| PSENet |det_mv3_pse_v2.0 | 检测 |
| PSENet |det_r50_vd_pse_v2.0 | 检测 |
| SAST |det_r50_vd_sast_totaltext_v2.0 | 检测 |
| Rosetta|rec_mv3_none_none_ctc_v2.0 | 识别 |
| Rosetta|rec_r34_vd_none_none_ctc_v2.0 | 识别 |
| CRNN |rec_mv3_none_bilstm_ctc_v2.0 | 识别 |
| CRNN |rec_r34_vd_none_bilstm_ctc_v2.0| 识别 |
| StarNet|rec_mv3_tps_bilstm_ctc_v2.0 | 识别 |
| StarNet|rec_r34_vd_tps_bilstm_ctc_v2.0 | 识别 |
| RARE |rec_mv3_tps_bilstm_att_v2.0 | 识别 |
| RARE |rec_r34_vd_tps_bilstm_att_v2.0 | 识别 |
| SRN |rec_r50fpn_vd_none_srn | 识别 |
| NRTR |rec_mtb_nrtr | 识别 |
| SAR |rec_r31_sar | 识别 |
| PGNet |rec_r34_vd_none_none_ctc_v2.0 | 端到端|
|PP-OCRv2|ch_PP-OCRv2 | 检测+识别 | 支持 | 多机多卡 <br> 混合精度 | - | Paddle Inference: C++ <br> Paddle Serving: Python, C++ |
| DB |det_mv3_db_v2.0 | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| DB |det_r50_vd_db_v2.0 | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| EAST |det_mv3_east_v2.0 | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| EAST |det_r50_vd_east_v2.0 | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| PSENet |det_mv3_pse_v2.0 | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| PSENet |det_r50_vd_pse_v2.0 | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| SAST |det_r50_vd_sast_totaltext_v2.0 | 检测 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| Rosetta|rec_mv3_none_none_ctc_v2.0 | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| Rosetta|rec_r34_vd_none_none_ctc_v2.0 | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| CRNN |rec_mv3_none_bilstm_ctc_v2.0 | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| CRNN |rec_r34_vd_none_bilstm_ctc_v2.0| 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| StarNet|rec_mv3_tps_bilstm_ctc_v2.0 | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| StarNet|rec_r34_vd_tps_bilstm_ctc_v2.0 | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| RARE |rec_mv3_tps_bilstm_att_v2.0 | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| RARE |rec_r34_vd_tps_bilstm_att_v2.0 | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| SRN |rec_r50fpn_vd_none_srn | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| NRTR |rec_mtb_nrtr | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| SAR |rec_r31_sar | 识别 | 支持 | 多机多卡 <br> 混合精度 | - | - |
| PGNet |rec_r34_vd_none_none_ctc_v2.0 | 端到端| 支持 | 多机多卡 <br> 混合精度 | - | - |
......
......@@ -3,7 +3,7 @@
Linux端基础训练预测功能测试的主程序为test_train_python.sh,可以测试基于Python的模型训练、评估等基本功能,包括裁剪、量化、蒸馏训练。
![](./tipc_train.png)
![](./test_tipc/tipc_train.png)
测试链条如上图所示,主要测试内容有带共享权重,自定义OP的模型的正常训练和slim相关功能训练流程是否正常。
......@@ -28,23 +28,30 @@ pip3 install -r requirements.txt
- 模式1:lite_train_lite_infer,使用少量数据训练,用于快速验证训练到预测的走通流程,不验证精度和速度;
```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python.txt 'lite_train_lite_infer'
bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python.txt 'lite_train_lite_infer'
```
- 模式2:whole_train_whole_infer,使用全量数据训练,用于快速验证训练到预测的走通流程,验证模型最终训练精度;
```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python.txt 'whole_train_whole_infer'
bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python.txt 'whole_train_whole_infer'
```
如果是运行量化裁剪等训练方式,需要使用不同的配置文件。量化训练的测试指令如下:
```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python_PACT.txt 'lite_train_lite_infer'
bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python_PACT.txt 'lite_train_lite_infer'
```
同理,FPGM裁剪的运行方式如下:
```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python_FPGM.txt 'lite_train_lite_infer'
bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python_FPGM.txt 'lite_train_lite_infer'
```
多机多卡的运行配置文件分别为 `train_infer_python_fleet.txt`, `train_infer_python_FPGM_fleet.txt``train_infer_python_PACT_fleet.txt`
运行时,需要修改配置文件中的 `gpu_list:xx.xx.xx.xx,yy.yy.yy.yy;0,1`。 将 `xx.xx.xx.xx` 替换为具体的 `ip` 地址,各个`ip`地址之间用`,`分隔。 另外,和单机训练
不同,启动多机多卡训练需要在多机的每个节点上分别运行命令。以多机多卡量化训练为例,指令如下:
```
bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python_PACT_fleet.txt 'lite_train_lite_infer'
```
运行相应指令后,在`test_tipc/output`文件夹下自动会保存运行日志。如'lite_train_lite_infer'模式运行后,在test_tipc/extra_output文件夹有以下文件:
......
......@@ -35,7 +35,6 @@ use_share_conv_key=$(func_parser_key "${lines[13]}")
use_share_conv_list=$(func_parser_value "${lines[13]}")
run_train_py=$(func_parser_value "${lines[14]}")
LOG_PATH="./test_tipc/extra_output"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_python.log"
......@@ -98,6 +97,8 @@ if [ ${MODE} = "lite_train_lite_infer" ] || [ ${MODE} = "whole_train_whole_infer
cmd="${python} ${run_train_py} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_checkpoints} ${set_autocast} ${set_batchsize} ${set_use_custom_op} ${set_model_type} ${set_use_share_conv} ${set_amp_config}"
elif [ ${#ips} -le 26 ];then # train with multi-gpu
cmd="${python} -m paddle.distributed.launch --gpus=${gpu} ${run_train_py} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_checkpoints} ${set_autocast} ${set_batchsize} ${set_use_custom_op} ${set_model_type} ${set_use_share_conv} ${set_amp_config}"
else
cmd="${python} -m paddle.distributed.launch --ips=${ips} --gpus=${gpu} ${run_train_py} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_checkpoints} ${set_autocast} ${set_batchsize} ${set_use_custom_op} ${set_model_type} ${set_use_share_conv} ${set_amp_config}"
fi
# run train
......
......@@ -4,9 +4,9 @@ python:python3.7
gpu_list:0|0,1
use_gpu:True|True
AMP.use_amp:True|False
epoch:lite_train_lite_infer=20|whole_train_whole_infer=1000
epoch:lite_train_lite_infer=2|whole_train_whole_infer=1000
save_model_dir:./output/
TRAIN.batch_size:lite_train_lite_infer=2|whole_train_whole_infer=4
TRAIN.batch_size:lite_train_lite_infer=1280|whole_train_whole_infer=1280
pretrained_model:null
checkpoints:null
use_custom_relu:False|True
......
===========================train_params===========================
model_name:ch_PPOCRv2_det
python:python3.7
gpu_list:xx.xx.xx.xx,yy.yy.yy.yy;0,1
use_gpu:True
AMP.use_amp:True|False
epoch:lite_train_lite_infer=2|whole_train_whole_infer=1000
save_model_dir:./output/
TRAIN.batch_size:lite_train_lite_infer=1280|whole_train_whole_infer=1280
pretrained_model:null
checkpoints:null
use_custom_relu:False|True
model_type:cls|cls_distill|cls_distill_multiopt
MODEL.siamese:False|True
norm_train:train.py -c mv3_large_x0_5.yml -o prune_train=True
quant_train:False
prune_train:False
......@@ -4,9 +4,9 @@ python:python3.7
gpu_list:0|0,1
use_gpu:True|True
AMP.use_amp:True|False
epoch:lite_train_lite_infer=20|whole_train_whole_infer=1000
epoch:lite_train_lite_infer=2|whole_train_whole_infer=1000
save_model_dir:./output/
TRAIN.batch_size:lite_train_lite_infer=2|whole_train_whole_infer=4
TRAIN.batch_size:lite_train_lite_infer=1280|whole_train_whole_infer=1280
pretrained_model:null
checkpoints:null
use_custom_relu:False|True
......
===========================train_params===========================
model_name:ch_PPOCRv2_det
python:python3.7
gpu_list:xx.xx.xx.xx,yy.yy.yy.yy;0,1
use_gpu:True
AMP.use_amp:True|False
epoch:lite_train_lite_infer=2|whole_train_whole_infer=1000
save_model_dir:./output/
TRAIN.batch_size:lite_train_lite_infer=1280|whole_train_whole_infer=1280
pretrained_model:null
checkpoints:null
use_custom_relu:False|True
model_type:cls|cls_distill|cls_distill_multiopt
MODEL.siamese:False|True
norm_train:train.py -c mv3_large_x0_5.yml -o quant_train=True
quant_train:False
prune_train:False
===========================train_params===========================
model_name:ch_PPOCRv2_det
python:python3.7
gpu_list:xx.xx.xx.xx,yy.yy.yy.yy;0,1
use_gpu:True
AMP.use_amp:True|False
epoch:lite_train_lite_infer=2|whole_train_whole_infer=1000
save_model_dir:./output/
TRAIN.batch_size:lite_train_lite_infer=1280|whole_train_whole_infer=1280
pretrained_model:null
checkpoints:null
use_custom_relu:False|True
model_type:cls|cls_distill|cls_distill_multiopt
MODEL.siamese:False|True
norm_train: train.py -c mv3_large_x0_5.yml -o
quant_train:False
prune_train:False
......@@ -284,7 +284,6 @@ else
set_amp_config=" "
fi
for trainer in ${trainer_list[*]}; do
eval ${env}
flag_quant=False
if [ ${trainer} = ${pact_key} ]; then
run_train=${pact_trainer}
......@@ -344,6 +343,7 @@ else
# run eval
if [ ${eval_py} != "null" ]; then
eval ${env}
set_eval_params1=$(func_set_params "${eval_key1}" "${eval_value1}")
eval_cmd="${python} ${eval_py} ${set_eval_pretrain} ${set_use_gpu} ${set_eval_params1}"
eval $eval_cmd
......
......@@ -28,7 +28,6 @@ from ppocr.modeling.architectures import build_model
from ppocr.postprocess import build_post_process
from ppocr.metrics import build_metric
from ppocr.utils.save_load import load_model
from ppocr.utils.utility import print_dict
import tools.program as program
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment