Commit 6cb47e76 authored by LDOUBLEV's avatar LDOUBLEV
Browse files

delete debug

parents d666de85 c9d7ec85
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"# 1. Course Prerequisites\n",
"\n",
"The OCR model involved in this course is based on deep learning, so its related basic knowledge, environment configuration, project engineering and other materials will be introduced in this section, especially for readers who are not familiar with deep learning. content.\n",
"\n",
"### 1.1 Preliminary Knowledge\n",
"\n",
"The \"learning\" of deep learning has been developed from the content of neurons, perceptrons, and multilayer neural networks in machine learning. Therefore, understanding the basic machine learning algorithms is of great help to the understanding and application of deep learning. The \"deepness\" of deep learning is embodied in a series of vector-based mathematical operations such as convolution and pooling used in the process of processing a large amount of information. If you lack the theoretical foundation of the two, you can learn from teacher Li Hongyi's [Linear Algebra](https://aistudio.baidu.com/aistudio/course/introduce/2063) and [Machine Learning](https://aistudio.baidu.com/aistudio/course/introduce/1978) courses.\n",
"\n",
"For the understanding of deep learning itself, you can refer to the zero-based course of Bai Ran, an outstanding architect of Baidu: [Baidu architects take you hands-on with zero-based practice deep learning](https://aistudio.baidu.com/aistudio/course/introduce/1297), which covers the development history of deep learning and introduces the complete components of deep learning through a classic case. It is a set of practice-oriented deep learning courses.\n",
"\n",
"For the practice of theoretical knowledge, [Python basic knowledge](https://aistudio.baidu.com/aistudio/course/introduce/1224) is essential. At the same time, in order to quickly reproduce the deep learning model, the deep learning framework used in this course For: Flying PaddlePaddle. If you have used other frameworks, you can quickly learn how to use flying paddles through [Quick Start Document](https://www.paddlepaddle.org.cn/documentation/docs/zh/practices/quick_start/hello_paddle.html).\n",
"\n",
"### 1.2 Basic Environment Preparation\n",
"\n",
"If you want to run the code of this course in a local environment and have not built a Python environment before, you can follow the [zero-base operating environment preparation](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/environment.md), install Anaconda or docker environment according to your operating system.\n",
"\n",
"If you don't have local resources, you can run the code through the AI Studio training platform. Each item in it is presented in a notebook, which is convenient for developers to learn. If you are not familiar with the related operations of Notebook, you can refer to [AI Studio Project Description](https://ai.baidu.com/ai-doc/AISTUDIO/0k3e2tfzm).\n",
"\n",
"### 1.3 Get and Run the Code\n",
"\n",
"This course relies on the formation of PaddleOCR's code repository. First, clone the complete project of PaddleOCR:\n",
"\n",
"```bash\n",
"# [recommend]\n",
"git clone https://github.com/PaddlePaddle/PaddleOCR\n",
"\n",
"# If you cannot pull successfully due to network problems, you can also choose to use the hosting on Code Cloud:\n",
"git clone https://gitee.com/paddlepaddle/PaddleOCR\n",
"```\n",
"\n",
"> Note: The code cloud hosted code may not be able to synchronize the update of this github project in real time, there is a delay of 3~5 days, please use the recommended method first.\n",
">\n",
"> If you are not familiar with git operations, you can download the compressed package directly from the `Code` on the homepage of PaddleOCR\n",
"\n",
"Then install third-party libraries:\n",
"\n",
"```bash\n",
"cd PaddleOCR\n",
"pip3 install -r requirements.txt\n",
"```\n",
"\n",
"\n",
"\n",
"### 1.4 Access to Information\n",
"\n",
"[PaddleOCR Usage Document](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/README.md) describes in detail how to use PaddleOCR to complete model application, training and deployment. The document is rich in content, most of the user’s questions are described in the document or FAQ, especially in [FAQ](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.3/doc/doc_en/FAQ_en.md), in accordance with the application process of deep learning, has precipitated the user's common questions, it is recommended that you read it carefully.\n",
"\n",
"### 1.5 Ask for Help\n",
"\n",
"If you encounter BUG, ease of use or documentation related issues while using PaddleOCR, you can contact the official via [Github issue](https://github.com/PaddlePaddle/PaddleOCR/issues), please follow the issue template Provide as much information as possible so that official personnel can quickly locate the problem. At the same time, the WeChat group is the daily communication position for the majority of PaddleOCR users, and it is more suitable for asking some consulting questions. In addition to the PaddleOCR team members, there will also be enthusiastic developers answering your questions."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "py35-paddle1.2.0"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
...@@ -105,3 +105,22 @@ def set_seed(seed=1024): ...@@ -105,3 +105,22 @@ def set_seed(seed=1024):
random.seed(seed) random.seed(seed)
np.random.seed(seed) np.random.seed(seed)
paddle.seed(seed) paddle.seed(seed)
class AverageMeter:
def __init__(self):
self.reset()
def reset(self):
"""reset"""
self.val = 0
self.avg = 0
self.sum = 0
self.count = 0
def update(self, val, n=1):
"""update"""
self.val = val
self.sum += val * n
self.count += n
self.avg = self.sum / self.count
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
Linux端基础训练预测功能测试的主程序为test_train_python.sh,可以测试基于Python的模型训练、评估等基本功能,包括裁剪、量化、蒸馏训练。 Linux端基础训练预测功能测试的主程序为test_train_python.sh,可以测试基于Python的模型训练、评估等基本功能,包括裁剪、量化、蒸馏训练。
![](./tipc_train.png) ![](./test_tipc/tipc_train.png)
测试链条如上图所示,主要测试内容有带共享权重,自定义OP的模型的正常训练和slim相关功能训练流程是否正常。 测试链条如上图所示,主要测试内容有带共享权重,自定义OP的模型的正常训练和slim相关功能训练流程是否正常。
...@@ -28,23 +28,23 @@ pip3 install -r requirements.txt ...@@ -28,23 +28,23 @@ pip3 install -r requirements.txt
- 模式1:lite_train_lite_infer,使用少量数据训练,用于快速验证训练到预测的走通流程,不验证精度和速度; - 模式1:lite_train_lite_infer,使用少量数据训练,用于快速验证训练到预测的走通流程,不验证精度和速度;
``` ```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python.txt 'lite_train_lite_infer' bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python.txt 'lite_train_lite_infer'
``` ```
- 模式2:whole_train_whole_infer,使用全量数据训练,用于快速验证训练到预测的走通流程,验证模型最终训练精度; - 模式2:whole_train_whole_infer,使用全量数据训练,用于快速验证训练到预测的走通流程,验证模型最终训练精度;
``` ```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python.txt 'whole_train_whole_infer' bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python.txt 'whole_train_whole_infer'
``` ```
如果是运行量化裁剪等训练方式,需要使用不同的配置文件。量化训练的测试指令如下: 如果是运行量化裁剪等训练方式,需要使用不同的配置文件。量化训练的测试指令如下:
``` ```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python_PACT.txt 'lite_train_lite_infer' bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python_PACT.txt 'lite_train_lite_infer'
``` ```
同理,FPGM裁剪的运行方式如下: 同理,FPGM裁剪的运行方式如下:
``` ```
bash test_tipc/test_train_python.sh ./test_tipc/ch_ppocr_mobile_v2.0_det/train_infer_python_FPGM.txt 'lite_train_lite_infer' bash test_tipc/test_train_python.sh ./test_tipc/train_infer_python_FPGM.txt 'lite_train_lite_infer'
``` ```
运行相应指令后,在`test_tipc/output`文件夹下自动会保存运行日志。如'lite_train_lite_infer'模式运行后,在test_tipc/extra_output文件夹有以下文件: 运行相应指令后,在`test_tipc/output`文件夹下自动会保存运行日志。如'lite_train_lite_infer'模式运行后,在test_tipc/extra_output文件夹有以下文件:
......
...@@ -21,7 +21,7 @@ import sys ...@@ -21,7 +21,7 @@ import sys
import platform import platform
import yaml import yaml
import time import time
import shutil import datetime
import paddle import paddle
import paddle.distributed as dist import paddle.distributed as dist
from tqdm import tqdm from tqdm import tqdm
...@@ -29,11 +29,10 @@ from argparse import ArgumentParser, RawDescriptionHelpFormatter ...@@ -29,11 +29,10 @@ from argparse import ArgumentParser, RawDescriptionHelpFormatter
from ppocr.utils.stats import TrainingStats from ppocr.utils.stats import TrainingStats
from ppocr.utils.save_load import save_model from ppocr.utils.save_load import save_model
from ppocr.utils.utility import print_dict from ppocr.utils.utility import print_dict, AverageMeter
from ppocr.utils.logging import get_logger from ppocr.utils.logging import get_logger
from ppocr.utils import profiler from ppocr.utils import profiler
from ppocr.data import build_dataloader from ppocr.data import build_dataloader
import numpy as np
class ArgsParser(ArgumentParser): class ArgsParser(ArgumentParser):
...@@ -48,7 +47,8 @@ class ArgsParser(ArgumentParser): ...@@ -48,7 +47,8 @@ class ArgsParser(ArgumentParser):
'--profiler_options', '--profiler_options',
type=str, type=str,
default=None, default=None,
help='The option of profiler, which should be in format \"key1=value1;key2=value2;key3=value3\".' help='The option of profiler, which should be in format ' \
'\"key1=value1;key2=value2;key3=value3\".'
) )
def parse_args(self, argv=None): def parse_args(self, argv=None):
...@@ -99,7 +99,8 @@ def merge_config(config, opts): ...@@ -99,7 +99,8 @@ def merge_config(config, opts):
sub_keys = key.split('.') sub_keys = key.split('.')
assert ( assert (
sub_keys[0] in config sub_keys[0] in config
), "the sub_keys can only be one of global_config: {}, but get: {}, please check your running command".format( ), "the sub_keys can only be one of global_config: {}, but get: " \
"{}, please check your running command".format(
config.keys(), sub_keys[0]) config.keys(), sub_keys[0])
cur = config[sub_keys[0]] cur = config[sub_keys[0]]
for idx, sub_key in enumerate(sub_keys[1:]): for idx, sub_key in enumerate(sub_keys[1:]):
...@@ -160,11 +161,13 @@ def train(config, ...@@ -160,11 +161,13 @@ def train(config,
eval_batch_step = eval_batch_step[1] eval_batch_step = eval_batch_step[1]
if len(valid_dataloader) == 0: if len(valid_dataloader) == 0:
logger.info( logger.info(
'No Images in eval dataset, evaluation during training will be disabled' 'No Images in eval dataset, evaluation during training ' \
'will be disabled'
) )
start_eval_step = 1e111 start_eval_step = 1e111
logger.info( logger.info(
"During the training process, after the {}th iteration, an evaluation is run every {} iterations". "During the training process, after the {}th iteration, " \
"an evaluation is run every {} iterations".
format(start_eval_step, eval_batch_step)) format(start_eval_step, eval_batch_step))
save_epoch_step = config['Global']['save_epoch_step'] save_epoch_step = config['Global']['save_epoch_step']
save_model_dir = config['Global']['save_model_dir'] save_model_dir = config['Global']['save_model_dir']
...@@ -189,10 +192,11 @@ def train(config, ...@@ -189,10 +192,11 @@ def train(config,
start_epoch = best_model_dict[ start_epoch = best_model_dict[
'start_epoch'] if 'start_epoch' in best_model_dict else 1 'start_epoch'] if 'start_epoch' in best_model_dict else 1
train_reader_cost = 0.0
train_run_cost = 0.0
total_samples = 0 total_samples = 0
train_reader_cost = 0.0
train_batch_cost = 0.0
reader_start = time.time() reader_start = time.time()
eta_meter = AverageMeter()
max_iter = len(train_dataloader) - 1 if platform.system( max_iter = len(train_dataloader) - 1 if platform.system(
) == "Windows" else len(train_dataloader) ) == "Windows" else len(train_dataloader)
...@@ -203,7 +207,6 @@ def train(config, ...@@ -203,7 +207,6 @@ def train(config,
config, 'Train', device, logger, seed=epoch) config, 'Train', device, logger, seed=epoch)
max_iter = len(train_dataloader) - 1 if platform.system( max_iter = len(train_dataloader) - 1 if platform.system(
) == "Windows" else len(train_dataloader) ) == "Windows" else len(train_dataloader)
for idx, batch in enumerate(train_dataloader): for idx, batch in enumerate(train_dataloader):
profiler.add_profiler_step(profiler_options) profiler.add_profiler_step(profiler_options)
train_reader_cost += time.time() - reader_start train_reader_cost += time.time() - reader_start
...@@ -214,7 +217,6 @@ def train(config, ...@@ -214,7 +217,6 @@ def train(config,
if use_srn: if use_srn:
model_average = True model_average = True
train_start = time.time()
# use amp # use amp
if scaler: if scaler:
with paddle.amp.auto_cast(): with paddle.amp.auto_cast():
...@@ -242,7 +244,9 @@ def train(config, ...@@ -242,7 +244,9 @@ def train(config,
optimizer.step() optimizer.step()
optimizer.clear_grad() optimizer.clear_grad()
train_run_cost += time.time() - train_start train_batch_time = time.time() - reader_start
train_batch_cost += train_batch_time
eta_meter.update(train_batch_time)
global_step += 1 global_step += 1
total_samples += len(images) total_samples += len(images)
...@@ -273,19 +277,27 @@ def train(config, ...@@ -273,19 +277,27 @@ def train(config,
(global_step > 0 and global_step % print_batch_step == 0) or (global_step > 0 and global_step % print_batch_step == 0) or
(idx >= len(train_dataloader) - 1)): (idx >= len(train_dataloader) - 1)):
logs = train_stats.log() logs = train_stats.log()
strs = 'epoch: [{}/{}], global_step: {}, {}, avg_reader_cost: {:.5f} s, avg_batch_cost: {:.5f} s, avg_samples: {}, samples/s: {:.5f}'.format(
epoch, epoch_num, global_step, logs, train_reader_cost / eta_sec = ((epoch_num + 1 - epoch) * \
print_batch_step, (train_reader_cost + train_run_cost) / len(train_dataloader) - idx - 1) * eta_meter.avg
print_batch_step, total_samples / print_batch_step, eta_sec_format = str(datetime.timedelta(seconds=int(eta_sec)))
total_samples / (train_reader_cost + train_run_cost)) strs = 'epoch: [{}/{}], global_step: {}, {}, avg_reader_cost: ' \
'{:.5f} s, avg_batch_cost: {:.5f} s, avg_samples: {}, ' \
'samples/s: {:.5f}, eta: {}'.format(
epoch, epoch_num, global_step, logs,
train_reader_cost / print_batch_step,
train_batch_cost / print_batch_step,
total_samples / print_batch_step,
total_samples / train_batch_cost, eta_sec_format)
logger.info(strs) logger.info(strs)
train_reader_cost = 0.0
train_run_cost = 0.0
total_samples = 0 total_samples = 0
train_reader_cost = 0.0
train_batch_cost = 0.0
# eval # eval
if global_step > start_eval_step and \ if global_step > start_eval_step and \
(global_step - start_eval_step) % eval_batch_step == 0 and dist.get_rank() == 0: (global_step - start_eval_step) % eval_batch_step == 0 \
and dist.get_rank() == 0:
if model_average: if model_average:
Model_Average = paddle.incubate.optimizer.ModelAverage( Model_Average = paddle.incubate.optimizer.ModelAverage(
0.15, 0.15,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment