Unverified Commit 14866fc3 authored by DanielYang's avatar DanielYang Committed by GitHub
Browse files

Merge pull request #4158 from Evezerest/dygraph

Update joinus.png and documents 
parents 0da240d0 551f4b0d
# MODEL TRAINING
# Model Training
- [1. Basic concepts](#1-basic-concepts)
* [1.1 Learning rate](#11-learning-rate)
* [1.2 Regularization](#12-regularization)
* [1.3 Evaluation indicators](#13-evaluation-indicators-)
- [2. Data and vertical scenes](#2-data-and-vertical-scenes)
* [2.1 Training data](#21-training-data)
* [2.2 Vertical scene](#22-vertical-scene)
* [2.3 Build your own data set](#23-build-your-own-data-set)
* [3. FAQ](#3-faq)
- [1.Yml Configuration ](#1-Yml-Configuration)
- [2. Basic Concepts](#1-basic-concepts)
* [2.1 Learning Rate](#11-learning-rate)
* [2.2 Regularization](#12-regularization)
* [2.3 Evaluation Indicators](#13-evaluation-indicators-)
- [3. Data and Vertical Scenes](#2-data-and-vertical-scenes)
* [3.1 Training Data](#21-training-data)
* [3.2 Vertical Scene](#22-vertical-scene)
* [3.3 Build Your Own Dataset](#23-build-your-own-data-set)
* [4. FAQ](#3-faq)
This article will introduce the basic concepts that need to be mastered during model training and the tuning methods during training.
At the same time, it will briefly introduce the components of the PaddleOCR model training data and how to prepare the data finetune model in the vertical scene.
<a name="1-Yml-Configuration"></a>
## 1. Yml Configuration
The PaddleOCR model uses configuration files to manage network training and evaluation parameters. In the configuration file, you can set the model, optimizer, loss function, and pre- and post-processing parameters of the model. PaddleOCR reads these parameters from the configuration file, and then builds a complete training process to complete the model training. When optimized, the configuration can be completed by modifying the parameters in the configuration file, which is simple to use and convenient to modify.
For the complete configuration file description, please refer to [Configuration File](./config_en.md)
<a name="1-basic-concepts"></a>
# 1. Basic concepts
OCR (Optical Character Recognition) refers to the process of analyzing and recognizing images to obtain text and layout information. It is a typical computer vision task.
It usually consists of two subtasks: text detection and text recognition.
## 2. Basic Concepts
The following parameters need to be paid attention to when tuning the model:
<a name="11-learning-rate"></a>
## 1.1 Learning rate
### 2.1 Learning Rate
The learning rate is one of the important hyperparameters for training neural networks. It represents the step length of the gradient moving to the optimal solution of the loss function in each iteration.
A variety of learning rate update strategies are provided in PaddleOCR, which can be modified through configuration files, for example:
......@@ -61,7 +69,7 @@ Optimizer:
factor: 2.0e-05
```
<a name="13-evaluation-indicators-"></a>
## 1.3 Evaluation indicators
### 2.3 Evaluation Indicators
(1) Detection stage: First, evaluate according to the IOU of the detection frame and the labeled frame. If the IOU is greater than a certain threshold, it is judged that the detection is accurate. Here, the detection frame and the label frame are different from the general general target detection frame, and they are represented by polygons. Detection accuracy: the percentage of the correct detection frame number in all detection frames is mainly used to judge the detection index. Detection recall rate: the percentage of correct detection frames in all marked frames, which is mainly an indicator of missed detection.
......@@ -71,11 +79,11 @@ Optimizer:
<a name="2-data-and-vertical-scenes"></a>
# 2. Data and vertical scenes
## 3. Data and Vertical Scenes
<a name="21-training-data"></a>
## 2.1 Training data
### 3.1 Training Data
The current open source models, data sets and magnitudes are as follows:
......@@ -92,14 +100,14 @@ Among them, the public data sets are all open source, users can search and downl
<a name="22-vertical-scene"></a>
## 2.2 Vertical scene
### 3.2 Vertical Scene
PaddleOCR mainly focuses on general OCR. If you have vertical requirements, you can use PaddleOCR + vertical data to train yourself;
If there is a lack of labeled data, or if you do not want to invest in research and development costs, it is recommended to directly call the open API, which covers some of the more common vertical categories.
<a name="23-build-your-own-data-set"></a>
## 2.3 Build your own data set
### 3.3 Build Your Own Dataset
There are several experiences for reference when constructing the data set:
......
doc/joinus.PNG

209 KB | W: | H:

doc/joinus.PNG

191 KB | W: | H:

doc/joinus.PNG
doc/joinus.PNG
doc/joinus.PNG
doc/joinus.PNG
  • 2-up
  • Swipe
  • Onion skin
# 表格识别
* [1. 表格识别 pipeline](#1)
* [2. 性能](#2)
* [3. 使用](#3)
+ [3.1 快速开始](#31)
+ [3.2 训练](#32)
+ [3.3 评估](#33)
+ [3.4 预测](#34)
<a name="1"></a>
## 1. 表格识别 pipeline
表格识别主要包含三个模型
1. 单行文本检测-DB
2. 单行文本识别-CRNN
......@@ -17,6 +27,8 @@
3. 由单行文字的坐标、识别结果和单元格的坐标一起组合出单元格的识别结果。
4. 单元格的识别结果和表格结构一起构造表格的html字符串。
<a name="2"></a>
## 2. 性能
我们在 PubTabNet<sup>[1]</sup> 评估数据集上对算法进行了评估,性能如下
......@@ -26,8 +38,9 @@
| EDD<sup>[2]</sup> | 88.3 |
| Ours | 93.32 |
<a name="3"></a>
## 3. 使用
<a name="31"></a>
### 3.1 快速开始
```python
......@@ -48,7 +61,7 @@ python3 table/predict_table.py --det_model_dir=inference/en_ppocr_mobile_v2.0_ta
运行完成后,每张图片的excel表格会保存到output字段指定的目录下
note: 上述模型是在 PubLayNet 数据集上训练的表格识别模型,仅支持英文扫描场景,如需识别其他场景需要自己训练模型后替换 `det_model_dir`,`rec_model_dir`,`table_model_dir`三个字段即可。
<a name="32"></a>
### 3.2 训练
在这一章节中,我们仅介绍表格结构模型的训练,[文字检测](../../doc/doc_ch/detection.md)[文字识别](../../doc/doc_ch/recognition.md)的模型训练请参考对应的文档。
......@@ -75,7 +88,7 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo
**注意**`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。
<a name="33"></a>
### 3.3 评估
表格使用 [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) 作为模型的评估指标。在进行模型评估之前,需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好),还需要准备评估的gt, gt示例如下:
......@@ -100,7 +113,7 @@ python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_di
```bash
teds: 93.32
```
<a name="34"></a>
### 3.4 预测
```python
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment