Commit c6a27e0b authored by panhb's avatar panhb
Browse files

first init

parent e4b993b1
Pipeline #2192 canceled with stages
简体中文 | [English](DetAnnoTools_en.md)
# 目标检测标注工具
## 目录
[LabelMe](#LabelMe)
* [使用说明](#使用说明)
* [安装](#LabelMe安装)
* [图片标注过程](#LabelMe图片标注过程)
* [标注格式](#LabelMe标注格式)
* [导出数据格式](#LabelMe导出数据格式)
* [格式转化总结](#格式转化总结)
* [标注文件(json)-->VOC](#标注文件(json)-->VOC数据集)
* [标注文件(json)-->COCO](#标注文件(json)-->COCO数据集)
[LabelImg](#LabelImg)
* [使用说明](#使用说明)
* [LabelImg安装](#LabelImg安装)
* [安装注意事项](#安装注意事项)
* [图片标注过程](#LabelImg图片标注过程)
* [标注格式](#LabelImg标注格式)
* [导出数据格式](#LabelImg导出数据格式)
* [格式转换注意事项](#格式转换注意事项)
## [LabelMe](https://github.com/wkentaro/labelme)
### 使用说明
#### LabelMe安装
具体安装操作请参考[LabelMe官方教程](https://github.com/wkentaro/labelme)中的Installation
<details>
<summary><b> Ubuntu</b></summary>
```
sudo apt-get install labelme
# or
sudo pip3 install labelme
# or install standalone executable from:
# https://github.com/wkentaro/labelme/releases
```
</details>
<details>
<summary><b> macOS</b></summary>
```
brew install pyqt # maybe pyqt5
pip install labelme
# or
brew install wkentaro/labelme/labelme # command line interface
# brew install --cask wkentaro/labelme/labelme # app
# or install standalone executable/app from:
# https://github.com/wkentaro/labelme/releases
```
</details>
推荐使用Anaconda的安装方式
```
conda create –name=labelme python=3
conda activate labelme
pip install pyqt5
pip install labelme
```
#### LabelMe图片标注过程
启动labelme后,选择图片文件或者图片所在文件夹
左侧编辑栏选择`create polygons` 绘制标注区域如下图所示(右击图像区域可以选择不同的标注形状),绘制好区域后按下回车,弹出新的框填入标注区域对应的标签,如:people
左侧菜单栏点击保存,生成`json`形式的**标注文件**
![](https://media3.giphy.com/media/XdnHZgge5eynRK3ATK/giphy.gif?cid=790b7611192e4c0ec2b5e6990b6b0f65623154ffda66b122&rid=giphy.gif&ct=g)
### LabelMe标注格式
#### LabelMe导出数据格式
```
#生成标注文件
png/jpeg/jpg-->labelme标注-->json
```
#### 格式转化总结
```
#标注文件转化为VOC数据集格式
json-->labelme2voc.py-->VOC数据集
#标注文件转化为COCO数据集格式
json-->labelme2coco.py-->COCO数据集
```
#### 标注文件(json)-->VOC数据集
使用[官方给出的labelme2voc.py](https://github.com/wkentaro/labelme/blob/main/examples/bbox_detection/labelme2voc.py)这份脚本
下载该脚本,在命令行中使用
```Te
python labelme2voc.py data_annotated(标注文件所在文件夹) data_dataset_voc(输出文件夹) --labels labels.txt
```
运行后,在指定的输出文件夹中会如下的目录
```
# It generates:
# - data_dataset_voc/JPEGImages
# - data_dataset_voc/Annotations
# - data_dataset_voc/AnnotationsVisualization
```
#### 标注文件(json)-->COCO数据集
使用[PaddleDetection提供的x2coco.py](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/tools/x2coco.py) 将labelme标注的数据转换为COCO数据集形式
```bash
python tools/x2coco.py \
--dataset_type labelme \
--json_input_dir ./labelme_annos/ \
--image_input_dir ./labelme_imgs/ \
--output_dir ./cocome/ \
--train_proportion 0.8 \
--val_proportion 0.2 \
--test_proportion 0.0
```
用户数据集转成COCO数据后目录结构如下(注意数据集中路径名、文件名尽量不要使用中文,避免中文编码问题导致出错):
```
dataset/xxx/
├── annotations
│ ├── train.json # coco数据的标注文件
│ ├── valid.json # coco数据的标注文件
├── images
│ ├── xxx1.jpg
│ ├── xxx2.jpg
│ ├── xxx3.jpg
│ | ...
...
```
## [LabelImg](https://github.com/tzutalin/labelImg)
### 使用说明
#### LabelImg安装
安装操作请参考[LabelImg官方教程](https://github.com/tzutalin/labelImg)
<details>
<summary><b> Ubuntu</b></summary>
```
sudo apt-get install pyqt5-dev-tools
sudo pip3 install -r requirements/requirements-linux-python3.txt
make qt5py3
python3 labelImg.py
python3 labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
```
</details>
<details>
<summary><b>macOS</b></summary>
```
brew install qt # Install qt-5.x.x by Homebrew
brew install libxml2
or using pip
pip3 install pyqt5 lxml # Install qt and lxml by pip
make qt5py3
python3 labelImg.py
python3 labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
```
</details>
推荐使用Anaconda的安装方式
首先下载并进入 [labelImg](https://github.com/tzutalin/labelImg#labelimg) 的目录
```
conda install pyqt=5
conda install -c anaconda lxml
pyrcc5 -o libs/resources.py resources.qrc
python labelImg.py
python labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
```
#### 安装注意事项
以Anaconda安装方式为例,比Labelme配置要麻烦一些
启动方式是通过python运行脚本`python labelImg.py <图片路径>`
#### LabelImg图片标注过程
启动labelImg后,选择图片文件或者图片所在文件夹
左侧编辑栏选择`创建区块` 绘制标注区,在弹出新的框选择对应的标签
左侧菜单栏点击保存,可以选择VOC/YOLO/CreateML三种类型的标注文件
![](https://user-images.githubusercontent.com/34162360/177526022-fd9c63d8-e476-4b63-ae02-76d032bb7656.gif)
### LabelImg标注格式
#### LabelImg导出数据格式
```
#生成标注文件
png/jpeg/jpg-->labelImg标注-->xml/txt/json
```
#### 格式转换注意事项
**PaddleDetection支持VOC或COCO格式的数据**,经LabelImg标注导出后的标注文件,需要修改为**VOC或COCO格式**,调整说明可以参考[准备训练数据](./PrepareDataSet.md#%E5%87%86%E5%A4%87%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE)
[简体中文](DetAnnoTools.md) | English
# Object Detection Annotation Tools
## Concents
[LabelMe](#LabelMe)
* [Instruction](#Instruction-of-LabelMe)
* [Installation](#Installation)
* [Annotation of Images](#Annotation-of-images-in-LabelMe)
* [Annotation Format](#Annotation-Format-of-LabelMe)
* [Export Format](#Export-Format-of-LabelMe)
* [Summary of Format Conversion](#Summary-of-Format-Conversion)
* [Annotation file(json)—>VOC Dataset](#annotation-filejsonvoc-dataset)
* [Annotation file(json)—>COCO Dataset](#annotation-filejsoncoco-dataset)
[LabelImg](#LabelImg)
* [Instruction](#Instruction-of-LabelImg)
* [Installation](#Installation-of-LabelImg)
* [Installation Notes](#Installation-Notes)
* [Annotation of images](#Annotation-of-images-in-LabelImg)
* [Annotation Format](#Annotation-Format-of-LabelImg)
* [Export Format](#Export-Format-of-LabelImg)
* [Notes of Format Conversion](#Notes-of-Format-Conversion)
## [LabelMe](https://github.com/wkentaro/labelme)
### Instruction of LabelMe
#### Installation
Please refer to [The github of LabelMe](https://github.com/wkentaro/labelme) for installation details.
<details>
<summary><b> Ubuntu</b></summary>
```
sudo apt-get install labelme
# or
sudo pip3 install labelme
# or install standalone executable from:
# https://github.com/wkentaro/labelme/releases
```
</details>
<details>
<summary><b> macOS</b></summary>
```
brew install pyqt # maybe pyqt5
pip install labelme
# or
brew install wkentaro/labelme/labelme # command line interface
# brew install --cask wkentaro/labelme/labelme # app
# or install standalone executable/app from:
# https://github.com/wkentaro/labelme/releases
```
</details>
We recommend installing by Anoncanda.
```
conda create –name=labelme python=3
conda activate labelme
pip install pyqt5
pip install labelme
```
#### Annotation of Images in LabelMe
After starting labelme, select an image or an folder with images.
Select `create polygons` in the formula bar. Draw an annotation area as shown in the following GIF. You can right-click on the image to select different shape. When finished, press the Enter/Return key, then fill the corresponding label in the popup box, such as, people.
Click the save button in the formula bar,it will generate an annotation file in json.
![](https://media3.giphy.com/media/XdnHZgge5eynRK3ATK/giphy.gif?cid=790b7611192e4c0ec2b5e6990b6b0f65623154ffda66b122&rid=giphy.gif&ct=g)
### Annotation Format of LabelMe
#### Export Format of LabelMe
```
#generate an annotation file
png/jpeg/jpg-->labelme-->json
```
#### Summary of Format Conversion
```
#convert annotation file to VOC dataset format
json-->labelme2voc.py-->VOC dataset
#convert annotation file to COCO dataset format
json-->labelme2coco.py-->COCO dataset
```
#### Annotation file(json)—>VOC Dataset
Use this script [labelme2voc.py](https://github.com/wkentaro/labelme/blob/main/examples/bbox_detection/labelme2voc.py) in command line.
```Te
python labelme2voc.py data_annotated(annotation folder) data_dataset_voc(output folder) --labels labels.txt
```
Then, it will generate following contents:
```
# It generates:
# - data_dataset_voc/JPEGImages
# - data_dataset_voc/Annotations
# - data_dataset_voc/AnnotationsVisualization
```
#### Annotation file(json)—>COCO Dataset
Convert the data annotated by LabelMe to COCO dataset by the script [x2coco.py](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/tools/x2coco.py) provided by PaddleDetection.
```bash
python tools/x2coco.py \
--dataset_type labelme \
--json_input_dir ./labelme_annos/ \
--image_input_dir ./labelme_imgs/ \
--output_dir ./cocome/ \
--train_proportion 0.8 \
--val_proportion 0.2 \
--test_proportion 0.0
```
After the user dataset is converted to COCO data, the directory structure is as follows (Try to avoid use Chinese for the path name in case of errors caused by Chinese coding problems):
```
dataset/xxx/
├── annotations
│ ├── train.json # Annotation file of coco data
│ ├── valid.json # Annotation file of coco data
├── images
│ ├── xxx1.jpg
│ ├── xxx2.jpg
│ ├── xxx3.jpg
│ | ...
...
```
## [LabelImg](https://github.com/tzutalin/labelImg)
### Instruction
#### Installation of LabelImg
Please refer to [The github of LabelImg](https://github.com/tzutalin/labelImg) for installation details.
<details>
<summary><b> Ubuntu</b></summary>
```
sudo apt-get install pyqt5-dev-tools
sudo pip3 install -r requirements/requirements-linux-python3.txt
make qt5py3
python3 labelImg.py
python3 labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
```
</details>
<details>
<summary><b>macOS</b></summary>
```
brew install qt # Install qt-5.x.x by Homebrew
brew install libxml2
or using pip
pip3 install pyqt5 lxml # Install qt and lxml by pip
make qt5py3
python3 labelImg.py
python3 labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
```
</details>
We recommend installing by Anoncanda.
Download and go to the folder of [labelImg](https://github.com/tzutalin/labelImg#labelimg)
```
conda install pyqt=5
conda install -c anaconda lxml
pyrcc5 -o libs/resources.py resources.qrc
python labelImg.py
python labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
```
#### Installation Notes
Use python scripts to startup LabelImg: `python labelImg.py <IMAGE_PATH>`
#### Annotation of images in LabelImg
After the startup of LabelImg, select an image or a folder with images.
Select `Create RectBox` in the formula bar. Draw an annotation area as shown in the following GIF. When finished, select corresponding label in the popup box. Then save the annotated file in three forms: VOC/YOLO/CreateML.
![](https://user-images.githubusercontent.com/34162360/177526022-fd9c63d8-e476-4b63-ae02-76d032bb7656.gif)
### Annotation Format of LabelImg
#### Export Format of LabelImg
```
#generate annotation files
png/jpeg/jpg-->labelImg-->xml/txt/json
```
#### Notes of Format Conversion
**PaddleDetection supports the format of VOC or COCO.** The annotation file generated by LabelImg needs to be converted by VOC or COCO. You can refer to [PrepareDataSet](./PrepareDataSet.md#%E5%87%86%E5%A4%87%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE).
# 目标检测数据准备
## 目录
- [目标检测数据说明](#目标检测数据说明)
- [准备训练数据](#准备训练数据)
- [VOC数据数据](#VOC数据数据)
- [VOC数据集下载](#VOC数据集下载)
- [VOC数据标注文件介绍](#VOC数据标注文件介绍)
- [COCO数据数据](#COCO数据数据)
- [COCO数据集下载](#COCO数据下载)
- [COCO数据标注文件介绍](#COCO数据标注文件介绍)
- [用户数据准备](#用户数据准备)
- [用户数据转成VOC数据](#用户数据转成VOC数据)
- [用户数据转成COCO数据](#用户数据转成COCO数据)
- [用户数据自定义reader](#用户数据自定义reader)
- [用户数据使用示例](#用户数据使用示例)
- [数据格式转换](#数据格式转换)
- [自定义数据训练](#自定义数据训练)
- [(可选)生成Anchor](#(可选)生成Anchor)
### 目标检测数据说明
目标检测的数据比分类复杂,一张图像中,需要标记出各个目标区域的位置和类别。
一般的目标区域位置用一个矩形框来表示,一般用以下3种方式表达:
| 表达方式 | 说明 |
| :----------------: | :--------------------------------: |
| x1,y1,x2,y2 | (x1,y1)为左上角坐标,(x2,y2)为右下角坐标 |
| x1,y1,w,h | (x1,y1)为左上角坐标,w为目标区域宽度,h为目标区域高度 |
| xc,yc,w,h | (xc,yc)为目标区域中心坐标,w为目标区域宽度,h为目标区域高度 |
常见的目标检测数据集如Pascal VOC采用的`[x1,y1,x2,y2]` 表示物体的bounding box, [COCO](https://cocodataset.org/#format-data)采用的`[x1,y1,w,h]` 表示物体的bounding box.
### 准备训练数据
PaddleDetection默认支持[COCO](http://cocodataset.org)[Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/)[WIDER-FACE](http://shuoyang1213.me/WIDERFACE/) 数据源。
同时还支持自定义数据源,包括:
(1) 自定义数据数据转换成VOC数据;
(2) 自定义数据数据转换成COCO数据;
(3) 自定义新的数据源,增加自定义的reader。
首先进入到`PaddleDetection`根目录下
```
cd PaddleDetection/
ppdet_root=$(pwd)
```
#### VOC数据数据
VOC数据是[Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) 比赛使用的数据。Pascal VOC比赛不仅包含图像分类分类任务,还包含图像目标检测、图像分割等任务,其标注文件中包含多个任务的标注内容。
VOC数据集指的是Pascal VOC比赛使用的数据。用户自定义的VOC数据,xml文件中的非必须字段,请根据实际情况选择是否标注或是否使用默认值。
##### VOC数据集下载
- 通过代码自动化下载VOC数据集,数据集较大,下载需要较长时间
```
# 执行代码自动化下载VOC数据集
python dataset/voc/download_voc.py
```
代码执行完成后VOC数据集文件组织结构为:
```
>>cd dataset/voc/
>>tree
├── create_list.py
├── download_voc.py
├── generic_det_label_list.txt
├── generic_det_label_list_zh.txt
├── label_list.txt
├── VOCdevkit/VOC2007
│ ├── annotations
│ ├── 001789.xml
│ | ...
│ ├── JPEGImages
│ ├── 001789.jpg
│ | ...
│ ├── ImageSets
│ | ...
├── VOCdevkit/VOC2012
│ ├── Annotations
│ ├── 2011_003876.xml
│ | ...
│ ├── JPEGImages
│ ├── 2011_003876.jpg
│ | ...
│ ├── ImageSets
│ | ...
| ...
```
各个文件说明
```
# label_list.txt 是类别名称列表,文件名必须是 label_list.txt。若使用VOC数据集,config文件中use_default_label为true时不需要这个文件
>>cat label_list.txt
aeroplane
bicycle
...
# trainval.txt 是训练数据集文件列表
>>cat trainval.txt
VOCdevkit/VOC2007/JPEGImages/007276.jpg VOCdevkit/VOC2007/Annotations/007276.xml
VOCdevkit/VOC2012/JPEGImages/2011_002612.jpg VOCdevkit/VOC2012/Annotations/2011_002612.xml
...
# test.txt 是测试数据集文件列表
>>cat test.txt
VOCdevkit/VOC2007/JPEGImages/000001.jpg VOCdevkit/VOC2007/Annotations/000001.xml
...
# label_list.txt voc 类别名称列表
>>cat label_list.txt
aeroplane
bicycle
...
```
- 已下载VOC数据集
按照如上数据文件组织结构组织文件即可。
##### VOC数据标注文件介绍
VOC数据是每个图像文件对应一个同名的xml文件,xml文件中标记物体框的坐标和类别等信息。例如图像`2007_002055.jpg`
![](../images/2007_002055.jpg)
图片对应的xml文件内包含对应图片的基本信息,比如文件名、来源、图像尺寸以及图像中包含的物体区域信息和类别信息等。
xml文件中包含以下字段:
- filename,表示图像名称。
- size,表示图像尺寸。包括:图像宽度、图像高度、图像深度。
```
<size>
<width>500</width>
<height>375</height>
<depth>3</depth>
</size>
```
- object字段,表示每个物体。包括:
| 标签 | 说明 |
| :--------: | :-----------: |
| name | 物体类别名称 |
| pose | 关于目标物体姿态描述(非必须字段) |
| truncated | 如果物体的遮挡超过15-20%并且位于边界框之外,请标记为`truncated`(非必须字段) |
| difficult | 难以识别的物体标记为`difficult`(非必须字段) |
| bndbox子标签 | (xmin,ymin) 左上角坐标,(xmax,ymax) 右下角坐标, |
#### COCO数据
COCO数据是[COCO](http://cocodataset.org) 比赛使用的数据。同样的,COCO比赛数也包含多个比赛任务,其标注文件中包含多个任务的标注内容。
COCO数据集指的是COCO比赛使用的数据。用户自定义的COCO数据,json文件中的一些字段,请根据实际情况选择是否标注或是否使用默认值。
##### COCO数据下载
- 通过代码自动化下载COCO数据集,数据集较大,下载需要较长时间
```
# 执行代码自动化下载COCO数据集
python dataset/coco/download_coco.py
```
代码执行完成后COCO数据集文件组织结构为:
```
>>cd dataset/coco/
>>tree
├── annotations
│ ├── instances_train2017.json
│ ├── instances_val2017.json
│ | ...
├── train2017
│ ├── 000000000009.jpg
│ ├── 000000580008.jpg
│ | ...
├── val2017
│ ├── 000000000139.jpg
│ ├── 000000000285.jpg
│ | ...
| ...
```
- 已下载COCO数据集
按照如上数据文件组织结构组织文件即可。
##### COCO数据标注介绍
COCO数据标注是将所有训练图像的标注都存放到一个json文件中。数据以字典嵌套的形式存放。
json文件中包含以下key:
- info,表示标注文件info。
- licenses,表示标注文件licenses。
- images,表示标注文件中图像信息列表,每个元素是一张图像的信息。如下为其中一张图像的信息:
```
{
'license': 3, # license
'file_name': '000000391895.jpg', # file_name
# coco_url
'coco_url': 'http://images.cocodataset.org/train2017/000000391895.jpg',
'height': 360, # image height
'width': 640, # image width
'date_captured': '2013-11-14 11:18:45', # date_captured
# flickr_url
'flickr_url': 'http://farm9.staticflickr.com/8186/8119368305_4e622c8349_z.jpg',
'id': 391895 # image id
}
```
- annotations,表示标注文件中目标物体的标注信息列表,每个元素是一个目标物体的标注信息。如下为其中一个目标物体的标注信息:
```
{
'segmentation': # 物体的分割标注
'area': 2765.1486500000005, # 物体的区域面积
'iscrowd': 0, # iscrowd
'image_id': 558840, # image id
'bbox': [199.84, 200.46, 77.71, 70.88], # bbox [x1,y1,w,h]
'category_id': 58, # category_id
'id': 156 # image id
}
```
```
# 查看COCO标注文件
import json
coco_anno = json.load(open('./annotations/instances_train2017.json'))
# coco_anno.keys
print('\nkeys:', coco_anno.keys())
# 查看类别信息
print('\n物体类别:', coco_anno['categories'])
# 查看一共多少张图
print('\n图像数量:', len(coco_anno['images']))
# 查看一共多少个目标物体
print('\n标注物体数量:', len(coco_anno['annotations']))
# 查看一条目标物体标注信息
print('\n查看一条目标物体标注信息:', coco_anno['annotations'][0])
```
#### 用户数据准备
对于用户数据有3种处理方法:
(1) 将用户数据转成VOC数据(根据需要仅包含物体检测所必须的标签即可)
(2) 将用户数据转成COCO数据(根据需要仅包含物体检测所必须的标签即可)
(3) 自定义一个用户数据的reader(较复杂数据,需要自定义reader)
##### 用户数据转成VOC数据
用户数据集转成VOC数据后目录结构如下(注意数据集中路径名、文件名尽量不要使用中文,避免中文编码问题导致出错):
```
dataset/xxx/
├── annotations
│ ├── xxx1.xml
│ ├── xxx2.xml
│ ├── xxx3.xml
│ | ...
├── images
│ ├── xxx1.jpg
│ ├── xxx2.jpg
│ ├── xxx3.jpg
│ | ...
├── label_list.txt (必须提供,且文件名称必须是label_list.txt )
├── train.txt (训练数据集文件列表, ./images/xxx1.jpg ./annotations/xxx1.xml)
└── valid.txt (测试数据集文件列表)
```
各个文件说明
```
# label_list.txt 是类别名称列表,改文件名必须是这个
>>cat label_list.txt
classname1
classname2
...
# train.txt 是训练数据文件列表
>>cat train.txt
./images/xxx1.jpg ./annotations/xxx1.xml
./images/xxx2.jpg ./annotations/xxx2.xml
...
# valid.txt 是验证数据文件列表
>>cat valid.txt
./images/xxx3.jpg ./annotations/xxx3.xml
...
```
##### 用户数据转成COCO数据
`./tools/`中提供了`x2coco.py`用于将VOC数据集、labelme标注的数据集或cityscape数据集转换为COCO数据,例如:
(1)labelme数据转换为COCO数据:
```bash
python tools/x2coco.py \
--dataset_type labelme \
--json_input_dir ./labelme_annos/ \
--image_input_dir ./labelme_imgs/ \
--output_dir ./cocome/ \
--train_proportion 0.8 \
--val_proportion 0.2 \
--test_proportion 0.0
```
(2)voc数据转换为COCO数据:
```bash
python tools/x2coco.py \
--dataset_type voc \
--voc_anno_dir path/to/VOCdevkit/VOC2007/Annotations/ \
--voc_anno_list path/to/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt \
--voc_label_list dataset/voc/label_list.txt \
--voc_out_name voc_train.json
```
用户数据集转成COCO数据后目录结构如下(注意数据集中路径名、文件名尽量不要使用中文,避免中文编码问题导致出错):
```
dataset/xxx/
├── annotations
│ ├── train.json # coco数据的标注文件
│ ├── valid.json # coco数据的标注文件
├── images
│ ├── xxx1.jpg
│ ├── xxx2.jpg
│ ├── xxx3.jpg
│ | ...
...
```
##### 用户数据自定义reader
如果数据集有新的数据需要添加进PaddleDetection中,您可参考数据处理文档中的[添加新数据源](../advanced_tutorials/READER.md#2.3自定义数据集)文档部分,开发相应代码完成新的数据源支持,同时数据处理具体代码解析等可阅读[数据处理文档](../advanced_tutorials/READER.md)
#### 用户数据使用示例
[Kaggle数据集](https://www.kaggle.com/andrewmvd/road-sign-detection) 比赛数据为例,说明如何准备自定义数据。
Kaggle上的 [road-sign-detection](https://www.kaggle.com/andrewmvd/road-sign-detection) 比赛数据包含877张图像,数据类别4类:crosswalk,speedlimit,stop,trafficlight。
可从Kaggle上下载,也可以从[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/roadsign_voc.tar) 下载。
路标数据集示例图:
![](../images/road554.png)
```
# 下载解压数据
>>cd $(ppdet_root)/dataset
# 下载kaggle数据集并解压,当前文件组织结构如下
├── annotations
│ ├── road0.xml
│ ├── road1.xml
│ ├── road10.xml
│ | ...
├── images
│ ├── road0.jpg
│ ├── road1.jpg
│ ├── road2.jpg
│ | ...
```
#### 数据格式转换
将数据划分为训练集和测试集
```
# 生成 label_list.txt 文件
>>echo -e "speedlimit\ncrosswalk\ntrafficlight\nstop" > label_list.txt
# 生成 train.txt、valid.txt和test.txt列表文件
>>ls images/*.png | shuf > all_image_list.txt
>>awk -F"/" '{print $2}' all_image_list.txt | awk -F".png" '{print $1}' | awk -F"\t" '{print "images/"$1".png annotations/"$1".xml"}' > all_list.txt
# 训练集、验证集、测试集比例分别约80%、10%、10%。
>>head -n 88 all_list.txt > test.txt
>>head -n 176 all_list.txt | tail -n 88 > valid.txt
>>tail -n 701 all_list.txt > train.txt
# 删除不用文件
>>rm -rf all_image_list.txt all_list.txt
最终数据集文件组织结构为:
├── annotations
│ ├── road0.xml
│ ├── road1.xml
│ ├── road10.xml
│ | ...
├── images
│ ├── road0.jpg
│ ├── road1.jpg
│ ├── road2.jpg
│ | ...
├── label_list.txt
├── test.txt
├── train.txt
└── valid.txt
# label_list.txt 是类别名称列表,文件名必须是 label_list.txt
>>cat label_list.txt
crosswalk
speedlimit
stop
trafficlight
# train.txt 是训练数据集文件列表,每一行是一张图像路径和对应标注文件路径,以空格分开。注意这里的路径是数据集文件夹内的相对路径。
>>cat train.txt
./images/road839.png ./annotations/road839.xml
./images/road363.png ./annotations/road363.xml
...
# valid.txt 是验证数据集文件列表,每一行是一张图像路径和对应标注文件路径,以空格分开。注意这里的路径是数据集文件夹内的相对路径。
>>cat valid.txt
./images/road218.png ./annotations/road218.xml
./images/road681.png ./annotations/road681.xml
```
也可以下载准备好的数据[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/roadsign_voc.tar) ,解压到`dataset/roadsign_voc/`文件夹下即可。
准备好数据后,一般的我们要对数据有所了解,比如图像量,图像尺寸,每一类目标区域个数,目标区域大小等。如有必要,还要对数据进行清洗。
roadsign数据集统计:
| 数据 | 图片数量 |
| :--------: | :-----------: |
| train | 701 |
| valid | 176 |
**说明:**
(1)用户数据,建议在训练前仔细检查数据,避免因数据标注格式错误或图像数据不完整造成训练过程中的crash
(2)如果图像尺寸太大的话,在不限制读入数据尺寸情况下,占用内存较多,会造成内存/显存溢出,请合理设置batch_size,可从小到大尝试
#### 自定义数据训练
数据准备完成后,需要修改PaddleDetection中关于Dataset的配置文件,在`configs/datasets`文件夹下。比如roadsign数据集的配置文件如下:
```
metric: VOC # 目前支持COCO, VOC, WiderFace等评估标准
num_classes: 4 # 数据集的类别数,不包含背景类,roadsign数据集为4类,其他数据需要修改为自己的数据类别
TrainDataset:
!VOCDataSet
dataset_dir: dataset/roadsign_voc # 训练集的图片所在文件相对于dataset_dir的路径
anno_path: train.txt # 训练集的标注文件相对于dataset_dir的路径
label_list: label_list.txt # 数据集所在路径,相对于PaddleDetection路径
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult'] # 控制dataset输出的sample所包含的字段,注意此为训练集Reader独有的且必须配置的字段
EvalDataset:
!VOCDataSet
dataset_dir: dataset/roadsign_voc # 数据集所在路径,相对于PaddleDetection路径
anno_path: valid.txt # 验证集的标注文件相对于dataset_dir的路径
label_list: label_list.txt # 标签文件,相对于dataset_dir的路径
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
TestDataset:
!ImageFolder
anno_path: label_list.txt # 标注文件所在路径,仅用于读取数据集的类别信息,支持json和txt格式
dataset_dir: dataset/roadsign_voc # 数据集所在路径,若添加了此行,则`anno_path`路径为相对于`dataset_dir`路径,若此行不设置或去掉此行,则为相对于PaddleDetection路径
```
然后在对应模型配置文件中将自定义数据文件路径替换为新路径,以`configs/yolov3/yolov3_mobilenet_v1_roadsign.yml`为例
```
_BASE_: [
'../datasets/roadsign_voc.yml', # 指定为自定义数据集配置路径
'../runtime.yml',
'_base_/optimizer_40e.yml',
'_base_/yolov3_mobilenet_v1.yml',
'_base_/yolov3_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams
weights: output/yolov3_mobilenet_v1_roadsign/model_final
YOLOv3Loss:
ignore_thresh: 0.7
label_smooth: true
```
在PaddleDetection的yml配置文件中,使用`!`直接序列化模块实例(可以是函数,实例等),上述的配置文件均使用Dataset进行了序列化。
配置修改完成后,即可以启动训练评估,命令如下
```
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/yolov3/yolov3_mobilenet_v1_roadsign.yml --eval
```
更详细的命令参考[30分钟快速上手PaddleDetection](../GETTING_STARTED_cn.md)
**注意:**
请运行前自行仔细检查数据集的配置路径,在训练或验证时如果TrainDataset和EvalDataset的路径配置有误,会提示自动下载数据集。若使用自定义数据集,在推理时如果TestDataset路径配置有误,会提示使用默认COCO数据集的类别信息。
### (可选)生成Anchor
在yolo系列模型中,大多数情况下使用默认的anchor设置即可, 你也可以运行`tools/anchor_cluster.py`来得到适用于你的数据集Anchor,使用方法如下:
``` bash
python tools/anchor_cluster.py -c configs/ppyolo/ppyolo.yml -n 9 -s 608 -m v2 -i 1000
```
目前`tools/anchor_cluster.py`支持的主要参数配置如下表所示:
| 参数 | 用途 | 默认值 | 备注 |
|:------:|:------:|:------:|:------:|
| -c/--config | 模型的配置文件 | 无默认值 | 必须指定 |
| -n/--n | 聚类的簇数 | 9 | Anchor的数目 |
| -s/--size | 图片的输入尺寸 | None | 若指定,则使用指定的尺寸,如果不指定, 则尝试从配置文件中读取图片尺寸 |
| -m/--method | 使用的Anchor聚类方法 | v2 | 目前只支持yolov2的聚类算法 |
| -i/--iters | kmeans聚类算法的迭代次数 | 1000 | kmeans算法收敛或者达到迭代次数后终止 |
# How to Prepare Training Data
## Directory
- [How to Prepare Training Data](#how-to-prepare-training-data)
- [Directory](#directory)
- [Description of Object Detection Data](#description-of-object-detection-data)
- [Prepare Training Data](#prepare-training-data)
- [VOC Data](#voc-data)
- [VOC Dataset Download](#voc-dataset-download)
- [Introduction to VOC Data Annotation File](#introduction-to-voc-data-annotation-file)
- [COCO Data](#coco-data)
- [COCO Data Download](#coco-data-download)
- [Description of COCO Data Annotation](#description-of-coco-data-annotation)
- [User Data](#user-data)
- [Convert User Data to VOC Data](#convert-user-data-to-voc-data)
- [Convert User Data to COCO Data](#convert-user-data-to-coco-data)
- [Reader of User Define Data](#reader-of-user-define-data)
- [Example of User Data Conversion](#example-of-user-data-conversion)
### Description of Object Detection Data
The data of object detection is more complex than classification. In an image, it is necessary to mark the position and category of each object.
The general object position is represented by a rectangular box, which is generally expressed in the following three ways
| Expression | Explanation |
| :---------: | :----------------------------------------------------------------------------: |
| x1,y1,x2,y2 | (x1,y1)is the top left coordinate, (x2,y2)is the bottom right coordonate |
| x1,y1,w,h | (x1,y1)is the top left coordinate, w is width of object, h is height of object |
| xc,yc,w,h | (xc,yc)is center of object, w is width of object, h is height of object |
Common object detection datasets such as Pascal VOC, adopting `[x1,y1,x2,y2]` to express the bounding box of object. COCO uses `[x1,y1,w,h]`, [format](https://cocodataset.org/#format-data).
### Prepare Training Data
PaddleDetection is supported [COCO](http://cocodataset.org) and [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) and [WIDER-FACE](http://shuoyang1213.me/WIDERFACE/) datasets by default.
It also supports custom data sources including:
(1) Convert custom data to VOC format;
(2) Convert custom data to COOC format;
(3) Customize a new data source, and add custom reader;
firstly, enter `PaddleDetection` root directory
```
cd PaddleDetection/
ppdet_root=$(pwd)
```
#### VOC Data
VOC data is used in [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) competition. Pascal VOC competition not only contains image classification task, but also contains object detection and object segmentation et al., the annotation file contains the ground truth of multiple tasks.
VOC dataset denotes the data of PAscal VOC competition. when customizeing VOC data, For non mandatory fields in the XML file, please select whether to label or use the default value according to the actual situation.
##### VOC Dataset Download
- Download VOC datasets through code automation. The datasets are large and take a long time to download
```
# Execute code to automatically download VOC dataset
python dataset/voc/download_voc.py
```
After code execution, the VOC dataset file organization structure is:
```
>>cd dataset/voc/
>>tree
├── create_list.py
├── download_voc.py
├── generic_det_label_list.txt
├── generic_det_label_list_zh.txt
├── label_list.txt
├── VOCdevkit/VOC2007
│ ├── annotations
│ ├── 001789.xml
│ | ...
│ ├── JPEGImages
│ ├── 001789.jpg
│ | ...
│ ├── ImageSets
│ | ...
├── VOCdevkit/VOC2012
│ ├── Annotations
│ ├── 2011_003876.xml
│ | ...
│ ├── JPEGImages
│ ├── 2011_003876.jpg
│ | ...
│ ├── ImageSets
│ | ...
| ...
```
Description of each document
```
# label_list.txt is list of classes name,filename must be label_list.txt. If using VOC dataset, when `use_default_label=true` in config file, this file is not required.
>>cat label_list.txt
aeroplane
bicycle
...
# trainval.txt is file list of trainset
>>cat trainval.txt
VOCdevkit/VOC2007/JPEGImages/007276.jpg VOCdevkit/VOC2007/Annotations/007276.xml
VOCdevkit/VOC2012/JPEGImages/2011_002612.jpg VOCdevkit/VOC2012/Annotations/2011_002612.xml
...
# test.txt is file list of testset
>>cat test.txt
VOCdevkit/VOC2007/JPEGImages/000001.jpg VOCdevkit/VOC2007/Annotations/000001.xml
...
# label_list.txt voc list of classes name
>>cat label_list.txt
aeroplane
bicycle
...
```
- If the VOC dataset has been downloaded
You can organize files according to the above data file organization structure.
##### Introduction to VOC Data Annotation File
In VOC dataset, Each image file corresponds to an XML file with the same name, the coordinates and categories of the marked object frame in the XML file, such as `2007_002055.jpg`:
![](../images/2007_002055.jpg)
The XML file corresponding to the image contains the basic information of the corresponding image, such as file name, source, image size, object area information and category information contained in the image.
The XML file contains the following fields:
- filename, indicating the image name.
- size, indicating the image size, including: image width, image height and image depth
```
<size>
<width>500</width>
<height>375</height>
<depth>3</depth>
</size>
```
- object field, indict each object, including:
| Label | Explanation |
| :--------------: | :------------------------------------------------------------------------------------------------------------------------: |
| name | name of object class |
| pose | attitude description of the target object (non required field) |
| truncated | If the occlusion of the object exceeds 15-20% and is outside the bounding box,mark it as `truncated` (non required field) |
| difficult | objects that are difficult to recognize are marked as`difficult` (non required field) |
| bndbox son laebl | (xmin,ymin) top left coordinate, (xmax,ymax) bottom right coordinate |
#### COCO Data
COOC data is used in [COCO](http://cocodataset.org) competition. alike, Coco competition also contains multiple competition tasks, and its annotation file contains the annotation contents of multiple tasks.
The coco dataset refers to the data used in the coco competition. Customizing coco data, some fields in JSON file, please select whether to label or use the default value according to the actual situation.
##### COCO Data Download
- The coco dataset is downloaded automatically through the code. The dataset is large and takes a long time to download
```
# automatically download coco datasets by executing code
python dataset/coco/download_coco.py
```
after code execution, the organization structure of coco dataset file is:
```
>>cd dataset/coco/
>>tree
├── annotations
│ ├── instances_train2017.json
│ ├── instances_val2017.json
│ | ...
├── train2017
│ ├── 000000000009.jpg
│ ├── 000000580008.jpg
│ | ...
├── val2017
│ ├── 000000000139.jpg
│ ├── 000000000285.jpg
│ | ...
| ...
```
- If the coco dataset has been downloaded
The files can be organized according to the above data file organization structure.
##### Description of COCO Data Annotation
Coco data annotation is to store the annotations of all training images in a JSON file. Data is stored in the form of nested dictionaries.
The JSON file contains the following keys:
- info,indicating the annotation file info。
- licenses, indicating the label file licenses。
- images, indicating the list of image information in the annotation file, and each element is the information of an image. The following is the information of one of the images:
```
{
'license': 3, # license
'file_name': '000000391895.jpg', # file_name
# coco_url
'coco_url': 'http://images.cocodataset.org/train2017/000000391895.jpg',
'height': 360, # image height
'width': 640, # image width
'date_captured': '2013-11-14 11:18:45', # date_captured
# flickr_url
'flickr_url': 'http://farm9.staticflickr.com/8186/8119368305_4e622c8349_z.jpg',
'id': 391895 # image id
}
```
- annotations: indicating the annotation information list of the target object in the annotation file. Each element is the annotation information of a target object. The following is the annotation information of one of the target objects:
```
{
'segmentation': # object segmentation annotation
'area': 2765.1486500000005, # object area
'iscrowd': 0, # iscrowd
'image_id': 558840, # image id
'bbox': [199.84, 200.46, 77.71, 70.88], # bbox [x1,y1,w,h]
'category_id': 58, # category_id
'id': 156 # image id
}
```
```
# Viewing coco annotation files
import json
coco_anno = json.load(open('./annotations/instances_train2017.json'))
# coco_anno.keys
print('\nkeys:', coco_anno.keys())
# Viewing categories information
print('\ncategories:', coco_anno['categories'])
# Viewing the number of images
print('\nthe number of images:', len(coco_anno['images']))
# Viewing the number of obejcts
print('\nthe number of annotation:', len(coco_anno['annotations']))
# View object annotation information
print('\nobject annotation information: ', coco_anno['annotations'][0])
```
Coco data is prepared as follows.
`dataset/coco/`Initial document organization
```
>>cd dataset/coco/
>>tree
├── download_coco.py
```
#### User Data
There are three processing methods for user data:
(1) Convert user data into VOC data (only include labels necessary for object detection as required)
(2) Convert user data into coco data (only include labels necessary for object detection as required)
(3) Customize a reader for user data (for complex data, you need to customize the reader)
##### Convert User Data to VOC Data
After the user dataset is converted to VOC data, the directory structure is as follows (note that the path name and file name in the dataset should not use Chinese as far as possible to avoid errors caused by Chinese coding problems):
```
dataset/xxx/
├── annotations
│ ├── xxx1.xml
│ ├── xxx2.xml
│ ├── xxx3.xml
│ | ...
├── images
│ ├── xxx1.jpg
│ ├── xxx2.jpg
│ ├── xxx3.jpg
│ | ...
├── label_list.txt (Must be provided and the file name must be label_list.txt )
├── train.txt (list of trainset ./images/xxx1.jpg ./annotations/xxx1.xml)
└── valid.txt (list of valid file)
```
Description of each document
```
# label_list.txt is a list of category names. The file name must be this
>>cat label_list.txt
classname1
classname2
...
# train.txt is list of trainset
>>cat train.txt
./images/xxx1.jpg ./annotations/xxx1.xml
./images/xxx2.jpg ./annotations/xxx2.xml
...
# valid.txt is list of validset
>>cat valid.txt
./images/xxx3.jpg ./annotations/xxx3.xml
...
```
##### Convert User Data to COCO Data
`x2coco.py` is provided in `./tools/` to convert VOC dataset, labelme labeled dataset or cityscape dataset into coco data, for example:
(1)Conversion of labelme data to coco data:
```bash
python tools/x2coco.py \
--dataset_type labelme \
--json_input_dir ./labelme_annos/ \
--image_input_dir ./labelme_imgs/ \
--output_dir ./cocome/ \
--train_proportion 0.8 \
--val_proportion 0.2 \
--test_proportion 0.0
```
(2)Convert VOC data to coco data:
```bash
python tools/x2coco.py \
--dataset_type voc \
--voc_anno_dir path/to/VOCdevkit/VOC2007/Annotations/ \
--voc_anno_list path/to/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt \
--voc_label_list dataset/voc/label_list.txt \
--voc_out_name voc_train.json
```
After the user dataset is converted to coco data, the directory structure is as follows (note that the path name and file name in the dataset should not use Chinese as far as possible to avoid errors caused by Chinese coding problems):
```
dataset/xxx/
├── annotations
│ ├── train.json # Annotation file of coco data
│ ├── valid.json # Annotation file of coco data
├── images
│ ├── xxx1.jpg
│ ├── xxx2.jpg
│ ├── xxx3.jpg
│ | ...
...
```
##### Reader of User Define Data
If new data in the dataset needs to be added to paddedetection, you can refer to the [add new data source] (../advanced_tutorials/READER.md#2.3_Customizing_Dataset) document section in the data processing document to develop corresponding code to complete the new data source support. At the same time, you can read the [data processing document] (../advanced_tutorials/READER.md) for specific code analysis of data processing
The configuration file for the Dataset exists in the `configs/datasets` folder. For example, the COCO dataset configuration file is as follows:
```
metric: COCO # Currently supports COCO, VOC, OID, Wider Face and other evaluation standards
num_classes: 80 # num_classes: The number of classes in the dataset, excluding background classes
TrainDataset:
!COCODataSet
image_dir: train2017 # The path where the training set image resides relative to the dataset_dir
anno_path: annotations/instances_train2017.json # Path to the annotation file of the training set relative to the dataset_dir
dataset_dir: dataset/coco #The path where the dataset is located relative to the PaddleDetection path
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd'] # Controls the fields contained in the sample output of the dataset, note data_fields are unique to the trainreader and must be configured
EvalDataset:
!COCODataSet
image_dir: val2017 # The path where the images of the validation set reside relative to the dataset_dir
anno_path: annotations/instances_val2017.json # The path to the annotation file of the validation set relative to the dataset_dir
dataset_dir: dataset/coco # The path where the dataset is located relative to the PaddleDetection path
TestDataset:
!ImageFolder
anno_path: dataset/coco/annotations/instances_val2017.json # The path of the annotation file, it is only used to read the category information of the dataset. JSON and TXT formats are supported
dataset_dir: dataset/coco # The path of the dataset, note if this row is added, `anno_path` will be 'dataset_dir/anno_path`, if not set or removed, `anno_path` is `anno_path`
```
In the YML profile for Paddle Detection, use `!`directly serializes module instances (functions, instances, etc.). The above configuration files are serialized using Dataset.
**Note:**
Please carefully check the configuration path of the dataset before running. During training or verification, if the path of TrainDataset or EvalDataset is wrong, it will download the dataset automatically. When using a user-defined dataset, if the TestDataset path is incorrectly configured during inference, the category of the default COCO dataset will be used.
#### Example of User Data Conversion
Take [Kaggle Dataset](https://www.kaggle.com/andrewmvd/road-sign-detection) competition data as an example to illustrate how to prepare custom data. The dataset of Kaggle [road-sign-detection](https://www.kaggle.com/andrewmvd/road-sign-detection) competition contains 877 images, four categories:crosswalk,speedlimit,stop,trafficlight. Available for download from kaggle, also available from [link](https://paddlemodels.bj.bcebos.com/object_detection/roadsign_voc.tar).
Example diagram of road sign dataset:
![](../images/road554.png)
```
# Downing and unziping data
>>cd $(ppdet_root)/dataset
# Download and unzip the kaggle dataset. The current file organization is as follows
├── annotations
│ ├── road0.xml
│ ├── road1.xml
│ ├── road10.xml
│ | ...
├── images
│ ├── road0.jpg
│ ├── road1.jpg
│ ├── road2.jpg
│ | ...
```
The data is divided into training set and test set
```
# Generating label_list.txt
>>echo -e "speedlimit\ncrosswalk\ntrafficlight\nstop" > label_list.txt
# Generating train.txt, valid.txt and test.txt
>>ls images/*.png | shuf > all_image_list.txt
>>awk -F"/" '{print $2}' all_image_list.txt | awk -F".png" '{print $1}' | awk -F"\t" '{print "images/"$1".png annotations/"$1".xml"}' > all_list.txt
# The proportion of training set, verification set and test set is about 80%, 10% and 10% respectively.
>>head -n 88 all_list.txt > test.txt
>>head -n 176 all_list.txt | tail -n 88 > valid.txt
>>tail -n 701 all_list.txt > train.txt
# Deleting unused files
>>rm -rf all_image_list.txt all_list.txt
The organization structure of the final dataset file is:
├── annotations
│ ├── road0.xml
│ ├── road1.xml
│ ├── road10.xml
│ | ...
├── images
│ ├── road0.jpg
│ ├── road1.jpg
│ ├── road2.jpg
│ | ...
├── label_list.txt
├── test.txt
├── train.txt
└── valid.txt
# label_list.txt is list of file name, file name must be label_list.txt
>>cat label_list.txt
crosswalk
speedlimit
stop
trafficlight
# train.txt is the list of training dataset files, and each line is an image path and the corresponding annotation file path, separated by spaces. Note that the path here is a relative path within the dataset folder.
>>cat train.txt
./images/road839.png ./annotations/road839.xml
./images/road363.png ./annotations/road363.xml
...
# valid.txt is the list of validation dataset files. Each line is an image path and the corresponding annotation file path, separated by spaces. Note that the path here is a relative path within the dataset folder.
>>cat valid.txt
./images/road218.png ./annotations/road218.xml
./images/road681.png ./annotations/road681.xml
```
You can also download [the prepared data](https://paddlemodels.bj.bcebos.com/object_detection/roadsign_voc.tar), unzip to `dataset/roadsign_voc/`
After preparing the data, we should generally understand the data, such as image quantity, image size, number of target areas of each type, target area size, etc. If necessary, clean the data.
Roadsign dataset statistics:
| data | number of images |
| :---: | :--------------: |
| train | 701 |
| valid | 176 |
**Explanation:**
(1) For user data, it is recommended to carefully check the data before training to avoid crash during training due to wrong data annotation format or incomplete image data
(2) If the image size is too large, it will occupy more memory without limiting the read data size, which will cause memory / video memory overflow. Please set batch reasonably_ Size, you can try from small to large
# 数据准备
数据对于深度学习开发起到了至关重要的作用,数据采集和标注的质量是提升业务模型效果的重要因素。本文档主要介绍PaddleDetection中如何进行数据准备,包括采集高质量数据方法,覆盖多场景类型,提升模型泛化能力;以及各类任务数据标注工具和方法,并在PaddleDetection下使用
## 数据采集
在深度学习任务的实际落地中,数据采集往往决定了最终模型的效果,对于数据采集的几点建议如下:
### 确定方向
任务类型、数据的类别和目标场景这些因素决定了要收集什么数据,首先需要根据这些因素来确定整体数据收集的工作方向。
### 开源数据集
在实际场景中数据采集成本其实十分高昂,完全靠自己收集在时间和金钱上都有很高的成本,开源数据集是帮助增加训练数据量的重要手段,所以很多时候会考虑加入一些相似任务的开源数据。在使用中请遵守各个开源数据集的license规定的使用条件。
### 增加场景数据
开源数据一般不会覆盖实际使用的的目标场景,用户需要评估开源数据集中已包含的场景和目标场景间的差异,有针对性地补充目标场景数据,尽量让训练和部署数据的场景一致。
### 类别均衡
在采集阶段,也需要尽量保持类别均衡,帮助模型正确学习到目标特征。
## 数据标注及格式说明
| 任务类型 | 数据标注 | 数据格式说明 |
|:--------:| :--------:|:--------:|
| 目标检测 | [文档链接](DetAnnoTools.md) | [文档链接](PrepareDetDataSet.md) |
| 关键点检测 | [文档链接](KeyPointAnnoTools.md) | [文档链接](PrepareKeypointDataSet.md) |
| 多目标跟踪 | [文档链接](MOTAnnoTools.md) | [文档链接](PrepareMOTDataSet.md) |
# Logging
This document talks about how to track metrics and visualize model performance during training. The library currently supports [VisualDL](https://www.paddlepaddle.org.cn/documentation/docs/en/guides/03_VisualDL/visualdl_usage_en.html) and [Weights & Biases](https://docs.wandb.ai).
## VisualDL
Logging to VisualDL is supported only in python >= 3.5. To install VisualDL
```
pip install visualdl
```
PaddleDetection uses a callback to log the training metrics at the end of every step and metrics from the validation step at the end of every epoch. To use VisualDL for visualization, add the `--use_vdl` flag to the training command and `--vdl_log_dir <logs>` to set the directory which stores the records.
For example
```
python tools/train -c config.yml --use_vdl --vdl_log_dir ./logs
```
Another possible way to do this is to add the aforementioned flags to the `config.yml` file.
## Weights & Biases
W&B is a MLOps tool that can be used for experiment tracking, dataset/model versioning, visualizing results and collaborating with colleagues. A W&B logger is integrated directly into PaddleDetection and to use it, first you need to install the wandb sdk and login to your wandb account.
```
pip install wandb
wandb login
```
To use wandb to log metrics while training add the `--use_wandb` flag to the training command and any other arguments for the W&B logger can be provided like this -
```
python tools/train -c config.yml --use_wandb -o wandb-project=MyDetector wandb-entity=MyTeam wandb-save_dir=./logs
```
The arguments to the W&B logger must be proceeded by `-o` and each invidiual argument must contain the prefix "wandb-".
If this is too tedious, an alternative way is to add the arguments to the `config.yml` file under the `wandb` header. For example
```
use_wandb: True
wandb:
project: MyProject
entity: MyTeam
save_dir: ./logs
```
mode: paddle
draw_threshold: 0.5
metric: COCO
use_dynamic_shape: false
arch: YOLOv8
min_subgraph_size: 10
Preprocess:
- interp: 1
keep_ratio: true
target_size:
- 640
- 640
type: Resize
- fill_value:
- 114.0
- 114.0
- 114.0
size:
- 640
- 640
type: Pad
- mean:
- 0.0
- 0.0
- 0.0
norm_type: none
std:
- 1.0
- 1.0
- 1.0
type: NormalizeImage
- type: Permute
label_list:
- person
- bicycle
- car
- motorcycle
- airplane
- bus
- train
- truck
- boat
- traffic light
- fire hydrant
- stop sign
- parking meter
- bench
- bird
- cat
- dog
- horse
- sheep
- cow
- elephant
- bear
- zebra
- giraffe
- backpack
- umbrella
- handbag
- tie
- suitcase
- frisbee
- skis
- snowboard
- sports ball
- kite
- baseball bat
- baseball glove
- skateboard
- surfboard
- tennis racket
- bottle
- wine glass
- cup
- fork
- knife
- spoon
- bowl
- banana
- apple
- sandwich
- orange
- broccoli
- carrot
- hot dog
- pizza
- donut
- cake
- chair
- couch
- potted plant
- bed
- dining table
- toilet
- tv
- laptop
- mouse
- remote
- keyboard
- cell phone
- microwave
- oven
- toaster
- sink
- refrigerator
- book
- clock
- vase
- scissors
- teddy bear
- hair drier
- toothbrush
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import (core, data, engine, modeling, model_zoo, optimizer, metrics,
utils, slim)
try:
from .version import full_version as __version__
from .version import commit as __git_commit__
except ImportError:
import sys
sys.stderr.write("Warning: import ppdet from source directory " \
"without installing, run 'python setup.py install' to " \
"install ppdet firstly\n")
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import config
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import print_function
from __future__ import division
import inspect
import importlib
import re
try:
from docstring_parser import parse as doc_parse
except Exception:
def doc_parse(*args):
pass
try:
from typeguard import check_type
except Exception:
def check_type(*args):
pass
__all__ = ['SchemaValue', 'SchemaDict', 'SharedConfig', 'extract_schema']
class SchemaValue(object):
def __init__(self, name, doc='', type=None):
super(SchemaValue, self).__init__()
self.name = name
self.doc = doc
self.type = type
def set_default(self, value):
self.default = value
def has_default(self):
return hasattr(self, 'default')
class SchemaDict(dict):
def __init__(self, **kwargs):
super(SchemaDict, self).__init__()
self.schema = {}
self.strict = False
self.doc = ""
self.update(kwargs)
def __setitem__(self, key, value):
# XXX also update regular dict to SchemaDict??
if isinstance(value, dict) and key in self and isinstance(self[key],
SchemaDict):
self[key].update(value)
else:
super(SchemaDict, self).__setitem__(key, value)
def __missing__(self, key):
if self.has_default(key):
return self.schema[key].default
elif key in self.schema:
return self.schema[key]
else:
raise KeyError(key)
def copy(self):
newone = SchemaDict()
newone.__dict__.update(self.__dict__)
newone.update(self)
return newone
def set_schema(self, key, value):
assert isinstance(value, SchemaValue)
self.schema[key] = value
def set_strict(self, strict):
self.strict = strict
def has_default(self, key):
return key in self.schema and self.schema[key].has_default()
def is_default(self, key):
if not self.has_default(key):
return False
if hasattr(self[key], '__dict__'):
return True
else:
return key not in self or self[key] == self.schema[key].default
def find_default_keys(self):
return [
k for k in list(self.keys()) + list(self.schema.keys())
if self.is_default(k)
]
def mandatory(self):
return any([k for k in self.schema.keys() if not self.has_default(k)])
def find_missing_keys(self):
missing = [
k for k in self.schema.keys()
if k not in self and not self.has_default(k)
]
placeholders = [k for k in self if self[k] in ('<missing>', '<value>')]
return missing + placeholders
def find_extra_keys(self):
return list(set(self.keys()) - set(self.schema.keys()))
def find_mismatch_keys(self):
mismatch_keys = []
for arg in self.schema.values():
if arg.type is not None:
try:
check_type("{}.{}".format(self.name, arg.name),
self[arg.name], arg.type)
except Exception:
mismatch_keys.append(arg.name)
return mismatch_keys
def validate(self):
missing_keys = self.find_missing_keys()
if missing_keys:
raise ValueError("Missing param for class<{}>: {}".format(
self.name, ", ".join(missing_keys)))
extra_keys = self.find_extra_keys()
if extra_keys and self.strict:
raise ValueError("Extraneous param for class<{}>: {}".format(
self.name, ", ".join(extra_keys)))
mismatch_keys = self.find_mismatch_keys()
if mismatch_keys:
raise TypeError("Wrong param type for class<{}>: {}".format(
self.name, ", ".join(mismatch_keys)))
class SharedConfig(object):
"""
Representation class for `__shared__` annotations, which work as follows:
- if `key` is set for the module in config file, its value will take
precedence
- if `key` is not set for the module but present in the config file, its
value will be used
- otherwise, use the provided `default_value` as fallback
Args:
key: config[key] will be injected
default_value: fallback value
"""
def __init__(self, key, default_value=None):
super(SharedConfig, self).__init__()
self.key = key
self.default_value = default_value
def extract_schema(cls):
"""
Extract schema from a given class
Args:
cls (type): Class from which to extract.
Returns:
schema (SchemaDict): Extracted schema.
"""
ctor = cls.__init__
# python 2 compatibility
if hasattr(inspect, 'getfullargspec'):
argspec = inspect.getfullargspec(ctor)
annotations = argspec.annotations
has_kwargs = argspec.varkw is not None
else:
argspec = inspect.getfullargspec(ctor)
# python 2 type hinting workaround, see pep-3107
# however, since `typeguard` does not support python 2, type checking
# is still python 3 only for now
annotations = getattr(ctor, '__annotations__', {})
has_kwargs = argspec.varkw is not None
names = [arg for arg in argspec.args if arg != 'self']
defaults = argspec.defaults
num_defaults = argspec.defaults is not None and len(argspec.defaults) or 0
num_required = len(names) - num_defaults
docs = cls.__doc__
if docs is None and getattr(cls, '__category__', None) == 'op':
docs = cls.__call__.__doc__
try:
docstring = doc_parse(docs)
except Exception:
docstring = None
if docstring is None:
comments = {}
else:
comments = {}
for p in docstring.params:
match_obj = re.match('^([a-zA-Z_]+[a-zA-Z_0-9]*).*', p.arg_name)
if match_obj is not None:
comments[match_obj.group(1)] = p.description
schema = SchemaDict()
schema.name = cls.__name__
schema.doc = ""
if docs is not None:
start_pos = docs[0] == '\n' and 1 or 0
schema.doc = docs[start_pos:].split("\n")[0].strip()
# XXX handle paddle's weird doc convention
if '**' == schema.doc[:2] and '**' == schema.doc[-2:]:
schema.doc = schema.doc[2:-2].strip()
schema.category = hasattr(cls, '__category__') and getattr(
cls, '__category__') or 'module'
schema.strict = not has_kwargs
schema.pymodule = importlib.import_module(cls.__module__)
schema.inject = getattr(cls, '__inject__', [])
schema.shared = getattr(cls, '__shared__', [])
for idx, name in enumerate(names):
comment = name in comments and comments[name] or name
if name in schema.inject:
type_ = None
else:
type_ = name in annotations and annotations[name] or None
value_schema = SchemaValue(name, comment, type_)
if name in schema.shared:
assert idx >= num_required, "shared config must have default value"
default = defaults[idx - num_required]
value_schema.set_default(SharedConfig(name, default))
elif idx >= num_required:
default = defaults[idx - num_required]
value_schema.set_default(default)
schema.set_schema(name, value_schema)
return schema
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment