Commit c6a27e0b authored by panhb's avatar panhb
Browse files

first init

parent e4b993b1
Pipeline #2192 canceled with stages
This diff is collapsed.
# yolov8_paddle
# yolov8
## 论文
## 模型结构
yolov8是一种单阶段目标检测算法,该算法在YOLOV5的基础上添加了一些新的改进思路,使其速度与精度都得到了极大的性能提升。<br>
![block.jpg](asserts%2Fblock.jpg)
## 算法原理
YOLOv8算法通过将图像划分为不同大小的网格,预测每个网格中的目标类别和边界框,利用特征金字塔结构和自适应的模型缩放来实现高效准确的实时目标检测。
- 骨干网络和 Neck 部分将 YOLOv5 的 C3 结构换成了梯度流更丰富的 C2f 结构,并对不同尺度模型调整了不同的通道数,大幅提升了模型性能。
- Head 部分相比 YOLOv5 改动较大,换成了目前主流的解耦头结构,将分类和检测头分离,同时也从 Anchor-Based 换成了 Anchor-Free.
- Loss 计算方面采用了 TaskAlignedAssigner 正样本分配策略,并引入了 Distribution Focal Loss.
- 训练的数据增强部分引入了 YOLOX 中的最后 10 epoch 关闭 Mosiac 增强的操作,可以有效地提升精度
![model_framework.png](asserts%2Fmodel_framework.png)
## 环境配置
### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/paddlepaddle:2.6.1-py3.10-dtk24.04.3-ubuntu20.04
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=128G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name --network=host imageID /bin/bash
```
### Anaconda(方法二)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装: https://developer.hpccube.com/tool/
```
DTK软件栈: dtk24.04.03
python: python3.10
paddlepaddle: 2.6.1
```
Tips:以上dtk软件栈、python、paddle等DCU相关工具版本需要严格一一对应
## 数据集
COCO2017(在网络良好的情况下,如果没有下载数据集,程序会默认在线下载数据集)
训练数据快速下载中心:[SCNet AIDatasets](http://113.200.138.88:18080/aidatasets/) ,项目中的训练数据下载地址[COCO2017](http://113.200.138.88:18080/aidatasets/coco2017)
[训练数据](http://images.cocodataset.org/zips/train2017.zip)
[验证数据](http://images.cocodataset.org/zips/val2017.zip)
[测试数据](http://images.cocodataset.org/zips/test2017.zip)
[标签数据](https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels.zip)
数据集的目录结构如下:
```
├── images
│ ├── train2017
│ ├── val2017
│ ├── test2017
├── labels
│ ├── train2017
│ ├── val2017
├── annotations
│ ├── instances_val2017.json
├── LICENSE
├── README.txt
├── test-dev2017.txt
├── train2017.txt
├── val2017.txt
```
## 训练
### 单机多卡
```bash
cd /your_code_path/yolov8_paddle
# 使用混合精度训练YOLOv8, --amp表示开启混合精度训练以避免显存溢出,--eval表示边训边验证。
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/yolov8/yolov8_s_500e_coco.yml --amp --eval
# 若遇到dataset路径找不到的报错,修改官方默认的dataset数据集地址 datasets_dir: /home/yolov8_pytorch/
vim /root/.config/Ultralytics/settings.yaml
```
### 单机单卡
```bash
cd /your_code_path/yolov8_paddle
python tools/train.py -c configs/yolov8/yolov8_s_500e_coco.yml --eval --amp
```
## 评估
```bash
cd /your_code_path/yolov8_paddle
python tools/eval.py -c configs/yolov8/yolov8_s_500e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolov8_s_500e_coco.pdparams
```
## 推理
- 使用--infer_img推理单张图片以及使用--infer_dir推理文件中的所有图片。
```bash
# 推理单张图片
python tools/infer.py -c configs/yolov8/yolov8_s_500e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolov8_s_500e_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
# 推理文件中的所有图片
python tools/infer.py -c configs/yolov8/yolov8_s_500e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolov8_s_500e_coco.pdparams --infer_dir=demo
```
## result
![result.jpg](asserts%2F000000014439.jpg)
### 精度
| 模型名称 | amp混精 | Box AP |
|:------:|:----------------:|:------:|
| yolo8s | on | |
## 应用场景
### 算法类别
`目标检测`
### 热点应用行业
`金融,交通,教育`
## 源码仓库及问题反馈
- https://developer.hpccube.com/codes/modelzoo/yolov8_pytorch
## 参考资料
- https://github.com/ultralytics/ultralytics
This diff is collapsed.
# ConvNeXt (A ConvNet for the 2020s)
## 模型库
### ConvNeXt on COCO
| 网络网络 | 输入尺寸 | 图片数/GPU | 学习率策略 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params(M) | FLOPs(G) | 下载链接 | 配置文件 |
| :------------- | :------- | :-------: | :------: | :------------: | :---------------------: | :----------------: |:---------: | :------: |:---------------: |
| PP-YOLOE-ConvNeXt-tiny | 640 | 16 | 36e | 44.6 | 63.3 | 33.04 | 13.87 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_convnext_tiny_36e_coco.pdparams) | [配置文件](./ppyoloe_convnext_tiny_36e_coco.yml) |
| YOLOX-ConvNeXt-s | 640 | 8 | 36e | 44.6 | 65.3 | 36.20 | 27.52 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_convnext_s_36e_coco.pdparams) | [配置文件](./yolox_convnext_s_36e_coco.yml) |
| YOLOv5-s ConvNeXt | 640 | 8 | 36e | 42.4 | 65.3 | 34.54 | 17.96 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov5_convnext_s_36e_coco.pdparams) | [配置文件](./yolov5_convnext_s_36e_coco.yml) |
## Citations
```
@Article{liu2022convnet,
author = {Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
title = {A ConvNet for the 2020s},
journal = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022},
}
```
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../ppyoloe/_base_/ppyoloe_crn.yml',
'../ppyoloe/_base_/ppyoloe_reader.yml',
]
depth_mult: 0.25
width_mult: 0.50
log_iter: 100
snapshot_epoch: 5
weights: output/ppyoloe_convnext_tiny_36e_coco/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/convnext_tiny_22k_224.pdparams
YOLOv3:
backbone: ConvNeXt
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
ConvNeXt:
arch: 'tiny'
drop_path_rate: 0.4
layer_scale_init_value: 1.0
return_idx: [1, 2, 3]
PPYOLOEHead:
static_assigner_epoch: 12
nms:
nms_top_k: 1000
keep_top_k: 300
score_threshold: 0.01
nms_threshold: 0.7
TrainReader:
batch_size: 16
epoch: 36
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [36]
use_warmup: false
OptimizerBuilder:
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0005
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../yolov5/_base_/yolov5_cspresnet.yml',
'../yolov5/_base_/yolov5_reader.yml',
]
depth_mult: 0.33
width_mult: 0.50
log_iter: 100
snapshot_epoch: 5
weights: output/yolov5_convnext_s_300e_coco/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/convnext_tiny_22k_224.pdparams
YOLOv5:
backbone: ConvNeXt
neck: YOLOCSPPAN
yolo_head: YOLOv5Head
post_process: ~
ConvNeXt:
arch: 'tiny'
drop_path_rate: 0.4
layer_scale_init_value: 1.0
return_idx: [1, 2, 3]
TrainReader:
batch_size: 8
epoch: 36
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [36]
use_warmup: false
OptimizerBuilder:
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0005
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../yolox/_base_/yolox_cspdarknet.yml',
'../yolox/_base_/yolox_reader.yml'
]
depth_mult: 0.33
width_mult: 0.50
log_iter: 100
snapshot_epoch: 5
weights: output/yolox_convnext_s_36e_coco/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/convnext_tiny_22k_224.pdparams
YOLOX:
backbone: ConvNeXt
neck: YOLOCSPPAN
head: YOLOXHead
size_stride: 32
size_range: [15, 25] # multi-scale range [480*480 ~ 800*800]
ConvNeXt:
arch: 'tiny'
drop_path_rate: 0.4
layer_scale_init_value: 1.0
return_idx: [1, 2, 3]
TrainReader:
batch_size: 8
mosaic_epoch: 30
YOLOXHead:
l1_epoch: 30
nms:
name: MultiClassNMS
nms_top_k: 10000
keep_top_k: 1000
score_threshold: 0.001
nms_threshold: 0.65
epoch: 36
LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [36]
use_warmup: false
OptimizerBuilder:
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0005
metric: COCO
num_classes: 80
TrainDataset:
name: COCODataSet
image_dir: train2017
anno_path: annotations/instances_train2017.json
dataset_dir: /dataset/COCO2017
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
name: COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: /dataset/COCO2017
TestDataset:
name: ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: /dataset/COCO2017 # if set, anno_path will be 'dataset_dir/anno_path'
metric: COCO
num_classes: 80
TrainDataset:
name: COCODataSet
image_dir: train2017
anno_path: annotations/instances_train2017.json
dataset_dir: dataset/coco
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_poly', 'is_crowd']
EvalDataset:
name: COCODataSet
image_dir: val2017
anno_path: annotations/instances_val2017.json
dataset_dir: dataset/coco
TestDataset:
name: ImageFolder
anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
metric: COCO
num_classes: 365
TrainDataset:
!COCODataSet
image_dir: train
anno_path: annotations/zhiyuan_objv2_train.json
dataset_dir: dataset/objects365
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: val
anno_path: annotations/zhiyuan_objv2_val.json
dataset_dir: dataset/objects365
allow_empty: true
TestDataset:
!ImageFolder
anno_path: annotations/zhiyuan_objv2_val.json
dataset_dir: dataset/objects365/
metric: COCO
num_classes: 601
# Due to the large dataset, training and evaluation are not supported currently
TrainDataset:
!COCODataSet
image_dir: train
anno_path: annotations/train.json
dataset_dir: dataset/OpenImagesV7
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
# Due to the large dataset, training and evaluation are not supported currently
EvalDataset:
!COCODataSet
image_dir: val
anno_path: annotations/val.json
dataset_dir: dataset/OpenImagesV7
allow_empty: true
TestDataset:
!ImageFolder
anno_path: label_list.txt
dataset_dir: dataset/OpenImagesV7
metric: VOC
map_type: integral
num_classes: 4
TrainDataset:
name: VOCDataSet
dataset_dir: dataset/roadsign_voc
anno_path: train.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
EvalDataset:
name: VOCDataSet
dataset_dir: dataset/roadsign_voc
anno_path: valid.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
TestDataset:
name: ImageFolder
anno_path: dataset/roadsign_voc/label_list.txt
metric: COCO
num_classes: 10
TrainDataset:
!COCODataSet
image_dir: VisDrone2019-DET-train
anno_path: train.json
dataset_dir: dataset/visdrone
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: VisDrone2019-DET-val
anno_path: val.json
# image_dir: test_dev
# anno_path: test_dev.json
dataset_dir: dataset/visdrone
TestDataset:
!ImageFolder
anno_path: val.json
dataset_dir: dataset/visdrone
metric: VOC
map_type: 11point
num_classes: 20
TrainDataset:
name: VOCDataSet
dataset_dir: dataset/voc
anno_path: trainval.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
EvalDataset:
name: VOCDataSet
dataset_dir: dataset/voc
anno_path: test.txt
label_list: label_list.txt
data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
TestDataset:
name: ImageFolder
anno_path: dataset/voc/label_list.txt
# FocalNet (Focal Modulation Networks)
## 模型库
### FocalNet on COCO
| 网络网络 | 输入尺寸| 图片数/GPU | 学习率策略 | 推理时间(fps) | mAP<sup>val<br>0.5:0.95 | 下载链接 | 配置文件 |
| :--------- | :---- | :-------: | :------: | :---------------------: | :----------------: | :-------: |:------: |
| PP-YOLOE+ FocalNet-tiny | 640 | 8 | 36e | - | 46.6 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_focalnet_tiny_36e_coco.pdparams) | [配置文件](./ppyoloe_plus_focalnet_tiny_36e_coco.yml) |
## Citations
```
@misc{yang2022focal,
title={Focal Modulation Networks},
author={Jianwei Yang and Chunyuan Li and Xiyang Dai and Jianfeng Gao},
journal={Advances in Neural Information Processing Systems (NeurIPS)},
year={2022}
}
```
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'../ppyoloe/_base_/ppyoloe_plus_crn.yml',
'../ppyoloe/_base_/ppyoloe_plus_reader.yml',
]
depth_mult: 0.33 # s version
width_mult: 0.50
log_iter: 100
snapshot_epoch: 4
weights: output/ppyoloe_plus_focalnet_tiny_36e_coco/model_final
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/focalnet_tiny_lrf_pretrained.pdparams
architecture: PPYOLOE
norm_type: sync_bn
use_ema: true
ema_decay: 0.9998
ema_black_list: ['proj_conv.weight']
custom_black_list: ['reduce_mean']
PPYOLOE:
backbone: FocalNet
neck: CustomCSPPAN
yolo_head: PPYOLOEHead
post_process: ~
FocalNet:
arch: 'focalnet_T_224_1k_lrf'
out_indices: [1, 2, 3]
PPYOLOEHead:
static_assigner_epoch: 12
nms:
nms_top_k: 1000
keep_top_k: 300
score_threshold: 0.01
nms_threshold: 0.7
TrainReader:
batch_size: 8
epoch: 36
LearningRate:
base_lr: 0.0001
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [36]
- !LinearWarmup
start_factor: 0.1
steps: 1000
OptimizerBuilder:
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0005
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment