README.md 6.09 KB
Newer Older
Sugon_ldc's avatar
Sugon_ldc committed
1
2
3
4


# YOLOV5算力测试

Sugon_ldc's avatar
Sugon_ldc committed
5
## 相关文档
Sugon_ldc's avatar
Sugon_ldc committed
6

Sugon_ldc's avatar
Sugon_ldc committed
7
[Comprehensive Guide to Ultralytics YOLOv5 - Ultralytics YOLOv8 Docs](https://docs.ultralytics.com/yolov5/)
Sugon_ldc's avatar
Sugon_ldc committed
8

Sugon_ldc's avatar
Sugon_ldc committed
9
## 模型结构
Sugon_ldc's avatar
Sugon_ldc committed
10

Sugon_ldc's avatar
Sugon_ldc committed
11
YOLOv5 是一种目标检测算法,采用单阶段(one-stage)的方法,基于轻量级的卷积神经网络结构,通过引入不同尺度的特征融合和特征金字塔结构来实现高效准确的目标检测。
Sugon_ldc's avatar
Sugon_ldc committed
12

Sugon_ldc's avatar
Sugon_ldc committed
13
![Backbone](Backbone.png)
Sugon_ldc's avatar
Sugon_ldc committed
14

Sugon_ldc's avatar
Sugon_ldc committed
15
## 算法原理
Sugon_ldc's avatar
Sugon_ldc committed
16

Sugon_ldc's avatar
Sugon_ldc committed
17
YOLOv5 是一种基于单阶段目标检测算法,通过将图像划分为不同大小的网格,预测每个网格中的目标类别和边界框,利用特征金字塔结构和自适应的模型缩放来实现高效准确的实时目标检测。
Sugon_ldc's avatar
Sugon_ldc committed
18

Sugon_ldc's avatar
Sugon_ldc committed
19
![Algorithm_principle](Algorithm_principle.png)
Sugon_ldc's avatar
Sugon_ldc committed
20

Sugon_ldc's avatar
Sugon_ldc committed
21
## 环境配置
Sugon_ldc's avatar
Sugon_ldc committed
22

Sugon_ldc's avatar
Sugon_ldc committed
23
### Docker (方法一)
Sugon_ldc's avatar
Sugon_ldc committed
24

Sugon_ldc's avatar
Sugon_ldc committed
25
26
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-22.10.1-py38-latest
Sugon_ldc's avatar
Sugon_ldc committed
27

Sugon_ldc's avatar
Sugon_ldc committed
28
docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
Sugon_ldc's avatar
Sugon_ldc committed
29

Sugon_ldc's avatar
Sugon_ldc committed
30
31
cd /path/workspace/
pip3 install -r requirements.txt
Sugon_ldc's avatar
Sugon_ldc committed
32
```
Sugon_ldc's avatar
Sugon_ldc committed
33
34
35
36
37
38
39

### Dockerfile (方法二)

```
cd ./docker
docker build --no-cache -t yolov5:6.0 .
docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
Sugon_ldc's avatar
Sugon_ldc committed
40
41
```

Sugon_ldc's avatar
Sugon_ldc committed
42
43
44
### Anaconda (方法三)

1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装: https://developer.hpccube.com/tool/
Sugon_ldc's avatar
Sugon_ldc committed
45
46

```
Sugon_ldc's avatar
Sugon_ldc committed
47
48
49
50
DTK软件栈:dtk22.10.1
python:python3.8
torch:1.10
torchvision:0.10.0
Sugon_ldc's avatar
Sugon_ldc committed
51
52
```

Sugon_ldc's avatar
Sugon_ldc committed
53
54
55
Tips:以上dtk软件栈、python、torch等DCU相关工具版本需要严格一一对应

2、其他非特殊库直接按照requirements.txt安装
Sugon_ldc's avatar
Sugon_ldc committed
56
57

```
Sugon_ldc's avatar
Sugon_ldc committed
58
pip3 install -r requirements.txt
Sugon_ldc's avatar
Sugon_ldc committed
59
60
61
62
```



Sugon_ldc's avatar
Sugon_ldc committed
63
64
65
66
67
68
69
70
71
## 数据集

COCO2017(在网络良好的情况下,如果没有下载数据集,程序会默认在线下载数据集)

[训练数据](http://images.cocodataset.org/zips/train2017.zip)

[验证数据](http://images.cocodataset.org/zips/val2017.zip)

[测试数据](http://images.cocodataset.org/zips/test2017.zip)
Sugon_ldc's avatar
Sugon_ldc committed
72

Sugon_ldc's avatar
Sugon_ldc committed
73
[标签数据](https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels.zip)
Sugon_ldc's avatar
Sugon_ldc committed
74

Sugon_ldc's avatar
Sugon_ldc committed
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
数据集的目录结构如下:

```
├── images 
│   ├── train2017
│   ├── val2017
│   ├── test2017
├── labels
│   ├── train2017
│   ├── val2017
├── annotations
│   ├── instances_val2017.json
├── LICENSE
├── README.txt 
├── test-dev2017.txt
├── train2017.txt
├── val2017.txt

```

## 训练

若使用dtk版本为22.10.1及其之前的版本,训练之前请修改bn的配置:

如图将torch.backends.cudnn.enabled改为False

Sugon_ldc's avatar
Sugon_ldc committed
101
![native_bn](native_bn.png)
Sugon_ldc's avatar
Sugon_ldc committed
102
103

### 单机单卡
Sugon_ldc's avatar
Sugon_ldc committed
104
105
106
107
108
109
110

```
export HIP_VISIBLE_DEVICES=0

python3 train.py --batch 32 --data coco.yaml --cfg 'yolov5m.yaml' --weights '' --project 'run/train' --hyp 'data/hyps/hyp.scratch-high.yaml' --epochs 1000  2>&1 | tee  yolov5m.log
```

Sugon_ldc's avatar
Sugon_ldc committed
111
### 单机多卡
Sugon_ldc's avatar
Sugon_ldc committed
112
113
114
115

```
#以单机四卡为例子
export HIP_VISIBLE_DEVICES=0,1,2,3
Sugon_ldc's avatar
Sugon_ldc committed
116
export HSA_FORCE_FINE_GRAIN_PCIE=1
Sugon_ldc's avatar
Sugon_ldc committed
117
118
119
120

python3 -m torch.distributed.run --nproc_per_node 4 train.py --batch 128 --data coco.yaml --cfg 'yolov5m.yaml' --weights '' --project 'run/train' --hyp 'data/hyps/hyp.scratch-high.yaml' --device 0,1,2,3 --epochs 1000 2>&1 | tee  yolov5m_4.log
```

Sugon_ldc's avatar
Sugon_ldc committed
121
### 多机多卡
Sugon_ldc's avatar
Sugon_ldc committed
122
123
124
125
126
127
128
129
130
131

```
#下面的例子中使用两个节点,每个节点包含4加速张卡
#node 1
python3 -m torch.distributed.launch --nproc_per_node 4 --nnodes 2 --node_rank 0 --master_addr "node1" --master_port 34567 train.py --batch 256 --data coco.yaml --weight '' --project 'multi/train' --hyp 'data/hyps/hyp.scratch-high.yaml' --cfg 'yolov5m.yaml' --epochs 1000  2>&1 | tee  yolov5m_8.log

#node2
python3 -m torch.distributed.launch --nproc_per_node 4 --nnodes 2 --node_rank 1 --master_addr "node1" --master_port 34567 train.py --batch 256 --data coco.yaml --weight '' --project 'multi/train' --hyp 'data/hyps/hyp.scratch-high.yaml' --cfg 'yolov5m.yaml' --epochs 1000  2>&1 | tee  yolov5m_8.log
```

Sugon_ldc's avatar
Sugon_ldc committed
132
133
134
### 推理

#### 单卡推理
Sugon_ldc's avatar
Sugon_ldc committed
135

Sugon_ldc's avatar
Sugon_ldc committed
136
```
Sugon_ldc's avatar
Sugon_ldc committed
137
python3 val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65 --batch-size 32 --weights yolov5s.pt --device 0
Sugon_ldc's avatar
Sugon_ldc committed
138
```
Sugon_ldc's avatar
Sugon_ldc committed
139

Sugon_ldc's avatar
Sugon_ldc committed
140
#### 多卡推理
Sugon_ldc's avatar
Sugon_ldc committed
141
142

```
Sugon_ldc's avatar
Sugon_ldc committed
143
144
#以四卡推理为例
python3 val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65 --batch-size 128 --weights yolov5s.pt --device 0,1,2,3
Sugon_ldc's avatar
Sugon_ldc committed
145
```
Sugon_ldc's avatar
Sugon_ldc committed
146
147
148
#### result
此处以yolov5s模型进行推理测试
![bus](bus.jpg)
Sugon_ldc's avatar
Sugon_ldc committed
149
150
151
152
153
154
155
156
157
158
## 准确率数据

|  模型   | 数据类型 | map0.5:0.95 | map0.5 |
| :-----: | :------: | :---------: | :----: |
| yolov5n |   混精   |    27.9     |  46.8  |
| yolov5s |   混精   |    37.2     |  57.1  |
| yolov5m |   混精   |    44.3     |  64.1  |
| yolov5l |   混精   |     48      |  67.3  |
| yolov5x |   混精   |    49.6     |  68.6  |

Sugon_ldc's avatar
Sugon_ldc committed
159
160
161
162
163
## 应用场景
### 算法分类
目标检测

## 热点应用行业
Sugon_ldc's avatar
Sugon_ldc committed
164
视频监控与安防,零售业,工业质量检测,自动驾驶
Sugon_ldc's avatar
Sugon_ldc committed
165

Sugon_ldc's avatar
Sugon_ldc committed
166
167
168
169
## 画出loss和精度曲线

如果在训练一段时间后想要得到类似上述的loss及map曲线,我们提供了view_code.py文件,只需要将您训练过程中--project 指定的路径写入,之后执行python3 view_code.py即可在该路径下得到曲线的图像。

Sugon_ldc's avatar
Sugon_ldc committed
170
## FAQ:pycocotools输出结果特别低问题
Sugon_ldc's avatar
Sugon_ldc committed
171
172
173
174
175
176
177
178
179

在训练结束或者推理结束后有时候会发现pycocotools输出的结果异常,数值会非常低,而训练过程中结果正常,如下图所示:

![pycoco错误结果](pycoco_false_result.png)

这是由于python的版本过低导致的问题,除了升级Python版本外,还可以对代码进行修改也可以解决问题,将val.py文件中的如图所示位置,注释掉红框部分的代码也可得到正确的结果。

![pycocotools](pycocotools.png)

Sugon_ldc's avatar
Sugon_ldc committed
180
## 源码仓库及问题反馈
Sugon_ldc's avatar
Sugon_ldc committed
181

Sugon_ldc's avatar
Sugon_ldc committed
182
https://developer.hpccube.com/codes/modelzoo/yolov5_pytorch
Sugon_ldc's avatar
Sugon_ldc committed
183

Sugon_ldc's avatar
Sugon_ldc committed
184
185
186
## 参考

[GitHub - ultralytics/yolov5 at v6.0](https://github.com/ultralytics/yolov5/tree/v6.0)
Sugon_ldc's avatar
Sugon_ldc committed
187
188
189
190




Sugon_ldc's avatar
Sugon_ldc committed
191

Sugon_ldc's avatar
Sugon_ldc committed
192
193
194