Tutorial_Cpp.md 8.51 KB
Newer Older
liucong's avatar
liucong committed
1
2
# YOLOV5检测器

shizhm's avatar
shizhm committed
3
YOLOV5模型是目前工业界使用较多的算法,官方提供了多个不同版本的预训练模型,本份文档主要介绍了如何基于migraphx构建YOLOV5推理,包括:静态推理、动态shape推理,该示例推理流程对YOLOV5其他版本的模型同样适用。
liucong's avatar
liucong committed
4
5
6
7
8
9
10
11
12
13
14

## 模型简介

YOLOV5是一种单阶段目标检测算法,该算法在YOLOV4的基础上添加了一些新的改进思路,使其速度与精度都得到了极大的性能提升。具体包括:输入端的Mosaic数据增强、自适应锚框计算、自适应图片缩放操作;主干网络的Focus结构与CSP结构;Neck端的FPN+PAN结构;输出端的损失函数GIOU_Loss以及预测框筛选的DIOU_nms。网络结构如图所示。

<img src=./YOLOV5_01.jpg style="zoom:100%;" align=middle>

## 检测器参数设置

samples工程中的Resource/Configuration.xml文件的DetectorYOLOV5节点表示YOLOV5检测器的参数,相关参数主要依据官方推理示例进行设置。各个参数含义如下:

shizhm's avatar
shizhm committed
15
16
- ModelPathDynamic:yolov5动态模型存放路径
- ModelPathStatic:yolov5静态模型存放路径
liucong's avatar
liucong committed
17
18
19
20
21
22
23
24
- ClassNameFile:coco数据集类别文件存放路径
- UseFP16:是否使用FP16推理模式
- NumberOfClasses:检测类别数量
- ConfidenceThreshold:置信度阈值,用于判断anchor内的物体是否为正样本
- NMSThreshold:非极大值抑制阈值,用于消除重复框
- ObjectThreshold:用于判断anchor内部是否有物体

```
shizhm's avatar
shizhm committed
25
26
<ModelPathDynamic>"../Resource/Models/yolov5s_Nx3xNxN.onnx"</ModelPathDynamic>
<ModelPathStatic>"../Resource/Models/yolov5s.onnx"</ModelPathStatic>
liucong's avatar
liucong committed
27
28
29
30
31
32
33
34
35
36
<ClassNameFile>"../Resource/Models/coco.names"</ClassNameFile>
<UseFP16>0</UseFP16><!--是否使用FP16-->
<NumberOfClasses>80</NumberOfClasses><!--类别数(不包括背景类),COCO:80,VOC:20-->
<ConfidenceThreshold>0.25</ConfidenceThreshold>
<NMSThreshold>0.5</NMSThreshold>
<ObjectThreshold>0.5</ObjectThreshold>
```

## 模型初始化

shizhm's avatar
shizhm committed
37
38
39
模型初始化首先通过parse_onnx()函数加载YOLOV5的onnx模型。

- 静态推理:调用parse_onnx函数对静态模型进行解析
liucong's avatar
liucong committed
40
41

```
shizhm's avatar
shizhm committed
42
ErrorCode DetectorYOLOV5::Initialize(InitializationParameterOfDetector initializationParameterOfDetector, bool dynamic)
liucong's avatar
liucong committed
43
44
45
{
    ...
    
shizhm's avatar
shizhm committed
46
47
48
49
50
51
52
        // 加载模型
        net = migraphx::parse_onnx(modelPath);
        LOG_INFO(stdout,"succeed to load model: %s\n",GetFileName(modelPath).c_str());
    ...
    
}
```
liucong's avatar
liucong committed
53

shizhm's avatar
shizhm committed
54
- 动态shape推理:需要设置模型输入的最大shape,本示例设为{1,3,800,800}
55

shizhm's avatar
shizhm committed
56
57
58
59
60
61
62
63
64
```
ErrorCode DetectorYOLOV5::Initialize(InitializationParameterOfDetector initializationParameterOfDetector, bool dynamic)
{
	...
        
        migraphx::onnx_options onnx_options;
        onnx_options.map_input_dims["images"]={1,3,800,800};// 
        net = migraphx::parse_onnx(modelPath, onnx_options);
        
liucong's avatar
liucong committed
65
66
67
68
69
70
71
72
    ...
}
```

## 预处理

在将数据输入到模型之前,需要对图像做如下预处理操作:

shizhm's avatar
shizhm committed
73
74
75
76
- 转换数据排布为NCHW

- 归一化[0.0, 1.0]

shizhm's avatar
shizhm committed
77
- 输入数据的尺寸变换:静态推理将输入大小固定为relInputShape=[1,3,608,608],动态推理对输入图像尺寸变换为设定的动态尺寸。
liucong's avatar
liucong committed
78
79

```
shizhm's avatar
shizhm committed
80
ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<std::size_t> &relInputShape, std::vector<ResultOfDetection> &resultsOfDetection, bool dynamic)
liucong's avatar
liucong committed
81
82
83
{ 
  ...
  
shizhm's avatar
shizhm committed
84
85
   // 数据预处理并转换为NCHW格式
    inputSize = cv::Size(relInputShape[3], relInputShape[2]);
liucong's avatar
liucong committed
86
    cv::Mat inputBlob;
shizhm's avatar
shizhm committed
87
    cv::dnn::blobFromImage(srcImage,
shizhm's avatar
shizhm committed
88
89
90
91
92
                    inputBlob,
                    1 / 255.0,
                    inputSize,
                    cv::Scalar(0, 0, 0),
                    true,
liucong's avatar
liucong committed
93
94
95
96
97
98
99
100
                    false);
                    
   ...
}
```

## 推理

shizhm's avatar
shizhm committed
101
完成图像预处理以及YOLOV5目标检测相关参数设置之后开始执行推理,利用migraphx推理计算得到YOLOV5模型的输出。其中静态推理输入数据inputData的shape大小为模型的固定输入尺寸,动态推理则为实际输入的尺寸。
liucong's avatar
liucong committed
102
103

```
shizhm's avatar
shizhm committed
104
105
ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<std::size_t> &relInputShape, std::vector<ResultOfDetection> &resultsOfDetection, bool dynamic)
{	
liucong's avatar
liucong committed
106
	...
shizhm's avatar
shizhm committed
107
108
	
	// 创建输入数据
109
    migraphx::parameter_map inputData;
shizhm's avatar
shizhm committed
110
111
112
113
114
115
116
117
118
    if(dynamic)
    {
        inputData[inputName]= migraphx::argument{migraphx::shape(inputShape.type(), relInputShape), (float*)inputBlob.data};
    }
    else
    {
        inputData[inputName]= migraphx::argument{inputShape, (float*)inputBlob.data};
    }
    
liucong's avatar
liucong committed
119
120
121
122
123

    // 推理
    std::vector<migraphx::argument> inferenceResults = net.eval(inputData);
    
    ...
shizhm's avatar
shizhm committed
124
    
liucong's avatar
liucong committed
125
126
127
128
129
130
131
132
133
}
```

YOLOV5的MIGraphX推理结果inferenceResults是一个std::vector< migraphx::argument >类型,YOLOV5的onnx模型包含一个输出,所以result等于inferenceResults[0],result包含三个维度:outputShape.lens()[0]=1表示batch信息,outputShape.lens()[1]=22743表示生成anchor数量,outputShape.lens()[2]=85表示对每个anchor的预测信息。同时可将85拆分为4+1+80,前4个参数用于判断每一个特征点的回归参数,回归参数调整后可以获得预测框,第5个参数用于判断每一个特征点是否包含物体,最后80个参数用于判断每一个特征点所包含的物体种类。获取上述信息之后进行anchors筛选,筛选过程分为两个步骤:

- 第一步根据objectThreshold阈值进行筛选,大于该阈值则判断当前anchor内包含物体,小于该阈值则判断无物体
- 第二步根据confidenceThreshold阈值进行筛选,当满足第一步阈值anchor的最大置信度得分maxClassScore大于该阈值,则进一步获取当前anchor的坐标信息和预测物体类别信息,小于该阈值则不做处理。

```
shizhm's avatar
shizhm committed
134
ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<std::size_t> &relInputShape, std::vector<ResultOfDetection> &resultsOfDetection, bool dynamic)
liucong's avatar
liucong committed
135
136
137
{

	...
138
139
140
141
	//获取先验框的个数
    int numProposal = outs[0].size[1];
    int numOut = outs[0].size[2];
    //变换输出的维度
liucong's avatar
liucong committed
142
143
    outs[0] = outs[0].reshape(0, numProposal);

144
    //生成先验框
liucong's avatar
liucong committed
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
    std::vector<float> confidences;
    std::vector<cv::Rect> boxes;
    std::vector<int> classIds;
    float ratioh = (float)srcImage.rows / inputSize.height, ratiow = (float)srcImage.cols / inputSize.width;

    //计算cx,cy,w,h,box_sore,class_sore
    int n = 0, rowInd = 0;
    float* pdata = (float*)outs[0].data;
    for (n = 0; n < numProposal; n++)
    {
        float boxScores = pdata[4];
        if (boxScores > yolov5Parameter.objectThreshold)
        {
            cv::Mat scores = outs[0].row(rowInd).colRange(5, numOut);
            cv::Point classIdPoint;
            double maxClassScore;
            cv::minMaxLoc(scores, 0, &maxClassScore, 0, &classIdPoint);
            maxClassScore *= boxScores;
            if (maxClassScore > yolov5Parameter.confidenceThreshold)
            {
                const int classIdx = classIdPoint.x;
                float cx = pdata[0] * ratiow;
                float cy = pdata[1] * ratioh;
                float w = pdata[2] * ratiow;
                float h = pdata[3] * ratioh;
170

liucong's avatar
liucong committed
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
                int left = int(cx - 0.5 * w);
                int top = int(cy - 0.5 * h);

                confidences.push_back((float)maxClassScore);
                boxes.push_back(cv::Rect(left, top, (int)(w), (int)(h)));
                classIds.push_back(classIdx);
            }
        }
        rowInd++;
        pdata += numOut;
    }
	
	...
}
```

为了消除重叠锚框,输出最终的YOLOV5目标检测结果,执行非极大值抑制对筛选之后的anchor进行处理,最后保存检测结果到resultsOfDetection中。

```
shizhm's avatar
shizhm committed
190
ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<std::size_t> &relInputShape, std::vector<ResultOfDetection> &resultsOfDetection, bool dynamic)
liucong's avatar
liucong committed
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
{

	...
    
    // 执行non maximum suppression消除冗余重叠boxes
    std::vector<int> indices;
    cv::dnn::NMSBoxes(boxes, confidences, yolov5Parameter.confidenceThreshold, yolov5Parameter.nmsThreshold, indices);
    for (size_t i = 0; i < indices.size(); ++i)
    {
        int idx = indices[i];
        int classID=classIds[idx];
        string className=classNames[classID];
        float confidence=confidences[idx];
        cv::Rect box = boxes[idx];
		
        //保存每个最终预测anchor的坐标值、置信度分数、类别ID
        ResultOfDetection result;
        result.boundingBox=box;
        result.confidence=confidence;// confidence
        result.classID=classID; // label
        result.className=className;
        resultsOfDetection.push_back(result);
    }

    ...
}
```