Tutorial_Cpp.md

# YOLOV7检测器

本示例提供了YOLOV7模型的MIGraphX C++推理教程，通过该教程可以了解图像预处理以及后处理等流程，根据文档说明执行YOLOV7推理则可获取目标检测结果。

## 模型简介

YOLOV7是2022年最新出现的一种YOLO系列目标检测模型，在论文 [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)中提出。

<img src="./YOLOV7_01.png" style="zoom:67%;" />

## 模型初始化

初始化操作主要利用MIGraphX对输入的模型进行解析，获取模型的输入属性inputAttribute，同时设置使用GPU推理模式对模型进行编译。

```
ErrorCode DetectorYOLOV7::Initialize(InitializationParameterOfDetector initializationParameterOfDetector)
{
    ...
    
    //模型加载
    net = migraphx::parse_onnx(modelPath);
    LOG_INFO(stdout,"succeed to load model: %s\n",GetFileName(modelPath).c_str());

    // 获取模型输入属性
    std::pair<std::string, migraphx::shape> inputMap=net.get_parameter_shapes();
    inputName=inputAttribute.first;
    inputShape=inputAttribute.second;
    int N=inputShape.lens()[0];
    int C=inputShape.lens()[1];
    int H=inputShape.lens()[2];
    int W=inputShape.lens()[3];
    inputSize=cv::Size(W,H);

    // 设置模型为GPU模式
    migraphx::target gpuTarget = migraphx::gpu::target{};

    // 量化    
    if(useFP16)
    {
        migraphx::quantize_fp16(net);
    }

    // 编译模型
    migraphx::compile_options options;
    options.device_id=0; // 设置GPU设备，默认为0号设备
    options.offload_copy=true; // 设置offload_copy
    net.compile(gpuTarget,options);
    LOG_INFO(stdout,"succeed to compile model: %s\n",GetFileName(modelPath).c_str());

    ...
}
```

## 预处理

在将数据输入到模型之前，需要对图像做如下预处理操作：

- 转换数据排布为NCHW
- 归一化到[0.0, 1.0]
- 将输入数据的尺寸变换到YOLOV7输入大小（1，3，640，640）

```c++
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...
        
    // 预处理并转换为NCHW
    cv::Mat inputBlob;
    blobFromImage(srcImage, //输入数据
                    inputBlob, //输出数据
                    1 / 255.0, //缩放系数，这里为1/255.0
                    inputSize, //YOLOV7输入尺寸(640,640)
                    Scalar(0, 0, 0), // 均值，这里不需要减均值，所以设置为0.0
                    true, //转换RB通道
                    false); 
    ...
}
```

## 推理

将预处理之后的图像输入到模型进行推理，获取模型的推理结果inferenceResults，并将输出结果由std::vector< migraphx::argument >类型转换为cv::Mat类型。

```
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...
	
    // 创建输入数据
    std::unordered_map<std::string, migraphx::shape> inputData;
    inputData[inputName]= migraphx::argument{inputShape, (float*)inputBlob.data};

    // 推理
    std::vector<migraphx::argument> inferenceResults=net.eval(inputData);

    // 获取推理结果
    std::vector<cv::Mat> outs;
    migraphx::argument result = inferenceResults[0]; 

    // 转换为cv::Mat
    migraphx::shape outputShape = result.get_shape();
    int shape[]={outputShape.lens()[0],outputShape.lens()[1],outputShape.lens()[2]};
    cv::Mat out(4,shape,CV_32F);
    memcpy(out.data,result.data(),sizeof(float)*outputShape.elements());

    outs.push_back(out);

    ...

}
```

获取MIGraphX推理结果之后需要进一步处理才可以得到YOLOV7的检测结果，输出结果result的第一个维度outputShape.lens()[0]的数值表示YOLOV7模型在当前待检测图像上生成的anchor数量。后处理过程包含两次anchor筛选过程，首先根据阈值objectThreshold判断anchor内部是否包含物体，小于该阈值的anchor则去除，然后获取第一次筛选后保留的anchor内部预测物体类别概率的最高得分，并与boxScores相乘得到anchor的置信度得分，最后根据置信度阈值confidenceThreshold进行第二次anchor筛选，大于该置信度阈值的anchor则保留，并获取最终保留下来anchor的坐标信息和物体类别预测信息，同时还需将预测坐标信息根据图像预处理缩放的比例ratioh、ratiow映射到原图。

```
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...
	
    //获取先验框的个数numProposal=25200
    int numProposal = outs[0].size[1];
    //每个anchor的预测信息数量numOut=85
    int numOut = outs[0].size[2];
    outs[0] = outs[0].reshape(0, numProposal);

    std::vector<float> confidence;
    std::vector<Rect> boxes
    std::vector<int> classIds
    //原图尺寸与模型输入尺寸的缩放比例
    float ratioh = (float)srcImage.rows / inputSize.height, ratiow = (float)srcImage.cols / inputSize.width;

   //计算cx,cy,w,h,box_sore,class_sore
    int n = 0, rowInd = 0;
    float* pdata = (float*)outs[0].data;
    for (n = 0; n < numProposal; n++)
    {
    	//获取是否包含物体的概率值
        float boxScores = pdata[4];
        
        //第一次筛选，判断anchor内是否包含物体
        if (boxScores > yolov7Parameter.objectThreshold)
        {
            //获取每个anchor内部预测的80个类别概率信息
            cv::Mat scores = outs[0].row(rowInd).colRange(5, numOut);
            cv::Point classIdPoint;
            double maxClassScore;
            
            //获取80个类别中最大概率值和对应的类别ID
            cv::minMaxLoc(scores, 0, &maxClassScore, 0, &classIdPoint);
            maxClassScore *= boxScores;
            
            //第二次筛选，判断当前anchor的最大置信度得分是否满足阈值
            if (maxClassScore > yolov7Parameter.confidenceThreshold)
            {
                const int classIdx = classIdPoint.x;
                
                //将每个anchor坐标按缩放比例映射到原图
                float cx = pdata[0] * ratiow;
                float cy = pdata[1] * ratioh;
                float w = pdata[2] * ratiow;
                float h = pdata[3] * ratioh;
                //获取anchor的左上角坐标
                int left = int(cx - 0.5 * w);
                int top = int(cy - 0.5 * h);

                confidences.push_back((float)maxClassScore);
                boxes.push_back(cv::Rect(left, top, (int)(w), (int)(h)));
                classIds.push_back(classIdx);
            }
        }
        rowInd++;
        pdata += numOut;
    }

    ...

}
```

为了消除重叠锚框，输出最终的YOLOV7目标检测结果，执行非极大值抑制对筛选之后的anchor进行处理，最后保存检测结果到resultsOfDetection中。

```
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...

    //执行non maximum suppression消除冗余重叠boxes
    std::vector<int> indices;
    dnn::NMSBoxes(boxes, confidences, yolov7Parameter.confidenceThreshold, yolov7Parameter.nmsThreshold, indices);
    for (size_t i = 0; i < indices.size(); ++i)
    {
        int idx = indices[i];
        int classID=classIds[idx];
        string className=classNames[classID];
        float confidence=confidences[idx];
        cv::Rect box = boxes[idx];
		
        //保存每个最终预测anchor的坐标值、置信度分数、类别ID
        ResultOfDetection result;
        result.boundingBox=box;
        result.confidence=confidence;// confidence
        result.classID=classID; // label
        result.className=className;
        resultsOfDetection.push_back(result);
    }
    
    ...
    
}
```