YOLOV7.md

# YOLOV7检测器

## 模型简介

YOLOV7是2022年最新出现的一种YOLO系列目标检测模型，在论文 [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)中提出。

<img src="../Images/YOLOV7_02.png" style="zoom:67%;" />

本示例采用YOLOv7的官方源码：https://github.com/WongKinYiu/yolov7, 作者提供了多个预训练模型，本示例使用yolov7-tiny.pt预训练模型，将yolov7-tiny.pt预训练模型下载下来后，保存到Pytorch_YOLOV7工程的weights目录。

## 模型转换

官方提供的YOLOV7源码中包含导出onnx模型的程序，通过下面的步骤可以将yolov7-tiny.pt预训练模型转换成onnx格式：

```
# 进入Pytorch_YOLOV7工程根目录
cd <path_to_Pytorch_YOLOV7>

# 安装程序运行的依赖，torch、torchvision需要手动安装
pip install -r requirement.txt

# 转换模型
python export.py --weights ./weight/yolov7-tiny.pt --img-size 640 640 
```

注意：如果需要修改onnx模型的输入大小，可以调整--img-size参数，同时模型输入batch默认为1，若想修改可以通过添加--batch-size设置，程序运行结束后，在当前目录下会生成onnx格式的YOLOV7模型，并将该模型保存到了samples工程中的Resource/Models/Detector/YOLOV7目录中，可以用来MIGraphX加载推理。

## 检测器参数设置

samples工程中的Resource/Configuration.xml文件的DetectorYOLOV7节点表示YOLOV7检测器的参数，相关参数主要依据官方推理示例进行设置。其中包括模型存放路径、类别名称文件、检测类别数量、置信度阈值、非极大值抑制阈值和判断先验框是否有物体阈值。

- ModelPath：yolov7模型存放路径
- ClassNameFile：coco数据集类别文件存放路径
- UseFP16：是否使用FP16推理模式
- NumberOfClasses：检测类别数量
- ConfidenceThreshold：置信度阈值，用于判断anchor内的物体是否为正样本
- NMSThreshold：非极大值抑制阈值，用于消除重复框
- ObjectThreshold：用于判断anchor内部是否有物体

```
<ModelPath>"../Resource/Models/Detector/YOLOV7/yolov7-tiny.onnx"</ModelPath>
<ClassNameFile>"../Resource/Models/Detector/YOLOV7/coco.names"</ClassNameFile>
<UseFP16>0</UseFP16><!--是否使用FP16-->
<NumberOfClasses>80</NumberOfClasses><!--类别数(不包括背景类)，COCO:80,VOC:20-->
<ConfidenceThreshold>0.25</ConfidenceThreshold>
<NMSThreshold>0.5</NMSThreshold>
<ObjectThreshold>0.5</ObjectThreshold>
```

## 模型初始化

模型初始化首先通过parse_onnx()函数加载YOLOV7的onnx模型，并可以通过program的get_parameter_shapes()函数获取网络的输入属性。完成模型加载之后需要使用compile()方法编译模型，编译模式使用migraphx::gpu::target{}设为GPU模式，编译过程主要基于MIGraphX IR完成各种优化。同时如果需要使用低精度量化进行推理，可以使用quantize_fp16()函数实现。

```
ErrorCode DetectorYOLOV7::Initialize(InitializationParameterOfDetector initializationParameterOfDetector)
{
    ...
    
    //模型加载
    net = migraphx::parse_onnx(modelPath);
    LOG_INFO(logFile,"succeed to load model: %s\n",GetFileName(modelPath).c_str());

    // 获取模型输入属性
    std::pair<std::string, migraphx::shape> inputAttribute=*(net.get_parameter_shapes().begin());
    inputName=inputAttribute.first;
    inputShape=inputAttribute.second;
    inputSize=cv::Size(inputShape.lens()[3],inputShape.lens()[2]);// NCHW

    // 设置模型为GPU模式
    migraphx::target gpuTarget = migraphx::gpu::target{};

    // 量化    
    if(useFP16)
    {
        migraphx::quantize_fp16(net);
    }

    // 编译模型
    migraphx::compile_options options;
    options.device_id=0; // 设置GPU设备，默认为0号设备(>=1.2版本中支持)
    options.offload_copy=true; // 设置offload_copy
    net.compile(gpuTarget,options);
    LOG_INFO(logFile,"succeed to compile model: %s\n",GetFileName(modelPath).c_str());

    ...
}
```

## 模型推理

### 预处理

在将数据输入到模型之前，需要对图像做如下预处理操作：

- 转换数据排布为NCHW
- 归一化到[0.0, 1.0]
- 将输入数据的尺寸变换到YOLOV7输入大小（1，3，640，640）

```c++
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...
        
    // 预处理并转换为NCHW
    cv::Mat inputBlob;
    blobFromImage(srcImage, //输入数据
                    inputBlob, //输出数据
                    1 / 255.0, //缩放系数，这里为1/255.0
                    inputSize, //YOLOV7输入尺寸(640,640)
                    Scalar(0, 0, 0), // 均值，这里不需要减均值，所以设置为0.0
                    true, //转换RB通道
                    false); 
    ...
}
```

### 前向推理

完成图像预处理以及yolov7目标检测相关参数设置之后开始执行推理，获取migraphx推理结果。

```
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...
	
    // 输入数据
    migraphx::parameter_map inputData;
    inputData[inputName]= migraphx::argument{inputShape, (float*)inputBlob.data};

    // 推理
    std::vector<migraphx::argument> inferenceResults=net.eval(inputData);

    // 获取推理结果
    std::vector<cv::Mat> outs;
    migraphx::argument result = inferenceResults[0]; 

    // 转换为cv::Mat
    migraphx::shape outputShape = result.get_shape();
    int shape[]={outputShape.lens()[0],outputShape.lens()[1],outputShape.lens()[2]};
    cv::Mat out(4,shape,CV_32F);
    memcpy(out.data,result.data(),sizeof(float)*outputShape.elements());

    outs.push_back(out);

    ...

}
```

YOLOV7的MIGraphX推理结果inferenceResults是一个std::vector< migraphx::argument >类型，YOLOV7的onnx模型包含一个输出，所以result等于inferenceResults[0]，result包含三个维度：outputShape.lens()[0]=1表示batch信息，outputShape.lens()[1]=25200表示生成anchor数量，outputShape.lens()[2]=85表示对每个anchor的预测信息。同时可将85拆分为4+1+80，前4个参数用于判断每一个特征点的回归参数，回归参数调整后可以获得预测框，第5个参数用于判断每一个特征点是否包含物体，最后80个参数用于判断每一个特征点所包含的物体种类。获取上述信息之后进行anchors筛选，筛选过程分为两个步骤：

- 第一步根据objectThreshold阈值进行筛选，大于该阈值则判断当前anchor内包含物体，小于该阈值则判断无物体
- 第二步根据confidenceThreshold阈值进行筛选，当满足第一步阈值anchor的最大置信度得分maxClassScore大于该阈值，则进一步获取当前anchor内部的物体类别和坐标信息，小于该阈值则不做处理。

```
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...
	
    //获取先验框的个数numProposal=25200
    int numProposal = outs[0].size[1];
    //每个anchor的预测信息数量numOut=85
    int numOut = outs[0].size[2];
    outs[0] = outs[0].reshape(0, numProposal);

    std::vector<float> confidence;
    std::vector<Rect> boxes
    std::vector<int> classIds
    //原图尺寸与模型输入尺寸的缩放比例
    float ratioh = (float)srcImage.rows / inputSize.height, ratiow = (float)srcImage.cols / inputSize.width;

   //计算cx,cy,w,h,box_sore,class_sore
    int n = 0, rowInd = 0;
    float* pdata = (float*)outs[0].data;
    for (n = 0; n < numProposal; n++)
    {
    	//获取是否包含物体的概率值
        float boxScores = pdata[4];
        
        //第一次筛选，判断anchor内是否包含物体
        if (boxScores > yolov7Parameter.objectThreshold)
        {
            //获取每个anchor内部预测的80个类别概率信息
            cv::Mat scores = outs[0].row(rowInd).colRange(5, numOut);
            cv::Point classIdPoint;
            double maxClassScore;
            
            //获取80个类别中最大概率值和对应的类别ID
            cv::minMaxLoc(scores, 0, &maxClassScore, 0, &classIdPoint);
            maxClassScore *= boxScores;
            
            //第二次筛选，判断当前anchor的最大置信度得分是否满足阈值
            if (maxClassScore > yolov7Parameter.confidenceThreshold)
            {
                const int classIdx = classIdPoint.x;
                
                //将每个anchor坐标按缩放比例映射到原图
                float cx = pdata[0] * ratiow;
                float cy = pdata[1] * ratioh;
                float w = pdata[2] * ratiow;
                float h = pdata[3] * ratioh;
                //获取anchor的左上角坐标
                int left = int(cx - 0.5 * w);
                int top = int(cy - 0.5 * h);

                confidences.push_back((float)maxClassScore);
                boxes.push_back(cv::Rect(left, top, (int)(w), (int)(h)));
                classIds.push_back(classIdx);
            }
        }
        rowInd++;
        pdata += numOut;
    }

    ...

}
```

为了消除重叠锚框，输出最终的YOLOV7目标检测结果，执行非极大值抑制对筛选之后的anchor进行处理，最后保存检测结果到resultsOfDetection中。

```
ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
{
	...

    //执行non maximum suppression消除冗余重叠boxes
    std::vector<int> indices;
    dnn::NMSBoxes(boxes, confidences, yolov7Parameter.confidenceThreshold, yolov7Parameter.nmsThreshold, indices);
    for (size_t i = 0; i < indices.size(); ++i)
    {
        int idx = indices[i];
        int classID=classIds[idx];
        string className=classNames[classID];
        float confidence=confidences[idx];
        cv::Rect box = boxes[idx];
		
        //保存每个最终预测anchor的坐标值、置信度分数、类别ID
        ResultOfDetection result;
        result.boundingBox=box;
        result.confidence=confidence;// confidence
        result.classID=classID; // label
        result.className=className;
        resultsOfDetection.push_back(result);
    }
    
    ...
    
}
```

## 运行示例

根据samples工程中的README.md构建成功C++ samples后，在build目录下输入如下命令运行该示例：

```
./MIGraphX_Samples 6
```

会在当前目录生成检测结果图像Result.jpg。

<img src="../Images/YOLOV7_03.jpg" alt="YOLOV7_03" style="zoom:67%;" />