YOLOV5.md

# YOLOV5检测器

## 模型简介

YOLOV5是一种单阶段目标检测算法，该算法在YOLOV4的基础上添加了一些新的改进思路，使其速度与精度都得到了极大的性能提升。具体包括：输入端的Mosaic数据增强、自适应锚框计算、自适应图片缩放操作；主干网络的Focus结构与CSP结构；Neck端的FPN+PAN结构；输出端的损失函数GIOU_Loss以及预测框筛选的DIOU_nms。网络结构如图所示。

<img src="../Images/YOLOV5_01.jpg" alt="YOLOV5_01" style="zoom: 67%;" />

YOLOV5的官方源码地址：https://github.com/ultralytics/yolov5，官方源码中具有YOLOV5n、YOLOV5s、YOLOV5m、YOLOV5l等不同的版本。本示例采用YOLOV5s版本进行MIGraphX推理示例构建，下载YOLOV5s的预训练模型yolov5s.pt保存在Pytorch_YOLOV5工程的weights目录。

## 环境配置

运行YOLOV5模型的Python示例首先需要进行环境配置，包括安装torch、torchvision以及程序运行所需要的依赖。

1、安装torch、torchvision

2、安装程序运行的依赖

```
# 进入Pytorch_YOLOV5工程根目录
cd <path_to_Pytorch_YOLOV5>

# 安装程序运行的依赖
pip install -r requirement.txt
```

## 模型转换

官方提供的YOLOV5源码中包含导出onnx模型的程序，通过下面的步骤可以将yolov5s.pt预训练模型转换成onnx格式：

```
# 进入Pytorch_YOLOV5工程根目录
cd <path_to_Pytorch_YOLOV5>

# 转换模型
python export.py --weights ./weights/yolov5s.pt --imgsz 608 608 --include onnx
```

注意：官方源码提供的模型转换的程序中包含更多的功能，例如动态shape模型的导出，可根据需要进行添加相关参数。

## 模型推理

### 图片预处理

待检测图片输入模型进行检测之前需要进行预处理，主要包括调整输入的尺寸，归一化等操作。

1. 转换数据排布为NCHW
2. 归一化[0.0, 1.0]
3. 调整输入数据的尺寸为（1，3，608，608）

```
def prepare_input(self, image):
    self.img_height, self.img_width = image.shape[:2]
    input_img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    # 调整图像的尺寸
    input_img = cv2.resize(input_img, (self.inputWidth, self.inputHeight))
    # 维度转换HWC->CHW
    input_img = input_img.transpose(2, 0, 1)
    # 维度拓展，增加batch维度
    input_img = np.expand_dims(input_img, 0)
    input_img = np.ascontiguousarray(input_img)
    input_img = input_img.astype(np.float32)
    # 归一化
    input_img = input_img / 255

    return input_img
```

其中模型输入的inputWidth、inputHeight通过migraphx对输入模型进行解析获取，代码位置见YOLOV5类初始化位置。

```
class YOLOv5:
    def __init__(self, path, obj_thres=0.5, conf_thres=0.25, iou_thres=0.5):
        self.objectThreshold = obj_thres
        self.confThreshold = conf_thres
        self.nmsThreshold = iou_thres

        # 获取模型检测的类别信息
        self.classNames = list(map(lambda x: x.strip(), open('./weights/coco.names', 'r').readlines()))

        # 解析推理模型
        self.model = migraphx.parse_onnx(path)

        # 获取模型的输入name
        self.inputName = self.model.get_parameter_names()[0]

        # 获取模型的输入尺寸
        inputShape = self.model.get_parameter_shapes()[self.inputName].lens()
        self.inputHeight = int(inputShape[2])
        self.inputWidth = int(inputShape[3])
```

### 推理

输入图片预处理完成之后开始进行推理，首先需要利用migraphx进行编译，然后对输入数据进行前向计算得到模型的输出result，在detect函数中调用定义的process_output函数对result进行后处理，得到图片中含有物体的anchor坐标信息、类别置信度、类别ID。

```
def detect(self, image):
    # 输入图片预处理
    input_img = self.prepare_input(image)

    # 模型编译
    self.model.compile(t=migraphx.get_target("gpu"), device_id=0)  # device_id: 设置GPU设备，默认为0号设备
    print("Success to compile")
    # 执行推理
    print("Start to inference")
    start = time.time()
    result = self.model.run({self.model.get_parameter_names()[0]: migraphx.argument(input_img)})
    print('net forward time: {:.4f}'.format(time.time() - start))
    # 模型输出结果后处理
    boxes, scores, class_ids = self.process_output(result)

    return boxes, scores, class_ids
```

其中对migraphx推理输出result进行后处理，首先需要对生成的anchor根据是否有物体阈值objectThreshold、置信度阈值confThreshold进行筛选，相关过程定义在process_output函数中。获取筛选后的anchor的坐标信息之后，需要将坐标映射到原图中的位置，相关过程定义在rescale_boxes函数中。

```
def process_output(self, output):
    predictions = np.squeeze(output[0])

    # 筛选包含物体的anchor
    obj_conf = predictions[:, 4]
    predictions = predictions[obj_conf > self.objectThreshold]
    obj_conf = obj_conf[obj_conf > self.objectThreshold]

    # 筛选大于置信度阈值的anchor
    predictions[:, 5:] *= obj_conf[:, np.newaxis]
    scores = np.max(predictions[:, 5:], axis=1)
    valid_scores = scores > self.confThreshold
    predictions = predictions[valid_scores]
    scores = scores[valid_scores]

    # 获取最高置信度分数对应的类别ID
    class_ids = np.argmax(predictions[:, 5:], axis=1)

    # 获取每个物体对应的anchor
    boxes = self.extract_boxes(predictions)

    # 执行非极大值抑制消除冗余anchor
    indices = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), self.confThreshold, self.nmsThreshold).flatten()

    return boxes[indices], scores[indices], class_ids[indices]

def rescale_boxes(self, boxes):
    # 对anchor尺寸进行变换
    input_shape = np.array([self.inputWidth, self.inputHeight, self.inputWidth, self.inputHeight])
    boxes = np.divide(boxes, input_shape, dtype=np.float32)
    boxes *= np.array([self.img_width, self.img_height, self.img_width, self.img_height])
    return boxes
```

根据获取的detect函数输出的boxes、scores、class_ids信息在原图进行结果可视化，包括用绘制图片中检测到的物体位置、类别和置信度分数，得到最终的YOLOV5目标检测结果输出。

```
def draw_detections(self, image, boxes, scores, class_ids):
    for box, score, class_id in zip(boxes, scores, class_ids):
        cx, cy, w, h = box.astype(int)

        # 绘制检测物体框
        cv2.rectangle(image, (cx, cy), (cx + w, cy + h), (0, 255, 255), thickness=2)
        label = self.classNames[class_id]
        label = f'{label} {int(score * 100)}%'
        labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
        cv2.putText(image, label, (cx, cy - 10), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), thickness=2)
    return image
```

## 运行示例

1.参考《MIGraphX教程》中的安装方法安装MIGraphX并设置好PYTHONPATH

2.运行示例

```
# 进入migraphx samples工程根目录
cd <path_to_migraphx_samples> 

#进入示例程序目录
cd Python/Detector/YOLOV5

# 运行示例
python detect_migraphx.py --imgpath ./data/images/bus.jpg --modelpath ./weights/yolov5s.onnx --objectThreshold 0.5 --confThreshold 0.25 --nmsThreshold 0.5
```

输入参数中可根据需要进行修改，程序运行结束会在当前目录生成YOLOV5检测结果图片。

<img src="../Images/YOLOV5_03.jpg" alt="YOLOV5_02" style="zoom:67%;" />