Update

Signed-off-by: lijian <lijian6@sugon.com>

Update
Signed-off-by: lijian <lijian6@sugon.com>
503a4ba9 · lijian6 · 32aace58 · 503a4ba9 · 503a4ba9 · 503a4ba9
Commit 503a4ba9 authored Oct 27, 2023 by lijian6
14 changed files
--- a/Doc/Tutorial_Cpp/RetinaFace.md
+++ b/Doc/Tutorial_Cpp/RetinaFace.md
+# RetinaFace人脸检测器
+
+## 模型简介
+
+RetinaFace是一个经典的人脸检测模型(https://arxiv.org/abs/1905.00641)，采用了SSD架构。
+
+![image-20221215140647406](../Images/RetinaFace_01.png)
+
+本示例采用了如下的开源实现：https://github.com/biubug6/Pytorch_Retinaface，作者提供了restnet50 和mobilenet0.25两个预训练模型，本示例使用了mobilenet0.25预训练模型，将mobilenet0.25预训练模型下载下来后，保存到Pytorch_Retinaface工程的weights目录。
+
+
+
+## 模型转换
+
+在将mobilenet0.25预训练模型转换为onnx文件的时候，本示例需要对作者提供的python代码做如下改变：
+
+### 修改models/retinaface.py
+
+1. **将ClassHead类修改为如下实现**
+
+   ```
+   class ClassHead(nn.Module):
+       def __init__(self,inchannels=512,num_anchors=3):
+           super(ClassHead,self).__init__()
+           self.num_anchors = num_anchors
+           self.conv1x1 = nn.Conv2d(inchannels,self.num_anchors*2,kernel_size=(1,1),stride=1,padding=0)
+   
+       def forward(self,x):
+           out = self.conv1x1(x)
+           
+           return out
+   ```
+
+   由于本示例的C++推理代码已经实现了permute操作，所以这里需要去掉out.permute(0,2,3,1).contiguous()。
+
+2. **将BboxHead类修改为如下实现**
+
+   ```
+   class BboxHead(nn.Module):
+       def __init__(self,inchannels=512,num_anchors=3):
+           super(BboxHead,self).__init__()
+           self.conv1x1 = nn.Conv2d(inchannels,num_anchors*4,kernel_size=(1,1),stride=1,padding=0)
+   
+       def forward(self,x):
+           out = self.conv1x1(x)
+   
+           return out
+   ```
+
+   与ClassHead一样，需要去掉permute操作。
+
+3. **将RetinaFace类的forward修改为如下实现**
+
+   ```
+   def forward(self,inputs):
+           out = self.body(inputs)
+   
+           # FPN
+           fpn = self.fpn(out)
+   
+           # SSH
+           feature1 = self.ssh1(fpn[0])
+           feature2 = self.ssh2(fpn[1])
+           feature3 = self.ssh3(fpn[2])
+           features = [feature1, feature2, feature3]
+   
+           bbox_regressions = [self.BboxHead[i](feature) for i, feature in enumerate(features)]
+           classifications = [self.ClassHead[i](feature) for i, feature in enumerate(features)]
+          
+           output=(bbox_regressions[0],classifications[0],bbox_regressions[1],classifications[1],bbox_regressions[2],classifications[2])
+   
+           return output
+   ```
+
+   本示例去掉了landmark检测功能，所以需要去掉forward中的landmark部分，bbox_regressions和classifications需要删除torch.cat操作，同时需要修改output为(bbox_regressions[0],classifications[0],bbox_regressions[1],classifications[1],bbox_regressions[2],classifications[2])。
+
+
+
+### 修改data/config.py
+
+将cfg_mnet中的'pretrain': True,修改为'pretrain': False,
+
+
+
+### 修改convert_to_onnx.py
+
+导出onnx模型的时候，需要修改原来的output_names，可以直接删除torch.onnx._export()的output_names参数或者手动指定每个输出节点的名字，如果直接删除了output_names参数，则会生成一个随机名，本示例直接删除了output_names参数，同时本示例修改了onnx文件名output_onnx，修改后的main函数如下：
+
+```
+if __name__ == '__main__':
+    torch.set_grad_enabled(False)
+    cfg = None
+    if args.network == "mobile0.25":
+        cfg = cfg_mnet
+    elif args.network == "resnet50":
+        cfg = cfg_re50
+    # net and model
+    net = RetinaFace(cfg=cfg, phase = 'test')
+    net = load_model(net, args.trained_model, args.cpu)
+    net.eval()
+    print('Finished loading model!')
+    print(net)
+    device = torch.device("cpu" if args.cpu else "cuda")
+    net = net.to(device)
+
+    # ------------------------ export -----------------------------
+    output_onnx = 'mobilenet0.25_Final.onnx'
+    print("==> Exporting model to ONNX format at '{}'".format(output_onnx))
+    input_names = ["input0"]
+    output_names = ["output0"]
+    inputs = torch.randn(1, 3, args.long_side, args.long_side).to(device)
+
+    torch_out = torch.onnx._export(net, inputs, output_onnx, export_params=True, verbose=False,
+                                   input_names=input_names)
+```
+
+注意：如果需要修改模型的输入大小，可以修改args.long_side参数，默认为640x640。
+
+
+
+完成上述修改后，执行python convert_to_onnx.py命令就可以实现模型转换了，转换成功后会在当前目录生成mobilenet0.25_Final.onnx文件，下面就可以进行推理了。本示例将修改好的工程保存到了samples工程中的Resource/Models/Detector/RetinaFace目录中，在Pytorch_Retinaface目录中执行python convert_to_onnx.py命令可以直接生成onnx文件。
+
+
+
+## 检测器参数设置
+
+samples工程中的Resource/Configuration.xml文件的DetectorRetinaFace节点表示RetinaFace检测器的参数，这些参数是根据Pytorch_Retinaface工程中的data/config.py文件中的cfg_mnet来设置的，下面我们看一下是如何通过cfg_mnet来设置的。
+
+2. **设置anchor大小**
+   cfg_mnet的min_sizes表示每一个priorbox层的anchor大小，我们可以看到该模型一共有3个priorbox层，第一层anchor大小为16和32，第二层anchor大小为64和128，第三层anchor大小为256和512，注意：**Configuration.xml中priorbox层的顺序要与onnx文件中的输出节点顺序保持一致**，通过netron (https://netron.app/) 可以看到首先输出的是467和470节点，这两个节点对应的是特征图最大的检测层，所以对应的anchor大小为16和32，最后输出的是469和472节点，这两个节点对应的是特征图最小的检测层，所以对应的anchor大小为256和512，
+
+   ![image-20221215153957174](../Images/RetinaFace_02.png)
+   
+   
+   所以Configuration.xml配置文件中的参数设置如下：
+   
+   ```
+   <!--priorbox层的个数-->
+   <PriorBoxLayerNumber>3</PriorBoxLayerNumber>
+   
+   <!--每个priorbox层的minisize-->
+   <MinSize11>16</MinSize11>
+   <MinSize12>32</MinSize12>
+   <MinSize21>64</MinSize21>
+   <MinSize22>128</MinSize22>
+   <MinSize31>256</MinSize31>
+   <MinSize32>512</MinSize32>
+   ```
+   
+3. **设置Flip和Clip**
+   cfg_mnet中的clip为False，所以Configuration.xml中对应的参数设置为0即可，由于只有一个宽高比为1的anchor，所以Flip设置为0。
+
+   ```
+   <Flip1>0</Flip1>
+   <Flip2>0</Flip2>
+   <Flip3>0</Flip3>
+   
+   <Clip1>0</Clip1>
+   <Clip2>0</Clip2>
+   <Clip3>0</Clip3>
+   ```
+
+4. **设置anchor的宽高比**
+   由于RetinaFace只包含宽高比为1的anchor，所以这里不需要设置宽高比。
+
+5. **设置每个priorbox层的步长**
+   cfg_mnet中的steps表示每个priorbox层的步长，所以三个priorbox的步长依次为8,16,32，对应的Configuration.xml的设置如下：
+
+   ```
+   <!--每个priorbox层的step-->
+   <PriorBoxStepWidth1>8</PriorBoxStepWidth1><!--第一个priorbox层的step的width-->
+   <PriorBoxStepWidth2>16</PriorBoxStepWidth2>
+   <PriorBoxStepWidth3>32</PriorBoxStepWidth3>
+   
+   <PriorBoxStepHeight1>8</PriorBoxStepHeight1><!--第一个priorbox层的step的height-->
+   <PriorBoxStepHeight2>16</PriorBoxStepHeight2>
+   <PriorBoxStepHeight3>32</PriorBoxStepHeight3>
+   ```
+
+6. **设置DetectionOutput层的参数**
+
+   由于本示例模型是一个人脸检测模型，所以只有两类目标（背景和人脸），所以ClassNumber为2，DetectionOutput层的其他参数可以根据实际情况做微调，本示例中采用如下设置：
+
+   ```
+   <TopK>400</TopK>
+   <KeepTopK>200</KeepTopK>
+   <NMSThreshold>0.3</NMSThreshold>
+   <ConfidenceThreshold>0.9</ConfidenceThreshold>
+   ```
+
+
+
+## 预处理
+
+在将数据输入到模型之前，需要对图像做如下预处理操作：
+
+1. 减去均值，RetinaFace训练的时候对图像做了减均值的操作(train.py文件中的第38行)，注意均值的顺序是BGR顺序。
+2. 转换数据排布为NCHW
+
+
+
+本示例代码采用了OpenCV的cv::dnn::blobFromImage()函数实现了预处理操作：
+
+```
+ErrorCode DetectorRetinaFace::Detect(const cv::Mat &srcImage,std::vector<ResultOfDetection> &resultsOfDetection)
+{
+	...
+
+    // 预处理并转换为NCHW
+    cv::Mat inputBlob;
+    blobFromImage(srcImage,   // 输入数据
+                    inputBlob, // 输出数据
+                    scale, // 1
+                    inputSize, // SSD输入大小，本示例为640x480
+                    meanValue,// (104,117,123)
+                    swapRB, // false
+                    false);
+    
+    ...
+ }
+```
+
+
+
+## 推理
+
+模型转换成功并且设置好检测器参数之后就可以执行推理了。
+
+```
+ErrorCode DetectorRetinaFace::Detect(const cv::Mat &srcImage,std::vector<ResultOfDetection> &resultsOfDetection)
+{
+
+    ...
+ 
+    // 输入数据
+    migraphx::parameter_map inputData;
+    inputData[inputName]= migraphx::argument{inputShape, (float*)inputBlob.data};
+
+    // 推理
+    std::vector<migraphx::argument> inferenceResults=net.eval(inputData);
+    vector<vector<float>> regressions;
+    vector<vector<float>> classifications;
+    for(int i=0;i<ssdParameter.numberOfPriorBoxLayer;++i) // 执行Permute操作
+    {
+        int numberOfPriorBox=ssdParameter.detectInputChn[i]/(4*(ssdParameter.priorBoxHeight[i] * ssdParameter.priorBoxWidth[i]));
+
+        // BboxHead
+        std::vector<float> regression;
+        migraphx::argument result0  = inferenceResults[2*i]; 
+        result0.visit([&](auto output) { regression.assign(output.begin(), output.end()); });
+        regression=PermuteLayer(regression,ssdParameter.priorBoxWidth[i],ssdParameter.priorBoxHeight[i],numberOfPriorBox*4);
+        regressions.push_back(regression);
+        
+        // ClassHead
+        std::vector<float> classification;
+        migraphx::argument result1  = inferenceResults[2*i+1]; 
+        result1.visit([&](auto output) { classification.assign(output.begin(), output.end()); });
+        classification=PermuteLayer(classification,ssdParameter.priorBoxWidth[i],ssdParameter.priorBoxHeight[i],numberOfPriorBox*ssdParameter.classNum);
+        classifications.push_back(classification);
+    }
+
+    // 对推理结果进行处理，得到最后SSD检测的结果
+    GetResult(classifications,regressions,resultsOfDetection);
+
+    // 转换到原图坐标
+    for(int i=0;i<resultsOfDetection.size();++i)
+    {
+        float ratioOfWidth=(1.0*srcImage.cols)/inputSize.width;
+        float ratioOfHeight=(1.0*srcImage.rows)/inputSize.height;
+
+        resultsOfDetection[i].boundingBox.x*=ratioOfWidth;
+        resultsOfDetection[i].boundingBox.width*=ratioOfWidth;
+        resultsOfDetection[i].boundingBox.y*=ratioOfHeight;
+        resultsOfDetection[i].boundingBox.height*=ratioOfHeight;
+    }
+
+    // 按照置信度排序
+    sort(resultsOfDetection.begin(), resultsOfDetection.end(),CompareConfidence);
+
+    return SUCCESS;
+
+}
+```
+
+1. net.eval(inputData)返回推理结果，顺序与onnx输出保持一致，可以通过netron查看输出节点顺序，其中inferenceResults[2 * i]表示每个检测层的BboxHead的输出，inferenceResults [2 * i + 1]表示每个检测层的ClassHead的输出。
+
+1. 经过PermuteLayer处理之后的所有检测层数据通过GetResult()得到最后的输出结果，注意这里的输出结果还不是最后的检测结果，最后需要转换到原图坐标才能够得到最终的检测结果。
+
+
+
+## 运行示例
+
+根据samples工程中的README.md构建成功C++ samples后，在build目录下输入如下命令运行该示例：
+
+```
+./MIGraphX_Samples 2
+```
+
+会在当前目录生成检测结果图像Result.jpg
+
+![image-20221215164140724](../Images/RetinaFace_03.png)
\ No newline at end of file
--- a/Doc/Tutorial_Cpp/YOLOV3.md
+++ b/Doc/Tutorial_Cpp/YOLOV3.md
+# YOLOV3检测器
+
+## 模型简介
+
+YOLOV3是由Joseph Redmon和Ali Farhadi在《YOLOv3: An Incremental Improvement》论文中提出的单阶段检测模型，算法首先通过特征提取网络对输入提取特征，backbone部分由YOLOV2时期的Darknet19进化至Darknet53加深了网络层数，引入了Resnet中的跨层加和操作；然后结合不同卷积层的特征实现多尺度训练，一共有13x13、26x26、52x52三种分辨率，分别用来预测大、中、小的物体；每种分辨率的特征图将输入图像分成不同数量的格子，每个格子预测B个bounding box，每个bounding box预测内容包括: Location(x, y, w, h)、Confidence Score和C个类别的概率，因此YOLOv3输出层的channel数为B*(5 + C)。YOLOv3的loss函数也有三部分组成：Location误差，Confidence误差和分类误差。
+
+<img src="../Images/YOLOV3_02.jpg" alt="YOLOV3_02" style="zoom:;" />
+
+本示例采用如下的开源实现：https://github.com/ultralytics/yolov3, 作者在V9.6.0版本中提供多种不同的YOLOV3预训练模型，其中包括yolov3、yolov3-fixed、yolov3-spp、yolov3-tiny四个版本。本示例选择yolov3-tiny.pt预训练模型进行构建MIGraphX推理，下载YOLOV3的预训练模型yolov3-tiny.pt保存在Pytorch_YOLOV3工程的weights目录。
+
+## 模型转换
+
+官方提供的YOLOV3源码中包含导出onnx模型的程序，通过下面的步骤可以将yolov3-tiny.pt转换成onnx格式：
+
+```
+# 进入Pytorch_YOLOV3工程根目录
+cd <path_to_Pytorch_YOLOV3>
+
+# 环境配置，torch、torchvision手动安装
+pip install -r requirements.txt
+
+# 导出onnx模型
+python export.py --weights yolov3.pt --imgsz 416 416 --include onnx
+```
+
+注意：官方源码提供的模型转换的程序中包含更多的功能，例如动态shape模型的导出，可根据需要进行添加相关参数。
+
+## 检测器参数设置
+
+samples工程中的Resource/Configuration.xml文件的DetectorYOLOV3节点表示YOLOV3检测器的参数，相关参数主要依据官方推理示例进行设置。各个参数含义如下：
+
+- ModelPath：yolov3模型存放路径
+- ClassNameFile：coco数据集类别文件存放路径
+- UseFP16：是否使用FP16推理模式
+- NumberOfClasses：检测类别数量
+- ConfidenceThreshold：置信度阈值，用于判断anchor内的物体是否为正样本
+- NMSThreshold：非极大值抑制阈值，用于消除重复框
+- ObjectThreshold：用于判断anchor内部是否有物体
+
+```
+<ModelPath>"../Resource/Models/Detector/YOLOV3/yolov3-tiny.onnx"</ModelPath>
+<ClassNameFile>"../Resource/Models/Detector/YOLOV3/coco.names"</ClassNameFile>
+<UseFP16>0</UseFP16><!--是否使用FP16-->
+<NumberOfClasses>80</NumberOfClasses><!--类别数(不包括背景类)，COCO:80,VOC:20-->
+<ConfidenceThreshold>0.2</ConfidenceThreshold>
+<NMSThreshold>0.4</NMSThreshold>
+<ObjectThreshold>0.4</ObjectThreshold>
+```
+
+## 模型初始化
+
+模型初始化首先通过parse_onnx()函数加载YOLOV3的onnx模型，并可以通过program的get_parameter_shapes()函数获取网络的输入属性。完成模型加载之后需要使用compile()方法编译模型，编译模式使用migraphx::gpu::target{}设为GPU模式，编译过程主要基于MIGraphX IR完成各种优化。同时如果需要使用低精度量化进行推理，可以使用quantize_fp16()函数实现。
+
+```
+ErrorCode DetectorYOLOV3::Initialize(InitializationParameterOfDetector initializationParameterOfDetector)
+{
+    ...
+    
+    //模型加载
+    net = migraphx::parse_onnx(modelPath);
+    LOG_INFO(logFile,"succeed to load model: %s\n",GetFileName(modelPath).c_str());
+
+    // 获取模型输入属性
+    std::pair<std::string, migraphx::shape> inputAttribute=*(net.get_parameter_shapes().begin());
+    inputName=inputAttribute.first;
+    inputShape=inputAttribute.second;
+    inputSize=cv::Size(inputShape.lens()[3],inputShape.lens()[2]);// NCHW
+
+    // 设置模型为GPU模式
+    migraphx::target gpuTarget = migraphx::gpu::target{};
+
+    // 量化    
+    if(useFP16)
+    {
+        migraphx::quantize_fp16(net);
+    }
+
+    // 编译模型
+    migraphx::compile_options options;
+    options.device_id=0; // 设置GPU设备，默认为0号设备(>=1.2版本中支持)
+    options.offload_copy=true; // 设置offload_copy
+    net.compile(gpuTarget,options);
+    LOG_INFO(logFile,"succeed to compile model: %s\n",GetFileName(modelPath).c_str());
+
+    ...
+}
+```
+
+## 模型推理
+
+### 预处理
+
+在将数据输入到模型之前，需要对图像做如下预处理操作：
+
+- 转换数据排布为NCHW
+- 归一化[0.0, 1.0]
+- 将输入数据的尺寸变换到YOLOV3输入大小（1，3，416，416）
+
+```
+ErrorCode DetectorYOLOV3::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+   ...
+
+    // 预处理并转换为NCHW
+    cv::Mat inputBlob;
+    blobFromImage(srcImage,   // 输入数据
+                    inputBlob,  // 输出数据
+                    1 / 255.0,  //归一化
+                    inputSize,  //YOLOV3输入尺寸，本示例为416x416
+                    Scalar(0, 0, 0),  //未减去均值
+                    true,  //转换RB通道
+                    false);
+                    
+    ...
+}
+```
+
+### 前向推理
+
+完成图像预处理以及YOLOV3目标检测相关参数设置之后开始执行推理，利用migraphx推理计算得到YOLOV3模型的输出数据。
+
+```
+ErrorCode DetectorYOLOV3::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+
+	...
+    // 输入数据
+    migraphx::parameter_map inputData;
+    inputData[inputName]= migraphx::argument{inputShape, (float*)inputBlob.data};
+
+    // 推理
+    std::vector<migraphx::argument> inferenceResults = net.eval(inputData);
+
+    // 获取推理结果
+    std::vector<cv::Mat> outs;
+    migraphx::argument result = inferenceResults[0]; 
+
+    // 转换为cv::Mat
+    migraphx::shape outputShape = result.get_shape();
+    int shape[]={outputShape.lens()[0],outputShape.lens()[1],outputShape.lens()[2]};
+    cv::Mat out(3,shape,CV_32F);
+    memcpy(out.data,result.data(),sizeof(float)*outputShape.elements());
+    outs.push_back(out);
+    
+    ...
+}
+```
+
+YOLOV3的MIGraphX推理结果inferenceResults是一个std::vector< migraphx::argument >类型，YOLOV3的onnx模型包含一个输出，所以result等于inferenceResults[0]，result包含三个维度：outputShape.lens()[0]=1表示batch信息，outputShape.lens()[1]=10647表示生成anchor数量，outputShape.lens()[2]=85表示对每个anchor的预测信息。同时可将85拆分为4+1+80，前4个参数用于判断每一个特征点的回归参数，回归参数调整后可以获得预测框，第5个参数用于判断每一个特征点是否包含物体，最后80个参数用于判断每一个特征点所包含的物体种类。获取上述信息之后进行anchors筛选，筛选过程分为两个步骤：
+
+- 第一步根据objectThreshold阈值进行筛选，大于该阈值则判断当前anchor内包含物体，小于该阈值则判断无物体；
+
+- 第二步根据confidenceThreshold阈值进行筛选，当满足第一步阈值anchor的最大置信度得分maxClassScore大于该阈值，则进一步获取当前anchor的坐标信息和预测物体类别信息，小于该阈值则不做处理。
+
+```
+ErrorCode DetectorYOLOV3::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+    ...
+    
+    //获取先验框的个数numProposal=10647
+    int numProposal = outs[0].size[1];
+    //获取每个anchor的预测信息数量numOut=85
+    int numOut = outs[0].size[2];
+    //变换输出的维度
+    outs[0] = outs[0].reshape(0, numProposal);
+
+    //生成先验框
+    std::vector<float> confidences;
+    std::vector<cv::Rect> boxes;
+    std::vector<int> classIds;
+    //原图尺寸与模型输入尺寸的缩放比例
+    float ratioh = (float)srcImage.rows / inputSize.height, ratiow = (float)srcImage.cols / inputSize.width;
+
+    //计算cx,cy,w,h,box_sore,class_sore
+    int n = 0, rowInd = 0;
+    float* pdata = (float*)outs[0].data;
+    for (n = 0; n < numProposal; n++)
+    {
+        //获取当前anchor是否包含物体的概率值
+        float boxScores = pdata[4];
+        //第一次筛选，判断anchor内是否包含物体
+        if (boxScores > yolov3Parameter.objectThreshold)
+        {
+            //获取每个anchor内部预测的80个类别概率信息
+            cv::Mat scores = outs[0].row(rowInd).colRange(5, numOut);
+            cv::Point classIdPoint;
+            double maxClassScore;
+            /获取80个类别中最大概率值和对应的类别ID
+            cv::minMaxLoc(scores, 0, &maxClassScore, 0, &classIdPoint);
+            maxClassScore *= boxScores;
+            //第二次筛选，判断当前anchor的最大置信度得分是否满足阈值
+            if (maxClassScore > yolov3Parameter.confidenceThreshold)
+            {
+                const int classIdx = classIdPoint.x;
+                //将每个anchor坐标按缩放比例映射到原图
+                float cx = pdata[0] * ratiow;
+                float cy = pdata[1] * ratioh;
+                float w = pdata[2] * ratiow;
+                float h = pdata[3] * ratioh;
+                //获取anchor的左上角坐标
+                int left = int(cx - 0.5 * w);
+                int top = int(cy - 0.5 * h);
+
+                confidences.push_back((float)maxClassScore);
+                boxes.push_back(cv::Rect(left, top, (int)(w), (int)(h)));
+                classIds.push_back(classIdx);
+            }
+        }
+        rowInd++;
+        pdata += numOut;
+    }
+
+    ...
+}
+```
+
+为了消除重叠锚框，输出最终的YOLOV3目标检测结果，执行非极大值抑制对筛选之后的anchor进行处理，最后保存检测结果到resultsOfDetection中。
+
+```
+ErrorCode DetectorYOLOV3::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{    
+    ...
+    //执行non maximum suppression消除冗余重叠boxes
+    std::vector<int> indices;
+    dnn::NMSBoxes(boxes, confidences, yolov3Parameter.confidenceThreshold, yolov3Parameter.nmsThreshold, indices);
+    for (size_t i = 0; i < indices.size(); ++i)
+    {
+        int idx = indices[i];
+        int classID=classIds[idx];
+        string className=classNames[classID];
+        float confidence=confidences[idx];
+        cv::Rect box = boxes[idx];、
+        //保存每个最终预测anchor的坐标值、置信度分数、类别ID
+        ResultOfDetection result;
+        result.boundingBox=box;
+        result.confidence=confidence;// confidence
+        result.classID=classID; // label
+        result.className=className;
+        resultsOfDetection.push_back(result);
+    }
+    ...
+}
+```
+
+## 运行示例
+
+根据samples工程中的README.md构建成功C++ samples后，在build目录下输入如下命令运行该示例：
+
+```
+./MIGraphX_Samples 4
+```
+
+会在当前目录生成检测结果图像Result.jpg。
+
+![YOLOV3_01](../Images/YOLOV3_01.jpg)
--- a/Doc/Tutorial_Cpp/YOLOV5.md
+++ b/Doc/Tutorial_Cpp/YOLOV5.md
+# YOLOV5检测器
+
+## 模型简介
+
+YOLOV5是一种单阶段目标检测算法，该算法在YOLOV4的基础上添加了一些新的改进思路，使其速度与精度都得到了极大的性能提升。具体包括：输入端的Mosaic数据增强、自适应锚框计算、自适应图片缩放操作；主干网络的Focus结构与CSP结构；Neck端的FPN+PAN结构；输出端的损失函数GIOU_Loss以及预测框筛选的DIOU_nms。网络结构如图所示。
+
+![YOLOV5_01](../Images/YOLOV5_01.jpg)
+
+YOLOV5的官方源码地址：https://github.com/ultralytics/yolov5, 官方源码中具有YOLOV5n、YOLOV5s、YOLOV5m、YOLOV5l等不同的版本。本示例采用YOLOV5s版本进行MIGraphX推理示例构建，下载YOLOV5s的预训练模型yolov5s.pt保存在Pytorch_YOLOV5工程的weights目录。
+
+## 模型转换
+
+官方提供的YOLOV5源码中包含导出onnx模型的程序，通过下面的步骤可以将yolov5s.pt预训练模型转换成onnx格式：
+
+```
+# 进入Pytorch_YOLOV5工程根目录
+cd <path_to_Pytorch_YOLOV5>
+
+# 环境配置，torch、torchvision手动安装
+pip install -r requirement.txt
+
+# 转换模型
+python export.py --weights ./weights/yolov5s.pt --imgsz 608 608 --include onnx
+```
+
+注意：官方源码提供的模型转换的程序中包含更多的功能，例如动态shape模型的导出，可根据需要进行添加相关参数。
+
+## 检测器参数设置
+
+samples工程中的Resource/Configuration.xml文件的DetectorYOLOV5节点表示YOLOV5检测器的参数，相关参数主要依据官方推理示例进行设置。各个参数含义如下：
+
+- ModelPath：yolov5模型存放路径
+- ClassNameFile：coco数据集类别文件存放路径
+- UseFP16：是否使用FP16推理模式
+- NumberOfClasses：检测类别数量
+- ConfidenceThreshold：置信度阈值，用于判断anchor内的物体是否为正样本
+- NMSThreshold：非极大值抑制阈值，用于消除重复框
+- ObjectThreshold：用于判断anchor内部是否有物体
+
+```
+<ModelPath>"../Resource/Models/Detector/YOLOV5/YOLOV5s.onnx"</ModelPath>
+<ClassNameFile>"../Resource/Models/Detector/YOLOV5/coco.names"</ClassNameFile>
+<UseFP16>0</UseFP16><!--是否使用FP16-->
+<NumberOfClasses>80</NumberOfClasses><!--类别数(不包括背景类)，COCO:80,VOC:20-->
+<ConfidenceThreshold>0.25</ConfidenceThreshold>
+<NMSThreshold>0.5</NMSThreshold>
+<ObjectThreshold>0.5</ObjectThreshold>
+```
+
+## 模型初始化
+
+模型初始化首先通过parse_onnx()函数加载YOLOV5的onnx模型，并可以通过program的get_parameter_shapes()函数获取网络的输入属性。完成模型加载之后需要使用compile()方法编译模型，编译模式使用migraphx::gpu::target{}设为GPU模式，编译过程主要基于MIGraphX IR完成各种优化。同时如果需要使用低精度量化进行推理，可以使用quantize_fp16()函数实现。
+
+```
+ErrorCode DetectorYOLOV5::Initialize(InitializationParameterOfDetector initializationParameterOfDetector)
+{
+    ...
+    
+    //模型加载
+    net = migraphx::parse_onnx(modelPath);
+    LOG_INFO(logFile,"succeed to load model: %s\n",GetFileName(modelPath).c_str());
+
+    // 获取模型输入属性
+    std::pair<std::string, migraphx::shape> inputAttribute=*(net.get_parameter_shapes().begin());
+    inputName=inputAttribute.first;
+    inputShape=inputAttribute.second;
+    inputSize=cv::Size(inputShape.lens()[3],inputShape.lens()[2]);// NCHW
+
+    // 设置模型为GPU模式
+    migraphx::target gpuTarget = migraphx::gpu::target{};
+
+    // 量化    
+    if(useFP16)
+    {
+        migraphx::quantize_fp16(net);
+    }
+
+    // 编译模型
+    migraphx::compile_options options;
+    options.device_id=0; // 设置GPU设备，默认为0号设备(>=1.2版本中支持)
+    options.offload_copy=true; // 设置offload_copy
+    net.compile(gpuTarget,options);
+    LOG_INFO(logFile,"succeed to compile model: %s\n",GetFileName(modelPath).c_str());
+
+    ...
+}
+```
+
+## 模型推理
+
+### 预处理
+
+在将数据输入到模型之前，需要对图像做如下预处理操作：
+
+1. 转换数据排布为NCHW
+2. 归一化[0.0, 1.0]
+3. 将输入数据的尺寸变换到YOLOV5输入大小（1，3，608，608）
+
+```
+ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{ 
+  ...
+  
+    // 预处理并转换为NCHW
+    cv::Mat inputBlob;
+    blobFromImage(srcImage,   // 输入数据
+                    inputBlob,  // 输出数据
+                    1 / 255.0,  //归一化
+                    inputSize,  //YOLOV5输入尺寸，本示例为608x608
+                    Scalar(0, 0, 0),  //未减去均值
+                    true,  //转换RB通道
+                    false);
+                    
+   ...
+}
+```
+
+### 前向推理
+
+完成图像预处理以及YOLOV5目标检测相关参数设置之后开始执行推理，利用migraphx推理计算得到YOLOV5模型的输出。
+
+```
+ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+
+	...
+    // 输入数据
+    migraphx::parameter_map inputData;
+    inputData[inputName]= migraphx::argument{inputShape, (float*)inputBlob.data};
+
+    // 推理
+    std::vector<migraphx::argument> inferenceResults = net.eval(inputData);
+
+    // 获取推理结果
+    std::vector<cv::Mat> outs;
+    migraphx::argument result = inferenceResults[0]; 
+
+    // 转换为cv::Mat
+    migraphx::shape outputShape = result.get_shape();
+    int shape[]={outputShape.lens()[0],outputShape.lens()[1],outputShape.lens()[2]};
+    cv::Mat out(3,shape,CV_32F);
+    memcpy(out.data,result.data(),sizeof(float)*outputShape.elements());
+    outs.push_back(out);
+    
+    ...
+}
+```
+
+YOLOV5的MIGraphX推理结果inferenceResults是一个std::vector< migraphx::argument >类型，YOLOV5的onnx模型包含一个输出，所以result等于inferenceResults[0]，result包含三个维度：outputShape.lens()[0]=1表示batch信息，outputShape.lens()[1]=22743表示生成anchor数量，outputShape.lens()[2]=85表示对每个anchor的预测信息。同时可将85拆分为4+1+80，前4个参数用于判断每一个特征点的回归参数，回归参数调整后可以获得预测框，第5个参数用于判断每一个特征点是否包含物体，最后80个参数用于判断每一个特征点所包含的物体种类。获取上述信息之后进行anchors筛选，筛选过程分为两个步骤：
+
+- 第一步根据objectThreshold阈值进行筛选，大于该阈值则判断当前anchor内包含物体，小于该阈值则判断无物体
+- 第二步根据confidenceThreshold阈值进行筛选，当满足第一步阈值anchor的最大置信度得分maxClassScore大于该阈值，则进一步获取当前anchor的坐标信息和预测物体类别信息，小于该阈值则不做处理。
+
+```
+ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+
+	...
+    //获取先验框的个数numProposal=22743
+    numProposal = outs[0].size[1];
+    //每个anchor的预测信息数量numOut=85
+    numOut = outs[0].size[2];
+    outs[0] = outs[0].reshape(0, numProposal);
+
+    std::vector<float> confidences;
+    std::vector<cv::Rect> boxes;
+    std::vector<int> classIds;
+    //原图尺寸与模型输入尺寸的缩放比例
+    float ratioh = (float)srcImage.rows / inputSize.height, ratiow = (float)srcImage.cols / inputSize.width;
+
+    //计算cx,cy,w,h,box_sore,class_sore
+    int n = 0, rowInd = 0;
+    float* pdata = (float*)outs[0].data;
+    for (n = 0; n < numProposal; n++)
+    {
+        //获取当前anchor是否包含物体的概率值
+        float boxScores = pdata[4];
+        
+        //第一次筛选，判断anchor内是否包含物体
+        if (boxScores > yolov5Parameter.objectThreshold)
+        {
+            //获取每个anchor内部预测的80个类别概率信息
+            cv::Mat scores = outs[0].row(rowInd).colRange(5, numOut);
+            cv::Point classIdPoint;
+            double maxClassScore;
+            
+            //获取80个类别中最大概率值和对应的类别ID
+            cv::minMaxLoc(scores, 0, &maxClassScore, 0, &classIdPoint);
+            maxClassScore *= boxScores;
+            
+            //第二次筛选，判断当前anchor的最大置信度得分是否满足阈值
+            if (maxClassScore > yolov5Parameter.confidenceThreshold)
+            {
+                const int classIdx = classIdPoint.x;
+                
+                //将每个anchor坐标按缩放比例映射到原图
+                float cx = pdata[0] * ratiow;
+                float cy = pdata[1] * ratioh;
+                float w = pdata[2] * ratiow;
+                float h = pdata[3] * ratioh;
+                //获取anchor的左上角坐标
+                int left = int(cx - 0.5 * w);
+                int top = int(cy - 0.5 * h);
+
+                confidences.push_back((float)maxClassScore);
+                boxes.push_back(cv::Rect(left, top, (int)(w), (int)(h)));
+                classIds.push_back(classIdx);
+            }
+        }
+        rowInd++;
+        pdata += numOut;
+    }
+	
+	...
+}
+```
+
+为了消除重叠锚框，输出最终的YOLOV5目标检测结果，执行非极大值抑制对筛选之后的anchor进行处理，最后保存检测结果到resultsOfDetection中。
+
+```
+ErrorCode DetectorYOLOV5::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+
+	...
+    
+    // 执行non maximum suppression消除冗余重叠boxes
+    std::vector<int> indices;
+    dnn::NMSBoxes(boxes, confidences, yolov5Parameter.confidenceThreshold, yolov5Parameter.nmsThreshold, indices);
+    for (size_t i = 0; i < indices.size(); ++i)
+    {
+        int idx = indices[i];
+        int classID=classIds[idx];
+        string className=classNames[classID];
+        float confidence=confidences[idx];
+        cv::Rect box = boxes[idx];
+		
+        //保存每个最终预测anchor的坐标值、置信度分数、类别ID
+        ResultOfDetection result;
+        result.boundingBox=box;
+        result.confidence=confidence;// confidence
+        result.classID=classID; // label
+        result.className=className;
+        resultsOfDetection.push_back(result);
+    }
+
+    ...
+}
+```
+
+## 运行示例
+
+根据samples工程中的README.md构建成功C++ samples后，在build目录下输入如下命令运行该示例：
+
+```
+./MIGraphX_Samples 5
+```
+
+会在当前目录生成检测结果图像Result.jpg。
+
+<img src="../Images/YOLOV5_02.jpg" alt="YOLOV5_02" style="zoom:67%;" />
--- a/Doc/Tutorial_Cpp/YOLOV7.md
+++ b/Doc/Tutorial_Cpp/YOLOV7.md
+# YOLOV7检测器
+
+## 模型简介
+
+YOLOV7是2022年最新出现的一种YOLO系列目标检测模型，在论文 [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)中提出。
+
+<img src="../Images/YOLOV7_02.png" style="zoom:67%;" />
+
+本示例采用YOLOv7的官方源码：https://github.com/WongKinYiu/yolov7, 作者提供了多个预训练模型，本示例使用yolov7-tiny.pt预训练模型，将yolov7-tiny.pt预训练模型下载下来后，保存到Pytorch_YOLOV7工程的weights目录。
+
+## 模型转换
+
+官方提供的YOLOV7源码中包含导出onnx模型的程序，通过下面的步骤可以将yolov7-tiny.pt预训练模型转换成onnx格式：
+
+```
+# 进入Pytorch_YOLOV7工程根目录
+cd <path_to_Pytorch_YOLOV7>
+
+# 安装程序运行的依赖，torch、torchvision需要手动安装
+pip install -r requirement.txt
+
+# 转换模型
+python export.py --weights ./weight/yolov7-tiny.pt --img-size 640 640 
+```
+
+注意：如果需要修改onnx模型的输入大小，可以调整--img-size参数，同时模型输入batch默认为1，若想修改可以通过添加--batch-size设置，程序运行结束后，在当前目录下会生成onnx格式的YOLOV7模型，并将该模型保存到了samples工程中的Resource/Models/Detector/YOLOV7目录中，可以用来MIGraphX加载推理。
+
+## 检测器参数设置
+
+samples工程中的Resource/Configuration.xml文件的DetectorYOLOV7节点表示YOLOV7检测器的参数，相关参数主要依据官方推理示例进行设置。其中包括模型存放路径、类别名称文件、检测类别数量、置信度阈值、非极大值抑制阈值和判断先验框是否有物体阈值。
+
+- ModelPath：yolov7模型存放路径
+- ClassNameFile：coco数据集类别文件存放路径
+- UseFP16：是否使用FP16推理模式
+- NumberOfClasses：检测类别数量
+- ConfidenceThreshold：置信度阈值，用于判断anchor内的物体是否为正样本
+- NMSThreshold：非极大值抑制阈值，用于消除重复框
+- ObjectThreshold：用于判断anchor内部是否有物体
+
+```
+<ModelPath>"../Resource/Models/Detector/YOLOV7/yolov7-tiny.onnx"</ModelPath>
+<ClassNameFile>"../Resource/Models/Detector/YOLOV7/coco.names"</ClassNameFile>
+<UseFP16>0</UseFP16><!--是否使用FP16-->
+<NumberOfClasses>80</NumberOfClasses><!--类别数(不包括背景类)，COCO:80,VOC:20-->
+<ConfidenceThreshold>0.25</ConfidenceThreshold>
+<NMSThreshold>0.5</NMSThreshold>
+<ObjectThreshold>0.5</ObjectThreshold>
+```
+
+## 模型初始化
+
+模型初始化首先通过parse_onnx()函数加载YOLOV7的onnx模型，并可以通过program的get_parameter_shapes()函数获取网络的输入属性。完成模型加载之后需要使用compile()方法编译模型，编译模式使用migraphx::gpu::target{}设为GPU模式，编译过程主要基于MIGraphX IR完成各种优化。同时如果需要使用低精度量化进行推理，可以使用quantize_fp16()函数实现。
+
+```
+ErrorCode DetectorYOLOV7::Initialize(InitializationParameterOfDetector initializationParameterOfDetector)
+{
+    ...
+    
+    //模型加载
+    net = migraphx::parse_onnx(modelPath);
+    LOG_INFO(logFile,"succeed to load model: %s\n",GetFileName(modelPath).c_str());
+
+    // 获取模型输入属性
+    std::pair<std::string, migraphx::shape> inputAttribute=*(net.get_parameter_shapes().begin());
+    inputName=inputAttribute.first;
+    inputShape=inputAttribute.second;
+    inputSize=cv::Size(inputShape.lens()[3],inputShape.lens()[2]);// NCHW
+
+    // 设置模型为GPU模式
+    migraphx::target gpuTarget = migraphx::gpu::target{};
+
+    // 量化    
+    if(useFP16)
+    {
+        migraphx::quantize_fp16(net);
+    }
+
+    // 编译模型
+    migraphx::compile_options options;
+    options.device_id=0; // 设置GPU设备，默认为0号设备(>=1.2版本中支持)
+    options.offload_copy=true; // 设置offload_copy
+    net.compile(gpuTarget,options);
+    LOG_INFO(logFile,"succeed to compile model: %s\n",GetFileName(modelPath).c_str());
+
+    ...
+}
+```
+
+## 模型推理
+
+### 预处理
+
+在将数据输入到模型之前，需要对图像做如下预处理操作：
+
+- 转换数据排布为NCHW
+- 归一化到[0.0, 1.0]
+- 将输入数据的尺寸变换到YOLOV7输入大小（1，3，640，640）
+
+```c++
+ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+	...
+        
+    // 预处理并转换为NCHW
+    cv::Mat inputBlob;
+    blobFromImage(srcImage, //输入数据
+                    inputBlob, //输出数据
+                    1 / 255.0, //缩放系数，这里为1/255.0
+                    inputSize, //YOLOV7输入尺寸(640,640)
+                    Scalar(0, 0, 0), // 均值，这里不需要减均值，所以设置为0.0
+                    true, //转换RB通道
+                    false); 
+    ...
+}
+```
+
+### 前向推理
+
+完成图像预处理以及yolov7目标检测相关参数设置之后开始执行推理，获取migraphx推理结果。
+
+```
+ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+	...
+	
+    // 输入数据
+    migraphx::parameter_map inputData;
+    inputData[inputName]= migraphx::argument{inputShape, (float*)inputBlob.data};
+
+    // 推理
+    std::vector<migraphx::argument> inferenceResults=net.eval(inputData);
+
+    // 获取推理结果
+    std::vector<cv::Mat> outs;
+    migraphx::argument result = inferenceResults[0]; 
+
+    // 转换为cv::Mat
+    migraphx::shape outputShape = result.get_shape();
+    int shape[]={outputShape.lens()[0],outputShape.lens()[1],outputShape.lens()[2]};
+    cv::Mat out(4,shape,CV_32F);
+    memcpy(out.data,result.data(),sizeof(float)*outputShape.elements());
+
+    outs.push_back(out);
+
+    ...
+
+}
+```
+
+YOLOV7的MIGraphX推理结果inferenceResults是一个std::vector< migraphx::argument >类型，YOLOV7的onnx模型包含一个输出，所以result等于inferenceResults[0]，result包含三个维度：outputShape.lens()[0]=1表示batch信息，outputShape.lens()[1]=25200表示生成anchor数量，outputShape.lens()[2]=85表示对每个anchor的预测信息。同时可将85拆分为4+1+80，前4个参数用于判断每一个特征点的回归参数，回归参数调整后可以获得预测框，第5个参数用于判断每一个特征点是否包含物体，最后80个参数用于判断每一个特征点所包含的物体种类。获取上述信息之后进行anchors筛选，筛选过程分为两个步骤：
+
+- 第一步根据objectThreshold阈值进行筛选，大于该阈值则判断当前anchor内包含物体，小于该阈值则判断无物体
+- 第二步根据confidenceThreshold阈值进行筛选，当满足第一步阈值anchor的最大置信度得分maxClassScore大于该阈值，则进一步获取当前anchor内部的物体类别和坐标信息，小于该阈值则不做处理。
+
+```
+ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+	...
+	
+    //获取先验框的个数numProposal=25200
+    int numProposal = outs[0].size[1];
+    //每个anchor的预测信息数量numOut=85
+    int numOut = outs[0].size[2];
+    outs[0] = outs[0].reshape(0, numProposal);
+
+    std::vector<float> confidence;
+    std::vector<Rect> boxes
+    std::vector<int> classIds
+    //原图尺寸与模型输入尺寸的缩放比例
+    float ratioh = (float)srcImage.rows / inputSize.height, ratiow = (float)srcImage.cols / inputSize.width;
+
+   //计算cx,cy,w,h,box_sore,class_sore
+    int n = 0, rowInd = 0;
+    float* pdata = (float*)outs[0].data;
+    for (n = 0; n < numProposal; n++)
+    {
+    	//获取是否包含物体的概率值
+        float boxScores = pdata[4];
+        
+        //第一次筛选，判断anchor内是否包含物体
+        if (boxScores > yolov7Parameter.objectThreshold)
+        {
+            //获取每个anchor内部预测的80个类别概率信息
+            cv::Mat scores = outs[0].row(rowInd).colRange(5, numOut);
+            cv::Point classIdPoint;
+            double maxClassScore;
+            
+            //获取80个类别中最大概率值和对应的类别ID
+            cv::minMaxLoc(scores, 0, &maxClassScore, 0, &classIdPoint);
+            maxClassScore *= boxScores;
+            
+            //第二次筛选，判断当前anchor的最大置信度得分是否满足阈值
+            if (maxClassScore > yolov7Parameter.confidenceThreshold)
+            {
+                const int classIdx = classIdPoint.x;
+                
+                //将每个anchor坐标按缩放比例映射到原图
+                float cx = pdata[0] * ratiow;
+                float cy = pdata[1] * ratioh;
+                float w = pdata[2] * ratiow;
+                float h = pdata[3] * ratioh;
+                //获取anchor的左上角坐标
+                int left = int(cx - 0.5 * w);
+                int top = int(cy - 0.5 * h);
+
+                confidences.push_back((float)maxClassScore);
+                boxes.push_back(cv::Rect(left, top, (int)(w), (int)(h)));
+                classIds.push_back(classIdx);
+            }
+        }
+        rowInd++;
+        pdata += numOut;
+    }
+
+    ...
+
+}
+```
+
+为了消除重叠锚框，输出最终的YOLOV7目标检测结果，执行非极大值抑制对筛选之后的anchor进行处理，最后保存检测结果到resultsOfDetection中。
+
+```
+ErrorCode DetectorYOLOV7::Detect(const cv::Mat &srcImage, std::vector<ResultOfDetection> &resultsOfDetection)
+{
+	...
+
+    //执行non maximum suppression消除冗余重叠boxes
+    std::vector<int> indices;
+    dnn::NMSBoxes(boxes, confidences, yolov7Parameter.confidenceThreshold, yolov7Parameter.nmsThreshold, indices);
+    for (size_t i = 0; i < indices.size(); ++i)
+    {
+        int idx = indices[i];
+        int classID=classIds[idx];
+        string className=classNames[classID];
+        float confidence=confidences[idx];
+        cv::Rect box = boxes[idx];
+		
+        //保存每个最终预测anchor的坐标值、置信度分数、类别ID
+        ResultOfDetection result;
+        result.boundingBox=box;
+        result.confidence=confidence;// confidence
+        result.classID=classID; // label
+        result.className=className;
+        resultsOfDetection.push_back(result);
+    }
+    
+    ...
+    
+}
+```
+
+## 运行示例
+
+根据samples工程中的README.md构建成功C++ samples后，在build目录下输入如下命令运行该示例：
+
+```
+./MIGraphX_Samples 6
+```
+
+会在当前目录生成检测结果图像Result.jpg。
+
+<img src="../Images/YOLOV7_03.jpg" alt="YOLOV7_03" style="zoom:67%;" />
--- a/Doc/Tutorial_Python/RetinaFace.md
+++ b/Doc/Tutorial_Python/RetinaFace.md
+# RetinaFace人脸检测器
+
+## 模型简介
+
+RetinaFace是一个经典的人脸检测模型(https://arxiv.org/abs/1905.00641)，采用了SSD架构。
+
+![image-20221215140647406](../Images/RetinaFace_01.png)
+
+本示例采用了如下的开源实现：https://github.com/biubug6/Pytorch_Retinaface，作者提供了restnet50 和mobilenet0.25两个预训练模型，本示例使用了mobilenet0.25预训练模型，将mobilenet0.25预训练模型下载下来后，保存到Pytorch_Retinaface工程的weights目录。
+
+## 模型转换
+
+通过下面的步骤可以将mobilenet0.25预训练转换成onnx文件：
+
+1. 修改data/config.py：将cfg_mnet中的'pretrain': True,修改为'pretrain': False,
+
+2. 执行如下命令就可以将weights目录下的mobilenet0.25_Final.pth模型转换为onnx文件了
+
+   ```
+   # 进入Pytorch_Retinaface工程根目录
+   cd <path_to_Pytorch_Retinaface>
+   
+   # 转换模型
+   python convert_to_onnx.py
+   ```
+
+   注意：如果需要修改模型的输入大小，可以修改args.long_side参数，默认为640x640。
+
+   模型转换成功后，会在当前目录生成FaceDetector.onnx文件，利用该模型就可以使用MIGraphX进行推理了，本示例在samples工程中的Python/RetinaFace目录中提供了已经修改好的代码，在该目录下执行python convert_to_onnx.py可以直接生成onnx文件。
+
+## 推理
+
+Pytorch_Retinaface工程提供了原始Pytorch版本的推理测试代码detect.py，我们只需要将其中使用Pytorch推理的部分转换为MIGraphX推理就可以了，samples工程中的Python/RetinaFace/detect.py文件为已经转换好的推理代码，下面我们看一下是如何转换的：
+
+1. 将加载模型部分修改为migraphx的方式加载
+
+   ```
+   # 加载模型
+   model = migraphx.parse_onnx("./FaceDetector.onnx")
+   ```
+
+2. 模型加载成功后，需要通过model.compile进行编译，可以通过device_id设置使用哪一块设备
+
+   ```
+   model.compile(t=migraphx.get_target("gpu"),device_id=0)
+   ```
+
+3. 编译成功后，就可以输入图像进行推理了，由于本示例使用的onnx模型的输入大小是640x640，所以对于输入图像需要先resize到640x640
+
+   ```
+   # resize到onnx模型输入大小
+   image_path = "./curve/test.jpg"
+   img_raw = cv2.imread(image_path, cv2.IMREAD_COLOR)
+   img_raw = cv2.resize(img_raw, (640,640))
+   ```
+
+4. 预处理部分跟作者的代码保持一致即可，这部分不需要修改
+
+5. 下面是最关键的一步，将pytorch推理net(img)转换为MIGraphX推理migraphx_run(model,args.cpu,img)，其中migraphx_run实现如下：
+
+   ```
+   def migraphx_run(model,cpu,data_tensor):
+       # 将输入的tensor数据转换为numpy
+       if cpu:
+           data_numpy=data_tensor.cpu().numpy()
+           device = torch.device("cpu")
+       else:
+           data_numpy=data_tensor.detach().cpu().numpy()
+           device = torch.device("cuda")
+       
+       img_data = np.zeros(data_numpy.shape).astype("float32")
+       for i in range(data_numpy.shape[0]):
+           img_data[i, :, :, :] = data_numpy[i, :, :, :]
+   
+       # 执行推理
+       result = model.run({model.get_parameter_names()[0]: migraphx.argument(img_data)})
+   
+       # 将结果转换为tensor
+       result0=torch.from_numpy(np.array(result[0], copy=False)).to(device)
+       result1=torch.from_numpy(np.array(result[1], copy=False)).to(device)
+       result2=torch.from_numpy(np.array(result[2], copy=False)).to(device)
+   
+       return (result0,result1,result2)
+   ```
+
+   首先需要将tensor数据转换为numpy，转换好的数据保存在img_data中，然后通过{model.get_parameter_names()[0]: migraphx.argument(img_data)}创建MIGraphX的输入数据，并使用model.run执行推理，result为推理返回的结果，然后通过torch.from_numpy的方式转换为tensor类型并返回，为了保持与Pytorch推理结果的格式一致，转换的时候需要注意输出结果的顺序，MIGraphX的输出结果顺序与onnx中保持一致，可以通过netron (https://netron.app/) 查看：
+
+   ![image-20221215200807445](../Images/RetinaFace_04.png)
+
+   所以第一个输出结果对应pytorch结果中的loc,第二个对应conf, 第三个对应landms，所以返回的结果是(result0,result1,result2)。
+
+6. 推理执行成功后，需要执行后处理才能得到最终的检测结果，由于我们模型推理输出的格式与原始的Pytorch模型输出是一致的，所以后处理可以直接使用原来的，不需要修改。
+
+## 运行示例
+
+1. 参考《MIGraphX教程》中的安装方法安装MIGraphX并设置好PYTHONPATH
+2. 安装Pytorch
+3. 安装依赖：
+
+```
+# 进入migraphx samples工程根目录
+cd <path_to_migraphx_samples> 
+
+# 进入示例程序目录
+cd Python/RetinaFace
+
+# 安装依赖
+pip install -r requirements.txt
+```
+
+4. 运行示例：
+
+```
+python detect.py
+```
+
+会在当前目录生成检测结果图像test.jpg
+![image-20221215202711987](../Images/RetinaFace_05.png)
\ No newline at end of file
--- a/Doc/Tutorial_Python/YOLOV3.md
+++ b/Doc/Tutorial_Python/YOLOV3.md
+# YOLOV3检测器
+
+## 模型简介
+
+YOLOV3是由Joseph Redmon和Ali Farhadi在《YOLOv3: An Incremental Improvement》论文中提出的单阶段检测模型，算法基本思想首先通过特征提取网络对输入提取特征，backbone部分由YOLOV2时期的Darknet19进化至Darknet53加深了网络层数，引入了Resnet中的跨层加和操作；然后结合不同卷积层的特征实现多尺度训练，一共有13x13、26x26、52x52三种分辨率，分别用来预测大、中、小的物体；每种分辨率的特征图将输入图像分成不同数量的格子，每个格子预测B个bounding box，每个bounding box预测内容包括: Location(x, y, w, h)、Confidence Score和C个类别的概率，因此YOLOv3输出层的channel数为B*(5 + C)。YOLOv3的loss函数也有三部分组成：Location误差，Confidence误差和分类误差。
+
+<img src="../Images/YOLOV3_02.jpg" alt="YOLOV3_02"  />
+
+本示例采用如下的开源实现：https://github.com/ultralytics/yolov3，作者在V9.6.0版本中提供多种不同的YOLOV3预训练模型，其中包括yolov3、yolov3-fixed、yolov3-spp、yolov3-tiny四个版本。本示例选择yolov3-tiny.pt预训练模型进行构建MIGraphX推理，下载YOLOV3的预训练模型yolov3-tiny.pt保存在Pytorch_YOLOV3工程的weights目录。
+
+## 环境配置
+
+运行YOLOV3模型的Python示例首先需要进行环境配置，包括安装torch、torchvision以及程序运行所需要的依赖。
+
+1、安装torch、torchvision
+
+2、安装程序运行的依赖
+
+```
+# 进入Pytorch_YOLOV3工程根目录
+cd <path_to_Pytorch_YOLOV3>
+
+# 安装程序运行的依赖
+pip install -r requirement.txt
+```
+
+## 模型转换
+
+官方提供的YOLOV3源码中包含导出onnx模型的程序，通过下面的步骤可以将yolov3-tiny.pt预训练模型转换成onnx格式：
+
+```
+# 进入Pytorch_YOLOV3工程根目录
+cd <path_to_Pytorch_YOLOV3>
+
+# 转换模型
+python export.py --weights ./weights/yolov3-tiny.pt --imgsz 416 416 --include onnx
+```
+
+注意：官方源码提供的模型转换的程序中包含更多的功能，例如动态shape模型的导出，可根据需要进行添加相关参数。
+
+## 模型推理
+
+### 图片预处理
+
+待检测图片输入模型进行检测之前需要进行预处理，主要包括调整输入的尺寸，归一化等操作。
+
+1. 转换数据排布为NCHW
+2. 归一化[0.0, 1.0]
+3. 调整输入数据的尺寸为（1，3，416，416）
+
+```
+def prepare_input(self, image):
+    self.img_height, self.img_width = image.shape[:2]
+    input_img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+    # 调整图像的尺寸
+    input_img = cv2.resize(input_img, (self.inputWidth, self.inputHeight))
+    # 维度转换HWC->CHW
+    input_img = input_img.transpose(2, 0, 1)
+    # 维度拓展，增加batch维度
+    input_img = np.expand_dims(input_img, 0)
+    input_img = np.ascontiguousarray(input_img)
+    input_img = input_img.astype(np.float32)
+    # 归一化
+    input_img = input_img / 255
+
+    return input_img
+```
+
+其中模型输入的inputWidth、inputHeight通过migraphx对输入模型进行解析获取，代码位置见YOLOV3类初始化位置。
+
+```
+class YOLOv3:
+    def __init__(self, path, obj_thres=0.5, conf_thres=0.25, iou_thres=0.5):
+        self.objectThreshold = obj_thres
+        self.confThreshold = conf_thres
+        self.nmsThreshold = iou_thres
+
+        # 获取模型检测的类别信息
+        self.classNames = list(map(lambda x: x.strip(), open('./weights/coco.names', 'r').readlines()))
+
+        # 解析推理模型
+        self.model = migraphx.parse_onnx(path)
+
+        # 获取模型的输入name
+        self.inputName = self.model.get_parameter_names()[0]
+
+        # 获取模型的输入尺寸
+        inputShape = self.model.get_parameter_shapes()[self.inputName].lens()
+        self.inputHeight = int(inputShape[2])
+        self.inputWidth = int(inputShape[3])
+```
+
+### 推理
+
+输入图片预处理完成之后开始进行推理，首先需要利用migraphx进行编译，然后对输入数据进行前向计算得到模型的输出result，在detect函数中调用定义的process_output函数对result进行后处理，得到图片中含有物体的anchor坐标信息、类别置信度、类别ID。
+
+```
+def detect(self, image):
+    # 输入图片预处理
+    input_img = self.prepare_input(image)
+
+    # 模型编译
+    self.model.compile(t=migraphx.get_target("gpu"), device_id=0)  # device_id: 设置GPU设备，默认为0号设备
+    print("Success to compile")
+    # 执行推理
+    print("Start to inference")
+    start = time.time()
+    result = self.model.run({self.model.get_parameter_names()[0]: migraphx.argument(input_img)})
+    print('net forward time: {:.4f}'.format(time.time() - start))
+    # 模型输出结果后处理
+    boxes, scores, class_ids = self.process_output(result)
+
+    return boxes, scores, class_ids
+```
+
+其中对migraphx推理输出result进行后处理，首先需要对生成的anchor根据是否有物体阈值objectThreshold、置信度阈值confThreshold进行筛选，相关过程定义在process_output函数中。获取筛选后的anchor的坐标信息之后，需要将坐标映射到原图中的位置，相关过程定义在rescale_boxes函数中。
+
+```
+def process_output(self, output):
+    predictions = np.squeeze(output[0])
+
+    # 筛选包含物体的anchor
+    obj_conf = predictions[:, 4]
+    predictions = predictions[obj_conf > self.objectThreshold]
+    obj_conf = obj_conf[obj_conf > self.objectThreshold]
+
+    # 筛选大于置信度阈值的anchor
+    predictions[:, 5:] *= obj_conf[:, np.newaxis]
+    scores = np.max(predictions[:, 5:], axis=1)
+    valid_scores = scores > self.confThreshold
+    predictions = predictions[valid_scores]
+    scores = scores[valid_scores]
+
+    # 获取最高置信度分数对应的类别ID
+    class_ids = np.argmax(predictions[:, 5:], axis=1)
+
+    # 获取每个物体对应的anchor
+    boxes = self.extract_boxes(predictions)
+
+    # 执行非极大值抑制消除冗余anchor
+    indices = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), self.confThreshold, self.nmsThreshold).flatten()
+
+    return boxes[indices], scores[indices], class_ids[indices]
+
+def rescale_boxes(self, boxes):
+    # 对anchor尺寸进行变换
+    input_shape = np.array([self.inputWidth, self.inputHeight, self.inputWidth, self.inputHeight])
+    boxes = np.divide(boxes, input_shape, dtype=np.float32)
+    boxes *= np.array([self.img_width, self.img_height, self.img_width, self.img_height])
+    return boxes
+```
+
+根据获取的detect函数输出的boxes、scores、class_ids信息在原图进行结果可视化，包括用绘制图片中检测到的物体位置、类别和置信度分数，得到最终的YOLOV3目标检测结果输出。
+
+```
+def draw_detections(self, image, boxes, scores, class_ids):
+    for box, score, class_id in zip(boxes, scores, class_ids):
+        cx, cy, w, h = box.astype(int)
+
+        # 绘制检测物体框
+        cv2.rectangle(image, (cx, cy), (cx + w, cy + h), (0, 255, 255), thickness=2)
+        label = self.classNames[class_id]
+        label = f'{label} {int(score * 100)}%'
+        labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
+        cv2.putText(image, label, (cx, cy - 10), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), thickness=2)
+    return image
+```
+
+## 运行示例
+
+1.参考《MIGraphX教程》中的安装方法安装MIGraphX并设置好PYTHONPATH
+
+2.运行示例
+
+```
+# 进入migraphx samples工程根目录
+cd <path_to_migraphx_samples> 
+
+#进入示例程序目录
+cd Python/Detector/YOLOV3
+
+# 运行示例
+python detect_migraphx.py --imgpath ./data/images/dog.jpg --modelpath ./weights/yolov3-tiny.onnx --objectThreshold 0.4 --confThreshold 0.2 --nmsThreshold 0.4
+```
+
+输入参数中可根据需要进行修改，程序运行结束会在当前目录生成YOLOV3检测结果图片。
+
+![YOLOV3_03](../Images/YOLOV3_03.jpg)
--- a/Doc/Tutorial_Python/YOLOV5.md
+++ b/Doc/Tutorial_Python/YOLOV5.md
+# YOLOV5检测器
+
+## 模型简介
+
+YOLOV5是一种单阶段目标检测算法，该算法在YOLOV4的基础上添加了一些新的改进思路，使其速度与精度都得到了极大的性能提升。具体包括：输入端的Mosaic数据增强、自适应锚框计算、自适应图片缩放操作；主干网络的Focus结构与CSP结构；Neck端的FPN+PAN结构；输出端的损失函数GIOU_Loss以及预测框筛选的DIOU_nms。网络结构如图所示。
+
+<img src="../Images/YOLOV5_01.jpg" alt="YOLOV5_01" style="zoom: 67%;" />
+
+YOLOV5的官方源码地址：https://github.com/ultralytics/yolov5，官方源码中具有YOLOV5n、YOLOV5s、YOLOV5m、YOLOV5l等不同的版本。本示例采用YOLOV5s版本进行MIGraphX推理示例构建，下载YOLOV5s的预训练模型yolov5s.pt保存在Pytorch_YOLOV5工程的weights目录。
+
+## 环境配置
+
+运行YOLOV5模型的Python示例首先需要进行环境配置，包括安装torch、torchvision以及程序运行所需要的依赖。
+
+1、安装torch、torchvision
+
+2、安装程序运行的依赖
+
+```
+# 进入Pytorch_YOLOV5工程根目录
+cd <path_to_Pytorch_YOLOV5>
+
+# 安装程序运行的依赖
+pip install -r requirement.txt
+```
+
+## 模型转换
+
+官方提供的YOLOV5源码中包含导出onnx模型的程序，通过下面的步骤可以将yolov5s.pt预训练模型转换成onnx格式：
+
+```
+# 进入Pytorch_YOLOV5工程根目录
+cd <path_to_Pytorch_YOLOV5>
+
+# 转换模型
+python export.py --weights ./weights/yolov5s.pt --imgsz 608 608 --include onnx
+```
+
+注意：官方源码提供的模型转换的程序中包含更多的功能，例如动态shape模型的导出，可根据需要进行添加相关参数。
+
+## 模型推理
+
+### 图片预处理
+
+待检测图片输入模型进行检测之前需要进行预处理，主要包括调整输入的尺寸，归一化等操作。
+
+1. 转换数据排布为NCHW
+2. 归一化[0.0, 1.0]
+3. 调整输入数据的尺寸为（1，3，608，608）
+
+```
+def prepare_input(self, image):
+    self.img_height, self.img_width = image.shape[:2]
+    input_img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+    # 调整图像的尺寸
+    input_img = cv2.resize(input_img, (self.inputWidth, self.inputHeight))
+    # 维度转换HWC->CHW
+    input_img = input_img.transpose(2, 0, 1)
+    # 维度拓展，增加batch维度
+    input_img = np.expand_dims(input_img, 0)
+    input_img = np.ascontiguousarray(input_img)
+    input_img = input_img.astype(np.float32)
+    # 归一化
+    input_img = input_img / 255
+
+    return input_img
+```
+
+其中模型输入的inputWidth、inputHeight通过migraphx对输入模型进行解析获取，代码位置见YOLOV5类初始化位置。
+
+```
+class YOLOv5:
+    def __init__(self, path, obj_thres=0.5, conf_thres=0.25, iou_thres=0.5):
+        self.objectThreshold = obj_thres
+        self.confThreshold = conf_thres
+        self.nmsThreshold = iou_thres
+
+        # 获取模型检测的类别信息
+        self.classNames = list(map(lambda x: x.strip(), open('./weights/coco.names', 'r').readlines()))
+
+        # 解析推理模型
+        self.model = migraphx.parse_onnx(path)
+
+        # 获取模型的输入name
+        self.inputName = self.model.get_parameter_names()[0]
+
+        # 获取模型的输入尺寸
+        inputShape = self.model.get_parameter_shapes()[self.inputName].lens()
+        self.inputHeight = int(inputShape[2])
+        self.inputWidth = int(inputShape[3])
+```
+
+### 推理
+
+输入图片预处理完成之后开始进行推理，首先需要利用migraphx进行编译，然后对输入数据进行前向计算得到模型的输出result，在detect函数中调用定义的process_output函数对result进行后处理，得到图片中含有物体的anchor坐标信息、类别置信度、类别ID。
+
+```
+def detect(self, image):
+    # 输入图片预处理
+    input_img = self.prepare_input(image)
+
+    # 模型编译
+    self.model.compile(t=migraphx.get_target("gpu"), device_id=0)  # device_id: 设置GPU设备，默认为0号设备
+    print("Success to compile")
+    # 执行推理
+    print("Start to inference")
+    start = time.time()
+    result = self.model.run({self.model.get_parameter_names()[0]: migraphx.argument(input_img)})
+    print('net forward time: {:.4f}'.format(time.time() - start))
+    # 模型输出结果后处理
+    boxes, scores, class_ids = self.process_output(result)
+
+    return boxes, scores, class_ids
+```
+
+其中对migraphx推理输出result进行后处理，首先需要对生成的anchor根据是否有物体阈值objectThreshold、置信度阈值confThreshold进行筛选，相关过程定义在process_output函数中。获取筛选后的anchor的坐标信息之后，需要将坐标映射到原图中的位置，相关过程定义在rescale_boxes函数中。
+
+```
+def process_output(self, output):
+    predictions = np.squeeze(output[0])
+
+    # 筛选包含物体的anchor
+    obj_conf = predictions[:, 4]
+    predictions = predictions[obj_conf > self.objectThreshold]
+    obj_conf = obj_conf[obj_conf > self.objectThreshold]
+
+    # 筛选大于置信度阈值的anchor
+    predictions[:, 5:] *= obj_conf[:, np.newaxis]
+    scores = np.max(predictions[:, 5:], axis=1)
+    valid_scores = scores > self.confThreshold
+    predictions = predictions[valid_scores]
+    scores = scores[valid_scores]
+
+    # 获取最高置信度分数对应的类别ID
+    class_ids = np.argmax(predictions[:, 5:], axis=1)
+
+    # 获取每个物体对应的anchor
+    boxes = self.extract_boxes(predictions)
+
+    # 执行非极大值抑制消除冗余anchor
+    indices = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), self.confThreshold, self.nmsThreshold).flatten()
+
+    return boxes[indices], scores[indices], class_ids[indices]
+
+def rescale_boxes(self, boxes):
+    # 对anchor尺寸进行变换
+    input_shape = np.array([self.inputWidth, self.inputHeight, self.inputWidth, self.inputHeight])
+    boxes = np.divide(boxes, input_shape, dtype=np.float32)
+    boxes *= np.array([self.img_width, self.img_height, self.img_width, self.img_height])
+    return boxes
+```
+
+根据获取的detect函数输出的boxes、scores、class_ids信息在原图进行结果可视化，包括用绘制图片中检测到的物体位置、类别和置信度分数，得到最终的YOLOV5目标检测结果输出。
+
+```
+def draw_detections(self, image, boxes, scores, class_ids):
+    for box, score, class_id in zip(boxes, scores, class_ids):
+        cx, cy, w, h = box.astype(int)
+
+        # 绘制检测物体框
+        cv2.rectangle(image, (cx, cy), (cx + w, cy + h), (0, 255, 255), thickness=2)
+        label = self.classNames[class_id]
+        label = f'{label} {int(score * 100)}%'
+        labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
+        cv2.putText(image, label, (cx, cy - 10), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), thickness=2)
+    return image
+```
+
+## 运行示例
+
+1.参考《MIGraphX教程》中的安装方法安装MIGraphX并设置好PYTHONPATH
+
+2.运行示例
+
+```
+# 进入migraphx samples工程根目录
+cd <path_to_migraphx_samples> 
+
+#进入示例程序目录
+cd Python/Detector/YOLOV5
+
+# 运行示例
+python detect_migraphx.py --imgpath ./data/images/bus.jpg --modelpath ./weights/yolov5s.onnx --objectThreshold 0.5 --confThreshold 0.25 --nmsThreshold 0.5
+```
+
+输入参数中可根据需要进行修改，程序运行结束会在当前目录生成YOLOV5检测结果图片。
+
+<img src="../Images/YOLOV5_03.jpg" alt="YOLOV5_02" style="zoom:67%;" />
\ No newline at end of file
--- a/Doc/Tutorial_Python/YOLOV7.md
+++ b/Doc/Tutorial_Python/YOLOV7.md
+# YOLOV7检测器
+
+## 模型简介
+
+YOLOV7是2022年最新出现的一种YOLO系列目标检测模型，在论文 [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)中提出。
+
+<img src="../Images/YOLOV7_02.png" alt="YOLOV7_02" style="zoom:67%;" />
+
+本示例采用YOLOv7的官方源码：https://github.com/WongKinYiu/yolov7，作者提供了多个预训练模型，本示例使用yolov7-tiny.pt预训练模型，将yolov7-tiny.pt预训练模型下载下来后，保存到Pytorch_YOLOV7工程的weights目录。
+
+## 环境配置
+
+运行YOLOV7模型的Python示例首先需要进行环境配置，包括安装torch、torchvision以及程序运行所需要的依赖。
+
+1、安装torch、torchvision
+
+2、安装程序运行的依赖
+
+```
+# 进入Pytorch_YOLOV7工程根目录
+cd <path_to_Pytorch_YOLOV7>
+
+# 安装程序运行的依赖
+pip install -r requirement.txt
+```
+
+## 模型转换
+
+通过下面的步骤可以将yolov7-tiny.pt预训练模型转换成onnx格式：
+
+```
+# 进入Pytorch_YOLOV7工程根目录
+cd <path_to_Pytorch_YOLOV7>
+
+#转换模型
+python export.py --weights ./weight/yolov7-tiny.pt --img-size 640 640 
+```
+
+注意：如果需要修改onnx模型的输入大小，可以调整--img-size参数，同时模型输入batch默认为1，若想修改可以通过添加--batch-size设置，程序运行结束后，在当前目录下会生成onnx格式的YOLOV7模型，并将该模型保存到了samples工程中的Resource/Models/Detector/YOLOV7目录中，可以用来MIGraphX加载推理。
+
+## 模型推理
+
+### 图片预处理
+
+待检测图片输入模型进行检测之前需要进行预处理，主要包括调整输入的尺寸，归一化等操作。
+
+1. 转换数据排布为NCHW
+2. 调整输入数据的尺寸为（1，3，640，640）
+
+```python
+def prepare_input(self, image):
+    self.img_height, self.img_width = image.shape[:2]
+    input_img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+    # 调整图像的尺寸为YOLOV7的输入尺寸
+    input_img = cv2.resize(input_img, (self.inputWidth, self.inputHeight))
+    # 调整输入数据的维度
+    input_img = input_img.transpose(2, 0, 1)
+    # 拓展维度，增加batch维度
+    input_img = np.expand_dims(input_img, 0)
+    input_img = np.ascontiguousarray(input_img)
+    input_img = input_img.astype(np.float32)
+    # 归一化[0.0, 1.0]
+    input_img = input_img / 255
+
+    return input_img
+```
+
+其中模型输入的inputWidth、inputHeight通过migraphx对输入模型进行解析获取，代码位置见YOLOV7类初始化位置。
+
+```
+class YOLOv7:
+    def __init__(self, path, obj_thres=0.5, conf_thres=0.7, iou_thres=0.5):
+        self.objectThreshold = obj_thres
+        self.confThreshold = conf_thres
+        self.nmsThreshold = iou_thres
+        
+        # 获取模型检测的类别信息
+        self.classNames = list(map(lambda x: x.strip(), open('./weights/coco.names', 'r').readlines()))
+
+        # 解析推理模型
+        self.model = migraphx.parse_onnx(path)
+      
+        # 获取模型的输入name
+        self.inputName = self.model.get_parameter_names()[0]
+        
+        # 获取模型的输入尺寸
+        inputShape = self.model.get_parameter_shapes()[self.inputName].lens()
+        self.inputHeight = int(inputShape[2])
+        self.inputWidth = int(inputShape[3])
+```
+
+### 推理
+
+输入图片预处理完成之后开始进行推理，首先需要利用migraphx进行编译，然后对输入数据进行前向计算得到模型的输出result，在detect函数中调用定义的process_output函数对result进行后处理，得到图片中含有物体的anchor坐标信息、类别置信度、类别ID。
+
+```
+def detect(self, image):
+    # 输入图片预处理
+    input_img = self.prepare_input(image)
+
+    # 模型编译
+    self.model.compile(t=migraphx.get_target("gpu"), device_id=0)  # device_id: 设置GPU设备，默认为0号设备
+    print("Success to compile")
+    # 执行推理
+    print("Start to inference")
+    start = time.time()
+    result = self.model.run({self.model.get_parameter_names()[0]: migraphx.argument(input_img)})
+    print('net forward time: {:.4f}'.format(time.time() - start))
+    # 模型输出结果后处理
+    boxes, scores, class_ids = self.process_output(result)
+
+    return boxes, scores, class_ids
+```
+
+其中对migraphx推理输出result进行后处理，首先需要对生成的anchor根据是否有物体阈值objectThreshold、置信度阈值confThreshold进行筛选，相关过程定义在process_output函数中。获取筛选后的anchor的坐标信息之后，需要将坐标映射到原图中的位置，相关过程定义在rescale_boxes函数中。
+
+```
+def process_output(self, output):
+    predictions = np.squeeze(output[0])
+
+    # 筛选包含物体的anchor
+    obj_conf = predictions[:, 4]
+    predictions = predictions[obj_conf > self.objectThreshold]
+    obj_conf = obj_conf[obj_conf > self.objectThreshold]
+
+    # 筛选大于置信度阈值的anchor
+    predictions[:, 5:] *= obj_conf[:, np.newaxis]
+    scores = np.max(predictions[:, 5:], axis=1)
+    valid_scores = scores > self.confThreshold
+    predictions = predictions[valid_scores]
+    scores = scores[valid_scores]
+
+    # 获取最高置信度分数对应的类别ID
+    class_ids = np.argmax(predictions[:, 5:], axis=1)
+
+    # 获取每个物体对应的anchor
+    boxes = self.extract_boxes(predictions)
+
+    # 执行非极大值抑制消除冗余anchor
+    indices = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), self.confThreshold, self.nmsThreshold).flatten()
+
+    return boxes[indices], scores[indices], class_ids[indices]
+        
+def rescale_boxes(self, boxes):
+    # 对anchor尺寸进行变换
+    input_shape = np.array([self.inputWidth, self.inputHeight, self.inputWidth, self.inputHeight])
+    boxes = np.divide(boxes, input_shape, dtype=np.float32)
+    boxes *= np.array([self.img_width, self.img_height, self.img_width, self.img_height])
+    return boxes
+```
+
+根据获取的detect函数输出的boxes、scores、class_ids信息在原图进行结果可视化，包括用绘制图片中检测到的物体位置、类别和置信度分数，得到最终的YOLOV7目标检测结果输出。
+
+```
+def draw_detections(self, image, boxes, scores, class_ids):
+    for box, score, class_id in zip(boxes, scores, class_ids):
+        cx, cy, w, h = box.astype(int)
+
+        # 绘制检测物体框
+        cv2.rectangle(image, (cx, cy), (cx + w, cy + h), (0, 255, 255), thickness=2)
+        label = self.classNames[class_id]
+        label = f'{label} {int(score * 100)}%'
+        labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
+        cv2.putText(image, label, (cx, cy - 10), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), thickness=2)
+    return image
+```
+
+## 运行示例
+
+1.参考《MIGraphX教程》中的安装方法安装MIGraphX并设置好PYTHONPATH
+
+2.运行示例
+
+```
+# 进入migraphx samples工程根目录
+cd <path_to_migraphx_samples> 
+
+#进入示例程序目录
+cd Python/Detector/YOLOV7
+
+# 运行示例
+python detect_migraphx.py --imgpath ./inference/images/bus.jpg --modelpath ./weights/yolov7-tiny.onnx --objectThreshold 0.5 --confThreshold 0.25 --nmsThreshold 0.5
+```
+
+输入参数中可根据需要进行修改，程序运行结束会在当前目录生成YOLOV7检测结果图片。
+
+<img src="../Images/YOLOV7_01.jpg" alt="YOLOV7_01" style="zoom:67%;" />
+
--- a/Doc/YOLOV7_01.png
+++ b/Doc/YOLOV7_01.png
--- a/Doc/YoloV7_model.png
+++ b/Doc/YoloV7_model.png
--- a/Doc/YoloV7_suanfa.png
+++ b/Doc/YoloV7_suanfa.png
--- a/Doc/image.gif
+++ b/Doc/image.gif
--- a/README.md
+++ b/README.md
-# Video_Ort
+# YoloV7

-## 目录
- [目录结构](#目录结构)
- [项目介绍](#项目介绍)
- [环境配置](#环境配置)
- [编译运行](#编译运行)
- [参考文档](#参考文档)
- [历史版本](#历史版本)
+## 论文

-## 目录结构
-```
-├── CMakeLists.txt
-├── include
-├── lib
-│   ├── libdecode.so
-│   └── libQueue.so
-├── README.md
-├── Resource
-│   ├── Images
-│   └── Models
-└── src
-    ├── CommonUtils.cpp
-    ├── DetectorYOLOV3.cpp
-    ├── DetectorYOLOV5.cpp
-    ├── DetectorYOLOV7.cpp
-    ├── main.cpp
-    ├── yolov3-tiny.cpp
-    ├── yolov5s.cpp
-    └── yolov7-tiny.cpp
-```
+YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
+
+- https://arxiv.org/pdf/2207.02696.pdf
+
+## 模型结构
+
+YOLOV7是2022年最新出现的一种YOLO系列目标检测模型，该模型的网络结构包括三个部分：input、backbone和head。
+
+<img src="./Doc/YoloV7_model.png" alt="YOLOV7_02" style="zoom:67%;" />
+
+## 算法原理

-## 项目介绍
+YOLOv7的作者提出了 Extended-ELAN (E-ELAN)结构。E-ELAN采用了ELAN类似的特征聚合和特征转移流程，仅在计算模块中采用了类似ShuffleNet的分组卷积、扩张模块和混洗模块，最终通过聚合模块融合特征。通过采
+用这种方法可以获得更加多样的特征，同时提高参数的计算和利用效率。

-基于CPU解码，onnxruntime框架推理的视频推理范例
+<img src="./Doc/YoloV7_suanfa.png" alt="YOLOV7_suanfa" style="zoom:67%;" />

 ## 环境配置
+### Docker（方法一）

-推荐使用docker方式运行，提供[光源](https://www.sourcefind.cn/#/service-list)拉取的docker镜像
+拉取镜像：

-```
+```plaintext
 docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:tvm0.10_ort1.14.0_migraphx3.0.0-dtk23.04
 ```

-## 编译运行
+创建并启动容器：
+
+```plaintext
+docker run --shm-size 16g --network=host --name=video_ort --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/video_ort:/home/video_ort -it <Your Image ID> /bin/bash
+```
+
+### Dockerfile（方法二）

-### 编译
+```
+cd ./docker
+docker build --no-cache -t video_ort:test .
+
+docker run --shm-size 16g --network=host --name=video_ort --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/video_ort:/home/video_ort -it <Your Image ID> /bin/bash
+```
+
+## 数据集
+根据提供的视频文件，进行目标检测。
+
+## 推理
+
+### 编译工程
 ```
 git clone https://developer.hpccube.com/codes/modelzoo/video_ort.git
 cd video_ort
@@ -54,22 +58,37 @@ cmake ../
 make
 ```

-### 运行
-```
-./Video_Ort
+### 运行示例
 ```
-根据提示选择要运行的示例程序，比如执行:
-```
-./Video_Ort 0
+./Video_Onnx
+根据提示选择要运行的示例程序，运行解码卡示例需要提前安装并初始化解码卡。比如执行:
+
+./Video_Onnx 0
 ```
 运行CPU解码并运行YOLOV3推理示例程序

+## result
+
+![img](./Doc/image.gif)

-## 参考文档
+### 精度
+无

-文档参考Doc目录下说明文档.
+## 应用场景
+
+### 算法类别
+
+`目标检测`
+
+### 热点应用行业
+
+`监控`,`交通`,`教育`,`化工`

 ## 源码仓库及问题反馈

-https://developer.hpccube.com/codes/modelzoo/video_ort.git
+http://developer.hpccube.com/codes/modelzoo/video_ort.git
+
+## 参考资料
+
+https://github.com/WongKinYiu/yolov7

--- a/docker/Dockerfile
+++ b/docker/Dockerfile
+FROM image.sourcefind.cn:5000/dcu/admin/base/custom:tvm0.10_ort1.14.0_migraphx3.0.0-dtk23.04