Unverified Commit 2b39d7a8 authored by q.yao's avatar q.yao Committed by GitHub
Browse files

[Docs] Add zh_cn document of ONNX (#1331)



* add doc-cn of ONNX

* Update docs_zh_CN/deployment/onnxruntime_custom_ops.md
Co-authored-by: default avatarZaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: default avatarZaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* update doc of cummax

* fix en doc of softnms

* update heading
Co-authored-by: default avatarZaida Zhou <58739961+zhouzaida@users.noreply.github.com>
parent 6f533ff1
# Introduction of onnx module in MMCV (Experimental)
## Introduction of onnx module in MMCV (Experimental)
## register_extra_symbolics
### register_extra_symbolics
Some extra symbolic functions need to be registered before exporting PyTorch model to ONNX.
### Example
#### Example
```python
import mmcv
......@@ -14,6 +14,6 @@ opset_version = 11
register_extra_symbolics(opset_version)
```
### FAQs
#### FAQs
- None
# Onnxruntime Custom Ops
## Onnxruntime Custom Ops
<!-- TOC -->
......@@ -54,13 +54,13 @@
<!-- TOC -->
## SoftNMS
### SoftNMS
### Description
#### Description
Perform soft NMS on `boxes` with `scores`. Read [Soft-NMS -- Improving Object Detection With One Line of Code](https://arxiv.org/abs/1704.04503) for detail.
### Parameters
#### Parameters
| Type | Parameter | Description |
| ------- | --------------- | -------------------------------------------------------------- |
......@@ -70,7 +70,7 @@ Perform soft NMS on `boxes` with `scores`. Read [Soft-NMS -- Improving Object De
| `int` | `method` | method to do the nms, (0: `naive`, 1: `linear`, 2: `gaussian`) |
| `int` | `offset` | `boxes` width or height is (x2 - x1 + offset). (0 or 1) |
### Inputs
#### Inputs
<dl>
<dt><tt>boxes</tt>: T</dt>
......@@ -79,26 +79,26 @@ Perform soft NMS on `boxes` with `scores`. Read [Soft-NMS -- Improving Object De
<dd>Input scores. 1-D tensor of shape (N, ).</dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>dets</tt>: tensor(int64)</dt>
<dt><tt>dets</tt>: T</dt>
<dd>Output boxes and scores. 2-D tensor of shape (num_valid_boxes, 5), [[x1, y1, x2, y2, score], ...]. num_valid_boxes is the number of valid boxes.</dd>
<dt><tt>indices</tt>: T</dt>
<dt><tt>indices</tt>: tensor(int64)</dt>
<dd>Output indices. 1-D tensor of shape (num_valid_boxes, ).</dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32)
## RoIAlign
### RoIAlign
### Description
#### Description
Perform RoIAlign on output feature, used in bbox_head of most two-stage detectors.
### Parameters
#### Parameters
| Type | Parameter | Description |
| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
......@@ -109,7 +109,7 @@ Perform RoIAlign on output feature, used in bbox_head of most two-stage detector
| `str` | `mode` | pooling mode in each bin. `avg` or `max` |
| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
### Inputs
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
......@@ -118,31 +118,31 @@ Perform RoIAlign on output feature, used in bbox_head of most two-stage detector
<dd>RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of input.</dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>feat</tt>: T</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element feat[r-1] is a pooled feature map corresponding to the r-th RoI RoIs[r-1].<dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32)
## NMS
### NMS
### Description
#### Description
Filter out boxes has high IoU overlap with previously selected boxes.
### Parameters
#### Parameters
| Type | Parameter | Description |
| ------- | --------------- | ---------------------------------------------------------------------------------------------------------------- |
| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
| `int` | `offset` | 0 or 1, boxes' width or height is (x2 - x1 + offset). |
### Inputs
#### Inputs
<dl>
<dt><tt>bboxes</tt>: T</dt>
......@@ -151,24 +151,24 @@ Filter out boxes has high IoU overlap with previously selected boxes.
<dd>Input scores. 1-D tensor of shape (num_boxes, ).</dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>indices</tt>: tensor(int32, Linear)</dt>
<dd>Selected indices. 1-D tensor of shape (num_valid_boxes, ). num_valid_boxes is the number of valid boxes.</dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32)
## grid_sampler
### grid_sampler
### Description
#### Description
Perform sample from `input` with pixel locations from `grid`.
### Parameters
#### Parameters
| Type | Parameter | Description |
| ----- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
......@@ -176,7 +176,7 @@ Perform sample from `input` with pixel locations from `grid`.
| `int` | `padding_mode` | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`) |
| `int` | `align_corners` | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
### Inputs
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
......@@ -185,67 +185,67 @@ Perform sample from `input` with pixel locations from `grid`.
<dd>Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW is the height and width of offset and output. </dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, C, outH, outW).</dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32, Linear)
## CornerPool
### CornerPool
### Description
#### Description
Perform CornerPool on `input` features. Read [CornerNet -- Detecting Objects as Paired Keypoints](https://arxiv.org/abs/1808.01244) for more details.
### Parameters
#### Parameters
| Type | Parameter | Description |
| ------- | --------------- | ---------------------------------------------------------------- |
| `int` | `mode` | corner pool mode, (0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
| Type | Parameter | Description |
| ----- | --------- | ---------------------------------------------------------------- |
| `int` | `mode` | corner pool mode, (0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
### Inputs
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input features. 4-D tensor of shape (N, C, H, W). N is the batch size.</dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output the pooled features. 4-D tensor of shape (N, C, H, W).</dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32)
## cummax
### cummax
### Description
#### Description
Returns a tuple (`values`, `indices`) where `values` is the cumulative maximum elements of `input` in the dimension `dim`. And `indices` is the index location of each maximum value found in the dimension `dim`. Read [torch.cummax](https://pytorch.org/docs/stable/generated/torch.cummax.html) for more details.
### Parameters
#### Parameters
| Type | Parameter | Description |
| ------- | --------------- | ---------------------------------------------------------------- |
| `int` | `dim` | the dimension to do the operation over |
| Type | Parameter | Description |
| ----- | --------- | -------------------------------------- |
| `int` | `dim` | the dimension to do the operation over |
### Inputs
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>The input tensor with various shapes. Tensor with empty element is also supported.</dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>output</tt>: T</dt>
......@@ -254,30 +254,30 @@ Returns a tuple (`values`, `indices`) where `values` is the cumulative maximum e
<dd>Output the index location of each cumulative maximum value found in the dimension `dim`, with the same shape as `input`.</dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32)
## cummin
### cummin
### Description
#### Description
Returns a tuple (`values`, `indices`) where `values` is the cumulative minimum elements of `input` in the dimension `dim`. And `indices` is the index location of each minimum value found in the dimension `dim`. Read [torch.cummin](https://pytorch.org/docs/stable/generated/torch.cummin.html) for more details.
### Parameters
#### Parameters
| Type | Parameter | Description |
| ------- | --------------- | ---------------------------------------------------------------- |
| `int` | `dim` | the dimension to do the operation over |
| Type | Parameter | Description |
| ----- | --------- | -------------------------------------- |
| `int` | `dim` | the dimension to do the operation over |
### Inputs
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>The input tensor with various shapes. Tensor with empty element is also supported.</dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>output</tt>: T</dt>
......@@ -286,27 +286,27 @@ Returns a tuple (`values`, `indices`) where `values` is the cumulative minimum e
<dd>Output the index location of each cumulative minimum value found in the dimension `dim`, with the same shape as `input`.</dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32)
## MMCVModulatedDeformConv2d
### MMCVModulatedDeformConv2d
### Description
#### Description
Perform Modulated Deformable Convolution on input feature, read [Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline) for detail.
### Parameters
#### Parameters
| Type | Parameter | Description |
| -------------- | ------------------ | ------------------------------------------------------------------------------------- |
| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
| Type | Parameter | Description |
| -------------- | ------------------- | ------------------------------------------------------------------------------------- |
| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
| `int` | `deformable_groups` | Groups of deformable offset. |
| `int` | `groups` | Split input into groups. `input_channel` should be divisible by the number of groups. |
### Inputs
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
......@@ -321,13 +321,13 @@ Perform Modulated Deformable Convolution on input feature, read [Deformable Conv
<dd>Input bias; 1-D tensor of shape (output_channel).</dd>
</dl>
### Outputs
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, output_channel, outH, outW).</dd>
</dl>
### Type Constraints
#### Type Constraints
- T:tensor(float32, Linear)
# Custom operators for ONNX Runtime in MMCV
## Custom operators for ONNX Runtime in MMCV
## Introduction of ONNX Runtime
### Introduction of ONNX Runtime
**ONNX Runtime** is a cross-platform inferencing and training accelerator compatible with many popular ML/DNN frameworks. Check its [github](https://github.com/microsoft/onnxruntime) for more information.
## Introduction of ONNX
### Introduction of ONNX
**ONNX** stands for **Open Neural Network Exchange**, which acts as *Intermediate Representation(IR)* for ML/DNN models from many frameworks. Check its [github](https://github.com/onnx/onnx) for more information.
## Why include custom operators for ONNX Runtime in MMCV
### Why include custom operators for ONNX Runtime in MMCV
- To verify the correctness of exported ONNX models in ONNX Runtime.
- To ease the deployment of ONNX models with custom operators from `mmcv.ops` in ONNX Runtime.
## List of operators for ONNX Runtime supported in MMCV
### List of operators for ONNX Runtime supported in MMCV
| Operator | CPU | GPU | MMCV Releases |
| :----------------------------------------------------: | :---: | :---: | :-----------: |
......@@ -25,11 +25,11 @@
| [cummax](onnxruntime_custom_ops.md#cummax) | Y | N | master |
| [cummin](onnxruntime_custom_ops.md#cummin) | Y | N | master |
## How to build custom operators for ONNX Runtime
### How to build custom operators for ONNX Runtime
*Please be noted that only **onnxruntime>=1.8.1** of CPU version on Linux platform is tested by now.*
### Prerequisite
#### Prerequisite
- Clone repository
......@@ -48,14 +48,14 @@ export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```
### Build on Linux
#### Build on Linux
```bash
cd mmcv # to MMCV root directory
cd mmcv ## to MMCV root directory
MMCV_WITH_OPS=1 MMCV_WITH_ORT=1 python setup.py develop
```
## How to do inference using exported ONNX models with custom operators in ONNX Runtime in python
### How to do inference using exported ONNX models with custom operators in ONNX Runtime in python
Install ONNX Runtime with `pip`
......@@ -77,21 +77,21 @@ ort_custom_op_path = get_onnxruntime_op_path()
assert os.path.exists(ort_custom_op_path)
session_options = ort.SessionOptions()
session_options.register_custom_ops_library(ort_custom_op_path)
# exported ONNX model with custom operators
## exported ONNX model with custom operators
onnx_file = 'sample.onnx'
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
sess = ort.InferenceSession(onnx_file, session_options)
onnx_results = sess.run(None, {'input' : input_data})
```
## How to add a new custom operator for ONNX Runtime in MMCV
### How to add a new custom operator for ONNX Runtime in MMCV
### Reminder
#### Reminder
- The custom operator is not included in [supported operator list](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md) in ONNX Runtime.
- The custom operator should be able to be exported to ONNX.
### Main procedures
#### Main procedures
Take custom operator `soft_nms` for example.
......@@ -114,13 +114,13 @@ Take custom operator `soft_nms` for example.
**Finally, welcome to send us PR of adding custom operators for ONNX Runtime in MMCV.** :nerd_face:
## Known Issues
### Known Issues
- "RuntimeError: tuple appears in op that does not forward tuples, unsupported kind: `prim::PythonOp`."
1. Note generally `cummax` or `cummin` is exportable to ONNX as long as the torch version >= 1.5.0, since `torch.cummax` is only supported with torch >= 1.5.0. But when `cummax` or `cummin` serves as an intermediate component whose outputs is used as inputs for another modules, it's expected that torch version must be >= 1.7.0. Otherwise the above error might arise, when running exported ONNX model with onnxruntime.
2. Solution: update the torch version to 1.7.0 or higher.
## References
### References
- [How to export Pytorch model with custom op to ONNX and run it in ONNX Runtime](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
- [How to add a custom operator/kernel in ONNX Runtime](https://github.com/microsoft/onnxruntime/blob/master/docs/AddingCustomOp.md)
# MMCV 中的 onnx 模块 (实验性)
## MMCV中ONNX模块简介 (实验性)
欢迎有兴趣的朋友一起翻译 MMCV 文档。如有兴趣,请在 [MMCV issue](https://github.com/open-mmlab/mmcv/issues) 提 issue 确定翻译的文档。
### register_extra_symbolics
在将PyTorch模型导出成ONNX时,需要注册额外的符号函数
#### 范例
```python
import mmcv
from mmcv.onnx import register_extra_symbolics
opset_version = 11
register_extra_symbolics(opset_version)
```
#### 常见问题
-
# Onnxruntime 自定义算子
## ONNX Runtime自定义算子
欢迎有兴趣的朋友一起翻译 MMCV 文档。如有兴趣,请在 [MMCV issue](https://github.com/open-mmlab/mmcv/issues) 提 issue 确定翻译的文档。
<!-- TOC -->
- [ONNX Runtime自定义算子](#onnx-runtime自定义算子)
- [SoftNMS](#softnms)
- [描述](#描述)
- [模型参数](#模型参数)
- [输入](#输入)
- [输出](#输出)
- [类型约束](#类型约束)
- [RoIAlign](#roialign)
- [描述](#描述-1)
- [模型参数](#模型参数-1)
- [输入](#输入-1)
- [输出](#输出-1)
- [类型约束](#类型约束-1)
- [NMS](#nms)
- [描述](#描述-2)
- [模型参数](#模型参数-2)
- [输入](#输入-2)
- [输出](#输出-2)
- [类型约束](#类型约束-2)
- [grid_sampler](#grid_sampler)
- [描述](#描述-3)
- [模型参数](#模型参数-3)
- [输入](#输入-3)
- [输出](#输出-3)
- [类型约束](#类型约束-3)
- [CornerPool](#cornerpool)
- [描述](#描述-4)
- [模型参数](#模型参数-4)
- [输入](#输入-4)
- [输出](#输出-4)
- [类型约束](#类型约束-4)
- [cummax](#cummax)
- [描述](#描述-5)
- [模型参数](#模型参数-5)
- [输入](#输入-5)
- [输出](#输出-5)
- [类型约束](#类型约束-5)
- [cummin](#cummin)
- [描述](#描述-6)
- [模型参数](#模型参数-6)
- [输入](#输入-6)
- [输出](#输出-6)
- [类型约束](#类型约束-6)
- [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
- [描述](#描述-7)
- [模型参数](#模型参数-7)
- [输入](#输入-7)
- [输出](#输出-7)
- [类型约束](#类型约束-7)
<!-- TOC -->
### SoftNMS
#### 描述
根据`scores`计算`boxes`的soft NMS。 请阅读[Soft-NMS -- Improving Object Detection With One Line of Code](https://arxiv.org/abs/1704.04503)了解细节。
#### 模型参数
| 类型 | 参数名 | 描述 |
| ------- | --------------- | ------------------------------------------------------- |
| `float` | `iou_threshold` | 用来判断候选框重合度的阈值,取值范围[0, 1]。默认值为0 |
| `float` | `sigma` | 高斯方法的超参数 |
| `float` | `min_score` | NMS的score阈值 |
| `int` | `method` | NMS的计算方式, (0: `naive`, 1: `linear`, 2: `gaussian`) |
| `int` | `offset` | 用来计算候选框的宽高(x2 - x1 + offset)。可选值0或1 |
#### 输入
<dl>
<dt><tt>boxes</tt>: T</dt>
<dd>输入候选框。形状为(N, 4)的二维张量,N为候选框数量。</dd>
<dt><tt>scores</tt>: T</dt>
<dd>输入得分。形状为(N, )的一维张量。</dd>
</dl>
#### 输出
<dl>
<dt><tt>dets</tt>: T</dt>
<dd>输出的检测框与得分。形状为(num_valid_boxes, 5)的二维张量,内容为[[x1, y1, x2, y2, score], ...]。num_valid_boxes是合法的检测框数量。</dd>
<dt><tt>indices</tt>: tensor(int64)</dt>
<dd>输出序号。形状为(num_valid_boxes, )的一维张量。</dd>
</dl>
#### 类型约束
- T:tensor(float32)
### RoIAlign
#### 描述
在特征图上计算RoIAlign,通常在双阶段目标检测模型的bbox_head中使用
#### 模型参数
| 类型 | 参数名 | 描述 |
| ------- | ---------------- | ------------------------------------------------------- |
| `int` | `output_height` | roi特征的输出高度 |
| `int` | `output_width` | roi特征的输出宽度 |
| `float` | `spatial_scale` | 输入检测框的缩放系数 |
| `int` | `sampling_ratio` | 输出的采样率。`0`表示使用密集采样 |
| `str` | `mode` | 池化方式。 `avg``max` |
| `int` | `aligned` | 如果`aligned=1`,则像素会进行-0.5的偏移以达到更好的对齐 |
#### 输入
<dl>
<dt><tt>input</tt>: T</dt>
<dd>输入特征图;形状为(N, C, H, W)的四维张量,其中N为batch大小,C为输入通道数,H和W为输入特征图的高和宽。</dd>
<dt><tt>rois</tt>: T</dt>
<dd>需要进行池化的感兴趣区域;形状为(num_rois, 5)的二维张量,内容为[[batch_index, x1, y1, x2, y2], ...]。rois的坐标为输入特征图的坐标系。</dd>
</dl>
#### 输出
<dl>
<dt><tt>feat</tt>: T</dt>
<dd>池化的输出;形状为(num_rois, C, output_height, output_width)的四维张量。每个输出特征feat[i]都与输入感兴趣区域rois[i]一一对应。<dd>
</dl>
#### 类型约束
- T:tensor(float32)
### NMS
#### 描述
根据IoU阈值对候选框进行非极大值抑制。
#### 模型参数
| 类型 | 参数名 | 描述 |
| ------- | --------------- | ----------------------------------------------------- |
| `float` | `iou_threshold` | 用来判断候选框重合度的阈值,取值范围[0, 1]。默认值为0 |
| `int` | `offset` | 用来计算候选框的宽高(x2 - x1 + offset)。可选值0或1 |
#### 输入
<dl>
<dt><tt>boxes</tt>: T</dt>
<dd>输入候选框。形状为(N, 4)的二维张量,N为候选框数量。</dd>
<dt><tt>scores</tt>: T</dt>
<dd>输入得分。形状为(N, )的一维张量。</dd>
</dl>
#### 输出
<dl>
<dt><tt>indices</tt>: tensor(int32, Linear)</dt>
<dd>被选中的候选框索引。形状为(num_valid_boxes, )的一维张量,num_valid_boxes表示被选上的候选框数量。</dd>
</dl>
#### 类型约束
- T:tensor(float32)
### grid_sampler
#### 描述
根据`grid`的像素位置对`input`进行网格采样。
#### 模型参数
| 类型 | 参数名 | 描述 |
| ----- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `interpolation_mode` | 计算输出使用的插值模式。(0: `bilinear` , 1: `nearest`) |
| `int` | `padding_mode` | 边缘填充模式。(0: `zeros`, 1: `border`, 2: `reflection`) |
| `int` | `align_corners` | 如果`align_corners=1`,则极值(`-1``1`)会被当做输入边缘像素的中心点。如果`align_corners=0`,则它们会被看做是边缘像素的边缘点,减小分辨率对采样的影响 |
#### 输入
<dl>
<dt><tt>input</tt>: T</dt>
<dd>输入特征;形状为(N, C, inH, inW)的四维张量,其中N为batch大小,C为输入通道数,inH和inW为输入特征图的高和宽。</dd>
<dt><tt>grid</tt>: T</dt>
<dd>输入网格;形状为(N, outH, outW, 2)的四维张量,outH和outW为输出的高和宽。 </dd>
</dl>
#### 输出
<dl>
<dt><tt>output</tt>: T</dt>
<dd>输出特征;形状为(N, C, outH, outW)的四维张量。</dd>
</dl>
#### 类型约束
- T:tensor(float32, Linear)
### CornerPool
#### 描述
`input`计算CornerPool。请阅读[CornerNet -- Detecting Objects as Paired Keypoints](https://arxiv.org/abs/1808.01244)了解更多细节。
#### 模型参数
| 类型 | 参数名 | 描述 |
| ----- | ------ | -------------------------------------------------------- |
| `int` | `mode` | 池化模式。(0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
#### 输入
<dl>
<dt><tt>input</tt>: T</dt>
<dd>输入特征;形状为(N, C, H, W)的四维张量,其中N为batch大小,C为输入通道数,H和W为输入特征图的高和宽。</dd>
</dl>
#### 输出
<dl>
<dt><tt>output</tt>: T</dt>
<dd>输出特征;形状为(N, C, H, W)的四维张量。</dd>
</dl>
#### 类型约束
- T:tensor(float32)
### cummax
#### 描述
返回一个元组(`values`, `indices`),其中`values``input``dim`维的累计最大值,`indices`为第`dim`维最大值位置。请阅读[torch.cummax](https://pytorch.org/docs/stable/generated/torch.cummax.html)了解更多细节。
#### 模型参数
| 类型 | 参数名 | 描述 |
| ----- | ------ | ------------------ |
| `int` | `dim` | 进行累计计算的维度 |
#### 输入
<dl>
<dt><tt>input</tt>: T</dt>
<dd>输入张量;可以使任意形状;也支持空Tensor</dd>
</dl>
#### 输出
<dl>
<dt><tt>output</tt>: T</dt>
<dd>`input``dim`维的累计最大值,形状与`input`相同。类型和`input`一致</dd>
<dt><tt>indices</tt>: tensor(int64)</dt>
<dd>`dim`维最大值位置,形状与`input`相同。</dd>
</dl>
#### 类型约束
- T:tensor(float32)
### cummin
#### 描述
返回一个元组(`values`, `indices`),其中`values``input``dim`维的累计最小值,`indices`为第`dim`维最小值位置。请阅读[torch.cummin](https://pytorch.org/docs/stable/generated/torch.cummin.html)了解更多细节。
#### 模型参数
| 类型 | 参数名 | 描述 |
| ----- | ------ | ------------------ |
| `int` | `dim` | 进行累计计算的维度 |
#### 输入
<dl>
<dt><tt>input</tt>: T</dt>
<dd>输入张量;可以是任意形状;也支持空Tensor</dd>
</dl>
#### 输出
<dl>
<dt><tt>output</tt>: T</dt>
<dd>`input``dim`维的累计最小值,形状与`input`相同。类型和`input`一致</dd>
<dt><tt>indices</tt>: tensor(int64)</dt>
<dd>`dim`维最小值位置,形状与`input`相同。</dd>
</dl>
#### 类型约束
- T:tensor(float32)
### MMCVModulatedDeformConv2d
#### 描述
在输入特征上计算Modulated Deformable Convolution,请阅读[Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline)了解更多细节。
#### 模型参数
| 类型 | 参数名 | 描述 |
| -------------- | ------------------- | ------------------------------------------------------------- |
| `list of ints` | `stride` | 卷积的步长 (sH, sW) |
| `list of ints` | `padding` | 输入特征填充大小 (padH, padW) |
| `list of ints` | `dilation` | 卷积核各元素间隔 (dH, dW) |
| `int` | `deformable_groups` | 可变偏移量的分组,通常置位1即可 |
| `int` | `groups` | 卷积分组数,`input_channel`会根据这个值被分为数个分组进行计算 |
#### 输入
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>输入特征;形状为(N, C, inH, inW)的四维张量,其中N为batch大小,C为输入通道数,inH和inW为输入特征图的高和宽。</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>输入偏移量;形状为(N, deformable_group* 2* kH* kW, outH, outW)的四维张量,kH和kW为输入特征图的高和宽,outH和outW为输入特征图的高和宽。</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>输入掩码;形状为(N, deformable_group* kH* kW, outH, outW)的四维张量。</dd>
<dt><tt>inputs[3]</tt>: T</dt>
<dd>输入权重;形状为(output_channel, input_channel, kH, kW)的四维张量。</dd>
<dt><tt>inputs[4]</tt>: T, optional</dt>
<dd>输入偏移量;形状为(output_channel)的一维张量。</dd>
</dl>
#### 输出
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>输出特征;形状为(N, output_channel, outH, outW)的四维张量。</dd>
</dl>
#### 类型约束
- T:tensor(float32, Linear)
# MMCV 中用于 ONNX Runtime自定义算子
## MMCV中的ONNX Runtime自定义算子
欢迎有兴趣的朋友一起翻译 MMCV 文档。如有兴趣,请在 [MMCV issue](https://github.com/open-mmlab/mmcv/issues) 提 issue 确定翻译的文档。
### ONNX Runtime介绍
**ONNX Runtime**是一个跨平台的推理与训练加速器,适配许多常用的机器学习/深度神经网络框架。请访问[github](https://github.com/microsoft/onnxruntime)了解更多信息。
### ONNX介绍
**ONNX****Open Neural Network Exchange**的缩写,是许多机器学习/深度神经网络框架使用的*中间表示(IR)*。请访问[github](https://github.com/onnx/onnx)了解更多信息。
### 为什么要在MMCV中添加ONNX自定义算子?
- 为了验证ONNX模型在ONNX Runtime下的推理的正确性。
- 为了方便使用了`mmcv.ops`自定义算子的模型的部署工作。
### MMCV已支持的算子
| 算子 | CPU | GPU | MMCV版本 |
| :------------------------------------------------------------------------------: | :---: | :---: | :------: |
| [SoftNMS](onnxruntime_custom_ops.md#softnms) | Y | N | 1.2.3 |
| [RoIAlign](onnxruntime_custom_ops.md#roialign) | Y | N | 1.2.5 |
| [NMS](onnxruntime_custom_ops.md#nms) | Y | N | 1.2.7 |
| [grid_sampler](onnxruntime_custom_ops.md#grid_sampler) | Y | N | 1.3.1 |
| [CornerPool](onnxruntime_custom_ops.md#cornerpool) | Y | N | 1.3.4 |
| [cummax](onnxruntime_custom_ops.md#cummax) | Y | N | 1.3.4 |
| [cummin](onnxruntime_custom_ops.md#cummin) | Y | N | 1.3.4 |
| [MMCVModulatedDeformConv2d](onnxruntime_custom_ops.md#mmcvmodulateddeformconv2d) | Y | N | 1.3.12 |
### 如何编译ONNX Runtime自定义算子?
*请注意我们仅在**onnxruntime>=1.8.1**的Linux x86-64 cpu平台上进行过测试*
#### 准备工作
- 克隆代码仓库
```bash
git clone https://github.com/open-mmlab/mmcv.git
```
- 从ONNX Runtime下载`onnxruntime-linux`[releases](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1),解压缩,根据路径创建变量`ONNXRUNTIME_DIR`并把路径下的lib目录添加到`LD_LIBRARY_PATH`,步骤如下:
```bash
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
cd onnxruntime-linux-x64-1.8.1
export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```
#### Linux系统下编译
```bash
cd mmcv ## to MMCV root directory
MMCV_WITH_OPS=1 MMCV_WITH_ORT=1 python setup.py develop
```
### 如何在python下使用ONNX Runtime对导出的ONNX模型做编译
使用`pip`安装ONNX Runtime
```bash
pip install onnxruntime==1.8.1
```
推理范例
```python
import os
import numpy as np
import onnxruntime as ort
from mmcv.ops import get_onnxruntime_op_path
ort_custom_op_path = get_onnxruntime_op_path()
assert os.path.exists(ort_custom_op_path)
session_options = ort.SessionOptions()
session_options.register_custom_ops_library(ort_custom_op_path)
## exported ONNX model with custom operators
onnx_file = 'sample.onnx'
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
sess = ort.InferenceSession(onnx_file, session_options)
onnx_results = sess.run(None, {'input' : input_data})
```
### 如何为MMCV添加ONNX Runtime的自定义算子
#### 开发前提醒
- 该算子的ONNX Runtime实现尚未在MMCV中支持[已实现算子列表](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md)
- 确保该自定义算子可以被ONNX导出。
#### 添加方法
`soft_nms`为例:
1. 在ONNX Runtime头文件目录`mmcv/ops/csrc/onnxruntime/`下添加头文件`soft_nms.h`
2. 在ONNX Runtime源码目录`mmcv/ops/csrc/onnxruntime/cpu/`下添加算子实现`soft_nms.cpp`
3.[onnxruntime_register.cpp](../../mmcv/ops/csrc/onnxruntime/cpu/onnxruntime_register.cpp)中注册实现的算子`soft_nms`
```c++
#include "soft_nms.h"
SoftNmsOp c_SoftNmsOp;
if (auto status = ortApi->CustomOpDomain_Add(domain, &c_SoftNmsOp)) {
return status;
}
```
4.`tests/test_ops/test_onnx.py`添加单元测试,
可以参考[here](../../tests/test_ops/test_onnx.py)
**最后,欢迎为MMCV添加ONNX Runtime自定义算子** :nerd_face:
### 已知问题
- "RuntimeError: tuple appears in op that does not forward tuples, unsupported kind: `prim::PythonOp`."
1. 请注意`cummax``cummin`算子是在torch >= 1.5.0被添加的。但他们需要在torch version >= 1.7.0才能正确导出。否则会在导出时发生上面的错误。
2. 解决方法:升级PyTorch到1.7.0以上版本
### 引用
- [How to export Pytorch model with custom op to ONNX and run it in ONNX Runtime](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
- [How to add a custom operator/kernel in ONNX Runtime](https://github.com/microsoft/onnxruntime/blob/master/docs/AddingCustomOp.md)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment