onnxruntime_custom_ops.md 15.9 KB
Newer Older
RunningLeon's avatar
RunningLeon committed
1
## ONNX Runtime Custom Ops
2
3
4

<!-- TOC -->

RunningLeon's avatar
RunningLeon committed
5
- [ONNX Runtime Custom Ops](#onnx-runtime-custom-ops)
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
  - [SoftNMS](#softnms)
    - [Description](#description)
    - [Parameters](#parameters)
    - [Inputs](#inputs)
    - [Outputs](#outputs)
    - [Type Constraints](#type-constraints)
  - [RoIAlign](#roialign)
    - [Description](#description-1)
    - [Parameters](#parameters-1)
    - [Inputs](#inputs-1)
    - [Outputs](#outputs-1)
    - [Type Constraints](#type-constraints-1)
  - [NMS](#nms)
    - [Description](#description-2)
    - [Parameters](#parameters-2)
    - [Inputs](#inputs-2)
    - [Outputs](#outputs-2)
    - [Type Constraints](#type-constraints-2)
  - [grid_sampler](#grid_sampler)
    - [Description](#description-3)
    - [Parameters](#parameters-3)
    - [Inputs](#inputs-3)
    - [Outputs](#outputs-3)
    - [Type Constraints](#type-constraints-3)
30
31
32
33
34
35
  - [CornerPool](#cornerpool)
    - [Description](#description-4)
    - [Parameters](#parameters-4)
    - [Inputs](#inputs-4)
    - [Outputs](#outputs-4)
    - [Type Constraints](#type-constraints-4)
36
37
38
39
40
41
42
43
44
45
46
47
  - [cummax](#cummax)
    - [Description](#description-5)
    - [Parameters](#parameters-5)
    - [Inputs](#inputs-5)
    - [Outputs](#outputs-5)
    - [Type Constraints](#type-constraints-5)
  - [cummin](#cummin)
    - [Description](#description-6)
    - [Parameters](#parameters-6)
    - [Inputs](#inputs-6)
    - [Outputs](#outputs-6)
    - [Type Constraints](#type-constraints-6)
48
49
50
51
52
53
  - [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
    - [Description](#description-7)
    - [Parameters](#parameters-7)
    - [Inputs](#inputs-7)
    - [Outputs](#outputs-7)
    - [Type Constraints](#type-constraints-7)
54
55
56
57
58
59
  - [MMCVDeformConv2d](#mmcvdeformconv2d)
    - [Description](#description-8)
    - [Parameters](#parameters-8)
    - [Inputs](#inputs-8)
    - [Outputs](#outputs-8)
    - [Type Constraints](#type-constraints-8)
60
61
62

<!-- TOC -->

63
### SoftNMS
64

65
#### Description
66
67
68

Perform soft NMS on `boxes` with `scores`. Read [Soft-NMS -- Improving Object Detection With One Line of Code](https://arxiv.org/abs/1704.04503) for detail.

69
#### Parameters
70
71

| Type    | Parameter       | Description                                                    |
RunningLeon's avatar
RunningLeon committed
72
|---------|-----------------|----------------------------------------------------------------|
73
74
75
76
77
78
| `float` | `iou_threshold` | IoU threshold for NMS                                          |
| `float` | `sigma`         | hyperparameter for gaussian method                             |
| `float` | `min_score`     | score filter threshold                                         |
| `int`   | `method`        | method to do the nms, (0: `naive`, 1: `linear`, 2: `gaussian`) |
| `int`   | `offset`        | `boxes` width or height is (x2 - x1 + offset). (0 or 1)        |

79
#### Inputs
80
81
82
83
84
85
86
87

<dl>
<dt><tt>boxes</tt>: T</dt>
<dd>Input boxes. 2-D tensor of shape (N, 4). N is the number of boxes.</dd>
<dt><tt>scores</tt>: T</dt>
<dd>Input scores. 1-D tensor of shape (N, ).</dd>
</dl>

88
#### Outputs
89
90

<dl>
91
<dt><tt>dets</tt>: T</dt>
92
<dd>Output boxes and scores. 2-D tensor of shape (num_valid_boxes, 5), [[x1, y1, x2, y2, score], ...]. num_valid_boxes is the number of valid boxes.</dd>
93
<dt><tt>indices</tt>: tensor(int64)</dt>
94
95
96
<dd>Output indices. 1-D tensor of shape (num_valid_boxes, ).</dd>
</dl>

97
#### Type Constraints
98
99
100

- T:tensor(float32)

101
### RoIAlign
102

103
#### Description
104
105
106

Perform RoIAlign on output feature, used in bbox_head of most two-stage detectors.

107
#### Parameters
108
109

| Type    | Parameter        | Description                                                                                                   |
RunningLeon's avatar
RunningLeon committed
110
|---------|------------------|---------------------------------------------------------------------------------------------------------------|
111
112
113
114
115
116
117
| `int`   | `output_height`  | height of output roi                                                                                          |
| `int`   | `output_width`   | width of output roi                                                                                           |
| `float` | `spatial_scale`  | used to scale the input boxes                                                                                 |
| `int`   | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
| `str`   | `mode`           | pooling mode in each bin. `avg` or `max`                                                                      |
| `int`   | `aligned`        | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly.         |

118
#### Inputs
119
120
121
122
123
124
125
126

<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.</dd>
<dt><tt>rois</tt>: T</dt>
<dd>RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of input.</dd>
</dl>

127
#### Outputs
128
129
130
131
132
133

<dl>
<dt><tt>feat</tt>: T</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element feat[r-1] is a pooled feature map corresponding to the r-th RoI RoIs[r-1].<dd>
</dl>

134
#### Type Constraints
135
136
137

- T:tensor(float32)

138
### NMS
139

140
#### Description
141
142
143

Filter out boxes has high IoU overlap with previously selected boxes.

144
#### Parameters
145
146

| Type    | Parameter       | Description                                                                                                      |
RunningLeon's avatar
RunningLeon committed
147
|---------|-----------------|------------------------------------------------------------------------------------------------------------------|
148
149
150
| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
| `int`   | `offset`        | 0 or 1, boxes' width or height is (x2 - x1 + offset).                                                            |

151
#### Inputs
152
153
154
155
156
157
158
159

<dl>
<dt><tt>bboxes</tt>: T</dt>
<dd>Input boxes. 2-D tensor of shape (num_boxes, 4). num_boxes is the number of input boxes.</dd>
<dt><tt>scores</tt>: T</dt>
<dd>Input scores. 1-D tensor of shape (num_boxes, ).</dd>
</dl>

160
#### Outputs
161
162
163
164
165
166

<dl>
<dt><tt>indices</tt>: tensor(int32, Linear)</dt>
<dd>Selected indices. 1-D tensor of shape (num_valid_boxes, ). num_valid_boxes is the number of valid boxes.</dd>
</dl>

167
#### Type Constraints
168
169
170

- T:tensor(float32)

171
### grid_sampler
172

173
#### Description
174
175
176

Perform sample from `input` with pixel locations from `grid`.

177
#### Parameters
178
179

| Type  | Parameter            | Description                                                                                                                                                                                                                                                                                     |
RunningLeon's avatar
RunningLeon committed
180
|-------|----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
181
182
183
184
| `int` | `interpolation_mode` | Interpolation mode to calculate output values. (0: `bilinear` , 1: `nearest`)                                                                                                                                                                                                                   |
| `int` | `padding_mode`       | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`)                                                                                                                                                                                                                |
| `int` | `align_corners`      | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |

185
#### Inputs
186
187
188
189
190
191
192
193

<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>grid</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW is the height and width of offset and output. </dd>
</dl>

194
#### Outputs
195
196
197
198
199
200

<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, C, outH, outW).</dd>
</dl>

201
#### Type Constraints
202
203

- T:tensor(float32, Linear)
204

205
### CornerPool
206

207
#### Description
208
209
210

Perform CornerPool on `input` features. Read [CornerNet -- Detecting Objects as Paired Keypoints](https://arxiv.org/abs/1808.01244) for more details.

211
#### Parameters
212

213
| Type  | Parameter | Description                                                      |
RunningLeon's avatar
RunningLeon committed
214
|-------|-----------|------------------------------------------------------------------|
215
| `int` | `mode`    | corner pool mode, (0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
216

217
#### Inputs
218
219
220
221
222
223

<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input features. 4-D tensor of shape (N, C, H, W). N is the batch size.</dd>
</dl>

224
#### Outputs
225
226
227
228
229
230

<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output the pooled features. 4-D tensor of shape (N, C, H, W).</dd>
</dl>

231
#### Type Constraints
232
233

- T:tensor(float32)
234

235
### cummax
236

237
#### Description
238
239
240

Returns a tuple (`values`, `indices`) where `values` is the cumulative maximum elements of `input` in the dimension `dim`. And `indices` is the index location of each maximum value found in the dimension `dim`. Read [torch.cummax](https://pytorch.org/docs/stable/generated/torch.cummax.html) for more details.

241
#### Parameters
242

243
| Type  | Parameter | Description                            |
RunningLeon's avatar
RunningLeon committed
244
|-------|-----------|----------------------------------------|
245
| `int` | `dim`     | the dimension to do the operation over |
246

247
#### Inputs
248
249
250
251
252
253

<dl>
<dt><tt>input</tt>: T</dt>
<dd>The input tensor with various shapes. Tensor with empty element is also supported.</dd>
</dl>

254
#### Outputs
255
256
257
258
259
260
261
262

<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output the cumulative maximum elements of `input` in the dimension `dim`, with the same shape and dtype as `input`.</dd>
<dt><tt>indices</tt>: tensor(int64)</dt>
<dd>Output the index location of each cumulative maximum value found in the dimension `dim`, with the same shape as `input`.</dd>
</dl>

263
#### Type Constraints
264
265
266

- T:tensor(float32)

267
### cummin
268

269
#### Description
270
271
272

Returns a tuple (`values`, `indices`) where `values` is the cumulative minimum elements of `input` in the dimension `dim`. And `indices` is the index location of each minimum value found in the dimension `dim`. Read [torch.cummin](https://pytorch.org/docs/stable/generated/torch.cummin.html) for more details.

273
#### Parameters
274

275
| Type  | Parameter | Description                            |
RunningLeon's avatar
RunningLeon committed
276
|-------|-----------|----------------------------------------|
277
| `int` | `dim`     | the dimension to do the operation over |
278

279
#### Inputs
280
281
282
283
284
285

<dl>
<dt><tt>input</tt>: T</dt>
<dd>The input tensor with various shapes. Tensor with empty element is also supported.</dd>
</dl>

286
#### Outputs
287
288
289
290
291
292
293
294

<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output the cumulative minimum elements of `input` in the dimension `dim`, with the same shape and dtype as `input`.</dd>
<dt><tt>indices</tt>: tensor(int64)</dt>
<dd>Output the index location of each cumulative minimum value found in the dimension `dim`, with the same shape as `input`.</dd>
</dl>

295
#### Type Constraints
296
297

- T:tensor(float32)
298

299
### MMCVModulatedDeformConv2d
300

301
#### Description
302
303
304

Perform Modulated Deformable Convolution on input feature, read [Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline) for detail.

305
#### Parameters
306

307
| Type           | Parameter           | Description                                                                           |
RunningLeon's avatar
RunningLeon committed
308
|----------------|---------------------|---------------------------------------------------------------------------------------|
309
310
311
| `list of ints` | `stride`            | The stride of the convolving kernel. (sH, sW)                                         |
| `list of ints` | `padding`           | Paddings on both sides of the input. (padH, padW)                                     |
| `list of ints` | `dilation`          | The spacing between kernel elements. (dH, dW)                                         |
312
313
314
| `int`          | `deformable_groups` | Groups of deformable offset.                                                          |
| `int`          | `groups`            | Split input into groups. `input_channel` should be divisible by the number of groups. |

315
#### Inputs
316
317
318
319
320
321
322
323
324
325
326
327
328
329

<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the number of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Input mask; 4-D tensor of shape (N, deformable_group* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.</dd>
<dt><tt>inputs[3]</tt>: T</dt>
<dd>Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).</dd>
<dt><tt>inputs[4]</tt>: T, optional</dt>
<dd>Input bias; 1-D tensor of shape (output_channel).</dd>
</dl>

330
#### Outputs
331
332
333
334
335
336

<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, output_channel, outH, outW).</dd>
</dl>

337
#### Type Constraints
338
339

- T:tensor(float32, Linear)
340

RunningLeon's avatar
RunningLeon committed
341
### MMCVDeformConv2d
342

RunningLeon's avatar
RunningLeon committed
343
#### Description
344
345
346

Perform Deformable Convolution on input feature, read [Deformable Convolutional Network](https://arxiv.org/abs/1703.06211) for detail.

RunningLeon's avatar
RunningLeon committed
347
#### Parameters
348
349

| Type           | Parameter          | Description                                                                                                                       |
RunningLeon's avatar
RunningLeon committed
350
|----------------|--------------------|-----------------------------------------------------------------------------------------------------------------------------------|
351
352
353
354
355
356
357
| `list of ints` | `stride`           | The stride of the convolving kernel. (sH, sW)                                                                                     |
| `list of ints` | `padding`          | Paddings on both sides of the input. (padH, padW)                                                                                 |
| `list of ints` | `dilation`         | The spacing between kernel elements. (dH, dW)                                                                                     |
| `int`          | `deformable_group` | Groups of deformable offset.                                                                                                      |
| `int`          | `group`            | Split input into groups. `input_channel` should be divisible by the number of groups.                                             |
| `int`          | `im2col_step`      | DeformableConv2d use im2col to compute convolution. im2col_step is used to split input and offset, reduce memory usage of column. |

RunningLeon's avatar
RunningLeon committed
358
#### Inputs
359
360
361
362
363
364
365
366
367
368

<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).</dd>
</dl>

RunningLeon's avatar
RunningLeon committed
369
#### Outputs
370
371
372
373
374
375

<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, output_channel, outH, outW).</dd>
</dl>

RunningLeon's avatar
RunningLeon committed
376
#### Type Constraints
377
378

- T:tensor(float32, Linear)