"pcdet/models/backbones_2d/base_bev_backbone.py" did not exist on "3fdecc873991c7696a55eb518225b2bd85cbaac2"
Commit 4353fa59 authored by limm's avatar limm
Browse files

add part code

parents
Pipeline #2807 canceled with stages
## TensorRT Ops
<!-- TOC -->
- [TensorRT Ops](#tensorrt-ops)
- [TRTBatchedNMS](#trtbatchednms)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
- [grid_sampler](#grid_sampler)
- [Description](#description-1)
- [Parameters](#parameters-1)
- [Inputs](#inputs-1)
- [Outputs](#outputs-1)
- [Type Constraints](#type-constraints-1)
- [MMCVInstanceNormalization](#mmcvinstancenormalization)
- [Description](#description-2)
- [Parameters](#parameters-2)
- [Inputs](#inputs-2)
- [Outputs](#outputs-2)
- [Type Constraints](#type-constraints-2)
- [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
- [Description](#description-3)
- [Parameters](#parameters-3)
- [Inputs](#inputs-3)
- [Outputs](#outputs-3)
- [Type Constraints](#type-constraints-3)
- [MMCVMultiLevelRoiAlign](#mmcvmultilevelroialign)
- [Description](#description-4)
- [Parameters](#parameters-4)
- [Inputs](#inputs-4)
- [Outputs](#outputs-4)
- [Type Constraints](#type-constraints-4)
- [MMCVRoIAlign](#mmcvroialign)
- [Description](#description-5)
- [Parameters](#parameters-5)
- [Inputs](#inputs-5)
- [Outputs](#outputs-5)
- [Type Constraints](#type-constraints-5)
- [ScatterND](#scatternd)
- [Description](#description-6)
- [Parameters](#parameters-6)
- [Inputs](#inputs-6)
- [Outputs](#outputs-6)
- [Type Constraints](#type-constraints-6)
- [TRTBatchedRotatedNMS](#trtbatchedrotatednms)
- [Description](#description-7)
- [Parameters](#parameters-7)
- [Inputs](#inputs-7)
- [Outputs](#outputs-7)
- [Type Constraints](#type-constraints-7)
- [GridPriorsTRT](#gridpriorstrt)
- [Description](#description-8)
- [Parameters](#parameters-8)
- [Inputs](#inputs-8)
- [Outputs](#outputs-8)
- [Type Constraints](#type-constraints-8)
- [ScaledDotProductAttentionTRT](#scaleddotproductattentiontrt)
- [Description](#description-9)
- [Parameters](#parameters-9)
- [Inputs](#inputs-9)
- [Outputs](#outputs-9)
- [Type Constraints](#type-constraints-9)
- [GatherTopk](#gathertopk)
- [Description](#description-10)
- [Parameters](#parameters-10)
- [Inputs](#inputs-10)
- [Outputs](#outputs-10)
- [Type Constraints](#type-constraints-10)
- [MMCVMultiScaleDeformableAttention](#mmcvmultiscaledeformableattention)
- [Description](#description-11)
- [Parameters](#parameters-11)
- [Inputs](#inputs-11)
- [Outputs](#outputs-11)
- [Type Constraints](#type-constraints-11)
<!-- TOC -->
### TRTBatchedNMS
#### Description
Batched NMS with a fixed number of output bounding boxes.
#### Parameters
| Type | Parameter | Description |
| ------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `background_label_id` | The label ID for the background class. If there is no background class, set it to `-1`. |
| `int` | `num_classes` | The number of classes. |
| `int` | `topK` | The number of bounding boxes to be fed into the NMS step. |
| `int` | `keepTopK` | The number of total bounding boxes to be kept per-image after the NMS step. Should be less than or equal to the `topK` value. |
| `float` | `scoreThreshold` | The scalar threshold for score (low scoring boxes are removed). |
| `float` | `iouThreshold` | The scalar threshold for IoU (new boxes that have high IoU overlap with previously selected boxes are removed). |
| `int` | `isNormalized` | Set to `false` if the box coordinates are not normalized, meaning they are not in the range `[0,1]`. Defaults to `true`. |
| `int` | `clipBoxes` | Forcibly restrict bounding boxes to the normalized range `[0,1]`. Only applicable if `isNormalized` is also `true`. Defaults to `true`. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>boxes; 4-D tensor of shape (N, num_boxes, num_classes, 4), where N is the batch size; `num_boxes` is the number of boxes; `num_classes` is the number of classes, which could be 1 if the boxes are shared between all classes.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>scores; 4-D tensor of shape (N, num_boxes, 1, num_classes). </dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>dets; 3-D tensor of shape (N, valid_num_boxes, 5), `valid_num_boxes` is the number of boxes after NMS. For each row `dets[i,j,:] = [x0, y0, x1, y1, score]`</dd>
<dt><tt>outputs[1]</tt>: tensor(int32, Linear)</dt>
<dd>labels; 2-D tensor of shape (N, valid_num_boxes). </dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### grid_sampler
#### Description
Perform sample from `input` with pixel locations from `grid`.
#### Parameters
| Type | Parameter | Description |
| ----- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `interpolation_mode` | Interpolation mode to calculate output values. (0: `bilinear` , 1: `nearest`) |
| `int` | `padding_mode` | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`) |
| `int` | `align_corners` | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW are the height and width of offset and output. </dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, C, outH, outW).</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### MMCVInstanceNormalization
#### Description
Carry out instance normalization as described in the paper https://arxiv.org/abs/1607.08022.
y = scale * (x - mean) / sqrt(variance + epsilon) + B, where mean and variance are computed per instance per channel.
#### Parameters
| Type | Parameter | Description |
| ------- | --------- | -------------------------------------------------------------------- |
| `float` | `epsilon` | The epsilon value to use to avoid division by zero. Default is 1e-05 |
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
<dt><tt>scale</tt>: T</dt>
<dd>The input 1-dimensional scale tensor of size C.</dd>
<dt><tt>B</tt>: T</dt>
<dd>The input 1-dimensional bias tensor of size C.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>output</tt>: T</dt>
<dd>The output tensor of the same shape as input.</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### MMCVModulatedDeformConv2d
#### Description
Perform Modulated Deformable Convolution on input feature. Read [Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline) for detail.
#### Parameters
| Type | Parameter | Description |
| -------------- | ------------------ | ------------------------------------------------------------------------------------- |
| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
| `int` | `deformable_group` | Groups of deformable offset. |
| `int` | `group` | Split input into groups. `input_channel` should be divisible by the number of groups. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the number of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW are the height and width of weight, outH and outW are the height and width of offset and output.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Input mask; 4-D tensor of shape (N, deformable_group* kH* kW, outH, outW), where kH and kW are the height and width of weight, outH and outW are the height and width of offset and output.</dd>
<dt><tt>inputs[3]</tt>: T</dt>
<dd>Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).</dd>
<dt><tt>inputs[4]</tt>: T, optional</dt>
<dd>Input weight; 1-D tensor of shape (output_channel).</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, output_channel, outH, outW).</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### MMCVMultiLevelRoiAlign
#### Description
Perform RoIAlign on features from multiple levels. Used in bbox_head of most two-stage detectors.
#### Parameters
| Type | Parameter | Description |
| ---------------- | ------------------ | ------------------------------------------------------------------------------------------------------------- |
| `int` | `output_height` | height of output roi. |
| `int` | `output_width` | width of output roi. |
| `list of floats` | `featmap_strides` | feature map stride of each level. |
| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
| `float` | `roi_scale_factor` | RoIs will be scaled by this factor before RoI Align. |
| `int` | `finest_scale` | Scale threshold of mapping to level 0. Default: 56. |
| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
#### Inputs
<dt><tt>inputs[0]</tt>: T</dt>
<dd>RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...].</dd>
<dt><tt>inputs[1~]</tt>: T</dt>
<dd>Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.</dd>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element output[0][r-1] is a pooled feature map corresponding to the r-th RoI inputs[1][r-1].<dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### MMCVRoIAlign
#### Description
Perform RoIAlign on output feature, used in bbox_head of most two-stage detectors.
#### Parameters
| Type | Parameter | Description |
| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
| `int` | `output_height` | height of output roi |
| `int` | `output_width` | width of output roi |
| `float` | `spatial_scale` | used to scale the input boxes |
| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
| `str` | `mode` | pooling mode in each bin. `avg` or `max` |
| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of inputs[0].</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element output[0][r-1] is a pooled feature map corresponding to the r-th RoI inputs[1][r-1].<dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### ScatterND
#### Description
ScatterND takes three inputs `data` tensor of rank r >= 1, `indices` tensor of rank q >= 1, and `updates` tensor of rank q + r - indices.shape[-1] - 1. The output of the operation is produced by creating a copy of the input `data`, and then updating its value to values specified by updates at specific index positions specified by `indices`. Its output shape is the same as the shape of `data`. Note that `indices` should not have duplicate entries. That is, two or more updates for the same index-location is not supported.
The `output` is calculated via the following equation:
```python
output = np.copy(data)
update_indices = indices.shape[:-1]
for idx in np.ndindex(update_indices):
output[indices[idx]] = updates[idx]
```
#### Parameters
None
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Tensor of rank r>=1.</dd>
<dt><tt>inputs[1]</tt>: tensor(int32, Linear)</dt>
<dd>Tensor of rank q>=1.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Tensor of rank q + r - indices_shape[-1] - 1.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Tensor of rank r >= 1.</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear), tensor(int32, Linear)
### TRTBatchedRotatedNMS
#### Description
Batched rotated NMS with a fixed number of output bounding boxes.
#### Parameters
| Type | Parameter | Description |
| ------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `background_label_id` | The label ID for the background class. If there is no background class, set it to `-1`. |
| `int` | `num_classes` | The number of classes. |
| `int` | `topK` | The number of bounding boxes to be fed into the NMS step. |
| `int` | `keepTopK` | The number of total bounding boxes to be kept per-image after the NMS step. Should be less than or equal to the `topK` value. |
| `float` | `scoreThreshold` | The scalar threshold for score (low scoring boxes are removed). |
| `float` | `iouThreshold` | The scalar threshold for IoU (new boxes that have high IoU overlap with previously selected boxes are removed). |
| `int` | `isNormalized` | Set to `false` if the box coordinates are not normalized, meaning they are not in the range `[0,1]`. Defaults to `true`. |
| `int` | `clipBoxes` | Forcibly restrict bounding boxes to the normalized range `[0,1]`. Only applicable if `isNormalized` is also `true`. Defaults to `true`. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>boxes; 4-D tensor of shape (N, num_boxes, num_classes, 5), where N is the batch size; `num_boxes` is the number of boxes; `num_classes` is the number of classes, which could be 1 if the boxes are shared between all classes.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>scores; 4-D tensor of shape (N, num_boxes, 1, num_classes). </dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>dets; 3-D tensor of shape (N, valid_num_boxes, 6), `valid_num_boxes` is the number of boxes after NMS. For each row `dets[i,j,:] = [x0, y0, width, height, theta, score]`</dd>
<dt><tt>outputs[1]</tt>: tensor(int32, Linear)</dt>
<dd>labels; 2-D tensor of shape (N, valid_num_boxes). </dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### GridPriorsTRT
#### Description
Generate the anchors for object detection task.
#### Parameters
| Type | Parameter | Description |
| ----- | ---------- | --------------------------------- |
| `int` | `stride_w` | The stride of the feature width. |
| `int` | `stride_h` | The stride of the feature height. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>The base anchors; 2-D tensor with shape [num_base_anchor, 4].</dd>
<dt><tt>inputs[1]</tt>: TAny</dt>
<dd>height provider; 1-D tensor with shape [featmap_height]. The data will never been used.</dd>
<dt><tt>inputs[2]</tt>: TAny</dt>
<dd>width provider; 1-D tensor with shape [featmap_width]. The data will never been used.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>output anchors; 2-D tensor of shape (num_base_anchor*featmap_height*featmap_widht, 4).</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
- TAny: Any
### ScaledDotProductAttentionTRT
#### Description
Dot product attention used to support multihead attention, read [Attention Is All You Need](https://arxiv.org/abs/1706.03762?context=cs) for more detail.
#### Parameters
None
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>query; 3-D tensor with shape [batch_size, sequence_length, embedding_size].</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>key; 3-D tensor with shape [batch_size, sequence_length, embedding_size].</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>value; 3-D tensor with shape [batch_size, sequence_length, embedding_size].</dd>
<dt><tt>inputs[3]</tt>: T</dt>
<dd>mask; 2-D/3-D tensor with shape [sequence_length, sequence_length] or [batch_size, sequence_length, sequence_length]. optional.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>3-D tensor of shape [batch_size, sequence_length, embedding_size]. `softmax(q@k.T)@v`</dd>
<dt><tt>outputs[1]</tt>: T</dt>
<dd>3-D tensor of shape [batch_size, sequence_length, sequence_length]. `softmax(q@k.T)`</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### GatherTopk
#### Description
TensorRT 8.2~8.4 would give unexpected result for multi-index gather.
```python
data[batch_index, bbox_index, ...]
```
Read [this](https://github.com/NVIDIA/TensorRT/issues/2299) for more details.
#### Parameters
None
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Tensor to be gathered, with shape (A0, ..., An, G0, C0, ...).</dd>
<dt><tt>inputs[1]</tt>: tensor(int32, Linear)</dt>
<dd>Tensor of index. with shape (A0, ..., An, G1)</dd>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Tensor of output. With shape (A0, ..., An, G1, C0, ...)</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear), tensor(int32, Linear)
### MMCVMultiScaleDeformableAttention
#### Description
Perform attention computation over a small set of key sampling points around a reference point rather than looking over all possible spatial locations. Read [Deformable DETR: Deformable Transformers for End-to-End Object Detection](https://arxiv.org/abs/2010.04159) for detail.
#### Parameters
None
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, S, M, D), where N is the batch size, S is the length of feature maps, M is the number of attention heads, and D is hidden_dim.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 2-D tensor of shape (L, 2), L is the number of feature maps, `2` is shape of feature maps.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Input mask; 1-D tensor of shape (L, ), this tensor is used to find the sampling locations for different feature levels as the input feature tensors are flattened.</dd>
<dt><tt>inputs[3]</tt>: T</dt>
<dd>Input weight; 6-D tensor of shape (N, Lq, M, L, P, 2). Lq is the length of feature maps(encoder)/length of queries(decoder), P is the number of points</dd>
<dt><tt>inputs[4]</tt>: T, optional</dt>
<dd>Input weight; 5-D tensor of shape (N, Lq, M, L, P).</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 3-D tensor of shape (N, Lq, M*D).</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
# How to add test units for backend ops
This tutorial introduces how to add unit test for backend ops. When you add a custom op under `backend_ops`, you need to add the corresponding test unit. Test units of ops are included in `tests/test_ops/test_ops.py`.
## Prerequisite
- `Compile new ops`: After adding a new custom op, needs to recompile the relevant backend, referring to [build.md](../01-how-to-build/build_from_source.md).
## 1. Add the test program test_XXXX()
You can put unit test for ops in `tests/test_ops/`. Usually, the following program template can be used for your custom op.
### example of ops unit test
```python
@pytest.mark.parametrize('backend', [TEST_TENSORRT, TEST_ONNXRT]) # 1.1 backend test class
@pytest.mark.parametrize('pool_h,pool_w,spatial_scale,sampling_ratio', # 1.2 set parameters of op
[(2, 2, 1.0, 2), (4, 4, 2.0, 4)]) # [(# Examples of op test parameters),...]
def test_roi_align(backend,
pool_h, # set parameters of op
pool_w,
spatial_scale,
sampling_ratio,
input_list=None,
save_dir=None):
backend.check_env()
if input_list is None:
input = torch.rand(1, 1, 16, 16, dtype=torch.float32) # 1.3 op input data initialization
single_roi = torch.tensor([[0, 0, 0, 4, 4]], dtype=torch.float32)
else:
input = torch.tensor(input_list[0], dtype=torch.float32)
single_roi = torch.tensor(input_list[1], dtype=torch.float32)
from mmcv.ops import roi_align
def wrapped_function(torch_input, torch_rois): # 1.4 initialize op model to be tested
return roi_align(torch_input, torch_rois, (pool_w, pool_h),
spatial_scale, sampling_ratio, 'avg', True)
wrapped_model = WrapFunction(wrapped_function).eval()
with RewriterContext(cfg={}, backend=backend.backend_name, opset=11): # 1.5 call the backend test class interface
backend.run_and_validate(
wrapped_model, [input, single_roi],
'roi_align',
input_names=['input', 'rois'],
output_names=['roi_feat'],
save_dir=save_dir)
```
### 1.1 backend test class
We provide some functions and classes for difference backends, such as `TestOnnxRTExporter`, `TestTensorRTExporter`, `TestNCNNExporter`.
### 1.2 set parameters of op
Set some parameters of op, such as ’pool_h‘, ’pool_w‘, ’spatial_scale‘, ’sampling_ratio‘ in roi_align. You can set multiple parameters to test op.
### 1.3 op input data initialization
Initialization required input data.
### 1.4 initialize op model to be tested
The model containing custom op usually has two forms.
- `torch model`: Torch model with custom operators. Python code related to op is required, refer to `roi_align` unit test.
- `onnx model`: Onnx model with custom operators. Need to call onnx api to build, refer to `multi_level_roi_align` unit test.
### 1.5 call the backend test class interface
Call the backend test class `run_and_validate` to run and verify the result output by the op on the backend.
```python
def run_and_validate(self,
model,
input_list,
model_name='tmp',
tolerate_small_mismatch=False,
do_constant_folding=True,
dynamic_axes=None,
output_names=None,
input_names=None,
expected_result=None,
save_dir=None):
```
#### Parameter Description
- `model`: Input model to be tested and it can be torch model or any other backend model.
- `input_list`: List of test data, which is mapped to the order of input_names.
- `model_name`: The name of the model.
- `tolerate_small_mismatch`: Whether to allow small errors in the verification of results.
- `do_constant_folding`: Whether to use constant light folding to optimize the model.
- `dynamic_axes`: If you need to use dynamic dimensions, enter the dimension information.
- `output_names`: The node name of the output node.
- `input_names`: The node name of the input node.
- `expected_result`: Expected ground truth values for verification.
- `save_dir`: The folder used to save the output files.
## 2. Test Methods
Use pytest to call the test function to test ops.
```bash
pytest tests/test_ops/test_ops.py::test_XXXX
```
# mmdeploy Architecture
This article mainly introduces the functions of each directory of mmdeploy and how it works from model conversion to real inference.
## Take a general look at the directory structure
The entire mmdeploy can be seen as two independent parts: model conversion and SDK.
We introduce the entire repo directory structure and functions, without having to study the source code, just have an impression.
Peripheral directory features:
```bash
$ cd /path/to/mmdeploy
$ tree -L 1
.
├── CMakeLists.txt # Compile custom operator and cmake configuration of SDK
├── configs # Algorithm library configuration for model conversion
├── csrc # SDK and custom operator
├── demo # FFI interface examples in various languages, such as csharp, java, python, etc.
├── docker # docker build
├── mmdeploy # python package for model conversion
├── requirements # python requirements
├── service # Some small boards not support python, we use C/S mode for model conversion, here is server code
├── tests # unittest
├── third_party # 3rd party dependencies required by SDK and FFI
└── tools # Tools are also the entrance to all functions, such as onnx2xx.py, profiler.py, test.py, etc.
```
It should be clear
- Model conversion mainly depends on `tools`, `mmdeploy` and small part of `csrc` directory;
- SDK is consist of three directories: `csrc`, `third_party` and `demo`.
## Model Conversion
Here we take ViT of mmpretrain as model example, and take ncnn as inference backend example. Other models and inferences are similar.
Let's take a look at the mmdeploy/mmdeploy directory structure and get an impression:
```bash
.
├── apis # The api used by tools is implemented here, such as onnx2ncnn.py
│   ├── calibration.py # trt dedicated collection of quantitative data
│   ├── core # Software infrastructure
│   ├── extract_model.py # Use it to export part of onnx
│   ├── inference.py # Abstract function, which will actually call torch/ncnn specific inference
│   ├── ncnn # ncnn Wrapper
│   └── visualize.py # Still an abstract function, which will actually call torch/ncnn specific inference and visualize
..
├── backend # Backend wrapper
│   ├── base # Because there are multiple backends, there must be an OO design for the base class
│   ├── ncnn # This calls the ncnn python interface for model conversion
│   │   ├── init_plugins.py # Find the path of ncnn custom operators and ncnn tools
│   │   ├── onnx2ncnn.py # Wrap `mmdeploy_onnx2ncnn` into a python interface
│   │   ├── quant.py # Wrap `ncnn2int8` as a python interface
│   │   └── wrapper.py # Wrap pyncnn forward API
..
├── codebase # Algorithm rewriter
│   ├── base # There are multiple algorithms here that we need a bit of OO design
│   ├── mmpretrain # mmpretrain related model rewrite
│   │   ├── deploy # mmpretrain implementation of base abstract task/model/codebase
│   │   └── models # Real model rewrite
│   │   ├── backbones # Rewrites of backbone network parts, such as multiheadattention
│   │   ├── heads # Such as MultiLabelClsHead
│   │   ├── necks # Such as GlobalAveragePooling
│..
├── core # Software infrastructure of rewrite mechanism
├── mmcv # Rewrite mmcv
├── pytorch # Rewrite pytorch operator for ncnn, such as Gemm
..
```
Each line above needs to be read, don't skip it.
When typing `tools/deploy.py` to convert ViT, these are 3 things:
1. Rewrite of mmpretrain ViT forward
2. ncnn does not support `gather` opr, customize and load it with libncnn.so
3. Run exported ncnn model with real inference, render output, and make sure the result is correct
### 1. Rewrite `forward`
Because when exporting ViT to onnx, it generates some operators that ncnn doesn't support perfectly, mmdeploy's solution is to hijack the forward code and change it. The output onnx is suitable for ncnn.
For example, rewrite the process of `conv -> shape -> concat_const -> reshape` to `conv -> reshape` to trim off the redundant `shape` and `concat` operator.
All mmpretrain algorithm rewriters are in the `mmdeploy/codebase/mmpretrain/models` directory.
### 2. Custom Operator
Operators customized for ncnn are in the `csrc/mmdeploy/backend_ops/ncnn/` directory, and are loaded together with `libncnn.so` after compilation. The essence is in hotfix ncnn, which currently implements these operators:
- topk
- tensorslice
- shape
- gather
- expand
- constantofshape
### 3. Model Conversion and testing
We first use the modified `mmdeploy_onnx2ncnn`to convert model, then inference with`pyncnn` and custom ops.
When encountering a framework such as snpe that does not support python well, we use C/S mode: wrap a server with protocols such as gRPC, and forward the real inference output.
For Rendering, mmdeploy directly uses the rendering API of upstream algorithm codebase.
## SDK
After the model conversion completed, the SDK compiled with C++ can be used to execute on different platforms.
Let's take a look at the csrc/mmdeploy directory structure:
```bash
.
├── apis # csharp, java, go, Rust and other FFI interfaces
├── backend_ops # Custom operators for each inference framework
├── CMakeLists.txt
├── codebase # The type of results preferred by each algorithm framework, such as multi-use bbox for detection task
├── core # Abstraction of graph, operator, device and so on
├── device # Implementation of CPU/GPU device abstraction
├── execution # Implementation of the execution abstraction
├── graph # Implementation of graph abstraction
├── model # Implement both zip-compressed and uncompressed work directory
├── net # Implementation of net, such as wrap ncnn forward C API
├── preprocess # Implement preprocess
└── utils # OCV tools
```
The essence of the SDK is to design a set of abstraction of the computational graph, and combine the **multiple models'**
- preprocess
- inference
- postprocess
Provide FFI in multiple languages at the same time.
# How to get partitioned ONNX models
MMDeploy supports exporting PyTorch models to partitioned onnx models. With this feature, users can define their partition policy and get partitioned onnx models at ease. In this tutorial, we will briefly introduce how to support partition a model step by step. In the example, we would break YOLOV3 model into two parts and extract the first part without the post-processing (such as anchor generating and NMS) in the onnx model.
## Step 1: Mark inputs/outpupts
To support the model partition, we need to add `Mark` nodes in the ONNX model. This could be done with mmdeploy's `@mark` decorator. Note that to make the `mark` work, the marking operation should be included in a rewriting function.
At first, we would mark the model input, which could be done by marking the input tensor `img` in the `forward` method of `BaseDetector` class, which is the parent class of all detector classes. Thus we name this marking point as `detector_forward` and mark the inputs as `input`. Since there could be three outputs for detectors such as `Mask RCNN`, the outputs are marked as `dets`, `labels`, and `masks`. The following code shows the idea of adding mark functions and calling the mark functions in the rewrite. For source code, you could refer to [mmdeploy/codebase/mmdet/models/detectors/single_stage.py](https://github.com/open-mmlab/mmdeploy/blob/4fc8828af84281b62be143012cd9f9dafd1e7cc2/mmdeploy/codebase/mmdet/models/detectors/single_stage.py)
```python
from mmdeploy.core import FUNCTION_REWRITER, mark
@mark(
'detector_forward', inputs=['input'], outputs=['dets', 'labels', 'masks'])
def __forward_impl(self, img, img_metas=None, **kwargs):
...
@FUNCTION_REWRITER.register_rewriter(
'mmdet.models.detectors.base.BaseDetector.forward')
def base_detector__forward(self, img, img_metas=None, **kwargs):
...
# call the mark function
return __forward_impl(...)
```
Then, we have to mark the output feature of `YOLOV3Head`, which is the input argument `pred_maps` in `get_bboxes` method of `YOLOV3Head` class. We could add a internal function to only mark the `pred_maps` inside [`yolov3_head__get_bboxes`](https://github.com/open-mmlab/mmdeploy/blob/4fc8828af84281b62be143012cd9f9dafd1e7cc2/mmdeploy/codebase/mmdet/models/dense_heads/yolo_head.py#L16) function as following.
```python
from mmdeploy.core import FUNCTION_REWRITER, mark
@FUNCTION_REWRITER.register_rewriter(
func_name='mmdet.models.dense_heads.YOLOV3Head.get_bboxes')
def yolov3_head__get_bboxes(self,
pred_maps,
img_metas,
cfg=None,
rescale=False,
with_nms=True):
# mark pred_maps
@mark('yolo_head', inputs=['pred_maps'])
def __mark_pred_maps(pred_maps):
return pred_maps
pred_maps = __mark_pred_maps(pred_maps)
...
```
Note that `pred_maps` is a list of `Tensor` and it has three elements. Thus, three `Mark` nodes with op name as `pred_maps.0`, `pred_maps.1`, `pred_maps.2` would be added in the onnx model.
## Step 2: Add partition config
After marking necessary nodes that would be used to split the model, we could add a deployment config file `configs/mmdet/detection/yolov3_partition_onnxruntime_static.py`. If you are not familiar with how to write config, you could check [write_config.md](../02-how-to-run/write_config.md).
In the config file, we need to add `partition_config`. The key part is `partition_cfg`, which contains elements of dict that designates the start nodes and end nodes of each model segments. Since we only want to keep `YOLOV3` without post-processing, we could set the `start` as `['detector_forward:input']`, and `end` as `['yolo_head:input']`. Note that `start` and `end` can have multiple marks.
```python
_base_ = ['./detection_onnxruntime_static.py']
onnx_config = dict(input_shape=[608, 608])
partition_config = dict(
type='yolov3_partition', # the partition policy name
apply_marks=True, # should always be set to True
partition_cfg=[
dict(
save_file='yolov3.onnx', # filename to save the partitioned onnx model
start=['detector_forward:input'], # [mark_name:input/output, ...]
end=['yolo_head:input'], # [mark_name:input/output, ...]
output_names=[f'pred_maps.{i}' for i in range(3)]) # output names
])
```
## Step 3: Get partitioned onnx models
Once we have marks of nodes and the deployment config with `parition_config` being set properly, we could use the [tool](../02-how-to-run/useful_tools.md) `torch2onnx` to export the model to onnx and get the partition onnx files.
```shell
python tools/torch2onnx.py \
configs/mmdet/detection/yolov3_partition_onnxruntime_static.py \
../mmdetection/configs/yolo/yolov3_d53_8xb8-ms-608-273e_coco.py \
https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco_20210518_115020-a2c3acb8.pth \
../mmdetection/demo/demo.jpg \
--work-dir ./work-dirs/mmdet/yolov3/ort/partition
```
After run the script above, we would have the partitioned onnx file `yolov3.onnx` in the `work-dir`. You can use the visualization tool [netron](https://netron.app/) to check the model structure.
With the partitioned onnx file, you could refer to [useful_tools.md](../02-how-to-run/useful_tools.md) to do the following procedures such as `mmdeploy_onnx2ncnn`, `onnx2tensorrt`.
# How to do regression test
This tutorial describes how to do regression test. The deployment configuration file contains codebase config and inference config.
### 1. Python Environment
```shell
pip install -r requirements/tests.txt
```
If pip throw an exception, try to upgrade numpy.
```shell
pip install -U numpy
```
## 2. Usage
```shell
python ./tools/regression_test.py \
--codebase "${CODEBASE_NAME}" \
--backends "${BACKEND}" \
[--models "${MODELS}"] \
--work-dir "${WORK_DIR}" \
--device "${DEVICE}" \
--log-level INFO \
[--performance-p] \
[--checkpoint-dir "$CHECKPOINT_DIR"]
```
### Description
- `--codebase` : The codebase to test, eg.`mmdet`. If you want to test multiple codebase, use `mmpretrain mmdet ...`
- `--backends` : The backend to test. By default, all `backend`s would be tested. You can use `onnxruntime tesensorrt`to choose several backends. If you also need to test the SDK, you need to configure the `sdk_config` in `tests/regression/${codebase}.yml`.
- `--models` : Specify the model to be tested. All models in `yml` are tested by default. You can also give some model names. For the model name, please refer to the relevant yml configuration file. For example `ResNet SE-ResNet "Mask R-CNN"`. Model name can only contain numbers and letters.
- `--work-dir` : The directory of model convert and report, use `../mmdeploy_regression_working_dir` by default.
- `--checkpoint-dir`: The path of downloaded torch model, use `../mmdeploy_checkpoints` by default.
- `--device` : device type, use `cuda` by default
- `--log-level` : These options are available:`'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. The default value is `INFO`.
- `-p` or `--performance` : Test precision or not. If not enabled, only model convert would be tested.
### Notes
For Windows user:
1. To use the `&&` connector in shell commands, you need to download `PowerShell 7 Preview 5+`.
2. If you are using conda env, you may need to change `python3` to `python` in regression_test.py because there is `python3.exe` in `%USERPROFILE%\AppData\Local\Microsoft\WindowsApps` directory.
## Example
1. Test all backends of mmdet and mmpose for **model convert and precision**
```shell
python ./tools/regression_test.py \
--codebase mmdet mmpose \
--work-dir "../mmdeploy_regression_working_dir" \
--device "cuda" \
--log-level INFO \
--performance
```
2. Test **model convert and precision** of some backends of mmdet and mmpose
```shell
python ./tools/regression_test.py \
--codebase mmdet mmpose \
--backends onnxruntime tensorrt \
--work-dir "../mmdeploy_regression_working_dir" \
--device "cuda" \
--log-level INFO \
-p
```
3. Test some backends of mmdet and mmpose, **only test model convert**
```shell
python ./tools/regression_test.py \
--codebase mmdet mmpose \
--backends onnxruntime tensorrt \
--work-dir "../mmdeploy_regression_working_dir" \
--device "cuda" \
--log-level INFO
```
4. Test some models of mmdet and mmpretrain, **only test model convert**
```shell
python ./tools/regression_test.py \
--codebase mmdet mmpose \
--models ResNet SE-ResNet "Mask R-CNN" \
--work-dir "../mmdeploy_regression_working_dir" \
--device "cuda" \
--log-level INFO
```
## 3. Regression Test Configuration
### Example and parameter description
```yaml
globals:
codebase_dir: ../mmocr # codebase path to test
checkpoint_force_download: False # whether to redownload the model even if it already exists
images:
img_densetext_det: &img_densetext_det ../mmocr/demo/demo_densetext_det.jpg
img_demo_text_det: &img_demo_text_det ../mmocr/demo/demo_text_det.jpg
img_demo_text_ocr: &img_demo_text_ocr ../mmocr/demo/demo_text_ocr.jpg
img_demo_text_recog: &img_demo_text_recog ../mmocr/demo/demo_text_recog.jpg
metric_info: &metric_info
hmean-iou: # metafile.Results.Metrics
eval_name: hmean-iou # test.py --metrics args
metric_key: 0_hmean-iou:hmean # the key name of eval log
tolerance: 0.1 # tolerated threshold interval
task_name: Text Detection # the name of metafile.Results.Task
dataset: ICDAR2015 # the name of metafile.Results.Dataset
word_acc: # same as hmean-iou, also a kind of metric
eval_name: acc
metric_key: 0_word_acc_ignore_case
tolerance: 0.2
task_name: Text Recognition
dataset: IIIT5K
convert_image_det: &convert_image_det # the image that will be used by detection model convert
input_img: *img_densetext_det
test_img: *img_demo_text_det
convert_image_rec: &convert_image_rec
input_img: *img_demo_text_recog
test_img: *img_demo_text_recog
backend_test: &default_backend_test True # whether test model precision for backend
sdk: # SDK config
sdk_detection_dynamic: &sdk_detection_dynamic configs/mmocr/text-detection/text-detection_sdk_dynamic.py
sdk_recognition_dynamic: &sdk_recognition_dynamic configs/mmocr/text-recognition/text-recognition_sdk_dynamic.py
onnxruntime:
pipeline_ort_recognition_static_fp32: &pipeline_ort_recognition_static_fp32
convert_image: *convert_image_rec # the image used by model conversion
backend_test: *default_backend_test # whether inference on the backend
sdk_config: *sdk_recognition_dynamic # test SDK or not. If it exists, use a specific SDK config for testing
deploy_config: configs/mmocr/text-recognition/text-recognition_onnxruntime_static.py # the deploy cfg path to use, based on mmdeploy path
pipeline_ort_recognition_dynamic_fp32: &pipeline_ort_recognition_dynamic_fp32
convert_image: *convert_image_rec
backend_test: *default_backend_test
sdk_config: *sdk_recognition_dynamic
deploy_config: configs/mmocr/text-recognition/text-recognition_onnxruntime_dynamic.py
pipeline_ort_detection_dynamic_fp32: &pipeline_ort_detection_dynamic_fp32
convert_image: *convert_image_det
deploy_config: configs/mmocr/text-detection/text-detection_onnxruntime_dynamic.py
tensorrt:
pipeline_trt_recognition_dynamic_fp16: &pipeline_trt_recognition_dynamic_fp16
convert_image: *convert_image_rec
backend_test: *default_backend_test
sdk_config: *sdk_recognition_dynamic
deploy_config: configs/mmocr/text-recognition/text-recognition_tensorrt-fp16_dynamic-1x32x32-1x32x640.py
pipeline_trt_detection_dynamic_fp16: &pipeline_trt_detection_dynamic_fp16
convert_image: *convert_image_det
backend_test: *default_backend_test
sdk_config: *sdk_detection_dynamic
deploy_config: configs/mmocr/text-detection/text-detection_tensorrt-fp16_dynamic-320x320-2240x2240.py
openvino:
# same as onnxruntime backend configuration
ncnn:
# same as onnxruntime backend configuration
pplnn:
# same as onnxruntime backend configuration
torchscript:
# same as onnxruntime backend configuration
models:
- name: crnn # model name
metafile: configs/textrecog/crnn/metafile.yml # the path of model metafile, based on codebase path
codebase_model_config_dir: configs/textrecog/crnn # the basepath of `model_configs`, based on codebase path
model_configs: # the config name to teset
- crnn_academic_dataset.py
pipelines: # pipeline name
- *pipeline_ort_recognition_dynamic_fp32
- name: dbnet
metafile: configs/textdet/dbnet/metafile.yml
codebase_model_config_dir: configs/textdet/dbnet
model_configs:
- dbnet_r18_fpnc_1200e_icdar2015.py
pipelines:
- *pipeline_ort_detection_dynamic_fp32
- *pipeline_trt_detection_dynamic_fp16
# special pipeline can be added like this
- convert_image: xxx
backend_test: xxx
sdk_config: xxx
deploy_config: configs/mmocr/text-detection/xxx
```
## 4. Generated Report
This is an example of mmocr regression test report.
| | Model | Model Config | Task | Checkpoint | Dataset | Backend | Deploy Config | Static or Dynamic | Precision Type | Conversion Result | hmean-iou | word_acc | Test Pass |
| --- | ----- | ---------------------------------------------------------------- | ---------------- | ------------------------------------------------------------------------------------------------------------ | --------- | --------------- | -------------------------------------------------------------------------------------- | ----------------- | -------------- | ----------------- | --------- | -------- | --------- |
| 0 | crnn | ../mmocr/configs/textrecog/crnn/crnn_academic_dataset.py | Text Recognition | ../mmdeploy_checkpoints/mmocr/crnn/crnn_academic-a723a1c5.pth | IIIT5K | Pytorch | - | - | - | - | - | 80.5 | - |
| 1 | crnn | ../mmocr/configs/textrecog/crnn/crnn_academic_dataset.py | Text Recognition | ${WORK_DIR}/mmocr/crnn/onnxruntime/static/crnn_academic-a723a1c5/end2end.onnx | x | onnxruntime | configs/mmocr/text-recognition/text-recognition_onnxruntime_dynamic.py | static | fp32 | True | - | 80.67 | True |
| 2 | crnn | ../mmocr/configs/textrecog/crnn/crnn_academic_dataset.py | Text Recognition | ${WORK_DIR}/mmocr/crnn/onnxruntime/static/crnn_academic-a723a1c5 | x | SDK-onnxruntime | configs/mmocr/text-recognition/text-recognition_sdk_dynamic.py | static | fp32 | True | - | x | False |
| 3 | dbnet | ../mmocr/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py | Text Detection | ../mmdeploy_checkpoints/mmocr/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth | ICDAR2015 | Pytorch | - | - | - | - | 0.795 | - | - |
| 4 | dbnet | ../mmocr/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py | Text Detection | ../mmdeploy_checkpoints/mmocr/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth | ICDAR | onnxruntime | configs/mmocr/text-detection/text-detection_onnxruntime_dynamic.py | dynamic | fp32 | True | - | - | True |
| 5 | dbnet | ../mmocr/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py | Text Detection | ${WORK_DIR}/mmocr/dbnet/tensorrt/dynamic/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597/end2end.engine | ICDAR | tensorrt | configs/mmocr/text-detection/text-detection_tensorrt-fp16_dynamic-320x320-2240x2240.py | dynamic | fp16 | True | 0.793302 | - | True |
| 6 | dbnet | ../mmocr/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py | Text Detection | ${WORK_DIR}/mmocr/dbnet/tensorrt/dynamic/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597 | ICDAR | SDK-tensorrt | configs/mmocr/text-detection/text-detection_sdk_dynamic.py | dynamic | fp16 | True | 0.795073 | - | True |
## 5. Supported Backends
- [x] ONNX Runtime
- [x] TensorRT
- [x] PPLNN
- [x] ncnn
- [x] OpenVINO
- [x] TorchScript
- [x] SNPE
- [x] MMDeploy SDK
## 6. Supported Codebase and Metrics
| Codebase | Metric | Support |
| ---------- | -------- | ------------------ |
| mmdet | bbox | :heavy_check_mark: |
| | segm | :heavy_check_mark: |
| | PQ | :x: |
| mmpretrain | accuracy | :heavy_check_mark: |
| mmseg | mIoU | :heavy_check_mark: |
| mmpose | AR | :heavy_check_mark: |
| | AP | :heavy_check_mark: |
| mmocr | hmean | :heavy_check_mark: |
| | acc | :heavy_check_mark: |
| mmagic | PSNR | :heavy_check_mark: |
| | SSIM | :heavy_check_mark: |
# How to support new backends
MMDeploy supports a number of backend engines. We welcome the contribution of new backends. In this tutorial, we will introduce the general procedures to support a new backend in MMDeploy.
## Prerequisites
Before contributing the codes, there are some requirements for the new backend that need to be checked:
- The backend must support ONNX as IR.
- If the backend requires model files or weight files other than a ".onnx" file, a conversion tool that converts the ".onnx" file to model files and weight files is required. The tool can be a Python API, a script, or an executable program.
- It is highly recommended that the backend provides a Python interface to load the backend files and inference for validation.
## Support backend conversion
The backends in MMDeploy must support the ONNX. The backend loads the ".onnx" file directly, or converts the ".onnx" to its own format using the conversion tool. In this section, we will introduce the steps to support backend conversion.
1. Add backend constant in `mmdeploy/utils/constants.py` that denotes the name of the backend.
**Example**:
```Python
# mmdeploy/utils/constants.py
class Backend(AdvancedEnum):
# Take TensorRT as an example
TENSORRT = 'tensorrt'
```
2. Add a corresponding package (a folder with `__init__.py`) in `mmdeploy/backend/`. For example, `mmdeploy/backend/tensorrt`. In the `__init__.py`, there must be a function named `is_available` which checks if users have installed the backend library. If the check is passed, then the remaining files of the package will be loaded.
**Example**:
```Python
# mmdeploy/backend/tensorrt/__init__.py
def is_available():
return importlib.util.find_spec('tensorrt') is not None
if is_available():
from .utils import from_onnx, load, save
from .wrapper import TRTWrapper
__all__ = [
'from_onnx', 'save', 'load', 'TRTWrapper'
]
```
3. Create a config file in `configs/_base_/backends` (e.g., `configs/_base_/backends/tensorrt.py`). If the backend just takes the '.onnx' file as input, the new config can be simple. The config of the backend only consists of one field denoting the name of the backend (which should be same as the name in `mmdeploy/utils/constants.py`).
**Example**:
```python
backend_config = dict(type='onnxruntime')
```
If the backend requires other files, then the arguments for the conversion from ".onnx" file to backend files should be included in the config file.
**Example:**
```Python
backend_config = dict(
type='tensorrt',
common_config=dict(
fp16_mode=False, max_workspace_size=0))
```
After possessing a base backend config file, you can easily construct a complete deploy config through inheritance. Please refer to our [config tutorial](../02-how-to-run/write_config.md) for more details. Here is an example:
```Python
_base_ = ['../_base_/backends/onnxruntime.py']
codebase_config = dict(type='mmpretrain', task='Classification')
onnx_config = dict(input_shape=None)
```
4. If the backend requires model files or weight files other than a ".onnx" file, create a `onnx2backend.py` file in the corresponding folder (e.g., create `mmdeploy/backend/tensorrt/onnx2tensorrt.py`). Then add a conversion function `onnx2backend` in the file. The function should convert a given ".onnx" file to the required backend files in a given work directory. There are no requirements on other parameters of the function and the implementation details. You can use any tools for conversion. Here are some examples:
**Use Python script:**
```Python
def onnx2openvino(input_info: Dict[str, Union[List[int], torch.Size]],
output_names: List[str], onnx_path: str, work_dir: str):
input_names = ','.join(input_info.keys())
input_shapes = ','.join(str(list(elem)) for elem in input_info.values())
output = ','.join(output_names)
mo_args = f'--input_model="{onnx_path}" '\
f'--output_dir="{work_dir}" ' \
f'--output="{output}" ' \
f'--input="{input_names}" ' \
f'--input_shape="{input_shapes}" ' \
f'--disable_fusing '
command = f'mo.py {mo_args}'
mo_output = run(command, stdout=PIPE, stderr=PIPE, shell=True, check=True)
```
**Use executable program:**
```Python
def onnx2ncnn(onnx_path: str, work_dir: str):
onnx2ncnn_path = get_onnx2ncnn_path()
save_param, save_bin = get_output_model_file(onnx_path, work_dir)
call([onnx2ncnn_path, onnx_path, save_param, save_bin])\
```
5. Define APIs in a new package in `mmdeploy/apis`.
**Example:**
```Python
# mmdeploy/apis/ncnn/__init__.py
from mmdeploy.backend.ncnn import is_available
__all__ = ['is_available']
if is_available():
from mmdeploy.backend.ncnn.onnx2ncnn import (onnx2ncnn,
get_output_model_file)
__all__ += ['onnx2ncnn', 'get_output_model_file']
```
Create a backend manager class which derive from `BaseBackendManager`, implement its `to_backend` static method.
**Example:**
```Python
@classmethod
def to_backend(cls,
ir_files: Sequence[str],
deploy_cfg: Any,
work_dir: str,
log_level: int = logging.INFO,
device: str = 'cpu',
**kwargs) -> Sequence[str]:
return ir_files
```
6. Convert the models of OpenMMLab to backends (if necessary) and inference on backend engine. If you find some incompatible operators when testing, you can try to rewrite the original model for the backend following the [rewriter tutorial](support_new_model.md) or add custom operators.
7. Add docstring and unit tests for new code :).
## Support backend inference
Although the backend engines are usually implemented in C/C++, it is convenient for testing and debugging if the backend provides Python inference interface. We encourage the contributors to support backend inference in the Python interface of MMDeploy. In this section we will introduce the steps to support backend inference.
1. Add a file named `wrapper.py` to corresponding folder in `mmdeploy/backend/{backend}`. For example, `mmdeploy/backend/tensorrt/wrapper.py`. This module should implement and register a wrapper class that inherits the base class `BaseWrapper` in `mmdeploy/backend/base/base_wrapper.py`.
**Example:**
```Python
from mmdeploy.utils import Backend
from ..base import BACKEND_WRAPPER, BaseWrapper
@BACKEND_WRAPPER.register_module(Backend.TENSORRT.value)
class TRTWrapper(BaseWrapper):
```
2. The wrapper class can initialize the engine in `__init__` function and inference in `forward` function. Note that the `__init__` function must take a parameter `output_names` and pass it to base class to determine the orders of output tensors. The input and output variables of `forward` should be dictionaries denoting the name and value of the tensors.
3. For the convenience of performance testing, the class should define a "execute" function that only calls the inference interface of the backend engine. The `forward` function should call the "execute" function after preprocessing the data.
**Example:**
```Python
from mmdeploy.utils import Backend
from mmdeploy.utils.timer import TimeCounter
from ..base import BACKEND_WRAPPER, BaseWrapper
@BACKEND_WRAPPER.register_module(Backend.ONNXRUNTIME.value)
class ORTWrapper(BaseWrapper):
def __init__(self,
onnx_file: str,
device: str,
output_names: Optional[Sequence[str]] = None):
# Initialization
# ...
super().__init__(output_names)
def forward(self, inputs: Dict[str,
torch.Tensor]) -> Dict[str, torch.Tensor]:
# Fetch data
# ...
self.__ort_execute(self.io_binding)
# Postprocess data
# ...
@TimeCounter.count_time('onnxruntime')
def __ort_execute(self, io_binding: ort.IOBinding):
# Only do the inference
self.sess.run_with_iobinding(io_binding)
```
4. Create a backend manager class which derive from `BaseBackendManager`, implement its `build_wrapper` static method.
**Example:**
```Python
@BACKEND_MANAGERS.register('onnxruntime')
class ONNXRuntimeManager(BaseBackendManager):
@classmethod
def build_wrapper(cls,
backend_files: Sequence[str],
device: str = 'cpu',
input_names: Optional[Sequence[str]] = None,
output_names: Optional[Sequence[str]] = None,
deploy_cfg: Optional[Any] = None,
**kwargs):
from .wrapper import ORTWrapper
return ORTWrapper(
onnx_file=backend_files[0],
device=device,
output_names=output_names)
```
5. Add docstring and unit tests for new code :).
## Support new backends using MMDeploy as a third party
Previous parts show how to add a new backend in MMDeploy, which requires changing its source codes. However, if we treat MMDeploy as a third party, the methods above are no longer efficient. To this end, adding a new backend requires us pre-install another package named `aenum`. We can install it directly through `pip install aenum`.
After installing `aenum` successfully, we can use it to add a new backend through:
```python
from mmdeploy.utils.constants import Backend
from aenum import extend_enum
try:
Backend.get('backend_name')
except Exception:
extend_enum(Backend, 'BACKEND', 'backend_name')
```
We can run the codes above before we use the rewrite logic of MMDeploy.
# How to support new models
We provide several tools to support model conversion.
## Function Rewriter
The PyTorch neural network is written in python that eases the development of the algorithm. But the use of Python control flow and third-party libraries make it difficult to export the network to an intermediate representation. We provide a 'monkey patch' tool to rewrite the unsupported function to another one that can be exported. Here is an example:
```python
from mmdeploy.core import FUNCTION_REWRITER
@FUNCTION_REWRITER.register_rewriter(
func_name='torch.Tensor.repeat', backend='tensorrt')
def repeat_static(input, *size):
ctx = FUNCTION_REWRITER.get_context()
origin_func = ctx.origin_func
if input.dim() == 1 and len(size) == 1:
return origin_func(input.unsqueeze(0), *([1] + list(size))).squeeze(0)
else:
return origin_func(input, *size)
```
It is easy to use the function rewriter. Just add a decorator with arguments:
- `func_name` is the function to override. It can be either a PyTorch function or a custom function. Methods in modules can also be overridden by this tool.
- `backend` is the inference engine. The function will be overridden when the model is exported to this engine. If it is not given, this rewrite will be the default rewrite. The default rewrite will be used if the rewrite of the given backend does not exist.
The arguments are the same as the original function, except a context `ctx` as the first argument. The context provides some useful information such as the deployment config `ctx.cfg` and the original function (which has been overridden) `ctx.origin_func`.
## Module Rewriter
If you want to replace a whole module with another one, we have another rewriter as follows:
```python
@MODULE_REWRITER.register_rewrite_module(
'mmagic.models.backbones.sr_backbones.SRCNN', backend='tensorrt')
class SRCNNWrapper(nn.Module):
def __init__(self,
module,
cfg,
channels=(3, 64, 32, 3),
kernel_sizes=(9, 1, 5),
upscale_factor=4):
super(SRCNNWrapper, self).__init__()
self._module = module
module.img_upsampler = nn.Upsample(
scale_factor=module.upscale_factor,
mode='bilinear',
align_corners=False)
def forward(self, *args, **kwargs):
"""Run forward."""
return self._module(*args, **kwargs)
def init_weights(self, *args, **kwargs):
"""Initialize weights."""
return self._module.init_weights(*args, **kwargs)
```
Just like function rewriter, add a decorator with arguments:
- `module_type` the module class to rewrite.
- `backend` is the inference engine. The function will be overridden when the model is exported to this engine. If it is not given, this rewrite will be the default rewrite. The default rewrite will be used if the rewrite of the given backend does not exist.
All instances of the module in the network will be replaced with instances of this new class. The original module and the deployment config will be passed as the first two arguments.
## Custom Symbolic
The mappings between PyTorch and ONNX are defined in PyTorch with symbolic functions. The custom symbolic function can help us to bypass some ONNX nodes which are unsupported by inference engine.
```python
@SYMBOLIC_REWRITER.register_symbolic('squeeze', is_pytorch=True)
def squeeze_default(g, self, dim=None):
if dim is None:
dims = []
for i, size in enumerate(self.type().sizes()):
if size == 1:
dims.append(i)
else:
dims = [sym_help._get_const(dim, 'i', 'dim')]
return g.op('Squeeze', self, axes_i=dims)
```
The decorator arguments:
- `func_name` The function name to add symbolic. Use full path if it is a custom `torch.autograd.Function`. Or just a name if it is a PyTorch built-in function.
- `backend` is the inference engine. The function will be overridden when the model is exported to this engine. If it is not given, this rewrite will be the default rewrite. The default rewrite will be used if the rewrite of the given backend does not exist.
- `is_pytorch` True if the function is a PyTorch built-in function.
- `arg_descriptors` the descriptors of the symbolic function arguments. Will be feed to `torch.onnx.symbolic_helper._parse_arg`.
Just like function rewriter, there is a context `ctx` as the first argument. The context provides some useful information such as the deployment config `ctx.cfg` and the original function (which has been overridden) `ctx.origin_func`. Note that the `ctx.origin_func` can be used only when `is_pytorch==False`.
# How to test rewritten models
After you create a rewritten model using our [rewriter](support_new_model.md), it's better to write a unit test for the model to validate if the model rewrite would come into effect. Generally, we need to get outputs of the original model and rewritten model, then compare them. The outputs of the original model can be acquired directly by calling the forward function of the model, whereas the way to generate the outputs of the rewritten model depends on the complexity of the rewritten model.
## Test rewritten model with small changes
If the changes to the model are small (e.g., only change the behavior of one or two variables and don't introduce side effects), you can construct the input arguments for the rewritten functions/modules,run model's inference in `RewriteContext` and check the results.
```python
# mmpretrain.models.classfiers.base.py
class BaseClassifier(BaseModule, metaclass=ABCMeta):
def forward(self, img, return_loss=True, **kwargs):
if return_loss:
return self.forward_train(img, **kwargs)
else:
return self.forward_test(img, **kwargs)
# Custom rewritten function
@FUNCTION_REWRITER.register_rewriter(
'mmpretrain.models.classifiers.BaseClassifier.forward', backend='default')
def forward_of_base_classifier(self, img, *args, **kwargs):
"""Rewrite `forward` for default backend."""
return self.simple_test(img, {})
```
In the example, we only change the function that `forward` calls. We can test this rewritten function by writing the following test function:
```python
def test_baseclassfier_forward():
input = torch.rand(1)
from mmpretrain.models.classifiers import BaseClassifier
class DummyClassifier(BaseClassifier):
def __init__(self, init_cfg=None):
super().__init__(init_cfg=init_cfg)
def extract_feat(self, imgs):
pass
def forward_train(self, imgs):
return 'train'
def simple_test(self, img, tmp, **kwargs):
return 'simple_test'
model = DummyClassifier().eval()
model_output = model(input)
with RewriterContext(cfg=dict()), torch.no_grad():
backend_output = model(input)
assert model_output == 'train'
assert backend_output == 'simple_test'
```
In this test function, we construct a derived class of `BaseClassifier` to test if the rewritten model would work in the rewrite context. We get outputs of the original model by directly calling `model(input)` and get the outputs of the rewritten model by calling `model(input)` in `RewriteContext`. Finally, we can check the outputs by asserting their value.
## Test rewritten model with big changes
In the first example, the output is generated in Python. Sometimes we may make big changes to original model functions (e.g., eliminate branch statements to generate correct computing graph). Even if the outputs of a rewritten model running in Python are correct, we cannot assure that the rewritten model can work as expected in the backend. Therefore, we need to test the rewritten model in the backend.
```python
# Custom rewritten function
@FUNCTION_REWRITER.register_rewriter(
func_name='mmseg.models.segmentors.BaseSegmentor.forward')
def base_segmentor__forward(self, img, img_metas=None, **kwargs):
ctx = FUNCTION_REWRITER.get_context()
if img_metas is None:
img_metas = {}
assert isinstance(img_metas, dict)
assert isinstance(img, torch.Tensor)
deploy_cfg = ctx.cfg
is_dynamic_flag = is_dynamic_shape(deploy_cfg)
img_shape = img.shape[2:]
if not is_dynamic_flag:
img_shape = [int(val) for val in img_shape]
img_metas['img_shape'] = img_shape
return self.simple_test(img, img_metas, **kwargs)
```
The behavior of this rewritten function is complex. We should test it as follows:
```python
def test_basesegmentor_forward():
from mmdeploy.utils.test import (WrapModel, get_model_outputs,
get_rewrite_outputs)
segmentor = get_model()
segmentor.cpu().eval()
# Prepare data
# ...
# Get the outputs of original model
model_inputs = {
'img': [imgs],
'img_metas': [img_metas],
'return_loss': False
}
model_outputs = get_model_outputs(segmentor, 'forward', model_inputs)
# Get the outputs of rewritten model
wrapped_model = WrapModel(segmentor, 'forward', img_metas = None, return_loss = False)
rewrite_inputs = {'img': imgs}
rewrite_outputs, is_backend_output = get_rewrite_outputs(
wrapped_model=wrapped_model,
model_inputs=rewrite_inputs,
deploy_cfg=deploy_cfg)
if is_backend_output:
# If the backend plugins have been installed, the rewrite outputs are
# generated by backend.
rewrite_outputs = torch.tensor(rewrite_outputs)
model_outputs = torch.tensor(model_outputs)
model_outputs = model_outputs.unsqueeze(0).unsqueeze(0)
assert torch.allclose(rewrite_outputs, model_outputs)
else:
# Otherwise, the outputs are generated by python.
assert rewrite_outputs is not None
```
We provide some utilities to test rewritten functions. At first, you can construct a model and call `get_model_outputs` to get outputs of the original model. Then you can wrap the rewritten function with `WrapModel`, which serves as a partial function, and get the results with `get_rewrite_outputs`. `get_rewrite_outputs` returns two values that indicate the content of outputs and whether the outputs come from the backend. Because we cannot assume that everyone has installed the backend, we should check if the results are generated by a Python or backend engine. The unit test must cover both conditions. Finally, we should compare the original and rewritten outputs, which may be done simply by calling `torch.allclose`.
## Note
To learn the complete usage of the test utilities, please refer to our apis document.
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.header-logo {
background-image: url("../image/mmdeploy-logo.png");
background-size: 150px 60px;
height: 60px;
width: 150px;
}
apis
-------
.. automodule:: mmdeploy.apis
:members:
apis/tensorrt
-------------
.. automodule:: mmdeploy.apis.tensorrt
:members:
apis/onnxruntime
----------------
.. automodule:: mmdeploy.apis.onnxruntime
:members:
apis/ncnn
---------
.. automodule:: mmdeploy.apis.ncnn
:members:
apis/pplnn
----------
.. automodule:: mmdeploy.apis.pplnn
:members:
# Cross compile snpe inference server on Ubuntu 18
mmdeploy has provided a prebuilt package, if you want to compile it by self, or need to modify the `.proto` file, you can refer to this document.
Note that the official gRPC documentation does not have complete support for the NDK.
## 1. Environment
| Item | Version | Remarks |
| ------------------ | -------------- | --------------------------------------------------------- |
| snpe | 1.59 | 1.60 uses clang-8.0, which may cause compatibility issues |
| host OS | ubuntu18.04 | snpe1.59 specified version |
| NDK | r17c | snpe1.59 specified version |
| gRPC | commit 6f698b5 | - |
| Hardware equipment | qcom888 | qcom chip required |
## 2. Cross compile gRPC with NDK
1. Pull gRPC repo, compile `protoc` and `grpc_cpp_plugin` on host
```bash
# Install dependencies
$ apt-get update && apt-get install -y libssl-dev
# Compile
$ git clone https://github.com/grpc/grpc --recursive=1 --depth=1
$ mkdir -p cmake/build
$ pushd cmake/build
$ cmake \
-DCMAKE_BUILD_TYPE=Release \
-DgRPC_INSTALL=ON \
-DgRPC_BUILD_TESTS=OFF \
-DgRPC_SSL_PROVIDER=package \
../..
# Install to host
$ make -j
$ sudo make install
```
2. Download the NDK and cross-compile the static libraries with android aarch64 format
```bash
$ wget https://dl.google.com/android/repository/android-ndk-r17c-linux-x86_64.zip
$ unzip android-ndk-r17c-linux-x86_64.zip
$ export ANDROID_NDK=/path/to/android-ndk-r17c
$ cd /path/to/grpc
$ mkdir -p cmake/build_aarch64 && pushd cmake/build_aarch64
$ cmake ../.. \
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK}/build/cmake/android.toolchain.cmake \
-DANDROID_ABI=arm64-v8a \
-DANDROID_PLATFORM=android-26 \
-DANDROID_TOOLCHAIN=clang \
-DANDROID_STL=c++_shared \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/tmp/android_grpc_install_shared
$ make -j
$ make install
```
3. At this point `/tmp/android_grpc_install` should have the complete installation file
```bash
$ cd /tmp/android_grpc_install
$ tree -L 1
.
├── bin
├── include
├── lib
└── share
```
## 3. (Skipable) Self-test whether NDK gRPC is available
1. Compile the helloworld that comes with gRPC
```bash
$ cd /path/to/grpc/examples/cpp/helloworld/
$ mkdir cmake/build_aarch64 -p && pushd cmake/build_aarch64
$ cmake ../.. \
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK}/build/cmake/android.toolchain.cmake \
-DANDROID_ABI=arm64-v8a \
-DANDROID_PLATFORM=android-26 \
-DANDROID_STL=c++_shared \
-DANDROID_TOOLCHAIN=clang \
-DCMAKE_BUILD_TYPE=Release \
-Dabsl_DIR=/tmp/android_grpc_install_shared/lib/cmake/absl \
-DProtobuf_DIR=/tmp/android_grpc_install_shared/lib/cmake/protobuf \
-DgRPC_DIR=/tmp/android_grpc_install_shared/lib/cmake/grpc
$ make -j
$ ls greeter*
greeter_async_client greeter_async_server greeter_callback_server greeter_server
greeter_async_client2 greeter_callback_client greeter_client
```
2. Turn on debug mode on your phone, push the binary to `/data/local/tmp`
```bash
$ adb push greeter* /data/local/tmp
```
3. `adb shell` into the phone, execute client/server
```bash
/data/local/tmp $ ./greeter_client
Greeter received: Hello world
```
## 4. Cross compile snpe inference server
1. Open the [snpe tools website](https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk/tools) and download version 1.59. Unzip and set environment variables
> Note that snpe >= 1.60 starts using `clang-8.0`, which may cause incompatibility with `libc++_shared.so` on older devices.
```bash
$ export SNPE_ROOT=/path/to/snpe-1.59.0.3230
```
2. Open the snpe server directory within mmdeploy, use the options when cross-compiling gRPC
```bash
$ cd /path/to/mmdeploy
$ cd service/snpe/server
$ mkdir -p build && cd build
$ export ANDROID_NDK=/path/to/android-ndk-r17c
$ cmake .. \
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK}/build/cmake/android.toolchain.cmake \
-DANDROID_ABI=arm64-v8a \
-DANDROID_PLATFORM=android-26 \
-DANDROID_STL=c++_shared \
-DANDROID_TOOLCHAIN=clang \
-DCMAKE_BUILD_TYPE=Release \
-Dabsl_DIR=/tmp/android_grpc_install_shared/lib/cmake/absl \
-DProtobuf_DIR=/tmp/android_grpc_install_shared/lib/cmake/protobuf \
-DgRPC_DIR=/tmp/android_grpc_install_shared/lib/cmake/grpc
$ make -j
$ file inference_server
inference_server: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /system/bin/linker64, BuildID[sha1]=252aa04e2b982681603dacb74b571be2851176d2, with debug_info, not stripped
```
Finally, you can see `infernece_server`, `adb push` it to the device and execute.
## 5. Regenerate the proto interface
If you have changed `inference.proto`, you need to regenerate the .cpp and .py interfaces
```Shell
$ python3 -m pip install grpc_tools --user
$ python3 -m grpc_tools.protoc -I./ --python_out=./client/ --grpc_python_out=./client/ inference.proto
$ ln -s `which protoc-gen-grpc`
$ protoc --cpp_out=./ --grpc_out=./ --plugin=protoc-gen-grpc=grpc_cpp_plugin inference.proto
```
## Reference
- snpe tutorial https://developer.qualcomm.com/sites/default/files/docs/snpe/cplus_plus_tutorial.html
- gRPC cross build script https://raw.githubusercontent.com/grpc/grpc/master/test/distrib/cpp/run_distrib_test_cmake_aarch64_cross.sh
- stackoverflow https://stackoverflow.com/questions/54052229/build-grpc-c-for-android-using-ndk-arm-linux-androideabi-clang-compiler
#
# Configuration file for the Sphinx documentation builder.
#
# This file does only contain a selection of the most common options. For a
# full list see the documentation:
# http://www.sphinx-doc.org/en/master/config
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import subprocess
import sys
import pytorch_sphinx_theme
from m2r import MdInclude
from recommonmark.transform import AutoStructify
from sphinx.builders.html import StandaloneHTMLBuilder
sys.path.insert(0, os.path.abspath('../..'))
version_file = '../../mmdeploy/version.py'
with open(version_file, 'r') as f:
exec(compile(f.read(), version_file, 'exec'))
__version__ = locals()['__version__']
# -- Project information -----------------------------------------------------
project = 'mmdeploy'
copyright = '2021-2024, OpenMMLab'
author = 'MMDeploy Authors'
# The short X.Y version
version = __version__
# The full version, including alpha/beta/rc tags
release = __version__
# -- General configuration ---------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'breathe',
'sphinx.ext.autodoc',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
'sphinx.ext.autosectionlabel',
'sphinx_markdown_tables',
'myst_parser',
'sphinx_copybutton',
'sphinxcontrib.mermaid'
] # yapf: disable
breathe_default_project = 'mmdeployapi'
breathe_projects = {'mmdeployapi': '../cppapi/docs/xml'}
def generate_doxygen_xml(app):
try:
folder = '../cppapi'
retcode = subprocess.call('cd %s; doxygen' % folder, shell=True)
if retcode < 0:
sys.stderr.write('doxygen terminated by signal %s' % (-retcode))
except Exception as e:
sys.stderr.write('doxygen execution failed: %s' % e)
autodoc_mock_imports = ['tensorrt']
autosectionlabel_prefix_document = True
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
source_suffix = {
'.rst': 'restructuredtext',
'.md': 'markdown',
}
# The master toctree document.
master_doc = 'index'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = 'en'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
# html_theme = 'sphinx_rtd_theme'
html_theme = 'pytorch_sphinx_theme'
html_theme_path = [pytorch_sphinx_theme.get_html_theme_path()]
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
html_theme_options = {
'logo_url': 'https://mmdeploy.readthedocs.io/en/latest/',
'menu': [{
'name': 'GitHub',
'url': 'https://github.com/open-mmlab/mmdeploy'
}],
'menu_lang': 'en'
}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_css_files = ['css/readthedocs.css']
# Enable ::: for my_st
myst_enable_extensions = ['colon_fence']
myst_heading_anchors = 5
# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
#
# The default sidebars (for documents that don't match any pattern) are
# defined by theme itself. Builtin themes are using these templates by
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
# 'searchbox.html']``.
#
# html_sidebars = {}
# -- Options for HTMLHelp output ---------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'mmdeploydoc'
# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'mmdeploy.tex', 'mmdeploy Documentation',
'MMDeploy Contributors', 'manual'),
]
# -- Options for manual page output ------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [(master_doc, 'mmdeploy', 'mmdeploy Documentation', [author], 1)]
# -- Options for Texinfo output ----------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'mmdeploy', 'mmdeploy Documentation', author, 'mmdeploy',
'One line description of project.', 'Miscellaneous'),
]
# -- Options for Epub output -------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = project
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''
# A unique identification for the text.
#
# epub_uid = ''
# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
# set priority when building html
StandaloneHTMLBuilder.supported_image_types = [
'image/svg+xml', 'image/gif', 'image/png', 'image/jpeg'
]
# -- Extension configuration -------------------------------------------------
# Ignore >>> when copying code
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True
def setup(app):
# Add hook for building doxygen xml when needed
app.connect('builder-inited', generate_doxygen_xml)
app.add_config_value('no_underscore_emphasis', False, 'env')
app.add_config_value('m2r_parse_relative_links', False, 'env')
app.add_config_value('m2r_anonymous_references', False, 'env')
app.add_config_value('m2r_disable_inline_math', False, 'env')
app.add_directive('mdinclude', MdInclude)
app.add_config_value('recommonmark_config', {
'auto_toc_tree_section': 'Contents',
'enable_eval_rst': True,
}, True)
app.add_transform(AutoStructify)
# ONNX export Optimizer
This is a tool to optimize ONNX model when exporting from PyTorch.
## Installation
Build MMDeploy with `torchscript` support:
```shell
export Torch_DIR=$(python -c "import torch;print(torch.utils.cmake_prefix_path + '/Torch')")
cmake \
-DTorch_DIR=${Torch_DIR} \
-DMMDEPLOY_TARGET_BACKENDS="${your_backend};torchscript" \
.. # You can also add other build flags if you need
cmake --build . -- -j$(nproc) && cmake --install .
```
## Usage
```python
# import model_to_graph_custom_optimizer so we can hijack onnx.export
from mmdeploy.apis.onnx.optimizer import model_to_graph__custom_optimizer # noqa
from mmdeploy.core import RewriterContext
from mmdeploy.apis.onnx.passes import optimize_onnx
# load you model here
model = create_model()
# export with ONNX Optimizer
x = create_dummy_input()
with RewriterContext({}, onnx_custom_passes=optimize_onnx):
torch.onnx.export(model, x, output_path)
```
The model would be optimized after export.
You can also define your own optimizer:
```python
# create the optimize callback
def _optimize_onnx(graph, params_dict, torch_out):
from mmdeploy.backend.torchscript import ts_optimizer
ts_optimizer.onnx._jit_pass_onnx_peephole(graph)
return graph, params_dict, torch_out
with RewriterContext({}, onnx_custom_passes=_optimize_onnx):
# export your model
```
## Frequently Asked Questions
### TensorRT
- "WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected."
Fp16 mode requires a device with full-rate fp16 support.
- "error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] \<= dimensions.d[i]"
When building an `ICudaEngine` from an `INetworkDefinition` that has dynamically resizable inputs, users need to specify at least one optimization profile. Which can be set in deploy config:
```python
backend_config = dict(
common_config=dict(max_workspace_size=1 << 30),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 320, 320],
opt_shape=[1, 3, 800, 1344],
max_shape=[1, 3, 1344, 1344])))
])
```
The input tensor shape should be limited between `min_shape` and `max_shape`.
- "error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS"
TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. You may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.
### Libtorch
- Error: `libtorch/share/cmake/Caffe2/Caffe2Config.cmake:96 (message):Your installed Caffe2 version uses cuDNN but I cannot find the cuDNN libraries. Please set the proper cuDNN prefixes and / or install cuDNN.`
May `export CUDNN_ROOT=/root/path/to/cudnn` to resolve the build error.
### Windows
- Error: similar like this `OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\cx\miniconda3\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies`
Solution: according to this [post](https://stackoverflow.com/questions/64837376/how-to-efficiently-run-multiple-pytorch-processes-models-at-once-traceback), the issue may be caused by NVidia and will fix in *CUDA release 11.7*. For now one could use the [fixNvPe.py](https://gist.github.com/cobryan05/7d1fe28dd370e110a372c4d268dcb2e5) script to modify the nvidia dlls in the pytorch lib dir.
`python fixNvPe.py --input=C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\lib\*.dll`
You can find your pytorch installation path with:
```python
import torch
print(torch.__file__)
```
- enable_language(CUDA) error
```
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.19044.
-- Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.1 (found version "11.1")
CMake Error at C:/Software/cmake/cmake-3.23.1-windows-x86_64/share/cmake-3.23/Modules/CMakeDetermineCompilerId.cmake:491 (message):
No CUDA toolset found.
Call Stack (most recent call first):
C:/Software/cmake/cmake-3.23.1-windows-x86_64/share/cmake-3.23/Modules/CMakeDetermineCompilerId.cmake:6 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
C:/Software/cmake/cmake-3.23.1-windows-x86_64/share/cmake-3.23/Modules/CMakeDetermineCompilerId.cmake:59 (__determine_compiler_id_test)
C:/Software/cmake/cmake-3.23.1-windows-x86_64/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:339 (CMAKE_DETERMINE_COMPILER_ID)
C:/workspace/mmdeploy-0.6.0-windows-amd64-cuda11.1-tensorrt8.2.3.0/sdk/lib/cmake/MMDeploy/MMDeployConfig.cmake:27 (enable_language)
CMakeLists.txt:5 (find_package)
```
**Cause:** CUDA Toolkit 11.1 was installed before Visual Studio, so the VS plugin was not installed. Or the version of VS is too new, so that the installation of the VS plugin is skipped during the installation of the CUDA Toolkit
**Solution:** This problem can be solved by manually copying the four files in `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\extras\visual_studio_integration\MSBuildExtensions` to `C:\Software\Microsoft Visual Studio\2022\Community\Msbuild\Microsoft\VC\v170\BuildCustomizations` The specific path should be changed according to the actual situation.
### ONNX Runtime
- Under Windows system, when visualizing model inference result failed with the following error:
```
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Failed to load library, error code: 193
```
**Cause:** In latest Windows systems, there are two `onnxruntime.dll` under the system path, and they will be loaded first, causing conflicts.
```
C:\Windows\SysWOW64\onnxruntime.dll
C:\Windows\System32\onnxruntime.dll
```
**Solution:** Choose one of the following two options
1. Copy the dll in the lib directory of the downloaded onnxruntime to the directory where mmdeploy_onnxruntime_ops.dll locates (It is recommended to use Everything to search the ops dll)
2. Rename the two dlls in the system path so that they cannot be loaded.
### Pip
- pip installed package but could not `import` them.
Make sure your are using conda pip.
```bash
$ which pip
# /path/to/.local/bin/pip
/path/to/miniconda3/lib/python3.9/site-packages/pip
```
# Get Started
MMDeploy provides useful tools for deploying OpenMMLab models to various platforms and devices.
With the help of them, you can not only do model deployment using our pre-defined pipelines but also customize your own deployment pipeline.
## Introduction
In MMDeploy, the deployment pipeline can be illustrated by a sequential modules, i.e., Model Converter, MMDeploy Model and Inference SDK.
![deploy-pipeline](https://user-images.githubusercontent.com/4560679/172306700-31b4c922-2f04-42ed-a1d6-c360f2f3048c.png)
### Model Converter
Model Converter aims at converting training models from OpenMMLab into backend models that can be run on target devices.
It is able to transform PyTorch model into IR model, i.e., ONNX, TorchScript, as well as convert IR model to backend model. By combining them together, we can achieve one-click **end-to-end** model deployment.
### MMDeploy Model
MMDeploy Model is the result package exported by Model Converter.
Beside the backend models, it also includes the model meta info, which will be used by Inference SDK.
### Inference SDK
Inference SDK is developed by C/C++, wrapping the preprocessing, model forward and postprocessing modules in model inference.
It supports FFI such as C, C++, Python, C#, Java and so on.
## Prerequisites
In order to do an end-to-end model deployment, MMDeploy requires Python 3.6+ and PyTorch 1.8+.
**Step 0.** Download and install Miniconda from the [official website](https://docs.conda.io/en/latest/miniconda.html).
**Step 1.** Create a conda environment and activate it.
```shell
conda create --name mmdeploy python=3.8 -y
conda activate mmdeploy
```
**Step 2.** Install PyTorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.
On GPU platforms:
```shell
conda install pytorch=={pytorch_version} torchvision=={torchvision_version} cudatoolkit={cudatoolkit_version} -c pytorch -c conda-forge
```
On CPU platforms:
```shell
conda install pytorch=={pytorch_version} torchvision=={torchvision_version} cpuonly -c pytorch
```
```{note}
On GPU platform, please ensure that {cudatoolkit_version} matches your host CUDA toolkit version. Otherwise, it probably brings in conflicts when deploying model with TensorRT.
```
## Installation
We recommend that users follow our best practices installing MMDeploy.
**Step 0.** Install [MMCV](https://github.com/open-mmlab/mmcv).
```shell
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0rc2"
```
**Step 1.** Install MMDeploy and inference engine
We recommend using MMDeploy precompiled package as our best practice. Currently, we support model converter and sdk inference pypi package, and the sdk c/cpp library is provided [here](https://github.com/open-mmlab/mmdeploy/releases). You can download them according to your target platform and device.
The supported platform and device matrix is presented as following:
<table>
<thead>
<tr>
<th>OS-Arch</th>
<th>Device</th>
<th>ONNX Runtime</th>
<th>TensorRT</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Linux-x86_64</td>
<td>CPU</td>
<td>Y</td>
<td>N/A</td>
</tr>
<tr>
<td>CUDA</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr>
<td rowspan="2">Windows-x86_64</td>
<td>CPU</td>
<td>Y</td>
<td>N/A</td>
</tr>
<tr>
<td>CUDA</td>
<td>Y</td>
<td>Y</td>
</tr>
</tbody>
</table>
**Note: if MMDeploy prebuilt package doesn't meet your target platforms or devices, please [build MMDeploy from source](01-how-to-build/build_from_source.md)**
Take the latest precompiled package as example, you can install it as follows:
<details open>
<summary><b>Linux-x86_64</b></summary>
```shell
# 1. install MMDeploy model converter
pip install mmdeploy==1.3.1
# 2. install MMDeploy sdk inference
# you can install one to install according whether you need gpu inference
# 2.1 support onnxruntime
pip install mmdeploy-runtime==1.3.1
# 2.2 support onnxruntime-gpu, tensorrt
pip install mmdeploy-runtime-gpu==1.3.1
# 3. install inference engine
# 3.1 install TensorRT
# !!! If you want to convert a tensorrt model or inference with tensorrt,
# download TensorRT-8.2.3.0 CUDA 11.x tar package from NVIDIA, and extract it to the current directory
pip install TensorRT-8.2.3.0/python/tensorrt-8.2.3.0-cp38-none-linux_x86_64.whl
pip install pycuda
export TENSORRT_DIR=$(pwd)/TensorRT-8.2.3.0
export LD_LIBRARY_PATH=${TENSORRT_DIR}/lib:$LD_LIBRARY_PATH
# !!! Moreover, download cuDNN 8.2.1 CUDA 11.x tar package from NVIDIA, and extract it to the current directory
export CUDNN_DIR=$(pwd)/cuda
export LD_LIBRARY_PATH=$CUDNN_DIR/lib64:$LD_LIBRARY_PATH
# 3.2 install ONNX Runtime
# you can install one to install according whether you need gpu inference
# 3.2.1 onnxruntime
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
export ONNXRUNTIME_DIR=$(pwd)/onnxruntime-linux-x64-1.8.1
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
# 3.2.2 onnxruntime-gpu
pip install onnxruntime-gpu==1.8.1
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-gpu-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-gpu-1.8.1.tgz
export ONNXRUNTIME_DIR=$(pwd)/onnxruntime-linux-x64-gpu-1.8.1
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```
</details>
<details open>
<summary><b>Windows-x86_64</b></summary>
</details>
Please learn its prebuilt package from [this](02-how-to-run/prebuilt_package_windows.md) guide.
## Convert Model
After the installation, you can enjoy the model deployment journey starting from converting PyTorch model to backend model by running `tools/deploy.py`.
Based on the above settings, we provide an example to convert the Faster R-CNN in [MMDetection](https://github.com/open-mmlab/mmdetection) to TensorRT as below:
```shell
# clone mmdeploy to get the deployment config. `--recursive` is not necessary
git clone -b main https://github.com/open-mmlab/mmdeploy.git
# clone mmdetection repo. We have to use the config file to build PyTorch nn module
git clone -b 3.x https://github.com/open-mmlab/mmdetection.git
cd mmdetection
mim install -v -e .
cd ..
# download Faster R-CNN checkpoint
wget -P checkpoints https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
# run the command to start model conversion
python mmdeploy/tools/deploy.py \
mmdeploy/configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
mmdetection/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py \
checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
mmdetection/demo/demo.jpg \
--work-dir mmdeploy_model/faster-rcnn \
--device cuda \
--dump-info
```
The converted model and its meta info will be found in the path specified by `--work-dir`.
And they make up of MMDeploy Model that can be fed to MMDeploy SDK to do model inference.
For more details about model conversion, you can read [how_to_convert_model](02-how-to-run/convert_model.md). If you want to customize the conversion pipeline, you can edit the config file by following [this](02-how-to-run/write_config.md) tutorial.
```{tip}
You can convert the above model to onnx model and perform ONNX Runtime inference
just by changing 'detection_tensorrt_dynamic-320x320-1344x1344.py' to 'detection_onnxruntime_dynamic.py' and making '--device' as 'cpu'.
```
## Inference Model
After model conversion, we can perform inference not only by Model Converter but also by Inference SDK.
### Inference by Model Converter
Model Converter provides a unified API named as `inference_model` to do the job, making all inference backends API transparent to users.
Take the previous converted Faster R-CNN tensorrt model for example,
```python
from mmdeploy.apis import inference_model
result = inference_model(
model_cfg='mmdetection/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py',
deploy_cfg='mmdeploy/configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py',
backend_files=['mmdeploy_model/faster-rcnn/end2end.engine'],
img='mmdetection/demo/demo.jpg',
device='cuda:0')
```
```{note}
'backend_files' in this API refers to backend engine file path, which MUST be put in a list, since some inference engines like OpenVINO and ncnn separate the network structure and its weights into two files.
```
### Inference by SDK
You can directly run MMDeploy demo programs in the precompiled package to get inference results.
```shell
wget https://github.com/open-mmlab/mmdeploy/releases/download/v1.3.1/mmdeploy-1.3.1-linux-x86_64-cuda11.8.tar.gz
tar xf mmdeploy-1.3.1-linux-x86_64-cuda11.8
cd mmdeploy-1.3.1-linux-x86_64-cuda11.8
# run python demo
python example/python/object_detection.py cuda ../mmdeploy_model/faster-rcnn ../mmdetection/demo/demo.jpg
# run C/C++ demo
# build the demo according to the README.md in the folder.
./bin/object_detection cuda ../mmdeploy_model/faster-rcnn ../mmdetection/demo/demo.jpg
```
```{note}
In the above command, the input model is SDK Model path. It is NOT engine file path but actually the path passed to --work-dir. It not only includes engine files but also meta information like 'deploy.json' and 'pipeline.json'.
```
In the next section, we will provide examples of deploying the converted Faster R-CNN model talked above with SDK different FFI (Foreign Function Interface).
#### Python API
```python
from mmdeploy_runtime import Detector
import cv2
img = cv2.imread('mmdetection/demo/demo.jpg')
# create a detector
detector = Detector(model_path='mmdeploy_models/faster-rcnn', device_name='cuda', device_id=0)
# run the inference
bboxes, labels, _ = detector(img)
# Filter the result according to threshold
indices = [i for i in range(len(bboxes))]
for index, bbox, label_id in zip(indices, bboxes, labels):
[left, top, right, bottom], score = bbox[0:4].astype(int), bbox[4]
if score < 0.3:
continue
cv2.rectangle(img, (left, top), (right, bottom), (0, 255, 0))
cv2.imwrite('output_detection.png', img)
```
You can find more examples from [here](https://github.com/open-mmlab/mmdeploy/tree/main/demo/python).
#### C++ API
Using SDK C++ API should follow next pattern,
![image](https://user-images.githubusercontent.com/4560679/182554739-7fff57fc-5c84-44ed-b139-4749fae27404.png)
Now let's apply this procedure on the above Faster R-CNN model.
```C++
#include <cstdlib>
#include <opencv2/opencv.hpp>
#include "mmdeploy/detector.hpp"
int main() {
const char* device_name = "cuda";
int device_id = 0;
std::string model_path = "mmdeploy_model/faster-rcnn";
std::string image_path = "mmdetection/demo/demo.jpg";
// 1. load model
mmdeploy::Model model(model_path);
// 2. create predictor
mmdeploy::Detector detector(model, mmdeploy::Device{device_name, device_id});
// 3. read image
cv::Mat img = cv::imread(image_path);
// 4. inference
auto dets = detector.Apply(img);
// 5. deal with the result. Here we choose to visualize it
for (int i = 0; i < dets.size(); ++i) {
const auto& box = dets[i].bbox;
fprintf(stdout, "box %d, left=%.2f, top=%.2f, right=%.2f, bottom=%.2f, label=%d, score=%.4f\n",
i, box.left, box.top, box.right, box.bottom, dets[i].label_id, dets[i].score);
if (dets[i].score < 0.3) {
continue;
}
cv::rectangle(img, cv::Point{(int)box.left, (int)box.top},
cv::Point{(int)box.right, (int)box.bottom}, cv::Scalar{0, 255, 0});
}
cv::imwrite("output_detection.png", img);
return 0;
}
```
When you build this example, try to add MMDeploy package in your CMake project as following. Then pass `-DMMDeploy_DIR` to cmake, which indicates the path where `MMDeployConfig.cmake` locates. You can find it in the prebuilt package.
```Makefile
find_package(MMDeploy REQUIRED)
target_link_libraries(${name} PRIVATE mmdeploy ${OpenCV_LIBS})
```
For more SDK C++ API usages, please read these [samples](https://github.com/open-mmlab/mmdeploy/tree/main/demo/csrc/cpp).
For the rest C, C# and Java API usages, please read [C demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo/csrc/c), [C# demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo/csharp) and [Java demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo/java) respectively.
We'll talk about them more in our next release.
#### Accelerate preprocessing(Experimental)
If you want to fuse preprocess for acceleration,please refer to this [doc](./02-how-to-run/fuse_transform.md)
## Evaluate Model
You can test the performance of deployed model using `tool/test.py`. For example,
```shell
python ${MMDEPLOY_DIR}/tools/test.py \
${MMDEPLOY_DIR}/configs/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
${MMDET_DIR}/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
--model ${BACKEND_MODEL_FILES} \
--metrics ${METRICS} \
--device cuda:0
```
```{note}
Regarding the --model option, it represents the converted engine files path when using Model Converter to do performance test. But when you try to test the metrics by Inference SDK, this option refers to the directory path of MMDeploy Model.
```
You can read [how to evaluate a model](02-how-to-run/profile_model.md) for more details.
Welcome to MMDeploy's documentation!
====================================
You can switch between Chinese and English documents in the lower-left corner of the layout.
.. toctree::
:maxdepth: 2
:caption: Get Started
get_started.md
.. toctree::
:maxdepth: 1
:caption: Build
01-how-to-build/build_from_source.md
01-how-to-build/build_from_docker.md
01-how-to-build/build_from_script.md
01-how-to-build/cmake_option.md
.. toctree::
:maxdepth: 1
:caption: Run & Test
02-how-to-run/convert_model.md
02-how-to-run/write_config.md
02-how-to-run/profile_model.md
02-how-to-run/quantize_model.md
02-how-to-run/useful_tools.md
.. toctree::
:maxdepth: 1
:caption: SDK Usage
sdk_usage/index.rst
.. toctree::
:maxdepth: 1
:caption: Benchmark
03-benchmark/supported_models.md
03-benchmark/benchmark.md
03-benchmark/benchmark_edge.md
03-benchmark/benchmark_tvm.md
03-benchmark/quantization.md
.. toctree::
:maxdepth: 1
:caption: OpenMMLab Codebase Support
04-supported-codebases/mmpretrain.md
04-supported-codebases/mmdet.md
04-supported-codebases/mmseg.md
04-supported-codebases/mmagic.md
04-supported-codebases/mmocr.md
04-supported-codebases/mmpose.md
04-supported-codebases/mmdet3d.md
04-supported-codebases/mmrotate.md
04-supported-codebases/mmaction2.md
.. toctree::
:maxdepth: 1
:caption: Backend Support
05-supported-backends/ncnn.md
05-supported-backends/onnxruntime.md
05-supported-backends/openvino.md
05-supported-backends/pplnn.md
05-supported-backends/snpe.md
05-supported-backends/tensorrt.md
05-supported-backends/torchscript.md
05-supported-backends/rknn.md
05-supported-backends/tvm.md
05-supported-backends/coreml.md
.. toctree::
:maxdepth: 1
:caption: Custom Ops
06-custom-ops/onnxruntime.md
06-custom-ops/tensorrt.md
06-custom-ops/ncnn.md
.. toctree::
:maxdepth: 1
:caption: Developer Guide
07-developer-guide/architecture.md
07-developer-guide/support_new_model.md
07-developer-guide/support_new_backend.md
07-developer-guide/add_backend_ops_unittest.md
07-developer-guide/test_rewritten_models.md
07-developer-guide/partition_model.md
07-developer-guide/regression_test.md
.. toctree::
:maxdepth: 1
:caption: Experimental feature
experimental/onnx_optimizer.md
.. toctree::
:maxdepth: 1
:caption: Appendix
appendix/cross_build_snpe_service.md
.. toctree::
:maxdepth: 1
:caption: FAQ
faq.md
.. toctree::
:caption: Switch Language
switch_language.md
.. toctree::
:maxdepth: 1
:caption: API Reference
api.rst
Indices and tables
==================
* :ref:`genindex`
* :ref:`search`
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build
if "%1" == "" goto help
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
:end
popd
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment