Commit 4353fa59 authored by limm's avatar limm
Browse files

add part code

parents
Pipeline #2807 canceled with stages
# MMDetection Deployment
- [MMDetection Deployment](#mmdetection-deployment)
- [Installation](#installation)
- [Install mmdet](#install-mmdet)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Model specification](#model-specification)
- [Model inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Supported models](#supported-models)
- [Reminder](#reminder)
______________________________________________________________________
[MMDetection](https://github.com/open-mmlab/mmdetection) aka `mmdet` is an open source object detection toolbox based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Installation
### Install mmdet
Please follow the [installation guide](https://mmdetection.readthedocs.io/en/3.x/get_started.html) to install mmdet.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmdet models to the specified backend models. Its detailed usage can be learned from [here](../02-how-to-run/convert_model.md).
The command below shows an example about converting `Faster R-CNN` model to onnx model that can be inferred by ONNX Runtime.
```shell
cd mmdeploy
# download faster r-cnn model from mmdet model zoo
mim download mmdet --config faster-rcnn_r50_fpn_1x_coco --dest .
# convert mmdet model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmdet/detection/detection_onnxruntime_dynamic.py \
faster-rcnn_r50_fpn_1x_coco.py \
faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
demo/resources/det.jpg \
--work-dir mmdeploy_models/mmdet/ort \
--device cpu \
--show \
--dump-info
```
It is crucial to specify the correct deployment config during model conversion. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmdet) of all supported backends for mmdetection, under which the config file path follows the pattern:
```
{task}/{task}_{backend}-{precision}_{static | dynamic}_{shape}.py
```
- **{task}:** task in mmdetection.
There are two of them. One is `detection` and the other is `instance-seg`, indicating instance segmentation.
mmdet models like `RetinaNet`, `Faster R-CNN` and `DETR` and so on belongs to `detection` task. While `Mask R-CNN` is one of `instance-seg` models. You can find more of them in chapter [Supported models](#supported-models).
**DO REMEMBER TO USE** `detection/detection_*.py` deployment config file when trying to convert detection models and use `instance-seg/instance-seg_*.py` to deploy instance segmentation models.
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
Therefore, in the above example, you can also convert `faster r-cnn` to other backend models by changing the deployment config file `detection_onnxruntime_dynamic.py` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmdet/detection), e.g., converting to tensorrt-fp16 model by `detection_tensorrt-fp16_dynamic-320x320-1344x1344.py`.
```{tip}
When converting mmdet models to tensorrt models, --device should be set to "cuda"
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmdet/ort` in the previous example. It includes:
```
mmdeploy_models/mmdet/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmdet/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model inference
### Backend model inference
Take the previous converted `end2end.onnx` model as an example, you can use the following code to inference the model and visualize the results.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch
deploy_cfg = 'configs/mmdet/detection/detection_onnxruntime_dynamic.py'
model_cfg = './faster-rcnn_r50_fpn_1x_coco.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmdet/ort/end2end.onnx']
image = './demo/resources/det.jpg'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# visualize results
task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='output_detection.png')
```
### SDK model inference
You can also perform SDK model inference like following,
```python
from mmdeploy_runtime import Detector
import cv2
img = cv2.imread('./demo/resources/det.jpg')
# create a detector
detector = Detector(model_path='./mmdeploy_models/mmdet/ort', device_name='cpu', device_id=0)
# perform inference
bboxes, labels, masks = detector(img)
# visualize inference result
indices = [i for i in range(len(bboxes))]
for index, bbox, label_id in zip(indices, bboxes, labels):
[left, top, right, bottom], score = bbox[0:4].astype(int), bbox[4]
if score < 0.3:
continue
cv2.rectangle(img, (left, top), (right, bottom), (0, 255, 0))
cv2.imwrite('output_detection.png', img)
```
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
## Supported models
| Model | Task | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVINO |
| :-----------------------------------------------------------------------------------------------------------------: | :-------------------: | :---------: | :------: | :--: | :---: | :------: |
| [ATSS](https://github.com/open-mmlab/mmdetection/tree/main/configs/atss) | Object Detection | Y | Y | N | N | Y |
| [FCOS](https://github.com/open-mmlab/mmdetection/tree/main/configs/fcos) | Object Detection | Y | Y | Y | N | Y |
| [FoveaBox](https://github.com/open-mmlab/mmdetection/tree/main/configs/foveabox) | Object Detection | Y | N | N | N | Y |
| [FSAF](https://github.com/open-mmlab/mmdetection/tree/main/configs/fsaf) | Object Detection | Y | Y | Y | Y | Y |
| [RetinaNet](https://github.com/open-mmlab/mmdetection/tree/main/configs/retinanet) | Object Detection | Y | Y | Y | Y | Y |
| [SSD](https://github.com/open-mmlab/mmdetection/tree/main/configs/ssd) | Object Detection | Y | Y | Y | N | Y |
| [VFNet](https://github.com/open-mmlab/mmdetection/tree/main/configs/vfnet) | Object Detection | N | N | N | N | Y |
| [YOLOv3](https://github.com/open-mmlab/mmdetection/tree/main/configs/yolo) | Object Detection | Y | Y | Y | N | Y |
| [YOLOX](https://github.com/open-mmlab/mmdetection/tree/main/configs/yolox) | Object Detection | Y | Y | Y | N | Y |
| [Cascade R-CNN](https://github.com/open-mmlab/mmdetection/tree/main/configs/cascade_rcnn) | Object Detection | Y | Y | N | Y | Y |
| [Faster R-CNN](https://github.com/open-mmlab/mmdetection/tree/main/configs/faster_rcnn) | Object Detection | Y | Y | Y | Y | Y |
| [Faster R-CNN + DCN](https://github.com/open-mmlab/mmdetection/tree/main/configs/faster_rcnn) | Object Detection | Y | Y | Y | Y | Y |
| [GFL](https://github.com/open-mmlab/mmdetection/tree/main/configs/gfl) | Object Detection | Y | Y | N | ? | Y |
| [RepPoints](https://github.com/open-mmlab/mmdetection/tree/main/configs/reppoints) | Object Detection | N | Y | N | ? | Y |
| [DETR](https://github.com/open-mmlab/mmdetection/tree/main/configs/detr)[\*](#nobatchinfer) | Object Detection | Y | Y | N | ? | Y |
| [Deformable DETR](https://github.com/open-mmlab/mmdetection/tree/main/configs/deformable_detr)[\*](#nobatchinfer) | Object Detection | Y | Y | N | ? | Y |
| [Conditional DETR](https://github.com/open-mmlab/mmdetection/tree/main/configs/conditional_detr)[\*](#nobatchinfer) | Object Detection | Y | Y | N | ? | Y |
| [DAB-DETR](https://github.com/open-mmlab/mmdetection/tree/main/configs/dab_detr)[\*](#nobatchinfer) | Object Detection | Y | Y | N | ? | Y |
| [DINO](https://github.com/open-mmlab/mmdetection/tree/main/configs/dino)[\*](#nobatchinfer) | Object Detection | Y | Y | N | ? | Y |
| [CenterNet](https://github.com/open-mmlab/mmdetection/tree/main/configs/centernet) | Object Detection | Y | Y | N | ? | Y |
| [RTMDet](https://github.com/open-mmlab/mmdetection/tree/main/configs/rtmdet) | Object Detection | Y | Y | N | ? | Y |
| [Cascade Mask R-CNN](https://github.com/open-mmlab/mmdetection/tree/main/configs/cascade_rcnn) | Instance Segmentation | Y | Y | N | N | Y |
| [HTC](https://github.com/open-mmlab/mmdetection/tree/main/configs/htc) | Instance Segmentation | Y | Y | N | ? | Y |
| [Mask R-CNN](https://github.com/open-mmlab/mmdetection/tree/main/configs/mask_rcnn) | Instance Segmentation | Y | Y | N | N | Y |
| [Swin Transformer](https://github.com/open-mmlab/mmdetection/tree/main/configs/swin) | Instance Segmentation | Y | Y | N | N | Y |
| [SOLO](https://github.com/open-mmlab/mmdetection/tree/main/configs/solo) | Instance Segmentation | Y | N | N | N | Y |
| [SOLOv2](https://github.com/open-mmlab/mmdetection/tree/main/configs/solov2) | Instance Segmentation | Y | N | N | N | Y |
| [CondInst](https://github.com/open-mmlab/mmdetection/tree/main/configs/condinst) | Instance Segmentation | Y | Y | N | N | N |
| [Panoptic FPN](https://github.com/open-mmlab/mmdetection/tree/main/configs/panoptic_fpn) | Panoptic Segmentation | Y | Y | N | N | N |
| [MaskFormer](https://github.com/open-mmlab/mmdetection/tree/main/configs/maskformer) | Panoptic Segmentation | Y | Y | N | N | N |
| [Mask2Former](https://github.com/open-mmlab/mmdetection/tree/main/configs/mask2former)[\*](#mask2former) | Panoptic Segmentation | Y | Y | N | N | N |
## Reminder
- For transformer based models, strongly suggest use `TensorRT>=8.4`.
- <i id="mask2former">Mask2Former</i> should use `TensorRT>=8.6.1` for dynamic shape inference.
- <i id="nobatchinfer">DETR-like models</i> do not support multi-batch inference.
# MMDetection3d Deployment
- [MMDetection3d Deployment](#mmdetection3d-deployment)
- [Install mmdet3d](#install-mmdet3d)
- [Convert model](#convert-model)
- [Model inference](#model-inference)
- [Supported models](#supported-models)
______________________________________________________________________
[MMDetection3d](https://github.com/open-mmlab/mmdetection3d) aka `mmdet3d` is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Install mmdet3d
We could install mmdet3d through [mim](https://github.com/open-mmlab/mim).
For other ways of installation, please refer to [here](https://mmdetection3d.readthedocs.io/en/latest/get_started.html#installation)
```bash
python3 -m pip install -U openmim
python3 -m mim install "mmdet3d>=1.1.0"
```
## Convert model
For example, use `tools/deploy.py` to convert centerpoint to onnxruntime format
```bash
# cd to mmdeploy root directory
# download config and model
mim download mmdet3d --config centerpoint_pillar02_second_secfpn_head-circlenms_8xb4-cyclic-20e_nus-3d --dest .
export MODEL_CONFIG=centerpoint_pillar02_second_secfpn_head-circlenms_8xb4-cyclic-20e_nus-3d.py
export MODEL_PATH=centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus_20220811_031844-191a3822.pth
export TEST_DATA=tests/data/n008-2018-08-01-15-16-36-0400__LIDAR_TOP__1533151612397179.pcd.bin
python3 tools/deploy.py configs/mmdet3d/voxel-detection/voxel-detection_onnxruntime_dynamic.py $MODEL_CONFIG $MODEL_PATH $TEST_DATA --work-dir centerpoint
```
This step would generate `end2end.onnx` in `work-dir`
```bash
ls -lah centerpoint
..
-rw-rw-r-- 1 rg rg 87M 11月 4 19:48 end2end.onnx
```
## Model inference
At present, the voxelize preprocessing and postprocessing of mmdet3d are not converted into onnx operations; the C++ SDK has not yet implemented the voxelize calculation.
The caller needs to refer to the corresponding [python implementation](../../../mmdeploy/codebase/mmdet3d/deploy/voxel_detection_model.py) to complete.
## Supported models
| model | task | dataset | onnxruntime | openvino | tensorrt\* |
| :------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------: | :------: | :---------: | :------: | :--------: |
| [centerpoint](https://github.com/open-mmlab/mmdetection3d/blob/main/configs/centerpoint/centerpoint_pillar02_second_secfpn_head-circlenms_8xb4-cyclic-20e_nus-3d.py) | voxel detection | nuScenes | ✔️ | ✔️ | ✔️ |
| [pointpillars](https://github.com/open-mmlab/mmdetection3d/blob/main/configs/pointpillars/pointpillars_hv_secfpn_sbn-all_8xb4-2x_nus-3d.py) | voxel detection | nuScenes | ✔️ | ✔️ | ✔️ |
| [pointpillars](https://github.com/open-mmlab/mmdetection3d/blob/main/configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-3class.py) | voxel detection | KITTI | ✔️ | ✔️ | ✔️ |
| [smoke](https://github.com/open-mmlab/mmdetection3d/blob/main/configs/smoke/smoke_dla34_dlaneck_gn-all_4xb8-6x_kitti-mono3d.py) | monocular detection | KITTI | ✔️ | x | ✔️ |
- Make sure trt >= 8.6 for some bug fixed, such as ScatterND, dynamic shape crash and so on.
# MMOCR Deployment
- [MMOCR Deployment](#mmocr-deployment)
- [Installation](#installation)
- [Install mmocr](#install-mmocr)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Convert text detection model](#convert-text-detection-model)
- [Convert text recognition model](#convert-text-recognition-model)
- [Model specification](#model-specification)
- [Model Inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Text detection SDK model inference](#text-detection-sdk-model-inference)
- [Text Recognition SDK model inference](#text-recognition-sdk-model-inference)
- [Supported models](#supported-models)
- [Reminder](#reminder)
______________________________________________________________________
[MMOCR](https://github.com/open-mmlab/mmocr/tree/main) aka `mmocr` is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the corresponding downstream tasks including key information extraction. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Installation
### Install mmocr
Please follow the [installation guide](https://mmocr.readthedocs.io/en/latest/get_started/install.html) to install mmocr.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmocr models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
When using `tools/deploy.py`, it is crucial to specify the correct deployment config. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmocr) of all supported backends for mmocr, under which the config file path follows the pattern:
```
{task}/{task}_{backend}-{precision}_{static | dynamic}_{shape}.py
```
- **{task}:** task in mmocr.
MMDeploy supports models of two tasks of mmocr, one is `text detection` and the other is `text-recogntion`.
**DO REMEMBER TO USE** the corresponding deployment config file when trying to convert models of different tasks.
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
In the next two chapters, we will task `dbnet` model from `text detection` task and `crnn` model from `text recognition` task respectively as examples, showing how to convert them to onnx model that can be inferred by ONNX Runtime.
### Convert text detection model
```shell
cd mmdeploy
# download dbnet model from mmocr model zoo
mim download mmocr --config dbnet_resnet18_fpnc_1200e_icdar2015 --dest .
# convert mmocr model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmocr/text-detection/text-detection_onnxruntime_dynamic.py \
dbnet_resnet18_fpnc_1200e_icdar2015.py \
dbnet_resnet18_fpnc_1200e_icdar2015_20220825_221614-7c0e94f2.pth \
demo/resources/text_det.jpg \
--work-dir mmdeploy_models/mmocr/dbnet/ort \
--device cpu \
--show \
--dump-info
```
### Convert text recognition model
```shell
cd mmdeploy
# download crnn model from mmocr model zoo
mim download mmocr --config crnn_mini-vgg_5e_mj --dest .
# convert mmocr model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmocr/text-recognition/text-recognition_onnxruntime_dynamic.py \
crnn_mini-vgg_5e_mj.py \
crnn_mini-vgg_5e_mj_20220826_224120-8afbedbb.pth \
demo/resources/text_recog.jpg \
--work-dir mmdeploy_models/mmocr/crnn/ort \
--device cpu \
--show \
--dump-info
```
You can also convert the above models to other backend models by changing the deployment config file `*_onnxruntime_dynamic.py` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmocr), e.g., converting `dbnet` to tensorrt-fp32 model by `text-detection/text-detection_tensorrt-_dynamic-320x320-2240x2240.py`.
```{tip}
When converting mmocr models to tensorrt models, --device should be set to "cuda"
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmocr/dbnet/ort` in the previous example. It includes:
```
mmdeploy_models/mmocr/dbnet/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmocr/dbnet/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model Inference
### Backend model inference
Take the previous converted `end2end.onnx` mode of `dbnet` as an example, you can use the following code to inference the model and visualize the results.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch
deploy_cfg = 'configs/mmocr/text-detection/text-detection_onnxruntime_dynamic.py'
model_cfg = 'dbnet_resnet18_fpnc_1200e_icdar2015.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmocr/dbnet/ort/end2end.onnx']
image = './demo/resources/text_det.jpg'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# visualize results
task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='output_ocr.png')
```
**Tip**:
Map 'deploy_cfg', 'model_cfg', 'backend_model' and 'image' to corresponding arguments in chapter [convert text recognition model](#convert-text-recognition-model), you will get the ONNX Runtime inference results of `crnn` onnx model.
### SDK model inference
Given the above SDK models of `dbnet` and `crnn`, you can also perform SDK model inference like following,
#### Text detection SDK model inference
```python
import cv2
from mmdeploy_runtime import TextDetector
img = cv2.imread('demo/resources/text_det.jpg')
# create text detector
detector = TextDetector(
model_path='mmdeploy_models/mmocr/dbnet/ort',
device_name='cpu',
device_id=0)
# do model inference
bboxes = detector(img)
# draw detected bbox into the input image
if len(bboxes) > 0:
pts = ((bboxes[:, 0:8] + 0.5).reshape(len(bboxes), -1,
2).astype(int))
cv2.polylines(img, pts, True, (0, 255, 0), 2)
cv2.imwrite('output_ocr.png', img)
```
#### Text Recognition SDK model inference
```python
import cv2
from mmdeploy_runtime import TextRecognizer
img = cv2.imread('demo/resources/text_recog.jpg')
# create text recognizer
recognizer = TextRecognizer(
model_path='mmdeploy_models/mmocr/crnn/ort',
device_name='cpu',
device_id=0
)
# do model inference
texts = recognizer(img)
# print the result
print(texts)
```
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
## Supported models
| Model | Task | TorchScript | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVINO |
| :----------------------------------------------------------------------------------- | :--------------- | :---------: | :---------: | :------: | :--: | :---: | :------: |
| [DBNet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet) | text-detection | Y | Y | Y | Y | Y | Y |
| [DBNetpp](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnetpp) | text-detection | N | Y | Y | ? | ? | Y |
| [PSENet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet) | text-detection | Y | Y | Y | Y | N | Y |
| [PANet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet) | text-detection | Y | Y | Y | Y | N | Y |
| [TextSnake](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/textsnake) | text-detection | Y | Y | Y | ? | ? | ? |
| [MaskRCNN](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/maskrcnn) | text-detection | Y | Y | Y | ? | ? | ? |
| [CRNN](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/crnn) | text-recognition | Y | Y | Y | Y | Y | N |
| [SAR](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/sar) | text-recognition | N | Y | Y | N | N | N |
| [SATRN](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/satrn) | text-recognition | Y | Y | Y | N | N | N |
| [ABINet](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/abinet) | text-recognition | Y | Y | Y | ? | ? | ? |
## Reminder
- ABINet for TensorRT require pytorch1.10+ and TensorRT 8.4+.
- SAR uses `valid_ratio` inside network inference, which causes performance drops. When the `valid_ratio`s between
testing image and the image for conversion are quite different, the gap would be enlarged.
- For TensorRT backend, users have to choose the right config. For example, CRNN only accepts 1 channel input. Here is a recommendation table:
| Model | Config |
| :------- | :--------------------------------------------------------- |
| MaskRCNN | text-detection_mrcnn_tensorrt_dynamic-320x320-2240x2240.py |
| CRNN | text-recognition_tensorrt_dynamic-1x32x32-1x32x640.py |
| SATRN | text-recognition_tensorrt_dynamic-32x32-32x640.py |
| SAR | text-recognition_tensorrt_dynamic-48x64-48x640.py |
| ABINet | text-recognition_tensorrt_static-32x128.py |
# MMPose Deployment
- [MMPose Deployment](#mmpose-deployment)
- [Installation](#installation)
- [Install mmpose](#install-mmpose)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Model specification](#model-specification)
- [Model inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Supported models](#supported-models)
______________________________________________________________________
[MMPose](https://github.com/open-mmlab/mmpose/tree/main) aka `mmpose` is an open-source toolbox for pose estimation based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Installation
### Install mmpose
Please follow the [best practice](https://mmpose.readthedocs.io/en/latest/installation.html#best-practices) to install mmpose.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmpose models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
The command below shows an example about converting `hrnet` model to onnx model that can be inferred by ONNX Runtime.
```shell
cd mmdeploy
# download hrnet model from mmpose model zoo
mim download mmpose --config td-hm_hrnet-w32_8xb64-210e_coco-256x192 --dest .
# convert mmdet model to onnxruntime model with static shape
python tools/deploy.py \
configs/mmpose/pose-detection_onnxruntime_static.py \
td-hm_hrnet-w32_8xb64-210e_coco-256x192.py \
hrnet_w32_coco_256x192-c78dce93_20200708.pth \
demo/resources/human-pose.jpg \
--work-dir mmdeploy_models/mmpose/ort \
--device cpu \
--show
```
It is crucial to specify the correct deployment config during model conversion. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmpose) of all supported backends for mmpose. The config filename pattern is:
```
pose-detection_{backend}-{precision}_{static | dynamic}_{shape}.py
```
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
Therefore, in the above example, you can also convert `hrnet` to other backend models by changing the deployment config file `pose-detection_onnxruntime_static.py` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmpose), e.g., converting to tensorrt model by `pose-detection_tensorrt_static-256x192.py`.
```{tip}
When converting mmpose models to tensorrt models, --device should be set to "cuda"
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmpose/ort` in the previous example. It includes:
```
mmdeploy_models/mmpose/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmpose/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model inference
### Backend model inference
Take the previous converted `end2end.onnx` model as an example, you can use the following code to inference the model and visualize the results.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch
deploy_cfg = 'configs/mmpose/pose-detection_onnxruntime_static.py'
model_cfg = 'td-hm_hrnet-w32_8xb64-210e_coco-256x192.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmpose/ort/end2end.onnx']
image = './demo/resources/human-pose.jpg'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# visualize results
task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='output_pose.png')
```
### SDK model inference
TODO
## Supported models
| Model | Task | ONNX Runtime | TensorRT | ncnn | PPLNN | OpenVINO |
| :-------------------------------------------------------------------------------------------------------- | :------------ | :----------: | :------: | :--: | :---: | :------: |
| [HRNet](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#hrnet-cvpr-2019) | PoseDetection | Y | Y | Y | N | Y |
| [MSPN](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#mspn-arxiv-2019) | PoseDetection | Y | Y | Y | N | Y |
| [LiteHRNet](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#litehrnet-cvpr-2021) | PoseDetection | Y | Y | Y | N | Y |
| [Hourglass](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/algorithms.html#hourglass-eccv-2016) | PoseDetection | Y | Y | Y | N | Y |
| [SimCC](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/algorithms.html#simcc-eccv-2022) | PoseDetection | Y | Y | Y | N | Y |
| [RTMPose](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmpose) | PoseDetection | Y | Y | Y | N | Y |
| [YoloX-Pose](https://github.com/open-mmlab/mmpose/tree/main/projects/yolox_pose) | PoseDetection | Y | Y | N | N | Y |
| [RTMO](https://github.com/open-mmlab/mmpose/tree/dev-1.x/projects/rtmo) | PoseDetection | Y | Y | N | N | N |
# MMPretrain Deployment
- [MMPretrain Deployment](#mmpretrain-deployment)
- [Installation](#installation)
- [Install mmpretrain](#install-mmpretrain)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Model Specification](#model-specification)
- [Model inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Supported models](#supported-models)
______________________________________________________________________
[MMPretrain](https://github.com/open-mmlab/mmpretrain) aka `mmpretrain` is an open-source image classification toolbox based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com) project.
## Installation
### Install mmpretrain
Please follow this [quick guide](https://github.com/open-mmlab/mmpretrain/tree/main#installation) to install mmpretrain.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmpretrain models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
The command below shows an example about converting `resnet18` model to onnx model that can be inferred by ONNX Runtime.
```shell
cd mmdeploy
# download resnet18 model from mmpretrain model zoo
mim download mmpretrain --config resnet18_8xb32_in1k --dest .
# convert mmpretrain model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmpretrain/classification_onnxruntime_dynamic.py \
resnet18_8xb32_in1k.py \
resnet18_8xb32_in1k_20210831-fbbb1da6.pth \
tests/data/tiger.jpeg \
--work-dir mmdeploy_models/mmpretrain/ort \
--device cpu \
--show \
--dump-info
```
It is crucial to specify the correct deployment config during model conversion. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmpretrain) of all supported backends for mmpretrain. The config filename pattern is:
```
classification_{backend}-{precision}_{static | dynamic}_{shape}.py
```
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml and etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
Therefore, in the above example, you can also convert `resnet18` to other backend models by changing the deployment config file `classification_onnxruntime_dynamic.py` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmpretrain), e.g., converting to tensorrt-fp16 model by `classification_tensorrt-fp16_dynamic-224x224-224x224.py`.
```{tip}
When converting mmpretrain models to tensorrt models, --device should be set to "cuda"
```
## Model Specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmpretrain/ort` in the previous example. It includes:
```
mmdeploy_models/mmpretrain/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmpretrain/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model inference
### Backend model inference
Take the previous converted `end2end.onnx` model as an example, you can use the following code to inference the model.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch
deploy_cfg = 'configs/mmpretrain/classification_onnxruntime_dynamic.py'
model_cfg = './resnet18_8xb32_in1k.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmpretrain/ort/end2end.onnx']
image = 'tests/data/tiger.jpeg'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# visualize results
task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='output_classification.png')
```
### SDK model inference
You can also perform SDK model inference like following,
```python
from mmdeploy_runtime import Classifier
import cv2
img = cv2.imread('tests/data/tiger.jpeg')
# create a classifier
classifier = Classifier(model_path='./mmdeploy_models/mmpretrain/ort', device_name='cpu', device_id=0)
# perform inference
result = classifier(img)
# show inference result
for label_id, score in result:
print(label_id, score)
```
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
## Supported models
| Model | TorchScript | ONNX Runtime | TensorRT | ncnn | PPLNN | OpenVINO |
| :------------------------------------------------------------------------------------------------- | :---------: | :----------: | :------: | :--: | :---: | :------: |
| [ResNet](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnet) | Y | Y | Y | Y | Y | Y |
| [ResNeXt](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnext) | Y | Y | Y | Y | Y | Y |
| [SE-ResNet](https://github.com/open-mmlab/mmpretrain/tree/main/configs/seresnet) | Y | Y | Y | Y | Y | Y |
| [MobileNetV2](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v2) | Y | Y | Y | Y | Y | Y |
| [MobileNetV3](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v3) | Y | Y | Y | Y | ? | Y |
| [ShuffleNetV1](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v1) | Y | Y | Y | Y | Y | Y |
| [ShuffleNetV2](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v2) | Y | Y | Y | Y | Y | Y |
| [VisionTransformer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/vision_transformer) | Y | Y | Y | Y | ? | Y |
| [SwinTransformer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/swin_transformer) | Y | Y | Y | N | ? | Y |
| [MobileOne](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobileone) | Y | Y | Y | Y | ? | Y |
| [EfficientNet](https://github.com/open-mmlab/mmpretrain/tree/main/configs/efficientnet) | Y | Y | Y | N | ? | Y |
| [Conformer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/conformer) | Y | Y | Y | N | ? | Y |
| [EfficientFormer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/efficientformer) | Y | Y | Y | N | ? | Y |
# MMRotate Deployment
- [MMRotate Deployment](#mmrotate-deployment)
- [Installation](#installation)
- [Install mmrotate](#install-mmrotate)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Model specification](#model-specification)
- [Model inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Supported models](#supported-models)
______________________________________________________________________
[MMRotate](https://github.com/open-mmlab/mmrotate) is an open-source toolbox for rotated object detection based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Installation
### Install mmrotate
Please follow the [installation guide](https://mmrotate.readthedocs.io/en/1.x/get_started.html) to install mmrotate.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**NOTE**:
- Adding `$(pwd)/build/lib` to `PYTHONPATH` is for importing mmdeploy SDK python module - `mmdeploy_runtime`, which will be presented in chapter [SDK model inference](#sdk-model-inference).
- When [inference onnx model by ONNX Runtime](#backend-model-inference), it requests ONNX Runtime library be found. Thus, we add it to `LD_LIBRARY_PATH`.
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/blob/main/tools/deploy.py) to convert mmrotate models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/blob/main/docs/en/02-how-to-run/convert_model.md#usage).
The command below shows an example about converting `rotated-faster-rcnn` model to onnx model that can be inferred by ONNX Runtime.
```shell
cd mmdeploy
# download rotated-faster-rcnn model from mmrotate model zoo
mim download mmrotate --config rotated-faster-rcnn-le90_r50_fpn_1x_dota --dest .
wget https://github.com/open-mmlab/mmrotate/raw/main/demo/dota_demo.jpg
# convert mmrotate model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmrotate/rotated-detection_onnxruntime_dynamic.py \
rotated-faster-rcnn-le90_r50_fpn_1x_dota.py \
rotated_faster_rcnn_r50_fpn_1x_dota_le90-0393aa5c.pth \
dota_demo.jpg \
--work-dir mmdeploy_models/mmrotate/ort \
--device cpu \
--show \
--dump-info
```
It is crucial to specify the correct deployment config during model conversion. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmrotate) of all supported backends for mmrotate. The config filename pattern is:
```
rotated_detection-{backend}-{precision}_{static | dynamic}_{shape}.py
```
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
Therefore, in the above example, you can also convert `rotated-faster-rcnn` to other backend models by changing the deployment config file `rotated-detection_onnxruntime_dynamic` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmrotate), e.g., converting to tensorrt-fp16 model by `rotated-detection_tensorrt-fp16_dynamic-320x320-1024x1024.py`.
```{tip}
When converting mmrotate models to tensorrt models, --device should be set to "cuda"
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmrotate/ort` in the previous example. It includes:
```
mmdeploy_models/mmrotate/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmrotate/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model inference
### Backend model inference
Take the previous converted `end2end.onnx` model as an example, you can use the following code to inference the model and visualize the results.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch
deploy_cfg = 'configs/mmrotate/rotated-detection_onnxruntime_dynamic.py'
model_cfg = './rotated-faster-rcnn-le90_r50_fpn_1x_dota.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmrotate/ort/end2end.onnx']
image = './dota_demo.jpg'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# visualize results
task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='./output.png')
```
### SDK model inference
You can also perform SDK model inference like following,
```python
from mmdeploy_runtime import RotatedDetector
import cv2
import numpy as np
img = cv2.imread('./dota_demo.jpg')
# create a detector
detector = RotatedDetector(model_path='./mmdeploy_models/mmrotate/ort', device_name='cpu', device_id=0)
# perform inference
det = detector(img)
```
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
## Supported models
| Model | OnnxRuntime | TensorRT |
| :------------------------------------------------------------------------------------------------ | :---------: | :------: |
| [Rotated RetinaNet](https://github.com/open-mmlab/mmrotate/blob/1.x/configs/rotated_retinanet) | Y | Y |
| [Rotated FasterRCNN](https://github.com/open-mmlab/mmrotate/blob/1.x/configs/rotated_faster_rcnn) | Y | Y |
| [Oriented R-CNN](https://github.com/open-mmlab/mmrotate/blob/1.x/configs/oriented_rcnn) | Y | Y |
| [Gliding Vertex](https://github.com/open-mmlab/mmrotate/blob/1.x/configs/gliding_vertex) | Y | Y |
| [RTMDET-R](https://github.com/open-mmlab/mmrotate/blob/1.x/configs/rotated_rtmdet) | Y | Y |
# MMSegmentation Deployment
- [MMSegmentation Deployment](#mmsegmentation-deployment)
- [Installation](#installation)
- [Install mmseg](#install-mmseg)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Model specification](#model-specification)
- [Model inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Supported models](#supported-models)
- [Reminder](#reminder)
______________________________________________________________________
[MMSegmentation](https://github.com/open-mmlab/mmsegmentation/tree/main) aka `mmseg` is an open source semantic segmentation toolbox based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Installation
### Install mmseg
Please follow the [installation guide](https://mmsegmentation.readthedocs.io/en/latest/get_started.html) to install mmseg.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**NOTE**:
- Adding `$(pwd)/build/lib` to `PYTHONPATH` is for importing mmdeploy SDK python module - `mmdeploy_runtime`, which will be presented in chapter [SDK model inference](#sdk-model-inference).
- When [inference onnx model by ONNX Runtime](#backend-model-inference), it requests ONNX Runtime library be found. Thus, we add it to `LD_LIBRARY_PATH`.
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmseg models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
The command below shows an example about converting `unet` model to onnx model that can be inferred by ONNX Runtime.
```shell
cd mmdeploy
# download unet model from mmseg model zoo
mim download mmsegmentation --config unet-s5-d16_fcn_4xb4-160k_cityscapes-512x1024 --dest .
# convert mmseg model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmseg/segmentation_onnxruntime_dynamic.py \
unet-s5-d16_fcn_4xb4-160k_cityscapes-512x1024.py \
fcn_unet_s5-d16_4x4_512x1024_160k_cityscapes_20211210_145204-6860854e.pth \
demo/resources/cityscapes.png \
--work-dir mmdeploy_models/mmseg/ort \
--device cpu \
--show \
--dump-info
```
It is crucial to specify the correct deployment config during model conversion. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmseg) of all supported backends for mmsegmentation. The config filename pattern is:
```
segmentation_{backend}-{precision}_{static | dynamic}_{shape}.py
```
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
Therefore, in the above example, you can also convert `unet` to other backend models by changing the deployment config file `segmentation_onnxruntime_dynamic.py` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmseg), e.g., converting to tensorrt-fp16 model by `segmentation_tensorrt-fp16_dynamic-512x1024-2048x2048.py`.
```{tip}
When converting mmseg models to tensorrt models, --device should be set to "cuda"
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmseg/ort` in the previous example. It includes:
```
mmdeploy_models/mmseg/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmseg/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model inference
### Backend model inference
Take the previous converted `end2end.onnx` model as an example, you can use the following code to inference the model and visualize the results.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch
deploy_cfg = 'configs/mmseg/segmentation_onnxruntime_dynamic.py'
model_cfg = './unet-s5-d16_fcn_4xb4-160k_cityscapes-512x1024.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmseg/ort/end2end.onnx']
image = './demo/resources/cityscapes.png'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# visualize results
task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='./output_segmentation.png')
```
### SDK model inference
You can also perform SDK model inference like following,
```python
from mmdeploy_runtime import Segmentor
import cv2
import numpy as np
img = cv2.imread('./demo/resources/cityscapes.png')
# create a classifier
segmentor = Segmentor(model_path='./mmdeploy_models/mmseg/ort', device_name='cpu', device_id=0)
# perform inference
seg = segmentor(img)
# visualize inference result
## random a palette with size 256x3
palette = np.random.randint(0, 256, size=(256, 3))
color_seg = np.zeros((seg.shape[0], seg.shape[1], 3), dtype=np.uint8)
for label, color in enumerate(palette):
color_seg[seg == label, :] = color
# convert to BGR
color_seg = color_seg[..., ::-1]
img = img * 0.5 + color_seg * 0.5
img = img.astype(np.uint8)
cv2.imwrite('output_segmentation.png', img)
```
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
## Supported models
| Model | TorchScript | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVino |
| :-------------------------------------------------------------------------------------------------------- | :---------: | :---------: | :------: | :--: | :---: | :------: |
| [FCN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fcn) | Y | Y | Y | Y | Y | Y |
| [PSPNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/pspnet)[\*](#static_shape) | Y | Y | Y | Y | Y | Y |
| [DeepLabV3](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3) | Y | Y | Y | Y | Y | Y |
| [DeepLabV3+](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus) | Y | Y | Y | Y | Y | Y |
| [Fast-SCNN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastscnn)[\*](#static_shape) | Y | Y | Y | N | Y | Y |
| [UNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/unet) | Y | Y | Y | Y | Y | Y |
| [ANN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ann)[\*](#static_shape) | Y | Y | Y | N | N | N |
| [APCNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/apcnet) | Y | Y | Y | Y | N | N |
| [BiSeNetV1](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv1) | Y | Y | Y | Y | N | Y |
| [BiSeNetV2](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv2) | Y | Y | Y | Y | N | Y |
| [CGNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/cgnet) | Y | Y | Y | Y | N | Y |
| [DMNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/dmnet) | ? | Y | N | N | N | N |
| [DNLNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/dnlnet) | ? | Y | Y | Y | N | Y |
| [EMANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/emanet) | Y | Y | Y | N | N | Y |
| [EncNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/encnet) | Y | Y | Y | N | N | Y |
| [ERFNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/erfnet) | Y | Y | Y | Y | N | Y |
| [FastFCN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastfcn) | Y | Y | Y | Y | N | Y |
| [GCNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/gcnet) | Y | Y | Y | N | N | N |
| [ICNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/icnet)[\*](#static_shape) | Y | Y | Y | N | N | Y |
| [ISANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/isanet)[\*](#static_shape) | N | Y | Y | N | N | Y |
| [NonLocal Net](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/nonlocal_net) | ? | Y | Y | Y | N | Y |
| [OCRNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ocrnet) | Y | Y | Y | Y | N | Y |
| [PointRend](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/point_rend)[\*](#static_shape) | Y | Y | Y | N | N | N |
| [Semantic FPN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/sem_fpn) | Y | Y | Y | Y | N | Y |
| [STDC](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/stdc) | Y | Y | Y | Y | N | Y |
| [UPerNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/upernet)[\*](#static_shape) | N | Y | Y | N | N | N |
| [DANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/danet) | ? | Y | Y | N | N | Y |
| [Segmenter](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/segmenter)[\*](#static_shape) | N | Y | Y | Y | N | Y |
| [SegFormer](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/segformer)[\*](#static_shape) | Y | Y | Y | N | N | Y |
| [SETR](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/setr) | ? | Y | N | N | N | Y |
| [CCNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ccnet) | ? | N | N | N | N | N |
| [PSANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/psanet) | ? | N | N | N | N | N |
| [DPT](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/dpt) | ? | N | N | N | N | N |
## Reminder
- Only `whole` inference mode is supported for all mmseg models.
- <i id="static_shape">PSPNet, Fast-SCNN</i> only support static shape, because [nn.AdaptiveAvgPool2d](https://github.com/open-mmlab/mmsegmentation/blob/0c87f7a0c9099844eff8e90fa3db5b0d0ca02fee/mmseg/models/decode_heads/psp_head.py#L38) is not supported by most inference backends.
- For models that only supports static shape, you should use the deployment config file of static shape such as `configs/mmseg/segmentation_tensorrt_static-1024x2048.py`.
- For users prefer deployed models generate probability feature map, put `codebase_config = dict(with_argmax=False)` in deploy configs.
# Core ML feature support
MMDeploy support convert Pytorch model to Core ML and inference.
## Installation
To convert the model in mmdet, you need to compile libtorch to support custom operators such as nms (only needed in conversion stage). For MacOS 12 users, please install Pytorch 1.8.0, for MacOS 13 users, please install Pytorch 2.0.0+.
```bash
cd ${PYTORCH_DIR}
mkdir build && cd build
cmake .. \
-DCMAKE_BUILD_TYPE=Release \
-DPYTHON_EXECUTABLE=`which python` \
-DCMAKE_INSTALL_PREFIX=install \
-DDISABLE_SVE=ON
make install
```
## Usage
```bash
python tools/deploy.py \
configs/mmdet/detection/detection_coreml_static-800x1344.py \
/mmdetection_dir/configs/retinanet/retinanet_r18_fpn_1x_coco.py \
/checkpoint/retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth \
/mmdetection_dir/demo/demo.jpg \
--work-dir work_dir/retinanet \
--device cpu \
--dump-info
```
# Supported ncnn feature
The current use of the ncnn feature is as follows:
| feature | windows | linux | mac | android |
| :----------------: | :-----: | :---: | :-: | :-----: |
| fp32 inference | ✔️ | ✔️ | ✔️ | ✔️ |
| int8 model convert | - | ✔️ | ✔️ | - |
| nchw layout | ✔️ | ✔️ | ✔️ | ✔️ |
| Vulkan support | - | ✔️ | ✔️ | ✔️ |
The following features cannot be automatically enabled by mmdeploy and you need to manually modify the ncnn build options or adjust the running parameters in the SDK
- bf16 inference
- nc4hw4 layout
- Profiling per layer
- Turn off NCNN_STRING to reduce .so file size
- Set thread number and CPU affinity
# onnxruntime Support
## Introduction of ONNX Runtime
**ONNX Runtime** is a cross-platform inference and training accelerator compatible with many popular ML/DNN frameworks. Check its [github](https://github.com/microsoft/onnxruntime) for more information.
## Installation
*Please note that only **onnxruntime>=1.8.1** of on Linux platform is supported by now.*
### Install ONNX Runtime python package
- CPU Version
```bash
pip install onnxruntime==1.8.1 # if you want to use cpu version
```
- GPU Version
```bash
pip install onnxruntime-gpu==1.8.1 # if you want to use gpu version
```
### Install float16 conversion tool (optional)
If you want to use float16 precision, install the tool by running the following script:
```bash
pip install onnx onnxconverter-common
```
## Build custom ops
### Download ONNXRuntime Library
Download `onnxruntime-linux-*.tgz` library from ONNX Runtime [releases](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1), extract it, expose `ONNXRUNTIME_DIR` and finally add the lib path to `LD_LIBRARY_PATH` as below:
- CPU Version
```bash
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
cd onnxruntime-linux-x64-1.8.1
export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```
- GPU Version
In X64 GPU:
```bash
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-gpu-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-gpu-1.8.1.tgz
cd onnxruntime-linux-x64-gpu-1.8.1
export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```
In Arm GPU:
```bash
# Arm not have 1.8.1 version package
wget https://github.com/microsoft/onnxruntime/releases/download/v1.10.0/onnxruntime-linux-aarch64-1.10.0.tgz
tar -zxvf onnxruntime-linux-aarch64-1.10.0.tgz
cd onnxruntime-linux-aarch64-1.10.0
export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```
You can also go to [ONNX Runtime Release](https://github.com/microsoft/onnxruntime/releases) to find corresponding release version package.
### Build on Linux
- CPU Version
```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_DEVICES='cpu' -DMMDEPLOY_TARGET_BACKENDS=ort -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} ..
make -j$(nproc) && make install
```
- GPU Version
```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_DEVICES='cuda' -DMMDEPLOY_TARGET_BACKENDS=ort -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} ..
make -j$(nproc) && make install
```
## How to convert a model
- You could follow the instructions of tutorial [How to convert model](../02-how-to-run/convert_model.md)
## How to add a new custom op
## Reminder
- The custom operator is not included in [supported operator list](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md) in ONNX Runtime.
- The custom operator should be able to be exported to ONNX.
#### Main procedures
Take custom operator `roi_align` for example.
1. Create a `roi_align` directory in ONNX Runtime source directory `${MMDEPLOY_DIR}/csrc/backend_ops/onnxruntime/`
2. Add header and source file into `roi_align` directory `${MMDEPLOY_DIR}/csrc/backend_ops/onnxruntime/roi_align/`
3. Add unit test into `tests/test_ops/test_ops.py`
Check [here](../../../tests/test_ops/test_ops.py) for examples.
**Finally, welcome to send us PR of adding custom operators for ONNX Runtime in MMDeploy.** :nerd_face:
## References
- [How to export Pytorch model with custom op to ONNX and run it in ONNX Runtime](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
- [How to add a custom operator/kernel in ONNX Runtime](https://onnxruntime.ai/docs/reference/operators/add-custom-op.html)
# OpenVINO Support
This tutorial is based on Linux systems like Ubuntu-18.04.
## Installation
It is recommended to create a virtual environment for the project.
### Install python package
Install [OpenVINO](https://docs.openvino.ai/2022.3/get_started.html). It is recommended to use the installer or install using pip.
Installation example using [pip](https://pypi.org/project/openvino-dev/):
```bash
pip install openvino-dev[onnx]==2022.3.0
```
### Download OpenVINO runtime for SDK (Optional)
If you want to use OpenVINO in SDK, you need install OpenVINO with [install_guides](https://docs.openvino.ai/2022.3/openvino_docs_install_guides_installing_openvino_from_archive_linux.html#installing-openvino-runtime).
Take `openvino==2022.3.0` as example:
```bash
wget https://storage.openvinotoolkit.org/repositories/openvino/packages/2022.3/linux/l_openvino_toolkit_ubuntu20_2022.3.0.9052.9752fafe8eb_x86_64.tgz
tar xzf ./l_openvino_toolkit*.tgz
cd l_openvino*
export InferenceEngine_DIR=$pwd/runtime/cmake
bash ./install_dependencies/install_openvino_dependencies.sh
```
### Build mmdeploy SDK with OpenVINO (Optional)
Install MMDeploy following the [instructions](../01-how-to-build/build_from_source.md).
```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_DEVICES='cpu' -DMMDEPLOY_TARGET_BACKENDS=openvino -DInferenceEngine_DIR=${InferenceEngine_DIR} ..
make -j$(nproc) && make install
```
To work with models from [MMDetection](https://mmdetection.readthedocs.io/en/3.x/get_started.html), you may need to install it additionally.
## Usage
You could follow the instructions of tutorial [How to convert model](../02-how-to-run/convert_model.md)
Example:
```bash
python tools/deploy.py \
configs/mmdet/detection/detection_openvino_static-300x300.py \
/mmdetection_dir/mmdetection/configs/ssd/ssd300_coco.py \
/tmp/snapshots/ssd300_coco_20210803_015428-d231a06e.pth \
tests/data/tiger.jpeg \
--work-dir ../deploy_result \
--device cpu \
--log-level INFO
```
## List of supported models exportable to OpenVINO from MMDetection
The table below lists the models that are guaranteed to be exportable to OpenVINO from MMDetection.
| Model name | Config | Dynamic Shape |
| :----------------: | :-----------------------------------------------------------------------: | :-----------: |
| ATSS | `configs/atss/atss_r50_fpn_1x_coco.py` | Y |
| Cascade Mask R-CNN | `configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py` | Y |
| Cascade R-CNN | `configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.py` | Y |
| Faster R-CNN | `configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py` | Y |
| FCOS | `configs/fcos/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco.py` | Y |
| FoveaBox | `configs/foveabox/fovea_r50_fpn_4x4_1x_coco.py ` | Y |
| FSAF | `configs/fsaf/fsaf_r50_fpn_1x_coco.py` | Y |
| Mask R-CNN | `configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py` | Y |
| RetinaNet | `configs/retinanet/retinanet_r50_fpn_1x_coco.py` | Y |
| SSD | `configs/ssd/ssd300_coco.py` | Y |
| YOLOv3 | `configs/yolo/yolov3_d53_mstrain-608_273e_coco.py` | Y |
| YOLOX | `configs/yolox/yolox_tiny_8x8_300e_coco.py` | Y |
| Faster R-CNN + DCN | `configs/dcn/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py` | Y |
| VFNet | `configs/vfnet/vfnet_r50_fpn_1x_coco.py` | Y |
Notes:
- Custom operations from OpenVINO use the domain `org.openvinotoolkit`.
- For faster work in OpenVINO in the Faster-RCNN, Mask-RCNN, Cascade-RCNN, Cascade-Mask-RCNN models
the RoiAlign operation is replaced with the [ExperimentalDetectronROIFeatureExtractor](https://docs.openvino.ai/2022.3/openvino_docs_ops_detection_ExperimentalDetectronROIFeatureExtractor_6.html) operation in the ONNX graph.
- Models "VFNet" and "Faster R-CNN + DCN" use the custom "DeformableConv2D" operation.
## Deployment config
With the deployment config, you can specify additional options for the Model Optimizer.
To do this, add the necessary parameters to the `backend_config.mo_options` in the fields `args` (for parameters with values) and `flags` (for flags).
Example:
```python
backend_config = dict(
mo_options=dict(
args=dict({
'--mean_values': [0, 0, 0],
'--scale_values': [255, 255, 255],
'--data_type': 'FP32',
}),
flags=['--disable_fusing'],
)
)
```
Information about the possible parameters for the Model Optimizer can be found in the [documentation](https://docs.openvino.ai/latest/openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model.html).
## Troubleshooting
- ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory
To resolve missing external dependency on Ubuntu\*, execute the following command:
```bash
sudo apt-get install libpython3.7
```
# PPLNN Support
MMDeploy supports ppl.nn v0.8.1 and later. This tutorial is based on Linux systems like Ubuntu-18.04.
## Installation
1. Please install [pyppl](https://github.com/openppl-public/ppl.nn) following [install-guide](https://github.com/openppl-public/ppl.nn/blob/master/docs/en/building-from-source.md).
2. Install MMDeploy following the [instructions](../01-how-to-build/build_from_source.md).
## Usage
Example:
```bash
python tools/deploy.py \
configs/mmdet/detection/detection_pplnn_dynamic-800x1344.py \
/mmdetection_dir/mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py \
/tmp/snapshots/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth \
tests/data/tiger.jpeg \
--work-dir ../deploy_result \
--device cuda \
--log-level INFO
```
# Supported RKNN feature
Currently, MMDeploy only tests rk3588 and rv1126 with linux platform.
The following features cannot be automatically enabled by mmdeploy and you need to manually modify the configuration in MMDeploy like [here](https://github.com/open-mmlab/mmdeploy/tree/main/configs/_base_/backends/rknn.py).
- target_platform other than default
- quantization settings
- optimization level other than 1
# SNPE feature support
Currently mmdeploy integrates the onnx2dlc model conversion and SDK inference, but the following features are not yet supported:
- GPU_FP16 mode
- DSP/AIP quantization
- Operator internal profiling
- UDO operator
# TensorRT Support
## Installation
### Install TensorRT
Please install TensorRT 8 follow [install-guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing).
**Note**:
- `pip Wheel File Installation` is not supported yet in this repo.
- We strongly suggest you install TensorRT through [tar file](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar)
- After installation, you'd better add TensorRT environment variables to bashrc by:
```bash
cd ${TENSORRT_DIR} # To TensorRT root directory
echo '# set env for TensorRT' >> ~/.bashrc
echo "export TENSORRT_DIR=${TENSORRT_DIR}" >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$TENSORRT_DIR/lib:$TENSORRT_DIR' >> ~/.bashrc
source ~/.bashrc
```
### Build custom ops
Some custom ops are created to support models in OpenMMLab, and the custom ops can be built as follow:
```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_BACKENDS=trt ..
make -j$(nproc)
```
If you haven't installed TensorRT in the default path, Please add `-DTENSORRT_DIR` flag in CMake.
```bash
cmake -DMMDEPLOY_TARGET_BACKENDS=trt -DTENSORRT_DIR=${TENSORRT_DIR} ..
make -j$(nproc) && make install
```
## Convert model
Please follow the tutorial in [How to convert model](../02-how-to-run/convert_model.md). **Note** that the device must be `cuda` device.
### Int8 Support
Since TensorRT supports INT8 mode, a custom dataset config can be given to calibrate the model. Following is an example for MMDetection:
```python
# calibration_dataset.py
# dataset settings, same format as the codebase in OpenMMLab
dataset_type = 'CalibrationDataset'
data_root = 'calibration/dataset/root'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
val=dict(
type=dataset_type,
ann_file=data_root + 'val_annotations.json',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + 'test_annotations.json',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='bbox')
```
Convert your model with this calibration dataset:
```python
python tools/deploy.py \
...
--calib-dataset-cfg calibration_dataset.py
```
If the calibration dataset is not given, the data will be calibrated with the dataset in model config.
## FAQs
- Error `Cannot found TensorRT headers` or `Cannot found TensorRT libs`
Try cmake with flag `-DTENSORRT_DIR`:
```bash
cmake -DBUILD_TENSORRT_OPS=ON -DTENSORRT_DIR=${TENSORRT_DIR} ..
make -j$(nproc)
```
Please make sure there are libs and headers in `${TENSORRT_DIR}`.
- Error `error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]`
There is an input shape limit in deployment config:
```python
backend_config = dict(
# other configs
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 320, 320],
opt_shape=[1, 3, 800, 1344],
max_shape=[1, 3, 1344, 1344])))
])
# other configs
```
The shape of the tensor `input` must be limited between `input_shapes["input"]["min_shape"]` and `input_shapes["input"]["max_shape"]`.
- Error `error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS`
TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the default choice for SM version >= 7.0. However, you may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you don't want to upgrade.
Read [this](https://forums.developer.nvidia.com/t/matrixmultiply-failed-on-tensorrt-7-2-1/158187/4) for detail.
- Install mmdeploy on Jetson
We provide a tutorial to get start on Jetsons [here](../01-how-to-build/jetsons.md).
# TorchScript support
## Introduction of TorchScript
**TorchScript** a way to create serializable and optimizable models from PyTorch code. Any TorchScript program can be saved from a Python process and loaded in a process where there is no Python dependency. Check the [Introduction to TorchScript](https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html) for more details.
## Build custom ops
### Prerequisite
- Download libtorch from the official website [here](https://pytorch.org/get-started/locally/).
*Please note that only **Pre-cxx11 ABI** and **version 1.8.1+** on Linux platform are supported by now.*
For previous versions of libtorch, users can find through the [issue comment](https://github.com/pytorch/pytorch/issues/40961#issuecomment-1017317786). Libtorch1.8.1+cu111 as an example, extract it, expose `Torch_DIR` and add the lib path to `LD_LIBRARY_PATH` as below:
```bash
wget https://download.pytorch.org/libtorch/cu111/libtorch-shared-with-deps-1.8.1%2Bcu111.zip
unzip libtorch-shared-with-deps-1.8.1+cu111.zip
cd libtorch
export Torch_DIR=$(pwd)
export LD_LIBRARY_PATH=$Torch_DIR/lib:$LD_LIBRARY_PATH
```
Note:
- If you want to save libtorch env variables to bashrc, you could run
```bash
echo '# set env for libtorch' >> ~/.bashrc
echo "export Torch_DIR=${Torch_DIR}" >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$Torch_DIR/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
```
### Build on Linux
```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_BACKENDS=torchscript -DTorch_DIR=${Torch_DIR} ..
make -j$(nproc) && make install
```
## How to convert a model
- You could follow the instructions of tutorial [How to convert model](../02-how-to-run/convert_model.md)
## SDK backend
TorchScript SDK backend may be built by passing `-DMMDEPLOY_TORCHSCRIPT_SDK_BACKEND=ON` to `cmake`.
Notice that `libtorch` is sensitive to C++ ABI versions. On platforms defaulted to C++11 ABI (e.g. Ubuntu 16+) one may
pass `-DCMAKE_CXX_FLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"` to `cmake` to use pre-C++11 ABI for building. In this case all
dependencies with ABI sensitive interfaces (e.g. OpenCV) must be built with pre-C++11 ABI.
## FAQs
- Error: `projects/thirdparty/libtorch/share/cmake/Caffe2/Caffe2Config.cmake:96 (message):Your installed Caffe2 version uses cuDNN but I cannot find the cuDNN libraries. Please set the proper cuDNN prefixes and / or install cuDNN.`
May export CUDNN_ROOT=/root/path/to/cudnn to resolve the build error.
# TVM feature support
MMDeploy has integrated TVM for model conversion and SDK. Features include:
- AutoTVM tuner
- Ansor tuner
- Graph Executor runtime
- Virtual machine runtime
# VACC Backend
- cmake 3.10.0+
- gcc/g++ 7.5.0
- llvm 9.0.1
- ubuntu 18.04
## PCIE
### 1.package
- dkms (>=1.95)
- linux-headers
- dpkg (Ubuntu)
- rpm (CentOS)
- python2
- python3
Check if there is a vacc card:`lspci -d:0100`
1. Requirements
```bash
sudo apt-get install dkms dpkg python2 python3
```
2. install driver
```bash
sudo dpkg -i vastai-pci_xx.xx.xx.xx_xx.deb
```
3. Verify installation
```bash
dpkg --status vastai-pci-xxx
#output
Package: vastai-pci-dkms
Status: install ok installed
……
Version: xx.xx.xx.xx
Provides: vastai-pci-modules (= xx.xx.xx.xx)
Depends: dkms (>= 1.95)
Description: vastai-pci driver in DKMS format.
lsmod | grep vastai_pci
#output
vastai_pci xxx x
```
4. Upgrade driver
```bash
sudo dpkg -i vastai-pci_dkms_xx.xx.xx.xx_xx.deb
```
5. Uninstall driver
```bash
sudo dpkg -r vastai-pci_dkms_xx.xx.xx.xx_xx
```
### 2.reboot pcie
```bash
sudo chmod 666 /dev/kchar:0 && sudo echo reboot > /dev/kchar:0
```
## SDK
### step.1
```bash
pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0
pip install onnx==1.10.0 tqdm==4.64.1
pip install h5py==3.8.0
pip install decorator==5.1.1 scipy==1.7.3
```
### step.2
```bash
sudo vi ~/.bashrc
export VASTSTREAM_PIPELINE=true
export VACC_IRTEXT_ENABLE=1
export TVM_HOME="/opt/vastai/vaststream/tvm"
export VASTSTREAM_HOME="/opt/vastai/vaststream/vacl"
export LD_LIBRARY_PATH=$TVM_HOME/lib:$VASTSTREAM_HOME/lib
export PYTHONPATH=$TVM_HOME/python:$TVM_HOME/vacc/python:$TVM_HOME/topi/python:${PYTHONPATH}:$VASTSTREAM_HOME/python
source ~/.bashrc
```
## ncnn Ops
<!-- TOC -->
- [ncnn Ops](#ncnn-ops)
- [Expand](#expand)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
- [Gather](#gather)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
- [Shape](#shape)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
- [TopK](#topk)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
<!-- TOC -->
### Expand
#### Description
Broadcast the input blob following the given shape and the broadcast rule of ncnn.
#### Parameters
Expand has no parameters.
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: ncnn.Mat</dt>
<dd>bottom_blobs[0]; An ncnn.Mat of input data.</dd>
<dt><tt>inputs[1]</tt>: ncnn.Mat</dt>
<dd>bottom_blobs[1]; An 1-dim ncnn.Mat. A valid shape of ncnn.Mat.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>top_blob; The blob of ncnn.Mat which expanded by given shape and broadcast rule of ncnn.</dd>
</dl>
#### Type Constraints
- ncnn.Mat: Mat(float32)
### Gather
#### Description
Given the data and indice blob, gather entries of the axis dimension of data indexed by indices.
#### Parameters
| Type | Parameter | Description |
| ----- | --------- | -------------------------------------- |
| `int` | `axis` | Which axis to gather on. Default is 0. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: ncnn.Mat</dt>
<dd>bottom_blobs[0]; An ncnn.Mat of input data.</dd>
<dt><tt>inputs[1]</tt>: ncnn.Mat</dt>
<dd>bottom_blobs[1]; An 1-dim ncnn.Mat of indices on given axis.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>top_blob; The blob of ncnn.Mat which gathered by given data and indice blob.</dd>
</dl>
#### Type Constraints
- ncnn.Mat: Mat(float32)
### Shape
#### Description
Get the shape of the ncnn blobs.
#### Parameters
Shape has no parameters.
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: ncnn.Mat</dt>
<dd>bottom_blob; An ncnn.Mat of input data.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>top_blob; 1-D ncnn.Mat of shape (bottom_blob.dims,), `bottom_blob.dims` is the input blob dimensions.</dd>
</dl>
#### Type Constraints
- ncnn.Mat: Mat(float32)
### TopK
#### Description
Get the indices and value(optional) of largest or smallest k data among the axis. This op will map to onnx op `TopK`, `ArgMax`, and `ArgMin`.
#### Parameters
| Type | Parameter | Description |
| ----- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `axis` | The axis of data which topk calculate on. Default is -1, indicates the last dimension. |
| `int` | `largest` | The binary value which indicates the TopK operator selects the largest or smallest K values. Default is 1, the TopK selects the largest K values. |
| `int` | `sorted` | The binary value of whether returning sorted topk value or not. If not, the topk returns topk values in any order. Default is 1, this operator returns sorted topk values. |
| `int` | `keep_dims` | The binary value of whether keep the reduced dimension or not. Default is 1, each output blob has the same dimension as input blob. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: ncnn.Mat</dt>
<dd>bottom_blob[0]; An ncnn.Mat of input data.</dd>
<dt><tt>inputs[1] (optional)</tt>: ncnn.Mat</dt>
<dd>bottom_blob[1]; An optional ncnn.Mat. A blob of K in TopK. If this blob not exist, K is 1.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>top_blob[0]; If outputs has only 1 blob, outputs[0] is the indice blob of topk, if outputs has 2 blobs, outputs[0] is the value blob of topk. This blob is ncnn.Mat format with the shape of bottom_blob[0] or reduced shape of bottom_blob[0].</dd>
<dt><tt>outputs[1]</tt>: T</dt>
<dd>top_blob[1] (optional); If outputs has 2 blobs, outputs[1] is the value blob of topk. This blob is ncnn.Mat format with the shape of bottom_blob[0] or reduced shape of bottom_blob[0].</dd>
</dl>
#### Type Constraints
- ncnn.Mat: Mat(float32)
## ONNX Runtime Ops
<!-- TOC -->
- [ONNX Runtime Ops](#onnx-runtime-ops)
- [grid_sampler](#grid_sampler)
- [Description](#description)
- [Parameters](#parameters)
- [Inputs](#inputs)
- [Outputs](#outputs)
- [Type Constraints](#type-constraints)
- [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
- [Description](#description-1)
- [Parameters](#parameters-1)
- [Inputs](#inputs-1)
- [Outputs](#outputs-1)
- [Type Constraints](#type-constraints-1)
- [NMSRotated](#nmsrotated)
- [Description](#description-2)
- [Parameters](#parameters-2)
- [Inputs](#inputs-2)
- [Outputs](#outputs-2)
- [Type Constraints](#type-constraints-2)
- [RoIAlignRotated](#roialignrotated)
- [Description](#description-3)
- [Parameters](#parameters-3)
- [Inputs](#inputs-3)
- [Outputs](#outputs-3)
- [Type Constraints](#type-constraints-3)
- [NMSMatch](#nmsmatch)
- [Description](#description-2)
- [Parameters](#parameters-2)
- [Inputs](#inputs-2)
- [Outputs](#outputs-2)
- [Type Constraints](#type-constraints-2)
<!-- TOC -->
### grid_sampler
#### Description
Perform sample from `input` with pixel locations from `grid`.
#### Parameters
| Type | Parameter | Description |
| ----- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `interpolation_mode` | Interpolation mode to calculate output values. (0: `bilinear` , 1: `nearest`) |
| `int` | `padding_mode` | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`) |
| `int` | `align_corners` | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>grid</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW are the height and width of offset and output. </dd>
</dl>
#### Outputs
<dl>
<dt><tt>output</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, C, outH, outW).</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### MMCVModulatedDeformConv2d
#### Description
Perform Modulated Deformable Convolution on input feature, read [Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline) for detail.
#### Parameters
| Type | Parameter | Description |
| -------------- | ------------------- | ------------------------------------------------------------------------------------- |
| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
| `int` | `deformable_groups` | Groups of deformable offset. |
| `int` | `groups` | Split input into groups. `input_channel` should be divisible by the number of groups. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the number of channels, inH and inW are the height and width of the data.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW are the height and width of weight, outH and outW are the height and width of offset and output.</dd>
<dt><tt>inputs[2]</tt>: T</dt>
<dd>Input mask; 4-D tensor of shape (N, deformable_group* kH* kW, outH, outW), where kH and kW are the height and width of weight, outH and outW are the height and width of offset and output.</dd>
<dt><tt>inputs[3]</tt>: T</dt>
<dd>Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).</dd>
<dt><tt>inputs[4]</tt>: T, optional</dt>
<dd>Input bias; 1-D tensor of shape (output_channel).</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 4-D tensor of shape (N, output_channel, outH, outW).</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### NMSRotated
#### Description
Non Max Suppression for rotated bboxes.
#### Parameters
| Type | Parameter | Description |
| ------- | --------------- | -------------------------- |
| `float` | `iou_threshold` | The IoU threshold for NMS. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input feature; 2-D tensor of shape (N, 5), where N is the number of rotated bboxes, .</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input offset; 1-D tensor of shape (N, ), where N is the number of rotated bboxes.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 1-D tensor of shape (K, ), where K is the number of keep bboxes.</dd>
</dl>
#### Type Constraints
- T:tensor(float32, Linear)
### RoIAlignRotated
#### Description
Perform RoIAlignRotated on output feature, used in bbox_head of most two-stage rotated object detectors.
#### Parameters
| Type | Parameter | Description |
| ------- | ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| `int` | `output_height` | height of output roi |
| `int` | `output_width` | width of output roi |
| `float` | `spatial_scale` | used to scale the input boxes |
| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
| `int` | `clockwise` | If True, the angle in each proposal follows a clockwise fashion in image space, otherwise, the angle is counterclockwise. Default: False. |
#### Inputs
<dl>
<dt><tt>input</tt>: T</dt>
<dd>Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.</dd>
<dt><tt>rois</tt>: T</dt>
<dd>RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 6) given as [[batch_index, cx, cy, w, h, theta], ...]. The RoIs' coordinates are the coordinate system of input.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>feat</tt>: T</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element feat[r-1] is a pooled feature map corresponding to the r-th RoI RoIs[r-1].<dd>
</dl>
#### Type Constraints
- T:tensor(float32)
### NMSMatch
#### Description
Non Max Suppression with the suppression box match.
#### Parameters
| Type | Parameter | Description |
| ------- | ----------- | --------------------------------- |
| `float` | `iou_thr` | The IoU threshold for NMSMatch. |
| `float` | `score_thr` | The score threshold for NMSMatch. |
#### Inputs
<dl>
<dt><tt>inputs[0]</tt>: T</dt>
<dd>Input boxes; 3-D tensor of shape (b, N, 4), where b is the batch size, N is the number of boxes and 4 means the coordinate.</dd>
<dt><tt>inputs[1]</tt>: T</dt>
<dd>Input scores; 3-D tensor of shape (b, c, N), where b is the batch size, c is the class size and N is the number of boxes.</dd>
</dl>
#### Outputs
<dl>
<dt><tt>outputs[0]</tt>: T</dt>
<dd>Output feature; 2-D tensor of shape (K, 4), K is the number of matched boxes, 4 is batch id, class id, select boxes, suppressed boxes.</dd>
</dl>
#### Type Constraints
- T:tensor(float32)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment