**Make sure cmake version >= 3.14.0**. The below script shows how to install cmake 3.20.0. You can find more versions [here](https://cmake.org/install).
<td>Please install conda according to the official <ahref="https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html">guide</a>. <br>
Create a conda virtual environment and activate it. <br>
<pre><code>
conda create -n mmdeploy python=3.7 -y
conda activate mmdeploy
</code></pre>
</td>
</tr>
<tr>
<td>PyTorch <br>(>=1.8.0) </td>
<td>
Install PyTorch>=1.8.0 by following the <ahref="https://pytorch.org/">official instructions</a>. Be sure the CUDA version PyTorch requires matches that in your host.
<td>Install mmcv as follows. Refer to the <ahref="https://github.com/open-mmlab/mmcv/tree/2.x#installation">guide</a> for details.
<pre><code>
export cu_version=cu111 # cuda 11.1
export torch_version=torch1.8
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0rc2"
</code></pre>
</td>
</tr>
</tbody>
</table>
### Install Dependencies for SDK
You can skip this chapter if you are only interested in the model converter.
<tableclass="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>OpenCV<br>(>=3.0) </td>
<td>
On Ubuntu >=18.04,
<pre><code>
sudo apt-get install libopencv-dev
</code></pre>
On Ubuntu 16.04, OpenCV has to be built from the source code. Please refer to the <ahref="https://docs.opencv.org/3.4/d7/d9f/tutorial_linux_install.html">guide</a>.
</td>
</tr>
<tr>
<td>pplcv </td>
<td>A high-performance image processing library of openPPL.<br>
<b>It is optional which only be needed if <code>cuda</code> platform is required.</b><br>
2. Download the linux prebuilt binary package from <ahref="https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1">here</a>. Extract it and export environment variables as below:
1. Login <ahref="https://www.nvidia.com/">NVIDIA</a> and download the TensorRT tar file that matches the CPU architecture and CUDA version you are using from <ahref="https://developer.nvidia.com/nvidia-tensorrt-download">here</a>. Follow the <ahref="https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar">guide</a> to install TensorRT. <br>
2. Here is an example of installing TensorRT 8.2 GA Update 2 for Linux x86_64 and CUDA 11.x that you can refer to. First of all, click <ahref="https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.2.3.0/tars/tensorrt-8.2.3.0.linux.x86_64-gnu.cuda-11.4.cudnn8.2.tar.gz">here</a> to download CUDA 11.x TensorRT 8.2.3.0 and then install it and other dependency like below:
<pre><code>
cd /the/path/of/tensorrt/tar/gz/file
tar -zxvf TensorRT-8.2.3.0.Linux.x86_64-gnu.cuda-11.4.cudnn8.2.tar.gz
1. Download cuDNN that matches the CPU architecture, CUDA version and TensorRT version you are using from <ahref="https://developer.nvidia.com/rdp/cudnn-archive"> cuDNN Archive</a>. <br>
In the above TensorRT's installation example, it requires cudnn8.2. Thus, you can download <ahref="https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.2.1.32/11.3_06072021/cudnn-11.3-linux-x64-v8.2.1.32.tgz">CUDA 11.x cuDNN 8.2</a><br>
2. Extract the compressed file and set the environment variables
1. Please follow the <ahref="https://github.com/openppl-public/ppl.nn/blob/master/docs/en/building-from-source.md">guide</a> to build <code>ppl.nn</code> and install <code>pyppl</code>.<br>
2. Export pplnn's root path to environment variable
2.<b>Optional</b>. If you want to use OpenVINO in MMDeploy SDK, please install and configure it by following the <ahref="https://docs.openvino.ai/2021.4/openvino_docs_install_guides_installing_openvino_linux.html#install-openvino">guide</a>.
</td>
</tr>
<tr>
<td>ncnn </td>
<td>ncnn </td>
<td>1. Download and build ncnn according to its <ahref="https://github.com/Tencent/ncnn/wiki/how-to-build">wiki</a>.
Make sure to enable <code>-DNCNN_PYTHON=ON</code> in your build command. <br>
2. Export ncnn's root path to environment variable
1. Download libtorch from <ahref="https://pytorch.org/get-started/locally/">here</a>. Please note that only <b>Pre-cxx11 ABI</b> and <b>version 1.8.1+</b> on Linux platform are supported by now. For previous versions of libtorch, you can find them in the <ahref="https://github.com/pytorch/pytorch/issues/40961#issuecomment-1017317786">issue comment</a>. <br>
2. Take Libtorch1.8.1+cu111 as an example. You can install it like this:
- Some dependencies are optional. Simply running `pip install -e .` will only install the minimum runtime requirements.
To use optional dependencies, install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -e .[optional]`).
Valid keys for the extras field are: `all`, `tests`, `build`, `optional`.
- It is recommended to [install patch for cuda10](https://developer.nvidia.com/cuda-10.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=runfilelocal), otherwise GEMM related errors may occur when model runs
### Build SDK and Demo
MMDeploy provides two recipes as shown below for building SDK with ONNXRuntime and TensorRT as inference engines respectively.
You can also activate other engines after the model.
If the [ncnn auto-install script](../../../tools/scripts/build_ubuntu_x64_ncnn.py) is used, protobuf will be installed in mmdeploy-dep/pbinstall in the same directory as mmdeploy.
You can skip this chapter if you are only interested in the model converter.
<tableclass="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>OpenCV<br>(>=3.0) </td>
<td>
<pre><code>
brew install opencv
</code></pre>
</td>
</tbody>
</table>
### Install Inference Engines for MMDeploy
Both MMDeploy's model converter and SDK share the same inference engines.
You can select you interested inference engines and do the installation by following the given commands.
This document focus on Core ML. The installation of ONNX Runtime, ncnn and TorchScript is similar to the linux platform, please refer to the document [linux-x86_64](linux-x86_64.md) for installation.
The TorchScript model is used as the IR in the conversion process of the Core ML model. In order to support the custom operator in some models like detection models in mmdet, libtorch needs to be installed.
<tableclass="docutils">
<thead>
<tr>
<th>NAME</th>
<th>PACKAGE</th>
<th>INSTALLATION</th>
</tr>
</thead>
<tbody>
<tr>
<td>Core ML</td>
<td>coremltools</td>
<td>
<pre><code>
pip install coremltools==6.3
</code></pre>
</td>
</tr>
<tr>
<td>TorchScript</td>
<td>libtorch</td>
<td>
1. Libtorch doesn't provide prebuilt arm library for macOS, so you need to compile it yourself. Please note that the version of libtorch must be consistent with the version of pytorch. <br>
2. Take LibTorch 1.9.0 as an example. You can install it like this:
# You should use `conda install` to install the grpcio in requirements/runtime.txt
conda install grpcio
```
```bash
cd${MMDEPLOY_DIR}
mim install-v-e .
```
**Note**
- Some dependencies are optional. Simply running `pip install -e .` will only install the minimum runtime requirements.
To use optional dependencies, install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -e .[optional]`).
Valid keys for the extras field are: `all`, `tests`, `build`, `optional`.
### Build SDK and Demo
The following shows an example of building an SDK using Core ML as the inference engine.
After `make install`, the examples will locate in `install\bin`
```
tree -L 1 install/bin/
.
├── image_classification
├── image_restorer
├── image_segmentation
├── object_detection
├── ocr
├── pose_detection
└── rotated_object_detection
```
### 4) Run the demo
First make sure that `--dump-info` is used during convert model, so that the `resnet18` directory has the files required by the SDK such as `pipeline.json`.
Copy the model folder(resnet18), executable(image_classification) file and test image(tests/data/tiger.jpeg) to the device.
2. Install RKNN python package following [rknn-toolkit2 doc](https://github.com/rockchip-linux/rknn-toolkit2/tree/master/doc) or [rknn-toolkit doc](https://github.com/rockchip-linux/rknn-toolkit/tree/master/docs). When installing rknn python package, it is better to append `--no-deps` after the commands to avoid dependency conflicts. RKNN-Toolkit2 package for example:
3. Install ONNX==1.8.0 before reinstall MMDeploy from source following the [instructions](../01-how-to-build/build_from_source.md). Note that there are conflicts between the pip dependencies of MMDeploy and RKNN. Here is the suggested packages versions for python 3.6:
```
protobuf==3.19.4
onnx==1.8.0
onnxruntime==1.8.0
torch==1.8.0
torchvision==0.9.0
```
4. Install torch and torchvision using conda. For example:
The contents of `common_config` are for `rknn.config()`. The contents of `quantization_config` are used to control `rknn.build()`. You may have to modify `target_platform` for your own preference.
2. For linux, download gcc cross compiler. The download link of the compiler from the official user guide of `rknpu2` was deprecated. You may use another verified [link](https://github.com/Caesar-github/gcc-buildroot-9.3.0-2020.03-x86_64_aarch64-rockchip-linux-gnu). After download and unzip the compiler, you may open the terminal, set `RKNN_TOOL_CHAIN` and `RKNPU2_DEVICE_DIR` by `export RKNN_TOOL_CHAIN=/path/to/gcc/usr;export RKNPU2_DEVICE_DIR=/path/to/rknpu2/runtime/RK3588`.
3. after the above preparition, run the following commands:
First make sure that`--dump-info`is used during convert model, so that the working directory has the files required by the SDK such as `pipeline.json`.
`adb push` the model directory, executable file and .so to the device.
./image_classification cpu ./resnet50 ./resnet50/demo.JPEG
..
label: 65, score: 0.95
```
## Troubleshooting
- MMDet models.
YOLOV3 & YOLOX: you may paste the following partition configuration into [detection_rknn_static-320x320.py](https://github.com/open-mmlab/mmdeploy/blob/main/configs/mmdet/detection/detection_rknn-int8_static-320x320.py):
```python
# yolov3, yolox for rknn-toolkit and rknn-toolkit2
partition_config=dict(
type='rknn',# the partition policy name
apply_marks=True,# should always be set to True
partition_cfg=[
dict(
save_file='model.onnx',# name to save the partitioned onnx
RTMDet: you may paste the following partition configuration into [detection_rknn-int8_static-640x640.py](https://github.com/open-mmlab/mmdeploy/blob/main/configs/mmdet/detection/detection_rknn-int8_static-640x640.py):
```python
# rtmdet for rknn-toolkit and rknn-toolkit2
partition_config=dict(
type='rknn',# the partition policy name
apply_marks=True,# should always be set to True
partition_cfg=[
dict(
save_file='model.onnx',# name to save the partitioned onnx
RetinaNet & SSD & FSAF with rknn-toolkit2, you may paste the following partition configuration into [detection_rknn_static-320x320.py](https://github.com/open-mmlab/mmdeploy/blob/main/configs/mmdet/detection/detection_rknn-int8_static-320x320.py). Users with rknn-toolkit can directly use default config.
Take Resnet-18 as an example. First refer to [documentation to install mmpretrain](https://github.com/open-mmlab/mmpretrain/tree/main) and use `tools/deploy.py` to convert the model.
| ANDROID_STL=c++\_static | In case of NDK environment can not find suitable c++ library |
| MMDEPLOY_SHARED_LIBS=ON | snpe does not provide static library |
[Here](../01-how-to-build/cmake_option.md) is all cmake build option description.
### 3) Run the demo
First make sure that`--dump-info`is used during convert model, so that the `resnet18` directory has the files required by the SDK such as `pipeline.json`.
`adb push` the model directory, executable file and .so to the device.
All the commands listed in the following chapters are verified on **Windows 10**.
### Install Toolchains
1. Download and install [Visual Studio 2019](https://visualstudio.microsoft.com)
2. Add the path of `cmake` to the environment variable `PATH`, i.e., "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\Common7\\IDE\\CommonExtensions\\Microsoft\\CMake\\CMake\\bin"
3. Install cuda toolkit if NVIDIA gpu is available. You can refer to the official [guide](https://developer.nvidia.com/cuda-downloads).
### Install Dependencies
#### Install Dependencies for Model Converter
<tableclass="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>conda </td>
<td> Please install conda according to the official <ahref="https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html">guide</a>. <br>
After installation, open <code>anaconda powershell prompt</code> under the Start Menu <b>as the administrator</b>, because: <br>
1.<b>All the commands listed in the following text are verified in anaconda powershell </b><br>
2.<b>As an administrator, you can install the thirdparty libraries to the system path so as to simplify MMDeploy build command</b><br>
Note: if you are familiar with how cmake works, you can also use <code>anaconda powershell prompt</code> as an ordinary user.
</td>
</tr>
<tr>
<td>PyTorch <br>(>=1.8.0) </td>
<td>
Install PyTorch>=1.8.0 by following the <ahref="https://pytorch.org/">official instructions</a>. Be sure the CUDA version PyTorch requires matches that in your host.
<td>Install mmcv as follows. Refer to the <ahref="https://github.com/open-mmlab/mmcv/tree/2.x#installation">guide</a> for details.
<pre><code>
$env:cu_version="cu111"
$env:torch_version="torch1.8.0"
pip install -U openmim
mim install "mmcv>=2.0.0rc1"
</code></pre>
</td>
</tr>
</tbody>
</table>
#### Install Dependencies for SDK
You can skip this chapter if you are only interested in the model converter.
<tableclass="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>OpenCV<br>(>=3.0) </td>
<td>
1. Find and download OpenCV 3+ for windows from <ahref="https://github.com/opencv/opencv/releases">here</a>.<br>
2. You can download the prebuilt package and install it to the target directory. Or you can build OpenCV from its source. <br>
3. Find where <code>OpenCVConfig.cmake</code> locates in the installation directory. And export its path to the environment variable <code>PATH</code> like this,
2. Download the windows prebuilt binary package from <ahref="https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1">here</a>. Extract it and export environment variables as below:
1. Login <ahref="https://www.nvidia.com/">NVIDIA</a> and download the TensorRT tar file that matches the CPU architecture and CUDA version you are using from <ahref="https://developer.nvidia.com/nvidia-tensorrt-download">here</a>. Follow the <ahref="https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar">guide</a> to install TensorRT. <br>
2. Here is an example of installing TensorRT 8.2 GA Update 2 for Windows x86_64 and CUDA 11.x that you can refer to. <br> First of all, click <ahref="https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.2.3.0/zip/TensorRT-8.2.3.0.Windows10.x86_64.cuda-11.4.cudnn8.2.zip">here</a> to download CUDA 11.x TensorRT 8.2.3.0 and then install it and other dependency like below:
1. Download cuDNN that matches the CPU architecture, CUDA version and TensorRT version you are using from <ahref="https://developer.nvidia.com/rdp/cudnn-archive"> cuDNN Archive</a>. <br>
In the above TensorRT's installation example, it requires cudnn8.2. Thus, you can download <ahref="https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.2.1.32/11.3_06072021/cudnn-11.3-windows-x64-v8.2.1.32.zip">CUDA 11.x cuDNN 8.2</a><br>
2. Extract the zip file and set the environment variables
- Some dependencies are optional. Simply running `pip install -e .` will only install the minimum runtime requirements.
To use optional dependencies, install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -e .[optional]`).
Valid keys for the extras field are: `all`, `tests`, `build`, `optional`.
#### Build SDK and Demos
MMDeploy provides two recipes as shown below for building SDK with ONNXRuntime and TensorRT as inference engines respectively.
You can also activate other engines after the model.
- cpu + ONNXRuntime
```PowerShell
cd $env:MMDEPLOY_DIR
mkdir build -ErrorAction SilentlyContinue
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142 `
-DMMDEPLOY_BUILD_SDK=ON `
-DMMDEPLOY_BUILD_EXAMPLES=ON `
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON `
-DMMDEPLOY_TARGET_DEVICES="cpu" `
-DMMDEPLOY_TARGET_BACKENDS="ort" `
-DONNXRUNTIME_DIR="$env:ONNXRUNTIME_DIR"
cmake --build . --config Release -- /m
cmake --install . --config Release
```
- cuda + TensorRT
```PowerShell
cd $env:MMDEPLOY_DIR
mkdir build -ErrorAction SilentlyContinue
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142 `
1. Release / Debug libraries can not be mixed. If MMDeploy is built with Release mode, all its dependent thirdparty libraries have to be built in Release mode too and vice versa.
This tutorial briefly introduces how to export an OpenMMlab model to a specific backend using MMDeploy tools.
Notes:
- Supported backends are [ONNXRuntime](../05-supported-backends/onnxruntime.md), [TensorRT](../05-supported-backends/tensorrt.md), [ncnn](../05-supported-backends/ncnn.md), [PPLNN](../05-supported-backends/pplnn.md), [OpenVINO](../05-supported-backends/openvino.md).
- Supported codebases are [MMPretrain](../04-supported-codebases/mmpretrain.md), [MMDetection](../04-supported-codebases/mmdet.md), [MMSegmentation](../04-supported-codebases/mmseg.md), [MMOCR](../04-supported-codebases/mmocr.md), [MMagic](../04-supported-codebases/mmagic.md).
## How to convert models from Pytorch to other backends
### Prerequisite
1. Install and build your target backend. You could refer to [ONNXRuntime-install](../05-supported-backends/onnxruntime.md), [TensorRT-install](../05-supported-backends/tensorrt.md), [ncnn-install](../05-supported-backends/ncnn.md), [PPLNN-install](../05-supported-backends/pplnn.md), [OpenVINO-install](../05-supported-backends/openvino.md) for more information.
2. Install and build your target codebase. You could refer to [MMPretrain-install](https://mmpretrain.readthedocs.io/en/latest/get_started.html#installation), [MMDetection-install](https://mmdetection.readthedocs.io/en/latest/get_started.html#installation), [MMSegmentation-install](https://mmsegmentation.readthedocs.io/en/latest/get_started.html#installation), [MMOCR-install](https://mmocr.readthedocs.io/en/latest/get_started/install.html#installation-steps), [MMagic-install](https://mmagic.readthedocs.io/en/latest/get_started/install.html#installation).
### Usage
```bash
python ./tools/deploy.py \
${DEPLOY_CFG_PATH}\
${MODEL_CFG_PATH}\
${MODEL_CHECKPOINT_PATH}\
${INPUT_IMG}\
--test-img${TEST_IMG}\
--work-dir${WORK_DIR}\
--calib-dataset-cfg${CALIB_DATA_CFG}\
--device${DEVICE}\
--log-level INFO \
--show\
--dump-info
```
### Description of all arguments
-`deploy_cfg` : The deployment configuration of mmdeploy for the model, including the type of inference framework, whether quantize, whether the input shape is dynamic, etc. There may be a reference relationship between configuration files, `mmdeploy/mmpretrain/classification_ncnn_static.py` is an example.
-`model_cfg` : Model configuration for algorithm library, e.g. `mmpretrain/configs/vision_transformer/vit-base-p32_ft-64xb64_in1k-384.py`, regardless of the path to mmdeploy.
-`checkpoint` : torch model path. It can start with http/https, see the implementation of `mmcv.FileClient` for details.
-`img` : The path to the image or point cloud file used for testing during the model conversion.
-`--test-img` : The path of the image file that is used to test the model. If not specified, it will be set to `None`.
-`--work-dir` : The path of the work directory that is used to save logs and models.
-`--calib-dataset-cfg` : Only valid in int8 mode. The config used for calibration. If not specified, it will be set to `None` and use the "val" dataset in the model config for calibration.
-`--device` : The device used for model conversion. If not specified, it will be set to `cpu`. For trt, use `cuda:0` format.
-`--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
-`--show` : Whether to show detection outputs.
-`--dump-info` : Whether to output information for SDK.
### How to find the corresponding deployment config of a PyTorch model
1. Find the model's codebase folder in `configs/`. For converting a yolov3 model, you need to check `configs/mmdet` folder.
2. Find the model's task folder in `configs/codebase_folder/`. For a yolov3 model, you need to check `configs/mmdet/detection` folder.
3. Find the deployment config file in `configs/codebase_folder/task_folder/`. For deploying a yolov3 model to the onnx backend, you could use `configs/mmdet/detection/detection_onnxruntime_dynamic.py`.
If the model preprocess supports fusion, there will be a filed named `fuse_transform` in `pipeline.json`. It represents fusion switch and the default value `false` stands for off. One need to edit this filed to `true` to use the fuse option.
This tutorial takes `mmdeploy-1.3.1-windows-amd64.zip` and `mmdeploy-1.3.1-windows-amd64-cuda11.8.zip` as examples to show how to use the prebuilt packages. The former support onnxruntime cpu inference, the latter support onnxruntime-gpu and tensorrt inference.
The directory structure of the prebuilt package is as follows, where the `dist` folder is about model converter, and the `sdk` folder is related to model inference.
```
.
├── build_sdk.ps1
├── example
├── include
├── install_opencv.ps1
├── lib
├── README.md
├── set_env.ps1
└── thirdparty
```
## Prerequisite
In order to use the prebuilt package, you need to install some third-party dependent libraries.
1. Follow the [get_started](../get_started.md) documentation to create a virtual python environment and install pytorch, torchvision and mmcv. To use the C interface of the SDK, you need to install [vs2019+](https://visualstudio.microsoft.com/), [OpenCV](https://github.com/opencv/opencv/releases).
:point_right: It is recommended to use `pip` instead of `conda` to install pytorch and torchvision
2. Clone the mmdeploy repository
```bash
git clone -b main https://github.com/open-mmlab/mmdeploy.git
```
:point_right: The main purpose here is to use the configs, so there is no need to compile `mmdeploy`.
3. Install mmpretrain
```bash
git clone -b main https://github.com/open-mmlab/mmpretrain.git
cd mmpretrain
pip install-e .
```
4. Prepare a PyTorch model as our example
Download the pth [resnet18_8xb32_in1k_20210831-fbbb1da6.pth](https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_8xb32_in1k_20210831-fbbb1da6.pth). The corresponding config of the model is [resnet18_8xb32_in1k.py](https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnet/resnet18_8xb32_in1k.py)
After the above work is done, the structure of the current working directory should be:
```
.
|-- mmpretrain
|-- mmdeploy
|-- resnet18_8xb32_in1k_20210831-fbbb1da6.pth
```
### ONNX Runtime
In order to use `ONNX Runtime` backend, you should also do the following steps.
5. Install `mmdeploy` (Model Converter) and `mmdeploy_runtime` (SDK Python API).
```bash
pip install mmdeploy==1.3.1
pip install mmdeploy-runtime==1.3.1
```
:point_right: If you have installed it before, please uninstall it first.
6. Install onnxruntime package
```
pip install onnxruntime==1.8.1
```
7. Download [`onnxruntime`](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1), and add environment variable.
As shown in the figure, add the lib directory of onnxruntime to the `PATH`.
:exclamation: Restart powershell to make the environment variables setting take effect. You can check whether the settings are in effect by `echo $env:PATH`.
In order to use `TensorRT` backend, you should also do the following steps.
5. Install `mmdeploy` (Model Converter) and `mmdeploy_runtime` (SDK Python API).
```bash
pip install mmdeploy==1.3.1
pip install mmdeploy-runtime-gpu==1.3.1
```
:point_right: If you have installed it before, please uninstall it first.
6. Install TensorRT related package and set environment variables
- CUDA Toolkit 11.8
- TensorRT 8.6.1.6
- cuDNN 8.6.0
Add the runtime libraries of TensorRT and cuDNN to the `PATH`. You can refer to the path setting of onnxruntime. Don't forget to install python package of TensorRT.
:exclamation: Restart powershell to make the environment variables setting take effect. You can check whether the settings are in effect by echo `$env:PATH`.
:exclamation: It is recommended to add only one version of the TensorRT/cuDNN runtime libraries to the `PATH`. It is better not to copy the runtime libraries of TensorRT/cuDNN to the cuda directory in `C:\`.
You can obtain two model folders after model conversion.
```
.\work_dir\onnx\resnet
.\work_dir\trt\resnet
```
The structure of current working directory:
```
.
|-- mmdeploy-1.3.1-windows-amd64
|-- mmdeploy-1.3.1-windows-amd64-cuda11.8
|-- mmpretrain
|-- mmdeploy
|-- resnet18_8xb32_in1k_20210831-fbbb1da6.pth
`-- work_dir
```
### Backend Inference
:exclamation: It should be emphasized that `inference_model` is not for deployment, but shields the difference of backend inference api(`TensorRT`, `ONNX Runtime` etc.). The main purpose of this api is to check whether the converted model can be inferred normally.
After converting a PyTorch model to a backend model, you may evaluate backend models with `tools/test.py`
## Prerequisite
Install MMDeploy according to [get-started](../get_started.md) instructions.
And convert the PyTorch model or ONNX model to the backend model by following the [guide](convert_model.md).
## Usage
```shell
python tools/test.py \
${DEPLOY_CFG}\
${MODEL_CFG}\
--model${BACKEND_MODEL_FILES}\
[--out${OUTPUT_PKL_FILE}]\
[--format-only]\
[--metrics${METRICS}]\
[--show]\
[--show-dir${OUTPUT_IMAGE_DIR}]\
[--show-score-thr${SHOW_SCORE_THR}]\
--device${DEVICE}\
[--cfg-options${CFG_OPTIONS}]\
[--metric-options${METRIC_OPTIONS}]
[--log2file work_dirs/output.txt]
[--batch-size${BATCH_SIZE}]
[--speed-test]\
[--warmup${WARM_UP}]\
[--log-interval${LOG_INTERVERL}]\
```
## Description of all arguments
-`deploy_cfg`: The config for deployment.
-`model_cfg`: The config of the model in OpenMMLab codebases.
-`--model`: The backend model file. For example, if we convert a model to TensorRT, we need to pass the model file with ".engine" suffix.
-`--out`: The path to save output results in pickle format. (The results will be saved only if this argument is given)
-`--format-only`: Whether format the output results without evaluation or not. It is useful when you want to format the result to a specific format and submit it to the test server
-`--metrics`: The metrics to evaluate the model defined in OpenMMLab codebases. e.g. "segm", "proposal" for COCO in mmdet, "precision", "recall", "f1_score", "support" for single label dataset in mmpretrain.
-`--show`: Whether to show the evaluation result on the screen.
-`--show-dir`: The directory to save the evaluation result. (The results will be saved only if this argument is given)
-`--show-score-thr`: The threshold determining whether to show detection bounding boxes.
-`--device`: The device that the model runs on. Note that some backends restrict the device. For example, TensorRT must run on cuda.
-`--cfg-options`: Extra or overridden settings that will be merged into the current deploy config.
-`--metric-options`: Custom options for evaluation. The key-value pair in xxx=yyy
format will be kwargs for dataset.evaluate() function.
-`--log2file`: log evaluation results (and speed) to file.
-`--batch-size`: the batch size for inference, which would override `samples_per_gpu` in data config. Default is `1`. Note that not all models support `batch_size>1`.
-`--speed-test`: Whether to activate speed test.
-`--warmup`: warmup before counting inference elapse, require setting speed-test first.
-`--log-interval`: The interval between each log, require setting speed-test first.
\* Other arguments in `tools/test.py` are used for speed test. They have no concern with evaluation.
The fixed-point model has many advantages over the fp32 model:
- Smaller size, 8-bit model reduces file size by 75%
- Benefit from the smaller model, the Cache hit rate is improved and inference would be faster
- Chips tend to have corresponding fixed-point acceleration instructions which are faster and less energy consumed (int8 on a common CPU requires only about 10% of energy)
APK file size and heat generation are key indicators while evaluating mobile APP;
On server side, quantization means that you can increase model size in exchange for precision and keep the same QPS.
## Post training quantization scheme
Taking ncnn backend as an example, the complete workflow is as follows:
<divalign="center">
<imgsrc="../_static/image/quant_model.png"/>
</div>
mmdeploy generates quantization table based on static graph (onnx) and uses backend tools to convert fp32 model to fixed point.
It is highly recommended that [verifying model precision](profile_model.md) after quantization. [Here](../03-benchmark/quantization.md) is some quantization model test result.
Apart from `deploy.py`, there are other useful tools under the `tools/` directory.
## torch2onnx
This tool can be used to convert PyTorch model from OpenMMLab to ONNX.
### Usage
```bash
python tools/torch2onnx.py \
${DEPLOY_CFG}\
${MODEL_CFG}\
${CHECKPOINT}\
${INPUT_IMG}\
--work-dir${WORK_DIR}\
--device cpu \
--log-level INFO
```
### Description of all arguments
-`deploy_cfg` : The path of the deploy config file in MMDeploy codebase.
-`model_cfg` : The path of model config file in OpenMMLab codebase.
-`checkpoint` : The path of the model checkpoint file.
-`img` : The path of the image file used to convert the model.
-`--work-dir` : Directory to save output ONNX models Default is `./work-dir`.
-`--device` : The device used for conversion. If not specified, it will be set to `cpu`.
-`--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## extract
ONNX model with `Mark` nodes in it can be partitioned into multiple subgraphs. This tool can be used to extract the subgraph from the ONNX model.
### Usage
```bash
python tools/extract.py \
${INPUT_MODEL}\
${OUTPUT_MODEL}\
--start${PARITION_START}\
--end${PARITION_END}\
--log-level INFO
```
### Description of all arguments
-`input_model` : The path of input ONNX model. The output ONNX model will be extracted from this model.
-`output_model` : The path of output ONNX model.
-`--start` : The start point of extracted model with format `<function_name>:<input/output>`. The `function_name` comes from the decorator `@mark`.
-`--end` : The end point of extracted model with format `<function_name>:<input/output>`. The `function_name` comes from the decorator `@mark`.
-`--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
### Note
To support the model partition, you need to add Mark nodes in the ONNX model. The Mark node comes from the `@mark` decorator.
For example, if we have marked the `multiclass_nms` as below, we can set `end=multiclass_nms:input` to extract the subgraph before NMS.
This tool helps to convert an `ONNX` model to an `PPLNN` model.
### Usage
```bash
python tools/onnx2pplnn.py \
${ONNX_PATH}\
${OUTPUT_PATH}\
--device cuda:0 \
--opt-shapes[224,224] \
--log-level INFO
```
### Description of all arguments
-`onnx_path`: The path of the `ONNX` model to convert.
-`output_path`: The converted `PPLNN` algorithm path in json format.
-`device`: The device of the model during conversion.
-`opt-shapes`: Optimal shapes for PPLNN optimization. The shape of each tensor should be wrap with "[]" or "()" and the shapes of tensors should be separated by ",".
-`--log-level`: To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## onnx2tensorrt
This tool can be used to convert ONNX to TensorRT engine.
### Usage
```bash
python tools/onnx2tensorrt.py \
${DEPLOY_CFG}\
${ONNX_PATH}\
${OUTPUT}\
--device-id 0 \
--log-level INFO \
--calib-file /path/to/file
```
### Description of all arguments
-`deploy_cfg` : The path of the deploy config file in MMDeploy codebase.
-`onnx_path` : The ONNX model path to convert.
-`output` : The path of output TensorRT engine.
-`--device-id` : The device index, default to `0`.
-`--calib-file` : The calibration data used to calibrate engine to int8.
-`--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## onnx2ncnn
This tool helps to convert an `ONNX` model to an `ncnn` model.
### Usage
```bash
python tools/onnx2ncnn.py \
${ONNX_PATH}\
${NCNN_PARAM}\
${NCNN_BIN}\
--log-level INFO
```
### Description of all arguments
-`onnx_path` : The path of the `ONNX` model to convert from.
-`output_param` : The converted `ncnn` param path.
-`output_bin` : The converted `ncnn` bin path.
-`--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## profiler
This tool helps to test latency of models with PyTorch, TensorRT and other backends. Note that the pre- and post-processing is excluded when computing inference latency.
### Usage
```bash
python tools/profiler.py \
${DEPLOY_CFG}\
${MODEL_CFG}\
${IMAGE_DIR}\
--model${MODEL}\
--device${DEVICE}\
--shape${SHAPE}\
--num-iter${NUM_ITER}\
--warmup${WARMUP}\
--cfg-options${CFG_OPTIONS}\
--batch-size${BATCH_SIZE}\
--img-ext${IMG_EXT}
```
### Description of all arguments
-`deploy_cfg` : The path of the deploy config file in MMDeploy codebase.
-`model_cfg` : The path of model config file in OpenMMLab codebase.
-`image_dir` : The directory to image files that used to test the model.
-`--model` : The path of the model to be tested.
-`--shape` : Input shape of the model by `HxW`, e.g., `800x1344`. If not specified, it would use `input_shape` from deploy config.
-`--num-iter` : Number of iteration to run inference. Default is `100`.
-`--warmup` : Number of iteration to warm-up the machine. Default is `10`.
-`--device` : The device type. If not specified, it will be set to `cuda:0`.
-`--cfg-options` : Optional key-value pairs to be overrode for model config.
-`--batch-size`: the batch size for test inference. Default is `1`. Note that not all models support `batch_size>1`.
-`--img-ext`: the file extensions for input images from `image_dir`. Defaults to `['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif']`.
This tutorial describes how to write a config for model conversion and deployment. A deployment config includes `onnx config`, `codebase config`, `backend config`.
<!-- TOC -->
-[How to write config](#how-to-write-config)
-[1. How to write onnx config](#1-how-to-write-onnx-config)
-[Description of onnx config arguments](#description-of-onnx-config-arguments)
-[Example](#example)
-[If you need to use dynamic axes](#if-you-need-to-use-dynamic-axes)
-[Example](#example-1)
-[2. How to write codebase config](#2-how-to-write-codebase-config)
-[Description of codebase config arguments](#description-of-codebase-config-arguments)
-[Example](#example-2)
-[3. How to write backend config](#3-how-to-write-backend-config)
-[Example](#example-3)
-[4. A complete example of mmpretrain on TensorRT](#4-a-complete-example-of-mmpretrain-on-tensorrt)
-[5. The name rules of our deployment config](#5-the-name-rules-of-our-deployment-config)
-[Example](#example-4)
-[6. How to write model config](#6-how-to-write-model-config)
<!-- TOC -->
## 1. How to write onnx config
Onnx config to describe how to export a model from pytorch to onnx.
### Description of onnx config arguments
-`type`: Type of config dict. Default is `onnx`.
-`export_params`: If specified, all parameters will be exported. Set this to False if you want to export an untrained model.
-`keep_initializers_as_inputs`: If True, all the initializers (typically corresponding to parameters) in the exported graph will also be added as inputs to the graph. If False, then initializers are not added as inputs to the graph, and only the non-parameter inputs are added as inputs.
-`opset_version`: Opset_version is 11 by default.
-`save_file`: Output onnx file.
-`input_names`: Names to assign to the input nodes of the graph.
-`output_names`: Names to assign to the output nodes of the graph.
-`input_shape`: The height and width of input tensor to the model.
### Example
```python
onnx_config=dict(
type='onnx',
export_params=True,
keep_initializers_as_inputs=False,
opset_version=11,
save_file='end2end.onnx',
input_names=['input'],
output_names=['output'],
input_shape=None)
```
### If you need to use dynamic axes
If the dynamic shape of inputs and outputs is required, you need to add dynamic_axes dict in onnx config.
-`dynamic_axes`: Describe the dimensional information about input and output.
#### Example
```python
dynamic_axes={
'input':{
0:'batch',
2:'height',
3:'width'
},
'dets':{
0:'batch',
1:'num_dets',
},
'labels':{
0:'batch',
1:'num_dets',
},
}
```
## 2. How to write codebase config
Codebase config part contains information like codebase type and task type.
### Description of codebase config arguments
-`type`: Model's codebase, including `mmpretrain`, `mmdet`, `mmseg`, `mmocr`, `mmagic`.
-`task`: Model's task type, referring to [List of tasks in all codebases](#list-of-tasks-in-all-codebases).
The backend config is mainly used to specify the backend on which model runs and provide the information needed when the model runs on the backend , referring to [ONNX Runtime](../05-supported-backends/onnxruntime.md), [TensorRT](../05-supported-backends/tensorrt.md), [ncnn](../05-supported-backends/ncnn.md), [PPLNN](../05-supported-backends/pplnn.md).
-`type`: Model's backend, including `onnxruntime`, `ncnn`, `pplnn`, `tensorrt`, `openvino`.
### Example
```python
backend_config=dict(
type='tensorrt',
common_config=dict(
fp16_mode=False,max_workspace_size=1<<30),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1,3,512,1024],
opt_shape=[1,3,1024,2048],
max_shape=[1,3,2048,2048])))
])
```
## 4. A complete example of mmpretrain on TensorRT
Here we provide a complete deployment config from mmpretrain on TensorRT.
There is a specific naming convention for the filename of deployment config files.
```bash
(task name)_(backend name)_(dynamic or static).py
```
-`task name`: Model's task type.
-`backend name`: Backend's name. Note if you use the quantization function, you need to indicate the quantization type. Just like `tensorrt-int8`.
-`dynamic or static`: Dynamic or static export. Note if the backend needs explicit shape information, you need to add a description of input size with `height x width` format. Just like `dynamic-512x1024-2048x2048`, it means that the min input shape is `512x1024` and the max input shape is `2048x2048`.
According to model's codebase, write the model config file. Model's config file is used to initialize the model, referring to [MMPretrain](https://github.com/open-mmlab/mmpretrain/blob/main/docs/en/user_guides/config.md), [MMDetection](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/en/user_guides/config.md), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/en/user_guides/1_config.md), [MMOCR](https://github.com/open-mmlab/mmocr/blob/main/docs/en/user_guides/config.md), [MMagic](https://github.com/open-mmlab/mmagic/blob/main/docs/en/user_guides/config.md).
- We count the average inference performance of 100 images of the dataset.
- Warm up. For ncnn, we warm up 30 iters for all codebases. As for other backends: for classification, we warm up 1010 iters; for other codebases, we warm up 10 iters.
- Input resolution varies for different datasets of different codebases. All inputs are real images except for `mmagic` because the dataset is not large enough.
Users can directly test the speed through [model profiling](../02-how-to-run/profile_model.md). And here is the benchmark in our environment.
Users can directly test the performance through [how_to_evaluate_a_model.md](../02-how-to-run/profile_model.md). And here is the benchmark in our environment.
- As some datasets contain images with various resolutions in codebase like MMDet. The speed benchmark is gained through static configs in MMDeploy, while the performance benchmark is gained through dynamic ones.
- Some int8 performance benchmarks of TensorRT require Nvidia cards with tensor core, or the performance would drop heavily.
- DBNet uses the interpolate mode `nearest` in the neck of the model, which TensorRT-7 applies a quite different strategy from Pytorch. To make the repository compatible with TensorRT-7, we rewrite the neck to use the interpolate mode `bilinear` which improves final detection performance. To get the matched performance with Pytorch, TensorRT-8+ is recommended, which the interpolate methods are all the same as Pytorch.
- Mask AP of Mask R-CNN drops by 1% for the backend. The main reason is that the predicted masks are directly interpolated to original image in PyTorch, while they are at first interpolated to the preprocessed input image of the model and then to original image in other backends.
- MMPose models are tested with `flip_test` explicitly set to `False` in model configs.
- Some models might get low accuracy in fp16 mode. Please adjust the model to avoid value overflow.
Here are the test conclusions of our edge devices. You can directly obtain the results of your own environment with [model profiling](../02-how-to-run/profile_model.md).
1. The ImageNet-1k dataset is too large to test, only part of the dataset is used (8000/50000)
2. The heating of device will downgrade the frequency, so the time consumption will actually fluctuate. Here are the stable values after running for a period of time. This result is closer to the actual demand.
-`fcn` works fine with 512x1024 size. Cityscapes dataset uses 1024x2048 resolution which causes device to reboot.
## Notes
- We needs to manually split the mmdet model into two parts. Because
- In snpe source code, `onnx_to_ir.py` can only parse onnx input while `ir_to_dlc.py` does not support `topk` operator
- UDO (User Defined Operator) does not work with `snpe-onnx-to-dlc`
- mmagic model
-`srcnn` requires cubic resize which snpe does not support
-`esrgan` converts fine, but loading the model causes the device to reboot
- mmrotate depends on [e2cnn](https://pypi.org/project/e2cnn/) and needs to be installed manually [its Python3.6 compatible branch](https://github.com/QUVA-Lab/e2cnn)
| [RetinaNet](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/retinanet) | MMDetection | Y | Y | Y | Y | Y | Y | Y | Y |
| [Faster R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/faster_rcnn) | MMDetection | Y | Y | Y | Y | Y | Y | Y | N |
| [YOLOv3](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/yolo) | MMDetection | Y | Y | Y | Y | N | Y | Y | Y |
| [YOLOX](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/yolox) | MMDetection | Y | Y | Y | Y | N | Y | N | Y |
| [FCOS](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/fcos) | MMDetection | Y | Y | Y | Y | N | Y | N | N |
| [FSAF](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/fsaf) | MMDetection | Y | Y | Y | Y | Y | Y | N | Y |
| [Mask R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/mask_rcnn) | MMDetection | Y | Y | Y | N | N | Y | N | N |
| [SSD](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/ssd)[\*](#note) | MMDetection | Y | Y | Y | Y | N | Y | N | Y |
| [FoveaBox](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/foveabox) | MMDetection | Y | Y | N | N | N | Y | N | N |
| [ATSS](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/atss) | MMDetection | N | Y | Y | N | N | Y | N | N |
| [GFL](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/gfl) | MMDetection | N | Y | Y | N | ? | Y | N | N |
| [Cascade R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/cascade_rcnn) | MMDetection | N | Y | Y | N | Y | Y | N | N |
| [Cascade Mask R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/cascade_rcnn) | MMDetection | N | Y | Y | N | N | Y | N | N |
| [Swin Transformer](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/swin)[\*](#note) | MMDetection | N | Y | Y | N | N | Y | N | N |
| [VFNet](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/vfnet) | MMDetection | N | N | N | N | N | Y | N | N |
| [RepPoints](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/reppoints) | MMDetection | N | N | Y | N | ? | Y | N | N |
| [DETR](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/detr) | MMDetection | N | Y | Y | N | ? | N | N | N |
| [CenterNet](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/centernet) | MMDetection | N | Y | Y | N | ? | Y | N | N |
| [SOLO](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/solo) | MMDetection | N | Y | N | N | N | Y | N | N |
| [SOLOv2](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/solov2) | MMDetection | N | Y | N | N | N | Y | N | N |
| [ResNet](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnet) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [ResNeXt](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnext) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [SE-ResNet](https://github.com/open-mmlab/mmpretrain/tree/main/configs/seresnet) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [MobileNetV2](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v2) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [MobileNetV3](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v3) | MMPretrain | Y | Y | Y | Y | N | Y | N | N |
| [ShuffleNetV1](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v1) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [ShuffleNetV2](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v2) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [VisionTransformer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/vision_transformer) | MMPretrain | Y | Y | Y | Y | ? | Y | Y | N |
| [SwinTransformer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/swin_transformer) | MMPretrain | Y | Y | Y | N | ? | N | ? | N |
| [MobileOne](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobileone) | MMPretrain | N | Y | Y | N | N | N | N | N |
| [FCN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fcn) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | Y |
| [PSPNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/pspnet)[\*static](#note) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | Y |
| [DeepLabV3](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | N |
| [DeepLabV3+](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | N |
| [Fast-SCNN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastscnn)[\*static](#note) | MMSegmentation | Y | Y | Y | N | Y | Y | N | Y |
| [UNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/unet) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | Y |
| [ANN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ann)[\*](#note) | MMSegmentation | Y | Y | Y | N | N | N | N | N |
| [APCNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/apcnet) | MMSegmentation | Y | Y | Y | Y | N | N | N | Y |
| [BiSeNetV1](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv1) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [BiSeNetV2](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv2) | MMSegmentation | Y | Y | Y | Y | N | Y | N | N |
| [CGNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/cgnet) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [DMNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/dmnet) | MMSegmentation | ? | Y | N | N | N | N | N | N |
| [DNLNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/dnlnet) | MMSegmentation | ? | Y | Y | Y | N | Y | N | N |
| [EMANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/emanet) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [EncNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/encnet) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [ERFNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/erfnet) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [FastFCN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastfcn) | MMSegmentation | Y | Y | Y | Y | N | Y | N | N |
| [GCNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/gcnet) | MMSegmentation | Y | Y | Y | N | N | N | N | N |
| [ICNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/icnet)[\*](#note) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [ISANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/isanet)[\*static](#note) | MMSegmentation | N | Y | Y | N | N | Y | N | Y |
| [NonLocal Net](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/nonlocal_net) | MMSegmentation | ? | Y | Y | Y | N | Y | N | N |
| [OCRNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ocrnet) | MMSegmentation | ? | Y | Y | Y | N | Y | N | Y |
| [PointRend](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/point_rend) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [Semantic FPN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/sem_fpn) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [STDC](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/stdc) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [UPerNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/upernet)[\*](#note) | MMSegmentation | ? | Y | Y | N | N | N | N | Y |
| [DANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/danet) | MMSegmentation | ? | Y | Y | N | N | N | N | N |
| [Segmenter](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/segmenter)[\*static](#note) | MMSegmentation | Y | Y | Y | Y | N | Y | N | N |
| [SRCNN](https://github.com/open-mmlab/mmagic/tree/main/configs/srcnn) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [ESRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/esrgan) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [SRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/srgan_resnet) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [SRResNet](https://github.com/open-mmlab/mmagic/tree/main/configs/srgan_resnet) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [Real-ESRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/real_esrgan) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [EDSR](https://github.com/open-mmlab/mmagic/tree/main/configs/edsr) | MMagic | Y | Y | Y | Y | N | Y | N | N |
| [RDN](https://github.com/open-mmlab/mmagic/tree/main/configs/rdn) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [DBNet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet) | MMOCR | Y | Y | Y | Y | Y | Y | Y | N |
| [DBNetpp](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnetpp) | MMOCR | Y | Y | Y | ? | ? | Y | ? | N |
| [PANet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet) | MMOCR | Y | Y | Y | Y | ? | Y | Y | N |
| [PSENet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet) | MMOCR | Y | Y | Y | Y | ? | Y | Y | N |
| [TextSnake](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/textsnake) | MMOCR | Y | Y | Y | Y | ? | ? | ? | N |
| [MaskRCNN](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/maskrcnn) | MMOCR | Y | Y | Y | ? | ? | ? | ? | N |
| [CRNN](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/crnn) | MMOCR | Y | Y | Y | Y | Y | N | N | N |
| [SAR](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/sar) | MMOCR | N | Y | N | N | N | N | N | N |
| [SATRN](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/satrn) | MMOCR | Y | Y | Y | N | N | N | N | N |
| [ABINet](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/abinet) | MMOCR | Y | Y | Y | N | N | N | N | N |
| [HRNet](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#hrnet-cvpr-2019) | MMPose | N | Y | Y | Y | N | Y | N | N |
| [MSPN](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#mspn-arxiv-2019) | MMPose | N | Y | Y | Y | N | Y | N | N |
| [LiteHRNet](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#litehrnet-cvpr-2021) | MMPose | N | Y | Y | N | N | Y | N | N |
| [Hourglass](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#hourglass-eccv-2016) | MMPose | N | Y | Y | Y | N | Y | N | N |
| [SimCC](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/algorithms.html#simcc-eccv-2022) | MMPose | N | Y | Y | Y | N | N | N | N |
| [PointPillars](https://github.com/open-mmlab/mmdetection3d/tree/main/configs/pointpillars) | MMDetection3d | ? | Y | Y | N | N | Y | N | N |
| [CenterPoint (pillar)](https://github.com/open-mmlab/mmdetection3d/tree/main/configs/centerpoint) | MMDetection3d | ? | Y | Y | N | N | Y | N | N |
| [RotatedRetinaNet](https://github.com/open-mmlab/mmrotate/blob/main/configs/rotated_retinanet/README.md) | RotatedDetection | N | Y | Y | N | N | N | N | N |
| [Oriented RCNN](https://github.com/open-mmlab/mmrotate/blob/main/configs/oriented_rcnn/README.md) | RotatedDetection | N | Y | Y | N | N | N | N | N |
| [Gliding Vertex](https://github.com/open-mmlab/mmrotate/blob/main/configs/gliding_vertex/README.md) | RotatedDetection | N | N | Y | N | N | N | N | N |
### Note
- Tag:
- static: This model only support static export. Please use `static` deploy config, just like $MMDEPLOY_DIR/configs/mmseg/segmentation_tensorrt_static-1024x2048.py.
- SSD: When you convert SSD model, you need to use min shape deploy config just like 300x300-512x512 rather than 320x320-1344x1344, for example $MMDEPLOY_DIR/configs/mmdet/detection/detection_tensorrt_dynamic-300x300-512x512.py.
- YOLOX: YOLOX with ncnn only supports static shape.
- Swin Transformer: For TensorRT, only version 8.4+ is supported.
- SAR: Chinese text recognition model is not supported as the protobuf size of ONNX is limited.
[MMAction2](https://github.com/open-mmlab/mmaction2) is an open-source toolbox for video understanding based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com) project.
## Installation
### Install mmaction2
Please follow the [installation guide](https://github.com/open-mmlab/mmaction2/tree/main#installation) to install mmaction2.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive-b main https://github.com/open-mmlab/mmdeploy.git
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmaction2 models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
When using `tools/deploy.py`, it is crucial to specify the correct deployment config. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmaction) of all supported backends for mmaction2, under which the config file path follows the pattern:
-**{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
-**{precision}:** fp16, int8. When it's empty, it means fp32
-**{static | dynamic}:** static shape or dynamic shape
-**{shape}:** input shape or shape range of a model
-**{2d/3d}:** model type
In the next part,we will take `tsn` model from `video recognition` task as an example, showing how to convert them to onnx model that can be inferred by ONNX Runtime.
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmaction/tsn/ort` in the previous example. It includes:
```
mmdeploy_models/mmaction/tsn/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
-**end2end.onnx**: backend model which can be inferred by ONNX Runtime
-\***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmaction/tsn/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model Inference
### Backend model inference
Take the previous converted `end2end.onnx` mode of `tsn` as an example, you can use the following code to inference the model and visualize the results.
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
> MMAction2 only API of c, c++ and python for now.
[MMagic](https://github.com/open-mmlab/mmagic/tree/main) aka `mmagic` is an open-source image and video editing toolbox based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Installation
### Install mmagic
Please follow the [installation guide](https://github.com/open-mmlab/mmagic/tree/main#installation) to install mmagic.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive-b main https://github.com/open-mmlab/mmdeploy.git
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmagic models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
When using `tools/deploy.py`, it is crucial to specify the correct deployment config. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmagic) of all supported backends for mmagic, under which the config file path follows the pattern:
MMDeploy supports models of one task in mmagic, i.e., `super resolution`. Please refer to chapter [supported models](#supported-models) for task-model organization.
**DO REMEMBER TO USE** the corresponding deployment config file when trying to convert models of different tasks.
-**{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
-**{precision}:** fp16, int8. When it's empty, it means fp32
-**{static | dynamic}:** static shape or dynamic shape
-**{shape}:** input shape or shape range of a model
### Convert super resolution model
The command below shows an example about converting `ESRGAN` model to onnx model that can be inferred by ONNX Runtime.
You can also convert the above model to other backend models by changing the deployment config file `*_onnxruntime_dynamic.py` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmagic), e.g., converting to tensorrt model by `super-resolution/super-resolution_tensorrt-_dynamic-32x32-512x512.py`.
```{tip}
When converting mmagic models to tensorrt models, --device should be set to "cuda"
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmagic/ort` in the previous example. It includes:
```
mmdeploy_models/mmagic/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
-**end2end.onnx**: backend model which can be inferred by ONNX Runtime
-\***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmagic/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model inference
### Backend model inference
Take the previous converted `end2end.onnx` model as an example, you can use the following code to inference the model and visualize the results.
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).