Commit 4353fa59 authored by limm's avatar limm
Browse files

add part code

parents
Pipeline #2807 canceled with stages
# Build for Linux-x86_64
- [Build for Linux-x86_64](#build-for-linux-x86_64)
- [Install Toolchains](#install-toolchains)
- [Install Dependencies](#install-dependencies)
- [Install Dependencies for Model Converter](#install-dependencies-for-model-converter)
- [Install Dependencies for SDK](#install-dependencies-for-sdk)
- [Install Inference Engines for MMDeploy](#install-inference-engines-for-mmdeploy)
- [Build MMDeploy](#build-mmdeploy)
- [Build Model Converter](#build-model-converter)
- [Install Model Converter](#install-model-converter)
- [Build SDK and Demo](#build-sdk-and-demo)
______________________________________________________________________
## Install Toolchains
- cmake
**Make sure cmake version >= 3.14.0**. The below script shows how to install cmake 3.20.0. You can find more versions [here](https://cmake.org/install).
```bash
wget https://github.com/Kitware/CMake/releases/download/v3.20.0/cmake-3.20.0-linux-x86_64.tar.gz
tar -xzvf cmake-3.20.0-linux-x86_64.tar.gz
sudo ln -sf $(pwd)/cmake-3.20.0-linux-x86_64/bin/* /usr/bin/
```
- GCC 7+
MMDeploy requires compilers that support C++17.
```bash
# Add repository if ubuntu < 18.04
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-7
sudo apt-get install g++-7
```
## Install Dependencies
### Install Dependencies for Model Converter
<table class="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>conda </td>
<td>Please install conda according to the official <a href="https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html">guide</a>. <br>
Create a conda virtual environment and activate it. <br>
<pre><code>
conda create -n mmdeploy python=3.7 -y
conda activate mmdeploy
</code></pre>
</td>
</tr>
<tr>
<td>PyTorch <br>(>=1.8.0) </td>
<td>
Install PyTorch>=1.8.0 by following the <a href="https://pytorch.org/">official instructions</a>. Be sure the CUDA version PyTorch requires matches that in your host.
<pre><code>
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge
</code></pre>
</td>
</tr>
<tr>
<td>mmcv </td>
<td>Install mmcv as follows. Refer to the <a href="https://github.com/open-mmlab/mmcv/tree/2.x#installation">guide</a> for details.
<pre><code>
export cu_version=cu111 # cuda 11.1
export torch_version=torch1.8
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0rc2"
</code></pre>
</td>
</tr>
</tbody>
</table>
### Install Dependencies for SDK
You can skip this chapter if you are only interested in the model converter.
<table class="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>OpenCV<br>(>=3.0) </td>
<td>
On Ubuntu >=18.04,
<pre><code>
sudo apt-get install libopencv-dev
</code></pre>
On Ubuntu 16.04, OpenCV has to be built from the source code. Please refer to the <a href="https://docs.opencv.org/3.4/d7/d9f/tutorial_linux_install.html">guide</a>.
</td>
</tr>
<tr>
<td>pplcv </td>
<td>A high-performance image processing library of openPPL.<br>
<b>It is optional which only be needed if <code>cuda</code> platform is required.</b><br>
<pre><code>
git clone https://github.com/openppl-public/ppl.cv.git
cd ppl.cv
export PPLCV_DIR=$(pwd)
git checkout tags/v0.7.0 -b v0.7.0
./build.sh cuda
</code></pre>
</td>
</tr>
</tbody>
</table>
### Install Inference Engines for MMDeploy
Both MMDeploy's model converter and SDK share the same inference engines.
You can select you interested inference engines and do the installation by following the given commands.
<table class="docutils">
<thead>
<tr>
<th>NAME</th>
<th>PACKAGE</th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>ONNXRuntime</td>
<td>onnxruntime<br>(>=1.8.1) </td>
<td>
1. Install python package
<pre><code>pip install onnxruntime==1.8.1</code></pre>
2. Download the linux prebuilt binary package from <a href="https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1">here</a>. Extract it and export environment variables as below:
<pre><code>
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
cd onnxruntime-linux-x64-1.8.1
export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
</code></pre>
</td>
</tr>
<tr>
<td rowspan="2">TensorRT<br> </td>
<td>TensorRT <br> </td>
<td>
1. Login <a href="https://www.nvidia.com/">NVIDIA</a> and download the TensorRT tar file that matches the CPU architecture and CUDA version you are using from <a href="https://developer.nvidia.com/nvidia-tensorrt-download">here</a>. Follow the <a href="https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar">guide</a> to install TensorRT. <br>
2. Here is an example of installing TensorRT 8.2 GA Update 2 for Linux x86_64 and CUDA 11.x that you can refer to. First of all, click <a href="https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.2.3.0/tars/tensorrt-8.2.3.0.linux.x86_64-gnu.cuda-11.4.cudnn8.2.tar.gz">here</a> to download CUDA 11.x TensorRT 8.2.3.0 and then install it and other dependency like below:
<pre><code>
cd /the/path/of/tensorrt/tar/gz/file
tar -zxvf TensorRT-8.2.3.0.Linux.x86_64-gnu.cuda-11.4.cudnn8.2.tar.gz
pip install TensorRT-8.2.3.0/python/tensorrt-8.2.3.0-cp37-none-linux_x86_64.whl
export TENSORRT_DIR=$(pwd)/TensorRT-8.2.3.0
export LD_LIBRARY_PATH=$TENSORRT_DIR/lib:$LD_LIBRARY_PATH
pip install pycuda
</code></pre>
</td>
</tr>
<tr>
<td>cuDNN </td>
<td>
1. Download cuDNN that matches the CPU architecture, CUDA version and TensorRT version you are using from <a href="https://developer.nvidia.com/rdp/cudnn-archive"> cuDNN Archive</a>. <br>
In the above TensorRT's installation example, it requires cudnn8.2. Thus, you can download <a href="https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.2.1.32/11.3_06072021/cudnn-11.3-linux-x64-v8.2.1.32.tgz">CUDA 11.x cuDNN 8.2</a><br>
2. Extract the compressed file and set the environment variables
<pre><code>
cd /the/path/of/cudnn/tgz/file
tar -zxvf cudnn-11.3-linux-x64-v8.2.1.32.tgz
export CUDNN_DIR=$(pwd)/cuda
export LD_LIBRARY_PATH=$CUDNN_DIR/lib64:$LD_LIBRARY_PATH
</code></pre>
</td>
</tr>
<tr>
<td>PPL.NN</td>
<td>ppl.nn </td>
<td>
1. Please follow the <a href="https://github.com/openppl-public/ppl.nn/blob/master/docs/en/building-from-source.md">guide</a> to build <code>ppl.nn</code> and install <code>pyppl</code>.<br>
2. Export pplnn's root path to environment variable
<pre><code>
cd ppl.nn
export PPLNN_DIR=$(pwd)
</code></pre>
</td>
</tr>
<tr>
<td>OpenVINO</td>
<td>openvino </td>
<td>1. Install <a href="https://docs.openvino.ai/2021.4/get_started.html">OpenVINO</a> package
<pre><code>
pip install openvino-dev
</code></pre>
2. <b>Optional</b>. If you want to use OpenVINO in MMDeploy SDK, please install and configure it by following the <a href="https://docs.openvino.ai/2021.4/openvino_docs_install_guides_installing_openvino_linux.html#install-openvino">guide</a>.
</td>
</tr>
<tr>
<td>ncnn </td>
<td>ncnn </td>
<td>1. Download and build ncnn according to its <a href="https://github.com/Tencent/ncnn/wiki/how-to-build">wiki</a>.
Make sure to enable <code>-DNCNN_PYTHON=ON</code> in your build command. <br>
2. Export ncnn's root path to environment variable
<pre><code>
cd ncnn
export NCNN_DIR=$(pwd)
export LD_LIBRARY_PATH=${NCNN_DIR}/build/install/lib/:$LD_LIBRARY_PATH
</code></pre>
3. Install pyncnn
<pre><code>
cd ${NCNN_DIR}/python
pip install -e .
</code></pre>
</td>
</tr>
<tr>
<td>TorchScript</td>
<td>libtorch</td>
<td>
1. Download libtorch from <a href="https://pytorch.org/get-started/locally/">here</a>. Please note that only <b>Pre-cxx11 ABI</b> and <b>version 1.8.1+</b> on Linux platform are supported by now. For previous versions of libtorch, you can find them in the <a href="https://github.com/pytorch/pytorch/issues/40961#issuecomment-1017317786">issue comment</a>. <br>
2. Take Libtorch1.8.1+cu111 as an example. You can install it like this:
<pre><code>
wget https://download.pytorch.org/libtorch/cu111/libtorch-shared-with-deps-1.8.1%2Bcu111.zip
unzip libtorch-shared-with-deps-1.8.1+cu111.zip
cd libtorch
export Torch_DIR=$(pwd)
export LD_LIBRARY_PATH=$Torch_DIR/lib:$LD_LIBRARY_PATH
</code></pre>
</td>
</tr>
<tr>
<td>Ascend</td>
<td>CANN</td>
<td>
1. Install CANN follow <a href="https://www.hiascend.com/document/detail/en/CANNCommunityEdition/51RC1alphaX/softwareinstall/instg/atlasdeploy_03_0002.html">official guide</a>.<br>
2. Setup environment
<pre><code>
export ASCEND_TOOLKIT_HOME="/usr/local/Ascend/ascend-toolkit/latest"
</code></pre>
</td>
</tr>
<tr>
<td>TVM</td>
<td>TVM</td>
<td>
1. Install TVM follow <a href="https://tvm.apache.org/docs/install/from_source.html">official guide</a>.<br>
2. Setup environment
<pre><code>
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${TVM_HOME}/build
export PYTHONPATH=${TVM_HOME}/python:${PYTHONPATH}
</code></pre>
</td>
</tr>
</tbody>
</table>
Note: <br>
If you want to make the above environment variables permanent, you could add them to <code>~/.bashrc</code>. Take the ONNXRuntime for example,
```bash
echo '# set env for onnxruntime' >> ~/.bashrc
echo "export ONNXRUNTIME_DIR=${ONNXRUNTIME_DIR}" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH" >> ~/.bashrc
source ~/.bashrc
```
## Build MMDeploy
```bash
cd /the/root/path/of/MMDeploy
export MMDEPLOY_DIR=$(pwd)
```
#### Build Model Converter
If one of inference engines among ONNXRuntime, TensorRT, ncnn and libtorch is selected, you have to build the corresponding custom ops.
- **ONNXRuntime** Custom Ops
```bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake -DCMAKE_CXX_COMPILER=g++-7 -DMMDEPLOY_TARGET_BACKENDS=ort -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} ..
make -j$(nproc) && make install
```
- **TensorRT** Custom Ops
```bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake -DCMAKE_CXX_COMPILER=g++-7 -DMMDEPLOY_TARGET_BACKENDS=trt -DTENSORRT_DIR=${TENSORRT_DIR} -DCUDNN_DIR=${CUDNN_DIR} ..
make -j$(nproc) && make install
```
- **ncnn** Custom Ops
```bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake -DCMAKE_CXX_COMPILER=g++-7 -DMMDEPLOY_TARGET_BACKENDS=ncnn -Dncnn_DIR=${NCNN_DIR}/build/install/lib/cmake/ncnn ..
make -j$(nproc) && make install
```
- **TorchScript** Custom Ops
```bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake -DCMAKE_CXX_COMPILER=g++-7 -DMMDEPLOY_TARGET_BACKENDS=torchscript -DTorch_DIR=${Torch_DIR} ..
make -j$(nproc) && make install
```
Please check [cmake build option](cmake_option.md).
#### Install Model Converter
```bash
cd ${MMDEPLOY_DIR}
mim install -e .
```
**Note**
- Some dependencies are optional. Simply running `pip install -e .` will only install the minimum runtime requirements.
To use optional dependencies, install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -e .[optional]`).
Valid keys for the extras field are: `all`, `tests`, `build`, `optional`.
- It is recommended to [install patch for cuda10](https://developer.nvidia.com/cuda-10.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=runfilelocal), otherwise GEMM related errors may occur when model runs
### Build SDK and Demo
MMDeploy provides two recipes as shown below for building SDK with ONNXRuntime and TensorRT as inference engines respectively.
You can also activate other engines after the model.
- cpu + ONNXRuntime
```Bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake .. \
-DCMAKE_CXX_COMPILER=g++-7 \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_TARGET_DEVICES=cpu \
-DMMDEPLOY_TARGET_BACKENDS=ort \
-DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR}
make -j$(nproc) && make install
```
- cuda + TensorRT
```Bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake .. \
-DCMAKE_CXX_COMPILER=g++-7 \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_TARGET_DEVICES="cuda;cpu" \
-DMMDEPLOY_TARGET_BACKENDS=trt \
-Dpplcv_DIR=${PPLCV_DIR}/cuda-build/install/lib/cmake/ppl \
-DTENSORRT_DIR=${TENSORRT_DIR} \
-DCUDNN_DIR=${CUDNN_DIR}
make -j$(nproc) && make install
```
- pplnn
```Bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake .. \
-DCMAKE_CXX_COMPILER=g++-7 \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_TARGET_DEVICES="cuda;cpu" \
-DMMDEPLOY_TARGET_BACKENDS=pplnn \
-Dpplcv_DIR=${PPLCV_DIR}/cuda-build/cuda-build/install/lib/cmake/ppl \
-Dpplnn_DIR=${PPLNN_DIR}/pplnn-build/install/lib/cmake/ppl
make -j$(nproc) && make install
```
- cuda + TensorRT + onnxruntime + openvino + ncnn
If the [ncnn auto-install script](../../../tools/scripts/build_ubuntu_x64_ncnn.py) is used, protobuf will be installed in mmdeploy-dep/pbinstall in the same directory as mmdeploy.
```Bash
export PROTO_DIR=/path/to/mmdeploy-dep/pbinstall
cmake .. \
-DCMAKE_CXX_COMPILER=g++-7 \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_TARGET_DEVICES="cuda;cpu" \
-DMMDEPLOY_TARGET_BACKENDS="trt;ort;ncnn;openvino" \
-Dpplcv_DIR=${PPLCV_DIR}/cuda-build/install/lib/cmake/ppl \
-DTENSORRT_DIR=${TENSORRT_DIR} \
-DCUDNN_DIR=${CUDNN_DIR} \
-DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} \
-DInferenceEngine_DIR=${OPENVINO_DIR}/runtime/cmake \
-Dncnn_DIR=${NCNN_DIR}/build/install/lib/cmake/ncnn \
-DProtobuf_LIBRARIES=${PROTO_DIR}/lib/libprotobuf.so \
-DProtobuf_PROTOC_EXECUTABLE=${PROTO_DIR}/bin/protoc \
-DProtobuf_INCLUDE_DIR=${PROTO_DIR}/pbinstall/include
```
```
```
# Build for macOS-arm64
- [Build for macOS-arm64](#build-for-macos-arm64)
- [Install Toolchains](#install-toolchains)
- [Install Dependencies](#install-dependencies)
- [Install Dependencies for Model Converter](#install-dependencies-for-model-converter)
- [Install Dependencies for SDK](#install-dependencies-for-sdk)
- [Install Inference Engines for MMDeploy](#install-inference-engines-for-mmdeploy)
- [Build MMDeploy](#build-mmdeploy)
- [Build Model Converter](#build-model-converter)
- [Install Model Converter](#install-model-converter)
- [Build SDK and Demo](#build-sdk-and-demo)
## Install Toolchains
- cmake
```
brew install cmake
```
- clang
install Xcode or Command Line Tools
```
xcode-select --install
```
## Install Dependencies
### Install Dependencies for Model Converter
Please refer to [get_started](../get_started.md) to install conda.
```bash
# install pytorch & mmcv
conda install pytorch==1.9.0 torchvision==0.10.0 -c pytorch
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0rc2"
```
### Install Dependencies for SDK
You can skip this chapter if you are only interested in the model converter.
<table class="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>OpenCV<br>(>=3.0) </td>
<td>
<pre><code>
brew install opencv
</code></pre>
</td>
</tbody>
</table>
### Install Inference Engines for MMDeploy
Both MMDeploy's model converter and SDK share the same inference engines.
You can select you interested inference engines and do the installation by following the given commands.
This document focus on Core ML. The installation of ONNX Runtime, ncnn and TorchScript is similar to the linux platform, please refer to the document [linux-x86_64](linux-x86_64.md) for installation.
The TorchScript model is used as the IR in the conversion process of the Core ML model. In order to support the custom operator in some models like detection models in mmdet, libtorch needs to be installed.
<table class="docutils">
<thead>
<tr>
<th>NAME</th>
<th>PACKAGE</th>
<th>INSTALLATION</th>
</tr>
</thead>
<tbody>
<tr>
<td>Core ML</td>
<td>coremltools</td>
<td>
<pre><code>
pip install coremltools==6.3
</code></pre>
</td>
</tr>
<tr>
<td>TorchScript</td>
<td>libtorch</td>
<td>
1. Libtorch doesn't provide prebuilt arm library for macOS, so you need to compile it yourself. Please note that the version of libtorch must be consistent with the version of pytorch. <br>
2. Take LibTorch 1.9.0 as an example. You can install it like this:
<pre><code>
git clone -b v1.9.0 --recursive https://github.com/pytorch/pytorch.git
cd pytorch
mkdir build && cd build
cmake .. \
-DCMAKE_BUILD_TYPE=Release \
-DPYTHON_EXECUTABLE=`which python` \
-DCMAKE_INSTALL_PREFIX=install \
-DDISABLE_SVE=ON # low version like 1.9.0 of pytorch need DISABLE_SVE option
make -j4 && make install
export Torch_DIR=$(pwd)/install/share/cmake/Torch
</code></pre>
</td>
</tr>
</tbody>
</table>
## Build MMDeploy
```bash
cd /the/root/path/of/MMDeploy
export MMDEPLOY_DIR=$(pwd)
```
### Build Model Converter
- **Core ML**
Core ML uses torchscript as IR, to convert models in some codebases like mmdet, you need to compile torchscript custom operators
- **torchscript** custom operators
```bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_BACKENDS=coreml -DTorch_DIR=${Torch_DIR} ..
make -j4 && make install
```
Please check [cmake build option](cmake_option.md).
### Install Model Converter
```bash
# You should use `conda install` to install the grpcio in requirements/runtime.txt
conda install grpcio
```
```bash
cd ${MMDEPLOY_DIR}
mim install -v -e .
```
**Note**
- Some dependencies are optional. Simply running `pip install -e .` will only install the minimum runtime requirements.
To use optional dependencies, install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -e .[optional]`).
Valid keys for the extras field are: `all`, `tests`, `build`, `optional`.
### Build SDK and Demo
The following shows an example of building an SDK using Core ML as the inference engine.
- cpu + Core ML
```Bash
cd ${MMDEPLOY_DIR}
mkdir -p build && cd build
cmake .. \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_TARGET_DEVICES=cpu \
-DMMDEPLOY_TARGET_BACKENDS=coreml \
-DTorch_DIR=${Torch_DIR}
make -j4 && make install
```
# Build for RISC-V
MMDeploy chooses ncnn as the inference backend on RISC-V platform. The deployment process consists of two steps:
Model conversion: Convert the PyTorch model to the ncnn model on the host side, and then upload the converted model to the device.
Model deployment: Compile ncnn and MMDeploy in cross-compilation mode on the host side, and then upload the executable for inference.
## 1. Model conversion
a) Install MMDeploy
You can refer to [Build document](./linux-x86_64.md) to install ncnn inference engine and MMDeploy
b) Convert model
```bash
export MODEL_CONFIG=/path/to/mmpretrain/configs/resnet/resnet18_8xb32_in1k.py
export MODEL_PATH=https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_8xb32_in1k_20210831-fbbb1da6.pth
# Convert the model
cd /path/to/mmdeploy
python tools/deploy.py \
configs/mmpretrain/classification_ncnn_static.py \
$MODEL_CONFIG \
$MODEL_PATH \
tests/data/tiger.jpeg \
--work-dir resnet18 \
--device cpu \
--dump-info
```
## 2. Model deployment
a) Download the compiler toolchain and set environment
```bash
# download Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.2.6-20220516.tar.gz
# https://occ.t-head.cn/community/download?id=4046947553902661632
tar xf Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.2.6-20220516.tar.gz
export RISCV_ROOT_PATH=`realpath Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.2.6`
```
b) Compile ncnn & opencv
```bash
# ncnn
# refer to https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-allwinner-d1
# opencv
git clone https://github.com/opencv/opencv.git
mkdir build_riscv && cd build_riscv
cmake .. \
-DCMAKE_TOOLCHAIN_FILE=/path/to/mmdeploy/cmake/toolchains/riscv64-unknown-linux-gnu.cmake \
-DCMAKE_INSTALL_PREFIX=install \
-DBUILD_PERF_TESTS=OFF \
-DBUILD_SHARED_LIBS=OFF \
-DBUILD_TESTS=OFF \
-DCMAKE_BUILD_TYPE=Release
make -j$(nproc) && make install
```
c) Compile mmdeploy SDK & demo
```bash
cd /path/to/mmdeploy
mkdir build_riscv && cd build_riscv
cmake .. \
-DCMAKE_TOOLCHAIN_FILE=../cmake/toolchains/riscv64-unknown-linux-gnu.cmake \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_SHARED_LIBS=OFF \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_TARGET_DEVICES="cpu" \
-DMMDEPLOY_TARGET_BACKENDS="ncnn" \
-Dncnn_DIR=${ncnn_DIR}/build-c906/install/lib/cmake/ncnn/ \
-DMMDEPLOY_CODEBASES=all \
-DOpenCV_DIR=${OpenCV_DIR}/build_riscv/install/lib/cmake/opencv4
make -j$(nproc) && make install
```
After `make install`, the examples will locate in `install\bin`
```
tree -L 1 install/bin/
.
├── image_classification
├── image_restorer
├── image_segmentation
├── object_detection
├── ocr
├── pose_detection
└── rotated_object_detection
```
### 4) Run the demo
First make sure that `--dump-info` is used during convert model, so that the `resnet18` directory has the files required by the SDK such as `pipeline.json`.
Copy the model folder(resnet18), executable(image_classification) file and test image(tests/data/tiger.jpeg) to the device.
```bash
./image_classification cpu ./resnet18 tiger.jpeg
```
# Build for RKNN
This tutorial is based on Ubuntu-18.04 and Rockchip NPU `rk3588`. For different NPU devices, you may have to use different rknn packages.
Below is a table describing the relationship:
| Device | Python Package | c/c++ SDK |
| -------------------- | ---------------------------------------------------------------- | -------------------------------------------------- |
| RK1808/RK1806 | [rknn-toolkit](https://github.com/rockchip-linux/rknn-toolkit) | [rknpu](https://github.com/rockchip-linux/rknpu) |
| RV1109/RV1126 | [rknn-toolkit](https://github.com/rockchip-linux/rknn-toolkit) | [rknpu](https://github.com/rockchip-linux/rknpu) |
| RK3566/RK3568/RK3588 | [rknn-toolkit2](https://github.com/rockchip-linux/rknn-toolkit2) | [rknpu2](https://github.com/rockchip-linux/rknpu2) |
| RV1103/RV1106 | [rknn-toolkit2](https://github.com/rockchip-linux/rknn-toolkit2) | [rknpu2](https://github.com/rockchip-linux/rknpu2) |
## Installation
It is recommended to create a virtual environment for the project.
1. Get RKNN-Toolkit2 or RKNN-Toolkit through git. RKNN-Toolkit2 for example:
```
git clone git@github.com:rockchip-linux/rknn-toolkit2.git
```
2. Install RKNN python package following [rknn-toolkit2 doc](https://github.com/rockchip-linux/rknn-toolkit2/tree/master/doc) or [rknn-toolkit doc](https://github.com/rockchip-linux/rknn-toolkit/tree/master/docs). When installing rknn python package, it is better to append `--no-deps` after the commands to avoid dependency conflicts. RKNN-Toolkit2 package for example:
```
pip install packages/rknn_toolkit2-1.4.0_22dcfef4-cp36-cp36m-linux_x86_64.whl --no-deps
```
3. Install ONNX==1.8.0 before reinstall MMDeploy from source following the [instructions](../01-how-to-build/build_from_source.md). Note that there are conflicts between the pip dependencies of MMDeploy and RKNN. Here is the suggested packages versions for python 3.6:
```
protobuf==3.19.4
onnx==1.8.0
onnxruntime==1.8.0
torch==1.8.0
torchvision==0.9.0
```
4. Install torch and torchvision using conda. For example:
```
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge
```
To work with models from [MMPretrain](https://mmpretrain.readthedocs.io/en/latest/get_started.html), you may need to install it additionally.
## Usage
Example:
```bash
python tools/deploy.py \
configs/mmpretrain/classification_rknn-fp16_static-224x224.py \
/mmpretrain_dir/configs/resnet/resnet50_8xb32_in1k.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth \
/mmpretrain_dir/demo/demo.JPEG \
--work-dir ../resnet50 \
--device cpu
```
## Deployment config
With the deployment config, you can modify the `backend_config` for your preference. An example `backend_config` of mmpretrain is shown as below:
```python
backend_config = dict(
type='rknn',
common_config=dict(
mean_values=None,
std_values=None,
target_platform='rk3588',
optimization_level=3),
quantization_config=dict(do_quantization=False, dataset=None),
input_size_list=[[3, 224, 224]])
```
The contents of `common_config` are for `rknn.config()`. The contents of `quantization_config` are used to control `rknn.build()`. You may have to modify `target_platform` for your own preference.
## Build SDK with Rockchip NPU
### Build SDK with RKNPU2
1. Get rknpu2 through git:
```
git clone git@github.com:rockchip-linux/rknpu2.git
```
2. For linux, download gcc cross compiler. The download link of the compiler from the official user guide of `rknpu2` was deprecated. You may use another verified [link](https://github.com/Caesar-github/gcc-buildroot-9.3.0-2020.03-x86_64_aarch64-rockchip-linux-gnu). After download and unzip the compiler, you may open the terminal, set `RKNN_TOOL_CHAIN` and `RKNPU2_DEVICE_DIR` by `export RKNN_TOOL_CHAIN=/path/to/gcc/usr;export RKNPU2_DEVICE_DIR=/path/to/rknpu2/runtime/RK3588`.
3. after the above preparition, run the following commands:
```shell
cd /path/to/mmdeploy
mkdir -p build && rm -rf build/CM* && cd build
export LD_LIBRARY_PATH=$RKNN_TOOL_CHAIN/lib64:$LD_LIBRARY_PATH
cmake \
-DCMAKE_TOOLCHAIN_FILE=/path/to/mmdeploy/cmake/toolchains/rknpu2-linux-gnu.cmake \
-DMMDEPLOY_BUILD_SDK=ON \
-DCMAKE_BUILD_TYPE=Debug \
-DOpenCV_DIR=${RKNPU2_DEVICE_DIR}/../../examples/3rdparty/opencv/opencv-linux-aarch64/share/OpenCV \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_TARGET_DEVICES="cpu" \
-DMMDEPLOY_TARGET_BACKENDS="rknn" \
-DMMDEPLOY_CODEBASES=all \
-DMMDEPLOY_BUILD_TEST=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
..
make && make install
```
## Run the demo with SDK
First make sure that`--dump-info`is used during convert model, so that the working directory has the files required by the SDK such as `pipeline.json`.
`adb push` the model directory, executable file and .so to the device.
```bash
cd /path/to/mmdeploy
adb push resnet50 /data/local/tmp/resnet50
adb push /mmpretrain_dir/demo/demo.JPEG /data/local/tmp/resnet50/demo.JPEG
cd build
adb push lib /data/local/tmp/lib
adb push bin/image_classification /data/local/tmp/image_classification
```
Set up environment variable and execute the sample.
```bash
adb shell
cd /data/local/tmp
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/data/local/tmp/lib
./image_classification cpu ./resnet50 ./resnet50/demo.JPEG
..
label: 65, score: 0.95
```
## Troubleshooting
- MMDet models.
YOLOV3 & YOLOX: you may paste the following partition configuration into [detection_rknn_static-320x320.py](https://github.com/open-mmlab/mmdeploy/blob/main/configs/mmdet/detection/detection_rknn-int8_static-320x320.py):
```python
# yolov3, yolox for rknn-toolkit and rknn-toolkit2
partition_config = dict(
type='rknn', # the partition policy name
apply_marks=True, # should always be set to True
partition_cfg=[
dict(
save_file='model.onnx', # name to save the partitioned onnx
start=['detector_forward:input'], # [mark_name:input, ...]
end=['yolo_head:input'], # [mark_name:output, ...]
output_names=[f'pred_maps.{i}' for i in range(3)]) # output names
])
```
RTMDet: you may paste the following partition configuration into [detection_rknn-int8_static-640x640.py](https://github.com/open-mmlab/mmdeploy/blob/main/configs/mmdet/detection/detection_rknn-int8_static-640x640.py):
```python
# rtmdet for rknn-toolkit and rknn-toolkit2
partition_config = dict(
type='rknn', # the partition policy name
apply_marks=True, # should always be set to True
partition_cfg=[
dict(
save_file='model.onnx', # name to save the partitioned onnx
start=['detector_forward:input'], # [mark_name:input, ...]
end=['rtmdet_head:output'], # [mark_name:output, ...]
output_names=[f'pred_maps.{i}' for i in range(6)]) # output names
])
```
RetinaNet & SSD & FSAF with rknn-toolkit2, you may paste the following partition configuration into [detection_rknn_static-320x320.py](https://github.com/open-mmlab/mmdeploy/blob/main/configs/mmdet/detection/detection_rknn-int8_static-320x320.py). Users with rknn-toolkit can directly use default config.
```python
# retinanet, ssd for rknn-toolkit2
partition_config = dict(
type='rknn', # the partition policy name
apply_marks=True,
partition_cfg=[
dict(
save_file='model.onnx',
start='detector_forward:input',
end=['BaseDenseHead:output'],
output_names=[f'BaseDenseHead.cls.{i}' for i in range(5)] +
[f'BaseDenseHead.loc.{i}' for i in range(5)])
])
```
- SDK only supports int8 rknn model, which require `do_quantization=True` when converting models.
- Latency problem.
For devices running RKNPU like rv1126, please set `pre_compile=True` in `quantization_config` when converting models.
Or the latency may not suit your need.
# Build for SNPE
It is quite simple to support snpe backend: Client/Server mode.
this mode
1. Can split `model convert` and `inference` environments;
- Inference irrelevant matters are done on host
- We can get the real running results of gpu/npu instead of cpu simulation values
2. Can cover cost-sensitive device, armv7/risc-v/mips chips meet product requirements, but often have limited support for Python;
3. Can simplify mmdeploy installation steps. If you only want to convert snpe model and test, you don't need to compile the .whl package.
## 1. Run inference server
Download the prebuilt snpe inference server package, `adb push` it to the phone and execute.
Note that **the phone must have a qcom chip**.
```bash
$ wget https://media.githubusercontent.com/media/tpoisonooo/mmdeploy_snpe_testdata/main/snpe-inference-server-1.59.tar.gz
...
$ sudo apt install adb
$ adb push snpe-inference-server-1.59.tar.gz /data/local/tmp/
# decompress and execute
$ adb shell
venus:/ $ cd /data/local/tmp
130|venus:/data/local/tmp $ tar xvf snpe-inference-server-1.59.tar.gz
...
130|venus:/data/local/tmp $ source export1.59.sh
130|venus:/data/local/tmp $ ./inference_server
...
Server listening on [::]:60000
```
At this point the inference service should print all the ipv6 and ipv4 addresses of the device and listen on the port.
tips:
- If `adb devices` cannot find the device, may be:
- Some cheap cables can only charge and cannot transmit data
- or the "developer mode" of the phone is not turned on
- If you need to compile the binary by self, please refer to [NDK Cross Compiling snpe Inference Service](../appendix/cross_build_snpe_service.md)
- If a `segmentation fault` occurs when listening on a port, it may be because:
- The port number is already occupied, use another port
## 2. Build mmdeploy
### 1) Environment
| Matters | Version | Remarks |
| ------- | ------------------ | ---------------------- |
| host OS | ubuntu18.04 x86_64 | snpe specified version |
| Python | **3.6.0** | snpe specified version |
### 2) Installation
Download [snpe-1.59 from the official website](https://developer.qualcomm.com/qfile/69652/snpe-1.59.0.zip)
```bash
$ unzip snpe-1.59.0.zip
$ export SNPE_ROOT=${PWD}/snpe-1.59.0.3230
$ cd /path/to/mmdeploy
$ export PYTHONPATH=${PWD}/service/snpe/client:${SNPE_ROOT}/lib/python:${PYTHONPATH}
$ export LD_LIBRARY_PATH=${SNPE_ROOT}/lib/x86_64-linux-clang:${LD_LIBRARY_PATH}
$ export PATH=${SNPE_ROOT}/bin/x86_64-linux-clang:${PATH}
$ python3 -m pip install -e .
```
## 3. Test the model
Take Resnet-18 as an example. First refer to [documentation to install mmpretrain](https://github.com/open-mmlab/mmpretrain/tree/main) and use `tools/deploy.py` to convert the model.
```bash
$ export MODEL_CONFIG=/path/to/mmpretrain/configs/resnet/resnet18_8xb16_cifar10.py
$ export MODEL_PATH=https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth
# Convert the model
$ cd /path/to/mmdeploy
$ python3 tools/deploy.py configs/mmpretrain/classification_snpe_static.py $MODEL_CONFIG $MODEL_PATH /path/to/test.png --work-dir resnet18 --device cpu --uri 10.0.0.1\:60000 --dump-info
# Test
$ python3 tools/test.py configs/mmpretrain/classification_snpe_static.py $MODEL_CONFIG --model reset18/end2end.dlc --metrics accuracy precision f1_score recall --uri 10.0.0.1\:60000
```
Note that `--uri` is required to specify the ip and port of the snpe inference service, ipv4 and ipv6 addresses can be used.
## 4. Build SDK with Android SDK
If you also need to compile mmdeploy SDK with Android NDK, please continue reading.
### 1) Download NDK and OpenCV package and setup environment
```bash
# Download android OCV
$ export OPENCV_VERSION=4.5.4
$ wget https://github.com/opencv/opencv/releases/download/${OPENCV_VERSION}/opencv-${OPENCV_VERSION}-android-sdk.zip
$ unzip opencv-${OPENCV_VERSION}-android-sdk.zip
$ export ANDROID_OCV_ROOT=`realpath opencv-${OPENCV_VERSION}-android-sdk`
# Download ndk r23b
$ wget https://dl.google.com/android/repository/android-ndk-r23b-linux.zip
$ unzip android-ndk-r23b-linux.zip
$ export ANDROID_NDK_ROOT=`realpath android-ndk-r23b`
```
### 2) Compile mmdeploy SDK and demo
```bash
$ cd /path/to/mmdeploy
$ mkdir build && cd build
$ cmake .. \
-DMMDEPLOY_BUILD_SDK=ON \
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_ROOT}/build/cmake/android.toolchain.cmake \
-DMMDEPLOY_TARGET_BACKENDS=snpe \
-DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=android-30 \
-DANDROID_STL=c++_static \
-DOpenCV_DIR=${ANDROID_OCV_ROOT}/sdk/native/jni/abi-arm64-v8a \
-DMMDEPLOY_BUILD_EXAMPLES=ON
$ make && make install
$ tree ./bin
./bin
├── image_classification
├── image_restorer
├── image_segmentation
├── mmdeploy_onnx2ncnn
├── object_detection
├── ocr
├── pose_detection
└── rotated_object_detection
```
| Options | Description |
| ----------------------------- | ------------------------------------------------------------ |
| CMAKE_TOOLCHAIN_FILE | Load NDK parameters, mainly used to select compiler |
| MMDEPLOY_TARGET_BACKENDS=snpe | Inference backend |
| ANDROID_STL=c++\_static | In case of NDK environment can not find suitable c++ library |
| MMDEPLOY_SHARED_LIBS=ON | snpe does not provide static library |
[Here](../01-how-to-build/cmake_option.md) is all cmake build option description.
### 3) Run the demo
First make sure that`--dump-info`is used during convert model, so that the `resnet18` directory has the files required by the SDK such as `pipeline.json`.
`adb push` the model directory, executable file and .so to the device.
```bash
$ cd /path/to/mmdeploy
$ adb push resnet18 /data/local/tmp
$ adb push tests/data/tiger.jpeg /data/local/tmp/resnet18/
$ cd /path/to/install/
$ adb push lib /data/local/tmp
$ adb push bin/image_classification /data/local/tmp/resnet18/
```
Set up environment variable and execute the sample.
```bash
$ adb push /path/to/mmpretrain/demo/demo.JPEG /data/local/tmp
$ adb shell
venus:/ $ cd /data/local/tmp/resnet18
venus:/data/local/tmp/resnet18 $ export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/data/local/tmp/lib
venus:/data/local/tmp/resnet18 $ ./image_classification cpu ./ tiger.jpeg
..
label: 3, score: 0.3214
```
# Build for Windows
- [Build for Windows](#build-for-windows)
- [Build From Source](#build-from-source)
- [Install Toolchains](#install-toolchains)
- [Install Dependencies](#install-dependencies)
- [Install Dependencies for Model Converter](#install-dependencies-for-model-converter)
- [Install Dependencies for SDK](#install-dependencies-for-sdk)
- [Install Inference Engines for MMDeploy](#install-inference-engines-for-mmdeploy)
- [Build MMDeploy](#build-mmdeploy)
- [Build Model Converter](#build-model-converter)
- [Install Model Converter](#install-model-converter)
- [Build SDK and Demos](#build-sdk-and-demos)
- [Note](#note)
______________________________________________________________________
## Build From Source
All the commands listed in the following chapters are verified on **Windows 10**.
### Install Toolchains
1. Download and install [Visual Studio 2019](https://visualstudio.microsoft.com)
2. Add the path of `cmake` to the environment variable `PATH`, i.e., "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\Common7\\IDE\\CommonExtensions\\Microsoft\\CMake\\CMake\\bin"
3. Install cuda toolkit if NVIDIA gpu is available. You can refer to the official [guide](https://developer.nvidia.com/cuda-downloads).
### Install Dependencies
#### Install Dependencies for Model Converter
<table class="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>conda </td>
<td> Please install conda according to the official <a href="https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html">guide</a>. <br>
After installation, open <code>anaconda powershell prompt</code> under the Start Menu <b>as the administrator</b>, because: <br>
1. <b>All the commands listed in the following text are verified in anaconda powershell </b><br>
2. <b>As an administrator, you can install the thirdparty libraries to the system path so as to simplify MMDeploy build command</b><br>
Note: if you are familiar with how cmake works, you can also use <code>anaconda powershell prompt</code> as an ordinary user.
</td>
</tr>
<tr>
<td>PyTorch <br>(>=1.8.0) </td>
<td>
Install PyTorch>=1.8.0 by following the <a href="https://pytorch.org/">official instructions</a>. Be sure the CUDA version PyTorch requires matches that in your host.
<pre><code>
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
</code></pre>
</td>
</tr>
<tr>
<td>mmcv </td>
<td>Install mmcv as follows. Refer to the <a href="https://github.com/open-mmlab/mmcv/tree/2.x#installation">guide</a> for details.
<pre><code>
$env:cu_version="cu111"
$env:torch_version="torch1.8.0"
pip install -U openmim
mim install "mmcv>=2.0.0rc1"
</code></pre>
</td>
</tr>
</tbody>
</table>
#### Install Dependencies for SDK
You can skip this chapter if you are only interested in the model converter.
<table class="docutils">
<thead>
<tr>
<th>NAME </th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>OpenCV<br>(>=3.0) </td>
<td>
1. Find and download OpenCV 3+ for windows from <a href="https://github.com/opencv/opencv/releases">here</a>.<br>
2. You can download the prebuilt package and install it to the target directory. Or you can build OpenCV from its source. <br>
3. Find where <code>OpenCVConfig.cmake</code> locates in the installation directory. And export its path to the environment variable <code>PATH</code> like this,
<pre><code>$env:path = "\the\path\where\OpenCVConfig.cmake\locates;" + "$env:path"</code></pre>
</td>
</tr>
<tr>
<td>pplcv </td>
<td>A high-performance image processing library of openPPL.<br>
<b>It is optional which only be needed if <code>cuda</code> platform is required.</b><br>
<pre><code>
git clone https://github.com/openppl-public/ppl.cv.git
cd ppl.cv
git checkout tags/v0.7.0 -b v0.7.0
$env:PPLCV_DIR = "$pwd"
mkdir pplcv-build
cd pplcv-build
cmake .. -G "Visual Studio 16 2019" -T v142 -A x64 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=install -DHPCC_USE_CUDA=ON -DPPLCV_USE_MSVC_STATIC_RUNTIME=OFF
cmake --build . --config Release -- /m
cmake --install . --config Release
cd ../..
</code></pre>
</td>
</tr>
</tbody>
</table>
#### Install Inference Engines for MMDeploy
Both MMDeploy's model converter and SDK share the same inference engines.
You can select your interested inference engines and do the installation by following the given commands.
**Currently, MMDeploy only verified ONNXRuntime and TensorRT for windows platform**.
As for the rest, MMDeploy will support them in the future.
<table class="docutils">
<thead>
<tr>
<th>NAME</th>
<th>PACKAGE</th>
<th>INSTALLATION </th>
</tr>
</thead>
<tbody>
<tr>
<td>ONNXRuntime</td>
<td>onnxruntime<br>(>=1.8.1) </td>
<td>
1. Install python package
<pre><code>pip install onnxruntime==1.8.1</code></pre>
2. Download the windows prebuilt binary package from <a href="https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1">here</a>. Extract it and export environment variables as below:
<pre><code>
Invoke-WebRequest -Uri https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-win-x64-1.8.1.zip -OutFile onnxruntime-win-x64-1.8.1.zip
Expand-Archive onnxruntime-win-x64-1.8.1.zip .
$env:ONNXRUNTIME_DIR = "$pwd\onnxruntime-win-x64-1.8.1"
$env:path = "$env:ONNXRUNTIME_DIR\lib;" + $env:path
</code></pre>
</td>
</tr>
<tr>
<td rowspan="2">TensorRT<br> </td>
<td>TensorRT <br> </td>
<td>
1. Login <a href="https://www.nvidia.com/">NVIDIA</a> and download the TensorRT tar file that matches the CPU architecture and CUDA version you are using from <a href="https://developer.nvidia.com/nvidia-tensorrt-download">here</a>. Follow the <a href="https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar">guide</a> to install TensorRT. <br>
2. Here is an example of installing TensorRT 8.2 GA Update 2 for Windows x86_64 and CUDA 11.x that you can refer to. <br> First of all, click <a href="https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.2.3.0/zip/TensorRT-8.2.3.0.Windows10.x86_64.cuda-11.4.cudnn8.2.zip">here</a> to download CUDA 11.x TensorRT 8.2.3.0 and then install it and other dependency like below:
<pre><code>
cd \the\path\of\tensorrt\zip\file
Expand-Archive TensorRT-8.2.3.0.Windows10.x86_64.cuda-11.4.cudnn8.2.zip .
pip install $env:TENSORRT_DIR\python\tensorrt-8.2.3.0-cp37-none-win_amd64.whl
$env:TENSORRT_DIR = "$pwd\TensorRT-8.2.3.0"
$env:path = "$env:TENSORRT_DIR\lib;" + $env:path
pip install pycuda
</code></pre>
</td>
</tr>
<tr>
<td>cuDNN </td>
<td>
1. Download cuDNN that matches the CPU architecture, CUDA version and TensorRT version you are using from <a href="https://developer.nvidia.com/rdp/cudnn-archive"> cuDNN Archive</a>. <br>
In the above TensorRT's installation example, it requires cudnn8.2. Thus, you can download <a href="https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.2.1.32/11.3_06072021/cudnn-11.3-windows-x64-v8.2.1.32.zip">CUDA 11.x cuDNN 8.2</a><br>
2. Extract the zip file and set the environment variables
<pre><code>
cd \the\path\of\cudnn\zip\file
Expand-Archive cudnn-11.3-windows-x64-v8.2.1.32.zip .
$env:CUDNN_DIR="$pwd\cuda"
$env:path = "$env:CUDNN_DIR\bin;" + $env:path
</code></pre>
</td>
</tr>
<tr>
<td>PPL.NN</td>
<td>ppl.nn </td>
<td>TODO </td>
</tr>
<tr>
<td>OpenVINO</td>
<td>openvino </td>
<td>TODO </td>
</tr>
<tr>
<td>ncnn </td>
<td>ncnn </td>
<td>1. Download <a href="https://github.com/google/protobuf/archive/v3.11.2.zip">protobuf-3.11.2</a><br>
2. Compile protobuf
<pre><code>cd &ltprotobuf-dir>
mkdir build
cd build
cmake -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=%cd%/install -Dbuild_TESTS=OFF -Dprotobuf_MSVC_STATIC_RUNTIME=OFF ../cmake
cmake --build . --config Release -j 2
cmake --build . --config Release --target install</code></pre>
2. Download ncnn
<pre><code>git clone --recursive https://github.com/Tencent/ncnn.git
cd &ltncnn-dir>
mkdir -p ncnn_build
cd ncnn_build
cmake -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=%cd%/install -Dprotobuf_DIR=<protobuf-dir>/build/install/cmake -DNCNN_VULKAN=OFF ..
cmake --build . --config Release -j 2
cmake --build . --config Release --target install
</code></pre> </td>
</tr>
</tbody>
</table>
### Build MMDeploy
```powershell
cd \the\root\path\of\MMDeploy
$env:MMDEPLOY_DIR="$pwd"
```
#### Build Model Converter
If one of inference engines among ONNXRuntime, TensorRT and ncnn is selected, you have to build the corresponding custom ops.
- **ONNXRuntime** Custom Ops
```powershell
mkdir build -ErrorAction SilentlyContinue
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142 -DMMDEPLOY_TARGET_BACKENDS="ort" -DONNXRUNTIME_DIR="$env:ONNXRUNTIME_DIR"
cmake --build . --config Release -- /m
cmake --install . --config Release
```
- **TensorRT** Custom Ops
```powershell
mkdir build -ErrorAction SilentlyContinue
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142 -DMMDEPLOY_TARGET_BACKENDS="trt" -DTENSORRT_DIR="$env:TENSORRT_DIR" -DCUDNN_DIR="$env:CUDNN_DIR"
cmake --build . --config Release -- /m
cmake --install . --config Release
```
- **ncnn** Custom Ops
```powershell
mkdir build -ErrorAction SilentlyContinue
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142
-DMMDEPLOY_TARGET_BACKENDS="ncnn" \
-Dncnn_DIR="<ncnn-dir>/ncnn_build/install/lib/cmake/ncnn"
-Dprotobuf_DIR="<protobuf-dir>/build/install/cmake"
-DProtobuf_LIBRARIES="<protobuf-dir>/build\install\lib"
-DProtobuf_INCLUDE_DIR="<protobuf-dir>/build\install\include\"
cmake --build . --config Release -- /m
cmake --install . --config Release
```
Please check [cmake build option](cmake_option.md).
#### Install Model Converter
```powershell
cd $env:MMDEPLOY_DIR
pip install -e .
```
**Note**
- Some dependencies are optional. Simply running `pip install -e .` will only install the minimum runtime requirements.
To use optional dependencies, install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -e .[optional]`).
Valid keys for the extras field are: `all`, `tests`, `build`, `optional`.
#### Build SDK and Demos
MMDeploy provides two recipes as shown below for building SDK with ONNXRuntime and TensorRT as inference engines respectively.
You can also activate other engines after the model.
- cpu + ONNXRuntime
```PowerShell
cd $env:MMDEPLOY_DIR
mkdir build -ErrorAction SilentlyContinue
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142 `
-DMMDEPLOY_BUILD_SDK=ON `
-DMMDEPLOY_BUILD_EXAMPLES=ON `
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON `
-DMMDEPLOY_TARGET_DEVICES="cpu" `
-DMMDEPLOY_TARGET_BACKENDS="ort" `
-DONNXRUNTIME_DIR="$env:ONNXRUNTIME_DIR"
cmake --build . --config Release -- /m
cmake --install . --config Release
```
- cuda + TensorRT
```PowerShell
cd $env:MMDEPLOY_DIR
mkdir build -ErrorAction SilentlyContinue
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142 `
-DMMDEPLOY_BUILD_SDK=ON `
-DMMDEPLOY_BUILD_EXAMPLES=ON `
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON `
-DMMDEPLOY_TARGET_DEVICES="cuda" `
-DMMDEPLOY_TARGET_BACKENDS="trt" `
-Dpplcv_DIR="$env:PPLCV_DIR/pplcv-build/install/lib/cmake/ppl" `
-DTENSORRT_DIR="$env:TENSORRT_DIR" `
-DCUDNN_DIR="$env:CUDNN_DIR"
cmake --build . --config Release -- /m
cmake --install . --config Release
```
- cpu + ncnn
```PowerShell
cd $env:MMDEPLOY_DIR
mkdir build
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -T v142 `
-DMMDEPLOY_BUILD_SDK=ON `
-DMMDEPLOY_BUILD_EXAMPLES=ON `
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON `
-DMMDEPLOY_TARGET_DEVICES="cpu" `
-DMMDEPLOY_TARGET_BACKENDS="ncnn" `
-Dncnn_DIR="<ncnn-dir>/ncnn_build/install/lib/cmake/ncnn"
-Dprotobuf_DIR="<protobuf-dir>/build/install/cmake"
-DProtobuf_LIBRARIES="<protobuf-dir>/build\install\lib"
-DProtobuf_INCLUDE_DIR="<protobuf-dir>/build\install\include\"
cmake --build . --config Release -- /m
cmake --install . --config Release
```
### Note
1. Release / Debug libraries can not be mixed. If MMDeploy is built with Release mode, all its dependent thirdparty libraries have to be built in Release mode too and vice versa.
# How to convert model
This tutorial briefly introduces how to export an OpenMMlab model to a specific backend using MMDeploy tools.
Notes:
- Supported backends are [ONNXRuntime](../05-supported-backends/onnxruntime.md), [TensorRT](../05-supported-backends/tensorrt.md), [ncnn](../05-supported-backends/ncnn.md), [PPLNN](../05-supported-backends/pplnn.md), [OpenVINO](../05-supported-backends/openvino.md).
- Supported codebases are [MMPretrain](../04-supported-codebases/mmpretrain.md), [MMDetection](../04-supported-codebases/mmdet.md), [MMSegmentation](../04-supported-codebases/mmseg.md), [MMOCR](../04-supported-codebases/mmocr.md), [MMagic](../04-supported-codebases/mmagic.md).
## How to convert models from Pytorch to other backends
### Prerequisite
1. Install and build your target backend. You could refer to [ONNXRuntime-install](../05-supported-backends/onnxruntime.md), [TensorRT-install](../05-supported-backends/tensorrt.md), [ncnn-install](../05-supported-backends/ncnn.md), [PPLNN-install](../05-supported-backends/pplnn.md), [OpenVINO-install](../05-supported-backends/openvino.md) for more information.
2. Install and build your target codebase. You could refer to [MMPretrain-install](https://mmpretrain.readthedocs.io/en/latest/get_started.html#installation), [MMDetection-install](https://mmdetection.readthedocs.io/en/latest/get_started.html#installation), [MMSegmentation-install](https://mmsegmentation.readthedocs.io/en/latest/get_started.html#installation), [MMOCR-install](https://mmocr.readthedocs.io/en/latest/get_started/install.html#installation-steps), [MMagic-install](https://mmagic.readthedocs.io/en/latest/get_started/install.html#installation).
### Usage
```bash
python ./tools/deploy.py \
${DEPLOY_CFG_PATH} \
${MODEL_CFG_PATH} \
${MODEL_CHECKPOINT_PATH} \
${INPUT_IMG} \
--test-img ${TEST_IMG} \
--work-dir ${WORK_DIR} \
--calib-dataset-cfg ${CALIB_DATA_CFG} \
--device ${DEVICE} \
--log-level INFO \
--show \
--dump-info
```
### Description of all arguments
- `deploy_cfg` : The deployment configuration of mmdeploy for the model, including the type of inference framework, whether quantize, whether the input shape is dynamic, etc. There may be a reference relationship between configuration files, `mmdeploy/mmpretrain/classification_ncnn_static.py` is an example.
- `model_cfg` : Model configuration for algorithm library, e.g. `mmpretrain/configs/vision_transformer/vit-base-p32_ft-64xb64_in1k-384.py`, regardless of the path to mmdeploy.
- `checkpoint` : torch model path. It can start with http/https, see the implementation of `mmcv.FileClient` for details.
- `img` : The path to the image or point cloud file used for testing during the model conversion.
- `--test-img` : The path of the image file that is used to test the model. If not specified, it will be set to `None`.
- `--work-dir` : The path of the work directory that is used to save logs and models.
- `--calib-dataset-cfg` : Only valid in int8 mode. The config used for calibration. If not specified, it will be set to `None` and use the "val" dataset in the model config for calibration.
- `--device` : The device used for model conversion. If not specified, it will be set to `cpu`. For trt, use `cuda:0` format.
- `--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
- `--show` : Whether to show detection outputs.
- `--dump-info` : Whether to output information for SDK.
### How to find the corresponding deployment config of a PyTorch model
1. Find the model's codebase folder in `configs/`. For converting a yolov3 model, you need to check `configs/mmdet` folder.
2. Find the model's task folder in `configs/codebase_folder/`. For a yolov3 model, you need to check `configs/mmdet/detection` folder.
3. Find the deployment config file in `configs/codebase_folder/task_folder/`. For deploying a yolov3 model to the onnx backend, you could use `configs/mmdet/detection/detection_onnxruntime_dynamic.py`.
### Example
```bash
python ./tools/deploy.py \
configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
$PATH_TO_MMDET/configs/yolo/yolov3_d53_8xb8-ms-608-273e_coco.py \
$PATH_TO_MMDET/checkpoints/yolo/yolov3_d53_mstrain-608_273e_coco_20210518_115020-a2c3acb8.pth \
$PATH_TO_MMDET/demo/demo.jpg \
--work-dir work_dir \
--show \
--device cuda:0
```
## How to evaluate the exported models
You can try to evaluate model, referring to [how_to_evaluate_a_model](profile_model.md).
## List of supported models exportable to other backends
Refer to [Support model list](../03-benchmark/supported_models.md)
# Fuse Transform(Experimental)
MMDeploy provides ability to fuse transform for acceleration in some cases.
When make inference with SDK, one can edit the pipeline.json to turn on the fuse option.
To bring the ability of fuse transform to MMDeploy, you can refer to the use of CVFusion.
## 1. Use CVFusion
There are two ways to use CVFusion, one is to use the pre-generated kernel code, the other is to generate the code yourself.
A)Use pre-generated kernel code
i) Download the kernel code from here,unzip it and copy the csrc folder to the mmdeploy root folder.
[elena_kernel-20220823.tar.gz](https://github.com/open-mmlab/mmdeploy/files/9399795/elena_kernel-20220823.tar.gz)
ii) Add option `-DMMDEPLOY_ELENA_FUSION=ON` when compile MMDeploy.
B) Generate kernel code by yourself
i) Compile CVFusion
```bash
$ git clone --recursive https://github.com/OpenComputeLab/CVFusion.git
$ cd CVFusion
$ bash build.sh
```
```
# add OpFuse to PATH
$ export PATH=`pwd`/build/examples/MMDeploy:$PATH
```
ii) Download algorithm codebase
```bash
$ tree -L 1 .
├── mmdeploy
├── mmpretrain
├── mmdetection
├── mmsegmentation
├── ...
```
iii) Generate kernel code
```bash
python tools/elena/extract_transform.py ..
# The generated code will be saved to csrc/preprocess/elena/{cpu_kernel}/{cuda_kernel}
```
iv) Add option `-DMMDEPLOY_ELENA_FUSION=ON` when compile MMDeploy.
## 2. Model conversion
Add `--dump-info` argument when convert a model, this will generate files that SDK needs.
```bash
$ export MODEL_CONFIG=/path/to/mmpretrain/configs/resnet/resnet18_8xb32_in1k.py
$ export MODEL_PATH=https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_8xb32_in1k_20210831-fbbb1da6.pth
$ python tools/deploy.py \
configs/mmpretrain/classification_onnxruntime_static.py \
$MODEL_CONFIG \
$MODEL_PATH \
tests/data/tiger.jpeg \
--work-dir resnet18 \
--device cpu \
--dump-info
```
## 3. Model Inference
If the model preprocess supports fusion, there will be a filed named `fuse_transform` in `pipeline.json`. It represents fusion switch and the default value `false` stands for off. One need to edit this filed to `true` to use the fuse option.
# How to use prebuilt package on Windows10
- [How to use prebuilt package on Windows10](#how-to-use-prebuilt-package-on-windows10)
- [Prerequisite](#prerequisite)
- [ONNX Runtime](#onnx-runtime)
- [TensorRT](#tensorrt)
- [Model Convert](#model-convert)
- [ONNX Runtime Example](#onnx-runtime-example)
- [TensorRT Example](#tensorrt-example)
- [Model Inference](#model-inference)
- [Backend Inference](#backend-inference)
- [ONNXRuntime](#onnxruntime)
- [TensorRT](#tensorrt-1)
- [Python SDK](#python-sdk)
- [ONNXRuntime](#onnxruntime-1)
- [TensorRT](#tensorrt-2)
- [C SDK](#c-sdk)
- [ONNXRuntime](#onnxruntime-2)
- [TensorRT](#tensorrt-3)
- [Troubleshooting](#troubleshooting)
______________________________________________________________________
This tutorial takes `mmdeploy-1.3.1-windows-amd64.zip` and `mmdeploy-1.3.1-windows-amd64-cuda11.8.zip` as examples to show how to use the prebuilt packages. The former support onnxruntime cpu inference, the latter support onnxruntime-gpu and tensorrt inference.
The directory structure of the prebuilt package is as follows, where the `dist` folder is about model converter, and the `sdk` folder is related to model inference.
```
.
├── build_sdk.ps1
├── example
├── include
├── install_opencv.ps1
├── lib
├── README.md
├── set_env.ps1
└── thirdparty
```
## Prerequisite
In order to use the prebuilt package, you need to install some third-party dependent libraries.
1. Follow the [get_started](../get_started.md) documentation to create a virtual python environment and install pytorch, torchvision and mmcv. To use the C interface of the SDK, you need to install [vs2019+](https://visualstudio.microsoft.com/), [OpenCV](https://github.com/opencv/opencv/releases).
:point_right: It is recommended to use `pip` instead of `conda` to install pytorch and torchvision
2. Clone the mmdeploy repository
```bash
git clone -b main https://github.com/open-mmlab/mmdeploy.git
```
:point_right: The main purpose here is to use the configs, so there is no need to compile `mmdeploy`.
3. Install mmpretrain
```bash
git clone -b main https://github.com/open-mmlab/mmpretrain.git
cd mmpretrain
pip install -e .
```
4. Prepare a PyTorch model as our example
Download the pth [resnet18_8xb32_in1k_20210831-fbbb1da6.pth](https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_8xb32_in1k_20210831-fbbb1da6.pth). The corresponding config of the model is [resnet18_8xb32_in1k.py](https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnet/resnet18_8xb32_in1k.py)
After the above work is done, the structure of the current working directory should be:
```
.
|-- mmpretrain
|-- mmdeploy
|-- resnet18_8xb32_in1k_20210831-fbbb1da6.pth
```
### ONNX Runtime
In order to use `ONNX Runtime` backend, you should also do the following steps.
5. Install `mmdeploy` (Model Converter) and `mmdeploy_runtime` (SDK Python API).
```bash
pip install mmdeploy==1.3.1
pip install mmdeploy-runtime==1.3.1
```
:point_right: If you have installed it before, please uninstall it first.
6. Install onnxruntime package
```
pip install onnxruntime==1.8.1
```
7. Download [`onnxruntime`](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1), and add environment variable.
As shown in the figure, add the lib directory of onnxruntime to the `PATH`.
![sys-path](https://user-images.githubusercontent.com/16019484/181463801-1d7814a8-b256-46e9-86f2-c08de0bc150b.png)
:exclamation: Restart powershell to make the environment variables setting take effect. You can check whether the settings are in effect by `echo $env:PATH`.
8. Download SDK C/cpp Library mmdeploy-1.3.1-windows-amd64.zip
### TensorRT
In order to use `TensorRT` backend, you should also do the following steps.
5. Install `mmdeploy` (Model Converter) and `mmdeploy_runtime` (SDK Python API).
```bash
pip install mmdeploy==1.3.1
pip install mmdeploy-runtime-gpu==1.3.1
```
:point_right: If you have installed it before, please uninstall it first.
6. Install TensorRT related package and set environment variables
- CUDA Toolkit 11.8
- TensorRT 8.6.1.6
- cuDNN 8.6.0
Add the runtime libraries of TensorRT and cuDNN to the `PATH`. You can refer to the path setting of onnxruntime. Don't forget to install python package of TensorRT.
:exclamation: Restart powershell to make the environment variables setting take effect. You can check whether the settings are in effect by echo `$env:PATH`.
:exclamation: It is recommended to add only one version of the TensorRT/cuDNN runtime libraries to the `PATH`. It is better not to copy the runtime libraries of TensorRT/cuDNN to the cuda directory in `C:\`.
7. Install pycuda by `pip install pycuda`
8. Download SDK C/cpp Library mmdeploy-1.3.1-windows-amd64-cuda11.8.zip
## Model Convert
### ONNX Runtime Example
The following describes how to use the prebuilt package to do model conversion based on the previous downloaded pth.
After preparation work, the structure of the current working directory should be:
```
..
|-- mmdeploy-1.3.1-windows-amd64
|-- mmpretrain
|-- mmdeploy
`-- resnet18_8xb32_in1k_20210831-fbbb1da6.pth
```
Model conversion can be performed like below:
```python
from mmdeploy.apis import torch2onnx
from mmdeploy.backend.sdk.export_info import export2SDK
img = 'mmpretrain/demo/demo.JPEG'
work_dir = 'work_dir/onnx/resnet'
save_file = 'end2end.onnx'
deploy_cfg = 'mmdeploy/configs/mmpretrain/classification_onnxruntime_dynamic.py'
model_cfg = 'mmpretrain/configs/resnet/resnet18_8xb32_in1k.py'
model_checkpoint = 'resnet18_8xb32_in1k_20210831-fbbb1da6.pth'
device = 'cpu'
# 1. convert model to onnx
torch2onnx(img, work_dir, save_file, deploy_cfg, model_cfg,
model_checkpoint, device)
# 2. extract pipeline info for sdk use (dump-info)
export2SDK(deploy_cfg, model_cfg, work_dir, pth=model_checkpoint, device=device)
```
The structure of the converted model directory:
```bash
.\work_dir\
`-- onnx
`-- resnet
|-- deploy.json
|-- detail.json
|-- end2end.onnx
`-- pipeline.json
```
### TensorRT Example
The following describes how to use the prebuilt package to do model conversion based on the previous downloaded pth.
After installation of mmdeploy-tensorrt prebuilt package, the structure of the current working directory should be:
```
..
|-- mmdeploy-1.3.1-windows-amd64-cuda11.8
|-- mmpretrain
|-- mmdeploy
`-- resnet18_8xb32_in1k_20210831-fbbb1da6.pth
```
Model conversion can be performed like below:
```python
from mmdeploy.apis import torch2onnx
from mmdeploy.apis.tensorrt import onnx2tensorrt
from mmdeploy.backend.sdk.export_info import export2SDK
import os
img = 'mmpretrain/demo/demo.JPEG'
work_dir = 'work_dir/trt/resnet'
save_file = 'end2end.onnx'
deploy_cfg = 'mmdeploy/configs/mmpretrain/classification_tensorrt_static-224x224.py'
model_cfg = 'mmpretrain/configs/resnet/resnet18_8xb32_in1k.py'
model_checkpoint = 'resnet18_8xb32_in1k_20210831-fbbb1da6.pth'
device = 'cpu'
# 1. convert model to IR(onnx)
torch2onnx(img, work_dir, save_file, deploy_cfg, model_cfg,
model_checkpoint, device)
# 2. convert IR to tensorrt
onnx_model = os.path.join(work_dir, save_file)
save_file = 'end2end.engine'
model_id = 0
device = 'cuda'
onnx2tensorrt(work_dir, save_file, model_id, deploy_cfg, onnx_model, device)
# 3. extract pipeline info for sdk use (dump-info)
export2SDK(deploy_cfg, model_cfg, work_dir, pth=model_checkpoint, device=device)
```
The structure of the converted model directory:
```
.\work_dir\
`-- trt
`-- resnet
|-- deploy.json
|-- detail.json
|-- end2end.engine
|-- end2end.onnx
`-- pipeline.json
```
## Model Inference
You can obtain two model folders after model conversion.
```
.\work_dir\onnx\resnet
.\work_dir\trt\resnet
```
The structure of current working directory:
```
.
|-- mmdeploy-1.3.1-windows-amd64
|-- mmdeploy-1.3.1-windows-amd64-cuda11.8
|-- mmpretrain
|-- mmdeploy
|-- resnet18_8xb32_in1k_20210831-fbbb1da6.pth
`-- work_dir
```
### Backend Inference
:exclamation: It should be emphasized that `inference_model` is not for deployment, but shields the difference of backend inference api(`TensorRT`, `ONNX Runtime` etc.). The main purpose of this api is to check whether the converted model can be inferred normally.
#### ONNXRuntime
```python
from mmdeploy.apis import inference_model
model_cfg = 'mmpretrain/configs/resnet/resnet18_8xb32_in1k.py'
deploy_cfg = 'mmdeploy/configs/mmpretrain/classification_onnxruntime_dynamic.py'
backend_files = ['work_dir/onnx/resnet/end2end.onnx']
img = 'mmpretrain/demo/demo.JPEG'
device = 'cpu'
result = inference_model(model_cfg, deploy_cfg, backend_files, img, device)
```
#### TensorRT
```python
from mmdeploy.apis import inference_model
model_cfg = 'mmpretrain/configs/resnet/resnet18_8xb32_in1k.py'
deploy_cfg = 'mmdeploy/configs/mmpretrain/classification_tensorrt_static-224x224.py'
backend_files = ['work_dir/trt/resnet/end2end.engine']
img = 'mmpretrain/demo/demo.JPEG'
device = 'cuda'
result = inference_model(model_cfg, deploy_cfg, backend_files, img, device)
```
### Python SDK
The following describes how to use the SDK's Python API for inference
#### ONNXRuntime
```bash
python .\mmdeploy\demo\python\image_classification.py cpu .\work_dir\onnx\resnet\ .\mmpretrain\demo\demo.JPEG
```
#### TensorRT
```bash
python .\mmdeploy\demo\python\image_classification.py cuda .\work_dir\trt\resnet\ .\mmpretrain\demo\demo.JPEG
```
### C SDK
The following describes how to use the SDK's C API for inference
#### ONNXRuntime
1. Add environment variables
Refer to the README.md in sdk folder
2. Build examples
Refer to the README.md in sdk folder
3. Inference:
It is recommended to use `CMD` here.
Under `mmdeploy-1.3.1-windows-amd64\\example\\cpp\\build\\Release` directory:
```
.\image_classification.exe cpu C:\workspace\work_dir\onnx\resnet\ C:\workspace\mmpretrain\demo\demo.JPEG
```
#### TensorRT
1. Add environment variables
Refer to the README.md in sdk folder
2. Build examples
Refer to the README.md in sdk folder
3. Inference
It is recommended to use `CMD` here.
Under `mmdeploy-1.3.1-windows-amd64-cuda11.8\\example\\cpp\\build\\Release` directory
```
.\image_classification.exe cuda C:\workspace\work_dir\trt\resnet C:\workspace\mmpretrain\demo\demo.JPEG
```
## Troubleshooting
If you encounter problems, please refer to [FAQ](../faq.md)
# How to evaluate model
After converting a PyTorch model to a backend model, you may evaluate backend models with `tools/test.py`
## Prerequisite
Install MMDeploy according to [get-started](../get_started.md) instructions.
And convert the PyTorch model or ONNX model to the backend model by following the [guide](convert_model.md).
## Usage
```shell
python tools/test.py \
${DEPLOY_CFG} \
${MODEL_CFG} \
--model ${BACKEND_MODEL_FILES} \
[--out ${OUTPUT_PKL_FILE}] \
[--format-only] \
[--metrics ${METRICS}] \
[--show] \
[--show-dir ${OUTPUT_IMAGE_DIR}] \
[--show-score-thr ${SHOW_SCORE_THR}] \
--device ${DEVICE} \
[--cfg-options ${CFG_OPTIONS}] \
[--metric-options ${METRIC_OPTIONS}]
[--log2file work_dirs/output.txt]
[--batch-size ${BATCH_SIZE}]
[--speed-test] \
[--warmup ${WARM_UP}] \
[--log-interval ${LOG_INTERVERL}] \
```
## Description of all arguments
- `deploy_cfg`: The config for deployment.
- `model_cfg`: The config of the model in OpenMMLab codebases.
- `--model`: The backend model file. For example, if we convert a model to TensorRT, we need to pass the model file with ".engine" suffix.
- `--out`: The path to save output results in pickle format. (The results will be saved only if this argument is given)
- `--format-only`: Whether format the output results without evaluation or not. It is useful when you want to format the result to a specific format and submit it to the test server
- `--metrics`: The metrics to evaluate the model defined in OpenMMLab codebases. e.g. "segm", "proposal" for COCO in mmdet, "precision", "recall", "f1_score", "support" for single label dataset in mmpretrain.
- `--show`: Whether to show the evaluation result on the screen.
- `--show-dir`: The directory to save the evaluation result. (The results will be saved only if this argument is given)
- `--show-score-thr`: The threshold determining whether to show detection bounding boxes.
- `--device`: The device that the model runs on. Note that some backends restrict the device. For example, TensorRT must run on cuda.
- `--cfg-options`: Extra or overridden settings that will be merged into the current deploy config.
- `--metric-options`: Custom options for evaluation. The key-value pair in xxx=yyy
format will be kwargs for dataset.evaluate() function.
- `--log2file`: log evaluation results (and speed) to file.
- `--batch-size`: the batch size for inference, which would override `samples_per_gpu` in data config. Default is `1`. Note that not all models support `batch_size>1`.
- `--speed-test`: Whether to activate speed test.
- `--warmup`: warmup before counting inference elapse, require setting speed-test first.
- `--log-interval`: The interval between each log, require setting speed-test first.
\* Other arguments in `tools/test.py` are used for speed test. They have no concern with evaluation.
## Example
```shell
python tools/test.py \
configs/mmpretrain/classification_onnxruntime_static.py \
{MMPRETRAIN_DIR}/configs/resnet/resnet50_b32x8_imagenet.py \
--model model.onnx \
--out out.pkl \
--device cpu \
--speed-test
```
## Note
- The performance of each model in [OpenMMLab](https://openmmlab.com/) codebases can be found in the document of each codebase.
# Quantize model
## Why quantization ?
The fixed-point model has many advantages over the fp32 model:
- Smaller size, 8-bit model reduces file size by 75%
- Benefit from the smaller model, the Cache hit rate is improved and inference would be faster
- Chips tend to have corresponding fixed-point acceleration instructions which are faster and less energy consumed (int8 on a common CPU requires only about 10% of energy)
APK file size and heat generation are key indicators while evaluating mobile APP;
On server side, quantization means that you can increase model size in exchange for precision and keep the same QPS.
## Post training quantization scheme
Taking ncnn backend as an example, the complete workflow is as follows:
<div align="center">
<img src="../_static/image/quant_model.png"/>
</div>
mmdeploy generates quantization table based on static graph (onnx) and uses backend tools to convert fp32 model to fixed point.
mmdeploy currently supports ncnn with PTQ.
## How to convert model
[After mmdeploy installation](../01-how-to-build/build_from_source.md), install ppq
```bash
git clone https://github.com/openppl-public/ppq.git
cd ppq
pip install -r requirements.txt
python3 setup.py install
```
Back in mmdeploy, enable quantization with the option 'tools/deploy.py --quant'.
```bash
cd /path/to/mmdeploy
export MODEL_CONFIG=/home/rg/konghuanjun/mmpretrain/configs/resnet/resnet18_8xb32_in1k.py
export MODEL_PATH=https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_8xb32_in1k_20210831-fbbb1da6.pth
# get some imagenet sample images
git clone https://github.com/nihui/imagenet-sample-images --depth=1
# quantize
python3 tools/deploy.py configs/mmpretrain/classification_ncnn-int8_static.py ${MODEL_CONFIG} ${MODEL_PATH} /path/to/self-test.png --work-dir work_dir --device cpu --quant --quant-image-dir /path/to/imagenet-sample-images
...
```
Description
| Parameter | Meaning |
| :---------------: | :--------------------------------------------------------------: |
| --quant | Enable quantization, the default value is False |
| --quant-image-dir | Calibrate dataset, use Validation Set in MODEL_CONFIG by default |
## Custom calibration dataset
Calibration set is used to calculate quantization layer parameters. Some DFQ (Data Free Quantization) methods do not even require a dataset.
- Create a folder, just put in some images (no directory structure, no negative example, no special filename format)
- The image needs to be the data comes from real scenario otherwise the accuracy would be drop
- You can not quantize model with test dataset
| Type | Train dataset | Validation dataset | Test dataset | Calibration dataset |
| ----- | ------------- | ------------------ | ------------- | ------------------- |
| Usage | QAT | PTQ | Test accuracy | PTQ |
It is highly recommended that [verifying model precision](profile_model.md) after quantization. [Here](../03-benchmark/quantization.md) is some quantization model test result.
# Useful Tools
Apart from `deploy.py`, there are other useful tools under the `tools/` directory.
## torch2onnx
This tool can be used to convert PyTorch model from OpenMMLab to ONNX.
### Usage
```bash
python tools/torch2onnx.py \
${DEPLOY_CFG} \
${MODEL_CFG} \
${CHECKPOINT} \
${INPUT_IMG} \
--work-dir ${WORK_DIR} \
--device cpu \
--log-level INFO
```
### Description of all arguments
- `deploy_cfg` : The path of the deploy config file in MMDeploy codebase.
- `model_cfg` : The path of model config file in OpenMMLab codebase.
- `checkpoint` : The path of the model checkpoint file.
- `img` : The path of the image file used to convert the model.
- `--work-dir` : Directory to save output ONNX models Default is `./work-dir`.
- `--device` : The device used for conversion. If not specified, it will be set to `cpu`.
- `--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## extract
ONNX model with `Mark` nodes in it can be partitioned into multiple subgraphs. This tool can be used to extract the subgraph from the ONNX model.
### Usage
```bash
python tools/extract.py \
${INPUT_MODEL} \
${OUTPUT_MODEL} \
--start ${PARITION_START} \
--end ${PARITION_END} \
--log-level INFO
```
### Description of all arguments
- `input_model` : The path of input ONNX model. The output ONNX model will be extracted from this model.
- `output_model` : The path of output ONNX model.
- `--start` : The start point of extracted model with format `<function_name>:<input/output>`. The `function_name` comes from the decorator `@mark`.
- `--end` : The end point of extracted model with format `<function_name>:<input/output>`. The `function_name` comes from the decorator `@mark`.
- `--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
### Note
To support the model partition, you need to add Mark nodes in the ONNX model. The Mark node comes from the `@mark` decorator.
For example, if we have marked the `multiclass_nms` as below, we can set `end=multiclass_nms:input` to extract the subgraph before NMS.
```python
@mark('multiclass_nms', inputs=['boxes', 'scores'], outputs=['dets', 'labels'])
def multiclass_nms(*args, **kwargs):
"""Wrapper function for `_multiclass_nms`."""
```
## onnx2pplnn
This tool helps to convert an `ONNX` model to an `PPLNN` model.
### Usage
```bash
python tools/onnx2pplnn.py \
${ONNX_PATH} \
${OUTPUT_PATH} \
--device cuda:0 \
--opt-shapes [224,224] \
--log-level INFO
```
### Description of all arguments
- `onnx_path`: The path of the `ONNX` model to convert.
- `output_path`: The converted `PPLNN` algorithm path in json format.
- `device`: The device of the model during conversion.
- `opt-shapes`: Optimal shapes for PPLNN optimization. The shape of each tensor should be wrap with "[]" or "()" and the shapes of tensors should be separated by ",".
- `--log-level`: To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## onnx2tensorrt
This tool can be used to convert ONNX to TensorRT engine.
### Usage
```bash
python tools/onnx2tensorrt.py \
${DEPLOY_CFG} \
${ONNX_PATH} \
${OUTPUT} \
--device-id 0 \
--log-level INFO \
--calib-file /path/to/file
```
### Description of all arguments
- `deploy_cfg` : The path of the deploy config file in MMDeploy codebase.
- `onnx_path` : The ONNX model path to convert.
- `output` : The path of output TensorRT engine.
- `--device-id` : The device index, default to `0`.
- `--calib-file` : The calibration data used to calibrate engine to int8.
- `--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## onnx2ncnn
This tool helps to convert an `ONNX` model to an `ncnn` model.
### Usage
```bash
python tools/onnx2ncnn.py \
${ONNX_PATH} \
${NCNN_PARAM} \
${NCNN_BIN} \
--log-level INFO
```
### Description of all arguments
- `onnx_path` : The path of the `ONNX` model to convert from.
- `output_param` : The converted `ncnn` param path.
- `output_bin` : The converted `ncnn` bin path.
- `--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
## profiler
This tool helps to test latency of models with PyTorch, TensorRT and other backends. Note that the pre- and post-processing is excluded when computing inference latency.
### Usage
```bash
python tools/profiler.py \
${DEPLOY_CFG} \
${MODEL_CFG} \
${IMAGE_DIR} \
--model ${MODEL} \
--device ${DEVICE} \
--shape ${SHAPE} \
--num-iter ${NUM_ITER} \
--warmup ${WARMUP} \
--cfg-options ${CFG_OPTIONS} \
--batch-size ${BATCH_SIZE} \
--img-ext ${IMG_EXT}
```
### Description of all arguments
- `deploy_cfg` : The path of the deploy config file in MMDeploy codebase.
- `model_cfg` : The path of model config file in OpenMMLab codebase.
- `image_dir` : The directory to image files that used to test the model.
- `--model` : The path of the model to be tested.
- `--shape` : Input shape of the model by `HxW`, e.g., `800x1344`. If not specified, it would use `input_shape` from deploy config.
- `--num-iter` : Number of iteration to run inference. Default is `100`.
- `--warmup` : Number of iteration to warm-up the machine. Default is `10`.
- `--device` : The device type. If not specified, it will be set to `cuda:0`.
- `--cfg-options` : Optional key-value pairs to be overrode for model config.
- `--batch-size`: the batch size for test inference. Default is `1`. Note that not all models support `batch_size>1`.
- `--img-ext`: the file extensions for input images from `image_dir`. Defaults to `['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif']`.
### Example:
```shell
python tools/profiler.py \
configs/mmpretrain/classification_tensorrt_dynamic-224x224-224x224.py \
../mmpretrain/configs/resnet/resnet18_8xb32_in1k.py \
../mmpretrain/demo/ \
--model work-dirs/mmpretrain/resnet/trt/end2end.engine \
--device cuda \
--shape 224x224 \
--num-iter 100 \
--warmup 10 \
--batch-size 1
```
And the output look like this:
```text
----- Settings:
+------------+---------+
| batch size | 1 |
| shape | 224x224 |
| iterations | 100 |
| warmup | 10 |
+------------+---------+
----- Results:
+--------+------------+---------+
| Stats | Latency/ms | FPS |
+--------+------------+---------+
| Mean | 1.535 | 651.656 |
| Median | 1.665 | 600.569 |
| Min | 1.308 | 764.341 |
| Max | 1.689 | 591.983 |
+--------+------------+---------+
```
## generate_md_table
This tool can be used to generate supported-backends markdown table.
### Usage
```shell
python tools/generate_md_table.py \
${YML_FILE} \
${OUTPUT} \
--backends ${BACKENDS}
```
### Description of all arguments
- `yml_file:` input yml config path
- `output:` output markdown file path
- `--backends:` output backends list. If not specified, it will be set 'onnxruntime' 'tensorrt' 'torchscript' 'pplnn' 'openvino' 'ncnn'.
### Example:
Generate backends markdown table from mmocr.yml
```shell
python tools/generate_md_table.py tests/regression/mmocr.yml tests/regression/mmocr.md --backends onnxruntime tensorrt torchscript pplnn openvino ncnn
```
And the output look like this:
| model | task | onnxruntime | tensorrt | torchscript | pplnn | openvino | ncnn |
| :----------------------------------------------------------------------------------- | :-------------- | :---------: | :------: | :---------: | :---: | :------: | :--: |
| [DBNet](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/dbnet) | TextDetection | Y | Y | Y | Y | Y | Y |
| [DBNetpp](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/dbnetpp) | TextDetection | Y | Y | N | N | Y | Y |
| [PANet](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/panet) | TextDetection | Y | Y | Y | Y | Y | Y |
| [PSENet](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/psenet) | TextDetection | Y | Y | Y | Y | Y | Y |
| [TextSnake](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/textsnake) | TextDetection | Y | Y | Y | N | N | N |
| [MaskRCNN](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/maskrcnn) | TextDetection | Y | Y | Y | N | N | N |
| [CRNN](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/crnn) | TextRecognition | Y | Y | Y | Y | N | Y |
| [SAR](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/sar) | TextRecognition | Y | N | Y | N | N | N |
| [SATRN](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/satrn) | TextRecognition | Y | Y | Y | N | N | N |
| [ABINet](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/abinet) | TextRecognition | Y | Y | Y | N | N | N |
# How to write config
This tutorial describes how to write a config for model conversion and deployment. A deployment config includes `onnx config`, `codebase config`, `backend config`.
<!-- TOC -->
- [How to write config](#how-to-write-config)
- [1. How to write onnx config](#1-how-to-write-onnx-config)
- [Description of onnx config arguments](#description-of-onnx-config-arguments)
- [Example](#example)
- [If you need to use dynamic axes](#if-you-need-to-use-dynamic-axes)
- [Example](#example-1)
- [2. How to write codebase config](#2-how-to-write-codebase-config)
- [Description of codebase config arguments](#description-of-codebase-config-arguments)
- [Example](#example-2)
- [3. How to write backend config](#3-how-to-write-backend-config)
- [Example](#example-3)
- [4. A complete example of mmpretrain on TensorRT](#4-a-complete-example-of-mmpretrain-on-tensorrt)
- [5. The name rules of our deployment config](#5-the-name-rules-of-our-deployment-config)
- [Example](#example-4)
- [6. How to write model config](#6-how-to-write-model-config)
<!-- TOC -->
## 1. How to write onnx config
Onnx config to describe how to export a model from pytorch to onnx.
### Description of onnx config arguments
- `type`: Type of config dict. Default is `onnx`.
- `export_params`: If specified, all parameters will be exported. Set this to False if you want to export an untrained model.
- `keep_initializers_as_inputs`: If True, all the initializers (typically corresponding to parameters) in the exported graph will also be added as inputs to the graph. If False, then initializers are not added as inputs to the graph, and only the non-parameter inputs are added as inputs.
- `opset_version`: Opset_version is 11 by default.
- `save_file`: Output onnx file.
- `input_names`: Names to assign to the input nodes of the graph.
- `output_names`: Names to assign to the output nodes of the graph.
- `input_shape`: The height and width of input tensor to the model.
### Example
```python
onnx_config = dict(
type='onnx',
export_params=True,
keep_initializers_as_inputs=False,
opset_version=11,
save_file='end2end.onnx',
input_names=['input'],
output_names=['output'],
input_shape=None)
```
### If you need to use dynamic axes
If the dynamic shape of inputs and outputs is required, you need to add dynamic_axes dict in onnx config.
- `dynamic_axes`: Describe the dimensional information about input and output.
#### Example
```python
dynamic_axes={
'input': {
0: 'batch',
2: 'height',
3: 'width'
},
'dets': {
0: 'batch',
1: 'num_dets',
},
'labels': {
0: 'batch',
1: 'num_dets',
},
}
```
## 2. How to write codebase config
Codebase config part contains information like codebase type and task type.
### Description of codebase config arguments
- `type`: Model's codebase, including `mmpretrain`, `mmdet`, `mmseg`, `mmocr`, `mmagic`.
- `task`: Model's task type, referring to [List of tasks in all codebases](#list-of-tasks-in-all-codebases).
#### Example
```python
codebase_config = dict(type='mmpretrain', task='Classification')
```
## 3. How to write backend config
The backend config is mainly used to specify the backend on which model runs and provide the information needed when the model runs on the backend , referring to [ONNX Runtime](../05-supported-backends/onnxruntime.md), [TensorRT](../05-supported-backends/tensorrt.md), [ncnn](../05-supported-backends/ncnn.md), [PPLNN](../05-supported-backends/pplnn.md).
- `type`: Model's backend, including `onnxruntime`, `ncnn`, `pplnn`, `tensorrt`, `openvino`.
### Example
```python
backend_config = dict(
type='tensorrt',
common_config=dict(
fp16_mode=False, max_workspace_size=1 << 30),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 512, 1024],
opt_shape=[1, 3, 1024, 2048],
max_shape=[1, 3, 2048, 2048])))
])
```
## 4. A complete example of mmpretrain on TensorRT
Here we provide a complete deployment config from mmpretrain on TensorRT.
```python
codebase_config = dict(type='mmpretrain', task='Classification')
backend_config = dict(
type='tensorrt',
common_config=dict(
fp16_mode=False,
max_workspace_size=1 << 30),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 224, 224],
opt_shape=[4, 3, 224, 224],
max_shape=[64, 3, 224, 224])))])
onnx_config = dict(
type='onnx',
dynamic_axes={
'input': {
0: 'batch',
2: 'height',
3: 'width'
},
'output': {
0: 'batch'
}
},
export_params=True,
keep_initializers_as_inputs=False,
opset_version=11,
save_file='end2end.onnx',
input_names=['input'],
output_names=['output'],
input_shape=[224, 224])
```
## 5. The name rules of our deployment config
There is a specific naming convention for the filename of deployment config files.
```bash
(task name)_(backend name)_(dynamic or static).py
```
- `task name`: Model's task type.
- `backend name`: Backend's name. Note if you use the quantization function, you need to indicate the quantization type. Just like `tensorrt-int8`.
- `dynamic or static`: Dynamic or static export. Note if the backend needs explicit shape information, you need to add a description of input size with `height x width` format. Just like `dynamic-512x1024-2048x2048`, it means that the min input shape is `512x1024` and the max input shape is `2048x2048`.
### Example
```bash
detection_tensorrt-int8_dynamic-320x320-1344x1344.py
```
## 6. How to write model config
According to model's codebase, write the model config file. Model's config file is used to initialize the model, referring to [MMPretrain](https://github.com/open-mmlab/mmpretrain/blob/main/docs/en/user_guides/config.md), [MMDetection](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/en/user_guides/config.md), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/en/user_guides/1_config.md), [MMOCR](https://github.com/open-mmlab/mmocr/blob/main/docs/en/user_guides/config.md), [MMagic](https://github.com/open-mmlab/mmagic/blob/main/docs/en/user_guides/config.md).
# Benchmark
## Backends
CPU: ncnn, ONNXRuntime, OpenVINO
GPU: ncnn, TensorRT, PPLNN
## Latency benchmark
### Platform
- Ubuntu 18.04
- ncnn 20211208
- Cuda 11.3
- TensorRT 7.2.3.4
- Docker 20.10.8
- NVIDIA tesla T4 tensor core GPU for TensorRT
### Other settings
- Static graph
- Batch size 1
- Synchronize devices after each inference.
- We count the average inference performance of 100 images of the dataset.
- Warm up. For ncnn, we warm up 30 iters for all codebases. As for other backends: for classification, we warm up 1010 iters; for other codebases, we warm up 10 iters.
- Input resolution varies for different datasets of different codebases. All inputs are real images except for `mmagic` because the dataset is not large enough.
Users can directly test the speed through [model profiling](../02-how-to-run/profile_model.md). And here is the benchmark in our environment.
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="2">mmpretrain</th>
<th align="center" colspan="5">TensorRT(ms)</th>
<th align="center" colspan="2">PPLNN(ms)</th>
<th align="center" colspan="2">ncnn(ms)</th>
<th align="center" colspan="1">Ascend(ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" colspan="1" rowspan="2">model</td>
<td align="center" colspan="1" rowspan="2">spatial</td>
<td align="center" colspan="3">T4</td>
<td align="center" colspan="2">JetsonNano2GB</td>
<td align="center" colspan="1">Jetson TX2</td>
<td align="center" colspan="1">T4</td>
<td align="center" colspan="1">SnapDragon888</td>
<td align="center" colspan="1">Adreno660</td>
<td align="center" colspan="1">Ascend310</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">int8</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnet/resnet50_8xb32_in1k.py"> ResNet </a></td>
<td align="center">224x224</td>
<td align="center">2.97</td>
<td align="center">1.26</td>
<td align="center">1.21</td>
<td align="center">59.32</td>
<td align="center">30.54</td>
<td align="center">24.13</td>
<td align="center">1.30</td>
<td align="center">33.91</td>
<td align="center">25.93</td>
<td align="center">2.49</td>
</tr>
<tr>
<td align="center"> <a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnext/resnext50-32x4d_8xb32_in1k.py"> ResNeXt </a></td>
<td align="center">224x224</td>
<td align="center">4.31</td>
<td align="center">1.42</td>
<td align="center">1.37</td>
<td align="center">88.10</td>
<td align="center">49.18</td>
<td align="center">37.45</td>
<td align="center">1.36</td>
<td align="center">133.44</td>
<td align="center">69.38</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"> <a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/seresnet/seresnet50_8xb32_in1k.py"> SE-ResNet </a></td>
<td align="center">224x224</td>
<td align="center">3.41</td>
<td align="center">1.66</td>
<td align="center">1.51</td>
<td align="center">74.59</td>
<td align="center">48.78</td>
<td align="center">29.62</td>
<td align="center">1.91</td>
<td align="center">107.84</td>
<td align="center">80.85</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py"> ShuffleNetV2 </a></td>
<td align="center">224x224</td>
<td align="center">1.37</td>
<td align="center">1.19</td>
<td align="center">1.13</td>
<td align="center">15.26</td>
<td align="center">10.23</td>
<td align="center">7.37</td>
<td align="center">4.69</td>
<td align="center">9.55</td>
<td align="center">10.66</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="2">mmdet part1</th>
<th align="center" colspan="4">TensorRT(ms)</th>
<th align="center" colspan="1">PPLNN(ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="2" colspan="1">model</td>
<td align="center" rowspan="2" colspan="1">spatial</td>
<td align="center" colspan="3">T4</td>
<td align="center" colspan="1">Jetson TX2</td>
<td align="center" colspan="1">T4</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">int8</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/main/configs/yolo/yolov3_d53_320_273e_coco.py">YOLOv3</a></td>
<td align="center">320x320</td>
<td align="center">14.76</td>
<td align="center">24.92</td>
<td align="center">24.92</td>
<td align="center">-</td>
<td align="center">18.07</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/main/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py">SSD-Lite</a></td>
<td align="center">320x320</td>
<td align="center">8.84</td>
<td align="center">9.21</td>
<td align="center">8.04</td>
<td align="center">1.28</td>
<td align="center">19.72</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/main/configs/retinanet/retinanet_r50_fpn_1x_coco.py">RetinaNet</a></td>
<td align="center">800x1344</td>
<td align="center">97.09</td>
<td align="center">25.79</td>
<td align="center">16.88</td>
<td align="center">780.48</td>
<td align="center">38.34</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/main/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py">FCOS</a></td>
<td align="center">800x1344</td>
<td align="center">84.06</td>
<td align="center">23.15</td>
<td align="center">17.68</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/main/configs/fsaf/fsaf_r50_fpn_1x_coco.py">FSAF</a></td>
<td align="center">800x1344</td>
<td align="center">82.96</td>
<td align="center">21.02</td>
<td align="center">13.50</td>
<td align="center">-</td>
<td align="center">30.41</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/main/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py">Faster R-CNN</a></td>
<td align="center">800x1344</td>
<td align="center">88.08</td>
<td align="center">26.52</td>
<td align="center">19.14</td>
<td align="center">733.81</td>
<td align="center">65.40</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py">Mask R-CNN</a></td>
<td align="center">800x1344</td>
<td align="center">104.83</td>
<td align="center">58.27</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">86.80</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table>
<thead>
<tr>
<th align="center" colspan="2">mmdet part2</th>
<th align="center" colspan="2">ncnn</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="2">model</td>
<td align="center" rowspan="2">spatial</td>
<td align="center" colspan="1">SnapDragon888</td>
<td align="center" colspan="1">Adreno660</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo/yolov3_mobilenetv2_mstrain-416_300e_coco.py">MobileNetv2-YOLOv3</a></td>
<td align="center">320x320</td>
<td align="center">48.57</td>
<td align="center">66.55</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py">SSD-Lite</a></td>
<td align="center">320x320</td>
<td align="center">44.91</td>
<td align="center">66.19</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox/yolox_tiny_8x8_300e_coco.py">YOLOX</a></td>
<td align="center">416x416</td>
<td align="center">111.60</td>
<td align="center">134.50</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="2">mmagic</th>
<th align="center" colspan="4">TensorRT(ms)</th>
<th align="center" colspan="1">PPLNN(ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="2">model</td>
<td align="center" rowspan="2">spatial</td>
<td align="center" colspan="3">T4</td>
<td align="center" colspan="1">Jetson TX2</td>
<td align="center" colspan="1">T4</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">int8</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/esrgan/esrgan_psnr-x4c64b23g32_1xb16-1000k_div2k.py">ESRGAN</a></td>
<td align="center">32x32</td>
<td align="center">12.64</td>
<td align="center">12.42</td>
<td align="center">12.45</td>
<td align="center">-</td>
<td align="center">7.67</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/srcnn/srcnn_x4k915_1xb16-1000k_div2k.py">SRCNN</a></td>
<td align="center">32x32</td>
<td align="center">0.70</td>
<td align="center">0.35</td>
<td align="center">0.26</td>
<td align="center">58.86</td>
<td align="center">0.56</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="2">mmocr</th>
<th align="center" colspan="3">TensorRT(ms)</th>
<th align="center" colspan="1">PPLNN(ms)</th>
<th align="center" colspan="2">ncnn(ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="2">model</td>
<td align="center" rowspan="2">spatial</td>
<td align="center" colspan="3">T4</td>
<td align="center" colspan="1">T4</td>
<td align="center" colspan="1">SnapDragon888</td>
<td align="center" colspan="1">Adreno660</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">int8</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py">DBNet</a></td>
<td align="center">640x640</td>
<td align="center">10.70</td>
<td align="center">5.62</td>
<td align="center">5.00</td>
<td align="center">34.84</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/crnn/crnn_mini-vgg_5e_mj.py">CRNN</a></td>
<td align="center">32x32</td>
<td align="center">1.93 </td>
<td align="center">1.40</td>
<td align="center">1.36</td>
<td align="center">-</td>
<td align="center">10.57</td>
<td align="center">20.00</td>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="2">mmseg</th>
<th align="center" colspan="4">TensorRT(ms)</th>
<th align="center" colspan="1">PPLNN(ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="2">model</td>
<td align="center" rowspan="2">spatial</td>
<td align="center" colspan="3">T4</td>
<td align="center" colspan="1">Jetson TX2</td>
<td align="center" colspan="1">T4</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">int8</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp16</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fcn/fcn_r50-d8_4xb2-40k_cityscapes-512x1024.py">FCN</a></td>
<td align="center">512x1024</td>
<td align="center">128.42</td>
<td align="center">23.97</td>
<td align="center">18.13</td>
<td align="center">1682.54</td>
<td align="center">27.00</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/pspnet/pspnet_r50-d8_4xb2-80k_cityscapes-512x1024.py">PSPNet</a></td>
<td align="center">1x3x512x1024</td>
<td align="center">119.77</td>
<td align="center">24.10</td>
<td align="center">16.33</td>
<td align="center">1586.19</td>
<td align="center">27.26</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3/deeplabv3/deeplabv3_r50-d8_4xb2-80k_cityscapes-512x1024.py">DeepLabV3</a></td>
<td align="center">512x1024</td>
<td align="center">226.75</td>
<td align="center">31.80</td>
<td align="center">19.85</td>
<td align="center">-</td>
<td align="center">36.01</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus/deeplabv3plus_r50-d8_4xb2-80k_cityscapes-512x1024.py">DeepLabV3+</a></td>
<td align="center">512x1024</td>
<td align="center">151.25</td>
<td align="center">47.03</td>
<td align="center">50.38</td>
<td align="center">2534.96</td>
<td align="center">34.80</td>
</tr>
</tbody>
</table>
</div>
## Performance benchmark
Users can directly test the performance through [how_to_evaluate_a_model.md](../02-how-to-run/profile_model.md). And here is the benchmark in our environment.
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="2">mmpretrain</th>
<th align="center">PyTorch</th>
<th align="center">TorchScript</th>
<th align="center">ONNX Runtime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">metric</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnet/resnet18_8xb32_in1k.py">ResNet-18</a></td>
<td align="center">top-1</td>
<td align="center">69.90</td>
<td align="center">69.90</td>
<td align="center">69.88</td>
<td align="center">69.88</td>
<td align="center">69.86</td>
<td align="center">69.86</td>
<td align="center">69.86</td>
<td align="center">69.91</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">89.43</td>
<td align="center">89.43</td>
<td align="center">89.34</td>
<td align="center">89.34</td>
<td align="center">89.33</td>
<td align="center">89.38</td>
<td align="center">89.34</td>
<td align="center">89.43</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnext/resnext50-32x4d_8xb32_in1k.py">ResNeXt-50</a></td>
<td align="center">top-1</td>
<td align="center">77.90</td>
<td align="center">77.90</td>
<td align="center">77.90</td>
<td align="center">77.90</td>
<td align="center">-</td>
<td align="center">77.78</td>
<td align="center">77.89</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">93.66</td>
<td align="center">93.66</td>
<td align="center">93.66</td>
<td align="center">93.66</td>
<td align="center">-</td>
<td align="center">93.64</td>
<td align="center">93.65</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/seresnet/seresnext50-32x4d_8xb32_in1k.py">SE-ResNet-50</a></td>
<td align="center">top-1</td>
<td align="center">77.74</td>
<td align="center">77.74</td>
<td align="center">77.74</td>
<td align="center">77.74</td>
<td align="center">77.75</td>
<td align="center">77.63</td>
<td align="center">77.73</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">93.84</td>
<td align="center">93.84</td>
<td align="center">93.84</td>
<td align="center">93.84</td>
<td align="center">93.83</td>
<td align="center">93.72</td>
<td align="center">93.84</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/shufflenet_v1/shufflenet-v1-1x_16xb64_in1k.py">ShuffleNetV1 1.0x</a></td>
<td align="center">top-1</td>
<td align="center">68.13</td>
<td align="center">68.13</td>
<td align="center">68.13</td>
<td align="center">68.13</td>
<td align="center">68.13</td>
<td align="center">67.71</td>
<td align="center">68.11</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">87.81</td>
<td align="center">87.81</td>
<td align="center">87.81</td>
<td align="center">87.81</td>
<td align="center">87.81</td>
<td align="center">87.58</td>
<td align="center">87.80</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py">ShuffleNetV2 1.0x</a></td>
<td align="center">top-1</td>
<td align="center">69.55</td>
<td align="center">69.55</td>
<td align="center">69.55</td>
<td align="center">69.55</td>
<td align="center">69.54</td>
<td align="center">69.10</td>
<td align="center">69.54</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">88.92</td>
<td align="center">88.92</td>
<td align="center">88.92</td>
<td align="center">88.92</td>
<td align="center">88.91</td>
<td align="center">88.58</td>
<td align="center">88.92</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py">MobileNet V2</a></td>
<td align="center">top-1</td>
<td align="center">71.86</td>
<td align="center">71.86</td>
<td align="center">71.86</td>
<td align="center">71.86</td>
<td align="center">71.87</td>
<td align="center">70.91</td>
<td align="center">71.84</td>
<td align="center">71.87</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">90.42</td>
<td align="center">90.42</td>
<td align="center">90.42</td>
<td align="center">90.42</td>
<td align="center">90.40</td>
<td align="center">89.85</td>
<td align="center">90.41</td>
<td align="center">90.42</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py">Vision Transformer</a></td>
<td align="center">top-1</td>
<td align="center">85.43</td>
<td align="center">85.43</td>
<td align="center">-</td>
<td align="center">85.43</td>
<td align="center">85.42</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">85.43</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">97.77</td>
<td align="center">97.77</td>
<td align="center">-</td>
<td align="center">97.77</td>
<td align="center">97.76</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">97.77</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/swin_transformer/swin-tiny_16xb64_in1k.py">Swin Transformer</a></td>
<td align="center">top-1</td>
<td align="center">81.18</td>
<td align="center">81.18</td>
<td align="center">81.18</td>
<td align="center">81.18</td>
<td align="center">81.18</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">95.61</td>
<td align="center">95.61</td>
<td align="center">95.61</td>
<td align="center">95.61</td>
<td align="center">95.61</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpretrain/blob/main/configs/efficientformer/efficientformer-l1_8xb128_in1k.py">EfficientFormer</a></td>
<td align="center">top-1</td>
<td align="center">80.46</td>
<td align="center">80.45</td>
<td align="center">80.46</td>
<td align="center">80.46</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">94.99</td>
<td align="center">94.98</td>
<td align="center">94.99</td>
<td align="center">94.99</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="4">mmdet</th>
<th align="center">Pytorch</th>
<th align="center">TorchScript</th>
<th align="center">ONNXRuntime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
<th algin="center">OpenVINO</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">task</td>
<td align="center">dataset</td>
<td align="center">metric</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo/yolov3_d53_320_273e_coco.py">YOLOV3</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">33.7</td>
<td align="center">33.7</td>
<td align="center">-</td>
<td align="center">33.5</td>
<td align="center">33.5</td>
<td align="center">33.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd/ssd300_coco.py">SSD</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">25.5</td>
<td align="center">25.5</td>
<td align="center">-</td>
<td align="center">25.5</td>
<td align="center">25.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/retinanet/retinanet_r50_fpn_1x_coco.py">RetinaNet</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">36.5</td>
<td align="center">36.4</td>
<td align="center">-</td>
<td align="center">36.4</td>
<td align="center">36.4</td>
<td align="center">36.3</td>
<td align="center">36.5</td>
<td align="center">36.4</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py">FCOS</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">36.6</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">36.6</td>
<td align="center">36.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/fsaf/fsaf_r50_fpn_1x_coco.py">FSAF</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">37.4</td>
<td align="center">37.4</td>
<td align="center">-</td>
<td align="center">37.4</td>
<td align="center">37.4</td>
<td align="center">37.2</td>
<td align="center">37.4</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/3.x/configs/centernet/centernet_r18_8xb16-crop512-140e_coco.py">CenterNet</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">25.9</td>
<td align="center">26.0</td>
<td align="center">26.0</td>
<td align="center">26.0</td>
<td align="center">25.8</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox/yolox_s_8x8_300e_coco.py">YOLOX</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">40.5</td>
<td align="center">40.3</td>
<td align="center">-</td>
<td align="center">40.3</td>
<td align="center">40.3</td>
<td align="center">29.3</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py">Faster R-CNN</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">37.4</td>
<td align="center">37.3</td>
<td align="center">-</td>
<td align="center">37.3</td>
<td align="center">37.3</td>
<td align="center">37.1</td>
<td align="center">37.3</td>
<td align="center">37.2</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/atss/atss_r50_fpn_1x_coco.py">ATSS</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">39.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">39.4</td>
<td align="center">39.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco.py">Cascade R-CNN</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">40.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">40.4</td>
<td align="center">40.4</td>
<td align="center">-</td>
<td align="center">40.4</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl/gfl_r50_fpn_1x_coco.py">GFL</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">40.2</td>
<td align="center">-</td>
<td align="center">40.2</td>
<td align="center">40.2</td>
<td align="center">40.0</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/reppoints/reppoints_moment_r50_fpn_1x_coco.py">RepPoints</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">37.0</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">36.9</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/detr/detr_r50_8x2_150e_coco.py">DETR</a></td>
<td align="center">Object Detection</td>
<td align="center">COCO2017</td>
<td align="center">box AP</td>
<td align="center">40.1</td>
<td align="center">40.1</td>
<td align="center">-</td>
<td align="center">40.1</td>
<td align="center">40.1</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py">Mask R-CNN</a></td>
<td align="center" rowspan="2">Instance Segmentation</td>
<td align="center" rowspan="2">COCO2017</td>
<td align="center">box AP</td>
<td align="center">38.2</td>
<td align="center">38.1</td>
<td align="center">-</td>
<td align="center">38.1</td>
<td align="center">38.1</td>
<td align="center">-</td>
<td align="center">38.0</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">mask AP</td>
<td align="center">34.7</td>
<td align="center">34.7</td>
<td align="center">-</td>
<td align="center">33.7</td>
<td align="center">33.7</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmdetection/blob/master/configs/swin/mask_rcnn_swin-t-p4-w7_fpn_1x_coco.py">Swin-Transformer</a></td>
<td align="center" rowspan="2">Instance Segmentation</td>
<td align="center" rowspan="2">COCO2017</td>
<td align="center">box AP</td>
<td align="center">42.7</td>
<td align="center">-</td>
<td align="center">42.7</td>
<td align="center">42.5</td>
<td align="center">37.7</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">mask AP</td>
<td align="center">39.3</td>
<td align="center">-</td>
<td align="center">39.3</td>
<td align="center">39.3</td>
<td align="center">35.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/3.x/configs/solo/solo_r50_fpn_1x_coco.py">SOLO</a></td>
<td align="center">Instance Segmentation</td>
<td align="center">COCO2017</td>
<td align="center">mask AP</td>
<td align="center">33.1</td>
<td align="center">-</td>
<td align="center">32.7</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">32.7</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/3.x/configs/solov2/solov2_r50_fpn_1x_coco.py">SOLOv2</a></td>
<td align="center">Instance Segmentation</td>
<td align="center">COCO2017</td>
<td align="center">mask AP</td>
<td align="center">34.8</td>
<td align="center">-</td>
<td align="center">34.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">34.5</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="4">mmagic</th>
<th align="center">Pytorch</th>
<th align="center">TorchScript</th>
<th align="center">ONNX Runtime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">task</td>
<td align="center">dataset</td>
<td align="center">metric</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/srcnn/srcnn_x4k915_1xb16-1000k_div2k.py">SRCNN</a></td>
<td align="center" rowspan="2">Super Resolution</td>
<td align="center" rowspan="2">Set5</td>
<td align="center">PSNR</td>
<td align="center">28.4316</td>
<td align="center">28.4120</td>
<td align="center">28.4323</td>
<td align="center">28.4323</td>
<td align="center">28.4286</td>
<td align="center">28.1995</td>
<td align="center">28.4311</td>
</tr>
<tr>
<td align="center">SSIM</td>
<td align="center">0.8099</td>
<td align="center">0.8106</td>
<td align="center">0.8097</td>
<td align="center">0.8097</td>
<td align="center">0.8096</td>
<td align="center">0.7934</td>
<td align="center">0.8096</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/esrgan/esrgan_x4c64b23g32_1xb16-400k_div2k.py">ESRGAN</a></td>
<td align="center" rowspan="2">Super Resolution</td>
<td align="center" rowspan="2">Set5</td>
<td align="center">PSNR</td>
<td align="center">28.2700</td>
<td align="center">28.2619</td>
<td align="center">28.2592</td>
<td align="center">28.2592</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">28.2624</td>
</tr>
<tr>
<td align="center">SSIM</td>
<td align="center">0.7778</td>
<td align="center">0.7784</td>
<td align="center">0.7764</td>
<td align="center">0.7774</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">0.7765</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/esrgan/esrgan_psnr-x4c64b23g32_1xb16-1000k_div2k.py">ESRGAN-PSNR</a></td>
<td align="center" rowspan="2">Super Resolution</td>
<td align="center" rowspan="2">Set5</td>
<td align="center">PSNR</td>
<td align="center">30.6428</td>
<td align="center">30.6306</td>
<td align="center">30.6444</td>
<td align="center">30.6430</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">27.0426</td>
</tr>
<tr>
<td align="center">SSIM</td>
<td align="center">0.8559</td>
<td align="center">0.8565</td>
<td align="center">0.8558</td>
<td align="center">0.8558</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">0.8557</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/srgan_resnet/srgan_x4c64b16_1xb16-1000k_div2k.py">SRGAN</a></td>
<td align="center" rowspan="2">Super Resolution</td>
<td align="center" rowspan="2">Set5</td>
<td align="center">PSNR</td>
<td align="center">27.9499</td>
<td align="center">27.9252</td>
<td align="center">27.9408</td>
<td align="center">27.9408</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">27.9388</td>
</tr>
<tr>
<td align="center">SSIM</td>
<td align="center">0.7846</td>
<td align="center">0.7851</td>
<td align="center">0.7839</td>
<td align="center">0.7839</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">0.7839</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/srgan_resnet/msrresnet_x4c64b16_1xb16-1000k_div2k.py">SRResNet</a></td>
<td align="center" rowspan="2">Super Resolution</td>
<td align="center" rowspan="2">Set5</td>
<td align="center">PSNR</td>
<td align="center">30.2252</td>
<td align="center">30.2069</td>
<td align="center">30.2300</td>
<td align="center">30.2300</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">30.2294</td>
</tr>
<tr>
<td align="center">SSIM</td>
<td align="center">0.8491</td>
<td align="center">0.8497</td>
<td align="center">0.8488</td>
<td align="center">0.8488</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">0.8488</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/real_esrgan/realesrnet_c64b23g32_4xb12-lr2e-4-1000k_df2k-ost.py">Real-ESRNet</a></td>
<td align="center" rowspan="2">Super Resolution</td>
<td align="center" rowspan="2">Set5</td>
<td align="center">PSNR</td>
<td align="center">28.0297</td>
<td align="center">-</td>
<td align="center">27.7016</td>
<td align="center">27.7016</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">27.7049</td>
</tr>
<tr>
<td align="center">SSIM</td>
<td align="center">0.8236</td>
<td align="center">-</td>
<td align="center">0.8122</td>
<td align="center">0.8122</td>
<td align="center"> - </td>
<td align="center"> - </td>
<td align="center">0.8123</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmagic/blob/main/configs/edsr/edsr_x4c64b16_1xb16-300k_div2k.py">EDSR</a></td>
<td align="center" rowspan="2">Super Resolution</td>
<td align="center" rowspan="2">Set5</td>
<td align="center">PSNR</td>
<td align="center">30.2223</td>
<td align="center">30.2192</td>
<td align="center">30.2214</td>
<td align="center">30.2214</td>
<td align="center">30.2211</td>
<td align="center">30.1383</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">SSIM</td>
<td align="center">0.8500</td>
<td align="center">0.8507</td>
<td align="center">0.8497</td>
<td align="center">0.8497</td>
<td align="center">0.8497</td>
<td align="center">0.8469</td>
<td align="center"> - </td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="4">mmocr</th>
<th align="center">Pytorch</th>
<th align="center">TorchScript</th>
<th align="center">ONNXRuntime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">OpenVINO</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">task</td>
<td align="center">dataset</td>
<td align="center">metric</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center" rowspan="3"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py">DBNet*</a></td>
<td align="center" rowspan="3">TextDetection</td>
<td align="center" rowspan="3">ICDAR2015</td>
<td align="center">recall</td>
<td align="center">0.7310</td>
<td align="center">0.7308</td>
<td align="center">0.7304</td>
<td align="center">0.7198</td>
<td align="center">0.7179</td>
<td align="center">0.7111</td>
<td align="center">0.7304</td>
<td align="center">0.7309</td>
</tr>
<tr>
<td align="center">precision</td>
<td align="center">0.8714</td>
<td align="center">0.8718</td>
<td align="center">0.8714</td>
<td align="center">0.8677</td>
<td align="center">0.8674</td>
<td align="center">0.8688</td>
<td align="center">0.8718</td>
<td align="center">0.8714</td>
</tr>
<tr>
<td align="center">hmean</td>
<td align="center">0.7950</td>
<td align="center">0.7949</td>
<td align="center">0.7950</td>
<td align="center">0.7868</td>
<td align="center">0.7856</td>
<td align="center">0.7821</td>
<td align="center">0.7949</td>
<td align="center">0.7950</td>
</tr>
<tr>
<td align="center" rowspan="3"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnetpp/dbnetpp_resnet50_fpnc_1200e_icdar2015.py">DBNetpp</a></td>
<td align="center" rowspan="3">TextDetection</td>
<td align="center" rowspan="3">ICDAR2015</td>
<td align="center">recall</td>
<td align="center">0.8209</td>
<td align="center">0.8209</td>
<td align="center">0.8209</td>
<td align="center">0.8199</td>
<td align="center">0.8204</td>
<td align="center">0.8204</td>
<td align="center">-</td>
<td align="center">0.8209</td>
</tr>
<tr>
<td align="center">precision</td>
<td align="center">0.9079</td>
<td align="center">0.9079</td>
<td align="center">0.9079</td>
<td align="center">0.9117</td>
<td align="center">0.9117</td>
<td align="center">0.9142</td>
<td align="center">-</td>
<td align="center">0.9079</td>
</tr>
<tr>
<td align="center">hmean</td>
<td align="center">0.8622</td>
<td align="center">0.8622</td>
<td align="center">0.8622</td>
<td align="center">0.8634</td>
<td align="center">0.8637</td>
<td align="center">0.8648</td>
<td align="center">-</td>
<td align="center">0.8622</td>
</tr>
<tr>
<td align="center" rowspan="3"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet/psenet_resnet50_fpnf_600e_icdar2015.py">PSENet</a></td>
<td align="center" rowspan="3">TextDetection</td>
<td align="center" rowspan="3">ICDAR2015</td>
<td align="center">recall</td>
<td align="center">0.7526</td>
<td align="center">0.7526</td>
<td align="center">0.7526</td>
<td align="center">0.7526</td>
<td align="center">0.7520</td>
<td align="center">0.7496</td>
<td align="center">-</td>
<td align="center">0.7526</td>
</tr>
<tr>
<td align="center">precision</td>
<td align="center">0.8669</td>
<td align="center">0.8669</td>
<td align="center">0.8669</td>
<td align="center">0.8669</td>
<td align="center">0.8668</td>
<td align="center">0.8550</td>
<td align="center">-</td>
<td align="center">0.8669</td>
</tr>
<tr>
<td align="center">hmean</td>
<td align="center">0.8057</td>
<td align="center">0.8057</td>
<td align="center">0.8057</td>
<td align="center">0.8057</td>
<td align="center">0.8054</td>
<td align="center">0.7989</td>
<td align="center">-</td>
<td align="center">0.8057</td>
</tr>
<tr>
<td align="center" rowspan="3"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_resnet18_fpem-ffm_600e_icdar2015.py">PANet</a></td>
<td align="center" rowspan="3">TextDetection</td>
<td align="center" rowspan="3">ICDAR2015</td>
<td align="center">recall</td>
<td align="center">0.7401</td>
<td align="center">0.7401</td>
<td align="center">0.7401</td>
<td align="center">0.7357</td>
<td align="center">0.7366</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">0.7401</td>
</tr>
<tr>
<td align="center">precision</td>
<td align="center">0.8601</td>
<td align="center">0.8601</td>
<td align="center">0.8601</td>
<td align="center">0.8570</td>
<td align="center">0.8586</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">0.8601</td>
</tr>
<tr>
<td align="center">hmean</td>
<td align="center">0.7955</td>
<td align="center">0.7955</td>
<td align="center">0.7955</td>
<td align="center">0.7917</td>
<td align="center">0.7930</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">0.7955</td>
</tr>
<tr>
<td align="center" rowspan="3"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/textsnake/textsnake_resnet50_fpn-unet_1200e_ctw1500.py">TextSnake</a></td>
<td align="center" rowspan="3">TextDetection</td>
<td align="center" rowspan="3">CTW1500</td>
<td align="center">recall</td>
<td align="center">0.8052</td>
<td align="center">0.8052</td>
<td align="center">0.8052</td>
<td align="center">0.8055</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">precision</td>
<td align="center">0.8535</td>
<td align="center">0.8535</td>
<td align="center">0.8535</td>
<td align="center">0.8538</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">hmean</td>
<td align="center">0.8286</td>
<td align="center">0.8286</td>
<td align="center">0.8286</td>
<td align="center">0.8290</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="3"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_icdar2015.py">MaskRCNN</a></td>
<td align="center" rowspan="3">TextDetection</td>
<td align="center" rowspan="3">ICDAR2015</td>
<td align="center">recall</td>
<td align="center">0.7766</td>
<td align="center">0.7766</td>
<td align="center">0.7766</td>
<td align="center">0.7766</td>
<td align="center">0.7761</td>
<td align="center">0.7670</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">precision</td>
<td align="center">0.8644</td>
<td align="center">0.8644</td>
<td align="center">0.8644</td>
<td align="center">0.8644</td>
<td align="center">0.8630</td>
<td align="center">0.8705</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">hmean</td>
<td align="center">0.8182</td>
<td align="center">0.8182</td>
<td align="center">0.8182</td>
<td align="center">0.8182</td>
<td align="center">0.8172</td>
<td align="center">0.8155</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/crnn/crnn_mini-vgg_5e_mj.py">CRNN</a></td>
<td align="center">TextRecognition</td>
<td align="center">IIIT5K</td>
<td align="center">acc</td>
<td align="center">0.8067</td>
<td align="center">0.8067</td>
<td align="center">0.8067</td>
<td align="center">0.8067</td>
<td align="center">0.8063</td>
<td align="center">0.8067</td>
<td align="center">0.8067</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/sar/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real.py">SAR</a></td>
<td align="center">TextRecognition</td>
<td align="center">IIIT5K</td>
<td align="center">acc</td>
<td align="center">0.9517</td>
<td align="center">-</td>
<td align="center">0.9287</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/satrn/satrn_shallow-small_5e_st_mj.py">SATRN</a></td>
<td align="center">TextRecognition</td>
<td align="center">IIIT5K</td>
<td align="center">acc</td>
<td align="center">0.9470</td>
<td align="center">0.9487</td>
<td align="center">0.9487</td>
<td align="center">0.9487</td>
<td align="center">0.9483</td>
<td align="center">0.9483</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/abinet/abinet_20e_st-an_mj.py">ABINet</a></td>
<td align="center">TextRecognition</td>
<td align="center">IIIT5K</td>
<td align="center">acc</td>
<td align="center">0.9603</td>
<td align="center">0.9563</td>
<td align="center">0.9563</td>
<td align="center">0.9573</td>
<td align="center">0.9507</td>
<td align="center">0.9510</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="3">mmseg</th>
<th align="center">Pytorch</th>
<th align="center">TorchScript</th>
<th align="center">ONNXRuntime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">dataset</td>
<td align="center">metric</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fcn/fcn_r50-d8_4xb2-40k_cityscapes-512x1024.py">FCN</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">72.25</td>
<td align="center">72.36</td>
<td align="center">-</td>
<td align="center">72.36</td>
<td align="center">72.35</td>
<td align="center">74.19</td>
<td align="center">72.35</td>
<td align="center">72.35</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/pspnet/pspnet_r50-d8_4xb2-80k_cityscapes-512x1024.py">PSPNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">78.55</td>
<td align="center">78.66</td>
<td align="center">-</td>
<td align="center">78.26</td>
<td align="center">78.24</td>
<td align="center">77.97</td>
<td align="center">78.09</td>
<td align="center">78.67</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3/deeplabv3_r50-d8_4xb2-40k_cityscapes-512x1024.py">deeplabv3</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">79.09</td>
<td align="center">79.12</td>
<td align="center">-</td>
<td align="center">79.12</td>
<td align="center">79.12</td>
<td align="center">78.96</td>
<td align="center">79.12</td>
<td align="center">79.06</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus/deeplabv3plus_r50-d8_4xb2-40k_cityscapes-512x1024.py">deeplabv3+</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">79.61</td>
<td align="center">79.60</td>
<td align="center">-</td>
<td align="center">79.60</td>
<td align="center">79.60</td>
<td align="center">79.43</td>
<td align="center">79.60</td>
<td align="center">79.51</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastscnn/fast_scnn_8xb4-160k_cityscapes-512x1024.py">Fast-SCNN</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">70.96</td>
<td align="center">70.96</td>
<td align="center">-</td>
<td align="center">70.93</td>
<td align="center">70.92</td>
<td align="center">66.00</td>
<td align="center">70.92</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/unet/unet-s5-d16_fcn_4xb4-160k_cityscapes-512x1024.py">UNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">69.10</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">69.10</td>
<td align="center">69.10</td>
<td align="center">68.95</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ann/ann_r50-d8_4xb2-40k_cityscapes-512x1024.py">ANN</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">77.40</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">77.32</td>
<td align="center">77.32</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/apcnet/apcnet_r50-d8_4xb2-40k_cityscapes-512x1024.py">APCNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">77.40</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">77.32</td>
<td align="center">77.32</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv1/bisenetv1_r18-d32_4xb4-160k_cityscapes-1024x1024.py">BiSeNetV1</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">74.44</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">74.44</td>
<td align="center">74.43</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv2/bisenetv2_fcn_4xb4-160k_cityscapes-1024x1024.py">BiSeNetV2</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">73.21</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">73.21</td>
<td align="center">73.21</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/cgnet/cgnet_fcn_4xb8-60k_cityscapes-512x1024.py">CGNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">68.25</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">68.27</td>
<td align="center">68.27</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/emanet/emanet_r50-d8_4xb2-80k_cityscapes-512x1024.py">EMANet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">77.59</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">77.59</td>
<td align="center">77.6</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/encnet/encnet_r50-d8_4xb2-40k_cityscapes-512x1024.py">EncNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">75.67</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">75.66</td>
<td align="center">75.66</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/erfnet/erfnet_fcn_4xb4-160k_cityscapes-512x1024.py">ERFNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">71.08</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">71.08</td>
<td align="center">71.07</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastfcn/fastfcn_r50-d32_jpu_aspp_4xb2-80k_cityscapes-512x1024.py">FastFCN</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">79.12</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">79.12</td>
<td align="center">79.12</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/gcnet/gcnet_r50-d8_4xb2-40k_cityscapes-512x1024.py">GCNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">77.69</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">77.69</td>
<td align="center">77.69</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/icnet/icnet_r18-d8_4xb2-80k_cityscapes-832x832.py">ICNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">76.29</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">76.36</td>
<td align="center">76.36</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/isanet/isanet_r50-d8_4xb2-40k_cityscapes-512x1024.py">ISANet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">78.49</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">78.49</td>
<td align="center">78.49</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ocrnet/ocrnet_hr18s_4xb2-40k_cityscapes-512x1024.py">OCRNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">74.30</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">73.66</td>
<td align="center">73.67</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/point_rend/pointrend_r50_4xb2-80k_cityscapes-512x1024.py">PointRend</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">76.47</td>
<td align="center">76.47</td>
<td align="center">-</td>
<td align="center">76.41</td>
<td align="center">76.42</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/sem_fpn/fpn_r50_4xb2-80k_cityscapes-512x1024.py">Semantic FPN</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">74.52</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">74.52</td>
<td align="center">74.52</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/stdc/stdc1_in1k-pre_4xb12-80k_cityscapes-512x1024.py">STDC</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">75.10</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">75.10</td>
<td align="center">75.10</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/stdc/stdc2_in1k-pre_4xb12-80k_cityscapes-512x1024.py">STDC</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">77.17</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">77.17</td>
<td align="center">77.17</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/main/configs/upernet/upernet_r50_4xb2-40k_cityscapes-512x1024.py">UPerNet</a></td>
<td align="center">Cityscapes</td>
<td align="center">mIoU</td>
<td align="center">77.10</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">77.19</td>
<td align="center">77.18</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/blob/main/configs/segmenter/segmenter_vit-s_fcn_8xb1-160k_ade20k-512x512.py">Segmenter</a></td>
<td align="center">ADE20K</td>
<td align="center">mIoU</td>
<td align="center">44.32</td>
<td align="center">44.29</td>
<td align="center">44.29</td>
<td align="center">44.29</td>
<td align="center">43.34</td>
<td align="center">43.35</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="4">mmpose</th>
<th align="center">Pytorch</th>
<th align="center">ONNXRuntime</th>
<th align="center" colspan="2">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">OpenVINO</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">task</td>
<td align="center">dataset</td>
<td align="center">metric</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpose/blob/main/configs/body_2d_keypoint/topdown_heatmap/coco/td-hm_hrnet-w48_8xb32-210e_coco-256x192.py">HRNet</a></td>
<td align="center" rowspan="2">Pose Detection</td>
<td align="center" rowspan="2">COCO</td>
<td align="center">AP</td>
<td align="center">0.748</td>
<td align="center">0.748</td>
<td align="center">0.748</td>
<td align="center">0.748</td>
<td align="center">-</td>
<td align="center">0.748</td>
</tr>
<tr>
<td align="center">AR</td>
<td align="center">0.802</td>
<td align="center">0.802</td>
<td align="center">0.802</td>
<td align="center">0.802</td>
<td align="center">-</td>
<td align="center">0.802</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpose/blob/main/configs/body_2d_keypoint/topdown_heatmap/coco/td-hm_litehrnet-30_8xb64-210e_coco-256x192.py">LiteHRNet</a></td>
<td align="center" rowspan="2">Pose Detection</td>
<td align="center" rowspan="2">COCO</td>
<td align="center">AP</td>
<td align="center">0.663</td>
<td align="center">0.663</td>
<td align="center">0.663</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">0.663</td>
</tr>
<tr>
<td align="center">AR</td>
<td align="center">0.728</td>
<td align="center">0.728</td>
<td align="center">0.728</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">0.728</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpose/blob/main/configs/body_2d_keypoint/topdown_heatmap/coco/td-hm_4xmspn50_8xb32-210e_coco-256x192.py">MSPN</a></td>
<td align="center" rowspan="2">Pose Detection</td>
<td align="center" rowspan="2">COCO</td>
<td align="center">AP</td>
<td align="center">0.762</td>
<td align="center">0.762</td>
<td align="center">0.762</td>
<td align="center">0.762</td>
<td align="center">-</td>
<td align="center">0.762</td>
</tr>
<tr>
<td align="center">AR</td>
<td align="center">0.825</td>
<td align="center">0.825</td>
<td align="center">0.825</td>
<td align="center">0.825</td>
<td align="center">-</td>
<td align="center">0.825</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpose/blob/main/configs/body_2d_keypoint/topdown_heatmap/coco/td-hm_hourglass52_8xb32-210e_coco-256x256.py">Hourglass</a></td>
<td align="center" rowspan="2">Pose Detection</td>
<td align="center" rowspan="2">COCO</td>
<td align="center">AP</td>
<td align="center">0.717</td>
<td align="center">0.717</td>
<td align="center">0.717</td>
<td align="center">0.717</td>
<td align="center">-</td>
<td align="center">0.717</td>
</tr>
<tr>
<td align="center">AR</td>
<td align="center">0.774</td>
<td align="center">0.774</td>
<td align="center">0.774</td>
<td align="center">0.774</td>
<td align="center">-</td>
<td align="center">0.774</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmpose/blob/main/configs/body_2d_keypoint/simcc/coco/simcc_mobilenetv2_wo-deconv-8xb64-210e_coco-256x192.py">SimCC</a></td>
<td align="center" rowspan="2">Pose Detection</td>
<td align="center" rowspan="2">COCO</td>
<td align="center">AP</td>
<td align="center">0.607</td>
<td align="center">-</td>
<td align="center">0.608</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">AR</td>
<td align="center">0.668</td>
<td align="center">-</td>
<td align="center">0.672</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="4">mmrotate</th>
<th align="center">Pytorch</th>
<th align="center">ONNXRuntime</th>
<th align="center" colspan="2">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">OpenVINO</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">task</td>
<td align="center">dataset</td>
<td align="center">metrics</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmrotate/tree/main/configs/rotated_retinanet/rotated-retinanet-hbox-oc_r50_fpn_1x_dota.py">RotatedRetinaNet</a></td>
<td align="center">Rotated Detection</td>
<td align="center">DOTA-v1.0</td>
<td align="center">mAP</td>
<td align="center">0.698</td>
<td align="center">0.698</td>
<td align="center">0.698</td>
<td align="center">0.697</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmrotate/tree/main/configs/oriented_rcnn/oriented-rcnn-le90_r50_fpn_1x_dota.py">Oriented RCNN</a></td>
<td align="center">Rotated Detection</td>
<td align="center">DOTA-v1.0</td>
<td align="center">mAP</td>
<td align="center">0.756</td>
<td align="center">0.756</td>
<td align="center">0.758</td>
<td align="center">0.730</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmrotate/blob/main/configs/gliding_vertex/gliding-vertex-rbox_r50_fpn_1x_dota.py">GlidingVertex</a></td>
<td align="center">Rotated Detection</td>
<td align="center">DOTA-v1.0</td>
<td align="center">mAP</td>
<td align="center">0.732</td>
<td align="center">-</td>
<td align="center">0.733</td>
<td align="center">0.731</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmrotate/blob/main/configs/roi_trans/roi-trans-le90_r50_fpn_1x_dota.py">RoI Transformer</a></td>
<td align="center">Rotated Detection</td>
<td align="center">DOTA-v1.0</td>
<td align="center">mAP</td>
<td align="center">0.761</td>
<td align="center">-</td>
<td align="center">0.758</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
</div>
<div style="margin-left: 25px;">
<table class="docutils">
<thead>
<tr>
<th align="center" colspan="4">mmaction2</th>
<th align="center">Pytorch</th>
<th align="center">ONNXRuntime</th>
<th align="center" colspan="2">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">OpenVINO</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">model</td>
<td align="center">task</td>
<td align="center">dataset</td>
<td align="center">metrics</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp32</td>
<td align="center">fp16</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmaction2/blob/main/configs/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py">TSN</a></td>
<td align="center" rowspan="2">Recognition</td>
<td align="center" rowspan="2">Kinetics-400</td>
<td align="center">top-1</td>
<td align="center">69.71</td>
<td align="center">-</td>
<td align="center">69.71</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">88.75</td>
<td align="center">-</td>
<td align="center">88.75</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmaction2/blob/main/configs/recognition/slowfast/slowfast_r50_8xb8-4x16x1-256e_kinetics400-rgb.py">SlowFast</a></td>
<td align="center" rowspan="2">Recognition</td>
<td align="center" rowspan="2">Kinetics-400</td>
<td align="center">top-1</td>
<td align="center">74.45</td>
<td align="center">-</td>
<td align="center">75.62</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
<td align="center">91.55</td>
<td align="center">-</td>
<td align="center">92.10</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
</div>
## Notes
- As some datasets contain images with various resolutions in codebase like MMDet. The speed benchmark is gained through static configs in MMDeploy, while the performance benchmark is gained through dynamic ones.
- Some int8 performance benchmarks of TensorRT require Nvidia cards with tensor core, or the performance would drop heavily.
- DBNet uses the interpolate mode `nearest` in the neck of the model, which TensorRT-7 applies a quite different strategy from Pytorch. To make the repository compatible with TensorRT-7, we rewrite the neck to use the interpolate mode `bilinear` which improves final detection performance. To get the matched performance with Pytorch, TensorRT-8+ is recommended, which the interpolate methods are all the same as Pytorch.
- Mask AP of Mask R-CNN drops by 1% for the backend. The main reason is that the predicted masks are directly interpolated to original image in PyTorch, while they are at first interpolated to the preprocessed input image of the model and then to original image in other backends.
- MMPose models are tested with `flip_test` explicitly set to `False` in model configs.
- Some models might get low accuracy in fp16 mode. Please adjust the model to avoid value overflow.
# Test on embedded device
Here are the test conclusions of our edge devices. You can directly obtain the results of your own environment with [model profiling](../02-how-to-run/profile_model.md).
## Software and hardware environment
- host OS ubuntu 18.04
- backend SNPE-1.59
- device Mi11 (qcom 888)
## mmpretrain
| model | dataset | spatial | fp32 top-1 (%) | snpe gpu hybrid fp32 top-1 (%) | latency (ms) |
| :----------------------------------------------------------------------------------------------------------------------: | :---------: | :-----: | :------------: | :----------------------------: | :----------: |
| [ShuffleNetV2](https://github.com/open-mmlab/mmpretrain/blob/main/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py) | ImageNet-1k | 224x224 | 69.55 | 69.83\* | 20±7 |
| [MobilenetV2](https://github.com/open-mmlab/mmpretrain/blob/main/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | ImageNet-1k | 224x224 | 71.86 | 72.14\* | 15±6 |
tips:
1. The ImageNet-1k dataset is too large to test, only part of the dataset is used (8000/50000)
2. The heating of device will downgrade the frequency, so the time consumption will actually fluctuate. Here are the stable values after running for a period of time. This result is closer to the actual demand.
## mmocr detection
| model | dataset | spatial | fp32 hmean | snpe gpu hybrid hmean | latency(ms) |
| :--------------------------------------------------------------------------------------------------------------------: | :-------: | :------: | :--------: | :-------------------: | :---------: |
| [PANet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_resnet18_fpem-ffm_600e_icdar2015.py) | ICDAR2015 | 1312x736 | 0.795 | 0.785 @thr=0.9 | 3100±100 |
## mmpose
| model | dataset | spatial | snpe hybrid AR@IoU=0.50 | snpe hybrid AP@IoU=0.50 | latency(ms) |
| :---------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------: | :-----: | :---------------------: | :---------------------: | :---------: |
| [pose_hrnet_w32](https://github.com/open-mmlab/mmpose/blob/main/configs/animal_2d_keypoint/topdown_heatmap/animalpose/td-hm_hrnet-w32_8xb64-210e_animalpose-256x256.py) | Animalpose | 256x256 | 0.997 | 0.989 | 630±50 |
tips:
- Test `pose_hrnet` using AnimalPose's test dataset instead of val dataset.
## mmseg
| model | dataset | spatial | mIoU | latency(ms) |
| :------------------------------------------------------------------------------------------------------------------: | :--------: | :------: | :---: | :---------: |
| [fcn](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/fcn/fcn_r18-d8_4xb2-80k_cityscapes-512x1024.py) | Cityscapes | 512x1024 | 71.11 | 4915±500 |
tips:
- `fcn` works fine with 512x1024 size. Cityscapes dataset uses 1024x2048 resolution which causes device to reboot.
## Notes
- We needs to manually split the mmdet model into two parts. Because
- In snpe source code, `onnx_to_ir.py` can only parse onnx input while `ir_to_dlc.py` does not support `topk` operator
- UDO (User Defined Operator) does not work with `snpe-onnx-to-dlc`
- mmagic model
- `srcnn` requires cubic resize which snpe does not support
- `esrgan` converts fine, but loading the model causes the device to reboot
- mmrotate depends on [e2cnn](https://pypi.org/project/e2cnn/) and needs to be installed manually [its Python3.6 compatible branch](https://github.com/QUVA-Lab/e2cnn)
# Test on TVM
## Supported Models
| Model | Codebase | Model config |
| :---------------- | :------------- | :-------------------------------------------------------------------------------------: |
| RetinaNet | MMDetection | [config](https://github.com/open-mmlab/mmdetection/tree/main/configs/retinanet) |
| Faster R-CNN | MMDetection | [config](https://github.com/open-mmlab/mmdetection/tree/main/configs/faster_rcnn) |
| YOLOv3 | MMDetection | [config](https://github.com/open-mmlab/mmdetection/tree/main/configs/yolo) |
| YOLOX | MMDetection | [config](https://github.com/open-mmlab/mmdetection/tree/main/configs/yolox) |
| Mask R-CNN | MMDetection | [config](https://github.com/open-mmlab/mmdetection/tree/main/configs/mask_rcnn) |
| SSD | MMDetection | [config](https://github.com/open-mmlab/mmdetection/tree/main/configs/ssd) |
| ResNet | MMPretrain | [config](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnet) |
| ResNeXt | MMPretrain | [config](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnext) |
| SE-ResNet | MMPretrain | [config](https://github.com/open-mmlab/mmpretrain/tree/main/configs/seresnet) |
| MobileNetV2 | MMPretrain | [config](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v2) |
| ShuffleNetV1 | MMPretrain | [config](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v1) |
| ShuffleNetV2 | MMPretrain | [config](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v2) |
| VisionTransformer | MMPretrain | [config](https://github.com/open-mmlab/mmpretrain/tree/main/configs/vision_transformer) |
| FCN | MMSegmentation | [config](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fcn) |
| PSPNet | MMSegmentation | [config](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/pspnet) |
| DeepLabV3 | MMSegmentation | [config](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3) |
| DeepLabV3+ | MMSegmentation | [config](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus) |
| UNet | MMSegmentation | [config](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/unet) |
The table above list the models that we have tested. Models not listed on the table might still be able to converted. Please have a try.
## Test
- Ubuntu 20.04
- tvm 0.9.0
| mmpretrain | metric | PyTorch | TVM |
| :-----------------------------------------------------------------------------------------------------------------------: | :----: | :-----: | :---: |
| [ResNet-18](https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnet/resnet18_8xb32_in1k.py) | top-1 | 69.90 | 69.90 |
| [ResNeXt-50](https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | top-1 | 77.90 | 77.90 |
| [ShuffleNet V2](https://github.com/open-mmlab/mmpretrain/blob/main/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py) | top-1 | 69.55 | 69.55 |
| [MobileNet V2](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | top-1 | 71.86 | 71.86 |
<!-- | [Vision Transformer](https://github.com/open-mmlab/mmpretrain/blob/main/configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py) | top-1 | 85.43 | 84.01 | -->
| mmdet(\*) | metric | PyTorch | TVM |
| :-----------------------------------------------------------------------------------: | :----: | :-----: | :--: |
| [SSD](https://github.com/open-mmlab/mmdetection/tree/main/configs/ssd/ssd300_coco.py) | box AP | 25.5 | 25.5 |
\*: We only test model on ssd since dynamic shape is not supported for now.
| mmseg | metric | PyTorch | TVM |
| :---------------------------------------------------------------------------------------------------------------------------: | :----: | :-----: | :---: |
| [FCN](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/fcn/fcn_r50-d8_4xb2-40k_cityscapes-512x1024.py) | mIoU | 72.25 | 72.36 |
| [PSPNet](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/pspnet/pspnet_r50-d8_4xb2-80k_cityscapes-512x1024.py) | mIoU | 78.55 | 77.90 |
# Quantization test result
Currently mmdeploy support ncnn quantization
## Quantize with ncnn
### mmpretrain
| model | dataset | fp32 top-1 (%) | int8 top-1 (%) |
| :------------------------------------------------------------------------------------------------------------------: | :---------: | :------------: | :------------: |
| [ResNet-18](https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnet/resnet18_8xb16_cifar10.py) | Cifar10 | 94.82 | 94.83 |
| [ResNeXt-32x4d-50](https://github.com/open-mmlab/mmpretrain/blob/main/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | ImageNet-1k | 77.90 | 78.20\* |
| [MobileNet V2](https://github.com/open-mmlab/mmpretrain/blob/main/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | ImageNet-1k | 71.86 | 71.43\* |
| [HRNet-W18\*](https://github.com/open-mmlab/mmpretrain/blob/main/configs/hrnet/hrnet-w18_4xb32_in1k.py) | ImageNet-1k | 76.75 | 76.25\* |
Note:
- Because of the large amount of imagenet-1k data and ncnn has not released Vulkan int8 version, only part of the test set (4000/50000) is used.
- The accuracy will vary after quantization, and it is normal for the classification model to increase by less than 1%.
### OCR detection
| model | dataset | fp32 hmean | int8 hmean |
| :-------------------------------------------------------------------------------------------------------------------------------: | :-------: | :--------: | :------------: |
| [PANet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_resnet18_fpem-ffm_600e_icdar2015.py) | ICDAR2015 | 0.795 | 0.792 @thr=0.9 |
| [TextSnake](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/textsnake/textsnake_resnet50_fpn-unet_1200e_ctw1500.py) | CTW1500 | 0.817 | 0.818 |
Note: [mmocr](https://github.com/open-mmlab/mmocr) Uses 'shapely' to compute IoU, which results in a slight difference in accuracy
### Pose detection
| model | dataset | fp32 AP | int8 AP |
| :----------------------------------------------------------------------------------------------------------------------------------------------------: | :------: | :-----: | :-----: |
| [Hourglass](https://github.com/open-mmlab/mmpose/blob/main/configs/body_2d_keypoint/topdown_heatmap/coco/td-hm_hourglass52_8xb32-210e_coco-256x256.py) | COCO2017 | 0.717 | 0.713 |
Note: MMPose models are tested with `flip_test` explicitly set to `False` in model configs.
## Supported models
The table below lists the models that are guaranteed to be exportable to other backends.
| Model config | Codebase | TorchScript | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVINO | Ascend | RKNN |
| :------------------------------------------------------------------------------------------------------- | :--------------- | :---------: | :---------: | :------: | :--: | :---: | :------: | :----: | :--: |
| [RetinaNet](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/retinanet) | MMDetection | Y | Y | Y | Y | Y | Y | Y | Y |
| [Faster R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/faster_rcnn) | MMDetection | Y | Y | Y | Y | Y | Y | Y | N |
| [YOLOv3](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/yolo) | MMDetection | Y | Y | Y | Y | N | Y | Y | Y |
| [YOLOX](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/yolox) | MMDetection | Y | Y | Y | Y | N | Y | N | Y |
| [FCOS](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/fcos) | MMDetection | Y | Y | Y | Y | N | Y | N | N |
| [FSAF](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/fsaf) | MMDetection | Y | Y | Y | Y | Y | Y | N | Y |
| [Mask R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/mask_rcnn) | MMDetection | Y | Y | Y | N | N | Y | N | N |
| [SSD](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/ssd)[\*](#note) | MMDetection | Y | Y | Y | Y | N | Y | N | Y |
| [FoveaBox](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/foveabox) | MMDetection | Y | Y | N | N | N | Y | N | N |
| [ATSS](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/atss) | MMDetection | N | Y | Y | N | N | Y | N | N |
| [GFL](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/gfl) | MMDetection | N | Y | Y | N | ? | Y | N | N |
| [Cascade R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/cascade_rcnn) | MMDetection | N | Y | Y | N | Y | Y | N | N |
| [Cascade Mask R-CNN](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/cascade_rcnn) | MMDetection | N | Y | Y | N | N | Y | N | N |
| [Swin Transformer](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/swin)[\*](#note) | MMDetection | N | Y | Y | N | N | Y | N | N |
| [VFNet](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/vfnet) | MMDetection | N | N | N | N | N | Y | N | N |
| [RepPoints](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/reppoints) | MMDetection | N | N | Y | N | ? | Y | N | N |
| [DETR](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/detr) | MMDetection | N | Y | Y | N | ? | N | N | N |
| [CenterNet](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/centernet) | MMDetection | N | Y | Y | N | ? | Y | N | N |
| [SOLO](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/solo) | MMDetection | N | Y | N | N | N | Y | N | N |
| [SOLOv2](https://github.com/open-mmlab/mmdetection/tree/3.x/configs/solov2) | MMDetection | N | Y | N | N | N | Y | N | N |
| [ResNet](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnet) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [ResNeXt](https://github.com/open-mmlab/mmpretrain/tree/main/configs/resnext) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [SE-ResNet](https://github.com/open-mmlab/mmpretrain/tree/main/configs/seresnet) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [MobileNetV2](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v2) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [MobileNetV3](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobilenet_v3) | MMPretrain | Y | Y | Y | Y | N | Y | N | N |
| [ShuffleNetV1](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v1) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [ShuffleNetV2](https://github.com/open-mmlab/mmpretrain/tree/main/configs/shufflenet_v2) | MMPretrain | Y | Y | Y | Y | Y | Y | Y | Y |
| [VisionTransformer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/vision_transformer) | MMPretrain | Y | Y | Y | Y | ? | Y | Y | N |
| [SwinTransformer](https://github.com/open-mmlab/mmpretrain/tree/main/configs/swin_transformer) | MMPretrain | Y | Y | Y | N | ? | N | ? | N |
| [MobileOne](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mobileone) | MMPretrain | N | Y | Y | N | N | N | N | N |
| [FCN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fcn) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | Y |
| [PSPNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/pspnet)[\*static](#note) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | Y |
| [DeepLabV3](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | N |
| [DeepLabV3+](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | N |
| [Fast-SCNN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastscnn)[\*static](#note) | MMSegmentation | Y | Y | Y | N | Y | Y | N | Y |
| [UNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/unet) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | Y |
| [ANN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ann)[\*](#note) | MMSegmentation | Y | Y | Y | N | N | N | N | N |
| [APCNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/apcnet) | MMSegmentation | Y | Y | Y | Y | N | N | N | Y |
| [BiSeNetV1](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv1) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [BiSeNetV2](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/bisenetv2) | MMSegmentation | Y | Y | Y | Y | N | Y | N | N |
| [CGNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/cgnet) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [DMNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/dmnet) | MMSegmentation | ? | Y | N | N | N | N | N | N |
| [DNLNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/dnlnet) | MMSegmentation | ? | Y | Y | Y | N | Y | N | N |
| [EMANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/emanet) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [EncNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/encnet) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [ERFNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/erfnet) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [FastFCN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/fastfcn) | MMSegmentation | Y | Y | Y | Y | N | Y | N | N |
| [GCNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/gcnet) | MMSegmentation | Y | Y | Y | N | N | N | N | N |
| [ICNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/icnet)[\*](#note) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [ISANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/isanet)[\*static](#note) | MMSegmentation | N | Y | Y | N | N | Y | N | Y |
| [NonLocal Net](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/nonlocal_net) | MMSegmentation | ? | Y | Y | Y | N | Y | N | N |
| [OCRNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/ocrnet) | MMSegmentation | ? | Y | Y | Y | N | Y | N | Y |
| [PointRend](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/point_rend) | MMSegmentation | Y | Y | Y | N | N | Y | N | N |
| [Semantic FPN](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/sem_fpn) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [STDC](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/stdc) | MMSegmentation | Y | Y | Y | Y | N | Y | N | Y |
| [UPerNet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/upernet)[\*](#note) | MMSegmentation | ? | Y | Y | N | N | N | N | Y |
| [DANet](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/danet) | MMSegmentation | ? | Y | Y | N | N | N | N | N |
| [Segmenter](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/segmenter) [\*static](#note) | MMSegmentation | Y | Y | Y | Y | N | Y | N | N |
| [SRCNN](https://github.com/open-mmlab/mmagic/tree/main/configs/srcnn) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [ESRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/esrgan) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [SRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/srgan_resnet) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [SRResNet](https://github.com/open-mmlab/mmagic/tree/main/configs/srgan_resnet) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [Real-ESRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/real_esrgan) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [EDSR](https://github.com/open-mmlab/mmagic/tree/main/configs/edsr) | MMagic | Y | Y | Y | Y | N | Y | N | N |
| [RDN](https://github.com/open-mmlab/mmagic/tree/main/configs/rdn) | MMagic | Y | Y | Y | Y | Y | Y | N | N |
| [DBNet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet) | MMOCR | Y | Y | Y | Y | Y | Y | Y | N |
| [DBNetpp](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnetpp) | MMOCR | Y | Y | Y | ? | ? | Y | ? | N |
| [PANet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet) | MMOCR | Y | Y | Y | Y | ? | Y | Y | N |
| [PSENet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet) | MMOCR | Y | Y | Y | Y | ? | Y | Y | N |
| [TextSnake](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/textsnake) | MMOCR | Y | Y | Y | Y | ? | ? | ? | N |
| [MaskRCNN](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/maskrcnn) | MMOCR | Y | Y | Y | ? | ? | ? | ? | N |
| [CRNN](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/crnn) | MMOCR | Y | Y | Y | Y | Y | N | N | N |
| [SAR](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/sar) | MMOCR | N | Y | N | N | N | N | N | N |
| [SATRN](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/satrn) | MMOCR | Y | Y | Y | N | N | N | N | N |
| [ABINet](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/abinet) | MMOCR | Y | Y | Y | N | N | N | N | N |
| [HRNet](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#hrnet-cvpr-2019) | MMPose | N | Y | Y | Y | N | Y | N | N |
| [MSPN](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#mspn-arxiv-2019) | MMPose | N | Y | Y | Y | N | Y | N | N |
| [LiteHRNet](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#litehrnet-cvpr-2021) | MMPose | N | Y | Y | N | N | Y | N | N |
| [Hourglass](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/backbones.html#hourglass-eccv-2016) | MMPose | N | Y | Y | Y | N | Y | N | N |
| [SimCC](https://mmpose.readthedocs.io/en/latest/model_zoo_papers/algorithms.html#simcc-eccv-2022) | MMPose | N | Y | Y | Y | N | N | N | N |
| [PointPillars](https://github.com/open-mmlab/mmdetection3d/tree/main/configs/pointpillars) | MMDetection3d | ? | Y | Y | N | N | Y | N | N |
| [CenterPoint (pillar)](https://github.com/open-mmlab/mmdetection3d/tree/main/configs/centerpoint) | MMDetection3d | ? | Y | Y | N | N | Y | N | N |
| [RotatedRetinaNet](https://github.com/open-mmlab/mmrotate/blob/main/configs/rotated_retinanet/README.md) | RotatedDetection | N | Y | Y | N | N | N | N | N |
| [Oriented RCNN](https://github.com/open-mmlab/mmrotate/blob/main/configs/oriented_rcnn/README.md) | RotatedDetection | N | Y | Y | N | N | N | N | N |
| [Gliding Vertex](https://github.com/open-mmlab/mmrotate/blob/main/configs/gliding_vertex/README.md) | RotatedDetection | N | N | Y | N | N | N | N | N |
### Note
- Tag:
- static: This model only support static export. Please use `static` deploy config, just like $MMDEPLOY_DIR/configs/mmseg/segmentation_tensorrt_static-1024x2048.py.
- SSD: When you convert SSD model, you need to use min shape deploy config just like 300x300-512x512 rather than 320x320-1344x1344, for example $MMDEPLOY_DIR/configs/mmdet/detection/detection_tensorrt_dynamic-300x300-512x512.py.
- YOLOX: YOLOX with ncnn only supports static shape.
- Swin Transformer: For TensorRT, only version 8.4+ is supported.
- SAR: Chinese text recognition model is not supported as the protobuf size of ONNX is limited.
# MMAction2 Deployment
- [MMAction2 Deployment](#mmaction2-deployment)
- [Installation](#installation)
- [Install mmaction2](#install-mmaction2)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Convert video recognition model](#convert-video-recognition-model)
- [Model specification](#model-specification)
- [Model Inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Video recognition SDK model inference](#video-recognition-sdk-model-inference)
- [Supported models](#supported-models)
______________________________________________________________________
[MMAction2](https://github.com/open-mmlab/mmaction2) is an open-source toolbox for video understanding based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com) project.
## Installation
### Install mmaction2
Please follow the [installation guide](https://github.com/open-mmlab/mmaction2/tree/main#installation) to install mmaction2.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmaction2 models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
When using `tools/deploy.py`, it is crucial to specify the correct deployment config. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmaction) of all supported backends for mmaction2, under which the config file path follows the pattern:
```
{task}/{task}_{backend}-{precision}_{static | dynamic}_{shape}.py
```
其中:
- **{task}:** task in mmaction2.
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
- **{2d/3d}:** model type
In the next part,we will take `tsn` model from `video recognition` task as an example, showing how to convert them to onnx model that can be inferred by ONNX Runtime.
### Convert video recognition model
```shell
cd mmdeploy
# download tsn model from mmaction2 model zoo
mim download mmaction2 --config tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb --dest .
# convert mmaction2 model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmaction/video-recognition/video-recognition_2d_onnxruntime_static.py \
tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb \
tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb_20220906-cd10898e.pth \
tests/data/arm_wrestling.mp4 \
--work-dir mmdeploy_models/mmaction/tsn/ort \
--device cpu \
--show \
--dump-info
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmaction/tsn/ort` in the previous example. It includes:
```
mmdeploy_models/mmaction/tsn/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmaction/tsn/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model Inference
### Backend model inference
Take the previous converted `end2end.onnx` mode of `tsn` as an example, you can use the following code to inference the model and visualize the results.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import numpy as np
import torch
deploy_cfg = 'configs/mmaction/video-recognition/video-recognition_2d_onnxruntime_static.py'
model_cfg = 'tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmaction2/tsn/ort/end2end.onnx']
image = 'tests/data/arm_wrestling.mp4'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# show top5-results
pred_scores = result[0].pred_scores.item.tolist()
top_index = np.argsort(pred_scores)[::-1]
for i in range(5):
index = top_index[i]
print(index, pred_scores[index])
```
### SDK model inference
Given the above SDK model of `tsn` you can also perform SDK model inference like following,
#### Video recognition SDK model inference
```python
from mmdeploy_runtime import VideoRecognizer
import cv2
# refer to demo/python/video_recognition.py
# def SampleFrames(cap, clip_len, frame_interval, num_clips):
# ...
cap = cv2.VideoCapture('tests/data/arm_wrestling.mp4')
clips, info = SampleFrames(cap, 1, 1, 25)
# create a recognizer
recognizer = VideoRecognizer(model_path='./mmdeploy_models/mmaction/tsn/ort', device_name='cpu', device_id=0)
# perform inference
result = recognizer(clips, info)
# show inference result
for label_id, score in result:
print(label_id, score)
```
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
> MMAction2 only API of c, c++ and python for now.
## Supported models
| Model | TorchScript | ONNX Runtime | TensorRT | ncnn | PPLNN | OpenVINO |
| :----------------------------------------------------------------------------------------- | :---------: | :----------: | :------: | :--: | :---: | :------: |
| [TSN](https://github.com/open-mmlab/mmaction2/tree/main/configs/recognition/tsn) | Y | Y | Y | N | N | N |
| [SlowFast](https://github.com/open-mmlab/mmaction2/tree/main/configs/recognition/slowfast) | Y | Y | Y | N | N | N |
| [TSM](https://github.com/open-mmlab/mmaction2/tree/main/configs/recognition/tsm) | Y | Y | Y | N | N | N |
| [X3D](https://github.com/open-mmlab/mmaction2/tree/main/configs/recognition/x3d) | Y | Y | Y | N | N | N |
# MMagic Deployment
- [MMagic Deployment](#mmagic-deployment)
- [Installation](#installation)
- [Install mmagic](#install-mmagic)
- [Install mmdeploy](#install-mmdeploy)
- [Convert model](#convert-model)
- [Convert super resolution model](#convert-super-resolution-model)
- [Model specification](#model-specification)
- [Model inference](#model-inference)
- [Backend model inference](#backend-model-inference)
- [SDK model inference](#sdk-model-inference)
- [Supported models](#supported-models)
______________________________________________________________________
[MMagic](https://github.com/open-mmlab/mmagic/tree/main) aka `mmagic` is an open-source image and video editing toolbox based on PyTorch. It is a part of the [OpenMMLab](https://openmmlab.com/) project.
## Installation
### Install mmagic
Please follow the [installation guide](https://github.com/open-mmlab/mmagic/tree/main#installation) to install mmagic.
### Install mmdeploy
There are several methods to install mmdeploy, among which you can choose an appropriate one according to your target platform and device.
**Method I:** Install precompiled package
You can refer to [get_started](https://mmdeploy.readthedocs.io/en/latest/get_started.html#installation)
**Method II:** Build using scripts
If your target platform is **Ubuntu 18.04 or later version**, we encourage you to run
[scripts](../01-how-to-build/build_from_script.md). For example, the following commands install mmdeploy as well as inference engine - `ONNX Runtime`.
```shell
git clone --recursive -b main https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)
export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH
```
**Method III:** Build from source
If neither **I** nor **II** meets your requirements, [building mmdeploy from source](../01-how-to-build/build_from_source.md) is the last option.
## Convert model
You can use [tools/deploy.py](https://github.com/open-mmlab/mmdeploy/tree/main/tools/deploy.py) to convert mmagic models to the specified backend models. Its detailed usage can be learned from [here](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/02-how-to-run/convert_model.md#usage).
When using `tools/deploy.py`, it is crucial to specify the correct deployment config. We've already provided builtin deployment config [files](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmagic) of all supported backends for mmagic, under which the config file path follows the pattern:
```
{task}/{task}_{backend}-{precision}_{static | dynamic}_{shape}.py
```
- **{task}:** task in mmagic.
MMDeploy supports models of one task in mmagic, i.e., `super resolution`. Please refer to chapter [supported models](#supported-models) for task-model organization.
**DO REMEMBER TO USE** the corresponding deployment config file when trying to convert models of different tasks.
- **{backend}:** inference backend, such as onnxruntime, tensorrt, pplnn, ncnn, openvino, coreml etc.
- **{precision}:** fp16, int8. When it's empty, it means fp32
- **{static | dynamic}:** static shape or dynamic shape
- **{shape}:** input shape or shape range of a model
### Convert super resolution model
The command below shows an example about converting `ESRGAN` model to onnx model that can be inferred by ONNX Runtime.
```shell
cd mmdeploy
# download esrgan model from mmagic model zoo
mim download mmagic --config esrgan_psnr-x4c64b23g32_1xb16-1000k_div2k --dest .
# convert esrgan model to onnxruntime model with dynamic shape
python tools/deploy.py \
configs/mmagic/super-resolution/super-resolution_onnxruntime_dynamic.py \
esrgan_psnr-x4c64b23g32_1xb16-1000k_div2k.py \
esrgan_psnr_x4c64b23g32_1x16_1000k_div2k_20200420-bf5c993c.pth \
demo/resources/face.png \
--work-dir mmdeploy_models/mmagic/ort \
--device cpu \
--show \
--dump-info
```
You can also convert the above model to other backend models by changing the deployment config file `*_onnxruntime_dynamic.py` to [others](https://github.com/open-mmlab/mmdeploy/tree/main/configs/mmagic), e.g., converting to tensorrt model by `super-resolution/super-resolution_tensorrt-_dynamic-32x32-512x512.py`.
```{tip}
When converting mmagic models to tensorrt models, --device should be set to "cuda"
```
## Model specification
Before moving on to model inference chapter, let's know more about the converted model structure which is very important for model inference.
The converted model locates in the working directory like `mmdeploy_models/mmagic/ort` in the previous example. It includes:
```
mmdeploy_models/mmagic/ort
├── deploy.json
├── detail.json
├── end2end.onnx
└── pipeline.json
```
in which,
- **end2end.onnx**: backend model which can be inferred by ONNX Runtime
- \***.json**: the necessary information for mmdeploy SDK
The whole package **mmdeploy_models/mmagic/ort** is defined as **mmdeploy SDK model**, i.e., **mmdeploy SDK model** includes both backend model and inference meta information.
## Model inference
### Backend model inference
Take the previous converted `end2end.onnx` model as an example, you can use the following code to inference the model and visualize the results.
```python
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
import torch
deploy_cfg = 'configs/mmagic/super-resolution/super-resolution_onnxruntime_dynamic.py'
model_cfg = 'esrgan_psnr-x4c64b23g32_1xb16-1000k_div2k.py'
device = 'cpu'
backend_model = ['./mmdeploy_models/mmagic/ort/end2end.onnx']
image = './demo/resources/face.png'
# read deploy_cfg and model_cfg
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
# build task and backend model
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
model = task_processor.build_backend_model(backend_model)
# process input image
input_shape = get_input_shape(deploy_cfg)
model_inputs, _ = task_processor.create_input(image, input_shape)
# do model inference
with torch.no_grad():
result = model.test_step(model_inputs)
# visualize results
task_processor.visualize(
image=image,
model=model,
result=result[0],
window_name='visualize',
output_file='output_restorer.bmp')
```
### SDK model inference
You can also perform SDK model inference like following,
```python
from mmdeploy_runtime import Restorer
import cv2
img = cv2.imread('./demo/resources/face.png')
# create a classifier
restorer = Restorer(model_path='./mmdeploy_models/mmagic/ort', device_name='cpu', device_id=0)
# perform inference
result = restorer(img)
# visualize inference result
# convert to BGR
result = result[..., ::-1]
cv2.imwrite('output_restorer.bmp', result)
```
Besides python API, mmdeploy SDK also provides other FFI (Foreign Function Interface), such as C, C++, C#, Java and so on. You can learn their usage from [demos](https://github.com/open-mmlab/mmdeploy/tree/main/demo).
## Supported models
| Model | Task | ONNX Runtime | TensorRT | ncnn | PPLNN | OpenVINO |
| :-------------------------------------------------------------------------------- | :--------------- | :----------: | :------: | :--: | :---: | :------: |
| [SRCNN](https://github.com/open-mmlab/mmagic/tree/main/configs/srcnn) | super-resolution | Y | Y | Y | Y | Y |
| [ESRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/esrgan) | super-resolution | Y | Y | Y | Y | Y |
| [ESRGAN-PSNR](https://github.com/open-mmlab/mmagic/tree/main/configs/esrgan) | super-resolution | Y | Y | Y | Y | Y |
| [SRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/srgan_resnet) | super-resolution | Y | Y | Y | Y | Y |
| [SRResNet](https://github.com/open-mmlab/mmagic/tree/main/configs/srgan_resnet) | super-resolution | Y | Y | Y | Y | Y |
| [Real-ESRGAN](https://github.com/open-mmlab/mmagic/tree/main/configs/real_esrgan) | super-resolution | Y | Y | Y | Y | Y |
| [EDSR](https://github.com/open-mmlab/mmagic/tree/main/configs/edsr) | super-resolution | Y | Y | Y | N | Y |
| [RDN](https://github.com/open-mmlab/mmagic/tree/main/configs/rdn) | super-resolution | Y | Y | Y | Y | Y |
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment