-*Please note that this feature is experimental and may change in the future. Strongly suggest users always try with the latest master branch.*
- The custom operator is not included in [supported operator list](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md) in ONNX Runtime.
- The custom operator should be able to be exported to ONNX.
1. Add header `soft_nms.h` to ONNX Runtime include directory `mmcv/ops/csrc/onnxruntime/`
2. Add source `soft_nms.cpp` to ONNX Runtime source directory `mmcv/ops/csrc/onnxruntime/cpu/`
3. Register `soft_nms` operator in [onnxruntime_register.cpp](../../mmcv/ops/csrc/onnxruntime/cpu/onnxruntime_register.cpp)
3. Register `soft_nms` operator in [onnxruntime_register.cpp](../../../mmcv/ops/csrc/onnxruntime/cpu/onnxruntime_register.cpp)
```c++
#include "soft_nms.h"
...
...
@@ -123,4 +133,4 @@ Take custom operator `soft_nms` for example.
### References
-[How to export Pytorch model with custom op to ONNX and run it in ONNX Runtime](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
-[How to add a custom operator/kernel in ONNX Runtime](https://github.com/microsoft/onnxruntime/blob/master/docs/AddingCustomOp.md)
-[How to add a custom operator/kernel in ONNX Runtime](https://onnxruntime.ai/docs/reference/operators/add-custom-op.html)
ScatterND takes three inputs `data` tensor of rank r >= 1, `indices` tensor of rank q >= 1, and `updates` tensor of rank q + r - indices.shape[-1] - 1. The output of the operation is produced by creating a copy of the input `data`, and then updating its value to values specified by updates at specific index positions specified by `indices`. Its output shape is the same as the shape of `data`. Note that `indices` should not have duplicate entries. That is, two or more updates for the same index-location is not supported.
ScatterND takes three inputs `data` tensor of rank r >= 1, `indices` tensor of rank q >= 1, and `updates` tensor of rank q + r - indices.shape\[-1\] - 1. The output of the operation is produced by creating a copy of the input `data`, and then updating its value to values specified by updates at specific index positions specified by `indices`. Its output shape is the same as the shape of `data`. Note that `indices` should not have duplicate entries. That is, two or more updates for the same index-location is not supported.
The `output` is calculated via the following equation:
...
...
@@ -151,9 +151,9 @@ Filter out boxes has high IoU overlap with previously selected boxes or low scor
| `int` | `center_point_box` | 0 - the box data is supplied as [y1, x1, y2, x2], 1-the box data is supplied as [x_center, y_center, width, height]. |
| `int` | `center_point_box` | 0 - the box data is supplied as \[y1, x1, y2, x2\], 1-the box data is supplied as \[x_center, y_center, width, height\]. |
| `int` | `max_output_boxes_per_class` | The maximum number of boxes to be selected per batch per class. Default to 0, number of output boxes equal to number of input boxes. |
| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range \[0, 1\]. Default to 0. |
| `float` | `score_threshold` | The threshold for deciding when to remove boxes based on score. |
| `int` | `offset` | 0 or 1, boxes' width or height is (x2 - x1 + offset). |
For more detailed information of installing TensorRT using tar, please refer to [Nvidia' website](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-721/install-guide/index.html#installing-tar).
- Install cuDNN
Install cuDNN 8 following [Nvidia' website](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar).
#### Build on Linux
```bash
...
...
@@ -142,8 +152,11 @@ Below are the main steps:
**Take RoIAlign plugin `roi_align` for example.**
1. Add header `trt_roi_align.hpp` to TensorRT include directory `mmcv/ops/csrc/tensorrt/`
2. Add source `trt_roi_align.cpp` to TensorRT source directory `mmcv/ops/csrc/tensorrt/plugins/`
3. Add cuda kernel `trt_roi_align_kernel.cu` to TensorRT source directory `mmcv/ops/csrc/tensorrt/plugins/`
4. Register `roi_align` plugin in [trt_plugin.cpp](https://github.com/open-mmlab/mmcv/blob/master/mmcv/ops/csrc/tensorrt/plugins/trt_plugin.cpp)
```c++
...
...
@@ -163,6 +176,8 @@ Below are the main steps:
#### Reminders
-*Please note that this feature is experimental and may change in the future. Strongly suggest users always try with the latest master branch.*
- Some of the [custom ops](https://mmcv.readthedocs.io/en/latest/ops.html) in `mmcv` have their cuda implementations, which could be referred.
We list some common troubles faced by many users and their corresponding solutions here.
Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them.
### Installation
- KeyError: "xxx: 'yyy is not in the zzz registry'"
The registry mechanism will be triggered only when the file of the module is imported.
So you need to import that file somewhere. More details can be found at [KeyError: "MaskRCNN: 'RefineRoIHead is not in the models registry'"](https://github.com/open-mmlab/mmdetection/issues/5974).
- "No module named 'mmcv.ops'"; "No module named 'mmcv.\_ext'"
1. Uninstall existing mmcv in the environment using `pip uninstall mmcv`
2. Install mmcv-full following the [installation instruction](https://mmcv.readthedocs.io/en/latest/get_started/installation.html) or [Build MMCV from source](https://mmcv.readthedocs.io/en/latest/get_started/build.html)
- "invalid device function" or "no kernel image is available for execution"
1. Check the CUDA compute capability of you GPU
2. Run `python mmdet/utils/collect_env.py` to check whether PyTorch, torchvision, and MMCV are built for the correct GPU architecture. You may need to set `TORCH_CUDA_ARCH_LIST` to reinstall MMCV. The compatibility issue could happen when using old GPUS, e.g., Tesla K80 (3.7) on colab.
3. Check whether the running environment is the same as that when mmcv/mmdet is compiled. For example, you may compile mmcv using CUDA 10.0 bug run it on CUDA9.0 environments
- "undefined symbol" or "cannot open xxx.so"
1. If those symbols are CUDA/C++ symbols (e.g., libcudart.so or GLIBCXX), check
whether the CUDA/GCC runtimes are the same as those used for compiling mmcv
2. If those symbols are Pytorch symbols (e.g., symbols containing caffe, aten, and TH), check whether the Pytorch version is the same as that used for compiling mmcv
3. Run `python mmdet/utils/collect_env.py` to check whether PyTorch, torchvision, and MMCV are built by and running on the same environment
- "RuntimeError: CUDA error: invalid configuration argument"
This error may be caused by the poor performance of GPU. Try to decrease the value of [THREADS_PER_BLOCK](https://github.com/open-mmlab/mmcv/blob/cac22f8cf5a904477e3b5461b1cc36856c2793da/mmcv/ops/csrc/common_cuda_helper.hpp#L10)
and recompile mmcv.
- "RuntimeError: nms is not compiled with GPU support"
This error is because your CUDA environment is not installed correctly.
You may try to re-install your CUDA environment and then delete the build/ folder before re-compile mmcv.
- "Segmentation fault"
1. Check your GCC version and use GCC >= 5.4. This usually caused by the incompatibility between PyTorch and the environment (e.g., GCC \< 4.9 for PyTorch). We also recommend the users to avoid using GCC 5.5 because many feedbacks report that GCC 5.5 will cause "segmentation fault" and simply changing it to GCC 5.4 could solve the problem
2. Check whether PyTorch is correctly installed and could use CUDA op, e.g. type the following command in your terminal and see whether they could correctly output results
3. If PyTorch is correctly installed, check whether MMCV is correctly installed. If MMCV is correctly installed, then there will be no issue of the command
```shell
python -c'import mmcv; import mmcv.ops'
```
4. If MMCV and PyTorch are correctly installed, you can use `ipdb` to set breakpoints or directly add `print` to debug and see which part leads the `segmentation fault`
- "libtorch_cuda_cu.so: cannot open shared object file"
`mmcv-full` depends on the share object but it can not be found. We can check whether the object exists in `~/miniconda3/envs/{environment-name}/lib/python3.7/site-packages/torch/lib` or try to re-install the PyTorch.
- "fatal error C1189: #error: -- unsupported Microsoft Visual Studio version!"
If you are building mmcv-full on Windows and the version of CUDA is 9.2, you will probably encounter the error `"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include\crt/host_config.h(133): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions 2012, 2013, 2015 and 2017 are supported!"`, in which case you can use a lower version of Microsoft Visual Studio like vs2017.
- "error: member "torch::jit::detail::ModulePolicy::all_slots" may not be initialized"
If your version of PyTorch is 1.5.0 and you are building mmcv-full on Windows, you will probably encounter the error `- torch/csrc/jit/api/module.h(474): error: member "torch::jit::detail::ModulePolicy::all_slots" may not be initialized`. The way to solve the error is to replace all the `static constexpr bool all_slots = false;` with `static bool all_slots = false;` at this file `https://github.com/pytorch/pytorch/blob/v1.5.0/torch/csrc/jit/api/module.h`. More details can be found at [member "torch::jit::detail::AttributePolicy::all_slots" may not be initialized](https://github.com/pytorch/pytorch/issues/39394).
- "error: a member with an in-class initializer must be const"
If your version of PyTorch is 1.6.0 and you are building mmcv-full on Windows, you will probably encounter the error `"- torch/include\torch/csrc/jit/api/module.h(483): error: a member with an in-class initializer must be const"`. The way to solve the error is to replace all the `CONSTEXPR_EXCEPT_WIN_CUDA ` with `const` at `torch/include\torch/csrc/jit/api/module.h`. More details can be found at [Ninja: build stopped: subcommand failed](https://github.com/open-mmlab/mmcv/issues/575).
- "error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized"
If your version of PyTorch is 1.7.0 and you are building mmcv-full on Windows, you will probably encounter the error `torch/include\torch/csrc/jit/ir/ir.h(1347): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized`. The way to solve the error needs to modify several local files of PyTorch:
- delete `static constexpr Symbol Kind = ::c10::prim::profile;` and `tatic constexpr Symbol Kind = ::c10::prim::profile_optional;` at `torch/include\torch/csrc/jit/ir/ir.h`
- replace `explicit operator type&() { return *(this->value); }` with `explicit operator type&() { return *((type*)this->value); }` at `torch\include\pybind11\cast.h`
- replace all the `CONSTEXPR_EXCEPT_WIN_CUDA` with `const` at `torch/include\torch/csrc/jit/api/module.h`
More details can be found at [Ensure default extra_compile_args](https://github.com/pytorch/pytorch/pull/45956).
- Compatibility issue between MMCV and MMDetection; "ConvWS is already registered in conv layer"
Please install the correct version of MMCV for the version of your MMDetection following the [installation instruction](https://mmdetection.readthedocs.io/en/latest/get_started.html#installation).
### Usage
- "RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one"
1. This error indicates that your module has parameters that were not used in producing loss. This phenomenon may be caused by running different branches in your code in DDP mode. More datails at [Expected to have finished reduction in the prior iteration before starting a new one](https://github.com/pytorch/pytorch/issues/55582).
2. You can set ` find_unused_parameters = True` in the config to solve the above problems or find those unused parameters manually
- "RuntimeError: Trying to backward through the graph a second time"
`GradientCumulativeOptimizerHook` and `OptimizerHook` are both set which causes the `loss.backward()` to be called twice so `RuntimeError` was raised. We can only use one of these. More datails at [Trying to backward through the graph a second time](https://github.com/open-mmlab/mmcv/issues/1379).
@@ -85,17 +92,23 @@ You should know how to set up environment variables, especially `Path`, on Windo
We only tested PyTorch version >= 1.6.0.
1. Prepare MMCV source code
4. Prepare MMCV source code
```shell
git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
```
1. Install required Python packages
5. Install required Python packages
```shell
pip3 install -r requirements.txt
pip3 install-r requirements/runtime.txt
```
6. It is recommended to install `ninja` to speed up the compilation
```bash
pip install-r requirements/optional.txt
```
#### Build and install MMCV
...
...
@@ -106,11 +119,11 @@ MMCV can be built in three ways:
In this way, no custom ops are compiled and mmcv is a pure python package.
1. Full version (CPU ops)
2. Full version (CPU ops)
Module `ops` will be compiled as a pytorch extension, but only x86 code will be compiled. The compiled ops can be executed on CPU only.
1. Full version (CUDA ops)
3. Full version (CUDA ops)
Both x86 and CUDA codes of `ops` module will be compiled. The compiled version can be run on both CPU and CUDA-enabled GPU (if implemented).
...
...
@@ -150,14 +163,15 @@ pip list
##### Option 2: Build MMCV (full version with CPU)
1. Finish above common steps
1. Set up environment variables
2. Set up environment variables
```shell
$env:MMCV_WITH_OPS = 1
$env:MAX_JOBS = 8 # based on your available number of CPU cores and amount of memory
```
1. Following build steps of the lite version
3. Following build steps of the lite version
```shell
# activate environment
...
...
@@ -175,7 +189,8 @@ pip list
##### Option 3: Build MMCV (full version with CUDA)
1. Finish above common steps
1. Make sure `CUDA_PATH` or `CUDA_HOME` is already set in `envs` via `ls env:`, desired output is shown as below:
2. Make sure `CUDA_PATH` or `CUDA_HOME` is already set in `envs` via `ls env:`, desired output is shown as below:
```none
(base) PS C:\Users\WRH> ls env:
...
...
@@ -197,7 +212,7 @@ pip list
$env:CUDA_HOME =$env:CUDA_PATH_V10_2 # if CUDA_PATH_V10_2 is in envs:
```
1. Set CUDA target arch
3. Set CUDA target arch
```shell
# Suppose you are using GTX 1080, which is of capability 6.1
...
...
@@ -210,7 +225,7 @@ pip list
Check your the compute capability of your GPU from [here](https://developer.nvidia.com/cuda-gpus).
```
1. Launch compiling the same way as CPU
4. Launch compiling the same way as CPU
```shell
$env:MMCV_WITH_OPS = 1
...
...
@@ -232,3 +247,23 @@ If you are compiling against PyTorch 1.6.0, you might meet some errors from PyTo
```
If you meet issues when running or compiling mmcv, we list some common issues in [Frequently Asked Question](../faq.html).
## \[Optional\] Build MMCV on IPU machine
Firstly, you need to apply for an IPU cloud machine, see [here](https://www.graphcore.ai/ipus-in-the-cloud).
### Option 1: Docker
1. Pull docker
```shell
docker pull graphcore/pytorch
```
2. Build MMCV under same python environment
### Option 2: Install from SDK
1. Build MMCV
2. Use pip to install sdk according to [IPU PyTorch document](https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/installation.html). Also, you need to apply for machine and sdk to Graphcore.
-**mmcv-full**: comprehensive, with full features and various CUDA ops out of box. It takes longer time to build.
-**mmcv**: lite, without CUDA ops but all other features, similar to mmcv<1.0.0. It is useful when you do not need those CUDA ops.
-**mmcv**: lite, without CUDA ops but all other features, similar to mmcv\<1.0.0. It is useful when you do not need those CUDA ops.
```{warning}
Do not install both versions in the same environment, otherwise you may encounter errors like `ModuleNotFound`. You need to uninstall one before installing the other. `Installing the full version is highly recommended if CUDA is avaliable`.
...
...
@@ -13,36 +13,36 @@ a. Install the full version.
Before installing mmcv-full, make sure that PyTorch has been successfully installed following the [official guide](https://pytorch.org/).
We provide pre-built mmcv packages (recommended) with different PyTorch and CUDA versions to simplify the building. In addition, you can run [check_installation.py](.dev_scripts/check_installation.py) to check the installation of mmcv-full after running the installation commands.
We provide pre-built mmcv packages (recommended) with different PyTorch and CUDA versions to simplify the building for **Linux and Windows systems**. In addition, you can run [check_installation.py](.dev_scripts/check_installation.py) to check the installation of mmcv-full after running the installation commands.
i. Install the latest version.
The rule for installing the latest ``mmcv-full`` is as follows:
The rule for installing the latest `mmcv-full` is as follows:
@@ -138,7 +156,11 @@ For more details, please refer the the following tables.
</table>
```{note}
The pre-built packages provided above do not include all versions of mmcv-full, you can click on the corresponding links to see the supported versions. For example, if you click [cu102-torch1.8.0](https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html), you can see that `cu102-torch1.8.0` only provides 1.3.0 and above versions of mmcv-full. In addition, We no longer provide `mmcv-full` pre-built packages compiled with `PyTorch 1.3 & 1.4` since v1.3.17. You can find previous versions that compiled with PyTorch 1.3 & 1.4 [here](./docs/get_started/previous_versions.md). The compatibility is still ensured in our CI, but we will discard the support of PyTorch 1.3 & 1.4 next year.
The pre-built packages provided above do not include all versions of mmcv-full, you can click on the corresponding links to see the supported versions. For example, if you click [cu102-torch1.8.0](https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html), you can see that `cu102-torch1.8.0` only provides 1.3.0 and above versions of mmcv-full. In addition, We no longer provide `mmcv-full` pre-built packages compiled with `PyTorch 1.3 & 1.4` since v1.3.17. You can find previous versions that compiled with PyTorch 1.3 & 1.4 [here](./previous_versions.md). The compatibility is still ensured in our CI, but we will discard the support of PyTorch 1.3 & 1.4 next year.
```
```{note}
mmcv-full does not provide pre-built packages for `cu102-torch1.11` and `cu92-torch*` on Windows.
MMCV implements [registry](https://github.com/open-mmlab/mmcv/blob/master/mmcv/utils/registry.py) to manage different modules that share similar functionalities, e.g., backbones, head, and necks, in detectors.
Most projects in OpenMMLab use registry to manage modules of datasets and models, such as [MMDetection](https://github.com/open-mmlab/mmdetection), [MMDetection3D](https://github.com/open-mmlab/mmdetection3d), [MMClassification](https://github.com/open-mmlab/mmclassification), [MMEditing](https://github.com/open-mmlab/mmediting), etc.
```{note}
In v1.5.1 and later, the Registry supports registering functions and calling them.
```
### What is registry
In MMCV, registry can be regarded as a mapping that maps a class to a string.
These classes contained by a single registry usually have similar APIs but implement different algorithms or support different datasets.
With the registry, users can find and instantiate the class through its corresponding string, and use the instantiated module as they want.
In MMCV, registry can be regarded as a mapping that maps a class or function to a string.
These classes or functions contained by a single registry usually have similar APIs but implement different algorithms or support different datasets.
With the registry, users can find the class or function through its corresponding string, and instantiate the corresponding module or call the function to obtain the result according to needs.
One typical example is the config systems in most OpenMMLab projects, which use the registry to create hooks, runners, models, and datasets, through configs.
The API reference could be found [here](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.Registry).
...
...
@@ -17,7 +21,7 @@ To manage your modules in the codebase by `Registry`, there are three steps as b
2. Create a registry.
3. Use this registry to manage the modules.
`build_func` argument of `Registry` is to customize how to instantiate the class instance, the default one is `build_from_cfg` implemented [here](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.build_from_cfg).
`build_func` argument of `Registry` is to customize how to instantiate the class instance or how to call the function to obtain the result, the default one is `build_from_cfg` implemented [here](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.build_from_cfg).
### A Simple Example
...
...
@@ -31,10 +35,10 @@ In the package, we first create a file to implement builders, named `converters/
```python
frommmcv.utilsimportRegistry
# create a registry for converters
CONVERTERS=Registry('converter')
CONVERTERS=Registry('converters')
```
Then we can implement different converters in the package. For example, implement `Converter1` in `converters/converter1.py`
Then we can implement different converters that is class or function in the package. For example, implement `Converter1` in `converters/converter1.py`, and `converter2` in `converters/converter2.py`.
```python
...
...
@@ -48,18 +52,38 @@ class Converter1(object):
self.b=b
```
```python
# converter2.py
from.builderimportCONVERTERS
from.converter1importConverter1
# 使用注册器管理模块
@CONVERTERS.register_module()
defconverter2(a,b)
returnConverter1(a,b)
```
The key step to use registry for managing the modules is to register the implemented module into the registry `CONVERTERS` through
`@CONVERTERS.register_module()` when you are creating the module. By this way, a mapping between a string and the class is built and maintained by `CONVERTERS` as below
`@CONVERTERS.register_module()` when you are creating the module. By this way, a mapping between a string and the class (function) is built and maintained by `CONVERTERS` as below
```python
'Converter1'-><class'Converter1'>
'converter2' -> <function 'converter2'>
```
```{note}
The registry mechanism will be triggered only when the file where the module is located is imported.
So you need to import that file somewhere. More details can be found at https://github.com/open-mmlab/mmdetection/issues/5974.
```
If the module is successfully registered, you can use this converter through configs as
@@ -8,7 +8,7 @@ The runner class is designed to manage the training. It eases the training proce
### EpochBasedRunner
As its name indicates, workflow in `EpochBasedRunner` should be set based on epochs. For example, [('train', 2), ('val', 1)] means running 2 epochs for training and 1 epoch for validation, iteratively. And each epoch may contain multiple iterations. Currently, MMDetection uses `EpochBasedRunner` by default.
As its name indicates, workflow in `EpochBasedRunner` should be set based on epochs. For example, \[('train', 2), ('val', 1)\] means running 2 epochs for training and 1 epoch for validation, iteratively. And each epoch may contain multiple iterations. Currently, MMDetection uses `EpochBasedRunner` by default.
Different from `EpochBasedRunner`, workflow in `IterBasedRunner` should be set based on iterations. For example, [('train', 2), ('val', 1)] means running 2 iters for training and 1 iter for validation, iteratively. Currently, MMSegmentation uses `IterBasedRunner` by default.
Different from `EpochBasedRunner`, workflow in `IterBasedRunner` should be set based on iterations. For example, \[('train', 2), ('val', 1)\] means running 2 iters for training and 1 iter for validation, iteratively. Currently, MMSegmentation uses `IterBasedRunner` by default.
Let's take `EpochBasedRunner` for example and go a little bit into details about setting workflow:
- Say we only want to put train in the workflow, then we can set: workflow = [('train', 1)]. The runner will only execute train iteratively in this case.
- Say we want to put both train and val in the workflow, then we can set: workflow = [('train', 3), ('val',1)]. The runner will first execute train for 3 epochs and then switch to val mode and execute val for 1 epoch. The workflow will be repeated until the current epoch hit the max_epochs.
- Workflow is highly flexible. Therefore, you can set workflow = [('val', 1), ('train',1)] if you would like the runner to validate first and train after.
- Say we only want to put train in the workflow, then we can set: workflow = \[('train', 1)\]. The runner will only execute train iteratively in this case.
- Say we want to put both train and val in the workflow, then we can set: workflow = \[('train', 3), ('val',1)\]. The runner will first execute train for 3 epochs and then switch to val mode and execute val for 1 epoch. The workflow will be repeated until the current epoch hit the max_epochs.
- Workflow is highly flexible. Therefore, you can set workflow = \[('val', 1), ('train',1)\] if you would like the runner to validate first and train after.
The code we demonstrated above is already in `train.py` in MM repositories. Simply modify the corresponding keys in the configuration files and the script will execute the expected workflow automatically.