diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index a60cd994305bc8b548c71951c9b57d544c1ec21d..2fdf8a2d23cff3f69ea753466370b6dc3c719686 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,258 +1,71 @@
## Contributing to OpenMMLab
-Welcome to the MMCV community, we are committed to building a cutting-edge computer vision foundational library and all kinds of contributions are welcomed, including but not limited to
+All kinds of contributions are welcome, including but not limited to the following.
-**Fix bug**
+- Fix typo or bugs
+- Add documentation or translate the documentation into other languages
+- Add new features and components
-You can directly post a Pull Request to fix typo in code or documents
+### Workflow
-The steps to fix the bug of code implementation are as follows.
+1. fork and pull the latest OpenMMLab repository
+2. checkout a new branch (do not use master branch for PRs)
+3. commit your changes
+4. create a PR
-1. If the modification involve significant changes, you should create an issue first and describe the error information and how to trigger the bug. Other developers will discuss with you and propose an proper solution.
-
-2. Posting a pull request after fixing the bug and adding corresponding unit test.
-
-**New Feature or Enhancement**
-
-1. If the modification involve significant changes, you should create an issue to discuss with our developers to propose an proper design.
-2. Post a Pull Request after implementing the new feature or enhancement and add corresponding unit test.
-
-**Document**
-
-You can directly post a pull request to fix documents. If you want to add a document, you should first create an issue to check if it is reasonable.
-
-### Pull Request Workflow
-
-If you're not familiar with Pull Request, don't worry! The following guidance will tell you how to create a Pull Request step by step. If you want to dive into the develop mode of Pull Request, you can refer to the [official documents](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)
-
-#### 1. Fork and clone
-
-If you are posting a pull request for the first time, you should fork the OpenMMLab repositories by clicking the **Fork** button in the top right corner of the GitHub page, and the forked repositories will appear under your GitHub profile.
-
-
-
-Then, you can clone the repositories to local:
-
-```shell
-git clone git@github.com:{username}/mmcv.git
+```{note}
+If you plan to add some new features that involve large changes, it is encouraged to open an issue for discussion first.
```
+### Code style
-After that, you should ddd official repository as the upstream repository
+#### Python
-```bash
-git remote add upstream git@github.com:open-mmlab/mmcv
-```
+We adopt [PEP8](https://www.python.org/dev/peps/pep-0008/) as the preferred code style.
-Check whether remote repository has been added successfully by `git remote -v`
+We use the following tools for linting and formatting:
-```bash
-origin git@github.com:{username}/mmcv.git (fetch)
-origin git@github.com:{username}/mmcv.git (push)
-upstream git@github.com:open-mmlab/mmcv (fetch)
-upstream git@github.com:open-mmlab/mmcv (push)
-```
+- [flake8](http://flake8.pycqa.org/en/latest/): A wrapper around some linter tools.
+- [yapf](https://github.com/google/yapf): A formatter for Python files.
+- [isort](https://github.com/timothycrosley/isort): A Python utility to sort imports.
+- [markdownlint](https://github.com/markdownlint/markdownlint): A linter to check markdown files and flag style issues.
+- [docformatter](https://github.com/myint/docformatter): A formatter to format docstring.
-> Here's a brief introduction to origin and upstream. When we use "git clone", we create an "origin" remote by default, which points to the repository cloned from. As for "upstream", we add it ourselves to point to the target repository. Of course, if you don't like the name "upstream", you could name it as you wish. Usually, we'll push the code to "origin". If the pushed code conflicts with the latest code in official("upstream"), we should pull the latest code from upstream to resolve the conflicts, and then push to "origin" again. The posted Pull Request will be updated automatically.
+Style configurations of yapf and isort can be found in [setup.cfg](./setup.cfg).
-#### 2. Configure pre-commit
+We use [pre-commit hook](https://pre-commit.com/) that checks and formats for `flake8`, `yapf`, `isort`, `trailing whitespaces`, `markdown files`,
+fixes `end-of-files`, `double-quoted-strings`, `python-encoding-pragma`, `mixed-line-ending`, sorts `requirments.txt` automatically on every commit.
+The config for a pre-commit hook is stored in [.pre-commit-config](./.pre-commit-config.yaml).
-You should configure [pre-commit](https://pre-commit.com/#intro) in the local development environment to make sure the code style matches that of OpenMMLab. **Note**: The following code should be executed under the MMCV directory.
+After you clone the repository, you will need to install initialize pre-commit hook.
```shell
pip install -U pre-commit
-pre-commit install
-```
-
-Check that pre-commit is configured successfully, and install the hooks defined in `.pre-commit-config.yaml`.
-
-```shell
-pre-commit run --all-files
-```
-
-
-
-
-
-If the installation process is interrupted, you can repeatedly run `pre-commit run ... ` to continue the installation.
-
-If the code does not conform to the code style specification, pre-commit will raise a warning and fixes some of the errors automatically.
-
-
-
-If we want to commit our code bypassing the pre-commit hook, we can use the `--no-verify` option(**only for temporarily commit**).
-
-```shell
-git commit -m "xxx" --no-verify
-```
-
-#### 3. Create a development branch
-
-After configuring the pre-commit, we should create a branch based on the master branch to develop the new feature or fix the bug. The proposed branch name is `username/pr_name`
-
-```shell
-git checkout -b yhc/refactor_contributing_doc
-```
-
-In subsequent development, if the master branch of the local repository is behind the master branch of "upstream", we need to pull the upstream for synchronization, and then execute the above command:
-
-```shell
-git pull upstream master
-```
-
-#### 4. Commit the code and pass the unit test
-
-- MMCV introduces mypy to do static type checking to increase the robustness of the code. Therefore, we need to add Type Hints to our code and pass the mypy check. If you are not familiar with Type Hints, you can refer to [this tutorial](https://docs.python.org/3/library/typing.html).
-
-- The committed code should pass through the unit test
-
- ```shell
- # Pass all unit tests
- pytest tests
-
- # Pass the unit test of runner
- pytest tests/test_runner/test_runner.py
- ```
-
- If the unit test fails for lack of dependencies, you can install the dependencies referring to the [guidance](#unit-test)
-
-- If the documents are modified/added, we should check the rendering result referring to [guidance](#document-rendering)
-
-#### 5. Push the code to remote
-
-We could push the local commits to remote after passing through the check of unit test and pre-commit. You can associate the local branch with remote branch by adding `-u` option.
-
-```shell
-git push -u origin {branch_name}
-```
-
-This will allow you to use the `git push` command to push code directly next time, without having to specify a branch or the remote repository.
-
-#### 6. Create a Pull Request
-
-(1) Create a pull request in GitHub's Pull request interface
-
-
-
-(2) Modify the PR description according to the guidelines so that other developers can better understand your changes
-
-
-
-Find more details about Pull Request description in [pull request guidelines](#pr-specs).
-
-**note**
-
-(a) The Pull Request description should contain the reason for the change, the content of the change, and the impact of the change, and be associated with the relevant Issue (see [documentation](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue)
-
-(b) If it is your first contribution, please sign the CLA
-
-
-
-(c) Check whether the Pull Request pass through the CI
-
-
-
-MMCV will run unit test for the posted Pull Request on different platforms (Linux, Window, Mac), based on different versions of Python, PyTorch, CUDA to make sure the code is correct. We can see the specific test information by clicking `Details` in the above image so that we can modify the code.
-
-(3) If the Pull Request passes the CI, then you can wait for the review from other developers. You'll modify the code based on the reviewer's comments, and repeat the steps [4](#4-commit-the-code-and-pass-the-unit-test)-[5](#5-push-the-code-to-remote) until all reviewers approve it. Then, we will merge it ASAP.
-
-
-
-#### 7. Resolve conflicts
-
-If your local branch conflicts with the latest master branch of "upstream", you'll need to resolove them. There are two ways to do this:
-
-```shell
-git fetch --all --prune
-git rebase upstream/master
```
-or
-
-```shell
-git fetch --all --prune
-git merge upstream/master
-```
-
-If you are very good at handling conflicts, then you can use rebase to resolve conflicts, as this will keep your commit logs tidy. If you are not familiar with `rebase`, then you can use `merge` to resolve conflicts.
-
-### Guidance
-
-#### Unit test
-
-If you cannot run the unit test of some modules for lacking of some dependencies, such as [video](https://github.com/open-mmlab/mmcv/tree/master/mmcv/video) module, you can try to install the following dependencies:
+From the repository folder
```shell
-# Linux
-sudo apt-get update -y
-sudo apt-get install -y libturbojpeg
-sudo apt-get install -y ffmpeg
-
-# Windows
-conda install ffmpeg
+pre-commit install
```
-We should also make sure the committed code will not decrease the coverage of unit test, we could run the following command to check the coverage of unit test:
+Try the following steps to install ruby when you encounter an issue on installing markdownlint
```shell
-python -m coverage run -m pytest /path/to/test_file
-python -m coverage html
-# check file in htmlcov/index.html
-```
-
-#### Document rendering
+# install rvm
+curl -L https://get.rvm.io | bash -s -- --autolibs=read-fail
+[[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm"
+rvm autolibs disable
-If the documents are modified/added, we should check the rendering result. We could install the dependencies and run the following command to render the documents and check the results:
-
-```shell
-pip install -r requirements/docs.txt
-cd docs/zh_cn/
-# or docs/en
-make html
-# check file in ./docs/zh_cn/_build/html/index.html
+# install ruby
+rvm install 2.7.1
```
-### Code style
+Or refer to [this repo](https://github.com/innerlee/setup) and take [`zzruby.sh`](https://github.com/innerlee/setup/blob/master/zzruby.sh) according its instruction.
-#### Python
+After this on every commit check code linters and formatter will be enforced.
-We adopt [PEP8](https://www.python.org/dev/peps/pep-0008/) as the preferred code style.
-
-We use the following tools for linting and formatting:
-
-- [flake8](https://github.com/PyCQA/flake8): A wrapper around some linter tools.
-- [isort](https://github.com/timothycrosley/isort): A Python utility to sort imports.
-- [yapf](https://github.com/google/yapf): A formatter for Python files.
-- [codespell](https://github.com/codespell-project/codespell): A Python utility to fix common misspellings in text files.
-- [mdformat](https://github.com/executablebooks/mdformat): Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files.
-- [docformatter](https://github.com/myint/docformatter): A formatter to format docstring.
-
-Style configurations of yapf and isort can be found in [setup.cfg](./setup.cfg).
-
-We use [pre-commit hook](https://pre-commit.com/) that checks and formats for `flake8`, `yapf`, `isort`, `trailing whitespaces`, `markdown files`,
-fixes `end-of-files`, `double-quoted-strings`, `python-encoding-pragma`, `mixed-line-ending`, sorts `requirments.txt` automatically on every commit.
-The config for a pre-commit hook is stored in [.pre-commit-config](./.pre-commit-config.yaml).
+>Before you create a PR, make sure that your code lints and is formatted by yapf.
#### C++ and CUDA
We follow the [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html).
-
-### PR Specs
-
-1. Use [pre-commit](https://pre-commit.com) hook to avoid issues of code style
-
-2. One short-time branch should be matched with only one PR
-
-3. Accomplish a detailed change in one PR. Avoid large PR
-
- - Bad: Support Faster R-CNN
- - Acceptable: Add a box head to Faster R-CNN
- - Good: Add a parameter to box head to support custom conv-layer number
-
-4. Provide clear and significant commit message
-
-5. Provide clear and meaningful PR description
-
- - Task name should be clarified in title. The general format is: \[Prefix\] Short description of the PR (Suffix)
- - Prefix: add new feature \[Feature\], fix bug \[Fix\], related to documents \[Docs\], in developing \[WIP\] (which will not be reviewed temporarily)
- - Introduce main changes, results and influences on other modules in short description
- - Associate related issues and pull requests with a milestone
diff --git a/CONTRIBUTING_zh-CN.md b/CONTRIBUTING_zh-CN.md
deleted file mode 100644
index 00622031dd567957829f38d0425d3d23741c8f2f..0000000000000000000000000000000000000000
--- a/CONTRIBUTING_zh-CN.md
+++ /dev/null
@@ -1,274 +0,0 @@
-## 贡献代码
-
-欢迎加入 MMCV 社区,我们致力于打造最前沿的计算机视觉基础库,我们欢迎任何类型的贡献,包括但不限于
-
-**修复错误**
-
-修复代码实现错误的步骤如下:
-
-1. 如果提交的代码改动较大,建议先提交 issue,并正确描述 issue 的现象、原因和复现方式,讨论后确认修复方案。
-2. 修复错误并补充相应的单元测试,提交拉取请求。
-
-**新增功能或组件**
-
-1. 如果新功能或模块涉及较大的代码改动,建议先提交 issue,确认功能的必要性。
-2. 实现新增功能并添单元测试,提交拉取请求。
-
-**文档补充**
-
-修复文档可以直接提交拉取请求
-
-添加文档或将文档翻译成其他语言步骤如下
-
-1. 提交 issue,确认添加文档的必要性。
-2. 添加文档,提交拉取请求。
-
-### 拉取请求工作流
-
-如果你对拉取请求不了解,没关系,接下来的内容将会从零开始,一步一步地指引你如何创建一个拉取请求。如果你想深入了解拉取请求的开发模式,可以参考 github [官方文档](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)
-
-#### 1. 复刻仓库
-
-当你第一次提交拉取请求时,先复刻 OpenMMLab 原代码库,点击 GitHub 页面右上角的 **Fork** 按钮,复刻后的代码库将会出现在你的 GitHub 个人主页下。
-
-
-
-将代码克隆到本地
-
-```shell
-git clone git@github.com:{username}/mmcv.git
-```
-
-添加原代码库为上游代码库
-
-```bash
-git remote add upstream git@github.com:open-mmlab/mmcv
-```
-
-检查 remote 是否添加成功,在终端输入 `git remote -v`
-
-```bash
-origin git@github.com:{username}/mmcv.git (fetch)
-origin git@github.com:{username}/mmcv.git (push)
-upstream git@github.com:open-mmlab/mmcv (fetch)
-upstream git@github.com:open-mmlab/mmcv (push)
-```
-
-> 这里对 origin 和 upstream 进行一个简单的介绍,当我们使用 git clone 来克隆代码时,会默认创建一个 origin 的 remote,它指向我们克隆的代码库地址,而 upstream 则是我们自己添加的,用来指向原始代码库地址。当然如果你不喜欢他叫 upstream,也可以自己修改,比如叫 open-mmlab。我们通常向 origin 提交代码(即 fork 下来的远程仓库),然后向 upstream 提交一个 pull request。如果提交的代码和最新的代码发生冲突,再从 upstream 拉取最新的代码,和本地分支解决冲突,再提交到 origin。
-
-#### 2. 配置 pre-commit
-
-在本地开发环境中,我们使用 [pre-commit](https://pre-commit.com/#intro) 来检查代码风格,以确保代码风格的统一。在提交代码,需要先安装 pre-commit(需要在 MMCV 目录下执行):
-
-```shell
-pip install -U pre-commit
-pre-commit install
-```
-
-检查 pre-commit 是否配置成功,并安装 `.pre-commit-config.yaml` 中的钩子:
-
-```shell
-pre-commit run --all-files
-```
-
-
-
-
-
-> 如果你是中国用户,由于网络原因,可能会出现安装失败的情况,这时可以使用国内源
-
-> pre-commit install -c .pre-commit-config-zh-cn.yaml
-
-> pre-commit run --all-files -c .pre-commit-config-zh-cn.yaml
-
-如果安装过程被中断,可以重复执行 `pre-commit run ...` 继续安装。
-
-如果提交的代码不符合代码风格规范,pre-commit 会发出警告,并自动修复部分错误。
-
-
-
-如果我们想临时绕开 pre-commit 的检查提交一次代码,可以在 `git commit` 时加上 `--no-verify`(需要保证最后推送至远程仓库的代码能够通过 pre-commit 检查)。
-
-```shell
-git commit -m "xxx" --no-verify
-```
-
-#### 3. 创建开发分支
-
-安装完 pre-commit 之后,我们需要基于 master 创建开发分支,建议的分支命名规则为 `username/pr_name`。
-
-```shell
-git checkout -b yhc/refactor_contributing_doc
-```
-
-在后续的开发中,如果本地仓库的 master 分支落后于 upstream 的 master 分支,我们需要先拉取 upstream 的代码进行同步,再执行上面的命令
-
-```shell
-git pull upstream master
-```
-
-#### 4. 提交代码并在本地通过单元测试
-
-- MMCV 引入了 mypy 来做静态类型检查,以增加代码的鲁棒性。因此我们在提交代码时,需要补充 Type Hints。具体规则可以参考[教程](https://zhuanlan.zhihu.com/p/519335398)。
-
-- 提交的代码同样需要通过单元测试
-
- ```shell
- # 通过全量单元测试
- pytest tests
-
- # 我们需要保证提交的代码能够通过修改模块的单元测试,以 runner 为例
- pytest tests/test_runner/test_runner.py
- ```
-
- 如果你由于缺少依赖无法运行修改模块的单元测试,可以参考[指引-单元测试](#单元测试)
-
-- 如果修改/添加了文档,参考[指引](#文档渲染)确认文档渲染正常。
-
-#### 5. 推送代码到远程
-
-代码通过单元测试和 pre-commit 检查后,将代码推送到远程仓库,如果是第一次推送,可以在 `git push` 后加上 `-u` 参数以关联远程分支
-
-```shell
-git push -u origin {branch_name}
-```
-
-这样下次就可以直接使用 `git push` 命令推送代码了,而无需指定分支和远程仓库。
-
-#### 6. 提交拉取请求(PR)
-
-(1) 在 GitHub 的 Pull request 界面创建拉取请求
-
-
-(2) 根据指引修改 PR 描述,以便于其他开发者更好地理解你的修改
-
-
-
-描述规范详见[拉取请求规范](#拉取请求规范)
-
-
-
-**注意事项**
-
-(a) PR 描述应该包含修改理由、修改内容以及修改后带来的影响,并关联相关 Issue(具体方式见[文档](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue))
-
-(b) 如果是第一次为 OpenMMLab 做贡献,需要签署 CLA
-
-
-
-(c) 检查提交的 PR 是否通过 CI(集成测试)
-
-
-
-MMCV 会在不同的平台(Linux、Window、Mac),基于不同版本的 Python、PyTorch、CUDA 对提交的代码进行单元测试,以保证代码的正确性,如果有任何一个没有通过,我们可点击上图中的 `Details` 来查看具体的测试信息,以便于我们修改代码。
-
-(3) 如果 PR 通过了 CI,那么就可以等待其他开发者的 review,并根据 reviewer 的意见,修改代码,并重复 [4](#4-提交代码并本地通过单元测试)-[5](#5-推送代码到远程) 步骤,直到 reviewer 同意合入 PR。
-
-
-
-所有 reviewer 同意合入 PR 后,我们会尽快将 PR 合并到主分支。
-
-#### 7. 解决冲突
-
-随着时间的推移,我们的代码库会不断更新,这时候,如果你的 PR 与主分支存在冲突,你需要解决冲突,解决冲突的方式有两种:
-
-```shell
-git fetch --all --prune
-git rebase upstream/master
-```
-
-或者
-
-```shell
-git fetch --all --prune
-git merge upstream/master
-```
-
-如果你非常善于处理冲突,那么可以使用 rebase 的方式来解决冲突,因为这能够保证你的 commit log 的整洁。如果你不太熟悉 `rebase` 的使用,那么可以使用 `merge` 的方式来解决冲突。
-
-### 指引
-
-#### 单元测试
-
-如果你无法正常执行部分模块的单元测试,例如 [video](https://github.com/open-mmlab/mmcv/tree/master/mmcv/video) 模块,可能是你的当前环境没有安装以下依赖
-
-```shell
-# Linux
-sudo apt-get update -y
-sudo apt-get install -y libturbojpeg
-sudo apt-get install -y ffmpeg
-
-# Windows
-conda install ffmpeg
-```
-
-在提交修复代码错误或新增特性的拉取请求时,我们应该尽可能的让单元测试覆盖所有提交的代码,计算单元测试覆盖率的方法如下
-
-```shell
-python -m coverage run -m pytest /path/to/test_file
-python -m coverage html
-# check file in htmlcov/index.html
-```
-
-#### 文档渲染
-
-在提交修复代码错误或新增特性的拉取请求时,可能会需要修改/新增模块的 docstring。我们需要确认渲染后的文档样式是正确的。
-本地生成渲染后的文档的方法如下
-
-```shell
-pip install -r requirements/docs.txt
-cd docs/zh_cn/
-# or docs/en
-make html
-# check file in ./docs/zh_cn/_build/html/index.html
-```
-
-### 代码风格
-
-#### Python
-
-[PEP8](https://www.python.org/dev/peps/pep-0008/) 作为 OpenMMLab 算法库首选的代码规范,我们使用以下工具检查和格式化代码
-
-- [flake8](https://github.com/PyCQA/flake8): Python 官方发布的代码规范检查工具,是多个检查工具的封装
-- [isort](https://github.com/timothycrosley/isort): 自动调整模块导入顺序的工具
-- [yapf](https://github.com/google/yapf): Google 发布的代码规范检查工具
-- [codespell](https://github.com/codespell-project/codespell): 检查单词拼写是否有误
-- [mdformat](https://github.com/executablebooks/mdformat): 检查 markdown 文件的工具
-- [docformatter](https://github.com/myint/docformatter): 格式化 docstring 的工具
-
-yapf 和 isort 的配置可以在 [setup.cfg](./setup.cfg) 找到
-
-通过配置 [pre-commit hook](https://pre-commit.com/) ,我们可以在提交代码时自动检查和格式化 `flake8`、`yapf`、`isort`、`trailing whitespaces`、`markdown files`,
-修复 `end-of-files`、`double-quoted-strings`、`python-encoding-pragma`、`mixed-line-ending`,调整 `requirments.txt` 的包顺序。
-pre-commit 钩子的配置可以在 [.pre-commit-config](./.pre-commit-config.yaml) 找到。
-
-pre-commit 具体的安装使用方式见[拉取请求](#2-配置-pre-commit)。
-
-更具体的规范请参考 [OpenMMLab 代码规范](code_style.md)。
-
-#### C++ and CUDA
-
-C++ 和 CUDA 的代码规范遵从 [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html)
-
-### 拉取请求规范
-
-1. 使用 [pre-commit hook](https://pre-commit.com),尽量减少代码风格相关问题
-
-2. 一个`拉取请求`对应一个短期分支
-
-3. 粒度要细,一个`拉取请求`只做一件事情,避免超大的`拉取请求`
-
- - Bad:实现 Faster R-CNN
- - Acceptable:给 Faster R-CNN 添加一个 box head
- - Good:给 box head 增加一个参数来支持自定义的 conv 层数
-
-4. 每次 Commit 时需要提供清晰且有意义 commit 信息
-
-5. 提供清晰且有意义的`拉取请求`描述
-
- - 标题写明白任务名称,一般格式:\[Prefix\] Short description of the pull request (Suffix)
- - prefix: 新增功能 \[Feature\], 修 bug \[Fix\], 文档相关 \[Docs\], 开发中 \[WIP\] (暂时不会被review)
- - 描述里介绍`拉取请求`的主要修改内容,结果,以及对其他部分的影响, 参考`拉取请求`模板
- - 关联相关的`议题` (issue) 和其他`拉取请求`
-
-6. 如果引入了其他三方库,或借鉴了三方库的代码,请确认他们的许可证和 mmcv 兼容,并在借鉴的代码上补充 `This code is inspired from http://`
diff --git a/Dockerfile b/Dockerfile
new file mode 100644
index 0000000000000000000000000000000000000000..e163b312ca5b45dac195232979fa31024ff55ef2
--- /dev/null
+++ b/Dockerfile
@@ -0,0 +1,7 @@
+FROM python:3.7
+
+WORKDIR /mmcv
+
+COPY . /mmcv
+
+RUN pip install -e .
diff --git a/LICENSES.md b/LICENSES.md
index 3cdeddf6ff1d09ed8e2d9042f2d930e20599a0b1..9bb0c8cafa72033f503fd3f46b98d30dcfd75c29 100644
--- a/LICENSES.md
+++ b/LICENSES.md
@@ -2,10 +2,7 @@
In this file, we list the operations with other licenses instead of Apache 2.0. Users should be careful about adopting these operations in any commercial matters.
-| Operation | Files | License |
-| :--------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------: |
-| upfirdn2d | [mmcv/ops/csrc/pytorch/cuda/upfirdn2d_kernel.cu](https://github.com/open-mmlab/mmcv/tree/2.x/mmcv/ops/csrc/pytorch/cuda/upfirdn2d_kernel.cu) | NVIDIA License |
-| fused_leaky_relu | [mmcv/ops/csrc/pytorch/cuda/fused_bias_leakyrelu_cuda.cu](https://github.com/open-mmlab/mmcv/tree/2.x/mmcv/ops/csrc/pytorch/cuda/fused_bias_leakyrelu_cuda.cu) | NVIDIA License |
-| bias_act | [mmcv/ops/csrc/pytorch/cuda/bias_act_cuda.cu](https://github.com/open-mmlab/mmcv/tree/2.x/mmcv/ops/csrc/pytorch/cuda/bias_act_cuda.cu) | NVIDIA License |
-| filtered_lrelu | [mmcv/ops/csrc/pytorch/cuda/filtered_lrelu.cu](https://github.com/open-mmlab/mmcv/tree/2.x/mmcv/ops/csrc/pytorch/cuda/filtered_lrelu.cu) | NVIDIA License |
-| conv2d_gradfix | [mmcv/ops/conv2d_gradfix.py](https://github.com/open-mmlab/mmcv/tree/2.x/mmcv/ops/conv2d_gradfix.py) | NVIDIA License |
+| Operation | Files | License |
+| :--------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------: | :------------: |
+| upfirdn2d | [mmcv/ops/csrc/pytorch/cuda/upfirdn2d_kernel.cu](https://github.com/open-mmlab/mmcv/blob/master/mmcv/ops/csrc/pytorch/cuda/upfirdn2d_kernel.cu) | NVIDIA License |
+| fused_leaky_relu | [mmcv/ops/csrc/pytorch/cuda/fused_bias_leakyrelu_cuda.cu](https://github.com/open-mmlab/mmcv/blob/master/mmcv/ops/csrc/pytorch/cuda/fused_bias_leakyrelu_cuda.cu) | NVIDIA License |
diff --git a/MANIFEST.in b/MANIFEST.in
index 622635caa1ec01f78d95c684b87658df87c63b38..65f232e070d43ce40d0fd425201e3b140b5af551 100644
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,6 +1,5 @@
include requirements/runtime.txt
+include mmcv/model_zoo/open_mmlab.json mmcv/model_zoo/deprecated.json mmcv/model_zoo/mmcls.json
include mmcv/ops/csrc/common/cuda/*.cuh mmcv/ops/csrc/common/cuda/*.hpp mmcv/ops/csrc/common/*.hpp
include mmcv/ops/csrc/pytorch/*.cpp mmcv/ops/csrc/pytorch/cuda/*.cu mmcv/ops/csrc/pytorch/cuda/*.cpp mmcv/ops/csrc/pytorch/cpu/*.cpp
include mmcv/ops/csrc/parrots/*.h mmcv/ops/csrc/parrots/*.cpp
-include mmcv/ops/csrc/pytorch/mps/*.mm mmcv/ops/csrc/common/mps/*.h mmcv/ops/csrc/common/mps/*.mm
-recursive-include mmcv/ops/csrc/ *.h *.hpp *.cpp *.cuh *.cu *.mm
diff --git a/README.md b/README.md
index 098cf65f012e3cde8342e467c2391d3b303226c6..9b64100479f8f8030f1736173aa6ee3e25be8f8a 100644
--- a/README.md
+++ b/README.md
@@ -7,7 +7,7 @@ MMCV是计算机视觉研究的基础库,主要提供以下功能:图像处
+ Python 3.7、3.8、3.9
### 1、使用pip方式安装
-mmcv whl包下载目录:[https://cancon.hpccube.com:65024/4/main/mmcv/dtk23.04](https://cancon.hpccube.com:65024/4/main/mmcv/dtk23.04),选择对应的pytorch版本和python版本下载对应mmcv的whl包
+mmcv whl包下载目录:[https://cancon.hpccube.com:65024/4/main/mmcv](https://cancon.hpccube.com:65024/4/main/mmcv),选择对应的pytorch版本和python版本下载对应mmcv的whl包
```shell
pip install mmcv* (下载的mmcv的whl包)
```
@@ -18,7 +18,7 @@ pip install mmcv* (下载的mmcv的whl包)
1. 基于光源pytorch基础镜像环境:镜像下载地址:[https://sourcefind.cn/#/image/dcu/pytorch](https://sourcefind.cn/#/image/dcu/pytorch),根据pytorch、python、dtk及系统下载对应的镜像版本。
-2. 基于现有python环境:安装pytorch,pytorch whl包下载目录:[https://cancon.hpccube.com:65024/4/main/pytorch/dtk23.04](https://cancon.hpccube.com:65024/4/main/pytorch/dtk23.04),根据python、dtk版本,下载对应pytorch的whl包。安装命令如下:
+2. 基于现有python环境:安装pytorch,pytorch whl包下载目录:[https://cancon.hpccube.com:65024/4/main/pytorch/dtk24.04.1](https://cancon.hpccube.com:65024/4/main/pytorch/dtk24.04.1),根据python、dtk版本,下载对应pytorch的whl包。安装命令如下:
```shell
pip install torch* (下载的torch的whl包)
pip install setuptools==59.5.0 wheel
@@ -32,11 +32,17 @@ git clone https://developer.hpccube.com/codes/aicomponent/mmcv # 根据编译需
- 提供2种源码编译方式(进入mmcv目录):
```
1. 编译whl包并安装
-MMCV_WITH_OPS=1 ROCM_HOME=${ROCM_PATH} python3 setup.py -v bdist_wheel
+MMCV_WITH_OPS=1 python3 setup.py -v bdist_wheel
pip install dist/mmcv*
2. 源码编译安装
-MMCV_WITH_OPS=1 ROCM_HOME=${ROCM_PATH} python3 setup.py install
+MMCV_WITH_OPS=1 python3 setup.py install
+```
+3. 测试验证
+```
+cd test
+pytest -s ./test_arraymisc.py
+pytest -s ./test_ops
```
#### 注意事项
+ 若使用pip install下载安装过慢,可添加pypi清华源:-i https://pypi.tuna.tsinghua.edu.cn/simple/
@@ -52,3 +58,4 @@ MMCV_WITH_OPS=1 ROCM_HOME=${ROCM_PATH} python3 setup.py install
- [README_ORIGIN](README_ORIGIN.md)
- [README_zh-CN](README_zh-CN.md)
- [https://github.com/open-mmlab/mmcv](https://github.com/open-mmlab/mmcv)
+
diff --git a/README_ORIGIN.md b/README_ORIGIN.md
index 25d290f3dac27c8f0e87b0256ed8b0964d5bbcc9..e9e3f8efaf86059c8e7bef3fec73513b69e31442 100644
--- a/README_ORIGIN.md
+++ b/README_ORIGIN.md
@@ -1,119 +1,204 @@
-
-
-
-
+
-[](https://mmcv.readthedocs.io/en/2.x/)
-[](https://mmcv.readthedocs.io/en/2.x/get_started/installation.html)
-[](https://pypi.org/project/mmcv/)
-[](https://pytorch.org/get-started/previous-versions/)
-[](https://developer.nvidia.com/cuda-downloads)
-[](https://pypi.org/project/mmcv)
-[](https://github.com/open-mmlab/mmcv/actions)
-[](https://codecov.io/gh/open-mmlab/mmcv)
-[](https://github.com/open-mmlab/mmcv/blob/master/LICENSE)
+[](https://pypi.org/project/mmcv/) [](https://pypi.org/project/mmcv) [](https://github.com/open-mmlab/mmcv/actions) [](https://codecov.io/gh/open-mmlab/mmcv) [](https://github.com/open-mmlab/mmcv/blob/master/LICENSE)
English | [简体中文](README_zh-CN.md)
## Introduction
-MMCV is a foundational library for computer vision research and it provides the following functionalities:
+MMCV is a foundational library for computer vision research and supports many
+research projects as below:
-- [Image/Video processing](https://mmcv.readthedocs.io/en/2.x/understand_mmcv/data_process.html)
-- [Image and annotation visualization](https://mmcv.readthedocs.io/en/2.x/understand_mmcv/visualization.html)
-- [Image transformation](https://mmcv.readthedocs.io/en/2.x/understand_mmcv/data_transform.html)
-- [Various CNN architectures](https://mmcv.readthedocs.io/en/2.x/understand_mmcv/cnn.html)
-- [High-quality implementation of common CPU and CUDA ops](https://mmcv.readthedocs.io/en/2.x/understand_mmcv/ops.html)
+- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab foundational library for computer vision.
+- [MIM](https://github.com/open-mmlab/mim): MIM Installs OpenMMLab Packages.
+- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark.
+- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark.
+- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection.
+- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark.
+- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark.
+- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark.
+- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark.
+- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox.
+- [MMOCR](https://github.com/open-mmlab/mmocr): A Comprehensive Toolbox for Text Detection, Recognition and Understanding.
+- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox.
+- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab optical flow toolbox and benchmark.
+- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab FewShot Learning Toolbox and Benchmark.
-It supports the following systems:
+It provides the following functionalities.
-- Linux
-- Windows
-- macOS
+- Universal IO APIs
+- Image/Video processing
+- Image and annotation visualization
+- Useful utilities (progress bar, timer, ...)
+- PyTorch runner with hooking mechanism
+- Various CNN architectures
+- High-quality implementation of common CUDA ops
-See the [documentation](http://mmcv.readthedocs.io/en/2.x) for more features and usage.
+See the [documentation](http://mmcv.readthedocs.io/en/latest) for more features and usage.
-Note: MMCV requires Python 3.7+.
+Note: MMCV requires Python 3.6+.
## Installation
There are two versions of MMCV:
-- **mmcv**: comprehensive, with full features and various CUDA ops out of the box. It takes longer time to build.
-- **mmcv-lite**: lite, without CUDA ops but all other features, similar to mmcv\<1.0.0. It is useful when you do not need those CUDA ops.
+- **mmcv-full**: comprehensive, with full features and various CUDA ops out of box. It takes longer time to build.
+- **mmcv**: lite, without CUDA ops but all other features, similar to mmcv<1.0.0. It is useful when you do not need those CUDA ops.
**Note**: Do not install both versions in the same environment, otherwise you may encounter errors like `ModuleNotFound`. You need to uninstall one before installing the other. `Installing the full version is highly recommended if CUDA is available`.
-### Install mmcv
+a. Install the full version.
+
+Before installing mmcv-full, make sure that PyTorch has been successfully installed following the [official guide](https://pytorch.org/).
-Before installing mmcv, make sure that PyTorch has been successfully installed following the [PyTorch official installation guide](https://github.com/pytorch/pytorch#installation). For apple silicon users, please use PyTorch 1.13+.
+We provide pre-built mmcv packages (recommended) with different PyTorch and CUDA versions to simplify the building. In addition, you can run [check_installation.py](.dev_scripts/check_installation.py) to check the installation of mmcv-full after running the installation commands.
-The command to install mmcv:
+i. Install the latest version.
-```bash
-pip install -U openmim
-mim install "mmcv>=2.0.0rc1"
+The rule for installing the latest ``mmcv-full`` is as follows:
+
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
```
-If you need to specify the version of mmcv, you can use the following command:
+Please replace ``{cu_version}`` and ``{torch_version}`` in the url to your desired one. For example,
+to install the latest ``mmcv-full`` with ``CUDA 11.1`` and ``PyTorch 1.9.0``, use the following command:
-```bash
-mim install mmcv==2.0.0rc3
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
```
-If you find that the above installation command does not use a pre-built package ending with `.whl` but a source package ending with `.tar.gz`, you may not have a pre-build package corresponding to the PyTorch or CUDA or mmcv version, in which case you can [build mmcv from source](https://mmcv.readthedocs.io/en/2.x/get_started/build.html).
+**Note**: mmcv-full is only compiled on PyTorch 1.x.0 because the compatibility usually holds between 1.x.0 and 1.x.1. If your PyTorch version is 1.x.1, you can install mmcv-full compiled with PyTorch 1.x.0 and it usually works well. For example, if your PyTorch version is 1.8.1 and CUDA version is 11.1, you can use the following command to install mmcv-full.
-
-Installation log using pre-built packages
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
+```
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv
-Downloading https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/mmcv-2.0.0rc3-cp38-cp38-manylinux1_x86_64.whl
+For more details, please refer the the following tables and delete ``=={mmcv_version}``.
-
+ii. Install a specified version.
-
-Installation log using source packages
+The rule for installing a specified ``mmcv-full`` is as follows:
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv==2.0.0rc3
-Downloading mmcv-2.0.0rc3.tar.gz
+```shell
+pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
+```
+
+First of all, please refer to the Releases and replace ``{mmcv_version}`` a specified one. e.g. ``1.3.9``.
+Then replace ``{cu_version}`` and ``{torch_version}`` in the url to your desired versions. For example,
+to install ``mmcv-full==1.3.9`` with ``CUDA 11.1`` and ``PyTorch 1.9.0``, use the following command:
-
+```shell
+pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+```
-For more installation methods, please refer to the [Installation documentation](https://mmcv.readthedocs.io/en/2.x/get_started/installation.html).
+For more details, please refer the the following tables.
+
+
+
+
+ CUDA
+ torch1.10
+ torch1.9
+ torch1.8
+ torch1.7
+ torch1.6
+ torch1.5
+
+
+ 11.3
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
+
+
+
+
+
+
+
+ 11.1
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
+
+
+
+
+
+ 11.0
+
+
+
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.0/index.html
+
+
+
+
+ 10.2
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.10.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.9.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.5.0/index.html
+
+
+ 10.1
+
+
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.5.0/index.html
+
+
+ 9.2
+
+
+
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.5.0/index.html
+
+
+ cpu
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.9.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.8.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.5.0/index.html
+
+
+
+
+**Note**: The pre-built packages provided above do not include all versions of mmcv-full, you can click on the corresponding links to see the supported versions. For example, you can click [cu102-torch1.8.0](https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html) and you can see that `cu102-torch1.8.0` only provides 1.3.0 and above versions of mmcv-full. In addition, We no longer provide `mmcv-full` pre-built packages compiled with `PyTorch 1.3 & 1.4` since v1.3.17. You can find previous versions that compiled with PyTorch 1.3 & 1.4 [here](./docs/get_started/previous_versions.md). The compatibility is still ensured in our CI, but we will discard the support of PyTorch 1.3 & 1.4 next year.
+
+Another way is to compile locally by running
+
+```python
+pip install mmcv-full
+```
-### Install mmcv-lite
+Note that the local compiling may take up to 10 mins.
-If you need to use PyTorch-related modules, make sure PyTorch has been successfully installed in your environment by referring to the [PyTorch official installation guide](https://github.com/pytorch/pytorch#installation).
+b. Install the lite version.
-```bash
-pip install -U openmim
-mim install "mmcv-lite>=2.0.0rc1"
+```python
+pip install mmcv
```
+c. Install full version with custom operators for onnxruntime
+
+- Check [here](docs/deployment/onnxruntime_op.md) for detailed instruction.
+
+If you would like to build MMCV from source, please refer to the [guide](https://mmcv.readthedocs.io/en/latest/get_started/build.html).
+
## FAQ
If you face some installation issues, CUDA related issues or RuntimeErrors,
-you may first refer to this [Frequently Asked Questions](https://mmcv.readthedocs.io/en/2.x/faq.html).
-
-If you face installation problems or runtime issues, you may first refer to this [Frequently Asked Questions](https://mmcv.readthedocs.io/en/2.x/faq.html) to see if there is a solution. If the problem is still not solved, feel free to open an [issue](https://github.com/open-mmlab/mmcv/issues).
+you may first refer to this [Frequently Asked Questions](https://mmcv.readthedocs.io/en/latest/faq.html).
## Citation
@@ -135,27 +220,3 @@ We appreciate all contributions to improve MMCV. Please refer to [CONTRIBUTING.m
## License
MMCV is released under the Apache 2.0 license, while some specific operations in this library are with other licenses. Please refer to [LICENSES.md](LICENSES.md) for the careful check, if you are using our code for commercial matters.
-
-## Projects in OpenMMLab
-
-- [MMEngine](https://github.com/open-mmlab/mmengine): OpenMMLab foundational library for training deep learning models.
-- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab foundational library for computer vision.
-- [MIM](https://github.com/open-mmlab/mim): MIM installs OpenMMLab packages.
-- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark.
-- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark.
-- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection.
-- [MMRotate](https://github.com/open-mmlab/mmrotate): OpenMMLab rotated object detection toolbox and benchmark.
-- [MMYOLO](https://github.com/open-mmlab/mmyolo): OpenMMLab YOLO series toolbox and benchmark.
-- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark.
-- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab text detection, recognition, and understanding toolbox.
-- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark.
-- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 3D human parametric model toolbox and benchmark.
-- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab self-supervised learning toolbox and benchmark.
-- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab model compression toolbox and benchmark.
-- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab fewshot learning toolbox and benchmark.
-- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark.
-- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark.
-- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab optical flow toolbox and benchmark.
-- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox.
-- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox.
-- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab model deployment framework.
diff --git a/README_zh-CN.md b/README_zh-CN.md
index d9a81ebf58c7e5578e7b43d9803cd9a2b69bdd9b..e3288ee31403d02c6d4c2c9335aff556c2c3d23c 100644
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -1,116 +1,200 @@
-
-
-
-
+
-[](https://mmcv.readthedocs.io/zh_CN/2.x/)
-[](https://mmcv.readthedocs.io/zh_CN/2.x/get_started/installation.html)
-[](https://pypi.org/project/mmcv/)
-[](https://pytorch.org/get-started/previous-versions/)
-[](https://developer.nvidia.com/cuda-downloads)
-[](https://pypi.org/project/mmcv)
-[](https://github.com/open-mmlab/mmcv/actions)
-[](https://codecov.io/gh/open-mmlab/mmcv)
-[](https://github.com/open-mmlab/mmcv/blob/master/LICENSE)
+[](https://pypi.org/project/mmcv/) [](https://pypi.org/project/mmcv) [](https://github.com/open-mmlab/mmcv/actions) [](https://codecov.io/gh/open-mmlab/mmcv) [](https://github.com/open-mmlab/mmcv/blob/master/LICENSE)
[English](README.md) | 简体中文
## 简介
-MMCV 是一个面向计算机视觉的基础库,它提供了以下功能:
+MMCV 是一个面向计算机视觉的基础库,它支持了很多开源项目,例如:
-- [图像和视频处理](https://mmcv.readthedocs.io/zh_CN/2.x/understand_mmcv/data_process.html)
-- [图像和标注结果可视化](https://mmcv.readthedocs.io/zh_CN/2.x/understand_mmcv/visualization.html)
-- [图像变换](https://mmcv.readthedocs.io/zh_CN/2.x/understand_mmcv/data_transform.html)
-- [多种 CNN 网络结构](https://mmcv.readthedocs.io/zh_CN/2.x/understand_mmcv/cnn.html)
-- [高质量实现的常见 CUDA 算子](https://mmcv.readthedocs.io/zh_CN/2.x/understand_mmcv/ops.html)
+- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab 计算机视觉基础库
+- [MIM](https://github.com/open-mmlab/mim): OpenMMLab 项目、算法、模型的统一入口
+- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab 图像分类工具箱与测试基准
+- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab 检测工具箱与测试基准
+- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab 新一代通用3D目标检测平台
+- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab 语义分割工具箱与测试基准
+- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab 新一代视频理解工具箱与测试基准
+- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab 一体化视频目标感知平台
+- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab 姿态估计工具箱与测试基准
+- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab 图像视频编辑工具箱
+- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab 全流程文字检测识别理解工具包
+- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab 新一代生成模型工具箱
+- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab 光流估计工具箱与测试基准
+- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab 少样本学习工具箱与测试基准
-MMCV 支持多种平台,包括:
+MMCV 提供了如下众多功能:
-- Linux
-- Windows
-- macOS
+- 通用的 IO 接口
+- 图像和视频处理
+- 图像和标注结果可视化
+- 常用小工具(进度条,计时器等)
+- 基于 PyTorch 的通用训练框架
+- 多种 CNN 网络结构
+- 高质量实现的常见 CUDA 算子
-如想了解更多特性和使用,请参考[文档](http://mmcv.readthedocs.io/zh_CN/2.x)。
+如想了解更多特性和使用,请参考[文档](http://mmcv.readthedocs.io/en/latest)。
-提示: MMCV 需要 Python 3.7 以上版本。
+提示: MMCV 需要 Python 3.6 以上版本。
## 安装
MMCV 有两个版本:
-- **mmcv**: 完整版,包含所有的特性以及丰富的开箱即用的 CUDA 算子。注意完整版本可能需要更长时间来编译。
-- **mmcv-lite**: 精简版,不包含 CUDA 算子但包含其余所有特性和功能,类似 MMCV 1.0 之前的版本。如果你不需要使用 CUDA 算子的话,精简版可以作为一个考虑选项。
+- **mmcv-full**: 完整版,包含所有的特性以及丰富的开箱即用的 CUDA 算子。注意完整版本可能需要更长时间来编译。
+- **mmcv**: 精简版,不包含 CUDA 算子但包含其余所有特性和功能,类似 MMCV 1.0 之前的版本。如果你不需要使用 CUDA 算子的话,精简版可以作为一个考虑选项。
+
+**注意**: 请不要在同一个环境中安装两个版本,否则可能会遇到类似 `ModuleNotFound` 的错误。在安装一个版本之前,需要先卸载另一个。`如果CUDA可用,强烈推荐安装mmcv-full`。
-**注意**: 请不要在同一个环境中安装两个版本,否则可能会遇到类似 `ModuleNotFound` 的错误。在安装一个版本之前,需要先卸载另一个。`如果 CUDA 可用,强烈推荐安装 mmcv`。
+a. 安装完整版
-### 安装 mmcv
+在安装 mmcv-full 之前,请确保 PyTorch 已经成功安装在环境中,可以参考 PyTorch 官方[文档](https://pytorch.org/)。
-在安装 mmcv 之前,请确保 PyTorch 已经成功安装在环境中,可以参考 [PyTorch 官方安装文档](https://github.com/pytorch/pytorch#installation)。如果你使用的是搭载 apple silicon 的 mac 设备,请安装 PyTorch 1.13+ 的版本。
+我们提供了不同 PyTorch 和 CUDA 版本的 mmcv-full 预编译包,可以大大简化用户安装编译过程。强烈推荐通过预编译包来安装。另外,安装完成后可以运行 [check_installation.py](.dev_scripts/check_installation.py) 脚本检查 mmcv-full 是否安装成功。
-安装 mmcv 的命令如下:
+i. 安装最新版本
-```bash
-pip install -U openmim
-mim install "mmcv>=2.0.0rc1"
+如下是安装最新版 ``mmcv-full`` 的命令
+
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
```
-如果需要指定 mmcv 的版本,可以使用以下命令
+请将链接中的 ``{cu_version}`` 和 ``{torch_version}`` 根据自身需求替换成实际的版本号,例如想安装和 ``CUDA 11.1``、``PyTorch 1.9.0`` 兼容的最新版 ``mmcv-full``,使用如下替换过的命令
-```bash
-mim install mmcv==2.0.0rc3
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
```
-如果发现上述的安装命令没有使用预编译包(以 `.whl` 结尾)而是使用源码包(以 `.tar.gz` 结尾)安装,则有可能是我们没有提供和当前环境的 PyTorch 版本、CUDA 版本相匹配的 mmcv 预编译包,此时,你可以[源码安装 mmcv](https://mmcv.readthedocs.io/zh_CN/2.x/get_started/build.html)。
+**注意**: PyTorch 在 1.x.0 和 1.x.1 之间通常是兼容的,故 mmcv-full 只提供 1.x.0 的编译包。如果你的 PyTorch 版本是 1.x.1,你可以放心地安装在 1.x.0 版本编译的 mmcv-full。例如,如果你的 PyTorch 版本是 1.8.1、CUDA 版本是 11.1,你可以使用以下命令安装 mmcv-full。
+
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
+```
-
-使用预编译包的安装日志
+如果想知道更多 CUDA 和 PyTorch 版本的命令,可以参考下面的表格,将链接中的 ``=={mmcv_version}`` 删去即可。
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv
-Downloading https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/mmcv-2.0.0rc3-cp38-cp38-manylinux1_x86_64.whl
+ii. 安装特定的版本
-
+如下是安装特定版本 ``mmcv-full`` 的命令
-
-使用源码包的安装日志
+```shell
+pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
+```
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv==2.0.0rc3
-Downloading mmcv-2.0.0rc3.tar.gz
+首先请参考版本发布信息找到想要安装的版本号,将 ``{mmcv_version}`` 替换成该版本号,例如 ``1.3.9``。
+然后将链接中的 ``{cu_version}`` 和 ``{torch_version}`` 根据自身需求替换成实际的版本号,例如想安装和 ``CUDA 11.1``、``PyTorch 1.9.0`` 兼容的 ``mmcv-full`` 1.3.9 版本,使用如下替换过的命令
-
+```shell
+pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+```
-更多安装方式请参考[安装文档](https://mmcv.readthedocs.io/zh_CN/2.x/get_started/installation.html)。
+对于更多的 PyTorch 和 CUDA 版本组合,请参考下表:
+
+
+
+
+ CUDA
+ torch1.10
+ torch1.9
+ torch1.8
+ torch1.7
+ torch1.6
+ torch1.5
+
+
+ 11.3
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
+
+
+
+
+
+
+
+ 11.1
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
+
+
+
+
+
+ 11.0
+
+
+
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.0/index.html
+
+
+
+
+ 10.2
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.10.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.9.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.5.0/index.html
+
+
+ 10.1
+
+
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.5.0/index.html
+
+
+ 9.2
+
+
+
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.5.0/index.html
+
+
+ cpu
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.9.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.8.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.5.0/index.html
+
+
+
+
+**注意**:以上提供的预编译包并不囊括所有的 mmcv-full 版本,你可以点击对应链接查看支持的版本。例如,点击 [cu102-torch1.8.0](https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html),可以看到 `cu102-torch1.8.0` 只提供了 1.3.0 及以上的 mmcv-full 版本。另外,从 `mmcv v1.3.17` 开始,我们不再提供`PyTorch 1.3 & 1.4` 对应的 mmcv-full 预编译包。你可以在 [这](./docs_zh_CN/get_started/previous_versions.md) 找到 `PyTorch 1.3 & 1.4` 对应的预编包。虽然我们不再提供 `PyTorch 1.3 & 1.4` 对应的预编译包,但是我们依然在 CI 中保证对它们的兼容持续到下一年。
+
+除了使用预编译包之外,另一种方式是在本地进行编译,直接运行下述命令
+
+```python
+pip install mmcv-full
+```
-### 安装 mmcv-lite
+但注意本地编译可能会耗时 10 分钟以上。
-如果你需要使用和 PyTorch 相关的模块,请确保 PyTorch 已经成功安装在环境中,可以参考 [PyTorch 官方安装文档](https://github.com/pytorch/pytorch#installation)。
+b. 安装精简版
-```bash
-pip install -U openmim
-mim install "mmcv-lite>=2.0.0rc1"
+```python
+pip install mmcv
```
+c. 安装完整版并且编译 onnxruntime 的自定义算子
+
+- 详细的指南请查看 [这里](docs/deployment/onnxruntime_op.md)。
+
+如果想从源码编译 MMCV,请参考[该文档](https://mmcv.readthedocs.io/en/latest/get_started/build.html)。
+
## FAQ
-如果你遇到了安装问题或者运行时问题,请查看[问题解决页面](https://mmcv.readthedocs.io/zh_CN/2.x/faq.html)是否已有解决方案。如果问题仍然没有解决,欢迎提 [issue](https://github.com/open-mmlab/mmcv/issues)。
+如果你遇到了安装问题,CUDA 相关的问题或者 RuntimeErrors,可以首先参考[问题解决页面](https://mmcv.readthedocs.io/en/latest/faq.html) 看是否已经有解决方案。
## 贡献指南
@@ -119,37 +203,12 @@ mim install "mmcv-lite>=2.0.0rc1"
## 许可证
`MMCV` 目前以 Apache 2.0 的许可证发布,但是其中有一部分功能并不是使用的 Apache2.0 许可证,我们在 [许可证](LICENSES.md) 中详细地列出了这些功能以及他们对应的许可证,如果您正在从事盈利性活动,请谨慎参考此文档。
-
-## OpenMMLab 的其他项目
-
-- [MMEngine](https://github.com/open-mmlab/mmengine): OpenMMLab 深度学习模型训练基础库
-- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab 计算机视觉基础库
-- [MIM](https://github.com/open-mmlab/mim): MIM 是 OpenMMlab 项目、算法、模型的统一入口
-- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab 图像分类工具箱
-- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab 目标检测工具箱
-- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab 新一代通用 3D 目标检测平台
-- [MMRotate](https://github.com/open-mmlab/mmrotate): OpenMMLab 旋转框检测工具箱与测试基准
-- [MMYOLO](https://github.com/open-mmlab/mmyolo): OpenMMLab YOLO 系列工具箱与测试基准
-- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab 语义分割工具箱
-- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab 全流程文字检测识别理解工具箱
-- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab 姿态估计工具箱
-- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 人体参数化模型工具箱与测试基准
-- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab 自监督学习工具箱与测试基准
-- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab 模型压缩工具箱与测试基准
-- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab 少样本学习工具箱与测试基准
-- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab 新一代视频理解工具箱
-- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab 一体化视频目标感知平台
-- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab 光流估计工具箱与测试基准
-- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab 图像视频编辑工具箱
-- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab 图片视频生成模型工具箱
-- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab 模型部署框架
-
## 欢迎加入 OpenMMLab 社区
-扫描下方的二维码可关注 OpenMMLab 团队的 [知乎官方账号](https://www.zhihu.com/people/openmmlab),加入 OpenMMLab 团队的 [官方交流 QQ 群](https://jq.qq.com/?_wv=1027&k=K0QI8ByU),或添加微信小助手”OpenMMLabwx“加入官方交流微信群。
+扫描下方的二维码可关注 OpenMMLab 团队的 [知乎官方账号](https://www.zhihu.com/people/openmmlab),加入 OpenMMLab 团队的 [官方交流 QQ 群](https://jq.qq.com/?_wv=1027&k=GJP18SjI)
我们会在 OpenMMLab 社区为大家
diff --git a/TERMINOLOGY.md b/TERMINOLOGY.md
index 07411b7774c2ed713f472c1287b98b871c7f4d02..61941e3306c7dc2c0f7b0e181248cac841571a7a 100644
--- a/TERMINOLOGY.md
+++ b/TERMINOLOGY.md
@@ -4,27 +4,27 @@ This document is used as a reference for English-Chinese terminology translation
该文档用作中英文翻译对照参考。
-| English | 中文 |
-| :---------------: | :----------: |
-| annotation | 标注 |
-| backbone | 主干网络 |
-| benchmark | 基准测试 |
-| checkpoint | 模型权重文件 |
-| classifier | 分类器 |
-| cls_head | 分类头 |
-| decoder | 解码器 |
-| detector | 检测器 |
-| encoder | 编码器 |
-| finetune | 微调 |
-| ground truth | 真实标签 |
-| hook | 钩子 |
-| localizer | 定位器 |
-| neck | 模型颈部 |
-| pipeline | 流水线 |
-| recognizer | 识别器 |
-| register | 注册器 |
-| schedule | 调整 |
-| scheduler | 调度器 |
-| segmentor | 分割器 |
-| tensor | 张量 |
-| training schedule | 训练策略 |
+| English | 中文 |
+| :-----: | :---:|
+| annotation | 标注 |
+| backbone | 主干网络 |
+| benchmark | 基准测试 |
+| checkpoint | 模型权重文件 |
+| classifier | 分类器 |
+| cls_head | 分类头 |
+| decoder | 解码器 |
+| detector | 检测器 |
+| encoder | 编码器 |
+| finetune | 微调 |
+| ground truth | 真实标签 |
+| hook | 钩子 |
+| localizer | 定位器 |
+| neck | 模型颈部 |
+| pipeline | 流水线 |
+| recognizer | 识别器 |
+| register | 注册器 |
+| schedule | 调整 |
+| scheduler | 调度器 |
+| segmentor | 分割器 |
+| tensor | 张量 |
+| training schedule | 训练策略 |
diff --git a/docker/README.md b/docker/README.md
deleted file mode 100644
index 60d5c9de5da8faa7e0ae7e0def19a4320a2a7a5e..0000000000000000000000000000000000000000
--- a/docker/README.md
+++ /dev/null
@@ -1,70 +0,0 @@
-# Docker images
-
-There are two `Dockerfile` files to build docker images, one to build an image with the mmcv pre-built package and the other with the mmcv development environment.
-
-```text
-.
-|-- README.md
-|-- dev # build with mmcv development environment
-| `-- Dockerfile
-`-- release # build with mmcv pre-built package
- `-- Dockerfile
-```
-
-## Build docker images
-
-### Build with mmcv pre-built package
-
-Build with local repository
-
-```bash
-git clone https://github.com/open-mmlab/mmcv.git && cd mmcv
-docker build -t mmcv -f docker/release/Dockerfile .
-```
-
-Or build with remote repository
-
-```bash
-docker build -t mmcv https://github.com/open-mmlab/mmcv.git#master:docker/release
-```
-
-The [Dockerfile](release/Dockerfile) installs latest released version of mmcv by default, but you can specify mmcv versions to install expected versions.
-
-```bash
-docker image build -t mmcv -f docker/release/Dockerfile --build-arg MMCV=2.0.0rc1 .
-```
-
-If you also want to use other versions of PyTorch and CUDA, you can also pass them when building docker images.
-
-An example to build an image with PyTorch 1.11 and CUDA 11.3.
-
-```bash
-docker build -t mmcv -f docker/release/Dockerfile \
- --build-arg PYTORCH=1.9.0 \
- --build-arg CUDA=11.1 \
- --build-arg CUDNN=8 \
- --build-arg MMCV=2.0.0rc1 .
-```
-
-More available versions of PyTorch and CUDA can be found at [dockerhub/pytorch](https://hub.docker.com/r/pytorch/pytorch/tags).
-
-### Build with mmcv development environment
-
-If you want to build an docker image with the mmcv development environment, you can use the following command
-
-```bash
-git clone https://github.com/open-mmlab/mmcv.git && cd mmcv
-docker build -t mmcv -f docker/dev/Dockerfile --build-arg CUDA_ARCH=7.5 .
-```
-
-Note that `CUDA_ARCH` is the cumpute capability of your GPU and you can find it at [Compute Capability](https://developer.nvidia.com/cuda-gpus#compute).
-
-The building process may take 10 minutes or more.
-
-## Run images
-
-```bash
-docker run --gpus all --shm-size=8g -it mmcv
-```
-
-See [docker run](https://docs.docker.com/engine/reference/commandline/run/) for more usages.
diff --git a/docker/dev/Dockerfile b/docker/dev/Dockerfile
deleted file mode 100644
index a4d9e23fcfaa6e1af104aaa0e9cbb2a348b3cd34..0000000000000000000000000000000000000000
--- a/docker/dev/Dockerfile
+++ /dev/null
@@ -1,31 +0,0 @@
-ARG PYTORCH="1.8.1"
-ARG CUDA="10.2"
-ARG CUDNN="7"
-
-FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel
-
-# To fix GPG key error when running apt-get update
-RUN rm /etc/apt/sources.list.d/cuda.list \
- && rm /etc/apt/sources.list.d/nvidia-ml.list \
- && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub \
- && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
-
-# Install git and system dependencies for opencv-python
-RUN apt-get update && apt-get install -y git \
- && apt-get update && apt-get install -y libgl1 libglib2.0-0
-
-# Install system dependencies for unit tests
-RUN apt-get install -y ffmpeg libturbojpeg \
- && apt-get clean \
- && rm -rf /var/lib/apt/lists/*
-
-# build mmcv from source with develop mode
-ARG HTTPS_PROXY=""
-ENV https_proxy=${HTTPS_PROXY}
-ENV FORCE_CUDA="1"
-ARG CUDA_ARCH=""
-ENV TORCH_CUDA_ARCH_LIST=${CUDA_ARCH}
-RUN git clone https://github.com/open-mmlab/mmcv.git /mmcv
-WORKDIR /mmcv
-RUN git checkout 2.x && git rev-parse --short HEAD
-RUN pip install --no-cache-dir -e .[all] -v && pip install pre-commit && pre-commit install
diff --git a/docker/release/Dockerfile b/docker/release/Dockerfile
deleted file mode 100644
index d5e25e9eb70a87ab1c47a629cc6ed9706ade83c6..0000000000000000000000000000000000000000
--- a/docker/release/Dockerfile
+++ /dev/null
@@ -1,23 +0,0 @@
-ARG PYTORCH="1.8.1"
-ARG CUDA="10.2"
-ARG CUDNN="7"
-
-FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel
-
-# To fix GPG key error when running apt-get update
-RUN rm /etc/apt/sources.list.d/cuda.list \
- && rm /etc/apt/sources.list.d/nvidia-ml.list \
- && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub \
- && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
-
-# Install system dependencies for opencv-python
-RUN apt-get update && apt-get install -y libgl1 libglib2.0-0 \
- && apt-get clean \
- && rm -rf /var/lib/apt/lists/*
-
-# Install mmcv
-ARG MMCV=""
-RUN if [ "${MMCV}" = "" ]; then pip install -U openmim && mim install 'mmcv>=2.0.0rc1'; else pip install -U openmim && mim install mmcv==${MMCV}; fi
-
-# Verify the installation
-RUN python -c 'import mmcv;print(mmcv.__version__)'
diff --git a/docs/en/Makefile b/docs/Makefile
similarity index 100%
rename from docs/en/Makefile
rename to docs/Makefile
diff --git a/docs/en/_static/community/1.png b/docs/_static/community/1.png
similarity index 100%
rename from docs/en/_static/community/1.png
rename to docs/_static/community/1.png
diff --git a/docs/en/_static/community/2.png b/docs/_static/community/2.png
similarity index 100%
rename from docs/en/_static/community/2.png
rename to docs/_static/community/2.png
diff --git a/docs/en/_static/community/3.png b/docs/_static/community/3.png
similarity index 100%
rename from docs/en/_static/community/3.png
rename to docs/_static/community/3.png
diff --git a/docs/en/_static/css/readthedocs.css b/docs/_static/css/readthedocs.css
similarity index 75%
rename from docs/en/_static/css/readthedocs.css
rename to docs/_static/css/readthedocs.css
index 9e3a567d5f78aedb606600bb3111034a1003b362..3f425fc1e5344d7d159c71aa94b5e385767d5b37 100644
--- a/docs/en/_static/css/readthedocs.css
+++ b/docs/_static/css/readthedocs.css
@@ -4,7 +4,3 @@
height: 40px;
width: 85px;
}
-
-table.colwidths-auto td {
- width: 50%
-}
diff --git a/docs/en/_static/flow_img2toimg1.png b/docs/_static/flow_img2toimg1.png
similarity index 100%
rename from docs/en/_static/flow_img2toimg1.png
rename to docs/_static/flow_img2toimg1.png
diff --git a/docs/en/_static/flow_raw_images.png b/docs/_static/flow_raw_images.png
similarity index 100%
rename from docs/en/_static/flow_raw_images.png
rename to docs/_static/flow_raw_images.png
diff --git a/docs/en/_static/flow_visualization.png b/docs/_static/flow_visualization.png
similarity index 100%
rename from docs/en/_static/flow_visualization.png
rename to docs/_static/flow_visualization.png
diff --git a/docs/en/_static/flow_warp.png b/docs/_static/flow_warp.png
similarity index 100%
rename from docs/en/_static/flow_warp.png
rename to docs/_static/flow_warp.png
diff --git a/docs/en/_static/flow_warp_diff.png b/docs/_static/flow_warp_diff.png
similarity index 100%
rename from docs/en/_static/flow_warp_diff.png
rename to docs/_static/flow_warp_diff.png
diff --git a/docs/en/_static/image/mmcv-logo.png b/docs/_static/image/mmcv-logo.png
similarity index 100%
rename from docs/en/_static/image/mmcv-logo.png
rename to docs/_static/image/mmcv-logo.png
diff --git a/docs/en/_static/parallel_progress.gif b/docs/_static/parallel_progress.gif
similarity index 100%
rename from docs/en/_static/parallel_progress.gif
rename to docs/_static/parallel_progress.gif
diff --git a/docs/en/_static/parallel_progress.png b/docs/_static/parallel_progress.png
similarity index 100%
rename from docs/en/_static/parallel_progress.png
rename to docs/_static/parallel_progress.png
diff --git a/docs/en/_static/progress.gif b/docs/_static/progress.gif
similarity index 100%
rename from docs/en/_static/progress.gif
rename to docs/_static/progress.gif
diff --git a/docs/en/_static/progress.png b/docs/_static/progress.png
similarity index 100%
rename from docs/en/_static/progress.png
rename to docs/_static/progress.png
diff --git a/docs/_static/qq_group_qrcode.jpg b/docs/_static/qq_group_qrcode.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..7c6b04f561da283ae622f4219ea9b8cabf8f301a
Binary files /dev/null and b/docs/_static/qq_group_qrcode.jpg differ
diff --git a/docs/_static/zhihu_qrcode.jpg b/docs/_static/zhihu_qrcode.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..c745fb027f06564d41794e9a40069b06c34e2bb5
Binary files /dev/null and b/docs/_static/zhihu_qrcode.jpg differ
diff --git a/docs/api.rst b/docs/api.rst
new file mode 100644
index 0000000000000000000000000000000000000000..8ca9118c3b033f1b7311ec3c1533ce9c93fa1aa2
--- /dev/null
+++ b/docs/api.rst
@@ -0,0 +1,44 @@
+fileio
+-------
+.. automodule:: mmcv.fileio
+ :members:
+
+image
+------
+.. automodule:: mmcv.image
+ :members:
+
+video
+------
+.. automodule:: mmcv.video
+ :members:
+
+arraymisc
+---------
+.. automodule:: mmcv.arraymisc
+ :members:
+
+visualization
+--------------
+.. automodule:: mmcv.visualization
+ :members:
+
+utils
+-----
+.. automodule:: mmcv.utils
+ :members:
+
+cnn
+----
+.. automodule:: mmcv.cnn
+ :members:
+
+runner
+------
+.. automodule:: mmcv.runner
+ :members:
+
+ops
+------
+.. automodule:: mmcv.ops
+ :members:
diff --git a/docs/community/contributing.md b/docs/community/contributing.md
new file mode 120000
index 0000000000000000000000000000000000000000..f939e75f21a8badb5c40f527abd0e098fe9bc472
--- /dev/null
+++ b/docs/community/contributing.md
@@ -0,0 +1 @@
+../../CONTRIBUTING.md
\ No newline at end of file
diff --git a/docs/community/pr.md b/docs/community/pr.md
new file mode 100644
index 0000000000000000000000000000000000000000..77bdbf77080577d48ca734ffeb45d12269a166e4
--- /dev/null
+++ b/docs/community/pr.md
@@ -0,0 +1,94 @@
+## Pull Request (PR)
+
+### What is PR
+
+`PR` is the abbreviation of `Pull Request`. Here's the definition of `PR` in the [official document](https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests) of Github.
+
+> Pull requests let you tell others about changes you've pushed to a branch in a repository on GitHub. Once a pull request is opened, you can discuss and review the potential changes with collaborators and add follow-up commits before your changes are merged into the base branch.
+
+### Basic Workflow
+
+1. Get the most recent codebase
+2. Checkout a new branch from the master branch
+3. Commit your changes
+4. Push your changes and create a PR
+5. Discuss and review your code
+6. Merge your branch to the master branch
+
+### Procedures in detail
+
+1. Get the most recent codebase
+ + When you work on your first PR
+ - Fork the OpenMMLab repository: click the **fork** button at the top right corner of Github page
+ 
+
+ - Clone forked repository to local
+ ```bash
+ git clone git@github.com:XXX/mmcv.git
+ ```
+
+ - Add source repository to upstream
+ ```bash
+ git remote add upstream git@github.com:open-mmlab/mmcv
+ ```
+
+ + After your first PR
+ - Checkout master branch of the local repository and pull the latest master branch of the source repository
+ ```bash
+ git checkout master
+ git pull upstream master
+ ```
+
+2. Checkout a new branch from the master branch
+ ```bash
+ git checkout -b branchname
+ ```
+
+```{tip}
+To make commit history clear, we strongly recommend you checkout the master branch before create a new branch.
+```
+
+3. Commit your changes
+ ```bash
+ # coding
+ git add [files]
+ git commit -m 'messages'
+ ```
+
+4. Push your changes to the forked repository and create a PR
+ + Push the branch to your forked remote repository
+ ```bash
+ git push origin branchname
+ ```
+
+ + Create a PR
+ 
+
+ + Revise PR message template to describe your motivation and modifications made in this PR. You can also link the related issue to the PR manually in the PR message (For more information, checkout the [official guidance](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue)).
+
+5. Discuss and review your code
+ + After creating a pull request, you can ask a specific person to review the changes you've proposed
+ 
+
+ + Modify your codes according to reviewers' suggestions and then push your changes
+
+6. Merge your branch to the master branch and delete the branch
+ ```bash
+ git branch -d branchname # delete local branch
+ git push origin --delete branchname # delete remote branch
+ ```
+
+### PR Specs
+
+1. Use [pre-commit](https://pre-commit.com) hook to avoid issues of code style
+2. One short-time branch should be matched with only one PR
+3. Accomplish a detailed change in one PR. Avoid large PR
+ >- Bad: Support Faster R-CNN
+ >- Acceptable: Add a box head to Faster R-CNN
+ >- Good: Add a parameter to box head to support custom conv-layer number
+4. Provide clear and significant commit message
+5. Provide clear and meaningful PR description
+ >- Task name should be clarified in title. The general format is: [Prefix] Short description of the PR (Suffix)
+ >- Prefix: add new feature [Feature], fix bug [Fix], related to documents [Docs], in developing [WIP] (which will not be reviewed temporarily)
+ >- Introduce main changes, results and influences on other modules in short description
+ >- Associate related issues and pull requests with a milestone
diff --git a/docs/en/compatibility.md b/docs/compatibility.md
similarity index 100%
rename from docs/en/compatibility.md
rename to docs/compatibility.md
diff --git a/docs/zh_cn/conf.py b/docs/conf.py
similarity index 62%
rename from docs/zh_cn/conf.py
rename to docs/conf.py
index 7bfb9c23a726bb917761c725472d307e6d1d865a..bea4706cf0430220087b77847f5a07cd24c9b31f 100644
--- a/docs/zh_cn/conf.py
+++ b/docs/conf.py
@@ -15,19 +15,21 @@ import os
import sys
import pytorch_sphinx_theme
+from m2r import MdInclude
+from recommonmark.transform import AutoStructify
from sphinx.builders.html import StandaloneHTMLBuilder
-sys.path.insert(0, os.path.abspath('../..'))
+sys.path.insert(0, os.path.abspath('..'))
-version_file = '../../mmcv/version.py'
-with open(version_file) as f:
+version_file = '../mmcv/version.py'
+with open(version_file, 'r') as f:
exec(compile(f.read(), version_file, 'exec'))
__version__ = locals()['__version__']
# -- Project information -----------------------------------------------------
project = 'mmcv'
-copyright = '2018-2022, OpenMMLab'
+copyright = '2018-2021, OpenMMLab'
author = 'MMCV Authors'
# The short X.Y version
@@ -47,8 +49,6 @@ release = __version__
extensions = [
'sphinx.ext.autodoc',
- 'sphinx.ext.autosummary',
- 'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
'sphinx.ext.autosectionlabel',
@@ -57,18 +57,6 @@ extensions = [
'sphinx_copybutton',
] # yapf: disable
-myst_heading_anchors = 4
-
-myst_enable_extensions = ['colon_fence']
-
-# Configuration for intersphinx
-intersphinx_mapping = {
- 'python': ('https://docs.python.org/3', None),
- 'numpy': ('https://numpy.org/doc/stable', None),
- 'torch': ('https://pytorch.org/docs/stable/', None),
- 'mmengine': ('https://mmengine.readthedocs.io/en/latest', None),
-}
-
autodoc_mock_imports = ['mmcv._ext', 'mmcv.utils.ext_loader', 'torchvision']
autosectionlabel_prefix_document = True
@@ -91,7 +79,7 @@ master_doc = 'index'
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
-language = 'zh_CN'
+language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
@@ -120,9 +108,92 @@ html_theme_options = {
'name': 'GitHub',
'url': 'https://github.com/open-mmlab/mmcv'
},
- ],
- # Specify the language of shared menu
- 'menu_lang': 'cn',
+ {
+ 'name':
+ 'Docs',
+ 'children': [
+ {
+ 'name': 'MMCV',
+ 'url': 'https://mmcv.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MIM',
+ 'url': 'https://openmim.readthedocs.io/en/latest/'
+ },
+ {
+ 'name': 'MMAction2',
+ 'url': 'https://mmaction2.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMClassification',
+ 'url':
+ 'https://mmclassification.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMDetection',
+ 'url': 'https://mmdetection.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMDetection3D',
+ 'url': 'https://mmdetection3d.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMEditing',
+ 'url': 'https://mmediting.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMGeneration',
+ 'url': 'https://mmgeneration.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMOCR',
+ 'url': 'https://mmocr.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMPose',
+ 'url': 'https://mmpose.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMSegmentation',
+ 'url': 'https://mmsegmentation.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMTracking',
+ 'url': 'https://mmtracking.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMFlow',
+ 'url': 'https://mmflow.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMFewShot',
+ 'url': 'https://mmfewshot.readthedocs.io/en/latest/',
+ },
+ ]
+ },
+ {
+ 'name':
+ 'OpenMMLab',
+ 'children': [
+ {
+ 'name': 'Homepage',
+ 'url': 'https://openmmlab.com/'
+ },
+ {
+ 'name': 'GitHub',
+ 'url': 'https://github.com/open-mmlab/'
+ },
+ {
+ 'name': 'Twitter',
+ 'url': 'https://twitter.com/OpenMMLab'
+ },
+ {
+ 'name': 'Zhihu',
+ 'url': 'https://zhihu.com/people/openmmlab'
+ },
+ ]
+ },
+ ]
}
# Add any paths that contain custom static files (such as style sheets) here,
@@ -215,3 +286,16 @@ StandaloneHTMLBuilder.supported_image_types = [
# Ignore >>> when copying code
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True
+
+
+def setup(app):
+ app.add_config_value('no_underscore_emphasis', False, 'env')
+ app.add_config_value('m2r_parse_relative_links', False, 'env')
+ app.add_config_value('m2r_anonymous_references', False, 'env')
+ app.add_config_value('m2r_disable_inline_math', False, 'env')
+ app.add_directive('mdinclude', MdInclude)
+ app.add_config_value('recommonmark_config', {
+ 'auto_toc_tree_section': 'Contents',
+ 'enable_eval_rst': True,
+ }, True)
+ app.add_transform(AutoStructify)
diff --git a/docs/en/deployment/mmcv_ops_definition.md b/docs/deployment/mmcv_ops_definition.md
similarity index 80%
rename from docs/en/deployment/mmcv_ops_definition.md
rename to docs/deployment/mmcv_ops_definition.md
index d7eabb33fd41855116ed975d4e48daea81e4d74d..5696316be5b1fb9234faab74cd83ad579655724e 100644
--- a/docs/en/deployment/mmcv_ops_definition.md
+++ b/docs/deployment/mmcv_ops_definition.md
@@ -1,10 +1,7 @@
-# MMCV Operators
-
-To make custom operators in MMCV more standard, precise definitions of each operator are listed in this document.
+# Definition of custom operators in MMCV
-
-- [MMCV Operators](#mmcv-operators)
+- [Definition of custom operators in MMCV](#definition-of-custom-operators-in-mmcv)
- [MMCVBorderAlign](#mmcvborderalign)
- [Description](#description)
- [Parameters](#parameters)
@@ -83,26 +80,25 @@ To make custom operators in MMCV more standard, precise definitions of each oper
- [Inputs](#inputs-12)
- [Outputs](#outputs-12)
- [Type Constraints](#type-constraints-12)
- - [grid_sampler\*](#grid_sampler)
+- [torch](#torch)
+ - [grid_sampler](#grid_sampler)
- [Description](#description-13)
- [Parameters](#parameters-13)
- [Inputs](#inputs-13)
- [Outputs](#outputs-13)
- [Type Constraints](#type-constraints-13)
- - [cummax\*](#cummax)
+ - [cummax](#cummax)
- [Description](#description-14)
- [Parameters](#parameters-14)
- [Inputs](#inputs-14)
- [Outputs](#outputs-14)
- [Type Constraints](#type-constraints-14)
- - [cummin\*](#cummin)
+ - [cummin](#cummin)
- [Description](#description-15)
- [Parameters](#parameters-15)
- [Inputs](#inputs-15)
- [Outputs](#outputs-15)
- [Type Constraints](#type-constraints-15)
- - [Reminders](#reminders)
-
## MMCVBorderAlign
@@ -122,9 +118,9 @@ Read [BorderDet: Border Feature for Dense Object Detection](ttps://arxiv.org/abs
### Parameters
-| Type | Parameter | Description |
-| ----- | ----------- | ----------------------------------------------------------------------------------- |
-| `int` | `pool_size` | number of positions sampled over the boxes' borders(e.g. top, bottom, left, right). |
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
+| `int` | `pool_size` | number of positions sampled over the boxes' borders(e.g. top, bottom, left, right). |
### Inputs
@@ -156,11 +152,11 @@ Read [CARAFE: Content-Aware ReAssembly of FEatures](https://arxiv.org/abs/1905.0
### Parameters
-| Type | Parameter | Description |
-| ------- | -------------- | --------------------------------------------- |
-| `int` | `kernel_size` | reassemble kernel size, should be odd integer |
-| `int` | `group_size` | reassemble group size |
-| `float` | `scale_factor` | upsample ratio(>=1) |
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
+| `int` | `kernel_size` | reassemble kernel size, should be odd integer|
+| `int` | `group_size` | reassemble group size |
+| `float` | `scale_factor` | upsample ratio(>=1) |
### Inputs
@@ -191,7 +187,8 @@ Read [CCNet: Criss-Cross Attention for SemanticSegmentation](https://arxiv.org/p
### Parameters
-None
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
### Inputs
@@ -222,7 +219,8 @@ Read [CCNet: Criss-Cross Attention for SemanticSegmentation](https://arxiv.org/p
### Parameters
-None
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
### Inputs
@@ -244,6 +242,7 @@ None
- T:tensor(float32)
+
## MMCVCornerPool
### Description
@@ -252,9 +251,9 @@ Perform CornerPool on `input` features. Read [CornerNet -- Detecting Objects as
### Parameters
-| Type | Parameter | Description |
-| ----- | --------- | ---------------------------------------------------------------- |
-| `int` | `mode` | corner pool mode, (0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
+| Type | Parameter | Description |
+| ------- | --------------- | ---------------------------------------------------------------- |
+| `int` | `mode` | corner pool mode, (0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
### Inputs
@@ -284,15 +283,15 @@ Read [Deformable Convolutional Networks](https://arxiv.org/pdf/1703.06211.pdf) f
### Parameters
-| Type | Parameter | Description |
-| -------------- | ------------------- | ----------------------------------------------------------------------------------------------------------------- |
-| `list of ints` | `stride` | The stride of the convolving kernel, (sH, sW). Defaults to `(1, 1)`. |
-| `list of ints` | `padding` | Paddings on both sides of the input, (padH, padW). Defaults to `(0, 0)`. |
-| `list of ints` | `dilation` | The spacing between kernel elements (dH, dW). Defaults to `(1, 1)`. |
-| `int` | `groups` | Split input into groups. `input_channel` should be divisible by the number of groups. Defaults to `1`. |
-| `int` | `deformable_groups` | Groups of deformable offset. Defaults to `1`. |
-| `int` | `bias` | Whether to add a learnable bias to the output. `0` stands for `False` and `1` stands for `True`. Defaults to `0`. |
-| `int` | `im2col_step` | Groups of deformable offset. Defaults to `32`. |
+| Type | Parameter | Description |
+| -------------- | ------------------ | ------------------------------------------------------------------------------------- |
+| `list of ints` | `stride` | The stride of the convolving kernel, (sH, sW). Defaults to `(1, 1)`. |
+| `list of ints` | `padding` | Paddings on both sides of the input, (padH, padW). Defaults to `(0, 0)`. |
+| `list of ints` | `dilation` | The spacing between kernel elements (dH, dW). Defaults to `(1, 1)`. |
+| `int` | `groups` | Split input into groups. `input_channel` should be divisible by the number of groups. Defaults to `1`.|
+| `int` | `deformable_groups` | Groups of deformable offset. Defaults to `1`. |
+| `int` | `bias` | Whether to add a learnable bias to the output. `0` stands for `False` and `1` stands for `True`. Defaults to `0`. |
+| `int` | `im2col_step` | Groups of deformable offset. Defaults to `32`. |
### Inputs
@@ -324,11 +323,11 @@ Perform Modulated Deformable Convolution on input feature, read [Deformable Conv
### Parameters
-| Type | Parameter | Description |
-| -------------- | ------------------- | ------------------------------------------------------------------------------------- |
-| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
-| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
-| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
+| Type | Parameter | Description |
+| -------------- | ------------------ | ------------------------------------------------------------------------------------- |
+| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
+| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
+| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
| `int` | `deformable_groups` | Groups of deformable offset. |
| `int` | `groups` | Split input into groups. `input_channel` should be divisible by the number of groups. |
@@ -366,13 +365,13 @@ Deformable roi pooling layer
### Parameters
-| Type | Parameter | Description |
-| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
| `int` | `output_height` | height of output roi |
| `int` | `output_width` | width of output roi |
| `float` | `spatial_scale` | used to scale the input boxes |
| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
-| `float` | `gamma` | gamma |
+| `float` | `gamma` | gamma |
### Inputs
@@ -405,10 +404,10 @@ Read [Pixel Recurrent Neural Networks](https://arxiv.org/abs/1601.06759) for mor
### Parameters
-| Type | Parameter | Description |
-| -------------- | --------- | -------------------------------------------------------------------------------- |
-| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW). **Only support stride=1 in mmcv** |
-| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW). Defaults to `(0, 0)`. |
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
+| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW). **Only support stride=1 in mmcv** |
+| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW). Defaults to `(0, 0)`. |
### Inputs
@@ -444,10 +443,10 @@ Read [PSANet: Point-wise Spatial Attention Network for Scene Parsing](https://hs
### Parameters
-| Type | Parameter | Description |
-| -------------- | ----------- | -------------------------------------------- |
-| `int` | `psa_type` | `0` means collect and `1` means `distribute` |
-| `list of ints` | `mask_size` | The size of mask |
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
+| `int` | `psa_type` | `0` means collect and `1` means `distribute` |
+| `list of ints` | `mask_size` | The size of mask |
### Inputs
@@ -479,9 +478,9 @@ Note this definition is slightly different with [onnx: NonMaxSuppression](https:
| Type | Parameter | Description |
| ------- | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
-| `int` | `center_point_box` | 0 - the box data is supplied as \[y1, x1, y2, x2\], 1-the box data is supplied as \[x_center, y_center, width, height\]. |
+| `int` | `center_point_box` | 0 - the box data is supplied as [y1, x1, y2, x2], 1-the box data is supplied as [x_center, y_center, width, height]. |
| `int` | `max_output_boxes_per_class` | The maximum number of boxes to be selected per batch per class. Default to 0, number of output boxes equal to number of input boxes. |
-| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range \[0, 1\]. Default to 0. |
+| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
| `float` | `score_threshold` | The threshold for deciding when to remove boxes based on score. |
| `int` | `offset` | 0 or 1, boxes' width or height is (x2 - x1 + offset). |
@@ -544,6 +543,7 @@ Perform RoIAlign on output feature, used in bbox_head of most two-stage detector
- T:tensor(float32)
+
## MMCVRoIAlignRotated
### Description
@@ -552,15 +552,15 @@ Perform RoI align pooling for rotated proposals
### Parameters
-| Type | Parameter | Description |
-| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
| `int` | `output_height` | height of output roi |
| `int` | `output_width` | width of output roi |
| `float` | `spatial_scale` | used to scale the input boxes |
| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
| `str` | `mode` | pooling mode in each bin. `avg` or `max` |
| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
-| `int` | `clockwise` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
+| `int` | `clockwise` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
### Inputs
@@ -581,7 +581,9 @@ Perform RoI align pooling for rotated proposals
- T:tensor(float32)
-## grid_sampler\*
+# torch
+
+## grid_sampler
### Description
@@ -617,7 +619,7 @@ Check [torch.nn.functional.grid_sample](https://pytorch.org/docs/stable/generate
- T:tensor(float32, Linear)
-## cummax\*
+## cummax
### Description
@@ -625,9 +627,9 @@ Returns a tuple (`values`, `indices`) where `values` is the cumulative maximum e
### Parameters
-| Type | Parameter | Description |
-| ----- | --------- | -------------------------------------- |
-| `int` | `dim` | the dimension to do the operation over |
+| Type | Parameter | Description |
+| ------- | --------------- | ---------------------------------------------------------------- |
+| `int` | `dim` | the dimension to do the operation over |
### Inputs
@@ -649,7 +651,7 @@ Returns a tuple (`values`, `indices`) where `values` is the cumulative maximum e
- T:tensor(float32)
-## cummin\*
+## cummin
### Description
@@ -657,9 +659,9 @@ Returns a tuple (`values`, `indices`) where `values` is the cumulative minimum e
### Parameters
-| Type | Parameter | Description |
-| ----- | --------- | -------------------------------------- |
-| `int` | `dim` | the dimension to do the operation over |
+| Type | Parameter | Description |
+| ------- | --------------- | ---------------------------------------------------------------- |
+| `int` | `dim` | the dimension to do the operation over |
### Inputs
@@ -680,7 +682,3 @@ Returns a tuple (`values`, `indices`) where `values` is the cumulative minimum e
### Type Constraints
- T:tensor(float32)
-
-## Reminders
-
-- Operators endwith `*` are defined in Torch and are included here for the conversion to ONNX.
diff --git a/docs/deployment/onnx.md b/docs/deployment/onnx.md
new file mode 100644
index 0000000000000000000000000000000000000000..be6c59c5c5dbe3d17d62f4c01c79df35afb19d6d
--- /dev/null
+++ b/docs/deployment/onnx.md
@@ -0,0 +1,19 @@
+## Introduction of onnx module in MMCV (Experimental)
+
+### register_extra_symbolics
+
+Some extra symbolic functions need to be registered before exporting PyTorch model to ONNX.
+
+#### Example
+
+```python
+import mmcv
+from mmcv.onnx import register_extra_symbolics
+
+opset_version = 11
+register_extra_symbolics(opset_version)
+```
+
+#### FAQs
+
+- None
diff --git a/docs/deployment/onnxruntime_custom_ops.md b/docs/deployment/onnxruntime_custom_ops.md
new file mode 100644
index 0000000000000000000000000000000000000000..baaa576f6d789f0eb53b4005dec537de5e06e700
--- /dev/null
+++ b/docs/deployment/onnxruntime_custom_ops.md
@@ -0,0 +1,378 @@
+## Onnxruntime Custom Ops
+
+
+
+- [Onnxruntime Custom Ops](#onnxruntime-custom-ops)
+ - [SoftNMS](#softnms)
+ - [Description](#description)
+ - [Parameters](#parameters)
+ - [Inputs](#inputs)
+ - [Outputs](#outputs)
+ - [Type Constraints](#type-constraints)
+ - [RoIAlign](#roialign)
+ - [Description](#description-1)
+ - [Parameters](#parameters-1)
+ - [Inputs](#inputs-1)
+ - [Outputs](#outputs-1)
+ - [Type Constraints](#type-constraints-1)
+ - [NMS](#nms)
+ - [Description](#description-2)
+ - [Parameters](#parameters-2)
+ - [Inputs](#inputs-2)
+ - [Outputs](#outputs-2)
+ - [Type Constraints](#type-constraints-2)
+ - [grid_sampler](#grid_sampler)
+ - [Description](#description-3)
+ - [Parameters](#parameters-3)
+ - [Inputs](#inputs-3)
+ - [Outputs](#outputs-3)
+ - [Type Constraints](#type-constraints-3)
+ - [CornerPool](#cornerpool)
+ - [Description](#description-4)
+ - [Parameters](#parameters-4)
+ - [Inputs](#inputs-4)
+ - [Outputs](#outputs-4)
+ - [Type Constraints](#type-constraints-4)
+ - [cummax](#cummax)
+ - [Description](#description-5)
+ - [Parameters](#parameters-5)
+ - [Inputs](#inputs-5)
+ - [Outputs](#outputs-5)
+ - [Type Constraints](#type-constraints-5)
+ - [cummin](#cummin)
+ - [Description](#description-6)
+ - [Parameters](#parameters-6)
+ - [Inputs](#inputs-6)
+ - [Outputs](#outputs-6)
+ - [Type Constraints](#type-constraints-6)
+ - [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
+ - [Description](#description-7)
+ - [Parameters](#parameters-7)
+ - [Inputs](#inputs-7)
+ - [Outputs](#outputs-7)
+ - [Type Constraints](#type-constraints-7)
+ - [MMCVDeformConv2d](#mmcvdeformconv2d)
+ - [Description](#description-8)
+ - [Parameters](#parameters-8)
+ - [Inputs](#inputs-8)
+ - [Outputs](#outputs-8)
+ - [Type Constraints](#type-constraints-8)
+
+
+
+### SoftNMS
+
+#### Description
+
+Perform soft NMS on `boxes` with `scores`. Read [Soft-NMS -- Improving Object Detection With One Line of Code](https://arxiv.org/abs/1704.04503) for detail.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ------- | --------------- | -------------------------------------------------------------- |
+| `float` | `iou_threshold` | IoU threshold for NMS |
+| `float` | `sigma` | hyperparameter for gaussian method |
+| `float` | `min_score` | score filter threshold |
+| `int` | `method` | method to do the nms, (0: `naive`, 1: `linear`, 2: `gaussian`) |
+| `int` | `offset` | `boxes` width or height is (x2 - x1 + offset). (0 or 1) |
+
+#### Inputs
+
+
+boxes : T
+Input boxes. 2-D tensor of shape (N, 4). N is the number of boxes.
+scores : T
+Input scores. 1-D tensor of shape (N, ).
+
+
+#### Outputs
+
+
+dets : T
+Output boxes and scores. 2-D tensor of shape (num_valid_boxes, 5), [[x1, y1, x2, y2, score], ...]. num_valid_boxes is the number of valid boxes.
+indices : tensor(int64)
+Output indices. 1-D tensor of shape (num_valid_boxes, ).
+
+
+#### Type Constraints
+
+- T:tensor(float32)
+
+### RoIAlign
+
+#### Description
+
+Perform RoIAlign on output feature, used in bbox_head of most two-stage detectors.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
+| `int` | `output_height` | height of output roi |
+| `int` | `output_width` | width of output roi |
+| `float` | `spatial_scale` | used to scale the input boxes |
+| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
+| `str` | `mode` | pooling mode in each bin. `avg` or `max` |
+| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
+
+#### Inputs
+
+
+input : T
+Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.
+rois : T
+RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of input.
+
+
+#### Outputs
+
+
+feat : T
+RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element feat[r-1] is a pooled feature map corresponding to the r-th RoI RoIs[r-1].
+
+
+#### Type Constraints
+
+- T:tensor(float32)
+
+### NMS
+
+#### Description
+
+Filter out boxes has high IoU overlap with previously selected boxes.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ------- | --------------- | ---------------------------------------------------------------------------------------------------------------- |
+| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
+| `int` | `offset` | 0 or 1, boxes' width or height is (x2 - x1 + offset). |
+
+#### Inputs
+
+
+bboxes : T
+Input boxes. 2-D tensor of shape (num_boxes, 4). num_boxes is the number of input boxes.
+scores : T
+Input scores. 1-D tensor of shape (num_boxes, ).
+
+
+#### Outputs
+
+
+indices : tensor(int32, Linear)
+Selected indices. 1-D tensor of shape (num_valid_boxes, ). num_valid_boxes is the number of valid boxes.
+
+
+#### Type Constraints
+
+- T:tensor(float32)
+
+### grid_sampler
+
+#### Description
+
+Perform sample from `input` with pixel locations from `grid`.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ----- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `int` | `interpolation_mode` | Interpolation mode to calculate output values. (0: `bilinear` , 1: `nearest`) |
+| `int` | `padding_mode` | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`) |
+| `int` | `align_corners` | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
+
+#### Inputs
+
+
+input : T
+Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.
+grid : T
+Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW is the height and width of offset and output.
+
+
+#### Outputs
+
+
+output : T
+Output feature; 4-D tensor of shape (N, C, outH, outW).
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### CornerPool
+
+#### Description
+
+Perform CornerPool on `input` features. Read [CornerNet -- Detecting Objects as Paired Keypoints](https://arxiv.org/abs/1808.01244) for more details.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ----- | --------- | ---------------------------------------------------------------- |
+| `int` | `mode` | corner pool mode, (0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
+
+#### Inputs
+
+
+input : T
+Input features. 4-D tensor of shape (N, C, H, W). N is the batch size.
+
+
+#### Outputs
+
+
+output : T
+Output the pooled features. 4-D tensor of shape (N, C, H, W).
+
+
+#### Type Constraints
+
+- T:tensor(float32)
+
+### cummax
+
+#### Description
+
+Returns a tuple (`values`, `indices`) where `values` is the cumulative maximum elements of `input` in the dimension `dim`. And `indices` is the index location of each maximum value found in the dimension `dim`. Read [torch.cummax](https://pytorch.org/docs/stable/generated/torch.cummax.html) for more details.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ----- | --------- | -------------------------------------- |
+| `int` | `dim` | the dimension to do the operation over |
+
+#### Inputs
+
+
+input : T
+The input tensor with various shapes. Tensor with empty element is also supported.
+
+
+#### Outputs
+
+
+output : T
+Output the cumulative maximum elements of `input` in the dimension `dim`, with the same shape and dtype as `input`.
+indices : tensor(int64)
+Output the index location of each cumulative maximum value found in the dimension `dim`, with the same shape as `input`.
+
+
+#### Type Constraints
+
+- T:tensor(float32)
+
+### cummin
+
+#### Description
+
+Returns a tuple (`values`, `indices`) where `values` is the cumulative minimum elements of `input` in the dimension `dim`. And `indices` is the index location of each minimum value found in the dimension `dim`. Read [torch.cummin](https://pytorch.org/docs/stable/generated/torch.cummin.html) for more details.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ----- | --------- | -------------------------------------- |
+| `int` | `dim` | the dimension to do the operation over |
+
+#### Inputs
+
+
+input : T
+The input tensor with various shapes. Tensor with empty element is also supported.
+
+
+#### Outputs
+
+
+output : T
+Output the cumulative minimum elements of `input` in the dimension `dim`, with the same shape and dtype as `input`.
+indices : tensor(int64)
+Output the index location of each cumulative minimum value found in the dimension `dim`, with the same shape as `input`.
+
+
+#### Type Constraints
+
+- T:tensor(float32)
+
+### MMCVModulatedDeformConv2d
+
+#### Description
+
+Perform Modulated Deformable Convolution on input feature, read [Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline) for detail.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| -------------- | ------------------- | ------------------------------------------------------------------------------------- |
+| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
+| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
+| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
+| `int` | `deformable_groups` | Groups of deformable offset. |
+| `int` | `groups` | Split input into groups. `input_channel` should be divisible by the number of groups. |
+
+#### Inputs
+
+
+inputs[0] : T
+Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the number of channels, inH and inW are the height and width of the data.
+inputs[1] : T
+Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
+inputs[2] : T
+Input mask; 4-D tensor of shape (N, deformable_group* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
+inputs[3] : T
+Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).
+inputs[4] : T, optional
+Input bias; 1-D tensor of shape (output_channel).
+
+
+#### Outputs
+
+
+outputs[0] : T
+Output feature; 4-D tensor of shape (N, output_channel, outH, outW).
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+## MMCVDeformConv2d
+
+### Description
+
+Perform Deformable Convolution on input feature, read [Deformable Convolutional Network](https://arxiv.org/abs/1703.06211) for detail.
+
+### Parameters
+
+| Type | Parameter | Description |
+| -------------- | ------------------ | --------------------------------------------------------------------------------------------------------------------------------- |
+| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
+| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
+| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
+| `int` | `deformable_group` | Groups of deformable offset. |
+| `int` | `group` | Split input into groups. `input_channel` should be divisible by the number of groups. |
+| `int` | `im2col_step` | DeformableConv2d use im2col to compute convolution. im2col_step is used to split input and offset, reduce memory usage of column. |
+
+### Inputs
+
+
+inputs[0] : T
+Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.
+inputs[1] : T
+Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
+inputs[2] : T
+Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).
+
+
+### Outputs
+
+
+outputs[0] : T
+Output feature; 4-D tensor of shape (N, output_channel, outH, outW).
+
+
+### Type Constraints
+
+- T:tensor(float32, Linear)
diff --git a/docs/deployment/onnxruntime_op.md b/docs/deployment/onnxruntime_op.md
new file mode 100644
index 0000000000000000000000000000000000000000..f17b32a0647e2f25b1736580f385e7ae1fcb8163
--- /dev/null
+++ b/docs/deployment/onnxruntime_op.md
@@ -0,0 +1,126 @@
+## Custom operators for ONNX Runtime in MMCV
+
+### Introduction of ONNX Runtime
+
+**ONNX Runtime** is a cross-platform inferencing and training accelerator compatible with many popular ML/DNN frameworks. Check its [github](https://github.com/microsoft/onnxruntime) for more information.
+
+### Introduction of ONNX
+
+**ONNX** stands for **Open Neural Network Exchange**, which acts as *Intermediate Representation(IR)* for ML/DNN models from many frameworks. Check its [github](https://github.com/onnx/onnx) for more information.
+
+### Why include custom operators for ONNX Runtime in MMCV
+
+- To verify the correctness of exported ONNX models in ONNX Runtime.
+- To ease the deployment of ONNX models with custom operators from `mmcv.ops` in ONNX Runtime.
+
+### List of operators for ONNX Runtime supported in MMCV
+
+| Operator | CPU | GPU | MMCV Releases |
+| :----------------------------------------------------: | :---: | :---: | :-----------: |
+| [SoftNMS](onnxruntime_custom_ops.md#softnms) | Y | N | 1.2.3 |
+| [RoIAlign](onnxruntime_custom_ops.md#roialign) | Y | N | 1.2.5 |
+| [NMS](onnxruntime_custom_ops.md#nms) | Y | N | 1.2.7 |
+| [grid_sampler](onnxruntime_custom_ops.md#grid_sampler) | Y | N | 1.3.1 |
+| [CornerPool](onnxruntime_custom_ops.md#cornerpool) | Y | N | 1.3.4 |
+| [cummax](onnxruntime_custom_ops.md#cummax) | Y | N | master |
+| [cummin](onnxruntime_custom_ops.md#cummin) | Y | N | master |
+
+### How to build custom operators for ONNX Runtime
+
+*Please be noted that only **onnxruntime>=1.8.1** of CPU version on Linux platform is tested by now.*
+
+#### Prerequisite
+
+- Clone repository
+
+```bash
+git clone https://github.com/open-mmlab/mmcv.git
+```
+
+- Download `onnxruntime-linux` from ONNX Runtime [releases](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1), extract it, expose `ONNXRUNTIME_DIR` and finally add the lib path to `LD_LIBRARY_PATH` as below:
+
+```bash
+wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
+
+tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
+cd onnxruntime-linux-x64-1.8.1
+export ONNXRUNTIME_DIR=$(pwd)
+export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
+```
+
+#### Build on Linux
+
+```bash
+cd mmcv ## to MMCV root directory
+MMCV_WITH_OPS=1 MMCV_WITH_ORT=1 python setup.py develop
+```
+
+### How to do inference using exported ONNX models with custom operators in ONNX Runtime in python
+
+Install ONNX Runtime with `pip`
+
+```bash
+pip install onnxruntime==1.8.1
+```
+
+Inference Demo
+
+```python
+import os
+
+import numpy as np
+import onnxruntime as ort
+
+from mmcv.ops import get_onnxruntime_op_path
+
+ort_custom_op_path = get_onnxruntime_op_path()
+assert os.path.exists(ort_custom_op_path)
+session_options = ort.SessionOptions()
+session_options.register_custom_ops_library(ort_custom_op_path)
+## exported ONNX model with custom operators
+onnx_file = 'sample.onnx'
+input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
+sess = ort.InferenceSession(onnx_file, session_options)
+onnx_results = sess.run(None, {'input' : input_data})
+```
+
+### How to add a new custom operator for ONNX Runtime in MMCV
+
+#### Reminder
+
+- The custom operator is not included in [supported operator list](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md) in ONNX Runtime.
+- The custom operator should be able to be exported to ONNX.
+
+#### Main procedures
+
+Take custom operator `soft_nms` for example.
+
+1. Add header `soft_nms.h` to ONNX Runtime include directory `mmcv/ops/csrc/onnxruntime/`
+2. Add source `soft_nms.cpp` to ONNX Runtime source directory `mmcv/ops/csrc/onnxruntime/cpu/`
+3. Register `soft_nms` operator in [onnxruntime_register.cpp](../../mmcv/ops/csrc/onnxruntime/cpu/onnxruntime_register.cpp)
+
+ ```c++
+ #include "soft_nms.h"
+
+ SoftNmsOp c_SoftNmsOp;
+
+ if (auto status = ortApi->CustomOpDomain_Add(domain, &c_SoftNmsOp)) {
+ return status;
+ }
+ ```
+
+4. Add unit test into `tests/test_ops/test_onnx.py`
+ Check [here](../../tests/test_ops/test_onnx.py) for examples.
+
+**Finally, welcome to send us PR of adding custom operators for ONNX Runtime in MMCV.** :nerd_face:
+
+### Known Issues
+
+- "RuntimeError: tuple appears in op that does not forward tuples, unsupported kind: `prim::PythonOp`."
+ 1. Note generally `cummax` or `cummin` is exportable to ONNX as long as the torch version >= 1.5.0, since `torch.cummax` is only supported with torch >= 1.5.0. But when `cummax` or `cummin` serves as an intermediate component whose outputs is used as inputs for another modules, it's expected that torch version must be >= 1.7.0. Otherwise the above error might arise, when running exported ONNX model with onnxruntime.
+ 2. Solution: update the torch version to 1.7.0 or higher.
+
+### References
+
+- [How to export Pytorch model with custom op to ONNX and run it in ONNX Runtime](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
+- [How to add a custom operator/kernel in ONNX Runtime](https://github.com/microsoft/onnxruntime/blob/master/docs/AddingCustomOp.md)
diff --git a/docs/deployment/tensorrt_custom_ops.md b/docs/deployment/tensorrt_custom_ops.md
new file mode 100644
index 0000000000000000000000000000000000000000..be47e355be6316295ca18f12450630e9fe6d3854
--- /dev/null
+++ b/docs/deployment/tensorrt_custom_ops.md
@@ -0,0 +1,395 @@
+## TensorRT Custom Ops
+
+
+
+- [TensorRT Custom Ops](#tensorrt-custom-ops)
+ - [MMCVRoIAlign](#mmcvroialign)
+ - [Description](#description)
+ - [Parameters](#parameters)
+ - [Inputs](#inputs)
+ - [Outputs](#outputs)
+ - [Type Constraints](#type-constraints)
+ - [ScatterND](#scatternd)
+ - [Description](#description-1)
+ - [Parameters](#parameters-1)
+ - [Inputs](#inputs-1)
+ - [Outputs](#outputs-1)
+ - [Type Constraints](#type-constraints-1)
+ - [NonMaxSuppression](#nonmaxsuppression)
+ - [Description](#description-2)
+ - [Parameters](#parameters-2)
+ - [Inputs](#inputs-2)
+ - [Outputs](#outputs-2)
+ - [Type Constraints](#type-constraints-2)
+ - [MMCVDeformConv2d](#mmcvdeformconv2d)
+ - [Description](#description-3)
+ - [Parameters](#parameters-3)
+ - [Inputs](#inputs-3)
+ - [Outputs](#outputs-3)
+ - [Type Constraints](#type-constraints-3)
+ - [grid_sampler](#grid_sampler)
+ - [Description](#description-4)
+ - [Parameters](#parameters-4)
+ - [Inputs](#inputs-4)
+ - [Outputs](#outputs-4)
+ - [Type Constraints](#type-constraints-4)
+ - [cummax](#cummax)
+ - [Description](#description-5)
+ - [Parameters](#parameters-5)
+ - [Inputs](#inputs-5)
+ - [Outputs](#outputs-5)
+ - [Type Constraints](#type-constraints-5)
+ - [cummin](#cummin)
+ - [Description](#description-6)
+ - [Parameters](#parameters-6)
+ - [Inputs](#inputs-6)
+ - [Outputs](#outputs-6)
+ - [Type Constraints](#type-constraints-6)
+ - [MMCVInstanceNormalization](#mmcvinstancenormalization)
+ - [Description](#description-7)
+ - [Parameters](#parameters-7)
+ - [Inputs](#inputs-7)
+ - [Outputs](#outputs-7)
+ - [Type Constraints](#type-constraints-7)
+ - [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
+ - [Description](#description-8)
+ - [Parameters](#parameters-8)
+ - [Inputs](#inputs-8)
+ - [Outputs](#outputs-8)
+ - [Type Constraints](#type-constraints-8)
+
+
+
+### MMCVRoIAlign
+
+#### Description
+
+Perform RoIAlign on output feature, used in bbox_head of most two stage
+detectors.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
+| `int` | `output_height` | height of output roi |
+| `int` | `output_width` | width of output roi |
+| `float` | `spatial_scale` | used to scale the input boxes |
+| `int` | `sampling_ratio` | number of input samples to take for each output sample. `0` means to take samples densely for current models. |
+| `str` | `mode` | pooling mode in each bin. `avg` or `max` |
+| `int` | `aligned` | If `aligned=0`, use the legacy implementation in MMDetection. Else, align the results more perfectly. |
+
+#### Inputs
+
+
+inputs[0] : T
+Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.
+inputs[1] : T
+RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of inputs[0].
+
+
+#### Outputs
+
+
+outputs[0] : T
+RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element output[0][r-1] is a pooled feature map corresponding to the r-th RoI inputs[1][r-1].
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### ScatterND
+
+#### Description
+
+ScatterND takes three inputs `data` tensor of rank r >= 1, `indices` tensor of rank q >= 1, and `updates` tensor of rank q + r - indices.shape[-1] - 1. The output of the operation is produced by creating a copy of the input `data`, and then updating its value to values specified by updates at specific index positions specified by `indices`. Its output shape is the same as the shape of `data`. Note that `indices` should not have duplicate entries. That is, two or more updates for the same index-location is not supported.
+
+The `output` is calculated via the following equation:
+
+```python
+ output = np.copy(data)
+ update_indices = indices.shape[:-1]
+ for idx in np.ndindex(update_indices):
+ output[indices[idx]] = updates[idx]
+```
+
+#### Parameters
+
+None
+
+#### Inputs
+
+
+inputs[0] : T
+Tensor of rank r>=1.
+
+inputs[1] : tensor(int32, Linear)
+Tensor of rank q>=1.
+
+inputs[2] : T
+Tensor of rank q + r - indices_shape[-1] - 1.
+
+
+#### Outputs
+
+
+outputs[0] : T
+Tensor of rank r >= 1.
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear), tensor(int32, Linear)
+
+### NonMaxSuppression
+
+#### Description
+
+Filter out boxes has high IoU overlap with previously selected boxes or low score. Output the indices of valid boxes. Indices of invalid boxes will be filled with -1.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ------- | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
+| `int` | `center_point_box` | 0 - the box data is supplied as [y1, x1, y2, x2], 1-the box data is supplied as [x_center, y_center, width, height]. |
+| `int` | `max_output_boxes_per_class` | The maximum number of boxes to be selected per batch per class. Default to 0, number of output boxes equal to number of input boxes. |
+| `float` | `iou_threshold` | The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
+| `float` | `score_threshold` | The threshold for deciding when to remove boxes based on score. |
+| `int` | `offset` | 0 or 1, boxes' width or height is (x2 - x1 + offset). |
+
+#### Inputs
+
+
+inputs[0] : T
+Input boxes. 3-D tensor of shape (num_batches, spatial_dimension, 4).
+inputs[1] : T
+Input scores. 3-D tensor of shape (num_batches, num_classes, spatial_dimension).
+
+
+#### Outputs
+
+
+outputs[0] : tensor(int32, Linear)
+Selected indices. 2-D tensor of shape (num_selected_indices, 3) as [[batch_index, class_index, box_index], ...].
+num_selected_indices=num_batches* num_classes* min(max_output_boxes_per_class, spatial_dimension).
+All invalid indices will be filled with -1.
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### MMCVDeformConv2d
+
+#### Description
+
+Perform Deformable Convolution on input feature, read [Deformable Convolutional Network](https://arxiv.org/abs/1703.06211) for detail.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| -------------- | ------------------ | --------------------------------------------------------------------------------------------------------------------------------- |
+| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
+| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
+| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
+| `int` | `deformable_group` | Groups of deformable offset. |
+| `int` | `group` | Split input into groups. `input_channel` should be divisible by the number of groups. |
+| `int` | `im2col_step` | DeformableConv2d use im2col to compute convolution. im2col_step is used to split input and offset, reduce memory usage of column. |
+
+#### Inputs
+
+
+inputs[0] : T
+Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.
+inputs[1] : T
+Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
+inputs[2] : T
+Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).
+
+
+#### Outputs
+
+
+outputs[0] : T
+Output feature; 4-D tensor of shape (N, output_channel, outH, outW).
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### grid_sampler
+
+#### Description
+
+Perform sample from `input` with pixel locations from `grid`.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ----- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `int` | `interpolation_mode` | Interpolation mode to calculate output values. (0: `bilinear` , 1: `nearest`) |
+| `int` | `padding_mode` | Padding mode for outside grid values. (0: `zeros`, 1: `border`, 2: `reflection`) |
+| `int` | `align_corners` | If `align_corners=1`, the extrema (`-1` and `1`) are considered as referring to the center points of the input's corner pixels. If `align_corners=0`, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
+
+#### Inputs
+
+
+inputs[0] : T
+Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.
+inputs[1] : T
+Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW is the height and width of offset and output.
+
+
+#### Outputs
+
+
+outputs[0] : T
+Output feature; 4-D tensor of shape (N, C, outH, outW).
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### cummax
+
+#### Description
+
+Returns a namedtuple (`values`, `indices`) where `values` is the cumulative maximum of elements of `input` in the dimension `dim`. And `indices` is the index location of each maximum value found in the dimension `dim`.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ----- | --------- | --------------------------------------- |
+| `int` | `dim` | The dimension to do the operation over. |
+
+#### Inputs
+
+
+inputs[0] : T
+The input tensor.
+
+
+#### Outputs
+
+
+outputs[0] : T
+Output values.
+outputs[1] : (int32, Linear)
+Output indices.
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### cummin
+
+#### Description
+
+Returns a namedtuple (`values`, `indices`) where `values` is the cumulative minimum of elements of `input` in the dimension `dim`. And `indices` is the index location of each minimum value found in the dimension `dim`.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ----- | --------- | --------------------------------------- |
+| `int` | `dim` | The dimension to do the operation over. |
+
+#### Inputs
+
+
+inputs[0] : T
+The input tensor.
+
+
+#### Outputs
+
+
+outputs[0] : T
+Output values.
+outputs[1] : (int32, Linear)
+Output indices.
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### MMCVInstanceNormalization
+
+#### Description
+
+Carries out instance normalization as described in the paper https://arxiv.org/abs/1607.08022.
+
+y = scale * (x - mean) / sqrt(variance + epsilon) + B, where mean and variance are computed per instance per channel.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| ------- | --------- | -------------------------------------------------------------------- |
+| `float` | `epsilon` | The epsilon value to use to avoid division by zero. Default is 1e-05 |
+
+#### Inputs
+
+
+input : T
+Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.
+scale : T
+The input 1-dimensional scale tensor of size C.
+B : T
+The input 1-dimensional bias tensor of size C.
+
+
+#### Outputs
+
+
+output : T
+The output tensor of the same shape as input.
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
+
+### MMCVModulatedDeformConv2d
+
+#### Description
+
+Perform Modulated Deformable Convolution on input feature, read [Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline) for detail.
+
+#### Parameters
+
+| Type | Parameter | Description |
+| -------------- | ------------------ | ------------------------------------------------------------------------------------- |
+| `list of ints` | `stride` | The stride of the convolving kernel. (sH, sW) |
+| `list of ints` | `padding` | Paddings on both sides of the input. (padH, padW) |
+| `list of ints` | `dilation` | The spacing between kernel elements. (dH, dW) |
+| `int` | `deformable_group` | Groups of deformable offset. |
+| `int` | `group` | Split input into groups. `input_channel` should be divisible by the number of groups. |
+
+#### Inputs
+
+
+inputs[0] : T
+Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the number of channels, inH and inW are the height and width of the data.
+inputs[1] : T
+Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
+inputs[2] : T
+Input mask; 4-D tensor of shape (N, deformable_group* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
+inputs[3] : T
+Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).
+inputs[4] : T, optional
+Input weight; 1-D tensor of shape (output_channel).
+
+
+#### Outputs
+
+
+outputs[0] : T
+Output feature; 4-D tensor of shape (N, output_channel, outH, outW).
+
+
+#### Type Constraints
+
+- T:tensor(float32, Linear)
diff --git a/docs/deployment/tensorrt_plugin.md b/docs/deployment/tensorrt_plugin.md
new file mode 100644
index 0000000000000000000000000000000000000000..cd8924e33e5183516dcc86d5dc5b2fd786a54f87
--- /dev/null
+++ b/docs/deployment/tensorrt_plugin.md
@@ -0,0 +1,178 @@
+## TensorRT Plugins for custom operators in MMCV (Experimental)
+
+
+
+- [TensorRT Plugins for custom operators in MMCV (Experimental)](#tensorrt-plugins-for-custom-operators-in-mmcv-experimental)
+ - [Introduction](#introduction)
+ - [List of TensorRT plugins supported in MMCV](#list-of-tensorrt-plugins-supported-in-mmcv)
+ - [How to build TensorRT plugins in MMCV](#how-to-build-tensorrt-plugins-in-mmcv)
+ - [Prerequisite](#prerequisite)
+ - [Build on Linux](#build-on-linux)
+ - [Create TensorRT engine and run inference in python](#create-tensorrt-engine-and-run-inference-in-python)
+ - [How to add a TensorRT plugin for custom op in MMCV](#how-to-add-a-tensorrt-plugin-for-custom-op-in-mmcv)
+ - [Main procedures](#main-procedures)
+ - [Reminders](#reminders)
+ - [Known Issues](#known-issues)
+ - [References](#references)
+
+
+
+### Introduction
+
+**NVIDIA TensorRT** is a software development kit(SDK) for high-performance inference of deep learning models. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. Please check its [developer's website](https://developer.nvidia.com/tensorrt) for more information.
+To ease the deployment of trained models with custom operators from `mmcv.ops` using TensorRT, a series of TensorRT plugins are included in MMCV.
+
+### List of TensorRT plugins supported in MMCV
+
+| ONNX Operator | TensorRT Plugin | MMCV Releases |
+| :-----------------------: | :-----------------------------------------------------------------------------: | :-----------: |
+| MMCVRoiAlign | [MMCVRoiAlign](./tensorrt_custom_ops.md#mmcvroialign) | 1.2.6 |
+| ScatterND | [ScatterND](./tensorrt_custom_ops.md#scatternd) | 1.2.6 |
+| NonMaxSuppression | [NonMaxSuppression](./tensorrt_custom_ops.md#nonmaxsuppression) | 1.3.0 |
+| MMCVDeformConv2d | [MMCVDeformConv2d](./tensorrt_custom_ops.md#mmcvdeformconv2d) | 1.3.0 |
+| grid_sampler | [grid_sampler](./tensorrt_custom_ops.md#grid-sampler) | 1.3.1 |
+| cummax | [cummax](./tensorrt_custom_ops.md#cummax) | 1.3.5 |
+| cummin | [cummin](./tensorrt_custom_ops.md#cummin) | 1.3.5 |
+| MMCVInstanceNormalization | [MMCVInstanceNormalization](./tensorrt_custom_ops.md#mmcvinstancenormalization) | 1.3.5 |
+| MMCVModulatedDeformConv2d | [MMCVModulatedDeformConv2d](./tensorrt_custom_ops.md#mmcvmodulateddeformconv2d) | master |
+
+Notes
+
+- All plugins listed above are developed on TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0
+
+### How to build TensorRT plugins in MMCV
+
+#### Prerequisite
+
+- Clone repository
+
+```bash
+git clone https://github.com/open-mmlab/mmcv.git
+```
+
+- Install TensorRT
+
+Download the corresponding TensorRT build from [NVIDIA Developer Zone](https://developer.nvidia.com/nvidia-tensorrt-download).
+
+For example, for Ubuntu 16.04 on x86-64 with cuda-10.2, the downloaded file is `TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz`.
+
+Then, install as below:
+
+```bash
+cd ~/Downloads
+tar -xvzf TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz
+export TENSORRT_DIR=`pwd`/TensorRT-7.2.1.6
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$TENSORRT_DIR/lib
+```
+
+Install python packages: tensorrt, graphsurgeon, onnx-graphsurgeon
+
+```bash
+pip install $TENSORRT_DIR/python/tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl
+pip install $TENSORRT_DIR/onnx_graphsurgeon/onnx_graphsurgeon-0.2.6-py2.py3-none-any.whl
+pip install $TENSORRT_DIR/graphsurgeon/graphsurgeon-0.4.5-py2.py3-none-any.whl
+```
+
+For more detailed information of installing TensorRT using tar, please refer to [Nvidia' website](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-721/install-guide/index.html#installing-tar).
+
+#### Build on Linux
+
+```bash
+cd mmcv ## to MMCV root directory
+MMCV_WITH_OPS=1 MMCV_WITH_TRT=1 pip install -e .
+```
+
+### Create TensorRT engine and run inference in python
+
+Here is an example.
+
+```python
+import torch
+import onnx
+
+from mmcv.tensorrt import (TRTWrapper, onnx2trt, save_trt_engine,
+ is_tensorrt_plugin_loaded)
+
+assert is_tensorrt_plugin_loaded(), 'Requires to complie TensorRT plugins in mmcv'
+
+onnx_file = 'sample.onnx'
+trt_file = 'sample.trt'
+onnx_model = onnx.load(onnx_file)
+
+## Model input
+inputs = torch.rand(1, 3, 224, 224).cuda()
+## Model input shape info
+opt_shape_dict = {
+ 'input': [list(inputs.shape),
+ list(inputs.shape),
+ list(inputs.shape)]
+}
+
+## Create TensorRT engine
+max_workspace_size = 1 << 30
+trt_engine = onnx2trt(
+ onnx_model,
+ opt_shape_dict,
+ max_workspace_size=max_workspace_size)
+
+## Save TensorRT engine
+save_trt_engine(trt_engine, trt_file)
+
+## Run inference with TensorRT
+trt_model = TRTWrapper(trt_file, ['input'], ['output'])
+
+with torch.no_grad():
+ trt_outputs = trt_model({'input': inputs})
+ output = trt_outputs['output']
+
+```
+
+### How to add a TensorRT plugin for custom op in MMCV
+
+#### Main procedures
+
+Below are the main steps:
+
+1. Add c++ header file
+2. Add c++ source file
+3. Add cuda kernel file
+4. Register plugin in `trt_plugin.cpp`
+5. Add unit test in `tests/test_ops/test_tensorrt.py`
+
+**Take RoIAlign plugin `roi_align` for example.**
+
+1. Add header `trt_roi_align.hpp` to TensorRT include directory `mmcv/ops/csrc/tensorrt/`
+2. Add source `trt_roi_align.cpp` to TensorRT source directory `mmcv/ops/csrc/tensorrt/plugins/`
+3. Add cuda kernel `trt_roi_align_kernel.cu` to TensorRT source directory `mmcv/ops/csrc/tensorrt/plugins/`
+4. Register `roi_align` plugin in [trt_plugin.cpp](https://github.com/open-mmlab/mmcv/blob/master/mmcv/ops/csrc/tensorrt/plugins/trt_plugin.cpp)
+
+ ```c++
+ #include "trt_plugin.hpp"
+
+ #include "trt_roi_align.hpp"
+
+ REGISTER_TENSORRT_PLUGIN(RoIAlignPluginDynamicCreator);
+
+ extern "C" {
+ bool initLibMMCVInferPlugins() { return true; }
+ } // extern "C"
+ ```
+
+5. Add unit test into `tests/test_ops/test_tensorrt.py`
+ Check [here](https://github.com/open-mmlab/mmcv/blob/master/tests/test_ops/test_tensorrt.py) for examples.
+
+#### Reminders
+
+- Some of the [custom ops](https://mmcv.readthedocs.io/en/latest/ops.html) in `mmcv` have their cuda implementations, which could be referred.
+
+### Known Issues
+
+- None
+
+### References
+
+- [Developer guide of Nvidia TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html)
+- [TensorRT Open Source Software](https://github.com/NVIDIA/TensorRT)
+- [onnx-tensorrt](https://github.com/onnx/onnx-tensorrt)
+- [TensorRT python API](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/index.html)
+- [TensorRT c++ plugin API](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_plugin.html)
diff --git a/docs/en/_static/version.json b/docs/en/_static/version.json
deleted file mode 100644
index 7ee4965d36ed96f63f484137921d156d19cc40da..0000000000000000000000000000000000000000
--- a/docs/en/_static/version.json
+++ /dev/null
@@ -1,575 +0,0 @@
-{
- "Linux": [
- {
- "cuda": "11.7",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.5",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.0",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "9.2",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "9.2",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- }
- ],
- "Windows": [
- {
- "cuda": "11.7",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.5",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- }
- ],
- "macOS": [
- {
- "cuda": "cpu",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "mps",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- }
- ]
-}
diff --git a/docs/en/_templates/classtemplate.rst b/docs/en/_templates/classtemplate.rst
deleted file mode 100644
index 4f74842394ec9807fb1ae2d8f05a8a57e9a2e24c..0000000000000000000000000000000000000000
--- a/docs/en/_templates/classtemplate.rst
+++ /dev/null
@@ -1,14 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-.. currentmodule:: {{ module }}
-
-
-{{ name | underline}}
-
-.. autoclass:: {{ name }}
- :members:
-
-
-..
- autogenerated from source/_templates/classtemplate.rst
- note it does not have :inherited-members:
diff --git a/docs/en/api/arraymisc.rst b/docs/en/api/arraymisc.rst
deleted file mode 100644
index 28975eb76e94994c50d2fe52b8f34c7ce533e788..0000000000000000000000000000000000000000
--- a/docs/en/api/arraymisc.rst
+++ /dev/null
@@ -1,19 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.arraymisc
-===================================
-
-.. contents:: mmcv.arraymisc
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.arraymisc
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- quantize
- dequantize
diff --git a/docs/en/api/cnn.rst b/docs/en/api/cnn.rst
deleted file mode 100644
index 022191f179fdbe3b1644abbb96ffdc92e4e37e06..0000000000000000000000000000000000000000
--- a/docs/en/api/cnn.rst
+++ /dev/null
@@ -1,71 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.cnn
-===================================
-
-.. contents:: mmcv.cnn
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.cnn
-
-Module
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- ContextBlock
- Conv2d
- Conv3d
- ConvAWS2d
- ConvModule
- ConvTranspose2d
- ConvTranspose3d
- ConvWS2d
- DepthwiseSeparableConvModule
- GeneralizedAttention
- HSigmoid
- HSwish
- LayerScale
- Linear
- MaxPool2d
- MaxPool3d
- NonLocal1d
- NonLocal2d
- NonLocal3d
- Scale
- Swish
- Conv2dRFSearchOp
-
-Build Function
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- build_activation_layer
- build_conv_layer
- build_norm_layer
- build_padding_layer
- build_plugin_layer
- build_upsample_layer
-
-Miscellaneous
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- fuse_conv_bn
- conv_ws_2d
- is_norm
- make_res_layer
- make_vgg_layer
- get_model_complexity_info
diff --git a/docs/en/api/image.rst b/docs/en/api/image.rst
deleted file mode 100644
index 3b93484952cd0c45b9d103088b0677f93fe5615d..0000000000000000000000000000000000000000
--- a/docs/en/api/image.rst
+++ /dev/null
@@ -1,100 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.image
-===================================
-
-.. contents:: mmcv.image
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.image
-
-IO
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- imfrombytes
- imread
- imwrite
- use_backend
-
-Color Space
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- bgr2gray
- bgr2hls
- bgr2hsv
- bgr2rgb
- bgr2ycbcr
- gray2bgr
- gray2rgb
- hls2bgr
- hsv2bgr
- imconvert
- rgb2bgr
- rgb2gray
- rgb2ycbcr
- ycbcr2bgr
- ycbcr2rgb
-
-Geometric
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- cutout
- imcrop
- imflip
- impad
- impad_to_multiple
- imrescale
- imresize
- imresize_like
- imresize_to_multiple
- imrotate
- imshear
- imtranslate
- rescale_size
-
-Photometric
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- adjust_brightness
- adjust_color
- adjust_contrast
- adjust_hue
- adjust_lighting
- adjust_sharpness
- auto_contrast
- clahe
- imdenormalize
- imequalize
- iminvert
- imnormalize
- lut_transform
- posterize
- solarize
-
-Miscellaneous
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- tensor2imgs
diff --git a/docs/en/api/ops.rst b/docs/en/api/ops.rst
deleted file mode 100644
index b0290457bfa0c08f14d7fe346efccb33f388bdae..0000000000000000000000000000000000000000
--- a/docs/en/api/ops.rst
+++ /dev/null
@@ -1,135 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.ops
-===================================
-
-.. contents:: mmcv.ops
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.ops
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- BorderAlign
- CARAFE
- CARAFENaive
- CARAFEPack
- Conv2d
- ConvTranspose2d
- CornerPool
- Correlation
- CrissCrossAttention
- DeformConv2d
- DeformConv2dPack
- DeformRoIPool
- DeformRoIPoolPack
- DynamicScatter
- FusedBiasLeakyReLU
- GroupAll
- Linear
- MaskedConv2d
- MaxPool2d
- ModulatedDeformConv2d
- ModulatedDeformConv2dPack
- ModulatedDeformRoIPoolPack
- MultiScaleDeformableAttention
- PSAMask
- PointsSampler
- PrRoIPool
- QueryAndGroup
- RiRoIAlignRotated
- RoIAlign
- RoIAlignRotated
- RoIAwarePool3d
- RoIPointPool3d
- RoIPool
- SAConv2d
- SigmoidFocalLoss
- SimpleRoIAlign
- SoftmaxFocalLoss
- SparseConv2d
- SparseConv3d
- SparseConvTensor
- SparseConvTranspose2d
- SparseConvTranspose3d
- SparseInverseConv2d
- SparseInverseConv3d
- SparseMaxPool2d
- SparseMaxPool3d
- SparseModule
- SparseSequential
- SubMConv2d
- SubMConv3d
- SyncBatchNorm
- TINShift
- Voxelization
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- active_rotated_filter
- assign_score_withk
- ball_query
- batched_nms
- bbox_overlaps
- border_align
- box_iou_rotated
- boxes_iou3d
- boxes_iou_bev
- boxes_overlap_bev
- carafe
- carafe_naive
- chamfer_distance
- contour_expand
- convex_giou
- convex_iou
- deform_conv2d
- deform_roi_pool
- diff_iou_rotated_2d
- diff_iou_rotated_3d
- dynamic_scatter
- furthest_point_sample
- furthest_point_sample_with_dist
- fused_bias_leakyrelu
- gather_points
- grouping_operation
- knn
- masked_conv2d
- min_area_polygons
- modulated_deform_conv2d
- nms
- nms3d
- nms3d_normal
- nms_bev
- nms_match
- nms_normal_bev
- nms_rotated
- pixel_group
- point_sample
- points_in_boxes_all
- points_in_boxes_cpu
- points_in_boxes_part
- points_in_polygons
- prroi_pool
- rel_roi_point_to_rel_img_point
- riroi_align_rotated
- roi_align
- roi_align_rotated
- roi_pool
- rotated_feature_align
- scatter_nd
- sigmoid_focal_loss
- soft_nms
- softmax_focal_loss
- three_interpolate
- three_nn
- tin_shift
- upfirdn2d
- voxelization
diff --git a/docs/en/api/transforms.rst b/docs/en/api/transforms.rst
deleted file mode 100644
index b080133d6b7736398b855174c325169b8af92aae..0000000000000000000000000000000000000000
--- a/docs/en/api/transforms.rst
+++ /dev/null
@@ -1,60 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.transforms
-===================================
-
-.. currentmodule:: mmcv.transforms
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- BaseTransform
- TestTimeAug
-
-Loading
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- LoadAnnotations
- LoadImageFromFile
-
-Processing
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- CenterCrop
- MultiScaleFlipAug
- Normalize
- Pad
- RandomChoiceResize
- RandomFlip
- RandomGrayscale
- RandomResize
- Resize
- ToTensor
- ImageToTensor
-
-Wrapper
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- Compose
- KeyMapper
- RandomApply
- RandomChoice
- TransformBroadcaster
diff --git a/docs/en/api/utils.rst b/docs/en/api/utils.rst
deleted file mode 100644
index f2ff4c2a3872bc9ae0c2942debac5e5b523bd071..0000000000000000000000000000000000000000
--- a/docs/en/api/utils.rst
+++ /dev/null
@@ -1,23 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.utils
-===================================
-
-.. contents:: mmcv.utils
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.utils
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- IS_CUDA_AVAILABLE
- IS_MLU_AVAILABLE
- IS_MPS_AVAILABLE
- collect_env
- jit
- skip_no_elena
diff --git a/docs/en/api/video.rst b/docs/en/api/video.rst
deleted file mode 100644
index a6ebca0eb73afcf3f3f11aae8520e2782a310f13..0000000000000000000000000000000000000000
--- a/docs/en/api/video.rst
+++ /dev/null
@@ -1,56 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.video
-===================================
-
-.. contents:: mmcv.video
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.video
-
-IO
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- VideoReader
- Cache
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- frames2video
-
-Optical Flow
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- dequantize_flow
- flow_from_bytes
- flow_warp
- flowread
- flowwrite
- quantize_flow
- sparse_flow_from_bytes
-
-Video Processing
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- concat_video
- convert_video
- cut_video
- resize_video
diff --git a/docs/en/api/visualization.rst b/docs/en/api/visualization.rst
deleted file mode 100644
index 8f43ef27a441dcd9001a352cf18e97f8e615676d..0000000000000000000000000000000000000000
--- a/docs/en/api/visualization.rst
+++ /dev/null
@@ -1,50 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.visualization
-===================================
-
-.. contents:: mmcv.visualization
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.visualization
-
-Color
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- Color
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- color_val
-
-Image
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- imshow
- imshow_bboxes
- imshow_det_bboxes
-
-Optical Flow
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- flow2rgb
- flowshow
- make_color_wheel
diff --git a/docs/en/community/contributing.md b/docs/en/community/contributing.md
deleted file mode 100644
index e339935778d8d1aa0d1d150fb2cd5b63b27773bc..0000000000000000000000000000000000000000
--- a/docs/en/community/contributing.md
+++ /dev/null
@@ -1,267 +0,0 @@
-## Contributing to OpenMMLab
-
-Welcome to the MMCV community, we are committed to building a cutting-edge computer vision foundational library and all kinds of contributions are welcomed, including but not limited to
-
-**Fix bug**
-
-You can directly post a Pull Request to fix typo in code or documents
-
-The steps to fix the bug of code implementation are as follows.
-
-1. If the modification involve significant changes, you should create an issue first and describe the error information and how to trigger the bug. Other developers will discuss with you and propose an proper solution.
-
-2. Posting a pull request after fixing the bug and adding corresponding unit test.
-
-**New Feature or Enhancement**
-
-1. If the modification involve significant changes, you should create an issue to discuss with our developers to propose an proper design.
-2. Post a Pull Request after implementing the new feature or enhancement and add corresponding unit test.
-
-**Document**
-
-You can directly post a pull request to fix documents. If you want to add a document, you should first create an issue to check if it is reasonable.
-
-### Pull Request Workflow
-
-If you're not familiar with Pull Request, don't worry! The following guidance will tell you how to create a Pull Request step by step. If you want to dive into the develop mode of Pull Request, you can refer to the [official documents](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)
-
-#### 1. Fork and clone
-
-If you are posting a pull request for the first time, you should fork the OpenMMLab repositories by clicking the **Fork** button in the top right corner of the GitHub page, and the forked repositories will appear under your GitHub profile.
-
-
-
-Then, you can clone the repositories to local:
-
-```shell
-git clone git@github.com:{username}/mmcv.git
-```
-
-After that, you should ddd official repository as the upstream repository
-
-```bash
-git remote add upstream git@github.com:open-mmlab/mmcv
-```
-
-Check whether remote repository has been added successfully by `git remote -v`
-
-```bash
-origin git@github.com:{username}/mmcv.git (fetch)
-origin git@github.com:{username}/mmcv.git (push)
-upstream git@github.com:open-mmlab/mmcv (fetch)
-upstream git@github.com:open-mmlab/mmcv (push)
-```
-
-```{note}
-Here's a brief introduction to origin and upstream. When we use "git clone", we create an "origin" remote by default, which points to the repository cloned from. As for "upstream", we add it ourselves to point to the target repository. Of course, if you don't like the name "upstream", you could name it as you wish. Usually, we'll push the code to "origin". If the pushed code conflicts with the latest code in official("upstream"), we should pull the latest code from upstream to resolve the conflicts, and then push to "origin" again. The posted Pull Request will be updated automatically.
-```
-
-#### 2. Configure pre-commit
-
-You should configure [pre-commit](https://pre-commit.com/#intro) in the local development environment to make sure the code style matches that of OpenMMLab. **Note**: The following code should be executed under the MMCV directory.
-
-```shell
-pip install -U pre-commit
-pre-commit install
-```
-
-Check that pre-commit is configured successfully, and install the hooks defined in `.pre-commit-config.yaml`.
-
-```shell
-pre-commit run --all-files
-```
-
-
-
-
-
-```{note}
-Chinese users may fail to download the pre-commit hooks due to the network issue. In this case, you could download these hooks from gitee by setting the .pre-commit-config-zh-cn.yaml
-
-pre-commit install -c .pre-commit-config-zh-cn.yaml
-pre-commit run --all-files -c .pre-commit-config-zh-cn.yaml
-```
-
-If the installation process is interrupted, you can repeatedly run `pre-commit run ... ` to continue the installation.
-
-If the code does not conform to the code style specification, pre-commit will raise a warning and fixes some of the errors automatically.
-
-
-
-If we want to commit our code bypassing the pre-commit hook, we can use the `--no-verify` option(**only for temporarily commit**.
-
-```shell
-git commit -m "xxx" --no-verify
-```
-
-#### 3. Create a development branch
-
-After configuring the pre-commit, we should create a branch based on the master branch to develop the new feature or fix the bug. The proposed branch name is `username/pr_name`
-
-```shell
-git checkout -b yhc/refactor_contributing_doc
-```
-
-In subsequent development, if the master branch of the local repository is behind the master branch of "upstream", we need to pull the upstream for synchronization, and then execute the above command:
-
-```shell
-git pull upstream master
-```
-
-#### 4. Commit the code and pass the unit test
-
-- MMCV introduces mypy to do static type checking to increase the robustness of the code. Therefore, we need to add Type Hints to our code and pass the mypy check. If you are not familiar with Type Hints, you can refer to [this tutorial](https://docs.python.org/3/library/typing.html).
-
-- The committed code should pass through the unit test
-
- ```shell
- # Pass all unit tests
- pytest tests
-
- # Pass the unit test of runner
- pytest tests/test_runner/test_runner.py
- ```
-
- If the unit test fails for lack of dependencies, you can install the dependencies referring to the [guidance](#unit-test)
-
-- If the documents are modified/added, we should check the rendering result referring to [guidance](#document-rendering)
-
-#### 5. Push the code to remote
-
-We could push the local commits to remote after passing through the check of unit test and pre-commit. You can associate the local branch with remote branch by adding `-u` option.
-
-```shell
-git push -u origin {branch_name}
-```
-
-This will allow you to use the `git push` command to push code directly next time, without having to specify a branch or the remote repository.
-
-#### 6. Create a Pull Request
-
-(1) Create a pull request in GitHub's Pull request interface
-
-
-
-(2) Modify the PR description according to the guidelines so that other developers can better understand your changes
-
-
-
-Find more details about Pull Request description in [pull request guidelines](#pr-specs).
-
-**note**
-
-(a) The Pull Request description should contain the reason for the change, the content of the change, and the impact of the change, and be associated with the relevant Issue (see [documentation](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue)
-
-(b) If it is your first contribution, please sign the CLA
-
-
-
-(c) Check whether the Pull Request pass through the CI
-
-
-
-MMCV will run unit test for the posted Pull Request on different platforms (Linux, Window, Mac), based on different versions of Python, PyTorch, CUDA to make sure the code is correct. We can see the specific test information by clicking `Details` in the above image so that we can modify the code.
-
-(3) If the Pull Request passes the CI, then you can wait for the review from other developers. You'll modify the code based on the reviewer's comments, and repeat the steps [4](#4-commit-the-code-and-pass-the-unit-test)-[5](#5-push-the-code-to-remote) until all reviewers approve it. Then, we will merge it ASAP.
-
-
-
-#### 7. Resolve conflicts
-
-If your local branch conflicts with the latest master branch of "upstream", you'll need to resolove them. There are two ways to do this:
-
-```shell
-git fetch --all --prune
-git rebase upstream/master
-```
-
-or
-
-```shell
-git fetch --all --prune
-git merge upstream/master
-```
-
-If you are very good at handling conflicts, then you can use rebase to resolve conflicts, as this will keep your commit logs tidy. If you are not familiar with `rebase`, then you can use `merge` to resolve conflicts.
-
-### Guidance
-
-#### Unit test
-
-If you cannot run the unit test of some modules for lacking of some dependencies, such as [video](https://github.com/open-mmlab/mmcv/tree/master/mmcv/video) module, you can try to install the following dependencies:
-
-```shell
-# Linux
-sudo apt-get update -y
-sudo apt-get install -y libturbojpeg
-sudo apt-get install -y ffmpeg
-
-# Windows
-conda install ffmpeg
-```
-
-We should also make sure the committed code will not decrease the coverage of unit test, we could run the following command to check the coverage of unit test:
-
-```shell
-python -m coverage run -m pytest /path/to/test_file
-python -m coverage html
-# check file in htmlcov/index.html
-```
-
-#### Document rendering
-
-If the documents are modified/added, we should check the rendering result. We could install the dependencies and run the following command to render the documents and check the results:
-
-```shell
-pip install -r requirements/docs.txt
-cd docs/zh_cn/
-# or docs/en
-make html
-# check file in ./docs/zh_cn/_build/html/index.html
-```
-
-### Code style
-
-#### Python
-
-We adopt [PEP8](https://www.python.org/dev/peps/pep-0008/) as the preferred code style.
-
-We use the following tools for linting and formatting:
-
-- [flake8](https://github.com/PyCQA/flake8): A wrapper around some linter tools.
-- [isort](https://github.com/timothycrosley/isort): A Python utility to sort imports.
-- [yapf](https://github.com/google/yapf): A formatter for Python files.
-- [codespell](https://github.com/codespell-project/codespell): A Python utility to fix common misspellings in text files.
-- [mdformat](https://github.com/executablebooks/mdformat): Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files.
-- [docformatter](https://github.com/myint/docformatter): A formatter to format docstring.
-
-Style configurations of yapf and isort can be found in [setup.cfg](./setup.cfg).
-
-We use [pre-commit hook](https://pre-commit.com/) that checks and formats for `flake8`, `yapf`, `isort`, `trailing whitespaces`, `markdown files`,
-fixes `end-of-files`, `double-quoted-strings`, `python-encoding-pragma`, `mixed-line-ending`, sorts `requirments.txt` automatically on every commit.
-The config for a pre-commit hook is stored in [.pre-commit-config](./.pre-commit-config.yaml).
-
-#### C++ and CUDA
-
-We follow the [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html).
-
-### PR Specs
-
-1. Use [pre-commit](https://pre-commit.com) hook to avoid issues of code style
-
-2. One short-time branch should be matched with only one PR
-
-3. Accomplish a detailed change in one PR. Avoid large PR
-
- - Bad: Support Faster R-CNN
- - Acceptable: Add a box head to Faster R-CNN
- - Good: Add a parameter to box head to support custom conv-layer number
-
-4. Provide clear and significant commit message
-
-5. Provide clear and meaningful PR description
-
- - Task name should be clarified in title. The general format is: \[Prefix\] Short description of the PR (Suffix)
- - Prefix: add new feature \[Feature\], fix bug \[Fix\], related to documents \[Docs\], in developing \[WIP\] (which will not be reviewed temporarily)
- - Introduce main changes, results and influences on other modules in short description
- - Associate related issues and pull requests with a milestone
diff --git a/docs/en/community/pr.md b/docs/en/community/pr.md
deleted file mode 100644
index 1bdd90f2bc41867e5c17403690f6a35cfe2c07b7..0000000000000000000000000000000000000000
--- a/docs/en/community/pr.md
+++ /dev/null
@@ -1,3 +0,0 @@
-## Pull Request (PR)
-
-Content has been migrated to [contributing guidance](contributing.md).
diff --git a/docs/en/docutils.conf b/docs/en/docutils.conf
deleted file mode 100644
index 0c00c84688701117f231fd0c8ec295fb747b7d8f..0000000000000000000000000000000000000000
--- a/docs/en/docutils.conf
+++ /dev/null
@@ -1,2 +0,0 @@
-[html writers]
-table_style: colwidths-auto
diff --git a/docs/en/faq.md b/docs/en/faq.md
deleted file mode 100644
index 02d31c233a9ff66d5e8f3f288b5d5f64e5c5298c..0000000000000000000000000000000000000000
--- a/docs/en/faq.md
+++ /dev/null
@@ -1,93 +0,0 @@
-## Frequently Asked Questions
-
-We list some common troubles faced by many users and their corresponding solutions here.
-Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them.
-
-### Installation
-
-- KeyError: "xxx: 'yyy is not in the zzz registry'"
-
- The registry mechanism will be triggered only when the file of the module is imported.
- So you need to import that file somewhere. More details can be found at [KeyError: "MaskRCNN: 'RefineRoIHead is not in the models registry'"](https://github.com/open-mmlab/mmdetection/issues/5974).
-
-- "No module named 'mmcv.ops'"; "No module named 'mmcv.\_ext'"
-
- 1. Uninstall existing mmcv in the environment using `pip uninstall mmcv`
- 2. Install mmcv-full following the [installation instruction](https://mmcv.readthedocs.io/en/latest/get_started/installation.html) or [Build MMCV from source](https://mmcv.readthedocs.io/en/latest/get_started/build.html)
-
-- "invalid device function" or "no kernel image is available for execution"
-
- 1. Check the CUDA compute capability of you GPU
- 2. Run `python mmdet/utils/collect_env.py` to check whether PyTorch, torchvision, and MMCV are built for the correct GPU architecture. You may need to set `TORCH_CUDA_ARCH_LIST` to reinstall MMCV. The compatibility issue could happen when using old GPUS, e.g., Tesla K80 (3.7) on colab.
- 3. Check whether the running environment is the same as that when mmcv/mmdet is compiled. For example, you may compile mmcv using CUDA 10.0 bug run it on CUDA9.0 environments
-
-- "undefined symbol" or "cannot open xxx.so"
-
- 1. If those symbols are CUDA/C++ symbols (e.g., libcudart.so or GLIBCXX), check
- whether the CUDA/GCC runtimes are the same as those used for compiling mmcv
- 2. If those symbols are Pytorch symbols (e.g., symbols containing caffe, aten, and TH), check whether the Pytorch version is the same as that used for compiling mmcv
- 3. Run `python mmdet/utils/collect_env.py` to check whether PyTorch, torchvision, and MMCV are built by and running on the same environment
-
-- "RuntimeError: CUDA error: invalid configuration argument"
-
- This error may be caused by the poor performance of GPU. Try to decrease the value of [THREADS_PER_BLOCK](https://github.com/open-mmlab/mmcv/blob/cac22f8cf5a904477e3b5461b1cc36856c2793da/mmcv/ops/csrc/common_cuda_helper.hpp#L10)
- and recompile mmcv.
-
-- "RuntimeError: nms is not compiled with GPU support"
-
- This error is because your CUDA environment is not installed correctly.
- You may try to re-install your CUDA environment and then delete the build/ folder before re-compile mmcv.
-
-- "Segmentation fault"
-
- 1. Check your GCC version and use GCC >= 5.4. This usually caused by the incompatibility between PyTorch and the environment (e.g., GCC \< 4.9 for PyTorch). We also recommend the users to avoid using GCC 5.5 because many feedbacks report that GCC 5.5 will cause "segmentation fault" and simply changing it to GCC 5.4 could solve the problem
- 2. Check whether PyTorch is correctly installed and could use CUDA op, e.g. type the following command in your terminal and see whether they could correctly output results
- ```shell
- python -c 'import torch; print(torch.cuda.is_available())'
- ```
- 3. If PyTorch is correctly installed, check whether MMCV is correctly installed. If MMCV is correctly installed, then there will be no issue of the command
- ```shell
- python -c 'import mmcv; import mmcv.ops'
- ```
- 4. If MMCV and PyTorch are correctly installed, you can use `ipdb` to set breakpoints or directly add `print` to debug and see which part leads the `segmentation fault`
-
-- "libtorch_cuda_cu.so: cannot open shared object file"
-
- `mmcv-full` depends on the share object but it can not be found. We can check whether the object exists in `~/miniconda3/envs/{environment-name}/lib/python3.7/site-packages/torch/lib` or try to re-install the PyTorch.
-
-- "fatal error C1189: #error: -- unsupported Microsoft Visual Studio version!"
-
- If you are building mmcv-full on Windows and the version of CUDA is 9.2, you will probably encounter the error `"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include\crt/host_config.h(133): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions 2012, 2013, 2015 and 2017 are supported!"`, in which case you can use a lower version of Microsoft Visual Studio like vs2017.
-
-- "error: member "torch::jit::detail::ModulePolicy::all_slots" may not be initialized"
-
- If your version of PyTorch is 1.5.0 and you are building mmcv-full on Windows, you will probably encounter the error `- torch/csrc/jit/api/module.h(474): error: member "torch::jit::detail::ModulePolicy::all_slots" may not be initialized`. The way to solve the error is to replace all the `static constexpr bool all_slots = false;` with `static bool all_slots = false;` at this file `https://github.com/pytorch/pytorch/blob/v1.5.0/torch/csrc/jit/api/module.h`. More details can be found at [member "torch::jit::detail::AttributePolicy::all_slots" may not be initialized](https://github.com/pytorch/pytorch/issues/39394).
-
-- "error: a member with an in-class initializer must be const"
-
- If your version of PyTorch is 1.6.0 and you are building mmcv-full on Windows, you will probably encounter the error `"- torch/include\torch/csrc/jit/api/module.h(483): error: a member with an in-class initializer must be const"`. The way to solve the error is to replace all the `CONSTEXPR_EXCEPT_WIN_CUDA ` with `const` at `torch/include\torch/csrc/jit/api/module.h`. More details can be found at [Ninja: build stopped: subcommand failed](https://github.com/open-mmlab/mmcv/issues/575).
-
-- "error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized"
-
- If your version of PyTorch is 1.7.0 and you are building mmcv-full on Windows, you will probably encounter the error `torch/include\torch/csrc/jit/ir/ir.h(1347): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized`. The way to solve the error needs to modify several local files of PyTorch:
-
- - delete `static constexpr Symbol Kind = ::c10::prim::profile;` and `tatic constexpr Symbol Kind = ::c10::prim::profile_optional;` at `torch/include\torch/csrc/jit/ir/ir.h`
- - replace `explicit operator type&() { return *(this->value); }` with `explicit operator type&() { return *((type*)this->value); }` at `torch\include\pybind11\cast.h`
- - replace all the `CONSTEXPR_EXCEPT_WIN_CUDA` with `const` at `torch/include\torch/csrc/jit/api/module.h`
-
- More details can be found at [Ensure default extra_compile_args](https://github.com/pytorch/pytorch/pull/45956).
-
-- Compatibility issue between MMCV and MMDetection; "ConvWS is already registered in conv layer"
-
- Please install the correct version of MMCV for the version of your MMDetection following the [installation instruction](https://mmdetection.readthedocs.io/en/latest/get_started.html#installation).
-
-### Usage
-
-- "RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one"
-
- 1. This error indicates that your module has parameters that were not used in producing loss. This phenomenon may be caused by running different branches in your code in DDP mode. More datails at [Expected to have finished reduction in the prior iteration before starting a new one](https://github.com/pytorch/pytorch/issues/55582).
- 2. You can set ` find_unused_parameters = True` in the config to solve the above problems or find those unused parameters manually
-
-- "RuntimeError: Trying to backward through the graph a second time"
-
- `GradientCumulativeOptimizerHook` and `OptimizerHook` are both set which causes the `loss.backward()` to be called twice so `RuntimeError` was raised. We can only use one of these. More datails at [Trying to backward through the graph a second time](https://github.com/open-mmlab/mmcv/issues/1379).
diff --git a/docs/en/get_started/build.md b/docs/en/get_started/build.md
deleted file mode 100644
index e3d48ec7cf486edece6ea9e622937b08602f5e6e..0000000000000000000000000000000000000000
--- a/docs/en/get_started/build.md
+++ /dev/null
@@ -1,292 +0,0 @@
-## Build MMCV from source
-
-### Build mmcv
-
-Before installing mmcv, make sure that PyTorch has been successfully installed following the [PyTorch official installation guide](https://pytorch.org/get-started/locally/#start-locally). This can be verified using the following command
-
-```bash
-python -c 'import torch;print(torch.__version__)'
-```
-
-If version information is output, then PyTorch is installed.
-
-```{note}
-If you would like to use `opencv-python-headless` instead of `opencv-python`,
-e.g., in a minimum container environment or servers without GUI,
-you can first install it before installing MMCV to skip the installation of `opencv-python`.
-```
-
-#### Build on Linux
-
-1. Clone the repo
-
- ```bash
- git clone https://github.com/open-mmlab/mmcv.git
- cd mmcv
- ```
-
-2. Install `ninja` and `psutil` to speed up the compilation
-
- ```bash
- pip install -r requirements/optional.txt
- ```
-
-3. Check the nvcc version (requires 9.2+. Skip if no GPU available.)
-
- ```bash
- nvcc --version
- ```
-
- If the above command outputs the following message, it means that the nvcc setting is OK, otherwise you need to set CUDA_HOME.
-
- ```
- nvcc: NVIDIA (R) Cuda compiler driver
- Copyright (c) 2005-2020 NVIDIA Corporation
- Built on Mon_Nov_30_19:08:53_PST_2020
- Cuda compilation tools, release 11.2, V11.2.67
- Build cuda_11.2.r11.2/compiler.29373293_0
- ```
-
- :::{note}
- If you want to support ROCm, you can refer to [AMD ROCm](https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html) to install ROCm.
- :::
-
-4. Check the gcc version (requires 5.4+)
-
- ```bash
- gcc --version
- ```
-
-5. Start building (takes 10+ min)
-
- ```bash
- pip install -e . -v
- ```
-
-6. Validate the installation
-
- ```bash
- python .dev_scripts/check_installation.py
- ```
-
- If no error is reported by the above command, the installation is successful. If there is an error reported, please check [Frequently Asked Questions](../faq.md) to see if there is already a solution.
-
- If no solution is found, please feel free to open an [issue](https://github.com/open-mmlab/mmcv/issues).
-
-#### Build on macOS
-
-```{note}
-If you are using a mac with apple silicon chip, install the PyTorch 1.13+, otherwise you will encounter the problem in [issues#2218](https://github.com/open-mmlab/mmcv/issues/2218).
-```
-
-1. Clone the repo
-
- ```bash
- git clone https://github.com/open-mmlab/mmcv.git
- cd mmcv
- ```
-
-2. Install `ninja` and `psutil` to speed up the compilation
-
- ```bash
- pip install -r requirements/optional.txt
- ```
-
-3. Start building
-
- ```bash
- MMCV_WITH_OPS=1 pip install -e .
- ```
-
-4. Validate the installation
-
- ```bash
- python .dev_scripts/check_installation.py
- ```
-
- If no error is reported by the above command, the installation is successful. If there is an error reported, please check [Frequently Asked Questions](../faq.md) to see if there is already a solution.
-
- If no solution is found, please feel free to open an [issue](https://github.com/open-mmlab/mmcv/issues).
-
-#### Build on Windows
-
-Building MMCV on Windows is a bit more complicated than that on Linux.
-The following instructions show how to get this accomplished.
-
-##### Prerequisite
-
-The following software is required for building MMCV on windows.
-Install them first.
-
-- [Git](https://git-scm.com/download/win)
- - During installation, tick **add git to Path**.
-- [Visual Studio Community 2019](https://visualstudio.microsoft.com)
- - A compiler for C++ and CUDA codes.
-- [Miniconda](https://docs.conda.io/en/latest/miniconda.html)
- - Official distributions of Python should work too.
-- [CUDA 10.2](https://developer.nvidia.com/cuda-10.2-download-archive)
- - Not required for building CPU version.
- - Customize the installation if necessary. As a recommendation, skip the driver installation if a newer version is already installed.
-
-```{note}
-You should know how to set up environment variables, especially `Path`, on Windows. The following instruction relies heavily on this skill.
-```
-
-##### Common steps
-
-1. Launch Anaconda prompt from Windows Start menu
-
- Do not use raw `cmd.exe` s instruction is based on PowerShell syntax.
-
-2. Create a new conda environment
-
- ```powershell
- (base) PS C:\Users\xxx> conda create --name mmcv python=3.7
- (base) PS C:\Users\xxx> conda activate mmcv # make sure to activate environment before any operation
- ```
-
-3. Install PyTorch. Choose a version based on your need.
-
- ```powershell
- # CUDA version
- (mmcv) PS C:\Users\xxx> conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
- # CPU version
- (mmcv) PS C:\Users\xxx> conda install install pytorch torchvision cpuonly -c pytorch
- ```
-
-4. Clone the repo
-
- ```powershell
- (mmcv) PS C:\Users\xxx> git clone https://github.com/open-mmlab/mmcv.git
- (mmcv) PS C:\Users\xxx\mmcv> cd mmcv
- ```
-
-5. Install `ninja` and `psutil` to speed up the compilation
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> pip install -r requirements/optional.txt
- ```
-
-6. Set up MSVC compiler
-
- Set Environment variable, add `C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\Hostx86\x64` to `PATH`, so that `cl.exe` will be available in prompt, as shown below.
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> cl
- Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29111 for x64
- Copyright (C) Microsoft Corporation. All rights reserved.
-
- usage: cl [ option... ] filename... [ / link linkoption... ]
- ```
-
- For compatibility, we use the x86-hosted and x64-targeted compiler. note `Hostx86\x64` in the path.
-
- You may want to change the system language to English because pytorch will parse text output from `cl.exe` to check its version. However only utf-8 is recognized. Navigate to Control Panel -> Region -> Administrative -> Language for Non-Unicode programs and change it to English.
-
-##### Build and install MMCV
-
-mmcv can be built in two ways:
-
-1. Full version (CPU ops)
-
- Module `ops` will be compiled as a pytorch extension, but only x86 code will be compiled. The compiled ops can be executed on CPU only.
-
-2. Full version (CUDA ops)
-
- Both x86 and CUDA codes of `ops` module will be compiled. The compiled version can be run on both CPU and CUDA-enabled GPU (if implemented).
-
-###### CPU version
-
-Build and install
-
-```powershell
-(mmcv) PS C:\Users\xxx\mmcv> python setup.py build_ext
-(mmcv) PS C:\Users\xxx\mmcv> python setup.py develop
-```
-
-###### GPU version
-
-1. Make sure `CUDA_PATH` or `CUDA_HOME` is already set in `envs` via `ls env:`, desired output is shown as below:
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> ls env:
-
- Name Value
- ---- -----
- CUDA_PATH C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
- CUDA_PATH_V10_1 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
- CUDA_PATH_V10_2 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
- ```
-
- This should already be done by CUDA installer. If not, or you have multiple version of CUDA toolkit installed, set it with
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> $env:CUDA_HOME = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2"
- # OR
- (mmcv) PS C:\Users\xxx\mmcv> $env:CUDA_HOME = $env:CUDA_PATH_V10_2 # if CUDA_PATH_V10_2 is in envs:
- ```
-
-2. Set CUDA target arch
-
- ```shell
- # Here you need to change to the target architecture corresponding to your GPU
- (mmcv) PS C:\Users\xxx\mmcv> $env:TORCH_CUDA_ARCH_LIST="7.5"
- ```
-
- :::{note}
- Check your the compute capability of your GPU from [here](https://developer.nvidia.com/cuda-gpus).
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> &"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\extras\demo_suite\deviceQuery.exe"
- Device 0: "NVIDIA GeForce GTX 1660 SUPER"
- CUDA Driver Version / Runtime Version 11.7 / 11.1
- CUDA Capability Major/Minor version number: 7.5
- ```
-
- The 7.5 above indicates the target architecture. Note: You need to replace v10.2 with your CUDA version in the above command.
- :::
-
-3. Build and install
-
- ```powershell
- # build
- python setup.py build_ext # if success, cl will be launched to compile ops
- # install
- python setup.py develop
- ```
-
- ```{note}
- If you are compiling against PyTorch 1.6.0, you might meet some errors from PyTorch as described in [this issue](https://github.com/pytorch/pytorch/issues/42467). Follow [this pull request](https://github.com/pytorch/pytorch/pull/43380/files) to modify the source code in your local PyTorch installation.
- ```
-
-##### Validate installation
-
-```powershell
-(mmcv) PS C:\Users\xxx\mmcv> python .dev_scripts/check_installation.py
-```
-
-If no error is reported by the above command, the installation is successful. If there is an error reported, please check [Frequently Asked Questions](../faq.md) to see if there is already a solution.
-If no solution is found, please feel free to open an [issue](https://github.com/open-mmlab/mmcv/issues).
-
-### Build mmcv-lite
-
-If you need to use PyTorch-related modules, make sure PyTorch has been successfully installed in your environment by referring to the [PyTorch official installation guide](https://github.com/pytorch/pytorch#installation).
-
-1. Clone the repo
-
- ```bash
- git clone https://github.com/open-mmlab/mmcv.git
- cd mmcv
- ```
-
-2. Start building
-
- ```bash
- MMCV_WITH_OPS=0 pip install -e . -v
- ```
-
-3. Validate installation
-
- ```bash
- python -c 'import mmcv;print(mmcv.__version__)'
- ```
diff --git a/docs/en/get_started/installation.md b/docs/en/get_started/installation.md
deleted file mode 100644
index 12bad000a171c0adf5be01dc7f53a94a5933070d..0000000000000000000000000000000000000000
--- a/docs/en/get_started/installation.md
+++ /dev/null
@@ -1,348 +0,0 @@
-## Installation
-
-There are two versions of MMCV:
-
-- **mmcv**: comprehensive, with full features and various CUDA ops out of box. It takes longer time to build.
-- **mmcv-lite**: lite, without CUDA ops but all other features, similar to mmcv\<1.0.0. It is useful when you do not need those CUDA ops.
-
-```{warning}
-Do not install both versions in the same environment, otherwise you may encounter errors like `ModuleNotFound`. You need to uninstall one before installing the other. `Installing the full version is highly recommended if CUDA is avaliable`.
-```
-
-### Install mmcv
-
-Before installing mmcv, make sure that PyTorch has been successfully installed following the [PyTorch official installation guide](https://pytorch.org/get-started/locally/#start-locally). This can be verified using the following command
-
-```bash
-python -c 'import torch;print(torch.__version__)'
-```
-
-If version information is output, then PyTorch is installed.
-
-#### Install with mim (recommended)
-
-[mim](https://github.com/open-mmlab/mim) is the package management tool for the OpenMMLab projects, which makes it easy to install mmcv
-
-```bash
-pip install -U openmim
-mim install "mmcv>=2.0.0rc1"
-```
-
-If you find that the above installation command does not use a pre-built package ending with `.whl` but a source package ending with `.tar.gz`, you may not have a pre-build package corresponding to the PyTorch or CUDA or mmcv version, in which case you can [build mmcv from source](build.md).
-
-
-Installation log using pre-built packages
-
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv
-Downloading https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/mmcv-2.0.0rc3-cp38-cp38-manylinux1_x86_64.whl
-
-
-
-
-Installation log using source packages
-
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv==2.0.0rc3
-Downloading mmcv-2.0.0rc3.tar.gz
-
-
-
-To install a specific version of mmcv, for example, mmcv version 2.0.0rc3, you can use the following command
-
-```bash
-mim install mmcv==2.0.0rc3
-```
-
-:::{note}
-If you would like to use `opencv-python-headless` instead of `opencv-python`,
-e.g., in a minimum container environment or servers without GUI,
-you can first install it before installing MMCV to skip the installation of `opencv-python`.
-
-Alternatively, if it takes too long to install a dependency library, you can specify the pypi source
-
-```bash
-mim install "mmcv>=2.0.0rc3" -i https://pypi.tuna.tsinghua.edu.cn/simple
-```
-
-:::
-
-You can run [check_installation.py](https://github.com/open-mmlab/mmcv/blob/2.x/.dev_scripts/check_installation.py) to check the installation of mmcv-full after running the installation commands.
-
-#### Install with pip
-
-Use the following command to check the version of CUDA and PyTorch
-
-```bash
-python -c 'import torch;print(torch.__version__);print(torch.version.cuda)'
-```
-
-Select the appropriate installation command depending on the type of system, CUDA version, PyTorch version, and MMCV version
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-If you do not find a corresponding version in the dropdown box above, you probably do not have a pre-built package corresponding to the PyTorch or CUDA or mmcv version, at which point you can [build mmcv from source](build.md).
-
-:::{note}
-mmcv is only compiled on PyTorch 1.x.0 because the compatibility
-usually holds between 1.x.0 and 1.x.1. If your PyTorch version is 1.x.1, you
-can install mmcv compiled with PyTorch 1.x.0 and it usually works well.
-For example, if your PyTorch version is 1.8.1, you can feel free to choose 1.8.x.
-:::
-
-:::{note}
-If you would like to use `opencv-python-headless` instead of `opencv-python`,
-e.g., in a minimum container environment or servers without GUI,
-you can first install it before installing MMCV to skip the installation of `opencv-python`.
-
-Alternatively, if it takes too long to install a dependency library, you can specify the pypi source
-
-```bash
-mim install "mmcv>=2.0.0rc1" -i https://pypi.tuna.tsinghua.edu.cn/simple
-```
-
-:::
-
-You can run [check_installation.py](https://github.com/open-mmlab/mmcv/blob/2.x/.dev_scripts/check_installation.py) to check the installation of mmcv after running the installation commands.
-
-#### Using mmcv with Docker
-
-Build with local repository
-
-```bash
-git clone https://github.com/open-mmlab/mmcv.git && cd mmcv
-docker build -t mmcv -f docker/release/Dockerfile .
-```
-
-Or build with remote repository
-
-```bash
-docker build -t mmcv https://github.com/open-mmlab/mmcv.git#2.x:docker/release
-```
-
-The [Dockerfile](release/Dockerfile) installs latest released version of mmcv-full by default, but you can specify mmcv versions to install expected versions.
-
-```bash
-docker image build -t mmcv -f docker/release/Dockerfile --build-arg MMCV=2.0.0rc1 .
-```
-
-If you also want to use other versions of PyTorch and CUDA, you can also pass them when building docker images.
-
-An example to build an image with PyTorch 1.11 and CUDA 11.3.
-
-```bash
-docker build -t mmcv -f docker/release/Dockerfile \
- --build-arg PYTORCH=1.11.0 \
- --build-arg CUDA=11.3 \
- --build-arg CUDNN=8 \
- --build-arg MMCV=2.0.0rc1 .
-```
-
-More available versions of PyTorch and CUDA can be found at [dockerhub/pytorch](https://hub.docker.com/r/pytorch/pytorch/tags).
-
-### Install mmcv-lite
-
-If you need to use PyTorch-related modules, make sure PyTorch has been successfully installed in your environment by referring to the [PyTorch official installation guide](https://github.com/pytorch/pytorch#installation).
-
-```python
-pip install mmcv-lite
-```
diff --git a/docs/en/get_started/introduction.md b/docs/en/get_started/introduction.md
deleted file mode 100644
index 461fcc725bbcf4a84296e95789303b64e7b2e9c5..0000000000000000000000000000000000000000
--- a/docs/en/get_started/introduction.md
+++ /dev/null
@@ -1,36 +0,0 @@
-## Introduction
-
-MMCV is a foundational library for computer vision research and provides the following functionalities.
-
-- [Image/Video processing](../understand_mmcv/data_process.md)
-- [Image and annotation visualization](../understand_mmcv/visualization.md)
-- [Image transformation](../understand_mmcv/data_transform.md)
-- [Various CNN architectures](../understand_mmcv/cnn.md)
-- [High-quality implementation of common CUDA ops](../understand_mmcv/ops.md)
-
-It supports the following systems:
-
-- Linux
-- Windows
-- macOS
-
-It supports many research projects as below:
-
-- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark.
-- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark.
-- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection.
-- [MMRotate](https://github.com/open-mmlab/mmrotate): OpenMMLab rotated object detection toolbox and benchmark.
-- [MMYOLO](https://github.com/open-mmlab/mmyolo): OpenMMLab YOLO series toolbox and benchmark.
-- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark.
-- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab text detection, recognition, and understanding toolbox.
-- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark.
-- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 3D human parametric model toolbox and benchmark.
-- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab self-supervised learning toolbox and benchmark.
-- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab model compression toolbox and benchmark.
-- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab fewshot learning toolbox and benchmark.
-- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark.
-- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark.
-- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab optical flow toolbox and benchmark.
-- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox.
-- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox.
-- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab model deployment framework.
diff --git a/docs/en/switch_language.md b/docs/en/switch_language.md
deleted file mode 100644
index 9dc7b34b4fac6a972abedd8c2b0b80d03441d2b9..0000000000000000000000000000000000000000
--- a/docs/en/switch_language.md
+++ /dev/null
@@ -1,3 +0,0 @@
-## English
-
-## 简体中文
diff --git a/docs/en/understand_mmcv/cnn.md b/docs/en/understand_mmcv/cnn.md
deleted file mode 100644
index 2c42f25d9d5c5b2886c420bbab4461272cf02b21..0000000000000000000000000000000000000000
--- a/docs/en/understand_mmcv/cnn.md
+++ /dev/null
@@ -1,120 +0,0 @@
-## CNN
-
-We provide some building bricks for CNNs, including layer building, module bundles and weight initialization.
-
-### Layer building
-
-We may need to try different layers of the same type when running experiments,
-but do not want to modify the code from time to time.
-Here we provide some layer building methods to construct layers from a dict,
-which can be written in configs or specified via command line arguments.
-
-#### Usage
-
-A simplest example is
-
-```python
-from mmcv.cnn import build_conv_layer
-
-cfg = dict(type='Conv3d')
-layer = build_conv_layer(cfg, in_channels=3, out_channels=8, kernel_size=3)
-```
-
-- `build_conv_layer`: Supported types are Conv1d, Conv2d, Conv3d, Conv (alias for Conv2d).
-- `build_norm_layer`: Supported types are BN1d, BN2d, BN3d, BN (alias for BN2d), SyncBN, GN, LN, IN1d, IN2d, IN3d, IN (alias for IN2d).
-- `build_activation_layer`: Supported types are ReLU, LeakyReLU, PReLU, RReLU, ReLU6, ELU, Sigmoid, Tanh, GELU.
-- `build_upsample_layer`: Supported types are nearest, bilinear, deconv, pixel_shuffle.
-- `build_padding_layer`: Supported types are zero, reflect, replicate.
-
-#### Extension
-
-We also allow extending the building methods with custom layers and operators.
-
-1. Write and register your own module.
-
- ```python
- from mmengine.registry import MODELS
-
- @MODELS.register_module()
- class MyUpsample:
-
- def __init__(self, scale_factor):
- pass
-
- def forward(self, x):
- pass
- ```
-
-2. Import `MyUpsample` somewhere (e.g., in `__init__.py`) and then use it.
-
- ```python
- from mmcv.cnn import build_upsample_layer
-
- cfg = dict(type='MyUpsample', scale_factor=2)
- layer = build_upsample_layer(cfg)
- ```
-
-### Module bundles
-
-We also provide common module bundles to facilitate the network construction.
-`ConvModule` is a bundle of convolution, normalization and activation layers,
-please refer to the [api](api.html#mmcv.cnn.ConvModule) for details.
-
-```python
-from mmcv.cnn import ConvModule
-
-# conv + bn + relu
-conv = ConvModule(3, 8, 2, norm_cfg=dict(type='BN'))
-# conv + gn + relu
-conv = ConvModule(3, 8, 2, norm_cfg=dict(type='GN', num_groups=2))
-# conv + relu
-conv = ConvModule(3, 8, 2)
-# conv
-conv = ConvModule(3, 8, 2, act_cfg=None)
-# conv + leaky relu
-conv = ConvModule(3, 8, 3, padding=1, act_cfg=dict(type='LeakyReLU'))
-# bn + conv + relu
-conv = ConvModule(
- 3, 8, 2, norm_cfg=dict(type='BN'), order=('norm', 'conv', 'act'))
-```
-
-### Model Zoo
-
-Besides torchvision pre-trained models, we also provide pre-trained models of following CNN:
-
-- VGG Caffe
-- ResNet Caffe
-- ResNeXt
-- ResNet with Group Normalization
-- ResNet with Group Normalization and Weight Standardization
-- HRNetV2
-- Res2Net
-- RegNet
-
-#### Model URLs in JSON
-
-The model zoo links in MMCV are managed by JSON files.
-The json file consists of key-value pair of model name and its url or path.
-An example json file could be like:
-
-```json
-{
- "model_a": "https://example.com/models/model_a_9e5bac.pth",
- "model_b": "pretrain/model_b_ab3ef2c.pth"
-}
-```
-
-The default links of the pre-trained models hosted on OpenMMLab AWS could be found [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/model_zoo/open_mmlab.json).
-
-You may override default links by putting `open-mmlab.json` under `MMCV_HOME`. If `MMCV_HOME` is not found in your environment, `~/.cache/mmcv` will be used by default. You may use your own path with `export MMCV_HOME=/your/path`.
-
-The external json files will be merged into default one. If the same key presents in both external json and default json, the external one will be used.
-
-#### Load Checkpoint
-
-The following types are supported for `filename` of `mmcv.load_checkpoint()`.
-
-- filepath: The filepath of the checkpoint.
-- `http://xxx` and `https://xxx`: The link to download the checkpoint. The `SHA256` postfix should be contained in the filename.
-- `torchvision://xxx`: The model links in `torchvision.models`. Please refer to [torchvision](https://pytorch.org/docs/stable/torchvision/models.html) for details.
-- `open-mmlab://xxx`: The model links or filepath provided in default and additional json files.
diff --git a/docs/en/understand_mmcv/data_transform.md b/docs/en/understand_mmcv/data_transform.md
deleted file mode 100644
index 64c3af980eab0b07d7a298cee2c41465803911f8..0000000000000000000000000000000000000000
--- a/docs/en/understand_mmcv/data_transform.md
+++ /dev/null
@@ -1,341 +0,0 @@
-# Data Transformation
-
-In the OpenMMLab algorithm library, dataset construction and data preparation are decoupled. Usually, the construction of the dataset only parses the dataset and records the basic information of each sample, while the data preparation is a series of data transformations including data loading, preprocessing, formatting, and other operations performed according to the basic information of the sample.
-
-## Design of data transformation
-
-In MMCV, we use various callable data transformation classes to manipulate data. These data transformation classes can accept several configuration parameters for the instantiation and then process the input data dictionary by `__call__` method. All data transformation methods accept a dictionary as the input and produce the output as a dictionary as well. A simple example is as follows:
-
-```python
->>> import numpy as np
->>> from mmcv.transforms import Resize
->>>
->>> transform = Resize(scale=(224, 224))
->>> data_dict = {'img': np.random.rand(256, 256, 3)}
->>> data_dict = transform(data_dict)
->>> print(data_dict['img'].shape)
-(224, 224, 3)
-```
-
-The data transformation class reads some fields of the input dictionary and may add or update some fields. The keys of these fields are mostly fixed. For example, `Resize` will always read fields such as `"img"` in the input dictionary. More information about the conventions for input and output fields could be found in the documentation of the corresponding class.
-
-```{note}
-By convention, the order of image shape which is used as **initialization parameters** in data transformation (such as Resize, Pad) is (width, height). In the dictionary returned by the data transformation, the image related shape, such as `img_shape`, `ori_shape`, `pad_shape`, etc., is (height, width).
-```
-
-MMCV provides a unified base class called `BaseTransform` for all data transformation classes:
-
-```python
-class BaseTransform(metaclass=ABCMeta):
-
- def __call__(self, results: dict) -> dict:
-
- return self.transform(results)
-
- @abstractmethod
- def transform(self, results: dict) -> dict:
- pass
-```
-
-All data transformation classes must inherit `BaseTransform` and implement the `transform` method. Both the input and output of the `transform` method are a dictionary. In the **Custom data transformation class** section, we will describe how to implement a data transformation class in more detail.
-
-## Data pipeline
-
-As mentioned above, the inputs and outputs of all data transformations are dictionaries. Moreover, according to the \[Convention on Datasets\] (TODO) in OpenMMLab, the basic information of each sample in the dataset is also a dictionary. This way, we can connect all data transformation operations end to end and combine them into a data pipeline. This pipeline inputs the information dictionary of the samples in the dataset and outputs the information dictionary after a series of processing.
-
-Taking the classification task as an example, we show a typical data pipeline in the figure below. For each sample, the information stored in the dataset is a dictionary, as shown on the far left in the figure. After each data transformation operation represented by the blue block, a new field (marked in green) will be added to the data dictionary or an existing field (marked in orange) will be updated.
-
-
-
-
-
-The data pipeline is a list of several data transformation configuration dictionaries in the configuration file. Each dataset needs to set the parameter `pipeline` to define the data preparation operations the dataset needs to perform. The configuration of the above data pipeline in the configuration file is as follows:
-
-```python
-pipeline = [
- dict(type='LoadImageFromFile'),
- dict(type='Resize', size=256, keep_ratio=True),
- dict(type='CenterCrop', crop_size=224),
- dict(type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375]),
- dict(type='ClsFormatBundle')
-]
-
-dataset = dict(
- ...
- pipeline=pipeline,
- ...
-)
-```
-
-## Common data transformation classes
-
-The commonly used data transformation classes can be roughly divided into data loading, data preprocessing and augmentation, and data formatting. In MMCV, we provide some commonly used classes as follows:
-
-### Data loading
-
-To support the loading of large-scale datasets, data is usually not loaded when `Dataset` is initialized. Only the corresponding path is loaded. Therefore, it is necessary to load specific data in the data pipeline.
-
-| Class | Feature |
-| :-------------------------: | :--------------------------------------------: |
-| [`LoadImageFromFile`](TODO) | Load from file path |
-| [`LoadAnnotations`](TODO) | Load and organize the annotations (bbox, etc.) |
-
-### Data preprocessing and enhancement
-
-Data preprocessing and augmentation usually involve transforming the image itself, such as cropping, padding, scaling, etc.
-
-| Class | Feature |
-| :------------------------------: | :----------------------------------------------------: |
-| [`Pad`](TODO) | Padding |
-| [`CenterCrop`](TODO) | Center crop |
-| [`Normalize`](TODO) | Image normalization |
-| [`Resize`](TODO) | Resize to the specified size or ratio |
-| [`RandomResize`](TODO) | Scale the image randomly within the specified range |
-| [`RandomMultiscaleResize`](TODO) | Scale the image to a random size from multiple options |
-| [`RandomGrayscale`](TODO) | Random grayscale |
-| [`RandomFlip`](TODO) | Random flip |
-| [`MultiScaleFlipAug`](TODO) | Support scaling and flipping during the testing |
-
-### Data formatting
-
-Data formatting operations are type conversions performed on the data.
-
-| Class | Feature |
-| :---------------------: | :------------------------------------------: |
-| [`ToTensor`](TODO) | Convert the specified data to `torch.Tensor` |
-| [`ImageToTensor`](TODO) | Convert the image to `torch.Tensor` |
-
-## Customize data transformation classes
-
-To implement a new data transformation class, you must inherit `BaseTransform` and implement the `transform` method. Here, we use a simple flip transform (`MyFlip`) as an example:
-
-```python
-import random
-import mmcv
-from mmcv.transforms import BaseTransform, TRANSFORMS
-
-@TRANSFORMS.register_module()
-class MyFlip(BaseTransform):
- def __init__(self, direction: str):
- super().__init__()
- self.direction = direction
-
- def transform(self, results: dict) -> dict:
- img = results['img']
- results['img'] = mmcv.imflip(img, direction=self.direction)
- return results
-```
-
-Now, we can instantiate `MyFlip` as a callable object to handle our data dictionary.
-
-```python
-import numpy as np
-
-transform = MyFlip(direction='horizontal')
-data_dict = {'img': np.random.rand(224, 224, 3)}
-data_dict = transform(data_dict)
-processed_img = data_dict['img']
-```
-
-Alternatively, use `MyFlip` transform in the `pipeline` of the config file.
-
-```python
-pipeline = [
- ...
- dict(type='MyFlip', direction='horizontal'),
- ...
-]
-```
-
-It should be noted that if you want to use it in the configuration file, you must ensure that the file where the `MyFlip` class is located can be imported at the runtime.
-
-## Transform wrapper
-
-Transform wrappers are a special class of data transformations. They do not operate on images, labels or other information in the data dictionary by themselves. Instead, they enhance the behavior of data transformations defined in them.
-
-### KeyMapper
-
-`KeyMapper` is used to map fields in the data dictionary. For example, image processing transforms usually get their values from the `"img"` field in the data dictionary. But sometimes we want these transforms to handle images in other fields in the data dictionary, such as the `"gt_img"` field.
-
-When used with registry and configuration file, the field map wrapper should be used as follows:
-
-```python
-pipeline = [
- ...
- dict(type='KeyMapper',
- mapping={
- 'img': 'gt_img', # map "gt_img" to "img"
- 'mask': ..., # The "mask" field in the raw data is not used. That is, for wrapped data transformations, the "mask" field is not included in the data
- },
- auto_remap=True, # remap "img" back to "gt_img" after the transformation
- transforms=[
- # only need to specify "img" in `RandomFlip`
- dict(type='RandomFlip'),
- ])
- ...
-]
-```
-
-With `KeyMapper`, we don't need to consider various possible input field names in the `transform` method when we implement the data transformation class. We only need to deal with the default fields.
-
-### RandomChoice and RandomApply
-
-`RandomChoice` is used to randomly select a data transformation pipeline from the given choices. With this wrapper, we can easily implement some data augmentation functions, such as AutoAugment.
-
-In configuration file, you can use `RandomChoice` as follows:
-
-```python
-pipeline = [
- ...
- dict(type='RandomChoice',
- transforms=[
- [
- dict(type='Posterize', bits=4),
- dict(type='Rotate', angle=30.)
- ], # the first combo option
- [
- dict(type='Equalize'),
- dict(type='Rotate', angle=30)
- ], # the second combo option
- ],
- prob=[0.4, 0.6] # the prob of each combo
- )
- ...
-]
-```
-
-`RandomApply` is used to randomly perform a combination of data transformations with a specified probability. For example:
-
-```python
-pipeline = [
- ...
- dict(type='RandomApply',
- transforms=[dict(type='Rotate', angle=30.)],
- prob=0.3) # perform the transformation with prob as 0.3
- ...
-]
-```
-
-### TransformBroadcaster
-
-Usually, a data transformation class only reads the target of an operation from one field. While we can also use `KeyMapper` to change the fields read, there is no way to apply transformations to the data of multiple fields at once. To achieve this, we need to use the multi-target extension wrapper `TransformBroadcaster`.
-
-`TransformBroadcaster` has two uses, one is to apply data transformation to multiple specified fields, and the other is to apply data transformation to a group of targets under a field.
-
-1. Apply to multiple fields
-
- Suppose we need to apply a data transformation to images in two fields `"lq"` (low-quality) and `"gt"` (ground-truth).
-
- ```python
- pipeline = [
- dict(type='TransformBroadcaster',
- # apply to the "lq" and "gt" fields respectively, and set the "img" field to both
- mapping={'img': ['lq', 'gt']},
- # remap the "img" field back to the original field after the transformation
- auto_remap=True,
- # whether to share random variables in the transformation of each target
- # more introduction will be referred in the following chapters (random variable sharing)
- share_random_params=True,
- transforms=[
- # only need to manipulate the "img" field in the `RandomFlip` class
- dict(type='RandomFlip'),
- ])
- ]
- ```
-
- In the `mapping` setting of the multi-target extension, we can also use `...` to ignore the specified original field. As shown in the following example, the wrapped `RandomCrop` will crop the image in the field `"img"` and update the size of the cropped image if the field `"img_shape"` exists. If we want to do the same random cropping for both image fields `"lq"` and `"gt"` at the same time but update the `"img_shape"` field only once, we can do it as in the example:
-
- ```python
- pipeline = [
- dict(type='TransformBroadcaster',
- mapping={
- 'img': ['lq', 'gt'],
- 'img_shape': ['img_shape', ...],
- },
- # remap the "img" and "img_shape" fields back to their original fields after the transformation
- auto_remap=True,
- # whether to share random variables in the transformation of each target
- # more introduction will be referred in the following chapters (random variable sharing)
- share_random_params=True,
- transforms=[
- # "img" and "img_shape" fields are manipulated in the `RandomCrop` class
- # if "img_shape" is missing, only operate on "img"
- dict(type='RandomCrop'),
- ])
- ]
- ```
-
-2. A set of targets applied to a field
-
- Suppose we need to apply a data transformation to the `"images"` field, which is a list of images.
-
- ```python
- pipeline = [
- dict(type='TransformBroadcaster',
- # map each image under the "images" field to the "img" field
- mapping={'img': 'images'},
- # remap the images under the "img" field back to the list in the "images" field after the transformation
- auto_remap=True,
- # whether to share random variables in the transformation of each target
- share_random_params=True,
- transforms=[
- # in the `RandomFlip` transformation class, we only need to manipulate the "img" field
- dict(type='RandomFlip'),
- ])
- ]
- ```
-
-#### Decorator `cache_randomness`
-
-In `TransformBroadcaster`, we provide the `share_random_params` option to support sharing random states across multiple data transformations. For example, in a super-resolution task, we want to apply **the same** random transformations **simultaneously** to the low-resolution image and the original image. If we use this function in a custom data transformation class, we need to mark which random variables support sharing in the class. This can be achieved with the decorator `cache_randomness`.
-
-Taking `MyFlip` from the above example, we want to perform flipping randomly with a certain probability:
-
-```python
-from mmcv.transforms.utils import cache_randomness
-
-@TRANSFORMS.register_module()
-class MyRandomFlip(BaseTransform):
- def __init__(self, prob: float, direction: str):
- super().__init__()
- self.prob = prob
- self.direction = direction
-
- @cache_randomness # label the output of the method as a shareable random variable
- def do_flip(self):
- flip = True if random.random() > self.prob else False
- return flip
-
- def transform(self, results: dict) -> dict:
- img = results['img']
- if self.do_flip():
- results['img'] = mmcv.imflip(img, direction=self.direction)
- return results
-```
-
-In the above example, we decorate the `do_flip` method with `cache_randomness`, marking the method return value `flip` as a random variable that supports sharing. Therefore, in the transformation of `TransformBroadcaster` to multiple targets, the value of this variable will remain the same.
-
-#### Decorator `avoid_cache_randomness`
-
-In some cases, we cannot separate the process of generating random variables in data transformation into a class method. For example, modules from third-party libraries used in data transformation encapsulate the relevant parts of random variables inside, making them impossible to be extracted as class methods for data transformation. Such data transformations cannot support shared random variables through the decorator `cache_randomness` annotation, and thus cannot share random variables during multi-objective expansion.
-
-To avoid misuse of such data transformations in multi-object extensions, we provide another decorator, `avoid_cache_randomness`, to mark such data transformations:
-
-```python
-from mmcv.transforms.utils import avoid_cache_randomness
-
-@TRANSFORMS.register_module()
-@avoid_cache_randomness
-class MyRandomTransform(BaseTransform):
-
- def transform(self, results: dict) -> dict:
- ...
-```
-
-Data transformation classes marked with `avoid_cache_randomness` will throw an exception when their instance is wrapped by `TransformBroadcaster` and the parameter `share_random_params` is set to True. This reminds the user not to use it in this way.
-
-There are a few things to keep in mind when using `avoid_cache_randomness`:
-
-1. `avoid_cache_randomness` is only used to decorate data transformation classes (subclasses of `BaseTransfrom`) and cannot be used to decorate other general classes, class methods, or functions
-2. When a data transformation decorated with `avoid_cache_randomness` is used as a base class, its subclasses **will not inherit** its feature. If the subclass is still unable to share random variables, `avoid_cache_randomness` should be used again.
-3. A data transformation needs to be modified with `avoid_cache_randomness` only when a data transformation is random and cannot share its random parameters. Data transformations without randomness require no decoration
diff --git a/docs/en/understand_mmcv/ops.md b/docs/en/understand_mmcv/ops.md
deleted file mode 100644
index e60f77c77234a99ac22c2d7b950389ad3aec9835..0000000000000000000000000000000000000000
--- a/docs/en/understand_mmcv/ops.md
+++ /dev/null
@@ -1,66 +0,0 @@
-## ops
-
-We implement common ops used in detection, segmentation, etc.
-
-| Device | CPU | CUDA | MLU | MPS | Ascend |
-| ---------------------------- | --- | ---- | --- | --- | ------ |
-| ActiveRotatedFilter | √ | √ | | | |
-| AssignScoreWithK | | √ | | | |
-| BallQuery | | √ | | | |
-| BBoxOverlaps | | √ | √ | √ | √ |
-| BorderAlign | | √ | | | |
-| BoxIouRotated | √ | √ | | | |
-| BoxIouQuadri | √ | √ | | | |
-| CARAFE | | √ | √ | | |
-| ChamferDistance | | √ | | | |
-| CrissCrossAttention | | √ | | | |
-| ContourExpand | √ | | | | |
-| ConvexIoU | | √ | | | |
-| CornerPool | | √ | | | |
-| Correlation | | √ | | | |
-| Deformable Convolution v1/v2 | √ | √ | | | √ |
-| Deformable RoIPool | | √ | √ | | √ |
-| DiffIoURotated | | √ | | | |
-| DynamicScatter | | √ | | | |
-| FurthestPointSample | | √ | | | |
-| FurthestPointSampleWithDist | | √ | | | |
-| FusedBiasLeakyrelu | | √ | | | √ |
-| GatherPoints | | √ | | | √ |
-| GroupPoints | | √ | | | |
-| Iou3d | | √ | √ | | |
-| KNN | | √ | | | |
-| MaskedConv | | √ | √ | | √ |
-| MergeCells | | √ | | | |
-| MinAreaPolygon | | √ | | | |
-| ModulatedDeformConv2d | √ | √ | | | √ |
-| MultiScaleDeformableAttn | | √ | √ | | |
-| NMS | √ | √ | √ | | √ |
-| NMSRotated | √ | √ | | | √ |
-| NMSQuadri | √ | √ | | | |
-| PixelGroup | √ | | | | |
-| PointsInBoxes | √ | √ | | | |
-| PointsInPolygons | | √ | | | |
-| PSAMask | √ | √ | √ | | √ |
-| RotatedFeatureAlign | √ | √ | | | |
-| RoIPointPool3d | | √ | √ | | |
-| RoIPool | | √ | √ | | √ |
-| RoIAlignRotated | √ | √ | √ | | |
-| RiRoIAlignRotated | | √ | | | |
-| RoIAlign | √ | √ | √ | | |
-| RoIAwarePool3d | | √ | √ | | |
-| SAConv2d | | √ | | | |
-| SigmoidFocalLoss | | √ | √ | | √ |
-| SoftmaxFocalLoss | | √ | | | √ |
-| SoftNMS | | √ | | | |
-| Sparse Convolution | | √ | | | |
-| Synchronized BatchNorm | | √ | | | |
-| ThreeInterpolate | | √ | | | |
-| ThreeNN | | √ | √ | | |
-| TINShift | | √ | √ | | |
-| UpFirDn2d | | √ | | | |
-| Voxelization | √ | √ | | | √ |
-| PrRoIPool | | √ | | | |
-| BezierAlign | √ | √ | | | |
-| BiasAct | | √ | | | |
-| FilteredLrelu | | √ | | | |
-| Conv2dGradfix | | √ | | | |
diff --git a/docs/faq.md b/docs/faq.md
new file mode 100644
index 0000000000000000000000000000000000000000..ab0dd135f946c63f6dc3d08e2b6ca2f6837c7437
--- /dev/null
+++ b/docs/faq.md
@@ -0,0 +1,42 @@
+## Frequently Asked Questions
+
+We list some common troubles faced by many users and their corresponding solutions here.
+Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them.
+
+- Compatibility issue between MMCV and MMDetection; "ConvWS is already registered in conv layer"
+
+ Please install the correct version of MMCV for the version of your MMDetection following the instruction above.
+
+- "No module named 'mmcv.ops'"; "No module named 'mmcv._ext'".
+
+ 1. Uninstall existing mmcv in the environment using `pip uninstall mmcv`.
+ 2. Install mmcv-full following the instruction above.
+
+- "invalid device function" or "no kernel image is available for execution".
+
+ 1. Check the CUDA compute capability of you GPU.
+ 2. Run `python mmdet/utils/collect_env.py` to check whether PyTorch, torchvision,
+ and MMCV are built for the correct GPU architecture.
+ You may need to set `TORCH_CUDA_ARCH_LIST` to reinstall MMCV.
+ The compatibility issue could happen when using old GPUS, e.g., Tesla K80 (3.7) on colab.
+ 3. Check whether the running environment is the same as that when mmcv/mmdet is compiled.
+ For example, you may compile mmcv using CUDA 10.0 bug run it on CUDA9.0 environments.
+
+- "undefined symbol" or "cannot open xxx.so".
+
+ 1. If those symbols are CUDA/C++ symbols (e.g., libcudart.so or GLIBCXX), check
+ whether the CUDA/GCC runtimes are the same as those used for compiling mmcv.
+ 2. If those symbols are Pytorch symbols (e.g., symbols containing caffe, aten, and TH), check whether
+ the Pytorch version is the same as that used for compiling mmcv.
+ 3. Run `python mmdet/utils/collect_env.py` to check whether PyTorch, torchvision,
+ and MMCV are built by and running on the same environment.
+
+- "RuntimeError: CUDA error: invalid configuration argument".
+
+ This error may be due to your poor GPU. Try to decrease the value of [THREADS_PER_BLOCK](https://github.com/open-mmlab/mmcv/blob/cac22f8cf5a904477e3b5461b1cc36856c2793da/mmcv/ops/csrc/common_cuda_helper.hpp#L10)
+ and recompile mmcv.
+
+- "RuntimeError: nms is not compiled with GPU support".
+
+ This error is because your CUDA environment is not installed correctly.
+ You may try to re-install your CUDA environment and then delete the build/ folder before re-compile mmcv.
diff --git a/docs/get_started/build.md b/docs/get_started/build.md
new file mode 100644
index 0000000000000000000000000000000000000000..758a83a4fb84398c9e192df37f7778a736109813
--- /dev/null
+++ b/docs/get_started/build.md
@@ -0,0 +1,234 @@
+## Build MMCV from source
+
+### Build on Linux or macOS
+
+After cloning the repo with
+
+```bash
+git clone https://github.com/open-mmlab/mmcv.git
+cd mmcv
+```
+
+You can either
+
+- install the lite version
+
+ ```bash
+ pip install -e .
+ ```
+
+- install the full version
+
+ ```bash
+ MMCV_WITH_OPS=1 pip install -e .
+ ```
+
+If you are on macOS, add the following environment variables before the installing command.
+
+```bash
+CC=clang CXX=clang++ CFLAGS='-stdlib=libc++'
+```
+
+e.g.,
+
+```bash
+CC=clang CXX=clang++ CFLAGS='-stdlib=libc++' MMCV_WITH_OPS=1 pip install -e .
+```
+
+```{note}
+If you would like to use `opencv-python-headless` instead of `opencv-python`,
+e.g., in a minimum container environment or servers without GUI,
+you can first install it before installing MMCV to skip the installation of `opencv-python`.
+```
+### Build on Windows
+
+Building MMCV on Windows is a bit more complicated than that on Linux.
+The following instructions show how to get this accomplished.
+
+#### Prerequisite
+
+The following software is required for building MMCV on windows.
+Install them first.
+
+- [Git](https://git-scm.com/download/win)
+ - During installation, tick **add git to Path**.
+- [Visual Studio Community 2019](https://visualstudio.microsoft.com)
+ - A compiler for C++ and CUDA codes.
+- [Miniconda](https://docs.conda.io/en/latest/miniconda.html)
+ - Official distributions of Python should work too.
+- [CUDA 10.2](https://developer.nvidia.com/cuda-10.2-download-archive)
+ - Not required for building CPU version.
+ - Customize the installation if necessary. As a recommendation, skip the driver installation if a newer version is already installed.
+
+```{note}
+You should know how to set up environment variables, especially `Path`, on Windows. The following instruction relies heavily on this skill.
+```
+
+#### Setup Python Environment
+
+1. Launch Anaconda prompt from Windows Start menu
+
+ Do not use raw `cmd.exe` s instruction is based on PowerShell syntax.
+
+1. Create a new conda environment
+
+ ```shell
+ conda create --name mmcv python=3.7 # 3.6, 3.7, 3.8 should work too as tested
+ conda activate mmcv # make sure to activate environment before any operation
+ ```
+
+1. Install PyTorch. Choose a version based on your need.
+
+ ```shell
+ conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
+ ```
+
+ We only tested PyTorch version >= 1.6.0.
+
+1. Prepare MMCV source code
+
+ ```shell
+ git clone https://github.com/open-mmlab/mmcv.git
+ cd mmcv
+ ```
+
+1. Install required Python packages
+
+ ```shell
+ pip3 install -r requirements.txt
+ ```
+
+#### Build and install MMCV
+
+MMCV can be built in three ways:
+
+1. Lite version (without ops)
+
+ In this way, no custom ops are compiled and mmcv is a pure python package.
+
+1. Full version (CPU ops)
+
+ Module `ops` will be compiled as a pytorch extension, but only x86 code will be compiled. The compiled ops can be executed on CPU only.
+
+1. Full version (CUDA ops)
+
+ Both x86 and CUDA codes of `ops` module will be compiled. The compiled version can be run on both CPU and CUDA-enabled GPU (if implemented).
+
+##### Common steps
+
+1. Set up MSVC compiler
+
+ Set Environment variable, add `C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\Hostx86\x64` to `PATH`, so that `cl.exe` will be available in prompt, as shown below.
+
+ ```none
+ (base) PS C:\Users\xxx> cl
+ Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29111 for x64
+ Copyright (C) Microsoft Corporation. All rights reserved.
+
+ usage: cl [ option... ] filename... [ / link linkoption... ]
+ ```
+
+ For compatibility, we use the x86-hosted and x64-targeted compiler. note `Hostx86\x64` in the path.
+
+ You may want to change the system language to English because pytorch will parse text output from `cl.exe` to check its version. However only utf-8 is recognized. Navigate to Control Panel -> Region -> Administrative -> Language for Non-Unicode programs and change it to English.
+
+##### Option 1: Build MMCV (lite version)
+
+After finishing above common steps, launch Anaconda shell from Start menu and issue the following commands:
+
+```shell
+# activate environment
+conda activate mmcv
+# change directory
+cd mmcv
+# install
+python setup.py develop
+# check
+pip list
+```
+
+##### Option 2: Build MMCV (full version with CPU)
+
+1. Finish above common steps
+1. Set up environment variables
+
+ ```shell
+ $env:MMCV_WITH_OPS = 1
+ $env:MAX_JOBS = 8 # based on your available number of CPU cores and amount of memory
+ ```
+
+1. Following build steps of the lite version
+
+ ```shell
+ # activate environment
+ conda activate mmcv
+ # change directory
+ cd mmcv
+ # build
+ python setup.py build_ext # if success, cl will be launched to compile ops
+ # install
+ python setup.py develop
+ # check
+ pip list
+ ```
+
+##### Option 3: Build MMCV (full version with CUDA)
+
+1. Finish above common steps
+1. Make sure `CUDA_PATH` or `CUDA_HOME` is already set in `envs` via `ls env:`, desired output is shown as below:
+
+ ```none
+ (base) PS C:\Users\WRH> ls env:
+
+ Name Value
+ ---- -----
+ <... omit some lines ...>
+ CUDA_PATH C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
+ CUDA_PATH_V10_1 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
+ CUDA_PATH_V10_2 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
+ <... omit some lines ...>
+ ```
+
+ This should already be done by CUDA installer. If not, or you have multiple version of CUDA toolkit installed, set it with
+
+ ```shell
+ $env:CUDA_HOME = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2"
+ # OR
+ $env:CUDA_HOME = $env:CUDA_PATH_V10_2 # if CUDA_PATH_V10_2 is in envs:
+ ```
+
+1. Set CUDA target arch
+
+ ```shell
+ # Suppose you are using GTX 1080, which is of capability 6.1
+ $env:TORCH_CUDA_ARCH_LIST="6.1"
+ # OR build all supported arch, will be slow
+ $env:TORCH_CUDA_ARCH_LIST="3.5 3.7 5.0 5.2 6.0 6.1 7.0 7.5"
+ ```
+
+```{note}
+Check your the compute capability of your GPU from [here](https://developer.nvidia.com/cuda-gpus).
+```
+
+1. Launch compiling the same way as CPU
+
+ ```shell
+ $env:MMCV_WITH_OPS = 1
+ $env:MAX_JOBS = 8 # based on available number of CPU cores and amount of memory
+ # activate environment
+ conda activate mmcv
+ # change directory
+ cd mmcv
+ # build
+ python setup.py build_ext # if success, cl will be launched to compile ops
+ # install
+ python setup.py develop
+ # check
+ pip list
+ ```
+
+```{note}
+If you are compiling against PyTorch 1.6.0, you might meet some errors from PyTorch as described in [this issue](https://github.com/pytorch/pytorch/issues/42467). Follow [this pull request](https://github.com/pytorch/pytorch/pull/43380/files) to modify the source code in your local PyTorch installation.
+```
+
+If you meet issues when running or compiling mmcv, we list some common issues in [Frequently Asked Question](../faq.html).
diff --git a/docs/get_started/installation.md b/docs/get_started/installation.md
new file mode 100644
index 0000000000000000000000000000000000000000..0c64ea825cad548f21c2b41a9538f9447b7431b8
--- /dev/null
+++ b/docs/get_started/installation.md
@@ -0,0 +1,162 @@
+## Installation
+
+There are two versions of MMCV:
+
+- **mmcv-full**: comprehensive, with full features and various CUDA ops out of box. It takes longer time to build.
+- **mmcv**: lite, without CUDA ops but all other features, similar to mmcv<1.0.0. It is useful when you do not need those CUDA ops.
+
+```{warning}
+Do not install both versions in the same environment, otherwise you may encounter errors like `ModuleNotFound`. You need to uninstall one before installing the other. `Installing the full version is highly recommended if CUDA is avaliable`.
+```
+
+a. Install the full version.
+
+Before installing mmcv-full, make sure that PyTorch has been successfully installed following the [official guide](https://pytorch.org/).
+
+We provide pre-built mmcv packages (recommended) with different PyTorch and CUDA versions to simplify the building. In addition, you can run [check_installation.py](.dev_scripts/check_installation.py) to check the installation of mmcv-full after running the installation commands.
+
+i. Install the latest version.
+
+The rule for installing the latest ``mmcv-full`` is as follows:
+
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
+```
+
+Please replace ``{cu_version}`` and ``{torch_version}`` in the url to your desired one. For example,
+to install the latest ``mmcv-full`` with ``CUDA 11.1`` and ``PyTorch 1.9.0``, use the following command:
+
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+```
+
+For more details, please refer the the following tables and delete ``=={mmcv_version}``.
+
+ii. Install a specified version.
+
+The rule for installing a specified ``mmcv-full`` is as follows:
+
+```shell
+pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
+```
+
+First of all, please refer to the Releases and replace ``{mmcv_version}`` a specified one. e.g. ``1.3.9``.
+Then replace ``{cu_version}`` and ``{torch_version}`` in the url to your desired versions. For example,
+to install ``mmcv-full==1.3.9`` with ``CUDA 11.1`` and ``PyTorch 1.9.0``, use the following command:
+
+```shell
+pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+```
+
+```{note}
+mmcv-full is only compiled on PyTorch 1.x.0 because the compatibility
+usually holds between 1.x.0 and 1.x.1. If your PyTorch version is 1.x.1, you
+can install mmcv-full compiled with PyTorch 1.x.0 and it usually works well.
+For example, if your PyTorch version is 1.8.1 and CUDA version is 11.1, you
+can use the following command to install mmcv-full.
+
+`pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html`
+```
+
+For more details, please refer the the following tables.
+
+
+
+
+ CUDA
+ torch 1.10
+ torch 1.9
+ torch 1.8
+ torch 1.7
+ torch 1.6
+ torch 1.5
+
+
+ 11.3
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
+
+
+
+
+
+
+
+ 11.1
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
+
+
+
+
+
+ 11.0
+
+
+
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.0/index.html
+
+
+
+
+ 10.2
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.10.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.9.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.5.0/index.html
+
+
+ 10.1
+
+
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.5.0/index.html
+
+
+ 9.2
+
+
+
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.5.0/index.html
+
+
+ cpu
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.9.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.8.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.7.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.6.0/index.html
+ install pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.5.0/index.html
+
+
+
+
+```{note}
+The pre-built packages provided above do not include all versions of mmcv-full, you can click on the corresponding links to see the supported versions. For example, if you click [cu102-torch1.8.0](https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html), you can see that `cu102-torch1.8.0` only provides 1.3.0 and above versions of mmcv-full. In addition, We no longer provide `mmcv-full` pre-built packages compiled with `PyTorch 1.3 & 1.4` since v1.3.17. You can find previous versions that compiled with PyTorch 1.3 & 1.4 [here](./docs/get_started/previous_versions.md). The compatibility is still ensured in our CI, but we will discard the support of PyTorch 1.3 & 1.4 next year.
+```
+
+Another way is to compile locally by running
+
+```python
+pip install mmcv-full
+```
+
+Note that the local compiling may take up to 10 mins.
+
+b. Install the lite version.
+
+```python
+pip install mmcv
+```
+
+c. Install full version with custom operators for onnxruntime
+
+- Check [here](https://mmcv.readthedocs.io/en/latest/deployment/onnxruntime_custom_ops.html) for detailed instruction.
+
+If you would like to build MMCV from source, please refer to the [guide](https://mmcv.readthedocs.io/en/latest/get_started/build.html).
diff --git a/docs/get_started/introduction.md b/docs/get_started/introduction.md
new file mode 100644
index 0000000000000000000000000000000000000000..4ffb59d2d57cd24c23dd5d9fb0558ab1d66a06a8
--- /dev/null
+++ b/docs/get_started/introduction.md
@@ -0,0 +1,29 @@
+## Introduction
+
+MMCV is a foundational library for computer vision research and supports many
+research projects as below:
+
+- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark.
+- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark.
+- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection.
+- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark.
+- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark.
+- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark.
+- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark.
+- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox.
+- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab text detection, recognition and understanding toolbox.
+- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox.
+
+It provides the following functionalities.
+
+- Universal IO APIs
+- Image/Video processing
+- Image and annotation visualization
+- Useful utilities (progress bar, timer, ...)
+- PyTorch runner with hooking mechanism
+- Various CNN architectures
+- High-quality implementation of common CUDA ops
+
+```{note}
+MMCV requires Python 3.6+.
+```
diff --git a/docs/en/get_started/previous_versions.md b/docs/get_started/previous_versions.md
similarity index 93%
rename from docs/en/get_started/previous_versions.md
rename to docs/get_started/previous_versions.md
index a9c3717667fec3e8f338c319413aa6ad639dc6d3..c91180d2203dc5cf21c4dccbc4b4e20891879795 100644
--- a/docs/en/get_started/previous_versions.md
+++ b/docs/get_started/previous_versions.md
@@ -4,7 +4,7 @@ We no longer provide `mmcv-full` packages compiled under lower versions of `PyTo
### PyTorch 1.4
-| 1.0.0 \<= mmcv_version \<= 1.2.1
+| 1.0.0 <= mmcv_version <= 1.2.1
#### CUDA 10.1
@@ -26,7 +26,7 @@ pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dis
### PyTorch v1.3
-| 1.0.0 \<= mmcv_version \<= 1.3.16
+| 1.0.0 <= mmcv_version <= 1.3.16
#### CUDA 10.1
diff --git a/docs/en/index.rst b/docs/index.rst
similarity index 71%
rename from docs/en/index.rst
rename to docs/index.rst
index dee2c37507fb77df42fef5e51fe501214c13d7ce..6019f107a842107f5e38989df313ca7cc7fe9f9c 100644
--- a/docs/en/index.rst
+++ b/docs/index.rst
@@ -15,23 +15,27 @@ You can switch between Chinese and English documents in the lower-left corner of
:maxdepth: 2
:caption: Understand MMCV
+ understand_mmcv/config.md
+ understand_mmcv/registry.md
+ understand_mmcv/runner.md
+ understand_mmcv/io.md
understand_mmcv/data_process.md
- understand_mmcv/data_transform.md
understand_mmcv/visualization.md
understand_mmcv/cnn.md
understand_mmcv/ops.md
+ understand_mmcv/utils.md
.. toctree::
:maxdepth: 2
:caption: Deployment
+ deployment/onnx.md
+ deployment/onnxruntime_op.md
+ deployment/onnxruntime_custom_ops.md
+ deployment/tensorrt_plugin.md
+ deployment/tensorrt_custom_ops.md
deployment/mmcv_ops_definition.md
-.. toctree::
- :caption: Switch Language
-
- switch_language.md
-
.. toctree::
:maxdepth: 2
:caption: Compatibility
@@ -39,6 +43,8 @@ You can switch between Chinese and English documents in the lower-left corner of
compatibility.md
.. toctree::
+ :maxdepth: 2
+ :caption: FAQ
faq.md
@@ -50,17 +56,10 @@ You can switch between Chinese and English documents in the lower-left corner of
community/pr.md
.. toctree::
- :maxdepth: 1
+ :maxdepth: 2
:caption: API Reference
- mmcv.image
- mmcv.video
- mmcv.visualization
- mmcv.cnn
- mmcv.ops
- mmcv.transforms
- mmcv.arraymisc
- mmcv.utils
+ api.rst
Indices and tables
==================
diff --git a/docs/en/make.bat b/docs/make.bat
similarity index 100%
rename from docs/en/make.bat
rename to docs/make.bat
diff --git a/docs/en/mmcv-logo.png b/docs/mmcv-logo.png
similarity index 100%
rename from docs/en/mmcv-logo.png
rename to docs/mmcv-logo.png
diff --git a/docs/understand_mmcv/cnn.md b/docs/understand_mmcv/cnn.md
new file mode 100644
index 0000000000000000000000000000000000000000..749cb951131efe5c9ec4c59ef05b90243913df68
--- /dev/null
+++ b/docs/understand_mmcv/cnn.md
@@ -0,0 +1,538 @@
+## CNN
+
+We provide some building bricks for CNNs, including layer building, module bundles and weight initialization.
+
+### Layer building
+
+We may need to try different layers of the same type when running experiments,
+but do not want to modify the code from time to time.
+Here we provide some layer building methods to construct layers from a dict,
+which can be written in configs or specified via command line arguments.
+
+#### Usage
+
+A simplest example is
+
+```python
+cfg = dict(type='Conv3d')
+layer = build_conv_layer(cfg, in_channels=3, out_channels=8, kernel_size=3)
+```
+
+- `build_conv_layer`: Supported types are Conv1d, Conv2d, Conv3d, Conv (alias for Conv2d).
+- `build_norm_layer`: Supported types are BN1d, BN2d, BN3d, BN (alias for BN2d), SyncBN, GN, LN, IN1d, IN2d, IN3d, IN (alias for IN2d).
+- `build_activation_layer`: Supported types are ReLU, LeakyReLU, PReLU, RReLU, ReLU6, ELU, Sigmoid, Tanh, GELU.
+- `build_upsample_layer`: Supported types are nearest, bilinear, deconv, pixel_shuffle.
+- `build_padding_layer`: Supported types are zero, reflect, replicate.
+
+#### Extension
+
+We also allow extending the building methods with custom layers and operators.
+
+1. Write and register your own module.
+
+ ```python
+ from mmcv.cnn import UPSAMPLE_LAYERS
+
+ @UPSAMPLE_LAYERS.register_module()
+ class MyUpsample:
+
+ def __init__(self, scale_factor):
+ pass
+
+ def forward(self, x):
+ pass
+ ```
+
+2. Import `MyUpsample` somewhere (e.g., in `__init__.py`) and then use it.
+
+ ```python
+ cfg = dict(type='MyUpsample', scale_factor=2)
+ layer = build_upsample_layer(cfg)
+ ```
+
+### Module bundles
+
+We also provide common module bundles to facilitate the network construction.
+`ConvModule` is a bundle of convolution, normalization and activation layers,
+please refer to the [api](api.html#mmcv.cnn.ConvModule) for details.
+
+```python
+# conv + bn + relu
+conv = ConvModule(3, 8, 2, norm_cfg=dict(type='BN'))
+# conv + gn + relu
+conv = ConvModule(3, 8, 2, norm_cfg=dict(type='GN', num_groups=2))
+# conv + relu
+conv = ConvModule(3, 8, 2)
+# conv
+conv = ConvModule(3, 8, 2, act_cfg=None)
+# conv + leaky relu
+conv = ConvModule(3, 8, 3, padding=1, act_cfg=dict(type='LeakyReLU'))
+# bn + conv + relu
+conv = ConvModule(
+ 3, 8, 2, norm_cfg=dict(type='BN'), order=('norm', 'conv', 'act'))
+```
+
+### Weight initialization
+
+> Implementation details are available at [mmcv/cnn/utils/weight_init.py](../../mmcv/cnn/utils/weight_init.py)
+
+During training, a proper initialization strategy is beneficial to speed up the
+training or obtain a higher performance. In MMCV, we provide some commonly used
+methods for initializing modules like `nn.Conv2d`. Of course, we also provide
+high-level APIs for initializing models containing one or more
+modules.
+
+#### Initialization functions
+
+Initialize a `nn.Module` such as `nn.Conv2d`, `nn.Linear` in a functional way.
+
+We provide the following initialization methods.
+
+- constant_init
+
+ Initialize module parameters with constant values.
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import constant_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # constant_init(module, val, bias=0)
+ >>> constant_init(conv1, 1, 0)
+ >>> conv1.weight
+ ```
+
+- xavier_init
+
+ Initialize module parameters with values according to the method
+ described in [Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010)](http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf)
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import xavier_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # xavier_init(module, gain=1, bias=0, distribution='normal')
+ >>> xavier_init(conv1, distribution='normal')
+ ```
+
+- normal_init
+
+ Initialize module parameters with the values drawn from a normal distribution.
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import normal_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # normal_init(module, mean=0, std=1, bias=0)
+ >>> normal_init(conv1, std=0.01, bias=0)
+ ```
+
+- uniform_init
+
+ Initialize module parameters with values drawn from a uniform distribution.
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import uniform_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # uniform_init(module, a=0, b=1, bias=0)
+ >>> uniform_init(conv1, a=0, b=1)
+ ```
+
+- kaiming_init
+
+ Initialize module parameters with the values according to the method
+ described in [Delving deep into rectifiers: Surpassing human-level
+ performance on ImageNet classification - He, K. et al. (2015)](https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf)
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import kaiming_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # kaiming_init(module, a=0, mode='fan_out', nonlinearity='relu', bias=0, distribution='normal')
+ >>> kaiming_init(conv1)
+ ```
+
+- caffe2_xavier_init
+
+ The xavier initialization is implemented in caffe2, which corresponds to `kaiming_uniform_` in PyTorch.
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import caffe2_xavier_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # caffe2_xavier_init(module, bias=0)
+ >>> caffe2_xavier_init(conv1)
+ ```
+
+- bias_init_with_prob
+
+ Initialize conv/fc bias value according to a given probability, as proposed in [Focal Loss for Dense Object Detection](https://arxiv.org/pdf/1708.02002.pdf).
+
+ ```python
+ >>> from mmcv.cnn import bias_init_with_prob
+ >>> # bias_init_with_prob is proposed in Focal Loss
+ >>> bias = bias_init_with_prob(0.01)
+ >>> bias
+ -4.59511985013459
+ ```
+
+#### Initializers and configs
+
+On the basis of the initialization methods, we define the corresponding initialization classes and register them to `INITIALIZERS`, so we can
+use the configuration to initialize the model.
+
+We provide the following initialization classes.
+
+- ConstantInit
+- XavierInit
+- NormalInit
+- UniformInit
+- KaimingInit
+- Caffe2XavierInit
+- PretrainedInit
+
+Let us introduce the usage of `initialize` in detail.
+
+1. Initialize model by `layer` key
+
+ If we only define `layer`, it just initialize the layer in `layer` key.
+
+ NOTE: Value of `layer` key is the class name with attributes weights and bias of Pytorch, so `MultiheadAttention layer` is not supported.
+
+- Define `layer` key for initializing module with same configuration.
+
+ ```python
+ import torch.nn as nn
+ from mmcv.cnn import initialize
+
+ class FooNet(nn.Module):
+ def __init__(self):
+ super().__init__()
+ self.feat = nn.Conv1d(3, 1, 3)
+ self.reg = nn.Conv2d(3, 3, 3)
+ self.cls = nn.Linear(1, 2)
+
+ model = FooNet()
+ init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d', 'Linear'], val=1)
+ # initialize whole module with same configuration
+ initialize(model, init_cfg)
+ # model.feat.weight
+ # Parameter containing:
+ # tensor([[[1., 1., 1.],
+ # [1., 1., 1.],
+ # [1., 1., 1.]]], requires_grad=True)
+ ```
+
+- Define `layer` key for initializing layer with different configurations.
+
+ ```python
+ import torch.nn as nn
+ from mmcv.cnn.utils import initialize
+
+ class FooNet(nn.Module):
+ def __init__(self):
+ super().__init__()
+ self.feat = nn.Conv1d(3, 1, 3)
+ self.reg = nn.Conv2d(3, 3, 3)
+ self.cls = nn.Linear(1,2)
+
+ model = FooNet()
+ init_cfg = [dict(type='Constant', layer='Conv1d', val=1),
+ dict(type='Constant', layer='Conv2d', val=2),
+ dict(type='Constant', layer='Linear', val=3)]
+ # nn.Conv1d will be initialized with dict(type='Constant', val=1)
+ # nn.Conv2d will be initialized with dict(type='Constant', val=2)
+ # nn.Linear will be initialized with dict(type='Constant', val=3)
+ initialize(model, init_cfg)
+ # model.reg.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+ ```
+
+2. Initialize model by `override` key
+
+- When initializing some specific part with its attribute name, we can use `override` key, and the value in `override` will ignore the value in init_cfg.
+
+ ```python
+ import torch.nn as nn
+ from mmcv.cnn import initialize
+
+ class FooNet(nn.Module):
+ def __init__(self):
+ super().__init__()
+ self.feat = nn.Conv1d(3, 1, 3)
+ self.reg = nn.Conv2d(3, 3, 3)
+ self.cls = nn.Sequential(nn.Conv1d(3, 1, 3), nn.Linear(1,2))
+
+ # if we would like to initialize model's weights as 1 and bias as 2
+ # but weight in `cls` as 3 and bias 4, we can use override key
+ model = FooNet()
+ init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'], val=1, bias=2,
+ override=dict(type='Constant', name='reg', val=3, bias=4))
+ # self.feat and self.cls will be initialized with dict(type='Constant', val=1, bias=2)
+ # The module called 'reg' will be initialized with dict(type='Constant', val=3, bias=4)
+ initialize(model, init_cfg)
+ # model.reg.weight
+ # Parameter containing:
+ # tensor([[[[3., 3., 3.],
+ # [3., 3., 3.],
+ # [3., 3., 3.]],
+ # ...,
+ # [[3., 3., 3.],
+ # [3., 3., 3.],
+ # [3., 3., 3.]]]], requires_grad=True)
+ ```
+
+- If `layer` is None in init_cfg, only sub-module with the name in override will be initialized, and type and other args in override can be omitted.
+
+ ```python
+ model = FooNet()
+ init_cfg = dict(type='Constant', val=1, bias=2, override=dict(name='reg'))
+ # self.feat and self.cls will be initialized by Pytorch
+ # The module called 'reg' will be initialized with dict(type='Constant', val=1, bias=2)
+ initialize(model, init_cfg)
+ # model.reg.weight
+ # Parameter containing:
+ # tensor([[[[1., 1., 1.],
+ # [1., 1., 1.],
+ # [1., 1., 1.]],
+ # ...,
+ # [[1., 1., 1.],
+ # [1., 1., 1.],
+ # [1., 1., 1.]]]], requires_grad=True)
+ ```
+
+- If we don't define `layer` key or `override` key, it will not initialize anything.
+
+- Invalid usage
+
+ ```python
+ # It is invalid that override don't have name key
+ init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'],
+ val=1, bias=2,
+ override=dict(type='Constant', val=3, bias=4))
+
+ # It is also invalid that override has name and other args except type
+ init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'],
+ val=1, bias=2,
+ override=dict(name='reg', val=3, bias=4))
+ ```
+
+3. Initialize model with the pretrained model
+
+ ```python
+ import torch.nn as nn
+ import torchvision.models as models
+ from mmcv.cnn import initialize
+
+ # initialize model with pretrained model
+ model = models.resnet50()
+ # model.conv1.weight
+ # Parameter containing:
+ # tensor([[[[-6.7435e-03, -2.3531e-02, -9.0143e-03, ..., -2.1245e-03,
+ # -1.8077e-03, 3.0338e-03],
+ # [-1.2603e-02, -2.7831e-02, 2.3187e-02, ..., -1.5793e-02,
+ # 1.1655e-02, 4.5889e-03],
+ # [-3.7916e-02, 1.2014e-02, 1.3815e-02, ..., -4.2651e-03,
+ # 1.7314e-02, -9.9998e-03],
+ # ...,
+
+ init_cfg = dict(type='Pretrained',
+ checkpoint='torchvision://resnet50')
+ initialize(model, init_cfg)
+ # model.conv1.weight
+ # Parameter containing:
+ # tensor([[[[ 1.3335e-02, 1.4664e-02, -1.5351e-02, ..., -4.0896e-02,
+ # -4.3034e-02, -7.0755e-02],
+ # [ 4.1205e-03, 5.8477e-03, 1.4948e-02, ..., 2.2060e-03,
+ # -2.0912e-02, -3.8517e-02],
+ # [ 2.2331e-02, 2.3595e-02, 1.6120e-02, ..., 1.0281e-01,
+ # 6.2641e-02, 5.1977e-02],
+ # ...,
+
+ # initialize weights of a sub-module with the specific part of a pretrained model by using 'prefix'
+ model = models.resnet50()
+ url = 'http://download.openmmlab.com/mmdetection/v2.0/retinanet/'\
+ 'retinanet_r50_fpn_1x_coco/'\
+ 'retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth'
+ init_cfg = dict(type='Pretrained',
+ checkpoint=url, prefix='backbone.')
+ initialize(model, init_cfg)
+ ```
+
+4. Initialize model inherited from BaseModule, Sequential, ModuleList
+
+ `BaseModule` is inherited from `torch.nn.Module`, and the only different between them is that `BaseModule` implements `init_weight`.
+
+ `Sequential` is inherited from `BaseModule` and `torch.nn.Sequential`.
+
+ `ModuleList` is inherited from `BaseModule` and `torch.nn.ModuleList`.
+
+ `````python
+ import torch.nn as nn
+ from mmcv.runner import BaseModule, Sequential, ModuleList
+
+ class FooConv1d(BaseModule):
+
+ def __init__(self, init_cfg=None):
+ super().__init__(init_cfg)
+ self.conv1d = nn.Conv1d(4, 1, 4)
+
+ def forward(self, x):
+ return self.conv1d(x)
+
+ class FooConv2d(BaseModule):
+
+ def __init__(self, init_cfg=None):
+ super().__init__(init_cfg)
+ self.conv2d = nn.Conv2d(3, 1, 3)
+
+ def forward(self, x):
+ return self.conv2d(x)
+
+ # BaseModule
+ init_cfg = dict(type='Constant', layer='Conv1d', val=0., bias=1.)
+ model = FooConv1d(init_cfg)
+ model.init_weights()
+ # model.conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+
+ # Sequential
+ init_cfg1 = dict(type='Constant', layer='Conv1d', val=0., bias=1.)
+ init_cfg2 = dict(type='Constant', layer='Conv2d', val=2., bias=3.)
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ seq_model = Sequential(model1, model2)
+ seq_model.init_weights()
+ # seq_model[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # seq_model[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+
+ # inner init_cfg has higher priority
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d'], val=4., bias=5.)
+ seq_model = Sequential(model1, model2, init_cfg=init_cfg)
+ seq_model.init_weights()
+ # seq_model[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # seq_model[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+
+ # ModuleList
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ modellist = ModuleList([model1, model2])
+ modellist.init_weights()
+ # modellist[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # modellist[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+
+ # inner init_cfg has higher priority
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d'], val=4., bias=5.)
+ modellist = ModuleList([model1, model2], init_cfg=init_cfg)
+ modellist.init_weights()
+ # modellist[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # modellist[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+ `````
+
+### Model Zoo
+
+Besides torchvision pre-trained models, we also provide pre-trained models of following CNN:
+
+- VGG Caffe
+- ResNet Caffe
+- ResNeXt
+- ResNet with Group Normalization
+- ResNet with Group Normalization and Weight Standardization
+- HRNetV2
+- Res2Net
+- RegNet
+
+#### Model URLs in JSON
+
+The model zoo links in MMCV are managed by JSON files.
+The json file consists of key-value pair of model name and its url or path.
+An example json file could be like:
+
+```json
+{
+ "model_a": "https://example.com/models/model_a_9e5bac.pth",
+ "model_b": "pretrain/model_b_ab3ef2c.pth"
+}
+```
+
+The default links of the pre-trained models hosted on OpenMMLab AWS could be found [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/model_zoo/open_mmlab.json).
+
+You may override default links by putting `open-mmlab.json` under `MMCV_HOME`. If `MMCV_HOME` is not find in the environment, `~/.cache/mmcv` will be used by default. You may `export MMCV_HOME=/your/path` to use your own path.
+
+The external json files will be merged into default one. If the same key presents in both external json and default json, the external one will be used.
+
+#### Load Checkpoint
+
+The following types are supported for `filename` argument of `mmcv.load_checkpoint()`.
+
+- filepath: The filepath of the checkpoint.
+- `http://xxx` and `https://xxx`: The link to download the checkpoint. The `SHA256` postfix should be contained in the filename.
+- `torchvision://xxx`: The model links in `torchvision.models`.Please refer to [torchvision](https://pytorch.org/docs/stable/torchvision/models.html) for details.
+- `open-mmlab://xxx`: The model links or filepath provided in default and additional json files.
diff --git a/docs/understand_mmcv/config.md b/docs/understand_mmcv/config.md
new file mode 100644
index 0000000000000000000000000000000000000000..d0b669b8516c0281000a88c1bd41aac731dc8326
--- /dev/null
+++ b/docs/understand_mmcv/config.md
@@ -0,0 +1,200 @@
+## Config
+
+`Config` class is used for manipulating config and config files. It supports
+loading configs from multiple file formats including **python**, **json** and **yaml**.
+It provides dict-like apis to get and set values.
+
+Here is an example of the config file `test.py`.
+
+```python
+a = 1
+b = dict(b1=[0, 1, 2], b2=None)
+c = (1, 2)
+d = 'string'
+```
+
+To load and use configs
+
+```python
+>>> cfg = Config.fromfile('test.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=None),
+... c=(1, 2),
+... d='string')
+```
+
+For all format configs, some predefined variables are supported. It will convert the variable in `{{ var }}` with its real value.
+
+Currently, it supports four predefined variables:
+
+`{{ fileDirname }}` - the current opened file's dirname, e.g. /home/your-username/your-project/folder
+
+`{{ fileBasename }}` - the current opened file's basename, e.g. file.ext
+
+`{{ fileBasenameNoExtension }}` - the current opened file's basename with no file extension, e.g. file
+
+`{{ fileExtname }}` - the current opened file's extension, e.g. .ext
+
+These variable names are referred from [VS Code](https://code.visualstudio.com/docs/editor/variables-reference).
+
+Here is one examples of config with predefined variables.
+
+`config_a.py`
+
+```python
+a = 1
+b = './work_dir/{{ fileBasenameNoExtension }}'
+c = '{{ fileExtname }}'
+```
+
+```python
+>>> cfg = Config.fromfile('./config_a.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b='./work_dir/config_a',
+... c='.py')
+```
+
+For all format configs, inheritance is supported. To reuse fields in other config files,
+specify `_base_='./config_a.py'` or a list of configs `_base_=['./config_a.py', './config_b.py']`.
+Here are 4 examples of config inheritance.
+
+`config_a.py`
+
+```python
+a = 1
+b = dict(b1=[0, 1, 2], b2=None)
+```
+
+### Inherit from base config without overlapped keys
+
+`config_b.py`
+
+```python
+_base_ = './config_a.py'
+c = (1, 2)
+d = 'string'
+```
+
+```python
+>>> cfg = Config.fromfile('./config_b.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=None),
+... c=(1, 2),
+... d='string')
+```
+
+New fields in `config_b.py` are combined with old fields in `config_a.py`
+
+### Inherit from base config with overlapped keys
+
+`config_c.py`
+
+```python
+_base_ = './config_a.py'
+b = dict(b2=1)
+c = (1, 2)
+```
+
+```python
+>>> cfg = Config.fromfile('./config_c.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=1),
+... c=(1, 2))
+```
+
+`b.b2=None` in `config_a` is replaced with `b.b2=1` in `config_c.py`.
+
+### Inherit from base config with ignored fields
+
+`config_d.py`
+
+```python
+_base_ = './config_a.py'
+b = dict(_delete_=True, b2=None, b3=0.1)
+c = (1, 2)
+```
+
+```python
+>>> cfg = Config.fromfile('./config_d.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b2=None, b3=0.1),
+... c=(1, 2))
+```
+
+You may also set `_delete_=True` to ignore some fields in base configs. All old keys `b1, b2, b3` in `b` are replaced with new keys `b2, b3`.
+
+### Inherit from multiple base configs (the base configs should not contain the same keys)
+
+`config_e.py`
+
+```python
+c = (1, 2)
+d = 'string'
+```
+
+`config_f.py`
+
+```python
+_base_ = ['./config_a.py', './config_e.py']
+```
+
+```python
+>>> cfg = Config.fromfile('./config_f.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=None),
+... c=(1, 2),
+... d='string')
+```
+
+### Reference variables from base
+
+You can reference variables defined in base using the following grammar.
+
+`base.py`
+
+```python
+item1 = 'a'
+item2 = dict(item3 = 'b')
+```
+
+`config_g.py`
+
+```python
+_base_ = ['./base.py']
+item = dict(a = {{ _base_.item1 }}, b = {{ _base_.item2.item3 }})
+```
+
+```python
+>>> cfg = Config.fromfile('./config_g.py')
+>>> print(cfg.pretty_text)
+item1 = 'a'
+item2 = dict(item3='b')
+item = dict(a='a', b='b')
+```
+
+### Add deprecation information in configs
+
+Deprecation information can be added in a config file, which will trigger a `UserWarning` when this config file is loaded.
+
+`deprecated_cfg.py`
+
+```python
+_base_ = 'expected_cfg.py'
+
+_deprecation_ = dict(
+ expected = 'expected_cfg.py', # optional to show expected config path in the warning information
+ reference = 'url to related PR' # optional to show reference link in the warning information
+)
+```
+
+```python
+>>> cfg = Config.fromfile('./deprecated_cfg.py')
+
+UserWarning: The config file deprecated.py will be deprecated in the future. Please use expected_cfg.py instead. More information can be found at https://github.com/open-mmlab/mmcv/pull/1275
+```
diff --git a/docs/en/understand_mmcv/data_process.md b/docs/understand_mmcv/data_process.md
similarity index 90%
rename from docs/en/understand_mmcv/data_process.md
rename to docs/understand_mmcv/data_process.md
index 167928f88528ee6b682a559582a1584c369a5d39..79e9281b6c88c907e6edfc6d03f73930b2cd51ef 100644
--- a/docs/en/understand_mmcv/data_process.md
+++ b/docs/understand_mmcv/data_process.md
@@ -2,7 +2,7 @@
### Image
-This module provides some image processing methods, which requires `opencv` to be installed first.
+This module provides some image processing methods, which requires `opencv` to be installed.
#### Read/Write/Show
@@ -118,7 +118,7 @@ mmcv.imflip(img, direction='vertical')
#### Crop
-`imcrop` can crop the image with one or more regions. Each region is represented by the upper left and lower right coordinates as (x1, y1, x2, y2).
+`imcrop` can crop the image with one or some regions, represented as (x1, y1, x2, y2).
```python
import mmcv
@@ -135,12 +135,12 @@ bboxes = np.array([[10, 10, 100, 120], [0, 0, 50, 50]])
patches = mmcv.imcrop(img, bboxes)
# crop two regions, and rescale the patches by 1.2x
-patches = mmcv.imcrop(img, bboxes, scale=1.2)
+patches = mmcv.imcrop(img, bboxes, scale_ratio=1.2)
```
#### Padding
-There are two methods, `impad` and `impad_to_multiple`, to pad an image to the
+There are two methods `impad` and `impad_to_multiple` to pad an image to the
specific size with given values.
```python
@@ -150,14 +150,14 @@ img = mmcv.imread('tests/data/color.jpg')
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=0)
# pad the image to (1000, 1200) with different values for three channels.
-img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=(100, 50, 200))
+img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=[100, 50, 200])
# pad the image on left, right, top, bottom borders with all zeros
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=0)
# pad the image on left, right, top, bottom borders with different values
# for three channels.
-img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=(100, 50, 200))
+img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=[100, 50, 200])
# pad an image so that each edge is a multiple of some value.
img_ = mmcv.impad_to_multiple(img, 32)
@@ -165,7 +165,7 @@ img_ = mmcv.impad_to_multiple(img, 32)
### Video
-This module provides the following functionalities:
+This module provides the following functionalities.
- A `VideoReader` class with friendly apis to read and convert videos.
- Some methods for editing (cut, concat, resize) videos.
@@ -232,7 +232,7 @@ mmcv.resize_video('test.mp4', 'resized2.mp4', ratio=2)
- IO
- Visualization
-- Flow warping
+- Flow warpping
We provide two options to dump optical flow files: uncompressed and compressed.
The uncompressed way just dumps the floating numbers to a binary file. It is
@@ -265,12 +265,12 @@ mmcv.flowshow(flow)

-3. Flow warping
+3. Flow warpping
```python
img1 = mmcv.imread('img1.jpg')
flow = mmcv.flowread('flow.flo')
-warped_img2 = mmcv.flow_warp(img1, flow)
+warpped_img2 = mmcv.flow_warp(img1, flow)
```
img1 (left) and img2 (right)
@@ -281,6 +281,6 @@ optical flow (img2 -> img1)

-warped image and difference with ground truth
+warpped image and difference with ground truth
-
+
diff --git a/docs/understand_mmcv/io.md b/docs/understand_mmcv/io.md
new file mode 100644
index 0000000000000000000000000000000000000000..f6c28dd425cb0bcc54ca5d92a3a3849103f47e2a
--- /dev/null
+++ b/docs/understand_mmcv/io.md
@@ -0,0 +1,247 @@
+## File IO
+
+This module provides two universal API to load and dump files of different formats.
+
+```{note}
+Since v1.3.16, the IO modules support loading (dumping) data from (to) different backends, respectively. More details are in PR [#1330](https://github.com/open-mmlab/mmcv/pull/1330).
+```
+
+### Load and dump data
+
+`mmcv` provides a universal api for loading and dumping data, currently
+supported formats are json, yaml and pickle.
+
+#### Load from disk or dump to disk
+
+```python
+import mmcv
+
+# load data from a file
+data = mmcv.load('test.json')
+data = mmcv.load('test.yaml')
+data = mmcv.load('test.pkl')
+# load data from a file-like object
+with open('test.json', 'r') as f:
+ data = mmcv.load(f, file_format='json')
+
+# dump data to a string
+json_str = mmcv.dump(data, file_format='json')
+
+# dump data to a file with a filename (infer format from file extension)
+mmcv.dump(data, 'out.pkl')
+
+# dump data to a file with a file-like object
+with open('test.yaml', 'w') as f:
+ data = mmcv.dump(data, f, file_format='yaml')
+```
+
+#### Load from other backends or dump to other backends
+
+```python
+import mmcv
+
+# load data from a file
+data = mmcv.load('s3://bucket-name/test.json')
+data = mmcv.load('s3://bucket-name/test.yaml')
+data = mmcv.load('s3://bucket-name/test.pkl')
+
+# dump data to a file with a filename (infer format from file extension)
+mmcv.dump(data, 's3://bucket-name/out.pkl')
+```
+
+It is also very convenient to extend the api to support more file formats.
+All you need to do is to write a file handler inherited from `BaseFileHandler`
+and register it with one or several file formats.
+
+You need to implement at least 3 methods.
+
+```python
+import mmcv
+
+# To register multiple file formats, a list can be used as the argument.
+# @mmcv.register_handler(['txt', 'log'])
+@mmcv.register_handler('txt')
+class TxtHandler1(mmcv.BaseFileHandler):
+
+ def load_from_fileobj(self, file):
+ return file.read()
+
+ def dump_to_fileobj(self, obj, file):
+ file.write(str(obj))
+
+ def dump_to_str(self, obj, **kwargs):
+ return str(obj)
+```
+
+Here is an example of `PickleHandler`.
+
+```python
+import pickle
+
+class PickleHandler(mmcv.BaseFileHandler):
+
+ def load_from_fileobj(self, file, **kwargs):
+ return pickle.load(file, **kwargs)
+
+ def load_from_path(self, filepath, **kwargs):
+ return super(PickleHandler, self).load_from_path(
+ filepath, mode='rb', **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ return pickle.dumps(obj, **kwargs)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ pickle.dump(obj, file, **kwargs)
+
+ def dump_to_path(self, obj, filepath, **kwargs):
+ super(PickleHandler, self).dump_to_path(
+ obj, filepath, mode='wb', **kwargs)
+```
+
+### Load a text file as a list or dict
+
+For example `a.txt` is a text file with 5 lines.
+
+```
+a
+b
+c
+d
+e
+```
+
+#### Load from disk
+
+Use `list_from_file` to load the list from a.txt.
+
+```python
+>>> mmcv.list_from_file('a.txt')
+['a', 'b', 'c', 'd', 'e']
+>>> mmcv.list_from_file('a.txt', offset=2)
+['c', 'd', 'e']
+>>> mmcv.list_from_file('a.txt', max_num=2)
+['a', 'b']
+>>> mmcv.list_from_file('a.txt', prefix='/mnt/')
+['/mnt/a', '/mnt/b', '/mnt/c', '/mnt/d', '/mnt/e']
+```
+
+For example `b.txt` is a text file with 3 lines.
+
+```
+1 cat
+2 dog cow
+3 panda
+```
+
+Then use `dict_from_file` to load the dict from `b.txt`.
+
+```python
+>>> mmcv.dict_from_file('b.txt')
+{'1': 'cat', '2': ['dog', 'cow'], '3': 'panda'}
+>>> mmcv.dict_from_file('b.txt', key_type=int)
+{1: 'cat', 2: ['dog', 'cow'], 3: 'panda'}
+```
+
+#### Load from other backends
+
+Use `list_from_file` to load the list from `s3://bucket-name/a.txt`.
+
+```python
+>>> mmcv.list_from_file('s3://bucket-name/a.txt')
+['a', 'b', 'c', 'd', 'e']
+>>> mmcv.list_from_file('s3://bucket-name/a.txt', offset=2)
+['c', 'd', 'e']
+>>> mmcv.list_from_file('s3://bucket-name/a.txt', max_num=2)
+['a', 'b']
+>>> mmcv.list_from_file('s3://bucket-name/a.txt', prefix='/mnt/')
+['/mnt/a', '/mnt/b', '/mnt/c', '/mnt/d', '/mnt/e']
+```
+
+Use `dict_from_file` to load the dict from `s3://bucket-name/b.txt`.
+
+```python
+>>> mmcv.dict_from_file('s3://bucket-name/b.txt')
+{'1': 'cat', '2': ['dog', 'cow'], '3': 'panda'}
+>>> mmcv.dict_from_file('s3://bucket-name/b.txt', key_type=int)
+{1: 'cat', 2: ['dog', 'cow'], 3: 'panda'}
+```
+
+### Load and dump checkpoints
+
+#### Load checkpoints from disk or save to disk
+
+We can read the checkpoints from disk or save to disk in the following way.
+
+```python
+import torch
+
+filepath1 = '/path/of/your/checkpoint1.pth'
+filepath2 = '/path/of/your/checkpoint2.pth'
+# read from filepath1
+checkpoint = torch.load(filepath1)
+# save to filepath2
+torch.save(checkpoint, filepath2)
+```
+
+MMCV provides many backends. `HardDiskBackend` is one of them and we can use it to read or save checkpoints.
+
+```python
+import io
+from mmcv.fileio.file_client import HardDiskBackend
+
+disk_backend = HardDiskBackend()
+with io.BytesIO(disk_backend.get(filepath1)) as buffer:
+ checkpoint = torch.load(buffer)
+with io.BytesIO() as buffer:
+ torch.save(checkpoint, f)
+ disk_backend.put(f.getvalue(), filepath2)
+```
+
+If we want to implement an interface which automatically select the corresponding
+backend based on the file path, we can use the `FileClient`.
+For example, we want to implement two methods for reading checkpoints as well as saving checkpoints,
+which need to support different types of file paths, either disk paths, network paths or other paths.
+
+```python
+from mmcv.fileio.file_client import FileClient
+
+def load_checkpoint(path):
+ file_client = FileClient.infer(uri=path)
+ with io.BytesIO(file_client.get(path)) as buffer:
+ checkpoint = torch.load(buffer)
+ return checkpoint
+
+def save_checkpoint(checkpoint, path):
+ with io.BytesIO() as buffer:
+ torch.save(checkpoint, buffer)
+ file_client.put(buffer.getvalue(), path)
+
+file_client = FileClient.infer_client(uri=filepath1)
+checkpoint = load_checkpoint(filepath1)
+save_checkpoint(checkpoint, filepath2)
+```
+
+#### Load checkpoints from the Internet
+
+```{note}
+Currently, it only supports reading checkpoints from the Internet, and does not support saving checkpoints to the Internet.
+```
+
+```python
+import io
+import torch
+from mmcv.fileio.file_client import HTTPBackend, FileClient
+
+filepath = 'http://path/of/your/checkpoint.pth'
+checkpoint = torch.utils.model_zoo.load_url(filepath)
+
+http_backend = HTTPBackend()
+with io.BytesIO(http_backend.get(filepath)) as buffer:
+ checkpoint = torch.load(buffer)
+
+file_client = FileClient.infer_client(uri=filepath)
+with io.BytesIO(file_client.get(filepath)) as buffer:
+ checkpoint = torch.load(buffer)
+```
diff --git a/docs/understand_mmcv/ops.md b/docs/understand_mmcv/ops.md
new file mode 100644
index 0000000000000000000000000000000000000000..2729e441c1318ca2850c21bf72df428910657f31
--- /dev/null
+++ b/docs/understand_mmcv/ops.md
@@ -0,0 +1,37 @@
+## CUDA ops
+
+We implement common CUDA ops used in detection, segmentation, etc.
+
+- AssignScoreWithK
+- BallQuery
+- BBoxOverlaps
+- CARAFE
+- CrissCrossAttention
+- ContextBlock
+- CornerPool
+- Deformable Convolution v1/v2
+- Deformable RoIPool
+- DynamicScatter
+- GatherPoints
+- FurthestPointSample
+- FurthestPointSampleWithDist
+- GeneralizedAttention
+- GroupPoints
+- KNN
+- MaskedConv
+- NMS
+- PSAMask
+- RoIPointPool3d
+- RoIPool
+- RoIAlign
+- RoIAwarePool3d
+- SimpleRoIAlign
+- SigmoidFocalLoss
+- SoftmaxFocalLoss
+- SoftNMS
+- Synchronized BatchNorm
+- Voxelization
+- ThreeInterpolate
+- ThreeNN
+- Weight standardization
+- Correlation
diff --git a/docs/understand_mmcv/registry.md b/docs/understand_mmcv/registry.md
new file mode 100644
index 0000000000000000000000000000000000000000..2cf10819fea6ac81645cc127c6b7aea54af19d5f
--- /dev/null
+++ b/docs/understand_mmcv/registry.md
@@ -0,0 +1,155 @@
+## Registry
+
+MMCV implements [registry](https://github.com/open-mmlab/mmcv/blob/master/mmcv/utils/registry.py) to manage different modules that share similar functionalities, e.g., backbones, head, and necks, in detectors.
+Most projects in OpenMMLab use registry to manage modules of datasets and models, such as [MMDetection](https://github.com/open-mmlab/mmdetection), [MMDetection3D](https://github.com/open-mmlab/mmdetection3d), [MMClassification](https://github.com/open-mmlab/mmclassification), [MMEditing](https://github.com/open-mmlab/mmediting), etc.
+
+### What is registry
+
+In MMCV, registry can be regarded as a mapping that maps a class to a string.
+These classes contained by a single registry usually have similar APIs but implement different algorithms or support different datasets.
+With the registry, users can find and instantiate the class through its corresponding string, and use the instantiated module as they want.
+One typical example is the config systems in most OpenMMLab projects, which use the registry to create hooks, runners, models, and datasets, through configs.
+The API reference could be found [here](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.Registry).
+
+To manage your modules in the codebase by `Registry`, there are three steps as below.
+
+1. Create a build method (optional, in most cases you can just use the default one).
+2. Create a registry.
+3. Use this registry to manage the modules.
+
+`build_func` argument of `Registry` is to customize how to instantiate the class instance, the default one is `build_from_cfg` implemented [here](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.build_from_cfg).
+
+### A Simple Example
+
+Here we show a simple example of using registry to manage modules in a package.
+You can find more practical examples in OpenMMLab projects.
+
+Assuming we want to implement a series of Dataset Converter for converting different formats of data to the expected data format.
+We create a directory as a package named `converters`.
+In the package, we first create a file to implement builders, named `converters/builder.py`, as below
+
+```python
+from mmcv.utils import Registry
+# create a registry for converters
+CONVERTERS = Registry('converter')
+```
+
+Then we can implement different converters in the package. For example, implement `Converter1` in `converters/converter1.py`
+
+```python
+
+from .builder import CONVERTERS
+
+# use the registry to manage the module
+@CONVERTERS.register_module()
+class Converter1(object):
+ def __init__(self, a, b):
+ self.a = a
+ self.b = b
+```
+
+The key step to use registry for managing the modules is to register the implemented module into the registry `CONVERTERS` through
+`@CONVERTERS.register_module()` when you are creating the module. By this way, a mapping between a string and the class is built and maintained by `CONVERTERS` as below
+
+```python
+'Converter1' ->
+```
+
+If the module is successfully registered, you can use this converter through configs as
+
+```python
+converter_cfg = dict(type='Converter1', a=a_value, b=b_value)
+converter = CONVERTERS.build(converter_cfg)
+```
+
+### Customize Build Function
+
+Suppose we would like to customize how `converters` are built, we could implement a customized `build_func` and pass it into the registry.
+
+```python
+from mmcv.utils import Registry
+
+# create a build function
+def build_converter(cfg, registry, *args, **kwargs):
+ cfg_ = cfg.copy()
+ converter_type = cfg_.pop('type')
+ if converter_type not in registry:
+ raise KeyError(f'Unrecognized converter type {converter_type}')
+ else:
+ converter_cls = registry.get(converter_type)
+
+ converter = converter_cls(*args, **kwargs, **cfg_)
+ return converter
+
+# create a registry for converters and pass ``build_converter`` function
+CONVERTERS = Registry('converter', build_func=build_converter)
+```
+
+```{note}
+In this example, we demonstrate how to use the `build_func` argument to customize the way to build a class instance.
+The functionality is similar to the default `build_from_cfg`. In most cases, default one would be sufficient.
+`build_model_from_cfg` is also implemented to build PyTorch module in `nn.Sequentail`, you may directly use them instead of implementing by yourself.
+```
+
+### Hierarchy Registry
+
+You could also build modules from more than one OpenMMLab frameworks, e.g. you could use all backbones in [MMClassification](https://github.com/open-mmlab/mmclassification) for object detectors in [MMDetection](https://github.com/open-mmlab/mmdetection), you may also combine an object detection model in [MMDetection](https://github.com/open-mmlab/mmdetection) and semantic segmentation model in [MMSegmentation](https://github.com/open-mmlab/mmsegmentation).
+
+All `MODELS` registries of downstream codebases are children registries of MMCV's `MODELS` registry.
+Basically, there are two ways to build a module from child or sibling registries.
+
+1. Build from children registries.
+
+ For example:
+
+ In MMDetection we define:
+
+ ```python
+ from mmcv.utils import Registry
+ from mmcv.cnn import MODELS as MMCV_MODELS
+ MODELS = Registry('model', parent=MMCV_MODELS)
+
+ @MODELS.register_module()
+ class NetA(nn.Module):
+ def forward(self, x):
+ return x
+ ```
+
+ In MMClassification we define:
+
+ ```python
+ from mmcv.utils import Registry
+ from mmcv.cnn import MODELS as MMCV_MODELS
+ MODELS = Registry('model', parent=MMCV_MODELS)
+
+ @MODELS.register_module()
+ class NetB(nn.Module):
+ def forward(self, x):
+ return x + 1
+ ```
+
+ We could build two net in either MMDetection or MMClassification by:
+
+ ```python
+ from mmdet.models import MODELS
+ net_a = MODELS.build(cfg=dict(type='NetA'))
+ net_b = MODELS.build(cfg=dict(type='mmcls.NetB'))
+ ```
+
+ or
+
+ ```python
+ from mmcls.models import MODELS
+ net_a = MODELS.build(cfg=dict(type='mmdet.NetA'))
+ net_b = MODELS.build(cfg=dict(type='NetB'))
+ ```
+
+2. Build from parent registry.
+
+ The shared `MODELS` registry in MMCV is the parent registry for all downstream codebases (root registry):
+
+ ```python
+ from mmcv.cnn import MODELS as MMCV_MODELS
+ net_a = MMCV_MODELS.build(cfg=dict(type='mmdet.NetA'))
+ net_b = MMCV_MODELS.build(cfg=dict(type='mmcls.NetB'))
+ ```
diff --git a/docs/understand_mmcv/runner.md b/docs/understand_mmcv/runner.md
new file mode 100644
index 0000000000000000000000000000000000000000..2e6e3868335d92f94e98441a5c7ec6d0b92a960b
--- /dev/null
+++ b/docs/understand_mmcv/runner.md
@@ -0,0 +1,163 @@
+## Runner
+
+The runner class is designed to manage the training. It eases the training process with less code demanded from users while staying flexible and configurable. The main features are as listed:
+
+- Support `EpochBasedRunner` and `IterBasedRunner` for different scenarios. Implementing customized runners is also allowed to meet customized needs.
+- Support customized workflow to allow switching between different modes while training. Currently, supported modes are train and val.
+- Enable extensibility through various hooks, including hooks defined in MMCV and customized ones.
+
+### EpochBasedRunner
+
+As its name indicates, workflow in `EpochBasedRunner` should be set based on epochs. For example, [('train', 2), ('val', 1)] means running 2 epochs for training and 1 epoch for validation, iteratively. And each epoch may contain multiple iterations. Currently, MMDetection uses `EpochBasedRunner` by default.
+
+Let's take a look at its core logic:
+
+```python
+# the condition to stop training
+while curr_epoch < max_epochs:
+ # traverse the workflow.
+ # e.g. workflow = [('train', 2), ('val', 1)]
+ for i, flow in enumerate(workflow):
+ # mode(e.g. train) determines which function to run
+ mode, epochs = flow
+ # epoch_runner will be either self.train() or self.val()
+ epoch_runner = getattr(self, mode)
+ # execute the corresponding function
+ for _ in range(epochs):
+ epoch_runner(data_loaders[i], **kwargs)
+```
+
+Currently, we support 2 modes: train and val. Let's take a train function for example and have a look at its core logic:
+
+```python
+# Currently, epoch_runner could be either train or val
+def train(self, data_loader, **kwargs):
+ # traverse the dataset and get batch data for 1 epoch
+ for i, data_batch in enumerate(data_loader):
+ # it will execute all before_train_iter function in the hooks registered. You may want to watch out for the order.
+ self.call_hook('before_train_iter')
+ # set train_mode as False in val function
+ self.run_iter(data_batch, train_mode=True, **kwargs)
+ self.call_hook('after_train_iter')
+ self.call_hook('after_train_epoch')
+```
+
+### IterBasedRunner
+
+Different from `EpochBasedRunner`, workflow in `IterBasedRunner` should be set based on iterations. For example, [('train', 2), ('val', 1)] means running 2 iters for training and 1 iter for validation, iteratively. Currently, MMSegmentation uses `IterBasedRunner` by default.
+
+Let's take a look at its core logic:
+
+```python
+# Although we set workflow by iters here, we might also need info on the epochs in some using cases. That can be provided by IterLoader.
+iter_loaders = [IterLoader(x) for x in data_loaders]
+# the condition to stop training
+while curr_iter < max_iters:
+ # traverse the workflow.
+ # e.g. workflow = [('train', 2), ('val', 1)]
+ for i, flow in enumerate(workflow):
+ # mode(e.g. train) determines which function to run
+ mode, iters = flow
+ # iter_runner will be either self.train() or self.val()
+ iter_runner = getattr(self, mode)
+ # execute the corresponding function
+ for _ in range(iters):
+ iter_runner(iter_loaders[i], **kwargs)
+```
+
+Currently, we support 2 modes: train and val. Let's take a val function for example and have a look at its core logic:
+
+```python
+# Currently, iter_runner could be either train or val
+def val(self, data_loader, **kwargs):
+ # get batch data for 1 iter
+ data_batch = next(data_loader)
+ # it will execute all before_val_iter function in the hooks registered. You may want to watch out for the order.
+ self.call_hook('before_val_iter')
+ outputs = self.model.val_step(data_batch, self.optimizer, **kwargs)
+ self.outputs = outputs
+ self.call_hook('after_val_iter')
+```
+
+Other than the basic functionalities explained above, `EpochBasedRunner` and `IterBasedRunner` provide methods such as `resume`, `save_checkpoint` and `register_hook`. In case you are not familiar with the term Hook mentioned earlier, we will also provide a tutorial about it.(coming soon...) Essentially, a hook is functionality to alter or augment the code behaviors through predefined api. It allows users to have their own code called under certain circumstances. It makes code extensible in a non-intrusive manner.
+
+### A Simple Example
+
+We will walk you through the usage of runner with a classification task. The following code only contains essential steps for demonstration purposes. The following steps are necessary for any training tasks.
+
+**(1) Initialize dataloader, model, optimizer, etc.**
+
+```python
+# initialize model
+model=...
+# initialize optimizer, typically, we set: cfg.optimizer = dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001)
+optimizer = build_optimizer(model, cfg.optimizer)
+# initialize the dataloader corresponding to the workflow(train/val)
+data_loaders = [
+ build_dataloader(
+ ds,
+ cfg.data.samples_per_gpu,
+ cfg.data.workers_per_gpu,
+ ...) for ds in dataset
+ ]
+```
+
+**(2) Initialize runner**
+
+```python
+runner = build_runner(
+ # cfg.runner is typically set as:
+ # runner = dict(type='EpochBasedRunner', max_epochs=200)
+ cfg.runner,
+ default_args=dict(
+ model=model,
+ batch_processor=None,
+ optimizer=optimizer,
+ logger=logger))
+```
+
+**(3) Register training hooks and customized hooks.**
+
+```python
+# register default hooks necessary for training
+runner.register_training_hooks(
+ # configs of learning rate, it is typically set as:
+ # lr_config = dict(policy='step', step=[100, 150])
+ cfg.lr_config,
+ # configuration of optimizer, e.g. grad_clip
+ optimizer_config,
+ # configuration of saving checkpoints, it is typically set as:
+ # checkpoint_config = dict(interval=1), saving checkpoints every epochs
+ cfg.checkpoint_config,
+ # configuration of logs
+ cfg.log_config,
+ ...)
+
+# register customized hooks
+# say we want to enable ema, then we could set custom_hooks=[dict(type='EMAHook')]
+if cfg.get('custom_hooks', None):
+ custom_hooks = cfg.custom_hooks
+ for hook_cfg in cfg.custom_hooks:
+ hook_cfg = hook_cfg.copy()
+ priority = hook_cfg.pop('priority', 'NORMAL')
+ hook = build_from_cfg(hook_cfg, HOOKS)
+ runner.register_hook(hook, priority=priority)
+```
+
+Then, we can use `resume` or `load_checkpoint` to load existing weights.
+
+**(4) Start training**
+
+```python
+# workflow is typically set as: workflow = [('train', 1)]
+# here the training begins.
+runner.run(data_loaders, cfg.workflow)
+```
+
+Let's take `EpochBasedRunner` for example and go a little bit into details about setting workflow:
+
+- Say we only want to put train in the workflow, then we can set: workflow = [('train', 1)]. The runner will only execute train iteratively in this case.
+- Say we want to put both train and val in the workflow, then we can set: workflow = [('train', 3), ('val',1)]. The runner will first execute train for 3 epochs and then switch to val mode and execute val for 1 epoch. The workflow will be repeated until the current epoch hit the max_epochs.
+- Workflow is highly flexible. Therefore, you can set workflow = [('val', 1), ('train',1)] if you would like the runner to validate first and train after.
+
+The code we demonstrated above is already in `train.py` in MM repositories. Simply modify the corresponding keys in the configuration files and the script will execute the expected workflow automatically.
diff --git a/docs/understand_mmcv/utils.md b/docs/understand_mmcv/utils.md
new file mode 100644
index 0000000000000000000000000000000000000000..5d5e0adf9bc9fff34affa9c75a7d8fb3a937f650
--- /dev/null
+++ b/docs/understand_mmcv/utils.md
@@ -0,0 +1,74 @@
+## Utils
+
+### ProgressBar
+
+If you want to apply a method to a list of items and track the progress, `track_progress`
+is a good choice. It will display a progress bar to tell the progress and ETA.
+
+```python
+import mmcv
+
+def func(item):
+ # do something
+ pass
+
+tasks = [item_1, item_2, ..., item_n]
+
+mmcv.track_progress(func, tasks)
+```
+
+The output is like the following.
+
+
+
+There is another method `track_parallel_progress`, which wraps multiprocessing and
+progress visualization.
+
+```python
+mmcv.track_parallel_progress(func, tasks, 8) # 8 workers
+```
+
+
+
+If you want to iterate or enumerate a list of items and track the progress, `track_iter_progress`
+is a good choice. It will display a progress bar to tell the progress and ETA.
+
+```python
+import mmcv
+
+tasks = [item_1, item_2, ..., item_n]
+
+for task in mmcv.track_iter_progress(tasks):
+ # do something like print
+ print(task)
+
+for i, task in enumerate(mmcv.track_iter_progress(tasks)):
+ # do something like print
+ print(i)
+ print(task)
+```
+
+### Timer
+
+It is convenient to compute the runtime of a code block with `Timer`.
+
+```python
+import time
+
+with mmcv.Timer():
+ # simulate some code block
+ time.sleep(1)
+```
+
+or try with `since_start()` and `since_last_check()`. This former can
+return the runtime since the timer starts and the latter will return the time
+since the last time checked.
+
+```python
+timer = mmcv.Timer()
+# code block 1 here
+print(timer.since_start())
+# code block 2 here
+print(timer.since_last_check())
+print(timer.since_start())
+```
diff --git a/docs/en/understand_mmcv/visualization.md b/docs/understand_mmcv/visualization.md
similarity index 100%
rename from docs/en/understand_mmcv/visualization.md
rename to docs/understand_mmcv/visualization.md
diff --git a/docs/zh_cn/_static/version.json b/docs/zh_cn/_static/version.json
deleted file mode 100644
index 7ee4965d36ed96f63f484137921d156d19cc40da..0000000000000000000000000000000000000000
--- a/docs/zh_cn/_static/version.json
+++ /dev/null
@@ -1,575 +0,0 @@
-{
- "Linux": [
- {
- "cuda": "11.7",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.5",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.0",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "9.2",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "9.2",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- }
- ],
- "Windows": [
- {
- "cuda": "11.7",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "11.6",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.5",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.3",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "11.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "10.2",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "10.1",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2",
- "2.0.0rc1"
- ]
- }
- ],
- "macOS": [
- {
- "cuda": "cpu",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "mps",
- "torch": "1.13.x",
- "mmcv": [
- "2.0.0rc3"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.12.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.11.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.10.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.9.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.8.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.7.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- },
- {
- "cuda": "cpu",
- "torch": "1.6.x",
- "mmcv": [
- "2.0.0rc3",
- "2.0.0rc2"
- ]
- }
- ]
-}
diff --git a/docs/zh_cn/_templates/classtemplate.rst b/docs/zh_cn/_templates/classtemplate.rst
deleted file mode 100644
index 4f74842394ec9807fb1ae2d8f05a8a57e9a2e24c..0000000000000000000000000000000000000000
--- a/docs/zh_cn/_templates/classtemplate.rst
+++ /dev/null
@@ -1,14 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-.. currentmodule:: {{ module }}
-
-
-{{ name | underline}}
-
-.. autoclass:: {{ name }}
- :members:
-
-
-..
- autogenerated from source/_templates/classtemplate.rst
- note it does not have :inherited-members:
diff --git a/docs/zh_cn/api/arraymisc.rst b/docs/zh_cn/api/arraymisc.rst
deleted file mode 100644
index 28975eb76e94994c50d2fe52b8f34c7ce533e788..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/arraymisc.rst
+++ /dev/null
@@ -1,19 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.arraymisc
-===================================
-
-.. contents:: mmcv.arraymisc
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.arraymisc
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- quantize
- dequantize
diff --git a/docs/zh_cn/api/cnn.rst b/docs/zh_cn/api/cnn.rst
deleted file mode 100644
index 022191f179fdbe3b1644abbb96ffdc92e4e37e06..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/cnn.rst
+++ /dev/null
@@ -1,71 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.cnn
-===================================
-
-.. contents:: mmcv.cnn
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.cnn
-
-Module
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- ContextBlock
- Conv2d
- Conv3d
- ConvAWS2d
- ConvModule
- ConvTranspose2d
- ConvTranspose3d
- ConvWS2d
- DepthwiseSeparableConvModule
- GeneralizedAttention
- HSigmoid
- HSwish
- LayerScale
- Linear
- MaxPool2d
- MaxPool3d
- NonLocal1d
- NonLocal2d
- NonLocal3d
- Scale
- Swish
- Conv2dRFSearchOp
-
-Build Function
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- build_activation_layer
- build_conv_layer
- build_norm_layer
- build_padding_layer
- build_plugin_layer
- build_upsample_layer
-
-Miscellaneous
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- fuse_conv_bn
- conv_ws_2d
- is_norm
- make_res_layer
- make_vgg_layer
- get_model_complexity_info
diff --git a/docs/zh_cn/api/image.rst b/docs/zh_cn/api/image.rst
deleted file mode 100644
index 3b93484952cd0c45b9d103088b0677f93fe5615d..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/image.rst
+++ /dev/null
@@ -1,100 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.image
-===================================
-
-.. contents:: mmcv.image
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.image
-
-IO
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- imfrombytes
- imread
- imwrite
- use_backend
-
-Color Space
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- bgr2gray
- bgr2hls
- bgr2hsv
- bgr2rgb
- bgr2ycbcr
- gray2bgr
- gray2rgb
- hls2bgr
- hsv2bgr
- imconvert
- rgb2bgr
- rgb2gray
- rgb2ycbcr
- ycbcr2bgr
- ycbcr2rgb
-
-Geometric
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- cutout
- imcrop
- imflip
- impad
- impad_to_multiple
- imrescale
- imresize
- imresize_like
- imresize_to_multiple
- imrotate
- imshear
- imtranslate
- rescale_size
-
-Photometric
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- adjust_brightness
- adjust_color
- adjust_contrast
- adjust_hue
- adjust_lighting
- adjust_sharpness
- auto_contrast
- clahe
- imdenormalize
- imequalize
- iminvert
- imnormalize
- lut_transform
- posterize
- solarize
-
-Miscellaneous
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- tensor2imgs
diff --git a/docs/zh_cn/api/ops.rst b/docs/zh_cn/api/ops.rst
deleted file mode 100644
index b0290457bfa0c08f14d7fe346efccb33f388bdae..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/ops.rst
+++ /dev/null
@@ -1,135 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.ops
-===================================
-
-.. contents:: mmcv.ops
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.ops
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- BorderAlign
- CARAFE
- CARAFENaive
- CARAFEPack
- Conv2d
- ConvTranspose2d
- CornerPool
- Correlation
- CrissCrossAttention
- DeformConv2d
- DeformConv2dPack
- DeformRoIPool
- DeformRoIPoolPack
- DynamicScatter
- FusedBiasLeakyReLU
- GroupAll
- Linear
- MaskedConv2d
- MaxPool2d
- ModulatedDeformConv2d
- ModulatedDeformConv2dPack
- ModulatedDeformRoIPoolPack
- MultiScaleDeformableAttention
- PSAMask
- PointsSampler
- PrRoIPool
- QueryAndGroup
- RiRoIAlignRotated
- RoIAlign
- RoIAlignRotated
- RoIAwarePool3d
- RoIPointPool3d
- RoIPool
- SAConv2d
- SigmoidFocalLoss
- SimpleRoIAlign
- SoftmaxFocalLoss
- SparseConv2d
- SparseConv3d
- SparseConvTensor
- SparseConvTranspose2d
- SparseConvTranspose3d
- SparseInverseConv2d
- SparseInverseConv3d
- SparseMaxPool2d
- SparseMaxPool3d
- SparseModule
- SparseSequential
- SubMConv2d
- SubMConv3d
- SyncBatchNorm
- TINShift
- Voxelization
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- active_rotated_filter
- assign_score_withk
- ball_query
- batched_nms
- bbox_overlaps
- border_align
- box_iou_rotated
- boxes_iou3d
- boxes_iou_bev
- boxes_overlap_bev
- carafe
- carafe_naive
- chamfer_distance
- contour_expand
- convex_giou
- convex_iou
- deform_conv2d
- deform_roi_pool
- diff_iou_rotated_2d
- diff_iou_rotated_3d
- dynamic_scatter
- furthest_point_sample
- furthest_point_sample_with_dist
- fused_bias_leakyrelu
- gather_points
- grouping_operation
- knn
- masked_conv2d
- min_area_polygons
- modulated_deform_conv2d
- nms
- nms3d
- nms3d_normal
- nms_bev
- nms_match
- nms_normal_bev
- nms_rotated
- pixel_group
- point_sample
- points_in_boxes_all
- points_in_boxes_cpu
- points_in_boxes_part
- points_in_polygons
- prroi_pool
- rel_roi_point_to_rel_img_point
- riroi_align_rotated
- roi_align
- roi_align_rotated
- roi_pool
- rotated_feature_align
- scatter_nd
- sigmoid_focal_loss
- soft_nms
- softmax_focal_loss
- three_interpolate
- three_nn
- tin_shift
- upfirdn2d
- voxelization
diff --git a/docs/zh_cn/api/transforms.rst b/docs/zh_cn/api/transforms.rst
deleted file mode 100644
index b080133d6b7736398b855174c325169b8af92aae..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/transforms.rst
+++ /dev/null
@@ -1,60 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.transforms
-===================================
-
-.. currentmodule:: mmcv.transforms
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- BaseTransform
- TestTimeAug
-
-Loading
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- LoadAnnotations
- LoadImageFromFile
-
-Processing
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- CenterCrop
- MultiScaleFlipAug
- Normalize
- Pad
- RandomChoiceResize
- RandomFlip
- RandomGrayscale
- RandomResize
- Resize
- ToTensor
- ImageToTensor
-
-Wrapper
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- Compose
- KeyMapper
- RandomApply
- RandomChoice
- TransformBroadcaster
diff --git a/docs/zh_cn/api/utils.rst b/docs/zh_cn/api/utils.rst
deleted file mode 100644
index f2ff4c2a3872bc9ae0c2942debac5e5b523bd071..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/utils.rst
+++ /dev/null
@@ -1,23 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.utils
-===================================
-
-.. contents:: mmcv.utils
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.utils
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- IS_CUDA_AVAILABLE
- IS_MLU_AVAILABLE
- IS_MPS_AVAILABLE
- collect_env
- jit
- skip_no_elena
diff --git a/docs/zh_cn/api/video.rst b/docs/zh_cn/api/video.rst
deleted file mode 100644
index a6ebca0eb73afcf3f3f11aae8520e2782a310f13..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/video.rst
+++ /dev/null
@@ -1,56 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.video
-===================================
-
-.. contents:: mmcv.video
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.video
-
-IO
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- VideoReader
- Cache
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- frames2video
-
-Optical Flow
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- dequantize_flow
- flow_from_bytes
- flow_warp
- flowread
- flowwrite
- quantize_flow
- sparse_flow_from_bytes
-
-Video Processing
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- concat_video
- convert_video
- cut_video
- resize_video
diff --git a/docs/zh_cn/api/visualization.rst b/docs/zh_cn/api/visualization.rst
deleted file mode 100644
index 8f43ef27a441dcd9001a352cf18e97f8e615676d..0000000000000000000000000000000000000000
--- a/docs/zh_cn/api/visualization.rst
+++ /dev/null
@@ -1,50 +0,0 @@
-.. role:: hidden
- :class: hidden-section
-
-mmcv.visualization
-===================================
-
-.. contents:: mmcv.visualization
- :depth: 2
- :local:
- :backlinks: top
-
-.. currentmodule:: mmcv.visualization
-
-Color
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
- :template: classtemplate.rst
-
- Color
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- color_val
-
-Image
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- imshow
- imshow_bboxes
- imshow_det_bboxes
-
-Optical Flow
-----------------
-
-.. autosummary::
- :toctree: generated
- :nosignatures:
-
- flow2rgb
- flowshow
- make_color_wheel
diff --git a/docs/zh_cn/community/code_style.md b/docs/zh_cn/community/code_style.md
deleted file mode 100644
index 8ddb87c2391e07b848aa073287cc2a230da8c3ec..0000000000000000000000000000000000000000
--- a/docs/zh_cn/community/code_style.md
+++ /dev/null
@@ -1,609 +0,0 @@
-## 代码规范
-
-### 代码规范标准
-
-#### PEP 8 —— Python 官方代码规范
-
-[Python 官方的代码风格指南](https://www.python.org/dev/peps/pep-0008/),包含了以下几个方面的内容:
-
-- 代码布局,介绍了 Python 中空行、断行以及导入相关的代码风格规范。比如一个常见的问题:当我的代码较长,无法在一行写下时,何处可以断行?
-
-- 表达式,介绍了 Python 中表达式空格相关的一些风格规范。
-
-- 尾随逗号相关的规范。当列表较长,无法一行写下而写成如下逐行列表时,推荐在末项后加逗号,从而便于追加选项、版本控制等。
-
- ```python
- # Correct:
- FILES = ['setup.cfg', 'tox.ini']
- # Correct:
- FILES = [
- 'setup.cfg',
- 'tox.ini',
- ]
- # Wrong:
- FILES = ['setup.cfg', 'tox.ini',]
- # Wrong:
- FILES = [
- 'setup.cfg',
- 'tox.ini'
- ]
- ```
-
-- 命名相关规范、注释相关规范、类型注解相关规范,我们将在后续章节中做详细介绍。
-
- "A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important." PEP 8 -- Style Guide for Python Code
-
-:::{note}
-PEP 8 的代码规范并不是绝对的,项目内的一致性要优先于 PEP 8 的规范。OpenMMLab 各个项目都在 setup.cfg 设定了一些代码规范的设置,请遵照这些设置。一个例子是在 PEP 8 中有如下一个例子:
-
-```python
-# Correct:
-hypot2 = x*x + y*y
-# Wrong:
-hypot2 = x * x + y * y
-```
-
-这一规范是为了指示不同优先级,但 OpenMMLab 的设置中通常没有启用 yapf 的 `ARITHMETIC_PRECEDENCE_INDICATION` 选项,因而格式规范工具不会按照推荐样式格式化,以设置为准。
-:::
-
-#### Google 开源项目风格指南
-
-[Google 使用的编程风格指南](https://google.github.io/styleguide/pyguide.html),包括了 Python 相关的章节。相较于 PEP 8,该指南提供了更为详尽的代码指南。该指南包括了语言规范和风格规范两个部分。
-
-其中,语言规范对 Python 中很多语言特性进行了优缺点的分析,并给出了使用指导意见,如异常、Lambda 表达式、列表推导式、metaclass 等。
-
-风格规范的内容与 PEP 8 较为接近,大部分约定建立在 PEP 8 的基础上,也有一些更为详细的约定,如函数长度、TODO 注释、文件与 socket 对象的访问等。
-
-推荐将该指南作为参考进行开发,但不必严格遵照,一来该指南存在一些 Python 2 兼容需求,例如指南中要求所有无基类的类应当显式地继承 Object, 而在仅使用 Python 3 的环境中,这一要求是不必要的,依本项目中的惯例即可。二来 OpenMMLab 的项目作为框架级的开源软件,不必对一些高级技巧过于避讳,尤其是 MMCV。但尝试使用这些技巧前应当认真考虑是否真的有必要,并寻求其他开发人员的广泛评估。
-
-另外需要注意的一处规范是关于包的导入,在该指南中,要求导入本地包时必须使用路径全称,且导入的每一个模块都应当单独成行,通常这是不必要的,而且也不符合目前项目的开发惯例,此处进行如下约定:
-
-```python
-# Correct
-from mmcv.cnn.bricks import (Conv2d, build_norm_layer, DropPath, MaxPool2d,
- Linear)
-from ..utils import ext_loader
-
-# Wrong
-from mmcv.cnn.bricks import Conv2d, build_norm_layer, DropPath, MaxPool2d, \
- Linear # 使用括号进行连接,而不是反斜杠
-from ...utils import is_str # 最多向上回溯一层,过多的回溯容易导致结构混乱
-```
-
-OpenMMLab 项目使用 pre-commit 工具自动格式化代码,详情见[贡献代码](./contributing.md#代码风格)。
-
-### 命名规范
-
-#### 命名规范的重要性
-
-优秀的命名是良好代码可读的基础。基础的命名规范对各类变量的命名做了要求,使读者可以方便地根据代码名了解变量是一个类 / 局部变量 / 全局变量等。而优秀的命名则需要代码作者对于变量的功能有清晰的认识,以及良好的表达能力,从而使读者根据名称就能了解其含义,甚至帮助了解该段代码的功能。
-
-#### 基础命名规范
-
-| 类型 | 公有 | 私有 |
-| --------------- | ---------------- | ------------------ |
-| 模块 | lower_with_under | \_lower_with_under |
-| 包 | lower_with_under | |
-| 类 | CapWords | \_CapWords |
-| 异常 | CapWordsError | |
-| 函数(方法) | lower_with_under | \_lower_with_under |
-| 函数 / 方法参数 | lower_with_under | |
-| 全局 / 类内常量 | CAPS_WITH_UNDER | \_CAPS_WITH_UNDER |
-| 全局 / 类内变量 | lower_with_under | \_lower_with_under |
-| 变量 | lower_with_under | \_lower_with_under |
-| 局部变量 | lower_with_under | |
-
-注意:
-
-- 尽量避免变量名与保留字冲突,特殊情况下如不可避免,可使用一个后置下划线,如 class\_
-- 尽量不要使用过于简单的命名,除了约定俗成的循环变量 i,文件变量 f,错误变量 e 等。
-- 不会被用到的变量可以命名为 \_,逻辑检查器会将其忽略。
-
-#### 命名技巧
-
-良好的变量命名需要保证三点:
-
-1. 含义准确,没有歧义
-2. 长短适中
-3. 前后统一
-
-```python
-# Wrong
-class Masks(metaclass=ABCMeta): # 命名无法表现基类;Instance or Semantic?
- pass
-
-# Correct
-class BaseInstanceMasks(metaclass=ABCMeta):
- pass
-
-# Wrong,不同地方含义相同的变量尽量用统一的命名
-def __init__(self, inplanes, planes):
- pass
-
-def __init__(self, in_channels, out_channels):
- pass
-```
-
-常见的函数命名方法:
-
-- 动宾命名法:crop_img, init_weights
-- 动宾倒置命名法:imread, bbox_flip
-
-注意函数命名与参数的顺序,保证主语在前,符合语言习惯:
-
-- check_keys_exist(key, container)
-- check_keys_contain(container, key)
-
-注意避免非常规或统一约定的缩写,如 nb -> num_blocks,in_nc -> in_channels
-
-### docstring 规范
-
-#### 为什么要写 docstring
-
-docstring 是对一个类、一个函数功能与 API 接口的详细描述,有两个功能,一是帮助其他开发者了解代码功能,方便 debug 和复用代码;二是在 Readthedocs 文档中自动生成相关的 API reference 文档,帮助不了解源代码的社区用户使用相关功能。
-
-#### 如何写 docstring
-
-与注释不同,一份规范的 docstring 有着严格的格式要求,以便于 Python 解释器以及 sphinx 进行文档解析,详细的 docstring 约定参见 [PEP 257](https://www.python.org/dev/peps/pep-0257/)。此处以例子的形式介绍各种文档的标准格式,参考格式为 [Google 风格](https://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_style_rules/#comments)。
-
-1. 模块文档
-
- 代码风格规范推荐为每一个模块(即 Python 文件)编写一个 docstring,但目前 OpenMMLab 项目大部分没有此类 docstring,因此不做硬性要求。
-
- ```python
- """A one line summary of the module or program, terminated by a period.
-
- Leave one blank line. The rest of this docstring should contain an
- overall description of the module or program. Optionally, it may also
- contain a brief description of exported classes and functions and/or usage
- examples.
-
- Typical usage example:
-
- foo = ClassFoo()
- bar = foo.FunctionBar()
- """
- ```
-
-2. 类文档
-
- 类文档是我们最常需要编写的,此处,按照 OpenMMLab 的惯例,我们使用了与 Google 风格不同的写法。如下例所示,文档中没有使用 Attributes 描述类属性,而是使用 Args 描述 __init__ 函数的参数。
-
- 在 Args 中,遵照 `parameter (type): Description.` 的格式,描述每一个参数类型和功能。其中,多种类型可使用 `(float or str)` 的写法,可以为 None 的参数可以写为 `(int, optional)`。
-
- ```python
- class BaseRunner(metaclass=ABCMeta):
- """The base class of Runner, a training helper for PyTorch.
-
- All subclasses should implement the following APIs:
-
- - ``run()``
- - ``train()``
- - ``val()``
- - ``save_checkpoint()``
-
- Args:
- model (:obj:`torch.nn.Module`): The model to be run.
- batch_processor (callable, optional): A callable method that process
- a data batch. The interface of this method should be
- ``batch_processor(model, data, train_mode) -> dict``.
- Defaults to None.
- optimizer (dict or :obj:`torch.optim.Optimizer`, optional): It can be
- either an optimizer (in most cases) or a dict of optimizers
- (in models that requires more than one optimizer, e.g., GAN).
- Defaults to None.
- work_dir (str, optional): The working directory to save checkpoints
- and logs. Defaults to None.
- logger (:obj:`logging.Logger`): Logger used during training.
- Defaults to None. (The default value is just for backward
- compatibility)
- meta (dict, optional): A dict records some import information such as
- environment info and seed, which will be logged in logger hook.
- Defaults to None.
- max_epochs (int, optional): Total training epochs. Defaults to None.
- max_iters (int, optional): Total training iterations. Defaults to None.
- """
-
- def __init__(self,
- model,
- batch_processor=None,
- optimizer=None,
- work_dir=None,
- logger=None,
- meta=None,
- max_iters=None,
- max_epochs=None):
- ...
- ```
-
- 另外,在一些算法实现的主体类中,建议加入原论文的链接;如果参考了其他开源代码的实现,则应加入 modified from,而如果是直接复制了其他代码库的实现,则应加入 copied from ,并注意源码的 License。如有必要,也可以通过 .. math:: 来加入数学公式
-
- ```python
- # 参考实现
- # This func is modified from `detectron2
- # `_.
-
- # 复制代码
- # This code was copied from the `ubelt
- # library`_.
-
- # 引用论文 & 添加公式
- class LabelSmoothLoss(nn.Module):
- r"""Initializer for the label smoothed cross entropy loss.
-
- Refers to `Rethinking the Inception Architecture for Computer Vision
- `_.
-
- This decreases gap between output scores and encourages generalization.
- Labels provided to forward can be one-hot like vectors (NxC) or class
- indices (Nx1).
- And this accepts linear combination of one-hot like labels from mixup or
- cutmix except multi-label task.
-
- Args:
- label_smooth_val (float): The degree of label smoothing.
- num_classes (int, optional): Number of classes. Defaults to None.
- mode (str): Refers to notes, Options are "original", "classy_vision",
- "multi_label". Defaults to "classy_vision".
- reduction (str): The method used to reduce the loss.
- Options are "none", "mean" and "sum". Defaults to 'mean'.
- loss_weight (float): Weight of the loss. Defaults to 1.0.
-
- Note:
- if the ``mode`` is "original", this will use the same label smooth
- method as the original paper as:
-
- .. math::
- (1-\epsilon)\delta_{k, y} + \frac{\epsilon}{K}
-
- where :math:`\epsilon` is the ``label_smooth_val``, :math:`K` is
- the ``num_classes`` and :math:`\delta_{k,y}` is Dirac delta,
- which equals 1 for k=y and 0 otherwise.
-
- if the ``mode`` is "classy_vision", this will use the same label
- smooth method as the `facebookresearch/ClassyVision
- `_ repo as:
-
- .. math::
- \frac{\delta_{k, y} + \epsilon/K}{1+\epsilon}
-
- if the ``mode`` is "multi_label", this will accept labels from
- multi-label task and smoothing them as:
-
- .. math::
- (1-2\epsilon)\delta_{k, y} + \epsilon
- ```
-
-```{note}
-注意 \`\`here\`\`、\`here\`、"here" 三种引号功能是不同。
-
-在 reStructured 语法中,\`\`here\`\` 表示一段代码;\`here\` 表示斜体;"here" 无特殊含义,一般可用来表示字符串。其中 \`here\` 的用法与 Markdown 中不同,需要多加留意。
-另外还有 :obj:\`type\` 这种更规范的表示类的写法,但鉴于长度,不做特别要求,一般仅用于表示非常用类型。
-```
-
-3. 方法(函数)文档
-
- 函数文档与类文档的结构基本一致,但需要加入返回值文档。对于较为复杂的函数和类,可以使用 Examples 字段加入示例;如果需要对参数加入一些较长的备注,可以加入 Note 字段进行说明。
-
- 对于使用较为复杂的类或函数,比起看大段大段的说明文字和参数文档,添加合适的示例更能帮助用户迅速了解其用法。需要注意的是,这些示例最好是能够直接在 Python 交互式环境中运行的,并给出一些相对应的结果。如果存在多个示例,可以使用注释简单说明每段示例,也能起到分隔作用。
-
- ```python
- def import_modules_from_strings(imports, allow_failed_imports=False):
- """Import modules from the given list of strings.
-
- Args:
- imports (list | str | None): The given module names to be imported.
- allow_failed_imports (bool): If True, the failed imports will return
- None. Otherwise, an ImportError is raise. Defaults to False.
-
- Returns:
- List[module] | module | None: The imported modules.
- All these three lines in docstring will be compiled into the same
- line in readthedocs.
-
- Examples:
- >>> osp, sys = import_modules_from_strings(
- ... ['os.path', 'sys'])
- >>> import os.path as osp_
- >>> import sys as sys_
- >>> assert osp == osp_
- >>> assert sys == sys_
- """
- ...
- ```
-
- 如果函数接口在某个版本发生了变化,需要在 docstring 中加入相关的说明,必要时添加 Note 或者 Warning 进行说明,例如:
-
- ```python
- class CheckpointHook(Hook):
- """Save checkpoints periodically.
-
- Args:
- out_dir (str, optional): The root directory to save checkpoints. If
- not specified, ``runner.work_dir`` will be used by default. If
- specified, the ``out_dir`` will be the concatenation of
- ``out_dir`` and the last level directory of ``runner.work_dir``.
- Defaults to None. `Changed in version 1.3.15.`
- file_client_args (dict, optional): Arguments to instantiate a
- FileClient. See :class:`mmcv.fileio.FileClient` for details.
- Defaults to None. `New in version 1.3.15.`
-
- Warning:
- Before v1.3.15, the ``out_dir`` argument indicates the path where the
- checkpoint is stored. However, in v1.3.15 and later, ``out_dir``
- indicates the root directory and the final path to save checkpoint is
- the concatenation of out_dir and the last level directory of
- ``runner.work_dir``. Suppose the value of ``out_dir`` is
- "/path/of/A" and the value of ``runner.work_dir`` is "/path/of/B",
- then the final path will be "/path/of/A/B".
- ```
-
- 如果参数或返回值里带有需要展开描述字段的 dict,则应该采用如下格式:
-
- ```python
- def func(x):
- r"""
- Args:
- x (None): A dict with 2 keys, ``padded_targets``, and ``targets``.
-
- - ``targets`` (list[Tensor]): A list of tensors.
- Each tensor has the shape of :math:`(T_i)`. Each
- element is the index of a character.
- - ``padded_targets`` (Tensor): A tensor of shape :math:`(N)`.
- Each item is the length of a word.
-
- Returns:
- dict: A dict with 2 keys, ``padded_targets``, and ``targets``.
-
- - ``targets`` (list[Tensor]): A list of tensors.
- Each tensor has the shape of :math:`(T_i)`. Each
- element is the index of a character.
- - ``padded_targets`` (Tensor): A tensor of shape :math:`(N)`.
- Each item is the length of a word.
- """
- return x
- ```
-
-```{important}
-为了生成 readthedocs 文档,文档的编写需要按照 ReStructrued 文档格式,否则会产生文档渲染错误,在提交 PR 前,最好生成并预览一下文档效果。
-语法规范参考:
-
-- [reStructuredText Primer - Sphinx documentation](https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#)
-- [Example Google Style Python Docstrings ‒ napoleon 0.7 documentation](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html#example-google)
-```
-
-### 注释规范
-
-#### 为什么要写注释
-
-对于一个开源项目,团队合作以及社区之间的合作是必不可少的,因而尤其要重视合理的注释。不写注释的代码,很有可能过几个月自己也难以理解,造成额外的阅读和修改成本。
-
-#### 如何写注释
-
-最需要写注释的是代码中那些技巧性的部分。如果你在下次代码审查的时候必须解释一下,那么你应该现在就给它写注释。对于复杂的操作,应该在其操作开始前写上若干行注释。对于不是一目了然的代码,应在其行尾添加注释。
-—— Google 开源项目风格指南
-
-```python
-# We use a weighted dictionary search to find out where i is in
-# the array. We extrapolate position based on the largest num
-# in the array and the array size and then do binary search to
-# get the exact number.
-if i & (i-1) == 0: # True if i is 0 or a power of 2.
-```
-
-为了提高可读性, 注释应该至少离开代码2个空格.
-另一方面, 绝不要描述代码. 假设阅读代码的人比你更懂Python, 他只是不知道你的代码要做什么.
-—— Google 开源项目风格指南
-
-```python
-# Wrong:
-# Now go through the b array and make sure whenever i occurs
-# the next element is i+1
-
-# Wrong:
-if i & (i-1) == 0: # True if i bitwise and i-1 is 0.
-```
-
-在注释中,可以使用 Markdown 语法,因为开发人员通常熟悉 Markdown 语法,这样可以便于交流理解,如可使用单反引号表示代码和变量(注意不要和 docstring 中的 ReStructured 语法混淆)
-
-```python
-# `_reversed_padding_repeated_twice` is the padding to be passed to
-# `F.pad` if needed (e.g., for non-zero padding types that are
-# implemented as two ops: padding + conv). `F.pad` accepts paddings in
-# reverse order than the dimension.
-self._reversed_padding_repeated_twice = _reverse_repeat_tuple(self.padding, 2)
-```
-
-#### 注释示例
-
-1. 出自 `mmcv/utils/registry.py`,对于较为复杂的逻辑结构,通过注释,明确了优先级关系。
-
- ```python
- # self.build_func will be set with the following priority:
- # 1. build_func
- # 2. parent.build_func
- # 3. build_from_cfg
- if build_func is None:
- if parent is not None:
- self.build_func = parent.build_func
- else:
- self.build_func = build_from_cfg
- else:
- self.build_func = build_func
- ```
-
-2. 出自 `mmcv/runner/checkpoint.py`,对于 bug 修复中的一些特殊处理,可以附带相关的 issue 链接,帮助其他人了解 bug 背景。
-
- ```python
- def _save_ckpt(checkpoint, file):
- # The 1.6 release of PyTorch switched torch.save to use a new
- # zipfile-based file format. It will cause RuntimeError when a
- # checkpoint was saved in high version (PyTorch version>=1.6.0) but
- # loaded in low version (PyTorch version<1.6.0). More details at
- # https://github.com/open-mmlab/mmpose/issues/904
- if digit_version(TORCH_VERSION) >= digit_version('1.6.0'):
- torch.save(checkpoint, file, _use_new_zipfile_serialization=False)
- else:
- torch.save(checkpoint, file)
- ```
-
-### 类型注解
-
-#### 为什么要写类型注解
-
-类型注解是对函数中变量的类型做限定或提示,为代码的安全性提供保障、增强代码的可读性、避免出现类型相关的错误。
-Python 没有对类型做强制限制,类型注解只起到一个提示作用,通常你的 IDE 会解析这些类型注解,然后在你调用相关代码时对类型做提示。另外也有类型注解检查工具,这些工具会根据类型注解,对代码中可能出现的问题进行检查,减少 bug 的出现。
-需要注意的是,通常我们不需要注释模块中的所有函数:
-
-1. 公共的 API 需要注释
-2. 在代码的安全性,清晰性和灵活性上进行权衡是否注释
-3. 对于容易出现类型相关的错误的代码进行注释
-4. 难以理解的代码请进行注释
-5. 若代码中的类型已经稳定,可以进行注释. 对于一份成熟的代码,多数情况下,即使注释了所有的函数,也不会丧失太多的灵活性.
-
-#### 如何写类型注解
-
-1. 函数 / 方法类型注解,通常不对 self 和 cls 注释。
-
- ```python
- from typing import Optional, List, Tuple
-
- # 全部位于一行
- def my_method(self, first_var: int) -> int:
- pass
-
- # 另起一行
- def my_method(
- self, first_var: int,
- second_var: float) -> Tuple[MyLongType1, MyLongType1, MyLongType1]:
- pass
-
- # 单独成行(具体的应用场合与行宽有关,建议结合 yapf 自动化格式使用)
- def my_method(
- self, first_var: int, second_var: float
- ) -> Tuple[MyLongType1, MyLongType1, MyLongType1]:
- pass
-
- # 引用尚未被定义的类型
- class MyClass:
- def __init__(self,
- stack: List["MyClass"]) -> None:
- pass
- ```
-
- 注:类型注解中的类型可以是 Python 内置类型,也可以是自定义类,还可以使用 Python 提供的 wrapper 类对类型注解进行装饰,一些常见的注解如下:
-
- ```python
- # 数值类型
- from numbers import Number
-
- # 可选类型,指参数可以为 None
- from typing import Optional
- def foo(var: Optional[int] = None):
- pass
-
- # 联合类型,指同时接受多种类型
- from typing import Union
- def foo(var: Union[float, str]):
- pass
-
- from typing import Sequence # 序列类型
- from typing import Iterable # 可迭代类型
- from typing import Any # 任意类型
- from typing import Callable # 可调用类型
-
- from typing import List, Dict # 列表和字典的泛型类型
- from typing import Tuple # 元组的特殊格式
- # 虽然在 Python 3.9 中,list, tuple 和 dict 本身已支持泛型,但为了支持之前的版本
- # 我们在进行类型注解时还是需要使用 List, Tuple, Dict 类型
- # 另外,在对参数类型进行注解时,尽量使用 Sequence & Iterable & Mapping
- # List, Tuple, Dict 主要用于返回值类型注解
- # 参见 https://docs.python.org/3/library/typing.html#typing.List
- ```
-
-2. 变量类型注解,一般用于难以直接推断其类型时
-
- ```python
- # Recommend: 带类型注解的赋值
- a: Foo = SomeUndecoratedFunction()
- a: List[int]: [1, 2, 3] # List 只支持单一类型泛型,可使用 Union
- b: Tuple[int, int] = (1, 2) # 长度固定为 2
- c: Tuple[int, ...] = (1, 2, 3) # 变长
- d: Dict[str, int] = {'a': 1, 'b': 2}
-
- # Not Recommend:行尾类型注释
- # 虽然这种方式被写在了 Google 开源指南中,但这是一种为了支持 Python 2.7 版本
- # 而补充的注释方式,鉴于我们只支持 Python 3, 为了风格统一,不推荐使用这种方式。
- a = SomeUndecoratedFunction() # type: Foo
- a = [1, 2, 3] # type: List[int]
- b = (1, 2, 3) # type: Tuple[int, ...]
- c = (1, "2", 3.5) # type: Tuple[int, Text, float]
- ```
-
-3. 泛型
-
- 上文中我们知道,typing 中提供了 list 和 dict 的泛型类型,那么我们自己是否可以定义类似的泛型呢?
-
- ```python
- from typing import TypeVar, Generic
-
- KT = TypeVar('KT')
- VT = TypeVar('VT')
-
- class Mapping(Generic[KT, VT]):
- def __init__(self, data: Dict[KT, VT]):
- self._data = data
-
- def __getitem__(self, key: KT) -> VT:
- return self._data[key]
- ```
-
- 使用上述方法,我们定义了一个拥有泛型能力的映射类,实际用法如下:
-
- ```python
- mapping = Mapping[str, float]({'a': 0.5})
- value: float = example['a']
- ```
-
- 另外,我们也可以利用 TypeVar 在函数签名中指定联动的多个类型:
-
- ```python
- from typing import TypeVar, List
-
- T = TypeVar('T') # Can be anything
- A = TypeVar('A', str, bytes) # Must be str or bytes
-
-
- def repeat(x: T, n: int) -> List[T]:
- """Return a list containing n references to x."""
- return [x]*n
-
-
- def longest(x: A, y: A) -> A:
- """Return the longest of two strings."""
- return x if len(x) >= len(y) else y
- ```
-
-更多关于类型注解的写法请参考 [typing](https://docs.python.org/3/library/typing.html)。
-
-#### 类型注解检查工具
-
-[mypy](https://mypy.readthedocs.io/en/stable/) 是一个 Python 静态类型检查工具。根据你的类型注解,mypy 会检查传参、赋值等操作是否符合类型注解,从而避免可能出现的 bug。
-
-例如如下的一个 Python 脚本文件 test.py:
-
-```python
-def foo(var: int) -> float:
- return float(var)
-
-a: str = foo('2.0')
-b: int = foo('3.0') # type: ignore
-```
-
-运行 mypy test.py 可以得到如下检查结果,分别指出了第 4 行在函数调用和返回值赋值两处类型错误。而第 5 行同样存在两个类型错误,由于使用了 type: ignore 而被忽略了,只有部分特殊情况可能需要此类忽略。
-
-```
-test.py:4: error: Incompatible types in assignment (expression has type "float", variable has type "int")
-test.py:4: error: Argument 1 to "foo" has incompatible type "str"; expected "int"
-Found 2 errors in 1 file (checked 1 source file)
-```
diff --git a/docs/zh_cn/community/contributing.md b/docs/zh_cn/community/contributing.md
deleted file mode 100644
index e3aa781a5a31d14025dc4613e596195ede266bb7..0000000000000000000000000000000000000000
--- a/docs/zh_cn/community/contributing.md
+++ /dev/null
@@ -1,278 +0,0 @@
-## 贡献代码
-
-欢迎加入 MMCV 社区,我们致力于打造最前沿的计算机视觉基础库,我们欢迎任何类型的贡献,包括但不限于
-
-**修复错误**
-
-修复代码实现错误的步骤如下:
-
-1. 如果提交的代码改动较大,建议先提交 issue,并正确描述 issue 的现象、原因和复现方式,讨论后确认修复方案。
-2. 修复错误并补充相应的单元测试,提交拉取请求。
-
-**新增功能或组件**
-
-1. 如果新功能或模块涉及较大的代码改动,建议先提交 issue,确认功能的必要性。
-2. 实现新增功能并添单元测试,提交拉取请求。
-
-**文档补充**
-
-修复文档可以直接提交拉取请求
-
-添加文档或将文档翻译成其他语言步骤如下
-
-1. 提交 issue,确认添加文档的必要性。
-2. 添加文档,提交拉取请求。
-
-### 拉取请求工作流
-
-如果你对拉取请求不了解,没关系,接下来的内容将会从零开始,一步一步地指引你如何创建一个拉取请求。如果你想深入了解拉取请求的开发模式,可以参考 github [官方文档](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)
-
-#### 1. 复刻仓库
-
-当你第一次提交拉取请求时,先复刻 OpenMMLab 原代码库,点击 GitHub 页面右上角的 **Fork** 按钮,复刻后的代码库将会出现在你的 GitHub 个人主页下。
-
-
-
-将代码克隆到本地
-
-```shell
-git clone git@github.com:{username}/mmcv.git
-```
-
-添加原代码库为上游代码库
-
-```bash
-git remote add upstream git@github.com:open-mmlab/mmcv
-```
-
-检查 remote 是否添加成功,在终端输入 `git remote -v`
-
-```bash
-origin git@github.com:{username}/mmcv.git (fetch)
-origin git@github.com:{username}/mmcv.git (push)
-upstream git@github.com:open-mmlab/mmcv (fetch)
-upstream git@github.com:open-mmlab/mmcv (push)
-```
-
-```{note}
-这里对 origin 和 upstream 进行一个简单的介绍,当我们使用 git clone 来克隆代码时,会默认创建一个 origin 的 remote,它指向我们克隆的代码库地址,而 upstream 则是我们自己添加的,用来指向原始代码库地址。当然如果你不喜欢他叫 upstream,也可以自己修改,比如叫 open-mmlab。我们通常向 origin 提交代码(即 fork 下来的远程仓库),然后向 upstream 提交一个 pull request。如果提交的代码和最新的代码发生冲突,再从 upstream 拉取最新的代码,和本地分支解决冲突,再提交到 origin。
-```
-
-#### 2. 配置 pre-commit
-
-在本地开发环境中,我们使用 [pre-commit](https://pre-commit.com/#intro) 来检查代码风格,以确保代码风格的统一。在提交代码,需要先安装 pre-commit(需要在 MMCV 目录下执行):
-
-```shell
-pip install -U pre-commit
-pre-commit install
-```
-
-检查 pre-commit 是否配置成功,并安装 `.pre-commit-config.yaml` 中的钩子:
-
-```shell
-pre-commit run --all-files
-```
-
-
-
-
-
-```{note}
-如果你是中国用户,由于网络原因,可能会出现安装失败的情况,这时可以使用国内源
-
-pre-commit install -c .pre-commit-config-zh-cn.yaml
-
-pre-commit run --all-files -c .pre-commit-config-zh-cn.yaml
-```
-
-如果安装过程被中断,可以重复执行 `pre-commit run ...` 继续安装。
-
-如果提交的代码不符合代码风格规范,pre-commit 会发出警告,并自动修复部分错误。
-
-
-
-如果我们想临时绕开 pre-commit 的检查提交一次代码,可以在 `git commit` 时加上 `--no-verify`(需要保证最后推送至远程仓库的代码能够通过 pre-commit 检查)。
-
-```shell
-git commit -m "xxx" --no-verify
-```
-
-#### 3. 创建开发分支
-
-安装完 pre-commit 之后,我们需要基于 master 创建开发分支,建议的分支命名规则为 `username/pr_name`。
-
-```shell
-git checkout -b yhc/refactor_contributing_doc
-```
-
-在后续的开发中,如果本地仓库的 master 分支落后于 upstream 的 master 分支,我们需要先拉取 upstream 的代码进行同步,再执行上面的命令
-
-```shell
-git pull upstream master
-```
-
-#### 4. 提交代码并在本地通过单元测试
-
-- MMCV 引入了 mypy 来做静态类型检查,以增加代码的鲁棒性。因此我们在提交代码时,需要补充 Type Hints。具体规则可以参考[教程](https://zhuanlan.zhihu.com/p/519335398)。
-
-- 提交的代码同样需要通过单元测试
-
- ```shell
- # 通过全量单元测试
- pytest tests
-
- # 我们需要保证提交的代码能够通过修改模块的单元测试,以 runner 为例
- pytest tests/test_runner/test_runner.py
- ```
-
- 如果你由于缺少依赖无法运行修改模块的单元测试,可以参考[指引-单元测试](#单元测试)
-
-- 如果修改/添加了文档,参考[指引](#文档渲染)确认文档渲染正常。
-
-#### 5. 推送代码到远程
-
-代码通过单元测试和 pre-commit 检查后,将代码推送到远程仓库,如果是第一次推送,可以在 `git push` 后加上 `-u` 参数以关联远程分支
-
-```shell
-git push -u origin {branch_name}
-```
-
-这样下次就可以直接使用 `git push` 命令推送代码了,而无需指定分支和远程仓库。
-
-#### 6. 提交拉取请求(PR)
-
-(1) 在 GitHub 的 Pull request 界面创建拉取请求
-
-
-(2) 根据指引修改 PR 描述,以便于其他开发者更好地理解你的修改
-
-
-
-描述规范详见[拉取请求规范](#拉取请求规范)
-
-
-
-**注意事项**
-
-(a) PR 描述应该包含修改理由、修改内容以及修改后带来的影响,并关联相关 Issue(具体方式见[文档](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue))
-
-(b) 如果是第一次为 OpenMMLab 做贡献,需要签署 CLA
-
-
-
-(c) 检查提交的 PR 是否通过 CI(集成测试)
-
-
-
-MMCV 会在不同的平台(Linux、Window、Mac),基于不同版本的 Python、PyTorch、CUDA 对提交的代码进行单元测试,以保证代码的正确性,如果有任何一个没有通过,我们可点击上图中的 `Details` 来查看具体的测试信息,以便于我们修改代码。
-
-(3) 如果 PR 通过了 CI,那么就可以等待其他开发者的 review,并根据 reviewer 的意见,修改代码,并重复 [4](#4-提交代码并本地通过单元测试)-[5](#5-推送代码到远程) 步骤,直到 reviewer 同意合入 PR。
-
-
-
-所有 reviewer 同意合入 PR 后,我们会尽快将 PR 合并到主分支。
-
-#### 7. 解决冲突
-
-随着时间的推移,我们的代码库会不断更新,这时候,如果你的 PR 与主分支存在冲突,你需要解决冲突,解决冲突的方式有两种:
-
-```shell
-git fetch --all --prune
-git rebase upstream/master
-```
-
-或者
-
-```shell
-git fetch --all --prune
-git merge upstream/master
-```
-
-如果你非常善于处理冲突,那么可以使用 rebase 的方式来解决冲突,因为这能够保证你的 commit log 的整洁。如果你不太熟悉 `rebase` 的使用,那么可以使用 `merge` 的方式来解决冲突。
-
-### 指引
-
-#### 单元测试
-
-如果你无法正常执行部分模块的单元测试,例如 [video](https://github.com/open-mmlab/mmcv/tree/master/mmcv/video) 模块,可能是你的当前环境没有安装以下依赖
-
-```shell
-# Linux
-sudo apt-get update -y
-sudo apt-get install -y libturbojpeg
-sudo apt-get install -y ffmpeg
-
-# Windows
-conda install ffmpeg
-```
-
-在提交修复代码错误或新增特性的拉取请求时,我们应该尽可能的让单元测试覆盖所有提交的代码,计算单元测试覆盖率的方法如下
-
-```shell
-python -m coverage run -m pytest /path/to/test_file
-python -m coverage html
-# check file in htmlcov/index.html
-```
-
-#### 文档渲染
-
-在提交修复代码错误或新增特性的拉取请求时,可能会需要修改/新增模块的 docstring。我们需要确认渲染后的文档样式是正确的。
-本地生成渲染后的文档的方法如下
-
-```shell
-pip install -r requirements/docs.txt
-cd docs/zh_cn/
-# or docs/en
-make html
-# check file in ./docs/zh_cn/_build/html/index.html
-```
-
-### 代码风格
-
-#### Python
-
-[PEP8](https://www.python.org/dev/peps/pep-0008/) 作为 OpenMMLab 算法库首选的代码规范,我们使用以下工具检查和格式化代码
-
-- [flake8](https://github.com/PyCQA/flake8): Python 官方发布的代码规范检查工具,是多个检查工具的封装
-- [isort](https://github.com/timothycrosley/isort): 自动调整模块导入顺序的工具
-- [yapf](https://github.com/google/yapf): Google 发布的代码规范检查工具
-- [codespell](https://github.com/codespell-project/codespell): 检查单词拼写是否有误
-- [mdformat](https://github.com/executablebooks/mdformat): 检查 markdown 文件的工具
-- [docformatter](https://github.com/myint/docformatter): 格式化 docstring 的工具
-
-yapf 和 isort 的配置可以在 [setup.cfg](./setup.cfg) 找到
-
-通过配置 [pre-commit hook](https://pre-commit.com/) ,我们可以在提交代码时自动检查和格式化 `flake8`、`yapf`、`isort`、`trailing whitespaces`、`markdown files`,
-修复 `end-of-files`、`double-quoted-strings`、`python-encoding-pragma`、`mixed-line-ending`,调整 `requirments.txt` 的包顺序。
-pre-commit 钩子的配置可以在 [.pre-commit-config](./.pre-commit-config.yaml) 找到。
-
-pre-commit 具体的安装使用方式见[拉取请求](#2-配置-pre-commit)。
-
-更具体的规范请参考 [OpenMMLab 代码规范](code_style.md)。
-
-#### C++ and CUDA
-
-C++ 和 CUDA 的代码规范遵从 [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html)
-
-### 拉取请求规范
-
-1. 使用 [pre-commit hook](https://pre-commit.com),尽量减少代码风格相关问题
-
-2. 一个`拉取请求`对应一个短期分支
-
-3. 粒度要细,一个`拉取请求`只做一件事情,避免超大的`拉取请求`
-
- - Bad:实现 Faster R-CNN
- - Acceptable:给 Faster R-CNN 添加一个 box head
- - Good:给 box head 增加一个参数来支持自定义的 conv 层数
-
-4. 每次 Commit 时需要提供清晰且有意义 commit 信息
-
-5. 提供清晰且有意义的`拉取请求`描述
-
- - 标题写明白任务名称,一般格式:\[Prefix\] Short description of the pull request (Suffix)
- - prefix: 新增功能 \[Feature\], 修 bug \[Fix\], 文档相关 \[Docs\], 开发中 \[WIP\] (暂时不会被review)
- - 描述里介绍`拉取请求`的主要修改内容,结果,以及对其他部分的影响, 参考`拉取请求`模板
- - 关联相关的`议题` (issue) 和其他`拉取请求`
-
-6. 如果引入了其他三方库,或借鉴了三方库的代码,请确认他们的许可证和 mmcv 兼容,并在借鉴的代码上补充 `This code is inspired from http://`
diff --git a/docs/zh_cn/community/pr.md b/docs/zh_cn/community/pr.md
deleted file mode 100644
index 427fdf9e4965e404970c761676e7edd29e7b2e56..0000000000000000000000000000000000000000
--- a/docs/zh_cn/community/pr.md
+++ /dev/null
@@ -1,3 +0,0 @@
-## 拉取请求
-
-本文档的内容已迁移到[贡献指南](contributing.md)。
diff --git a/docs/zh_cn/docutils.conf b/docs/zh_cn/docutils.conf
deleted file mode 100644
index 0c00c84688701117f231fd0c8ec295fb747b7d8f..0000000000000000000000000000000000000000
--- a/docs/zh_cn/docutils.conf
+++ /dev/null
@@ -1,2 +0,0 @@
-[html writers]
-table_style: colwidths-auto
diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md
deleted file mode 100644
index 6cfb100c631b101fa0cff0650105a3cc7d735e7b..0000000000000000000000000000000000000000
--- a/docs/zh_cn/faq.md
+++ /dev/null
@@ -1,91 +0,0 @@
-## 常见问题
-
-在这里我们列出了用户经常遇到的问题以及对应的解决方法。如果您遇到了其他常见的问题,并且知道可以帮到大家的解决办法,
-欢迎随时丰富这个列表。
-
-### 安装问题
-
-- KeyError: "xxx: 'yyy is not in the zzz registry'"
-
- 只有模块所在的文件被导入时,注册机制才会被触发,所以您需要在某处导入该文件,更多详情请查看 [KeyError: "MaskRCNN: 'RefineRoIHead is not in the models registry'"](https://github.com/open-mmlab/mmdetection/issues/5974)。
-
-- "No module named 'mmcv.ops'"; "No module named 'mmcv.\_ext'"
-
- 1. 使用 `pip uninstall mmcv` 卸载您环境中的 mmcv
- 2. 参考 [installation instruction](https://mmcv.readthedocs.io/en/latest/get_started/installation.html) 或者 [Build MMCV from source](https://mmcv.readthedocs.io/en/latest/get_started/build.html) 安装 mmcv-full
-
-- "invalid device function" 或者 "no kernel image is available for execution"
-
- 1. 检查 GPU 的 CUDA 计算能力
- 2. 运行 `python mmdet/utils/collect_env.py` 来检查 PyTorch、torchvision 和 MMCV 是否是针对正确的 GPU 架构构建的,您可能需要去设置 `TORCH_CUDA_ARCH_LIST` 来重新安装 MMCV。兼容性问题可能会出现在使用旧版的 GPUs,如:colab 上的 Tesla K80 (3.7)
- 3. 检查运行环境是否和 mmcv/mmdet 编译时的环境相同。例如,您可能使用 CUDA 10.0 编译 mmcv,但在 CUDA 9.0 的环境中运行它
-
-- "undefined symbol" 或者 "cannot open xxx.so"
-
- 1. 如果符号和 CUDA/C++ 相关(例如:libcudart.so 或者 GLIBCXX),请检查 CUDA/GCC 运行时的版本是否和编译 mmcv 的一致
- 2. 如果符号和 PyTorch 相关(例如:符号包含 caffe、aten 和 TH),请检查 PyTorch 运行时的版本是否和编译 mmcv 的一致
- 3. 运行 `python mmdet/utils/collect_env.py` 以检查 PyTorch、torchvision 和 MMCV 构建和运行的环境是否相同
-
-- "RuntimeError: CUDA error: invalid configuration argument"
-
- 这个错误可能是由于您的 GPU 性能不佳造成的。尝试降低 [THREADS_PER_BLOCK](https://github.com/open-mmlab/mmcv/blob/cac22f8cf5a904477e3b5461b1cc36856c2793da/mmcv/ops/csrc/common_cuda_helper.hpp#L10)
- 的值并重新编译 mmcv。
-
-- "RuntimeError: nms is not compiled with GPU support"
-
- 这个错误是由于您的 CUDA 环境没有正确安装。
- 您可以尝试重新安装您的 CUDA 环境,然后删除 mmcv/build 文件夹并重新编译 mmcv。
-
-- "Segmentation fault"
-
- 1. 检查 GCC 的版本,通常是因为 PyTorch 版本与 GCC 版本不匹配 (例如 GCC \< 4.9 ),我们推荐用户使用 GCC 5.4,我们也不推荐使用 GCC 5.5, 因为有反馈 GCC 5.5 会导致 "segmentation fault" 并且切换到 GCC 5.4 就可以解决问题
- 2. 检查是否正确安装 CUDA 版本的 PyTorc。输入以下命令并检查是否返回 True
- ```shell
- python -c 'import torch; print(torch.cuda.is_available())'
- ```
- 3. 如果 `torch` 安装成功,那么检查 MMCV 是否安装成功。输入以下命令,如果没有报错说明 mmcv-full 安装成。
- ```shell
- python -c 'import mmcv; import mmcv.ops'
- ```
- 4. 如果 MMCV 与 PyTorch 都安装成功了,则可以使用 `ipdb` 设置断点或者使用 `print` 函数,分析是哪一部分的代码导致了 `segmentation fault`
-
-- "libtorch_cuda_cu.so: cannot open shared object file"
-
- `mmcv-full` 依赖 `libtorch_cuda_cu.so` 文件,但程序运行时没能找到该文件。我们可以检查该文件是否存在 `~/miniconda3/envs/{environment-name}/lib/python3.7/site-packages/torch/lib` 也可以尝试重装 PyTorch。
-
-- "fatal error C1189: #error: -- unsupported Microsoft Visual Studio version!"
-
- 如果您在 Windows 上编译 mmcv-full 并且 CUDA 的版本是 9.2,您很可能会遇到这个问题 `"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include\crt/host_config.h(133): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions 2012, 2013, 2015 and 2017 are supported!"`,您可以尝试使用低版本的 Microsoft Visual Studio,例如 vs2017。
-
-- "error: member "torch::jit::detail::ModulePolicy::all_slots" may not be initialized"
-
- 如果您在 Windows 上编译 mmcv-full 并且 PyTorch 的版本是 1.5.0,您很可能会遇到这个问题 `- torch/csrc/jit/api/module.h(474): error: member "torch::jit::detail::ModulePolicy::all_slots" may not be initialized`。解决这个问题的方法是将 `torch/csrc/jit/api/module.h` 文件中所有 `static constexpr bool all_slots = false;` 替换为 `static bool all_slots = false;`。更多细节可以查看 [member "torch::jit::detail::AttributePolicy::all_slots" may not be initialized](https://github.com/pytorch/pytorch/issues/39394)。
-
-- "error: a member with an in-class initializer must be const"
-
- 如果您在 Windows 上编译 mmcv-full 并且 PyTorch 的版本是 1.6.0,您很可能会遇到这个问题 `"- torch/include\torch/csrc/jit/api/module.h(483): error: a member with an in-class initializer must be const"`. 解决这个问题的方法是将 `torch/include\torch/csrc/jit/api/module.h` 文件中的所有 `CONSTEXPR_EXCEPT_WIN_CUDA ` 替换为 `const`。更多细节可以查看 [Ninja: build stopped: subcommand failed](https://github.com/open-mmlab/mmcv/issues/575)。
-
-- "error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized"
-
- 如果您在 Windows 上编译 mmcv-full 并且 PyTorch 的版本是 1.7.0,您很可能会遇到这个问题 `torch/include\torch/csrc/jit/ir/ir.h(1347): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized`. 解决这个问题的方法是修改 PyTorch 中的几个文件:
-
- - 删除 `torch/include\torch/csrc/jit/ir/ir.h` 文件中的 `static constexpr Symbol Kind = ::c10::prim::profile;` 和 `tatic constexpr Symbol Kind = ::c10::prim::profile_optional;`
- - 将 `torch\include\pybind11\cast.h` 文件中的 `explicit operator type&() { return *(this->value); }` 替换为 `explicit operator type&() { return *((type*)this->value); }`
- - 将 `torch/include\torch/csrc/jit/api/module.h` 文件中的 所有 `CONSTEXPR_EXCEPT_WIN_CUDA` 替换为 `const`
-
- 更多细节可以查看 [Ensure default extra_compile_args](https://github.com/pytorch/pytorch/pull/45956)。
-
-- MMCV 和 MMDetection 的兼容性问题;"ConvWS is already registered in conv layer"
-
- 请参考 [installation instruction](https://mmdetection.readthedocs.io/en/latest/get_started.html#installation) 为您的 MMDetection 版本安装正确版本的 MMCV。
-
-### 使用问题
-
-- "RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one"
-
- 1. 这个错误是因为有些参数没有参与 loss 的计算,可能是代码中存在多个分支,导致有些分支没有参与 loss 的计算。更多细节见 [Expected to have finished reduction in the prior iteration before starting a new one](https://github.com/pytorch/pytorch/issues/55582)。
- 2. 你可以设置 DDP 中的 `find_unused_parameters` 为 `True`,或者手动查找哪些参数没有用到。
-
-- "RuntimeError: Trying to backward through the graph a second time"
-
- 不能同时设置 `GradientCumulativeOptimizerHook` 和 `OptimizerHook`,这会导致 `loss.backward()` 被调用两次,于是程序抛出 `RuntimeError`。我们只需设置其中的一个。更多细节见 [Trying to backward through the graph a second time](https://github.com/open-mmlab/mmcv/issues/1379)。
diff --git a/docs/zh_cn/get_started/article.md b/docs/zh_cn/get_started/article.md
deleted file mode 100644
index 96768502cedb607d58ea2dc8d17b3dd8b9af20b2..0000000000000000000000000000000000000000
--- a/docs/zh_cn/get_started/article.md
+++ /dev/null
@@ -1,63 +0,0 @@
-## 解读文章汇总
-
-这篇文章汇总了 [OpenMMLab](https://www.zhihu.com/people/openmmlab) 解读的部分文章(更多文章和视频见 [OpenMMLabCourse](https://github.com/open-mmlab/OpenMMLabCourse)),如果您有推荐的文章(不一定是 OpenMMLab 发布的文章,可以是自己写的文章),非常欢迎提 [Pull Request](http://127.0.0.1:5501/mmcv/docs/zh_cn/_build/html/community/pr.html) 添加到这里。
-
-### MMCV 解读文章
-
-#### 框架解读
-
-- [MMCV 核心组件分析(一):整体概述](https://zhuanlan.zhihu.com/p/336081587)
-- [MMCV 核心组件分析(二):FileHandler](https://zhuanlan.zhihu.com/p/336097883)
-- [MMCV 核心组件分析(三): FileClient](https://zhuanlan.zhihu.com/p/339190576)
-- [MMCV 核心组件分析(四): Config](https://zhuanlan.zhihu.com/p/346203167)
-- [MMCV 核心组件分析(五): Registry](https://zhuanlan.zhihu.com/p/355271993)
-- [MMCV 核心组件分析(六): Hook](https://zhuanlan.zhihu.com/p/355272220)
-- [MMCV 核心组件分析(七): Runner](https://zhuanlan.zhihu.com/p/355272459)
-- [MMCV Hook 食用指南](https://zhuanlan.zhihu.com/p/448600739)
-- [PyTorch & MMCV Dispatcher 机制解析](https://zhuanlan.zhihu.com/p/451671838)
-
-#### 工具解读
-
-- [训练可视化工具哪款是你的菜?MMCV一行代码随你挑](https://zhuanlan.zhihu.com/p/387078211)
-
-#### 安装指南
-
-- [久等了!Windows 平台 MMCV 的预编译包终于来了!](https://zhuanlan.zhihu.com/p/441653536)
-- [Windows 环境从零安装 mmcv-full](https://zhuanlan.zhihu.com/p/434491590)
-
-#### 知乎问答
-
-- [深度学习科研,如何高效进行代码和实验管理?](https://www.zhihu.com/question/269707221/answer/2480772257)
-- [深度学习方面的科研工作中的实验代码有什么规范和写作技巧?如何妥善管理实验数据?](https://www.zhihu.com/question/268193800/answer/2586000037)
-
-### 下游算法库解读文章
-
-- [MMDetection](https://mmdetection.readthedocs.io/zh_CN/latest/article.html)
-
-### PyTorch 解读文章
-
-- [PyTorch1.11 亮点一览:TorchData、functorch、DDP 静态图](https://zhuanlan.zhihu.com/p/486222256)
-- [PyTorch1.12 亮点一览:DataPipe + TorchArrow 新的数据加载与处理范式](https://zhuanlan.zhihu.com/p/537868554)
-- [PyTorch 源码解读之 nn.Module:核心网络模块接口详解](https://zhuanlan.zhihu.com/p/340453841)
-- [PyTorch 源码解读之 torch.autograd:梯度计算详解](https://zhuanlan.zhihu.com/p/321449610)
-- [PyTorch 源码解读之 torch.utils.data:解析数据处理全流程](https://zhuanlan.zhihu.com/p/337850513)
-- [PyTorch 源码解读之 torch.optim:优化算法接口详解](https://zhuanlan.zhihu.com/p/346205754)
-- [PyTorch 源码解读之 DP & DDP:模型并行和分布式训练解析](https://zhuanlan.zhihu.com/p/343951042)
-- [PyTorch 源码解读之 BN & SyncBN:BN 与 多卡同步 BN 详解](https://zhuanlan.zhihu.com/p/337732517)
-- [PyTorch 源码解读之 torch.cuda.amp: 自动混合精度详解](https://zhuanlan.zhihu.com/p/348554267)
-- [PyTorch 源码解读之 cpp_extension:揭秘 C++/CUDA 算子实现和调用全流程](https://zhuanlan.zhihu.com/p/348555597)
-- [PyTorch 源码解读之即时编译篇](https://zhuanlan.zhihu.com/p/361101354)
-- [PyTorch 源码解读之分布式训练了解一下?](https://zhuanlan.zhihu.com/p/361314953)
-- [PyTorch 源码解读之 torch.serialization & torch.hub](https://zhuanlan.zhihu.com/p/364239544)
-
-### 其他
-
-- [困扰我 48 小时的深拷贝,今天终于...](https://zhuanlan.zhihu.com/p/470892209)
-- [拿什么拯救我的 4G 显卡](https://zhuanlan.zhihu.com/p/430123077)
-- [是谁偷偷动了我的 logger](https://zhuanlan.zhihu.com/p/481383590)
-- [三句话,让 logger 言听计从](https://zhuanlan.zhihu.com/p/487524917)
-- [Logging 不为人知的二三事](https://zhuanlan.zhihu.com/p/502610682)
-- [Type Hints 入门教程,让代码更加规范整洁](https://zhuanlan.zhihu.com/p/519335398)
-- [手把手教你如何高效地在 MMCV 中贡献算子](https://zhuanlan.zhihu.com/p/464492627)
-- [OpenMMLab 支持 IPU 训练芯片](https://zhuanlan.zhihu.com/p/517527926)
-- [基于 MMCV 走上开源大佬之路?](https://zhuanlan.zhihu.com/p/391144979)
diff --git a/docs/zh_cn/get_started/build.md b/docs/zh_cn/get_started/build.md
deleted file mode 100644
index 95f611bc2e0e616f83de448567d404c2e420981a..0000000000000000000000000000000000000000
--- a/docs/zh_cn/get_started/build.md
+++ /dev/null
@@ -1,300 +0,0 @@
-## 从源码编译 MMCV
-
-### 编译 mmcv
-
-在编译 mmcv 之前,请确保 PyTorch 已经成功安装在环境中,可以参考 [PyTorch 官方安装文档](https://pytorch.org/get-started/locally/#start-locally)。可使用以下命令验证
-
-```bash
-python -c 'import torch;print(torch.__version__)'
-```
-
-:::{note}
-
-- 如果克隆代码仓库的速度过慢,可以使用以下命令克隆(注意:gitee 的 mmcv 不一定和 github 的保持一致,因为每天只同步一次)
-
-```bash
-git clone https://gitee.com/open-mmlab/mmcv.git
-```
-
-- 如果打算使用 `opencv-python-headless` 而不是 `opencv-python`,例如在一个很小的容器环境或者没有图形用户界面的服务器中,你可以先安装 `opencv-python-headless`,这样在安装 mmcv 依赖的过程中会跳过 `opencv-python`。
-
-- 如果编译过程安装依赖库的时间过长,可以[设置 pypi 源](https://mirrors.tuna.tsinghua.edu.cn/help/pypi/)
-
-```bash
-pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
-```
-
-:::
-
-#### 在 Linux 上编译 mmcv
-
-| TODO: 视频教程
-
-1. 克隆代码仓库
-
- ```bash
- git clone https://github.com/open-mmlab/mmcv.git
- cd mmcv
- ```
-
-2. 安装 `ninja` 和 `psutil` 以加快编译速度
-
- ```bash
- pip install -r requirements/optional.txt
- ```
-
-3. 检查 nvcc 的版本(要求大于等于 9.2,如果没有 GPU,可以跳过)
-
- ```bash
- nvcc --version
- ```
-
- 上述命令如果输出以下信息,表示 nvcc 的设置没有问题,否则需要设置 CUDA_HOME
-
- ```
- nvcc: NVIDIA (R) Cuda compiler driver
- Copyright (c) 2005-2020 NVIDIA Corporation
- Built on Mon_Nov_30_19:08:53_PST_2020
- Cuda compilation tools, release 11.2, V11.2.67
- Build cuda_11.2.r11.2/compiler.29373293_0
- ```
-
- :::{note}
- 如果想要支持 ROCm,可以参考 [AMD ROCm](https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html) 安装 ROCm。
- :::
-
-4. 检查 gcc 的版本(要求大于等于**5.4**)
-
- ```bash
- gcc --version
- ```
-
-5. 开始编译(预估耗时 10 分钟)
-
- ```bash
- pip install -e . -v
- ```
-
-6. 验证安装
-
- ```bash
- python .dev_scripts/check_installation.py
- ```
-
- 如果上述命令没有报错,说明安装成功。如有报错,请查看[问题解决页面](../faq.html)是否已经有解决方案。
-
- 如果没有找到解决方案,欢迎提 [issue](https://github.com/open-mmlab/mmcv/issues)。
-
-#### 在 macOS 上编译 mmcv
-
-| TODO: 视频教程
-
-```{note}
-如果你使用的是搭载 apple silicon 的 mac 设备,请安装 PyTorch 1.13+ 的版本,否则会遇到 [issues#2218](https://github.com/open-mmlab/mmcv/issues/2218) 中的问题。
-```
-
-1. 克隆代码仓库
-
- ```bash
- git clone https://github.com/open-mmlab/mmcv.git
- cd mmcv
- ```
-
-2. 安装 `ninja` 和 `psutil` 以加快编译速度
-
- ```bash
- pip install -r requirements/optional.txt
- ```
-
-3. 开始编译
-
- ```bash
- pip install -e .
- ```
-
-4. 验证安装
-
- ```bash
- python .dev_scripts/check_installation.py
- ```
-
- 如果上述命令没有报错,说明安装成功。如有报错,请查看[问题解决页面](../faq.md)是否已经有解决方案。
-
- 如果没有找到解决方案,欢迎提 [issue](https://github.com/open-mmlab/mmcv/issues)。
-
-#### 在 Windows 上编译 mmcv
-
-| TODO: 视频教程
-
-在 Windows 上编译 mmcv 比 Linux 复杂,本节将一步步介绍如何在 Windows 上编译 mmcv。
-
-##### 依赖项
-
-请先安装以下的依赖项:
-
-- [Git](https://git-scm.com/download/win):安装期间,请选择 **add git to Path**
-- [Visual Studio Community 2019](https://visualstudio.microsoft.com):用于编译 C++ 和 CUDA 代码
-- [Miniconda](https://docs.conda.io/en/latest/miniconda.html):包管理工具
-- [CUDA 10.2](https://developer.nvidia.com/cuda-10.2-download-archive):如果只需要 CPU 版本可以不安装 CUDA,安装 CUDA 时,可根据需要进行自定义安装。如果已经安装新版本的显卡驱动,建议取消驱动程序的安装
-
-```{note}
-如果不清楚如何安装以上依赖,请参考[Windows 环境从零安装 mmcv](https://zhuanlan.zhihu.com/p/434491590)。
-另外,你需要知道如何在 Windows 上设置变量环境,尤其是 "PATH" 的设置,以下安装过程都会用到。
-```
-
-##### 通用步骤
-
-1. 从 Windows 菜单启动 Anaconda 命令行
-
- 如 Miniconda 安装程序建议,不要使用原始的 `cmd.exe` 或是 `powershell.exe`。命令行有两个版本,一个基于 PowerShell,一个基于传统的 `cmd.exe`。请注意以下说明都是使用的基于 PowerShell
-
-2. 创建一个新的 Conda 环境
-
- ```powershell
- (base) PS C:\Users\xxx> conda create --name mmcv python=3.7
- (base) PS C:\Users\xxx> conda activate mmcv # 确保做任何操作前先激活环境
- ```
-
-3. 安装 PyTorch 时,可以根据需要安装支持 CUDA 或不支持 CUDA 的版本
-
- ```powershell
- # CUDA version
- (mmcv) PS C:\Users\xxx> conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
- # CPU version
- (mmcv) PS C:\Users\xxx> conda install install pytorch torchvision cpuonly -c pytorch
- ```
-
-4. 克隆代码仓库
-
- ```powershell
- (mmcv) PS C:\Users\xxx> git clone https://github.com/open-mmlab/mmcv.git
- (mmcv) PS C:\Users\xxx> cd mmcv
- ```
-
-5. 安装 `ninja` 和 `psutil` 以加快编译速度
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> pip install -r requirements/optional.txt
- ```
-
-6. 设置 MSVC 编译器
-
- 设置环境变量。添加 `C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\Hostx86\x64` 到 `PATH`,则 `cl.exe` 可以在命令行中运行,如下所示。
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> cl
- Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29111 for x64
- Copyright (C) Microsoft Corporation. All rights reserved.
-
- usage: cl [ option... ] filename... [ / link linkoption... ]
- ```
-
- 为了兼容性,我们使用 x86-hosted 以及 x64-targeted 版本,即路径中的 `Hostx86\x64` 。
-
- 因为 PyTorch 将解析 `cl.exe` 的输出以检查其版本,只有 utf-8 将会被识别,你可能需要将系统语言更改为英语。控制面板 -> 地区-> 管理-> 非 Unicode 来进行语言转换。
-
-##### 编译与安装 mmcv
-
-mmcv 有两个版本:
-
-- 只包含 CPU 算子的版本
-
- 编译 CPU 算子,但只有 x86 将会被编译,并且编译版本只能在 CPU only 情况下运行
-
-- 既包含 CPU 算子,又包含 CUDA 算子的版本
-
- 同时编译 CPU 和 CUDA 算子,`ops` 模块的 x86 与 CUDA 的代码都可以被编译。同时编译的版本可以在 CUDA 上调用 GPU
-
-###### CPU 版本
-
-编译安装
-
-```powershell
-(mmcv) PS C:\Users\xxx\mmcv> python setup.py build_ext # 如果成功, cl 将被启动用于编译算子
-(mmcv) PS C:\Users\xxx\mmcv> python setup.py develop # 安装
-```
-
-###### GPU 版本
-
-1. 检查 `CUDA_PATH` 或者 `CUDA_HOME` 环境变量已经存在在 `envs` 之中
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> ls env:
-
- Name Value
- ---- -----
- CUDA_PATH C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
- CUDA_PATH_V10_1 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
- CUDA_PATH_V10_2 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
- ```
-
- 如果没有,你可以按照下面的步骤设置
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> $env:CUDA_HOME = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2"
- # 或者
- (mmcv) PS C:\Users\xxx\mmcv> $env:CUDA_HOME = $env:CUDA_PATH_V10_2 # CUDA_PATH_V10_2 已经在环境变量中
- ```
-
-2. 设置 CUDA 的目标架构
-
- ```powershell
- # 这里需要改成你的显卡对应的目标架构
- (mmcv) PS C:\Users\xxx\mmcv> $env:TORCH_CUDA_ARCH_LIST="7.5"
- ```
-
- :::{note}
- 可以点击 [cuda-gpus](https://developer.nvidia.com/cuda-gpus) 查看 GPU 的计算能力,也可以通过 CUDA 目录下的 deviceQuery.exe 工具查看
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> &"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\extras\demo_suite\deviceQuery.exe"
- Device 0: "NVIDIA GeForce GTX 1660 SUPER"
- CUDA Driver Version / Runtime Version 11.7 / 11.1
- CUDA Capability Major/Minor version number: 7.5
- ```
-
- 上面的 7.5 表示目标架构。注意:需把上面命令的 v10.2 换成你的 CUDA 版本。
- :::
-
-3. 编译安装
-
- ```powershell
- (mmcv) PS C:\Users\xxx\mmcv> python setup.py build_ext # 如果成功, cl 将被启动用于编译算子
- (mmcv) PS C:\Users\xxx\mmcv> python setup.py develop # 安装
- ```
-
- ```{note}
- 如果你的 PyTorch 版本是 1.6.0,你可能会遇到一些 [issue](https://github.com/pytorch/pytorch/issues/42467) 提到的错误,你可以参考这个 [pull request](https://github.com/pytorch/pytorch/pull/43380/files) 修改本地环境的 PyTorch 源代码
- ```
-
-##### 验证安装
-
-```powershell
-(mmcv) PS C:\Users\xxx\mmcv> python .dev_scripts/check_installation.py
-```
-
-如果上述命令没有报错,说明安装成功。如有报错,请查看[问题解决页面](../faq.md)是否已经有解决方案。
-如果没有找到解决方案,欢迎提 [issue](https://github.com/open-mmlab/mmcv/issues)。
-
-### 编译 mmcv-lite
-
-如果你需要使用和 PyTorch 相关的模块,请确保 PyTorch 已经成功安装在环境中,可以参考 [PyTorch 官方安装文档](https://pytorch.org/get-started/locally/#start-locally)。
-
-1. 克隆代码仓库
-
- ```bash
- git clone https://github.com/open-mmlab/mmcv.git
- cd mmcv
- ```
-
-2. 开始编译
-
- ```bash
- MMCV_WITH_OPS=0 pip install -e . -v
- ```
-
-3. 验证安装
-
- ```bash
- python -c 'import mmcv;print(mmcv.__version__)'
- ```
diff --git a/docs/zh_cn/get_started/installation.md b/docs/zh_cn/get_started/installation.md
deleted file mode 100644
index 54cdbd9f3ab9c2694e78013f5b3a5841730c54a5..0000000000000000000000000000000000000000
--- a/docs/zh_cn/get_started/installation.md
+++ /dev/null
@@ -1,369 +0,0 @@
-## 安装 MMCV
-
-MMCV 有两个版本:
-
-- **mmcv**: 完整版,包含所有的特性以及丰富的开箱即用的 CPU 和 CUDA 算子。注意,完整版本可能需要更长时间来编译。
-- **mmcv-lite**: 精简版,不包含 CPU 和 CUDA 算子但包含其余所有特性和功能,类似 MMCV 1.0 之前的版本。如果你不需要使用算子的话,精简版可以作为一个考虑选项。
-
-```{warning}
-请不要在同一个环境中安装两个版本,否则可能会遇到类似 `ModuleNotFound` 的错误。在安装一个版本之前,需要先卸载另一个。`如果 CUDA 可用,强烈推荐安装 mmcv`。
-```
-
-### 安装 mmcv
-
-在安装 mmcv 之前,请确保 PyTorch 已经成功安装在环境中,可以参考 [PyTorch 官方安装文档](https://pytorch.org/get-started/locally/#start-locally)。可使用以下命令验证
-
-```bash
-python -c 'import torch;print(torch.__version__)'
-```
-
-如果输出版本信息,则表示 PyTorch 已安装。
-
-#### 使用 mim 安装(推荐)
-
-[mim](https://github.com/open-mmlab/mim) 是 OpenMMLab 项目的包管理工具,使用它可以很方便地安装 mmcv。
-
-```bash
-pip install -U openmim
-mim install "mmcv>=2.0.0rc1"
-```
-
-如果发现上述的安装命令没有使用预编译包(以 `.whl` 结尾)而是使用源码包(以 `.tar.gz` 结尾)安装,则有可能是我们没有提供和当前环境的 PyTorch 版本、CUDA 版本相匹配的 mmcv 预编译包,此时,你可以[源码安装 mmcv](build.md)。
-
-
-使用预编译包的安装日志
-
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv
-Downloading https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/mmcv-2.0.0rc3-cp38-cp38-manylinux1_x86_64.whl
-
-
-
-
-使用源码包的安装日志
-
-Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
-Collecting mmcv==2.0.0rc3
-Downloading mmcv-2.0.0rc3.tar.gz
-
-
-
-如需安装指定版本的 mmcv,例如安装 2.0.0rc3 版本的 mmcv,可使用以下命令
-
-```bash
-mim install mmcv==2.0.0rc3
-```
-
-:::{note}
-如果你打算使用 `opencv-python-headless` 而不是 `opencv-python`,例如在一个很小的容器环境或者没有图形用户界面的服务器中,你可以先安装 `opencv-python-headless`,这样在安装 mmcv 依赖的过程中会跳过 `opencv-python`。
-
-另外,如果安装依赖库的时间过长,可以指定 pypi 源
-
-```bash
-mim install "mmcv>=2.0.0rc1" -i https://pypi.tuna.tsinghua.edu.cn/simple
-```
-
-:::
-
-安装完成后可以运行 [check_installation.py](https://github.com/open-mmlab/mmcv/blob/2.x/.dev_scripts/check_installation.py) 脚本检查 mmcv 是否安装成功。
-
-#### 使用 pip 安装
-
-使用以下命令查看 CUDA 和 PyTorch 的版本
-
-```bash
-python -c 'import torch;print(torch.__version__);print(torch.version.cuda)'
-```
-
-根据系统的类型、CUDA 版本、PyTorch 版本以及 MMCV 版本选择相应的安装命令
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-如果在上面的下拉框中没有找到对应的版本,则可能是没有对应 PyTorch 或者 CUDA 或者 mmcv 版本的预编译包,此时,你可以[源码安装 mmcv](build.md)。
-
-:::{note}
-PyTorch 在 1.x.0 和 1.x.1 之间通常是兼容的,故 mmcv 只提供 1.x.0 的编译包。如果你
-的 PyTorch 版本是 1.x.1,你可以放心地安装在 1.x.0 版本编译的 mmcv。例如,如果你的
-PyTorch 版本是 1.8.1,你可以放心选择 1.8.x。
-:::
-
-:::{note}
-如果你打算使用 `opencv-python-headless` 而不是 `opencv-python`,例如在一个很小的容器环境或者没有图形用户界面的服务器中,你可以先安装 `opencv-python-headless`,这样在安装 mmcv 依赖的过程中会跳过 `opencv-python`。
-
-另外,如果安装依赖库的时间过长,可以指定 pypi 源
-
-```bash
-pip install "mmcv>=2.0.0rc1" -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html -i https://pypi.tuna.tsinghua.edu.cn/simple
-```
-
-:::
-
-安装完成后可以运行 [check_installation.py](https://github.com/open-mmlab/mmcv/blob/2.x/.dev_scripts/check_installation.py) 脚本检查 mmcv 是否安装成功。
-
-#### 使用 docker 镜像
-
-先将算法库克隆到本地再构建镜像
-
-```bash
-git clone https://github.com/open-mmlab/mmcv.git && cd mmcv
-docker build -t mmcv -f docker/release/Dockerfile .
-```
-
-也可以直接使用下面的命令构建镜像
-
-```bash
-docker build -t mmcv https://github.com/open-mmlab/mmcv.git#2.x:docker/release
-```
-
-[Dockerfile](release/Dockerfile) 默认安装最新的 mmcv,如果你想要指定版本,可以使用下面的命令
-
-```bash
-docker image build -t mmcv -f docker/release/Dockerfile --build-arg MMCV=2.0.0rc1 .
-```
-
-如果你想要使用其他版本的 PyTorch 和 CUDA,你可以在构建镜像时指定它们的版本。
-
-例如指定 PyTorch 的版本是 1.11,CUDA 的版本是 11.3
-
-```bash
-docker build -t mmcv -f docker/release/Dockerfile \
- --build-arg PYTORCH=1.11.0 \
- --build-arg CUDA=11.3 \
- --build-arg CUDNN=8 \
- --build-arg MMCV=2.0.0rc1 .
-```
-
-更多 PyTorch 和 CUDA 镜像可以点击 [dockerhub/pytorch](https://hub.docker.com/r/pytorch/pytorch/tags) 查看。
-
-### 安装 mmcv-lite
-
-如果你需要使用和 PyTorch 相关的模块,请确保 PyTorch 已经成功安装在环境中,可以参考 [PyTorch 官方安装文档](https://pytorch.org/get-started/locally/#start-locally)。
-
-```python
-pip install mmcv-lite
-```
diff --git a/docs/zh_cn/get_started/introduction.md b/docs/zh_cn/get_started/introduction.md
deleted file mode 100644
index 4c735b94d3db71e484d04794fb5509cabbed68a9..0000000000000000000000000000000000000000
--- a/docs/zh_cn/get_started/introduction.md
+++ /dev/null
@@ -1,36 +0,0 @@
-## 介绍 MMCV
-
-MMCV 是一个面向计算机视觉的基础库,它提供了以下功能:
-
-- [图像和视频处理](../understand_mmcv/data_process.md)
-- [图像和标注结果可视化](../understand_mmcv/visualization.md)
-- [图像变换](../understand_mmcv/data_transform.md)
-- [多种 CNN 网络结构](../understand_mmcv/cnn.md)
-- [高质量实现的常见 CUDA 算子](../understand_mmcv/ops.md)
-
-MMCV 支持多种平台,包括:
-
-- Linux
-- Windows
-- macOS
-
-它支持的 OpenMMLab 项目:
-
-- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab 图像分类工具箱
-- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab 目标检测工具箱
-- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab 新一代通用 3D 目标检测平台
-- [MMRotate](https://github.com/open-mmlab/mmrotate): OpenMMLab 旋转框检测工具箱与测试基准
-- [MMYOLO](https://github.com/open-mmlab/mmyolo): OpenMMLab YOLO 系列工具箱与测试基准
-- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab 语义分割工具箱
-- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab 全流程文字检测识别理解工具箱
-- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab 姿态估计工具箱
-- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 人体参数化模型工具箱与测试基准
-- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab 自监督学习工具箱与测试基准
-- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab 模型压缩工具箱与测试基准
-- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab 少样本学习工具箱与测试基准
-- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab 新一代视频理解工具箱
-- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab 一体化视频目标感知平台
-- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab 光流估计工具箱与测试基准
-- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab 图像视频编辑工具箱
-- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab 图片视频生成模型工具箱
-- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab 模型部署框架
diff --git a/docs/zh_cn/switch_language.md b/docs/zh_cn/switch_language.md
deleted file mode 100644
index e4ac4b229ad520f142243f3a918748c542e9989f..0000000000000000000000000000000000000000
--- a/docs/zh_cn/switch_language.md
+++ /dev/null
@@ -1,3 +0,0 @@
-## English
-
-## 简体中文
diff --git a/docs/zh_cn/understand_mmcv/cnn.md b/docs/zh_cn/understand_mmcv/cnn.md
deleted file mode 100644
index 1f910419b3c212faed2ec6926fa316600a846232..0000000000000000000000000000000000000000
--- a/docs/zh_cn/understand_mmcv/cnn.md
+++ /dev/null
@@ -1,114 +0,0 @@
-## 卷积神经网络
-
-我们为卷积神经网络提供了一些构建模块,包括层构建、模块组件和权重初始化。
-
-### 网络层的构建
-
-在运行实验时,我们可能需要尝试同属一种类型但不同配置的层,但又不希望每次都修改代码。于是我们提供一些层构建方法,可以从字典构建层,字典可以在配置文件中配置,也可以通过命令行参数指定。
-
-#### 用法
-
-一个简单的例子:
-
-```python
-from mmcv.cnn import build_conv_layer
-
-cfg = dict(type='Conv3d')
-layer = build_conv_layer(cfg, in_channels=3, out_channels=8, kernel_size=3)
-```
-
-- `build_conv_layer`: 支持的类型包括 Conv1d、Conv2d、Conv3d、Conv (Conv是Conv2d的别名)
-- `build_norm_layer`: 支持的类型包括 BN1d、BN2d、BN3d、BN (alias for BN2d)、SyncBN、GN、LN、IN1d、IN2d、IN3d、IN(IN是IN2d的别名)
-- `build_activation_layer`:支持的类型包括 ReLU、LeakyReLU、PReLU、RReLU、ReLU6、ELU、Sigmoid、Tanh、GELU
-- `build_upsample_layer`: 支持的类型包括 nearest、bilinear、deconv、pixel_shuffle
-- `build_padding_layer`: 支持的类型包括 zero、reflect、replicate
-
-#### 拓展
-
-我们还允许自定义层和算子来扩展构建方法。
-
-1. 编写和注册自己的模块:
-
- ```python
- from mmengine.registry import MODELS
-
- @MODELS.register_module()
- class MyUpsample:
-
- def __init__(self, scale_factor):
- pass
-
- def forward(self, x):
- pass
- ```
-
-2. 在某处导入 `MyUpsample` (例如 `__init__.py` )然后使用它:
-
- ```python
- from mmcv.cnn import build_upsample_layer
-
- cfg = dict(type='MyUpsample', scale_factor=2)
- layer = build_upsample_layer(cfg)
- ```
-
-### 模块组件
-
-我们还提供了常用的模块组件,以方便网络构建。
-卷积组件 `ConvModule` 由 convolution、normalization以及activation layers 组成,更多细节请参考 [ConvModule api](api.html#mmcv.cnn.ConvModule)。
-
-```python
-from mmcv.cnn import ConvModule
-
-# conv + bn + relu
-conv = ConvModule(3, 8, 2, norm_cfg=dict(type='BN'))
-# conv + gn + relu
-conv = ConvModule(3, 8, 2, norm_cfg=dict(type='GN', num_groups=2))
-# conv + relu
-conv = ConvModule(3, 8, 2)
-# conv
-conv = ConvModule(3, 8, 2, act_cfg=None)
-# conv + leaky relu
-conv = ConvModule(3, 8, 3, padding=1, act_cfg=dict(type='LeakyReLU'))
-# bn + conv + relu
-conv = ConvModule(
- 3, 8, 2, norm_cfg=dict(type='BN'), order=('norm', 'conv', 'act'))
-```
-
-### Model Zoo
-
-除了`torchvision`的预训练模型,我们还提供以下 CNN 的预训练模型:
-
-- VGG Caffe
-- ResNet Caffe
-- ResNeXt
-- ResNet with Group Normalization
-- ResNet with Group Normalization and Weight Standardization
-- HRNetV2
-- Res2Net
-- RegNet
-
-#### Model URLs in JSON
-
-MMCV中的Model Zoo Link 由 JSON 文件管理。 json 文件由模型名称及其url或path的键值对组成,一个json文件可能类似于:
-
-```json
-{
- "model_a": "https://example.com/models/model_a_9e5bac.pth",
- "model_b": "pretrain/model_b_ab3ef2c.pth"
-}
-```
-
-可以在[此处](https://github.com/open-mmlab/mmcv/blob/master/mmcv/model_zoo/open_mmlab.json)找到托管在 OpenMMLab AWS 上的预训练模型的默认链接。
-
-你可以通过将 `open-mmlab.json` 放在 `MMCV_HOME`下来覆盖默认链接,如果在环境中找不到`MMCV_HOME`,则默认使用 `~/.cache/mmcv`。当然你也可以使用命令 `export MMCV_HOME=/your/path`来设置自己的路径。
-
-外部的json文件将被合并为默认文件,如果相同的键出现在外部`json`和默认`json`中,则将使用外部`json`。
-
-#### Load Checkpoint
-
-`mmcv.load_checkpoint()`的参数`filename`支持以下类型:
-
-- filepath: `checkpoint`路径
-- `http://xxx` and `https://xxx`: 下载checkpoint的链接,文件名中必需包含`SHA256`后缀
-- `torchvision://xxx`: `torchvision.models`中的模型链接,更多细节参考 [torchvision](https://pytorch.org/docs/stable/torchvision/models.html)
-- `open-mmlab://xxx`: 默认和其他 json 文件中提供的模型链接或文件路径
diff --git a/docs/zh_cn/understand_mmcv/data_transform.md b/docs/zh_cn/understand_mmcv/data_transform.md
deleted file mode 100644
index 47d16e1b5279cdcdf8700876d3d94e152b3181a0..0000000000000000000000000000000000000000
--- a/docs/zh_cn/understand_mmcv/data_transform.md
+++ /dev/null
@@ -1,341 +0,0 @@
-# 数据变换
-
-在 OpenMMLab 算法库中,数据集的构建和数据的准备是相互解耦的。通常,数据集的构建只对数据集进行解析,记录每个样本的基本信息;而数据的准备则是通过一系列的数据变换,根据样本的基本信息进行数据加载、预处理、格式化等操作。
-
-## 数据变换的设计
-
-在 MMCV 中,我们使用各种可调用的数据变换类来进行数据的操作。这些数据变换类可以接受若干配置参数进行实例化,之后通过调用的方式对输入的数据字典进行处理。同时,我们约定所有数据变换都接受一个字典作为输入,并将处理后的数据输出为一个字典。一个简单的例子如下:
-
-```python
->>> import numpy as np
->>> from mmcv.transforms import Resize
->>>
->>> transform = Resize(scale=(224, 224))
->>> data_dict = {'img': np.random.rand(256, 256, 3)}
->>> data_dict = transform(data_dict)
->>> print(data_dict['img'].shape)
-(224, 224, 3)
-```
-
-数据变换类会读取输入字典的某些字段,并且可能添加、或者更新某些字段。这些字段的键大部分情况下是固定的,如 `Resize` 会固定地读取输入字典中的 `"img"` 等字段。我们可以在对应类的文档中了解对输入输出字段的约定。
-
-```{note}
-默认情况下,在需要图像尺寸作为**初始化参数**的数据变换 (如Resize, Pad) 中,图像尺寸的顺序均为 (width, height)。在数据变换**返回的字典**中,图像相关的尺寸, 如 `img_shape`、`ori_shape`、`pad_shape` 等,均为 (height, width)。
-```
-
-MMCV 为所有的数据变换类提供了一个统一的基类 (`BaseTransform`):
-
-```python
-class BaseTransform(metaclass=ABCMeta):
-
- def __call__(self, results: dict) -> dict:
-
- return self.transform(results)
-
- @abstractmethod
- def transform(self, results: dict) -> dict:
- pass
-```
-
-所有的数据变换类都需要继承 `BaseTransform`,并实现 `transform` 方法。`transform` 方法的输入和输出均为一个字典。在**自定义数据变换类**一节中,我们会更详细地介绍如何实现一个数据变换类。
-
-## 数据流水线
-
-如上所述,所有数据变换的输入和输出都是一个字典,而且根据 OpenMMLab 中 [有关数据集的约定](TODO),数据集中每个样本的基本信息都是一个字典。这样一来,我们可以将所有的数据变换操作首尾相接,组合成为一条数据流水线(data pipeline),输入数据集中样本的信息字典,输出完成一系列处理后的信息字典。
-
-以分类任务为例,我们在下图展示了一个典型的数据流水线。对每个样本,数据集中保存的基本信息是一个如图中最左侧所示的字典,之后每经过一个由蓝色块代表的数据变换操作,数据字典中都会加入新的字段(标记为绿色)或更新现有的字段(标记为橙色)。
-
-
-
-
-
-在配置文件中,数据流水线是一个若干数据变换配置字典组成的列表,每个数据集都需要设置参数 `pipeline` 来定义该数据集需要进行的数据准备操作。如上数据流水线在配置文件中的配置如下:
-
-```python
-pipeline = [
- dict(type='LoadImageFromFile'),
- dict(type='Resize', size=256, keep_ratio=True),
- dict(type='CenterCrop', crop_size=224),
- dict(type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375]),
- dict(type='ClsFormatBundle')
-]
-
-dataset = dict(
- ...
- pipeline=pipeline,
- ...
-)
-```
-
-## 常用的数据变换类
-
-按照功能,常用的数据变换类可以大致分为数据加载、数据预处理与增强、数据格式化。在 MMCV 中,我们提供了一些常用的数据变换类如下:
-
-### 数据加载
-
-为了支持大规模数据集的加载,通常在 `Dataset` 初始化时不加载数据,只加载相应的路径。因此需要在数据流水线中进行具体数据的加载。
-
-| class | 功能 |
-| :-------------------------: | :---------------------------------------: |
-| [`LoadImageFromFile`](TODO) | 根据路径加载图像 |
-| [`LoadAnnotations`](TODO) | 加载和组织标注信息,如 bbox、语义分割图等 |
-
-### 数据预处理及增强
-
-数据预处理和增强通常是对图像本身进行变换,如裁剪、填充、缩放等。
-
-| class | 功能 |
-| :------------------------------: | :--------------------------------: |
-| [`Pad`](TODO) | 填充图像边缘 |
-| [`CenterCrop`](TODO) | 居中裁剪 |
-| [`Normalize`](TODO) | 对图像进行归一化 |
-| [`Resize`](TODO) | 按照指定尺寸或比例缩放图像 |
-| [`RandomResize`](TODO) | 缩放图像至指定范围的随机尺寸 |
-| [`RandomMultiscaleResize`](TODO) | 缩放图像至多个尺寸中的随机一个尺寸 |
-| [`RandomGrayscale`](TODO) | 随机灰度化 |
-| [`RandomFlip`](TODO) | 图像随机翻转 |
-| [`MultiScaleFlipAug`](TODO) | 支持缩放和翻转的测试时数据增强 |
-
-### 数据格式化
-
-数据格式化操作通常是对数据进行的类型转换。
-
-| class | 功能 |
-| :---------------------: | :-------------------------------: |
-| [`ToTensor`](TODO) | 将指定的数据转换为 `torch.Tensor` |
-| [`ImageToTensor`](TODO) | 将图像转换为 `torch.Tensor` |
-
-## 自定义数据变换类
-
-要实现一个新的数据变换类,需要继承 `BaseTransform`,并实现 `transform` 方法。这里,我们使用一个简单的翻转变换(`MyFlip`)作为示例:
-
-```python
-import random
-import mmcv
-from mmcv.transforms import BaseTransform, TRANSFORMS
-
-@TRANSFORMS.register_module()
-class MyFlip(BaseTransform):
- def __init__(self, direction: str):
- super().__init__()
- self.direction = direction
-
- def transform(self, results: dict) -> dict:
- img = results['img']
- results['img'] = mmcv.imflip(img, direction=self.direction)
- return results
-```
-
-从而,我们可以实例化一个 `MyFlip` 对象,并将之作为一个可调用对象,来处理我们的数据字典。
-
-```python
-import numpy as np
-
-transform = MyFlip(direction='horizontal')
-data_dict = {'img': np.random.rand(224, 224, 3)}
-data_dict = transform(data_dict)
-processed_img = data_dict['img']
-```
-
-又或者,在配置文件的 pipeline 中使用 `MyFlip` 变换
-
-```python
-pipeline = [
- ...
- dict(type='MyFlip', direction='horizontal'),
- ...
-]
-```
-
-需要注意的是,如需在配置文件中使用,需要保证 `MyFlip` 类所在的文件在运行时能够被导入。
-
-## 变换包装
-
-变换包装是一种特殊的数据变换类,他们本身并不操作数据字典中的图像、标签等信息,而是对其中定义的数据变换的行为进行增强。
-
-### 字段映射(KeyMapper)
-
-字段映射包装(`KeyMapper`)用于对数据字典中的字段进行映射。例如,一般的图像处理变换都从数据字典中的 `"img"` 字段获得值。但有些时候,我们希望这些变换处理数据字典中其他字段中的图像,比如 `"gt_img"` 字段。
-
-如果配合注册器和配置文件使用的话,在配置文件中数据集的 `pipeline` 中如下例使用字段映射包装:
-
-```python
-pipeline = [
- ...
- dict(type='KeyMapper',
- mapping={
- 'img': 'gt_img', # 将 "gt_img" 字段映射至 "img" 字段
- 'mask': ..., # 不使用原始数据中的 "mask" 字段。即对于被包装的数据变换,数据中不包含 "mask" 字段
- },
- auto_remap=True, # 在完成变换后,将 "img" 重映射回 "gt_img" 字段
- transforms=[
- # 在 `RandomFlip` 变换类中,我们只需要操作 "img" 字段即可
- dict(type='RandomFlip'),
- ])
- ...
-]
-```
-
-利用字段映射包装,我们在实现数据变换类时,不需要考虑在 `transform` 方法中考虑各种可能的输入字段名,只需要处理默认的字段即可。
-
-### 随机选择(RandomChoice)和随机执行(RandomApply)
-
-随机选择包装(`RandomChoice`)用于从一系列数据变换组合中随机应用一个数据变换组合。利用这一包装,我们可以简单地实现一些数据增强功能,比如 AutoAugment。
-
-如果配合注册器和配置文件使用的话,在配置文件中数据集的 `pipeline` 中如下例使用随机选择包装:
-
-```python
-pipeline = [
- ...
- dict(type='RandomChoice',
- transforms=[
- [
- dict(type='Posterize', bits=4),
- dict(type='Rotate', angle=30.)
- ], # 第一种随机变化组合
- [
- dict(type='Equalize'),
- dict(type='Rotate', angle=30)
- ], # 第二种随机变换组合
- ],
- prob=[0.4, 0.6] # 两种随机变换组合各自的选用概率
- )
- ...
-]
-```
-
-随机执行包装(`RandomApply`)用于以指定概率随机执行数据变换组合。例如:
-
-```python
-pipeline = [
- ...
- dict(type='RandomApply',
- transforms=[dict(type='Rotate', angle=30.)],
- prob=0.3) # 以 0.3 的概率执行被包装的数据变换
- ...
-]
-```
-
-### 多目标扩展(TransformBroadcaster)
-
-通常,一个数据变换类只会从一个固定的字段读取操作目标。虽然我们也可以使用 `KeyMapper` 来改变读取的字段,但无法将变换一次性应用于多个字段的数据。为了实现这一功能,我们需要借助多目标扩展包装(`TransformBroadcaster`)。
-
-多目标扩展包装(`TransformBroadcaster`)有两个用法,一是将数据变换作用于指定的多个字段,二是将数据变换作用于某个字段下的一组目标中。
-
-1. 应用于多个字段
-
- 假设我们需要将数据变换应用于 `"lq"` (low-quality) 和 `"gt"` (ground-truth) 两个字段中的图像上。
-
- ```python
- pipeline = [
- dict(type='TransformBroadcaster',
- # 分别应用于 "lq" 和 "gt" 两个字段,并将二者应设置 "img" 字段
- mapping={'img': ['lq', 'gt']},
- # 在完成变换后,将 "img" 字段重映射回原先的字段
- auto_remap=True,
- # 是否在对各目标的变换中共享随机变量
- # 更多介绍参加后续章节(随机变量共享)
- share_random_params=True,
- transforms=[
- # 在 `RandomFlip` 变换类中,我们只需要操作 "img" 字段即可
- dict(type='RandomFlip'),
- ])
- ]
- ```
-
- 在多目标扩展的 `mapping` 设置中,我们同样可以使用 `...` 来忽略指定的原始字段。如以下例子中,被包裹的 `RandomCrop` 会对字段 `"img"` 中的图像进行裁剪,并且在字段 `"img_shape"` 存在时更新剪裁后的图像大小。如果我们希望同时对两个图像字段 `"lq"` 和 `"gt"` 进行相同的随机裁剪,但只更新一次 `"img_shape"` 字段,可以通过例子中的方式实现:
-
- ```python
- pipeline = [
- dict(type='TransformBroadcaster',
- mapping={
- 'img': ['lq', 'gt'],
- 'img_shape': ['img_shape', ...],
- },
- # 在完成变换后,将 "img" 和 "img_shape" 字段重映射回原先的字段
- auto_remap=True,
- # 是否在对各目标的变换中共享随机变量
- # 更多介绍参加后续章节(随机变量共享)
- share_random_params=True,
- transforms=[
- # `RandomCrop` 类中会操作 "img" 和 "img_shape" 字段。若 "img_shape" 空缺,
- # 则只操作 "img"
- dict(type='RandomCrop'),
- ])
- ]
- ```
-
-2. 应用于一个字段的一组目标
-
- 假设我们需要将数据变换应用于 `"images"` 字段,该字段为一个图像组成的 list。
-
- ```python
- pipeline = [
- dict(type='TransformBroadcaster',
- # 将 "images" 字段下的每张图片映射至 "img" 字段
- mapping={'img': 'images'},
- # 在完成变换后,将 "img" 字段下的图片重映射回 "images" 字段的列表中
- auto_remap=True,
- # 是否在对各目标的变换中共享随机变量
- share_random_params=True,
- transforms=[
- # 在 `RandomFlip` 变换类中,我们只需要操作 "img" 字段即可
- dict(type='RandomFlip'),
- ])
- ]
- ```
-
-#### 装饰器 `cache_randomness`
-
-在 `TransformBroadcaster` 中,我们提供了 `share_random_params` 选项来支持在多次数据变换中共享随机状态。例如,在超分辨率任务中,我们希望将随机变换**同步**作用于低分辨率图像和原始图像。如果我们希望在自定义的数据变换类中使用这一功能,需要在类中标注哪些随机变量是支持共享的。这可以通过装饰器 `cache_randomness` 来实现。
-
-以上文中的 `MyFlip` 为例,我们希望以一定的概率随机执行翻转:
-
-```python
-from mmcv.transforms.utils import cache_randomness
-
-@TRANSFORMS.register_module()
-class MyRandomFlip(BaseTransform):
- def __init__(self, prob: float, direction: str):
- super().__init__()
- self.prob = prob
- self.direction = direction
-
- @cache_randomness # 标注该方法的输出为可共享的随机变量
- def do_flip(self):
- flip = True if random.random() > self.prob else False
- return flip
-
- def transform(self, results: dict) -> dict:
- img = results['img']
- if self.do_flip():
- results['img'] = mmcv.imflip(img, direction=self.direction)
- return results
-```
-
-在上面的例子中,我们用`cache_randomness` 装饰 `do_flip`方法,即将该方法返回值 `flip` 标注为一个支持共享的随机变量。进而,在 `TransformBroadcaster` 对多个目标的变换中,这一变量的值都会保持一致。
-
-#### 装饰器 `avoid_cache_randomness`
-
-在一些情况下,我们无法将数据变换中产生随机变量的过程单独放在类方法中。例如数据变换中使用的来自第三方库的模块,这些模块将随机变量相关的部分封装在了内部,导致无法将其抽出为数据变换的类方法。这样的数据变换无法通过装饰器 `cache_randomness` 标注支持共享的随机变量,进而无法在多目标扩展时共享随机变量。
-
-为了避免在多目标扩展中误用此类数据变换,我们提供了另一个装饰器 `avoid_cache_randomness`,用来对此类数据变换进行标记:
-
-```python
-from mmcv.transforms.utils import avoid_cache_randomness
-
-@TRANSFORMS.register_module()
-@avoid_cache_randomness
-class MyRandomTransform(BaseTransform):
-
- def transform(self, results: dict) -> dict:
- ...
-```
-
-用 `avoid_cache_randomness` 标记的数据变换类,当其实例被 `TransformBroadcaster` 包装且将参数 `share_random_params` 设置为 True 时,会抛出异常,以此提醒用户不能这样使用。
-
-在使用 `avoid_cache_randomness` 时需要注意以下几点:
-
-1. `avoid_cache_randomness` 只用于装饰数据变换类(BaseTransfrom 的子类),而不能用与装饰其他一般的类、类方法或函数
-2. 被 `avoid_cache_randomness` 修饰的数据变换作为基类时,其子类将**不会继承**这一特性。如果子类仍无法共享随机变量,则应再次使用 `avoid_cache_randomness` 修饰
-3. 只有当一个数据变换具有随机性,且无法共享随机参数时,才需要以 `avoid_cache_randomness` 修饰。无随机性的数据变换不需要修饰
diff --git a/docs/zh_cn/understand_mmcv/ops.md b/docs/zh_cn/understand_mmcv/ops.md
deleted file mode 100644
index 11b885d37c6ed19dcd295650cae73e018b465f92..0000000000000000000000000000000000000000
--- a/docs/zh_cn/understand_mmcv/ops.md
+++ /dev/null
@@ -1,66 +0,0 @@
-## 算子
-
-MMCV 提供了检测、分割等任务中常用的算子
-
-| Device | CPU | CUDA | MLU | MPS | Ascend |
-| ---------------------------- | --- | ---- | --- | --- | ------ |
-| ActiveRotatedFilter | √ | √ | | | |
-| AssignScoreWithK | | √ | | | |
-| BallQuery | | √ | | | |
-| BBoxOverlaps | | √ | √ | √ | √ |
-| BorderAlign | | √ | | | |
-| BoxIouRotated | √ | √ | | | |
-| BoxIouQuadri | √ | √ | | | |
-| CARAFE | | √ | √ | | |
-| ChamferDistance | | √ | | | |
-| CrissCrossAttention | | √ | | | |
-| ContourExpand | √ | | | | |
-| ConvexIoU | | √ | | | |
-| CornerPool | | √ | | | |
-| Correlation | | √ | | | |
-| Deformable Convolution v1/v2 | √ | √ | | | √ |
-| Deformable RoIPool | | √ | √ | | √ |
-| DiffIoURotated | | √ | | | |
-| DynamicScatter | | √ | | | |
-| FurthestPointSample | | √ | | | |
-| FurthestPointSampleWithDist | | √ | | | |
-| FusedBiasLeakyrelu | | √ | | | √ |
-| GatherPoints | | √ | | | √ |
-| GroupPoints | | √ | | | |
-| Iou3d | | √ | √ | | |
-| KNN | | √ | | | |
-| MaskedConv | | √ | √ | | √ |
-| MergeCells | | √ | | | |
-| MinAreaPolygon | | √ | | | |
-| ModulatedDeformConv2d | √ | √ | | | √ |
-| MultiScaleDeformableAttn | | √ | √ | | |
-| NMS | √ | √ | √ | | √ |
-| NMSRotated | √ | √ | | | √ |
-| NMSQuadri | √ | √ | | | |
-| PixelGroup | √ | | | | |
-| PointsInBoxes | √ | √ | | | |
-| PointsInPolygons | | √ | | | |
-| PSAMask | √ | √ | √ | | √ |
-| RotatedFeatureAlign | √ | √ | | | |
-| RoIPointPool3d | | √ | √ | | |
-| RoIPool | | √ | √ | | √ |
-| RoIAlignRotated | √ | √ | √ | | |
-| RiRoIAlignRotated | | √ | | | |
-| RoIAlign | √ | √ | √ | | |
-| RoIAwarePool3d | | √ | √ | | |
-| SAConv2d | | √ | | | |
-| SigmoidFocalLoss | | √ | √ | | √ |
-| SoftmaxFocalLoss | | √ | | | √ |
-| SoftNMS | | √ | | | |
-| Sparse Convolution | | √ | | | |
-| Synchronized BatchNorm | | √ | | | |
-| ThreeInterpolate | | √ | | | |
-| ThreeNN | | √ | √ | | |
-| TINShift | | √ | √ | | |
-| UpFirDn2d | | √ | | | |
-| Voxelization | √ | √ | | | √ |
-| PrRoIPool | | √ | | | |
-| BezierAlign | √ | √ | | | |
-| BiasAct | | √ | | | |
-| FilteredLrelu | | √ | | | |
-| Conv2dGradfix | | √ | | | |
diff --git a/docs/zh_cn/Makefile b/docs_zh_CN/Makefile
similarity index 100%
rename from docs/zh_cn/Makefile
rename to docs_zh_CN/Makefile
diff --git a/docs/zh_cn/_static/css/readthedocs.css b/docs_zh_CN/_static/css/readthedocs.css
similarity index 75%
rename from docs/zh_cn/_static/css/readthedocs.css
rename to docs_zh_CN/_static/css/readthedocs.css
index 9e3a567d5f78aedb606600bb3111034a1003b362..3f425fc1e5344d7d159c71aa94b5e385767d5b37 100644
--- a/docs/zh_cn/_static/css/readthedocs.css
+++ b/docs_zh_CN/_static/css/readthedocs.css
@@ -4,7 +4,3 @@
height: 40px;
width: 85px;
}
-
-table.colwidths-auto td {
- width: 50%
-}
diff --git a/docs/zh_cn/_static/image/mmcv-logo.png b/docs_zh_CN/_static/image/mmcv-logo.png
similarity index 100%
rename from docs/zh_cn/_static/image/mmcv-logo.png
rename to docs_zh_CN/_static/image/mmcv-logo.png
diff --git a/docs_zh_CN/api.rst b/docs_zh_CN/api.rst
new file mode 100644
index 0000000000000000000000000000000000000000..8ca9118c3b033f1b7311ec3c1533ce9c93fa1aa2
--- /dev/null
+++ b/docs_zh_CN/api.rst
@@ -0,0 +1,44 @@
+fileio
+-------
+.. automodule:: mmcv.fileio
+ :members:
+
+image
+------
+.. automodule:: mmcv.image
+ :members:
+
+video
+------
+.. automodule:: mmcv.video
+ :members:
+
+arraymisc
+---------
+.. automodule:: mmcv.arraymisc
+ :members:
+
+visualization
+--------------
+.. automodule:: mmcv.visualization
+ :members:
+
+utils
+-----
+.. automodule:: mmcv.utils
+ :members:
+
+cnn
+----
+.. automodule:: mmcv.cnn
+ :members:
+
+runner
+------
+.. automodule:: mmcv.runner
+ :members:
+
+ops
+------
+.. automodule:: mmcv.ops
+ :members:
diff --git a/docs_zh_CN/community/contributing.md b/docs_zh_CN/community/contributing.md
new file mode 100644
index 0000000000000000000000000000000000000000..30bac8738bee8db306287c6b245b3115464e64da
--- /dev/null
+++ b/docs_zh_CN/community/contributing.md
@@ -0,0 +1,69 @@
+## 贡献代码
+
+欢迎任何类型的贡献,包括但不限于
+
+- 修改拼写错误或代码错误
+- 添加文档或将文档翻译成其他语言
+- 添加新功能和新组件
+
+### 工作流
+| 详细工作流见 [拉取请求](pr.md)
+1. 复刻并拉取最新的 OpenMMLab 算法库
+2. 创建新的分支(不建议使用主分支提拉取请求)
+3. 提交你的修改
+4. 创建拉取请求
+
+```{note}
+如果你计划添加新功能并且该功能包含比较大的改动,建议先开 issue 讨论
+```
+### 代码风格
+
+#### Python
+
+[PEP8](https://www.python.org/dev/peps/pep-0008/) 作为 OpenMMLab 算法库首选的代码规范,我们使用以下工具检查和格式化代码
+
+- [flake8](http://flake8.pycqa.org/en/latest/): Python 官方发布的代码规范检查工具,是多个检查工具的封装
+- [yapf](https://github.com/google/yapf): Google 发布的代码规范检查工具
+- [isort](https://github.com/timothycrosley/isort): 自动调整模块导入顺序的工具
+- [markdownlint](https://github.com/markdownlint/markdownlint): 检查 markdown 文件的工具
+- [docformatter](https://github.com/myint/docformatter): 格式化 docstring 的工具
+
+yapf 和 isort 的配置可以在 [setup.cfg](./setup.cfg) 找到
+
+通过配置 [pre-commit hook](https://pre-commit.com/) ,我们可以在提交代码时自动检查和格式化 `flake8`、`yapf`、`isort`、`trailing whitespaces`、`markdown files`,
+修复 `end-of-files`、`double-quoted-strings`、`python-encoding-pragma`、`mixed-line-ending`,调整 `requirments.txt` 的包顺序。
+pre-commit 钩子的配置可以在 [.pre-commit-config](./.pre-commit-config.yaml) 找到。
+
+在克隆算法库后,你需要安装并初始化 pre-commit 钩子
+
+```shell
+pip install -U pre-commit
+```
+
+切换算法库根目录
+
+```shell
+pre-commit install
+```
+
+如果安装 markdownlint 遇到了问题,可以尝试使用以下的步骤安装 ruby
+
+```shell
+# install rvm
+curl -L https://get.rvm.io | bash -s -- --autolibs=read-fail
+[[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm"
+rvm autolibs disable
+
+# install ruby
+rvm install 2.7.1
+```
+
+或者参考 [这个代码库](https://github.com/innerlee/setup) 和 [`zzruby.sh`](https://github.com/innerlee/setup/blob/master/zzruby.sh)。
+
+至此,每一次 commit 修改都会触发 pre-commit 检查代码格式。
+
+>提交拉取请求前,请确保你的代码符合 yapf 的格式
+
+#### C++ and CUDA
+
+C++ 和 CUDA 的代码规范遵从 [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html)
diff --git a/docs_zh_CN/community/pr.md b/docs_zh_CN/community/pr.md
new file mode 100644
index 0000000000000000000000000000000000000000..219e01dd747827adedddd922310624f97ff10672
--- /dev/null
+++ b/docs_zh_CN/community/pr.md
@@ -0,0 +1,90 @@
+## 拉取请求
+
+### 什么是拉取请求?
+
+`拉取请求` (Pull Request), [GitHub 官方文档](https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests)定义如下。
+
+>拉取请求是一种通知机制。你修改了他人的代码,将你的修改通知原来作者,希望他合并你的修改。
+
+### 基本的工作流:
+
+1. 获取最新的代码库
+2. 从主分支创建最新的分支进行开发
+3. 提交修改
+4. 推送你的修改并创建一个`拉取请求`
+5. 讨论、审核代码
+6. 将开发分支合并到主分支
+
+### 具体步骤
+
+1. 获取最新的代码库
+ + 当你第一次提 PR 时
+ - 复刻 OpenMMLab 原代码库,点击 GitHub 页面右上角的 **Fork** 按钮即可
+ 
+
+ - 克隆复刻的代码库到本地
+ ```bash
+ git clone git@github.com:XXX/mmcv.git
+ ```
+
+ - 添加原代码库为上游代码库
+ ```bash
+ git remote add upstream git@github.com:open-mmlab/mmcv
+ ```
+ + 从第二个 PR 起
+ - 检出本地代码库的主分支,然后从最新的原代码库的主分支拉取更新
+ ```bash
+ git checkout master
+ git pull upstream master
+ ```
+
+2. 从主分支创建一个新的开发分支
+ ```bash
+ git checkout -b branchname
+ ```
+ 注意:为了保证提交历史清晰可读,我们强烈推荐您先检出主分支 (master),再创建新的分支。
+
+3. 提交你的修改
+ ```bash
+ # coding
+ git add [files]
+ git commit -m 'messages'
+ ```
+
+4. 推送你的修改到复刻的代码库,并创建一个`拉取请求`
+ + 推送当前分支到远端复刻的代码库
+ ```bash
+ git push origin branchname
+ ```
+
+ + 创建一个`拉取请求`
+ 
+
+ + 修改`拉取请求`信息模板,描述修改原因和修改内容。还可以在 PR 描述中,手动关联到相关的`议题` (issue),(更多细节,请参考[官方文档](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue))。
+
+5. 讨论并评审你的代码
+ + 创建`拉取请求`时,可以关联给相关人员进行评审
+ 
+
+ + 根据评审人员的意见修改代码,并推送修改
+
+6. `拉取请求`合并之后删除该分支
+```bash
+git branch -d branchname # delete local branch
+git push origin --delete branchname # delete remote branch
+```
+
+### PR 规范
+
+1. 使用 [pre-commit hook](https://pre-commit.com),尽量减少代码风格相关问题
+2. 一个PR对应一个短期分支
+3. 粒度要细,一个PR只做一件事情,避免超大的PR
+ >- Bad:实现Faster R-CNN
+ >- Acceptable:给 Faster R-CNN 添加一个 box head
+ >- Good:给 box head 增加一个参数来支持自定义的 conv 层数
+4. 每次 Commit 时需要提供清晰且有意义 commit 信息
+5. 提供清晰且有意义的`拉取请求`描述
+ >- 标题写明白任务名称,一般格式:[Prefix] Short description of the pull request (Suffix)
+ >- prefix: 新增功能 [Feature], 修 bug [Fix], 文档相关 [Docs], 开发中 [WIP] (暂时不会被review)
+ >- 描述里介绍`拉取请求`的主要修改内容,结果,以及对其他部分的影响, 参考`拉取请求`模板
+ >- 关联相关的`议题` (issue) 和其他`拉取请求`
diff --git a/docs/zh_cn/compatibility.md b/docs_zh_CN/compatibility.md
similarity index 100%
rename from docs/zh_cn/compatibility.md
rename to docs_zh_CN/compatibility.md
diff --git a/docs/en/conf.py b/docs_zh_CN/conf.py
similarity index 61%
rename from docs/en/conf.py
rename to docs_zh_CN/conf.py
index 471bd225adeede01787a236ac0d370d0056b960a..e0c65d0eeca3bc99ef827b3fa36fc903422e8832 100644
--- a/docs/en/conf.py
+++ b/docs_zh_CN/conf.py
@@ -15,19 +15,21 @@ import os
import sys
import pytorch_sphinx_theme
+from m2r import MdInclude
+from recommonmark.transform import AutoStructify
from sphinx.builders.html import StandaloneHTMLBuilder
-sys.path.insert(0, os.path.abspath('../..'))
+sys.path.insert(0, os.path.abspath('..'))
-version_file = '../../mmcv/version.py'
-with open(version_file) as f:
+version_file = '../mmcv/version.py'
+with open(version_file, 'r') as f:
exec(compile(f.read(), version_file, 'exec'))
__version__ = locals()['__version__']
# -- Project information -----------------------------------------------------
project = 'mmcv'
-copyright = '2018-2022, OpenMMLab'
+copyright = '2018-2021, OpenMMLab'
author = 'MMCV Authors'
# The short X.Y version
@@ -47,28 +49,16 @@ release = __version__
extensions = [
'sphinx.ext.autodoc',
- 'sphinx.ext.autosummary',
- 'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
+ 'sphinx.ext.autosectionlabel',
'sphinx_markdown_tables',
'myst_parser',
'sphinx_copybutton',
] # yapf: disable
-myst_heading_anchors = 4
-
-myst_enable_extensions = ['colon_fence']
-
-# Configuration for intersphinx
-intersphinx_mapping = {
- 'python': ('https://docs.python.org/3', None),
- 'numpy': ('https://numpy.org/doc/stable', None),
- 'torch': ('https://pytorch.org/docs/stable/', None),
- 'mmengine': ('https://mmengine.readthedocs.io/en/latest', None),
-}
-
autodoc_mock_imports = ['mmcv._ext', 'mmcv.utils.ext_loader', 'torchvision']
+autosectionlabel_prefix_document = True
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
@@ -89,7 +79,7 @@ master_doc = 'index'
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
-language = None
+language = 'zh_CN'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
@@ -118,9 +108,94 @@ html_theme_options = {
'name': 'GitHub',
'url': 'https://github.com/open-mmlab/mmcv'
},
- ],
- # Specify the language of shared menu
- 'menu_lang': 'en',
+ {
+ 'name':
+ '文档',
+ 'children': [
+ {
+ 'name': 'MMCV',
+ 'url': 'https://mmcv.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MIM',
+ 'url': 'https://openmim.readthedocs.io/en/latest/'
+ },
+ {
+ 'name': 'MMAction2',
+ 'url': 'https://mmaction2.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMClassification',
+ 'url':
+ 'https://mmclassification.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMDetection',
+ 'url': 'https://mmdetection.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMDetection3D',
+ 'url':
+ 'https://mmdetection3d.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMEditing',
+ 'url': 'https://mmediting.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMGeneration',
+ 'url': 'https://mmgeneration.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMOCR',
+ 'url': 'https://mmocr.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMPose',
+ 'url': 'https://mmpose.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMSegmentation',
+ 'url':
+ 'https://mmsegmentation.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMTracking',
+ 'url': 'https://mmtracking.readthedocs.io/zh_CN/latest/',
+ },
+ {
+ 'name': 'MMFlow',
+ 'url': 'https://mmflow.readthedocs.io/en/latest/',
+ },
+ {
+ 'name': 'MMFewShot',
+ 'url': 'https://mmfewshot.readthedocs.io/zh_CN/latest/',
+ },
+ ]
+ },
+ {
+ 'name':
+ 'OpenMMLab',
+ 'children': [
+ {
+ 'name': '主页',
+ 'url': 'https://openmmlab.com/'
+ },
+ {
+ 'name': 'GitHub',
+ 'url': 'https://github.com/open-mmlab/'
+ },
+ {
+ 'name': '推特',
+ 'url': 'https://twitter.com/OpenMMLab'
+ },
+ {
+ 'name': '知乎',
+ 'url': 'https://zhihu.com/people/openmmlab'
+ },
+ ]
+ },
+ ]
}
# Add any paths that contain custom static files (such as style sheets) here,
@@ -213,3 +288,16 @@ StandaloneHTMLBuilder.supported_image_types = [
# Ignore >>> when copying code
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True
+
+
+def setup(app):
+ app.add_config_value('no_underscore_emphasis', False, 'env')
+ app.add_config_value('m2r_parse_relative_links', False, 'env')
+ app.add_config_value('m2r_anonymous_references', False, 'env')
+ app.add_config_value('m2r_disable_inline_math', False, 'env')
+ app.add_directive('mdinclude', MdInclude)
+ app.add_config_value('recommonmark_config', {
+ 'auto_toc_tree_section': 'Contents',
+ 'enable_eval_rst': True,
+ }, True)
+ app.add_transform(AutoStructify)
diff --git a/docs_zh_CN/deployment/onnx.md b/docs_zh_CN/deployment/onnx.md
new file mode 100644
index 0000000000000000000000000000000000000000..c4e00417f2345748bd0df717a365d441cb6e25e5
--- /dev/null
+++ b/docs_zh_CN/deployment/onnx.md
@@ -0,0 +1,19 @@
+## MMCV中ONNX模块简介 (实验性)
+
+### register_extra_symbolics
+
+在将PyTorch模型导出成ONNX时,需要注册额外的符号函数
+
+#### 范例
+
+```python
+import mmcv
+from mmcv.onnx import register_extra_symbolics
+
+opset_version = 11
+register_extra_symbolics(opset_version)
+```
+
+#### 常见问题
+
+- 无
diff --git a/docs_zh_CN/deployment/onnxruntime_custom_ops.md b/docs_zh_CN/deployment/onnxruntime_custom_ops.md
new file mode 100644
index 0000000000000000000000000000000000000000..594aefb4ba4566aeda990ee5f42512f5e2be1917
--- /dev/null
+++ b/docs_zh_CN/deployment/onnxruntime_custom_ops.md
@@ -0,0 +1,333 @@
+## ONNX Runtime自定义算子
+
+
+
+- [ONNX Runtime自定义算子](#onnx-runtime自定义算子)
+ - [SoftNMS](#softnms)
+ - [描述](#描述)
+ - [模型参数](#模型参数)
+ - [输入](#输入)
+ - [输出](#输出)
+ - [类型约束](#类型约束)
+ - [RoIAlign](#roialign)
+ - [描述](#描述-1)
+ - [模型参数](#模型参数-1)
+ - [输入](#输入-1)
+ - [输出](#输出-1)
+ - [类型约束](#类型约束-1)
+ - [NMS](#nms)
+ - [描述](#描述-2)
+ - [模型参数](#模型参数-2)
+ - [输入](#输入-2)
+ - [输出](#输出-2)
+ - [类型约束](#类型约束-2)
+ - [grid_sampler](#grid_sampler)
+ - [描述](#描述-3)
+ - [模型参数](#模型参数-3)
+ - [输入](#输入-3)
+ - [输出](#输出-3)
+ - [类型约束](#类型约束-3)
+ - [CornerPool](#cornerpool)
+ - [描述](#描述-4)
+ - [模型参数](#模型参数-4)
+ - [输入](#输入-4)
+ - [输出](#输出-4)
+ - [类型约束](#类型约束-4)
+ - [cummax](#cummax)
+ - [描述](#描述-5)
+ - [模型参数](#模型参数-5)
+ - [输入](#输入-5)
+ - [输出](#输出-5)
+ - [类型约束](#类型约束-5)
+ - [cummin](#cummin)
+ - [描述](#描述-6)
+ - [模型参数](#模型参数-6)
+ - [输入](#输入-6)
+ - [输出](#输出-6)
+ - [类型约束](#类型约束-6)
+ - [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
+ - [描述](#描述-7)
+ - [模型参数](#模型参数-7)
+ - [输入](#输入-7)
+ - [输出](#输出-7)
+ - [类型约束](#类型约束-7)
+
+
+
+### SoftNMS
+
+#### 描述
+
+根据`scores`计算`boxes`的soft NMS。 请阅读[Soft-NMS -- Improving Object Detection With One Line of Code](https://arxiv.org/abs/1704.04503)了解细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ------- | --------------- | ------------------------------------------------------- |
+| `float` | `iou_threshold` | 用来判断候选框重合度的阈值,取值范围[0, 1]。默认值为0 |
+| `float` | `sigma` | 高斯方法的超参数 |
+| `float` | `min_score` | NMS的score阈值 |
+| `int` | `method` | NMS的计算方式, (0: `naive`, 1: `linear`, 2: `gaussian`) |
+| `int` | `offset` | 用来计算候选框的宽高(x2 - x1 + offset)。可选值0或1 |
+
+#### 输入
+
+
+boxes : T
+输入候选框。形状为(N, 4)的二维张量,N为候选框数量。
+scores : T
+输入得分。形状为(N, )的一维张量。
+
+
+#### 输出
+
+
+dets : T
+输出的检测框与得分。形状为(num_valid_boxes, 5)的二维张量,内容为[[x1, y1, x2, y2, score], ...]。num_valid_boxes是合法的检测框数量。
+indices : tensor(int64)
+输出序号。形状为(num_valid_boxes, )的一维张量。
+
+
+#### 类型约束
+
+- T:tensor(float32)
+
+### RoIAlign
+
+#### 描述
+
+在特征图上计算RoIAlign,通常在双阶段目标检测模型的bbox_head中使用
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ------- | ---------------- | ------------------------------------------------------- |
+| `int` | `output_height` | roi特征的输出高度 |
+| `int` | `output_width` | roi特征的输出宽度 |
+| `float` | `spatial_scale` | 输入检测框的缩放系数 |
+| `int` | `sampling_ratio` | 输出的采样率。`0`表示使用密集采样 |
+| `str` | `mode` | 池化方式。 `avg`或`max` |
+| `int` | `aligned` | 如果`aligned=1`,则像素会进行-0.5的偏移以达到更好的对齐 |
+
+#### 输入
+
+
+input : T
+输入特征图;形状为(N, C, H, W)的四维张量,其中N为batch大小,C为输入通道数,H和W为输入特征图的高和宽。
+rois : T
+需要进行池化的感兴趣区域;形状为(num_rois, 5)的二维张量,内容为[[batch_index, x1, y1, x2, y2], ...]。rois的坐标为输入特征图的坐标系。
+
+
+#### 输出
+
+
+feat : T
+池化的输出;形状为(num_rois, C, output_height, output_width)的四维张量。每个输出特征feat[i]都与输入感兴趣区域rois[i]一一对应。
+
+
+#### 类型约束
+
+- T:tensor(float32)
+
+### NMS
+
+#### 描述
+
+根据IoU阈值对候选框进行非极大值抑制。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ------- | --------------- | ----------------------------------------------------- |
+| `float` | `iou_threshold` | 用来判断候选框重合度的阈值,取值范围[0, 1]。默认值为0 |
+| `int` | `offset` | 用来计算候选框的宽高(x2 - x1 + offset)。可选值0或1 |
+
+#### 输入
+
+
+boxes : T
+输入候选框。形状为(N, 4)的二维张量,N为候选框数量。
+scores : T
+输入得分。形状为(N, )的一维张量。
+
+
+#### 输出
+
+
+indices : tensor(int32, Linear)
+被选中的候选框索引。形状为(num_valid_boxes, )的一维张量,num_valid_boxes表示被选上的候选框数量。
+
+
+#### 类型约束
+
+- T:tensor(float32)
+
+### grid_sampler
+
+#### 描述
+
+根据`grid`的像素位置对`input`进行网格采样。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ----- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `int` | `interpolation_mode` | 计算输出使用的插值模式。(0: `bilinear` , 1: `nearest`) |
+| `int` | `padding_mode` | 边缘填充模式。(0: `zeros`, 1: `border`, 2: `reflection`) |
+| `int` | `align_corners` | 如果`align_corners=1`,则极值(`-1`和`1`)会被当做输入边缘像素的中心点。如果`align_corners=0`,则它们会被看做是边缘像素的边缘点,减小分辨率对采样的影响 |
+
+#### 输入
+
+
+input : T
+输入特征;形状为(N, C, inH, inW)的四维张量,其中N为batch大小,C为输入通道数,inH和inW为输入特征图的高和宽。
+grid : T
+输入网格;形状为(N, outH, outW, 2)的四维张量,outH和outW为输出的高和宽。
+
+
+#### 输出
+
+
+output : T
+输出特征;形状为(N, C, outH, outW)的四维张量。
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### CornerPool
+
+#### 描述
+
+对`input`计算CornerPool。请阅读[CornerNet -- Detecting Objects as Paired Keypoints](https://arxiv.org/abs/1808.01244)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ----- | ------ | -------------------------------------------------------- |
+| `int` | `mode` | 池化模式。(0: `top`, 1: `bottom`, 2: `left`, 3: `right`) |
+
+#### 输入
+
+
+input : T
+输入特征;形状为(N, C, H, W)的四维张量,其中N为batch大小,C为输入通道数,H和W为输入特征图的高和宽。
+
+
+#### 输出
+
+
+output : T
+输出特征;形状为(N, C, H, W)的四维张量。
+
+
+#### 类型约束
+
+- T:tensor(float32)
+
+### cummax
+
+#### 描述
+
+返回一个元组(`values`, `indices`),其中`values`为`input`第`dim`维的累计最大值,`indices`为第`dim`维最大值位置。请阅读[torch.cummax](https://pytorch.org/docs/stable/generated/torch.cummax.html)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ----- | ------ | ------------------ |
+| `int` | `dim` | 进行累计计算的维度 |
+
+#### 输入
+
+
+input : T
+输入张量;可以使任意形状;也支持空Tensor
+
+
+#### 输出
+
+
+output : T
+`input`第`dim`维的累计最大值,形状与`input`相同。类型和`input`一致
+indices : tensor(int64)
+第`dim`维最大值位置,形状与`input`相同。
+
+
+#### 类型约束
+
+- T:tensor(float32)
+
+### cummin
+
+#### 描述
+
+返回一个元组(`values`, `indices`),其中`values`为`input`第`dim`维的累计最小值,`indices`为第`dim`维最小值位置。请阅读[torch.cummin](https://pytorch.org/docs/stable/generated/torch.cummin.html)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ----- | ------ | ------------------ |
+| `int` | `dim` | 进行累计计算的维度 |
+
+#### 输入
+
+
+input : T
+输入张量;可以是任意形状;也支持空Tensor
+
+
+#### 输出
+
+
+output : T
+`input`第`dim`维的累计最小值,形状与`input`相同。类型和`input`一致
+indices : tensor(int64)
+第`dim`维最小值位置,形状与`input`相同。
+
+
+#### 类型约束
+
+- T:tensor(float32)
+
+### MMCVModulatedDeformConv2d
+
+#### 描述
+
+在输入特征上计算Modulated Deformable Convolution,请阅读[Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| -------------- | ------------------- | ------------------------------------------------------------- |
+| `list of ints` | `stride` | 卷积的步长 (sH, sW) |
+| `list of ints` | `padding` | 输入特征填充大小 (padH, padW) |
+| `list of ints` | `dilation` | 卷积核各元素间隔 (dH, dW) |
+| `int` | `deformable_groups` | 可变偏移量的分组,通常置位1即可 |
+| `int` | `groups` | 卷积分组数,`input_channel`会根据这个值被分为数个分组进行计算 |
+
+#### 输入
+
+
+inputs[0] : T
+输入特征;形状为(N, C, inH, inW)的四维张量,其中N为batch大小,C为输入通道数,inH和inW为输入特征图的高和宽。
+inputs[1] : T
+输入偏移量;形状为(N, deformable_group* 2* kH* kW, outH, outW)的四维张量,kH和kW为输入特征图的高和宽,outH和outW为输入特征图的高和宽。
+inputs[2] : T
+输入掩码;形状为(N, deformable_group* kH* kW, outH, outW)的四维张量。
+inputs[3] : T
+输入权重;形状为(output_channel, input_channel, kH, kW)的四维张量。
+inputs[4] : T, optional
+输入偏移量;形状为(output_channel)的一维张量。
+
+
+#### 输出
+
+
+outputs[0] : T
+输出特征;形状为(N, output_channel, outH, outW)的四维张量。
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
diff --git a/docs_zh_CN/deployment/onnxruntime_op.md b/docs_zh_CN/deployment/onnxruntime_op.md
new file mode 100644
index 0000000000000000000000000000000000000000..3898aa164fd019b635890243d03de316d2f36127
--- /dev/null
+++ b/docs_zh_CN/deployment/onnxruntime_op.md
@@ -0,0 +1,127 @@
+## MMCV中的ONNX Runtime自定义算子
+
+### ONNX Runtime介绍
+
+**ONNX Runtime**是一个跨平台的推理与训练加速器,适配许多常用的机器学习/深度神经网络框架。请访问[github](https://github.com/microsoft/onnxruntime)了解更多信息。
+
+### ONNX介绍
+
+**ONNX**是**Open Neural Network Exchange**的缩写,是许多机器学习/深度神经网络框架使用的*中间表示(IR)*。请访问[github](https://github.com/onnx/onnx)了解更多信息。
+
+### 为什么要在MMCV中添加ONNX自定义算子?
+
+- 为了验证ONNX模型在ONNX Runtime下的推理的正确性。
+- 为了方便使用了`mmcv.ops`自定义算子的模型的部署工作。
+
+### MMCV已支持的算子
+
+| 算子 | CPU | GPU | MMCV版本 |
+| :------------------------------------------------------------------------------: | :---: | :---: | :------: |
+| [SoftNMS](onnxruntime_custom_ops.md#softnms) | Y | N | 1.2.3 |
+| [RoIAlign](onnxruntime_custom_ops.md#roialign) | Y | N | 1.2.5 |
+| [NMS](onnxruntime_custom_ops.md#nms) | Y | N | 1.2.7 |
+| [grid_sampler](onnxruntime_custom_ops.md#grid_sampler) | Y | N | 1.3.1 |
+| [CornerPool](onnxruntime_custom_ops.md#cornerpool) | Y | N | 1.3.4 |
+| [cummax](onnxruntime_custom_ops.md#cummax) | Y | N | 1.3.4 |
+| [cummin](onnxruntime_custom_ops.md#cummin) | Y | N | 1.3.4 |
+| [MMCVModulatedDeformConv2d](onnxruntime_custom_ops.md#mmcvmodulateddeformconv2d) | Y | N | 1.3.12 |
+
+### 如何编译ONNX Runtime自定义算子?
+
+*请注意我们仅在**onnxruntime>=1.8.1**的Linux x86-64 cpu平台上进行过测试*
+
+#### 准备工作
+
+- 克隆代码仓库
+
+```bash
+git clone https://github.com/open-mmlab/mmcv.git
+```
+
+- 从ONNX Runtime下载`onnxruntime-linux`:[releases](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1),解压缩,根据路径创建变量`ONNXRUNTIME_DIR`并把路径下的lib目录添加到`LD_LIBRARY_PATH`,步骤如下:
+
+```bash
+wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
+
+tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
+cd onnxruntime-linux-x64-1.8.1
+export ONNXRUNTIME_DIR=$(pwd)
+export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
+```
+
+#### Linux系统下编译
+
+```bash
+cd mmcv ## to MMCV root directory
+MMCV_WITH_OPS=1 MMCV_WITH_ORT=1 python setup.py develop
+```
+
+### 如何在python下使用ONNX Runtime对导出的ONNX模型做编译
+
+使用`pip`安装ONNX Runtime
+
+```bash
+pip install onnxruntime==1.8.1
+```
+
+推理范例
+
+```python
+import os
+
+import numpy as np
+import onnxruntime as ort
+
+from mmcv.ops import get_onnxruntime_op_path
+
+ort_custom_op_path = get_onnxruntime_op_path()
+assert os.path.exists(ort_custom_op_path)
+session_options = ort.SessionOptions()
+session_options.register_custom_ops_library(ort_custom_op_path)
+## exported ONNX model with custom operators
+onnx_file = 'sample.onnx'
+input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
+sess = ort.InferenceSession(onnx_file, session_options)
+onnx_results = sess.run(None, {'input' : input_data})
+```
+
+### 如何为MMCV添加ONNX Runtime的自定义算子
+
+#### 开发前提醒
+
+- 该算子的ONNX Runtime实现尚未在MMCV中支持[已实现算子列表](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md)。
+- 确保该自定义算子可以被ONNX导出。
+
+#### 添加方法
+
+以`soft_nms`为例:
+
+1. 在ONNX Runtime头文件目录`mmcv/ops/csrc/onnxruntime/`下添加头文件`soft_nms.h`
+2. 在ONNX Runtime源码目录`mmcv/ops/csrc/onnxruntime/cpu/`下添加算子实现`soft_nms.cpp`
+3. 在[onnxruntime_register.cpp](../../mmcv/ops/csrc/onnxruntime/cpu/onnxruntime_register.cpp)中注册实现的算子`soft_nms`
+
+ ```c++
+ #include "soft_nms.h"
+
+ SoftNmsOp c_SoftNmsOp;
+
+ if (auto status = ortApi->CustomOpDomain_Add(domain, &c_SoftNmsOp)) {
+ return status;
+ }
+ ```
+
+4. 在`tests/test_ops/test_onnx.py`添加单元测试,
+ 可以参考[here](../../tests/test_ops/test_onnx.py)。
+
+**最后,欢迎为MMCV添加ONNX Runtime自定义算子** :nerd_face:
+
+### 已知问题
+
+- "RuntimeError: tuple appears in op that does not forward tuples, unsupported kind: `prim::PythonOp`."
+ 1. 请注意`cummax`和`cummin`算子是在torch >= 1.5.0被添加的。但他们需要在torch version >= 1.7.0才能正确导出。否则会在导出时发生上面的错误。
+ 2. 解决方法:升级PyTorch到1.7.0以上版本
+
+### 引用
+
+- [How to export Pytorch model with custom op to ONNX and run it in ONNX Runtime](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
+- [How to add a custom operator/kernel in ONNX Runtime](https://github.com/microsoft/onnxruntime/blob/master/docs/AddingCustomOp.md)
diff --git a/docs_zh_CN/deployment/tensorrt_custom_ops.md b/docs_zh_CN/deployment/tensorrt_custom_ops.md
new file mode 100644
index 0000000000000000000000000000000000000000..123f2889bf18aa549c327ea70f3ba974b45e48f5
--- /dev/null
+++ b/docs_zh_CN/deployment/tensorrt_custom_ops.md
@@ -0,0 +1,391 @@
+## TensorRT自定义算子
+
+
+
+- [TensorRT自定义算子](#tensorrt自定义算子)
+ - [MMCVRoIAlign](#mmcvroialign)
+ - [描述](#描述)
+ - [模型参数](#模型参数)
+ - [输入](#输入)
+ - [输出](#输出)
+ - [类型约束](#类型约束)
+ - [ScatterND](#scatternd)
+ - [描述](#描述-1)
+ - [模型参数](#模型参数-1)
+ - [输入](#输入-1)
+ - [输出](#输出-1)
+ - [类型约束](#类型约束-1)
+ - [NonMaxSuppression](#nonmaxsuppression)
+ - [描述](#描述-2)
+ - [模型参数](#模型参数-2)
+ - [输入](#输入-2)
+ - [输出](#输出-2)
+ - [类型约束](#类型约束-2)
+ - [MMCVDeformConv2d](#mmcvdeformconv2d)
+ - [描述](#描述-3)
+ - [模型参数](#模型参数-3)
+ - [输入](#输入-3)
+ - [输出](#输出-3)
+ - [类型约束](#类型约束-3)
+ - [grid_sampler](#grid_sampler)
+ - [描述](#描述-4)
+ - [模型参数](#模型参数-4)
+ - [输入](#输入-4)
+ - [输出](#输出-4)
+ - [类型约束](#类型约束-4)
+ - [cummax](#cummax)
+ - [描述](#描述-5)
+ - [模型参数](#模型参数-5)
+ - [输入](#输入-5)
+ - [输出](#输出-5)
+ - [类型约束](#类型约束-5)
+ - [cummin](#cummin)
+ - [描述](#描述-6)
+ - [模型参数](#模型参数-6)
+ - [输入](#输入-6)
+ - [输出](#输出-6)
+ - [类型约束](#类型约束-6)
+ - [MMCVInstanceNormalization](#mmcvinstancenormalization)
+ - [描述](#描述-7)
+ - [模型参数](#模型参数-7)
+ - [输入](#输入-7)
+ - [输出](#输出-7)
+ - [类型约束](#类型约束-7)
+ - [MMCVModulatedDeformConv2d](#mmcvmodulateddeformconv2d)
+ - [描述](#描述-8)
+ - [模型参数](#模型参数-8)
+ - [输入](#输入-8)
+ - [输出](#输出-8)
+ - [类型约束](#类型约束-8)
+
+
+
+### MMCVRoIAlign
+
+#### 描述
+
+在特征图上计算RoIAlign,在多数双阶段目标检测模型的bbox_head中使用
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ------- | ---------------- | ------------------------------------------------------- |
+| `int` | `output_height` | roi特征的输出高度 |
+| `int` | `output_width` | roi特征的输出宽度 |
+| `float` | `spatial_scale` | 输入检测框的缩放系数 |
+| `int` | `sampling_ratio` | 输出的采样率。`0`表示使用密集采样 |
+| `str` | `mode` | 池化方式。 `avg`或`max` |
+| `int` | `aligned` | 如果`aligned=1`,则像素会进行-0.5的偏移以达到更好的对齐 |
+
+#### 输入
+
+
+inputs[0] : T
+输入特征图;形状为(N, C, H, W)的四维张量,其中N为batch大小,C为输入通道数,H和W为输入特征图的高和宽。
+inputs[1] : T
+需要进行池化的感兴趣区域;形状为(num_rois, 5)的二维张量,内容为[[batch_index, x1, y1, x2, y2], ...]。rois的坐标为输入特征图的坐标系。
+
+
+#### 输出
+
+
+outputs[0] : T
+池化的输出;形状为(num_rois, C, output_height, output_width)的四维张量。每个输出特征feat[i]都与输入感兴趣区域rois[i]一一对应。
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### ScatterND
+
+#### 描述
+
+ScatterND接收三个输入,分别为秩为r >= 1的`data`,秩为q >= 1的`indices`以及秩为 q + r - indices.shape[-1] -1 的`update`。输出的计算方式为:首先创建一个`data`的拷贝,然后根据`indces`的值使用`update`对拷贝的`data`进行更新。注意`indices`中不应该存在相同的条目,也就是说对同一个位置进行一次以上的更新是不允许的。
+
+输出的计算方式可以参考如下代码:
+
+```python
+ output = np.copy(data)
+ update_indices = indices.shape[:-1]
+ for idx in np.ndindex(update_indices):
+ output[indices[idx]] = updates[idx]
+```
+
+#### 模型参数
+
+无
+
+#### 输入
+
+
+inputs[0] : T
+秩为r >= 1的输入`data`
+
+inputs[1] : tensor(int32, Linear)
+秩为q >= 1的输入`update`
+
+inputs[2] : T
+秩为 q + r - indices.shape[-1] -1 的输入`update`
+
+
+#### 输出
+
+
+outputs[0] : T
+秩为r >= 1的输出张量
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear), tensor(int32, Linear)
+
+### NonMaxSuppression
+
+#### 描述
+
+根据IoU阈值对候选框进行非极大值抑制。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ------- | ---------------------------- | ---------------------------------------------------------------------------------------- |
+| `int` | `center_point_box` | 0 - 候选框的格式为[y1, x1, y2, x2], 1-候选框的格式为[x_center, y_center, width, height] |
+| `int` | `max_output_boxes_per_class` | 每一类最大的输出检测框个数。默认为0,输出检测框个数等于输入候选框数 |
+| `float` | `iou_threshold` | 用来判断候选框重合度的阈值,取值范围[0, 1]。默认值为0 |
+| `float` | `score_threshold` | 用来判断候选框是否合法的阈值 |
+| `int` | `offset` | 检测框长宽计算方式为(x2 - x1 + offset),可选值0或1 |
+
+#### 输入
+
+
+inputs[0] : T
+输入候选框。形状为(num_batches, spatial_dimension, 4)的三维张量
+inputs[1] : T
+输入得分。形状为(num_batches, num_classes, spatial_dimension)的三维张量
+
+
+#### 输出
+
+
+outputs[0] : tensor(int32, Linear)
+被选中的候选框索引。形状为(num_selected_indices, 3)的二维张量。每一行内容为[batch_index, class_index, box_index]。
+其中 num_selected_indices=num_batches* num_classes* min(max_output_boxes_per_class, spatial_dimension)。
+所有未被选中的候选框索引都会被填充为-1
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### MMCVDeformConv2d
+
+#### 描述
+
+在输入特征上计算Deformable Convolution,请阅读[Deformable Convolutional Network](https://arxiv.org/abs/1703.06211)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| -------------- | ------------------ | --------------------------------------------------------------------------------------------- |
+| `list of ints` | `stride` | 卷积的步长 (sH, sW) |
+| `list of ints` | `padding` | 输入特征填充大小 (padH, padW) |
+| `list of ints` | `dilation` | 卷积核各元素间隔 (dH, dW) |
+| `int` | `deformable_group` | 可变偏移量的分组 |
+| `int` | `group` | 卷积分组数,`input_channel`会根据这个值被分为数个分组进行计算 |
+| `int` | `im2col_step` | 可变卷积使用im2col计算卷积。输入与偏移量会以im2col_step为步长分块计算,减少临时空间的使用量。 |
+
+#### 输入
+
+
+inputs[0] : T
+输入特征;形状为(N, C, inH, inW)的四维张量,其中N为batch大小,C为输入通道数,inH和inW为输入特征图的高和宽
+inputs[1] : T
+输入偏移量;形状为(N, deformable_group* 2* kH* kW, outH, outW)的四维张量,kH和kW为输入特征图的高和宽,outH和outW为输入特征图的高和宽
+inputs[2] : T
+输入权重;形状为(output_channel, input_channel, kH, kW)的四维张量
+
+
+#### 输出
+
+
+outputs[0] : T
+输出特征;形状为(N, output_channel, outH, outW)的四维张量
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### grid_sampler
+
+#### 描述
+
+根据`grid`的像素位置对`input`进行网格采样。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ----- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `int` | `interpolation_mode` | 计算输出使用的插值模式。(0: `bilinear` , 1: `nearest`) |
+| `int` | `padding_mode` | 边缘填充模式。(0: `zeros`, 1: `border`, 2: `reflection`) |
+| `int` | `align_corners` | 如果`align_corners=1`,则极值(`-1`和`1`)会被当做输入边缘像素的中心点。如果`align_corners=0`,则它们会被看做是边缘像素的边缘点,减小分辨率对采样的影响 |
+
+#### 输入
+
+
+inputs[0] : T
+输入特征;形状为(N, C, inH, inW)的四维张量,其中N为batch大小,C为输入通道数,inH和inW为输入特征图的高和宽
+inputs[1] : T
+输入网格;形状为(N, outH, outW, 2)的四维张量,outH和outW为输出的高和宽
+
+
+#### 输出
+
+
+outputs[0] : T
+输出特征;形状为(N, C, outH, outW)的四维张量
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### cummax
+
+#### 描述
+
+返回一个元组(`values`, `indices`),其中`values`为`input`第`dim`维的累计最大值,`indices`为第`dim`维最大值位置。请阅读[torch.cummax](https://pytorch.org/docs/stable/generated/torch.cummax.html)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ----- | ------ | ------------------ |
+| `int` | `dim` | 进行累计计算的维度 |
+
+#### 输入
+
+
+inputs[0] : T
+输入张量;可以使任意形状
+
+
+#### 输出
+
+
+outputs[0] : T
+`input`第`dim`维的累计最大值,形状与`input`相同。类型和`input`一致
+outputs[1] : (int32, Linear)
+第`dim`维最大值位置,形状与`input`相同
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### cummin
+
+#### 描述
+
+返回一个元组(`values`, `indices`),其中`values`为`input`第`dim`维的累计最小值,`indices`为第`dim`维最小值位置。请阅读[torch.cummin](https://pytorch.org/docs/stable/generated/torch.cummin.html)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ----- | ------ | ------------------ |
+| `int` | `dim` | 进行累计计算的维度 |
+
+#### 输入
+
+
+inputs[0] : T
+输入张量;可以使任意形状
+
+
+#### 输出
+
+
+outputs[0] : T
+`input`第`dim`维的累计最小值,形状与`input`相同。类型和`input`一致
+outputs[1] : (int32, Linear)
+第`dim`维最小值位置,形状与`input`相同
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### MMCVInstanceNormalization
+
+#### 描述
+
+对特征计算instance normalization,请阅读[Instance Normalization: The Missing Ingredient for Fast Stylization](https://arxiv.org/abs/1607.08022)了解更多详细信息。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| ------- | --------- | ---------------------------- |
+| `float` | `epsilon` | 用来避免除0错误。默认为1e-05 |
+
+#### 输入
+
+
+inputs[0] : T
+输入特征。形状为(N, C, H, W)的四维张量,其中N为batch大小,C为输入通道数,H和W为输入特征图的高和宽
+inputs[1] : T
+输入缩放系数。形状为(C,)的一维张量
+inputs[2] : T
+输入偏移量。形状为(C,)的一维张量
+
+
+#### 输出
+
+
+outputs[0] : T
+输出特征。形状为(N, C, H, W)的四维张量
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
+
+### MMCVModulatedDeformConv2d
+
+#### 描述
+
+在输入特征上计算Modulated Deformable Convolution,请阅读[Deformable ConvNets v2: More Deformable, Better Results](https://arxiv.org/abs/1811.11168?from=timeline)了解更多细节。
+
+#### 模型参数
+
+| 类型 | 参数名 | 描述 |
+| -------------- | ------------------- | ------------------------------------------------------------- |
+| `list of ints` | `stride` | 卷积的步长 (sH, sW) |
+| `list of ints` | `padding` | 输入特征填充大小 (padH, padW) |
+| `list of ints` | `dilation` | 卷积核各元素间隔 (dH, dW) |
+| `int` | `deformable_groups` | 可变偏移量的分组,通常置位1即可 |
+| `int` | `groups` | 卷积分组数,`input_channel`会根据这个值被分为数个分组进行计算 |
+
+#### 输入
+
+
+inputs[0] : T
+输入特征;形状为(N, C, inH, inW)的四维张量,其中N为batch大小,C为输入通道数,inH和inW为输入特征图的高和宽
+inputs[1] : T
+输入偏移量;形状为(N, deformable_group* 2* kH* kW, outH, outW)的四维张量,kH和kW为输入特征图的高和宽,outH和outW为输入特征图的高和宽
+inputs[2] : T
+输入掩码;形状为(N, deformable_group* kH* kW, outH, outW)的四维张量
+inputs[3] : T
+输入权重;形状为(output_channel, input_channel, kH, kW)的四维张量
+inputs[4] : T, optional
+输入偏移量;形状为(output_channel)的一维张量
+
+
+#### 输出
+
+
+outputs[0] : T
+输出特征;形状为(N, output_channel, outH, outW)的四维张量
+
+
+#### 类型约束
+
+- T:tensor(float32, Linear)
diff --git a/docs_zh_CN/deployment/tensorrt_plugin.md b/docs_zh_CN/deployment/tensorrt_plugin.md
new file mode 100644
index 0000000000000000000000000000000000000000..0f385b8e032fac3267a838367b53d26880a693c9
--- /dev/null
+++ b/docs_zh_CN/deployment/tensorrt_plugin.md
@@ -0,0 +1,177 @@
+## MMCV中的TensorRT自定义算子 (实验性)
+
+
+
+- [MMCV中的TensorRT自定义算子 (实验性)](#mmcv中的tensorrt自定义算子-实验性)
+ - [介绍](#介绍)
+ - [MMCV中的TensorRT插件列表](#mmcv中的tensorrt插件列表)
+ - [如何编译MMCV中的TensorRT插件](#如何编译mmcv中的tensorrt插件)
+ - [准备](#准备)
+ - [在Linux上编译](#在linux上编译)
+ - [创建TensorRT推理引擎并在python下进行推理](#创建tensorrt推理引擎并在python下进行推理)
+ - [如何在MMCV中添加新的TensorRT自定义算子](#如何在mmcv中添加新的tensorrt自定义算子)
+ - [主要流程](#主要流程)
+ - [注意](#注意)
+ - [已知问题](#已知问题)
+ - [引用](#引用)
+
+
+
+### 介绍
+
+**NVIDIA TensorRT**是一个为深度学习模型高性能推理准备的软件开发工具(SDK)。它包括深度学习推理优化器和运行时,可为深度学习推理应用提供低延迟和高吞吐量。请访问[developer's website](https://developer.nvidia.com/tensorrt)了解更多信息。
+为了简化TensorRT部署带有MMCV自定义算子的模型的流程,MMCV中添加了一系列TensorRT插件。
+
+### MMCV中的TensorRT插件列表
+
+| ONNX算子 | TensorRT插件 | MMCV版本 |
+| :-----------------------: | :-----------------------------------------------------------------------------: | :------: |
+| MMCVRoiAlign | [MMCVRoiAlign](./tensorrt_custom_ops.md#mmcvroialign) | 1.2.6 |
+| ScatterND | [ScatterND](./tensorrt_custom_ops.md#scatternd) | 1.2.6 |
+| NonMaxSuppression | [NonMaxSuppression](./tensorrt_custom_ops.md#nonmaxsuppression) | 1.3.0 |
+| MMCVDeformConv2d | [MMCVDeformConv2d](./tensorrt_custom_ops.md#mmcvdeformconv2d) | 1.3.0 |
+| grid_sampler | [grid_sampler](./tensorrt_custom_ops.md#grid-sampler) | 1.3.1 |
+| cummax | [cummax](./tensorrt_custom_ops.md#cummax) | 1.3.5 |
+| cummin | [cummin](./tensorrt_custom_ops.md#cummin) | 1.3.5 |
+| MMCVInstanceNormalization | [MMCVInstanceNormalization](./tensorrt_custom_ops.md#mmcvinstancenormalization) | 1.3.5 |
+| MMCVModulatedDeformConv2d | [MMCVModulatedDeformConv2d](./tensorrt_custom_ops.md#mmcvmodulateddeformconv2d) | master |
+
+注意
+
+- 以上所有算子均在 TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0 环境下开发。
+
+### 如何编译MMCV中的TensorRT插件
+
+#### 准备
+
+- 克隆代码仓库
+
+```bash
+git clone https://github.com/open-mmlab/mmcv.git
+```
+
+- 安装TensorRT
+
+从 [NVIDIA Developer Zone](https://developer.nvidia.com/nvidia-tensorrt-download) 下载合适的TensorRT版本。
+
+比如,对安装了cuda-10.2的x86-64的Ubuntu 16.04,下载文件为`TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz`.
+
+然后使用下面方式安装并配置环境
+
+```bash
+cd ~/Downloads
+tar -xvzf TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz
+export TENSORRT_DIR=`pwd`/TensorRT-7.2.1.6
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$TENSORRT_DIR/lib
+```
+
+安装python依赖: tensorrt, graphsurgeon, onnx-graphsurgeon
+
+```bash
+pip install $TENSORRT_DIR/python/tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl
+pip install $TENSORRT_DIR/onnx_graphsurgeon/onnx_graphsurgeon-0.2.6-py2.py3-none-any.whl
+pip install $TENSORRT_DIR/graphsurgeon/graphsurgeon-0.4.5-py2.py3-none-any.whl
+```
+
+想了解更多通过tar包安装TensorRT,请访问[Nvidia' website](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-721/install-guide/index.html#installing-tar).
+
+#### 在Linux上编译
+
+```bash
+cd mmcv ## to MMCV root directory
+MMCV_WITH_OPS=1 MMCV_WITH_TRT=1 pip install -e .
+```
+
+### 创建TensorRT推理引擎并在python下进行推理
+
+范例如下:
+
+```python
+import torch
+import onnx
+
+from mmcv.tensorrt import (TRTWrapper, onnx2trt, save_trt_engine,
+ is_tensorrt_plugin_loaded)
+
+assert is_tensorrt_plugin_loaded(), 'Requires to complie TensorRT plugins in mmcv'
+
+onnx_file = 'sample.onnx'
+trt_file = 'sample.trt'
+onnx_model = onnx.load(onnx_file)
+
+## Model input
+inputs = torch.rand(1, 3, 224, 224).cuda()
+## Model input shape info
+opt_shape_dict = {
+ 'input': [list(inputs.shape),
+ list(inputs.shape),
+ list(inputs.shape)]
+}
+
+## Create TensorRT engine
+max_workspace_size = 1 << 30
+trt_engine = onnx2trt(
+ onnx_model,
+ opt_shape_dict,
+ max_workspace_size=max_workspace_size)
+
+## Save TensorRT engine
+save_trt_engine(trt_engine, trt_file)
+
+## Run inference with TensorRT
+trt_model = TRTWrapper(trt_file, ['input'], ['output'])
+
+with torch.no_grad():
+ trt_outputs = trt_model({'input': inputs})
+ output = trt_outputs['output']
+
+```
+
+### 如何在MMCV中添加新的TensorRT自定义算子
+
+#### 主要流程
+
+下面是主要的步骤:
+
+1. 添加c++头文件
+2. 添加c++源文件
+3. 添加cuda kernel文件
+4. 在`trt_plugin.cpp`中注册插件
+5. 在`tests/test_ops/test_tensorrt.py`中添加单元测试
+
+**以RoIAlign算子插件`roi_align`举例。**
+
+1. 在TensorRT包含目录`mmcv/ops/csrc/tensorrt/`中添加头文件`trt_roi_align.hpp`
+2. 在TensorRT源码目录`mmcv/ops/csrc/tensorrt/plugins/`中添加头文件`trt_roi_align.cpp`
+3. 在TensorRT源码目录`mmcv/ops/csrc/tensorrt/plugins/`中添加cuda kernel文件`trt_roi_align_kernel.cu`
+4. 在[trt_plugin.cpp](https://github.com/open-mmlab/mmcv/blob/master/mmcv/ops/csrc/tensorrt/plugins/trt_plugin.cpp)中注册`roi_align`插件
+
+ ```c++
+ #include "trt_plugin.hpp"
+
+ #include "trt_roi_align.hpp"
+
+ REGISTER_TENSORRT_PLUGIN(RoIAlignPluginDynamicCreator);
+
+ extern "C" {
+ bool initLibMMCVInferPlugins() { return true; }
+ } // extern "C"
+ ```
+
+5. 在`tests/test_ops/test_tensorrt.py`中添加单元测试
+
+#### 注意
+
+- 部分MMCV中的自定义算子存在对应的cuda实现,在进行TensorRT插件开发的时候可以参考。
+
+### 已知问题
+
+- 无
+
+### 引用
+
+- [Developer guide of Nvidia TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html)
+- [TensorRT Open Source Software](https://github.com/NVIDIA/TensorRT)
+- [onnx-tensorrt](https://github.com/onnx/onnx-tensorrt)
+- [TensorRT python API](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/index.html)
+- [TensorRT c++ plugin API](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_plugin.html)
diff --git a/docs_zh_CN/faq.md b/docs_zh_CN/faq.md
new file mode 100644
index 0000000000000000000000000000000000000000..e5d6395720e9e210771e10256efb926a0da5f4fa
--- /dev/null
+++ b/docs_zh_CN/faq.md
@@ -0,0 +1,37 @@
+## 常见问题
+
+在这里我们列出了用户经常遇到的问题以及对应的解决方法。如果您遇到了其他常见的问题,并且知道可以帮到大家的解决办法,
+欢迎随时丰富这个列表。
+
+- MMCV 和 MMDetection 的兼容性问题;"ConvWS is already registered in conv layer"
+
+ 请按照上述说明为您的 MMDetection 版本安装正确版本的 MMCV。
+
+- "No module named 'mmcv.ops'"; "No module named 'mmcv._ext'"
+
+ 1. 使用 `pip uninstall mmcv` 卸载您环境中的 mmcv
+ 2. 按照上述说明安装 mmcv-full
+
+- "invalid device function" 或者 "no kernel image is available for execution"
+
+ 1. 检查 GPU 的 CUDA 计算能力
+ 2. 运行 `python mmdet/utils/collect_env.py` 来检查 PyTorch、torchvision 和 MMCV 是否是针对正确的 GPU 架构构建的
+ 您可能需要去设置 `TORCH_CUDA_ARCH_LIST` 来重新安装 MMCV
+ 兼容性问题的可能会出现在使用旧版的 GPUs,如:colab 上的 Tesla K80 (3.7)
+ 3. 检查运行环境是否和 mmcv/mmdet 编译时的环境相同。例如,您可能使用 CUDA 10.0 编译 mmcv,但在 CUDA 9.0 的环境中运行它
+
+- "undefined symbol" 或者 "cannot open xxx.so"。
+
+ 1. 如果符号和 CUDA/C++ 相关(例如:libcudart.so 或者 GLIBCXX),请检查 CUDA/GCC 运行时的版本是否和编译 mmcv 的一致
+ 2. 如果符号和 PyTorch 相关(例如:符号包含 caffe、aten 和 TH),请检查 PyTorch 运行时的版本是否和编译 mmcv 的一致
+ 3. 运行 `python mmdet/utils/collect_env.py` 以检查 PyTorch、torchvision 和 MMCV 构建和运行的环境是否相同
+
+- "RuntimeError: CUDA error: invalid configuration argument"。
+
+ 这个错误可能是由于您的 GPU 性能不佳造成的。尝试降低[THREADS_PER_BLOCK](https://github.com/open-mmlab/mmcv/blob/cac22f8cf5a904477e3b5461b1cc36856c2793da/mmcv/ops/csrc/common_cuda_helper.hpp#L10)
+ 的值并重新编译 mmcv。
+
+- "RuntimeError: nms is not compiled with GPU support"。
+
+ 这个错误是由于您的 CUDA 环境没有正确安装。
+ 您可以尝试重新安装您的 CUDA 环境,然后删除 mmcv/build 文件夹并重新编译 mmcv。
diff --git a/docs_zh_CN/get_started/build.md b/docs_zh_CN/get_started/build.md
new file mode 100644
index 0000000000000000000000000000000000000000..77fb86e9cf5c805bdca5fdaff6f22768cbfe8d3e
--- /dev/null
+++ b/docs_zh_CN/get_started/build.md
@@ -0,0 +1,222 @@
+## 从源码编译 MMCV
+
+### 在 Linux 或者 macOS 上编译 MMCV
+
+克隆算法库
+
+```bash
+git clone https://github.com/open-mmlab/mmcv.git
+cd mmcv
+```
+
+你可以安装 lite 版本
+
+```bash
+pip install -e .
+```
+
+也可以安装 full 版本
+
+```bash
+MMCV_WITH_OPS=1 pip install -e .
+```
+
+如果是在 macOS 上编译,则需要在安装命令前添加一些环境变量
+
+```bash
+CC=clang CXX=clang++ CFLAGS='-stdlib=libc++'
+```
+
+例如
+
+```bash
+CC=clang CXX=clang++ CFLAGS='-stdlib=libc++' MMCV_WITH_OPS=1 pip install -e .
+```
+
+```{note}
+如果你打算使用 `opencv-python-headless` 而不是 `opencv-python`,例如在一个很小的容器环境或者没有图形用户界面的服务器中,你可以先安装 `opencv-python-headless`,这样在安装 mmcv 依赖的过程中会跳过 `opencv-python`
+```
+### 在 Windows 上编译 MMCV
+
+在 Windows 上编译 MMCV 比 Linux 复杂,本节将一步步介绍如何在 Windows 上编译 MMCV。
+
+#### 依赖项
+
+请首先安装以下的依赖项:
+
+- [Git](https://git-scm.com/download/win):安装期间,请选择 **add git to Path**
+- [Visual Studio Community 2019](https://visualstudio.microsoft.com):用于编译 C++ 和 CUDA 代码
+- [Miniconda](https://docs.conda.io/en/latest/miniconda.html):包管理工具
+- [CUDA 10.2](https://developer.nvidia.com/cuda-10.2-download-archive):如果只需要 CPU 版本可以不安装 CUDA,安装CUDA时,可根据需要进行自定义安装。如果已经安装新版本的显卡驱动,建议取消驱动程序的安装
+
+```{note}
+您需要知道如何在 Windows 上设置变量环境,尤其是 "PATH" 的设置,以下安装过程都会用到。
+```
+
+#### 设置 Python 环境
+
+1. 从 Windows 菜单启动 Anaconda 命令行
+
+```{note}
+如 Miniconda 安装程序建议,不要使用原始的 `cmd.exe` 或是 `powershell.exe`。命令行有两个版本,一个基于 PowerShell,一个基于传统的 `cmd.exe`。请注意以下说明都是使用的基于 PowerShell
+```
+
+2. 创建一个新的 Conda 环境
+
+ ```shell
+ conda create --name mmcv python=3.7 # 经测试,3.6, 3.7, 3.8 也能通过
+ conda activate mmcv # 确保做任何操作前先激活环境
+ ```
+
+3. 安装 PyTorch 时,可以根据需要安装支持 CUDA 或不支持 CUDA 的版本
+
+ ```shell
+ # CUDA version
+ conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
+ # CPU version
+ conda install pytorch torchvision cpuonly -c pytorch
+ ```
+
+4. 准备 MMCV 源代码
+
+ ```shell
+ git clone https://github.com/open-mmlab/mmcv.git
+ cd mmcv
+ ```
+
+5. 安装所需 Python 依赖包
+
+ ```shell
+ pip3 install -r requirements.txt
+ ```
+
+#### 编译与安装 MMCV
+
+MMCV 有三种安装的模式:
+
+1. Lite 版本(不包含算子)
+
+ 这种方式下,没有算子被编译,这种模式的 mmcv 是原生的 python 包
+
+2. Full 版本(只包含 CPU 算子)
+
+ 编译 CPU 算子,但只有 x86 将会被编译,并且编译版本只能在 CPU only 情况下运行
+
+3. Full 版本(既包含 CPU 算子,又包含 CUDA 算子)
+
+ 同时编译 CPU 和 CUDA 算子,`ops` 模块的 x86 与 CUDA 的代码都可以被编译。同时编译的版本可以在 CUDA 上调用 GPU
+
+##### 通用步骤
+
+1. 设置 MSVC 编译器
+
+ 设置环境变量。添加 `C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\Hostx86\x64` 到 `PATH`,则 `cl.exe` 可以在命令行中运行,如下所示。
+
+ ```none
+ (base) PS C:\Users\xxx> cl
+ Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29111 for x64
+ Copyright (C) Microsoft Corporation. All rights reserved.
+
+ usage: cl [ option... ] filename... [ / link linkoption... ]
+ ```
+
+ 为了兼容性,我们使用 x86-hosted 以及 x64-targeted 版本,即路径中的 `Hostx86\x64` 。
+
+ 因为 PyTorch 将解析 `cl.exe` 的输出以检查其版本,只有 utf-8 将会被识别,你可能需要将系统语言更改为英语。控制面板 -> 地区-> 管理-> 非 Unicode 来进行语言转换。
+
+##### 安装方式一:Lite version(不包含算子)
+
+在完成上述的公共步骤后,从菜单打开 Anaconda 命令框,输入以下命令
+
+```shell
+# 激活环境
+conda activate mmcv
+# 切换到 mmcv 根目录
+cd mmcv
+# 安装
+python setup.py develop
+# 检查是否安装成功
+pip list
+```
+
+##### 安装方式二:Full version(只编译 CPU 算子)
+
+1. 完成上述的公共步骤
+
+2. 设置环境变量
+
+ ```shell
+ $env:MMCV_WITH_OPS = 1
+ $env:MAX_JOBS = 8 # 根据你可用CPU以及内存量进行设置
+ ```
+
+3. 编译安装
+
+ ```shell
+ conda activate mmcv # 激活环境
+ cd mmcv # 改变路径
+ python setup.py build_ext # 如果成功, cl 将被启动用于编译算子
+ python setup.py develop # 安装
+ pip list # 检查是否安装成功
+ ```
+
+##### 安装方式三:Full version(既编译 CPU 算子又编译 CUDA 算子)
+
+1. 完成上述的公共步骤
+
+2. 设置环境变量
+
+ ```shell
+ $env:MMCV_WITH_OPS = 1
+ $env:MAX_JOBS = 8 # 根据你可用CPU以及内存量进行设置
+ ```
+
+3. 检查 `CUDA_PATH` 或者 `CUDA_HOME` 环境变量已经存在在 `envs` 之中
+
+ ```none
+ (base) PS C:\Users\WRH> ls env:
+
+ Name Value
+ ---- -----
+ CUDA_PATH C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
+ CUDA_PATH_V10_1 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
+ CUDA_PATH_V10_2 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
+ ```
+
+ 如果没有,你可以按照下面的步骤设置
+
+ ```shell
+ $env:CUDA_HOME = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2"
+ # 或者
+ $env:CUDA_HOME = $env:CUDA_PATH_V10_2 # CUDA_PATH_V10_2 已经在环境变量中
+ ```
+
+4. 设置 CUDA 的目标架构
+
+ ```shell
+ $env:TORCH_CUDA_ARCH_LIST="6.1" # 支持 GTX 1080
+ # 或者用所有支持的版本,但可能会变得很慢
+ $env:TORCH_CUDA_ARCH_LIST="3.5 3.7 5.0 5.2 6.0 6.1 7.0 7.5"
+ ```
+
+```{note}
+我们可以在 [here](https://developer.nvidia.com/cuda-gpus) 查看 GPU 的计算能力
+```
+
+5. 编译安装
+
+ ```shell
+ $env:MMCV_WITH_OPS = 1
+ $env:MAX_JOBS = 8 # 根据你可用CPU以及内存量进行设置
+ conda activate mmcv # 激活环境
+ cd mmcv # 改变路径
+ python setup.py build_ext # 如果成功, cl 将被启动用于编译算子
+ python setup.py develop # 安装
+ pip list # 检查是否安装成功
+ ```
+
+```{note}
+如果你的 PyTorch 版本是 1.6.0,你可能会遇到一些这个 [issue](https://github.com/pytorch/pytorch/issues/42467) 提到的错误,则可以参考这个 [pull request](https://github.com/pytorch/pytorch/pull/43380/files) 修改 本地环境的 PyTorch 源代码
+```
+
+如果编译安装 mmcv 的过程中遇到了问题,你也许可以在 [Frequently Asked Question](../faq.html) 找到解决方法
diff --git a/docs_zh_CN/get_started/installation.md b/docs_zh_CN/get_started/installation.md
new file mode 100644
index 0000000000000000000000000000000000000000..20e8cd59545fefb833b35195c1df7b4d3736b281
--- /dev/null
+++ b/docs_zh_CN/get_started/installation.md
@@ -0,0 +1,158 @@
+## 安装 MMCV
+
+MMCV 有两个版本:
+
+- **mmcv-full**: 完整版,包含所有的特性以及丰富的开箱即用的 CUDA 算子。注意完整版本可能需要更长时间来编译。
+- **mmcv**: 精简版,不包含 CUDA 算子但包含其余所有特性和功能,类似 MMCV 1.0 之前的版本。如果你不需要使用 CUDA 算子的话,精简版可以作为一个考虑选项。
+
+```{warning}
+请不要在同一个环境中安装两个版本,否则可能会遇到类似 `ModuleNotFound` 的错误。在安装一个版本之前,需要先卸载另一个。`如果CUDA可用,强烈推荐安装mmcv-full`。
+```
+
+a. 安装完整版
+
+在安装 mmcv-full 之前,请确保 PyTorch 已经成功安装在环境中,可以参考 PyTorch 官方[文档](https://pytorch.org/)。
+
+我们提供了不同 PyTorch 和 CUDA 版本的 mmcv-full 预编译包,可以大大简化用户安装编译过程。强烈推荐通过预编译包来安装。另外,安装完成后可以运行 [check_installation.py](https://github.com/open-mmlab/mmcv/.dev_scripts/check_installation.py) 脚本检查 mmcv-full 是否安装成功。
+
+i. 安装最新版本
+
+如下是安装最新版 ``mmcv-full`` 的命令
+
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
+```
+
+请将链接中的 ``{cu_version}`` 和 ``{torch_version}`` 根据自身需求替换成实际的版本号,例如想安装和 ``CUDA 11.1``、``PyTorch 1.9.0`` 兼容的最新版 ``mmcv-full``,使用如下替换过的命令
+
+```shell
+pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+```
+
+```{note}
+PyTorch 在 1.x.0 和 1.x.1 之间通常是兼容的,故 mmcv-full 只提供 1.x.0 的编译包。如果你
+的 PyTorch 版本是 1.x.1,你可以放心地安装在 1.x.0 版本编译的 mmcv-full。例如,如果你的
+PyTorch 版本是 1.8.1、CUDA 版本是 11.1,你可以使用以下命令安装 mmcv-full。
+
+`pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html`
+```
+
+如果想知道更多 CUDA 和 PyTorch 版本的命令,可以参考下面的表格,将链接中的 ``=={mmcv_version}`` 删去即可。
+
+ii. 安装特定的版本
+
+如下是安装特定版本 ``mmcv-full`` 的命令
+
+```shell
+pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
+```
+
+首先请参考版本发布信息找到想要安装的版本号,将 ``{mmcv_version}`` 替换成该版本号,例如 ``1.3.9``。
+然后将链接中的 ``{cu_version}`` 和 ``{torch_version}`` 根据自身需求替换成实际的版本号,例如想安装和 ``CUDA 11.1``、``PyTorch 1.9.0`` 兼容的 ``mmcv-full`` 1.3.9 版本,使用如下替换过的命令
+
+```shell
+pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+```
+
+对于更多的 PyTorch 和 CUDA 版本组合,请参考下表:
+
+
+
+
+ CUDA
+ torch 1.10
+ torch 1.9
+ torch 1.8
+ torch 1.7
+ torch 1.6
+ torch 1.5
+
+
+ 11.3
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
+
+
+
+
+
+
+
+ 11.1
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
+
+
+
+
+
+ 11.0
+
+
+
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.0/index.html
+
+
+
+
+ 10.2
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.10.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.9.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.5.0/index.html
+
+
+ 10.1
+
+
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.5.0/index.html
+
+
+ 9.2
+
+
+
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.5.0/index.html
+
+
+ cpu
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.9.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.8.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.7.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.6.0/index.html
+ 安装 pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.5.0/index.html
+
+
+
+
+```{note}
+以上提供的预编译包并不囊括所有的 mmcv-full 版本,我们可以点击对应链接查看支持的版本。例如,点击 [cu102-torch1.8.0](https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html),可以看到 `cu102-torch1.8.0` 只提供了 1.3.0 及以上的 mmcv-full 版本。另外,从 `mmcv v1.3.17` 开始,我们不再提供`PyTorch 1.3 & 1.4` 对应的 mmcv-full 预编译包。你可以在 [这](./docs_zh_CN/get_started/previous_versions.md) 找到 `PyTorch 1.3 & 1.4` 对应的预编包。虽然我们不再提供 `PyTorch 1.3 & 1.4` 对应的预编译包,但是我们依然在 CI 中保证对它们的兼容持续到下一年。
+```
+
+除了使用预编译包之外,另一种方式是在本地进行编译,直接运行下述命令
+
+```python
+pip install mmcv-full
+```
+
+但注意本地编译可能会耗时 10 分钟以上。
+
+b. 安装精简版
+
+```python
+pip install mmcv
+```
+
+c. 安装完整版并且编译 onnxruntime 的自定义算子
+
+- 详细的指南请查看 [这里](https://mmcv.readthedocs.io/zh_CN/latest/deployment/onnxruntime_custom_ops.html)。
+
+如果想从源码编译 MMCV,请参考[该文档](https://mmcv.readthedocs.io/zh_CN/latest/get_started/build.html)。
diff --git a/docs_zh_CN/get_started/introduction.md b/docs_zh_CN/get_started/introduction.md
new file mode 100644
index 0000000000000000000000000000000000000000..0082ae88a6a94fb09c76d9a821121ceb58b901a5
--- /dev/null
+++ b/docs_zh_CN/get_started/introduction.md
@@ -0,0 +1,30 @@
+## 介绍 MMCV
+
+MMCV 是一个面向计算机视觉的基础库,它支持了很多开源项目,例如:
+
+- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab 图像分类工具箱
+- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab 目标检测工具箱
+- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab 新一代通用 3D 目标检测平台
+- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab 语义分割工具箱
+- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab 新一代视频理解工具箱
+- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab 一体化视频目标感知平台
+- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab 姿态估计工具箱
+- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab 图像视频编辑工具箱
+- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab 全流程文字检测识别理解工具包
+- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab 图片视频生成模型工具箱
+
+MMCV 提供了如下众多功能:
+
+- 通用的 IO 接口
+- 图像和视频处理
+- 图像和标注结果可视化
+- 常用小工具(进度条,计时器等)
+- 基于 PyTorch 的通用训练框架
+- 多种 CNN 网络结构
+- 高质量实现的常见 CUDA 算子
+
+如想了解更多特性和使用,请参考[文档](https://mmcv.readthedocs.io/zh_CN/latest)。
+
+```{note}
+MMCV 需要 Python 3.6 以上版本。
+```
diff --git a/docs/zh_cn/get_started/previous_versions.md b/docs_zh_CN/get_started/previous_versions.md
similarity index 93%
rename from docs/zh_cn/get_started/previous_versions.md
rename to docs_zh_CN/get_started/previous_versions.md
index d543818752b51985169d4489bd46708725ce422d..56679d48181290768f33d0da866b7399ca63e710 100644
--- a/docs/zh_cn/get_started/previous_versions.md
+++ b/docs_zh_CN/get_started/previous_versions.md
@@ -1,10 +1,11 @@
+
## 其他版本的 PyTorch
我们不再提供在较低的 `PyTorch` 版本下编译的 `mmcv-full` 包,但为了您的方便,您可以在下面找到它们。
### PyTorch 1.4
-| 1.0.0 \<= mmcv_version \<= 1.2.1
+| 1.0.0 <= mmcv_version <= 1.2.1
#### CUDA 10.1
@@ -26,7 +27,7 @@ pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dis
### PyTorch v1.3
-| 1.0.0 \<= mmcv_version \<= 1.3.16
+| 1.0.0 <= mmcv_version <= 1.3.16
#### CUDA 10.1
diff --git a/docs/zh_cn/index.rst b/docs_zh_CN/index.rst
similarity index 65%
rename from docs/zh_cn/index.rst
rename to docs_zh_CN/index.rst
index 98cf08890618e699c7ac4731093818a07e862362..b6d00a534e6d2ccc00d2350ecb412ba66613dec3 100644
--- a/docs/zh_cn/index.rst
+++ b/docs_zh_CN/index.rst
@@ -10,22 +10,30 @@
get_started/introduction.md
get_started/installation.md
get_started/build.md
- get_started/article.md
.. toctree::
:maxdepth: 2
:caption: 深入理解 MMCV
+ understand_mmcv/config.md
+ understand_mmcv/registry.md
+ understand_mmcv/runner.md
+ understand_mmcv/io.md
understand_mmcv/data_process.md
- understand_mmcv/data_transform.md
understand_mmcv/visualization.md
understand_mmcv/cnn.md
understand_mmcv/ops.md
+ understand_mmcv/utils.md
.. toctree::
- :caption: 语言切换
+ :maxdepth: 2
+ :caption: 部署
- switch_language.md
+ deployment/onnx.md
+ deployment/onnxruntime_op.md
+ deployment/onnxruntime_custom_ops.md
+ deployment/tensorrt_plugin.md
+ deployment/tensorrt_custom_ops.md
.. toctree::
:maxdepth: 2
@@ -34,6 +42,8 @@
compatibility.md
.. toctree::
+ :maxdepth: 2
+ :caption: 常见问题
faq.md
@@ -43,20 +53,12 @@
community/contributing.md
community/pr.md
- community/code_style.md
.. toctree::
- :maxdepth: 1
+ :maxdepth: 2
:caption: API 文档
- mmcv.image
- mmcv.video
- mmcv.visualization
- mmcv.cnn
- mmcv.ops
- mmcv.transforms
- mmcv.arraymisc
- mmcv.utils
+ api.rst
Indices and tables
diff --git a/docs/zh_cn/make.bat b/docs_zh_CN/make.bat
similarity index 100%
rename from docs/zh_cn/make.bat
rename to docs_zh_CN/make.bat
diff --git a/docs/zh_cn/mmcv-logo.png b/docs_zh_CN/mmcv-logo.png
similarity index 100%
rename from docs/zh_cn/mmcv-logo.png
rename to docs_zh_CN/mmcv-logo.png
diff --git a/docs_zh_CN/understand_mmcv/cnn.md b/docs_zh_CN/understand_mmcv/cnn.md
new file mode 100644
index 0000000000000000000000000000000000000000..9027cf38dc48cbe342a48c3f4e658d629d2e0974
--- /dev/null
+++ b/docs_zh_CN/understand_mmcv/cnn.md
@@ -0,0 +1,525 @@
+## 卷积神经网络
+
+我们为卷积神经网络提供了一些构建模块,包括层构建、模块组件和权重初始化。
+
+### 网络层的构建
+
+在运行实验时,我们可能需要尝试同属一种类型但不同配置的层,但又不希望每次都修改代码。于是我们提供一些层构建方法,可以从字典构建层,字典可以在配置文件中配置,也可以通过命令行参数指定。
+
+#### 用法
+
+一个简单的例子:
+
+```python
+cfg = dict(type='Conv3d')
+layer = build_conv_layer(cfg, in_channels=3, out_channels=8, kernel_size=3)
+```
+
+- `build_conv_layer`: 支持的类型包括 Conv1d、Conv2d、Conv3d、Conv (Conv是Conv2d的别名)
+- `build_norm_layer`: 支持的类型包括 BN1d、BN2d、BN3d、BN (alias for BN2d)、SyncBN、GN、LN、IN1d、IN2d、IN3d、IN(IN是IN2d的别名)
+- `build_activation_layer`:支持的类型包括 ReLU、LeakyReLU、PReLU、RReLU、ReLU6、ELU、Sigmoid、Tanh、GELU
+- `build_upsample_layer`: 支持的类型包括 nearest、bilinear、deconv、pixel_shuffle
+- `build_padding_layer`: 支持的类型包括 zero、reflect、replicate
+
+#### 拓展
+
+我们还允许自定义层和算子来扩展构建方法。
+
+1. 编写和注册自己的模块:
+
+ ```python
+ from mmcv.cnn import UPSAMPLE_LAYERS
+
+ @UPSAMPLE_LAYERS.register_module()
+ class MyUpsample:
+
+ def __init__(self, scale_factor):
+ pass
+
+ def forward(self, x):
+ pass
+ ```
+
+2. 在某处导入 `MyUpsample` (例如 `__init__.py` )然后使用它:
+
+ ```python
+ cfg = dict(type='MyUpsample', scale_factor=2)
+ layer = build_upsample_layer(cfg)
+ ```
+
+### 模块组件
+
+我们还提供了常用的模块组件,以方便网络构建。
+卷积组件 `ConvModule` 由 convolution、normalization以及activation layers 组成,更多细节请参考 [ConvModule api](api.html#mmcv.cnn.ConvModule)。
+
+```python
+# conv + bn + relu
+conv = ConvModule(3, 8, 2, norm_cfg=dict(type='BN'))
+# conv + gn + relu
+conv = ConvModule(3, 8, 2, norm_cfg=dict(type='GN', num_groups=2))
+# conv + relu
+conv = ConvModule(3, 8, 2)
+# conv
+conv = ConvModule(3, 8, 2, act_cfg=None)
+# conv + leaky relu
+conv = ConvModule(3, 8, 3, padding=1, act_cfg=dict(type='LeakyReLU'))
+# bn + conv + relu
+conv = ConvModule(
+ 3, 8, 2, norm_cfg=dict(type='BN'), order=('norm', 'conv', 'act'))
+```
+
+### Weight initialization
+
+> 实现细节可以在 [mmcv/cnn/utils/weight_init.py](../../mmcv/cnn/utils/weight_init.py)中找到
+
+在训练过程中,适当的初始化策略有利于加快训练速度或者获得更高的性能。 在MMCV中,我们提供了一些常用的方法来初始化模块,比如 `nn.Conv2d` 模块。当然,我们也提供了一些高级API,可用于初始化包含一个或多个模块的模型。
+
+#### Initialization functions
+
+以函数的方式初始化 `nn.Module` ,例如 `nn.Conv2d` 、 `nn.Linear` 等。
+
+我们提供以下初始化方法,
+
+- constant_init
+
+ 使用给定常量值初始化模型参数
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import constant_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # constant_init(module, val, bias=0)
+ >>> constant_init(conv1, 1, 0)
+ >>> conv1.weight
+ ```
+
+- xavier_init
+
+ 按照 [Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010)](http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf) 描述的方法初始化模型参数
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import xavier_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # xavier_init(module, gain=1, bias=0, distribution='normal')
+ >>> xavier_init(conv1, distribution='normal')
+ ```
+
+- normal_init
+
+ 使用正态分布(高斯分布)初始化模型参数
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import normal_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # normal_init(module, mean=0, std=1, bias=0)
+ >>> normal_init(conv1, std=0.01, bias=0)
+ ```
+
+- uniform_init
+
+ 使用均匀分布初始化模型参数
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import uniform_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # uniform_init(module, a=0, b=1, bias=0)
+ >>> uniform_init(conv1, a=0, b=1)
+ ```
+
+- kaiming_init
+
+ 按照 [Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015)](https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf) 描述的方法来初始化模型参数。
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import kaiming_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # kaiming_init(module, a=0, mode='fan_out', nonlinearity='relu', bias=0, distribution='normal')
+ >>> kaiming_init(conv1)
+ ```
+
+- caffe2_xavier_init
+
+ caffe2中实现的 `xavier initialization`,对应于 PyTorch中的 `kaiming_uniform_`
+
+ ```python
+ >>> import torch.nn as nn
+ >>> from mmcv.cnn import caffe2_xavier_init
+ >>> conv1 = nn.Conv2d(3, 3, 1)
+ >>> # caffe2_xavier_init(module, bias=0)
+ >>> caffe2_xavier_init(conv1)
+ ```
+
+- bias_init_with_prob
+
+ 根据给定的概率初始化 `conv/fc`, 这在 [Focal Loss for Dense Object Detection](https://arxiv.org/pdf/1708.02002.pdf) 提出。
+
+ ```python
+ >>> from mmcv.cnn import bias_init_with_prob
+ >>> # bias_init_with_prob is proposed in Focal Loss
+ >>> bias = bias_init_with_prob(0.01)
+ >>> bias
+ -4.59511985013459
+ ```
+
+#### Initializers and configs
+
+在初始化方法的基础上,我们定义了相应的初始化类,并将它们注册到 `INITIALIZERS` 中,这样我们就可以使用 `config` 配置来初始化模型了。
+
+我们提供以下初始化类:
+
+- ConstantInit
+- XavierInit
+- NormalInit
+- UniformInit
+- KaimingInit
+- Caffe2XavierInit
+- PretrainedInit
+
+接下来详细介绍 `initialize` 的使用方法
+
+1. 通过关键字 `layer` 来初始化模型
+
+ 如果我们只定义了关键字 `layer` ,那么只初始化 `layer` 中包含的层。
+
+ 注意: 关键字 `layer` 支持的模块是带有 weights 和 bias 属性的 PyTorch 模块,所以不支持 `MultiheadAttention layer`
+
+- 定义关键字 `layer` 列表并使用相同相同配置初始化模块
+
+ ```python
+ import torch.nn as nn
+ from mmcv.cnn import initialize
+
+ class FooNet(nn.Module):
+ def __init__(self):
+ super().__init__()
+ self.feat = nn.Conv1d(3, 1, 3)
+ self.reg = nn.Conv2d(3, 3, 3)
+ self.cls = nn.Linear(1, 2)
+
+ model = FooNet()
+ init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d', 'Linear'], val=1)
+ # 使用相同的配置初始化整个模块
+ initialize(model, init_cfg)
+ # model.feat.weight
+ # Parameter containing:
+ # tensor([[[1., 1., 1.],
+ # [1., 1., 1.],
+ # [1., 1., 1.]]], requires_grad=True)
+ ```
+
+- 定义关键字 `layer` 用于初始化不同配置的层
+
+ ```python
+ import torch.nn as nn
+ from mmcv.cnn.utils import initialize
+
+ class FooNet(nn.Module):
+ def __init__(self):
+ super().__init__()
+ self.feat = nn.Conv1d(3, 1, 3)
+ self.reg = nn.Conv2d(3, 3, 3)
+ self.cls = nn.Linear(1,2)
+
+ model = FooNet()
+ init_cfg = [dict(type='Constant', layer='Conv1d', val=1),
+ dict(type='Constant', layer='Conv2d', val=2),
+ dict(type='Constant', layer='Linear', val=3)]
+ # nn.Conv1d 使用 dict(type='Constant', val=1) 初始化
+ # nn.Conv2d 使用 dict(type='Constant', val=2) 初始化
+ # nn.Linear 使用 dict(type='Constant', val=3) 初始化
+ initialize(model, init_cfg)
+ # model.reg.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+ ```
+
+2. 定义关键字`override`初始化模型
+
+- 当用属性名初始化某个特定部分时, 我们可以使用关键字 `override`, 关键字 `override` 对应的Value会替代init_cfg中相应的值
+
+ ```python
+ import torch.nn as nn
+ from mmcv.cnn import initialize
+
+ class FooNet(nn.Module):
+ def __init__(self):
+ super().__init__()
+ self.feat = nn.Conv1d(3, 1, 3)
+ self.reg = nn.Conv2d(3, 3, 3)
+ self.cls = nn.Sequential(nn.Conv1d(3, 1, 3), nn.Linear(1,2))
+
+ # 如果我们想将模型的权重初始化为 1,将偏差初始化为 2
+ # 但希望 `cls` 中的权重为 3,偏差为 4,则我们可以使用关键字override
+
+ model = FooNet()
+ init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'], val=1, bias=2,
+ override=dict(type='Constant', name='reg', val=3, bias=4))
+ # 使用 dict(type='Constant', val=1, bias=2)来初始化 self.feat and self.cls
+ # 使用dict(type='Constant', val=3, bias=4)来初始化‘reg’模块。
+ initialize(model, init_cfg)
+ # model.reg.weight
+ # Parameter containing:
+ # tensor([[[[3., 3., 3.],
+ # [3., 3., 3.],
+ # [3., 3., 3.]],
+ # ...,
+ # [[3., 3., 3.],
+ # [3., 3., 3.],
+ # [3., 3., 3.]]]], requires_grad=True)
+ ```
+
+- 如果 init_cfg 中的关键字`layer`为None,则只初始化在关键字override中的子模块,并且省略override中的 type 和其他参数
+
+ ```python
+ model = FooNet()
+ init_cfg = dict(type='Constant', val=1, bias=2, override=dict(name='reg'))
+ # self.feat 和 self.cls 使用pyTorch默认的初始化
+ # 将使用 dict(type='Constant', val=1, bias=2) 初始化名为 'reg' 的模块
+ initialize(model, init_cfg)
+ # model.reg.weight
+ # Parameter containing:
+ # tensor([[[[1., 1., 1.],
+ # [1., 1., 1.],
+ # [1., 1., 1.]],
+ # ...,
+ # [[1., 1., 1.],
+ # [1., 1., 1.],
+ # [1., 1., 1.]]]], requires_grad=True)
+ ```
+
+- 如果我们没有定义关键字`layer`或`override` , 将不会初始化任何东西
+
+- 关键字`override`的无效用法
+
+ ```python
+ # 没有重写任何子模块
+ init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'],
+ val=1, bias=2,
+ override=dict(type='Constant', val=3, bias=4))
+
+ # 没有指定type,即便有其他参数,也是无效的。
+ init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'],
+ val=1, bias=2,
+ override=dict(name='reg', val=3, bias=4))
+ ```
+
+3. 用预训练模型初始化
+
+ ```python
+ import torch.nn as nn
+ import torchvision.models as models
+ from mmcv.cnn import initialize
+
+ # 使用预训练模型来初始化
+ model = models.resnet50()
+ # model.conv1.weight
+ # Parameter containing:
+ # tensor([[[[-6.7435e-03, -2.3531e-02, -9.0143e-03, ..., -2.1245e-03,
+ # -1.8077e-03, 3.0338e-03],
+ # [-1.2603e-02, -2.7831e-02, 2.3187e-02, ..., -1.5793e-02,
+ # 1.1655e-02, 4.5889e-03],
+ # [-3.7916e-02, 1.2014e-02, 1.3815e-02, ..., -4.2651e-03,
+ # 1.7314e-02, -9.9998e-03],
+ # ...,
+
+ init_cfg = dict(type='Pretrained',
+ checkpoint='torchvision://resnet50')
+ initialize(model, init_cfg)
+ # model.conv1.weight
+ # Parameter containing:
+ # tensor([[[[ 1.3335e-02, 1.4664e-02, -1.5351e-02, ..., -4.0896e-02,
+ # -4.3034e-02, -7.0755e-02],
+ # [ 4.1205e-03, 5.8477e-03, 1.4948e-02, ..., 2.2060e-03,
+ # -2.0912e-02, -3.8517e-02],
+ # [ 2.2331e-02, 2.3595e-02, 1.6120e-02, ..., 1.0281e-01,
+ # 6.2641e-02, 5.1977e-02],
+ # ...,
+
+ # 使用关键字'prefix'用预训练模型的特定部分来初始化子模块权重
+ model = models.resnet50()
+ url = 'http://download.openmmlab.com/mmdetection/v2.0/retinanet/'\
+ 'retinanet_r50_fpn_1x_coco/'\
+ 'retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth'
+ init_cfg = dict(type='Pretrained',
+ checkpoint=url, prefix='backbone.')
+ initialize(model, init_cfg)
+ ```
+
+4. 初始化继承自BaseModule、Sequential、ModuleList的模型
+
+ `BaseModule` 继承自 `torch.nn.Module`, 它们之间唯一的不同是 `BaseModule` 实现了 `init_weight`
+
+ `Sequential` 继承自 `BaseModule` 和 `torch.nn.Sequential`
+
+ `ModuleList` 继承自 `BaseModule` 和 `torch.nn.ModuleList`
+
+ `````python
+ import torch.nn as nn
+ from mmcv.runner import BaseModule, Sequential, ModuleList
+
+ class FooConv1d(BaseModule):
+
+ def __init__(self, init_cfg=None):
+ super().__init__(init_cfg)
+ self.conv1d = nn.Conv1d(4, 1, 4)
+
+ def forward(self, x):
+ return self.conv1d(x)
+
+ class FooConv2d(BaseModule):
+
+ def __init__(self, init_cfg=None):
+ super().__init__(init_cfg)
+ self.conv2d = nn.Conv2d(3, 1, 3)
+
+ def forward(self, x):
+ return self.conv2d(x)
+
+ # BaseModule
+ init_cfg = dict(type='Constant', layer='Conv1d', val=0., bias=1.)
+ model = FooConv1d(init_cfg)
+ model.init_weights()
+ # model.conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+
+ # Sequential
+ init_cfg1 = dict(type='Constant', layer='Conv1d', val=0., bias=1.)
+ init_cfg2 = dict(type='Constant', layer='Conv2d', val=2., bias=3.)
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ seq_model = Sequential(model1, model2)
+ seq_model.init_weights()
+ # seq_model[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # seq_model[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+
+ # inner init_cfg has higher priority
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d'], val=4., bias=5.)
+ seq_model = Sequential(model1, model2, init_cfg=init_cfg)
+ seq_model.init_weights()
+ # seq_model[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # seq_model[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+
+ # ModuleList
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ modellist = ModuleList([model1, model2])
+ modellist.init_weights()
+ # modellist[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # modellist[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+
+ # inner init_cfg has higher priority
+ model1 = FooConv1d(init_cfg1)
+ model2 = FooConv2d(init_cfg2)
+ init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d'], val=4., bias=5.)
+ modellist = ModuleList([model1, model2], init_cfg=init_cfg)
+ modellist.init_weights()
+ # modellist[0].conv1d.weight
+ # Parameter containing:
+ # tensor([[[0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.],
+ # [0., 0., 0., 0.]]], requires_grad=True)
+ # modellist[1].conv2d.weight
+ # Parameter containing:
+ # tensor([[[[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]],
+ # ...,
+ # [[2., 2., 2.],
+ # [2., 2., 2.],
+ # [2., 2., 2.]]]], requires_grad=True)
+ `````
+
+### Model Zoo
+
+除了`torchvision`的预训练模型,我们还提供以下 CNN 的预训练模型:
+
+- VGG Caffe
+- ResNet Caffe
+- ResNeXt
+- ResNet with Group Normalization
+- ResNet with Group Normalization and Weight Standardization
+- HRNetV2
+- Res2Net
+- RegNet
+
+#### Model URLs in JSON
+
+MMCV中的Model Zoo Link 由 JSON 文件管理。 json 文件由模型名称及其url或path的键值对组成,一个json文件可能类似于:
+
+```json
+{
+ "model_a": "https://example.com/models/model_a_9e5bac.pth",
+ "model_b": "pretrain/model_b_ab3ef2c.pth"
+}
+```
+
+可以在[此处](https://github.com/open-mmlab/mmcv/blob/master/mmcv/model_zoo/open_mmlab.json)找到托管在 OpenMMLab AWS 上的预训练模型的默认链接。
+
+你可以通过将 `open-mmlab.json` 放在 `MMCV_HOME`下来覆盖默认链接,如果在环境中找不到`MMCV_HOME`,则默认使用 `~/.cache/mmcv`。当然你也可以使用命令 `export MMCV_HOME=/your/path`来设置自己的路径。
+
+外部的json文件将被合并为默认文件,如果相同的键出现在外部`json`和默认`json`中,则将使用外部`json`。
+
+#### Load Checkpoint
+
+`mmcv.load_checkpoint()`的参数`filename`支持以下类型:
+
+- filepath: `checkpoint`路径
+- `http://xxx` and `https://xxx`: 下载checkpoint的链接,文件名中必需包含`SHA256`后缀
+- `torchvision://xxx`: `torchvision.models`中的模型链接,更多细节参考 [torchvision](https://pytorch.org/docs/stable/torchvision/models.html)
+- `open-mmlab://xxx`: 默认和其他 json 文件中提供的模型链接或文件路径
diff --git a/docs_zh_CN/understand_mmcv/config.md b/docs_zh_CN/understand_mmcv/config.md
new file mode 100644
index 0000000000000000000000000000000000000000..c6da308833ebb3e1588d7dfb5ba66cc90fb5ee42
--- /dev/null
+++ b/docs_zh_CN/understand_mmcv/config.md
@@ -0,0 +1,176 @@
+## 配置
+
+`Config` 类用于操作配置文件,它支持从多种文件格式中加载配置,包括 **python**, **json** 和 **yaml**。
+它提供了类似字典对象的接口来获取和设置值。
+
+以配置文件 `test.py` 为例
+
+```python
+a = 1
+b = dict(b1=[0, 1, 2], b2=None)
+c = (1, 2)
+d = 'string'
+```
+
+加载与使用配置文件
+
+```python
+>>> cfg = Config.fromfile('test.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=None),
+... c=(1, 2),
+... d='string')
+```
+
+对于所有格式的配置文件,都支持一些预定义变量。它会将 `{{ var }}` 替换为实际值。
+
+目前支持以下四个预定义变量:
+
+`{{ fileDirname }}` - 当前打开文件的目录名,例如 /home/your-username/your-project/folder
+
+`{{ fileBasename }}` - 当前打开文件的文件名,例如 file.ext
+
+`{{ fileBasenameNoExtension }}` - 当前打开文件不包含扩展名的文件名,例如 file
+
+`{{ fileExtname }}` - 当前打开文件的扩展名,例如 .ext
+
+这些变量名引用自 [VS Code](https://code.visualstudio.com/docs/editor/variables-reference)。
+
+这里是一个带有预定义变量的配置文件的例子。
+
+`config_a.py`
+```python
+a = 1
+b = './work_dir/{{ fileBasenameNoExtension }}'
+c = '{{ fileExtname }}'
+```
+
+```python
+>>> cfg = Config.fromfile('./config_a.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b='./work_dir/config_a',
+... c='.py')
+```
+
+对于所有格式的配置文件, 都支持继承。为了重用其他配置文件的字段,
+需要指定 `_base_='./config_a.py'` 或者一个包含配置文件的列表 `_base_=['./config_a.py', './config_b.py']`。
+
+这里有 4 个配置继承关系的例子。
+
+`config_a.py` 作为基类配置文件
+
+```python
+a = 1
+b = dict(b1=[0, 1, 2], b2=None)
+```
+### 不含重复键值对从基类配置文件继承
+
+`config_b.py`
+
+```python
+_base_ = './config_a.py'
+c = (1, 2)
+d = 'string'
+```
+
+```python
+>>> cfg = Config.fromfile('./config_b.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=None),
+... c=(1, 2),
+... d='string')
+```
+在`config_b.py`里的新字段与在`config_a.py`里的旧字段拼接
+
+### 含重复键值对从基类配置文件继承
+
+`config_c.py`
+
+```python
+_base_ = './config_a.py'
+b = dict(b2=1)
+c = (1, 2)
+```
+
+```python
+>>> cfg = Config.fromfile('./config_c.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=1),
+... c=(1, 2))
+```
+
+在基类配置文件:`config_a` 里的 `b.b2=None`被配置文件:`config_c.py`里的 `b.b2=1`替代。
+
+### 从具有忽略字段的配置文件继承
+
+`config_d.py`
+
+```python
+_base_ = './config_a.py'
+b = dict(_delete_=True, b2=None, b3=0.1)
+c = (1, 2)
+```
+
+```python
+>>> cfg = Config.fromfile('./config_d.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b2=None, b3=0.1),
+... c=(1, 2))
+```
+
+您还可以设置 `_delete_=True`忽略基类配置文件中的某些字段。所有在`b`中的旧键 `b1, b2, b3` 将会被新键 `b2, b3` 所取代。
+
+### 从多个基类配置文件继承(基类配置文件不应包含相同的键)
+
+`config_e.py`
+
+```python
+c = (1, 2)
+d = 'string'
+```
+
+`config_f.py`
+
+```python
+_base_ = ['./config_a.py', './config_e.py']
+```
+
+```python
+>>> cfg = Config.fromfile('./config_f.py')
+>>> print(cfg)
+>>> dict(a=1,
+... b=dict(b1=[0, 1, 2], b2=None),
+... c=(1, 2),
+... d='string')
+```
+
+### 从基类引用变量
+
+您可以使用以下语法引用在基类中定义的变量。
+
+`base.py`
+
+```python
+item1 = 'a'
+item2 = dict(item3 = 'b')
+```
+
+`config_g.py`
+
+```python
+_base_ = ['./base.py']
+item = dict(a = {{ _base_.item1 }}, b = {{ _base_.item2.item3 }})
+```
+
+```python
+>>> cfg = Config.fromfile('./config_g.py')
+>>> print(cfg.pretty_text)
+item1 = 'a'
+item2 = dict(item3='b')
+item = dict(a='a', b='b')
+```
diff --git a/docs/zh_cn/understand_mmcv/data_process.md b/docs_zh_CN/understand_mmcv/data_process.md
similarity index 93%
rename from docs/zh_cn/understand_mmcv/data_process.md
rename to docs_zh_CN/understand_mmcv/data_process.md
index 7e0afd1e690b51d43d6e5b88cfa198dee32eb3d2..0885fe03353738d42b4503c9dddf4ec70883c5bb 100644
--- a/docs/zh_cn/understand_mmcv/data_process.md
+++ b/docs_zh_CN/understand_mmcv/data_process.md
@@ -130,7 +130,7 @@ bboxes = np.array([[10, 10, 100, 120], [0, 0, 50, 50]])
patches = mmcv.imcrop(img, bboxes)
# 裁剪两个区域并且缩放区域1.2倍
-patches = mmcv.imcrop(img, bboxes, scale=1.2)
+patches = mmcv.imcrop(img, bboxes, scale_ratio=1.2)
```
#### 填充
@@ -144,13 +144,13 @@ img = mmcv.imread('tests/data/color.jpg')
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=0)
# 用给定值分别填充图像的3个通道至 (1000, 1200)
-img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=(100, 50, 200))
+img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=[100, 50, 200])
# 用给定值填充图像的左、右、上、下四条边
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=0)
# 用3个值分别填充图像的左、右、上、下四条边的3个通道
-img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=(100, 50, 200))
+img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=[100, 50, 200])
# 将图像的四条边填充至能够被给定值整除
img_ = mmcv.impad_to_multiple(img, 32)
@@ -252,24 +252,24 @@ flow = mmcv.flowread('compressed.jpg', quantize=True, concat_axis=1)
mmcv.flowshow(flow)
```
-
+
-1. 流变换
+3. 流变换
```python
img1 = mmcv.imread('img1.jpg')
flow = mmcv.flowread('flow.flo')
-warped_img2 = mmcv.flow_warp(img1, flow)
+warpped_img2 = mmcv.flow_warp(img1, flow)
```
img1 (左) and img2 (右)
-
+
光流 (img2 -> img1)
-
+
变换后的图像和真实图像的差异
-
+
diff --git a/docs_zh_CN/understand_mmcv/io.md b/docs_zh_CN/understand_mmcv/io.md
new file mode 100644
index 0000000000000000000000000000000000000000..0e5002f828f5489ee0447d65501de78e20d3f093
--- /dev/null
+++ b/docs_zh_CN/understand_mmcv/io.md
@@ -0,0 +1,240 @@
+## 文件输入输出
+
+文件输入输出模块提供了两个通用的 API 接口用于读取和保存不同格式的文件。
+
+```{note}
+在 v1.3.16 及之后的版本中,IO 模块支持从不同后端读取数据并支持将数据至不同后端。更多细节请访问 PR [#1330](https://github.com/open-mmlab/mmcv/pull/1330)。
+```
+
+### 读取和保存数据
+
+`mmcv` 提供了一个通用的 api 用于读取和保存数据,目前支持的格式有 json、yaml 和 pickle。
+
+#### 从硬盘读取数据或者将数据保存至硬盘
+
+```python
+import mmcv
+
+# 从文件中读取数据
+data = mmcv.load('test.json')
+data = mmcv.load('test.yaml')
+data = mmcv.load('test.pkl')
+# 从文件对象中读取数据
+with open('test.json', 'r') as f:
+ data = mmcv.load(f, file_format='json')
+
+# 将数据序列化为字符串
+json_str = mmcv.dump(data, file_format='json')
+
+# 将数据保存至文件 (根据文件名后缀反推文件类型)
+mmcv.dump(data, 'out.pkl')
+
+# 将数据保存至文件对象
+with open('test.yaml', 'w') as f:
+ data = mmcv.dump(data, f, file_format='yaml')
+```
+
+#### 从其他后端加载或者保存至其他后端
+
+```python
+import mmcv
+
+# 从 s3 文件读取数据
+data = mmcv.load('s3://bucket-name/test.json')
+data = mmcv.load('s3://bucket-name/test.yaml')
+data = mmcv.load('s3://bucket-name/test.pkl')
+
+# 将数据保存至 s3 文件 (根据文件名后缀反推文件类型)
+mmcv.dump(data, 's3://bucket-name/out.pkl')
+```
+
+我们提供了易于拓展的方式以支持更多的文件格式。我们只需要创建一个继承自 `BaseFileHandler` 的
+文件句柄类并将其注册到 `mmcv` 中即可。句柄类至少需要重写三个方法。
+
+```python
+import mmcv
+
+# 支持为文件句柄类注册多个文件格式
+# @mmcv.register_handler(['txt', 'log'])
+@mmcv.register_handler('txt')
+class TxtHandler1(mmcv.BaseFileHandler):
+
+ def load_from_fileobj(self, file):
+ return file.read()
+
+ def dump_to_fileobj(self, obj, file):
+ file.write(str(obj))
+
+ def dump_to_str(self, obj, **kwargs):
+ return str(obj)
+```
+
+以 `PickleHandler` 为例
+
+```python
+import pickle
+
+class PickleHandler(mmcv.BaseFileHandler):
+
+ def load_from_fileobj(self, file, **kwargs):
+ return pickle.load(file, **kwargs)
+
+ def load_from_path(self, filepath, **kwargs):
+ return super(PickleHandler, self).load_from_path(
+ filepath, mode='rb', **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ return pickle.dumps(obj, **kwargs)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ pickle.dump(obj, file, **kwargs)
+
+ def dump_to_path(self, obj, filepath, **kwargs):
+ super(PickleHandler, self).dump_to_path(
+ obj, filepath, mode='wb', **kwargs)
+```
+
+### 读取文件并返回列表或字典
+
+例如, `a.txt` 是文本文件,一共有5行内容。
+
+```
+a
+b
+c
+d
+e
+```
+#### 从硬盘读取
+
+使用 `list_from_file` 读取 `a.txt`
+
+```python
+>>> mmcv.list_from_file('a.txt')
+['a', 'b', 'c', 'd', 'e']
+>>> mmcv.list_from_file('a.txt', offset=2)
+['c', 'd', 'e']
+>>> mmcv.list_from_file('a.txt', max_num=2)
+['a', 'b']
+>>> mmcv.list_from_file('a.txt', prefix='/mnt/')
+['/mnt/a', '/mnt/b', '/mnt/c', '/mnt/d', '/mnt/e']
+```
+
+同样, `b.txt` 也是文本文件,一共有3行内容
+
+```
+1 cat
+2 dog cow
+3 panda
+```
+
+使用 `dict_from_file` 读取 `b.txt`
+
+```python
+>>> mmcv.dict_from_file('b.txt')
+{'1': 'cat', '2': ['dog', 'cow'], '3': 'panda'}
+>>> mmcv.dict_from_file('b.txt', key_type=int)
+{1: 'cat', 2: ['dog', 'cow'], 3: 'panda'}
+```
+
+#### 从其他后端读取
+
+使用 `list_from_file` 读取 `s3://bucket-name/a.txt`
+
+```python
+>>> mmcv.list_from_file('s3://bucket-name/a.txt')
+['a', 'b', 'c', 'd', 'e']
+>>> mmcv.list_from_file('s3://bucket-name/a.txt', offset=2)
+['c', 'd', 'e']
+>>> mmcv.list_from_file('s3://bucket-name/a.txt', max_num=2)
+['a', 'b']
+>>> mmcv.list_from_file('s3://bucket-name/a.txt', prefix='/mnt/')
+['/mnt/a', '/mnt/b', '/mnt/c', '/mnt/d', '/mnt/e']
+```
+
+使用 `dict_from_file` 读取 `b.txt`
+
+```python
+>>> mmcv.dict_from_file('s3://bucket-name/b.txt')
+{'1': 'cat', '2': ['dog', 'cow'], '3': 'panda'}
+>>> mmcv.dict_from_file('s3://bucket-name/b.txt', key_type=int)
+{1: 'cat', 2: ['dog', 'cow'], 3: 'panda'}
+```
+
+### 读取和保存权重文件
+
+#### 从硬盘读取权重文件或者将权重文件保存至硬盘
+
+我们可以通过下面的方式从磁盘读取权重文件或者将权重文件保存至磁盘
+
+```python
+import torch
+
+filepath1 = '/path/of/your/checkpoint1.pth'
+filepath2 = '/path/of/your/checkpoint2.pth'
+# 从 filepath1 读取权重文件
+checkpoint = torch.load(filepath1)
+# 将权重文件保存至 filepath2
+torch.save(checkpoint, filepath2)
+```
+
+MMCV 提供了很多后端,`HardDiskBackend` 是其中一个,我们可以通过它来读取或者保存权重文件。
+
+```python
+import io
+from mmcv.fileio.file_client import HardDiskBackend
+
+disk_backend = HardDiskBackend()
+with io.BytesIO(disk_backend.get(filepath1)) as buffer:
+ checkpoint = torch.load(buffer)
+with io.BytesIO() as buffer:
+ torch.save(checkpoint, f)
+ disk_backend.put(f.getvalue(), filepath2)
+```
+
+如果我们想在接口中实现根据文件路径自动选择对应的后端,我们可以使用 `FileClient`。
+例如,我们想实现两个方法,分别是读取权重以及保存权重,它们需支持不同类型的文件路径,可以是磁盘路径,也可以是网络路径或者其他路径。
+
+```python
+from mmcv.fileio.file_client import FileClient
+
+def load_checkpoint(path):
+ file_client = FileClient.infer(uri=path)
+ with io.BytesIO(file_client.get(path)) as buffer:
+ checkpoint = torch.load(buffer)
+ return checkpoint
+
+def save_checkpoint(checkpoint, path):
+ with io.BytesIO() as buffer:
+ torch.save(checkpoint, buffer)
+ file_client.put(buffer.getvalue(), path)
+
+file_client = FileClient.infer_client(uri=filepath1)
+checkpoint = load_checkpoint(filepath1)
+save_checkpoint(checkpoint, filepath2)
+```
+
+#### 从网络远端读取权重文件
+
+```{note}
+目前只支持从网络远端读取权重文件,暂不支持将权重文件写入网络远端
+```
+
+```python
+import io
+import torch
+from mmcv.fileio.file_client import HTTPBackend, FileClient
+
+filepath = 'http://path/of/your/checkpoint.pth'
+checkpoint = torch.utils.model_zoo.load_url(filepath)
+
+http_backend = HTTPBackend()
+with io.BytesIO(http_backend.get(filepath)) as buffer:
+ checkpoint = torch.load(buffer)
+
+file_client = FileClient.infer_client(uri=filepath)
+with io.BytesIO(file_client.get(filepath)) as buffer:
+ checkpoint = torch.load(buffer)
+```
diff --git a/docs_zh_CN/understand_mmcv/ops.md b/docs_zh_CN/understand_mmcv/ops.md
new file mode 100644
index 0000000000000000000000000000000000000000..a45bb14862ad0ec05d5fa4d66954ac1465bb668c
--- /dev/null
+++ b/docs_zh_CN/understand_mmcv/ops.md
@@ -0,0 +1,36 @@
+## CUDA 算子
+
+MMCV 提供了检测、分割等任务中常用的 CUDA 算子
+
+- AssignScoreWithK
+- BallQuery
+- BBoxOverlaps
+- CARAFE
+- CrissCrossAttention
+- ContextBlock
+- CornerPool
+- Deformable Convolution v1/v2
+- Deformable RoIPool
+- DynamicScatter
+- GatherPoints
+- FurthestPointSample
+- FurthestPointSampleWithDist
+- GeneralizedAttention
+- KNN
+- MaskedConv
+- NMS
+- PSAMask
+- RoIPointPool3d
+- RoIPool
+- RoIAlign
+- RoIAwarePool3d
+- SimpleRoIAlign
+- SigmoidFocalLoss
+- SoftmaxFocalLoss
+- SoftNMS
+- Synchronized BatchNorm
+- Voxelization
+- ThreeInterpolate
+- ThreeNN
+- Weight standardization
+- Correlation
diff --git a/docs_zh_CN/understand_mmcv/registry.md b/docs_zh_CN/understand_mmcv/registry.md
new file mode 100644
index 0000000000000000000000000000000000000000..3afd0ab66e8e9787280ce54cdfb807e2acf60827
--- /dev/null
+++ b/docs_zh_CN/understand_mmcv/registry.md
@@ -0,0 +1,149 @@
+## 注册器
+MMCV 使用 [注册器](https://github.com/open-mmlab/mmcv/blob/master/mmcv/utils/registry.py) 来管理具有相似功能的不同模块, 例如, 检测器中的主干网络、头部、和模型颈部。
+在 OpenMMLab 家族中的绝大部分开源项目使用注册器去管理数据集和模型的模块,例如 [MMDetection](https://github.com/open-mmlab/mmdetection), [MMDetection3D](https://github.com/open-mmlab/mmdetection3d), [MMClassification](https://github.com/open-mmlab/mmclassification), [MMEditing](https://github.com/open-mmlab/mmediting) 等。
+
+### 什么是注册器
+在MMCV中,注册器可以看作类到字符串的映射。
+一个注册器中的类通常有相似的接口,但是可以实现不同的算法或支持不同的数据集。
+借助注册器,用户可以通过使用相应的字符串查找并实例化该类,并根据他们的需要实例化对应模块。
+一个典型的案例是,OpenMMLab 中的大部分开源项目的配置系统,这些系统通过配置文件来使用注册器创建钩子、执行器、模型和数据集。
+可以在[这里](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.Registry)找到注册器接口使用文档。
+
+使用 `registry`(注册器)管理代码库中的模型,需要以下三个步骤。
+
+1. 创建一个构建方法(可选,在大多数情况下您可以只使用默认方法)
+2. 创建注册器
+3. 使用此注册器来管理模块
+
+`Registry`(注册器)的参数 `build_func`(构建函数) 用来自定以如何实例化类的实例,默认使用 [这里](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.build_from_cfg)实现的`build_from_cfg`。
+
+### 一个简单的例子
+
+这里是一个使用注册器管理包中模块的简单示例。您可以在 OpenMMLab 开源项目中找到更多实例。
+
+假设我们要实现一系列数据集转换器(Dataset Converter),用于将不同格式的数据转换为标准数据格式。我们先创建一个名为converters的目录作为包,在包中我们创建一个文件来实现构建器(builder),命名为converters/builder.py,如下
+
+```python
+from mmcv.utils import Registry
+# 创建转换器(converter)的注册器(registry)
+CONVERTERS = Registry('converter')
+```
+
+然后我们在包中可以实现不同的转换器(converter)。例如,在 `converters/converter1.py` 中实现 `Converter1`。
+
+```python
+from .builder import CONVERTERS
+
+# 使用注册器管理模块
+@CONVERTERS.register_module()
+class Converter1(object):
+ def __init__(self, a, b):
+ self.a = a
+ self.b = b
+```
+使用注册器管理模块的关键步骤是,将实现的模块注册到注册表 `CONVERTERS` 中。通过 `@CONVERTERS.register_module()` 装饰所实现的模块,字符串和类之间的映射就可以由 `CONVERTERS` 构建和维护,如下所示:
+
+通过这种方式,就可以通过 `CONVERTERS` 建立字符串与类之间的映射,如下所示:
+
+```python
+'Converter1' ->
+```
+
+如果模块被成功注册了,你可以通过配置文件使用这个转换器(converter),如下所示:
+
+```python
+converter_cfg = dict(type='Converter1', a=a_value, b=b_value)
+converter = CONVERTERS.build(converter_cfg)
+```
+
+### 自定义构建函数
+
+假设我们想自定义 `converters` 的构建流程,我们可以实现一个自定义的 `build_func` (构建函数)并将其传递到注册器中。
+
+```python
+from mmcv.utils import Registry
+
+# 创建一个构建函数
+def build_converter(cfg, registry, *args, **kwargs):
+ cfg_ = cfg.copy()
+ converter_type = cfg_.pop('type')
+ if converter_type not in registry:
+ raise KeyError(f'Unrecognized converter type {converter_type}')
+ else:
+ converter_cls = registry.get(converter_type)
+
+ converter = converter_cls(*args, **kwargs, **cfg_)
+ return converter
+
+# 创建一个用于转换器(converters)的注册器,并传递(registry)``build_converter`` 函数
+CONVERTERS = Registry('converter', build_func=build_converter)
+```
+
+```{note}
+注:在这个例子中,我们演示了如何使用参数:`build_func` 自定义构建类的实例的方法。
+该功能类似于默认的`build_from_cfg`。在大多数情况下,默认就足够了。
+```
+
+`build_model_from_cfg`也实现了在`nn.Sequentail`中构建PyTorch模块,你可以直接使用它们。
+
+### 注册器层结构
+
+你也可以从多个 OpenMMLab 开源框架中构建模块,例如,你可以把所有 [MMClassification](https://github.com/open-mmlab/mmclassification) 中的主干网络(backbone)用到 [MMDetection](https://github.com/open-mmlab/mmdetection) 的目标检测中,你也可以融合 [MMDetection](https://github.com/open-mmlab/mmdetection) 中的目标检测模型 和 [MMSegmentation](https://github.com/open-mmlab/mmsegmentation) 语义分割模型。
+
+下游代码库中所有 `MODELS` 注册器都是MMCV `MODELS` 注册器的子注册器。基本上,使用以下两种方法从子注册器或相邻兄弟注册器构建模块。
+
+1. 从子注册器中构建
+
+ 例如:
+
+ 我们在 MMDetection 中定义:
+
+ ```python
+ from mmcv.utils import Registry
+ from mmcv.cnn import MODELS as MMCV_MODELS
+ MODELS = Registry('model', parent=MMCV_MODELS)
+
+ @MODELS.register_module()
+ class NetA(nn.Module):
+ def forward(self, x):
+ return x
+ ```
+
+ 我们在 MMClassification 中定义:
+
+ ```python
+ from mmcv.utils import Registry
+ from mmcv.cnn import MODELS as MMCV_MODELS
+ MODELS = Registry('model', parent=MMCV_MODELS)
+
+ @MODELS.register_module()
+ class NetB(nn.Module):
+ def forward(self, x):
+ return x + 1
+ ```
+
+ 我们可以通过以下代码在 MMDetection 或 MMClassification 中构建两个网络:
+
+ ```python
+ from mmdet.models import MODELS
+ net_a = MODELS.build(cfg=dict(type='NetA'))
+ net_b = MODELS.build(cfg=dict(type='mmcls.NetB'))
+ ```
+
+ 或
+
+ ```python
+ from mmcls.models import MODELS
+ net_a = MODELS.build(cfg=dict(type='mmdet.NetA'))
+ net_b = MODELS.build(cfg=dict(type='NetB'))
+ ```
+
+2. 从父注册器中构建
+
+ MMCV中的共享`MODELS`注册器是所有下游代码库的父注册器(根注册器):
+
+ ```python
+ from mmcv.cnn import MODELS as MMCV_MODELS
+ net_a = MMCV_MODELS.build(cfg=dict(type='mmdet.NetA'))
+ net_b = MMCV_MODELS.build(cfg=dict(type='mmcls.NetB'))
+ ```
diff --git a/docs_zh_CN/understand_mmcv/runner.md b/docs_zh_CN/understand_mmcv/runner.md
new file mode 100644
index 0000000000000000000000000000000000000000..203a5dcacfd709772dce8c411a25bb8a623e0dd7
--- /dev/null
+++ b/docs_zh_CN/understand_mmcv/runner.md
@@ -0,0 +1,155 @@
+## 执行器
+
+执行器模块负责模型训练过程调度,主要目的是让用户使用更少的代码以及灵活可配置方式开启训练。其具备如下核心特性:
+
+- 支持以 `EpochBasedRunner` 和 `IterBasedRunner` 为单位的迭代模式以满足不同场景
+- 支持定制工作流以满足训练过程中各状态自由切换,目前支持训练和验证两个工作流。工作流可以简单理解为一个完成的训练和验证迭代过程。
+- 配合各类默认和自定义 Hook,对外提供了灵活扩展能力
+
+### EpochBasedRunner
+
+顾名思义,`EpochBasedRunner` 是指以 epoch 为周期的工作流,例如设置 workflow = [('train', 2), ('val', 1)] 表示循环迭代地训练 2 个 epoch,然后验证 1 个 epoch。MMDetection 目标检测框架默认采用的是 `EpochBasedRunner`。
+
+其抽象逻辑如下所示:
+
+```python
+# 训练终止条件
+while curr_epoch < max_epochs:
+ # 遍历用户设置的工作流,例如 workflow = [('train', 2),('val', 1)]
+ for i, flow in enumerate(workflow):
+ # mode 是工作流函数,例如 train, epochs 是迭代次数
+ mode, epochs = flow
+ # 要么调用 self.train(),要么调用 self.val()
+ epoch_runner = getattr(self, mode)
+ # 运行对应工作流函数
+ for _ in range(epochs):
+ epoch_runner(data_loaders[i], **kwargs)
+```
+目前支持训练和验证两个工作流,以训练函数为例,其抽象逻辑是:
+
+```python
+# epoch_runner 目前可以是 train 或者 val
+def train(self, data_loader, **kwargs):
+ # 遍历 dataset,共返回一个 epoch 的 batch 数据
+ for i, data_batch in enumerate(data_loader):
+ self.call_hook('before_train_iter')
+ # 验证时候 train_mode=False
+ self.run_iter(data_batch, train_mode=True, **kwargs)
+ self.call_hook('after_train_iter')
+ self.call_hook('after_train_epoch')
+```
+
+### IterBasedRunner
+不同于 `EpochBasedRunner`,`IterBasedRunner` 是指以 iter 为周期的工作流,例如设置 workflow = [('train', 2), ('val', 1)] 表示循环迭代的训练 2 个 iter,然后验证 1 个 iter,MMSegmentation 语义分割框架默认采用的是 `EpochBasedRunner`。
+
+其抽象逻辑如下所示:
+
+```python
+# 虽然是 iter 单位,但是某些场合需要 epoch 信息,由 IterLoader 提供
+iter_loaders = [IterLoader(x) for x in data_loaders]
+# 训练终止条件
+while curr_iter < max_iters:
+ # 遍历用户设置的工作流,例如 workflow = [('train', 2), ('val', 1)]
+ for i, flow in enumerate(workflow):
+ # mode 是工作流函数,例如 train, iters 是迭代次数
+ mode, iters = flow
+ # 要么调用 self.train(),要么调用 self.val()
+ iter_runner = getattr(self, mode)
+ # 运行对应工作流函数
+ for _ in range(iters):
+ iter_runner(iter_loaders[i], **kwargs)
+```
+目前支持训练和验证两个工作流,以验证函数为例,其抽象逻辑是:
+
+```python
+# iter_runner 目前可以是 train 或者 val
+def val(self, data_loader, **kwargs):
+ # 获取 batch 数据,用于一次迭代
+ data_batch = next(data_loader)
+ self.call_hook('before_val_iter')
+ outputs = self.model.val_step(data_batch, self.optimizer, **kwargs)
+ self.outputs = outputs
+ self.call_hook('after_val_iter')
+```
+
+除了上述基础功能外,`EpochBasedRunner` 和 `IterBasedRunner` 还提供了 resume 、 save_checkpoint 和注册 hook 功能。
+
+### 一个简单例子
+以最常用的分类任务为例详细说明 `runner` 的使用方法。 开启任何一个训练任务,都需要包括如下步骤:
+
+**(1) dataloader、model 和优化器等类初始化**
+
+```python
+# 模型类初始化
+model=...
+# 优化器类初始化,典型值 cfg.optimizer = dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001)
+optimizer = build_optimizer(model, cfg.optimizer)
+# 工作流对应的 dataloader 初始化
+data_loaders = [
+ build_dataloader(
+ ds,
+ cfg.data.samples_per_gpu,
+ cfg.data.workers_per_gpu,
+ ...) for ds in dataset
+ ]
+```
+
+**(2) runner 类初始化**
+
+```python
+runner = build_runner(
+ # cfg.runner 典型配置为
+ # runner = dict(type='EpochBasedRunner', max_epochs=200)
+ cfg.runner,
+ default_args=dict(
+ model=model,
+ batch_processor=None,
+ optimizer=optimizer,
+ logger=logger))
+```
+
+**(3) 注册默认训练所必须的 hook,和用户自定义 hook**
+
+```python
+# 注册定制必需的 hook
+runner.register_training_hooks(
+ # lr相关配置,典型为
+ # lr_config = dict(policy='step', step=[100, 150])
+ cfg.lr_config,
+ # 优化相关配置,例如 grad_clip 等
+ optimizer_config,
+ # 权重保存相关配置,典型为
+ # checkpoint_config = dict(interval=1),每个单位都保存权重
+ cfg.checkpoint_config,
+ # 日志相关配置
+ cfg.log_config,
+ ...)
+
+# 注册用户自定义 hook
+# 例如想使用 ema 功能,则可以设置 custom_hooks=[dict(type='EMAHook')]
+if cfg.get('custom_hooks', None):
+ custom_hooks = cfg.custom_hooks
+ for hook_cfg in cfg.custom_hooks:
+ hook_cfg = hook_cfg.copy()
+ priority = hook_cfg.pop('priority', 'NORMAL')
+ hook = build_from_cfg(hook_cfg, HOOKS)
+ runner.register_hook(hook, priority=priority)
+```
+
+然后可以进行 resume 或者 load_checkpoint 对权重进行加载。
+
+**(4) 开启训练流**
+
+```python
+# workflow 典型为 workflow = [('train', 1)]
+# 此时就真正开启了训练
+runner.run(data_loaders, cfg.workflow)
+```
+
+关于 workflow 设置,以 `EpochBasedRunner` 为例,详情如下:
+
+- 假设只想运行训练工作流,则可以设置 workflow = [('train', 1)],表示只进行迭代训练
+- 假设想运行训练和验证工作流,则可以设置 workflow = [('train', 3), ('val', 1)],表示先训练 3 个 epoch ,然后切换到 val 工作流,运行 1 个 epoch,然后循环,直到训练 epoch 次数达到指定值
+- 工作流设置还自由定制,例如你可以先验证再训练 workflow = [('val', 1), ('train', 1)]
+
+上述代码都已经封装到了各个代码库的 train.py 中,用户只需要设置相应的配置即可,上述流程会自动运行。
diff --git a/docs_zh_CN/understand_mmcv/utils.md b/docs_zh_CN/understand_mmcv/utils.md
new file mode 100644
index 0000000000000000000000000000000000000000..746c560039759df3e6f76ae665e63812ed3c9ed6
--- /dev/null
+++ b/docs_zh_CN/understand_mmcv/utils.md
@@ -0,0 +1,69 @@
+## 辅助函数
+
+### 进度条
+
+如果你想跟踪函数批处理任务的进度,可以使用 `track_progress` 。它能以进度条的形式展示任务的完成情况以及剩余任务所需的时间(内部实现为for循环)。
+
+```python
+import mmcv
+
+def func(item):
+ # 执行相关操作
+ pass
+
+tasks = [item_1, item_2, ..., item_n]
+
+mmcv.track_progress(func, tasks)
+```
+
+效果如下
+
+
+如果你想可视化多进程任务的进度,你可以使用 `track_parallel_progress` 。
+
+```python
+mmcv.track_parallel_progress(func, tasks, 8) # 8 workers
+```
+
+
+
+如果你想要迭代或枚举数据列表并可视化进度,你可以使用 `track_iter_progress` 。
+
+```python
+import mmcv
+
+tasks = [item_1, item_2, ..., item_n]
+
+for task in mmcv.track_iter_progress(tasks):
+ # do something like print
+ print(task)
+
+for i, task in enumerate(mmcv.track_iter_progress(tasks)):
+ # do something like print
+ print(i)
+ print(task)
+```
+
+### 计时器
+
+mmcv提供的 `Timer` 可以很方便地计算代码块的执行时间。
+
+```python
+import time
+
+with mmcv.Timer():
+ # simulate some code block
+ time.sleep(1)
+```
+
+你也可以使用 `since_start()` 和 `since_last_check()` 。前者返回计时器启动后的运行时长,后者返回最近一次查看计时器后的运行时长。
+
+
+```python
+timer = mmcv.Timer()
+# code block 1 here
+print(timer.since_start())
+# code block 2 here
+print(timer.since_last_check())
+print(timer.since_start())
+```
diff --git a/docs/zh_cn/understand_mmcv/visualization.md b/docs_zh_CN/understand_mmcv/visualization.md
similarity index 100%
rename from docs/zh_cn/understand_mmcv/visualization.md
rename to docs_zh_CN/understand_mmcv/visualization.md
diff --git a/examples/train.py b/examples/train.py
new file mode 100644
index 0000000000000000000000000000000000000000..2dbdfee40f049f55e07d7be1427fdd2da784a9f4
--- /dev/null
+++ b/examples/train.py
@@ -0,0 +1,84 @@
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+import torchvision.transforms as transforms
+from torch.utils.data import DataLoader
+from torchvision.datasets import CIFAR10
+
+from mmcv.parallel import MMDataParallel
+from mmcv.runner import EpochBasedRunner
+from mmcv.utils import get_logger
+
+
+class Model(nn.Module):
+
+ def __init__(self):
+ super(Model, self).__init__()
+ self.conv1 = nn.Conv2d(3, 6, 5)
+ self.pool = nn.MaxPool2d(2, 2)
+ self.conv2 = nn.Conv2d(6, 16, 5)
+ self.fc1 = nn.Linear(16 * 5 * 5, 120)
+ self.fc2 = nn.Linear(120, 84)
+ self.fc3 = nn.Linear(84, 10)
+ self.loss_fn = nn.CrossEntropyLoss()
+
+ def forward(self, x):
+ x = self.pool(F.relu(self.conv1(x)))
+ x = self.pool(F.relu(self.conv2(x)))
+ x = x.view(-1, 16 * 5 * 5)
+ x = F.relu(self.fc1(x))
+ x = F.relu(self.fc2(x))
+ x = self.fc3(x)
+ return x
+
+ def train_step(self, data, optimizer):
+ images, labels = data
+ predicts = self(images) # -> self.__call__() -> self.forward()
+ loss = self.loss_fn(predicts, labels)
+ return {'loss': loss}
+
+
+if __name__ == '__main__':
+ model = Model()
+ if torch.cuda.is_available():
+ # only use gpu:0 to train
+ # Solved issue https://github.com/open-mmlab/mmcv/issues/1470
+ model = MMDataParallel(model.cuda(), device_ids=[0])
+
+ # dataset and dataloader
+ transform = transforms.Compose([
+ transforms.ToTensor(),
+ transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
+ ])
+ trainset = CIFAR10(
+ root='data', train=True, download=True, transform=transform)
+ trainloader = DataLoader(
+ trainset, batch_size=128, shuffle=True, num_workers=2)
+
+ optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
+ logger = get_logger('mmcv')
+ # runner is a scheduler to manage the training
+ runner = EpochBasedRunner(
+ model,
+ optimizer=optimizer,
+ work_dir='./work_dir',
+ logger=logger,
+ max_epochs=4)
+
+ # learning rate scheduler config
+ lr_config = dict(policy='step', step=[2, 3])
+ # configuration of optimizer
+ optimizer_config = dict(grad_clip=None)
+ # configuration of saving checkpoints periodically
+ checkpoint_config = dict(interval=1)
+ # save log periodically and multiple hooks can be used simultaneously
+ log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')])
+ # register hooks to runner and those hooks will be invoked automatically
+ runner.register_training_hooks(
+ lr_config=lr_config,
+ optimizer_config=optimizer_config,
+ checkpoint_config=checkpoint_config,
+ log_config=log_config)
+
+ runner.run([trainloader], [('train', 1)])
diff --git a/mmcv/__init__.py b/mmcv/__init__.py
index 2410ea555e905acb450792a427596764e16f62d3..210a2989138380559f23045b568d0fbbeb918c03 100644
--- a/mmcv/__init__.py
+++ b/mmcv/__init__.py
@@ -1,13 +1,15 @@
# Copyright (c) OpenMMLab. All rights reserved.
# flake8: noqa
from .arraymisc import *
+from .fileio import *
from .image import *
-from .transforms import *
+from .utils import *
from .version import *
from .video import *
from .visualization import *
# The following modules are not imported to this level, so mmcv may be used
# without PyTorch.
+# - runner
+# - parallel
# - op
-# - utils
diff --git a/mmcv/arraymisc/quantization.py b/mmcv/arraymisc/quantization.py
index 6182710d51787061304cfc7304ec97d565822536..8e47a3545780cf071a1ef8195efb0b7b662c8186 100644
--- a/mmcv/arraymisc/quantization.py
+++ b/mmcv/arraymisc/quantization.py
@@ -1,20 +1,14 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Union
-
import numpy as np
-def quantize(arr: np.ndarray,
- min_val: Union[int, float],
- max_val: Union[int, float],
- levels: int,
- dtype=np.int64) -> tuple:
+def quantize(arr, min_val, max_val, levels, dtype=np.int64):
"""Quantize an array of (-inf, inf) to [0, levels-1].
Args:
arr (ndarray): Input array.
- min_val (int or float): Minimum value to be clipped.
- max_val (int or float): Maximum value to be clipped.
+ min_val (scalar): Minimum value to be clipped.
+ max_val (scalar): Maximum value to be clipped.
levels (int): Quantization levels.
dtype (np.type): The type of the quantized array.
@@ -35,17 +29,13 @@ def quantize(arr: np.ndarray,
return quantized_arr
-def dequantize(arr: np.ndarray,
- min_val: Union[int, float],
- max_val: Union[int, float],
- levels: int,
- dtype=np.float64) -> tuple:
+def dequantize(arr, min_val, max_val, levels, dtype=np.float64):
"""Dequantize an array.
Args:
arr (ndarray): Input array.
- min_val (int or float): Minimum value to be clipped.
- max_val (int or float): Maximum value to be clipped.
+ min_val (scalar): Minimum value to be clipped.
+ max_val (scalar): Maximum value to be clipped.
levels (int): Quantization levels.
dtype (np.type): The type of the dequantized array.
diff --git a/mmcv/cnn/__init__.py b/mmcv/cnn/__init__.py
index 10e7e027e4da544f42a6a4fe3400d9413a57e081..7246c897430f0cc7ce12719ad8608824fc734446 100644
--- a/mmcv/cnn/__init__.py
+++ b/mmcv/cnn/__init__.py
@@ -1,7 +1,9 @@
# Copyright (c) OpenMMLab. All rights reserved.
from .alexnet import AlexNet
# yapf: disable
-from .bricks import (ContextBlock, Conv2d, Conv3d, ConvAWS2d, ConvModule,
+from .bricks import (ACTIVATION_LAYERS, CONV_LAYERS, NORM_LAYERS,
+ PADDING_LAYERS, PLUGIN_LAYERS, UPSAMPLE_LAYERS,
+ ContextBlock, Conv2d, Conv3d, ConvAWS2d, ConvModule,
ConvTranspose2d, ConvTranspose3d, ConvWS2d,
DepthwiseSeparableConvModule, GeneralizedAttention,
HSigmoid, HSwish, Linear, MaxPool2d, MaxPool3d,
@@ -9,20 +11,31 @@ from .bricks import (ContextBlock, Conv2d, Conv3d, ConvAWS2d, ConvModule,
build_activation_layer, build_conv_layer,
build_norm_layer, build_padding_layer, build_plugin_layer,
build_upsample_layer, conv_ws_2d, is_norm)
+from .builder import MODELS, build_model_from_cfg
# yapf: enable
from .resnet import ResNet, make_res_layer
-from .rfsearch import Conv2dRFSearchOp, RFSearchHook
-from .utils import fuse_conv_bn, get_model_complexity_info
+from .utils import (INITIALIZERS, Caffe2XavierInit, ConstantInit, KaimingInit,
+ NormalInit, PretrainedInit, TruncNormalInit, UniformInit,
+ XavierInit, bias_init_with_prob, caffe2_xavier_init,
+ constant_init, fuse_conv_bn, get_model_complexity_info,
+ initialize, kaiming_init, normal_init, trunc_normal_init,
+ uniform_init, xavier_init)
from .vgg import VGG, make_vgg_layer
__all__ = [
'AlexNet', 'VGG', 'make_vgg_layer', 'ResNet', 'make_res_layer',
- 'ConvModule', 'build_activation_layer', 'build_conv_layer',
- 'build_norm_layer', 'build_padding_layer', 'build_upsample_layer',
- 'build_plugin_layer', 'is_norm', 'NonLocal1d', 'NonLocal2d', 'NonLocal3d',
- 'ContextBlock', 'HSigmoid', 'Swish', 'HSwish', 'GeneralizedAttention',
- 'Scale', 'conv_ws_2d', 'ConvAWS2d', 'ConvWS2d',
- 'DepthwiseSeparableConvModule', 'Linear', 'Conv2d', 'ConvTranspose2d',
- 'MaxPool2d', 'ConvTranspose3d', 'MaxPool3d', 'Conv3d', 'fuse_conv_bn',
- 'get_model_complexity_info', 'Conv2dRFSearchOp', 'RFSearchHook'
+ 'constant_init', 'xavier_init', 'normal_init', 'trunc_normal_init',
+ 'uniform_init', 'kaiming_init', 'caffe2_xavier_init',
+ 'bias_init_with_prob', 'ConvModule', 'build_activation_layer',
+ 'build_conv_layer', 'build_norm_layer', 'build_padding_layer',
+ 'build_upsample_layer', 'build_plugin_layer', 'is_norm', 'NonLocal1d',
+ 'NonLocal2d', 'NonLocal3d', 'ContextBlock', 'HSigmoid', 'Swish', 'HSwish',
+ 'GeneralizedAttention', 'ACTIVATION_LAYERS', 'CONV_LAYERS', 'NORM_LAYERS',
+ 'PADDING_LAYERS', 'UPSAMPLE_LAYERS', 'PLUGIN_LAYERS', 'Scale',
+ 'get_model_complexity_info', 'conv_ws_2d', 'ConvAWS2d', 'ConvWS2d',
+ 'fuse_conv_bn', 'DepthwiseSeparableConvModule', 'Linear', 'Conv2d',
+ 'ConvTranspose2d', 'MaxPool2d', 'ConvTranspose3d', 'MaxPool3d', 'Conv3d',
+ 'initialize', 'INITIALIZERS', 'ConstantInit', 'XavierInit', 'NormalInit',
+ 'TruncNormalInit', 'UniformInit', 'KaimingInit', 'PretrainedInit',
+ 'Caffe2XavierInit', 'MODELS', 'build_model_from_cfg'
]
diff --git a/mmcv/cnn/alexnet.py b/mmcv/cnn/alexnet.py
index 309be24b66049c86837c67d24ee0e790e6396abc..89e36b8c7851f895d9ae7f07149f0e707456aab0 100644
--- a/mmcv/cnn/alexnet.py
+++ b/mmcv/cnn/alexnet.py
@@ -1,10 +1,7 @@
# Copyright (c) OpenMMLab. All rights reserved.
import logging
-from typing import Optional
-import torch
import torch.nn as nn
-from mmengine.runner import load_checkpoint
class AlexNet(nn.Module):
@@ -14,8 +11,8 @@ class AlexNet(nn.Module):
num_classes (int): number of classes for classification.
"""
- def __init__(self, num_classes: int = -1):
- super().__init__()
+ def __init__(self, num_classes=-1):
+ super(AlexNet, self).__init__()
self.num_classes = num_classes
self.features = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
@@ -43,9 +40,10 @@ class AlexNet(nn.Module):
nn.Linear(4096, num_classes),
)
- def init_weights(self, pretrained: Optional[str] = None) -> None:
+ def init_weights(self, pretrained=None):
if isinstance(pretrained, str):
logger = logging.getLogger()
+ from ..runner import load_checkpoint
load_checkpoint(self, pretrained, strict=False, logger=logger)
elif pretrained is None:
# use default initializer
@@ -53,7 +51,7 @@ class AlexNet(nn.Module):
else:
raise TypeError('pretrained must be a str or None')
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
x = self.features(x)
if self.num_classes > 0:
diff --git a/mmcv/cnn/bricks/__init__.py b/mmcv/cnn/bricks/__init__.py
index 6c74986953bf1a23a246c92c51fd14e033b6d682..0f33124ed23fc6f27119a37bcb5ab004d3572be0 100644
--- a/mmcv/cnn/bricks/__init__.py
+++ b/mmcv/cnn/bricks/__init__.py
@@ -14,7 +14,9 @@ from .non_local import NonLocal1d, NonLocal2d, NonLocal3d
from .norm import build_norm_layer, is_norm
from .padding import build_padding_layer
from .plugin import build_plugin_layer
-from .scale import LayerScale, Scale
+from .registry import (ACTIVATION_LAYERS, CONV_LAYERS, NORM_LAYERS,
+ PADDING_LAYERS, PLUGIN_LAYERS, UPSAMPLE_LAYERS)
+from .scale import Scale
from .swish import Swish
from .upsample import build_upsample_layer
from .wrappers import (Conv2d, Conv3d, ConvTranspose2d, ConvTranspose3d,
@@ -25,8 +27,9 @@ __all__ = [
'build_norm_layer', 'build_padding_layer', 'build_upsample_layer',
'build_plugin_layer', 'is_norm', 'HSigmoid', 'HSwish', 'NonLocal1d',
'NonLocal2d', 'NonLocal3d', 'ContextBlock', 'GeneralizedAttention',
- 'Scale', 'ConvAWS2d', 'ConvWS2d', 'conv_ws_2d',
- 'DepthwiseSeparableConvModule', 'Swish', 'Linear', 'Conv2dAdaptivePadding',
- 'Conv2d', 'ConvTranspose2d', 'MaxPool2d', 'ConvTranspose3d', 'MaxPool3d',
- 'Conv3d', 'Dropout', 'DropPath', 'LayerScale'
+ 'ACTIVATION_LAYERS', 'CONV_LAYERS', 'NORM_LAYERS', 'PADDING_LAYERS',
+ 'UPSAMPLE_LAYERS', 'PLUGIN_LAYERS', 'Scale', 'ConvAWS2d', 'ConvWS2d',
+ 'conv_ws_2d', 'DepthwiseSeparableConvModule', 'Swish', 'Linear',
+ 'Conv2dAdaptivePadding', 'Conv2d', 'ConvTranspose2d', 'MaxPool2d',
+ 'ConvTranspose3d', 'MaxPool3d', 'Conv3d', 'Dropout', 'DropPath'
]
diff --git a/mmcv/cnn/bricks/activation.py b/mmcv/cnn/bricks/activation.py
index ae99714b940913c946fa169883584ea193f645ea..79f1988386cbf09a4a13e2c5a72222e22bcc6f7f 100644
--- a/mmcv/cnn/bricks/activation.py
+++ b/mmcv/cnn/bricks/activation.py
@@ -1,41 +1,20 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Dict
-
import torch
import torch.nn as nn
import torch.nn.functional as F
-from mmengine.registry import MODELS
-from mmengine.utils import digit_version
-from mmengine.utils.dl_utils import TORCH_VERSION
+
+from mmcv.utils import TORCH_VERSION, build_from_cfg, digit_version
+from .registry import ACTIVATION_LAYERS
for module in [
nn.ReLU, nn.LeakyReLU, nn.PReLU, nn.RReLU, nn.ReLU6, nn.ELU,
nn.Sigmoid, nn.Tanh
]:
- MODELS.register_module(module=module)
-
-if digit_version(torch.__version__) >= digit_version('1.7.0'):
- MODELS.register_module(module=nn.SiLU, name='SiLU')
-else:
-
- class SiLU(nn.Module):
- """Sigmoid Weighted Liner Unit."""
+ ACTIVATION_LAYERS.register_module(module=module)
- def __init__(self, inplace=False):
- super().__init__()
- self.inplace = inplace
- def forward(self, inputs) -> torch.Tensor:
- if self.inplace:
- return inputs.mul_(torch.sigmoid(inputs))
- else:
- return inputs * torch.sigmoid(inputs)
-
- MODELS.register_module(module=SiLU, name='SiLU')
-
-
-@MODELS.register_module(name='Clip')
-@MODELS.register_module()
+@ACTIVATION_LAYERS.register_module(name='Clip')
+@ACTIVATION_LAYERS.register_module()
class Clamp(nn.Module):
"""Clamp activation layer.
@@ -49,12 +28,12 @@ class Clamp(nn.Module):
Default to 1.
"""
- def __init__(self, min: float = -1., max: float = 1.):
- super().__init__()
+ def __init__(self, min=-1., max=1.):
+ super(Clamp, self).__init__()
self.min = min
self.max = max
- def forward(self, x) -> torch.Tensor:
+ def forward(self, x):
"""Forward function.
Args:
@@ -88,27 +67,26 @@ class GELU(nn.Module):
>>> output = m(input)
"""
- def forward(self, input: torch.Tensor) -> torch.Tensor:
+ def forward(self, input):
return F.gelu(input)
if (TORCH_VERSION == 'parrots'
or digit_version(TORCH_VERSION) < digit_version('1.4')):
- MODELS.register_module(module=GELU)
+ ACTIVATION_LAYERS.register_module(module=GELU)
else:
- MODELS.register_module(module=nn.GELU)
+ ACTIVATION_LAYERS.register_module(module=nn.GELU)
-def build_activation_layer(cfg: Dict) -> nn.Module:
+def build_activation_layer(cfg):
"""Build activation layer.
Args:
cfg (dict): The activation layer config, which should contain:
-
- type (str): Layer type.
- layer args: Args needed to instantiate an activation layer.
Returns:
nn.Module: Created activation layer.
"""
- return MODELS.build(cfg)
+ return build_from_cfg(cfg, ACTIVATION_LAYERS)
diff --git a/mmcv/cnn/bricks/context_block.py b/mmcv/cnn/bricks/context_block.py
index 1e78df8648b779124091a8595282aad7a8d0d305..d60fdb904c749ce3b251510dff3cc63cea70d42e 100644
--- a/mmcv/cnn/bricks/context_block.py
+++ b/mmcv/cnn/bricks/context_block.py
@@ -1,20 +1,19 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Union
-
import torch
-from mmengine.model import constant_init, kaiming_init
-from mmengine.registry import MODELS
from torch import nn
+from ..utils import constant_init, kaiming_init
+from .registry import PLUGIN_LAYERS
+
-def last_zero_init(m: Union[nn.Module, nn.Sequential]) -> None:
+def last_zero_init(m):
if isinstance(m, nn.Sequential):
constant_init(m[-1], val=0)
else:
constant_init(m, val=0)
-@MODELS.register_module()
+@PLUGIN_LAYERS.register_module()
class ContextBlock(nn.Module):
"""ContextBlock module in GCNet.
@@ -35,11 +34,11 @@ class ContextBlock(nn.Module):
_abbr_ = 'context_block'
def __init__(self,
- in_channels: int,
- ratio: float,
- pooling_type: str = 'att',
- fusion_types: tuple = ('channel_add', )):
- super().__init__()
+ in_channels,
+ ratio,
+ pooling_type='att',
+ fusion_types=('channel_add', )):
+ super(ContextBlock, self).__init__()
assert pooling_type in ['avg', 'att']
assert isinstance(fusion_types, (list, tuple))
valid_fusion_types = ['channel_add', 'channel_mul']
@@ -83,7 +82,7 @@ class ContextBlock(nn.Module):
if self.channel_mul_conv is not None:
last_zero_init(self.channel_mul_conv)
- def spatial_pool(self, x: torch.Tensor) -> torch.Tensor:
+ def spatial_pool(self, x):
batch, channel, height, width = x.size()
if self.pooling_type == 'att':
input_x = x
@@ -109,7 +108,7 @@ class ContextBlock(nn.Module):
return context
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
# [N, C, 1, 1]
context = self.spatial_pool(x)
diff --git a/mmcv/cnn/bricks/conv.py b/mmcv/cnn/bricks/conv.py
index ace744e039b644c2e3bb643de2ba89a438d299af..cf54491997a48ac3e7fadc4183ab7bf3e831024c 100644
--- a/mmcv/cnn/bricks/conv.py
+++ b/mmcv/cnn/bricks/conv.py
@@ -1,16 +1,15 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Dict, Optional
-
-from mmengine.registry import MODELS
from torch import nn
-MODELS.register_module('Conv1d', module=nn.Conv1d)
-MODELS.register_module('Conv2d', module=nn.Conv2d)
-MODELS.register_module('Conv3d', module=nn.Conv3d)
-MODELS.register_module('Conv', module=nn.Conv2d)
+from .registry import CONV_LAYERS
+
+CONV_LAYERS.register_module('Conv1d', module=nn.Conv1d)
+CONV_LAYERS.register_module('Conv2d', module=nn.Conv2d)
+CONV_LAYERS.register_module('Conv3d', module=nn.Conv3d)
+CONV_LAYERS.register_module('Conv', module=nn.Conv2d)
-def build_conv_layer(cfg: Optional[Dict], *args, **kwargs) -> nn.Module:
+def build_conv_layer(cfg, *args, **kwargs):
"""Build convolution layer.
Args:
@@ -35,15 +34,11 @@ def build_conv_layer(cfg: Optional[Dict], *args, **kwargs) -> nn.Module:
cfg_ = cfg.copy()
layer_type = cfg_.pop('type')
+ if layer_type not in CONV_LAYERS:
+ raise KeyError(f'Unrecognized norm type {layer_type}')
+ else:
+ conv_layer = CONV_LAYERS.get(layer_type)
- # Switch registry to the target scope. If `conv_layer` cannot be found
- # in the registry, fallback to search `conv_layer` in the
- # mmengine.MODELS.
- with MODELS.switch_scope_and_registry(None) as registry:
- conv_layer = registry.get(layer_type)
- if conv_layer is None:
- raise KeyError(f'Cannot find {conv_layer} in registry under scope '
- f'name {registry.scope}')
layer = conv_layer(*args, **kwargs, **cfg_)
return layer
diff --git a/mmcv/cnn/bricks/conv2d_adaptive_padding.py b/mmcv/cnn/bricks/conv2d_adaptive_padding.py
index 0ac9949e4830c64161036b519594685f7dae72c2..b45e758ac6cf8dfb0382d072fe09125bc7e9b888 100644
--- a/mmcv/cnn/bricks/conv2d_adaptive_padding.py
+++ b/mmcv/cnn/bricks/conv2d_adaptive_padding.py
@@ -1,14 +1,13 @@
# Copyright (c) OpenMMLab. All rights reserved.
import math
-from typing import Tuple, Union
-import torch
-from mmengine.registry import MODELS
from torch import nn
from torch.nn import functional as F
+from .registry import CONV_LAYERS
-@MODELS.register_module()
+
+@CONV_LAYERS.register_module()
class Conv2dAdaptivePadding(nn.Conv2d):
"""Implementation of 2D convolution in tensorflow with `padding` as "same",
which applies padding to input (if needed) so that input image gets fully
@@ -32,18 +31,18 @@ class Conv2dAdaptivePadding(nn.Conv2d):
"""
def __init__(self,
- in_channels: int,
- out_channels: int,
- kernel_size: Union[int, Tuple[int, int]],
- stride: Union[int, Tuple[int, int]] = 1,
- padding: Union[int, Tuple[int, int]] = 0,
- dilation: Union[int, Tuple[int, int]] = 1,
- groups: int = 1,
- bias: bool = True):
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias=True):
super().__init__(in_channels, out_channels, kernel_size, stride, 0,
dilation, groups, bias)
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
img_h, img_w = x.size()[-2:]
kernel_h, kernel_w = self.weight.size()[-2:]
stride_h, stride_w = self.stride
diff --git a/mmcv/cnn/bricks/conv_module.py b/mmcv/cnn/bricks/conv_module.py
index 1f8e160517f7d62c07a4b10317da1a5718805209..4f19f1d0cf4448179272ac53536e7ccf5fd860a3 100644
--- a/mmcv/cnn/bricks/conv_module.py
+++ b/mmcv/cnn/bricks/conv_module.py
@@ -1,20 +1,18 @@
# Copyright (c) OpenMMLab. All rights reserved.
import warnings
-from typing import Dict, Optional, Tuple, Union
-import torch
import torch.nn as nn
-from mmengine.model import constant_init, kaiming_init
-from mmengine.registry import MODELS
-from mmengine.utils.dl_utils.parrots_wrapper import _BatchNorm, _InstanceNorm
+from mmcv.utils import _BatchNorm, _InstanceNorm
+from ..utils import constant_init, kaiming_init
from .activation import build_activation_layer
from .conv import build_conv_layer
from .norm import build_norm_layer
from .padding import build_padding_layer
+from .registry import PLUGIN_LAYERS
-@MODELS.register_module()
+@PLUGIN_LAYERS.register_module()
class ConvModule(nn.Module):
"""A conv block that bundles conv/norm/activation layers.
@@ -70,22 +68,22 @@ class ConvModule(nn.Module):
_abbr_ = 'conv_block'
def __init__(self,
- in_channels: int,
- out_channels: int,
- kernel_size: Union[int, Tuple[int, int]],
- stride: Union[int, Tuple[int, int]] = 1,
- padding: Union[int, Tuple[int, int]] = 0,
- dilation: Union[int, Tuple[int, int]] = 1,
- groups: int = 1,
- bias: Union[bool, str] = 'auto',
- conv_cfg: Optional[Dict] = None,
- norm_cfg: Optional[Dict] = None,
- act_cfg: Optional[Dict] = dict(type='ReLU'),
- inplace: bool = True,
- with_spectral_norm: bool = False,
- padding_mode: str = 'zeros',
- order: tuple = ('conv', 'norm', 'act')):
- super().__init__()
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias='auto',
+ conv_cfg=None,
+ norm_cfg=None,
+ act_cfg=dict(type='ReLU'),
+ inplace=True,
+ with_spectral_norm=False,
+ padding_mode='zeros',
+ order=('conv', 'norm', 'act')):
+ super(ConvModule, self).__init__()
assert conv_cfg is None or isinstance(conv_cfg, dict)
assert norm_cfg is None or isinstance(norm_cfg, dict)
assert act_cfg is None or isinstance(act_cfg, dict)
@@ -98,7 +96,7 @@ class ConvModule(nn.Module):
self.with_explicit_padding = padding_mode not in official_padding_mode
self.order = order
assert isinstance(self.order, tuple) and len(self.order) == 3
- assert set(order) == {'conv', 'norm', 'act'}
+ assert set(order) == set(['conv', 'norm', 'act'])
self.with_norm = norm_cfg is not None
self.with_activation = act_cfg is not None
@@ -145,22 +143,21 @@ class ConvModule(nn.Module):
norm_channels = out_channels
else:
norm_channels = in_channels
- self.norm_name, norm = build_norm_layer(
- norm_cfg, norm_channels) # type: ignore
+ self.norm_name, norm = build_norm_layer(norm_cfg, norm_channels)
self.add_module(self.norm_name, norm)
if self.with_bias:
if isinstance(norm, (_BatchNorm, _InstanceNorm)):
warnings.warn(
'Unnecessary conv bias before batch/instance norm')
else:
- self.norm_name = None # type: ignore
+ self.norm_name = None
# build activation layer
if self.with_activation:
- act_cfg_ = act_cfg.copy() # type: ignore
+ act_cfg_ = act_cfg.copy()
# nn.Tanh has no 'inplace' argument
if act_cfg_['type'] not in [
- 'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish', 'GELU'
+ 'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish'
]:
act_cfg_.setdefault('inplace', inplace)
self.activate = build_activation_layer(act_cfg_)
@@ -196,10 +193,7 @@ class ConvModule(nn.Module):
if self.with_norm:
constant_init(self.norm, 1, bias=0)
- def forward(self,
- x: torch.Tensor,
- activate: bool = True,
- norm: bool = True) -> torch.Tensor:
+ def forward(self, x, activate=True, norm=True):
for layer in self.order:
if layer == 'conv':
if self.with_explicit_padding:
diff --git a/mmcv/cnn/bricks/conv_ws.py b/mmcv/cnn/bricks/conv_ws.py
index 261f5c1aa9aa9b80891e6330e6d576c3a8ce3e5d..a3941e27874993418b3b5708d5a7485f175ff9c8 100644
--- a/mmcv/cnn/bricks/conv_ws.py
+++ b/mmcv/cnn/bricks/conv_ws.py
@@ -1,21 +1,19 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from collections import OrderedDict
-from typing import Dict, List, Optional, Tuple, Union
-
import torch
import torch.nn as nn
import torch.nn.functional as F
-from mmengine.registry import MODELS
+
+from .registry import CONV_LAYERS
-def conv_ws_2d(input: torch.Tensor,
- weight: torch.Tensor,
- bias: Optional[torch.Tensor] = None,
- stride: Union[int, Tuple[int, int]] = 1,
- padding: Union[int, Tuple[int, int]] = 0,
- dilation: Union[int, Tuple[int, int]] = 1,
- groups: int = 1,
- eps: float = 1e-5) -> torch.Tensor:
+def conv_ws_2d(input,
+ weight,
+ bias=None,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ eps=1e-5):
c_in = weight.size(0)
weight_flat = weight.view(c_in, -1)
mean = weight_flat.mean(dim=1, keepdim=True).view(c_in, 1, 1, 1)
@@ -24,20 +22,20 @@ def conv_ws_2d(input: torch.Tensor,
return F.conv2d(input, weight, bias, stride, padding, dilation, groups)
-@MODELS.register_module('ConvWS')
+@CONV_LAYERS.register_module('ConvWS')
class ConvWS2d(nn.Conv2d):
def __init__(self,
- in_channels: int,
- out_channels: int,
- kernel_size: Union[int, Tuple[int, int]],
- stride: Union[int, Tuple[int, int]] = 1,
- padding: Union[int, Tuple[int, int]] = 0,
- dilation: Union[int, Tuple[int, int]] = 1,
- groups: int = 1,
- bias: bool = True,
- eps: float = 1e-5):
- super().__init__(
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias=True,
+ eps=1e-5):
+ super(ConvWS2d, self).__init__(
in_channels,
out_channels,
kernel_size,
@@ -48,12 +46,12 @@ class ConvWS2d(nn.Conv2d):
bias=bias)
self.eps = eps
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
return conv_ws_2d(x, self.weight, self.bias, self.stride, self.padding,
self.dilation, self.groups, self.eps)
-@MODELS.register_module(name='ConvAWS')
+@CONV_LAYERS.register_module(name='ConvAWS')
class ConvAWS2d(nn.Conv2d):
"""AWS (Adaptive Weight Standardization)
@@ -78,14 +76,14 @@ class ConvAWS2d(nn.Conv2d):
"""
def __init__(self,
- in_channels: int,
- out_channels: int,
- kernel_size: Union[int, Tuple[int, int]],
- stride: Union[int, Tuple[int, int]] = 1,
- padding: Union[int, Tuple[int, int]] = 0,
- dilation: Union[int, Tuple[int, int]] = 1,
- groups: int = 1,
- bias: bool = True):
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias=True):
super().__init__(
in_channels,
out_channels,
@@ -100,7 +98,7 @@ class ConvAWS2d(nn.Conv2d):
self.register_buffer('weight_beta',
torch.zeros(self.out_channels, 1, 1, 1))
- def _get_weight(self, weight: torch.Tensor) -> torch.Tensor:
+ def _get_weight(self, weight):
weight_flat = weight.view(weight.size(0), -1)
mean = weight_flat.mean(dim=1).view(-1, 1, 1, 1)
std = torch.sqrt(weight_flat.var(dim=1) + 1e-5).view(-1, 1, 1, 1)
@@ -108,16 +106,13 @@ class ConvAWS2d(nn.Conv2d):
weight = self.weight_gamma * weight + self.weight_beta
return weight
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
weight = self._get_weight(self.weight)
return F.conv2d(x, weight, self.bias, self.stride, self.padding,
self.dilation, self.groups)
- def _load_from_state_dict(self, state_dict: OrderedDict, prefix: str,
- local_metadata: Dict, strict: bool,
- missing_keys: List[str],
- unexpected_keys: List[str],
- error_msgs: List[str]) -> None:
+ def _load_from_state_dict(self, state_dict, prefix, local_metadata, strict,
+ missing_keys, unexpected_keys, error_msgs):
"""Override default load function.
AWS overrides the function _load_from_state_dict to recover
@@ -129,7 +124,7 @@ class ConvAWS2d(nn.Conv2d):
"""
self.weight_gamma.data.fill_(-1)
- local_missing_keys: List = []
+ local_missing_keys = []
super()._load_from_state_dict(state_dict, prefix, local_metadata,
strict, local_missing_keys,
unexpected_keys, error_msgs)
diff --git a/mmcv/cnn/bricks/depthwise_separable_conv_module.py b/mmcv/cnn/bricks/depthwise_separable_conv_module.py
index cf1fe4cad3812007573211fa2bede28b23822122..722d5d8d71f75486e2db3008907c4eadfca41d63 100644
--- a/mmcv/cnn/bricks/depthwise_separable_conv_module.py
+++ b/mmcv/cnn/bricks/depthwise_separable_conv_module.py
@@ -1,7 +1,4 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Dict, Optional, Tuple, Union
-
-import torch
import torch.nn as nn
from .conv_module import ConvModule
@@ -49,27 +46,27 @@ class DepthwiseSeparableConvModule(nn.Module):
"""
def __init__(self,
- in_channels: int,
- out_channels: int,
- kernel_size: Union[int, Tuple[int, int]],
- stride: Union[int, Tuple[int, int]] = 1,
- padding: Union[int, Tuple[int, int]] = 0,
- dilation: Union[int, Tuple[int, int]] = 1,
- norm_cfg: Optional[Dict] = None,
- act_cfg: Dict = dict(type='ReLU'),
- dw_norm_cfg: Union[Dict, str] = 'default',
- dw_act_cfg: Union[Dict, str] = 'default',
- pw_norm_cfg: Union[Dict, str] = 'default',
- pw_act_cfg: Union[Dict, str] = 'default',
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ norm_cfg=None,
+ act_cfg=dict(type='ReLU'),
+ dw_norm_cfg='default',
+ dw_act_cfg='default',
+ pw_norm_cfg='default',
+ pw_act_cfg='default',
**kwargs):
- super().__init__()
+ super(DepthwiseSeparableConvModule, self).__init__()
assert 'groups' not in kwargs, 'groups should not be specified'
# if norm/activation config of depthwise/pointwise ConvModule is not
# specified, use default config.
- dw_norm_cfg = dw_norm_cfg if dw_norm_cfg != 'default' else norm_cfg # type: ignore # noqa E501
+ dw_norm_cfg = dw_norm_cfg if dw_norm_cfg != 'default' else norm_cfg
dw_act_cfg = dw_act_cfg if dw_act_cfg != 'default' else act_cfg
- pw_norm_cfg = pw_norm_cfg if pw_norm_cfg != 'default' else norm_cfg # type: ignore # noqa E501
+ pw_norm_cfg = pw_norm_cfg if pw_norm_cfg != 'default' else norm_cfg
pw_act_cfg = pw_act_cfg if pw_act_cfg != 'default' else act_cfg
# depthwise convolution
@@ -81,19 +78,19 @@ class DepthwiseSeparableConvModule(nn.Module):
padding=padding,
dilation=dilation,
groups=in_channels,
- norm_cfg=dw_norm_cfg, # type: ignore
- act_cfg=dw_act_cfg, # type: ignore
+ norm_cfg=dw_norm_cfg,
+ act_cfg=dw_act_cfg,
**kwargs)
self.pointwise_conv = ConvModule(
in_channels,
out_channels,
1,
- norm_cfg=pw_norm_cfg, # type: ignore
- act_cfg=pw_act_cfg, # type: ignore
+ norm_cfg=pw_norm_cfg,
+ act_cfg=pw_act_cfg,
**kwargs)
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
x = self.depthwise_conv(x)
x = self.pointwise_conv(x)
return x
diff --git a/mmcv/cnn/bricks/drop.py b/mmcv/cnn/bricks/drop.py
index fe82a2560515858341836de3fa563ed4db3a3e14..b0a026654ac2e3b994eb7a5248ca9faa277f8985 100644
--- a/mmcv/cnn/bricks/drop.py
+++ b/mmcv/cnn/bricks/drop.py
@@ -1,14 +1,12 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Any, Dict, Optional
-
import torch
import torch.nn as nn
-from mmengine.registry import MODELS
+
+from mmcv import build_from_cfg
+from .registry import DROPOUT_LAYERS
-def drop_path(x: torch.Tensor,
- drop_prob: float = 0.,
- training: bool = False) -> torch.Tensor:
+def drop_path(x, drop_prob=0., training=False):
"""Drop paths (Stochastic Depth) per sample (when applied in main path of
residual blocks).
@@ -26,7 +24,7 @@ def drop_path(x: torch.Tensor,
return output
-@MODELS.register_module()
+@DROPOUT_LAYERS.register_module()
class DropPath(nn.Module):
"""Drop paths (Stochastic Depth) per sample (when applied in main path of
residual blocks).
@@ -38,15 +36,15 @@ class DropPath(nn.Module):
drop_prob (float): Probability of the path to be zeroed. Default: 0.1
"""
- def __init__(self, drop_prob: float = 0.1):
- super().__init__()
+ def __init__(self, drop_prob=0.1):
+ super(DropPath, self).__init__()
self.drop_prob = drop_prob
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
return drop_path(x, self.drop_prob, self.training)
-@MODELS.register_module()
+@DROPOUT_LAYERS.register_module()
class Dropout(nn.Dropout):
"""A wrapper for ``torch.nn.Dropout``, We rename the ``p`` of
``torch.nn.Dropout`` to ``drop_prob`` so as to be consistent with
@@ -58,10 +56,10 @@ class Dropout(nn.Dropout):
inplace (bool): Do the operation inplace or not. Default: False.
"""
- def __init__(self, drop_prob: float = 0.5, inplace: bool = False):
+ def __init__(self, drop_prob=0.5, inplace=False):
super().__init__(p=drop_prob, inplace=inplace)
-def build_dropout(cfg: Dict, default_args: Optional[Dict] = None) -> Any:
+def build_dropout(cfg, default_args=None):
"""Builder for drop out layers."""
- return MODELS.build(cfg, default_args=default_args)
+ return build_from_cfg(cfg, DROPOUT_LAYERS, default_args)
diff --git a/mmcv/cnn/bricks/generalized_attention.py b/mmcv/cnn/bricks/generalized_attention.py
index ab20467f63876be8655bb7a54568eaa7dc74ba72..988d9adf2f289ef223bd1c680a5ae1d3387f0269 100644
--- a/mmcv/cnn/bricks/generalized_attention.py
+++ b/mmcv/cnn/bricks/generalized_attention.py
@@ -5,16 +5,17 @@ import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
-from mmengine.model import kaiming_init
-from mmengine.registry import MODELS
+from ..utils import kaiming_init
+from .registry import PLUGIN_LAYERS
-@MODELS.register_module()
+
+@PLUGIN_LAYERS.register_module()
class GeneralizedAttention(nn.Module):
"""GeneralizedAttention module.
See 'An Empirical Study of Spatial Attention Mechanisms in Deep Networks'
- (https://arxiv.org/abs/1904.05873) for details.
+ (https://arxiv.org/abs/1711.07971) for details.
Args:
in_channels (int): Channels of the input feature map.
@@ -44,16 +45,16 @@ class GeneralizedAttention(nn.Module):
_abbr_ = 'gen_attention_block'
def __init__(self,
- in_channels: int,
- spatial_range: int = -1,
- num_heads: int = 9,
- position_embedding_dim: int = -1,
- position_magnitude: int = 1,
- kv_stride: int = 2,
- q_stride: int = 1,
- attention_type: str = '1111'):
+ in_channels,
+ spatial_range=-1,
+ num_heads=9,
+ position_embedding_dim=-1,
+ position_magnitude=1,
+ kv_stride=2,
+ q_stride=1,
+ attention_type='1111'):
- super().__init__()
+ super(GeneralizedAttention, self).__init__()
# hard range means local range for non-local operation
self.position_embedding_dim = (
@@ -130,7 +131,7 @@ class GeneralizedAttention(nn.Module):
max_len_kv = int((max_len - 1.0) / self.kv_stride + 1)
local_constraint_map = np.ones(
- (max_len, max_len, max_len_kv, max_len_kv), dtype=int)
+ (max_len, max_len, max_len_kv, max_len_kv), dtype=np.int)
for iy in range(max_len):
for ix in range(max_len):
local_constraint_map[
@@ -212,7 +213,7 @@ class GeneralizedAttention(nn.Module):
return embedding_x, embedding_y
- def forward(self, x_input: torch.Tensor) -> torch.Tensor:
+ def forward(self, x_input):
num_heads = self.num_heads
# use empirical_attention
@@ -350,7 +351,7 @@ class GeneralizedAttention(nn.Module):
repeat(n, 1, 1, 1)
position_feat_x_reshape = position_feat_x.\
- view(n, num_heads, w * w_kv, self.qk_embed_dim)
+ view(n, num_heads, w*w_kv, self.qk_embed_dim)
position_feat_y_reshape = position_feat_y.\
view(n, num_heads, h * h_kv, self.qk_embed_dim)
diff --git a/mmcv/cnn/bricks/hsigmoid.py b/mmcv/cnn/bricks/hsigmoid.py
index 423e0aad9ae154cf651d289327bc19da940cf449..30b1a3d6580cf0360710426fbea1f05acdf07b4b 100644
--- a/mmcv/cnn/bricks/hsigmoid.py
+++ b/mmcv/cnn/bricks/hsigmoid.py
@@ -1,24 +1,18 @@
# Copyright (c) OpenMMLab. All rights reserved.
-import warnings
-
-import torch
import torch.nn as nn
-from mmengine.registry import MODELS
+
+from .registry import ACTIVATION_LAYERS
-@MODELS.register_module()
+@ACTIVATION_LAYERS.register_module()
class HSigmoid(nn.Module):
"""Hard Sigmoid Module. Apply the hard sigmoid function:
Hsigmoid(x) = min(max((x + bias) / divisor, min_value), max_value)
- Default: Hsigmoid(x) = min(max((x + 3) / 6, 0), 1)
-
- Note:
- In MMCV v1.4.4, we modified the default value of args to align with
- PyTorch official.
+ Default: Hsigmoid(x) = min(max((x + 1) / 2, 0), 1)
Args:
- bias (float): Bias of the input feature map. Default: 3.0.
- divisor (float): Divisor of the input feature map. Default: 6.0.
+ bias (float): Bias of the input feature map. Default: 1.0.
+ divisor (float): Divisor of the input feature map. Default: 2.0.
min_value (float): Lower bound value. Default: 0.0.
max_value (float): Upper bound value. Default: 1.0.
@@ -26,25 +20,15 @@ class HSigmoid(nn.Module):
Tensor: The output tensor.
"""
- def __init__(self,
- bias: float = 3.0,
- divisor: float = 6.0,
- min_value: float = 0.0,
- max_value: float = 1.0):
- super().__init__()
- warnings.warn(
- 'In MMCV v1.4.4, we modified the default value of args to align '
- 'with PyTorch official. Previous Implementation: '
- 'Hsigmoid(x) = min(max((x + 1) / 2, 0), 1). '
- 'Current Implementation: '
- 'Hsigmoid(x) = min(max((x + 3) / 6, 0), 1).')
+ def __init__(self, bias=1.0, divisor=2.0, min_value=0.0, max_value=1.0):
+ super(HSigmoid, self).__init__()
self.bias = bias
self.divisor = divisor
assert self.divisor != 0
self.min_value = min_value
self.max_value = max_value
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
x = (x + self.bias) / self.divisor
return x.clamp_(self.min_value, self.max_value)
diff --git a/mmcv/cnn/bricks/hswish.py b/mmcv/cnn/bricks/hswish.py
index 6b6dd006d424bd39a3f99ceefda816408309d71c..7e0c090ff037c99ee6c5c84c4592e87beae02208 100644
--- a/mmcv/cnn/bricks/hswish.py
+++ b/mmcv/cnn/bricks/hswish.py
@@ -1,11 +1,10 @@
# Copyright (c) OpenMMLab. All rights reserved.
-import torch
import torch.nn as nn
-from mmengine.registry import MODELS
-from mmengine.utils import digit_version
-from mmengine.utils.dl_utils import TORCH_VERSION
+from .registry import ACTIVATION_LAYERS
+
+@ACTIVATION_LAYERS.register_module()
class HSwish(nn.Module):
"""Hard Swish Module.
@@ -22,18 +21,9 @@ class HSwish(nn.Module):
Tensor: The output tensor.
"""
- def __init__(self, inplace: bool = False):
- super().__init__()
+ def __init__(self, inplace=False):
+ super(HSwish, self).__init__()
self.act = nn.ReLU6(inplace)
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
return x * self.act(x + 3) / 6
-
-
-if (TORCH_VERSION == 'parrots'
- or digit_version(TORCH_VERSION) < digit_version('1.7')):
- # Hardswish is not supported when PyTorch version < 1.6.
- # And Hardswish in PyTorch 1.6 does not support inplace.
- MODELS.register_module(module=HSwish)
-else:
- MODELS.register_module(module=nn.Hardswish, name='HSwish')
diff --git a/mmcv/cnn/bricks/non_local.py b/mmcv/cnn/bricks/non_local.py
index 8dd4465cd62fcb07ec1bc3410ebd272f427ec6b1..92d00155ef275c1201ea66bba30470a1785cc5d7 100644
--- a/mmcv/cnn/bricks/non_local.py
+++ b/mmcv/cnn/bricks/non_local.py
@@ -1,13 +1,12 @@
# Copyright (c) OpenMMLab. All rights reserved.
from abc import ABCMeta
-from typing import Dict, Optional
import torch
import torch.nn as nn
-from mmengine.model import constant_init, normal_init
-from mmengine.registry import MODELS
+from ..utils import constant_init, normal_init
from .conv_module import ConvModule
+from .registry import PLUGIN_LAYERS
class _NonLocalNd(nn.Module, metaclass=ABCMeta):
@@ -34,14 +33,14 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
"""
def __init__(self,
- in_channels: int,
- reduction: int = 2,
- use_scale: bool = True,
- conv_cfg: Optional[Dict] = None,
- norm_cfg: Optional[Dict] = None,
- mode: str = 'embedded_gaussian',
+ in_channels,
+ reduction=2,
+ use_scale=True,
+ conv_cfg=None,
+ norm_cfg=None,
+ mode='embedded_gaussian',
**kwargs):
- super().__init__()
+ super(_NonLocalNd, self).__init__()
self.in_channels = in_channels
self.reduction = reduction
self.use_scale = use_scale
@@ -62,7 +61,7 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
self.inter_channels,
kernel_size=1,
conv_cfg=conv_cfg,
- act_cfg=None) # type: ignore
+ act_cfg=None)
self.conv_out = ConvModule(
self.inter_channels,
self.in_channels,
@@ -97,7 +96,7 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
self.init_weights(**kwargs)
- def init_weights(self, std: float = 0.01, zeros_init: bool = True) -> None:
+ def init_weights(self, std=0.01, zeros_init=True):
if self.mode != 'gaussian':
for m in [self.g, self.theta, self.phi]:
normal_init(m.conv, std=std)
@@ -114,8 +113,7 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
else:
normal_init(self.conv_out.norm, std=std)
- def gaussian(self, theta_x: torch.Tensor,
- phi_x: torch.Tensor) -> torch.Tensor:
+ def gaussian(self, theta_x, phi_x):
# NonLocal1d pairwise_weight: [N, H, H]
# NonLocal2d pairwise_weight: [N, HxW, HxW]
# NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
@@ -123,8 +121,7 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
pairwise_weight = pairwise_weight.softmax(dim=-1)
return pairwise_weight
- def embedded_gaussian(self, theta_x: torch.Tensor,
- phi_x: torch.Tensor) -> torch.Tensor:
+ def embedded_gaussian(self, theta_x, phi_x):
# NonLocal1d pairwise_weight: [N, H, H]
# NonLocal2d pairwise_weight: [N, HxW, HxW]
# NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
@@ -135,8 +132,7 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
pairwise_weight = pairwise_weight.softmax(dim=-1)
return pairwise_weight
- def dot_product(self, theta_x: torch.Tensor,
- phi_x: torch.Tensor) -> torch.Tensor:
+ def dot_product(self, theta_x, phi_x):
# NonLocal1d pairwise_weight: [N, H, H]
# NonLocal2d pairwise_weight: [N, HxW, HxW]
# NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
@@ -144,8 +140,7 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
pairwise_weight /= pairwise_weight.shape[-1]
return pairwise_weight
- def concatenation(self, theta_x: torch.Tensor,
- phi_x: torch.Tensor) -> torch.Tensor:
+ def concatenation(self, theta_x, phi_x):
# NonLocal1d pairwise_weight: [N, H, H]
# NonLocal2d pairwise_weight: [N, HxW, HxW]
# NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
@@ -162,7 +157,7 @@ class _NonLocalNd(nn.Module, metaclass=ABCMeta):
return pairwise_weight
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
# Assume `reduction = 1`, then `inter_channels = C`
# or `inter_channels = C` when `mode="gaussian"`
@@ -229,11 +224,12 @@ class NonLocal1d(_NonLocalNd):
"""
def __init__(self,
- in_channels: int,
- sub_sample: bool = False,
- conv_cfg: Dict = dict(type='Conv1d'),
+ in_channels,
+ sub_sample=False,
+ conv_cfg=dict(type='Conv1d'),
**kwargs):
- super().__init__(in_channels, conv_cfg=conv_cfg, **kwargs)
+ super(NonLocal1d, self).__init__(
+ in_channels, conv_cfg=conv_cfg, **kwargs)
self.sub_sample = sub_sample
@@ -246,7 +242,7 @@ class NonLocal1d(_NonLocalNd):
self.phi = max_pool_layer
-@MODELS.register_module()
+@PLUGIN_LAYERS.register_module()
class NonLocal2d(_NonLocalNd):
"""2D Non-local module.
@@ -262,11 +258,12 @@ class NonLocal2d(_NonLocalNd):
_abbr_ = 'nonlocal_block'
def __init__(self,
- in_channels: int,
- sub_sample: bool = False,
- conv_cfg: Dict = dict(type='Conv2d'),
+ in_channels,
+ sub_sample=False,
+ conv_cfg=dict(type='Conv2d'),
**kwargs):
- super().__init__(in_channels, conv_cfg=conv_cfg, **kwargs)
+ super(NonLocal2d, self).__init__(
+ in_channels, conv_cfg=conv_cfg, **kwargs)
self.sub_sample = sub_sample
@@ -292,11 +289,12 @@ class NonLocal3d(_NonLocalNd):
"""
def __init__(self,
- in_channels: int,
- sub_sample: bool = False,
- conv_cfg: Dict = dict(type='Conv3d'),
+ in_channels,
+ sub_sample=False,
+ conv_cfg=dict(type='Conv3d'),
**kwargs):
- super().__init__(in_channels, conv_cfg=conv_cfg, **kwargs)
+ super(NonLocal3d, self).__init__(
+ in_channels, conv_cfg=conv_cfg, **kwargs)
self.sub_sample = sub_sample
if sub_sample:
diff --git a/mmcv/cnn/bricks/norm.py b/mmcv/cnn/bricks/norm.py
index 2fff684af04286cf688bc4e8e61157426307f5e9..cfb326bdb8ced3ec17ab5c3203cb6d6784ff2e78 100644
--- a/mmcv/cnn/bricks/norm.py
+++ b/mmcv/cnn/bricks/norm.py
@@ -1,24 +1,23 @@
# Copyright (c) OpenMMLab. All rights reserved.
import inspect
-from typing import Dict, Tuple, Union
import torch.nn as nn
-from mmengine.registry import MODELS
-from mmengine.utils import is_tuple_of
-from mmengine.utils.dl_utils.parrots_wrapper import (SyncBatchNorm, _BatchNorm,
- _InstanceNorm)
-
-MODELS.register_module('BN', module=nn.BatchNorm2d)
-MODELS.register_module('BN1d', module=nn.BatchNorm1d)
-MODELS.register_module('BN2d', module=nn.BatchNorm2d)
-MODELS.register_module('BN3d', module=nn.BatchNorm3d)
-MODELS.register_module('SyncBN', module=SyncBatchNorm)
-MODELS.register_module('GN', module=nn.GroupNorm)
-MODELS.register_module('LN', module=nn.LayerNorm)
-MODELS.register_module('IN', module=nn.InstanceNorm2d)
-MODELS.register_module('IN1d', module=nn.InstanceNorm1d)
-MODELS.register_module('IN2d', module=nn.InstanceNorm2d)
-MODELS.register_module('IN3d', module=nn.InstanceNorm3d)
+
+from mmcv.utils import is_tuple_of
+from mmcv.utils.parrots_wrapper import SyncBatchNorm, _BatchNorm, _InstanceNorm
+from .registry import NORM_LAYERS
+
+NORM_LAYERS.register_module('BN', module=nn.BatchNorm2d)
+NORM_LAYERS.register_module('BN1d', module=nn.BatchNorm1d)
+NORM_LAYERS.register_module('BN2d', module=nn.BatchNorm2d)
+NORM_LAYERS.register_module('BN3d', module=nn.BatchNorm3d)
+NORM_LAYERS.register_module('SyncBN', module=SyncBatchNorm)
+NORM_LAYERS.register_module('GN', module=nn.GroupNorm)
+NORM_LAYERS.register_module('LN', module=nn.LayerNorm)
+NORM_LAYERS.register_module('IN', module=nn.InstanceNorm2d)
+NORM_LAYERS.register_module('IN1d', module=nn.InstanceNorm1d)
+NORM_LAYERS.register_module('IN2d', module=nn.InstanceNorm2d)
+NORM_LAYERS.register_module('IN3d', module=nn.InstanceNorm3d)
def infer_abbr(class_type):
@@ -70,9 +69,7 @@ def infer_abbr(class_type):
return 'norm_layer'
-def build_norm_layer(cfg: Dict,
- num_features: int,
- postfix: Union[int, str] = '') -> Tuple[str, nn.Module]:
+def build_norm_layer(cfg, num_features, postfix=''):
"""Build normalization layer.
Args:
@@ -86,9 +83,9 @@ def build_norm_layer(cfg: Dict,
to create named layer.
Returns:
- tuple[str, nn.Module]: The first element is the layer name consisting
- of abbreviation and postfix, e.g., bn1, gn. The second element is the
- created norm layer.
+ (str, nn.Module): The first element is the layer name consisting of
+ abbreviation and postfix, e.g., bn1, gn. The second element is the
+ created norm layer.
"""
if not isinstance(cfg, dict):
raise TypeError('cfg must be a dict')
@@ -97,15 +94,10 @@ def build_norm_layer(cfg: Dict,
cfg_ = cfg.copy()
layer_type = cfg_.pop('type')
+ if layer_type not in NORM_LAYERS:
+ raise KeyError(f'Unrecognized norm type {layer_type}')
- # Switch registry to the target scope. If `norm_layer` cannot be found
- # in the registry, fallback to search `norm_layer` in the
- # mmengine.MODELS.
- with MODELS.switch_scope_and_registry(None) as registry:
- norm_layer = registry.get(layer_type)
- if norm_layer is None:
- raise KeyError(f'Cannot find {norm_layer} in registry under scope '
- f'name {registry.scope}')
+ norm_layer = NORM_LAYERS.get(layer_type)
abbr = infer_abbr(norm_layer)
assert isinstance(postfix, (int, str))
@@ -127,8 +119,7 @@ def build_norm_layer(cfg: Dict,
return name, layer
-def is_norm(layer: nn.Module,
- exclude: Union[type, tuple, None] = None) -> bool:
+def is_norm(layer, exclude=None):
"""Check if a layer is a normalization layer.
Args:
diff --git a/mmcv/cnn/bricks/padding.py b/mmcv/cnn/bricks/padding.py
index 4135a190d65170a5762ccc7a201439e276a9a5ab..e4ac6b28a1789bd551c613a7d3e7b622433ac7ec 100644
--- a/mmcv/cnn/bricks/padding.py
+++ b/mmcv/cnn/bricks/padding.py
@@ -1,19 +1,18 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Dict
-
import torch.nn as nn
-from mmengine.registry import MODELS
-MODELS.register_module('zero', module=nn.ZeroPad2d)
-MODELS.register_module('reflect', module=nn.ReflectionPad2d)
-MODELS.register_module('replicate', module=nn.ReplicationPad2d)
+from .registry import PADDING_LAYERS
+
+PADDING_LAYERS.register_module('zero', module=nn.ZeroPad2d)
+PADDING_LAYERS.register_module('reflect', module=nn.ReflectionPad2d)
+PADDING_LAYERS.register_module('replicate', module=nn.ReplicationPad2d)
-def build_padding_layer(cfg: Dict, *args, **kwargs) -> nn.Module:
+def build_padding_layer(cfg, *args, **kwargs):
"""Build padding layer.
Args:
- cfg (dict): The padding layer config, which should contain:
+ cfg (None or dict): The padding layer config, which should contain:
- type (str): Layer type.
- layer args: Args needed to instantiate a padding layer.
@@ -27,15 +26,11 @@ def build_padding_layer(cfg: Dict, *args, **kwargs) -> nn.Module:
cfg_ = cfg.copy()
padding_type = cfg_.pop('type')
+ if padding_type not in PADDING_LAYERS:
+ raise KeyError(f'Unrecognized padding type {padding_type}.')
+ else:
+ padding_layer = PADDING_LAYERS.get(padding_type)
- # Switch registry to the target scope. If `padding_layer` cannot be found
- # in the registry, fallback to search `padding_layer` in the
- # mmengine.MODELS.
- with MODELS.switch_scope_and_registry(None) as registry:
- padding_layer = registry.get(padding_type)
- if padding_layer is None:
- raise KeyError(f'Cannot find {padding_layer} in registry under scope '
- f'name {registry.scope}')
layer = padding_layer(*args, **kwargs, **cfg_)
return layer
diff --git a/mmcv/cnn/bricks/plugin.py b/mmcv/cnn/bricks/plugin.py
index 83ba3737abe683648ddd0a49b7143330d0480b8d..07c010d4053174dd41107aa654ea67e82b46a25c 100644
--- a/mmcv/cnn/bricks/plugin.py
+++ b/mmcv/cnn/bricks/plugin.py
@@ -1,18 +1,15 @@
-# Copyright (c) OpenMMLab. All rights reserved.
import inspect
import platform
-from typing import Dict, Tuple, Union
-import torch.nn as nn
-from mmengine.registry import MODELS
+from .registry import PLUGIN_LAYERS
if platform.system() == 'Windows':
- import regex as re # type: ignore
+ import regex as re
else:
- import re # type: ignore
+ import re
-def infer_abbr(class_type: type) -> str:
+def infer_abbr(class_type):
"""Infer abbreviation from the class name.
This method will infer the abbreviation to map class types to
@@ -50,27 +47,25 @@ def infer_abbr(class_type: type) -> str:
raise TypeError(
f'class_type must be a type, but got {type(class_type)}')
if hasattr(class_type, '_abbr_'):
- return class_type._abbr_ # type: ignore
+ return class_type._abbr_
else:
return camel2snack(class_type.__name__)
-def build_plugin_layer(cfg: Dict,
- postfix: Union[int, str] = '',
- **kwargs) -> Tuple[str, nn.Module]:
+def build_plugin_layer(cfg, postfix='', **kwargs):
"""Build plugin layer.
Args:
- cfg (dict): cfg should contain:
-
- - type (str): identify plugin layer type.
- - layer args: args needed to instantiate a plugin layer.
+ cfg (None or dict): cfg should contain:
+ type (str): identify plugin layer type.
+ layer args: args needed to instantiate a plugin layer.
postfix (int, str): appended into norm abbreviation to
create named layer. Default: ''.
Returns:
- tuple[str, nn.Module]: The first one is the concatenation of
- abbreviation and postfix. The second is the created plugin layer.
+ tuple[str, nn.Module]:
+ name (str): abbreviation + postfix
+ layer (nn.Module): created plugin layer
"""
if not isinstance(cfg, dict):
raise TypeError('cfg must be a dict')
@@ -79,15 +74,10 @@ def build_plugin_layer(cfg: Dict,
cfg_ = cfg.copy()
layer_type = cfg_.pop('type')
+ if layer_type not in PLUGIN_LAYERS:
+ raise KeyError(f'Unrecognized plugin type {layer_type}')
- # Switch registry to the target scope. If `plugin_layer` cannot be found
- # in the registry, fallback to search `plugin_layer` in the
- # mmengine.MODELS.
- with MODELS.switch_scope_and_registry(None) as registry:
- plugin_layer = registry.get(layer_type)
- if plugin_layer is None:
- raise KeyError(f'Cannot find {plugin_layer} in registry under scope '
- f'name {registry.scope}')
+ plugin_layer = PLUGIN_LAYERS.get(layer_type)
abbr = infer_abbr(plugin_layer)
assert isinstance(postfix, (int, str))
diff --git a/mmcv/cnn/bricks/registry.py b/mmcv/cnn/bricks/registry.py
new file mode 100644
index 0000000000000000000000000000000000000000..c29279776dd523e706b6af8f9b9de700bed05ba7
--- /dev/null
+++ b/mmcv/cnn/bricks/registry.py
@@ -0,0 +1,16 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from mmcv.utils import Registry
+
+CONV_LAYERS = Registry('conv layer')
+NORM_LAYERS = Registry('norm layer')
+ACTIVATION_LAYERS = Registry('activation layer')
+PADDING_LAYERS = Registry('padding layer')
+UPSAMPLE_LAYERS = Registry('upsample layer')
+PLUGIN_LAYERS = Registry('plugin layer')
+
+DROPOUT_LAYERS = Registry('drop out layers')
+POSITIONAL_ENCODING = Registry('position encoding')
+ATTENTION = Registry('attention')
+FEEDFORWARD_NETWORK = Registry('feed-forward Network')
+TRANSFORMER_LAYER = Registry('transformerLayer')
+TRANSFORMER_LAYER_SEQUENCE = Registry('transformer-layers sequence')
diff --git a/mmcv/cnn/bricks/scale.py b/mmcv/cnn/bricks/scale.py
index a47379898f75117e5ca2176d9a5f225f563d7b1e..c905fffcc8bf998d18d94f927591963c428025e2 100644
--- a/mmcv/cnn/bricks/scale.py
+++ b/mmcv/cnn/bricks/scale.py
@@ -13,45 +13,9 @@ class Scale(nn.Module):
scale (float): Initial value of scale factor. Default: 1.0
"""
- def __init__(self, scale: float = 1.0):
- super().__init__()
+ def __init__(self, scale=1.0):
+ super(Scale, self).__init__()
self.scale = nn.Parameter(torch.tensor(scale, dtype=torch.float))
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
return x * self.scale
-
-
-class LayerScale(nn.Module):
- """LayerScale layer.
-
- Args:
- dim (int): Dimension of input features.
- inplace (bool): Whether performs operation in-place.
- Default: `False`.
- data_format (str): The input data format, could be 'channels_last'
- or 'channels_first', representing (B, C, H, W) and
- (B, N, C) format data respectively. Default: 'channels_last'.
- scale (float): Initial value of scale factor. Default: 1.0
- """
-
- def __init__(self,
- dim: int,
- inplace: bool = False,
- data_format: str = 'channels_last',
- scale: float = 1e-5):
- super().__init__()
- assert data_format in ('channels_last', 'channels_first'), \
- "'data_format' could only be channels_last or channels_first."
- self.inplace = inplace
- self.data_format = data_format
- self.weight = nn.Parameter(torch.ones(dim) * scale)
-
- def forward(self, x) -> torch.Tensor:
- if self.data_format == 'channels_first':
- shape = tuple((1, -1, *(1 for _ in range(x.dim() - 2))))
- else:
- shape = tuple((*(1 for _ in range(x.dim() - 1)), -1))
- if self.inplace:
- return x.mul_(self.weight.view(*shape))
- else:
- return x * self.weight.view(*shape)
diff --git a/mmcv/cnn/bricks/swish.py b/mmcv/cnn/bricks/swish.py
index 75ad75b9d73f11375ed63491d9e29efd6f43f143..e2ca8ed7b749413f011ae54aac0cab27e6f0b51f 100644
--- a/mmcv/cnn/bricks/swish.py
+++ b/mmcv/cnn/bricks/swish.py
@@ -1,10 +1,11 @@
# Copyright (c) OpenMMLab. All rights reserved.
import torch
import torch.nn as nn
-from mmengine.registry import MODELS
+from .registry import ACTIVATION_LAYERS
-@MODELS.register_module()
+
+@ACTIVATION_LAYERS.register_module()
class Swish(nn.Module):
"""Swish Module.
@@ -18,7 +19,7 @@ class Swish(nn.Module):
"""
def __init__(self):
- super().__init__()
+ super(Swish, self).__init__()
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
return x * torch.sigmoid(x)
diff --git a/mmcv/cnn/bricks/transformer.py b/mmcv/cnn/bricks/transformer.py
index f83b9a6977bf821985cb4c2f78de84fcf103fffb..ed32688af40c0744289d07cd991b17a0dcb1c29f 100644
--- a/mmcv/cnn/bricks/transformer.py
+++ b/mmcv/cnn/bricks/transformer.py
@@ -1,26 +1,21 @@
# Copyright (c) OpenMMLab. All rights reserved.
import copy
-import math
import warnings
-from typing import Sequence
import torch
import torch.nn as nn
-import torch.nn.functional as F
-from mmengine.config import ConfigDict
-from mmengine.model import BaseModule, ModuleList, Sequential
-from mmengine.registry import MODELS
-from mmengine.utils import deprecated_api_warning, to_2tuple
-
-from mmcv.cnn import (Linear, build_activation_layer, build_conv_layer,
- build_norm_layer)
+
+from mmcv import ConfigDict, deprecated_api_warning
+from mmcv.cnn import Linear, build_activation_layer, build_norm_layer
+from mmcv.runner.base_module import BaseModule, ModuleList, Sequential
+from mmcv.utils import build_from_cfg
from .drop import build_dropout
-from .scale import LayerScale
+from .registry import (ATTENTION, FEEDFORWARD_NETWORK, POSITIONAL_ENCODING,
+ TRANSFORMER_LAYER, TRANSFORMER_LAYER_SEQUENCE)
# Avoid BC-breaking of importing MultiScaleDeformableAttention from this file
try:
- from mmcv.ops.multi_scale_deform_attn import \
- MultiScaleDeformableAttention # noqa F401
+ from mmcv.ops.multi_scale_deform_attn import MultiScaleDeformableAttention # noqa F401
warnings.warn(
ImportWarning(
'``MultiScaleDeformableAttention`` has been moved to '
@@ -32,379 +27,35 @@ try:
except ImportError:
warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from '
'``mmcv.ops.multi_scale_deform_attn``, '
- 'You should install ``mmcv`` rather than ``mmcv-lite`` '
- 'if you need this module. ')
+ 'You should install ``mmcv-full`` if you need this module. ')
def build_positional_encoding(cfg, default_args=None):
"""Builder for Position Encoding."""
- return MODELS.build(cfg, default_args=default_args)
+ return build_from_cfg(cfg, POSITIONAL_ENCODING, default_args)
def build_attention(cfg, default_args=None):
"""Builder for attention."""
- return MODELS.build(cfg, default_args=default_args)
+ return build_from_cfg(cfg, ATTENTION, default_args)
def build_feedforward_network(cfg, default_args=None):
"""Builder for feed-forward network (FFN)."""
- return MODELS.build(cfg, default_args=default_args)
+ return build_from_cfg(cfg, FEEDFORWARD_NETWORK, default_args)
def build_transformer_layer(cfg, default_args=None):
"""Builder for transformer layer."""
- return MODELS.build(cfg, default_args=default_args)
+ return build_from_cfg(cfg, TRANSFORMER_LAYER, default_args)
def build_transformer_layer_sequence(cfg, default_args=None):
"""Builder for transformer encoder and transformer decoder."""
- return MODELS.build(cfg, default_args=default_args)
-
-
-class AdaptivePadding(nn.Module):
- """Applies padding adaptively to the input.
-
- This module can make input get fully covered by filter
- you specified. It support two modes "same" and "corner". The
- "same" mode is same with "SAME" padding mode in TensorFlow, pad
- zero around input. The "corner" mode would pad zero
- to bottom right.
-
- Args:
- kernel_size (int | tuple): Size of the kernel. Default: 1.
- stride (int | tuple): Stride of the filter. Default: 1.
- dilation (int | tuple): Spacing between kernel elements.
- Default: 1.
- padding (str): Support "same" and "corner", "corner" mode
- would pad zero to bottom right, and "same" mode would
- pad zero around input. Default: "corner".
-
- Example:
- >>> kernel_size = 16
- >>> stride = 16
- >>> dilation = 1
- >>> input = torch.rand(1, 1, 15, 17)
- >>> adap_pad = AdaptivePadding(
- >>> kernel_size=kernel_size,
- >>> stride=stride,
- >>> dilation=dilation,
- >>> padding="corner")
- >>> out = adap_pad(input)
- >>> assert (out.shape[2], out.shape[3]) == (16, 32)
- >>> input = torch.rand(1, 1, 16, 17)
- >>> out = adap_pad(input)
- >>> assert (out.shape[2], out.shape[3]) == (16, 32)
- """
-
- def __init__(self, kernel_size=1, stride=1, dilation=1, padding='corner'):
- super().__init__()
- assert padding in ('same', 'corner')
-
- kernel_size = to_2tuple(kernel_size)
- stride = to_2tuple(stride)
- dilation = to_2tuple(dilation)
-
- self.padding = padding
- self.kernel_size = kernel_size
- self.stride = stride
- self.dilation = dilation
-
- def get_pad_shape(self, input_shape):
- """Calculate the padding size of input.
-
- Args:
- input_shape (:obj:`torch.Size`): arrange as (H, W).
-
- Returns:
- Tuple[int]: The padding size along the
- original H and W directions
- """
- input_h, input_w = input_shape
- kernel_h, kernel_w = self.kernel_size
- stride_h, stride_w = self.stride
- output_h = math.ceil(input_h / stride_h)
- output_w = math.ceil(input_w / stride_w)
- pad_h = max((output_h - 1) * stride_h +
- (kernel_h - 1) * self.dilation[0] + 1 - input_h, 0)
- pad_w = max((output_w - 1) * stride_w +
- (kernel_w - 1) * self.dilation[1] + 1 - input_w, 0)
- return pad_h, pad_w
-
- def forward(self, x):
- """Add padding to `x`
-
- Args:
- x (Tensor): Input tensor has shape (B, C, H, W).
-
- Returns:
- Tensor: The tensor with adaptive padding
- """
- pad_h, pad_w = self.get_pad_shape(x.size()[-2:])
- if pad_h > 0 or pad_w > 0:
- if self.padding == 'corner':
- x = F.pad(x, [0, pad_w, 0, pad_h])
- elif self.padding == 'same':
- x = F.pad(x, [
- pad_w // 2, pad_w - pad_w // 2, pad_h // 2,
- pad_h - pad_h // 2
- ])
- return x
+ return build_from_cfg(cfg, TRANSFORMER_LAYER_SEQUENCE, default_args)
-class PatchEmbed(BaseModule):
- """Image to Patch Embedding.
-
- We use a conv layer to implement PatchEmbed.
-
- Args:
- in_channels (int): The num of input channels. Default: 3
- embed_dims (int): The dimensions of embedding. Default: 768
- conv_type (str): The type of convolution
- to generate patch embedding. Default: "Conv2d".
- kernel_size (int): The kernel_size of embedding conv. Default: 16.
- stride (int): The slide stride of embedding conv.
- Default: 16.
- padding (int | tuple | string): The padding length of
- embedding conv. When it is a string, it means the mode
- of adaptive padding, support "same" and "corner" now.
- Default: "corner".
- dilation (int): The dilation rate of embedding conv. Default: 1.
- bias (bool): Bias of embed conv. Default: True.
- norm_cfg (dict, optional): Config dict for normalization layer.
- Default: None.
- input_size (int | tuple | None): The size of input, which will be
- used to calculate the out size. Only works when `dynamic_size`
- is False. Default: None.
- init_cfg (`mmcv.ConfigDict`, optional): The Config for initialization.
- Default: None.
- """
-
- def __init__(self,
- in_channels=3,
- embed_dims=768,
- conv_type='Conv2d',
- kernel_size=16,
- stride=16,
- padding='corner',
- dilation=1,
- bias=True,
- norm_cfg=None,
- input_size=None,
- init_cfg=None):
- super().__init__(init_cfg=init_cfg)
-
- self.embed_dims = embed_dims
- if stride is None:
- stride = kernel_size
-
- kernel_size = to_2tuple(kernel_size)
- stride = to_2tuple(stride)
- dilation = to_2tuple(dilation)
-
- if isinstance(padding, str):
- self.adaptive_padding = AdaptivePadding(
- kernel_size=kernel_size,
- stride=stride,
- dilation=dilation,
- padding=padding)
- # disable the padding of conv
- padding = 0
- else:
- self.adaptive_padding = None
- padding = to_2tuple(padding)
-
- self.projection = build_conv_layer(
- dict(type=conv_type),
- in_channels=in_channels,
- out_channels=embed_dims,
- kernel_size=kernel_size,
- stride=stride,
- padding=padding,
- dilation=dilation,
- bias=bias)
-
- if norm_cfg is not None:
- self.norm = build_norm_layer(norm_cfg, embed_dims)[1]
- else:
- self.norm = None
-
- if input_size:
- input_size = to_2tuple(input_size)
- # `init_out_size` would be used outside to
- # calculate the num_patches
- # e.g. when `use_abs_pos_embed` outside
- self.init_input_size = input_size
- if self.adaptive_padding:
- pad_h, pad_w = self.adaptive_padding.get_pad_shape(input_size)
- input_h, input_w = input_size
- input_h = input_h + pad_h
- input_w = input_w + pad_w
- input_size = (input_h, input_w)
-
- # https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html
- h_out = (input_size[0] + 2 * padding[0] - dilation[0] *
- (kernel_size[0] - 1) - 1) // stride[0] + 1
- w_out = (input_size[1] + 2 * padding[1] - dilation[1] *
- (kernel_size[1] - 1) - 1) // stride[1] + 1
- self.init_out_size = (h_out, w_out)
- else:
- self.init_input_size = None
- self.init_out_size = None
-
- def forward(self, x):
- """
- Args:
- x (Tensor): Has shape (B, C, H, W). In most case, C is 3.
-
- Returns:
- tuple: Contains merged results and its spatial shape.
-
- - x (Tensor): Has shape (B, out_h * out_w, embed_dims)
- - out_size (tuple[int]): Spatial shape of x, arrange as
- (out_h, out_w).
- """
-
- if self.adaptive_padding:
- x = self.adaptive_padding(x)
-
- x = self.projection(x)
- out_size = (x.shape[2], x.shape[3])
- x = x.flatten(2).transpose(1, 2)
- if self.norm is not None:
- x = self.norm(x)
- return x, out_size
-
-
-class PatchMerging(BaseModule):
- """Merge patch feature map.
-
- This layer groups feature map by kernel_size, and applies norm and linear
- layers to the grouped feature map ((used in Swin Transformer)).
- Our implementation uses `nn.Unfold` to
- merge patches, which is about 25% faster than the original
- implementation. However, we need to modify pretrained
- models for compatibility.
-
- Args:
- in_channels (int): The num of input channels.
- to gets fully covered by filter and stride you specified.
- out_channels (int): The num of output channels.
- kernel_size (int | tuple, optional): the kernel size in the unfold
- layer. Defaults to 2.
- stride (int | tuple, optional): the stride of the sliding blocks in the
- unfold layer. Default: None. (Would be set as `kernel_size`)
- padding (int | tuple | string ): The padding length of
- embedding conv. When it is a string, it means the mode
- of adaptive padding, support "same" and "corner" now.
- Default: "corner".
- dilation (int | tuple, optional): dilation parameter in the unfold
- layer. Default: 1.
- bias (bool, optional): Whether to add bias in linear layer or not.
- Defaults: False.
- norm_cfg (dict, optional): Config dict for normalization layer.
- Default: dict(type='LN').
- init_cfg (dict, optional): The extra config for initialization.
- Default: None.
- """
-
- def __init__(self,
- in_channels,
- out_channels,
- kernel_size=2,
- stride=None,
- padding='corner',
- dilation=1,
- bias=False,
- norm_cfg=dict(type='LN'),
- init_cfg=None):
- super().__init__(init_cfg=init_cfg)
- self.in_channels = in_channels
- self.out_channels = out_channels
- if stride:
- stride = stride
- else:
- stride = kernel_size
-
- kernel_size = to_2tuple(kernel_size)
- stride = to_2tuple(stride)
- dilation = to_2tuple(dilation)
-
- if isinstance(padding, str):
- self.adaptive_padding = AdaptivePadding(
- kernel_size=kernel_size,
- stride=stride,
- dilation=dilation,
- padding=padding)
- # disable the padding of unfold
- padding = 0
- else:
- self.adaptive_padding = None
-
- padding = to_2tuple(padding)
- self.sampler = nn.Unfold(
- kernel_size=kernel_size,
- dilation=dilation,
- padding=padding,
- stride=stride)
-
- sample_dim = kernel_size[0] * kernel_size[1] * in_channels
-
- if norm_cfg is not None:
- self.norm = build_norm_layer(norm_cfg, sample_dim)[1]
- else:
- self.norm = None
-
- self.reduction = nn.Linear(sample_dim, out_channels, bias=bias)
-
- def forward(self, x, input_size):
- """
- Args:
- x (Tensor): Has shape (B, H*W, C_in).
- input_size (tuple[int]): The spatial shape of x, arrange as (H, W).
- Default: None.
-
- Returns:
- tuple: Contains merged results and its spatial shape.
-
- - x (Tensor): Has shape (B, Merged_H * Merged_W, C_out)
- - out_size (tuple[int]): Spatial shape of x, arrange as
- (Merged_H, Merged_W).
- """
- B, L, C = x.shape
- assert isinstance(input_size, Sequence), f'Expect ' \
- f'input_size is ' \
- f'`Sequence` ' \
- f'but get {input_size}'
-
- H, W = input_size
- assert L == H * W, 'input feature has wrong size'
-
- x = x.view(B, H, W, C).permute([0, 3, 1, 2]) # B, C, H, W
-
- if self.adaptive_padding:
- x = self.adaptive_padding(x)
- H, W = x.shape[-2:]
-
- # Use nn.Unfold to merge patch. About 25% faster than original method,
- # but need to modify pretrained model for compatibility
- # if kernel_size=2 and stride=2, x should has shape (B, 4*C, H/2*W/2)
- x = self.sampler(x)
-
- out_h = (H + 2 * self.sampler.padding[0] - self.sampler.dilation[0] *
- (self.sampler.kernel_size[0] - 1) -
- 1) // self.sampler.stride[0] + 1
- out_w = (W + 2 * self.sampler.padding[1] - self.sampler.dilation[1] *
- (self.sampler.kernel_size[1] - 1) -
- 1) // self.sampler.stride[1] + 1
-
- output_size = (out_h, out_w)
- x = x.transpose(1, 2) # B, H/2*W/2, 4*C
- x = self.norm(x) if self.norm else x
- x = self.reduction(x)
- return x, output_size
-
-
-@MODELS.register_module()
+@ATTENTION.register_module()
class MultiheadAttention(BaseModule):
"""A wrapper for ``torch.nn.MultiheadAttention``.
@@ -436,13 +87,12 @@ class MultiheadAttention(BaseModule):
init_cfg=None,
batch_first=False,
**kwargs):
- super().__init__(init_cfg)
+ super(MultiheadAttention, self).__init__(init_cfg)
if 'dropout' in kwargs:
- warnings.warn(
- 'The arguments `dropout` in MultiheadAttention '
- 'has been deprecated, now you can separately '
- 'set `attn_drop`(float), proj_drop(float), '
- 'and `dropout_layer`(dict) ', DeprecationWarning)
+ warnings.warn('The arguments `dropout` in MultiheadAttention '
+ 'has been deprecated, now you can separately '
+ 'set `attn_drop`(float), proj_drop(float), '
+ 'and `dropout_layer`(dict) ')
attn_drop = kwargs['dropout']
dropout_layer['drop_prob'] = kwargs.pop('dropout')
@@ -504,9 +154,9 @@ class MultiheadAttention(BaseModule):
Returns:
Tensor: forwarded results with shape
- [num_queries, bs, embed_dims]
- if self.batch_first is False, else
- [bs, num_queries embed_dims].
+ [num_queries, bs, embed_dims]
+ if self.batch_first is False, else
+ [bs, num_queries embed_dims].
"""
if key is None:
@@ -552,7 +202,7 @@ class MultiheadAttention(BaseModule):
return identity + self.dropout_layer(self.proj_drop(out))
-@MODELS.register_module()
+@FEEDFORWARD_NETWORK.register_module()
class FFN(BaseModule):
"""Implements feed-forward networks (FFNs) with identity connection.
@@ -573,8 +223,6 @@ class FFN(BaseModule):
when adding the shortcut.
init_cfg (obj:`mmcv.ConfigDict`): The Config for initialization.
Default: None.
- layer_scale_init_value (float): Initial value of scale factor in
- LayerScale. Default: 1.0
"""
@deprecated_api_warning(
@@ -592,21 +240,23 @@ class FFN(BaseModule):
dropout_layer=None,
add_identity=True,
init_cfg=None,
- layer_scale_init_value=0.):
- super().__init__(init_cfg)
+ **kwargs):
+ super(FFN, self).__init__(init_cfg)
assert num_fcs >= 2, 'num_fcs should be no less ' \
f'than 2. got {num_fcs}.'
self.embed_dims = embed_dims
self.feedforward_channels = feedforward_channels
self.num_fcs = num_fcs
+ self.act_cfg = act_cfg
+ self.activate = build_activation_layer(act_cfg)
layers = []
in_channels = embed_dims
for _ in range(num_fcs - 1):
layers.append(
Sequential(
- Linear(in_channels, feedforward_channels),
- build_activation_layer(act_cfg), nn.Dropout(ffn_drop)))
+ Linear(in_channels, feedforward_channels), self.activate,
+ nn.Dropout(ffn_drop)))
in_channels = feedforward_channels
layers.append(Linear(feedforward_channels, embed_dims))
layers.append(nn.Dropout(ffn_drop))
@@ -615,11 +265,6 @@ class FFN(BaseModule):
dropout_layer) if dropout_layer else torch.nn.Identity()
self.add_identity = add_identity
- if layer_scale_init_value > 0:
- self.gamma2 = LayerScale(embed_dims, scale=layer_scale_init_value)
- else:
- self.gamma2 = nn.Identity()
-
@deprecated_api_warning({'residual': 'identity'}, cls_name='FFN')
def forward(self, x, identity=None):
"""Forward function for `FFN`.
@@ -627,7 +272,6 @@ class FFN(BaseModule):
The function would add x to the output tensor if residue is None.
"""
out = self.layers(x)
- out = self.gamma2(out)
if not self.add_identity:
return self.dropout_layer(out)
if identity is None:
@@ -635,7 +279,7 @@ class FFN(BaseModule):
return identity + self.dropout_layer(out)
-@MODELS.register_module()
+@TRANSFORMER_LAYER.register_module()
class BaseTransformerLayer(BaseModule):
"""Base `TransformerLayer` for vision transformer.
@@ -698,15 +342,15 @@ class BaseTransformerLayer(BaseModule):
f'The arguments `{ori_name}` in BaseTransformerLayer '
f'has been deprecated, now you should set `{new_name}` '
f'and other FFN related arguments '
- f'to a dict named `ffn_cfgs`. ', DeprecationWarning)
+ f'to a dict named `ffn_cfgs`. ')
ffn_cfgs[new_name] = kwargs[ori_name]
- super().__init__(init_cfg)
+ super(BaseTransformerLayer, self).__init__(init_cfg)
self.batch_first = batch_first
- assert set(operation_order) & {
- 'self_attn', 'norm', 'ffn', 'cross_attn'} == \
+ assert set(operation_order) & set(
+ ['self_attn', 'norm', 'ffn', 'cross_attn']) == \
set(operation_order), f'The operation_order of' \
f' {self.__class__.__name__} should ' \
f'contains all four operation type ' \
@@ -753,7 +397,7 @@ class BaseTransformerLayer(BaseModule):
assert len(ffn_cfgs) == num_ffns
for ffn_index in range(num_ffns):
if 'embed_dims' not in ffn_cfgs[ffn_index]:
- ffn_cfgs[ffn_index]['embed_dims'] = self.embed_dims
+ ffn_cfgs['embed_dims'] = self.embed_dims
else:
assert ffn_cfgs[ffn_index]['embed_dims'] == self.embed_dims
self.ffns.append(
@@ -866,7 +510,7 @@ class BaseTransformerLayer(BaseModule):
return query
-@MODELS.register_module()
+@TRANSFORMER_LAYER_SEQUENCE.register_module()
class TransformerLayerSequence(BaseModule):
"""Base class for TransformerEncoder and TransformerDecoder in vision
transformer.
@@ -887,7 +531,7 @@ class TransformerLayerSequence(BaseModule):
"""
def __init__(self, transformerlayers=None, num_layers=None, init_cfg=None):
- super().__init__(init_cfg)
+ super(TransformerLayerSequence, self).__init__(init_cfg)
if isinstance(transformerlayers, dict):
transformerlayers = [
copy.deepcopy(transformerlayers) for _ in range(num_layers)
diff --git a/mmcv/cnn/bricks/upsample.py b/mmcv/cnn/bricks/upsample.py
index d91689a1c8e16b97c0b4d76092c246e84c84256a..a1a353767d0ce8518f0d7289bed10dba0178ed12 100644
--- a/mmcv/cnn/bricks/upsample.py
+++ b/mmcv/cnn/bricks/upsample.py
@@ -1,17 +1,15 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Dict
-
-import torch
import torch.nn as nn
import torch.nn.functional as F
-from mmengine.model import xavier_init
-from mmengine.registry import MODELS
-MODELS.register_module('nearest', module=nn.Upsample)
-MODELS.register_module('bilinear', module=nn.Upsample)
+from ..utils import xavier_init
+from .registry import UPSAMPLE_LAYERS
+
+UPSAMPLE_LAYERS.register_module('nearest', module=nn.Upsample)
+UPSAMPLE_LAYERS.register_module('bilinear', module=nn.Upsample)
-@MODELS.register_module(name='pixel_shuffle')
+@UPSAMPLE_LAYERS.register_module(name='pixel_shuffle')
class PixelShufflePack(nn.Module):
"""Pixel Shuffle upsample layer.
@@ -26,9 +24,9 @@ class PixelShufflePack(nn.Module):
channels.
"""
- def __init__(self, in_channels: int, out_channels: int, scale_factor: int,
- upsample_kernel: int):
- super().__init__()
+ def __init__(self, in_channels, out_channels, scale_factor,
+ upsample_kernel):
+ super(PixelShufflePack, self).__init__()
self.in_channels = in_channels
self.out_channels = out_channels
self.scale_factor = scale_factor
@@ -43,13 +41,13 @@ class PixelShufflePack(nn.Module):
def init_weights(self):
xavier_init(self.upsample_conv, distribution='uniform')
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
x = self.upsample_conv(x)
x = F.pixel_shuffle(x, self.scale_factor)
return x
-def build_upsample_layer(cfg: Dict, *args, **kwargs) -> nn.Module:
+def build_upsample_layer(cfg, *args, **kwargs):
"""Build upsample layer.
Args:
@@ -57,7 +55,7 @@ def build_upsample_layer(cfg: Dict, *args, **kwargs) -> nn.Module:
- type (str): Layer type.
- scale_factor (int): Upsample ratio, which is not applicable to
- deconv.
+ deconv.
- layer args: Args needed to instantiate a upsample layer.
args (argument list): Arguments passed to the ``__init__``
method of the corresponding conv layer.
@@ -75,15 +73,11 @@ def build_upsample_layer(cfg: Dict, *args, **kwargs) -> nn.Module:
cfg_ = cfg.copy()
layer_type = cfg_.pop('type')
+ if layer_type not in UPSAMPLE_LAYERS:
+ raise KeyError(f'Unrecognized upsample type {layer_type}')
+ else:
+ upsample = UPSAMPLE_LAYERS.get(layer_type)
- # Switch registry to the target scope. If `upsample` cannot be found
- # in the registry, fallback to search `upsample` in the
- # mmengine.MODELS.
- with MODELS.switch_scope_and_registry(None) as registry:
- upsample = registry.get(layer_type)
- if upsample is None:
- raise KeyError(f'Cannot find {upsample} in registry under scope '
- f'name {registry.scope}')
if upsample is nn.Upsample:
cfg_['mode'] = layer_type
layer = upsample(*args, **kwargs, **cfg_)
diff --git a/mmcv/cnn/bricks/wrappers.py b/mmcv/cnn/bricks/wrappers.py
index 07eb04ee324c713291f834f5020c2943a48d9358..8aebf67bf52355a513f21756ee74fe510902d075 100644
--- a/mmcv/cnn/bricks/wrappers.py
+++ b/mmcv/cnn/bricks/wrappers.py
@@ -9,9 +9,10 @@ import math
import torch
import torch.nn as nn
-from mmengine.registry import MODELS
from torch.nn.modules.utils import _pair, _triple
+from .registry import CONV_LAYERS, UPSAMPLE_LAYERS
+
if torch.__version__ == 'parrots':
TORCH_VERSION = torch.__version__
else:
@@ -20,27 +21,27 @@ else:
TORCH_VERSION = tuple(int(x) for x in torch.__version__.split('.')[:2])
-def obsolete_torch_version(torch_version, version_threshold) -> bool:
+def obsolete_torch_version(torch_version, version_threshold):
return torch_version == 'parrots' or torch_version <= version_threshold
class NewEmptyTensorOp(torch.autograd.Function):
@staticmethod
- def forward(ctx, x: torch.Tensor, new_shape: tuple) -> torch.Tensor:
+ def forward(ctx, x, new_shape):
ctx.shape = x.shape
return x.new_empty(new_shape)
@staticmethod
- def backward(ctx, grad: torch.Tensor) -> tuple:
+ def backward(ctx, grad):
shape = ctx.shape
return NewEmptyTensorOp.apply(grad, shape), None
-@MODELS.register_module('Conv', force=True)
+@CONV_LAYERS.register_module('Conv', force=True)
class Conv2d(nn.Conv2d):
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
out_shape = [x.shape[0], self.out_channels]
for i, k, p, s, d in zip(x.shape[-2:], self.kernel_size,
@@ -58,10 +59,10 @@ class Conv2d(nn.Conv2d):
return super().forward(x)
-@MODELS.register_module('Conv3d', force=True)
+@CONV_LAYERS.register_module('Conv3d', force=True)
class Conv3d(nn.Conv3d):
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
out_shape = [x.shape[0], self.out_channels]
for i, k, p, s, d in zip(x.shape[-3:], self.kernel_size,
@@ -79,11 +80,12 @@ class Conv3d(nn.Conv3d):
return super().forward(x)
-@MODELS.register_module()
-@MODELS.register_module('deconv')
+@CONV_LAYERS.register_module()
+@CONV_LAYERS.register_module('deconv')
+@UPSAMPLE_LAYERS.register_module('deconv', force=True)
class ConvTranspose2d(nn.ConvTranspose2d):
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
out_shape = [x.shape[0], self.out_channels]
for i, k, p, s, d, op in zip(x.shape[-2:], self.kernel_size,
@@ -101,11 +103,12 @@ class ConvTranspose2d(nn.ConvTranspose2d):
return super().forward(x)
-@MODELS.register_module()
-@MODELS.register_module('deconv3d')
+@CONV_LAYERS.register_module()
+@CONV_LAYERS.register_module('deconv3d')
+@UPSAMPLE_LAYERS.register_module('deconv3d', force=True)
class ConvTranspose3d(nn.ConvTranspose3d):
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
out_shape = [x.shape[0], self.out_channels]
for i, k, p, s, d, op in zip(x.shape[-3:], self.kernel_size,
@@ -125,7 +128,7 @@ class ConvTranspose3d(nn.ConvTranspose3d):
class MaxPool2d(nn.MaxPool2d):
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
# PyTorch 1.9 does not support empty tensor inference yet
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 9)):
out_shape = list(x.shape[:2])
@@ -143,7 +146,7 @@ class MaxPool2d(nn.MaxPool2d):
class MaxPool3d(nn.MaxPool3d):
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
# PyTorch 1.9 does not support empty tensor inference yet
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 9)):
out_shape = list(x.shape[:2])
@@ -162,7 +165,7 @@ class MaxPool3d(nn.MaxPool3d):
class Linear(torch.nn.Linear):
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
# empty tensor forward of Linear layer is supported in Pytorch 1.6
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 5)):
out_shape = [x.shape[0], self.out_features]
diff --git a/mmcv/cnn/builder.py b/mmcv/cnn/builder.py
new file mode 100644
index 0000000000000000000000000000000000000000..7567316c566bd3aca6d8f65a84b00e9e890948a7
--- /dev/null
+++ b/mmcv/cnn/builder.py
@@ -0,0 +1,30 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from ..runner import Sequential
+from ..utils import Registry, build_from_cfg
+
+
+def build_model_from_cfg(cfg, registry, default_args=None):
+ """Build a PyTorch model from config dict(s). Different from
+ ``build_from_cfg``, if cfg is a list, a ``nn.Sequential`` will be built.
+
+ Args:
+ cfg (dict, list[dict]): The config of modules, is is either a config
+ dict or a list of config dicts. If cfg is a list, a
+ the built modules will be wrapped with ``nn.Sequential``.
+ registry (:obj:`Registry`): A registry the module belongs to.
+ default_args (dict, optional): Default arguments to build the module.
+ Defaults to None.
+
+ Returns:
+ nn.Module: A built nn module.
+ """
+ if isinstance(cfg, list):
+ modules = [
+ build_from_cfg(cfg_, registry, default_args) for cfg_ in cfg
+ ]
+ return Sequential(*modules)
+ else:
+ return build_from_cfg(cfg, registry, default_args)
+
+
+MODELS = Registry('model', build_func=build_model_from_cfg)
diff --git a/mmcv/cnn/resnet.py b/mmcv/cnn/resnet.py
index 8fc6abf6ac60b982a8c7998e0545bc55f9ceee78..1cb3ac057ee2d52c46fc94685b5d4e698aad8d5f 100644
--- a/mmcv/cnn/resnet.py
+++ b/mmcv/cnn/resnet.py
@@ -1,18 +1,13 @@
# Copyright (c) OpenMMLab. All rights reserved.
import logging
-from typing import Optional, Sequence, Tuple, Union
import torch.nn as nn
import torch.utils.checkpoint as cp
-from mmengine.model import constant_init, kaiming_init
-from mmengine.runner import load_checkpoint
-from torch import Tensor
+from .utils import constant_init, kaiming_init
-def conv3x3(in_planes: int,
- out_planes: int,
- stride: int = 1,
- dilation: int = 1):
+
+def conv3x3(in_planes, out_planes, stride=1, dilation=1):
"""3x3 convolution with padding."""
return nn.Conv2d(
in_planes,
@@ -28,14 +23,14 @@ class BasicBlock(nn.Module):
expansion = 1
def __init__(self,
- inplanes: int,
- planes: int,
- stride: int = 1,
- dilation: int = 1,
- downsample: Optional[nn.Module] = None,
- style: str = 'pytorch',
- with_cp: bool = False):
- super().__init__()
+ inplanes,
+ planes,
+ stride=1,
+ dilation=1,
+ downsample=None,
+ style='pytorch',
+ with_cp=False):
+ super(BasicBlock, self).__init__()
assert style in ['pytorch', 'caffe']
self.conv1 = conv3x3(inplanes, planes, stride, dilation)
self.bn1 = nn.BatchNorm2d(planes)
@@ -47,7 +42,7 @@ class BasicBlock(nn.Module):
self.dilation = dilation
assert not with_cp
- def forward(self, x: Tensor) -> Tensor:
+ def forward(self, x):
residual = x
out = self.conv1(x)
@@ -70,19 +65,19 @@ class Bottleneck(nn.Module):
expansion = 4
def __init__(self,
- inplanes: int,
- planes: int,
- stride: int = 1,
- dilation: int = 1,
- downsample: Optional[nn.Module] = None,
- style: str = 'pytorch',
- with_cp: bool = False):
+ inplanes,
+ planes,
+ stride=1,
+ dilation=1,
+ downsample=None,
+ style='pytorch',
+ with_cp=False):
"""Bottleneck block.
If style is "pytorch", the stride-two layer is the 3x3 conv layer, if
it is "caffe", the stride-two layer is the first 1x1 conv layer.
"""
- super().__init__()
+ super(Bottleneck, self).__init__()
assert style in ['pytorch', 'caffe']
if style == 'pytorch':
conv1_stride = 1
@@ -112,7 +107,7 @@ class Bottleneck(nn.Module):
self.dilation = dilation
self.with_cp = with_cp
- def forward(self, x: Tensor) -> Tensor:
+ def forward(self, x):
def _inner_forward(x):
residual = x
@@ -145,14 +140,14 @@ class Bottleneck(nn.Module):
return out
-def make_res_layer(block: nn.Module,
- inplanes: int,
- planes: int,
- blocks: int,
- stride: int = 1,
- dilation: int = 1,
- style: str = 'pytorch',
- with_cp: bool = False) -> nn.Module:
+def make_res_layer(block,
+ inplanes,
+ planes,
+ blocks,
+ stride=1,
+ dilation=1,
+ style='pytorch',
+ with_cp=False):
downsample = None
if stride != 1 or inplanes != planes * block.expansion:
downsample = nn.Sequential(
@@ -213,22 +208,22 @@ class ResNet(nn.Module):
}
def __init__(self,
- depth: int,
- num_stages: int = 4,
- strides: Sequence[int] = (1, 2, 2, 2),
- dilations: Sequence[int] = (1, 1, 1, 1),
- out_indices: Sequence[int] = (0, 1, 2, 3),
- style: str = 'pytorch',
- frozen_stages: int = -1,
- bn_eval: bool = True,
- bn_frozen: bool = False,
- with_cp: bool = False):
- super().__init__()
+ depth,
+ num_stages=4,
+ strides=(1, 2, 2, 2),
+ dilations=(1, 1, 1, 1),
+ out_indices=(0, 1, 2, 3),
+ style='pytorch',
+ frozen_stages=-1,
+ bn_eval=True,
+ bn_frozen=False,
+ with_cp=False):
+ super(ResNet, self).__init__()
if depth not in self.arch_settings:
raise KeyError(f'invalid depth {depth} for resnet')
assert num_stages >= 1 and num_stages <= 4
block, stage_blocks = self.arch_settings[depth]
- stage_blocks = stage_blocks[:num_stages] # type: ignore
+ stage_blocks = stage_blocks[:num_stages]
assert len(strides) == len(dilations) == num_stages
assert max(out_indices) < num_stages
@@ -239,7 +234,7 @@ class ResNet(nn.Module):
self.bn_frozen = bn_frozen
self.with_cp = with_cp
- self.inplanes: int = 64
+ self.inplanes = 64
self.conv1 = nn.Conv2d(
3, 64, kernel_size=7, stride=2, padding=3, bias=False)
self.bn1 = nn.BatchNorm2d(64)
@@ -260,17 +255,17 @@ class ResNet(nn.Module):
dilation=dilation,
style=self.style,
with_cp=with_cp)
- self.inplanes = planes * block.expansion # type: ignore
+ self.inplanes = planes * block.expansion
layer_name = f'layer{i + 1}'
self.add_module(layer_name, res_layer)
self.res_layers.append(layer_name)
- self.feat_dim = block.expansion * 64 * 2**( # type: ignore
- len(stage_blocks) - 1)
+ self.feat_dim = block.expansion * 64 * 2**(len(stage_blocks) - 1)
- def init_weights(self, pretrained: Optional[str] = None) -> None:
+ def init_weights(self, pretrained=None):
if isinstance(pretrained, str):
logger = logging.getLogger()
+ from ..runner import load_checkpoint
load_checkpoint(self, pretrained, strict=False, logger=logger)
elif pretrained is None:
for m in self.modules():
@@ -281,7 +276,7 @@ class ResNet(nn.Module):
else:
raise TypeError('pretrained must be a str or None')
- def forward(self, x: Tensor) -> Union[Tensor, Tuple[Tensor]]:
+ def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
@@ -297,8 +292,8 @@ class ResNet(nn.Module):
else:
return tuple(outs)
- def train(self, mode: bool = True) -> None:
- super().train(mode)
+ def train(self, mode=True):
+ super(ResNet, self).train(mode)
if self.bn_eval:
for m in self.modules():
if isinstance(m, nn.BatchNorm2d):
diff --git a/mmcv/cnn/rfsearch/__init__.py b/mmcv/cnn/rfsearch/__init__.py
deleted file mode 100644
index 04d45725dc40a15c086f21fc5ce73373318c578e..0000000000000000000000000000000000000000
--- a/mmcv/cnn/rfsearch/__init__.py
+++ /dev/null
@@ -1,5 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-from .operator import BaseConvRFSearchOp, Conv2dRFSearchOp
-from .search import RFSearchHook
-
-__all__ = ['BaseConvRFSearchOp', 'Conv2dRFSearchOp', 'RFSearchHook']
diff --git a/mmcv/cnn/rfsearch/operator.py b/mmcv/cnn/rfsearch/operator.py
deleted file mode 100644
index 2fa45abb0a282954cd5e06503596141c9a314de4..0000000000000000000000000000000000000000
--- a/mmcv/cnn/rfsearch/operator.py
+++ /dev/null
@@ -1,169 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-import copy
-
-import numpy as np
-import torch
-import torch.nn as nn
-from mmengine.logging import print_log
-from mmengine.model import BaseModule
-from torch import Tensor
-
-from .utils import expand_rates, get_single_padding
-
-
-class BaseConvRFSearchOp(BaseModule):
- """Based class of ConvRFSearchOp.
-
- Args:
- op_layer (nn.Module): pytorch module, e,g, Conv2d
- global_config (dict): config dict.
- """
-
- def __init__(self, op_layer: nn.Module, global_config: dict):
- super().__init__()
- self.op_layer = op_layer
- self.global_config = global_config
-
- def normlize(self, weights: nn.Parameter) -> nn.Parameter:
- """Normalize weights.
-
- Args:
- weights (nn.Parameter): Weights to be normalized.
-
- Returns:
- nn.Parameters: Normalized weights.
- """
- abs_weights = torch.abs(weights)
- normalized_weights = abs_weights / torch.sum(abs_weights)
- return normalized_weights
-
-
-class Conv2dRFSearchOp(BaseConvRFSearchOp):
- """Enable Conv2d with receptive field searching ability.
-
- Args:
- op_layer (nn.Module): pytorch module, e,g, Conv2d
- global_config (dict): config dict. Defaults to None.
- By default this must include:
-
- - "init_alphas": The value for initializing weights of each branch.
- - "num_branches": The controller of the size of
- search space (the number of branches).
- - "exp_rate": The controller of the sparsity of search space.
- - "mmin": The minimum dilation rate.
- - "mmax": The maximum dilation rate.
-
- Extra keys may exist, but are used by RFSearchHook, e.g., "step",
- "max_step", "search_interval", and "skip_layer".
- verbose (bool): Determines whether to print rf-next
- related logging messages.
- Defaults to True.
- """
-
- def __init__(self,
- op_layer: nn.Module,
- global_config: dict,
- verbose: bool = True):
- super().__init__(op_layer, global_config)
- assert global_config is not None, 'global_config is None'
- self.num_branches = global_config['num_branches']
- assert self.num_branches in [2, 3]
- self.verbose = verbose
- init_dilation = op_layer.dilation
- self.dilation_rates = expand_rates(init_dilation, global_config)
- if self.op_layer.kernel_size[
- 0] == 1 or self.op_layer.kernel_size[0] % 2 == 0:
- self.dilation_rates = [(op_layer.dilation[0], r[1])
- for r in self.dilation_rates]
- if self.op_layer.kernel_size[
- 1] == 1 or self.op_layer.kernel_size[1] % 2 == 0:
- self.dilation_rates = [(r[0], op_layer.dilation[1])
- for r in self.dilation_rates]
-
- self.branch_weights = nn.Parameter(torch.Tensor(self.num_branches))
- if self.verbose:
- print_log(f'Expand as {self.dilation_rates}', 'current')
- nn.init.constant_(self.branch_weights, global_config['init_alphas'])
-
- def forward(self, input: Tensor) -> Tensor:
- norm_w = self.normlize(self.branch_weights[:len(self.dilation_rates)])
- if len(self.dilation_rates) == 1:
- outputs = [
- nn.functional.conv2d(
- input,
- weight=self.op_layer.weight,
- bias=self.op_layer.bias,
- stride=self.op_layer.stride,
- padding=self.get_padding(self.dilation_rates[0]),
- dilation=self.dilation_rates[0],
- groups=self.op_layer.groups,
- )
- ]
- else:
- outputs = [
- nn.functional.conv2d(
- input,
- weight=self.op_layer.weight,
- bias=self.op_layer.bias,
- stride=self.op_layer.stride,
- padding=self.get_padding(r),
- dilation=r,
- groups=self.op_layer.groups,
- ) * norm_w[i] for i, r in enumerate(self.dilation_rates)
- ]
- output = outputs[0]
- for i in range(1, len(self.dilation_rates)):
- output += outputs[i]
- return output
-
- def estimate_rates(self) -> None:
- """Estimate new dilation rate based on trained branch_weights."""
- norm_w = self.normlize(self.branch_weights[:len(self.dilation_rates)])
- if self.verbose:
- print_log(
- 'Estimate dilation {} with weight {}.'.format(
- self.dilation_rates,
- norm_w.detach().cpu().numpy().tolist()), 'current')
-
- sum0, sum1, w_sum = 0, 0, 0
- for i in range(len(self.dilation_rates)):
- sum0 += norm_w[i].item() * self.dilation_rates[i][0]
- sum1 += norm_w[i].item() * self.dilation_rates[i][1]
- w_sum += norm_w[i].item()
- estimated = [
- np.clip(
- int(round(sum0 / w_sum)), self.global_config['mmin'],
- self.global_config['mmax']).item(),
- np.clip(
- int(round(sum1 / w_sum)), self.global_config['mmin'],
- self.global_config['mmax']).item()
- ]
- self.op_layer.dilation = tuple(estimated)
- self.op_layer.padding = self.get_padding(self.op_layer.dilation)
- self.dilation_rates = [tuple(estimated)]
- if self.verbose:
- print_log(f'Estimate as {tuple(estimated)}', 'current')
-
- def expand_rates(self) -> None:
- """Expand dilation rate."""
- dilation = self.op_layer.dilation
- dilation_rates = expand_rates(dilation, self.global_config)
- if self.op_layer.kernel_size[
- 0] == 1 or self.op_layer.kernel_size[0] % 2 == 0:
- dilation_rates = [(dilation[0], r[1]) for r in dilation_rates]
- if self.op_layer.kernel_size[
- 1] == 1 or self.op_layer.kernel_size[1] % 2 == 0:
- dilation_rates = [(r[0], dilation[1]) for r in dilation_rates]
-
- self.dilation_rates = copy.deepcopy(dilation_rates)
- if self.verbose:
- print_log(f'Expand as {self.dilation_rates}', 'current')
- nn.init.constant_(self.branch_weights,
- self.global_config['init_alphas'])
-
- def get_padding(self, dilation) -> tuple:
- padding = (get_single_padding(self.op_layer.kernel_size[0],
- self.op_layer.stride[0], dilation[0]),
- get_single_padding(self.op_layer.kernel_size[1],
- self.op_layer.stride[1], dilation[1]))
- return padding
diff --git a/mmcv/cnn/rfsearch/search.py b/mmcv/cnn/rfsearch/search.py
deleted file mode 100644
index f4add4b23afd1585fd434931e27dc92187ba1f6f..0000000000000000000000000000000000000000
--- a/mmcv/cnn/rfsearch/search.py
+++ /dev/null
@@ -1,239 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-import os
-from typing import Dict, Optional
-
-import mmengine
-import torch # noqa
-import torch.nn as nn
-from mmengine.hooks import Hook
-from mmengine.logging import print_log
-from mmengine.registry import HOOKS
-
-from .operator import BaseConvRFSearchOp, Conv2dRFSearchOp # noqa
-from .utils import get_single_padding, write_to_json
-
-
-@HOOKS.register_module()
-class RFSearchHook(Hook):
- """Rcecptive field search via dilation rates.
-
- Please refer to `RF-Next: Efficient Receptive Field
- Search for Convolutional Neural Networks
- `_ for more details.
-
-
- Args:
- mode (str, optional): It can be set to the following types:
- 'search', 'fixed_single_branch', or 'fixed_multi_branch'.
- Defaults to 'search'.
- config (Dict, optional): config dict of search.
- By default this config contains "search",
- and config["search"] must include:
-
- - "step": recording the current searching step.
- - "max_step": The maximum number of searching steps
- to update the structures.
- - "search_interval": The interval (epoch/iteration)
- between two updates.
- - "exp_rate": The controller of the sparsity of search space.
- - "init_alphas": The value for initializing weights of each branch.
- - "mmin": The minimum dilation rate.
- - "mmax": The maximum dilation rate.
- - "num_branches": The controller of the size of
- search space (the number of branches).
- - "skip_layer": The modules in skip_layer will be ignored
- during the receptive field search.
- rfstructure_file (str, optional): Path to load searched receptive
- fields of the model. Defaults to None.
- by_epoch (bool, optional): Determine to perform step by epoch or
- by iteration. If set to True, it will step by epoch. Otherwise, by
- iteration. Defaults to True.
- verbose (bool): Determines whether to print rf-next related logging
- messages. Defaults to True.
- """
-
- def __init__(self,
- mode: str = 'search',
- config: Dict = {},
- rfstructure_file: Optional[str] = None,
- by_epoch: bool = True,
- verbose: bool = True):
- assert mode in ['search', 'fixed_single_branch', 'fixed_multi_branch']
- assert config is not None
- self.config = config
- self.config['structure'] = {}
- self.verbose = verbose
- if rfstructure_file is not None:
- rfstructure = mmengine.load(rfstructure_file)['structure']
- self.config['structure'] = rfstructure
- self.mode = mode
- self.num_branches = self.config['search']['num_branches']
- self.by_epoch = by_epoch
-
- def init_model(self, model: nn.Module):
- """init model with search ability.
-
- Args:
- model (nn.Module): pytorch model
-
- Raises:
- NotImplementedError: only support three modes:
- search/fixed_single_branch/fixed_multi_branch
- """
- if self.verbose:
- print_log('RFSearch init begin.', 'current')
- if self.mode == 'search':
- if self.config['structure']:
- self.set_model(model, search_op='Conv2d')
- self.wrap_model(model, search_op='Conv2d')
- elif self.mode == 'fixed_single_branch':
- self.set_model(model, search_op='Conv2d')
- elif self.mode == 'fixed_multi_branch':
- self.set_model(model, search_op='Conv2d')
- self.wrap_model(model, search_op='Conv2d')
- else:
- raise NotImplementedError
- if self.verbose:
- print_log('RFSearch init end.', 'current')
-
- def after_train_epoch(self, runner):
- """Performs a dilation searching step after one training epoch."""
- if self.by_epoch and self.mode == 'search':
- self.step(runner.model, runner.work_dir)
-
- def after_train_iter(self, runner, batch_idx, data_batch, outputs):
- """Performs a dilation searching step after one training iteration."""
- if not self.by_epoch and self.mode == 'search':
- self.step(runner.model, runner.work_dir)
-
- def step(self, model: nn.Module, work_dir: str) -> None:
- """Performs a dilation searching step.
-
- Args:
- model (nn.Module): pytorch model
- work_dir (str): Directory to save the searching results.
- """
- self.config['search']['step'] += 1
- if (self.config['search']['step']
- ) % self.config['search']['search_interval'] == 0 and (self.config[
- 'search']['step']) < self.config['search']['max_step']:
- self.estimate_and_expand(model)
- for name, module in model.named_modules():
- if isinstance(module, BaseConvRFSearchOp):
- self.config['structure'][name] = module.op_layer.dilation
-
- write_to_json(
- self.config,
- os.path.join(
- work_dir,
- 'local_search_config_step%d.json' %
- self.config['search']['step'],
- ),
- )
-
- def estimate_and_expand(self, model: nn.Module) -> None:
- """estimate and search for RFConvOp.
-
- Args:
- model (nn.Module): pytorch model
- """
- for module in model.modules():
- if isinstance(module, BaseConvRFSearchOp):
- module.estimate_rates()
- module.expand_rates()
-
- def wrap_model(self,
- model: nn.Module,
- search_op: str = 'Conv2d',
- prefix: str = '') -> None:
- """wrap model to support searchable conv op.
-
- Args:
- model (nn.Module): pytorch model
- search_op (str): The module that uses RF search.
- Defaults to 'Conv2d'.
- init_rates (int, optional): Set to other initial dilation rates.
- Defaults to None.
- prefix (str): Prefix for function recursion. Defaults to ''.
- """
- op = 'torch.nn.' + search_op
- for name, module in model.named_children():
- if prefix == '':
- fullname = 'module.' + name
- else:
- fullname = prefix + '.' + name
- if self.config['search']['skip_layer'] is not None:
- if any(layer in fullname
- for layer in self.config['search']['skip_layer']):
- continue
- if isinstance(module, eval(op)):
- if 1 < module.kernel_size[0] and \
- 0 != module.kernel_size[0] % 2 or \
- 1 < module.kernel_size[1] and \
- 0 != module.kernel_size[1] % 2:
- moduleWrap = eval(search_op + 'RFSearchOp')(
- module, self.config['search'], self.verbose)
- moduleWrap = moduleWrap.to(module.weight.device)
- if self.verbose:
- print_log(
- 'Wrap model %s to %s.' %
- (str(module), str(moduleWrap)), 'current')
- setattr(model, name, moduleWrap)
- elif not isinstance(module, BaseConvRFSearchOp):
- self.wrap_model(module, search_op, fullname)
-
- def set_model(self,
- model: nn.Module,
- search_op: str = 'Conv2d',
- init_rates: Optional[int] = None,
- prefix: str = '') -> None:
- """set model based on config.
-
- Args:
- model (nn.Module): pytorch model
- config (Dict): config file
- search_op (str): The module that uses RF search.
- Defaults to 'Conv2d'.
- init_rates (int, optional): Set to other initial dilation rates.
- Defaults to None.
- prefix (str): Prefix for function recursion. Defaults to ''.
- """
- op = 'torch.nn.' + search_op
- for name, module in model.named_children():
- if prefix == '':
- fullname = 'module.' + name
- else:
- fullname = prefix + '.' + name
- if self.config['search']['skip_layer'] is not None:
- if any(layer in fullname
- for layer in self.config['search']['skip_layer']):
- continue
- if isinstance(module, eval(op)):
- if 1 < module.kernel_size[0] and \
- 0 != module.kernel_size[0] % 2 or \
- 1 < module.kernel_size[1] and \
- 0 != module.kernel_size[1] % 2:
- if isinstance(self.config['structure'][fullname], int):
- self.config['structure'][fullname] = [
- self.config['structure'][fullname],
- self.config['structure'][fullname]
- ]
- module.dilation = (
- self.config['structure'][fullname][0],
- self.config['structure'][fullname][1],
- )
- module.padding = (
- get_single_padding(
- module.kernel_size[0], module.stride[0],
- self.config['structure'][fullname][0]),
- get_single_padding(
- module.kernel_size[1], module.stride[1],
- self.config['structure'][fullname][1]))
- setattr(model, name, module)
- if self.verbose:
- print_log(
- 'Set module %s dilation as: [%d %d]' %
- (fullname, module.dilation[0], module.dilation[1]),
- 'current')
- elif not isinstance(module, BaseConvRFSearchOp):
- self.set_model(module, search_op, init_rates, fullname)
diff --git a/mmcv/cnn/rfsearch/utils.py b/mmcv/cnn/rfsearch/utils.py
deleted file mode 100644
index 4c8168e343d6bded761390f1be9a38b58727badf..0000000000000000000000000000000000000000
--- a/mmcv/cnn/rfsearch/utils.py
+++ /dev/null
@@ -1,68 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-import mmengine
-import numpy as np
-
-
-def write_to_json(config: dict, filename: str):
- """save config to json file.
-
- Args:
- config (dict): Config to be saved.
- filename (str): Path to save config.
- """
-
- with open(filename, 'w', encoding='utf-8') as f:
- mmengine.dump(config, f, file_format='json')
-
-
-def expand_rates(dilation: tuple, config: dict) -> list:
- """expand dilation rate according to config.
-
- Args:
- dilation (int): _description_
- config (dict): config dict
-
- Returns:
- list: list of expanded dilation rates
- """
- exp_rate = config['exp_rate']
-
- large_rates = []
- small_rates = []
- for _ in range(config['num_branches'] // 2):
- large_rates.append(
- tuple([
- np.clip(
- int(round((1 + exp_rate) * dilation[0])), config['mmin'],
- config['mmax']).item(),
- np.clip(
- int(round((1 + exp_rate) * dilation[1])), config['mmin'],
- config['mmax']).item()
- ]))
- small_rates.append(
- tuple([
- np.clip(
- int(round((1 - exp_rate) * dilation[0])), config['mmin'],
- config['mmax']).item(),
- np.clip(
- int(round((1 - exp_rate) * dilation[1])), config['mmin'],
- config['mmax']).item()
- ]))
-
- small_rates.reverse()
-
- if config['num_branches'] % 2 == 0:
- rate_list = small_rates + large_rates
- else:
- rate_list = small_rates + [dilation] + large_rates
-
- unique_rate_list = list(set(rate_list))
- unique_rate_list.sort(key=rate_list.index)
- return unique_rate_list
-
-
-def get_single_padding(kernel_size: int,
- stride: int = 1,
- dilation: int = 1) -> int:
- padding = ((stride - 1) + dilation * (kernel_size - 1)) // 2
- return padding
diff --git a/mmcv/cnn/utils/__init__.py b/mmcv/cnn/utils/__init__.py
index cdec9399f6544a90de6ac4238a60b05b8888c907..a263e31c1e3977712827ca229bbc04910b4e928e 100644
--- a/mmcv/cnn/utils/__init__.py
+++ b/mmcv/cnn/utils/__init__.py
@@ -1,5 +1,19 @@
# Copyright (c) OpenMMLab. All rights reserved.
from .flops_counter import get_model_complexity_info
from .fuse_conv_bn import fuse_conv_bn
+from .sync_bn import revert_sync_batchnorm
+from .weight_init import (INITIALIZERS, Caffe2XavierInit, ConstantInit,
+ KaimingInit, NormalInit, PretrainedInit,
+ TruncNormalInit, UniformInit, XavierInit,
+ bias_init_with_prob, caffe2_xavier_init,
+ constant_init, initialize, kaiming_init, normal_init,
+ trunc_normal_init, uniform_init, xavier_init)
-__all__ = ['get_model_complexity_info', 'fuse_conv_bn']
+__all__ = [
+ 'get_model_complexity_info', 'bias_init_with_prob', 'caffe2_xavier_init',
+ 'constant_init', 'kaiming_init', 'normal_init', 'trunc_normal_init',
+ 'uniform_init', 'xavier_init', 'fuse_conv_bn', 'initialize',
+ 'INITIALIZERS', 'ConstantInit', 'XavierInit', 'NormalInit',
+ 'TruncNormalInit', 'UniformInit', 'KaimingInit', 'PretrainedInit',
+ 'Caffe2XavierInit', 'revert_sync_batchnorm'
+]
diff --git a/mmcv/cnn/utils/flops_counter.py b/mmcv/cnn/utils/flops_counter.py
index b09edbcdff063c5a8276bafdd8d69b440539108e..dceeb398bfc8a562d406136028381326ef55e0dc 100644
--- a/mmcv/cnn/utils/flops_counter.py
+++ b/mmcv/cnn/utils/flops_counter.py
@@ -24,25 +24,22 @@
# SOFTWARE.
import sys
-import warnings
from functools import partial
-from typing import Any, Callable, Dict, Optional, TextIO, Tuple
import numpy as np
import torch
import torch.nn as nn
-from mmcv.cnn.bricks import (Conv2d, Conv3d, ConvTranspose2d, Linear,
- MaxPool2d, MaxPool3d)
+import mmcv
-def get_model_complexity_info(model: nn.Module,
- input_shape: tuple,
- print_per_layer_stat: bool = True,
- as_strings: bool = True,
- input_constructor: Optional[Callable] = None,
- flush: bool = False,
- ost: TextIO = sys.stdout) -> tuple:
+def get_model_complexity_info(model,
+ input_shape,
+ print_per_layer_stat=True,
+ as_strings=True,
+ input_constructor=None,
+ flush=False,
+ ost=sys.stdout):
"""Get complexity information of a model.
This method can calculate FLOPs and parameter counts of a model with
@@ -51,16 +48,16 @@ def get_model_complexity_info(model: nn.Module,
Supported layers are listed as below:
- Convolutions: ``nn.Conv1d``, ``nn.Conv2d``, ``nn.Conv3d``.
- - Activations: ``nn.ReLU``, ``nn.PReLU``, ``nn.ELU``,
- ``nn.LeakyReLU``, ``nn.ReLU6``.
+ - Activations: ``nn.ReLU``, ``nn.PReLU``, ``nn.ELU``, ``nn.LeakyReLU``,
+ ``nn.ReLU6``.
- Poolings: ``nn.MaxPool1d``, ``nn.MaxPool2d``, ``nn.MaxPool3d``,
- ``nn.AvgPool1d``, ``nn.AvgPool2d``, ``nn.AvgPool3d``,
- ``nn.AdaptiveMaxPool1d``, ``nn.AdaptiveMaxPool2d``,
- ``nn.AdaptiveMaxPool3d``, ``nn.AdaptiveAvgPool1d``,
- ``nn.AdaptiveAvgPool2d``, ``nn.AdaptiveAvgPool3d``.
+ ``nn.AvgPool1d``, ``nn.AvgPool2d``, ``nn.AvgPool3d``,
+ ``nn.AdaptiveMaxPool1d``, ``nn.AdaptiveMaxPool2d``,
+ ``nn.AdaptiveMaxPool3d``, ``nn.AdaptiveAvgPool1d``,
+ ``nn.AdaptiveAvgPool2d``, ``nn.AdaptiveAvgPool3d``.
- BatchNorms: ``nn.BatchNorm1d``, ``nn.BatchNorm2d``,
- ``nn.BatchNorm3d``, ``nn.GroupNorm``, ``nn.InstanceNorm1d``,
- ``InstanceNorm2d``, ``InstanceNorm3d``, ``nn.LayerNorm``.
+ ``nn.BatchNorm3d``, ``nn.GroupNorm``, ``nn.InstanceNorm1d``,
+ ``InstanceNorm2d``, ``InstanceNorm3d``, ``nn.LayerNorm``.
- Linear: ``nn.Linear``.
- Deconvolution: ``nn.ConvTranspose2d``.
- Upsample: ``nn.Upsample``.
@@ -81,8 +78,8 @@ def get_model_complexity_info(model: nn.Module,
Returns:
tuple[float | str]: If ``as_strings`` is set to True, it will return
- FLOPs and parameter counts in a string format. otherwise, it will
- return those in a float number format.
+ FLOPs and parameter counts in a string format. otherwise, it will
+ return those in a float number format.
"""
assert type(input_shape) is tuple
assert len(input_shape) >= 1
@@ -118,9 +115,7 @@ def get_model_complexity_info(model: nn.Module,
return flops_count, params_count
-def flops_to_string(flops: float,
- units: Optional[str] = 'GFLOPs',
- precision: int = 2) -> str:
+def flops_to_string(flops, units='GFLOPs', precision=2):
"""Convert FLOPs number into a string.
Note that Here we take a multiply-add counts as one FLOP.
@@ -163,9 +158,7 @@ def flops_to_string(flops: float,
return str(flops) + ' FLOPs'
-def params_to_string(num_params: float,
- units: Optional[str] = None,
- precision: int = 2) -> str:
+def params_to_string(num_params, units=None, precision=2):
"""Convert parameter number into a string.
Args:
@@ -202,13 +195,13 @@ def params_to_string(num_params: float,
return str(num_params)
-def print_model_with_flops(model: nn.Module,
- total_flops: float,
- total_params: float,
- units: Optional[str] = 'GFLOPs',
- precision: int = 3,
- ost: TextIO = sys.stdout,
- flush: bool = False) -> None:
+def print_model_with_flops(model,
+ total_flops,
+ total_params,
+ units='GFLOPs',
+ precision=3,
+ ost=sys.stdout,
+ flush=False):
"""Print a model with FLOPs for each layer.
Args:
@@ -283,10 +276,10 @@ def print_model_with_flops(model: nn.Module,
return ', '.join([
params_to_string(
accumulated_num_params, units='M', precision=precision),
- f'{accumulated_num_params / total_params:.3%} Params',
+ '{:.3%} Params'.format(accumulated_num_params / total_params),
flops_to_string(
accumulated_flops_cost, units=units, precision=precision),
- f'{accumulated_flops_cost / total_flops:.3%} FLOPs',
+ '{:.3%} FLOPs'.format(accumulated_flops_cost / total_flops),
self.original_extra_repr()
])
@@ -311,7 +304,7 @@ def print_model_with_flops(model: nn.Module,
model.apply(del_extra_repr)
-def get_model_parameters_number(model: nn.Module) -> float:
+def get_model_parameters_number(model):
"""Calculate parameter number of a model.
Args:
@@ -324,16 +317,16 @@ def get_model_parameters_number(model: nn.Module) -> float:
return num_params
-def add_flops_counting_methods(net_main_module: nn.Module) -> nn.Module:
+def add_flops_counting_methods(net_main_module):
# adding additional methods to the existing module object,
# this is done this way so that each function has access to self object
- net_main_module.start_flops_count = start_flops_count.__get__( # type: ignore # noqa E501
+ net_main_module.start_flops_count = start_flops_count.__get__(
net_main_module)
- net_main_module.stop_flops_count = stop_flops_count.__get__( # type: ignore # noqa E501
+ net_main_module.stop_flops_count = stop_flops_count.__get__(
net_main_module)
- net_main_module.reset_flops_count = reset_flops_count.__get__( # type: ignore # noqa E501
+ net_main_module.reset_flops_count = reset_flops_count.__get__(
net_main_module)
- net_main_module.compute_average_flops_cost = compute_average_flops_cost.__get__( # type: ignore # noqa E501
+ net_main_module.compute_average_flops_cost = compute_average_flops_cost.__get__( # noqa: E501
net_main_module)
net_main_module.reset_flops_count()
@@ -341,7 +334,7 @@ def add_flops_counting_methods(net_main_module: nn.Module) -> nn.Module:
return net_main_module
-def compute_average_flops_cost(self) -> Tuple[float, float]:
+def compute_average_flops_cost(self):
"""Compute average FLOPs cost.
A method to compute average FLOPs cost, which will be available after
@@ -359,7 +352,7 @@ def compute_average_flops_cost(self) -> Tuple[float, float]:
return flops_sum / batches_count, params_sum
-def start_flops_count(self) -> None:
+def start_flops_count(self):
"""Activate the computation of mean flops consumption per image.
A method to activate the computation of mean flops consumption per image.
@@ -368,7 +361,7 @@ def start_flops_count(self) -> None:
"""
add_batch_counter_hook_function(self)
- def add_flops_counter_hook_function(module: nn.Module) -> None:
+ def add_flops_counter_hook_function(module):
if is_supported_instance(module):
if hasattr(module, '__flops_handle__'):
return
@@ -382,7 +375,7 @@ def start_flops_count(self) -> None:
self.apply(partial(add_flops_counter_hook_function))
-def stop_flops_count(self) -> None:
+def stop_flops_count(self):
"""Stop computing the mean flops consumption per image.
A method to stop computing the mean flops consumption per image, which will
@@ -393,7 +386,7 @@ def stop_flops_count(self) -> None:
self.apply(remove_flops_counter_hook_function)
-def reset_flops_count(self) -> None:
+def reset_flops_count(self):
"""Reset statistics computed so far.
A method to Reset computed statistics, which will be available after
@@ -404,13 +397,11 @@ def reset_flops_count(self) -> None:
# ---- Internal functions
-def empty_flops_counter_hook(module: nn.Module, input: tuple,
- output: Any) -> None:
+def empty_flops_counter_hook(module, input, output):
module.__flops__ += 0
-def upsample_flops_counter_hook(module: nn.Module, input: tuple,
- output: torch.Tensor) -> None:
+def upsample_flops_counter_hook(module, input, output):
output_size = output[0]
batch_size = output_size.shape[0]
output_elements_count = batch_size
@@ -419,38 +410,39 @@ def upsample_flops_counter_hook(module: nn.Module, input: tuple,
module.__flops__ += int(output_elements_count)
-def relu_flops_counter_hook(module: nn.Module, input: tuple,
- output: torch.Tensor) -> None:
+def relu_flops_counter_hook(module, input, output):
active_elements_count = output.numel()
module.__flops__ += int(active_elements_count)
-def linear_flops_counter_hook(module: nn.Module, input: tuple,
- output: torch.Tensor) -> None:
+def linear_flops_counter_hook(module, input, output):
+ input = input[0]
output_last_dim = output.shape[
-1] # pytorch checks dimensions, so here we don't care much
- module.__flops__ += int(np.prod(input[0].shape) * output_last_dim)
+ module.__flops__ += int(np.prod(input.shape) * output_last_dim)
-def pool_flops_counter_hook(module: nn.Module, input: tuple,
- output: torch.Tensor) -> None:
- module.__flops__ += int(np.prod(input[0].shape))
+def pool_flops_counter_hook(module, input, output):
+ input = input[0]
+ module.__flops__ += int(np.prod(input.shape))
-def norm_flops_counter_hook(module: nn.Module, input: tuple,
- output: torch.Tensor) -> None:
- batch_flops = np.prod(input[0].shape)
+def norm_flops_counter_hook(module, input, output):
+ input = input[0]
+
+ batch_flops = np.prod(input.shape)
if (getattr(module, 'affine', False)
or getattr(module, 'elementwise_affine', False)):
batch_flops *= 2
module.__flops__ += int(batch_flops)
-def deconv_flops_counter_hook(conv_module: nn.Module, input: tuple,
- output: torch.Tensor) -> None:
+def deconv_flops_counter_hook(conv_module, input, output):
# Can have multiple inputs, getting the first one
- batch_size = input[0].shape[0]
- input_height, input_width = input[0].shape[2:]
+ input = input[0]
+
+ batch_size = input.shape[0]
+ input_height, input_width = input.shape[2:]
kernel_height, kernel_width = conv_module.kernel_size
in_channels = conv_module.in_channels
@@ -466,16 +458,17 @@ def deconv_flops_counter_hook(conv_module: nn.Module, input: tuple,
bias_flops = 0
if conv_module.bias is not None:
output_height, output_width = output.shape[2:]
- bias_flops = out_channels * batch_size * output_height * output_width
+ bias_flops = out_channels * batch_size * output_height * output_height
overall_flops = overall_conv_flops + bias_flops
conv_module.__flops__ += int(overall_flops)
-def conv_flops_counter_hook(conv_module: nn.Module, input: tuple,
- output: torch.Tensor) -> None:
+def conv_flops_counter_hook(conv_module, input, output):
# Can have multiple inputs, getting the first one
- batch_size = input[0].shape[0]
+ input = input[0]
+
+ batch_size = input.shape[0]
output_dims = list(output.shape[2:])
kernel_dims = list(conv_module.kernel_size)
@@ -502,23 +495,25 @@ def conv_flops_counter_hook(conv_module: nn.Module, input: tuple,
conv_module.__flops__ += int(overall_flops)
-def batch_counter_hook(module: nn.Module, input: tuple, output: Any) -> None:
+def batch_counter_hook(module, input, output):
batch_size = 1
if len(input) > 0:
# Can have multiple inputs, getting the first one
- batch_size = len(input[0])
+ input = input[0]
+ batch_size = len(input)
else:
- warnings.warn('No positional inputs found for a module, '
- 'assuming batch size is 1.')
+ pass
+ print('Warning! No positional inputs found for a module, '
+ 'assuming batch size is 1.')
module.__batch_counter__ += batch_size
-def add_batch_counter_variables_or_reset(module: nn.Module) -> None:
+def add_batch_counter_variables_or_reset(module):
module.__batch_counter__ = 0
-def add_batch_counter_hook_function(module: nn.Module) -> None:
+def add_batch_counter_hook_function(module):
if hasattr(module, '__batch_counter_handle__'):
return
@@ -526,43 +521,43 @@ def add_batch_counter_hook_function(module: nn.Module) -> None:
module.__batch_counter_handle__ = handle
-def remove_batch_counter_hook_function(module: nn.Module) -> None:
+def remove_batch_counter_hook_function(module):
if hasattr(module, '__batch_counter_handle__'):
module.__batch_counter_handle__.remove()
del module.__batch_counter_handle__
-def add_flops_counter_variable_or_reset(module: nn.Module) -> None:
+def add_flops_counter_variable_or_reset(module):
if is_supported_instance(module):
if hasattr(module, '__flops__') or hasattr(module, '__params__'):
- warnings.warn('variables __flops__ or __params__ are already '
- 'defined for the module' + type(module).__name__ +
- ' ptflops can affect your code!')
+ print('Warning: variables __flops__ or __params__ are already '
+ 'defined for the module' + type(module).__name__ +
+ ' ptflops can affect your code!')
module.__flops__ = 0
module.__params__ = get_model_parameters_number(module)
-def is_supported_instance(module: nn.Module) -> bool:
+def is_supported_instance(module):
if type(module) in get_modules_mapping():
return True
return False
-def remove_flops_counter_hook_function(module: nn.Module) -> None:
+def remove_flops_counter_hook_function(module):
if is_supported_instance(module):
if hasattr(module, '__flops_handle__'):
module.__flops_handle__.remove()
del module.__flops_handle__
-def get_modules_mapping() -> Dict:
+def get_modules_mapping():
return {
# convolutions
nn.Conv1d: conv_flops_counter_hook,
nn.Conv2d: conv_flops_counter_hook,
- Conv2d: conv_flops_counter_hook,
+ mmcv.cnn.bricks.Conv2d: conv_flops_counter_hook,
nn.Conv3d: conv_flops_counter_hook,
- Conv3d: conv_flops_counter_hook,
+ mmcv.cnn.bricks.Conv3d: conv_flops_counter_hook,
# activations
nn.ReLU: relu_flops_counter_hook,
nn.PReLU: relu_flops_counter_hook,
@@ -574,9 +569,9 @@ def get_modules_mapping() -> Dict:
nn.AvgPool1d: pool_flops_counter_hook,
nn.AvgPool2d: pool_flops_counter_hook,
nn.MaxPool2d: pool_flops_counter_hook,
- MaxPool2d: pool_flops_counter_hook,
+ mmcv.cnn.bricks.MaxPool2d: pool_flops_counter_hook,
nn.MaxPool3d: pool_flops_counter_hook,
- MaxPool3d: pool_flops_counter_hook,
+ mmcv.cnn.bricks.MaxPool3d: pool_flops_counter_hook,
nn.AvgPool3d: pool_flops_counter_hook,
nn.AdaptiveMaxPool1d: pool_flops_counter_hook,
nn.AdaptiveAvgPool1d: pool_flops_counter_hook,
@@ -595,10 +590,10 @@ def get_modules_mapping() -> Dict:
nn.LayerNorm: norm_flops_counter_hook,
# FC
nn.Linear: linear_flops_counter_hook,
- Linear: linear_flops_counter_hook,
+ mmcv.cnn.bricks.Linear: linear_flops_counter_hook,
# Upscale
nn.Upsample: upsample_flops_counter_hook,
# Deconvolution
nn.ConvTranspose2d: deconv_flops_counter_hook,
- ConvTranspose2d: deconv_flops_counter_hook,
+ mmcv.cnn.bricks.ConvTranspose2d: deconv_flops_counter_hook,
}
diff --git a/mmcv/cnn/utils/fuse_conv_bn.py b/mmcv/cnn/utils/fuse_conv_bn.py
index 6ccaab3bf1eb3ce615bad910d6dc45a467bb1fe4..cb7076f80bf37f7931185bf0293ffcc1ce19c8ef 100644
--- a/mmcv/cnn/utils/fuse_conv_bn.py
+++ b/mmcv/cnn/utils/fuse_conv_bn.py
@@ -3,7 +3,7 @@ import torch
import torch.nn as nn
-def _fuse_conv_bn(conv: nn.Module, bn: nn.Module) -> nn.Module:
+def _fuse_conv_bn(conv, bn):
"""Fuse conv and bn into one module.
Args:
@@ -24,7 +24,7 @@ def _fuse_conv_bn(conv: nn.Module, bn: nn.Module) -> nn.Module:
return conv
-def fuse_conv_bn(module: nn.Module) -> nn.Module:
+def fuse_conv_bn(module):
"""Recursively fuse conv and bn in a module.
During inference, the functionary of batch norm layers is turned off
diff --git a/mmcv/cnn/utils/sync_bn.py b/mmcv/cnn/utils/sync_bn.py
new file mode 100644
index 0000000000000000000000000000000000000000..8a79ff4a4f8dc70cf931fa319287682d4189e1a2
--- /dev/null
+++ b/mmcv/cnn/utils/sync_bn.py
@@ -0,0 +1,59 @@
+import torch
+
+import mmcv
+
+
+class _BatchNormXd(torch.nn.modules.batchnorm._BatchNorm):
+ """A general BatchNorm layer without input dimension check.
+
+ Reproduced from @kapily's work:
+ (https://github.com/pytorch/pytorch/issues/41081#issuecomment-783961547)
+ The only difference between BatchNorm1d, BatchNorm2d, BatchNorm3d, etc
+ is `_check_input_dim` that is designed for tensor sanity checks.
+ The check has been bypassed in this class for the convenience of converting
+ SyncBatchNorm.
+ """
+
+ def _check_input_dim(self, input):
+ return
+
+
+def revert_sync_batchnorm(module):
+ """Helper function to convert all `SyncBatchNorm` (SyncBN) and
+ `mmcv.ops.sync_bn.SyncBatchNorm`(MMSyncBN) layers in the model to
+ `BatchNormXd` layers.
+
+ Adapted from @kapily's work:
+ (https://github.com/pytorch/pytorch/issues/41081#issuecomment-783961547)
+
+ Args:
+ module (nn.Module): The module containing `SyncBatchNorm` layers.
+
+ Returns:
+ module_output: The converted module with `BatchNormXd` layers.
+ """
+ module_output = module
+ module_checklist = [torch.nn.modules.batchnorm.SyncBatchNorm]
+ if hasattr(mmcv, 'ops'):
+ module_checklist.append(mmcv.ops.SyncBatchNorm)
+ if isinstance(module, tuple(module_checklist)):
+ module_output = _BatchNormXd(module.num_features, module.eps,
+ module.momentum, module.affine,
+ module.track_running_stats)
+ if module.affine:
+ # no_grad() may not be needed here but
+ # just to be consistent with `convert_sync_batchnorm()`
+ with torch.no_grad():
+ module_output.weight = module.weight
+ module_output.bias = module.bias
+ module_output.running_mean = module.running_mean
+ module_output.running_var = module.running_var
+ module_output.num_batches_tracked = module.num_batches_tracked
+ module_output.training = module.training
+ # qconfig exists in quantized models
+ if hasattr(module, 'qconfig'):
+ module_output.qconfig = module.qconfig
+ for name, child in module.named_children():
+ module_output.add_module(name, revert_sync_batchnorm(child))
+ del module
+ return module_output
diff --git a/mmcv/cnn/utils/weight_init.py b/mmcv/cnn/utils/weight_init.py
new file mode 100644
index 0000000000000000000000000000000000000000..e1ac999e2470048ef05b3243b0d8b6959586785f
--- /dev/null
+++ b/mmcv/cnn/utils/weight_init.py
@@ -0,0 +1,684 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import copy
+import math
+import warnings
+
+import numpy as np
+import torch
+import torch.nn as nn
+from torch import Tensor
+
+from mmcv.utils import Registry, build_from_cfg, get_logger, print_log
+
+INITIALIZERS = Registry('initializer')
+
+
+def update_init_info(module, init_info):
+ """Update the `_params_init_info` in the module if the value of parameters
+ are changed.
+
+ Args:
+ module (obj:`nn.Module`): The module of PyTorch with a user-defined
+ attribute `_params_init_info` which records the initialization
+ information.
+ init_info (str): The string that describes the initialization.
+ """
+ assert hasattr(
+ module,
+ '_params_init_info'), f'Can not find `_params_init_info` in {module}'
+ for name, param in module.named_parameters():
+
+ assert param in module._params_init_info, (
+ f'Find a new :obj:`Parameter` '
+ f'named `{name}` during executing the '
+ f'`init_weights` of '
+ f'`{module.__class__.__name__}`. '
+ f'Please do not add or '
+ f'replace parameters during executing '
+ f'the `init_weights`. ')
+
+ # The parameter has been changed during executing the
+ # `init_weights` of module
+ mean_value = param.data.mean()
+ if module._params_init_info[param]['tmp_mean_value'] != mean_value:
+ module._params_init_info[param]['init_info'] = init_info
+ module._params_init_info[param]['tmp_mean_value'] = mean_value
+
+
+def constant_init(module, val, bias=0):
+ if hasattr(module, 'weight') and module.weight is not None:
+ nn.init.constant_(module.weight, val)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def xavier_init(module, gain=1, bias=0, distribution='normal'):
+ assert distribution in ['uniform', 'normal']
+ if hasattr(module, 'weight') and module.weight is not None:
+ if distribution == 'uniform':
+ nn.init.xavier_uniform_(module.weight, gain=gain)
+ else:
+ nn.init.xavier_normal_(module.weight, gain=gain)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def normal_init(module, mean=0, std=1, bias=0):
+ if hasattr(module, 'weight') and module.weight is not None:
+ nn.init.normal_(module.weight, mean, std)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def trunc_normal_init(module: nn.Module,
+ mean: float = 0,
+ std: float = 1,
+ a: float = -2,
+ b: float = 2,
+ bias: float = 0) -> None:
+ if hasattr(module, 'weight') and module.weight is not None:
+ trunc_normal_(module.weight, mean, std, a, b) # type: ignore
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias) # type: ignore
+
+
+def uniform_init(module, a=0, b=1, bias=0):
+ if hasattr(module, 'weight') and module.weight is not None:
+ nn.init.uniform_(module.weight, a, b)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def kaiming_init(module,
+ a=0,
+ mode='fan_out',
+ nonlinearity='relu',
+ bias=0,
+ distribution='normal'):
+ assert distribution in ['uniform', 'normal']
+ if hasattr(module, 'weight') and module.weight is not None:
+ if distribution == 'uniform':
+ nn.init.kaiming_uniform_(
+ module.weight, a=a, mode=mode, nonlinearity=nonlinearity)
+ else:
+ nn.init.kaiming_normal_(
+ module.weight, a=a, mode=mode, nonlinearity=nonlinearity)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def caffe2_xavier_init(module, bias=0):
+ # `XavierFill` in Caffe2 corresponds to `kaiming_uniform_` in PyTorch
+ # Acknowledgment to FAIR's internal code
+ kaiming_init(
+ module,
+ a=1,
+ mode='fan_in',
+ nonlinearity='leaky_relu',
+ bias=bias,
+ distribution='uniform')
+
+
+def bias_init_with_prob(prior_prob):
+ """initialize conv/fc bias value according to a given probability value."""
+ bias_init = float(-np.log((1 - prior_prob) / prior_prob))
+ return bias_init
+
+
+def _get_bases_name(m):
+ return [b.__name__ for b in m.__class__.__bases__]
+
+
+class BaseInit(object):
+
+ def __init__(self, *, bias=0, bias_prob=None, layer=None):
+ self.wholemodule = False
+ if not isinstance(bias, (int, float)):
+ raise TypeError(f'bias must be a number, but got a {type(bias)}')
+
+ if bias_prob is not None:
+ if not isinstance(bias_prob, float):
+ raise TypeError(f'bias_prob type must be float, \
+ but got {type(bias_prob)}')
+
+ if layer is not None:
+ if not isinstance(layer, (str, list)):
+ raise TypeError(f'layer must be a str or a list of str, \
+ but got a {type(layer)}')
+ else:
+ layer = []
+
+ if bias_prob is not None:
+ self.bias = bias_init_with_prob(bias_prob)
+ else:
+ self.bias = bias
+ self.layer = [layer] if isinstance(layer, str) else layer
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}, bias={self.bias}'
+ return info
+
+
+@INITIALIZERS.register_module(name='Constant')
+class ConstantInit(BaseInit):
+ """Initialize module parameters with constant values.
+
+ Args:
+ val (int | float): the value to fill the weights in the module with
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self, val, **kwargs):
+ super().__init__(**kwargs)
+ self.val = val
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ constant_init(m, self.val, self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ constant_init(m, self.val, self.bias)
+
+ module.apply(init)
+ if hasattr(module, '_params_init_info'):
+ update_init_info(module, init_info=self._get_init_info())
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}: val={self.val}, bias={self.bias}'
+ return info
+
+
+@INITIALIZERS.register_module(name='Xavier')
+class XavierInit(BaseInit):
+ r"""Initialize module parameters with values according to the method
+ described in `Understanding the difficulty of training deep feedforward
+ neural networks - Glorot, X. & Bengio, Y. (2010).
+ `_
+
+ Args:
+ gain (int | float): an optional scaling factor. Defaults to 1.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ distribution (str): distribution either be ``'normal'``
+ or ``'uniform'``. Defaults to ``'normal'``.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self, gain=1, distribution='normal', **kwargs):
+ super().__init__(**kwargs)
+ self.gain = gain
+ self.distribution = distribution
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ xavier_init(m, self.gain, self.bias, self.distribution)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ xavier_init(m, self.gain, self.bias, self.distribution)
+
+ module.apply(init)
+ if hasattr(module, '_params_init_info'):
+ update_init_info(module, init_info=self._get_init_info())
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}: gain={self.gain}, ' \
+ f'distribution={self.distribution}, bias={self.bias}'
+ return info
+
+
+@INITIALIZERS.register_module(name='Normal')
+class NormalInit(BaseInit):
+ r"""Initialize module parameters with the values drawn from the normal
+ distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)`.
+
+ Args:
+ mean (int | float):the mean of the normal distribution. Defaults to 0.
+ std (int | float): the standard deviation of the normal distribution.
+ Defaults to 1.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+
+ """
+
+ def __init__(self, mean=0, std=1, **kwargs):
+ super().__init__(**kwargs)
+ self.mean = mean
+ self.std = std
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ normal_init(m, self.mean, self.std, self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ normal_init(m, self.mean, self.std, self.bias)
+
+ module.apply(init)
+ if hasattr(module, '_params_init_info'):
+ update_init_info(module, init_info=self._get_init_info())
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}: mean={self.mean},' \
+ f' std={self.std}, bias={self.bias}'
+ return info
+
+
+@INITIALIZERS.register_module(name='TruncNormal')
+class TruncNormalInit(BaseInit):
+ r"""Initialize module parameters with the values drawn from the normal
+ distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)` with values
+ outside :math:`[a, b]`.
+
+ Args:
+ mean (float): the mean of the normal distribution. Defaults to 0.
+ std (float): the standard deviation of the normal distribution.
+ Defaults to 1.
+ a (float): The minimum cutoff value.
+ b ( float): The maximum cutoff value.
+ bias (float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+
+ """
+
+ def __init__(self,
+ mean: float = 0,
+ std: float = 1,
+ a: float = -2,
+ b: float = 2,
+ **kwargs) -> None:
+ super().__init__(**kwargs)
+ self.mean = mean
+ self.std = std
+ self.a = a
+ self.b = b
+
+ def __call__(self, module: nn.Module) -> None:
+
+ def init(m):
+ if self.wholemodule:
+ trunc_normal_init(m, self.mean, self.std, self.a, self.b,
+ self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ trunc_normal_init(m, self.mean, self.std, self.a, self.b,
+ self.bias)
+
+ module.apply(init)
+ if hasattr(module, '_params_init_info'):
+ update_init_info(module, init_info=self._get_init_info())
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}: a={self.a}, b={self.b},' \
+ f' mean={self.mean}, std={self.std}, bias={self.bias}'
+ return info
+
+
+@INITIALIZERS.register_module(name='Uniform')
+class UniformInit(BaseInit):
+ r"""Initialize module parameters with values drawn from the uniform
+ distribution :math:`\mathcal{U}(a, b)`.
+
+ Args:
+ a (int | float): the lower bound of the uniform distribution.
+ Defaults to 0.
+ b (int | float): the upper bound of the uniform distribution.
+ Defaults to 1.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self, a=0, b=1, **kwargs):
+ super().__init__(**kwargs)
+ self.a = a
+ self.b = b
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ uniform_init(m, self.a, self.b, self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ uniform_init(m, self.a, self.b, self.bias)
+
+ module.apply(init)
+ if hasattr(module, '_params_init_info'):
+ update_init_info(module, init_info=self._get_init_info())
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}: a={self.a},' \
+ f' b={self.b}, bias={self.bias}'
+ return info
+
+
+@INITIALIZERS.register_module(name='Kaiming')
+class KaimingInit(BaseInit):
+ r"""Initialize module parameters with the values according to the method
+ described in `Delving deep into rectifiers: Surpassing human-level
+ performance on ImageNet classification - He, K. et al. (2015).
+ `_
+
+ Args:
+ a (int | float): the negative slope of the rectifier used after this
+ layer (only used with ``'leaky_relu'``). Defaults to 0.
+ mode (str): either ``'fan_in'`` or ``'fan_out'``. Choosing
+ ``'fan_in'`` preserves the magnitude of the variance of the weights
+ in the forward pass. Choosing ``'fan_out'`` preserves the
+ magnitudes in the backwards pass. Defaults to ``'fan_out'``.
+ nonlinearity (str): the non-linear function (`nn.functional` name),
+ recommended to use only with ``'relu'`` or ``'leaky_relu'`` .
+ Defaults to 'relu'.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ distribution (str): distribution either be ``'normal'`` or
+ ``'uniform'``. Defaults to ``'normal'``.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self,
+ a=0,
+ mode='fan_out',
+ nonlinearity='relu',
+ distribution='normal',
+ **kwargs):
+ super().__init__(**kwargs)
+ self.a = a
+ self.mode = mode
+ self.nonlinearity = nonlinearity
+ self.distribution = distribution
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ kaiming_init(m, self.a, self.mode, self.nonlinearity,
+ self.bias, self.distribution)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ kaiming_init(m, self.a, self.mode, self.nonlinearity,
+ self.bias, self.distribution)
+
+ module.apply(init)
+ if hasattr(module, '_params_init_info'):
+ update_init_info(module, init_info=self._get_init_info())
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}: a={self.a}, mode={self.mode}, ' \
+ f'nonlinearity={self.nonlinearity}, ' \
+ f'distribution ={self.distribution}, bias={self.bias}'
+ return info
+
+
+@INITIALIZERS.register_module(name='Caffe2Xavier')
+class Caffe2XavierInit(KaimingInit):
+ # `XavierFill` in Caffe2 corresponds to `kaiming_uniform_` in PyTorch
+ # Acknowledgment to FAIR's internal code
+ def __init__(self, **kwargs):
+ super().__init__(
+ a=1,
+ mode='fan_in',
+ nonlinearity='leaky_relu',
+ distribution='uniform',
+ **kwargs)
+
+ def __call__(self, module):
+ super().__call__(module)
+
+
+@INITIALIZERS.register_module(name='Pretrained')
+class PretrainedInit(object):
+ """Initialize module by loading a pretrained model.
+
+ Args:
+ checkpoint (str): the checkpoint file of the pretrained model should
+ be load.
+ prefix (str, optional): the prefix of a sub-module in the pretrained
+ model. it is for loading a part of the pretrained model to
+ initialize. For example, if we would like to only load the
+ backbone of a detector model, we can set ``prefix='backbone.'``.
+ Defaults to None.
+ map_location (str): map tensors into proper locations.
+ """
+
+ def __init__(self, checkpoint, prefix=None, map_location=None):
+ self.checkpoint = checkpoint
+ self.prefix = prefix
+ self.map_location = map_location
+
+ def __call__(self, module):
+ from mmcv.runner import (_load_checkpoint_with_prefix, load_checkpoint,
+ load_state_dict)
+ logger = get_logger('mmcv')
+ if self.prefix is None:
+ print_log(f'load model from: {self.checkpoint}', logger=logger)
+ load_checkpoint(
+ module,
+ self.checkpoint,
+ map_location=self.map_location,
+ strict=False,
+ logger=logger)
+ else:
+ print_log(
+ f'load {self.prefix} in model from: {self.checkpoint}',
+ logger=logger)
+ state_dict = _load_checkpoint_with_prefix(
+ self.prefix, self.checkpoint, map_location=self.map_location)
+ load_state_dict(module, state_dict, strict=False, logger=logger)
+
+ if hasattr(module, '_params_init_info'):
+ update_init_info(module, init_info=self._get_init_info())
+
+ def _get_init_info(self):
+ info = f'{self.__class__.__name__}: load from {self.checkpoint}'
+ return info
+
+
+def _initialize(module, cfg, wholemodule=False):
+ func = build_from_cfg(cfg, INITIALIZERS)
+ # wholemodule flag is for override mode, there is no layer key in override
+ # and initializer will give init values for the whole module with the name
+ # in override.
+ func.wholemodule = wholemodule
+ func(module)
+
+
+def _initialize_override(module, override, cfg):
+ if not isinstance(override, (dict, list)):
+ raise TypeError(f'override must be a dict or a list of dict, \
+ but got {type(override)}')
+
+ override = [override] if isinstance(override, dict) else override
+
+ for override_ in override:
+
+ cp_override = copy.deepcopy(override_)
+ name = cp_override.pop('name', None)
+ if name is None:
+ raise ValueError('`override` must contain the key "name",'
+ f'but got {cp_override}')
+ # if override only has name key, it means use args in init_cfg
+ if not cp_override:
+ cp_override.update(cfg)
+ # if override has name key and other args except type key, it will
+ # raise error
+ elif 'type' not in cp_override.keys():
+ raise ValueError(
+ f'`override` need "type" key, but got {cp_override}')
+
+ if hasattr(module, name):
+ _initialize(getattr(module, name), cp_override, wholemodule=True)
+ else:
+ raise RuntimeError(f'module did not have attribute {name}, '
+ f'but init_cfg is {cp_override}.')
+
+
+def initialize(module, init_cfg):
+ """Initialize a module.
+
+ Args:
+ module (``torch.nn.Module``): the module will be initialized.
+ init_cfg (dict | list[dict]): initialization configuration dict to
+ define initializer. OpenMMLab has implemented 6 initializers
+ including ``Constant``, ``Xavier``, ``Normal``, ``Uniform``,
+ ``Kaiming``, and ``Pretrained``.
+ Example:
+ >>> module = nn.Linear(2, 3, bias=True)
+ >>> init_cfg = dict(type='Constant', layer='Linear', val =1 , bias =2)
+ >>> initialize(module, init_cfg)
+
+ >>> module = nn.Sequential(nn.Conv1d(3, 1, 3), nn.Linear(1,2))
+ >>> # define key ``'layer'`` for initializing layer with different
+ >>> # configuration
+ >>> init_cfg = [dict(type='Constant', layer='Conv1d', val=1),
+ dict(type='Constant', layer='Linear', val=2)]
+ >>> initialize(module, init_cfg)
+
+ >>> # define key``'override'`` to initialize some specific part in
+ >>> # module
+ >>> class FooNet(nn.Module):
+ >>> def __init__(self):
+ >>> super().__init__()
+ >>> self.feat = nn.Conv2d(3, 16, 3)
+ >>> self.reg = nn.Conv2d(16, 10, 3)
+ >>> self.cls = nn.Conv2d(16, 5, 3)
+ >>> model = FooNet()
+ >>> init_cfg = dict(type='Constant', val=1, bias=2, layer='Conv2d',
+ >>> override=dict(type='Constant', name='reg', val=3, bias=4))
+ >>> initialize(model, init_cfg)
+
+ >>> model = ResNet(depth=50)
+ >>> # Initialize weights with the pretrained model.
+ >>> init_cfg = dict(type='Pretrained',
+ checkpoint='torchvision://resnet50')
+ >>> initialize(model, init_cfg)
+
+ >>> # Initialize weights of a sub-module with the specific part of
+ >>> # a pretrained model by using "prefix".
+ >>> url = 'http://download.openmmlab.com/mmdetection/v2.0/retinanet/'\
+ >>> 'retinanet_r50_fpn_1x_coco/'\
+ >>> 'retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth'
+ >>> init_cfg = dict(type='Pretrained',
+ checkpoint=url, prefix='backbone.')
+ """
+ if not isinstance(init_cfg, (dict, list)):
+ raise TypeError(f'init_cfg must be a dict or a list of dict, \
+ but got {type(init_cfg)}')
+
+ if isinstance(init_cfg, dict):
+ init_cfg = [init_cfg]
+
+ for cfg in init_cfg:
+ # should deeply copy the original config because cfg may be used by
+ # other modules, e.g., one init_cfg shared by multiple bottleneck
+ # blocks, the expected cfg will be changed after pop and will change
+ # the initialization behavior of other modules
+ cp_cfg = copy.deepcopy(cfg)
+ override = cp_cfg.pop('override', None)
+ _initialize(module, cp_cfg)
+
+ if override is not None:
+ cp_cfg.pop('layer', None)
+ _initialize_override(module, override, cp_cfg)
+ else:
+ # All attributes in module have same initialization.
+ pass
+
+
+def _no_grad_trunc_normal_(tensor: Tensor, mean: float, std: float, a: float,
+ b: float) -> Tensor:
+ # Method based on
+ # https://people.sc.fsu.edu/~jburkardt/presentations/truncated_normal.pdf
+ # Modified from
+ # https://github.com/pytorch/pytorch/blob/master/torch/nn/init.py
+ def norm_cdf(x):
+ # Computes standard normal cumulative distribution function
+ return (1. + math.erf(x / math.sqrt(2.))) / 2.
+
+ if (mean < a - 2 * std) or (mean > b + 2 * std):
+ warnings.warn(
+ 'mean is more than 2 std from [a, b] in nn.init.trunc_normal_. '
+ 'The distribution of values may be incorrect.',
+ stacklevel=2)
+
+ with torch.no_grad():
+ # Values are generated by using a truncated uniform distribution and
+ # then using the inverse CDF for the normal distribution.
+ # Get upper and lower cdf values
+ lower = norm_cdf((a - mean) / std)
+ upper = norm_cdf((b - mean) / std)
+
+ # Uniformly fill tensor with values from [lower, upper], then translate
+ # to [2lower-1, 2upper-1].
+ tensor.uniform_(2 * lower - 1, 2 * upper - 1)
+
+ # Use inverse cdf transform for normal distribution to get truncated
+ # standard normal
+ tensor.erfinv_()
+
+ # Transform to proper mean, std
+ tensor.mul_(std * math.sqrt(2.))
+ tensor.add_(mean)
+
+ # Clamp to ensure it's in the proper range
+ tensor.clamp_(min=a, max=b)
+ return tensor
+
+
+def trunc_normal_(tensor: Tensor,
+ mean: float = 0.,
+ std: float = 1.,
+ a: float = -2.,
+ b: float = 2.) -> Tensor:
+ r"""Fills the input Tensor with values drawn from a truncated
+ normal distribution. The values are effectively drawn from the
+ normal distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)`
+ with values outside :math:`[a, b]` redrawn until they are within
+ the bounds. The method used for generating the random values works
+ best when :math:`a \leq \text{mean} \leq b`.
+
+ Modified from
+ https://github.com/pytorch/pytorch/blob/master/torch/nn/init.py
+
+ Args:
+ tensor (``torch.Tensor``): an n-dimensional `torch.Tensor`.
+ mean (float): the mean of the normal distribution.
+ std (float): the standard deviation of the normal distribution.
+ a (float): the minimum cutoff value.
+ b (float): the maximum cutoff value.
+ """
+ return _no_grad_trunc_normal_(tensor, mean, std, a, b)
diff --git a/mmcv/cnn/vgg.py b/mmcv/cnn/vgg.py
index a7f3116062c3943bb85fd7540b23a31918622a24..8778b649561a45a9652b1a15a26c2d171e58f3e1 100644
--- a/mmcv/cnn/vgg.py
+++ b/mmcv/cnn/vgg.py
@@ -1,14 +1,12 @@
# Copyright (c) OpenMMLab. All rights reserved.
import logging
-from typing import List, Optional, Sequence, Tuple, Union
import torch.nn as nn
-from mmengine.model import constant_init, kaiming_init, normal_init
-from mmengine.runner import load_checkpoint
-from torch import Tensor
+from .utils import constant_init, kaiming_init, normal_init
-def conv3x3(in_planes: int, out_planes: int, dilation: int = 1) -> nn.Module:
+
+def conv3x3(in_planes, out_planes, dilation=1):
"""3x3 convolution with padding."""
return nn.Conv2d(
in_planes,
@@ -18,12 +16,12 @@ def conv3x3(in_planes: int, out_planes: int, dilation: int = 1) -> nn.Module:
dilation=dilation)
-def make_vgg_layer(inplanes: int,
- planes: int,
- num_blocks: int,
- dilation: int = 1,
- with_bn: bool = False,
- ceil_mode: bool = False) -> List[nn.Module]:
+def make_vgg_layer(inplanes,
+ planes,
+ num_blocks,
+ dilation=1,
+ with_bn=False,
+ ceil_mode=False):
layers = []
for _ in range(num_blocks):
layers.append(conv3x3(inplanes, planes, dilation))
@@ -61,18 +59,18 @@ class VGG(nn.Module):
}
def __init__(self,
- depth: int,
- with_bn: bool = False,
- num_classes: int = -1,
- num_stages: int = 5,
- dilations: Sequence[int] = (1, 1, 1, 1, 1),
- out_indices: Sequence[int] = (0, 1, 2, 3, 4),
- frozen_stages: int = -1,
- bn_eval: bool = True,
- bn_frozen: bool = False,
- ceil_mode: bool = False,
- with_last_pool: bool = True):
- super().__init__()
+ depth,
+ with_bn=False,
+ num_classes=-1,
+ num_stages=5,
+ dilations=(1, 1, 1, 1, 1),
+ out_indices=(0, 1, 2, 3, 4),
+ frozen_stages=-1,
+ bn_eval=True,
+ bn_frozen=False,
+ ceil_mode=False,
+ with_last_pool=True):
+ super(VGG, self).__init__()
if depth not in self.arch_settings:
raise KeyError(f'invalid depth {depth} for vgg')
assert num_stages >= 1 and num_stages <= 5
@@ -124,9 +122,10 @@ class VGG(nn.Module):
nn.Linear(4096, num_classes),
)
- def init_weights(self, pretrained: Optional[str] = None) -> None:
+ def init_weights(self, pretrained=None):
if isinstance(pretrained, str):
logger = logging.getLogger()
+ from ..runner import load_checkpoint
load_checkpoint(self, pretrained, strict=False, logger=logger)
elif pretrained is None:
for m in self.modules():
@@ -139,7 +138,7 @@ class VGG(nn.Module):
else:
raise TypeError('pretrained must be a str or None')
- def forward(self, x: Tensor) -> Union[Tensor, Tuple[Tensor, ...]]:
+ def forward(self, x):
outs = []
vgg_layers = getattr(self, self.module_name)
for i in range(len(self.stage_blocks)):
@@ -157,8 +156,8 @@ class VGG(nn.Module):
else:
return tuple(outs)
- def train(self, mode: bool = True) -> None:
- super().train(mode)
+ def train(self, mode=True):
+ super(VGG, self).train(mode)
if self.bn_eval:
for m in self.modules():
if isinstance(m, nn.BatchNorm2d):
diff --git a/mmcv/engine/__init__.py b/mmcv/engine/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..3193b7f664e19ce2458d81c836597fa22e4bb082
--- /dev/null
+++ b/mmcv/engine/__init__.py
@@ -0,0 +1,8 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from .test import (collect_results_cpu, collect_results_gpu, multi_gpu_test,
+ single_gpu_test)
+
+__all__ = [
+ 'collect_results_cpu', 'collect_results_gpu', 'multi_gpu_test',
+ 'single_gpu_test'
+]
diff --git a/mmcv/engine/test.py b/mmcv/engine/test.py
new file mode 100644
index 0000000000000000000000000000000000000000..f236b1cda2f39517bda3e4cce9badc19c6cbf190
--- /dev/null
+++ b/mmcv/engine/test.py
@@ -0,0 +1,202 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import pickle
+import shutil
+import tempfile
+import time
+
+import torch
+import torch.distributed as dist
+
+import mmcv
+from mmcv.runner import get_dist_info
+
+
+def single_gpu_test(model, data_loader):
+ """Test model with a single gpu.
+
+ This method tests model with a single gpu and displays test progress bar.
+
+ Args:
+ model (nn.Module): Model to be tested.
+ data_loader (nn.Dataloader): Pytorch data loader.
+
+ Returns:
+ list: The prediction results.
+ """
+ model.eval()
+ results = []
+ dataset = data_loader.dataset
+ prog_bar = mmcv.ProgressBar(len(dataset))
+ for data in data_loader:
+ with torch.no_grad():
+ result = model(return_loss=False, **data)
+ results.extend(result)
+
+ # Assume result has the same length of batch_size
+ # refer to https://github.com/open-mmlab/mmcv/issues/985
+ batch_size = len(result)
+ for _ in range(batch_size):
+ prog_bar.update()
+ return results
+
+
+def multi_gpu_test(model, data_loader, tmpdir=None, gpu_collect=False):
+ """Test model with multiple gpus.
+
+ This method tests model with multiple gpus and collects the results
+ under two different modes: gpu and cpu modes. By setting
+ ``gpu_collect=True``, it encodes results to gpu tensors and use gpu
+ communication for results collection. On cpu mode it saves the results on
+ different gpus to ``tmpdir`` and collects them by the rank 0 worker.
+
+ Args:
+ model (nn.Module): Model to be tested.
+ data_loader (nn.Dataloader): Pytorch data loader.
+ tmpdir (str): Path of directory to save the temporary results from
+ different gpus under cpu mode.
+ gpu_collect (bool): Option to use either gpu or cpu to collect results.
+
+ Returns:
+ list: The prediction results.
+ """
+ model.eval()
+ results = []
+ dataset = data_loader.dataset
+ rank, world_size = get_dist_info()
+ if rank == 0:
+ prog_bar = mmcv.ProgressBar(len(dataset))
+ time.sleep(2) # This line can prevent deadlock problem in some cases.
+ for i, data in enumerate(data_loader):
+ with torch.no_grad():
+ result = model(return_loss=False, **data)
+ results.extend(result)
+
+ if rank == 0:
+ batch_size = len(result)
+ batch_size_all = batch_size * world_size
+ if batch_size_all + prog_bar.completed > len(dataset):
+ batch_size_all = len(dataset) - prog_bar.completed
+ for _ in range(batch_size_all):
+ prog_bar.update()
+
+ # collect results from all ranks
+ if gpu_collect:
+ results = collect_results_gpu(results, len(dataset))
+ else:
+ results = collect_results_cpu(results, len(dataset), tmpdir)
+ return results
+
+
+def collect_results_cpu(result_part, size, tmpdir=None):
+ """Collect results under cpu mode.
+
+ On cpu mode, this function will save the results on different gpus to
+ ``tmpdir`` and collect them by the rank 0 worker.
+
+ Args:
+ result_part (list): Result list containing result parts
+ to be collected.
+ size (int): Size of the results, commonly equal to length of
+ the results.
+ tmpdir (str | None): temporal directory for collected results to
+ store. If set to None, it will create a random temporal directory
+ for it.
+
+ Returns:
+ list: The collected results.
+ """
+ rank, world_size = get_dist_info()
+ # create a tmp dir if it is not specified
+ if tmpdir is None:
+ MAX_LEN = 512
+ # 32 is whitespace
+ dir_tensor = torch.full((MAX_LEN, ),
+ 32,
+ dtype=torch.uint8,
+ device='cuda')
+ if rank == 0:
+ mmcv.mkdir_or_exist('.dist_test')
+ tmpdir = tempfile.mkdtemp(dir='.dist_test')
+ tmpdir = torch.tensor(
+ bytearray(tmpdir.encode()), dtype=torch.uint8, device='cuda')
+ dir_tensor[:len(tmpdir)] = tmpdir
+ dist.broadcast(dir_tensor, 0)
+ tmpdir = dir_tensor.cpu().numpy().tobytes().decode().rstrip()
+ else:
+ mmcv.mkdir_or_exist(tmpdir)
+ # dump the part result to the dir
+ mmcv.dump(result_part, osp.join(tmpdir, f'part_{rank}.pkl'))
+ dist.barrier()
+ # collect all parts
+ if rank != 0:
+ return None
+ else:
+ # load results of all parts from tmp dir
+ part_list = []
+ for i in range(world_size):
+ part_file = osp.join(tmpdir, f'part_{i}.pkl')
+ part_result = mmcv.load(part_file)
+ # When data is severely insufficient, an empty part_result
+ # on a certain gpu could makes the overall outputs empty.
+ if part_result:
+ part_list.append(part_result)
+ # sort the results
+ ordered_results = []
+ for res in zip(*part_list):
+ ordered_results.extend(list(res))
+ # the dataloader may pad some samples
+ ordered_results = ordered_results[:size]
+ # remove tmp dir
+ shutil.rmtree(tmpdir)
+ return ordered_results
+
+
+def collect_results_gpu(result_part, size):
+ """Collect results under gpu mode.
+
+ On gpu mode, this function will encode results to gpu tensors and use gpu
+ communication for results collection.
+
+ Args:
+ result_part (list): Result list containing result parts
+ to be collected.
+ size (int): Size of the results, commonly equal to length of
+ the results.
+
+ Returns:
+ list: The collected results.
+ """
+ rank, world_size = get_dist_info()
+ # dump result part to tensor with pickle
+ part_tensor = torch.tensor(
+ bytearray(pickle.dumps(result_part)), dtype=torch.uint8, device='cuda')
+ # gather all result part tensor shape
+ shape_tensor = torch.tensor(part_tensor.shape, device='cuda')
+ shape_list = [shape_tensor.clone() for _ in range(world_size)]
+ dist.all_gather(shape_list, shape_tensor)
+ # padding result part tensor to max length
+ shape_max = torch.tensor(shape_list).max()
+ part_send = torch.zeros(shape_max, dtype=torch.uint8, device='cuda')
+ part_send[:shape_tensor[0]] = part_tensor
+ part_recv_list = [
+ part_tensor.new_zeros(shape_max) for _ in range(world_size)
+ ]
+ # gather all result part
+ dist.all_gather(part_recv_list, part_send)
+
+ if rank == 0:
+ part_list = []
+ for recv, shape in zip(part_recv_list, shape_list):
+ part_result = pickle.loads(recv[:shape[0]].cpu().numpy().tobytes())
+ # When data is severely insufficient, an empty part_result
+ # on a certain gpu could makes the overall outputs empty.
+ if part_result:
+ part_list.append(part_result)
+ # sort the results
+ ordered_results = []
+ for res in zip(*part_list):
+ ordered_results.extend(list(res))
+ # the dataloader may pad some samples
+ ordered_results = ordered_results[:size]
+ return ordered_results
diff --git a/mmcv/fileio/__init__.py b/mmcv/fileio/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..2051b85f7e59bff7bdbaa131849ce8cd31f059a4
--- /dev/null
+++ b/mmcv/fileio/__init__.py
@@ -0,0 +1,11 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from .file_client import BaseStorageBackend, FileClient
+from .handlers import BaseFileHandler, JsonHandler, PickleHandler, YamlHandler
+from .io import dump, load, register_handler
+from .parse import dict_from_file, list_from_file
+
+__all__ = [
+ 'BaseStorageBackend', 'FileClient', 'load', 'dump', 'register_handler',
+ 'BaseFileHandler', 'JsonHandler', 'PickleHandler', 'YamlHandler',
+ 'list_from_file', 'dict_from_file'
+]
diff --git a/mmcv/fileio/file_client.py b/mmcv/fileio/file_client.py
new file mode 100644
index 0000000000000000000000000000000000000000..b2d622868cdd006dc7446bcde0dc54731c17116a
--- /dev/null
+++ b/mmcv/fileio/file_client.py
@@ -0,0 +1,1148 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import inspect
+import os
+import os.path as osp
+import re
+import tempfile
+import warnings
+from abc import ABCMeta, abstractmethod
+from contextlib import contextmanager
+from pathlib import Path
+from typing import Iterable, Iterator, Optional, Tuple, Union
+from urllib.request import urlopen
+
+import mmcv
+from mmcv.utils.misc import has_method
+from mmcv.utils.path import is_filepath
+
+
+class BaseStorageBackend(metaclass=ABCMeta):
+ """Abstract class of storage backends.
+
+ All backends need to implement two apis: ``get()`` and ``get_text()``.
+ ``get()`` reads the file as a byte stream and ``get_text()`` reads the file
+ as texts.
+ """
+
+ # a flag to indicate whether the backend can create a symlink for a file
+ _allow_symlink = False
+
+ @property
+ def name(self):
+ return self.__class__.__name__
+
+ @property
+ def allow_symlink(self):
+ return self._allow_symlink
+
+ @abstractmethod
+ def get(self, filepath):
+ pass
+
+ @abstractmethod
+ def get_text(self, filepath):
+ pass
+
+
+class CephBackend(BaseStorageBackend):
+ """Ceph storage backend (for internal use).
+
+ Args:
+ path_mapping (dict|None): path mapping dict from local path to Petrel
+ path. When ``path_mapping={'src': 'dst'}``, ``src`` in ``filepath``
+ will be replaced by ``dst``. Default: None.
+
+ .. warning::
+ :class:`mmcv.fileio.file_client.CephBackend` will be deprecated,
+ please use :class:`mmcv.fileio.file_client.PetrelBackend` instead.
+ """
+
+ def __init__(self, path_mapping=None):
+ try:
+ import ceph
+ except ImportError:
+ raise ImportError('Please install ceph to enable CephBackend.')
+
+ warnings.warn(
+ 'CephBackend will be deprecated, please use PetrelBackend instead')
+ self._client = ceph.S3Client()
+ assert isinstance(path_mapping, dict) or path_mapping is None
+ self.path_mapping = path_mapping
+
+ def get(self, filepath):
+ filepath = str(filepath)
+ if self.path_mapping is not None:
+ for k, v in self.path_mapping.items():
+ filepath = filepath.replace(k, v)
+ value = self._client.Get(filepath)
+ value_buf = memoryview(value)
+ return value_buf
+
+ def get_text(self, filepath, encoding=None):
+ raise NotImplementedError
+
+
+class PetrelBackend(BaseStorageBackend):
+ """Petrel storage backend (for internal use).
+
+ PetrelBackend supports reading and writing data to multiple clusters.
+ If the file path contains the cluster name, PetrelBackend will read data
+ from specified cluster or write data to it. Otherwise, PetrelBackend will
+ access the default cluster.
+
+ Args:
+ path_mapping (dict, optional): Path mapping dict from local path to
+ Petrel path. When ``path_mapping={'src': 'dst'}``, ``src`` in
+ ``filepath`` will be replaced by ``dst``. Default: None.
+ enable_mc (bool, optional): Whether to enable memcached support.
+ Default: True.
+
+ Examples:
+ >>> filepath1 = 's3://path/of/file'
+ >>> filepath2 = 'cluster-name:s3://path/of/file'
+ >>> client = PetrelBackend()
+ >>> client.get(filepath1) # get data from default cluster
+ >>> client.get(filepath2) # get data from 'cluster-name' cluster
+ """
+
+ def __init__(self,
+ path_mapping: Optional[dict] = None,
+ enable_mc: bool = True):
+ try:
+ from petrel_client import client
+ except ImportError:
+ raise ImportError('Please install petrel_client to enable '
+ 'PetrelBackend.')
+
+ self._client = client.Client(enable_mc=enable_mc)
+ assert isinstance(path_mapping, dict) or path_mapping is None
+ self.path_mapping = path_mapping
+
+ def _map_path(self, filepath: Union[str, Path]) -> str:
+ """Map ``filepath`` to a string path whose prefix will be replaced by
+ :attr:`self.path_mapping`.
+
+ Args:
+ filepath (str): Path to be mapped.
+ """
+ filepath = str(filepath)
+ if self.path_mapping is not None:
+ for k, v in self.path_mapping.items():
+ filepath = filepath.replace(k, v)
+ return filepath
+
+ def _format_path(self, filepath: str) -> str:
+ """Convert a ``filepath`` to standard format of petrel oss.
+
+ If the ``filepath`` is concatenated by ``os.path.join``, in a Windows
+ environment, the ``filepath`` will be the format of
+ 's3://bucket_name\\image.jpg'. By invoking :meth:`_format_path`, the
+ above ``filepath`` will be converted to 's3://bucket_name/image.jpg'.
+
+ Args:
+ filepath (str): Path to be formatted.
+ """
+ return re.sub(r'\\+', '/', filepath)
+
+ def get(self, filepath: Union[str, Path]) -> memoryview:
+ """Read data from a given ``filepath`` with 'rb' mode.
+
+ Args:
+ filepath (str or Path): Path to read data.
+
+ Returns:
+ memoryview: A memory view of expected bytes object to avoid
+ copying. The memoryview object can be converted to bytes by
+ ``value_buf.tobytes()``.
+ """
+ filepath = self._map_path(filepath)
+ filepath = self._format_path(filepath)
+ value = self._client.Get(filepath)
+ value_buf = memoryview(value)
+ return value_buf
+
+ def get_text(self,
+ filepath: Union[str, Path],
+ encoding: str = 'utf-8') -> str:
+ """Read data from a given ``filepath`` with 'r' mode.
+
+ Args:
+ filepath (str or Path): Path to read data.
+ encoding (str): The encoding format used to open the ``filepath``.
+ Default: 'utf-8'.
+
+ Returns:
+ str: Expected text reading from ``filepath``.
+ """
+ return str(self.get(filepath), encoding=encoding)
+
+ def put(self, obj: bytes, filepath: Union[str, Path]) -> None:
+ """Save data to a given ``filepath``.
+
+ Args:
+ obj (bytes): Data to be saved.
+ filepath (str or Path): Path to write data.
+ """
+ filepath = self._map_path(filepath)
+ filepath = self._format_path(filepath)
+ self._client.put(filepath, obj)
+
+ def put_text(self,
+ obj: str,
+ filepath: Union[str, Path],
+ encoding: str = 'utf-8') -> None:
+ """Save data to a given ``filepath``.
+
+ Args:
+ obj (str): Data to be written.
+ filepath (str or Path): Path to write data.
+ encoding (str): The encoding format used to encode the ``obj``.
+ Default: 'utf-8'.
+ """
+ self.put(bytes(obj, encoding=encoding), filepath)
+
+ def remove(self, filepath: Union[str, Path]) -> None:
+ """Remove a file.
+
+ Args:
+ filepath (str or Path): Path to be removed.
+ """
+ if not has_method(self._client, 'delete'):
+ raise NotImplementedError(
+ ('Current version of Petrel Python SDK has not supported '
+ 'the `delete` method, please use a higher version or dev'
+ ' branch instead.'))
+
+ filepath = self._map_path(filepath)
+ filepath = self._format_path(filepath)
+ self._client.delete(filepath)
+
+ def exists(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path exists.
+
+ Args:
+ filepath (str or Path): Path to be checked whether exists.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise.
+ """
+ if not (has_method(self._client, 'contains')
+ and has_method(self._client, 'isdir')):
+ raise NotImplementedError(
+ ('Current version of Petrel Python SDK has not supported '
+ 'the `contains` and `isdir` methods, please use a higher'
+ 'version or dev branch instead.'))
+
+ filepath = self._map_path(filepath)
+ filepath = self._format_path(filepath)
+ return self._client.contains(filepath) or self._client.isdir(filepath)
+
+ def isdir(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path is a directory.
+
+ Args:
+ filepath (str or Path): Path to be checked whether it is a
+ directory.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` points to a directory,
+ ``False`` otherwise.
+ """
+ if not has_method(self._client, 'isdir'):
+ raise NotImplementedError(
+ ('Current version of Petrel Python SDK has not supported '
+ 'the `isdir` method, please use a higher version or dev'
+ ' branch instead.'))
+
+ filepath = self._map_path(filepath)
+ filepath = self._format_path(filepath)
+ return self._client.isdir(filepath)
+
+ def isfile(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path is a file.
+
+ Args:
+ filepath (str or Path): Path to be checked whether it is a file.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` points to a file, ``False``
+ otherwise.
+ """
+ if not has_method(self._client, 'contains'):
+ raise NotImplementedError(
+ ('Current version of Petrel Python SDK has not supported '
+ 'the `contains` method, please use a higher version or '
+ 'dev branch instead.'))
+
+ filepath = self._map_path(filepath)
+ filepath = self._format_path(filepath)
+ return self._client.contains(filepath)
+
+ def join_path(self, filepath: Union[str, Path],
+ *filepaths: Union[str, Path]) -> str:
+ """Concatenate all file paths.
+
+ Args:
+ filepath (str or Path): Path to be concatenated.
+
+ Returns:
+ str: The result after concatenation.
+ """
+ filepath = self._format_path(self._map_path(filepath))
+ if filepath.endswith('/'):
+ filepath = filepath[:-1]
+ formatted_paths = [filepath]
+ for path in filepaths:
+ formatted_paths.append(self._format_path(self._map_path(path)))
+ return '/'.join(formatted_paths)
+
+ @contextmanager
+ def get_local_path(self, filepath: Union[str, Path]) -> Iterable[str]:
+ """Download a file from ``filepath`` and return a temporary path.
+
+ ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It
+ can be called with ``with`` statement, and when exists from the
+ ``with`` statement, the temporary path will be released.
+
+ Args:
+ filepath (str | Path): Download a file from ``filepath``.
+
+ Examples:
+ >>> client = PetrelBackend()
+ >>> # After existing from the ``with`` clause,
+ >>> # the path will be removed
+ >>> with client.get_local_path('s3://path/of/your/file') as path:
+ ... # do something here
+
+ Yields:
+ Iterable[str]: Only yield one temporary path.
+ """
+ filepath = self._map_path(filepath)
+ filepath = self._format_path(filepath)
+ assert self.isfile(filepath)
+ try:
+ f = tempfile.NamedTemporaryFile(delete=False)
+ f.write(self.get(filepath))
+ f.close()
+ yield f.name
+ finally:
+ os.remove(f.name)
+
+ def list_dir_or_file(self,
+ dir_path: Union[str, Path],
+ list_dir: bool = True,
+ list_file: bool = True,
+ suffix: Optional[Union[str, Tuple[str]]] = None,
+ recursive: bool = False) -> Iterator[str]:
+ """Scan a directory to find the interested directories or files in
+ arbitrary order.
+
+ Note:
+ Petrel has no concept of directories but it simulates the directory
+ hierarchy in the filesystem through public prefixes. In addition,
+ if the returned path ends with '/', it means the path is a public
+ prefix which is a logical directory.
+
+ Note:
+ :meth:`list_dir_or_file` returns the path relative to ``dir_path``.
+ In addition, the returned path of directory will not contains the
+ suffix '/' which is consistent with other backends.
+
+ Args:
+ dir_path (str | Path): Path of the directory.
+ list_dir (bool): List the directories. Default: True.
+ list_file (bool): List the path of files. Default: True.
+ suffix (str or tuple[str], optional): File suffix
+ that we are interested in. Default: None.
+ recursive (bool): If set to True, recursively scan the
+ directory. Default: False.
+
+ Yields:
+ Iterable[str]: A relative path to ``dir_path``.
+ """
+ if not has_method(self._client, 'list'):
+ raise NotImplementedError(
+ ('Current version of Petrel Python SDK has not supported '
+ 'the `list` method, please use a higher version or dev'
+ ' branch instead.'))
+
+ dir_path = self._map_path(dir_path)
+ dir_path = self._format_path(dir_path)
+ if list_dir and suffix is not None:
+ raise TypeError(
+ '`list_dir` should be False when `suffix` is not None')
+
+ if (suffix is not None) and not isinstance(suffix, (str, tuple)):
+ raise TypeError('`suffix` must be a string or tuple of strings')
+
+ # Petrel's simulated directory hierarchy assumes that directory paths
+ # should end with `/`
+ if not dir_path.endswith('/'):
+ dir_path += '/'
+
+ root = dir_path
+
+ def _list_dir_or_file(dir_path, list_dir, list_file, suffix,
+ recursive):
+ for path in self._client.list(dir_path):
+ # the `self.isdir` is not used here to determine whether path
+ # is a directory, because `self.isdir` relies on
+ # `self._client.list`
+ if path.endswith('/'): # a directory path
+ next_dir_path = self.join_path(dir_path, path)
+ if list_dir:
+ # get the relative path and exclude the last
+ # character '/'
+ rel_dir = next_dir_path[len(root):-1]
+ yield rel_dir
+ if recursive:
+ yield from _list_dir_or_file(next_dir_path, list_dir,
+ list_file, suffix,
+ recursive)
+ else: # a file path
+ absolute_path = self.join_path(dir_path, path)
+ rel_path = absolute_path[len(root):]
+ if (suffix is None
+ or rel_path.endswith(suffix)) and list_file:
+ yield rel_path
+
+ return _list_dir_or_file(dir_path, list_dir, list_file, suffix,
+ recursive)
+
+
+class MemcachedBackend(BaseStorageBackend):
+ """Memcached storage backend.
+
+ Attributes:
+ server_list_cfg (str): Config file for memcached server list.
+ client_cfg (str): Config file for memcached client.
+ sys_path (str | None): Additional path to be appended to `sys.path`.
+ Default: None.
+ """
+
+ def __init__(self, server_list_cfg, client_cfg, sys_path=None):
+ if sys_path is not None:
+ import sys
+ sys.path.append(sys_path)
+ try:
+ import mc
+ except ImportError:
+ raise ImportError(
+ 'Please install memcached to enable MemcachedBackend.')
+
+ self.server_list_cfg = server_list_cfg
+ self.client_cfg = client_cfg
+ self._client = mc.MemcachedClient.GetInstance(self.server_list_cfg,
+ self.client_cfg)
+ # mc.pyvector servers as a point which points to a memory cache
+ self._mc_buffer = mc.pyvector()
+
+ def get(self, filepath):
+ filepath = str(filepath)
+ import mc
+ self._client.Get(filepath, self._mc_buffer)
+ value_buf = mc.ConvertBuffer(self._mc_buffer)
+ return value_buf
+
+ def get_text(self, filepath, encoding=None):
+ raise NotImplementedError
+
+
+class LmdbBackend(BaseStorageBackend):
+ """Lmdb storage backend.
+
+ Args:
+ db_path (str): Lmdb database path.
+ readonly (bool, optional): Lmdb environment parameter. If True,
+ disallow any write operations. Default: True.
+ lock (bool, optional): Lmdb environment parameter. If False, when
+ concurrent access occurs, do not lock the database. Default: False.
+ readahead (bool, optional): Lmdb environment parameter. If False,
+ disable the OS filesystem readahead mechanism, which may improve
+ random read performance when a database is larger than RAM.
+ Default: False.
+
+ Attributes:
+ db_path (str): Lmdb database path.
+ """
+
+ def __init__(self,
+ db_path,
+ readonly=True,
+ lock=False,
+ readahead=False,
+ **kwargs):
+ try:
+ import lmdb
+ except ImportError:
+ raise ImportError('Please install lmdb to enable LmdbBackend.')
+
+ self.db_path = str(db_path)
+ self._client = lmdb.open(
+ self.db_path,
+ readonly=readonly,
+ lock=lock,
+ readahead=readahead,
+ **kwargs)
+
+ def get(self, filepath):
+ """Get values according to the filepath.
+
+ Args:
+ filepath (str | obj:`Path`): Here, filepath is the lmdb key.
+ """
+ filepath = str(filepath)
+ with self._client.begin(write=False) as txn:
+ value_buf = txn.get(filepath.encode('ascii'))
+ return value_buf
+
+ def get_text(self, filepath, encoding=None):
+ raise NotImplementedError
+
+
+class HardDiskBackend(BaseStorageBackend):
+ """Raw hard disks storage backend."""
+
+ _allow_symlink = True
+
+ def get(self, filepath: Union[str, Path]) -> bytes:
+ """Read data from a given ``filepath`` with 'rb' mode.
+
+ Args:
+ filepath (str or Path): Path to read data.
+
+ Returns:
+ bytes: Expected bytes object.
+ """
+ with open(filepath, 'rb') as f:
+ value_buf = f.read()
+ return value_buf
+
+ def get_text(self,
+ filepath: Union[str, Path],
+ encoding: str = 'utf-8') -> str:
+ """Read data from a given ``filepath`` with 'r' mode.
+
+ Args:
+ filepath (str or Path): Path to read data.
+ encoding (str): The encoding format used to open the ``filepath``.
+ Default: 'utf-8'.
+
+ Returns:
+ str: Expected text reading from ``filepath``.
+ """
+ with open(filepath, 'r', encoding=encoding) as f:
+ value_buf = f.read()
+ return value_buf
+
+ def put(self, obj: bytes, filepath: Union[str, Path]) -> None:
+ """Write data to a given ``filepath`` with 'wb' mode.
+
+ Note:
+ ``put`` will create a directory if the directory of ``filepath``
+ does not exist.
+
+ Args:
+ obj (bytes): Data to be written.
+ filepath (str or Path): Path to write data.
+ """
+ mmcv.mkdir_or_exist(osp.dirname(filepath))
+ with open(filepath, 'wb') as f:
+ f.write(obj)
+
+ def put_text(self,
+ obj: str,
+ filepath: Union[str, Path],
+ encoding: str = 'utf-8') -> None:
+ """Write data to a given ``filepath`` with 'w' mode.
+
+ Note:
+ ``put_text`` will create a directory if the directory of
+ ``filepath`` does not exist.
+
+ Args:
+ obj (str): Data to be written.
+ filepath (str or Path): Path to write data.
+ encoding (str): The encoding format used to open the ``filepath``.
+ Default: 'utf-8'.
+ """
+ mmcv.mkdir_or_exist(osp.dirname(filepath))
+ with open(filepath, 'w', encoding=encoding) as f:
+ f.write(obj)
+
+ def remove(self, filepath: Union[str, Path]) -> None:
+ """Remove a file.
+
+ Args:
+ filepath (str or Path): Path to be removed.
+ """
+ os.remove(filepath)
+
+ def exists(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path exists.
+
+ Args:
+ filepath (str or Path): Path to be checked whether exists.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise.
+ """
+ return osp.exists(filepath)
+
+ def isdir(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path is a directory.
+
+ Args:
+ filepath (str or Path): Path to be checked whether it is a
+ directory.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` points to a directory,
+ ``False`` otherwise.
+ """
+ return osp.isdir(filepath)
+
+ def isfile(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path is a file.
+
+ Args:
+ filepath (str or Path): Path to be checked whether it is a file.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` points to a file, ``False``
+ otherwise.
+ """
+ return osp.isfile(filepath)
+
+ def join_path(self, filepath: Union[str, Path],
+ *filepaths: Union[str, Path]) -> str:
+ """Concatenate all file paths.
+
+ Join one or more filepath components intelligently. The return value
+ is the concatenation of filepath and any members of *filepaths.
+
+ Args:
+ filepath (str or Path): Path to be concatenated.
+
+ Returns:
+ str: The result of concatenation.
+ """
+ return osp.join(filepath, *filepaths)
+
+ @contextmanager
+ def get_local_path(
+ self, filepath: Union[str, Path]) -> Iterable[Union[str, Path]]:
+ """Only for unified API and do nothing."""
+ yield filepath
+
+ def list_dir_or_file(self,
+ dir_path: Union[str, Path],
+ list_dir: bool = True,
+ list_file: bool = True,
+ suffix: Optional[Union[str, Tuple[str]]] = None,
+ recursive: bool = False) -> Iterator[str]:
+ """Scan a directory to find the interested directories or files in
+ arbitrary order.
+
+ Note:
+ :meth:`list_dir_or_file` returns the path relative to ``dir_path``.
+
+ Args:
+ dir_path (str | Path): Path of the directory.
+ list_dir (bool): List the directories. Default: True.
+ list_file (bool): List the path of files. Default: True.
+ suffix (str or tuple[str], optional): File suffix
+ that we are interested in. Default: None.
+ recursive (bool): If set to True, recursively scan the
+ directory. Default: False.
+
+ Yields:
+ Iterable[str]: A relative path to ``dir_path``.
+ """
+ if list_dir and suffix is not None:
+ raise TypeError('`suffix` should be None when `list_dir` is True')
+
+ if (suffix is not None) and not isinstance(suffix, (str, tuple)):
+ raise TypeError('`suffix` must be a string or tuple of strings')
+
+ root = dir_path
+
+ def _list_dir_or_file(dir_path, list_dir, list_file, suffix,
+ recursive):
+ for entry in os.scandir(dir_path):
+ if not entry.name.startswith('.') and entry.is_file():
+ rel_path = osp.relpath(entry.path, root)
+ if (suffix is None
+ or rel_path.endswith(suffix)) and list_file:
+ yield rel_path
+ elif osp.isdir(entry.path):
+ if list_dir:
+ rel_dir = osp.relpath(entry.path, root)
+ yield rel_dir
+ if recursive:
+ yield from _list_dir_or_file(entry.path, list_dir,
+ list_file, suffix,
+ recursive)
+
+ return _list_dir_or_file(dir_path, list_dir, list_file, suffix,
+ recursive)
+
+
+class HTTPBackend(BaseStorageBackend):
+ """HTTP and HTTPS storage bachend."""
+
+ def get(self, filepath):
+ value_buf = urlopen(filepath).read()
+ return value_buf
+
+ def get_text(self, filepath, encoding='utf-8'):
+ value_buf = urlopen(filepath).read()
+ return value_buf.decode(encoding)
+
+ @contextmanager
+ def get_local_path(self, filepath: str) -> Iterable[str]:
+ """Download a file from ``filepath``.
+
+ ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It
+ can be called with ``with`` statement, and when exists from the
+ ``with`` statement, the temporary path will be released.
+
+ Args:
+ filepath (str): Download a file from ``filepath``.
+
+ Examples:
+ >>> client = HTTPBackend()
+ >>> # After existing from the ``with`` clause,
+ >>> # the path will be removed
+ >>> with client.get_local_path('http://path/of/your/file') as path:
+ ... # do something here
+ """
+ try:
+ f = tempfile.NamedTemporaryFile(delete=False)
+ f.write(self.get(filepath))
+ f.close()
+ yield f.name
+ finally:
+ os.remove(f.name)
+
+
+class FileClient:
+ """A general file client to access files in different backends.
+
+ The client loads a file or text in a specified backend from its path
+ and returns it as a binary or text file. There are two ways to choose a
+ backend, the name of backend and the prefix of path. Although both of them
+ can be used to choose a storage backend, ``backend`` has a higher priority
+ that is if they are all set, the storage backend will be chosen by the
+ backend argument. If they are all `None`, the disk backend will be chosen.
+ Note that It can also register other backend accessor with a given name,
+ prefixes, and backend class. In addition, We use the singleton pattern to
+ avoid repeated object creation. If the arguments are the same, the same
+ object will be returned.
+
+ Args:
+ backend (str, optional): The storage backend type. Options are "disk",
+ "ceph", "memcached", "lmdb", "http" and "petrel". Default: None.
+ prefix (str, optional): The prefix of the registered storage backend.
+ Options are "s3", "http", "https". Default: None.
+
+ Examples:
+ >>> # only set backend
+ >>> file_client = FileClient(backend='petrel')
+ >>> # only set prefix
+ >>> file_client = FileClient(prefix='s3')
+ >>> # set both backend and prefix but use backend to choose client
+ >>> file_client = FileClient(backend='petrel', prefix='s3')
+ >>> # if the arguments are the same, the same object is returned
+ >>> file_client1 = FileClient(backend='petrel')
+ >>> file_client1 is file_client
+ True
+
+ Attributes:
+ client (:obj:`BaseStorageBackend`): The backend object.
+ """
+
+ _backends = {
+ 'disk': HardDiskBackend,
+ 'ceph': CephBackend,
+ 'memcached': MemcachedBackend,
+ 'lmdb': LmdbBackend,
+ 'petrel': PetrelBackend,
+ 'http': HTTPBackend,
+ }
+ # This collection is used to record the overridden backends, and when a
+ # backend appears in the collection, the singleton pattern is disabled for
+ # that backend, because if the singleton pattern is used, then the object
+ # returned will be the backend before overwriting
+ _overridden_backends = set()
+ _prefix_to_backends = {
+ 's3': PetrelBackend,
+ 'http': HTTPBackend,
+ 'https': HTTPBackend,
+ }
+ _overridden_prefixes = set()
+
+ _instances = {}
+
+ def __new__(cls, backend=None, prefix=None, **kwargs):
+ if backend is None and prefix is None:
+ backend = 'disk'
+ if backend is not None and backend not in cls._backends:
+ raise ValueError(
+ f'Backend {backend} is not supported. Currently supported ones'
+ f' are {list(cls._backends.keys())}')
+ if prefix is not None and prefix not in cls._prefix_to_backends:
+ raise ValueError(
+ f'prefix {prefix} is not supported. Currently supported ones '
+ f'are {list(cls._prefix_to_backends.keys())}')
+
+ # concatenate the arguments to a unique key for determining whether
+ # objects with the same arguments were created
+ arg_key = f'{backend}:{prefix}'
+ for key, value in kwargs.items():
+ arg_key += f':{key}:{value}'
+
+ # if a backend was overridden, it will create a new object
+ if (arg_key in cls._instances
+ and backend not in cls._overridden_backends
+ and prefix not in cls._overridden_prefixes):
+ _instance = cls._instances[arg_key]
+ else:
+ # create a new object and put it to _instance
+ _instance = super().__new__(cls)
+ if backend is not None:
+ _instance.client = cls._backends[backend](**kwargs)
+ else:
+ _instance.client = cls._prefix_to_backends[prefix](**kwargs)
+
+ cls._instances[arg_key] = _instance
+
+ return _instance
+
+ @property
+ def name(self):
+ return self.client.name
+
+ @property
+ def allow_symlink(self):
+ return self.client.allow_symlink
+
+ @staticmethod
+ def parse_uri_prefix(uri: Union[str, Path]) -> Optional[str]:
+ """Parse the prefix of a uri.
+
+ Args:
+ uri (str | Path): Uri to be parsed that contains the file prefix.
+
+ Examples:
+ >>> FileClient.parse_uri_prefix('s3://path/of/your/file')
+ 's3'
+
+ Returns:
+ str | None: Return the prefix of uri if the uri contains '://'
+ else ``None``.
+ """
+ assert is_filepath(uri)
+ uri = str(uri)
+ if '://' not in uri:
+ return None
+ else:
+ prefix, _ = uri.split('://')
+ # In the case of PetrelBackend, the prefix may contains the cluster
+ # name like clusterName:s3
+ if ':' in prefix:
+ _, prefix = prefix.split(':')
+ return prefix
+
+ @classmethod
+ def infer_client(cls,
+ file_client_args: Optional[dict] = None,
+ uri: Optional[Union[str, Path]] = None) -> 'FileClient':
+ """Infer a suitable file client based on the URI and arguments.
+
+ Args:
+ file_client_args (dict, optional): Arguments to instantiate a
+ FileClient. Default: None.
+ uri (str | Path, optional): Uri to be parsed that contains the file
+ prefix. Default: None.
+
+ Examples:
+ >>> uri = 's3://path/of/your/file'
+ >>> file_client = FileClient.infer_client(uri=uri)
+ >>> file_client_args = {'backend': 'petrel'}
+ >>> file_client = FileClient.infer_client(file_client_args)
+
+ Returns:
+ FileClient: Instantiated FileClient object.
+ """
+ assert file_client_args is not None or uri is not None
+ if file_client_args is None:
+ file_prefix = cls.parse_uri_prefix(uri) # type: ignore
+ return cls(prefix=file_prefix)
+ else:
+ return cls(**file_client_args)
+
+ @classmethod
+ def _register_backend(cls, name, backend, force=False, prefixes=None):
+ if not isinstance(name, str):
+ raise TypeError('the backend name should be a string, '
+ f'but got {type(name)}')
+ if not inspect.isclass(backend):
+ raise TypeError(
+ f'backend should be a class but got {type(backend)}')
+ if not issubclass(backend, BaseStorageBackend):
+ raise TypeError(
+ f'backend {backend} is not a subclass of BaseStorageBackend')
+ if not force and name in cls._backends:
+ raise KeyError(
+ f'{name} is already registered as a storage backend, '
+ 'add "force=True" if you want to override it')
+
+ if name in cls._backends and force:
+ cls._overridden_backends.add(name)
+ cls._backends[name] = backend
+
+ if prefixes is not None:
+ if isinstance(prefixes, str):
+ prefixes = [prefixes]
+ else:
+ assert isinstance(prefixes, (list, tuple))
+ for prefix in prefixes:
+ if prefix not in cls._prefix_to_backends:
+ cls._prefix_to_backends[prefix] = backend
+ elif (prefix in cls._prefix_to_backends) and force:
+ cls._overridden_prefixes.add(prefix)
+ cls._prefix_to_backends[prefix] = backend
+ else:
+ raise KeyError(
+ f'{prefix} is already registered as a storage backend,'
+ ' add "force=True" if you want to override it')
+
+ @classmethod
+ def register_backend(cls, name, backend=None, force=False, prefixes=None):
+ """Register a backend to FileClient.
+
+ This method can be used as a normal class method or a decorator.
+
+ .. code-block:: python
+
+ class NewBackend(BaseStorageBackend):
+
+ def get(self, filepath):
+ return filepath
+
+ def get_text(self, filepath):
+ return filepath
+
+ FileClient.register_backend('new', NewBackend)
+
+ or
+
+ .. code-block:: python
+
+ @FileClient.register_backend('new')
+ class NewBackend(BaseStorageBackend):
+
+ def get(self, filepath):
+ return filepath
+
+ def get_text(self, filepath):
+ return filepath
+
+ Args:
+ name (str): The name of the registered backend.
+ backend (class, optional): The backend class to be registered,
+ which must be a subclass of :class:`BaseStorageBackend`.
+ When this method is used as a decorator, backend is None.
+ Defaults to None.
+ force (bool, optional): Whether to override the backend if the name
+ has already been registered. Defaults to False.
+ prefixes (str or list[str] or tuple[str], optional): The prefixes
+ of the registered storage backend. Default: None.
+ `New in version 1.3.15.`
+ """
+ if backend is not None:
+ cls._register_backend(
+ name, backend, force=force, prefixes=prefixes)
+ return
+
+ def _register(backend_cls):
+ cls._register_backend(
+ name, backend_cls, force=force, prefixes=prefixes)
+ return backend_cls
+
+ return _register
+
+ def get(self, filepath: Union[str, Path]) -> Union[bytes, memoryview]:
+ """Read data from a given ``filepath`` with 'rb' mode.
+
+ Note:
+ There are two types of return values for ``get``, one is ``bytes``
+ and the other is ``memoryview``. The advantage of using memoryview
+ is that you can avoid copying, and if you want to convert it to
+ ``bytes``, you can use ``.tobytes()``.
+
+ Args:
+ filepath (str or Path): Path to read data.
+
+ Returns:
+ bytes | memoryview: Expected bytes object or a memory view of the
+ bytes object.
+ """
+ return self.client.get(filepath)
+
+ def get_text(self, filepath: Union[str, Path], encoding='utf-8') -> str:
+ """Read data from a given ``filepath`` with 'r' mode.
+
+ Args:
+ filepath (str or Path): Path to read data.
+ encoding (str): The encoding format used to open the ``filepath``.
+ Default: 'utf-8'.
+
+ Returns:
+ str: Expected text reading from ``filepath``.
+ """
+ return self.client.get_text(filepath, encoding)
+
+ def put(self, obj: bytes, filepath: Union[str, Path]) -> None:
+ """Write data to a given ``filepath`` with 'wb' mode.
+
+ Note:
+ ``put`` should create a directory if the directory of ``filepath``
+ does not exist.
+
+ Args:
+ obj (bytes): Data to be written.
+ filepath (str or Path): Path to write data.
+ """
+ self.client.put(obj, filepath)
+
+ def put_text(self, obj: str, filepath: Union[str, Path]) -> None:
+ """Write data to a given ``filepath`` with 'w' mode.
+
+ Note:
+ ``put_text`` should create a directory if the directory of
+ ``filepath`` does not exist.
+
+ Args:
+ obj (str): Data to be written.
+ filepath (str or Path): Path to write data.
+ encoding (str, optional): The encoding format used to open the
+ `filepath`. Default: 'utf-8'.
+ """
+ self.client.put_text(obj, filepath)
+
+ def remove(self, filepath: Union[str, Path]) -> None:
+ """Remove a file.
+
+ Args:
+ filepath (str, Path): Path to be removed.
+ """
+ self.client.remove(filepath)
+
+ def exists(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path exists.
+
+ Args:
+ filepath (str or Path): Path to be checked whether exists.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise.
+ """
+ return self.client.exists(filepath)
+
+ def isdir(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path is a directory.
+
+ Args:
+ filepath (str or Path): Path to be checked whether it is a
+ directory.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` points to a directory,
+ ``False`` otherwise.
+ """
+ return self.client.isdir(filepath)
+
+ def isfile(self, filepath: Union[str, Path]) -> bool:
+ """Check whether a file path is a file.
+
+ Args:
+ filepath (str or Path): Path to be checked whether it is a file.
+
+ Returns:
+ bool: Return ``True`` if ``filepath`` points to a file, ``False``
+ otherwise.
+ """
+ return self.client.isfile(filepath)
+
+ def join_path(self, filepath: Union[str, Path],
+ *filepaths: Union[str, Path]) -> str:
+ """Concatenate all file paths.
+
+ Join one or more filepath components intelligently. The return value
+ is the concatenation of filepath and any members of *filepaths.
+
+ Args:
+ filepath (str or Path): Path to be concatenated.
+
+ Returns:
+ str: The result of concatenation.
+ """
+ return self.client.join_path(filepath, *filepaths)
+
+ @contextmanager
+ def get_local_path(self, filepath: Union[str, Path]) -> Iterable[str]:
+ """Download data from ``filepath`` and write the data to local path.
+
+ ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It
+ can be called with ``with`` statement, and when exists from the
+ ``with`` statement, the temporary path will be released.
+
+ Note:
+ If the ``filepath`` is a local path, just return itself.
+
+ .. warning::
+ ``get_local_path`` is an experimental interface that may change in
+ the future.
+
+ Args:
+ filepath (str or Path): Path to be read data.
+
+ Examples:
+ >>> file_client = FileClient(prefix='s3')
+ >>> with file_client.get_local_path('s3://bucket/abc.jpg') as path:
+ ... # do something here
+
+ Yields:
+ Iterable[str]: Only yield one path.
+ """
+ with self.client.get_local_path(str(filepath)) as local_path:
+ yield local_path
+
+ def list_dir_or_file(self,
+ dir_path: Union[str, Path],
+ list_dir: bool = True,
+ list_file: bool = True,
+ suffix: Optional[Union[str, Tuple[str]]] = None,
+ recursive: bool = False) -> Iterator[str]:
+ """Scan a directory to find the interested directories or files in
+ arbitrary order.
+
+ Note:
+ :meth:`list_dir_or_file` returns the path relative to ``dir_path``.
+
+ Args:
+ dir_path (str | Path): Path of the directory.
+ list_dir (bool): List the directories. Default: True.
+ list_file (bool): List the path of files. Default: True.
+ suffix (str or tuple[str], optional): File suffix
+ that we are interested in. Default: None.
+ recursive (bool): If set to True, recursively scan the
+ directory. Default: False.
+
+ Yields:
+ Iterable[str]: A relative path to ``dir_path``.
+ """
+ yield from self.client.list_dir_or_file(dir_path, list_dir, list_file,
+ suffix, recursive)
diff --git a/mmcv/fileio/handlers/__init__.py b/mmcv/fileio/handlers/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..aa24d91972837b8756b225f4879bac20436eb72a
--- /dev/null
+++ b/mmcv/fileio/handlers/__init__.py
@@ -0,0 +1,7 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from .base import BaseFileHandler
+from .json_handler import JsonHandler
+from .pickle_handler import PickleHandler
+from .yaml_handler import YamlHandler
+
+__all__ = ['BaseFileHandler', 'JsonHandler', 'PickleHandler', 'YamlHandler']
diff --git a/mmcv/fileio/handlers/base.py b/mmcv/fileio/handlers/base.py
new file mode 100644
index 0000000000000000000000000000000000000000..288878bc57282fbb2f12b32290152ca8e9d3cab0
--- /dev/null
+++ b/mmcv/fileio/handlers/base.py
@@ -0,0 +1,30 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from abc import ABCMeta, abstractmethod
+
+
+class BaseFileHandler(metaclass=ABCMeta):
+ # `str_like` is a flag to indicate whether the type of file object is
+ # str-like object or bytes-like object. Pickle only processes bytes-like
+ # objects but json only processes str-like object. If it is str-like
+ # object, `StringIO` will be used to process the buffer.
+ str_like = True
+
+ @abstractmethod
+ def load_from_fileobj(self, file, **kwargs):
+ pass
+
+ @abstractmethod
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ pass
+
+ @abstractmethod
+ def dump_to_str(self, obj, **kwargs):
+ pass
+
+ def load_from_path(self, filepath, mode='r', **kwargs):
+ with open(filepath, mode) as f:
+ return self.load_from_fileobj(f, **kwargs)
+
+ def dump_to_path(self, obj, filepath, mode='w', **kwargs):
+ with open(filepath, mode) as f:
+ self.dump_to_fileobj(obj, f, **kwargs)
diff --git a/mmcv/fileio/handlers/json_handler.py b/mmcv/fileio/handlers/json_handler.py
new file mode 100644
index 0000000000000000000000000000000000000000..18d4f15f74139d20adff18b20be5529c592a66b6
--- /dev/null
+++ b/mmcv/fileio/handlers/json_handler.py
@@ -0,0 +1,36 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import json
+
+import numpy as np
+
+from .base import BaseFileHandler
+
+
+def set_default(obj):
+ """Set default json values for non-serializable values.
+
+ It helps convert ``set``, ``range`` and ``np.ndarray`` data types to list.
+ It also converts ``np.generic`` (including ``np.int32``, ``np.float32``,
+ etc.) into plain numbers of plain python built-in types.
+ """
+ if isinstance(obj, (set, range)):
+ return list(obj)
+ elif isinstance(obj, np.ndarray):
+ return obj.tolist()
+ elif isinstance(obj, np.generic):
+ return obj.item()
+ raise TypeError(f'{type(obj)} is unsupported for json dump')
+
+
+class JsonHandler(BaseFileHandler):
+
+ def load_from_fileobj(self, file):
+ return json.load(file)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('default', set_default)
+ json.dump(obj, file, **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('default', set_default)
+ return json.dumps(obj, **kwargs)
diff --git a/mmcv/fileio/handlers/pickle_handler.py b/mmcv/fileio/handlers/pickle_handler.py
new file mode 100644
index 0000000000000000000000000000000000000000..b37c79bed4ef9fd8913715e62dbe3fc5cafdc3aa
--- /dev/null
+++ b/mmcv/fileio/handlers/pickle_handler.py
@@ -0,0 +1,28 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import pickle
+
+from .base import BaseFileHandler
+
+
+class PickleHandler(BaseFileHandler):
+
+ str_like = False
+
+ def load_from_fileobj(self, file, **kwargs):
+ return pickle.load(file, **kwargs)
+
+ def load_from_path(self, filepath, **kwargs):
+ return super(PickleHandler, self).load_from_path(
+ filepath, mode='rb', **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ return pickle.dumps(obj, **kwargs)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ pickle.dump(obj, file, **kwargs)
+
+ def dump_to_path(self, obj, filepath, **kwargs):
+ super(PickleHandler, self).dump_to_path(
+ obj, filepath, mode='wb', **kwargs)
diff --git a/mmcv/fileio/handlers/yaml_handler.py b/mmcv/fileio/handlers/yaml_handler.py
new file mode 100644
index 0000000000000000000000000000000000000000..c5aa2eea1e8c76f8baf753d1c8c959dee665e543
--- /dev/null
+++ b/mmcv/fileio/handlers/yaml_handler.py
@@ -0,0 +1,24 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import yaml
+
+try:
+ from yaml import CLoader as Loader, CDumper as Dumper
+except ImportError:
+ from yaml import Loader, Dumper
+
+from .base import BaseFileHandler # isort:skip
+
+
+class YamlHandler(BaseFileHandler):
+
+ def load_from_fileobj(self, file, **kwargs):
+ kwargs.setdefault('Loader', Loader)
+ return yaml.load(file, **kwargs)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('Dumper', Dumper)
+ yaml.dump(obj, file, **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('Dumper', Dumper)
+ return yaml.dump(obj, **kwargs)
diff --git a/mmcv/fileio/io.py b/mmcv/fileio/io.py
new file mode 100644
index 0000000000000000000000000000000000000000..aaefde58aa3ea5b58f86249ce7e1c40c186eb8dd
--- /dev/null
+++ b/mmcv/fileio/io.py
@@ -0,0 +1,151 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from io import BytesIO, StringIO
+from pathlib import Path
+
+from ..utils import is_list_of, is_str
+from .file_client import FileClient
+from .handlers import BaseFileHandler, JsonHandler, PickleHandler, YamlHandler
+
+file_handlers = {
+ 'json': JsonHandler(),
+ 'yaml': YamlHandler(),
+ 'yml': YamlHandler(),
+ 'pickle': PickleHandler(),
+ 'pkl': PickleHandler()
+}
+
+
+def load(file, file_format=None, file_client_args=None, **kwargs):
+ """Load data from json/yaml/pickle files.
+
+ This method provides a unified api for loading data from serialized files.
+
+ Note:
+ In v1.3.16 and later, ``load`` supports loading data from serialized
+ files those can be storaged in different backends.
+
+ Args:
+ file (str or :obj:`Path` or file-like object): Filename or a file-like
+ object.
+ file_format (str, optional): If not specified, the file format will be
+ inferred from the file extension, otherwise use the specified one.
+ Currently supported formats include "json", "yaml/yml" and
+ "pickle/pkl".
+ file_client_args (dict, optional): Arguments to instantiate a
+ FileClient. See :class:`mmcv.fileio.FileClient` for details.
+ Default: None.
+
+ Examples:
+ >>> load('/path/of/your/file') # file is storaged in disk
+ >>> load('https://path/of/your/file') # file is storaged in Internet
+ >>> load('s3://path/of/your/file') # file is storaged in petrel
+
+ Returns:
+ The content from the file.
+ """
+ if isinstance(file, Path):
+ file = str(file)
+ if file_format is None and is_str(file):
+ file_format = file.split('.')[-1]
+ if file_format not in file_handlers:
+ raise TypeError(f'Unsupported format: {file_format}')
+
+ handler = file_handlers[file_format]
+ if is_str(file):
+ file_client = FileClient.infer_client(file_client_args, file)
+ if handler.str_like:
+ with StringIO(file_client.get_text(file)) as f:
+ obj = handler.load_from_fileobj(f, **kwargs)
+ else:
+ with BytesIO(file_client.get(file)) as f:
+ obj = handler.load_from_fileobj(f, **kwargs)
+ elif hasattr(file, 'read'):
+ obj = handler.load_from_fileobj(file, **kwargs)
+ else:
+ raise TypeError('"file" must be a filepath str or a file-object')
+ return obj
+
+
+def dump(obj, file=None, file_format=None, file_client_args=None, **kwargs):
+ """Dump data to json/yaml/pickle strings or files.
+
+ This method provides a unified api for dumping data as strings or to files,
+ and also supports custom arguments for each file format.
+
+ Note:
+ In v1.3.16 and later, ``dump`` supports dumping data as strings or to
+ files which is saved to different backends.
+
+ Args:
+ obj (any): The python object to be dumped.
+ file (str or :obj:`Path` or file-like object, optional): If not
+ specified, then the object is dumped to a str, otherwise to a file
+ specified by the filename or file-like object.
+ file_format (str, optional): Same as :func:`load`.
+ file_client_args (dict, optional): Arguments to instantiate a
+ FileClient. See :class:`mmcv.fileio.FileClient` for details.
+ Default: None.
+
+ Examples:
+ >>> dump('hello world', '/path/of/your/file') # disk
+ >>> dump('hello world', 's3://path/of/your/file') # ceph or petrel
+
+ Returns:
+ bool: True for success, False otherwise.
+ """
+ if isinstance(file, Path):
+ file = str(file)
+ if file_format is None:
+ if is_str(file):
+ file_format = file.split('.')[-1]
+ elif file is None:
+ raise ValueError(
+ 'file_format must be specified since file is None')
+ if file_format not in file_handlers:
+ raise TypeError(f'Unsupported format: {file_format}')
+
+ handler = file_handlers[file_format]
+ if file is None:
+ return handler.dump_to_str(obj, **kwargs)
+ elif is_str(file):
+ file_client = FileClient.infer_client(file_client_args, file)
+ if handler.str_like:
+ with StringIO() as f:
+ handler.dump_to_fileobj(obj, f, **kwargs)
+ file_client.put_text(f.getvalue(), file)
+ else:
+ with BytesIO() as f:
+ handler.dump_to_fileobj(obj, f, **kwargs)
+ file_client.put(f.getvalue(), file)
+ elif hasattr(file, 'write'):
+ handler.dump_to_fileobj(obj, file, **kwargs)
+ else:
+ raise TypeError('"file" must be a filename str or a file-object')
+
+
+def _register_handler(handler, file_formats):
+ """Register a handler for some file extensions.
+
+ Args:
+ handler (:obj:`BaseFileHandler`): Handler to be registered.
+ file_formats (str or list[str]): File formats to be handled by this
+ handler.
+ """
+ if not isinstance(handler, BaseFileHandler):
+ raise TypeError(
+ f'handler must be a child of BaseFileHandler, not {type(handler)}')
+ if isinstance(file_formats, str):
+ file_formats = [file_formats]
+ if not is_list_of(file_formats, str):
+ raise TypeError('file_formats must be a str or a list of str')
+ for ext in file_formats:
+ file_handlers[ext] = handler
+
+
+def register_handler(file_formats, **kwargs):
+
+ def wrap(cls):
+ _register_handler(cls(**kwargs), file_formats)
+ return cls
+
+ return wrap
diff --git a/mmcv/fileio/parse.py b/mmcv/fileio/parse.py
new file mode 100644
index 0000000000000000000000000000000000000000..f60f0d611b8d75692221d0edd7dc993b0a6445c9
--- /dev/null
+++ b/mmcv/fileio/parse.py
@@ -0,0 +1,97 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+
+from io import StringIO
+
+from .file_client import FileClient
+
+
+def list_from_file(filename,
+ prefix='',
+ offset=0,
+ max_num=0,
+ encoding='utf-8',
+ file_client_args=None):
+ """Load a text file and parse the content as a list of strings.
+
+ Note:
+ In v1.3.16 and later, ``list_from_file`` supports loading a text file
+ which can be storaged in different backends and parsing the content as
+ a list for strings.
+
+ Args:
+ filename (str): Filename.
+ prefix (str): The prefix to be inserted to the beginning of each item.
+ offset (int): The offset of lines.
+ max_num (int): The maximum number of lines to be read,
+ zeros and negatives mean no limitation.
+ encoding (str): Encoding used to open the file. Default utf-8.
+ file_client_args (dict, optional): Arguments to instantiate a
+ FileClient. See :class:`mmcv.fileio.FileClient` for details.
+ Default: None.
+
+ Examples:
+ >>> list_from_file('/path/of/your/file') # disk
+ ['hello', 'world']
+ >>> list_from_file('s3://path/of/your/file') # ceph or petrel
+ ['hello', 'world']
+
+ Returns:
+ list[str]: A list of strings.
+ """
+ cnt = 0
+ item_list = []
+ file_client = FileClient.infer_client(file_client_args, filename)
+ with StringIO(file_client.get_text(filename, encoding)) as f:
+ for _ in range(offset):
+ f.readline()
+ for line in f:
+ if 0 < max_num <= cnt:
+ break
+ item_list.append(prefix + line.rstrip('\n\r'))
+ cnt += 1
+ return item_list
+
+
+def dict_from_file(filename,
+ key_type=str,
+ encoding='utf-8',
+ file_client_args=None):
+ """Load a text file and parse the content as a dict.
+
+ Each line of the text file will be two or more columns split by
+ whitespaces or tabs. The first column will be parsed as dict keys, and
+ the following columns will be parsed as dict values.
+
+ Note:
+ In v1.3.16 and later, ``dict_from_file`` supports loading a text file
+ which can be storaged in different backends and parsing the content as
+ a dict.
+
+ Args:
+ filename(str): Filename.
+ key_type(type): Type of the dict keys. str is user by default and
+ type conversion will be performed if specified.
+ encoding (str): Encoding used to open the file. Default utf-8.
+ file_client_args (dict, optional): Arguments to instantiate a
+ FileClient. See :class:`mmcv.fileio.FileClient` for details.
+ Default: None.
+
+ Examples:
+ >>> dict_from_file('/path/of/your/file') # disk
+ {'key1': 'value1', 'key2': 'value2'}
+ >>> dict_from_file('s3://path/of/your/file') # ceph or petrel
+ {'key1': 'value1', 'key2': 'value2'}
+
+ Returns:
+ dict: The parsed contents.
+ """
+ mapping = {}
+ file_client = FileClient.infer_client(file_client_args, filename)
+ with StringIO(file_client.get_text(filename, encoding)) as f:
+ for line in f:
+ items = line.rstrip('\n').split()
+ assert len(items) >= 2
+ key = key_type(items[0])
+ val = items[1:] if len(items) > 2 else items[1]
+ mapping[key] = val
+ return mapping
diff --git a/mmcv/image/__init__.py b/mmcv/image/__init__.py
index 92ecec4046a6f5ee25b4ea07215ed7c7c810dcfa..d0051d609d3de4e7562e3fe638335c66617c4d91 100644
--- a/mmcv/image/__init__.py
+++ b/mmcv/image/__init__.py
@@ -9,10 +9,10 @@ from .geometric import (cutout, imcrop, imflip, imflip_, impad,
from .io import imfrombytes, imread, imwrite, supported_backends, use_backend
from .misc import tensor2imgs
from .photometric import (adjust_brightness, adjust_color, adjust_contrast,
- adjust_hue, adjust_lighting, adjust_sharpness,
- auto_contrast, clahe, imdenormalize, imequalize,
- iminvert, imnormalize, imnormalize_, lut_transform,
- posterize, solarize)
+ adjust_lighting, adjust_sharpness, auto_contrast,
+ clahe, imdenormalize, imequalize, iminvert,
+ imnormalize, imnormalize_, lut_transform, posterize,
+ solarize)
__all__ = [
'bgr2gray', 'bgr2hls', 'bgr2hsv', 'bgr2rgb', 'gray2bgr', 'gray2rgb',
@@ -24,6 +24,5 @@ __all__ = [
'solarize', 'rgb2ycbcr', 'bgr2ycbcr', 'ycbcr2rgb', 'ycbcr2bgr',
'tensor2imgs', 'imshear', 'imtranslate', 'adjust_color', 'imequalize',
'adjust_brightness', 'adjust_contrast', 'lut_transform', 'clahe',
- 'adjust_sharpness', 'auto_contrast', 'cutout', 'adjust_lighting',
- 'adjust_hue'
+ 'adjust_sharpness', 'auto_contrast', 'cutout', 'adjust_lighting'
]
diff --git a/mmcv/image/colorspace.py b/mmcv/image/colorspace.py
index 08f9952408c8e0bb38b17c10e2089e900ed418c2..814533952fdfda23d67cb6a3073692d8c1156add 100644
--- a/mmcv/image/colorspace.py
+++ b/mmcv/image/colorspace.py
@@ -1,11 +1,9 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Callable, Union
-
import cv2
import numpy as np
-def imconvert(img: np.ndarray, src: str, dst: str) -> np.ndarray:
+def imconvert(img, src, dst):
"""Convert an image from the src colorspace to dst colorspace.
Args:
@@ -21,7 +19,7 @@ def imconvert(img: np.ndarray, src: str, dst: str) -> np.ndarray:
return out_img
-def bgr2gray(img: np.ndarray, keepdim: bool = False) -> np.ndarray:
+def bgr2gray(img, keepdim=False):
"""Convert a BGR image to grayscale image.
Args:
@@ -38,7 +36,7 @@ def bgr2gray(img: np.ndarray, keepdim: bool = False) -> np.ndarray:
return out_img
-def rgb2gray(img: np.ndarray, keepdim: bool = False) -> np.ndarray:
+def rgb2gray(img, keepdim=False):
"""Convert a RGB image to grayscale image.
Args:
@@ -55,7 +53,7 @@ def rgb2gray(img: np.ndarray, keepdim: bool = False) -> np.ndarray:
return out_img
-def gray2bgr(img: np.ndarray) -> np.ndarray:
+def gray2bgr(img):
"""Convert a grayscale image to BGR image.
Args:
@@ -69,7 +67,7 @@ def gray2bgr(img: np.ndarray) -> np.ndarray:
return out_img
-def gray2rgb(img: np.ndarray) -> np.ndarray:
+def gray2rgb(img):
"""Convert a grayscale image to RGB image.
Args:
@@ -83,7 +81,7 @@ def gray2rgb(img: np.ndarray) -> np.ndarray:
return out_img
-def _convert_input_type_range(img: np.ndarray) -> np.ndarray:
+def _convert_input_type_range(img):
"""Convert the type and range of the input image.
It converts the input image to np.float32 type and range of [0, 1].
@@ -111,8 +109,7 @@ def _convert_input_type_range(img: np.ndarray) -> np.ndarray:
return img
-def _convert_output_type_range(
- img: np.ndarray, dst_type: Union[np.uint8, np.float32]) -> np.ndarray:
+def _convert_output_type_range(img, dst_type):
"""Convert the type and range of the image according to dst_type.
It converts the image to desired type and range. If `dst_type` is np.uint8,
@@ -143,7 +140,7 @@ def _convert_output_type_range(
return img.astype(dst_type)
-def rgb2ycbcr(img: np.ndarray, y_only: bool = False) -> np.ndarray:
+def rgb2ycbcr(img, y_only=False):
"""Convert a RGB image to YCbCr image.
This function produces the same results as Matlab's `rgb2ycbcr` function.
@@ -163,7 +160,7 @@ def rgb2ycbcr(img: np.ndarray, y_only: bool = False) -> np.ndarray:
Returns:
ndarray: The converted YCbCr image. The output image has the same type
- and range as input image.
+ and range as input image.
"""
img_type = img.dtype
img = _convert_input_type_range(img)
@@ -177,7 +174,7 @@ def rgb2ycbcr(img: np.ndarray, y_only: bool = False) -> np.ndarray:
return out_img
-def bgr2ycbcr(img: np.ndarray, y_only: bool = False) -> np.ndarray:
+def bgr2ycbcr(img, y_only=False):
"""Convert a BGR image to YCbCr image.
The bgr version of rgb2ycbcr.
@@ -197,7 +194,7 @@ def bgr2ycbcr(img: np.ndarray, y_only: bool = False) -> np.ndarray:
Returns:
ndarray: The converted YCbCr image. The output image has the same type
- and range as input image.
+ and range as input image.
"""
img_type = img.dtype
img = _convert_input_type_range(img)
@@ -211,7 +208,7 @@ def bgr2ycbcr(img: np.ndarray, y_only: bool = False) -> np.ndarray:
return out_img
-def ycbcr2rgb(img: np.ndarray) -> np.ndarray:
+def ycbcr2rgb(img):
"""Convert a YCbCr image to RGB image.
This function produces the same results as Matlab's ycbcr2rgb function.
@@ -230,7 +227,7 @@ def ycbcr2rgb(img: np.ndarray) -> np.ndarray:
Returns:
ndarray: The converted RGB image. The output image has the same type
- and range as input image.
+ and range as input image.
"""
img_type = img.dtype
img = _convert_input_type_range(img) * 255
@@ -243,7 +240,7 @@ def ycbcr2rgb(img: np.ndarray) -> np.ndarray:
return out_img
-def ycbcr2bgr(img: np.ndarray) -> np.ndarray:
+def ycbcr2bgr(img):
"""Convert a YCbCr image to BGR image.
The bgr version of ycbcr2rgb.
@@ -262,7 +259,7 @@ def ycbcr2bgr(img: np.ndarray) -> np.ndarray:
Returns:
ndarray: The converted BGR image. The output image has the same type
- and range as input image.
+ and range as input image.
"""
img_type = img.dtype
img = _convert_input_type_range(img) * 255
@@ -275,11 +272,11 @@ def ycbcr2bgr(img: np.ndarray) -> np.ndarray:
return out_img
-def convert_color_factory(src: str, dst: str) -> Callable:
+def convert_color_factory(src, dst):
code = getattr(cv2, f'COLOR_{src.upper()}2{dst.upper()}')
- def convert_color(img: np.ndarray) -> np.ndarray:
+ def convert_color(img):
out_img = cv2.cvtColor(img, code)
return out_img
diff --git a/mmcv/image/geometric.py b/mmcv/image/geometric.py
index f35299bf9ae2d91dd05d6778b5c9e192fce93d02..cf97c201cb4e43796c911919d03fb26a07ed817d 100644
--- a/mmcv/image/geometric.py
+++ b/mmcv/image/geometric.py
@@ -1,11 +1,10 @@
# Copyright (c) OpenMMLab. All rights reserved.
import numbers
-from typing import List, Optional, Tuple, Union, no_type_check
import cv2
import numpy as np
-from mmengine.utils import to_2tuple
+from ..utils import to_2tuple
from .io import imread_backend
try:
@@ -14,10 +13,7 @@ except ImportError:
Image = None
-def _scale_size(
- size: Tuple[int, int],
- scale: Union[float, int, tuple],
-) -> Tuple[int, int]:
+def _scale_size(size, scale):
"""Rescale a size by a ratio.
Args:
@@ -41,47 +37,23 @@ cv2_interp_codes = {
'lanczos': cv2.INTER_LANCZOS4
}
-cv2_border_modes = {
- 'constant': cv2.BORDER_CONSTANT,
- 'replicate': cv2.BORDER_REPLICATE,
- 'reflect': cv2.BORDER_REFLECT,
- 'wrap': cv2.BORDER_WRAP,
- 'reflect_101': cv2.BORDER_REFLECT_101,
- 'transparent': cv2.BORDER_TRANSPARENT,
- 'isolated': cv2.BORDER_ISOLATED
-}
-
-# Pillow >=v9.1.0 use a slightly different naming scheme for filters.
-# Set pillow_interp_codes according to the naming scheme used.
if Image is not None:
- if hasattr(Image, 'Resampling'):
- pillow_interp_codes = {
- 'nearest': Image.Resampling.NEAREST,
- 'bilinear': Image.Resampling.BILINEAR,
- 'bicubic': Image.Resampling.BICUBIC,
- 'box': Image.Resampling.BOX,
- 'lanczos': Image.Resampling.LANCZOS,
- 'hamming': Image.Resampling.HAMMING
- }
- else:
- pillow_interp_codes = {
- 'nearest': Image.NEAREST,
- 'bilinear': Image.BILINEAR,
- 'bicubic': Image.BICUBIC,
- 'box': Image.BOX,
- 'lanczos': Image.LANCZOS,
- 'hamming': Image.HAMMING
- }
-
-
-def imresize(
- img: np.ndarray,
- size: Tuple[int, int],
- return_scale: bool = False,
- interpolation: str = 'bilinear',
- out: Optional[np.ndarray] = None,
- backend: Optional[str] = None
-) -> Union[Tuple[np.ndarray, float, float], np.ndarray]:
+ pillow_interp_codes = {
+ 'nearest': Image.NEAREST,
+ 'bilinear': Image.BILINEAR,
+ 'bicubic': Image.BICUBIC,
+ 'box': Image.BOX,
+ 'lanczos': Image.LANCZOS,
+ 'hamming': Image.HAMMING
+ }
+
+
+def imresize(img,
+ size,
+ return_scale=False,
+ interpolation='bilinear',
+ out=None,
+ backend=None):
"""Resize image to a given size.
Args:
@@ -98,7 +70,7 @@ def imresize(
Returns:
tuple | ndarray: (`resized_img`, `w_scale`, `h_scale`) or
- `resized_img`.
+ `resized_img`.
"""
h, w = img.shape[:2]
if backend is None:
@@ -123,18 +95,15 @@ def imresize(
return resized_img, w_scale, h_scale
-@no_type_check
-def imresize_to_multiple(
- img: np.ndarray,
- divisor: Union[int, Tuple[int, int]],
- size: Union[int, Tuple[int, int], None] = None,
- scale_factor: Union[float, Tuple[float, float], None] = None,
- keep_ratio: bool = False,
- return_scale: bool = False,
- interpolation: str = 'bilinear',
- out: Optional[np.ndarray] = None,
- backend: Optional[str] = None
-) -> Union[Tuple[np.ndarray, float, float], np.ndarray]:
+def imresize_to_multiple(img,
+ divisor,
+ size=None,
+ scale_factor=None,
+ keep_ratio=False,
+ return_scale=False,
+ interpolation='bilinear',
+ out=None,
+ backend=None):
"""Resize image according to a given size or scale factor and then rounds
up the the resized or rescaled image size to the nearest value that can be
divided by the divisor.
@@ -161,7 +130,7 @@ def imresize_to_multiple(
Returns:
tuple | ndarray: (`resized_img`, `w_scale`, `h_scale`) or
- `resized_img`.
+ `resized_img`.
"""
h, w = img.shape[:2]
if size is not None and scale_factor is not None:
@@ -176,7 +145,7 @@ def imresize_to_multiple(
size = _scale_size((w, h), scale_factor)
divisor = to_2tuple(divisor)
- size = tuple(int(np.ceil(s / d)) * d for s, d in zip(size, divisor))
+ size = tuple([int(np.ceil(s / d)) * d for s, d in zip(size, divisor)])
resized_img, w_scale, h_scale = imresize(
img,
size,
@@ -190,13 +159,11 @@ def imresize_to_multiple(
return resized_img
-def imresize_like(
- img: np.ndarray,
- dst_img: np.ndarray,
- return_scale: bool = False,
- interpolation: str = 'bilinear',
- backend: Optional[str] = None
-) -> Union[Tuple[np.ndarray, float, float], np.ndarray]:
+def imresize_like(img,
+ dst_img,
+ return_scale=False,
+ interpolation='bilinear',
+ backend=None):
"""Resize image to the same size of a given image.
Args:
@@ -208,15 +175,13 @@ def imresize_like(
Returns:
tuple or ndarray: (`resized_img`, `w_scale`, `h_scale`) or
- `resized_img`.
+ `resized_img`.
"""
h, w = dst_img.shape[:2]
return imresize(img, (w, h), return_scale, interpolation, backend=backend)
-def rescale_size(old_size: tuple,
- scale: Union[float, int, tuple],
- return_scale: bool = False) -> tuple:
+def rescale_size(old_size, scale, return_scale=False):
"""Calculate the new size to be rescaled to.
Args:
@@ -253,13 +218,11 @@ def rescale_size(old_size: tuple,
return new_size
-def imrescale(
- img: np.ndarray,
- scale: Union[float, Tuple[int, int]],
- return_scale: bool = False,
- interpolation: str = 'bilinear',
- backend: Optional[str] = None
-) -> Union[np.ndarray, Tuple[np.ndarray, float]]:
+def imrescale(img,
+ scale,
+ return_scale=False,
+ interpolation='bilinear',
+ backend=None):
"""Resize image while keeping the aspect ratio.
Args:
@@ -286,7 +249,7 @@ def imrescale(
return rescaled_img
-def imflip(img: np.ndarray, direction: str = 'horizontal') -> np.ndarray:
+def imflip(img, direction='horizontal'):
"""Flip an image horizontally or vertically.
Args:
@@ -306,7 +269,7 @@ def imflip(img: np.ndarray, direction: str = 'horizontal') -> np.ndarray:
return np.flip(img, axis=(0, 1))
-def imflip_(img: np.ndarray, direction: str = 'horizontal') -> np.ndarray:
+def imflip_(img, direction='horizontal'):
"""Inplace flip an image horizontally or vertically.
Args:
@@ -326,33 +289,30 @@ def imflip_(img: np.ndarray, direction: str = 'horizontal') -> np.ndarray:
return cv2.flip(img, -1, img)
-def imrotate(img: np.ndarray,
- angle: float,
- center: Optional[Tuple[float, float]] = None,
- scale: float = 1.0,
- border_value: int = 0,
- interpolation: str = 'bilinear',
- auto_bound: bool = False,
- border_mode: str = 'constant') -> np.ndarray:
+def imrotate(img,
+ angle,
+ center=None,
+ scale=1.0,
+ border_value=0,
+ interpolation='bilinear',
+ auto_bound=False):
"""Rotate an image.
Args:
- img (np.ndarray): Image to be rotated.
+ img (ndarray): Image to be rotated.
angle (float): Rotation angle in degrees, positive values mean
clockwise rotation.
center (tuple[float], optional): Center point (w, h) of the rotation in
the source image. If not specified, the center of the image will be
used.
scale (float): Isotropic scale factor.
- border_value (int): Border value used in case of a constant border.
- Defaults to 0.
+ border_value (int): Border value.
interpolation (str): Same as :func:`resize`.
auto_bound (bool): Whether to adjust the image size to cover the whole
rotated image.
- border_mode (str): Pixel extrapolation method. Defaults to 'constant'.
Returns:
- np.ndarray: The rotated image.
+ ndarray: The rotated image.
"""
if center is not None and auto_bound:
raise ValueError('`auto_bound` conflicts with `center`')
@@ -375,12 +335,11 @@ def imrotate(img: np.ndarray,
img,
matrix, (w, h),
flags=cv2_interp_codes[interpolation],
- borderMode=cv2_border_modes[border_mode],
borderValue=border_value)
return rotated
-def bbox_clip(bboxes: np.ndarray, img_shape: Tuple[int, int]) -> np.ndarray:
+def bbox_clip(bboxes, img_shape):
"""Clip bboxes to fit the image shape.
Args:
@@ -398,9 +357,7 @@ def bbox_clip(bboxes: np.ndarray, img_shape: Tuple[int, int]) -> np.ndarray:
return clipped_bboxes
-def bbox_scaling(bboxes: np.ndarray,
- scale: float,
- clip_shape: Optional[Tuple[int, int]] = None) -> np.ndarray:
+def bbox_scaling(bboxes, scale, clip_shape=None):
"""Scaling bboxes w.r.t the box center.
Args:
@@ -426,12 +383,7 @@ def bbox_scaling(bboxes: np.ndarray,
return scaled_bboxes
-def imcrop(
- img: np.ndarray,
- bboxes: np.ndarray,
- scale: float = 1.0,
- pad_fill: Union[float, list, None] = None
-) -> Union[np.ndarray, List[np.ndarray]]:
+def imcrop(img, bboxes, scale=1.0, pad_fill=None):
"""Crop image patches.
3 steps: scale the bboxes -> clip bboxes -> crop and pad.
@@ -440,7 +392,7 @@ def imcrop(
img (ndarray): Image to be cropped.
bboxes (ndarray): Shape (k, 4) or (4, ), location of cropped bboxes.
scale (float, optional): Scale ratio of bboxes, the default value
- 1.0 means no scaling.
+ 1.0 means no padding.
pad_fill (Number | list[Number]): Value to be filled for padding.
Default: None, which means no padding.
@@ -464,12 +416,10 @@ def imcrop(
patch = img[y1:y2 + 1, x1:x2 + 1, ...]
else:
_x1, _y1, _x2, _y2 = tuple(scaled_bboxes[i, :])
- patch_h = _y2 - _y1 + 1
- patch_w = _x2 - _x1 + 1
if chn == 1:
- patch_shape = (patch_h, patch_w)
+ patch_shape = (_y2 - _y1 + 1, _x2 - _x1 + 1)
else:
- patch_shape = (patch_h, patch_w, chn) # type: ignore
+ patch_shape = (_y2 - _y1 + 1, _x2 - _x1 + 1, chn)
patch = np.array(
pad_fill, dtype=img.dtype) * np.ones(
patch_shape, dtype=img.dtype)
@@ -487,12 +437,12 @@ def imcrop(
return patches
-def impad(img: np.ndarray,
+def impad(img,
*,
- shape: Optional[Tuple[int, int]] = None,
- padding: Union[int, tuple, None] = None,
- pad_val: Union[float, List] = 0,
- padding_mode: str = 'constant') -> np.ndarray:
+ shape=None,
+ padding=None,
+ pad_val=0,
+ padding_mode='constant'):
"""Pad the given image to a certain shape or pad on all sides with
specified padding mode and padding value.
@@ -512,16 +462,16 @@ def impad(img: np.ndarray,
reflect or symmetric. Default: constant.
- constant: pads with a constant value, this value is specified
- with pad_val.
+ with pad_val.
- edge: pads with the last value at the edge of the image.
- - reflect: pads with reflection of image without repeating the last
- value on the edge. For example, padding [1, 2, 3, 4] with 2
- elements on both sides in reflect mode will result in
- [3, 2, 1, 2, 3, 4, 3, 2].
- - symmetric: pads with reflection of image repeating the last value
- on the edge. For example, padding [1, 2, 3, 4] with 2 elements on
- both sides in symmetric mode will result in
- [2, 1, 1, 2, 3, 4, 4, 3]
+ - reflect: pads with reflection of image without repeating the
+ last value on the edge. For example, padding [1, 2, 3, 4]
+ with 2 elements on both sides in reflect mode will result
+ in [3, 2, 1, 2, 3, 4, 3, 2].
+ - symmetric: pads with reflection of image repeating the last
+ value on the edge. For example, padding [1, 2, 3, 4] with
+ 2 elements on both sides in symmetric mode will result in
+ [2, 1, 1, 2, 3, 4, 4, 3]
Returns:
ndarray: The padded image.
@@ -529,9 +479,7 @@ def impad(img: np.ndarray,
assert (shape is not None) ^ (padding is not None)
if shape is not None:
- width = max(shape[1] - img.shape[1], 0)
- height = max(shape[0] - img.shape[0], 0)
- padding = (0, 0, width, height)
+ padding = (0, 0, shape[1] - img.shape[1], shape[0] - img.shape[0])
# check pad_val
if isinstance(pad_val, tuple):
@@ -571,9 +519,7 @@ def impad(img: np.ndarray,
return img
-def impad_to_multiple(img: np.ndarray,
- divisor: int,
- pad_val: Union[float, List] = 0) -> np.ndarray:
+def impad_to_multiple(img, divisor, pad_val=0):
"""Pad an image to ensure each edge to be multiple to some number.
Args:
@@ -589,9 +535,7 @@ def impad_to_multiple(img: np.ndarray,
return impad(img, shape=(pad_h, pad_w), pad_val=pad_val)
-def cutout(img: np.ndarray,
- shape: Union[int, Tuple[int, int]],
- pad_val: Union[int, float, tuple] = 0) -> np.ndarray:
+def cutout(img, shape, pad_val=0):
"""Randomly cut out a rectangle from the original img.
Args:
@@ -635,7 +579,7 @@ def cutout(img: np.ndarray,
if img.ndim == 2:
patch_shape = (y2 - y1, x2 - x1)
else:
- patch_shape = (y2 - y1, x2 - x1, channels) # type: ignore
+ patch_shape = (y2 - y1, x2 - x1, channels)
img_cutout = img.copy()
patch = np.array(
@@ -646,8 +590,7 @@ def cutout(img: np.ndarray,
return img_cutout
-def _get_shear_matrix(magnitude: Union[int, float],
- direction: str = 'horizontal') -> np.ndarray:
+def _get_shear_matrix(magnitude, direction='horizontal'):
"""Generate the shear matrix for transformation.
Args:
@@ -665,11 +608,11 @@ def _get_shear_matrix(magnitude: Union[int, float],
return shear_matrix
-def imshear(img: np.ndarray,
- magnitude: Union[int, float],
- direction: str = 'horizontal',
- border_value: Union[int, Tuple[int, int]] = 0,
- interpolation: str = 'bilinear') -> np.ndarray:
+def imshear(img,
+ magnitude,
+ direction='horizontal',
+ border_value=0,
+ interpolation='bilinear'):
"""Shear an image.
Args:
@@ -693,7 +636,7 @@ def imshear(img: np.ndarray,
elif img.ndim == 3:
channels = img.shape[-1]
if isinstance(border_value, int):
- border_value = tuple([border_value] * channels) # type: ignore
+ border_value = tuple([border_value] * channels)
elif isinstance(border_value, tuple):
assert len(border_value) == channels, \
'Expected the num of elements in tuple equals the channels' \
@@ -711,13 +654,12 @@ def imshear(img: np.ndarray,
# greater than 3 (e.g. shearing masks whose channels large
# than 3) will raise TypeError in `cv2.warpAffine`.
# Here simply slice the first 3 values in `border_value`.
- borderValue=border_value[:3], # type: ignore
+ borderValue=border_value[:3],
flags=cv2_interp_codes[interpolation])
return sheared
-def _get_translate_matrix(offset: Union[int, float],
- direction: str = 'horizontal') -> np.ndarray:
+def _get_translate_matrix(offset, direction='horizontal'):
"""Generate the translate matrix.
Args:
@@ -735,11 +677,11 @@ def _get_translate_matrix(offset: Union[int, float],
return translate_matrix
-def imtranslate(img: np.ndarray,
- offset: Union[int, float],
- direction: str = 'horizontal',
- border_value: Union[int, tuple] = 0,
- interpolation: str = 'bilinear') -> np.ndarray:
+def imtranslate(img,
+ offset,
+ direction='horizontal',
+ border_value=0,
+ interpolation='bilinear'):
"""Translate an image.
Args:
diff --git a/mmcv/image/io.py b/mmcv/image/io.py
index e10d443da6554865afc98cb2441a0cc8eddf0e16..d47aaa845256e4e991582a939733c45d62a4de38 100644
--- a/mmcv/image/io.py
+++ b/mmcv/image/io.py
@@ -1,16 +1,14 @@
# Copyright (c) OpenMMLab. All rights reserved.
import io
import os.path as osp
-import warnings
from pathlib import Path
-from typing import Optional, Union
import cv2
-import mmengine.fileio as fileio
import numpy as np
from cv2 import (IMREAD_COLOR, IMREAD_GRAYSCALE, IMREAD_IGNORE_ORIENTATION,
IMREAD_UNCHANGED)
-from mmengine.utils import is_filepath, is_str
+
+from mmcv.utils import check_file_exist, is_str, mkdir_or_exist
try:
from turbojpeg import TJCS_RGB, TJPF_BGR, TJPF_GRAY, TurboJPEG
@@ -42,7 +40,7 @@ imread_flags = {
imread_backend = 'cv2'
-def use_backend(backend: str) -> None:
+def use_backend(backend):
"""Select a backend for image decoding.
Args:
@@ -68,7 +66,7 @@ def use_backend(backend: str) -> None:
raise ImportError('`tifffile` is not installed')
-def _jpegflag(flag: str = 'color', channel_order: str = 'bgr'):
+def _jpegflag(flag='color', channel_order='bgr'):
channel_order = channel_order.lower()
if channel_order not in ['rgb', 'bgr']:
raise ValueError('channel order must be either "rgb" or "bgr"')
@@ -84,9 +82,7 @@ def _jpegflag(flag: str = 'color', channel_order: str = 'bgr'):
raise ValueError('flag must be "color" or "grayscale"')
-def _pillow2array(img,
- flag: str = 'color',
- channel_order: str = 'bgr') -> np.ndarray:
+def _pillow2array(img, flag='color', channel_order='bgr'):
"""Convert a pillow image to numpy array.
Args:
@@ -141,13 +137,7 @@ def _pillow2array(img,
return array
-def imread(img_or_path: Union[np.ndarray, str, Path],
- flag: str = 'color',
- channel_order: str = 'bgr',
- backend: Optional[str] = None,
- file_client_args: Optional[dict] = None,
- *,
- backend_args: Optional[dict] = None) -> np.ndarray:
+def imread(img_or_path, flag='color', channel_order='bgr', backend=None):
"""Read an image.
Args:
@@ -167,117 +157,78 @@ def imread(img_or_path: Union[np.ndarray, str, Path],
`cv2`, `pillow`, `turbojpeg`, `tifffile`, `None`.
If backend is None, the global imread_backend specified by
``mmcv.use_backend()`` will be used. Default: None.
- file_client_args (dict, optional): Arguments to instantiate a
- FileClient. See :class:`mmengine.fileio.FileClient` for details.
- Default: None. It will be deprecated in future. Please use
- ``backend_args`` instead.
- Deprecated in version 2.0.0rc4.
- backend_args (dict, optional): Instantiates the corresponding file
- backend. It may contain `backend` key to specify the file
- backend. If it contains, the file backend corresponding to this
- value will be used and initialized with the remaining values,
- otherwise the corresponding file backend will be selected
- based on the prefix of the file path. Defaults to None.
- New in version 2.0.0rc4.
Returns:
ndarray: Loaded image array.
-
- Examples:
- >>> import mmcv
- >>> img_path = '/path/to/img.jpg'
- >>> img = mmcv.imread(img_path)
- >>> img = mmcv.imread(img_path, flag='color', channel_order='rgb',
- ... backend='cv2')
- >>> img = mmcv.imread(img_path, flag='color', channel_order='bgr',
- ... backend='pillow')
- >>> s3_img_path = 's3://bucket/img.jpg'
- >>> # infer the file backend by the prefix s3
- >>> img = mmcv.imread(s3_img_path)
- >>> # manually set the file backend petrel
- >>> img = mmcv.imread(s3_img_path, backend_args={
- ... 'backend': 'petrel'})
- >>> http_img_path = 'http://path/to/img.jpg'
- >>> img = mmcv.imread(http_img_path)
- >>> img = mmcv.imread(http_img_path, backend_args={
- ... 'backend': 'http'})
"""
- if file_client_args is not None:
- warnings.warn(
- '"file_client_args" will be deprecated in future. '
- 'Please use "backend_args" instead', DeprecationWarning)
- if backend_args is not None:
- raise ValueError(
- '"file_client_args" and "backend_args" cannot be set at the '
- 'same time.')
+ if backend is None:
+ backend = imread_backend
+ if backend not in supported_backends:
+ raise ValueError(f'backend: {backend} is not supported. Supported '
+ "backends are 'cv2', 'turbojpeg', 'pillow'")
if isinstance(img_or_path, Path):
img_or_path = str(img_or_path)
if isinstance(img_or_path, np.ndarray):
return img_or_path
elif is_str(img_or_path):
- if file_client_args is not None:
- file_client = fileio.FileClient.infer_client(
- file_client_args, img_or_path)
- img_bytes = file_client.get(img_or_path)
+ check_file_exist(img_or_path,
+ f'img file does not exist: {img_or_path}')
+ if backend == 'turbojpeg':
+ with open(img_or_path, 'rb') as in_file:
+ img = jpeg.decode(in_file.read(),
+ _jpegflag(flag, channel_order))
+ if img.shape[-1] == 1:
+ img = img[:, :, 0]
+ return img
+ elif backend == 'pillow':
+ img = Image.open(img_or_path)
+ img = _pillow2array(img, flag, channel_order)
+ return img
+ elif backend == 'tifffile':
+ img = tifffile.imread(img_or_path)
+ return img
else:
- img_bytes = fileio.get(img_or_path, backend_args=backend_args)
- return imfrombytes(img_bytes, flag, channel_order, backend)
+ flag = imread_flags[flag] if is_str(flag) else flag
+ img = cv2.imread(img_or_path, flag)
+ if flag == IMREAD_COLOR and channel_order == 'rgb':
+ cv2.cvtColor(img, cv2.COLOR_BGR2RGB, img)
+ return img
else:
raise TypeError('"img" must be a numpy array or a str or '
'a pathlib.Path object')
-def imfrombytes(content: bytes,
- flag: str = 'color',
- channel_order: str = 'bgr',
- backend: Optional[str] = None) -> np.ndarray:
+def imfrombytes(content, flag='color', channel_order='bgr', backend=None):
"""Read an image from bytes.
Args:
content (bytes): Image bytes got from files or other streams.
flag (str): Same as :func:`imread`.
- channel_order (str): The channel order of the output, candidates
- are 'bgr' and 'rgb'. Default to 'bgr'.
backend (str | None): The image decoding backend type. Options are
- `cv2`, `pillow`, `turbojpeg`, `tifffile`, `None`. If backend is
- None, the global imread_backend specified by ``mmcv.use_backend()``
- will be used. Default: None.
+ `cv2`, `pillow`, `turbojpeg`, `None`. If backend is None, the
+ global imread_backend specified by ``mmcv.use_backend()`` will be
+ used. Default: None.
Returns:
ndarray: Loaded image array.
-
- Examples:
- >>> img_path = '/path/to/img.jpg'
- >>> with open(img_path, 'rb') as f:
- >>> img_buff = f.read()
- >>> img = mmcv.imfrombytes(img_buff)
- >>> img = mmcv.imfrombytes(img_buff, flag='color', channel_order='rgb')
- >>> img = mmcv.imfrombytes(img_buff, backend='pillow')
- >>> img = mmcv.imfrombytes(img_buff, backend='cv2')
"""
if backend is None:
backend = imread_backend
if backend not in supported_backends:
- raise ValueError(
- f'backend: {backend} is not supported. Supported '
- "backends are 'cv2', 'turbojpeg', 'pillow', 'tifffile'")
+ raise ValueError(f'backend: {backend} is not supported. Supported '
+ "backends are 'cv2', 'turbojpeg', 'pillow'")
if backend == 'turbojpeg':
- img = jpeg.decode( # type: ignore
- content, _jpegflag(flag, channel_order))
+ img = jpeg.decode(content, _jpegflag(flag, channel_order))
if img.shape[-1] == 1:
img = img[:, :, 0]
return img
elif backend == 'pillow':
- with io.BytesIO(content) as buff:
- img = Image.open(buff)
- img = _pillow2array(img, flag, channel_order)
- return img
- elif backend == 'tifffile':
- with io.BytesIO(content) as buff:
- img = tifffile.imread(buff)
+ buff = io.BytesIO(content)
+ img = Image.open(buff)
+ img = _pillow2array(img, flag, channel_order)
return img
else:
img_np = np.frombuffer(content, np.uint8)
@@ -288,77 +239,20 @@ def imfrombytes(content: bytes,
return img
-def imwrite(img: np.ndarray,
- file_path: str,
- params: Optional[list] = None,
- auto_mkdir: Optional[bool] = None,
- file_client_args: Optional[dict] = None,
- *,
- backend_args: Optional[dict] = None) -> bool:
+def imwrite(img, file_path, params=None, auto_mkdir=True):
"""Write image to file.
- Warning:
- The parameter `auto_mkdir` will be deprecated in the future and every
- file clients will make directory automatically.
-
Args:
img (ndarray): Image array to be written.
file_path (str): Image file path.
params (None or list): Same as opencv :func:`imwrite` interface.
auto_mkdir (bool): If the parent folder of `file_path` does not exist,
- whether to create it automatically. It will be deprecated.
- file_client_args (dict, optional): Arguments to instantiate a
- FileClient. See :class:`mmengine.fileio.FileClient` for details.
- Default: None. It will be deprecated in future. Please use
- ``backend_args`` instead.
- Deprecated in version 2.0.0rc4.
- backend_args (dict, optional): Instantiates the corresponding file
- backend. It may contain `backend` key to specify the file
- backend. If it contains, the file backend corresponding to this
- value will be used and initialized with the remaining values,
- otherwise the corresponding file backend will be selected
- based on the prefix of the file path. Defaults to None.
- New in version 2.0.0rc4.
+ whether to create it automatically.
Returns:
bool: Successful or not.
-
- Examples:
- >>> # write to hard disk client
- >>> ret = mmcv.imwrite(img, '/path/to/img.jpg')
- >>> # infer the file backend by the prefix s3
- >>> ret = mmcv.imwrite(img, 's3://bucket/img.jpg')
- >>> # manually set the file backend petrel
- >>> ret = mmcv.imwrite(img, 's3://bucket/img.jpg', backend_args={
- ... 'backend': 'petrel'})
"""
- if file_client_args is not None:
- warnings.warn(
- '"file_client_args" will be deprecated in future. '
- 'Please use "backend_args" instead', DeprecationWarning)
- if backend_args is not None:
- raise ValueError(
- '"file_client_args" and "backend_args" cannot be set at the '
- 'same time.')
-
- assert is_filepath(file_path)
- file_path = str(file_path)
- if auto_mkdir is not None:
- warnings.warn(
- 'The parameter `auto_mkdir` will be deprecated in the future and '
- 'every file clients will make directory automatically.')
-
- img_ext = osp.splitext(file_path)[-1]
- # Encode image according to image suffix.
- # For example, if image path is '/path/your/img.jpg', the encode
- # format is '.jpg'.
- flag, img_buff = cv2.imencode(img_ext, img, params)
-
- if file_client_args is not None:
- file_client = fileio.FileClient.infer_client(file_client_args,
- file_path)
- file_client.put(img_buff.tobytes(), file_path)
- else:
- fileio.put(img_buff.tobytes(), file_path, backend_args=backend_args)
-
- return flag
+ if auto_mkdir:
+ dir_name = osp.abspath(osp.dirname(file_path))
+ mkdir_or_exist(dir_name)
+ return cv2.imwrite(file_path, img, params)
diff --git a/mmcv/image/misc.py b/mmcv/image/misc.py
index e923cad4e5f7d210640ee51291a48d82c3b84c32..dfc4a9c6e4c073a672a9a94a06bf0bf2a418c228 100644
--- a/mmcv/image/misc.py
+++ b/mmcv/image/misc.py
@@ -1,6 +1,4 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Optional
-
import numpy as np
import mmcv
@@ -11,24 +9,18 @@ except ImportError:
torch = None
-def tensor2imgs(tensor,
- mean: Optional[tuple] = None,
- std: Optional[tuple] = None,
- to_rgb: bool = True) -> list:
- """Convert tensor to 3-channel images or 1-channel gray images.
+def tensor2imgs(tensor, mean=(0, 0, 0), std=(1, 1, 1), to_rgb=True):
+ """Convert tensor to 3-channel images.
Args:
tensor (torch.Tensor): Tensor that contains multiple images, shape (
- N, C, H, W). :math:`C` can be either 3 or 1.
- mean (tuple[float], optional): Mean of images. If None,
- (0, 0, 0) will be used for tensor with 3-channel,
- while (0, ) for tensor with 1-channel. Defaults to None.
- std (tuple[float], optional): Standard deviation of images. If None,
- (1, 1, 1) will be used for tensor with 3-channel,
- while (1, ) for tensor with 1-channel. Defaults to None.
+ N, C, H, W).
+ mean (tuple[float], optional): Mean of images. Defaults to (0, 0, 0).
+ std (tuple[float], optional): Standard deviation of images.
+ Defaults to (1, 1, 1).
to_rgb (bool, optional): Whether the tensor was converted to RGB
format in the first place. If so, convert it back to BGR.
- For the tensor with 1 channel, it must be False. Defaults to True.
+ Defaults to True.
Returns:
list[np.ndarray]: A list that contains multiple images.
@@ -37,14 +29,8 @@ def tensor2imgs(tensor,
if torch is None:
raise RuntimeError('pytorch is not installed')
assert torch.is_tensor(tensor) and tensor.ndim == 4
- channels = tensor.size(1)
- assert channels in [1, 3]
- if mean is None:
- mean = (0, ) * channels
- if std is None:
- std = (1, ) * channels
- assert (channels == len(mean) == len(std) == 3) or \
- (channels == len(mean) == len(std) == 1 and not to_rgb)
+ assert len(mean) == 3
+ assert len(std) == 3
num_imgs = tensor.size(0)
mean = np.array(mean, dtype=np.float32)
diff --git a/mmcv/image/photometric.py b/mmcv/image/photometric.py
index 12cbb90822564bf14cd5176cc3c5532220db40da..5085d012019c0cbf56f66f421a378278c1a058ae 100644
--- a/mmcv/image/photometric.py
+++ b/mmcv/image/photometric.py
@@ -1,14 +1,9 @@
# Copyright (c) OpenMMLab. All rights reserved.
-import warnings
-from typing import Optional
-
import cv2
import numpy as np
-from mmengine.utils import is_tuple_of
-from PIL import Image, ImageEnhance
+from ..utils import is_tuple_of
from .colorspace import bgr2gray, gray2bgr
-from .io import imread_backend
def imnormalize(img, mean, std, to_rgb=True):
@@ -102,7 +97,7 @@ def posterize(img, bits):
return img
-def adjust_color(img, alpha=1, beta=None, gamma=0, backend=None):
+def adjust_color(img, alpha=1, beta=None, gamma=0):
r"""It blends the source image and its gray image:
.. math::
@@ -115,41 +110,22 @@ def adjust_color(img, alpha=1, beta=None, gamma=0, backend=None):
If None, it's assigned the value (1 - `alpha`).
gamma (int | float): Scalar added to each sum.
Same as :func:`cv2.addWeighted`. Default 0.
- backend (str | None): The image processing backend type. Options are
- `cv2`, `pillow`, `None`. If backend is None, the global
- ``imread_backend`` specified by ``mmcv.use_backend()`` will be
- used. Defaults to None.
Returns:
ndarray: Colored image which has the same size and dtype as input.
"""
- if backend is None:
- backend = imread_backend
- if backend not in ['cv2', 'pillow']:
- raise ValueError(f'backend: {backend} is not supported.'
- f"Supported backends are 'cv2', 'pillow'")
-
- if backend == 'pillow':
- assert img.dtype == np.uint8, 'Pillow backend only support uint8 type'
- warnings.warn("Only use 'alpha' for pillow backend.")
- # Image.fromarray defaultly supports RGB, not BGR.
- pil_image = Image.fromarray(img[..., ::-1], mode='RGB')
- enhancer = ImageEnhance.Color(pil_image)
- pil_image = enhancer.enhance(alpha)
- return np.array(pil_image, dtype=img.dtype)[..., ::-1]
- else:
- gray_img = bgr2gray(img)
- gray_img = np.tile(gray_img[..., None], [1, 1, 3])
- if beta is None:
- beta = 1 - alpha
- colored_img = cv2.addWeighted(img, alpha, gray_img, beta, gamma)
- if not colored_img.dtype == np.uint8:
- # Note when the dtype of `img` is not the default `np.uint8`
- # (e.g. np.float32), the value in `colored_img` got from cv2
- # is not guaranteed to be in range [0, 255], so here clip
- # is needed.
- colored_img = np.clip(colored_img, 0, 255)
- return colored_img.astype(img.dtype)
+ gray_img = bgr2gray(img)
+ gray_img = np.tile(gray_img[..., None], [1, 1, 3])
+ if beta is None:
+ beta = 1 - alpha
+ colored_img = cv2.addWeighted(img, alpha, gray_img, beta, gamma)
+ if not colored_img.dtype == np.uint8:
+ # Note when the dtype of `img` is not the default `np.uint8`
+ # (e.g. np.float32), the value in `colored_img` got from cv2
+ # is not guaranteed to be in range [0, 255], so here clip
+ # is needed.
+ colored_img = np.clip(colored_img, 0, 255)
+ return colored_img
def imequalize(img):
@@ -197,7 +173,7 @@ def imequalize(img):
return equalized_img.astype(img.dtype)
-def adjust_brightness(img, factor=1., backend=None):
+def adjust_brightness(img, factor=1.):
"""Adjust image brightness.
This function controls the brightness of an image. An
@@ -214,40 +190,22 @@ def adjust_brightness(img, factor=1., backend=None):
Factor 1.0 returns the original image, lower
factors mean less color (brightness, contrast,
etc), and higher values more. Default 1.
- backend (str | None): The image processing backend type. Options are
- `cv2`, `pillow`, `None`. If backend is None, the global
- ``imread_backend`` specified by ``mmcv.use_backend()`` will be
- used. Defaults to None.
Returns:
ndarray: The brightened image.
"""
- if backend is None:
- backend = imread_backend
- if backend not in ['cv2', 'pillow']:
- raise ValueError(f'backend: {backend} is not supported.'
- f"Supported backends are 'cv2', 'pillow'")
-
- if backend == 'pillow':
- assert img.dtype == np.uint8, 'Pillow backend only support uint8 type'
- # Image.fromarray defaultly supports RGB, not BGR.
- pil_image = Image.fromarray(img[..., ::-1], mode='RGB')
- enhancer = ImageEnhance.Brightness(pil_image)
- pil_image = enhancer.enhance(factor)
- return np.array(pil_image, dtype=img.dtype)[..., ::-1]
- else:
- degenerated = np.zeros_like(img)
- # Note manually convert the dtype to np.float32, to
- # achieve as close results as PIL.ImageEnhance.Brightness.
- # Set beta=1-factor, and gamma=0
- brightened_img = cv2.addWeighted(
- img.astype(np.float32), factor, degenerated.astype(np.float32),
- 1 - factor, 0)
- brightened_img = np.clip(brightened_img, 0, 255)
- return brightened_img.astype(img.dtype)
-
-
-def adjust_contrast(img, factor=1., backend=None):
+ degenerated = np.zeros_like(img)
+ # Note manually convert the dtype to np.float32, to
+ # achieve as close results as PIL.ImageEnhance.Brightness.
+ # Set beta=1-factor, and gamma=0
+ brightened_img = cv2.addWeighted(
+ img.astype(np.float32), factor, degenerated.astype(np.float32),
+ 1 - factor, 0)
+ brightened_img = np.clip(brightened_img, 0, 255)
+ return brightened_img.astype(img.dtype)
+
+
+def adjust_contrast(img, factor=1.):
"""Adjust image contrast.
This function controls the contrast of an image. An
@@ -261,38 +219,20 @@ def adjust_contrast(img, factor=1., backend=None):
Args:
img (ndarray): Image to be contrasted. BGR order.
factor (float): Same as :func:`mmcv.adjust_brightness`.
- backend (str | None): The image processing backend type. Options are
- `cv2`, `pillow`, `None`. If backend is None, the global
- ``imread_backend`` specified by ``mmcv.use_backend()`` will be
- used. Defaults to None.
Returns:
ndarray: The contrasted image.
"""
- if backend is None:
- backend = imread_backend
- if backend not in ['cv2', 'pillow']:
- raise ValueError(f'backend: {backend} is not supported.'
- f"Supported backends are 'cv2', 'pillow'")
-
- if backend == 'pillow':
- assert img.dtype == np.uint8, 'Pillow backend only support uint8 type'
- # Image.fromarray defaultly supports RGB, not BGR.
- pil_image = Image.fromarray(img[..., ::-1], mode='RGB')
- enhancer = ImageEnhance.Contrast(pil_image)
- pil_image = enhancer.enhance(factor)
- return np.array(pil_image, dtype=img.dtype)[..., ::-1]
- else:
- gray_img = bgr2gray(img)
- hist = np.histogram(gray_img, 256, (0, 255))[0]
- mean = round(np.sum(gray_img) / np.sum(hist))
- degenerated = (np.ones_like(img[..., 0]) * mean).astype(img.dtype)
- degenerated = gray2bgr(degenerated)
- contrasted_img = cv2.addWeighted(
- img.astype(np.float32), factor, degenerated.astype(np.float32),
- 1 - factor, 0)
- contrasted_img = np.clip(contrasted_img, 0, 255)
- return contrasted_img.astype(img.dtype)
+ gray_img = bgr2gray(img)
+ hist = np.histogram(gray_img, 256, (0, 255))[0]
+ mean = round(np.sum(gray_img) / np.sum(hist))
+ degenerated = (np.ones_like(img[..., 0]) * mean).astype(img.dtype)
+ degenerated = gray2bgr(degenerated)
+ contrasted_img = cv2.addWeighted(
+ img.astype(np.float32), factor, degenerated.astype(np.float32),
+ 1 - factor, 0)
+ contrasted_img = np.clip(contrasted_img, 0, 255)
+ return contrasted_img.astype(img.dtype)
def auto_contrast(img, cutoff=0):
@@ -486,76 +426,3 @@ def clahe(img, clip_limit=40.0, tile_grid_size=(8, 8)):
clahe = cv2.createCLAHE(clip_limit, tile_grid_size)
return clahe.apply(np.array(img, dtype=np.uint8))
-
-
-def adjust_hue(img: np.ndarray,
- hue_factor: float,
- backend: Optional[str] = None) -> np.ndarray:
- """Adjust hue of an image.
-
- The image hue is adjusted by converting the image to HSV and cyclically
- shifting the intensities in the hue channel (H). The image is then
- converted back to original image mode.
-
- `hue_factor` is the amount of shift in H channel and must be in the
- interval `[-0.5, 0.5]`.
-
- Modified from
- https://github.com/pytorch/vision/blob/main/torchvision/
- transforms/functional.py
-
- Args:
- img (ndarray): Image to be adjusted.
- hue_factor (float): How much to shift the hue channel. Should be in
- [-0.5, 0.5]. 0.5 and -0.5 give complete reversal of hue channel in
- HSV space in positive and negative direction respectively.
- 0 means no shift. Therefore, both -0.5 and 0.5 will give an image
- with complementary colors while 0 gives the original image.
- backend (str | None): The image processing backend type. Options are
- `cv2`, `pillow`, `None`. If backend is None, the global
- ``imread_backend`` specified by ``mmcv.use_backend()`` will be
- used. Defaults to None.
-
- Returns:
- ndarray: Hue adjusted image.
- """
- if backend is None:
- backend = imread_backend
- if backend not in ['cv2', 'pillow']:
- raise ValueError(f'backend: {backend} is not supported.'
- f"Supported backends are 'cv2', 'pillow'")
-
- if not (-0.5 <= hue_factor <= 0.5):
- raise ValueError(f'hue_factor:{hue_factor} is not in [-0.5, 0.5].')
- if not (isinstance(img, np.ndarray) and (img.ndim in {2, 3})):
- raise TypeError('img should be ndarray with dim=[2 or 3].')
-
- if backend == 'pillow':
- assert img.dtype == np.uint8, 'Pillow backend only support uint8 type'
- # Image.fromarray defaultly supports RGB, not BGR.
- pil_image = Image.fromarray(img[..., ::-1], mode='RGB')
- input_mode = pil_image.mode
- if input_mode in {'L', '1', 'I', 'F'}:
- return pil_image
-
- h, s, v = pil_image.convert('HSV').split()
-
- np_h = np.array(h, dtype=np.uint8)
- # uint8 addition take cares of rotation across boundaries
- with np.errstate(over='ignore'):
- np_h += np.uint8(hue_factor * 255)
- h = Image.fromarray(np_h, 'L')
-
- pil_image = Image.merge('HSV', (h, s, v)).convert(input_mode)
- return np.array(pil_image, dtype=img.dtype)[..., ::-1]
- else:
- dtype = img.dtype
- img = img.astype(np.uint8)
- hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV_FULL)
- h, s, v = cv2.split(hsv_img)
- h = h.astype(np.uint8)
- # uint8 addition take cares of rotation across boundaries
- with np.errstate(over='ignore'):
- h += np.uint8(hue_factor * 255)
- hsv_img = cv2.merge([h, s, v])
- return cv2.cvtColor(hsv_img, cv2.COLOR_HSV2BGR_FULL).astype(dtype)
diff --git a/mmcv/model_zoo/deprecated.json b/mmcv/model_zoo/deprecated.json
new file mode 100644
index 0000000000000000000000000000000000000000..25cf6f28caecc22a77e3136fefa6b8dfc0e6cb5b
--- /dev/null
+++ b/mmcv/model_zoo/deprecated.json
@@ -0,0 +1,6 @@
+{
+ "resnet50_caffe": "detectron/resnet50_caffe",
+ "resnet50_caffe_bgr": "detectron2/resnet50_caffe_bgr",
+ "resnet101_caffe": "detectron/resnet101_caffe",
+ "resnet101_caffe_bgr": "detectron2/resnet101_caffe_bgr"
+}
diff --git a/mmcv/model_zoo/mmcls.json b/mmcv/model_zoo/mmcls.json
new file mode 100644
index 0000000000000000000000000000000000000000..c073a41d0aeb44ee0243f97ecc3558de538f9300
--- /dev/null
+++ b/mmcv/model_zoo/mmcls.json
@@ -0,0 +1,59 @@
+{
+ "vgg11": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg11_batch256_imagenet_20210208-4271cd6c.pth",
+ "vgg13": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg13_batch256_imagenet_20210208-4d1d6080.pth",
+ "vgg16": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg16_batch256_imagenet_20210208-db26f1a5.pth",
+ "vgg19": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg19_batch256_imagenet_20210208-e6920e4a.pth",
+ "vgg11_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg11_bn_batch256_imagenet_20210207-f244902c.pth",
+ "vgg13_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg13_bn_batch256_imagenet_20210207-1a8b7864.pth",
+ "vgg16_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg16_bn_batch256_imagenet_20210208-7e55cd29.pth",
+ "vgg19_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg19_bn_batch256_imagenet_20210208-da620c4f.pth",
+ "resnet18": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_8xb32_in1k_20210831-fbbb1da6.pth",
+ "resnet34": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet34_8xb32_in1k_20210831-f257d4e6.pth",
+ "resnet50": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth",
+ "resnet101": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet101_8xb32_in1k_20210831-539c63f8.pth",
+ "resnet152": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet152_8xb32_in1k_20210901-4d7582fa.pth",
+ "resnet50_v1d": "https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d50_b32x8_imagenet_20210531-db14775a.pth",
+ "resnet101_v1d": "https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d101_b32x8_imagenet_20210531-6e13bcd3.pth",
+ "resnet152_v1d": "https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d152_b32x8_imagenet_20210531-278cf22a.pth",
+ "resnext50_32x4d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext50_32x4d_b32x8_imagenet_20210429-56066e27.pth",
+ "resnext101_32x4d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext101_32x4d_b32x8_imagenet_20210506-e0fa3dd5.pth",
+ "resnext101_32x8d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext101_32x8d_b32x8_imagenet_20210506-23a247d5.pth",
+ "resnext152_32x4d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext152_32x4d_b32x8_imagenet_20210524-927787be.pth",
+ "se-resnet50": "https://download.openmmlab.com/mmclassification/v0/se-resnet/se-resnet50_batch256_imagenet_20200804-ae206104.pth",
+ "se-resnet101": "https://download.openmmlab.com/mmclassification/v0/se-resnet/se-resnet101_batch256_imagenet_20200804-ba5b51d4.pth",
+ "resnest50": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest50_imagenet_converted-1ebf0afe.pth",
+ "resnest101": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest101_imagenet_converted-032caa52.pth",
+ "resnest200": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest200_imagenet_converted-581a60f2.pth",
+ "resnest269": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest269_imagenet_converted-59930960.pth",
+ "shufflenet_v1": "https://download.openmmlab.com/mmclassification/v0/shufflenet_v1/shufflenet_v1_batch1024_imagenet_20200804-5d6cec73.pth",
+ "shufflenet_v2": "https://download.openmmlab.com/mmclassification/v0/shufflenet_v2/shufflenet_v2_batch1024_imagenet_20200812-5bf4721e.pth",
+ "mobilenet_v2": "https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth",
+ "mobilenet_v3_small": "https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth",
+ "mobilenet_v3_large": "https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_large-3ea3c186.pth",
+ "repvgg_A0": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A0_3rdparty_4xb64-coslr-120e_in1k_20210909-883ab98c.pth",
+ "repvgg_A1": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A1_3rdparty_4xb64-coslr-120e_in1k_20210909-24003a24.pth",
+ "repvgg_A2": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A2_3rdparty_4xb64-coslr-120e_in1k_20210909-97d7695a.pth",
+ "repvgg_B0": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B0_3rdparty_4xb64-coslr-120e_in1k_20210909-446375f4.pth",
+ "repvgg_B1": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1_3rdparty_4xb64-coslr-120e_in1k_20210909-750cdf67.pth",
+ "repvgg_B1g2": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1g2_3rdparty_4xb64-coslr-120e_in1k_20210909-344f6422.pth",
+ "repvgg_B1g4": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1g4_3rdparty_4xb64-coslr-120e_in1k_20210909-d4c1a642.pth",
+ "repvgg_B2": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B2_3rdparty_4xb64-coslr-120e_in1k_20210909-bd6b937c.pth",
+ "repvgg_B2g4": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B2g4_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-7b7955f0.pth",
+ "repvgg_B3": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B3_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-dda968bf.pth",
+ "repvgg_B3g4": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B3g4_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-4e54846a.pth",
+ "repvgg_D2se": "https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-D2se_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-cf3139b7.pth",
+ "res2net101_w26": "https://download.openmmlab.com/mmclassification/v0/res2net/res2net101-w26-s4_3rdparty_8xb32_in1k_20210927-870b6c36.pth",
+ "res2net50_w14": "https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w14-s8_3rdparty_8xb32_in1k_20210927-bc967bf1.pth",
+ "res2net50_w26": "https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w26-s8_3rdparty_8xb32_in1k_20210927-f547a94b.pth",
+ "swin_tiny": "https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth",
+ "swin_small": "https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219-7f9d988b.pth",
+ "swin_base": "https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window7_224_22kto1k-f967f799.pth",
+ "swin_large": "https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_large_patch4_window7_224_22kto1k-5f0996db.pth",
+ "t2t_vit_t_14": "https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_3rdparty_8xb64_in1k_20210928-b7c09b62.pth",
+ "t2t_vit_t_19": "https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_3rdparty_8xb64_in1k_20210928-7f1478d5.pth",
+ "t2t_vit_t_24": "https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_3rdparty_8xb64_in1k_20210928-fe95a61b.pth",
+ "tnt_small": "https://download.openmmlab.com/mmclassification/v0/tnt/tnt-small-p16_3rdparty_in1k_20210903-c56ee7df.pth",
+ "vit_base_p16": "https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-98e8652b.pth",
+ "vit_base_p32": "https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p32_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-9cea8599.pth",
+ "vit_large_p16": "https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-large-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-b20ba619.pth"
+}
diff --git a/mmcv/model_zoo/open_mmlab.json b/mmcv/model_zoo/open_mmlab.json
new file mode 100644
index 0000000000000000000000000000000000000000..8311db4feef92faa0841c697d75efbee8430c3a0
--- /dev/null
+++ b/mmcv/model_zoo/open_mmlab.json
@@ -0,0 +1,50 @@
+{
+ "vgg16_caffe": "https://download.openmmlab.com/pretrain/third_party/vgg16_caffe-292e1171.pth",
+ "detectron/resnet50_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet50_caffe-788b5fa3.pth",
+ "detectron2/resnet50_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet50_msra-5891d200.pth",
+ "detectron/resnet101_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet101_caffe-3ad79236.pth",
+ "detectron2/resnet101_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet101_msra-6cc46731.pth",
+ "detectron2/resnext101_32x8d": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x8d-1516f1aa.pth",
+ "resnext50_32x4d": "https://download.openmmlab.com/pretrain/third_party/resnext50-32x4d-0ab1a123.pth",
+ "resnext101_32x4d": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x4d-a5af3160.pth",
+ "resnext101_64x4d": "https://download.openmmlab.com/pretrain/third_party/resnext101_64x4d-ee2c6f71.pth",
+ "contrib/resnet50_gn": "https://download.openmmlab.com/pretrain/third_party/resnet50_gn_thangvubk-ad1730dd.pth",
+ "detectron/resnet50_gn": "https://download.openmmlab.com/pretrain/third_party/resnet50_gn-9186a21c.pth",
+ "detectron/resnet101_gn": "https://download.openmmlab.com/pretrain/third_party/resnet101_gn-cac0ab98.pth",
+ "jhu/resnet50_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnet50_gn_ws-15beedd8.pth",
+ "jhu/resnet101_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnet101_gn_ws-3e3c308c.pth",
+ "jhu/resnext50_32x4d_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnext50_32x4d_gn_ws-0d87ac85.pth",
+ "jhu/resnext101_32x4d_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x4d_gn_ws-34ac1a9e.pth",
+ "jhu/resnext50_32x4d_gn": "https://download.openmmlab.com/pretrain/third_party/resnext50_32x4d_gn-c7e8b754.pth",
+ "jhu/resnext101_32x4d_gn": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x4d_gn-ac3bb84e.pth",
+ "msra/hrnetv2_w18_small": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w18_small-b5a04e21.pth",
+ "msra/hrnetv2_w18": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w18-00eb2006.pth",
+ "msra/hrnetv2_w32": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w32-dc9eeb4f.pth",
+ "msra/hrnetv2_w40": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w40-ed0b031c.pth",
+ "msra/hrnetv2_w48": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w48-d2186c55.pth",
+ "bninception_caffe": "https://download.openmmlab.com/pretrain/third_party/bn_inception_caffe-ed2e8665.pth",
+ "kin400/i3d_r50_f32s2_k400": "https://download.openmmlab.com/pretrain/third_party/i3d_r50_f32s2_k400-2c57e077.pth",
+ "kin400/nl3d_r50_f32s2_k400": "https://download.openmmlab.com/pretrain/third_party/nl3d_r50_f32s2_k400-fa7e7caa.pth",
+ "res2net101_v1d_26w_4s": "https://download.openmmlab.com/pretrain/third_party/res2net101_v1d_26w_4s_mmdetv2-f0a600f9.pth",
+ "regnetx_400mf": "https://download.openmmlab.com/pretrain/third_party/regnetx_400mf-a5b10d96.pth",
+ "regnetx_800mf": "https://download.openmmlab.com/pretrain/third_party/regnetx_800mf-1f4be4c7.pth",
+ "regnetx_1.6gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_1.6gf-5791c176.pth",
+ "regnetx_3.2gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_3.2gf-c2599b0f.pth",
+ "regnetx_4.0gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_4.0gf-a88f671e.pth",
+ "regnetx_6.4gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_6.4gf-006af45d.pth",
+ "regnetx_8.0gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_8.0gf-3c68abe7.pth",
+ "regnetx_12gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_12gf-4c2a3350.pth",
+ "resnet18_v1c": "https://download.openmmlab.com/pretrain/third_party/resnet18_v1c-b5776b93.pth",
+ "resnet50_v1c": "https://download.openmmlab.com/pretrain/third_party/resnet50_v1c-2cccc1ad.pth",
+ "resnet101_v1c": "https://download.openmmlab.com/pretrain/third_party/resnet101_v1c-e67eebb6.pth",
+ "mmedit/vgg16": "https://download.openmmlab.com/mmediting/third_party/vgg_state_dict.pth",
+ "mmedit/res34_en_nomixup": "https://download.openmmlab.com/mmediting/third_party/model_best_resnet34_En_nomixup.pth",
+ "mmedit/mobilenet_v2": "https://download.openmmlab.com/mmediting/third_party/mobilenet_v2.pth",
+ "contrib/mobilenet_v3_large": "https://download.openmmlab.com/pretrain/third_party/mobilenet_v3_large-bc2c3fd3.pth",
+ "contrib/mobilenet_v3_small": "https://download.openmmlab.com/pretrain/third_party/mobilenet_v3_small-47085aa1.pth",
+ "resnest50": "https://download.openmmlab.com/pretrain/third_party/resnest50_d2-7497a55b.pth",
+ "resnest101": "https://download.openmmlab.com/pretrain/third_party/resnest101_d2-f3b931b2.pth",
+ "resnest200": "https://download.openmmlab.com/pretrain/third_party/resnest200_d2-ca88e41f.pth",
+ "darknet53": "https://download.openmmlab.com/pretrain/third_party/darknet53-a628ea1b.pth",
+ "mmdet/mobilenet_v2": "https://download.openmmlab.com/mmdetection/v2.0/third_party/mobilenet_v2_batch256_imagenet-ff34753d.pth"
+}
diff --git a/mmcv/onnx/__init__.py b/mmcv/onnx/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..0d7eb5b0db770144ac6676bd1c7e80d7d2eb7e02
--- /dev/null
+++ b/mmcv/onnx/__init__.py
@@ -0,0 +1,5 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from .info import is_custom_op_loaded
+from .symbolic import register_extra_symbolics
+
+__all__ = ['register_extra_symbolics', 'is_custom_op_loaded']
diff --git a/mmcv/onnx/info.py b/mmcv/onnx/info.py
new file mode 100644
index 0000000000000000000000000000000000000000..e599973689245ff7c279bed0640842a9f0891750
--- /dev/null
+++ b/mmcv/onnx/info.py
@@ -0,0 +1,21 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os
+
+import torch
+
+
+def is_custom_op_loaded():
+ flag = False
+ try:
+ from ..tensorrt import is_tensorrt_plugin_loaded
+ flag = is_tensorrt_plugin_loaded()
+ except (ImportError, ModuleNotFoundError):
+ pass
+ if not flag:
+ try:
+ from ..ops import get_onnxruntime_op_path
+ ort_lib_path = get_onnxruntime_op_path()
+ flag = os.path.exists(ort_lib_path)
+ except (ImportError, ModuleNotFoundError):
+ pass
+ return flag or torch.__version__ == 'parrots'
diff --git a/mmcv/onnx/onnx_utils/__init__.py b/mmcv/onnx/onnx_utils/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..ef101fec61e72abc0eb90266d453b5b22331378d
--- /dev/null
+++ b/mmcv/onnx/onnx_utils/__init__.py
@@ -0,0 +1 @@
+# Copyright (c) OpenMMLab. All rights reserved.
diff --git a/mmcv/onnx/onnx_utils/symbolic_helper.py b/mmcv/onnx/onnx_utils/symbolic_helper.py
new file mode 100644
index 0000000000000000000000000000000000000000..a9a31eb4aeb24b6057acf9d4c352ee7e940377dd
--- /dev/null
+++ b/mmcv/onnx/onnx_utils/symbolic_helper.py
@@ -0,0 +1,331 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+"""Modified from https://github.com/pytorch/pytorch."""
+import warnings
+from functools import wraps
+from sys import maxsize
+
+import torch
+import torch.onnx
+# This import monkey-patches graph manipulation methods on Graph, used for the
+# ONNX symbolics
+import torch.onnx.utils
+from torch._C import ListType
+
+# ---------------------------------------------------------------------------------
+# Helper functions
+# ---------------------------------------------------------------------------------
+
+# Save some builtins as locals, because we'll shadown them below
+_sum = sum
+
+
+def _parse_arg(value, desc):
+ if desc == 'none':
+ return value
+ if desc == 'v' or not _is_value(value):
+ return value
+ if value.node().mustBeNone():
+ return None
+ if value.node().kind() == 'onnx::Constant':
+ tval = value.node()['value']
+ if desc == 'i':
+ return int(tval)
+ elif desc == 'f':
+ return float(tval)
+ elif desc == 'b':
+ return bool(tval)
+ elif desc == 's':
+ return str(tval)
+ elif desc == 't':
+ return tval
+ elif desc == 'is':
+ return [int(v) for v in tval]
+ elif desc == 'fs':
+ return [float(v) for v in tval]
+ else:
+ raise RuntimeError(
+ "ONNX symbolic doesn't know to interpret Constant node")
+ elif value.node().kind() == 'prim::ListConstruct':
+ if desc == 'is':
+ for v in value.node().inputs():
+ if v.node().kind() != 'onnx::Constant':
+ raise RuntimeError(
+ "Failed to export an ONNX attribute '" +
+ v.node().kind() +
+ "', since it's not constant, please try to make "
+ 'things (e.g., kernel size) static if possible')
+ return [int(v.node()['value']) for v in value.node().inputs()]
+ else:
+ raise RuntimeError(
+ "ONNX symbolic doesn't know to interpret ListConstruct node")
+
+ raise RuntimeError('Unexpected node type: {}'.format(value.node().kind()))
+
+
+def _maybe_get_const(value, desc):
+ if _is_value(value) and value.node().kind() == 'onnx::Constant':
+ return _parse_arg(value, desc)
+ return value
+
+
+def _maybe_get_scalar(value):
+ value_t = _maybe_get_const(value, 't')
+ if isinstance(value_t, torch.Tensor) and value_t.shape == ():
+ return value_t
+ return value
+
+
+def _get_const(value, desc, arg_name):
+ if _is_value(value) and value.node().kind() not in ('onnx::Constant',
+ 'prim::Constant'):
+ raise RuntimeError('ONNX symbolic expected a constant'
+ ' value of the {} argument, got `{}`'.format(
+ arg_name, value))
+ return _parse_arg(value, desc)
+
+
+def _unpack_list(list_value):
+ list_node = list_value.node()
+ assert list_node.kind() == 'prim::ListConstruct'
+ return list(list_node.inputs())
+
+
+# Check if list_value is output from prim::ListConstruct
+# This is usually called before _unpack_list to ensure the list can be
+# unpacked.
+def _is_packed_list(list_value):
+ return _is_value(
+ list_value) and list_value.node().kind() == 'prim::ListConstruct'
+
+
+def parse_args(*arg_descriptors):
+
+ def decorator(fn):
+ fn._arg_descriptors = arg_descriptors
+
+ def wrapper(g, *args):
+ # some args may be optional, so the length may be smaller
+ assert len(arg_descriptors) >= len(args)
+ args = [
+ _parse_arg(arg, arg_desc)
+ for arg, arg_desc in zip(args, arg_descriptors)
+ ]
+ return fn(g, *args)
+
+ # In Python 2 functools.wraps chokes on partially applied functions, so
+ # we need this as a workaround
+ try:
+ wrapper = wraps(fn)(wrapper)
+ except Exception:
+ pass
+ return wrapper
+
+ return decorator
+
+
+def _scalar(x):
+ """Convert a scalar tensor into a Python value."""
+ assert x.numel() == 1
+ return x.item()
+
+
+def _if_scalar_type_as(g, self, tensor):
+ """Convert self into the same type of tensor, as necessary."""
+ if isinstance(self, torch._C.Value):
+ return self
+
+ scalar_type = tensor.type().scalarType()
+ if scalar_type:
+ ty = scalar_type.lower()
+ return getattr(self, ty)()
+
+ return self
+
+
+def _is_none(x):
+ return x.node().mustBeNone()
+
+
+def _is_value(x):
+ return isinstance(x, torch._C.Value)
+
+
+def _is_tensor_list(x):
+ return x.type().isSubtypeOf(ListType.ofTensors())
+
+
+def _unimplemented(op, msg):
+ warnings.warn('ONNX export failed on ' + op + ' because ' + msg +
+ ' not supported')
+
+
+def _try_get_scalar_type(*args):
+ for arg in args:
+ try:
+ return arg.type().scalarType()
+ except RuntimeError:
+ pass
+ return None
+
+
+def _topk_helper(g, input, k, dim, largest=True, sorted=False, out=None):
+ if out is not None:
+ _unimplemented('TopK', 'Out parameter is not supported')
+ if not _is_value(k):
+ k = g.op('Constant', value_t=torch.tensor([k], dtype=torch.int64))
+ else:
+ k = g.op('Reshape', k, g.op('Constant', value_t=torch.tensor([1])))
+ return g.op(
+ 'TopK',
+ input,
+ k,
+ axis_i=dim,
+ largest_i=largest,
+ sorted_i=sorted,
+ outputs=2)
+
+
+def _slice_helper(g,
+ input,
+ axes,
+ starts,
+ ends,
+ steps=None,
+ dynamic_slice=False):
+ # TODO(ruobing): add support for opset<10
+ from torch.onnx.symbolic_opset10 import _slice
+ return _slice(g, input, axes, starts, ends, steps, dynamic_slice)
+
+
+def _unsqueeze_helper(g, input, dim):
+ from torch.onnx.symbolic_opset9 import unsqueeze
+ return unsqueeze(g, input, dim)
+
+
+def _interpolate_size_to_scales(g, input, output_size, dim):
+ output_size = _maybe_get_const(output_size, 'is')
+ if _is_value(output_size):
+ offset = 2
+ offsets = g.op(
+ 'Constant', value_t=torch.ones(offset, dtype=torch.float32))
+ dividend = g.op(
+ 'Cast', output_size, to_i=cast_pytorch_to_onnx['Float'])
+ divisor = _slice_helper(
+ g, g.op('Shape', input), axes=[0], ends=[maxsize], starts=[offset])
+ divisor = g.op('Cast', divisor, to_i=cast_pytorch_to_onnx['Float'])
+ scale_dims = g.op('Div', dividend, divisor)
+ scales = g.op('Concat', offsets, scale_dims, axis_i=0)
+ else:
+ scales_constant = [
+ 1. if i < 2 else float(output_size[-(dim - i)]) /
+ float(input.type().sizes()[-(dim - i)]) for i in range(0, dim)
+ ]
+ scales = g.op(
+ 'Constant',
+ value_t=torch.tensor(scales_constant, dtype=torch.float32))
+ return scales
+
+
+def _interpolate_get_scales_if_available(g, scales):
+ if len(scales) == 0:
+ return None
+ # scales[0] is NoneType in Pytorch == 1.5.1
+ # scales[0] is TensorType with sizes = [] in Pytorch == 1.6.0
+ # scales[0] is ListType in Pytorch == 1.7.0
+ # scales[0] is TensorType with sizes = [2] in Pytorch == 1.8.0
+ scale_desc = 'fs' if scales[0].type().kind() == 'ListType' or (
+ scales[0].type().kind() == 'TensorType' and
+ (sum(scales[0].type().sizes()) > 1)) else 'f'
+ available_scales = _maybe_get_const(
+ scales[0], scale_desc) != -1 and not _is_none(scales[0])
+
+ if not available_scales:
+ return None
+
+ offsets = g.op('Constant', value_t=torch.ones(2, dtype=torch.float32))
+ if scale_desc == 'fs':
+ scales_list = g.op(
+ 'Constant',
+ value_t=torch.tensor(_maybe_get_const(scales[0], scale_desc)))
+ # modify to support PyTorch==1.7.0
+ # https://github.com/pytorch/pytorch/blob/75ee5756715e7161314ce037474843b68f69fc04/torch/onnx/symbolic_helper.py#L375 # noqa: E501
+ scales = g.op('Concat', offsets, scales_list, axis_i=0)
+ else:
+ # for PyTorch < 1.7.0
+ scales_list = []
+ for scale in scales:
+ unsqueezed_scale = _unsqueeze_helper(g, scale, 0)
+ # ONNX only supports float for the scales. double -> float.
+ unsqueezed_scale = g.op(
+ 'Cast', unsqueezed_scale, to_i=cast_pytorch_to_onnx['Float'])
+ scales_list.append(unsqueezed_scale)
+ scales = g.op('Concat', offsets, *scales_list, axis_i=0)
+ return scales
+
+
+def _get_interpolate_attributes(g, mode, args):
+ if mode == 'nearest':
+ align_corners = None
+ scales = args[0:]
+ else:
+ align_corners = args[0]
+ scales = args[1:]
+ scales = _interpolate_get_scales_if_available(g, scales)
+ return scales, align_corners
+
+
+def _interpolate_get_scales(g, scale_factor, dim):
+ offsets = g.op('Constant', value_t=torch.ones(2, dtype=torch.float32))
+ if isinstance(scale_factor.type(), torch._C.ListType):
+ return g.op('Concat', offsets, scale_factor, axis_i=0)
+ else:
+ scale_factor = _unsqueeze_helper(g, scale_factor, 0)
+ scale_factor = g.op(
+ 'Cast', scale_factor, to_i=cast_pytorch_to_onnx['Float'])
+ scales = [scale_factor for i in range(dim - 2)]
+ scale_factor = g.op('Concat', offsets, *scales, axis_i=0)
+ return scale_factor
+
+
+def _size_helper(g, self, dim):
+ full_shape = g.op('Shape', self)
+ from torch.onnx.symbolic_opset9 import select
+ return select(g, full_shape, g.op('Constant', value_t=torch.tensor([0])),
+ dim)
+
+
+def _avgpool_helper(tuple_fn, padding, kernel_size, stride, divisor_override,
+ name):
+ if divisor_override and divisor_override.node().kind() != 'prim::Constant':
+ return _unimplemented(name, 'divisor_override')
+ if not stride:
+ stride = kernel_size
+ padding = tuple(tuple_fn(padding))
+ return padding
+
+
+# Metaprogram symbolics for each ATen native specialized cast operator.
+# For e.g. we specify a function named `_cast_uint8_t` that instantiates an
+# ONNX cast node with `to` attribute 'UINT8'
+#
+# TODO: remove these once we support Type's in the JIT IR and we can once again
+# use the unified toType operator
+cast_pytorch_to_onnx = {
+ 'Byte': torch.onnx.TensorProtoDataType.UINT8,
+ 'Char': torch.onnx.TensorProtoDataType.INT8,
+ 'Double': torch.onnx.TensorProtoDataType.DOUBLE,
+ 'Float': torch.onnx.TensorProtoDataType.FLOAT,
+ 'Half': torch.onnx.TensorProtoDataType.FLOAT16,
+ 'Int': torch.onnx.TensorProtoDataType.INT32,
+ 'Long': torch.onnx.TensorProtoDataType.INT64,
+ 'Short': torch.onnx.TensorProtoDataType.INT16,
+ 'Bool': torch.onnx.TensorProtoDataType.BOOL,
+ 'ComplexFloat': torch.onnx.TensorProtoDataType.COMPLEX64,
+ 'ComplexDouble': torch.onnx.TensorProtoDataType.COMPLEX128,
+ 'Undefined': torch.onnx.TensorProtoDataType.UNDEFINED,
+}
+
+# Global set to store the list of quantized operators in the network.
+# This is currently only used in the conversion of quantized ops from PT
+# -> C2 via ONNX.
+_quantized_ops = set()
diff --git a/mmcv/onnx/symbolic.py b/mmcv/onnx/symbolic.py
new file mode 100644
index 0000000000000000000000000000000000000000..94cc1c620d01c4fa062cc4576fcb591f90923a65
--- /dev/null
+++ b/mmcv/onnx/symbolic.py
@@ -0,0 +1,496 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+"""Modified from https://github.com/pytorch/pytorch."""
+import os
+
+import numpy as np
+import torch
+from torch.nn.modules.utils import _pair, _single, _triple
+from torch.onnx.symbolic_helper import parse_args
+from torch.onnx.symbolic_registry import register_op
+
+from .onnx_utils import symbolic_helper as sym_help
+
+
+def _interpolate(name, dim, interpolate_mode):
+
+ def symbolic_fn(g, input, output_size, *args):
+ scales, align_corners = sym_help._get_interpolate_attributes(
+ g, interpolate_mode, args)
+ align_corners = sym_help._maybe_get_scalar(align_corners)
+ transformation_mode = 'asymmetric' \
+ if interpolate_mode == 'nearest' \
+ else 'align_corners' if align_corners else 'pytorch_half_pixel'
+ empty_tensor = g.op(
+ 'Constant', value_t=torch.tensor([], dtype=torch.float32))
+
+ if scales is None:
+ if 'ONNX_BACKEND' in os.environ and os.environ[
+ 'ONNX_BACKEND'] == 'TensorRT':
+ input_size = input.type().sizes()
+ # slice the first two dim
+ input_size = input_size[:2]
+ # convert output_size to int type
+ output_size = sym_help._maybe_get_const(output_size, 'is')
+ input_size.extend(output_size)
+ output_size = g.op(
+ 'Constant',
+ value_t=torch.tensor(input_size, dtype=torch.int64))
+ else:
+ input_size = g.op('Shape', input)
+ input_size_beg = sym_help._slice_helper(
+ g, input_size, axes=[0], ends=[2], starts=[0])
+ output_size = g.op(
+ 'Cast',
+ output_size,
+ to_i=sym_help.cast_pytorch_to_onnx['Long'])
+ output_size = g.op(
+ 'Concat', input_size_beg, output_size, axis_i=0)
+ scales = g.op(
+ 'Constant', value_t=torch.tensor([], dtype=torch.float32))
+ return g.op(
+ 'Resize',
+ input,
+ empty_tensor,
+ # roi only takes effect with
+ # coordinate_transformation_mode="tf_crop_and_resize"
+ scales, # scales is not needed since we are sending out_size
+ output_size,
+ coordinate_transformation_mode_s=transformation_mode,
+ cubic_coeff_a_f=-0.75, # only valid when mode="cubic"
+ mode_s=interpolate_mode, # nearest, linear, or cubic
+ nearest_mode_s='floor') # only valid when mode="nearest"
+ else:
+ return g.op(
+ 'Resize',
+ input,
+ empty_tensor,
+ # roi only takes effect with
+ # coordinate_transformation_mode="tf_crop_and_resize"
+ scales, # scales is not needed since we are sending out_size
+ coordinate_transformation_mode_s=transformation_mode,
+ cubic_coeff_a_f=-0.75, # only valid when mode="cubic"
+ mode_s=interpolate_mode, # nearest, linear, or cubic
+ nearest_mode_s='floor') # only valid when mode="nearest"
+
+ return symbolic_fn
+
+
+upsample_nearest1d = _interpolate('upsample_nearest1d', 3, 'nearest')
+upsample_nearest2d = _interpolate('upsample_nearest2d', 4, 'nearest')
+upsample_nearest3d = _interpolate('upsample_nearest3d', 5, 'nearest')
+upsample_linear1d = _interpolate('upsample_linear1d', 3, 'linear')
+upsample_bilinear2d = _interpolate('upsample_bilinear2d', 4, 'linear')
+upsample_trilinear3d = _interpolate('upsample_trilinear3d', 5, 'linear')
+upsample_bicubic2d = _interpolate('upsample_bicubic2d', 4, 'cubic')
+
+
+@parse_args('v', 'v', 'i', 'i', 'i', 'none')
+def topk(g, self, k, dim, largest, sorted, out=None):
+ return sym_help._topk_helper(
+ g, self, k, dim, largest=largest, sorted=sorted, out=out)
+
+
+def masked_select(g, self, mask):
+ from torch.onnx.symbolic_opset9 import expand_as, nonzero
+ index = nonzero(g, expand_as(g, mask, self))
+ return g.op('GatherND', self, index)
+
+
+def _prepare_onnx_paddings(g, dim, pad):
+ pad_len = torch.onnx.symbolic_opset9.size(
+ g, pad, g.op('Constant', value_t=torch.tensor([0])))
+ # Set extension = [0] * (dim * 2 - len(pad))
+ extension = g.op(
+ 'Sub',
+ g.op('Mul',
+ g.op('Constant', value_t=torch.tensor(dim, dtype=torch.int64)),
+ g.op('Constant', value_t=torch.tensor(2, dtype=torch.int64))),
+ pad_len)
+ pad = g.op('Cast', pad, to_i=sym_help.cast_pytorch_to_onnx['Long'])
+ paddings = g.op(
+ 'Concat',
+ pad,
+ g.op(
+ 'ConstantOfShape',
+ extension,
+ value_t=torch.tensor([0], dtype=torch.int64)),
+ axis_i=0)
+ paddings = g.op('Reshape', paddings,
+ g.op('Constant', value_t=torch.tensor([-1, 2])))
+ paddings = g.op(
+ 'Transpose',
+ torch.onnx.symbolic_opset10.flip(g, paddings, [0]),
+ perm_i=[1, 0])
+ paddings = g.op('Reshape', paddings,
+ g.op('Constant', value_t=torch.tensor([-1])))
+ padding_c = g.op(
+ 'Cast', paddings, to_i=sym_help.cast_pytorch_to_onnx['Long'])
+ return padding_c
+
+
+def constant_pad_nd(g, input, padding, value=None):
+ mode = 'constant'
+ value = sym_help._maybe_get_scalar(value)
+ value = sym_help._if_scalar_type_as(g, value, input)
+ pad = _prepare_onnx_paddings(g, input.type().dim(), padding)
+ return g.op('Pad', input, pad, value, mode_s=mode)
+
+
+def reflection_pad(g, input, padding):
+ mode = 'reflect'
+ paddings = _prepare_onnx_paddings(g, input.type().dim(), padding)
+ return g.op('Pad', input, paddings, mode_s=mode)
+
+
+reflection_pad1d = reflection_pad
+reflection_pad2d = reflection_pad
+reflection_pad3d = reflection_pad
+
+
+def _avg_pool(name, tuple_fn):
+
+ @parse_args('v', 'is', 'is', 'is', 'i', 'i', 'none')
+ def symbolic_fn(g,
+ input,
+ kernel_size,
+ stride,
+ padding,
+ ceil_mode,
+ count_include_pad,
+ divisor_override=None):
+ padding = sym_help._avgpool_helper(tuple_fn, padding, kernel_size,
+ stride, divisor_override, name)
+ if not stride:
+ stride = kernel_size
+ if count_include_pad:
+ input = g.op(
+ 'Pad',
+ input,
+ g.op(
+ 'Constant',
+ value_t=torch.tensor(((0, ) * 2 + padding) * 2)),
+ mode_s='constant')
+ padding = (0, ) * len(padding)
+ output = g.op(
+ 'AveragePool',
+ input,
+ kernel_shape_i=tuple_fn(kernel_size),
+ strides_i=tuple_fn(stride),
+ pads_i=padding * 2,
+ ceil_mode_i=ceil_mode)
+ return output
+
+ return symbolic_fn
+
+
+avg_pool1d = _avg_pool('avg_pool1d', _single)
+avg_pool2d = _avg_pool('avg_pool2d', _pair)
+avg_pool3d = _avg_pool('avg_pool3d', _triple)
+
+
+def _get_im2col_indices_along_dim(g, input_d, kernel_size_d, dilation_d,
+ padding_d, stride_d):
+ # Input is always 4-D (N, C, H, W)
+ # Calculate indices of sliding blocks along spatial dimension
+ # Slide kernel over input each dim d:
+ # each dimension d ranges from 0 to
+ # input[d]+2xpadding[d]-dilation[d]x(kernel_size[d]-1)
+ # with steps = stride
+
+ blocks_d = g.op('Add', input_d,
+ g.op('Constant', value_t=torch.tensor(padding_d * 2)))
+ blocks_d = g.op(
+ 'Sub', blocks_d,
+ g.op(
+ 'Constant',
+ value_t=torch.tensor(dilation_d * (kernel_size_d - 1))))
+
+ # Stride kernel over input and find starting indices along dim d
+ blocks_d_indices = g.op('Range', g.op('Constant', value_t=torch.tensor(0)),
+ blocks_d,
+ g.op('Constant', value_t=torch.tensor(stride_d)))
+
+ # Apply dilation on kernel and find its indices along dim d
+ kernel_grid = np.arange(0, kernel_size_d * dilation_d, dilation_d)
+ kernel_grid = g.op('Constant', value_t=torch.tensor([kernel_grid]))
+
+ # Broadcast and add kernel staring positions (indices) with
+ # kernel_grid along dim d, to get block indices along dim d
+ blocks_d_indices = g.op(
+ 'Unsqueeze', blocks_d_indices, axes_i=[0]) # Reshape to [1, -1]
+ kernel_mask = g.op('Reshape', kernel_grid,
+ g.op('Constant', value_t=torch.tensor([-1, 1])))
+ block_mask = g.op('Add', blocks_d_indices, kernel_mask)
+
+ return block_mask
+
+
+def _get_im2col_padded_input(g, input, padding_h, padding_w):
+ # Input is always 4-D tensor (N, C, H, W)
+ # Padding tensor has the following format: (padding_h, padding_w)
+ # Reshape the padding to follow ONNX format:
+ # (dim1_begin, dim2_begin,...,dim1_end, dim2_end,...)
+ pad = g.op(
+ 'Constant', value_t=torch.LongTensor([0, 0, padding_h, padding_w] * 2))
+ return g.op('Pad', input, pad)
+
+
+def _get_im2col_output_shape(g, input, kernel_h, kernel_w):
+ batch_dim = size(g, input, g.op('Constant', value_t=torch.tensor(0)))
+ channel_dim = size(g, input, g.op('Constant', value_t=torch.tensor(1)))
+ channel_unfolded = g.op(
+ 'Mul', channel_dim,
+ g.op('Constant', value_t=torch.tensor(kernel_h * kernel_w)))
+
+ return g.op(
+ 'Concat',
+ g.op('Unsqueeze', batch_dim, axes_i=[0]),
+ g.op('Unsqueeze', channel_unfolded, axes_i=[0]),
+ g.op('Constant', value_t=torch.tensor([-1])),
+ axis_i=0)
+
+
+def size(g, self, dim=None):
+ if dim is None:
+ return g.op('Shape', self)
+ return sym_help._size_helper(g, self, dim)
+
+
+@parse_args('v', 'is', 'is', 'is', 'is')
+def im2col(g, input, kernel_size, dilation, padding, stride):
+ # Input is always 4-D tensor (N, C, H, W)
+ # All other args are int[2]
+
+ input_h = size(g, input, g.op('Constant', value_t=torch.tensor(2)))
+ input_w = size(g, input, g.op('Constant', value_t=torch.tensor(3)))
+
+ stride_h, stride_w = stride[0], stride[1]
+ padding_h, padding_w = padding[0], padding[1]
+ dilation_h, dilation_w = dilation[0], dilation[1]
+ kernel_h, kernel_w = kernel_size[0], kernel_size[1]
+
+ blocks_row_indices = _get_im2col_indices_along_dim(g, input_h, kernel_h,
+ dilation_h, padding_h,
+ stride_h)
+ blocks_col_indices = _get_im2col_indices_along_dim(g, input_w, kernel_w,
+ dilation_w, padding_w,
+ stride_w)
+
+ output_shape = _get_im2col_output_shape(g, input, kernel_h, kernel_w)
+ padded_input = _get_im2col_padded_input(g, input, padding_h, padding_w)
+
+ output = g.op('Gather', padded_input, blocks_row_indices, axis_i=2)
+ output = g.op('Gather', output, blocks_col_indices, axis_i=4)
+ output = g.op('Transpose', output, perm_i=[0, 1, 2, 4, 3, 5])
+ return g.op('Reshape', output, output_shape)
+
+
+@parse_args('v', 'i')
+def one_hot(g, self, num_classes):
+ values = g.op('Constant', value_t=torch.LongTensor([0, 1]))
+ depth = g.op('Constant', value_t=torch.LongTensor([num_classes]))
+ return g.op('OneHot', self, depth, values, axis_i=-1)
+
+
+@parse_args('v', 'i', 'none')
+def softmax(g, input, dim, dtype=None):
+ input_dim = input.type().dim()
+ if input_dim:
+ # TODO: remove this as onnx opset 11 spec allows negative axes
+ if dim < 0:
+ dim = input_dim + dim
+ if input_dim == dim + 1:
+ softmax = g.op('Softmax', input, axis_i=dim)
+ if dtype and dtype.node().kind() != 'prim::Constant':
+ parsed_dtype = sym_help._get_const(dtype, 'i', 'dtype')
+ softmax = g.op(
+ 'Cast',
+ softmax,
+ to_i=sym_help.scalar_type_to_onnx[parsed_dtype])
+ return softmax
+
+ max_value = g.op('ReduceMax', input, axes_i=[dim], keepdims_i=1)
+ input = g.op('Sub', input, max_value)
+ exp = g.op('Exp', input)
+ sum = g.op('ReduceSum', exp, axes_i=[dim])
+ softmax = g.op('Div', exp, sum)
+ if dtype and dtype.node().kind() != 'prim::Constant':
+ parsed_dtype = sym_help._get_const(dtype, 'i', 'dtype')
+ softmax = g.op(
+ 'Cast', softmax, to_i=sym_help.scalar_type_to_onnx[parsed_dtype])
+ return softmax
+
+
+def _adaptive_pool(name, type, tuple_fn, fn=None):
+
+ @parse_args('v', 'is')
+ def symbolic_fn(g, input, output_size):
+ if output_size == [1] * len(output_size) and type == 'AveragePool':
+ return g.op('GlobalAveragePool', input)
+ if not input.isCompleteTensor():
+ if output_size == [1] * len(output_size):
+ return g.op('GlobalMaxPool', input), None
+ raise NotImplementedError(
+ '[Adaptive pool]:input size not accessible')
+ dim = input.type().sizes()[2:]
+ if output_size == [1] * len(output_size) and type == 'MaxPool':
+ return g.op('GlobalMaxPool', input), None
+
+ # compute stride = floor(input_size / output_size)
+ s = [int(dim[i] / output_size[i]) for i in range(0, len(dim))]
+
+ # compute kernel_size = input_size - (output_size - 1) * stride
+ k = [dim[i] - (output_size[i] - 1) * s[i] for i in range(0, len(dim))]
+
+ # call max_poolxd_with_indices to get indices in the output
+ if type == 'MaxPool':
+ return fn(g, input, k, k, (0, ) * len(dim), (1, ) * len(dim),
+ False)
+ output = g.op(
+ type,
+ input,
+ kernel_shape_i=tuple_fn(k),
+ strides_i=tuple_fn(s),
+ ceil_mode_i=False)
+ return output
+
+ return symbolic_fn
+
+
+adaptive_avg_pool1d = _adaptive_pool('adaptive_avg_pool1d', 'AveragePool',
+ _single)
+adaptive_avg_pool2d = _adaptive_pool('adaptive_avg_pool2d', 'AveragePool',
+ _pair)
+adaptive_avg_pool3d = _adaptive_pool('adaptive_avg_pool3d', 'AveragePool',
+ _triple)
+
+
+def new_full(g,
+ self,
+ size,
+ fill_value,
+ dtype,
+ layout,
+ device,
+ pin_memory=False):
+ from torch.onnx.symbolic_opset9 import full
+ if dtype is None and self.isCompleteTensor():
+ dtype = self.type().scalarType()
+ dtype = sym_help.scalar_type_to_onnx.index(
+ sym_help.cast_pytorch_to_onnx[dtype])
+ return full(g, size, fill_value, dtype, layout, device, pin_memory)
+
+
+@parse_args('v', 'v', 'i', 'i', 'i')
+def grid_sampler(g,
+ input,
+ grid,
+ interpolation_mode,
+ padding_mode,
+ align_corners=False):
+ return g.op(
+ 'mmcv::grid_sampler',
+ input,
+ grid,
+ interpolation_mode_i=interpolation_mode,
+ padding_mode_i=padding_mode,
+ align_corners_i=align_corners)
+
+
+@parse_args('v', 'i')
+def cummax(g, input, dim):
+ return g.op('mmcv::cummax', input, dim_i=dim, outputs=2)
+
+
+@parse_args('v', 'i')
+def cummin(g, input, dim):
+ return g.op('mmcv::cummin', input, dim_i=dim, outputs=2)
+
+
+@parse_args('v', 'v', 'is')
+def roll(g, input, shifts, dims):
+ from torch.onnx.symbolic_opset9 import squeeze
+ from packaging import version
+ input_shape = g.op('Shape', input)
+
+ need_flatten = len(dims) == 0
+ # If dims is not specified, the tensor will be flattened before
+ # rolling and then restored to the original shape.
+ if need_flatten:
+ resize_shape = input_shape
+ input = g.op('Reshape', input,
+ g.op('Constant', value_t=torch.LongTensor([1, -1])))
+ input_shape = g.op('Shape', input)
+ dims = [1]
+
+ for index, dim in enumerate(dims):
+ end_size = sym_help._slice_helper(
+ g, input_shape, axes=[0], ends=[dim + 1], starts=[dim])
+ shift_size = sym_help._slice_helper(
+ g, shifts, axes=[0], ends=[index + 1], starts=[index])
+ slice_size = g.op('Sub', end_size, shift_size)
+
+ # Can not use Mod because tensorrt does not support
+ div_size = g.op('Div', slice_size, end_size)
+ slice_size = g.op('Sub', slice_size, g.op('Mul', end_size, div_size))
+
+ if version.parse(torch.__version__) >= version.parse('1.7.0'):
+ # add dim=0 for pytorch 1.9.0
+ end_size = squeeze(g, end_size, 0)
+ slice_size = squeeze(g, slice_size, 0)
+ else:
+ end_size = g.op('Squeeze', end_size)
+ slice_size = g.op('Squeeze', slice_size)
+ dim = torch.LongTensor([dim])
+
+ input_slice0 = sym_help._slice_helper(
+ g,
+ input,
+ axes=dim,
+ starts=torch.LongTensor([0]),
+ ends=slice_size,
+ dynamic_slice=True)
+ input_slice1 = sym_help._slice_helper(
+ g,
+ input,
+ axes=dim,
+ ends=end_size,
+ starts=slice_size,
+ dynamic_slice=True)
+
+ input = g.op('Concat', input_slice1, input_slice0, axis_i=dim)
+
+ if need_flatten:
+ input = g.op('Reshape', input, resize_shape)
+
+ return input
+
+
+def register_extra_symbolics(opset=11):
+ register_op('one_hot', one_hot, '', opset)
+ register_op('im2col', im2col, '', opset)
+ register_op('topk', topk, '', opset)
+ register_op('softmax', softmax, '', opset)
+ register_op('constant_pad_nd', constant_pad_nd, '', opset)
+ register_op('reflection_pad1d', reflection_pad1d, '', opset)
+ register_op('reflection_pad2d', reflection_pad2d, '', opset)
+ register_op('reflection_pad3d', reflection_pad3d, '', opset)
+ register_op('avg_pool1d', avg_pool1d, '', opset)
+ register_op('avg_pool2d', avg_pool2d, '', opset)
+ register_op('avg_pool3d', avg_pool3d, '', opset)
+ register_op('adaptive_avg_pool1d', adaptive_avg_pool1d, '', opset)
+ register_op('adaptive_avg_pool2d', adaptive_avg_pool2d, '', opset)
+ register_op('adaptive_avg_pool3d', adaptive_avg_pool3d, '', opset)
+ register_op('masked_select', masked_select, '', opset)
+ register_op('upsample_nearest1d', upsample_nearest1d, '', opset)
+ register_op('upsample_nearest2d', upsample_nearest2d, '', opset)
+ register_op('upsample_nearest3d', upsample_nearest3d, '', opset)
+ register_op('upsample_linear1d', upsample_linear1d, '', opset)
+ register_op('upsample_bilinear2d', upsample_bilinear2d, '', opset)
+ register_op('upsample_trilinear3d', upsample_trilinear3d, '', opset)
+ register_op('upsample_bicubic2d', upsample_bicubic2d, '', opset)
+ register_op('new_full', new_full, '', opset)
+ register_op('grid_sampler', grid_sampler, '', opset)
+ register_op('cummax', cummax, '', opset)
+ register_op('cummin', cummin, '', opset)
+ register_op('roll', roll, '', opset)
diff --git a/mmcv/ops/__init__.py b/mmcv/ops/__init__.py
old mode 100755
new mode 100644
index cffbd23fd465af95d5e6b1e089a8023e447b30fa..999e090a458ee148ceca0649f1e3806a40e909bd
--- a/mmcv/ops/__init__.py
+++ b/mmcv/ops/__init__.py
@@ -1,19 +1,12 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from .active_rotated_filter import active_rotated_filter
from .assign_score_withk import assign_score_withk
from .ball_query import ball_query
from .bbox import bbox_overlaps
-from .bezier_align import BezierAlign, bezier_align
-from .bias_act import bias_act
from .border_align import BorderAlign, border_align
-from .box_iou_quadri import box_iou_quadri
from .box_iou_rotated import box_iou_rotated
from .carafe import CARAFE, CARAFENaive, CARAFEPack, carafe, carafe_naive
from .cc_attention import CrissCrossAttention
-from .chamfer_distance import chamfer_distance
from .contour_expand import contour_expand
-from .conv2d_gradfix import conv2d, conv_transpose2d
-from .convex_iou import convex_giou, convex_iou
from .corner_pool import CornerPool
from .correlation import Correlation
from .deform_conv import DeformConv2d, DeformConv2dPack, deform_conv2d
@@ -23,8 +16,6 @@ from .deprecated_wrappers import Conv2d_deprecated as Conv2d
from .deprecated_wrappers import ConvTranspose2d_deprecated as ConvTranspose2d
from .deprecated_wrappers import Linear_deprecated as Linear
from .deprecated_wrappers import MaxPool2d_deprecated as MaxPool2d
-from .diff_iou_rotated import diff_iou_rotated_2d, diff_iou_rotated_3d
-from .filtered_lrelu import filtered_lrelu
from .focal_loss import (SigmoidFocalLoss, SoftmaxFocalLoss,
sigmoid_focal_loss, softmax_focal_loss)
from .furthest_point_sample import (furthest_point_sample,
@@ -32,46 +23,35 @@ from .furthest_point_sample import (furthest_point_sample,
from .fused_bias_leakyrelu import FusedBiasLeakyReLU, fused_bias_leakyrelu
from .gather_points import gather_points
from .group_points import GroupAll, QueryAndGroup, grouping_operation
-from .info import get_compiler_version, get_compiling_cuda_version
-from .iou3d import (boxes_iou3d, boxes_iou_bev, boxes_overlap_bev, nms3d,
- nms3d_normal, nms_bev, nms_normal_bev)
+from .info import (get_compiler_version, get_compiling_cuda_version,
+ get_onnxruntime_op_path)
+from .iou3d import boxes_iou_bev, nms_bev, nms_normal_bev
from .knn import knn
from .masked_conv import MaskedConv2d, masked_conv2d
-from .min_area_polygons import min_area_polygons
from .modulated_deform_conv import (ModulatedDeformConv2d,
ModulatedDeformConv2dPack,
modulated_deform_conv2d)
from .multi_scale_deform_attn import MultiScaleDeformableAttention
-from .nms import batched_nms, nms, nms_match, nms_quadri, nms_rotated, soft_nms
+from .nms import batched_nms, nms, nms_match, nms_rotated, soft_nms
from .pixel_group import pixel_group
from .point_sample import (SimpleRoIAlign, point_sample,
rel_roi_point_to_rel_img_point)
from .points_in_boxes import (points_in_boxes_all, points_in_boxes_cpu,
points_in_boxes_part)
-from .points_in_polygons import points_in_polygons
from .points_sampler import PointsSampler
-from .prroi_pool import PrRoIPool, prroi_pool
from .psa_mask import PSAMask
-from .riroi_align_rotated import RiRoIAlignRotated, riroi_align_rotated
from .roi_align import RoIAlign, roi_align
from .roi_align_rotated import RoIAlignRotated, roi_align_rotated
from .roi_pool import RoIPool, roi_pool
from .roiaware_pool3d import RoIAwarePool3d
from .roipoint_pool3d import RoIPointPool3d
-from .rotated_feature_align import rotated_feature_align
from .saconv import SAConv2d
from .scatter_points import DynamicScatter, dynamic_scatter
-from .sparse_conv import (SparseConv2d, SparseConv3d, SparseConvTranspose2d,
- SparseConvTranspose3d, SparseInverseConv2d,
- SparseInverseConv3d, SubMConv2d, SubMConv3d)
-from .sparse_modules import SparseModule, SparseSequential
-from .sparse_pool import SparseMaxPool2d, SparseMaxPool3d
-from .sparse_structure import SparseConvTensor, scatter_nd
from .sync_bn import SyncBatchNorm
from .three_interpolate import three_interpolate
from .three_nn import three_nn
from .tin_shift import TINShift, tin_shift
-from .upfirdn2d import filter2d, upfirdn2d, upsample2d
+from .upfirdn2d import upfirdn2d
from .voxelize import Voxelization, voxelization
__all__ = [
@@ -80,32 +60,22 @@ __all__ = [
'deform_conv2d', 'DeformRoIPool', 'DeformRoIPoolPack',
'ModulatedDeformRoIPoolPack', 'deform_roi_pool', 'SigmoidFocalLoss',
'SoftmaxFocalLoss', 'sigmoid_focal_loss', 'softmax_focal_loss',
- 'get_compiler_version', 'get_compiling_cuda_version', 'MaskedConv2d',
- 'masked_conv2d', 'ModulatedDeformConv2d', 'ModulatedDeformConv2dPack',
+ 'get_compiler_version', 'get_compiling_cuda_version',
+ 'get_onnxruntime_op_path', 'MaskedConv2d', 'masked_conv2d',
+ 'ModulatedDeformConv2d', 'ModulatedDeformConv2dPack',
'modulated_deform_conv2d', 'batched_nms', 'nms', 'soft_nms', 'nms_match',
'RoIAlign', 'roi_align', 'RoIPool', 'roi_pool', 'SyncBatchNorm', 'Conv2d',
'ConvTranspose2d', 'Linear', 'MaxPool2d', 'CrissCrossAttention', 'PSAMask',
'point_sample', 'rel_roi_point_to_rel_img_point', 'SimpleRoIAlign',
'SAConv2d', 'TINShift', 'tin_shift', 'assign_score_withk',
- 'box_iou_rotated', 'box_iou_quadri', 'RoIPointPool3d', 'nms_rotated',
- 'knn', 'ball_query', 'upfirdn2d', 'FusedBiasLeakyReLU',
- 'fused_bias_leakyrelu', 'rotated_feature_align', 'RiRoIAlignRotated',
- 'riroi_align_rotated', 'RoIAlignRotated', 'roi_align_rotated',
- 'pixel_group', 'QueryAndGroup', 'GroupAll', 'grouping_operation',
- 'contour_expand', 'three_nn', 'three_interpolate',
- 'MultiScaleDeformableAttention', 'BorderAlign', 'border_align',
- 'gather_points', 'furthest_point_sample', 'nms_quadri',
+ 'box_iou_rotated', 'RoIPointPool3d', 'nms_rotated', 'knn', 'ball_query',
+ 'upfirdn2d', 'FusedBiasLeakyReLU', 'fused_bias_leakyrelu',
+ 'RoIAlignRotated', 'roi_align_rotated', 'pixel_group', 'QueryAndGroup',
+ 'GroupAll', 'grouping_operation', 'contour_expand', 'three_nn',
+ 'three_interpolate', 'MultiScaleDeformableAttention', 'BorderAlign',
+ 'border_align', 'gather_points', 'furthest_point_sample',
'furthest_point_sample_with_dist', 'PointsSampler', 'Correlation',
- 'boxes_iou3d', 'boxes_iou_bev', 'boxes_overlap_bev', 'nms_bev',
- 'nms_normal_bev', 'nms3d', 'nms3d_normal', 'Voxelization', 'voxelization',
- 'dynamic_scatter', 'DynamicScatter', 'RoIAwarePool3d', 'SparseConv2d',
- 'SparseConv3d', 'SparseConvTranspose2d', 'SparseConvTranspose3d',
- 'SparseInverseConv2d', 'SparseInverseConv3d', 'SubMConv2d', 'SubMConv3d',
- 'SparseModule', 'SparseSequential', 'SparseMaxPool2d', 'SparseMaxPool3d',
- 'SparseConvTensor', 'scatter_nd', 'points_in_boxes_part',
- 'points_in_boxes_cpu', 'points_in_boxes_all', 'points_in_polygons',
- 'min_area_polygons', 'active_rotated_filter', 'convex_iou', 'convex_giou',
- 'diff_iou_rotated_2d', 'diff_iou_rotated_3d', 'chamfer_distance',
- 'PrRoIPool', 'prroi_pool', 'bias_act', 'filtered_lrelu', 'conv2d',
- 'conv_transpose2d', 'filter2d', 'upsample2d', 'BezierAlign', 'bezier_align'
+ 'boxes_iou_bev', 'nms_bev', 'nms_normal_bev', 'Voxelization',
+ 'voxelization', 'dynamic_scatter', 'DynamicScatter', 'RoIAwarePool3d',
+ 'points_in_boxes_part', 'points_in_boxes_cpu', 'points_in_boxes_all'
]
diff --git a/mmcv/ops/active_rotated_filter.py b/mmcv/ops/active_rotated_filter.py
deleted file mode 100644
index b8ba43dd41cca14e0d74b4ba7dd8316da2ba4abe..0000000000000000000000000000000000000000
--- a/mmcv/ops/active_rotated_filter.py
+++ /dev/null
@@ -1,64 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Tuple
-
-import torch
-from torch.autograd import Function
-from torch.autograd.function import once_differentiable
-
-from ..utils import ext_loader
-
-ext_module = ext_loader.load_ext(
- '_ext',
- ['active_rotated_filter_forward', 'active_rotated_filter_backward'])
-
-
-class ActiveRotatedFilterFunction(Function):
- """Encoding the orientation information and generating orientation-
- sensitive features.
-
- The details are described in the paper `Align Deep Features for Oriented
- Object Detection _`.
- """
-
- @staticmethod
- def forward(ctx, input: torch.Tensor,
- indices: torch.Tensor) -> torch.Tensor:
- """
- Args:
- input (torch.Tensor): Input features with shape
- [num_output_planes, num_input_planes, num_orientations, H, W].
- indices (torch.Tensor): Indices with shape
- [num_orientations, H, W, num_rotations].
-
- Returns:
- torch.Tensor: Refined features with shape [num_output_planes *
- num_rotations, num_input_planes * num_orientations, H, W].
- """
- ctx.save_for_backward(input, indices)
- op, ip, o, h, w = input.size()
- o, h, w, r = indices.size()
- output = input.new_zeros((op * r, ip * o, h, w))
- ext_module.active_rotated_filter_forward(input, indices, output)
-
- return output
-
- @staticmethod
- @once_differentiable
- def backward(ctx, grad_out: torch.Tensor) -> Tuple[torch.Tensor, None]:
- """
- Args:
- grad_output (torch.Tensor): The gradient of output features
- with shape [num_output_planes * num_rotations,
- num_input_planes * num_orientations, H, W].
-
- Returns:
- torch.Tensor: The gradient of input features with shape
- [num_output_planes, num_input_planes, num_orientations, H, W].
- """
- input, indices = ctx.saved_tensors
- grad_in = torch.zeros_like(input)
- ext_module.active_rotated_filter_backward(grad_out, indices, grad_in)
- return grad_in, None
-
-
-active_rotated_filter = ActiveRotatedFilterFunction.apply
diff --git a/mmcv/ops/assign_score_withk.py b/mmcv/ops/assign_score_withk.py
index deca0892bddc52b51e9d2543a9e893f0bd67ebdb..4906adaa2cffd1b46912fbe7d4f87ef2f9fa0012 100644
--- a/mmcv/ops/assign_score_withk.py
+++ b/mmcv/ops/assign_score_withk.py
@@ -1,6 +1,3 @@
-from typing import Tuple
-
-import torch
from torch.autograd import Function
from ..utils import ext_loader
@@ -30,11 +27,11 @@ class AssignScoreWithK(Function):
@staticmethod
def forward(ctx,
- scores: torch.Tensor,
- point_features: torch.Tensor,
- center_features: torch.Tensor,
- knn_idx: torch.Tensor,
- aggregate: str = 'sum') -> torch.Tensor:
+ scores,
+ point_features,
+ center_features,
+ knn_idx,
+ aggregate='sum'):
"""
Args:
scores (torch.Tensor): (B, npoint, K, M), predicted scores to
@@ -81,20 +78,15 @@ class AssignScoreWithK(Function):
return output
@staticmethod
- def backward(
- ctx, grad_out: torch.Tensor
- ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, None, None]:
+ def backward(ctx, grad_out):
"""
Args:
grad_out (torch.Tensor): (B, out_dim, npoint, K)
Returns:
- tuple[torch.Tensor]: A tuple contains five elements. The first one
- is the gradient of ``scores`` whose shape is (B, npoint, K, M). The
- second is the gradient of ``point_features`` whose shape is
- (B, N, M, out_dim). The third is the gradient of
- ``center_features`` with the shape of (B, N, M, out_dim). The last
- two are ``None``.
+ grad_scores (torch.Tensor): (B, npoint, K, M)
+ grad_point_features (torch.Tensor): (B, N, M, out_dim)
+ grad_center_features (torch.Tensor): (B, N, M, out_dim)
"""
_, point_features, center_features, scores, knn_idx = ctx.saved_tensors
diff --git a/mmcv/ops/ball_query.py b/mmcv/ops/ball_query.py
index a89b36b52b1cce8ab90274418a4d1346796d971c..d0466847c6e5c1239e359a0397568413ebc1504a 100644
--- a/mmcv/ops/ball_query.py
+++ b/mmcv/ops/ball_query.py
@@ -1,86 +1,54 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Optional, Tuple
-
import torch
from torch.autograd import Function
from ..utils import ext_loader
-ext_module = ext_loader.load_ext(
- '_ext', ['ball_query_forward', 'stack_ball_query_forward'])
+ext_module = ext_loader.load_ext('_ext', ['ball_query_forward'])
class BallQuery(Function):
"""Find nearby points in spherical space."""
@staticmethod
- def forward(
- ctx,
- min_radius: float,
- max_radius: float,
- sample_num: int,
- xyz: torch.Tensor,
- center_xyz: torch.Tensor,
- xyz_batch_cnt: Optional[torch.Tensor] = None,
- center_xyz_batch_cnt: Optional[torch.Tensor] = None
- ) -> torch.Tensor:
+ def forward(ctx, min_radius: float, max_radius: float, sample_num: int,
+ xyz: torch.Tensor, center_xyz: torch.Tensor) -> torch.Tensor:
"""
Args:
min_radius (float): minimum radius of the balls.
max_radius (float): maximum radius of the balls.
sample_num (int): maximum number of features in the balls.
- xyz (torch.Tensor): (B, N, 3) xyz coordinates of the features,
- or staked input (N1 + N2 ..., 3).
- center_xyz (torch.Tensor): (B, npoint, 3) centers of the ball
- query, or staked input (M1 + M2 ..., 3).
- xyz_batch_cnt: (batch_size): Stacked input xyz coordinates nums in
- each batch, just like (N1, N2, ...). Defaults to None.
- New in version 1.7.0.
- center_xyz_batch_cnt: (batch_size): Stacked centers coordinates
- nums in each batch, just line (M1, M2, ...). Defaults to None.
- New in version 1.7.0.
+ xyz (Tensor): (B, N, 3) xyz coordinates of the features.
+ center_xyz (Tensor): (B, npoint, 3) centers of the ball query.
Returns:
- torch.Tensor: (B, npoint, nsample) tensor with the indices of the
- features that form the query balls.
+ Tensor: (B, npoint, nsample) tensor with the indices of
+ the features that form the query balls.
"""
assert center_xyz.is_contiguous()
assert xyz.is_contiguous()
assert min_radius < max_radius
- if xyz_batch_cnt is not None and center_xyz_batch_cnt is not None:
- assert xyz_batch_cnt.dtype == torch.int
- assert center_xyz_batch_cnt.dtype == torch.int
- idx = center_xyz.new_zeros((center_xyz.shape[0], sample_num),
- dtype=torch.int32)
- ext_module.stack_ball_query_forward(
- center_xyz,
- center_xyz_batch_cnt,
- xyz,
- xyz_batch_cnt,
- idx,
- max_radius=max_radius,
- nsample=sample_num,
- )
- else:
- B, N, _ = xyz.size()
- npoint = center_xyz.size(1)
- idx = xyz.new_zeros(B, npoint, sample_num, dtype=torch.int32)
- ext_module.ball_query_forward(
- center_xyz,
- xyz,
- idx,
- b=B,
- n=N,
- m=npoint,
- min_radius=min_radius,
- max_radius=max_radius,
- nsample=sample_num)
+
+ B, N, _ = xyz.size()
+ npoint = center_xyz.size(1)
+ idx = xyz.new_zeros(B, npoint, sample_num, dtype=torch.int)
+
+ ext_module.ball_query_forward(
+ center_xyz,
+ xyz,
+ idx,
+ b=B,
+ n=N,
+ m=npoint,
+ min_radius=min_radius,
+ max_radius=max_radius,
+ nsample=sample_num)
if torch.__version__ != 'parrots':
ctx.mark_non_differentiable(idx)
return idx
@staticmethod
- def backward(ctx, a=None) -> Tuple[None, None, None, None]:
+ def backward(ctx, a=None):
return None, None, None, None
diff --git a/mmcv/ops/bbox.py b/mmcv/ops/bbox.py
index 4ba93d6b2268cae28555447805f5084aa6616226..0c4d58b6c91f652933974f519acd3403a833e906 100644
--- a/mmcv/ops/bbox.py
+++ b/mmcv/ops/bbox.py
@@ -1,57 +1,10 @@
# Copyright (c) OpenMMLab. All rights reserved.
-import torch
-
from ..utils import ext_loader
ext_module = ext_loader.load_ext('_ext', ['bbox_overlaps'])
-def _bbox_overlaps_cpu(bboxes1: torch.Tensor,
- bboxes2: torch.Tensor,
- mode: str = 'iou',
- aligned: bool = False,
- offset: int = 0) -> torch.Tensor:
- assert mode in ['iou', 'iof']
-
- if aligned:
- lt = torch.max(bboxes1[:, :2], bboxes2[:, :2]) # [rows, 2]
- rb = torch.min(bboxes1[:, 2:], bboxes2[:, 2:]) # [rows, 2]
-
- wh = (rb - lt + offset).clamp(min=0) # [rows, 2]
- overlap = wh[:, 0] * wh[:, 1]
- area1 = (bboxes1[:, 2] - bboxes1[:, 0] + offset) * (
- bboxes1[:, 3] - bboxes1[:, 1] + offset)
-
- if mode == 'iou':
- area2 = (bboxes2[:, 2] - bboxes2[:, 0] + offset) * (
- bboxes2[:, 3] - bboxes2[:, 1] + offset)
- ious = overlap / (area1 + area2 - overlap)
- else:
- ious = overlap / area1
- else:
- lt = torch.max(bboxes1[:, None, :2], bboxes2[:, :2]) # [rows, cols, 2]
- rb = torch.min(bboxes1[:, None, 2:], bboxes2[:, 2:]) # [rows, cols, 2]
-
- wh = (rb - lt + offset).clamp(min=0) # [rows, cols, 2]
- overlap = wh[:, :, 0] * wh[:, :, 1]
- area1 = (bboxes1[:, 2] - bboxes1[:, 0] + offset) * (
- bboxes1[:, 3] - bboxes1[:, 1] + offset)
-
- if mode == 'iou':
- area2 = (bboxes2[:, 2] - bboxes2[:, 0] + offset) * (
- bboxes2[:, 3] - bboxes2[:, 1] + offset)
- ious = overlap / (area1[:, None] + area2 - overlap)
- else:
- ious = overlap / (area1[:, None])
-
- return ious
-
-
-def bbox_overlaps(bboxes1: torch.Tensor,
- bboxes2: torch.Tensor,
- mode: str = 'iou',
- aligned: bool = False,
- offset: int = 0) -> torch.Tensor:
+def bbox_overlaps(bboxes1, bboxes2, mode='iou', aligned=False, offset=0):
"""Calculate overlap between two set of bboxes.
If ``aligned`` is ``False``, then calculate the ious between each bbox
@@ -59,16 +12,14 @@ def bbox_overlaps(bboxes1: torch.Tensor,
bboxes1 and bboxes2.
Args:
- bboxes1 (torch.Tensor): shape (m, 4) in format or
- empty.
- bboxes2 (torch.Tensor): shape (n, 4) in format or
- empty. If aligned is ``True``, then m and n must be equal.
+ bboxes1 (Tensor): shape (m, 4) in format or empty.
+ bboxes2 (Tensor): shape (n, 4) in format or empty.
+ If aligned is ``True``, then m and n must be equal.
mode (str): "iou" (intersection over union) or iof (intersection over
foreground).
Returns:
- torch.Tensor: Return the ious betweens boxes. If ``aligned`` is
- ``False``, the shape of ious is (m, n) else (m, 1).
+ ious(Tensor): shape (m, n) if aligned == False else shape (m, 1)
Example:
>>> bboxes1 = torch.FloatTensor([
@@ -106,17 +57,16 @@ def bbox_overlaps(bboxes1: torch.Tensor,
rows = bboxes1.size(0)
cols = bboxes2.size(0)
-
if aligned:
assert rows == cols
- ious = bboxes1.new_zeros(rows)
- else:
- ious = bboxes1.new_zeros((rows, cols))
if rows * cols == 0:
- return ious
+ return bboxes1.new(rows, 1) if aligned else bboxes1.new(rows, cols)
+ if aligned:
+ ious = bboxes1.new_zeros(rows)
+ else:
+ ious = bboxes1.new_zeros((rows, cols))
ext_module.bbox_overlaps(
bboxes1, bboxes2, ious, mode=mode_flag, aligned=aligned, offset=offset)
-
return ious
diff --git a/mmcv/ops/bezier_align.py b/mmcv/ops/bezier_align.py
deleted file mode 100644
index 6db7f5c8d8567b4c6ad5df2eb77f6cf60a4f0bb6..0000000000000000000000000000000000000000
--- a/mmcv/ops/bezier_align.py
+++ /dev/null
@@ -1,137 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Tuple, Union
-
-import torch
-import torch.nn as nn
-from torch.autograd import Function
-from torch.autograd.function import once_differentiable
-from torch.nn.modules.utils import _pair
-
-from ..utils import ext_loader
-
-ext_module = ext_loader.load_ext(
- '_ext', ['bezier_align_forward', 'bezier_align_backward'])
-
-
-class BezierAlignFunction(Function):
-
- @staticmethod
- def forward(ctx,
- input: torch.Tensor,
- beziers: torch.Tensor,
- output_size: Union[int, Tuple[int, int]],
- spatial_scale: Union[int, float] = 1.0,
- sampling_ratio: int = 0,
- aligned: bool = True) -> torch.Tensor:
- ctx.output_size = _pair(output_size)
- ctx.spatial_scale = spatial_scale
- ctx.input_shape = input.size()
- ctx.sampling_ratio = sampling_ratio
- ctx.aligned = aligned
-
- assert beziers.size(1) == 17
- output_shape = (beziers.size(0), input.size(1), ctx.output_size[0],
- ctx.output_size[1])
- output = input.new_zeros(output_shape)
- ext_module.bezier_align_forward(
- input,
- beziers,
- output,
- aligned_height=ctx.output_size[0],
- aligned_width=ctx.output_size[1],
- spatial_scale=ctx.spatial_scale,
- sampling_ratio=ctx.sampling_ratio,
- aligned=ctx.aligned)
-
- ctx.save_for_backward(beziers)
- return output
-
- @staticmethod
- @once_differentiable
- def backward(ctx, grad_output: torch.Tensor):
- beziers = ctx.saved_tensors[0]
- grad_input = grad_output.new_zeros(ctx.input_shape)
- grad_output = grad_output.contiguous()
- ext_module.bezier_align_backward(
- grad_output,
- beziers,
- grad_input,
- aligned_height=ctx.output_size[0],
- aligned_width=ctx.output_size[1],
- spatial_scale=ctx.spatial_scale,
- sampling_ratio=ctx.sampling_ratio,
- aligned=ctx.aligned)
- return grad_input, None, None, None, None, None
-
-
-bezier_align = BezierAlignFunction.apply
-
-
-class BezierAlign(nn.Module):
- """Bezier align pooling layer.
-
- Args:
- output_size (tuple): h, w
- spatial_scale (float): scale the input boxes by this number
- sampling_ratio (int): number of inputs samples to take for each
- output sample. 0 to take samples densely for current models.
- aligned (bool): if False, use the legacy implementation in
- MMDetection. If True, align the results more perfectly.
-
- Note:
- The implementation of BezierAlign is modified from
- https://github.com/aim-uofa/AdelaiDet
-
- The meaning of aligned=True:
-
- Given a continuous coordinate c, its two neighboring pixel
- indices (in our pixel model) are computed by floor(c - 0.5) and
- ceil(c - 0.5). For example, c=1.3 has pixel neighbors with discrete
- indices [0] and [1] (which are sampled from the underlying signal
- at continuous coordinates 0.5 and 1.5). But the original roi_align
- (aligned=False) does not subtract the 0.5 when computing
- neighboring pixel indices and therefore it uses pixels with a
- slightly incorrect alignment (relative to our pixel model) when
- performing bilinear interpolation.
-
- With `aligned=True`,
- we first appropriately scale the ROI and then shift it by -0.5
- prior to calling roi_align. This produces the correct neighbors;
-
- The difference does not make a difference to the model's
- performance if ROIAlign is used together with conv layers.
- """
-
- def __init__(
- self,
- output_size: Tuple,
- spatial_scale: Union[int, float],
- sampling_ratio: int,
- aligned: bool = True,
- ) -> None:
- super().__init__()
-
- self.output_size = _pair(output_size)
- self.spatial_scale = float(spatial_scale)
- self.sampling_ratio = int(sampling_ratio)
- self.aligned = aligned
-
- def forward(self, input: torch.Tensor,
- beziers: torch.Tensor) -> torch.Tensor:
- """BezierAlign forward.
-
- Args:
- inputs (Tensor): input features.
- beziers (Tensor): beziers for align.
- """
- return bezier_align(input, beziers, self.output_size,
- self.spatial_scale, self.sampling_ratio,
- self.aligned)
-
- def __repr__(self):
- s = self.__class__.__name__
- s += f'(output_size={self.output_size}, '
- s += f'spatial_scale={self.spatial_scale})'
- s += f'sampling_ratio={self.sampling_ratio})'
- s += f'aligned={self.aligned})'
- return s
diff --git a/mmcv/ops/bias_act.py b/mmcv/ops/bias_act.py
deleted file mode 100644
index 3dfa55743e0a0a6e8ad408c5937d9097cce6ea7d..0000000000000000000000000000000000000000
--- a/mmcv/ops/bias_act.py
+++ /dev/null
@@ -1,375 +0,0 @@
-# Modified from
-# https://github.com/NVlabs/stylegan3/blob/main/torch_utils/ops/bias_act.py
-
-# Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-#
-# NVIDIA CORPORATION and its licensors retain all intellectual property
-# and proprietary rights in and to this software, related documentation
-# and any modifications thereto. Any use, reproduction, disclosure or
-# distribution of this software and related documentation without an express
-# license agreement from NVIDIA CORPORATION is strictly prohibited.
-
-# source: https://github.com/open-mmlab/mmediting/blob/dev-1.x/mmedit/models/editors/stylegan3/stylegan3_ops/ops/bias_act.py # noqa
-"""Custom PyTorch ops for efficient bias and activation."""
-
-from typing import Any, Dict, Optional, Union
-
-import numpy as np
-import torch
-
-from ..utils import ext_loader
-
-ext_module = ext_loader.load_ext('_ext', ['bias_act'])
-
-
-class EasyDict(dict):
- """Convenience class that behaves like a dict but allows access with the
- attribute syntax."""
-
- def __getattr__(self, name: str) -> Any:
- try:
- return self[name]
- except KeyError:
- raise AttributeError(name)
-
- def __setattr__(self, name: str, value: Any) -> None:
- self[name] = value
-
- def __delattr__(self, name: str) -> None:
- del self[name]
-
-
-activation_funcs = {
- 'linear':
- EasyDict(
- func=lambda x, **_: x,
- def_alpha=0,
- def_gain=1,
- cuda_idx=1,
- ref='',
- has_2nd_grad=False),
- 'relu':
- EasyDict(
- func=lambda x, **_: torch.nn.functional.relu(x),
- def_alpha=0,
- def_gain=np.sqrt(2),
- cuda_idx=2,
- ref='y',
- has_2nd_grad=False),
- 'lrelu':
- EasyDict(
- func=lambda x, alpha, **_: torch.nn.functional.leaky_relu(x, alpha),
- def_alpha=0.2,
- def_gain=np.sqrt(2),
- cuda_idx=3,
- ref='y',
- has_2nd_grad=False),
- 'tanh':
- EasyDict(
- func=lambda x, **_: torch.tanh(x),
- def_alpha=0,
- def_gain=1,
- cuda_idx=4,
- ref='y',
- has_2nd_grad=True),
- 'sigmoid':
- EasyDict(
- func=lambda x, **_: torch.sigmoid(x),
- def_alpha=0,
- def_gain=1,
- cuda_idx=5,
- ref='y',
- has_2nd_grad=True),
- 'elu':
- EasyDict(
- func=lambda x, **_: torch.nn.functional.elu(x),
- def_alpha=0,
- def_gain=1,
- cuda_idx=6,
- ref='y',
- has_2nd_grad=True),
- 'selu':
- EasyDict(
- func=lambda x, **_: torch.nn.functional.selu(x),
- def_alpha=0,
- def_gain=1,
- cuda_idx=7,
- ref='y',
- has_2nd_grad=True),
- 'softplus':
- EasyDict(
- func=lambda x, **_: torch.nn.functional.softplus(x),
- def_alpha=0,
- def_gain=1,
- cuda_idx=8,
- ref='y',
- has_2nd_grad=True),
- 'swish':
- EasyDict(
- func=lambda x, **_: torch.sigmoid(x) * x,
- def_alpha=0,
- def_gain=np.sqrt(2),
- cuda_idx=9,
- ref='x',
- has_2nd_grad=True),
-}
-
-_null_tensor = torch.empty([0])
-
-
-def bias_act(input: torch.Tensor,
- bias: Optional[torch.Tensor] = None,
- dim: int = 1,
- act: str = 'linear',
- alpha: Optional[Union[float, int]] = None,
- gain: Optional[float] = None,
- clamp: Optional[float] = None,
- use_custom_op: bool = True):
- r"""Fused bias and activation function.
-
- Adds `bias` to activation tensor `input`, and evaluates activation
- function `act`, and scales the result by `gain`. Each of the steps is
- optional.
-
- In most cases, the fused op is considerably more efficient than performing
- the same calculation using standard PyTorch ops. It supports first and
- second order gradients, but not third order gradients.
-
- Args:
- input (torch.Tensor): Input activation tensor. Can be of any shape.
- bias (torch.Tensor): Bias vector, or `None` to disable.
- Must be a 1D tensor of the same type as `input`. The shape must
- be known, and it must match the dimension of `input` corresponding
- to `dim`. Defaults to None.
- dim (int): The dimension in `input` corresponding to the elements of
- `bias`. The value of `dim` is ignored if `b` is not specified.
- Defaults to 1.
- act (str): Name of the activation function to evaluate, or `"linear"`
- to disable. Can be e.g. "relu", "lrelu", "tanh", "sigmoid",
- "swish", etc. See `activation_funcs` for a full list. `None` is not
- allowed. Defaults to `linear`.
- alpha (float or int): Shape parameter for the activation
- function, or `None` to use the default. Defaults to None.
- gain (float): Scaling factor for the output tensor, or `None`
- to use default. See `activation_funcs` for the default scaling of
- each activation function. If unsure, consider specifying 1.
- Defaults to None.
- clamp (float): Clamp the output values to `[-clamp, +clamp]`,
- or `None` to disable the clamping (default). Defaults to None.
- use_custom_op (bool): Whether to use customized op.
- Defaults to True.
-
- Returns:
- torch.Tensor: Tensor of the same shape and datatype as `input`.
- """
- assert isinstance(input, torch.Tensor)
- if use_custom_op and input.is_cuda:
- return _bias_act_cuda(
- dim=dim, act=act, alpha=alpha, gain=gain,
- clamp=clamp).apply(input, bias)
- return _bias_act_ref(
- input=input,
- bias=bias,
- dim=dim,
- act=act,
- alpha=alpha,
- gain=gain,
- clamp=clamp)
-
-
-def _bias_act_ref(input: torch.Tensor,
- bias: Optional[torch.Tensor] = None,
- dim: int = 1,
- act: str = 'linear',
- alpha: Optional[Union[float, int]] = None,
- gain: Optional[float] = None,
- clamp: Optional[float] = None):
- """Slow reference implementation of `bias_act()` using standard PyTorch
- ops.
-
- Adds `bias` to activation tensor `input`, and evaluates activation
- function `act`, and scales the result by `gain`. Each of the steps is
- optional.
-
- In most cases, the fused op is considerably more efficient than performing
- the same calculation using standard PyTorch ops. It supports first and
- second order gradients, but not third order gradients.
-
- Args:
- input (torch.Tensor): Input activation tensor. Can be of any shape.
- bias (torch.Tensor): Bias vector, or `None` to disable.
- Must be a 1D tensor of the same type as `input`. The shape must
- be known, and it must match the dimension of `input` corresponding
- to `dim`. Defaults to None.
- dim (int): The dimension in `input` corresponding to the elements of
- `bias`. The value of `dim` is ignored if `b` is not specified.
- Defaults to 1.
- act (str): Name of the activation function to evaluate, or `"linear"`
- to disable. Can be e.g. "relu", "lrelu", "tanh", "sigmoid",
- "swish", etc. See `activation_funcs` for a full list. `None` is not
- allowed. Defaults to `linear`.
- alpha (float or int): Shape parameter for the activation
- function, or `None` to use the default. Defaults to None.
- gain (float): Scaling factor for the output tensor, or `None`
- to use default. See `activation_funcs` for the default scaling of
- each activation function. If unsure, consider specifying 1.
- Defaults to None.
- clamp (float): Clamp the output values to
- `[-clamp, +clamp]`, or `None` to disable the clamping (default).
- Defaults to None.
-
- Returns:
- torch.Tensor: Tensor of the same shape and datatype as `input`.
- """
- assert isinstance(input, torch.Tensor)
- assert clamp is None or clamp >= 0
- spec = activation_funcs[act]
- alpha = float(alpha if alpha is not None else spec.def_alpha)
- gain = float(gain if gain is not None else spec.def_gain)
- clamp = float(clamp if clamp is not None else -1)
-
- # Add bias.
- if bias is not None:
- assert isinstance(bias, torch.Tensor) and bias.ndim == 1
- assert 0 <= dim < input.ndim
- assert bias.shape[0] == input.shape[dim]
- input = input + bias.reshape(
- [-1 if i == dim else 1 for i in range(input.ndim)])
-
- # Evaluate activation function.
- alpha = float(alpha)
- output = spec.func(input, alpha=alpha)
-
- # Scale by gain.
- gain = float(gain)
- if gain != 1:
- output = output * gain
-
- # Clamp.
- if clamp >= 0:
- # pylint: disable=invalid-unary-operand-type
- output = output.clamp(-clamp, clamp)
- return output
-
-
-_bias_act_cuda_cache: Dict = dict()
-
-
-def _bias_act_cuda(dim: int = 1,
- act: str = 'linear',
- alpha: Optional[Union[float, int]] = None,
- gain: Optional[float] = None,
- clamp: Optional[float] = None):
- """"Fast CUDA implementation of `bias_act()` using custom ops.
-
- Args:
- dim (int): The dimension in `x` corresponding to the elements of `b`.
- The value of `dim` is ignored if `b` is not specified.
- Defaults to 1.
- act (str): Name of the activation function to evaluate, or `"linear"`
- to disable. Can be e.g. "relu", "lrelu", "tanh", "sigmoid",
- "swish", etc. See `activation_funcs` for a full list. `None` is not
- allowed. Defaults to `linear`.
- alpha (float | int): Shape parameter for the activation
- function, or `None` to use the default. Defaults to None.
- gain (float): Scaling factor for the output tensor, or `None`
- to use default. See `activation_funcs` for the default scaling of
- each activation function. If unsure, consider specifying 1.
- Defaults to None.
- clamp (float): Clamp the output values to `[-clamp, +clamp]`,
- or `None` to disable the clamping (default). Defaults to None.
-
- Returns:
- torch.Tensor: Tensor of the same shape and datatype as `x`.
- """
- # Parse arguments.
- assert clamp is None or clamp >= 0
- spec = activation_funcs[act]
- alpha = float(alpha if alpha is not None else spec.def_alpha)
- gain = float(gain if gain is not None else spec.def_gain)
- clamp = float(clamp if clamp is not None else -1)
-
- # Lookup from cache.
- key = (dim, act, alpha, gain, clamp)
- if key in _bias_act_cuda_cache:
- return _bias_act_cuda_cache[key]
-
- # Forward op.
- class BiasActCuda(torch.autograd.Function):
-
- @staticmethod
- def forward(ctx, x, b): # pylint: disable=arguments-differ
- ctx.memory_format = torch.channels_last if x.ndim > 2 and x.stride(
- 1) == 1 else torch.contiguous_format
- x = x.contiguous(memory_format=ctx.memory_format)
- b = b.contiguous() if b is not None else _null_tensor.to(x.device)
- y = x
- if act != 'linear' or gain != 1 or clamp >= 0 or (
- b is not _null_tensor.to(x.device)):
- y = ext_module.bias_act(x, b, _null_tensor.to(x.device),
- _null_tensor.to(x.device),
- _null_tensor.to(x.device), 0, dim,
- spec.cuda_idx, alpha, gain, clamp)
- ctx.save_for_backward(
- x if 'x' in spec.ref or spec.has_2nd_grad else _null_tensor.to(
- x.device), b if 'x' in spec.ref or spec.has_2nd_grad else
- _null_tensor.to(x.device),
- y if 'y' in spec.ref else _null_tensor.to(x.device))
- return y
-
- @staticmethod
- def backward(ctx, dy): # pylint: disable=arguments-differ
- dy = dy.contiguous(memory_format=ctx.memory_format)
- x, b, y = ctx.saved_tensors
- dx = None
- db = None
-
- if ctx.needs_input_grad[0] or ctx.needs_input_grad[1]:
- dx = dy
- if act != 'linear' or gain != 1 or clamp >= 0:
- dx = BiasActCudaGrad.apply(dy, x, b, y)
-
- if ctx.needs_input_grad[1]:
- db = dx.sum([i for i in range(dx.ndim) if i != dim])
-
- return dx, db
-
- # Backward op.
- class BiasActCudaGrad(torch.autograd.Function):
-
- @staticmethod
- def forward(ctx, dy, x, b, y): # pylint: disable=arguments-differ
- ctx.memory_format = torch.channels_last if dy.ndim > 2 and (
- dy.stride(1) == 1) else torch.contiguous_format
- dx = ext_module.bias_act(dy, b, x, y, _null_tensor.to(x.device), 1,
- dim, spec.cuda_idx, alpha, gain, clamp)
- ctx.save_for_backward(
- dy if spec.has_2nd_grad else _null_tensor.to(x.device), x, b,
- y)
- return dx
-
- @staticmethod
- def backward(ctx, d_dx): # pylint: disable=arguments-differ
- d_dx = d_dx.contiguous(memory_format=ctx.memory_format)
- dy, x, b, y = ctx.saved_tensors
- d_dy = None
- d_x = None
- d_b = None
- d_y = None
-
- if ctx.needs_input_grad[0]:
- d_dy = BiasActCudaGrad.apply(d_dx, x, b, y)
-
- if spec.has_2nd_grad and (ctx.needs_input_grad[1]
- or ctx.needs_input_grad[2]):
- d_x = ext_module.bias_act(d_dx, b, x, y, dy, 2, dim,
- spec.cuda_idx, alpha, gain, clamp)
-
- if spec.has_2nd_grad and ctx.needs_input_grad[2]:
- d_b = d_x.sum([i for i in range(d_x.ndim) if i != dim])
-
- return d_dy, d_x, d_b, d_y
-
- # Add to cache.
- _bias_act_cuda_cache[key] = BiasActCuda
- return BiasActCuda
diff --git a/mmcv/ops/border_align.py b/mmcv/ops/border_align.py
index c09501b962cfce10b1da87e6b651d61911eb8406..ff305be328e9b0a15e1bbb5e6b41beb940f55c81 100644
--- a/mmcv/ops/border_align.py
+++ b/mmcv/ops/border_align.py
@@ -2,8 +2,6 @@
# modified from
# https://github.com/Megvii-BaseDetection/cvpods/blob/master/cvpods/layers/border_align.py
-from typing import Tuple
-
import torch
import torch.nn as nn
from torch.autograd import Function
@@ -23,8 +21,7 @@ class BorderAlignFunction(Function):
'mmcv::MMCVBorderAlign', input, boxes, pool_size_i=pool_size)
@staticmethod
- def forward(ctx, input: torch.Tensor, boxes: torch.Tensor,
- pool_size: int) -> torch.Tensor:
+ def forward(ctx, input, boxes, pool_size):
ctx.pool_size = pool_size
ctx.input_shape = input.size()
@@ -48,8 +45,7 @@ class BorderAlignFunction(Function):
@staticmethod
@once_differentiable
- def backward(ctx,
- grad_output: torch.Tensor) -> Tuple[torch.Tensor, None, None]:
+ def backward(ctx, grad_output):
boxes, argmax_idx = ctx.saved_tensors
grad_input = grad_output.new_zeros(ctx.input_shape)
# complex head architecture may cause grad_output uncontiguous
@@ -76,25 +72,24 @@ class BorderAlign(nn.Module):
For each border line (e.g. top, left, bottom or right) of each box,
border_align does the following:
-
- 1. uniformly samples ``pool_size`` +1 positions on this line, involving
- the start and end points.
- 2. the corresponding features on these points are computed by bilinear
- interpolation.
- 3. max pooling over all the ``pool_size`` +1 positions are used for
- computing pooled feature.
+ 1. uniformly samples `pool_size`+1 positions on this line, involving \
+ the start and end points.
+ 2. the corresponding features on these points are computed by \
+ bilinear interpolation.
+ 3. max pooling over all the `pool_size`+1 positions are used for \
+ computing pooled feature.
Args:
pool_size (int): number of positions sampled over the boxes' borders
(e.g. top, bottom, left, right).
+
"""
- def __init__(self, pool_size: int):
- super().__init__()
+ def __init__(self, pool_size):
+ super(BorderAlign, self).__init__()
self.pool_size = pool_size
- def forward(self, input: torch.Tensor,
- boxes: torch.Tensor) -> torch.Tensor:
+ def forward(self, input, boxes):
"""
Args:
input: Features with shape [N,4C,H,W]. Channels ranged in [0,C),
@@ -103,8 +98,8 @@ class BorderAlign(nn.Module):
boxes: Boxes with shape [N,H*W,4]. Coordinate format (x1,y1,x2,y2).
Returns:
- torch.Tensor: Pooled features with shape [N,C,H*W,4]. The order is
- (top,left,bottom,right) for the last dimension.
+ Tensor: Pooled features with shape [N,C,H*W,4]. The order is
+ (top,left,bottom,right) for the last dimension.
"""
return border_align(input, boxes, self.pool_size)
diff --git a/mmcv/ops/box_iou_quadri.py b/mmcv/ops/box_iou_quadri.py
deleted file mode 100644
index 89747fdf1f03e0491351f876385ba3c1369ebaf7..0000000000000000000000000000000000000000
--- a/mmcv/ops/box_iou_quadri.py
+++ /dev/null
@@ -1,49 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-import torch
-
-from ..utils import ext_loader
-
-ext_module = ext_loader.load_ext('_ext', ['box_iou_quadri'])
-
-
-def box_iou_quadri(bboxes1: torch.Tensor,
- bboxes2: torch.Tensor,
- mode: str = 'iou',
- aligned: bool = False) -> torch.Tensor:
- """Return intersection-over-union (Jaccard index) of boxes.
-
- Both sets of boxes are expected to be in
- (x1, y1, ..., x4, y4) format.
-
- If ``aligned`` is ``False``, then calculate the ious between each bbox
- of bboxes1 and bboxes2, otherwise the ious between each aligned pair of
- bboxes1 and bboxes2.
-
- Args:
- bboxes1 (torch.Tensor): quadrilateral bboxes 1. It has shape (N, 8),
- indicating (x1, y1, ..., x4, y4) for each row.
- bboxes2 (torch.Tensor): quadrilateral bboxes 2. It has shape (M, 8),
- indicating (x1, y1, ..., x4, y4) for each row.
- mode (str): "iou" (intersection over union) or iof (intersection over
- foreground).
-
- Returns:
- torch.Tensor: Return the ious betweens boxes. If ``aligned`` is
- ``False``, the shape of ious is (N, M) else (N,).
- """
- assert mode in ['iou', 'iof']
- mode_dict = {'iou': 0, 'iof': 1}
- mode_flag = mode_dict[mode]
- rows = bboxes1.size(0)
- cols = bboxes2.size(0)
- if aligned:
- ious = bboxes1.new_zeros(rows)
- else:
- ious = bboxes1.new_zeros(rows * cols)
- bboxes1 = bboxes1.contiguous()
- bboxes2 = bboxes2.contiguous()
- ext_module.box_iou_quadri(
- bboxes1, bboxes2, ious, mode_flag=mode_flag, aligned=aligned)
- if not aligned:
- ious = ious.view(rows, cols)
- return ious
diff --git a/mmcv/ops/box_iou_rotated.py b/mmcv/ops/box_iou_rotated.py
index 2443af27c92146ed4328e8f94b1415c7e72c542b..2d78015e9c2a9e7a52859b4e18f84a9aa63481a0 100644
--- a/mmcv/ops/box_iou_rotated.py
+++ b/mmcv/ops/box_iou_rotated.py
@@ -1,16 +1,10 @@
# Copyright (c) OpenMMLab. All rights reserved.
-import torch
-
from ..utils import ext_loader
ext_module = ext_loader.load_ext('_ext', ['box_iou_rotated'])
-def box_iou_rotated(bboxes1: torch.Tensor,
- bboxes2: torch.Tensor,
- mode: str = 'iou',
- aligned: bool = False,
- clockwise: bool = True) -> torch.Tensor:
+def box_iou_rotated(bboxes1, bboxes2, mode='iou', aligned=False):
"""Return intersection-over-union (Jaccard index) of boxes.
Both sets of boxes are expected to be in
@@ -20,110 +14,18 @@ def box_iou_rotated(bboxes1: torch.Tensor,
of bboxes1 and bboxes2, otherwise the ious between each aligned pair of
bboxes1 and bboxes2.
- .. note::
- The operator assumes:
-
- 1) The positive direction along x axis is left -> right.
-
- 2) The positive direction along y axis is top -> down.
-
- 3) The w border is in parallel with x axis when angle = 0.
-
- However, there are 2 opposite definitions of the positive angular
- direction, clockwise (CW) and counter-clockwise (CCW). MMCV supports
- both definitions and uses CW by default.
-
- Please set ``clockwise=False`` if you are using the CCW definition.
-
- The coordinate system when ``clockwise`` is ``True`` (default)
-
- .. code-block:: none
-
- 0-------------------> x (0 rad)
- | A-------------B
- | | |
- | | box h
- | | angle=0 |
- | D------w------C
- v
- y (pi/2 rad)
-
- In such coordination system the rotation matrix is
-
- .. math::
- \\begin{pmatrix}
- \\cos\\alpha & -\\sin\\alpha \\\\
- \\sin\\alpha & \\cos\\alpha
- \\end{pmatrix}
-
- The coordinates of the corner point A can be calculated as:
-
- .. math::
- P_A=
- \\begin{pmatrix} x_A \\\\ y_A\\end{pmatrix}
- =
- \\begin{pmatrix} x_{center} \\\\ y_{center}\\end{pmatrix} +
- \\begin{pmatrix}\\cos\\alpha & -\\sin\\alpha \\\\
- \\sin\\alpha & \\cos\\alpha\\end{pmatrix}
- \\begin{pmatrix} -0.5w \\\\ -0.5h\\end{pmatrix} \\\\
- =
- \\begin{pmatrix} x_{center}-0.5w\\cos\\alpha+0.5h\\sin\\alpha
- \\\\
- y_{center}-0.5w\\sin\\alpha-0.5h\\cos\\alpha\\end{pmatrix}
-
-
- The coordinate system when ``clockwise`` is ``False``
-
- .. code-block:: none
-
- 0-------------------> x (0 rad)
- | A-------------B
- | | |
- | | box h
- | | angle=0 |
- | D------w------C
- v
- y (-pi/2 rad)
-
- In such coordination system the rotation matrix is
-
- .. math::
- \\begin{pmatrix}
- \\cos\\alpha & \\sin\\alpha \\\\
- -\\sin\\alpha & \\cos\\alpha
- \\end{pmatrix}
-
- The coordinates of the corner point A can be calculated as:
-
- .. math::
- P_A=
- \\begin{pmatrix} x_A \\\\ y_A\\end{pmatrix}
- =
- \\begin{pmatrix} x_{center} \\\\ y_{center}\\end{pmatrix} +
- \\begin{pmatrix}\\cos\\alpha & \\sin\\alpha \\\\
- -\\sin\\alpha & \\cos\\alpha\\end{pmatrix}
- \\begin{pmatrix} -0.5w \\\\ -0.5h\\end{pmatrix} \\\\
- =
- \\begin{pmatrix} x_{center}-0.5w\\cos\\alpha-0.5h\\sin\\alpha
- \\\\
- y_{center}+0.5w\\sin\\alpha-0.5h\\cos\\alpha\\end{pmatrix}
-
- Args:
- boxes1 (torch.Tensor): rotated bboxes 1. It has shape (N, 5),
- indicating (x, y, w, h, theta) for each row. Note that theta is in
- radian.
- boxes2 (torch.Tensor): rotated bboxes 2. It has shape (M, 5),
- indicating (x, y, w, h, theta) for each row. Note that theta is in
- radian.
+ Arguments:
+ boxes1 (Tensor): rotated bboxes 1. \
+ It has shape (N, 5), indicating (x, y, w, h, theta) for each row.
+ Note that theta is in radian.
+ boxes2 (Tensor): rotated bboxes 2. \
+ It has shape (M, 5), indicating (x, y, w, h, theta) for each row.
+ Note that theta is in radian.
mode (str): "iou" (intersection over union) or iof (intersection over
foreground).
- clockwise (bool): flag indicating whether the positive angular
- orientation is clockwise. default True.
- `New in version 1.4.3.`
Returns:
- torch.Tensor: Return the ious betweens boxes. If ``aligned`` is
- ``False``, the shape of ious is (N, M) else (N,).
+ ious(Tensor): shape (N, M) if aligned == False else shape (N,)
"""
assert mode in ['iou', 'iof']
mode_dict = {'iou': 0, 'iof': 1}
@@ -133,12 +35,7 @@ def box_iou_rotated(bboxes1: torch.Tensor,
if aligned:
ious = bboxes1.new_zeros(rows)
else:
- ious = bboxes1.new_zeros(rows * cols)
- if not clockwise:
- flip_mat = bboxes1.new_ones(bboxes1.shape[-1])
- flip_mat[-1] = -1
- bboxes1 = bboxes1 * flip_mat
- bboxes2 = bboxes2 * flip_mat
+ ious = bboxes1.new_zeros((rows * cols))
bboxes1 = bboxes1.contiguous()
bboxes2 = bboxes2.contiguous()
ext_module.box_iou_rotated(
diff --git a/mmcv/ops/carafe.py b/mmcv/ops/carafe.py
index f7e79c275e2bea62ce7e08fb6e6e4629c7565600..5154cb3abfccfbbe0a1b2daa67018dbf80aaf6d2 100644
--- a/mmcv/ops/carafe.py
+++ b/mmcv/ops/carafe.py
@@ -1,15 +1,11 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Tuple
-
import torch
import torch.nn as nn
import torch.nn.functional as F
-from mmengine.model import normal_init, xavier_init
-from mmengine.registry import MODELS
-from torch import Tensor
from torch.autograd import Function
from torch.nn.modules.module import Module
+from ..cnn import UPSAMPLE_LAYERS, normal_init, xavier_init
from ..utils import ext_loader
ext_module = ext_loader.load_ext('_ext', [
@@ -21,8 +17,7 @@ ext_module = ext_loader.load_ext('_ext', [
class CARAFENaiveFunction(Function):
@staticmethod
- def symbolic(g, features: Tensor, masks: Tensor, kernel_size: int,
- group_size: int, scale_factor: int) -> Tensor:
+ def symbolic(g, features, masks, kernel_size, group_size, scale_factor):
return g.op(
'mmcv::MMCVCARAFENaive',
features,
@@ -32,8 +27,7 @@ class CARAFENaiveFunction(Function):
scale_factor_f=scale_factor)
@staticmethod
- def forward(ctx, features: Tensor, masks: Tensor, kernel_size: int,
- group_size: int, scale_factor: int) -> Tensor:
+ def forward(ctx, features, masks, kernel_size, group_size, scale_factor):
assert scale_factor >= 1
assert masks.size(1) == kernel_size * kernel_size * group_size
assert masks.size(-1) == features.size(-1) * scale_factor
@@ -56,15 +50,12 @@ class CARAFENaiveFunction(Function):
group_size=group_size,
scale_factor=scale_factor)
- if features.requires_grad or masks.requires_grad or \
- torch.__version__ == 'parrots':
+ if features.requires_grad or masks.requires_grad:
ctx.save_for_backward(features, masks)
return output
@staticmethod
- def backward(
- ctx,
- grad_output: Tensor) -> Tuple[Tensor, Tensor, None, None, None]:
+ def backward(ctx, grad_output):
assert grad_output.is_cuda
features, masks = ctx.saved_tensors
@@ -92,8 +83,8 @@ carafe_naive = CARAFENaiveFunction.apply
class CARAFENaive(Module):
- def __init__(self, kernel_size: int, group_size: int, scale_factor: int):
- super().__init__()
+ def __init__(self, kernel_size, group_size, scale_factor):
+ super(CARAFENaive, self).__init__()
assert isinstance(kernel_size, int) and isinstance(
group_size, int) and isinstance(scale_factor, int)
@@ -101,7 +92,7 @@ class CARAFENaive(Module):
self.group_size = group_size
self.scale_factor = scale_factor
- def forward(self, features: Tensor, masks: Tensor) -> Tensor:
+ def forward(self, features, masks):
return carafe_naive(features, masks, self.kernel_size, self.group_size,
self.scale_factor)
@@ -109,8 +100,7 @@ class CARAFENaive(Module):
class CARAFEFunction(Function):
@staticmethod
- def symbolic(g, features: Tensor, masks: Tensor, kernel_size: int,
- group_size: int, scale_factor: int) -> Tensor:
+ def symbolic(g, features, masks, kernel_size, group_size, scale_factor):
return g.op(
'mmcv::MMCVCARAFE',
features,
@@ -120,8 +110,7 @@ class CARAFEFunction(Function):
scale_factor_f=scale_factor)
@staticmethod
- def forward(ctx, features: Tensor, masks: Tensor, kernel_size: int,
- group_size: int, scale_factor: int) -> Tensor:
+ def forward(ctx, features, masks, kernel_size, group_size, scale_factor):
assert scale_factor >= 1
assert masks.size(1) == kernel_size * kernel_size * group_size
assert masks.size(-1) == features.size(-1) * scale_factor
@@ -150,15 +139,14 @@ class CARAFEFunction(Function):
group_size=group_size,
scale_factor=scale_factor)
- if features.requires_grad or masks.requires_grad or \
- torch.__version__ == 'parrots':
+ if features.requires_grad or masks.requires_grad:
ctx.save_for_backward(features, masks, rfeatures)
return output
@staticmethod
- def backward(
- ctx,
- grad_output: Tensor) -> Tuple[Tensor, Tensor, None, None, None]:
+ def backward(ctx, grad_output):
+ assert grad_output.is_cuda
+
features, masks, rfeatures = ctx.saved_tensors
kernel_size = ctx.kernel_size
group_size = ctx.group_size
@@ -192,8 +180,7 @@ carafe = CARAFEFunction.apply
class CARAFE(Module):
""" CARAFE: Content-Aware ReAssembly of FEatures
- Please refer to `CARAFE: Content-Aware ReAssembly of FEatures
- `_ for more details.
+ Please refer to https://arxiv.org/abs/1905.02188 for more details.
Args:
kernel_size (int): reassemble kernel size
@@ -204,8 +191,8 @@ class CARAFE(Module):
upsampled feature map
"""
- def __init__(self, kernel_size: int, group_size: int, scale_factor: int):
- super().__init__()
+ def __init__(self, kernel_size, group_size, scale_factor):
+ super(CARAFE, self).__init__()
assert isinstance(kernel_size, int) and isinstance(
group_size, int) and isinstance(scale_factor, int)
@@ -213,19 +200,19 @@ class CARAFE(Module):
self.group_size = group_size
self.scale_factor = scale_factor
- def forward(self, features: Tensor, masks: Tensor) -> Tensor:
+ def forward(self, features, masks):
return carafe(features, masks, self.kernel_size, self.group_size,
self.scale_factor)
-@MODELS.register_module(name='carafe')
+@UPSAMPLE_LAYERS.register_module(name='carafe')
class CARAFEPack(nn.Module):
"""A unified package of CARAFE upsampler that contains: 1) channel
compressor 2) content encoder 3) CARAFE op.
Official implementation of ICCV 2019 paper
- `CARAFE: Content-Aware ReAssembly of FEatures
- `_.
+ CARAFE: Content-Aware ReAssembly of FEatures
+ Please refer to https://arxiv.org/abs/1905.02188 for more details.
Args:
channels (int): input feature channels
@@ -241,14 +228,14 @@ class CARAFEPack(nn.Module):
"""
def __init__(self,
- channels: int,
- scale_factor: int,
- up_kernel: int = 5,
- up_group: int = 1,
- encoder_kernel: int = 3,
- encoder_dilation: int = 1,
- compressed_channels: int = 64):
- super().__init__()
+ channels,
+ scale_factor,
+ up_kernel=5,
+ up_group=1,
+ encoder_kernel=3,
+ encoder_dilation=1,
+ compressed_channels=64):
+ super(CARAFEPack, self).__init__()
self.channels = channels
self.scale_factor = scale_factor
self.up_kernel = up_kernel
@@ -274,7 +261,7 @@ class CARAFEPack(nn.Module):
xavier_init(m, distribution='uniform')
normal_init(self.content_encoder, std=0.001)
- def kernel_normalizer(self, mask: Tensor) -> Tensor:
+ def kernel_normalizer(self, mask):
mask = F.pixel_shuffle(mask, self.scale_factor)
n, mask_c, h, w = mask.size()
# use float division explicitly,
@@ -287,11 +274,11 @@ class CARAFEPack(nn.Module):
return mask
- def feature_reassemble(self, x: Tensor, mask: Tensor) -> Tensor:
+ def feature_reassemble(self, x, mask):
x = carafe(x, mask, self.up_kernel, self.up_group, self.scale_factor)
return x
- def forward(self, x: Tensor) -> Tensor:
+ def forward(self, x):
compressed_x = self.channel_compressor(x)
mask = self.content_encoder(compressed_x)
mask = self.kernel_normalizer(mask)
diff --git a/mmcv/ops/cc_attention.py b/mmcv/ops/cc_attention.py
index efde7b703c8c50ecf5aa604e756422f0be488759..ff8dd4c56849d504d265346316e2f8abb0a66598 100644
--- a/mmcv/ops/cc_attention.py
+++ b/mmcv/ops/cc_attention.py
@@ -2,12 +2,11 @@
import torch
import torch.nn as nn
import torch.nn.functional as F
-from mmengine.registry import MODELS
-from mmcv.cnn import Scale
+from mmcv.cnn import PLUGIN_LAYERS, Scale
-def NEG_INF_DIAG(n: int, device: torch.device) -> torch.Tensor:
+def NEG_INF_DIAG(n, device):
"""Returns a diagonal matrix of size [n, n].
The diagonal are all "-inf". This is for avoiding calculating the
@@ -16,7 +15,7 @@ def NEG_INF_DIAG(n: int, device: torch.device) -> torch.Tensor:
return torch.diag(torch.tensor(float('-inf')).to(device).repeat(n), 0)
-@MODELS.register_module()
+@PLUGIN_LAYERS.register_module()
class CrissCrossAttention(nn.Module):
"""Criss-Cross Attention Module.
@@ -42,7 +41,7 @@ class CrissCrossAttention(nn.Module):
in_channels (int): Channels of the input feature map.
"""
- def __init__(self, in_channels: int) -> None:
+ def __init__(self, in_channels):
super().__init__()
self.query_conv = nn.Conv2d(in_channels, in_channels // 8, 1)
self.key_conv = nn.Conv2d(in_channels, in_channels // 8, 1)
@@ -50,15 +49,14 @@ class CrissCrossAttention(nn.Module):
self.gamma = Scale(0.)
self.in_channels = in_channels
- def forward(self, x: torch.Tensor) -> torch.Tensor:
+ def forward(self, x):
"""forward function of Criss-Cross Attention.
Args:
- x (torch.Tensor): Input feature with the shape of
- (batch_size, in_channels, height, width).
-
+ x (Tensor): Input feature. \
+ shape (batch_size, in_channels, height, width)
Returns:
- torch.Tensor: Output of the layer, with the shape of
+ Tensor: Output of the layer, with shape of \
(batch_size, in_channels, height, width)
"""
B, C, H, W = x.size()
@@ -79,7 +77,7 @@ class CrissCrossAttention(nn.Module):
return out
- def __repr__(self) -> str:
+ def __repr__(self):
s = self.__class__.__name__
s += f'(in_channels={self.in_channels})'
return s
diff --git a/mmcv/ops/chamfer_distance.py b/mmcv/ops/chamfer_distance.py
deleted file mode 100644
index 1f908a5bbc2655de6233cd6ddfa140ee783079ba..0000000000000000000000000000000000000000
--- a/mmcv/ops/chamfer_distance.py
+++ /dev/null
@@ -1,93 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Sequence, Tuple
-
-import torch
-from torch import Tensor
-from torch.autograd import Function
-from torch.autograd.function import once_differentiable
-
-from ..utils import ext_loader
-
-ext_module = ext_loader.load_ext(
- '_ext', ['chamfer_distance_forward', 'chamfer_distance_backward'])
-
-
-class ChamferDistanceFunction(Function):
- """This is an implementation of the 2D Chamfer Distance.
-
- It has been used in the paper `Oriented RepPoints for Aerial Object
- Detection (CVPR 2022) _`.
- """
-
- @staticmethod
- def forward(ctx, xyz1: Tensor, xyz2: Tensor) -> Sequence[Tensor]:
- """
- Args:
- xyz1 (Tensor): Point set with shape (B, N, 2).
- xyz2 (Tensor): Point set with shape (B, N, 2).
-
- Returns:
- Sequence[Tensor]:
-
- - dist1 (Tensor): Chamfer distance (xyz1 to xyz2) with
- shape (B, N).
- - dist2 (Tensor): Chamfer distance (xyz2 to xyz1) with
- shape (B, N).
- - idx1 (Tensor): Index of chamfer distance (xyz1 to xyz2)
- with shape (B, N), which be used in compute gradient.
- - idx2 (Tensor): Index of chamfer distance (xyz2 to xyz2)
- with shape (B, N), which be used in compute gradient.
- """
- batch_size, n, _ = xyz1.size()
- _, m, _ = xyz2.size()
- device = xyz1.device
- xyz1 = xyz1.contiguous()
- xyz2 = xyz2.contiguous()
-
- dist1 = torch.zeros(batch_size, n).to(device)
- dist2 = torch.zeros(batch_size, m).to(device)
- idx1 = torch.zeros(batch_size, n).type(torch.IntTensor).to(device)
- idx2 = torch.zeros(batch_size, m).type(torch.IntTensor).to(device)
-
- ext_module.chamfer_distance_forward(xyz1, xyz2, dist1, dist2, idx1,
- idx2)
- ctx.save_for_backward(xyz1, xyz2, idx1, idx2)
- return dist1, dist2, idx1, idx2
-
- @staticmethod
- @once_differentiable
- def backward(ctx,
- grad_dist1: Tensor,
- grad_dist2: Tensor,
- grad_idx1=None,
- grad_idx2=None) -> Tuple[Tensor, Tensor]:
- """
-
- Args:
- grad_dist1 (Tensor): Gradient of chamfer distance
- (xyz1 to xyz2) with shape (B, N).
- grad_dist2 (Tensor): Gradient of chamfer distance
- (xyz2 to xyz1) with shape (B, N).
-
- Returns:
- Tuple[Tensor, Tensor]:
-
- - grad_xyz1 (Tensor): Gradient of the point set with shape \
- (B, N, 2).
- - grad_xyz2 (Tensor):Gradient of the point set with shape \
- (B, N, 2).
- """
- xyz1, xyz2, idx1, idx2 = ctx.saved_tensors
- device = grad_dist1.device
- grad_dist1 = grad_dist1.contiguous()
- grad_dist2 = grad_dist2.contiguous()
- grad_xyz1 = torch.zeros(xyz1.size()).to(device)
- grad_xyz2 = torch.zeros(xyz2.size()).to(device)
-
- ext_module.chamfer_distance_backward(xyz1, xyz2, idx1, idx2,
- grad_dist1, grad_dist2, grad_xyz1,
- grad_xyz2)
- return grad_xyz1, grad_xyz2
-
-
-chamfer_distance = ChamferDistanceFunction.apply
diff --git a/mmcv/ops/contour_expand.py b/mmcv/ops/contour_expand.py
index 7184609ad9b64d421c17fdfe4a1a0dbeb62d64c8..ea1111e1768b5f27e118bf7dbc0d9c70a7afd6d7 100644
--- a/mmcv/ops/contour_expand.py
+++ b/mmcv/ops/contour_expand.py
@@ -1,6 +1,4 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Union
-
import numpy as np
import torch
@@ -9,22 +7,21 @@ from ..utils import ext_loader
ext_module = ext_loader.load_ext('_ext', ['contour_expand'])
-def contour_expand(kernel_mask: Union[np.array, torch.Tensor],
- internal_kernel_label: Union[np.array, torch.Tensor],
- min_kernel_area: int, kernel_num: int) -> list:
+def contour_expand(kernel_mask, internal_kernel_label, min_kernel_area,
+ kernel_num):
"""Expand kernel contours so that foreground pixels are assigned into
instances.
- Args:
- kernel_mask (np.array or torch.Tensor): The instance kernel mask with
+ Arguments:
+ kernel_mask (np.array or Tensor): The instance kernel mask with
size hxw.
- internal_kernel_label (np.array or torch.Tensor): The instance internal
+ internal_kernel_label (np.array or Tensor): The instance internal
kernel label with size hxw.
min_kernel_area (int): The minimum kernel area.
kernel_num (int): The instance kernel number.
Returns:
- list: The instance index map with size hxw.
+ label (list): The instance index map with size hxw.
"""
assert isinstance(kernel_mask, (torch.Tensor, np.ndarray))
assert isinstance(internal_kernel_label, (torch.Tensor, np.ndarray))
@@ -45,7 +42,7 @@ def contour_expand(kernel_mask: Union[np.array, torch.Tensor],
internal_kernel_label,
min_kernel_area=min_kernel_area,
kernel_num=kernel_num)
- label = label.tolist() # type: ignore
+ label = label.tolist()
else:
label = ext_module.contour_expand(kernel_mask, internal_kernel_label,
min_kernel_area, kernel_num)
diff --git a/mmcv/ops/conv2d_gradfix.py b/mmcv/ops/conv2d_gradfix.py
deleted file mode 100644
index 9d4ef6e1920881fb524f4d8076bc33a926f998ab..0000000000000000000000000000000000000000
--- a/mmcv/ops/conv2d_gradfix.py
+++ /dev/null
@@ -1,346 +0,0 @@
-# Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-#
-# NVIDIA CORPORATION and its licensors retain all intellectual property
-# and proprietary rights in and to this software, related documentation
-# and any modifications thereto. Any use, reproduction, disclosure or
-# distribution of this software and related documentation without an express
-# license agreement from NVIDIA CORPORATION is strictly prohibited.
-
-# source: https://github.com/NVlabs/stylegan3/blob/main/torch_utils/ops/conv2d_gradfix.py # noqa
-"""Custom replacement for `torch.nn.functional.conv2d` that supports
-arbitrarily high order gradients with zero performance penalty."""
-
-import contextlib
-import warnings
-from typing import Dict, Optional, Tuple, Union
-
-import torch
-from mmengine.utils import digit_version
-
-enabled = True
-weight_gradients_disabled = False
-
-
-@contextlib.contextmanager
-def no_weight_gradients(disable=True):
- global weight_gradients_disabled
- old = weight_gradients_disabled
- if disable:
- weight_gradients_disabled = True
- yield
- weight_gradients_disabled = old
-
-
-def conv2d(input: torch.Tensor,
- weight: torch.Tensor,
- bias: Optional[torch.Tensor] = None,
- stride: Union[int, Tuple[int, ...]] = 1,
- padding: Union[int, Tuple[int, ...]] = 0,
- dilation: Union[int, Tuple[int, ...]] = 1,
- groups: int = 1):
- flag = True
- if digit_version(torch.__version__) >= digit_version('1.10.0'):
- warnings.warn('Since '
- 'aten:cudnn_convolution_backward_weight is '
- f'not supported in torch=={torch.__version__},'
- ' rolling back to `torch.nn.functional.conv2d`')
- flag = False
- if _should_use_custom_op(input) and flag:
- return _conv2d_gradfix(
- transpose=False,
- weight_shape=weight.shape,
- stride=stride,
- padding=padding,
- output_padding=0,
- dilation=dilation,
- groups=groups).apply(input, weight, bias)
- return torch.nn.functional.conv2d(
- input=input,
- weight=weight,
- bias=bias,
- stride=stride,
- padding=padding,
- dilation=dilation,
- groups=groups)
-
-
-def conv_transpose2d(input: torch.Tensor,
- weight: torch.Tensor,
- bias: Optional[torch.Tensor] = None,
- stride: Union[int, Tuple[int, ...]] = 1,
- padding: Union[int, Tuple[int, ...]] = 0,
- output_padding: Union[int, Tuple[int, ...]] = 0,
- groups: int = 1,
- dilation: Union[int, Tuple[int, ...]] = 1):
- if _should_use_custom_op(input):
- return _conv2d_gradfix(
- transpose=True,
- weight_shape=weight.shape,
- stride=stride,
- padding=padding,
- output_padding=output_padding,
- groups=groups,
- dilation=dilation).apply(input, weight, bias)
- return torch.nn.functional.conv_transpose2d(
- input=input,
- weight=weight,
- bias=bias,
- stride=stride,
- padding=padding,
- output_padding=output_padding,
- groups=groups,
- dilation=dilation)
-
-
-def _should_use_custom_op(input):
- assert isinstance(input, torch.Tensor)
- if (not enabled) or (not torch.backends.cudnn.enabled):
- return False
- if input.device.type != 'cuda':
- return False
- return True
-
-
-def _to_tuple(x, ndim):
- xs = tuple(x) if isinstance(x, (tuple, list)) else (x, ) * ndim
- assert len(xs) == ndim
- assert all(isinstance(x, int) for x in xs)
- return xs
-
-
-_conv2d_gradfix_cache: Dict = dict()
-_null_tensor = torch.empty([0])
-
-
-def _conv2d_gradfix(
- transpose: bool,
- weight_shape: Tuple[int, ...],
- stride: Union[int, Tuple[int, ...]],
- padding: Union[int, Tuple[int, ...]],
- output_padding: Union[int, Tuple[int, ...]],
- dilation: Union[int, Tuple[int, ...]],
- groups: int,
-):
- # Parse arguments.
- ndim = 2
- weight_shape = tuple(weight_shape)
- stride = _to_tuple(stride, ndim)
- padding = _to_tuple(padding, ndim)
- output_padding = _to_tuple(output_padding, ndim)
- dilation = _to_tuple(dilation, ndim)
-
- # Lookup from cache.
- key = (transpose, weight_shape, stride, padding, output_padding, dilation,
- groups)
- if key in _conv2d_gradfix_cache:
- return _conv2d_gradfix_cache[key]
-
- # Validate arguments.
-
- assert groups >= 1
- assert len(weight_shape) == ndim + 2
- assert all(stride[i] >= 1 for i in range(ndim)) # type: ignore
- assert all(padding[i] >= 0 for i in range(ndim)) # type: ignore
- assert all(dilation[i] >= 0 for i in range(ndim)) # type: ignore
- if not transpose:
- assert all(output_padding[i] == 0 for i in range(ndim)) # type: ignore
- else: # transpose
- for i in range(ndim):
- assert 0 <= output_padding[i] < max( # type: ignore
- stride[i], # type: ignore
- dilation[i]) # type: ignore
-
- # Helpers.
- common_kwargs = dict(
- stride=stride, padding=padding, dilation=dilation, groups=groups)
-
- def calc_output_padding(input_shape, output_shape):
- if transpose:
- return [0, 0]
- return [
- input_shape[i + 2] - (output_shape[i + 2] - 1) * stride[i] -
- (1 - 2 * padding[i]) - dilation[i] * (weight_shape[i + 2] - 1)
- for i in range(ndim)
- ]
-
- # Forward & backward.
- class Conv2d(torch.autograd.Function):
-
- @staticmethod
- def forward(ctx, input, weight, bias):
- assert weight.shape == weight_shape
- ctx.save_for_backward(
- input if weight.requires_grad else _null_tensor,
- weight if input.requires_grad else _null_tensor,
- )
- ctx.input_shape = input.shape
-
- # Simple 1x1 convolution => cuBLAS (only on Volta, not on Ampere).
- if weight_shape[2:] == stride == dilation == (
- 1, 1) and padding == (
- 0, 0) and torch.cuda.get_device_capability(
- input.device) < (8, 0):
- a = weight.reshape(groups, weight_shape[0] // groups,
- weight_shape[1])
- b = input.reshape(input.shape[0], groups,
- input.shape[1] // groups, -1)
- c = (a.transpose(1, 2) if transpose else a) @ b.permute(
- 1, 2, 0, 3).flatten(2)
- c = c.reshape(-1, input.shape[0],
- *input.shape[2:]).transpose(0, 1)
- c = c if bias is None else c + bias.unsqueeze(0).unsqueeze(
- 2).unsqueeze(3)
- return c.contiguous(
- memory_format=(torch.channels_last if input.stride(1) ==
- 1 else torch.contiguous_format))
-
- # General case => cuDNN.
- if transpose:
- return torch.nn.functional.conv_transpose2d(
- input=input,
- weight=weight,
- bias=bias,
- output_padding=output_padding,
- **common_kwargs)
- return torch.nn.functional.conv2d(
- input=input, weight=weight, bias=bias, **common_kwargs)
-
- @staticmethod
- def backward(ctx, grad_output):
- input, weight = ctx.saved_tensors
- input_shape = ctx.input_shape
- grad_input = None
- grad_weight = None
- grad_bias = None
-
- if ctx.needs_input_grad[0]:
- p = calc_output_padding(
- input_shape=input_shape, output_shape=grad_output.shape)
- op = _conv2d_gradfix(
- transpose=(not transpose),
- weight_shape=weight_shape,
- output_padding=p,
- **common_kwargs)
- grad_input = op.apply(grad_output, weight, None)
- assert grad_input.shape == input_shape
-
- if ctx.needs_input_grad[1] and not weight_gradients_disabled:
- grad_weight = Conv2dGradWeight.apply(grad_output, input)
- assert grad_weight.shape == weight_shape
-
- if ctx.needs_input_grad[2]:
- grad_bias = grad_output.sum([0, 2, 3])
-
- return grad_input, grad_weight, grad_bias
-
- # Gradient with respect to the weights.
- class Conv2dGradWeight(torch.autograd.Function):
-
- @staticmethod
- def forward(ctx, grad_output, input):
- ctx.save_for_backward(
- grad_output if input.requires_grad else _null_tensor,
- input if grad_output.requires_grad else _null_tensor,
- )
- ctx.grad_output_shape = grad_output.shape
- ctx.input_shape = input.shape
-
- # Simple 1x1 convolution => cuBLAS (on both Volta and Ampere).
- if weight_shape[2:] == stride == dilation == (
- 1, 1) and padding == (0, 0):
- a = grad_output.reshape(grad_output.shape[0], groups,
- grad_output.shape[1] // groups,
- -1).permute(1, 2, 0, 3).flatten(2)
- b = input.reshape(input.shape[0], groups,
- input.shape[1] // groups,
- -1).permute(1, 2, 0, 3).flatten(2)
- c = (b @ a.transpose(1, 2) if transpose else
- a @ b.transpose(1, 2)).reshape(weight_shape)
- return c.contiguous(
- memory_format=(torch.channels_last if input.stride(1) ==
- 1 else torch.contiguous_format))
-
- # PyTorch consolidated convolution backward API in PR:
- # https://github.com/pytorch/pytorch/commit/3dc3651e0ee3623f669c3a2c096408dbc476d122 # noqa: E501
- # Enhance the code referring to the discussion:
- # https://github.com/pytorch/pytorch/issues/74437
- if digit_version(torch.__version__) >= digit_version('1.11.0'):
- empty_weight = torch.tensor(
- 0.0, dtype=input.dtype,
- device=input.device).expand(weight_shape)
- output_padding = calc_output_padding(input.shape,
- grad_output.shape)
- return torch.ops.aten.convolution_backward(
- grad_output,
- input,
- empty_weight,
- None,
- stride=stride,
- dilation=dilation,
- transposed=transpose,
- padding=padding,
- groups=groups,
- output_padding=output_padding,
- output_mask=[0, 1, 0])[1]
- else:
- is_rocm_pytorch = False
- try:
- from torch.utils.cpp_extension import ROCM_HOME
- is_rocm_pytorch = True if ((torch.version.hip is not None) and
- (ROCM_HOME is not None)) else False
- except ImportError:
- pass
- name=''
- flags=[]
- if is_rocm_pytorch:
- name = ('aten::miopen_convolution_transpose_backward_weight'
- if transpose else
- 'aten::miopen_convolution_backward_weight')
- flags = [
- torch.backends.cudnn.benchmark,
- torch.backends.cudnn.deterministic
- ]
- else:
- # General case => cuDNN.
- name = ('aten::cudnn_convolution_transpose_backward_weight'
- if transpose else
- 'aten::cudnn_convolution_backward_weight')
- flags = [
- torch.backends.cudnn.benchmark,
- torch.backends.cudnn.deterministic,
- torch.backends.cudnn.allow_tf32
- ]
- return torch._C._jit_get_operation(name)(weight_shape,
- grad_output, input,
- padding, stride,
- dilation, groups,
- *flags)
-
- @staticmethod
- def backward(ctx, grad2_grad_weight):
- grad_output, input = ctx.saved_tensors
- grad_output_shape = ctx.grad_output_shape
- input_shape = ctx.input_shape
- grad2_grad_output = None
- grad2_input = None
-
- if ctx.needs_input_grad[0]:
- grad2_grad_output = Conv2d.apply(input, grad2_grad_weight,
- None)
- assert grad2_grad_output.shape == grad_output_shape
-
- if ctx.needs_input_grad[1]:
- p = calc_output_padding(
- input_shape=input_shape, output_shape=grad_output_shape)
- op = _conv2d_gradfix(
- transpose=(not transpose),
- weight_shape=weight_shape,
- output_padding=p,
- **common_kwargs)
- grad2_input = op.apply(grad_output, grad2_grad_weight, None)
- assert grad2_input.shape == input_shape
-
- return grad2_grad_output, grad2_input
-
- _conv2d_gradfix_cache[key] = Conv2d
- return Conv2d
diff --git a/mmcv/ops/convex_iou.py b/mmcv/ops/convex_iou.py
deleted file mode 100644
index 50050363ac5b08cfa8f86dd186ab7087fac6f48a..0000000000000000000000000000000000000000
--- a/mmcv/ops/convex_iou.py
+++ /dev/null
@@ -1,52 +0,0 @@
-# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Tuple
-
-import torch
-
-from ..utils import ext_loader
-
-ext_module = ext_loader.load_ext('_ext', ['convex_iou', 'convex_giou'])
-
-
-def convex_giou(pointsets: torch.Tensor,
- polygons: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
- """Return generalized intersection-over-union (Jaccard index) between point
- sets and polygons.
-
- Args:
- pointsets (torch.Tensor): It has shape (N, 18),
- indicating (x1, y1, x2, y2, ..., x9, y9) for each row.
- polygons (torch.Tensor): It has shape (N, 8),
- indicating (x1, y1, x2, y2, x3, y3, x4, y4) for each row.
-
- Returns:
- tuple[torch.Tensor, torch.Tensor]: The first element is the gious
- between point sets and polygons with the shape (N,). The second
- element is the gradient of point sets with the shape (N, 18).
- """
- output = pointsets.new_zeros((pointsets.size(0), 19))
- ext_module.convex_giou(pointsets, polygons, output)
- convex_giou = output[:, -1]
- points_grad = output[:, 0:-1]
- return convex_giou, points_grad
-
-
-def convex_iou(pointsets: torch.Tensor,
- polygons: torch.Tensor) -> torch.Tensor:
- """Return intersection-over-union (Jaccard index) between point sets and
- polygons.
-
- Args:
- pointsets (torch.Tensor): It has shape (N, 18),
- indicating (x1, y1, x2, y2, ..., x9, y9) for each row.
- polygons (torch.Tensor): It has shape (K, 8),
- indicating (x1, y1, x2, y2, x3, y3, x4, y4) for each row.
-
- Returns:
- torch.Tensor: Return the ious between point sets and polygons with the
- shape (N, K).
- """
- N, K = pointsets.size(0), polygons.size(0)
- ious = pointsets.new_zeros((N, K))
- ext_module.convex_iou(pointsets, polygons, ious)
- return ious
diff --git a/mmcv/ops/corner_pool.py b/mmcv/ops/corner_pool.py
index f18e92dcf89478dd148ef8e1ae482c61969057aa..a33d798b43d405e4c86bee4cd6389be21ca9c637 100644
--- a/mmcv/ops/corner_pool.py
+++ b/mmcv/ops/corner_pool.py
@@ -1,37 +1,101 @@
# Copyright (c) OpenMMLab. All rights reserved.
import torch
-from torch import Tensor, nn
-from mmengine.utils import digit_version
+from torch import nn
+from torch.autograd import Function
+
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext('_ext', [
+ 'top_pool_forward', 'top_pool_backward', 'bottom_pool_forward',
+ 'bottom_pool_backward', 'left_pool_forward', 'left_pool_backward',
+ 'right_pool_forward', 'right_pool_backward'
+])
+
_mode_dict = {'top': 0, 'bottom': 1, 'left': 2, 'right': 3}
-def _corner_pool(x: Tensor, dim: int, flip: bool) -> Tensor:
- size = x.size(dim)
- output = x.clone()
+class TopPoolFunction(Function):
- ind = 1
- while ind < size:
- if flip:
- cur_start = 0
- cur_len = size - ind
- next_start = ind
- next_len = size - ind
- else:
- cur_start = ind
- cur_len = size - ind
- next_start = 0
- next_len = size - ind
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['top']))
+ return output
+
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.top_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.top_pool_backward(input, grad_output)
+ return output
+
+
+class BottomPoolFunction(Function):
+
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['bottom']))
+ return output
+
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.bottom_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.bottom_pool_backward(input, grad_output)
+ return output
- # max_temp should be cloned for backward computation
- max_temp = output.narrow(dim, cur_start, cur_len).clone()
- cur_temp = output.narrow(dim, cur_start, cur_len)
- next_temp = output.narrow(dim, next_start, next_len)
- cur_temp[...] = torch.where(max_temp > next_temp, max_temp, next_temp)
+class LeftPoolFunction(Function):
- ind = ind << 1
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['left']))
+ return output
- return output
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.left_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.left_pool_backward(input, grad_output)
+ return output
+
+
+class RightPoolFunction(Function):
+
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['right']))
+ return output
+
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.right_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.right_pool_backward(input, grad_output)
+ return output
class CornerPool(nn.Module):
@@ -40,13 +104,11 @@ class CornerPool(nn.Module):
Corner Pooling is a new type of pooling layer that helps a
convolutional network better localize corners of bounding boxes.
- Please refer to `CornerNet: Detecting Objects as Paired Keypoints
- `_ for more details.
-
+ Please refer to https://arxiv.org/abs/1808.01244 for more details.
Code is modified from https://github.com/princeton-vl/CornerNet-Lite.
Args:
- mode (str): Pooling orientation for the pooling layer
+ mode(str): Pooling orientation for the pooling layer
- 'bottom': Bottom Pooling
- 'left': Left Pooling
@@ -57,6 +119,13 @@ class CornerPool(nn.Module):
Feature map after pooling.
"""
+ pool_functions = {
+ 'bottom': BottomPoolFunction,
+ 'left': LeftPoolFunction,
+ 'right': RightPoolFunction,
+ 'top': TopPoolFunction,
+ }
+
cummax_dim_flip = {
'bottom': (2, False),
'left': (3, True),
@@ -64,13 +133,23 @@ class CornerPool(nn.Module):
'top': (2, True),
}
- def __init__(self, mode: str):
- super().__init__()
- assert mode in self.cummax_dim_flip
+ def __init__(self, mode):
+ super(CornerPool, self).__init__()
+ assert mode in self.pool_functions
self.mode = mode
+ self.corner_pool = self.pool_functions[mode]
+
+ def forward(self, x):
+ if torch.__version__ != 'parrots' and torch.__version__ >= '1.5.0':
+ if torch.onnx.is_in_onnx_export():
+ assert torch.__version__ >= '1.7.0', \
+ 'When `cummax` serves as an intermediate component whose '\
+ 'outputs is used as inputs for another modules, it\'s '\
+ 'expected that pytorch version must be >= 1.7.0, '\
+ 'otherwise Error appears like: `RuntimeError: tuple '\
+ 'appears in op that does not forward tuples, unsupported '\
+ 'kind: prim::PythonOp`.'
- def forward(self, x: Tensor) -> Tensor:
- if torch.__version__ != 'parrots' and digit_version(torch.__version__) >= digit_version('1.5.0'):
dim, flip = self.cummax_dim_flip[self.mode]
if flip:
x = x.flip(dim)
@@ -79,5 +158,4 @@ class CornerPool(nn.Module):
pool_tensor = pool_tensor.flip(dim)
return pool_tensor
else:
- dim, flip = self.cummax_dim_flip[self.mode]
- return _corner_pool(x, dim, flip)
+ return self.corner_pool.apply(x)
diff --git a/mmcv/ops/correlation.py b/mmcv/ops/correlation.py
index 319b7646782637e9ebaac4ef07b82d1f460031b5..3d0b79c301b29915dfaf4d2b1846c59be73127d3 100644
--- a/mmcv/ops/correlation.py
+++ b/mmcv/ops/correlation.py
@@ -1,6 +1,4 @@
# Copyright (c) OpenMMLab. All rights reserved.
-from typing import Tuple
-
import torch
from torch import Tensor, nn
from torch.autograd import Function
@@ -17,14 +15,14 @@ class CorrelationFunction(Function):
@staticmethod
def forward(ctx,
- input1: Tensor,
- input2: Tensor,
- kernel_size: int = 1,
- max_displacement: int = 1,
- stride: int = 1,
- padding: int = 1,
- dilation: int = 1,
- dilation_patch: int = 1) -> Tensor:
+ input1,
+ input2,
+ kernel_size=1,
+ max_displacement=1,
+ stride=1,
+ padding=1,
+ dilation=1,
+ dilation_patch=1):
ctx.save_for_backward(input1, input2)
@@ -62,9 +60,7 @@ class CorrelationFunction(Function):
@staticmethod
@once_differentiable
- def backward(
- ctx, grad_output: Tensor
- ) -> Tuple[Tensor, Tensor, None, None, None, None, None, None]:
+ def backward(ctx, grad_output):
input1, input2 = ctx.saved_tensors
kH, kW = ctx.kernel_size
diff --git a/mmcv/ops/csrc/README.md b/mmcv/ops/csrc/README.md
index 8fcc6eb1a3260148aa7448470967684f8c9f0365..3bc02004017a0d607131b4de168b320c3beed23c 100644
--- a/mmcv/ops/csrc/README.md
+++ b/mmcv/ops/csrc/README.md
@@ -13,150 +13,158 @@ This folder contains all non-python code for MMCV custom ops. Please follow the
│ ├── pytorch_cpp_helper.hpp
│ ├── pytorch_cuda_helper.hpp
│ ├── pytorch_device_registry.hpp
-│ ├── cuda
-│ │ ├── common_cuda_helper.hpp
-│ │ ├── parrots_cudawarpfunction.cuh
-│ │ ├── ...
-│ │ └── ops_cuda_kernel.cuh
-| ├── mps
-│ │ ├── MPSLibrary.h
-│ │ ├── ...
-│ │ └── MPSUtils.h
-| ├── mlu
-│ │ └── ...
-| └── utils
-│ │ └── ...
+│ └── cuda
+│ ├── common_cuda_helper.hpp
+│ ├── parrots_cudawarpfunction.cuh
+│ ├── ...
+│ └── ops_cuda_kernel.cuh
+├── onnxruntime
+│ ├── onnxruntime_register.h
+│ ├── onnxruntime_session_options_config_keys.h
+│ ├── ort_mmcv_utils.h
+│ ├── ...
+│ ├── onnx_ops.h
+│ └── cpu
+│ ├── onnxruntime_register.cpp
+│ ├── ...
+│ └── onnx_ops_impl.cpp
├── parrots
│ ├── ...
│ ├── ops.cpp
│ ├── ops_parrots.cpp
│ └── ops_pytorch.h
-└── pytorch
- ├── info.cpp
- ├── pybind.cpp
- ├── ...
- ├── ops.cpp
- ├── cuda
- │ ├── ...
- │ └── ops_cuda.cu
- ├── cpu
- │ ├── ...
- │ └── ops.cpp
- ├── mps
- │ ├── ...
- | └── op_mps.mm
- └── mlu
- ├── ...
- └── op_mlu.cpp
+├── pytorch
+│ ├── info.cpp
+│ ├── pybind.cpp
+│ ├── ...
+│ ├── ops.cpp
+│ ├── cuda
+│ │ ├── ...
+│ │ └── ops_cuda.cu
+│ └── cpu
+│ ├── ...
+│ └── ops.cpp
+└── tensorrt
+ ├── trt_cuda_helper.cuh
+ ├── trt_plugin_helper.hpp
+ ├── trt_plugin.hpp
+ ├── trt_serialize.hpp
+ ├── ...
+ ├── trt_ops.hpp
+ └── plugins
+ ├── trt_cuda_helper.cu
+ ├── trt_plugin.cpp
+ ├── ...
+ ├── trt_ops.cpp
+ └── trt_ops_kernel.cu
```
## Components
- `common`: This directory contains all tools and shared codes.
- `cuda`: The cuda kernels which can be shared by all backends. **HIP** kernel is also here since they have similar syntax.
- - `mps`: The tools used to support MPS ops. **NOTE** that MPS support is **experimental**.
- - `mlu`: The MLU kernels used to support [Cambricon](https://www.cambricon.com/) device.
- - `utils`: The kernels and utils of spconv.
+- `onnxruntime`: **ONNX Runtime** support for custom ops.
+ - `cpu`: CPU implementation of supported ops.
- `parrots`: **Parrots** is a deep learning frame for model training and inference. Parrots custom ops are placed in this directory.
- `pytorch`: **PyTorch** custom ops are supported by binding C++ to Python with **pybind11**. The ops implementation and binding codes are placed in this directory.
- `cuda`: This directory contains cuda kernel launchers, which feed memory pointers of tensor to the cuda kernel in `common/cuda`. The launchers provide c++ interface of cuda implementation of corresponding custom ops.
- `cpu`: This directory contain cpu implementations of corresponding custom ops.
- - `mlu`: This directory contain launchers of each MLU kernels.
- - `mps`: MPS ops implementation and launchers.
+- `tensorrt`: **TensorRT** support for custom ops.
+ - `plugins`: This directory contains the implementation of the supported custom ops. Some ops might also use shared cuda kernel in `common/cuda`.
## How to add new PyTorch ops?
1. (Optional) Add shared kernel in `common` to support special hardware platform.
- ```c++
- // src/common/cuda/new_ops_cuda_kernel.cuh
-
- template
- __global__ void new_ops_forward_cuda_kernel(const T* input, T* output, ...) {
- // forward here
- }
-
- ```
-
- Add cuda kernel launcher in `pytorch/cuda`.
-
- ```c++
- // src/pytorch/cuda
- #include
-
- void NewOpsForwardCUDAKernelLauncher(Tensor input, Tensor output, ...){
- // initialize
- at::cuda::CUDAGuard device_guard(input.device());
- cudaStream_t stream = at::cuda::getCurrentCUDAStream();
- ...
- AT_DISPATCH_FLOATING_TYPES_AND_HALF(
- input.scalar_type(), "new_ops_forward_cuda_kernel", ([&] {
- new_ops_forward_cuda_kernel
- <<>>(
- input.data_ptr(), output.data_ptr(),...);
- }));
- AT_CUDA_CHECK(cudaGetLastError());
- }
- ```
+ ```c++
+ // src/common/cuda/new_ops_cuda_kernel.cuh
+
+ template
+ __global__ void new_ops_forward_cuda_kernel(const T* input, T* output, ...) {
+ // forward here
+ }
+
+ ```
+
+ Add cuda kernel launcher in `pytorch/cuda`.
+
+ ```c++
+ // src/pytorch/cuda
+ #include
+
+ void NewOpsForwardCUDAKernelLauncher(Tensor input, Tensor output, ...){
+ // initialize
+ at::cuda::CUDAGuard device_guard(input.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ ...
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ input.scalar_type(), "new_ops_forward_cuda_kernel", ([&] {
+ new_ops_forward_cuda_kernel
+ <<>>(
+ input.data_ptr(), output.data_ptr(),...);
+ }));
+ AT_CUDA_CHECK(cudaGetLastError());
+ }
+ ```
2. Register implementation for different devices.
- ```c++
- // src/pytorch/cuda/cudabind.cpp
- ...
+ ```c++
+ // src/pytorch/cuda/cudabind.cpp
+ ...
- Tensor new_ops_forward_cuda(Tensor input, Tensor output, ...){
- // implement cuda forward here
- // use `NewOpsForwardCUDAKernelLauncher` here
- }
- // declare interface here.
- Tensor new_ops_forward_impl(Tensor input, Tensor output, ...);
- // register the implementation for given device (CUDA here).
- REGISTER_DEVICE_IMPL(new_ops_forward_impl, CUDA, new_ops_forward_cuda);
- ```
+ Tensor new_ops_forward_cuda(Tensor input, Tensor output, ...){
+ // implement cuda forward here
+ // use `NewOpsForwardCUDAKernelLauncher` here
+ }
+ // declare interface here.
+ Tensor new_ops_forward_impl(Tensor input, Tensor output, ...);
+ // register the implementation for given device (CUDA here).
+ REGISTER_DEVICE_IMPL(new_ops_forward_impl, CUDA, new_ops_forward_cuda);
+ ```
3. Add ops implementation in `pytorch` directory. Select different implementations according to device type.
- ```c++
- // src/pytorch/new_ops.cpp
- Tensor new_ops_forward_impl(Tensor input, Tensor output, ...){
- // dispatch the implementation according to the device type of input.
- DISPATCH_DEVICE_IMPL(new_ops_forward_impl, input, output, ...);
- }
- ...
+ ```c++
+ // src/pytorch/new_ops.cpp
+ Tensor new_ops_forward_impl(Tensor input, Tensor output, ...){
+ // dispatch the implementation according to the device type of input.
+ DISPATCH_DEVICE_IMPL(new_ops_forward_impl, input, output, ...);
+ }
+ ...
- Tensor new_ops_forward(Tensor input, Tensor output, ...){
- return new_ops_forward_impl(input, output, ...);
- }
- ```
+ Tensor new_ops_forward(Tensor input, Tensor output, ...){
+ return new_ops_forward_impl(input, output, ...);
+ }
+ ```
4. Binding the implementation in `pytorch/pybind.cpp`
- ```c++
- // src/pytorch/pybind.cpp
+ ```c++
+ // src/pytorch/pybind.cpp
- ...
+ ...
- Tensor new_ops_forward(Tensor input, Tensor output, ...);
+ Tensor new_ops_forward(Tensor input, Tensor output, ...);
- ...
+ ...
- // bind with pybind11
- m.def("new_ops_forward", &new_ops_forward, "new_ops_forward",
- py::arg("input"), py::arg("output"), ...);
+ // bind with pybind11
+ m.def("new_ops_forward", &new_ops_forward, "new_ops_forward",
+ py::arg("input"), py::arg("output"), ...);
- ...
+ ...
- ```
+ ```
5. Build MMCV again. Enjoy new ops in python
- ```python
- from ..utils import ext_loader
- ext_module = ext_loader.load_ext('_ext', ['new_ops_forward'])
+ ```python
+ from ..utils import ext_loader
+ ext_module = ext_loader.load_ext('_ext', ['new_ops_forward'])
- ...
+ ...
- ext_module.new_ops_forward(input, output, ...)
+ ext_module.new_ops_forward(input, output, ...)
- ```
+ ```
diff --git a/mmcv/ops/csrc/common/box_iou_rotated_utils.hpp b/mmcv/ops/csrc/common/box_iou_rotated_utils.hpp
index a8453eaa8d3638394df8a0b169d8df01dfc27a11..67190dc10eb245bb2bea23133ac984cd1c5a4888 100644
--- a/mmcv/ops/csrc/common/box_iou_rotated_utils.hpp
+++ b/mmcv/ops/csrc/common/box_iou_rotated_utils.hpp
@@ -220,10 +220,6 @@ HOST_DEVICE_INLINE int convex_hull_graham(const Point (&p)[24],
return temp > 0;
}
});
- // compute distance to origin after sort, since the points are now different.
- for (int i = 0; i < num_in; i++) {
- dist[i] = dot_2d(q[i], q[i]);
- }
#endif
// Step 4:
@@ -270,17 +266,6 @@ HOST_DEVICE_INLINE int convex_hull_graham(const Point (&p)[24],
return m;
}
-template
-HOST_DEVICE_INLINE T quadri_box_area(const Point (&q)[4]) {
- T area = 0;
-#pragma unroll
- for (int i = 1; i < 3; i++) {
- area += fabs(cross_2d(q[i] - q[0], q[i + 1] - q[0]));
- }
-
- return area / 2.0;
-}
-
template
HOST_DEVICE_INLINE T polygon_area(const Point (&q)[24], const int& m) {
if (m <= 2) {
@@ -319,25 +304,6 @@ HOST_DEVICE_INLINE T rotated_boxes_intersection(const RotatedBox& box1,
return polygon_area(orderedPts, num_convex);
}
-template
-HOST_DEVICE_INLINE T quadri_boxes_intersection(const Point (&pts1)[4],
- const Point (&pts2)[4]) {
- // There are up to 4 x 4 + 4 + 4 = 24 intersections (including dups) returned
- // from rotated_rect_intersection_pts
- Point intersectPts[24], orderedPts[24];
-
- int num = get_intersection_points(pts1, pts2, intersectPts);
-
- if (num <= 2) {
- return 0.0;
- }
-
- // Convex Hull to order the intersection points in clockwise order and find
- // the contour area.
- int num_convex = convex_hull_graham(intersectPts, num, orderedPts, true);
- return polygon_area(orderedPts, num_convex);
-}
-
} // namespace
template
@@ -375,52 +341,3 @@ HOST_DEVICE_INLINE T single_box_iou_rotated(T const* const box1_raw,
const T iou = intersection / baseS;
return iou;
}
-
-template
-HOST_DEVICE_INLINE T single_box_iou_quadri(T const* const pts1_raw,
- T const* const pts2_raw,
- const int mode_flag) {
- // shift center to the middle point to achieve higher precision in result
- Point pts1[4], pts2[4];
-
- auto center_shift_x =
- (pts1_raw[0] + pts2_raw[0] + pts1_raw[2] + pts2_raw[2] + pts1_raw[4] +
- pts2_raw[4] + pts1_raw[6] + pts2_raw[6]) /
- 8.0;
- auto center_shift_y =
- (pts1_raw[1] + pts2_raw[1] + pts1_raw[3] + pts2_raw[3] + pts1_raw[5] +
- pts2_raw[5] + pts1_raw[7] + pts2_raw[7]) /
- 8.0;
- pts1[0].x = pts1_raw[0] - center_shift_x;
- pts1[0].y = pts1_raw[1] - center_shift_y;
- pts1[1].x = pts1_raw[2] - center_shift_x;
- pts1[1].y = pts1_raw[3] - center_shift_y;
- pts1[2].x = pts1_raw[4] - center_shift_x;
- pts1[2].y = pts1_raw[5] - center_shift_y;
- pts1[3].x = pts1_raw[6] - center_shift_x;
- pts1[3].y = pts1_raw[7] - center_shift_y;
- pts2[0].x = pts2_raw[0] - center_shift_x;
- pts2[0].y = pts2_raw[1] - center_shift_y;
- pts2[1].x = pts2_raw[2] - center_shift_x;
- pts2[1].y = pts2_raw[3] - center_shift_y;
- pts2[2].x = pts2_raw[4] - center_shift_x;
- pts2[2].y = pts2_raw[5] - center_shift_y;
- pts2[3].x = pts2_raw[6] - center_shift_x;
- pts2[3].y = pts2_raw[7] - center_shift_y;
-
- const T area1 = quadri_box_area(pts1);
- const T area2 = quadri_box_area(pts2);
- if (area1 < 1e-14 || area2 < 1e-14) {
- return 0.f;
- }
-
- const T intersection = quadri_boxes_intersection(pts1, pts2);
- T baseS = 1.0;
- if (mode_flag == 0) {
- baseS = (area1 + area2 - intersection);
- } else if (mode_flag == 1) {
- baseS = area1;
- }
- const T iou = intersection / baseS;
- return iou;
-}
diff --git a/mmcv/ops/csrc/common/cuda/active_rotated_filter_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/active_rotated_filter_cuda_kernel.cuh
deleted file mode 100644
index 36e41107ebd52d3cf5e9a71cffe6eddeed4f0765..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/active_rotated_filter_cuda_kernel.cuh
+++ /dev/null
@@ -1,59 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved.
-// Modified from
-// https://github.com/csuhan/s2anet/blob/master/mmdet/ops/orn/src/cuda/ActiveRotatingFilter_cuda.cu
-#ifndef ACTIVE_ROTATED_FILTER_CUDA_KERNEL_CUH
-#define ACTIVE_ROTATED_FILTER_CUDA_KERNEL_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-template
-__global__ void active_rotated_filter_forward_cuda_kernel(
- const int nthreads, const scalar_t* weight_data, const int* indices_data,
- const int num_input_planes, const int num_output_planes,
- const int num_orientations, const int num_rotations, const int nEntry,
- scalar_t* output_data) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- int l = index % nEntry;
- int j = (index / nEntry) % num_input_planes;
- int i = index / nEntry / num_input_planes;
- int k;
- scalar_t val = *(weight_data + index);
- for (k = 0; k < num_rotations; k++) {
- int idx = (int)(*(indices_data + l * num_rotations + k)) - 1;
- scalar_t* target = output_data +
- i * (num_rotations * num_input_planes * nEntry) +
- k * (num_input_planes * nEntry) + j * (nEntry) + idx;
- *target = val;
- }
- }
-}
-
-template
-__global__ void active_rotated_filter_backward_cuda_kernel(
- const int nthreads, const scalar_t* gradWeight_data,
- const int* indices_data, const int num_input_planes,
- const int num_output_planes, const int num_orientations,
- const int num_rotations, const int nEntry, scalar_t* weight_data) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- int l = index % nEntry;
- int j = (index / nEntry) % num_input_planes;
- int i = index / nEntry / num_input_planes;
- int k;
- scalar_t* val = weight_data + index;
- *val = 0;
- scalar_t tmp = 0;
- for (k = 0; k < num_rotations; k++) {
- int idx = (int)(*(indices_data + l * num_rotations + k)) - 1;
- scalar_t target =
- *(gradWeight_data + i * (num_rotations * num_input_planes * nEntry) +
- k * (num_input_planes * nEntry) + j * (nEntry) + idx);
- tmp = tmp + target;
- }
- *val = tmp;
- }
-}
-#endif // ACTIVE_ROTATED_FILTER_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/assign_score_withk_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/assign_score_withk_cuda_kernel.cuh
index 9f9250844b9ceeca0df0377640c3d28e3f61cecc..056d12334b555bbbf14253382736bd6329805559 100644
--- a/mmcv/ops/csrc/common/cuda/assign_score_withk_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/assign_score_withk_cuda_kernel.cuh
@@ -22,34 +22,34 @@ __global__ void assign_score_withk_forward_cuda_kernel(
const int O, const int aggregate, const T* points, const T* centers,
const T* scores, const int64_t* knn_idx, T* output) {
// ----- parallel loop for B, N1, K and O ---------
- CUDA_1D_KERNEL_LOOP(i, B * O * N1 * K) {
- // ------- loop for M ----------
- const int b = (int)(i / (O * N1 * K));
- const int o = (int)(i % (O * N1 * K) / (N1 * K));
- const int n = (int)(i % (N1 * K) / K);
- const int k = (int)(i % K);
- const int cn = (int)knn_idx[b * K * N1 + n * K +
- 0]; // The first neighbor is the center point
- const int kn = (int)knn_idx[b * K * N1 + n * K + k];
- if (kn >= N0 ||
- kn < 0) { // if index overflows, it is out of the neighborhood range
- return;
- }
- assert(b < B);
- assert(kn < N0);
- assert(cn < N0);
- assert(o < O);
- assert(n < N1);
- const int out_idx = b * N1 * O * K + o * N1 * K + n * K + k;
- T val = output[out_idx];
- for (int m = 0; m < M; m++) {
- val += points[b * N0 * M * O + kn * M * O + m * O + o] *
- scores[b * N1 * K * M + n * K * M + k * M + m] -
- centers[b * N0 * M * O + cn * M * O + m * O + o] *
- scores[b * N1 * K * M + n * K * M + k * M + m];
- }
- output[out_idx] = val;
+ long i = blockIdx.x * blockDim.x + threadIdx.x;
+ if (i >= B * N1 * K * O) return;
+ // ------- loop for M ----------
+ const int b = (int)(i / (O * N1 * K));
+ const int o = (int)(i % (O * N1 * K) / (N1 * K));
+ const int n = (int)(i % (N1 * K) / K);
+ const int k = (int)(i % K);
+ const int cn = (int)knn_idx[b * K * N1 + n * K +
+ 0]; // The first neighbor is the center point
+ const int kn = (int)knn_idx[b * K * N1 + n * K + k];
+ if (kn >= N0 ||
+ kn < 0) { // if index overflows, it is out of the neighborhood range
+ return;
+ }
+ assert(b < B);
+ assert(kn < N0);
+ assert(cn < N0);
+ assert(o < O);
+ assert(n < N1);
+ const int out_idx = b * N1 * O * K + o * N1 * K + n * K + k;
+ T val = output[out_idx];
+ for (int m = 0; m < M; m++) {
+ val += points[b * N0 * M * O + kn * M * O + m * O + o] *
+ scores[b * N1 * K * M + n * K * M + k * M + m] -
+ centers[b * N0 * M * O + cn * M * O + m * O + o] *
+ scores[b * N1 * K * M + n * K * M + k * M + m];
}
+ output[out_idx] = val;
}
template
@@ -58,27 +58,27 @@ __global__ void assign_score_withk_points_backward_cuda_kernel(
const int O, const int aggregate, const T* grad_out, const T* scores,
const int64_t* knn_idx, T* grad_points, T* grad_centers) {
// ----- parallel loop for B, M, O ---------
- CUDA_1D_KERNEL_LOOP(i, B * M * O) {
- int b = (int)(i / (M * O));
- int m = (int)(i % (M * O) / O);
- int o = (int)(i % O);
+ long i = blockIdx.x * blockDim.x + threadIdx.x;
+ if (i >= B * M * O) return;
+ int b = (int)(i / (M * O));
+ int m = (int)(i % (M * O) / O);
+ int o = (int)(i % O);
- // ----- loop for N,K ---------
- for (int n = 0; n < N; n++) {
- for (int k = 0; k < K; k++) {
- int kn = knn_idx[b * N * K + n * K + k];
- int cn = knn_idx[b * N * K + n * K + 0];
- if (kn >= N0 || kn < 0) { // if index overflows, it is out of the
- // neighborhood range
- continue;
- }
- atomicAdd(grad_points + b * N0 * M * O + kn * M * O + m * O + o,
- scores[b * N * K * M + n * K * M + k * M + m] *
- grad_out[b * O * N * K + o * N * K + n * K + k]);
- atomicAdd(grad_centers + b * N0 * M * O + cn * M * O + m * O + o,
- -scores[b * N * K * M + n * K * M + k * M + m] *
- grad_out[b * O * N * K + o * N * K + n * K + k]);
+ // ----- loop for N,K ---------
+ for (int n = 0; n < N; n++) {
+ for (int k = 0; k < K; k++) {
+ int kn = knn_idx[b * N * K + n * K + k];
+ int cn = knn_idx[b * N * K + n * K + 0];
+ if (kn >= N0 ||
+ kn < 0) { // if index overflows, it is out of the neighborhood range
+ continue;
}
+ atomicAdd(grad_points + b * N0 * M * O + kn * M * O + m * O + o,
+ scores[b * N * K * M + n * K * M + k * M + m] *
+ grad_out[b * O * N * K + o * N * K + n * K + k]);
+ atomicAdd(grad_centers + b * N0 * M * O + cn * M * O + m * O + o,
+ -scores[b * N * K * M + n * K * M + k * M + m] *
+ grad_out[b * O * N * K + o * N * K + n * K + k]);
}
}
}
@@ -89,28 +89,28 @@ __global__ void assign_score_withk_scores_backward_cuda_kernel(
const int O, const int aggregate, const T* grad_out, const T* points,
const T* centers, const int64_t* knn_idx, T* grad_scores) {
// ----- parallel loop for B, N, K, M ---------
- CUDA_1D_KERNEL_LOOP(i, B * N * K * M) {
- const int b = (int)(i / (N * M * K));
- const int n = (int)(i % (N * M * K) / M / K);
- const int k = (int)(i % (M * K) / M);
- const int m = (int)(i % M);
- const int cn = knn_idx[b * N * K + n * K + 0];
- const int kn = knn_idx[b * N * K + n * K + k];
- if (kn >= N0 ||
- kn < 0) { // if index overflows, it is out of the neighborhood range
- return;
- }
+ long i = blockIdx.x * blockDim.x + threadIdx.x;
+ if (i >= B * N * K * M) return;
+ const int b = (int)(i / (N * M * K));
+ const int n = (int)(i % (N * M * K) / M / K);
+ const int k = (int)(i % (M * K) / M);
+ const int m = (int)(i % M);
+ const int cn = knn_idx[b * N * K + n * K + 0];
+ const int kn = knn_idx[b * N * K + n * K + k];
+ if (kn >= N0 ||
+ kn < 0) { // if index overflows, it is out of the neighborhood range
+ return;
+ }
- // -------------- loop for O ------------------------
- const int out_idx = b * N * K * M + n * K * M + k * M + m;
- T val = grad_scores[out_idx];
- for (int o = 0; o < O; o++) {
- val += (points[b * N0 * M * O + kn * M * O + m * O + o] -
- centers[b * N0 * M * O + cn * M * O + m * O + o]) *
- grad_out[b * O * N * K + o * N * K + n * K + k];
- }
- grad_scores[out_idx] = val;
+ // -------------- loop for O ------------------------
+ const int out_idx = b * N * K * M + n * K * M + k * M + m;
+ T val = grad_scores[out_idx];
+ for (int o = 0; o < O; o++) {
+ val += (points[b * N0 * M * O + kn * M * O + m * O + o] -
+ centers[b * N0 * M * O + cn * M * O + m * O + o]) *
+ grad_out[b * O * N * K + o * N * K + n * K + k];
}
+ grad_scores[out_idx] = val;
}
#endif // ASSIGN_SCORE_WITHK_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/ball_query_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/ball_query_cuda_kernel.cuh
index 632b5c4940b33a9d8d839fa3f3b92e7b6a2bd29e..ba2af01b5e4c67ec8498ac167e26a5116d853b62 100644
--- a/mmcv/ops/csrc/common/cuda/ball_query_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/ball_query_cuda_kernel.cuh
@@ -21,36 +21,35 @@ __global__ void ball_query_forward_cuda_kernel(int b, int n, int m,
// output:
// idx: (B, M, nsample)
int bs_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(pt_idx, m) {
- if (bs_idx >= b) return;
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (bs_idx >= b || pt_idx >= m) return;
- new_xyz += bs_idx * m * 3 + pt_idx * 3;
- xyz += bs_idx * n * 3;
- idx += bs_idx * m * nsample + pt_idx * nsample;
+ new_xyz += bs_idx * m * 3 + pt_idx * 3;
+ xyz += bs_idx * n * 3;
+ idx += bs_idx * m * nsample + pt_idx * nsample;
- float max_radius2 = max_radius * max_radius;
- float min_radius2 = min_radius * min_radius;
- T new_x = new_xyz[0];
- T new_y = new_xyz[1];
- T new_z = new_xyz[2];
+ float max_radius2 = max_radius * max_radius;
+ float min_radius2 = min_radius * min_radius;
+ T new_x = new_xyz[0];
+ T new_y = new_xyz[1];
+ T new_z = new_xyz[2];
- int cnt = 0;
- for (int k = 0; k < n; ++k) {
- T x = xyz[k * 3 + 0];
- T y = xyz[k * 3 + 1];
- T z = xyz[k * 3 + 2];
- T d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) +
- (new_z - z) * (new_z - z);
- if (d2 == 0 || (d2 >= min_radius2 && d2 < max_radius2)) {
- if (cnt == 0) {
- for (int l = 0; l < nsample; ++l) {
- idx[l] = k;
- }
+ int cnt = 0;
+ for (int k = 0; k < n; ++k) {
+ T x = xyz[k * 3 + 0];
+ T y = xyz[k * 3 + 1];
+ T z = xyz[k * 3 + 2];
+ T d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) +
+ (new_z - z) * (new_z - z);
+ if (d2 == 0 || (d2 >= min_radius2 && d2 < max_radius2)) {
+ if (cnt == 0) {
+ for (int l = 0; l < nsample; ++l) {
+ idx[l] = k;
}
- idx[cnt] = k;
- ++cnt;
- if (cnt >= nsample) break;
}
+ idx[cnt] = k;
+ ++cnt;
+ if (cnt >= nsample) break;
}
}
}
diff --git a/mmcv/ops/csrc/common/cuda/bbox_overlaps_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/bbox_overlaps_cuda_kernel.cuh
index 15bd91eca629895d3a99dde3fe6614036ca31dc9..249c9e85009d00af2bee5380a0013135f36c303b 100644
--- a/mmcv/ops/csrc/common/cuda/bbox_overlaps_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/bbox_overlaps_cuda_kernel.cuh
@@ -8,27 +8,6 @@
#include "pytorch_cuda_helper.hpp"
#endif
-template
-__device__ __forceinline__ void load_bbox(const T* bbox, const int base, T& x1,
- T& y1, T& x2, T& y2) {
- x1 = bbox[base];
- y1 = bbox[base + 1];
- x2 = bbox[base + 2];
- y2 = bbox[base + 3];
-}
-
-template <>
-__device__ __forceinline__ void load_bbox(const float* bbox,
- const int base, float& x1,
- float& y1, float& x2,
- float& y2) {
- const float4 bbox_offset = reinterpret_cast(bbox + base)[0];
- x1 = bbox_offset.x;
- y1 = bbox_offset.y;
- x2 = bbox_offset.z;
- y2 = bbox_offset.w;
-}
-
template
__global__ void bbox_overlaps_cuda_kernel(const T* bbox1, const T* bbox2,
T* ious, const int num_bbox1,
@@ -37,111 +16,69 @@ __global__ void bbox_overlaps_cuda_kernel(const T* bbox1, const T* bbox2,
const int offset) {
if (aligned) {
CUDA_1D_KERNEL_LOOP(index, num_bbox1) {
- const int b1 = index;
- const int b2 = index;
-
- const int base1 = b1 << 2; // b1 * 4
- T b1_x1, b1_y1, b1_x2, b1_y2;
- load_bbox(bbox1, base1, b1_x1, b1_y1, b1_x2, b1_y2);
- const T b1_area = (b1_x2 - b1_x1 + offset) * (b1_y2 - b1_y1 + offset);
-
- const int base2 = b2 << 2; // b2 * 4
- T b2_x1, b2_y1, b2_x2, b2_y2;
- load_bbox(bbox2, base2, b2_x1, b2_y1, b2_x2, b2_y2);
- const T b2_area = (b2_x2 - b2_x1 + offset) * (b2_y2 - b2_y1 + offset);
-
- const T left = fmaxf(b1_x1, b2_x1), right = fminf(b1_x2, b2_x2);
- const T top = fmaxf(b1_y1, b2_y1), bottom = fminf(b1_y2, b2_y2);
- const T width = fmaxf(right - left + offset, 0.f);
- const T height = fmaxf(bottom - top + offset, 0.f);
- const T interS = width * height;
-
- const T baseS =
- fmaxf(mode == 0 ? b1_area + b2_area - interS : b1_area, T(offset));
+ int b1 = index;
+ int b2 = index;
+
+ int base1 = b1 * 4;
+ T b1_x1 = bbox1[base1];
+ T b1_y1 = bbox1[base1 + 1];
+ T b1_x2 = bbox1[base1 + 2];
+ T b1_y2 = bbox1[base1 + 3];
+ T b1_area = (b1_x2 - b1_x1 + offset) * (b1_y2 - b1_y1 + offset);
+
+ int base2 = b2 * 4;
+ T b2_x1 = bbox2[base2];
+ T b2_y1 = bbox2[base2 + 1];
+ T b2_x2 = bbox2[base2 + 2];
+ T b2_y2 = bbox2[base2 + 3];
+ T b2_area = (b2_x2 - b2_x1 + offset) * (b2_y2 - b2_y1 + offset);
+
+ T left = fmaxf(b1_x1, b2_x1), right = fminf(b1_x2, b2_x2);
+ T top = fmaxf(b1_y1, b2_y1), bottom = fminf(b1_y2, b2_y2);
+ T width = fmaxf(right - left + offset, 0.f);
+ T height = fmaxf(bottom - top + offset, 0.f);
+ T interS = width * height;
+ T baseS = 1.0;
+ if (mode == 0) {
+ baseS = fmaxf(b1_area + b2_area - interS, T(offset));
+ } else if (mode == 1) {
+ baseS = fmaxf(b1_area, T(offset));
+ }
ious[index] = interS / baseS;
}
} else {
CUDA_1D_KERNEL_LOOP(index, num_bbox1 * num_bbox2) {
- const int b1 = index / num_bbox2;
- const int b2 = index % num_bbox2;
-
- const int base1 = b1 << 2; // b1 * 4
- T b1_x1, b1_y1, b1_x2, b1_y2;
- load_bbox(bbox1, base1, b1_x1, b1_y1, b1_x2, b1_y2);
- const T b1_area = (b1_x2 - b1_x1 + offset) * (b1_y2 - b1_y1 + offset);
-
- const int base2 = b2 << 2; // b2 * 4
- T b2_x1, b2_y1, b2_x2, b2_y2;
- load_bbox(bbox2, base2, b2_x1, b2_y1, b2_x2, b2_y2);
- const T b2_area = (b2_x2 - b2_x1 + offset) * (b2_y2 - b2_y1 + offset);
-
- const T left = fmaxf(b1_x1, b2_x1), right = fminf(b1_x2, b2_x2);
- const T top = fmaxf(b1_y1, b2_y1), bottom = fminf(b1_y2, b2_y2);
- const T width = fmaxf(right - left + offset, 0.f);
- const T height = fmaxf(bottom - top + offset, 0.f);
- const T interS = width * height;
-
- const T baseS =
- fmaxf(mode == 0 ? b1_area + b2_area - interS : b1_area, T(offset));
+ int b1 = index / num_bbox2;
+ int b2 = index % num_bbox2;
+
+ int base1 = b1 * 4;
+ T b1_x1 = bbox1[base1];
+ T b1_y1 = bbox1[base1 + 1];
+ T b1_x2 = bbox1[base1 + 2];
+ T b1_y2 = bbox1[base1 + 3];
+ T b1_area = (b1_x2 - b1_x1 + offset) * (b1_y2 - b1_y1 + offset);
+
+ int base2 = b2 * 4;
+ T b2_x1 = bbox2[base2];
+ T b2_y1 = bbox2[base2 + 1];
+ T b2_x2 = bbox2[base2 + 2];
+ T b2_y2 = bbox2[base2 + 3];
+ T b2_area = (b2_x2 - b2_x1 + offset) * (b2_y2 - b2_y1 + offset);
+
+ T left = fmaxf(b1_x1, b2_x1), right = fminf(b1_x2, b2_x2);
+ T top = fmaxf(b1_y1, b2_y1), bottom = fminf(b1_y2, b2_y2);
+ T width = fmaxf(right - left + offset, 0.f);
+ T height = fmaxf(bottom - top + offset, 0.f);
+ T interS = width * height;
+ T baseS = 1.0;
+ if (mode == 0) {
+ baseS = fmaxf(b1_area + b2_area - interS, T(offset));
+ } else if (mode == 1) {
+ baseS = fmaxf(b1_area, T(offset));
+ }
ious[index] = interS / baseS;
}
}
}
-#if __CUDA_ARCH__ >= 530
-__device__ __forceinline__ __half __half_area(const __half x1, const __half y1,
- const __half x2, const __half y2,
- const __half offset) {
- const __half half_w = __hadd(__hsub(x2, x1), offset);
- const __half half_h = __hadd(__hsub(y2, y1), offset);
- return __hmul(half_w, half_h);
-}
-
-__device__ __forceinline__ __half __half_max(const __half a, const __half b) {
- return __hge(a, b) ? a : b;
-}
-
-__device__ __forceinline__ __half __half_min(const __half a, const __half b) {
- return __hle(a, b) ? a : b;
-}
-
-// fp16 won't provide much increase when aligned==true. It is useful when
-// aligned==false, which would give you ~40% bonus.
-__device__ void bbox_overlaps_cuda_kernel_half(
- const __half* bbox1, const __half* bbox2, __half* ious, const int num_bbox1,
- const int num_bbox2, const int mode, const bool aligned, const int offset) {
- const int num_output = aligned ? num_bbox1 : num_bbox1 * num_bbox2;
- const __half h_offset = __int2half_rn(offset);
- CUDA_1D_KERNEL_LOOP(index, num_output) {
- const int b1 = aligned ? index : index / num_bbox2;
- const int b2 = aligned ? index : index % num_bbox2;
-
- const int base1 = b1 << 2;
- __half b1_x1, b1_y1, b1_x2, b1_y2;
- load_bbox<__half>(bbox1, base1, b1_x1, b1_y1, b1_x2, b1_y2);
- const __half b1_area = __half_area(b1_x1, b1_y1, b1_x2, b1_y2, h_offset);
-
- const int base2 = b2 << 2;
- __half b2_x1, b2_y1, b2_x2, b2_y2;
- load_bbox<__half>(bbox2, base2, b2_x1, b2_y1, b2_x2, b2_y2);
- const __half b2_area = __half_area(b2_x1, b2_y1, b2_x2, b2_y2, h_offset);
-
- const __half left = __half_max(b1_x1, b2_x1),
- right = __half_min(b1_x2, b2_x2);
- const __half top = __half_max(b1_y1, b2_y1),
- bottom = __half_min(b1_y2, b2_y2);
- const __half width =
- __half_max(__hadd(__hsub(right, left), h_offset), __float2half(0.f));
- const __half height =
- __half_max(__hadd(__hsub(bottom, top), h_offset), __float2half(0.f));
- const __half interS = __hmul(width, height);
-
- const __half baseS = __half_max(
- mode == 0 ? __hsub(__hadd(b1_area, b2_area), interS) : b1_area,
- h_offset);
- ious[index] = __hdiv(interS, baseS);
- }
-}
-#endif // __CUDA_ARCH__ >= 530
-
#endif // BBOX_OVERLAPS_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/bezier_align_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/bezier_align_cuda_kernel.cuh
deleted file mode 100644
index 537610416e16aae8979d0843972e090d127b0d43..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/bezier_align_cuda_kernel.cuh
+++ /dev/null
@@ -1,230 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved
-// Modified from
-// https://github.com/aim-uofa/AdelaiDet/blob/master/adet/layers/csrc/BezierAlign/BezierAlign_cuda.cu
-#ifndef BEZIER_ALIGN_CUDA_KERNEL_CUH
-#define BEZIER_ALIGN_CUDA_KERNEL_CUH
-
-#include
-#ifdef MMCV_WITH_TRT
-#include "common_cuda_helper.hpp"
-#else // MMCV_WITH_TRT
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else // MMCV_USE_PARROTS
-#include "pytorch_cuda_helper.hpp"
-#endif // MMCV_USE_PARROTS
-#endif // MMCV_WITH_TRT
-
-template
-__device__ T bezier_curve(const T p0, const T p1, const T p2, const T p3,
- const T u) {
- return ((1. - u) * (1. - u) * (1. - u) * p0 +
- 3. * u * (1. - u) * (1. - u) * p1 + 3. * u * u * (1. - u) * p2 +
- u * u * u * p3);
-}
-
-template
-__global__ void bezier_align_forward_cuda_kernel(
- const int nthreads,
- const T *bottom_data, // inputs
- const T *bottom_rois, // bottom rois contains the bezier curve
- T *top_data, // outputs
- const int pooled_height, const int pooled_width, const T spatial_scale,
- const int sampling_ratio, bool aligned, const int channels,
- const int height, const int width) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- // (n, c, ph, pw) is an element in the pooled output
- int pw = index % pooled_width;
- int ph = (index / pooled_width) % pooled_height;
- int c = (index / pooled_width / pooled_height) % channels;
- int n = index / pooled_width / pooled_height / channels;
-
- // beziers have size Nx(1+8*2) = Nx17
- const T *offset_bottom_rois = bottom_rois + n * 17;
- int roi_batch_ind = offset_bottom_rois[0];
-
- // Do not use rounding; this implementation detail is critical
- T offset = aligned ? (T)0.5 : (T)0.0;
-
- // TODO: avoid this by using parallel annotation, for good
- T p0_x = offset_bottom_rois[1] * spatial_scale;
- T p0_y = offset_bottom_rois[2] * spatial_scale;
- T p1_x = offset_bottom_rois[3] * spatial_scale;
- T p1_y = offset_bottom_rois[4] * spatial_scale;
- T p2_x = offset_bottom_rois[5] * spatial_scale;
- T p2_y = offset_bottom_rois[6] * spatial_scale;
- T p3_x = offset_bottom_rois[7] * spatial_scale;
- T p3_y = offset_bottom_rois[8] * spatial_scale;
- T p4_x = offset_bottom_rois[15] * spatial_scale;
- T p4_y = offset_bottom_rois[16] * spatial_scale;
- T p5_x = offset_bottom_rois[13] * spatial_scale;
- T p5_y = offset_bottom_rois[14] * spatial_scale;
- T p6_x = offset_bottom_rois[11] * spatial_scale;
- T p6_y = offset_bottom_rois[12] * spatial_scale;
- T p7_x = offset_bottom_rois[9] * spatial_scale;
- T p7_y = offset_bottom_rois[10] * spatial_scale;
-
- // compute the coords
- const T u = pw / static_cast(pooled_width);
- const T v = ph / static_cast(pooled_height);
- const T x0 = bezier_curve(p0_x, p1_x, p2_x, p3_x, u);
- const T y0 = bezier_curve(p0_y, p1_y, p2_y, p3_y, u);
- const T x1 = bezier_curve(p4_x, p5_x, p6_x, p7_x, u);
- const T y1 = bezier_curve(p4_y, p5_y, p6_y, p7_y, u);
- const T x_center = x1 * v + x0 * (1. - v) - offset;
- const T y_center = y1 * v + y0 * (1. - v) - offset;
-
- T roi_width = max(abs(p0_x - p3_x), abs(p4_x - p7_x));
- T roi_height = max(abs(p0_y - p3_y), abs(p4_y - p7_y));
- if (!aligned) { // for backward-compatibility only
- roi_width = max(roi_width, (T)1.);
- roi_height = max(roi_height, (T)1.);
- }
- T bin_size_h = static_cast(roi_height) / static_cast(pooled_height);
- T bin_size_w = static_cast(roi_width) / static_cast(pooled_width);
-
- const T *offset_bottom_data =
- bottom_data + (roi_batch_ind * channels + c) * height * width;
-
- // We use roi_bin_grid to sample the grid and mimic integral
- int roi_bin_grid_h = (sampling_ratio > 0)
- ? sampling_ratio
- : ceil(roi_height / pooled_height); // e.g., = 2
- int roi_bin_grid_w =
- (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);
-
- // We do average (integral) pooling inside a bin
- // When the grid is empty, output zeros == 0/1, instead of NaN.
- const T count = max(roi_bin_grid_h * roi_bin_grid_w, 1); // e.g. = 4
-
- T output_val = 0.;
- for (int iy = 0; iy < roi_bin_grid_h; iy++) // e.g., iy = 0, 1
- {
- const T y = y_center - (T)0.5 * bin_size_h +
- static_cast(iy + .5f) * bin_size_h /
- static_cast(roi_bin_grid_h); // e.g., 0.5, 1.5
- for (int ix = 0; ix < roi_bin_grid_w; ix++) {
- const T x = x_center - (T)0.5 * bin_size_w +
- static_cast(ix + .5f) * bin_size_w /
- static_cast(roi_bin_grid_w);
-
- T val = bilinear_interpolate(offset_bottom_data, height, width, y, x,
- index);
- output_val += val;
- }
- }
- output_val /= count;
-
- top_data[index] = output_val;
- }
-}
-
-template
-__global__ void bezier_align_backward_cuda_kernel(
- const int nthreads, const T *top_diff, const T *bottom_rois, T *bottom_diff,
- const int pooled_height, const int pooled_width, const T spatial_scale,
- const int sampling_ratio, bool aligned, const int channels,
- const int height, const int width) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- // (n, c, ph, pw) is an element in the pooled output
- int pw = index % pooled_width;
- int ph = (index / pooled_width) % pooled_height;
- int c = (index / pooled_width / pooled_height) % channels;
- int n = index / pooled_width / pooled_height / channels;
-
- // beziers have size Nx(1+8*2) = Nx17
- const T *offset_bottom_rois = bottom_rois + n * 17;
- int roi_batch_ind = offset_bottom_rois[0];
-
- // Do not use rounding; this implementation detail is critical
- T offset = aligned ? (T)0.5 : (T)0.0;
- T p0_x = offset_bottom_rois[1] * spatial_scale;
- T p0_y = offset_bottom_rois[2] * spatial_scale;
- T p1_x = offset_bottom_rois[3] * spatial_scale;
- T p1_y = offset_bottom_rois[4] * spatial_scale;
- T p2_x = offset_bottom_rois[5] * spatial_scale;
- T p2_y = offset_bottom_rois[6] * spatial_scale;
- T p3_x = offset_bottom_rois[7] * spatial_scale;
- T p3_y = offset_bottom_rois[8] * spatial_scale;
- T p4_x = offset_bottom_rois[15] * spatial_scale;
- T p4_y = offset_bottom_rois[16] * spatial_scale;
- T p5_x = offset_bottom_rois[13] * spatial_scale;
- T p5_y = offset_bottom_rois[14] * spatial_scale;
- T p6_x = offset_bottom_rois[11] * spatial_scale;
- T p6_y = offset_bottom_rois[12] * spatial_scale;
- T p7_x = offset_bottom_rois[9] * spatial_scale;
- T p7_y = offset_bottom_rois[10] * spatial_scale;
-
- // compute the coords
- const T u = pw / static_cast(pooled_width);
- const T v = ph / static_cast(pooled_height);
- const T x0 = bezier_curve(p0_x, p1_x, p2_x, p3_x, u);
- const T y0 = bezier_curve(p0_y, p1_y, p2_y, p3_y, u);
- const T x1 = bezier_curve(p4_x, p5_x, p6_x, p7_x, u);
- const T y1 = bezier_curve(p4_y, p5_y, p6_y, p7_y, u);
- const T x_center = x1 * v + x0 * (1. - v) - offset;
- const T y_center = y1 * v + y0 * (1. - v) - offset;
-
- T roi_width = max(abs(p0_x - p3_x), abs(p4_x - p7_x));
- T roi_height = max(abs(p0_y - p3_y), abs(p4_y - p7_y));
- if (!aligned) { // for backward-compatibility only
- roi_width = max(roi_width, (T)1.);
- roi_height = max(roi_height, (T)1.);
- }
- T bin_size_h = static_cast(roi_height) / static_cast(pooled_height);
- T bin_size_w = static_cast(roi_width) / static_cast(pooled_width);
-
- T *offset_bottom_diff =
- bottom_diff + (roi_batch_ind * channels + c) * height * width;
-
- int top_offset = (n * channels + c) * pooled_height * pooled_width;
- const T *offset_top_diff = top_diff + top_offset;
- const T top_diff_this_bin = offset_top_diff[ph * pooled_width + pw];
-
- // We use roi_bin_grid to sample the grid and mimic integral
- int roi_bin_grid_h = (sampling_ratio > 0)
- ? sampling_ratio
- : ceil(roi_height / pooled_height); // e.g., = 2
- int roi_bin_grid_w =
- (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);
-
- // We do average (integral) pooling inside a bin
- const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4
-
- for (int iy = 0; iy < roi_bin_grid_h; iy++) // e.g., iy = 0, 1
- {
- const T y = y_center - (T)0.5 * bin_size_h +
- static_cast(iy + .5f) * bin_size_h /
- static_cast(roi_bin_grid_h); // e.g., 0.5, 1.5
- for (int ix = 0; ix < roi_bin_grid_w; ix++) {
- const T x = x_center - (T)0.5 * bin_size_w +
- static_cast(ix + .5f) * bin_size_w /
- static_cast(roi_bin_grid_w);
-
- T w1, w2, w3, w4;
- int x_low, x_high, y_low, y_high;
-
- bilinear_interpolate_gradient(height, width, y, x, w1, w2, w3, w4,
- x_low, x_high, y_low, y_high, index);
-
- T g1 = top_diff_this_bin * w1 / count;
- T g2 = top_diff_this_bin * w2 / count;
- T g3 = top_diff_this_bin * w3 / count;
- T g4 = top_diff_this_bin * w4 / count;
-
- if (x_low >= 0 && x_high >= 0 && y_low >= 0 && y_high >= 0) {
- atomicAdd(offset_bottom_diff + y_low * width + x_low,
- static_cast(g1));
- atomicAdd(offset_bottom_diff + y_low * width + x_high,
- static_cast(g2));
- atomicAdd(offset_bottom_diff + y_high * width + x_low,
- static_cast(g3));
- atomicAdd(offset_bottom_diff + y_high * width + x_high,
- static_cast(g4));
- } // if
- } // ix
- } // iy
- } // CUDA_1D_KERNEL_LOOP
-} // BezierAlignBackward
-
-#endif // BEZIER_ALIGN_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/box_iou_quadri_cuda.cuh b/mmcv/ops/csrc/common/cuda/box_iou_quadri_cuda.cuh
deleted file mode 100644
index cf8ad5e1a324de3a11c8fc8af28a8d559a661ed6..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/box_iou_quadri_cuda.cuh
+++ /dev/null
@@ -1,91 +0,0 @@
-// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
-#ifndef BOX_IOU_QUADRI_CUDA_CUH
-#define BOX_IOU_QUADRI_CUDA_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-#include "box_iou_rotated_utils.hpp"
-
-// 2D block with 32 * 16 = 512 threads per block
-const int BLOCK_DIM_X = 32;
-const int BLOCK_DIM_Y = 16;
-
-inline int divideUP(const int x, const int y) { return (((x) + (y)-1) / (y)); }
-
-template
-__global__ void box_iou_quadri_cuda_kernel(
- const int n_boxes1, const int n_boxes2, const T* dev_boxes1,
- const T* dev_boxes2, T* dev_ious, const int mode_flag, const bool aligned) {
- if (aligned) {
- CUDA_1D_KERNEL_LOOP(index, n_boxes1) {
- int b1 = index;
- int b2 = index;
-
- int base1 = b1 * 8;
-
- float block_boxes1[8];
- float block_boxes2[8];
-
- block_boxes1[0] = dev_boxes1[base1 + 0];
- block_boxes1[1] = dev_boxes1[base1 + 1];
- block_boxes1[2] = dev_boxes1[base1 + 2];
- block_boxes1[3] = dev_boxes1[base1 + 3];
- block_boxes1[4] = dev_boxes1[base1 + 4];
- block_boxes1[5] = dev_boxes1[base1 + 5];
- block_boxes1[6] = dev_boxes1[base1 + 6];
- block_boxes1[7] = dev_boxes1[base1 + 7];
-
- int base2 = b2 * 8;
-
- block_boxes2[0] = dev_boxes2[base2 + 0];
- block_boxes2[1] = dev_boxes2[base2 + 1];
- block_boxes2[2] = dev_boxes2[base2 + 2];
- block_boxes2[3] = dev_boxes2[base2 + 3];
- block_boxes2[4] = dev_boxes2[base2 + 4];
- block_boxes2[5] = dev_boxes2[base2 + 5];
- block_boxes2[6] = dev_boxes2[base2 + 6];
- block_boxes2[7] = dev_boxes2[base2 + 7];
-
- dev_ious[index] =
- single_box_iou_quadri(block_boxes1, block_boxes2, mode_flag);
- }
- } else {
- CUDA_1D_KERNEL_LOOP(index, n_boxes1 * n_boxes2) {
- int b1 = index / n_boxes2;
- int b2 = index % n_boxes2;
-
- int base1 = b1 * 8;
-
- float block_boxes1[8];
- float block_boxes2[8];
-
- block_boxes1[0] = dev_boxes1[base1 + 0];
- block_boxes1[1] = dev_boxes1[base1 + 1];
- block_boxes1[2] = dev_boxes1[base1 + 2];
- block_boxes1[3] = dev_boxes1[base1 + 3];
- block_boxes1[4] = dev_boxes1[base1 + 4];
- block_boxes1[5] = dev_boxes1[base1 + 5];
- block_boxes1[6] = dev_boxes1[base1 + 6];
- block_boxes1[7] = dev_boxes1[base1 + 7];
-
- int base2 = b2 * 8;
-
- block_boxes2[0] = dev_boxes2[base2 + 0];
- block_boxes2[1] = dev_boxes2[base2 + 1];
- block_boxes2[2] = dev_boxes2[base2 + 2];
- block_boxes2[3] = dev_boxes2[base2 + 3];
- block_boxes2[4] = dev_boxes2[base2 + 4];
- block_boxes2[5] = dev_boxes2[base2 + 5];
- block_boxes2[6] = dev_boxes2[base2 + 6];
- block_boxes2[7] = dev_boxes2[base2 + 7];
-
- dev_ious[index] =
- single_box_iou_quadri(block_boxes1, block_boxes2, mode_flag);
- }
- }
-}
-
-#endif
diff --git a/mmcv/ops/csrc/common/cuda/carafe_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/carafe_cuda_kernel.cuh
index 20fd617ff7b7a318c5b02c8ab0aec2e5bb03483b..07beeda57f70389d067e16b549b1a6042780a624 100644
--- a/mmcv/ops/csrc/common/cuda/carafe_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/carafe_cuda_kernel.cuh
@@ -8,7 +8,7 @@
#include "pytorch_cuda_helper.hpp"
#endif
-#ifdef MMCV_WITH_HIP
+#ifdef HIP_DIFF
#define WARP_SIZE 64
#else
#define WARP_SIZE 32
@@ -29,22 +29,22 @@ __device__ inline int Loc2Index(const int n, const int c, const int h,
int index = w + (h + (c + n * channel_num) * height) * width;
return index;
}
-#ifndef MMCV_WITH_HIP
+#ifndef HIP_DIFF
/* TODO: move this to a common place */
template
-__device__ inline scalar_t min(scalar_t a, scalar_t b) {
+__device__ inline scalar_t mmcv_min(scalar_t a, scalar_t b) {
return a < b ? a : b;
}
template
-__device__ inline scalar_t max(scalar_t a, scalar_t b) {
+__device__ inline scalar_t mmcv_max(scalar_t a, scalar_t b) {
return a > b ? a : b;
}
#endif
template
__device__ __forceinline__ scalar_t warpReduceSum(scalar_t val) {
for (int offset = WARP_SIZE / 2; offset > 0; offset /= 2)
-#ifdef MMCV_WITH_HIP
+#ifdef HIP_DIFF
val += __shfl_down(val, offset);
#else
val += __shfl_down_sync(FULL_MASK, val, offset);
@@ -55,11 +55,11 @@ __device__ __forceinline__ scalar_t warpReduceSum(scalar_t val) {
template <>
__device__ __forceinline__ phalf warpReduceSum(phalf val) {
for (int offset = WARP_SIZE / 2; offset > 0; offset /= 2)
-#ifdef MMCV_WITH_HIP
- __PHALF(val) += __shfl_down(val, offset);
+#ifdef HIP_DIFF
+ __PHALF(val) += __shfl_down(FULL_MASK, val, offset);
#else
__PHALF(val) +=
- __shfl_down_sync(FULL_MASK, __PHALF(val).operator __half(), offset);
+ __shfl_down_sync(FULL_MASK, static_cast<__half>(__PHALF(val)), offset);
#endif
return val;
}
@@ -316,7 +316,7 @@ __global__ void CARAFEBackward_Mask(const int num_kernels,
output_val += top_diff[top_id] * bottom_data[bottom_id];
}
}
-#ifdef MMCV_WITH_HIP
+#ifdef HIP_DIFF
__syncthreads();
#else
__syncwarp();
diff --git a/mmcv/ops/csrc/common/cuda/chamfer_distance_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/chamfer_distance_cuda_kernel.cuh
deleted file mode 100644
index 89feea4a546a5093967f26393ca6be3b9fe6ae05..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/chamfer_distance_cuda_kernel.cuh
+++ /dev/null
@@ -1,101 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved.
-// Modified from
-// https://github.com/chrdiller/pyTorchChamferDistance/blob/master/chamfer_distance/chamfer_distance.cu
-#ifndef CHAMFER_DISTANCE_CUDA_KERNEL_CUH
-#define CHAMFER_DISTANCE_CUDA_KERNEL_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-#define MAX_SHARED_SCALAR_T 6144 // 49152 / 8 = 6144
-
-template
-__global__ void chamfer_distance_forward_cuda_kernel(int b, int n,
- const scalar_t* xyz, int m,
- const scalar_t* xyz2,
- scalar_t* result,
- int* result_i) {
- __shared__ scalar_t buf[MAX_SHARED_SCALAR_T];
- for (int i = blockIdx.x; i < b; i += gridDim.x) {
- for (int k2 = 0; k2 < m; k2 += THREADS_PER_BLOCK) {
- int end_k = min(m, k2 + THREADS_PER_BLOCK) - k2;
- for (int j = threadIdx.x; j < end_k * 2; j += blockDim.x) {
- buf[j] = xyz2[(i * m + k2) * 2 + j];
- }
- __syncthreads();
- for (int j = threadIdx.x; j < n; j += blockDim.x * gridDim.y) {
- scalar_t x1 = xyz[(i * n + j) * 2 + 0];
- scalar_t y1 = xyz[(i * n + j) * 2 + 1];
- int best_i = 0;
- scalar_t best = 1e10;
- int end_ka = end_k & (~2);
- if (end_ka == THREADS_PER_BLOCK) {
- for (int k = 0; k < THREADS_PER_BLOCK; k += 4) {
-#pragma unroll
- for (int j = 0; j < 4; ++j) {
- scalar_t x2 = buf[(k + j) * 2] - x1;
- scalar_t y2 = buf[(k + j) * 2 + 1] - y1;
- scalar_t d = x2 * x2 + y2 * y2;
- if (d < best) {
- best = d;
- best_i = k + k2 + j;
- }
- }
- }
- } else {
- for (int k = 0; k < end_ka; k += 4) {
-#pragma unroll
- for (int j = 0; j < 4; ++j) {
- scalar_t x2 = buf[(k + j) * 2] - x1;
- scalar_t y2 = buf[(k + j) * 2 + 1] - y1;
- scalar_t d = x2 * x2 + y2 * y2;
- if (d < best) {
- best = d;
- best_i = k + k2 + j;
- }
- }
- }
- }
- for (int k = end_ka; k < end_k; k++) {
- scalar_t x2 = buf[k * 2 + 0] - x1;
- scalar_t y2 = buf[k * 2 + 1] - y1;
- scalar_t d = x2 * x2 + y2 * y2;
- if (k == 0 || d < best) {
- best = d;
- best_i = k + k2;
- }
- }
- if (k2 == 0 || result[(i * n + j)] > best) {
- result[(i * n + j)] = best;
- result_i[(i * n + j)] = best_i;
- }
- }
- __syncthreads();
- }
- }
-}
-
-template
-__global__ void chamfer_distance_backward_cuda_kernel(
- int b, int n, const scalar_t* xyz1, int m, const scalar_t* xyz2,
- const scalar_t* grad_dist1, const int* idx1, scalar_t* grad_xyz1,
- scalar_t* grad_xyz2) {
- for (int i = blockIdx.x; i < b; i += gridDim.x) {
- for (int j = threadIdx.x; j < n; j += blockDim.x * gridDim.y) {
- scalar_t x1 = xyz1[(i * n + j) * 2 + 0];
- scalar_t y1 = xyz1[(i * n + j) * 2 + 1];
- int j2 = idx1[i * n + j];
- scalar_t x2 = xyz2[(i * m + j2) * 2 + 0];
- scalar_t y2 = xyz2[(i * m + j2) * 2 + 1];
- scalar_t g = grad_dist1[i * n + j] * 2;
- atomicAdd(&(grad_xyz1[(i * n + j) * 2 + 0]), g * (x1 - x2));
- atomicAdd(&(grad_xyz1[(i * n + j) * 2 + 1]), g * (y1 - y2));
- atomicAdd(&(grad_xyz2[(i * m + j2) * 2 + 0]), -(g * (x1 - x2)));
- atomicAdd(&(grad_xyz2[(i * m + j2) * 2 + 1]), -(g * (y1 - y2)));
- }
- }
-}
-#endif // CHAMFER_DISTANCE_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/common_cuda_helper.hpp b/mmcv/ops/csrc/common/cuda/common_cuda_helper.hpp
index b12aa9a26a2cc162fd89f68ccc97e17749090a41..dc5df1730ee20f7f97c5cbf14c7f8da849820feb 100644
--- a/mmcv/ops/csrc/common/cuda/common_cuda_helper.hpp
+++ b/mmcv/ops/csrc/common/cuda/common_cuda_helper.hpp
@@ -7,20 +7,12 @@
for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \
i += blockDim.x * gridDim.x)
-#define CUDA_2D_KERNEL_LOOP(i, n, j, m) \
- for (size_t i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \
- i += blockDim.x * gridDim.x) \
- for (size_t j = blockIdx.y * blockDim.y + threadIdx.y; j < (m); \
- j += blockDim.y * gridDim.y)
-
-#define CUDA_2D_KERNEL_BLOCK_LOOP(i, n, j, m) \
- for (size_t i = blockIdx.x; i < (n); i += gridDim.x) \
- for (size_t j = blockIdx.y; j < (m); j += gridDim.y)
-
#define THREADS_PER_BLOCK 512
-inline int GET_BLOCKS(const int N, const int num_threads = THREADS_PER_BLOCK) {
- int optimal_block_num = (N + num_threads - 1) / num_threads;
+#define DIVUP(m, n) ((m) / (n) + ((m) % (n) > 0))
+
+inline int GET_BLOCKS(const int N) {
+ int optimal_block_num = (N + THREADS_PER_BLOCK - 1) / THREADS_PER_BLOCK;
int max_block_num = 4096;
return min(optimal_block_num, max_block_num);
}
diff --git a/mmcv/ops/csrc/common/cuda/convex_iou_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/convex_iou_cuda_kernel.cuh
deleted file mode 100644
index 2af96f7963ec347486ced942a5ef7cc4f187db8b..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/convex_iou_cuda_kernel.cuh
+++ /dev/null
@@ -1,831 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved
-#ifndef CONVEX_IOU_CUDA_KERNEL_CUH
-#define CONVEX_IOU_CUDA_KERNEL_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-#define MAXN 100
-#define NMAX 512
-__device__ const double EPS = 1E-8;
-
-__device__ inline int sig(double d) { return (d > EPS) - (d < -EPS); }
-
-struct Point {
- double x, y;
- __device__ Point() {}
- __device__ Point(double x, double y) : x(x), y(y) {}
-};
-
-__device__ inline bool point_same(Point& a, Point& b) {
- return sig(a.x - b.x) == 0 && sig(a.y - b.y) == 0;
-}
-
-__device__ inline void swap1(Point* a, Point* b) {
- Point temp;
- temp.x = a->x;
- temp.y = a->y;
-
- a->x = b->x;
- a->y = b->y;
-
- b->x = temp.x;
- b->y = temp.y;
-}
-
-__device__ inline void reverse1(Point* a, const int n) {
- for (int i = 0; i < (n - 1) / 2.0; i++) {
- Point* j = &(a[i]);
- Point* k = &(a[n - 1 - i]);
- swap1(j, k);
- }
-}
-
-__device__ inline double cross(Point o, Point a, Point b) {
- return (a.x - o.x) * (b.y - o.y) - (b.x - o.x) * (a.y - o.y);
-}
-
-__device__ inline double dis(Point a, Point b) {
- return (a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y);
-}
-__device__ inline double area(Point* ps, int n) {
- ps[n] = ps[0];
- double res = 0;
- for (int i = 0; i < n; i++) {
- res += ps[i].x * ps[i + 1].y - ps[i].y * ps[i + 1].x;
- }
- return res / 2.0;
-}
-__device__ inline double polygon_area_grad(Point* ps, int n,
- int* polygon_to_pred_index,
- int n_pred, double* grad_C) {
- ps[n] = ps[0];
- double partion_grad[4 * 30 + 2];
- double res = 0;
- for (int i = 0; i < n; i++) {
- res += ps[i].x * ps[i + 1].y - ps[i].y * ps[i + 1].x;
- partion_grad[i * 4 + 2] = ps[i + 1].y;
- partion_grad[i * 4 + 3] = -ps[i + 1].x;
- if (i != n - 1) {
- partion_grad[i * 4 + 4] = -ps[i].y;
- partion_grad[i * 4 + 5] = ps[i].x;
- } else {
- partion_grad[0] = -ps[i].y;
- partion_grad[1] = ps[i].x;
- }
- }
- for (int i = 0; i < n; i++) {
- for (int j = 0; j < n_pred; j++) {
- if (i == polygon_to_pred_index[j]) {
- grad_C[2 * polygon_to_pred_index[j + n_pred]] =
- (partion_grad[i * 4] + partion_grad[i * 4 + 2]) / 2;
- break;
- }
- }
- for (int j = 0; j < n_pred; j++) {
- if (i == polygon_to_pred_index[j]) {
- grad_C[2 * polygon_to_pred_index[j + n_pred] + 1] =
- (partion_grad[i * 4 + 1] + partion_grad[i * 4 + 1 + 2]) / 2;
- break;
- }
- }
- }
-
- return res / 2.0;
-}
-
-__device__ inline int lineCross(Point a, Point b, Point c, Point d, Point& p,
- double* cut_grad, int m, int n, int i) {
- double s1, s2;
- double s2_s1_2;
- double ds1_dxc, ds1_dyc, ds2_dxd, ds2_dyd;
- double dxp_dxc, dxp_dyc, dxp_dxd, dxp_dyd, dyp_dxc, dyp_dyc, dyp_dxd, dyp_dyd;
- s1 = cross(a, b, c);
- s2 = cross(a, b, d);
-
- ds1_dxc = -(b.y - a.y);
- ds1_dyc = b.x - a.x;
- ds2_dxd = ds1_dxc;
- ds2_dyd = ds1_dyc;
- s2_s1_2 = (s2 - s1) * (s2 - s1);
-
- if (sig(s1) == 0 && sig(s2) == 0) return 2;
- if (sig(s2 - s1) == 0) return 0;
-
- dxp_dxc =
- ((s2 - d.x * ds1_dxc) * (s2 - s1) - (c.x * s2 - d.x * s1) * (-ds1_dxc)) /
- (s2_s1_2);
- dxp_dyc =
- ((0 - d.x * ds1_dyc) * (s2 - s1) - (c.x * s2 - d.x * s1) * (-ds1_dyc)) /
- (s2_s1_2);
- dxp_dxd =
- ((c.x * ds2_dxd - s1) * (s2 - s1) - (c.x * s2 - d.x * s1) * (ds2_dxd)) /
- (s2_s1_2);
- dxp_dyd =
- ((c.x * ds2_dyd - 0) * (s2 - s1) - (c.x * s2 - d.x * s1) * (ds2_dyd)) /
- (s2_s1_2);
-
- dyp_dxc =
- ((0 - d.y * ds1_dxc) * (s2 - s1) - (c.y * s2 - d.y * s1) * (-ds1_dxc)) /
- (s2_s1_2);
- dyp_dyc =
- ((s2 - d.y * ds1_dyc) * (s2 - s1) - (c.y * s2 - d.y * s1) * (-ds1_dyc)) /
- (s2_s1_2);
- dyp_dxd =
- ((c.y * ds2_dxd - 0) * (s2 - s1) - (c.y * s2 - d.y * s1) * (ds2_dxd)) /
- (s2_s1_2);
- dyp_dyd =
- ((c.y * ds2_dyd - s1) * (s2 - s1) - (c.y * s2 - d.y * s1) * (ds2_dyd)) /
- (s2_s1_2);
-
- p.x = (c.x * s2 - d.x * s1) / (s2 - s1);
- p.y = (c.y * s2 - d.y * s1) / (s2 - s1);
- if (i == n - 1) {
- cut_grad[4 * n * m + 4 * i] = dxp_dxc; // + dyp_dxc;
- cut_grad[4 * n * m + 4 * i + 1] = dyp_dxc;
- cut_grad[4 * n * m + 4 * i + 2] = dxp_dyc; // + dyp_dyc;
- cut_grad[4 * n * m + 4 * i + 3] = dyp_dyc;
- cut_grad[4 * n * m + 0] = dxp_dxd; // + dyp_dxd;
- cut_grad[4 * n * m + 1] = dyp_dxd;
- cut_grad[4 * n * m + 2] = dxp_dyd; // + dyp_dyd;
- cut_grad[4 * n * m + 3] = dyp_dyd;
- } else {
- cut_grad[4 * n * m + 4 * i] = dxp_dxc; // + dyp_dxc;
- cut_grad[4 * n * m + 4 * i + 1] = dyp_dxc;
- cut_grad[4 * n * m + 4 * i + 2] = dxp_dyc; // + dyp_dyc;
- cut_grad[4 * n * m + 4 * i + 3] = dyp_dyc;
- cut_grad[4 * n * m + 4 * (i + 1)] = dxp_dxd; // + dyp_dxd;
- cut_grad[4 * n * m + 4 * (i + 1) + 1] = dyp_dxd;
- cut_grad[4 * n * m + 4 * (i + 1) + 2] = dxp_dyd; // + dyp_dyd;
- cut_grad[4 * n * m + 4 * (i + 1) + 3] = dyp_dyd;
- }
-
- return 1;
-}
-__device__ inline void polygon_cut(Point* p, int& n, Point a, Point b,
- double* cut_grad) {
- Point pp[MAXN];
- double ccur_grad[MAXN] = {};
- int m = 0;
- p[n] = p[0];
- int k = n;
- for (int i = 0; i < n; i++) {
- if (sig(cross(a, b, p[i])) > 0) {
- pp[m] = p[i];
- ccur_grad[4 * n * m + 4 * i] = 1.0;
- ccur_grad[4 * n * m + 4 * i + 3] = 1.0;
- m++;
- }
- if (sig(cross(a, b, p[i])) != sig(cross(a, b, p[i + 1]))) {
- lineCross(a, b, p[i], p[i + 1], pp[m], ccur_grad, m, n, i);
- m++;
- }
- }
-
- n = 0;
- for (int i = 0; i < m; i++) {
- if (!i || !(point_same(pp[i], pp[i - 1]))) {
- p[n] = pp[i];
- for (int j = 0; j < 4 * k; j++) {
- cut_grad[4 * k * n + j] = ccur_grad[4 * k * i + j];
- }
- n++;
- }
- }
-
- while (n > 1 && point_same(p[n - 1], p[0])) n--;
-}
-
-__device__ inline double intersectArea(Point a, Point b, Point c, Point d,
- double* grad_AB, int order,
- int convex_n) {
- Point o(0, 0);
- int res_flag = 0;
- int s1 = sig(cross(o, a, b));
- int s2 = sig(cross(o, c, d));
- if (s1 == 0 || s2 == 0) return 0.0;
- if (s1 == -1) {
- Point* i = &a;
- Point* j = &b;
- swap1(i, j);
- res_flag = 1;
- }
- if (s2 == -1) {
- Point* i = &c;
- Point* j = &d;
- swap1(i, j);
- }
- Point p[10] = {o, a, b};
- int n = 3, n0 = 3, n1, n2, n3;
- double cut_grad1[MAXN] = {};
- double cut_grad2[MAXN] = {};
- double cut_grad3[MAXN] = {};
- double p1_p_grad[10][10] = {};
- double p2_p1_grad[10][10] = {};
- double p3_p2_grad[10][10] = {};
-
- double p3_p1_grad[10][10] = {};
- double p3_p_grad[10][10] = {};
-
- // 1
- polygon_cut(p, n, o, c, cut_grad1);
- n1 = n;
- for (int i = 0; i < n; i++) {
- for (int j = 0; j < 4 * n0; j++) {
- if (!(j % 2)) {
- p1_p_grad[2 * i][j / 2] = cut_grad1[4 * n0 * i + j];
- } else {
- p1_p_grad[2 * i + 1][j / 2] = cut_grad1[4 * n0 * i + j];
- }
- }
- }
-
- // 2
- polygon_cut(p, n, c, d, cut_grad2);
- n2 = n;
- for (int i = 0; i < n; i++) {
- for (int j = 0; j < 4 * n1; j++) {
- if (!(j % 2)) {
- p2_p1_grad[2 * i][j / 2] = cut_grad2[4 * n1 * i + j];
- } else {
- p2_p1_grad[2 * i + 1][j / 2] = cut_grad2[4 * n1 * i + j];
- }
- }
- }
- // 3
- polygon_cut(p, n, d, o, cut_grad3);
- n3 = n;
- for (int i = 0; i < n; i++) {
- for (int j = 0; j < 4 * n2; j++) {
- if (!(j % 2)) {
- p3_p2_grad[2 * i][j / 2] = cut_grad3[4 * n2 * i + j];
- } else {
- p3_p2_grad[2 * i + 1][j / 2] = cut_grad3[4 * n2 * i + j];
- }
- }
- }
-
- // mul
- // p3_p2(n3 * n2) * p2_p1(n2 * n1) = p3_p1 (n3 * n1)
- for (int i = 0; i < 2 * n3; i++) {
- for (int j = 0; j < 2 * n1; j++) {
- double sum = 0.0;
- for (int m = 0; m < 2 * n2; m++) {
- sum = sum + p3_p2_grad[i][m] * p2_p1_grad[m][j];
- }
- p3_p1_grad[i][j] = sum;
- }
- }
-
- // p3_p1 (n3 * n1) * p1_p (n1 * n0) = p3_p (n3 * n0)
- for (int i = 0; i < 2 * n3; i++) {
- for (int j = 0; j < 2 * n0; j++) {
- double sum = 0.0;
- for (int m = 0; m < 2 * n1; m++) {
- sum = sum + p3_p1_grad[i][m] * p1_p_grad[m][j];
- }
- p3_p_grad[i][j] = sum;
- }
- }
-
- // calculate S_grad
- int polygon_index_box_index[20];
- double grad_polygon[20];
- double S_grad[6];
-
- for (int i = 0; i < n3; i++) {
- polygon_index_box_index[i] = i;
- polygon_index_box_index[i + n3] = i;
- }
-
- double res =
- polygon_area_grad(p, n3, polygon_index_box_index, n3, grad_polygon);
-
- if (s1 * s2 == -1) {
- for (int j = 0; j < 2 * 3; j++) {
- double sum = 0.0;
- for (int m = 0; m < 2 * n3; m++) {
- sum = sum - grad_polygon[m] * p3_p_grad[m][j];
- }
- S_grad[j] = sum;
- }
-
- if (order != convex_n - 1) {
- if (res_flag) {
- grad_AB[2 * order] += S_grad[4];
- grad_AB[2 * order + 1] += S_grad[5];
- grad_AB[2 * order + 2] += S_grad[2];
- grad_AB[2 * order + 3] += S_grad[3];
-
- } else {
- grad_AB[2 * order] += S_grad[2];
- grad_AB[2 * order + 1] += S_grad[3];
- grad_AB[2 * order + 2] += S_grad[4];
- grad_AB[2 * order + 3] += S_grad[5];
- }
- } else {
- if (res_flag) {
- grad_AB[2 * order] += S_grad[4];
- grad_AB[2 * order + 1] += S_grad[5];
- grad_AB[0] += S_grad[2];
- grad_AB[1] += S_grad[3];
-
- } else {
- grad_AB[2 * order] += S_grad[2];
- grad_AB[2 * order + 1] += S_grad[3];
- grad_AB[0] += S_grad[4];
- grad_AB[1] += S_grad[5];
- }
- }
- res = -res;
- } else {
- for (int j = 0; j < 2 * 3; j++) {
- double sum = 0.0;
- for (int m = 0; m < 2 * n3; m++) {
- sum = sum + grad_polygon[m] * p3_p_grad[m][j];
- }
- S_grad[j] = sum;
- }
-
- if (order != convex_n - 1) {
- if (res_flag) {
- grad_AB[2 * order] += S_grad[4];
- grad_AB[2 * order + 1] += S_grad[5];
- grad_AB[2 * order + 2] += S_grad[2];
- grad_AB[2 * order + 3] += S_grad[3];
- } else {
- grad_AB[2 * order] += S_grad[2];
- grad_AB[2 * order + 1] += S_grad[3];
- grad_AB[2 * order + 2] += S_grad[4];
- grad_AB[2 * order + 3] += S_grad[5];
- }
- } else {
- if (res_flag) {
- grad_AB[2 * order] += S_grad[4];
- grad_AB[2 * order + 1] += S_grad[5];
- grad_AB[0] += S_grad[2];
- grad_AB[1] += S_grad[3];
- } else {
- grad_AB[2 * order] += S_grad[2];
- grad_AB[2 * order + 1] += S_grad[3];
- grad_AB[0] += S_grad[4];
- grad_AB[1] += S_grad[5];
- }
- }
- }
- return res;
-}
-
-__device__ inline double intersectAreaO(Point* ps1, int n1, Point* ps2, int n2,
- double* grad_AB) {
- if (area(ps1, n1) < 0) reverse1(ps1, n1);
- if (area(ps2, n2) < 0) reverse1(ps2, n2);
- ps1[n1] = ps1[0];
- ps2[n2] = ps2[0];
- double res = 0;
- for (int i = 0; i < n1; i++) {
- for (int j = 0; j < n2; j++) {
- res +=
- intersectArea(ps1[i], ps1[i + 1], ps2[j], ps2[j + 1], grad_AB, i, n1);
- }
- }
- return res;
-}
-
-__device__ inline void Jarvis(Point* in_poly, int& n_poly) {
- Point p_max, p_k;
- int max_index, k_index;
- int Stack[NMAX] = {}, top1, top2;
- double sign;
- Point right_point[10], left_point[10];
-
- for (int i = 0; i < n_poly; i++) {
- if (in_poly[i].y < in_poly[0].y ||
- in_poly[i].y == in_poly[0].y && in_poly[i].x < in_poly[0].x) {
- Point* j = &(in_poly[0]);
- Point* k = &(in_poly[i]);
- swap1(j, k);
- }
- if (i == 0) {
- p_max = in_poly[0];
- max_index = 0;
- }
- if (in_poly[i].y > p_max.y ||
- in_poly[i].y == p_max.y && in_poly[i].x > p_max.x) {
- p_max = in_poly[i];
- max_index = i;
- }
- }
-
- if (max_index == 0) {
- max_index = 1;
- p_max = in_poly[max_index];
- }
-
- k_index = 0, Stack[0] = 0, top1 = 0;
- while (k_index != max_index) {
- p_k = p_max;
- k_index = max_index;
- for (int i = 1; i < n_poly; i++) {
- sign = cross(in_poly[Stack[top1]], in_poly[i], p_k);
- if ((sign > 0) || ((sign == 0) && (dis(in_poly[Stack[top1]], in_poly[i]) >
- dis(in_poly[Stack[top1]], p_k)))) {
- p_k = in_poly[i];
- k_index = i;
- }
- }
- top1++;
- Stack[top1] = k_index;
- }
- for (int i = 0; i <= top1; i++) right_point[i] = in_poly[Stack[i]];
-
- k_index = 0, Stack[0] = 0, top2 = 0;
-
- while (k_index != max_index) {
- p_k = p_max;
- k_index = max_index;
- for (int i = 1; i < n_poly; i++) {
- sign = cross(in_poly[Stack[top2]], in_poly[i], p_k);
- if ((sign < 0) || (sign == 0) && (dis(in_poly[Stack[top2]], in_poly[i]) >
- dis(in_poly[Stack[top2]], p_k))) {
- p_k = in_poly[i];
- k_index = i;
- }
- }
- top2++;
- Stack[top2] = k_index;
- }
- for (int i = top2 - 1; i >= 0; i--) left_point[i] = in_poly[Stack[i]];
-
- for (int i = 0; i < top1 + top2; i++) {
- if (i <= top1) {
- in_poly[i] = right_point[i];
- } else {
- in_poly[i] = left_point[top2 - (i - top1)];
- }
- }
- n_poly = top1 + top2;
-}
-
-__device__ inline double intersectAreaPoly(Point* ps1, int n1, Point* ps2,
- int n2, double* grad_C) {
- Point polygon[MAXN];
- int n = n1 + n2, n_poly = 0;
- for (int i = 0; i < n1; i++) {
- for (int j = 0; j < n - n1; j++) {
- if (point_same(ps1[i], ps2[j])) {
- for (int k = j; k < n - n1 - 1; k++) {
- ps2[k] = ps2[k + 1];
- }
- n2--;
- break;
- }
- }
- }
- n_poly = n1 + n2;
- for (int i = 0; i < n_poly; i++) {
- if (i < n1) {
- polygon[i] = ps1[i];
- } else {
- polygon[i] = ps2[i - n1];
- }
- }
-
- Jarvis(polygon, n_poly);
-
- int polygon_to_pred_index[18] = {-1, -1, -1, -1, -1, -1, -1, -1, -1,
- -1, -1, -1, -1, -1, -1, -1, -1, -1};
- int n_pred = 0;
- for (int i = 0; i < n_poly; i++) {
- for (int j = 0; j < n1; j++) {
- if (polygon[i].x == ps1[j].x && polygon[i].y == ps1[j].y) {
- polygon_to_pred_index[n_pred] = i;
- polygon_to_pred_index[n_pred + n1] = j;
- n_pred += 1;
- break;
- }
- }
- }
- if (n_pred == 0) {
- double polygon_area = fabs(area(polygon, n_poly));
- for (int i = 0; i < 18; i++) {
- grad_C[i] = 0.0;
- }
- return polygon_area;
- } else {
- double polygon_area =
- polygon_area_grad(polygon, n_poly, polygon_to_pred_index, n1, grad_C);
- if (polygon_area < 0) {
- for (int i = 0; i < 18; i++) {
- grad_C[i] = -grad_C[i];
- }
- }
- return fabs(polygon_area);
- }
-}
-
-// convex_find and get the polygon_index_box_index
-__device__ inline void Jarvis_and_index(Point* in_poly, int& n_poly,
- int* points_to_convex_ind) {
- int n_input = n_poly;
- Point input_poly[20];
- for (int i = 0; i < n_input; i++) {
- input_poly[i].x = in_poly[i].x;
- input_poly[i].y = in_poly[i].y;
- }
- Point p_max, p_k;
- int max_index, k_index;
- int Stack[20], top1, top2;
- double sign;
- Point right_point[10], left_point[10];
-
- for (int i = 0; i < n_poly; i++) {
- if (in_poly[i].y < in_poly[0].y ||
- in_poly[i].y == in_poly[0].y && in_poly[i].x < in_poly[0].x) {
- Point* j = &(in_poly[0]);
- Point* k = &(in_poly[i]);
- swap1(j, k);
- }
- if (i == 0) {
- p_max = in_poly[0];
- max_index = 0;
- }
- if (in_poly[i].y > p_max.y ||
- in_poly[i].y == p_max.y && in_poly[i].x > p_max.x) {
- p_max = in_poly[i];
- max_index = i;
- }
- }
- if (max_index == 0) {
- max_index = 1;
- p_max = in_poly[max_index];
- }
-
- k_index = 0, Stack[0] = 0, top1 = 0;
- while (k_index != max_index) {
- p_k = p_max;
- k_index = max_index;
- for (int i = 1; i < n_poly; i++) {
- sign = cross(in_poly[Stack[top1]], in_poly[i], p_k);
- if ((sign > 0) || ((sign == 0) && (dis(in_poly[Stack[top1]], in_poly[i]) >
- dis(in_poly[Stack[top1]], p_k)))) {
- p_k = in_poly[i];
- k_index = i;
- }
- }
- top1++;
- Stack[top1] = k_index;
- }
- for (int i = 0; i <= top1; i++) {
- right_point[i] = in_poly[Stack[i]];
- }
-
- k_index = 0, Stack[0] = 0, top2 = 0;
-
- while (k_index != max_index) {
- p_k = p_max;
- k_index = max_index;
- for (int i = 1; i < n_poly; i++) {
- sign = cross(in_poly[Stack[top2]], in_poly[i], p_k);
- if ((sign < 0) || (sign == 0) && (dis(in_poly[Stack[top2]], in_poly[i]) >
- dis(in_poly[Stack[top2]], p_k))) {
- p_k = in_poly[i];
- k_index = i;
- }
- }
- top2++;
- Stack[top2] = k_index;
- }
-
- for (int i = top2 - 1; i >= 0; i--) {
- left_point[i] = in_poly[Stack[i]];
- }
-
- for (int i = 0; i < top1 + top2; i++) {
- if (i <= top1) {
- in_poly[i] = right_point[i];
- } else {
- in_poly[i] = left_point[top2 - (i - top1)];
- }
- }
- n_poly = top1 + top2;
- for (int i = 0; i < n_poly; i++) {
- for (int j = 0; j < n_input; j++) {
- if (point_same(in_poly[i], input_poly[j])) {
- points_to_convex_ind[i] = j;
- break;
- }
- }
- }
-}
-
-template
-__device__ inline float devrIoU(T const* const p, T const* const q,
- T* point_grad, const int idx) {
- Point ps1[MAXN], ps2[MAXN];
-
- Point convex[MAXN];
- for (int i = 0; i < 9; i++) {
- convex[i].x = (double)p[i * 2];
- convex[i].y = (double)p[i * 2 + 1];
- }
- int n_convex = 9;
- int points_to_convex_ind[9] = {-1, -1, -1, -1, -1, -1, -1, -1, -1};
- Jarvis_and_index(convex, n_convex, points_to_convex_ind);
-
- int n1 = n_convex;
- int n2 = 4;
-
- for (int i = 0; i < n1; i++) {
- ps1[i].x = (double)convex[i].x;
- ps1[i].y = (double)convex[i].y;
- }
-
- for (int i = 0; i < n2; i++) {
- ps2[i].x = (double)q[i * 2];
- ps2[i].y = (double)q[i * 2 + 1];
- }
-
- int polygon_index_box_index[18];
- for (int i = 0; i < n1; i++) {
- polygon_index_box_index[i] = i;
- polygon_index_box_index[i + n1] = i;
- }
-
- double grad_A[18] = {};
- double grad_AB[18] = {};
- double grad_C[18] = {};
-
- double inter_area = intersectAreaO(ps1, n1, ps2, n2, grad_AB);
- double S_pred =
- polygon_area_grad(ps1, n1, polygon_index_box_index, n1, grad_A);
- if (S_pred < 0) {
- for (int i = 0; i < n_convex * 2; i++) {
- grad_A[i] = -grad_A[i];
- }
- }
- double union_area = fabs(S_pred) + fabs(area(ps2, n2)) - inter_area;
-
- double iou = inter_area / union_area;
- double polygon_area = intersectAreaPoly(ps1, n1, ps2, n2, grad_C);
-
- // printf("%d:live\n", idx);
- double rot_giou = iou - (polygon_area - union_area) / polygon_area;
-
- float grad_point_temp[18] = {};
-
- for (int i = 0; i < n_convex; i++) {
- int grad_point = points_to_convex_ind[i];
- grad_point_temp[2 * grad_point] =
- (float)((union_area + inter_area) / (union_area * union_area) *
- grad_AB[2 * i] -
- iou / union_area * grad_A[2 * i] -
- 1 / polygon_area * (grad_AB[2 * i] - grad_A[2 * i]) -
- (union_area) / polygon_area / polygon_area * grad_C[2 * i]);
- grad_point_temp[2 * grad_point + 1] =
- (float)((union_area + inter_area) / (union_area * union_area) *
- grad_AB[2 * i + 1] -
- iou / union_area * grad_A[2 * i + 1] -
- 1 / polygon_area * (grad_AB[2 * i + 1] - grad_A[2 * i + 1]) -
- (union_area) / polygon_area / polygon_area * grad_C[2 * i + 1]);
- }
-
- for (int i = 0; i < 9; i++) {
- point_grad[2 * i] = grad_point_temp[2 * i];
- point_grad[2 * i + 1] = grad_point_temp[2 * i + 1];
- }
- return (float)rot_giou;
-}
-
-template
-__global__ void convex_giou_cuda_kernel(const int ex_n_boxes,
- const int gt_n_boxes, const T* ex_boxes,
- const T* gt_boxes, T* point_grad) {
- CUDA_1D_KERNEL_LOOP(index, ex_n_boxes) {
- const T* cur_box = ex_boxes + index * 18;
- const T* cur_gt_box = gt_boxes + index * 8;
- T* cur_grad = point_grad + index * 19;
- T giou = devrIoU(cur_box, cur_gt_box, cur_grad, threadIdx.x);
- cur_grad[18] = giou;
- }
-}
-
-__device__ inline int lineCross(Point a, Point b, Point c, Point d, Point& p) {
- double s1, s2;
- s1 = cross(a, b, c);
- s2 = cross(a, b, d);
- if (sig(s1) == 0 && sig(s2) == 0) return 2;
- if (sig(s2 - s1) == 0) return 0;
- p.x = (c.x * s2 - d.x * s1) / (s2 - s1);
- p.y = (c.y * s2 - d.y * s1) / (s2 - s1);
- return 1;
-}
-
-__device__ inline void polygon_cut(Point* p, int& n, Point a, Point b) {
- Point pp[MAXN];
- int m = 0;
- p[n] = p[0];
- for (int i = 0; i < n; i++) {
- if (sig(cross(a, b, p[i])) > 0) {
- pp[m] = p[i];
- m++;
- }
- if (sig(cross(a, b, p[i])) != sig(cross(a, b, p[i + 1]))) {
- lineCross(a, b, p[i], p[i + 1], pp[m]);
- m++;
- }
- }
- n = 0;
- for (int i = 0; i < m; i++) {
- if (!i || !(point_same(pp[i], pp[i - 1]))) {
- p[n] = pp[i];
- n++;
- }
- }
-
- while (n > 1 && point_same(p[n - 1], p[0])) n--;
-}
-
-__device__ inline double intersectArea(Point a, Point b, Point c, Point d) {
- Point o(0, 0);
- int s1 = sig(cross(o, a, b));
- int s2 = sig(cross(o, c, d));
- if (s1 == 0 || s2 == 0) return 0.0;
- if (s1 == -1) {
- Point* i = &a;
- Point* j = &b;
- swap1(i, j);
- }
- if (s2 == -1) {
- Point* i = &c;
- Point* j = &d;
- swap1(i, j);
- }
- Point p[10] = {o, a, b};
- int n = 3;
-
- polygon_cut(p, n, o, c);
- polygon_cut(p, n, c, d);
- polygon_cut(p, n, d, o);
- double res = area(p, n);
- if (s1 * s2 == -1) res = -res;
- return res;
-}
-__device__ inline double intersectAreaO(Point* ps1, int n1, Point* ps2,
- int n2) {
- if (area(ps1, n1) < 0) reverse1(ps1, n1);
- if (area(ps2, n2) < 0) reverse1(ps2, n2);
- ps1[n1] = ps1[0];
- ps2[n2] = ps2[0];
- double res = 0;
- for (int i = 0; i < n1; i++) {
- for (int j = 0; j < n2; j++) {
- res += intersectArea(ps1[i], ps1[i + 1], ps2[j], ps2[j + 1]);
- }
- }
- return res;
-}
-
-template
-__device__ inline float devrIoU(T const* const p, T const* const q) {
- Point ps1[MAXN], ps2[MAXN];
- Point convex[MAXN];
- for (int i = 0; i < 9; i++) {
- convex[i].x = (double)p[i * 2];
- convex[i].y = (double)p[i * 2 + 1];
- }
- int n_convex = 9;
- int points_to_convex_ind[9] = {-1, -1, -1, -1, -1, -1, -1, -1, -1};
- Jarvis_and_index(convex, n_convex, points_to_convex_ind);
- int n1 = n_convex;
- for (int i = 0; i < n1; i++) {
- ps1[i].x = (double)convex[i].x;
- ps1[i].y = (double)convex[i].y;
- }
- int n2 = 4;
- for (int i = 0; i < n2; i++) {
- ps2[i].x = (double)q[i * 2];
- ps2[i].y = (double)q[i * 2 + 1];
- }
- double inter_area = intersectAreaO(ps1, n1, ps2, n2);
- double S_pred = area(ps1, n1);
- double union_area = fabs(S_pred) + fabs(area(ps2, n2)) - inter_area;
- double iou = inter_area / union_area;
- return (float)iou;
-}
-
-template
-__global__ void convex_iou_cuda_kernel(const int ex_n_boxes,
- const int gt_n_boxes, const T* ex_boxes,
- const T* gt_boxes, T* iou) {
- CUDA_1D_KERNEL_LOOP(index, ex_n_boxes) {
- const T* cur_box = ex_boxes + index * 18;
- for (int i = 0; i < gt_n_boxes; i++) {
- iou[index * gt_n_boxes + i] = devrIoU(cur_box, gt_boxes + i * 8);
- }
- }
-}
-#endif // CONVEX_IOU_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/correlation_cuda.cuh b/mmcv/ops/csrc/common/cuda/correlation_cuda.cuh
index f910561ec309cd50fd6d4da131ab36cdf3ca963a..75ea4add72f597c88c8cdf511a7d2fd04727735b 100644
--- a/mmcv/ops/csrc/common/cuda/correlation_cuda.cuh
+++ b/mmcv/ops/csrc/common/cuda/correlation_cuda.cuh
@@ -29,25 +29,21 @@ using namespace torch;
#define TensorAcc5R PackedTensorAccessor32
#define WITHIN_BOUNDS(x, y, H, W) (x >= 0 && x < H && y >= 0 && y < W)
-#define WARP_SIZE 32
-#define FULL_MASK 0xffffffff
+#define THREADS_FORWARD 32
+#define THREADS_BACKWARD 16
template
__global__ void correlation_forward_cuda_kernel(
const TensorAcc4R rInput1, const TensorAcc4R rInput2, TensorAcc5R output,
int kH, int kW, int patchH, int patchW, int padH, int padW, int dilationH,
- int dilationW, int dilation_patchH, int dilation_patchW, int dH, int dW,
- int oH, int oW) {
+ int dilationW, int dilation_patchH, int dilation_patchW, int dH, int dW) {
const int iH = rInput1.size(1);
const int iW = rInput1.size(2);
const int C = rInput1.size(3);
const int n = blockIdx.x;
- const int h = blockIdx.y * blockDim.y + threadIdx.y;
- const int w = blockIdx.z * blockDim.z + threadIdx.z;
-
- if (h >= oH || w >= oW) return;
-
+ const int h = blockIdx.y;
+ const int w = blockIdx.z;
const int thread = threadIdx.x;
const int start_i = -padH + h * dH;
@@ -56,37 +52,40 @@ __global__ void correlation_forward_cuda_kernel(
const int patchRadH = dilation_patchH * (patchH - 1) / 2;
const int patchRadW = dilation_patchW * (patchW - 1) / 2;
+ __shared__ scalar_t prod_sum[THREADS_FORWARD];
+
for (int ph = 0; ph < patchH; ++ph) {
int ph_dilated = ph * dilation_patchH - patchRadH;
for (int pw = 0; pw < patchW; ++pw) {
int pw_dilated = pw * dilation_patchW - patchRadW;
- scalar_t prod_sum = 0.0f;
+ prod_sum[thread] = 0;
for (int i = 0; i < kH; ++i) {
int i1 = start_i + i * dilationH;
int i2 = i1 + ph_dilated;
- if (WITHIN_BOUNDS(i1, i2, iH, iH)) {
- for (int j = 0; j < kW; ++j) {
- int j1 = start_j + j * dilationW;
- int j2 = j1 + pw_dilated;
- if (WITHIN_BOUNDS(j1, j2, iW, iW)) {
- for (int c = thread; c < C; c += WARP_SIZE) {
- scalar_t v1 = rInput1[n][i1][j1][c];
- scalar_t v2 = rInput2[n][i2][j2][c];
- prod_sum += v1 * v2;
- }
+ if
+ WITHIN_BOUNDS(i1, i2, iH, iH) {
+ for (int j = 0; j < kW; ++j) {
+ int j1 = start_j + j * dilationW;
+ int j2 = j1 + pw_dilated;
+ if
+ WITHIN_BOUNDS(j1, j2, iW, iW) {
+ for (int c = thread; c < C; c += THREADS_FORWARD) {
+ scalar_t v1 = rInput1[n][i1][j1][c];
+ scalar_t v2 = rInput2[n][i2][j2][c];
+ prod_sum[thread] += v1 * v2;
+ }
+ }
}
}
- }
}
// accumulate
- for (int offset = 16; offset > 0; offset /= 2)
-#ifdef MMCV_WITH_HIP
- prod_sum += __shfl_down(float(prod_sum), offset);
-#else
- prod_sum += __shfl_down_sync(FULL_MASK, float(prod_sum), offset);
-#endif
+ __syncthreads();
if (thread == 0) {
- output[n][ph][pw][h][w] = prod_sum;
+ scalar_t reduce_sum = 0;
+ for (int index = 0; index < THREADS_FORWARD; ++index) {
+ reduce_sum += prod_sum[index];
+ }
+ output[n][ph][pw][h][w] = reduce_sum;
}
}
}
@@ -98,10 +97,9 @@ __global__ void correlation_backward_cuda_kernel_input1(
TensorAcc4R grad_input1, const int kH, const int kW, const int patchH,
const int patchW, const int padH, const int padW, const int dilationH,
const int dilationW, const int dilation_patchH, const int dilation_patchW,
- const int dH, const int dW) {
- const int iH = input2.size(1);
- const int iW = input2.size(2);
- const int C = input2.size(3);
+ const int dH, const int dW, const int batch) {
+ const int iH = input2.size(2);
+ const int iW = input2.size(3);
const int H = grad_output.size(3);
const int W = grad_output.size(4);
@@ -109,53 +107,54 @@ __global__ void correlation_backward_cuda_kernel_input1(
const int patchRadH = (patchH - 1) / 2;
const int patchRadW = (patchW - 1) / 2;
- const int n = blockIdx.x;
+ const int n = batch;
+ const int c = blockIdx.x;
const int h = blockIdx.y;
const int w = blockIdx.z;
+ const int ph_off = threadIdx.x;
+ const int pw_off = threadIdx.y;
const int h_2 = h + padH;
const int w_2 = w + padW;
const int min_h = h_2 - kH * dilationH;
const int min_w = w_2 - kW * dilationW;
- extern __shared__ __align__(sizeof(4)) unsigned char grad_cache_char[];
- scalar_t *grad_cache = reinterpret_cast(grad_cache_char);
- for (int i = threadIdx.x; i < patchH * patchW; i += blockDim.x) {
- const int ph = i / patchW;
- const int pw = i % patchW;
+ __shared__ scalar_t prod_sum[THREADS_BACKWARD][THREADS_BACKWARD];
+ prod_sum[ph_off][pw_off] = 0;
+
+ for (int ph = ph_off; ph < patchH; ph += THREADS_BACKWARD) {
int i1 = h + dilation_patchH * (ph - patchRadH);
- int j1 = w + dilation_patchW * (pw - patchRadW);
-
- if (WITHIN_BOUNDS(i1, j1, iH, iW)) {
- scalar_t grad_val = 0.0f;
- for (int h_3 = h_2; h_3 > min_h; h_3 -= dilationH) {
- int i2 = (h_3) / dH;
- if (i2 * dH != h_3) continue;
- for (int w_3 = w_2; w_3 > min_w; w_3 -= dilationW) {
- int j2 = (w_3) / dW;
- if (j2 * dW != w_3) continue;
- if (WITHIN_BOUNDS(i2, j2, H, W)) {
- grad_val += grad_output[n][ph][pw][i2][j2];
+ for (int pw = pw_off; pw < patchW; pw += THREADS_BACKWARD) {
+ int j1 = w + dilation_patchW * (pw - patchRadW);
+ if (WITHIN_BOUNDS(i1, j1, iH, iW)) {
+ scalar_t val = input2[n][c][i1][j1];
+ for (int h_3 = h_2; h_3 > min_h; h_3 -= dilationH) {
+ int i2 = (h_3) / dH;
+ if (i2 * dH != h_3) continue;
+ for (int w_3 = w_2; w_3 > min_w; w_3 -= dilationW) {
+ int j2 = (w_3) / dW;
+ if (j2 * dW != w_3) continue;
+ if
+ WITHIN_BOUNDS(i2, j2, H, W) {
+ prod_sum[ph_off][pw_off] +=
+ grad_output[n][ph][pw][i2][j2] * val;
+ }
}
}
}
- grad_cache[i] = grad_val;
}
}
+
__syncthreads();
- for (int c = threadIdx.x; c < C; c += blockDim.x) {
- scalar_t grad_input_val = 0.0f;
- for (int ph = 0; ph < patchH; ++ph) {
- int i1 = h + dilation_patchH * (ph - patchRadH);
- for (int pw = 0; pw < patchW; ++pw) {
- int j1 = w + dilation_patchW * (pw - patchRadW);
- if (WITHIN_BOUNDS(i1, j1, iH, iW)) {
- grad_input_val += input2[n][i1][j1][c] * grad_cache[ph * patchW + pw];
- }
+ if (ph_off == 0 && pw_off == 0) {
+ scalar_t reduce_sum = 0;
+ for (int ph = 0; ph < THREADS_BACKWARD; ++ph) {
+ for (int pw = 0; pw < THREADS_BACKWARD; ++pw) {
+ reduce_sum += prod_sum[ph][pw];
}
}
- grad_input1[n][c][h][w] = grad_input_val;
+ grad_input1[n][c][h][w] = reduce_sum;
}
}
@@ -164,10 +163,9 @@ __global__ void correlation_backward_cuda_kernel_input2(
const TensorAcc5R grad_output, const TensorAcc4R input1,
TensorAcc4R grad_input2, int kH, int kW, int patchH, int patchW, int padH,
int padW, int dilationH, int dilationW, int dilation_patchH,
- int dilation_patchW, int dH, int dW) {
- const int iH = input1.size(1);
- const int iW = input1.size(2);
- const int C = input1.size(3);
+ int dilation_patchW, int dH, int dW, int batch) {
+ const int iH = input1.size(2);
+ const int iW = input1.size(3);
const int patchRadH = (patchH - 1) / 2;
const int patchRadW = (patchW - 1) / 2;
@@ -178,54 +176,56 @@ __global__ void correlation_backward_cuda_kernel_input2(
const int dilatedKH = kH * dilationH;
const int dilatedKW = kW * dilationW;
- const int n = blockIdx.x;
+ const int n = batch;
+ const int c = blockIdx.x;
const int h = blockIdx.y;
const int w = blockIdx.z;
+ const int ph_off = threadIdx.x;
+ const int pw_off = threadIdx.y;
- extern __shared__ __align__(sizeof(4)) unsigned char grad_cache_char[];
- scalar_t *grad_cache = reinterpret_cast(grad_cache_char);
- for (int i = threadIdx.x; i < patchH * patchW; i += blockDim.x) {
- const int ph = i / patchW;
- const int pw = i % patchW;
+ __shared__ scalar_t prod_sum[THREADS_BACKWARD][THREADS_BACKWARD];
+ prod_sum[ph_off][pw_off] = 0;
+
+ for (int ph = ph_off; ph < patchH; ph += THREADS_BACKWARD) {
int i1 = h - dilation_patchH * (ph - patchRadH);
- int j1 = w - dilation_patchW * (pw - patchRadW);
-
- if (WITHIN_BOUNDS(i1, j1, iH, iW)) {
- scalar_t grad_val = 0.0f;
-
- const int h_2 = i1 + padH;
- const int w_2 = j1 + padW;
- const int min_h = h_2 - dilatedKH;
- const int min_w = w_2 - dilatedKW;
-
- for (int h_3 = h_2; h_3 > min_h; h_3 -= dilationH) {
- int i2 = (h_3) / dH;
- if (i2 * dH != h_3) continue;
- for (int w_3 = w_2; w_3 > min_w; w_3 -= dilationW) {
- int j2 = (w_3) / dW;
- if (j2 * dW != w_3) continue;
- if (WITHIN_BOUNDS(i2, j2, H, W)) {
- grad_val += grad_output[n][ph][pw][i2][j2];
+ for (int pw = pw_off; pw < patchW; pw += THREADS_BACKWARD) {
+ int j1 = w - dilation_patchW * (pw - patchRadW);
+ if
+ WITHIN_BOUNDS(i1, j1, iH, iW) {
+ scalar_t val = input1[n][c][i1][j1];
+
+ const int h_2 = i1 + padH;
+ const int w_2 = j1 + padW;
+ const int min_h = h_2 - dilatedKH;
+ const int min_w = w_2 - dilatedKW;
+
+ for (int h_3 = h_2; h_3 > min_h; h_3 -= dilationH) {
+ int i2 = (h_3) / dH;
+ if (i2 * dH != h_3) continue;
+ for (int w_3 = w_2; w_3 > min_w; w_3 -= dilationW) {
+ int j2 = (w_3) / dW;
+ if (j2 * dW != w_3) continue;
+ if
+ WITHIN_BOUNDS(i2, j2, H, W) {
+ prod_sum[ph_off][pw_off] +=
+ grad_output[n][ph][pw][i2][j2] * val;
+ }
+ }
}
}
- }
- grad_cache[i] = grad_val;
}
}
+
__syncthreads();
- for (int c = threadIdx.x; c < C; c += blockDim.x) {
- scalar_t grad_input_val = 0.0f;
- for (int ph = 0; ph < patchH; ++ph) {
- int i1 = h - dilation_patchH * (ph - patchRadH);
- for (int pw = 0; pw < patchW; ++pw) {
- int j1 = w - dilation_patchW * (pw - patchRadW);
- if (WITHIN_BOUNDS(i1, j1, iH, iW)) {
- grad_input_val += input1[n][i1][j1][c] * grad_cache[ph * patchW + pw];
- }
+ if (ph_off == 0 && pw_off == 0) {
+ scalar_t reduce_sum = 0;
+ for (int ph = 0; ph < THREADS_BACKWARD; ++ph) {
+ for (int pw = 0; pw < THREADS_BACKWARD; ++pw) {
+ reduce_sum += prod_sum[ph][pw];
}
}
- grad_input2[n][c][h][w] = grad_input_val;
+ grad_input2[n][c][h][w] = reduce_sum;
}
}
#endif
diff --git a/mmcv/ops/csrc/common/cuda/diff_iou_rotated_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/diff_iou_rotated_cuda_kernel.cuh
deleted file mode 100644
index 053977a3011692b22a5dce6050fcfec4797f092c..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/diff_iou_rotated_cuda_kernel.cuh
+++ /dev/null
@@ -1,137 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved
-// Adapted from
-// https://github.com/lilanxiao/Rotated_IoU/cuda_op/sort_vert_kernel.cu # noqa
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-#define MAX_NUM_VERT_IDX 9
-#define INTERSECTION_OFFSET 8
-#define EPSILON 1e-8
-
-inline int opt_n_thread(int work_size) {
- const int pow_2 = std::log(static_cast(work_size)) / std::log(2.0);
- return max(min(1 << pow_2, THREADS_PER_BLOCK), 1);
-}
-
-/*
-compare normalized vertices (vertices around (0,0))
-if vertex1 < vertex2 return true.
-order: minimum at x-aixs, become larger in anti-clockwise direction
-*/
-__device__ bool compare_vertices(float x1, float y1, float x2, float y2) {
- if (fabs(x1 - x2) < EPSILON && fabs(y2 - y1) < EPSILON)
- return false; // if equal, return false
-
- if (y1 > 0 && y2 < 0) return true;
- if (y1 < 0 && y2 > 0) return false;
-
- float n1 = x1 * x1 + y1 * y1 + EPSILON;
- float n2 = x2 * x2 + y2 * y2 + EPSILON;
- float diff = fabs(x1) * x1 / n1 - fabs(x2) * x2 / n2;
-
- if (y1 > 0 && y2 > 0) {
- if (diff > EPSILON)
- return true;
- else
- return false;
- }
- if (y1 < 0 && y2 < 0) {
- if (diff < EPSILON)
- return true;
- else
- return false;
- }
- return false;
-}
-
-__global__ void diff_iou_rotated_sort_vertices_forward_cuda_kernel(
- int b, int n, int m, const float *__restrict__ vertices,
- const bool *__restrict__ mask, const int *__restrict__ num_valid,
- int *__restrict__ idx) {
- int batch_idx = blockIdx.x;
- vertices += batch_idx * n * m * 2;
- mask += batch_idx * n * m;
- num_valid += batch_idx * n;
- idx += batch_idx * n * MAX_NUM_VERT_IDX;
-
- int index = threadIdx.x; // index of polygon
- int stride = blockDim.x;
- for (int i = index; i < n; i += stride) {
- int pad; // index of arbitrary invalid intersection point (not box corner!)
- for (int j = INTERSECTION_OFFSET; j < m; ++j) {
- if (!mask[i * m + j]) {
- pad = j;
- break;
- }
- }
- if (num_valid[i] < 3) {
- // not enough vertices, take an invalid intersection point
- // (zero padding)
- for (int j = 0; j < MAX_NUM_VERT_IDX; ++j) {
- idx[i * MAX_NUM_VERT_IDX + j] = pad;
- }
- } else {
- // sort the valid vertices
- // note the number of valid vertices is known
- // note: check that num_valid[i] < MAX_NUM_VERT_IDX
- for (int j = 0; j < num_valid[i]; ++j) {
- // initialize with a "big" value
- float x_min = 1;
- float y_min = -EPSILON;
- int i_take = 0;
- int i2;
- float x2, y2;
- if (j != 0) {
- i2 = idx[i * MAX_NUM_VERT_IDX + j - 1];
- x2 = vertices[i * m * 2 + i2 * 2 + 0];
- y2 = vertices[i * m * 2 + i2 * 2 + 1];
- }
- for (int k = 0; k < m; ++k) {
- float x = vertices[i * m * 2 + k * 2 + 0];
- float y = vertices[i * m * 2 + k * 2 + 1];
- if (mask[i * m + k] && compare_vertices(x, y, x_min, y_min)) {
- if ((j == 0) || (j != 0 && compare_vertices(x2, y2, x, y))) {
- x_min = x;
- y_min = y;
- i_take = k;
- }
- }
- }
- idx[i * MAX_NUM_VERT_IDX + j] = i_take;
- }
- // duplicate the first idx
- idx[i * MAX_NUM_VERT_IDX + num_valid[i]] = idx[i * MAX_NUM_VERT_IDX + 0];
-
- // pad zeros
- for (int j = num_valid[i] + 1; j < MAX_NUM_VERT_IDX; ++j) {
- idx[i * MAX_NUM_VERT_IDX + j] = pad;
- }
-
- // for corner case: the two boxes are exactly the same.
- // in this case, idx would have duplicate elements, which makes the
- // shoelace formula broken because of the definition, the duplicate
- // elements only appear in the first 8 positions (they are "corners in
- // box", not "intersection of edges")
- if (num_valid[i] == 8) {
- int counter = 0;
- for (int j = 0; j < 4; ++j) {
- int check = idx[i * MAX_NUM_VERT_IDX + j];
- for (int k = 4; k < INTERSECTION_OFFSET; ++k) {
- if (idx[i * MAX_NUM_VERT_IDX + k] == check) counter++;
- }
- }
- if (counter == 4) {
- idx[i * MAX_NUM_VERT_IDX + 4] = idx[i * MAX_NUM_VERT_IDX + 0];
- for (int j = 5; j < MAX_NUM_VERT_IDX; ++j) {
- idx[i * MAX_NUM_VERT_IDX + j] = pad;
- }
- }
- }
-
- // TODO: still might need to cover some other corner cases :(
- }
- }
-}
diff --git a/mmcv/ops/csrc/common/cuda/gather_points_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/gather_points_cuda_kernel.cuh
index 6d932434cba245833e661b8c7e140601940bc35b..c8fc61546acbce55c59abe8371590bba2e610442 100644
--- a/mmcv/ops/csrc/common/cuda/gather_points_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/gather_points_cuda_kernel.cuh
@@ -22,14 +22,13 @@ __global__ void gather_points_forward_cuda_kernel(int b, int c, int n, int m,
int bs_idx = blockIdx.z;
int c_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(pt_idx, m) {
- if (bs_idx >= b || c_idx >= c) return;
-
- out += bs_idx * c * m + c_idx * m + pt_idx;
- idx += bs_idx * m + pt_idx;
- points += bs_idx * c * n + c_idx * n;
- out[0] = points[idx[0]];
- }
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (bs_idx >= b || c_idx >= c || pt_idx >= m) return;
+
+ out += bs_idx * c * m + c_idx * m + pt_idx;
+ idx += bs_idx * m + pt_idx;
+ points += bs_idx * c * n + c_idx * n;
+ out[0] = points[idx[0]];
}
template
@@ -44,15 +43,14 @@ __global__ void gather_points_backward_cuda_kernel(int b, int c, int n, int m,
int bs_idx = blockIdx.z;
int c_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(pt_idx, m) {
- if (bs_idx >= b || c_idx >= c) return;
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (bs_idx >= b || c_idx >= c || pt_idx >= m) return;
- grad_out += bs_idx * c * m + c_idx * m + pt_idx;
- idx += bs_idx * m + pt_idx;
- grad_points += bs_idx * c * n + c_idx * n;
+ grad_out += bs_idx * c * m + c_idx * m + pt_idx;
+ idx += bs_idx * m + pt_idx;
+ grad_points += bs_idx * c * n + c_idx * n;
- atomicAdd(grad_points + idx[0], grad_out[0]);
- }
+ atomicAdd(grad_points + idx[0], grad_out[0]);
}
#endif // GATHER_POINTS_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/group_points_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/group_points_cuda_kernel.cuh
index dfad66fc16d8759f614d7f36fa961673976b1d95..9cfc2dc865152769d55d4062b7f6bad25e9c70e8 100644
--- a/mmcv/ops/csrc/common/cuda/group_points_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/group_points_cuda_kernel.cuh
@@ -22,19 +22,18 @@ __global__ void group_points_forward_cuda_kernel(int b, int c, int n,
// out: (B, C, npoints, nsample)
int bs_idx = blockIdx.z;
int c_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(index, npoints * nsample) {
- if (bs_idx >= b || c_idx >= c) return;
+ int index = blockIdx.x * blockDim.x + threadIdx.x;
+ int pt_idx = index / nsample;
+ if (bs_idx >= b || c_idx >= c || pt_idx >= npoints) return;
- int pt_idx = index / nsample;
- int sample_idx = index % nsample;
+ int sample_idx = index % nsample;
- idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx;
- int in_idx = bs_idx * c * n + c_idx * n + idx[0];
- int out_idx = bs_idx * c * npoints * nsample + c_idx * npoints * nsample +
- pt_idx * nsample + sample_idx;
+ idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx;
+ int in_idx = bs_idx * c * n + c_idx * n + idx[0];
+ int out_idx = bs_idx * c * npoints * nsample + c_idx * npoints * nsample +
+ pt_idx * nsample + sample_idx;
- out[out_idx] = points[in_idx];
- }
+ out[out_idx] = points[in_idx];
}
template
@@ -49,17 +48,16 @@ __global__ void group_points_backward_cuda_kernel(int b, int c, int n,
// grad_points: (B, C, N)
int bs_idx = blockIdx.z;
int c_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(index, npoints * nsample) {
- int pt_idx = index / nsample;
- if (bs_idx >= b || c_idx >= c) return;
+ int index = blockIdx.x * blockDim.x + threadIdx.x;
+ int pt_idx = index / nsample;
+ if (bs_idx >= b || c_idx >= c || pt_idx >= npoints) return;
- int sample_idx = index % nsample;
- grad_out += bs_idx * c * npoints * nsample + c_idx * npoints * nsample +
- pt_idx * nsample + sample_idx;
- idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx;
+ int sample_idx = index % nsample;
+ grad_out += bs_idx * c * npoints * nsample + c_idx * npoints * nsample +
+ pt_idx * nsample + sample_idx;
+ idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx;
- atomicAdd(grad_points + bs_idx * c * n + c_idx * n + idx[0], grad_out[0]);
- }
+ atomicAdd(grad_points + bs_idx * c * n + c_idx * n + idx[0], grad_out[0]);
}
#endif // GROUP_POINTS_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/iou3d_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/iou3d_cuda_kernel.cuh
index 9ebdcad15eee05a9f412ef34eb12d3553874a4dc..4e261cbd0cf1d69973eab34f32ab2a334d6a13a6 100644
--- a/mmcv/ops/csrc/common/cuda/iou3d_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/iou3d_cuda_kernel.cuh
@@ -50,17 +50,21 @@ __device__ int check_rect_cross(const Point &p1, const Point &p2,
}
__device__ inline int check_in_box2d(const float *box, const Point &p) {
- // params: box (7) [x, y, z, dx, dy, dz, heading]
- const float MARGIN = 1e-2;
-
- float center_x = box[0], center_y = box[1];
- // rotate the point in the opposite direction of box
- float angle_cos = cos(-box[6]), angle_sin = sin(-box[6]);
- float rot_x = (p.x - center_x) * angle_cos + (p.y - center_y) * (-angle_sin);
- float rot_y = (p.x - center_x) * angle_sin + (p.y - center_y) * angle_cos;
-
- return (fabs(rot_x) < box[3] / 2 + MARGIN &&
- fabs(rot_y) < box[4] / 2 + MARGIN);
+ // params: box (5) [x1, y1, x2, y2, angle]
+ const float MARGIN = 1e-5;
+
+ float center_x = (box[0] + box[2]) / 2;
+ float center_y = (box[1] + box[3]) / 2;
+ float angle_cos = cos(-box[4]),
+ angle_sin =
+ sin(-box[4]); // rotate the point in the opposite direction of box
+ float rot_x =
+ (p.x - center_x) * angle_cos - (p.y - center_y) * angle_sin + center_x;
+ float rot_y =
+ (p.x - center_x) * angle_sin + (p.y - center_y) * angle_cos + center_y;
+
+ return (rot_x > box[0] - MARGIN && rot_x < box[2] + MARGIN &&
+ rot_y > box[1] - MARGIN && rot_y < box[3] + MARGIN);
}
__device__ inline int intersection(const Point &p1, const Point &p0,
@@ -112,19 +116,16 @@ __device__ inline int point_cmp(const Point &a, const Point &b,
}
__device__ inline float box_overlap(const float *box_a, const float *box_b) {
- // params box_a: [x, y, z, dx, dy, dz, heading]
- // params box_b: [x, y, z, dx, dy, dz, heading]
+ // params: box_a (5) [x1, y1, x2, y2, angle]
+ // params: box_b (5) [x1, y1, x2, y2, angle]
- float a_angle = box_a[6], b_angle = box_b[6];
- float a_dx_half = box_a[3] / 2, b_dx_half = box_b[3] / 2,
- a_dy_half = box_a[4] / 2, b_dy_half = box_b[4] / 2;
- float a_x1 = box_a[0] - a_dx_half, a_y1 = box_a[1] - a_dy_half;
- float a_x2 = box_a[0] + a_dx_half, a_y2 = box_a[1] + a_dy_half;
- float b_x1 = box_b[0] - b_dx_half, b_y1 = box_b[1] - b_dy_half;
- float b_x2 = box_b[0] + b_dx_half, b_y2 = box_b[1] + b_dy_half;
+ float a_x1 = box_a[0], a_y1 = box_a[1], a_x2 = box_a[2], a_y2 = box_a[3],
+ a_angle = box_a[4];
+ float b_x1 = box_b[0], b_y1 = box_b[1], b_x2 = box_b[2], b_y2 = box_b[3],
+ b_angle = box_b[4];
- Point center_a(box_a[0], box_a[1]);
- Point center_b(box_b[0], box_b[1]);
+ Point center_a((a_x1 + a_x2) / 2, (a_y1 + a_y2) / 2);
+ Point center_b((b_x1 + b_x2) / 2, (b_y1 + b_y2) / 2);
Point box_a_corners[5];
box_a_corners[0].set(a_x1, a_y1);
@@ -208,10 +209,10 @@ __device__ inline float box_overlap(const float *box_a, const float *box_b) {
}
__device__ inline float iou_bev(const float *box_a, const float *box_b) {
- // params box_a: [x, y, z, dx, dy, dz, heading]
- // params box_b: [x, y, z, dx, dy, dz, heading]
- float sa = box_a[3] * box_a[4];
- float sb = box_b[3] * box_b[4];
+ // params: box_a (5) [x1, y1, x2, y2, angle]
+ // params: box_b (5) [x1, y1, x2, y2, angle]
+ float sa = (box_a[2] - box_a[0]) * (box_a[3] - box_a[1]);
+ float sb = (box_b[2] - box_b[0]) * (box_b[3] - box_b[1]);
float s_overlap = box_overlap(box_a, box_b);
return s_overlap / fmaxf(sa + sb - s_overlap, EPS);
}
@@ -219,148 +220,149 @@ __device__ inline float iou_bev(const float *box_a, const float *box_b) {
__global__ void iou3d_boxes_overlap_bev_forward_cuda_kernel(
const int num_a, const float *boxes_a, const int num_b,
const float *boxes_b, float *ans_overlap) {
- // params boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]
- // params boxes_b: (M, 7) [x, y, z, dx, dy, dz, heading]
- CUDA_2D_KERNEL_LOOP(b_idx, num_b, a_idx, num_a) {
- if (a_idx >= num_a || b_idx >= num_b) {
- return;
- }
+ const int a_idx = blockIdx.y * THREADS_PER_BLOCK + threadIdx.y;
+ const int b_idx = blockIdx.x * THREADS_PER_BLOCK + threadIdx.x;
- const float *cur_box_a = boxes_a + a_idx * 7;
- const float *cur_box_b = boxes_b + b_idx * 7;
- float cur_overlap = box_overlap(cur_box_a, cur_box_b);
- ans_overlap[a_idx * num_b + b_idx] = cur_overlap;
+ if (a_idx >= num_a || b_idx >= num_b) {
+ return;
}
+ const float *cur_box_a = boxes_a + a_idx * 5;
+ const float *cur_box_b = boxes_b + b_idx * 5;
+ float s_overlap = box_overlap(cur_box_a, cur_box_b);
+ ans_overlap[a_idx * num_b + b_idx] = s_overlap;
}
-__global__ void iou3d_nms3d_forward_cuda_kernel(const int boxes_num,
- const float nms_overlap_thresh,
- const float *boxes,
- unsigned long long *mask) {
- // params: boxes (N, 7) [x, y, z, dx, dy, dz, heading]
+__global__ void iou3d_boxes_iou_bev_forward_cuda_kernel(const int num_a,
+ const float *boxes_a,
+ const int num_b,
+ const float *boxes_b,
+ float *ans_iou) {
+ const int a_idx = blockIdx.y * THREADS_PER_BLOCK + threadIdx.y;
+ const int b_idx = blockIdx.x * THREADS_PER_BLOCK + threadIdx.x;
+
+ if (a_idx >= num_a || b_idx >= num_b) {
+ return;
+ }
+
+ const float *cur_box_a = boxes_a + a_idx * 5;
+ const float *cur_box_b = boxes_b + b_idx * 5;
+ float cur_iou_bev = iou_bev(cur_box_a, cur_box_b);
+ ans_iou[a_idx * num_b + b_idx] = cur_iou_bev;
+}
+
+__global__ void nms_forward_cuda_kernel(const int boxes_num,
+ const float nms_overlap_thresh,
+ const float *boxes,
+ unsigned long long *mask) {
+ // params: boxes (N, 5) [x1, y1, x2, y2, ry]
// params: mask (N, N/THREADS_PER_BLOCK_NMS)
- const int blocks =
- (boxes_num + THREADS_PER_BLOCK_NMS - 1) / THREADS_PER_BLOCK_NMS;
- CUDA_2D_KERNEL_BLOCK_LOOP(col_start, blocks, row_start, blocks) {
- // if (row_start > col_start) return;
-
- const int row_size = fminf(boxes_num - row_start * THREADS_PER_BLOCK_NMS,
- THREADS_PER_BLOCK_NMS);
- const int col_size = fminf(boxes_num - col_start * THREADS_PER_BLOCK_NMS,
- THREADS_PER_BLOCK_NMS);
-
- __shared__ float block_boxes[THREADS_PER_BLOCK_NMS * 7];
-
- if (threadIdx.x < col_size) {
- block_boxes[threadIdx.x * 7 + 0] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 0];
- block_boxes[threadIdx.x * 7 + 1] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 1];
- block_boxes[threadIdx.x * 7 + 2] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 2];
- block_boxes[threadIdx.x * 7 + 3] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 3];
- block_boxes[threadIdx.x * 7 + 4] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 4];
- block_boxes[threadIdx.x * 7 + 5] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 5];
- block_boxes[threadIdx.x * 7 + 6] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 6];
- }
- __syncthreads();
- if (threadIdx.x < row_size) {
- const int cur_box_idx = THREADS_PER_BLOCK_NMS * row_start + threadIdx.x;
- const float *cur_box = boxes + cur_box_idx * 7;
+ const int row_start = blockIdx.y;
+ const int col_start = blockIdx.x;
+
+ // if (row_start > col_start) return;
+
+ const int row_size = fminf(boxes_num - row_start * THREADS_PER_BLOCK_NMS,
+ THREADS_PER_BLOCK_NMS);
+ const int col_size = fminf(boxes_num - col_start * THREADS_PER_BLOCK_NMS,
+ THREADS_PER_BLOCK_NMS);
+
+ __shared__ float block_boxes[THREADS_PER_BLOCK_NMS * 5];
+
+ if (threadIdx.x < col_size) {
+ block_boxes[threadIdx.x * 5 + 0] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 0];
+ block_boxes[threadIdx.x * 5 + 1] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 1];
+ block_boxes[threadIdx.x * 5 + 2] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 2];
+ block_boxes[threadIdx.x * 5 + 3] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 3];
+ block_boxes[threadIdx.x * 5 + 4] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 4];
+ }
+ __syncthreads();
- int i = 0;
- unsigned long long t = 0;
- int start = 0;
- if (row_start == col_start) {
- start = threadIdx.x + 1;
- }
- for (i = start; i < col_size; i++) {
- if (iou_bev(cur_box, block_boxes + i * 7) > nms_overlap_thresh) {
- t |= 1ULL << i;
- }
+ if (threadIdx.x < row_size) {
+ const int cur_box_idx = THREADS_PER_BLOCK_NMS * row_start + threadIdx.x;
+ const float *cur_box = boxes + cur_box_idx * 5;
+
+ int i = 0;
+ unsigned long long t = 0;
+ int start = 0;
+ if (row_start == col_start) {
+ start = threadIdx.x + 1;
+ }
+ for (i = start; i < col_size; i++) {
+ if (iou_bev(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {
+ t |= 1ULL << i;
}
- const int col_blocks =
- (boxes_num + THREADS_PER_BLOCK_NMS - 1) / THREADS_PER_BLOCK_NMS;
- mask[cur_box_idx * col_blocks + col_start] = t;
}
+ const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);
+ mask[cur_box_idx * col_blocks + col_start] = t;
}
}
__device__ inline float iou_normal(float const *const a, float const *const b) {
- // params: a: [x, y, z, dx, dy, dz, heading]
- // params: b: [x, y, z, dx, dy, dz, heading]
-
- float left = fmaxf(a[0] - a[3] / 2, b[0] - b[3] / 2),
- right = fminf(a[0] + a[3] / 2, b[0] + b[3] / 2);
- float top = fmaxf(a[1] - a[4] / 2, b[1] - b[4] / 2),
- bottom = fminf(a[1] + a[4] / 2, b[1] + b[4] / 2);
+ float left = fmaxf(a[0], b[0]), right = fminf(a[2], b[2]);
+ float top = fmaxf(a[1], b[1]), bottom = fminf(a[3], b[3]);
float width = fmaxf(right - left, 0.f), height = fmaxf(bottom - top, 0.f);
float interS = width * height;
- float Sa = a[3] * a[4];
- float Sb = b[3] * b[4];
+ float Sa = (a[2] - a[0]) * (a[3] - a[1]);
+ float Sb = (b[2] - b[0]) * (b[3] - b[1]);
return interS / fmaxf(Sa + Sb - interS, EPS);
}
-__global__ void iou3d_nms3d_normal_forward_cuda_kernel(
- const int boxes_num, const float nms_overlap_thresh, const float *boxes,
- unsigned long long *mask) {
- // params: boxes (N, 7) [x, y, z, dx, dy, dz, heading]
+__global__ void nms_normal_forward_cuda_kernel(const int boxes_num,
+ const float nms_overlap_thresh,
+ const float *boxes,
+ unsigned long long *mask) {
+ // params: boxes (N, 5) [x1, y1, x2, y2, ry]
// params: mask (N, N/THREADS_PER_BLOCK_NMS)
- const int blocks =
- (boxes_num + THREADS_PER_BLOCK_NMS - 1) / THREADS_PER_BLOCK_NMS;
- CUDA_2D_KERNEL_BLOCK_LOOP(col_start, blocks, row_start, blocks) {
- // if (row_start > col_start) return;
-
- const int row_size = fminf(boxes_num - row_start * THREADS_PER_BLOCK_NMS,
- THREADS_PER_BLOCK_NMS);
- const int col_size = fminf(boxes_num - col_start * THREADS_PER_BLOCK_NMS,
- THREADS_PER_BLOCK_NMS);
-
- __shared__ float block_boxes[THREADS_PER_BLOCK_NMS * 7];
-
- if (threadIdx.x < col_size) {
- block_boxes[threadIdx.x * 7 + 0] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 0];
- block_boxes[threadIdx.x * 7 + 1] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 1];
- block_boxes[threadIdx.x * 7 + 2] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 2];
- block_boxes[threadIdx.x * 7 + 3] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 3];
- block_boxes[threadIdx.x * 7 + 4] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 4];
- block_boxes[threadIdx.x * 7 + 5] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 5];
- block_boxes[threadIdx.x * 7 + 6] =
- boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 6];
- }
- __syncthreads();
+ const int row_start = blockIdx.y;
+ const int col_start = blockIdx.x;
+
+ // if (row_start > col_start) return;
+
+ const int row_size = fminf(boxes_num - row_start * THREADS_PER_BLOCK_NMS,
+ THREADS_PER_BLOCK_NMS);
+ const int col_size = fminf(boxes_num - col_start * THREADS_PER_BLOCK_NMS,
+ THREADS_PER_BLOCK_NMS);
+
+ __shared__ float block_boxes[THREADS_PER_BLOCK_NMS * 5];
+
+ if (threadIdx.x < col_size) {
+ block_boxes[threadIdx.x * 5 + 0] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 0];
+ block_boxes[threadIdx.x * 5 + 1] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 1];
+ block_boxes[threadIdx.x * 5 + 2] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 2];
+ block_boxes[threadIdx.x * 5 + 3] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 3];
+ block_boxes[threadIdx.x * 5 + 4] =
+ boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 5 + 4];
+ }
+ __syncthreads();
- if (threadIdx.x < row_size) {
- const int cur_box_idx = THREADS_PER_BLOCK_NMS * row_start + threadIdx.x;
- const float *cur_box = boxes + cur_box_idx * 7;
+ if (threadIdx.x < row_size) {
+ const int cur_box_idx = THREADS_PER_BLOCK_NMS * row_start + threadIdx.x;
+ const float *cur_box = boxes + cur_box_idx * 5;
- int i = 0;
- unsigned long long t = 0;
- int start = 0;
- if (row_start == col_start) {
- start = threadIdx.x + 1;
- }
- for (i = start; i < col_size; i++) {
- if (iou_normal(cur_box, block_boxes + i * 7) > nms_overlap_thresh) {
- t |= 1ULL << i;
- }
+ int i = 0;
+ unsigned long long t = 0;
+ int start = 0;
+ if (row_start == col_start) {
+ start = threadIdx.x + 1;
+ }
+ for (i = start; i < col_size; i++) {
+ if (iou_normal(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {
+ t |= 1ULL << i;
}
- const int col_blocks =
- (boxes_num + THREADS_PER_BLOCK_NMS - 1) / THREADS_PER_BLOCK_NMS;
- mask[cur_box_idx * col_blocks + col_start] = t;
}
+ const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);
+ mask[cur_box_idx * col_blocks + col_start] = t;
}
}
diff --git a/mmcv/ops/csrc/common/cuda/knn_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/knn_cuda_kernel.cuh
index 3cf52bb90eb27d02b28c52069c760c8a38f83f08..3181aa65cddf129e9e97dde97ceb97923b75c135 100644
--- a/mmcv/ops/csrc/common/cuda/knn_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/knn_cuda_kernel.cuh
@@ -51,42 +51,41 @@ __global__ void knn_forward_cuda_kernel(int b, int n, int m, int nsample,
const T *xyz, const T *new_xyz,
int *__restrict__ idx, T *dist2) {
int bs_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(pt_idx, m) {
- if (bs_idx >= b) return;
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (bs_idx >= b || pt_idx >= m) return;
- new_xyz += bs_idx * m * 3 + pt_idx * 3;
- xyz += bs_idx * n * 3;
- idx += bs_idx * m * nsample + pt_idx * nsample;
- dist2 += bs_idx * m * nsample + pt_idx * nsample;
+ new_xyz += bs_idx * m * 3 + pt_idx * 3;
+ xyz += bs_idx * n * 3;
+ idx += bs_idx * m * nsample + pt_idx * nsample;
+ dist2 += bs_idx * m * nsample + pt_idx * nsample;
- T new_x = new_xyz[0];
- T new_y = new_xyz[1];
- T new_z = new_xyz[2];
+ T new_x = new_xyz[0];
+ T new_y = new_xyz[1];
+ T new_z = new_xyz[2];
- float best_dist[100];
- int best_idx[100];
- for (int i = 0; i < nsample; i++) {
- best_dist[i] = 1e10;
- best_idx[i] = 0;
- }
- for (int i = 0; i < n; i++) {
- T x = xyz[i * 3 + 0];
- T y = xyz[i * 3 + 1];
- T z = xyz[i * 3 + 2];
- T d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) +
- (new_z - z) * (new_z - z);
- if (d2 < best_dist[0]) {
- best_dist[0] = d2;
- best_idx[0] = i;
- reheap(best_dist, best_idx, nsample);
- }
- }
- heap_sort(best_dist, best_idx, nsample);
- for (int i = 0; i < nsample; i++) {
- idx[i] = best_idx[i];
- dist2[i] = best_dist[i];
+ float best_dist[100];
+ int best_idx[100];
+ for (int i = 0; i < nsample; i++) {
+ best_dist[i] = 1e10;
+ best_idx[i] = 0;
+ }
+ for (int i = 0; i < n; i++) {
+ T x = xyz[i * 3 + 0];
+ T y = xyz[i * 3 + 1];
+ T z = xyz[i * 3 + 2];
+ T d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) +
+ (new_z - z) * (new_z - z);
+ if (d2 < best_dist[0]) {
+ best_dist[0] = d2;
+ best_idx[0] = i;
+ reheap(best_dist, best_idx, nsample);
}
}
+ heap_sort(best_dist, best_idx, nsample);
+ for (int i = 0; i < nsample; i++) {
+ idx[i] = best_idx[i];
+ dist2[i] = best_dist[i];
+ }
}
#endif // KNN_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/min_area_polygons_cuda.cuh b/mmcv/ops/csrc/common/cuda/min_area_polygons_cuda.cuh
deleted file mode 100644
index df56e743669c3426f6abb113e4209d0cc60f2baf..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/min_area_polygons_cuda.cuh
+++ /dev/null
@@ -1,300 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved
-#ifndef MIN_AREA_POLYGONS_CUDA_KERNEL_CUH
-#define MIN_AREA_POLYGONS_CUDA_KERNEL_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-#define MAXN 20
-__device__ const float PI = 3.1415926;
-
-struct Point {
- float x, y;
- __device__ Point() {}
- __device__ Point(float x, float y) : x(x), y(y) {}
-};
-
-__device__ inline void swap1(Point *a, Point *b) {
- Point temp;
- temp.x = a->x;
- temp.y = a->y;
-
- a->x = b->x;
- a->y = b->y;
-
- b->x = temp.x;
- b->y = temp.y;
-}
-__device__ inline float cross(Point o, Point a, Point b) {
- return (a.x - o.x) * (b.y - o.y) - (b.x - o.x) * (a.y - o.y);
-}
-
-__device__ inline float dis(Point a, Point b) {
- return (a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y);
-}
-__device__ inline void minBoundingRect(Point *ps, int n_points, float *minbox) {
- float convex_points[2][MAXN];
- for (int j = 0; j < n_points; j++) {
- convex_points[0][j] = ps[j].x;
- }
- for (int j = 0; j < n_points; j++) {
- convex_points[1][j] = ps[j].y;
- }
-
- Point edges[MAXN];
- float edges_angles[MAXN];
- float unique_angles[MAXN];
- int n_edges = n_points - 1;
- int n_unique = 0;
- int unique_flag = 0;
-
- for (int i = 0; i < n_edges; i++) {
- edges[i].x = ps[i + 1].x - ps[i].x;
- edges[i].y = ps[i + 1].y - ps[i].y;
- }
- for (int i = 0; i < n_edges; i++) {
- edges_angles[i] = atan2((double)edges[i].y, (double)edges[i].x);
- if (edges_angles[i] >= 0) {
- edges_angles[i] = fmod((double)edges_angles[i], (double)PI / 2);
- } else {
- edges_angles[i] =
- edges_angles[i] - (int)(edges_angles[i] / (PI / 2) - 1) * (PI / 2);
- }
- }
- unique_angles[0] = edges_angles[0];
- n_unique += 1;
- for (int i = 1; i < n_edges; i++) {
- for (int j = 0; j < n_unique; j++) {
- if (edges_angles[i] == unique_angles[j]) {
- unique_flag += 1;
- }
- }
- if (unique_flag == 0) {
- unique_angles[n_unique] = edges_angles[i];
- n_unique += 1;
- unique_flag = 0;
- } else {
- unique_flag = 0;
- }
- }
-
- float minarea = 1e12;
- for (int i = 0; i < n_unique; i++) {
- float R[2][2];
- float rot_points[2][MAXN];
- R[0][0] = cos(unique_angles[i]);
- R[0][1] = sin(unique_angles[i]);
- R[1][0] = -sin(unique_angles[i]);
- R[1][1] = cos(unique_angles[i]);
- // R x Points
- for (int m = 0; m < 2; m++) {
- for (int n = 0; n < n_points; n++) {
- float sum = 0.0;
- for (int k = 0; k < 2; k++) {
- sum = sum + R[m][k] * convex_points[k][n];
- }
- rot_points[m][n] = sum;
- }
- }
-
- // xmin;
- float xmin, ymin, xmax, ymax;
- xmin = 1e12;
- for (int j = 0; j < n_points; j++) {
- if (isinf(rot_points[0][j]) || isnan(rot_points[0][j])) {
- continue;
- } else {
- if (rot_points[0][j] < xmin) {
- xmin = rot_points[0][j];
- }
- }
- }
- // ymin
- ymin = 1e12;
- for (int j = 0; j < n_points; j++) {
- if (isinf(rot_points[1][j]) || isnan(rot_points[1][j])) {
- continue;
- } else {
- if (rot_points[1][j] < ymin) {
- ymin = rot_points[1][j];
- }
- }
- }
- // xmax
- xmax = -1e12;
- for (int j = 0; j < n_points; j++) {
- if (isinf(rot_points[0][j]) || isnan(rot_points[0][j])) {
- continue;
- } else {
- if (rot_points[0][j] > xmax) {
- xmax = rot_points[0][j];
- }
- }
- }
- // ymax
- ymax = -1e12;
- for (int j = 0; j < n_points; j++) {
- if (isinf(rot_points[1][j]) || isnan(rot_points[1][j])) {
- continue;
- } else {
- if (rot_points[1][j] > ymax) {
- ymax = rot_points[1][j];
- }
- }
- }
- float area = (xmax - xmin) * (ymax - ymin);
- if (area < minarea) {
- minarea = area;
- minbox[0] = unique_angles[i];
- minbox[1] = xmin;
- minbox[2] = ymin;
- minbox[3] = xmax;
- minbox[4] = ymax;
- }
- }
-}
-
-// convex_find
-__device__ inline void Jarvis(Point *in_poly, int &n_poly) {
- int n_input = n_poly;
- Point input_poly[20];
- for (int i = 0; i < n_input; i++) {
- input_poly[i].x = in_poly[i].x;
- input_poly[i].y = in_poly[i].y;
- }
- Point p_max, p_k;
- int max_index, k_index;
- int Stack[20], top1, top2;
- // float sign;
- double sign;
- Point right_point[10], left_point[10];
-
- for (int i = 0; i < n_poly; i++) {
- if (in_poly[i].y < in_poly[0].y ||
- in_poly[i].y == in_poly[0].y && in_poly[i].x < in_poly[0].x) {
- Point *j = &(in_poly[0]);
- Point *k = &(in_poly[i]);
- swap1(j, k);
- }
- if (i == 0) {
- p_max = in_poly[0];
- max_index = 0;
- }
- if (in_poly[i].y > p_max.y ||
- in_poly[i].y == p_max.y && in_poly[i].x > p_max.x) {
- p_max = in_poly[i];
- max_index = i;
- }
- }
- if (max_index == 0) {
- max_index = 1;
- p_max = in_poly[max_index];
- }
-
- k_index = 0, Stack[0] = 0, top1 = 0;
- while (k_index != max_index) {
- p_k = p_max;
- k_index = max_index;
- for (int i = 1; i < n_poly; i++) {
- sign = cross(in_poly[Stack[top1]], in_poly[i], p_k);
- if ((sign > 0) || ((sign == 0) && (dis(in_poly[Stack[top1]], in_poly[i]) >
- dis(in_poly[Stack[top1]], p_k)))) {
- p_k = in_poly[i];
- k_index = i;
- }
- }
- top1++;
- Stack[top1] = k_index;
- }
-
- for (int i = 0; i <= top1; i++) {
- right_point[i] = in_poly[Stack[i]];
- }
-
- k_index = 0, Stack[0] = 0, top2 = 0;
-
- while (k_index != max_index) {
- p_k = p_max;
- k_index = max_index;
- for (int i = 1; i < n_poly; i++) {
- sign = cross(in_poly[Stack[top2]], in_poly[i], p_k);
- if ((sign < 0) || (sign == 0) && (dis(in_poly[Stack[top2]], in_poly[i]) >
- dis(in_poly[Stack[top2]], p_k))) {
- p_k = in_poly[i];
- k_index = i;
- }
- }
- top2++;
- Stack[top2] = k_index;
- }
-
- for (int i = top2 - 1; i >= 0; i--) {
- left_point[i] = in_poly[Stack[i]];
- }
-
- for (int i = 0; i < top1 + top2; i++) {
- if (i <= top1) {
- in_poly[i] = right_point[i];
- } else {
- in_poly[i] = left_point[top2 - (i - top1)];
- }
- }
- n_poly = top1 + top2;
-}
-
-template
-__device__ inline void Findminbox(T const *const p, T *minpoints) {
- Point ps1[MAXN];
- Point convex[MAXN];
- for (int i = 0; i < 9; i++) {
- convex[i].x = p[i * 2];
- convex[i].y = p[i * 2 + 1];
- }
- int n_convex = 9;
- Jarvis(convex, n_convex);
- int n1 = n_convex;
- for (int i = 0; i < n1; i++) {
- ps1[i].x = convex[i].x;
- ps1[i].y = convex[i].y;
- }
- ps1[n1].x = convex[0].x;
- ps1[n1].y = convex[0].y;
-
- float minbbox[5] = {0};
- minBoundingRect(ps1, n1 + 1, minbbox);
- float angle = minbbox[0];
- float xmin = minbbox[1];
- float ymin = minbbox[2];
- float xmax = minbbox[3];
- float ymax = minbbox[4];
- float R[2][2];
-
- R[0][0] = cos(angle);
- R[0][1] = sin(angle);
- R[1][0] = -sin(angle);
- R[1][1] = cos(angle);
-
- minpoints[0] = xmax * R[0][0] + ymin * R[1][0];
- minpoints[1] = xmax * R[0][1] + ymin * R[1][1];
- minpoints[2] = xmin * R[0][0] + ymin * R[1][0];
- minpoints[3] = xmin * R[0][1] + ymin * R[1][1];
- minpoints[4] = xmin * R[0][0] + ymax * R[1][0];
- minpoints[5] = xmin * R[0][1] + ymax * R[1][1];
- minpoints[6] = xmax * R[0][0] + ymax * R[1][0];
- minpoints[7] = xmax * R[0][1] + ymax * R[1][1];
-}
-
-template
-__global__ void min_area_polygons_cuda_kernel(const int ex_n_boxes,
- const T *ex_boxes, T *minbox) {
- CUDA_1D_KERNEL_LOOP(index, ex_n_boxes) {
- const T *cur_box = ex_boxes + index * 18;
- T *cur_min_box = minbox + index * 8;
- Findminbox(cur_box, cur_min_box);
- }
-}
-
-#endif // MIN_AREA_POLYGONS_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/ms_deform_attn_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/ms_deform_attn_cuda_kernel.cuh
index 12225ffdb3b1691ad9edabcd1663109f67ef1a6f..aff1ea26fafb6574060797d24131b8540594716d 100644
--- a/mmcv/ops/csrc/common/cuda/ms_deform_attn_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/ms_deform_attn_cuda_kernel.cuh
@@ -14,6 +14,11 @@
#include "common_cuda_helper.hpp"
#include "pytorch_cuda_helper.hpp"
+const int CUDA_NUM_THREADS = 1024;
+inline int GET_BLOCKS(const int N, const int num_threads) {
+ return (N + num_threads - 1) / num_threads;
+}
+
template
__device__ scalar_t ms_deform_attn_im2col_bilinear(
const scalar_t *&bottom_data, const int &height, const int &width,
@@ -262,11 +267,10 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1(
const int channels, const int num_levels, const int num_query,
const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
scalar_t *grad_attn_weight) {
- __shared__ scalar_t cache_grad_sampling_loc[blockSize * 2];
- __shared__ scalar_t cache_grad_attn_weight[blockSize];
- unsigned int tid = threadIdx.x;
- const int qid_stride = num_heads * channels;
CUDA_1D_KERNEL_LOOP(index, n) {
+ __shared__ scalar_t cache_grad_sampling_loc[blockSize * 2];
+ __shared__ scalar_t cache_grad_attn_weight[blockSize];
+ unsigned int tid = threadIdx.x;
int _temp = index;
const int c_col = _temp % channels;
_temp /= channels;
@@ -281,11 +285,11 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1(
int data_weight_ptr = sampling_index * num_levels * num_point;
int data_loc_w_ptr = data_weight_ptr << 1;
const int grad_sampling_ptr = data_weight_ptr;
- scalar_t *grad_sampling_loc_out =
- grad_sampling_loc + (grad_sampling_ptr << 1);
- scalar_t *grad_attn_weight_out = grad_attn_weight + grad_sampling_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
const int grad_weight_stride = 1;
const int grad_loc_stride = 2;
+ const int qid_stride = num_heads * channels;
const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
for (int l_col = 0; l_col < num_levels; ++l_col) {
@@ -322,23 +326,23 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1(
_grad_h = cache_grad_sampling_loc[1],
_grad_a = cache_grad_attn_weight[0];
int sid = 2;
- for (unsigned int _tid = 1; _tid < blockSize; ++_tid) {
+ for (unsigned int tid = 1; tid < blockSize; ++tid) {
_grad_w += cache_grad_sampling_loc[sid];
_grad_h += cache_grad_sampling_loc[sid + 1];
- _grad_a += cache_grad_attn_weight[_tid];
+ _grad_a += cache_grad_attn_weight[tid];
sid += 2;
}
- *grad_sampling_loc_out = _grad_w;
- *(grad_sampling_loc_out + 1) = _grad_h;
- *grad_attn_weight_out = _grad_a;
+ *grad_sampling_loc = _grad_w;
+ *(grad_sampling_loc + 1) = _grad_h;
+ *grad_attn_weight = _grad_a;
}
__syncthreads();
data_weight_ptr += 1;
data_loc_w_ptr += 2;
- grad_attn_weight_out += grad_weight_stride;
- grad_sampling_loc_out += grad_loc_stride;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
}
}
}
@@ -353,10 +357,10 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2(
const int channels, const int num_levels, const int num_query,
const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
scalar_t *grad_attn_weight) {
- __shared__ scalar_t cache_grad_sampling_loc[blockSize * 2];
- __shared__ scalar_t cache_grad_attn_weight[blockSize];
- unsigned int tid = threadIdx.x;
CUDA_1D_KERNEL_LOOP(index, n) {
+ __shared__ scalar_t cache_grad_sampling_loc[blockSize * 2];
+ __shared__ scalar_t cache_grad_attn_weight[blockSize];
+ unsigned int tid = threadIdx.x;
int _temp = index;
const int c_col = _temp % channels;
_temp /= channels;
@@ -371,9 +375,8 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2(
int data_weight_ptr = sampling_index * num_levels * num_point;
int data_loc_w_ptr = data_weight_ptr << 1;
const int grad_sampling_ptr = data_weight_ptr;
- scalar_t *grad_sampling_loc_out =
- grad_sampling_loc + (grad_sampling_ptr << 1);
- scalar_t *grad_attn_weight_out = grad_attn_weight + grad_sampling_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
const int grad_weight_stride = 1;
const int grad_loc_stride = 2;
const int qid_stride = num_heads * channels;
@@ -422,16 +425,16 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2(
}
if (tid == 0) {
- *grad_sampling_loc_out = cache_grad_sampling_loc[0];
- *(grad_sampling_loc_out + 1) = cache_grad_sampling_loc[1];
- *grad_attn_weight_out = cache_grad_attn_weight[0];
+ *grad_sampling_loc = cache_grad_sampling_loc[0];
+ *(grad_sampling_loc + 1) = cache_grad_sampling_loc[1];
+ *grad_attn_weight = cache_grad_attn_weight[0];
}
__syncthreads();
data_weight_ptr += 1;
data_loc_w_ptr += 2;
- grad_attn_weight_out += grad_weight_stride;
- grad_sampling_loc_out += grad_loc_stride;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
}
}
}
@@ -446,11 +449,11 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(
const int channels, const int num_levels, const int num_query,
const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
scalar_t *grad_attn_weight) {
- extern __shared__ int _s[];
- scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
- scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
- unsigned int tid = threadIdx.x;
CUDA_1D_KERNEL_LOOP(index, n) {
+ extern __shared__ int _s[];
+ scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
+ scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
+ unsigned int tid = threadIdx.x;
int _temp = index;
const int c_col = _temp % channels;
_temp /= channels;
@@ -465,9 +468,8 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(
int data_weight_ptr = sampling_index * num_levels * num_point;
int data_loc_w_ptr = data_weight_ptr << 1;
const int grad_sampling_ptr = data_weight_ptr;
- scalar_t *grad_sampling_loc_out =
- grad_sampling_loc + (grad_sampling_ptr << 1);
- scalar_t *grad_attn_weight_out = grad_attn_weight + grad_sampling_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
const int grad_weight_stride = 1;
const int grad_loc_stride = 2;
const int qid_stride = num_heads * channels;
@@ -507,23 +509,23 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(
_grad_h = cache_grad_sampling_loc[1],
_grad_a = cache_grad_attn_weight[0];
int sid = 2;
- for (unsigned int _tid = 1; _tid < blockDim.x; ++_tid) {
+ for (unsigned int tid = 1; tid < blockDim.x; ++tid) {
_grad_w += cache_grad_sampling_loc[sid];
_grad_h += cache_grad_sampling_loc[sid + 1];
- _grad_a += cache_grad_attn_weight[_tid];
+ _grad_a += cache_grad_attn_weight[tid];
sid += 2;
}
- *grad_sampling_loc_out = _grad_w;
- *(grad_sampling_loc_out + 1) = _grad_h;
- *grad_attn_weight_out = _grad_a;
+ *grad_sampling_loc = _grad_w;
+ *(grad_sampling_loc + 1) = _grad_h;
+ *grad_attn_weight = _grad_a;
}
__syncthreads();
data_weight_ptr += 1;
data_loc_w_ptr += 2;
- grad_attn_weight_out += grad_weight_stride;
- grad_sampling_loc_out += grad_loc_stride;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
}
}
}
@@ -538,11 +540,11 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(
const int channels, const int num_levels, const int num_query,
const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
scalar_t *grad_attn_weight) {
- extern __shared__ int _s[];
- scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
- scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
- unsigned int tid = threadIdx.x;
CUDA_1D_KERNEL_LOOP(index, n) {
+ extern __shared__ int _s[];
+ scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
+ scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
+ unsigned int tid = threadIdx.x;
int _temp = index;
const int c_col = _temp % channels;
_temp /= channels;
@@ -557,9 +559,8 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(
int data_weight_ptr = sampling_index * num_levels * num_point;
int data_loc_w_ptr = data_weight_ptr << 1;
const int grad_sampling_ptr = data_weight_ptr;
- scalar_t *grad_sampling_loc_out =
- grad_sampling_loc + (grad_sampling_ptr << 1);
- scalar_t *grad_attn_weight_out = grad_attn_weight + grad_sampling_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
const int grad_weight_stride = 1;
const int grad_loc_stride = 2;
const int qid_stride = num_heads * channels;
@@ -617,16 +618,16 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(
}
if (tid == 0) {
- *grad_sampling_loc_out = cache_grad_sampling_loc[0];
- *(grad_sampling_loc_out + 1) = cache_grad_sampling_loc[1];
- *grad_attn_weight_out = cache_grad_attn_weight[0];
+ *grad_sampling_loc = cache_grad_sampling_loc[0];
+ *(grad_sampling_loc + 1) = cache_grad_sampling_loc[1];
+ *grad_attn_weight = cache_grad_attn_weight[0];
}
__syncthreads();
data_weight_ptr += 1;
data_loc_w_ptr += 2;
- grad_attn_weight_out += grad_weight_stride;
- grad_sampling_loc_out += grad_loc_stride;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
}
}
}
@@ -641,11 +642,11 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(
const int channels, const int num_levels, const int num_query,
const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
scalar_t *grad_attn_weight) {
- extern __shared__ int _s[];
- scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
- scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
- unsigned int tid = threadIdx.x;
CUDA_1D_KERNEL_LOOP(index, n) {
+ extern __shared__ int _s[];
+ scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
+ scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
+ unsigned int tid = threadIdx.x;
int _temp = index;
const int c_col = _temp % channels;
_temp /= channels;
@@ -660,9 +661,8 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(
int data_weight_ptr = sampling_index * num_levels * num_point;
int data_loc_w_ptr = data_weight_ptr << 1;
const int grad_sampling_ptr = data_weight_ptr;
- scalar_t *grad_sampling_loc_out =
- grad_sampling_loc + (grad_sampling_ptr << 1);
- scalar_t *grad_attn_weight_out = grad_attn_weight + grad_sampling_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
const int grad_weight_stride = 1;
const int grad_loc_stride = 2;
const int qid_stride = num_heads * channels;
@@ -720,16 +720,16 @@ __global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(
}
if (tid == 0) {
- atomicAdd(grad_sampling_loc_out, cache_grad_sampling_loc[0]);
- atomicAdd(grad_sampling_loc_out + 1, cache_grad_sampling_loc[1]);
- atomicAdd(grad_attn_weight_out, cache_grad_attn_weight[0]);
+ atomicAdd(grad_sampling_loc, cache_grad_sampling_loc[0]);
+ atomicAdd(grad_sampling_loc + 1, cache_grad_sampling_loc[1]);
+ atomicAdd(grad_attn_weight, cache_grad_attn_weight[0]);
}
__syncthreads();
data_weight_ptr += 1;
data_loc_w_ptr += 2;
- grad_attn_weight_out += grad_weight_stride;
- grad_sampling_loc_out += grad_loc_stride;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
}
}
}
@@ -759,9 +759,8 @@ __global__ void ms_deformable_col2im_gpu_kernel_gm(
int data_weight_ptr = sampling_index * num_levels * num_point;
int data_loc_w_ptr = data_weight_ptr << 1;
const int grad_sampling_ptr = data_weight_ptr;
- scalar_t *grad_sampling_loc_out =
- grad_sampling_loc + (grad_sampling_ptr << 1);
- scalar_t *grad_attn_weight_out = grad_attn_weight + grad_sampling_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
const int grad_weight_stride = 1;
const int grad_loc_stride = 2;
const int qid_stride = num_heads * channels;
@@ -788,12 +787,12 @@ __global__ void ms_deformable_col2im_gpu_kernel_gm(
ms_deform_attn_col2im_bilinear_gm(
data_value_ptr, spatial_h, spatial_w, num_heads, channels, h_im,
w_im, m_col, c_col, top_grad, weight, grad_value_ptr,
- grad_sampling_loc_out, grad_attn_weight_out);
+ grad_sampling_loc, grad_attn_weight);
}
data_weight_ptr += 1;
data_loc_w_ptr += 2;
- grad_attn_weight_out += grad_weight_stride;
- grad_sampling_loc_out += grad_loc_stride;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
}
}
}
diff --git a/mmcv/ops/csrc/common/cuda/nms_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/nms_cuda_kernel.cuh
index 281d9f0b409f54260a81a79ad96ab09fde9580ce..40a2f462202cb06e7230ad3f1e17474e93ddc4cb 100644
--- a/mmcv/ops/csrc/common/cuda/nms_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/nms_cuda_kernel.cuh
@@ -27,91 +27,48 @@ __device__ inline bool devIoU(float const *const a, float const *const b,
return interS > threshold * (Sa + Sb - interS);
}
-__global__ static void nms_cuda(const int n_boxes, const float iou_threshold,
- const int offset, const float *dev_boxes,
- unsigned long long *dev_mask) {
- int blocks = (n_boxes + threadsPerBlock - 1) / threadsPerBlock;
- CUDA_2D_KERNEL_BLOCK_LOOP(col_start, blocks, row_start, blocks) {
- const int tid = threadIdx.x;
-
- if (row_start > col_start) return;
-
- const int row_size =
- fminf(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
- const int col_size =
- fminf(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
-
- __shared__ float block_boxes[threadsPerBlock * 4];
- if (tid < col_size) {
- block_boxes[tid * 4 + 0] =
- dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 0];
- block_boxes[tid * 4 + 1] =
- dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 1];
- block_boxes[tid * 4 + 2] =
- dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 2];
- block_boxes[tid * 4 + 3] =
- dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 3];
- }
- __syncthreads();
-
- if (tid < row_size) {
- const int cur_box_idx = threadsPerBlock * row_start + tid;
- const float *cur_box = dev_boxes + cur_box_idx * 4;
- int i = 0;
- unsigned long long int t = 0;
- int start = 0;
- if (row_start == col_start) {
- start = tid + 1;
- }
- for (i = start; i < col_size; i++) {
- if (devIoU(cur_box, block_boxes + i * 4, offset, iou_threshold)) {
- t |= 1ULL << i;
- }
- }
- dev_mask[cur_box_idx * gridDim.y + col_start] = t;
- }
- }
-}
-
-__global__ static void gather_keep_from_mask(bool *keep,
- const unsigned long long *dev_mask,
- const int n_boxes) {
- const int col_blocks = (n_boxes + threadsPerBlock - 1) / threadsPerBlock;
+__global__ void nms_cuda(const int n_boxes, const float iou_threshold,
+ const int offset, const float *dev_boxes,
+ unsigned long long *dev_mask) {
+ const int row_start = blockIdx.y;
+ const int col_start = blockIdx.x;
const int tid = threadIdx.x;
- // mark the bboxes which have been removed.
- extern __shared__ unsigned long long removed[];
+ if (row_start > col_start) return;
- // initialize removed.
- for (int i = tid; i < col_blocks; i += blockDim.x) {
- removed[i] = 0;
+ const int row_size =
+ fminf(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
+ const int col_size =
+ fminf(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
+
+ __shared__ float block_boxes[threadsPerBlock * 4];
+ if (tid < col_size) {
+ block_boxes[tid * 4 + 0] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 0];
+ block_boxes[tid * 4 + 1] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 1];
+ block_boxes[tid * 4 + 2] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 2];
+ block_boxes[tid * 4 + 3] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 3];
}
__syncthreads();
- for (int nblock = 0; nblock < col_blocks; ++nblock) {
- auto removed_val = removed[nblock];
- __syncthreads();
- const int i_offset = nblock * threadsPerBlock;
-#pragma unroll
- for (int inblock = 0; inblock < threadsPerBlock; ++inblock) {
- const int i = i_offset + inblock;
- if (i >= n_boxes) break;
- // select a candidate, check if it should kept.
- if (!(removed_val & (1ULL << inblock))) {
- if (tid == 0) {
- // mark the output.
- keep[i] = true;
- }
- auto p = dev_mask + i * col_blocks;
- // remove all bboxes which overlap the candidate.
- for (int j = tid; j < col_blocks; j += blockDim.x) {
- if (j >= nblock) removed[j] |= p[j];
- }
- __syncthreads();
- removed_val = removed[nblock];
+ if (tid < row_size) {
+ const int cur_box_idx = threadsPerBlock * row_start + tid;
+ const float *cur_box = dev_boxes + cur_box_idx * 4;
+ int i = 0;
+ unsigned long long int t = 0;
+ int start = 0;
+ if (row_start == col_start) {
+ start = tid + 1;
+ }
+ for (i = start; i < col_size; i++) {
+ if (devIoU(cur_box, block_boxes + i * 4, offset, iou_threshold)) {
+ t |= 1ULL << i;
}
}
+ dev_mask[cur_box_idx * gridDim.y + col_start] = t;
}
}
-
#endif // NMS_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/nms_quadri_cuda.cuh b/mmcv/ops/csrc/common/cuda/nms_quadri_cuda.cuh
deleted file mode 100644
index bba3b8258f6b8798b9d1a651bfda29c48bb5376a..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/nms_quadri_cuda.cuh
+++ /dev/null
@@ -1,141 +0,0 @@
-// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
-#ifndef NMS_QUADRI_CUDA_CUH
-#define NMS_QUADRI_CUDA_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-#include "box_iou_rotated_utils.hpp"
-
-__host__ __device__ inline int divideUP(const int x, const int y) {
- return (((x) + (y)-1) / (y));
-}
-
-namespace {
-int const threadsPerBlock = sizeof(unsigned long long) * 8;
-}
-
-template
-__global__ void nms_quadri_cuda_kernel(const int n_boxes,
- const float iou_threshold,
- const T* dev_boxes,
- unsigned long long* dev_mask,
- const int multi_label) {
- if (multi_label == 1) {
- const int row_start = blockIdx.y;
- const int col_start = blockIdx.x;
-
- // if (row_start > col_start) return;
-
- const int row_size =
- min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
- const int col_size =
- min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
-
- // Compared to nms_cuda_kernel, where each box is represented with 4 values
- // (x1, y1, x2, y2), each rotated box is represented with 8 values
- // (x1, y1, ..., x4, y4) here.
- __shared__ T block_boxes[threadsPerBlock * 8];
- if (threadIdx.x < col_size) {
- block_boxes[threadIdx.x * 8 + 0] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 0];
- block_boxes[threadIdx.x * 8 + 1] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 1];
- block_boxes[threadIdx.x * 8 + 2] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 2];
- block_boxes[threadIdx.x * 8 + 3] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 3];
- block_boxes[threadIdx.x * 8 + 4] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 4];
- block_boxes[threadIdx.x * 8 + 5] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 5];
- block_boxes[threadIdx.x * 8 + 6] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 6];
- block_boxes[threadIdx.x * 8 + 7] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 9 + 7];
- }
- __syncthreads();
-
- if (threadIdx.x < row_size) {
- const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
- const T* cur_box = dev_boxes + cur_box_idx * 9;
- int i = 0;
- unsigned long long t = 0;
- int start = 0;
- if (row_start == col_start) {
- start = threadIdx.x + 1;
- }
- for (i = start; i < col_size; i++) {
- // Instead of devIoU used by original horizontal nms, here
- // we use the single_box_iou_quadri function from
- // box_iou_rotated_utils.h
- if (single_box_iou_quadri(cur_box, block_boxes + i * 8, 0) >
- iou_threshold) {
- t |= 1ULL << i;
- }
- }
- const int col_blocks = divideUP(n_boxes, threadsPerBlock);
- dev_mask[cur_box_idx * col_blocks + col_start] = t;
- }
- } else {
- const int row_start = blockIdx.y;
- const int col_start = blockIdx.x;
-
- // if (row_start > col_start) return;
-
- const int row_size =
- min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
- const int col_size =
- min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
-
- // Compared to nms_cuda_kernel, where each box is represented with 4 values
- // (x1, y1, x2, y2), each rotated box is represented with 8 values
- // (x1, y1, , ..., x4, y4) here.
- __shared__ T block_boxes[threadsPerBlock * 8];
- if (threadIdx.x < col_size) {
- block_boxes[threadIdx.x * 8 + 0] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 0];
- block_boxes[threadIdx.x * 8 + 1] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 1];
- block_boxes[threadIdx.x * 8 + 2] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 2];
- block_boxes[threadIdx.x * 8 + 3] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 3];
- block_boxes[threadIdx.x * 8 + 4] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 4];
- block_boxes[threadIdx.x * 8 + 5] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 5];
- block_boxes[threadIdx.x * 8 + 6] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 6];
- block_boxes[threadIdx.x * 8 + 7] =
- dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 8 + 7];
- }
- __syncthreads();
-
- if (threadIdx.x < row_size) {
- const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
- const T* cur_box = dev_boxes + cur_box_idx * 8;
- int i = 0;
- unsigned long long t = 0;
- int start = 0;
- if (row_start == col_start) {
- start = threadIdx.x + 1;
- }
- for (i = start; i < col_size; i++) {
- // Instead of devIoU used by original horizontal nms, here
- // we use the single_box_iou_quadri function from
- // box_iou_rotated_utils.h
- if (single_box_iou_quadri(cur_box, block_boxes + i * 8, 0) >
- iou_threshold) {
- t |= 1ULL << i;
- }
- }
- const int col_blocks = divideUP(n_boxes, threadsPerBlock);
- dev_mask[cur_box_idx * col_blocks + col_start] = t;
- }
- }
-}
-
-#endif
diff --git a/mmcv/ops/csrc/common/cuda/nms_rotated_cuda.cuh b/mmcv/ops/csrc/common/cuda/nms_rotated_cuda.cuh
index 747327afb83900177dd4721f1b0ba99153f658d7..80bed9681f748390999a2963bd3448570b0dbf6a 100644
--- a/mmcv/ops/csrc/common/cuda/nms_rotated_cuda.cuh
+++ b/mmcv/ops/csrc/common/cuda/nms_rotated_cuda.cuh
@@ -43,16 +43,18 @@ __global__ void nms_rotated_cuda_kernel(const int n_boxes,
// (x_center, y_center, width, height, angle_degrees) here.
__shared__ T block_boxes[threadsPerBlock * 5];
if (threadIdx.x < col_size) {
- block_boxes[threadIdx.x * 5 + 0] =
+ block_boxes[threadIdx.x * 6 + 0] =
dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 0];
- block_boxes[threadIdx.x * 5 + 1] =
+ block_boxes[threadIdx.x * 6 + 1] =
dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 1];
- block_boxes[threadIdx.x * 5 + 2] =
+ block_boxes[threadIdx.x * 6 + 2] =
dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 2];
- block_boxes[threadIdx.x * 5 + 3] =
+ block_boxes[threadIdx.x * 6 + 3] =
dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 3];
- block_boxes[threadIdx.x * 5 + 4] =
+ block_boxes[threadIdx.x * 6 + 4] =
dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 4];
+ block_boxes[threadIdx.x * 6 + 5] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 5];
}
__syncthreads();
@@ -69,7 +71,7 @@ __global__ void nms_rotated_cuda_kernel(const int n_boxes,
// Instead of devIoU used by original horizontal nms, here
// we use the single_box_iou_rotated function from
// box_iou_rotated_utils.h
- if (single_box_iou_rotated(cur_box, block_boxes + i * 5, 0) >
+ if (single_box_iou_rotated(cur_box, block_boxes + i * 6, 0) >
iou_threshold) {
t |= 1ULL << i;
}
diff --git a/mmcv/ops/csrc/common/cuda/points_in_boxes_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/points_in_boxes_cuda_kernel.cuh
index 342362079a5ce3dde6d19532b3014872f4373330..12182cc3704eaacd1da838ce357c2677ad029eaa 100644
--- a/mmcv/ops/csrc/common/cuda/points_in_boxes_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/points_in_boxes_cuda_kernel.cuh
@@ -45,21 +45,20 @@ __global__ void points_in_boxes_part_forward_cuda_kernel(
// (B, npoints), default -1
int bs_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(pt_idx, pts_num) {
- if (bs_idx >= batch_size) return;
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (bs_idx >= batch_size || pt_idx >= pts_num) return;
- boxes += bs_idx * boxes_num * 7;
- pts += bs_idx * pts_num * 3 + pt_idx * 3;
- box_idx_of_points += bs_idx * pts_num + pt_idx;
+ boxes += bs_idx * boxes_num * 7;
+ pts += bs_idx * pts_num * 3 + pt_idx * 3;
+ box_idx_of_points += bs_idx * pts_num + pt_idx;
- T local_x = 0, local_y = 0;
- int cur_in_flag = 0;
- for (int k = 0; k < boxes_num; k++) {
- cur_in_flag = check_pt_in_box3d(pts, boxes + k * 7, local_x, local_y);
- if (cur_in_flag) {
- box_idx_of_points[0] = k;
- break;
- }
+ T local_x = 0, local_y = 0;
+ int cur_in_flag = 0;
+ for (int k = 0; k < boxes_num; k++) {
+ cur_in_flag = check_pt_in_box3d(pts, boxes + k * 7, local_x, local_y);
+ if (cur_in_flag) {
+ box_idx_of_points[0] = k;
+ break;
}
}
}
@@ -74,20 +73,19 @@ __global__ void points_in_boxes_all_forward_cuda_kernel(
// (B, npoints), default -1
int bs_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(pt_idx, pts_num) {
- if (bs_idx >= batch_size) return;
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (bs_idx >= batch_size || pt_idx >= pts_num) return;
- boxes += bs_idx * boxes_num * 7;
- pts += bs_idx * pts_num * 3 + pt_idx * 3;
- box_idx_of_points += bs_idx * pts_num * boxes_num + pt_idx * boxes_num;
+ boxes += bs_idx * boxes_num * 7;
+ pts += bs_idx * pts_num * 3 + pt_idx * 3;
+ box_idx_of_points += bs_idx * pts_num * boxes_num + pt_idx * boxes_num;
- T local_x = 0, local_y = 0;
- for (int k = 0; k < boxes_num; k++) {
- const int cur_in_flag =
- check_pt_in_box3d(pts, boxes + k * 7, local_x, local_y);
- if (cur_in_flag) {
- box_idx_of_points[k] = 1;
- }
+ T local_x = 0, local_y = 0;
+ for (int k = 0; k < boxes_num; k++) {
+ const int cur_in_flag =
+ check_pt_in_box3d(pts, boxes + k * 7, local_x, local_y);
+ if (cur_in_flag) {
+ box_idx_of_points[k] = 1;
}
}
}
diff --git a/mmcv/ops/csrc/common/cuda/points_in_polygons_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/points_in_polygons_cuda_kernel.cuh
deleted file mode 100644
index a0769d75a29ce8d7eac00931d6f51caa292b2693..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/points_in_polygons_cuda_kernel.cuh
+++ /dev/null
@@ -1,79 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved
-#ifndef POINTS_IN_POLYGONS_CUDA_KERNEL_CUH
-#define POINTS_IN_POLYGONS_CUDA_KERNEL_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-struct point {
- float x, y;
-};
-
-template
-__global__ void points_in_polygons_forward_cuda_kernel(
- const int nthreads, const scalar_t *vertex1, const scalar_t *vertex2,
- const int rows, const int cols, scalar_t *inside_flag) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- int row = index / cols;
- int col = index % cols;
-
- const scalar_t *offset_vertex1 = vertex1 + row * 2;
- const scalar_t *offset_vertex2 = vertex2 + col * 8;
-
- point point_[1];
- point polygon[4];
-
- point_[0].x = offset_vertex1[0];
- point_[0].y = offset_vertex1[1];
-
- polygon[0].x = offset_vertex2[0];
- polygon[0].y = offset_vertex2[1];
- polygon[1].x = offset_vertex2[2];
- polygon[1].y = offset_vertex2[3];
- polygon[2].x = offset_vertex2[4];
- polygon[2].y = offset_vertex2[5];
- polygon[3].x = offset_vertex2[6];
- polygon[3].y = offset_vertex2[7];
-
- int nCross = 0;
- int i, j;
- float sx, sy, tx, ty, px, py, x;
- for (i = 0, j = 3; i < 4; j = i, i++) {
- sx = polygon[i].x;
- sy = polygon[i].y;
- tx = polygon[j].x;
- ty = polygon[j].y;
-
- px = point_[0].x;
- py = point_[0].y;
-
- if (py < min(sy, ty)) continue;
- if (py > max(sy, ty)) continue;
-
- if ((sx == px && sy == py) || (tx == px && ty == py)) {
- break;
- } else {
- if ((sy < py && ty >= py) || (sy >= py && ty < py)) {
- x = sx + (py - sy) * (tx - sx) / (ty - sy);
- if (x == px) {
- break;
- }
- if (x > px) {
- nCross++;
- }
- }
- }
- }
- if (nCross % 2 == 1) {
- inside_flag[index] = 1.0;
- } else {
- inside_flag[index] = 0.0;
- }
- return;
- }
-}
-
-#endif // POINTS_IN_POLYGONS_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/prroi_pool_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/prroi_pool_cuda_kernel.cuh
deleted file mode 100644
index e2f5a11b8dd6058f8d2fd288fc943dc235b39c37..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/prroi_pool_cuda_kernel.cuh
+++ /dev/null
@@ -1,381 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved
-// Modified from
-// https://github.com/vacancy/PreciseRoIPooling/blob/master/src/prroi_pooling_gpu_impl.cu
-// Distributed under terms of the MIT license.
-#ifndef PRROI_POOL_CUDA_KERNEL_CUH
-#define PRROI_POOL_CUDA_KERNEL_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-template
-__device__ static __forceinline__ T PrRoIPoolingGetData(const T *data,
- const int h,
- const int w,
- const int height,
- const int width) {
- bool overflow = (h < 0) || (w < 0) || (h >= height) || (w >= width);
- T retVal = overflow ? 0.0f : data[h * width + w];
- return retVal;
-}
-
-template
-__device__ static __forceinline__ T PrRoIPoolingGetCoeff(T dh, T dw) {
- return (1.0f - abs(dh)) * (1.0f - abs(dw));
-}
-
-template
-__device__ static __forceinline__ T PrRoIPoolingSingleCoorIntegral(T s, T t,
- T c1, T c2) {
- return 0.5 * (t * t - s * s) * (c2 - c1) + (t - s) * c1;
-}
-
-template
-__device__ static T PrRoIPoolingInterpolation(const T *data, const T h,
- const T w, const int height,
- const int width) {
- T retVal = 0.0f;
- int h1 = floorf(h);
- int w1 = floorf(w);
- retVal += PrRoIPoolingGetData(data, h1, w1, height, width) *
- PrRoIPoolingGetCoeff(h - T(h1), w - T(w1));
- h1 = floorf(h) + 1;
- w1 = floorf(w);
- retVal += PrRoIPoolingGetData(data, h1, w1, height, width) *
- PrRoIPoolingGetCoeff(h - T(h1), w - T(w1));
- h1 = floorf(h);
- w1 = floorf(w) + 1;
- retVal += PrRoIPoolingGetData(data, h1, w1, height, width) *
- PrRoIPoolingGetCoeff(h - T(h1), w - T(w1));
- h1 = floorf(h) + 1;
- w1 = floorf(w) + 1;
- retVal += PrRoIPoolingGetData(data, h1, w1, height, width) *
- PrRoIPoolingGetCoeff(h - T(h1), w - T(w1));
- return retVal;
-}
-
-template
-__device__ static T PrRoIPoolingMatCalculation(const T *this_data,
- const int s_h, const int s_w,
- const int e_h, const int e_w,
- const T y0, const T x0,
- const T y1, const T x1,
- const int h0, const int w0) {
- T alpha, beta, lim_alpha, lim_beta, tmp;
- T sum_out = 0;
-
- alpha = x0 - T(s_w);
- beta = y0 - T(s_h);
- lim_alpha = x1 - T(s_w);
- lim_beta = y1 - T(s_h);
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- sum_out += PrRoIPoolingGetData(this_data, s_h, s_w, h0, w0) * tmp;
-
- alpha = T(e_w) - x1;
- lim_alpha = T(e_w) - x0;
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- sum_out += PrRoIPoolingGetData(this_data, s_h, e_w, h0, w0) * tmp;
-
- alpha = x0 - T(s_w);
- beta = T(e_h) - y1;
- lim_alpha = x1 - T(s_w);
- lim_beta = T(e_h) - y0;
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- sum_out += PrRoIPoolingGetData(this_data, e_h, s_w, h0, w0) * tmp;
-
- alpha = T(e_w) - x1;
- lim_alpha = T(e_w) - x0;
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- sum_out += PrRoIPoolingGetData(this_data, e_h, e_w, h0, w0) * tmp;
-
- return sum_out;
-}
-
-template
-__device__ static void PrRoIPoolingDistributeDiff(T *diff, const T top_diff,
- const int h, const int w,
- const int height,
- const int width,
- const T coeff) {
- bool overflow = (h < 0) || (w < 0) || (h >= height) || (w >= width);
- if (!overflow) atomicAdd(diff + h * width + w, top_diff * coeff);
-}
-
-template
-__device__ static void PrRoIPoolingMatDistributeDiff(
- T *diff, const T top_diff, const int s_h, const int s_w, const int e_h,
- const int e_w, const T y0, const T x0, const T y1, const T x1, const int h0,
- const int w0) {
- T alpha, beta, lim_alpha, lim_beta, tmp;
-
- alpha = x0 - T(s_w);
- beta = y0 - T(s_h);
- lim_alpha = x1 - T(s_w);
- lim_beta = y1 - T(s_h);
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- PrRoIPoolingDistributeDiff(diff, top_diff, s_h, s_w, h0, w0, tmp);
-
- alpha = T(e_w) - x1;
- lim_alpha = T(e_w) - x0;
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- PrRoIPoolingDistributeDiff(diff, top_diff, s_h, e_w, h0, w0, tmp);
-
- alpha = x0 - T(s_w);
- beta = T(e_h) - y1;
- lim_alpha = x1 - T(s_w);
- lim_beta = T(e_h) - y0;
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- PrRoIPoolingDistributeDiff(diff, top_diff, e_h, s_w, h0, w0, tmp);
-
- alpha = T(e_w) - x1;
- lim_alpha = T(e_w) - x0;
- tmp = (lim_alpha - 0.5f * lim_alpha * lim_alpha - alpha +
- 0.5f * alpha * alpha) *
- (lim_beta - 0.5f * lim_beta * lim_beta - beta + 0.5f * beta * beta);
- PrRoIPoolingDistributeDiff(diff, top_diff, e_h, e_w, h0, w0, tmp);
-}
-
-template
-__global__ void prroi_pool_forward_cuda_kernel(
- const int nthreads, const T *input, const T *rois, T *output,
- const int pooled_height, const int pooled_width, const T spatial_scale,
- const int channels, const int height, const int width) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- // (n, c, ph, pw) is an element in the pooled output
- int pw = index % pooled_width;
- int ph = (index / pooled_width) % pooled_height;
- int c = (index / pooled_width / pooled_height) % channels;
- int n = index / pooled_width / pooled_height / channels;
-
- const T *offset_rois = rois + n * 5;
- int roi_batch_ind = offset_rois[0];
-
- T roi_x1 = offset_rois[1] * spatial_scale;
- T roi_y1 = offset_rois[2] * spatial_scale;
- T roi_x2 = offset_rois[3] * spatial_scale;
- T roi_y2 = offset_rois[4] * spatial_scale;
-
- T roi_width = max(roi_x2 - roi_x1, ((T)0.0));
- T roi_height = max(roi_y2 - roi_y1, ((T)0.0));
- T bin_size_h = roi_height / static_cast(pooled_height);
- T bin_size_w = roi_width / static_cast(pooled_width);
-
- const T *this_data =
- input + (roi_batch_ind * channels + c) * height * width;
- T *this_out = output + index;
-
- T bin_x1 = roi_x1 + bin_size_w * pw;
- T bin_y1 = roi_y1 + bin_size_h * ph;
- T bin_x2 = bin_x1 + bin_size_w;
- T bin_y2 = bin_y1 + bin_size_h;
-
- T bin_size = max(T(0.0), bin_size_w * bin_size_h);
- if (bin_size == 0) {
- *this_out = 0;
- continue;
- }
-
- T sum_out = 0;
-
- int start_x, start_y, end_x, end_y;
-
- start_x = floorf(bin_x1);
- end_x = ceilf(bin_x2);
- start_y = floorf(bin_y1);
- end_y = ceilf(bin_y2);
-
- for (int bin_x = start_x; bin_x < end_x; ++bin_x)
- for (int bin_y = start_y; bin_y < end_y; ++bin_y)
- sum_out += PrRoIPoolingMatCalculation(
- this_data, bin_y, bin_x, bin_y + 1, bin_x + 1,
- max(bin_y1, T(bin_y)), max(bin_x1, T(bin_x)),
- min(bin_y2, T(bin_y) + 1.0f), min(bin_x2, T(bin_x + 1.0f)), height,
- width);
- *this_out = sum_out / bin_size;
- }
-}
-
-template
-__global__ void prroi_pool_backward_cuda_kernel(
- const int nthreads, const T *grad_output, const T *rois, T *grad_input,
- const int pooled_height, const int pooled_width, const T spatial_scale,
- const int channels, const int height, const int width) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- // (n, c, ph, pw) is an element in the pooled output
- int pw = index % pooled_width;
- int ph = (index / pooled_width) % pooled_height;
- int c = (index / pooled_width / pooled_height) % channels;
- int n = index / pooled_width / pooled_height / channels;
- auto rois_cur = rois + n * 5;
-
- int roi_batch_ind = rois_cur[0];
- T roi_x1 = rois_cur[1] * spatial_scale;
- T roi_y1 = rois_cur[2] * spatial_scale;
- T roi_x2 = rois_cur[3] * spatial_scale;
- T roi_y2 = rois_cur[4] * spatial_scale;
-
- T roi_width = max(roi_x2 - roi_x1, (T)0);
- T roi_height = max(roi_y2 - roi_y1, (T)0);
- T bin_size_h = roi_height / static_cast(pooled_height);
- T bin_size_w = roi_width / static_cast(pooled_width);
-
- const T *this_out_grad = grad_output + index;
- T *this_data_grad =
- grad_input + (roi_batch_ind * channels + c) * height * width;
-
- T bin_x1 = roi_x1 + bin_size_w * pw;
- T bin_y1 = roi_y1 + bin_size_h * ph;
- T bin_x2 = bin_x1 + bin_size_w;
- T bin_y2 = bin_y1 + bin_size_h;
-
- T bin_size = max(T(0.0), bin_size_w * bin_size_h);
-
- T sum_out = bin_size == T(0) ? T(0) : *this_out_grad / bin_size;
-
- int start_x, start_y, end_x, end_y;
-
- start_x = floorf(bin_x1);
- end_x = ceilf(bin_x2);
- start_y = floorf(bin_y1);
- end_y = ceilf(bin_y2);
-
- for (int bin_x = start_x; bin_x < end_x; ++bin_x)
- for (int bin_y = start_y; bin_y < end_y; ++bin_y)
- PrRoIPoolingMatDistributeDiff(
- this_data_grad, sum_out, bin_y, bin_x, bin_y + 1, bin_x + 1,
- max(bin_y1, T(bin_y)), max(bin_x1, T(bin_x)),
- min(bin_y2, T(bin_y) + 1.0f), min(bin_x2, T(bin_x + 1.0f)), height,
- width);
- }
-}
-
-template
-__global__ void prroi_pool_coor_backward_cuda_kernel(
- const int nthreads, const T *output, const T *grad_output, const T *input,
- const T *rois, T *grad_rois, const int pooled_height,
- const int pooled_width, const T spatial_scale, const int channels,
- const int height, const int width) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- // (n, c, ph, pw) is an element in the pooled output
- int pw = index % pooled_width;
- int ph = (index / pooled_width) % pooled_height;
- int c = (index / pooled_width / pooled_height) % channels;
- int n = index / pooled_width / pooled_height / channels;
- auto rois_cur = rois + n * 5;
-
- int roi_batch_ind = rois_cur[0];
- T roi_x1 = rois_cur[1] * spatial_scale;
- T roi_y1 = rois_cur[2] * spatial_scale;
- T roi_x2 = rois_cur[3] * spatial_scale;
- T roi_y2 = rois_cur[4] * spatial_scale;
-
- T roi_width = max(roi_x2 - roi_x1, (T)0);
- T roi_height = max(roi_y2 - roi_y1, (T)0);
- T bin_size_h = roi_height / static_cast(pooled_height);
- T bin_size_w = roi_width / static_cast(pooled_width);
-
- const T output_grad_val = grad_output[index];
- const T *this_input_data =
- input + (roi_batch_ind * channels + c) * height * width;
- const T output_val = output[index];
- T *this_rois_grad = grad_rois + n * 5;
-
- T bin_x1 = roi_x1 + bin_size_w * pw;
- T bin_y1 = roi_y1 + bin_size_h * ph;
- T bin_x2 = bin_x1 + bin_size_w;
- T bin_y2 = bin_y1 + bin_size_h;
-
- T bin_size = max(T(0.0), bin_size_w * bin_size_h);
-
- T sum_out = bin_size == T(0) ? T(0) : output_grad_val / bin_size;
-
- // WARNING: to be discussed
- if (sum_out == 0) continue;
-
- int start_x, start_y, end_x, end_y;
-
- start_x = floorf(bin_x1);
- end_x = ceilf(bin_x2);
- start_y = floorf(bin_y1);
- end_y = ceilf(bin_y2);
-
- T grad_x1_y = 0, grad_x2_y = 0, grad_x_y1 = 0, grad_x_y2 = 0;
- for (int bin_y = start_y; bin_y < end_y; ++bin_y) {
- grad_x1_y += PrRoIPoolingSingleCoorIntegral(
- max(bin_y1, T(bin_y)) - bin_y, min(bin_y2, T(bin_y + 1)) - bin_y,
- PrRoIPoolingInterpolation(this_input_data, float(bin_y), bin_x1,
- height, width),
- PrRoIPoolingInterpolation(this_input_data, float(bin_y + 1), bin_x1,
- height, width));
-
- grad_x2_y += PrRoIPoolingSingleCoorIntegral(
- max(bin_y1, T(bin_y)) - bin_y, min(bin_y2, T(bin_y + 1)) - bin_y,
- PrRoIPoolingInterpolation(this_input_data, float(bin_y), bin_x2,
- height, width),
- PrRoIPoolingInterpolation(this_input_data, float(bin_y + 1), bin_x2,
- height, width));
- }
-
- for (int bin_x = start_x; bin_x < end_x; ++bin_x) {
- grad_x_y1 += PrRoIPoolingSingleCoorIntegral(
- max(bin_x1, T(bin_x)) - bin_x, min(bin_x2, T(bin_x + 1)) - bin_x,
- PrRoIPoolingInterpolation(this_input_data, bin_y1, float(bin_x),
- height, width),
- PrRoIPoolingInterpolation(this_input_data, bin_y1, float(bin_x + 1),
- height, width));
-
- grad_x_y2 += PrRoIPoolingSingleCoorIntegral(
- max(bin_x1, T(bin_x)) - bin_x, min(bin_x2, T(bin_x + 1)) - bin_x,
- PrRoIPoolingInterpolation(this_input_data, bin_y2, float(bin_x),
- height, width),
- PrRoIPoolingInterpolation(this_input_data, bin_y2, float(bin_x + 1),
- height, width));
- }
-
- T partial_x1 = -grad_x1_y + (bin_y2 - bin_y1) * output_val;
- T partial_y1 = -grad_x_y1 + (bin_x2 - bin_x1) * output_val;
- T partial_x2 = grad_x2_y - (bin_y2 - bin_y1) * output_val;
- T partial_y2 = grad_x_y2 - (bin_x2 - bin_x1) * output_val;
-
- partial_x1 = partial_x1 / bin_size * spatial_scale;
- partial_x2 = partial_x2 / bin_size * spatial_scale;
- partial_y1 = partial_y1 / bin_size * spatial_scale;
- partial_y2 = partial_y2 / bin_size * spatial_scale;
-
- // (index, x1, y1, x2, y2)
- this_rois_grad[0] = 0;
- atomicAdd(this_rois_grad + 1,
- (partial_x1 * (1.0f - T(pw) / pooled_width) +
- partial_x2 * (1.0f - T(pw + 1) / pooled_width)) *
- output_grad_val);
- atomicAdd(this_rois_grad + 2,
- (partial_y1 * (1.0f - T(ph) / pooled_height) +
- partial_y2 * (1.0f - T(ph + 1) / pooled_height)) *
- output_grad_val);
- atomicAdd(this_rois_grad + 3, (partial_x2 * T(pw + 1) / pooled_width +
- partial_x1 * T(pw) / pooled_width) *
- output_grad_val);
- atomicAdd(this_rois_grad + 4, (partial_y2 * T(ph + 1) / pooled_height +
- partial_y1 * T(ph) / pooled_height) *
- output_grad_val);
- }
-}
-
-#endif // ROI_POOL_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/riroi_align_rotated_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/riroi_align_rotated_cuda_kernel.cuh
deleted file mode 100644
index 4383d9e82cce97362f53cf799b8dfa30c7b4cd02..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/riroi_align_rotated_cuda_kernel.cuh
+++ /dev/null
@@ -1,242 +0,0 @@
-// Modified from
-// https://github.com/csuhan/ReDet/blob/master/mmdet/ops/riroi_align/src/riroi_align_kernel.cu
-#ifndef RIROI_ALIGN_ROTATED_CUDA_KERNEL_CUH
-#define RIROI_ALIGN_ROTATED_CUDA_KERNEL_CUH
-
-#include
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else // MMCV_USE_PARROTS
-#include "pytorch_cuda_helper.hpp"
-#endif // MMCV_USE_PARROTS
-
-/*** Forward ***/
-template
-__global__ void riroi_align_rotated_forward_cuda_kernel(
- const int nthreads, const scalar_t *bottom_data,
- const scalar_t *bottom_rois, const scalar_t spatial_scale,
- const int num_samples, const bool clockwise, const int channels,
- const int height, const int width, const int pooled_height,
- const int pooled_width, const int num_orientations, scalar_t *top_data) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- // (n, c, ph, pw) is an element in the pooled output
- int pw = index % pooled_width;
- int ph = (index / pooled_width) % pooled_height;
- int o = (index / pooled_width / pooled_height) % num_orientations;
- int c =
- (index / pooled_width / pooled_height / num_orientations) % channels;
- int n = index / pooled_width / pooled_height / num_orientations / channels;
-
- const scalar_t *offset_bottom_rois = bottom_rois + n * 6;
- int roi_batch_ind = offset_bottom_rois[0];
-
- // Do not using rounding; this implementation detail is critical
- scalar_t roi_center_w = offset_bottom_rois[1] * spatial_scale;
- scalar_t roi_center_h = offset_bottom_rois[2] * spatial_scale;
- scalar_t roi_width = offset_bottom_rois[3] * spatial_scale;
- scalar_t roi_height = offset_bottom_rois[4] * spatial_scale;
- // scalar_t theta = offset_bottom_rois[5] * M_PI / 180.0;
- scalar_t theta = offset_bottom_rois[5];
- // Force malformed ROIs to be 1x1
- roi_width = max(roi_width, (scalar_t)1.);
- roi_height = max(roi_height, (scalar_t)1.);
- scalar_t bin_size_h = static_cast(roi_height) /
- static_cast(pooled_height);
- scalar_t bin_size_w =
- static_cast(roi_width) / static_cast(pooled_width);
-
- // find aligned index
- scalar_t ind_float = theta * num_orientations / (2 * M_PI);
- int ind = floorf(ind_float);
- scalar_t l_var = ind_float - (scalar_t)ind;
- scalar_t r_var = 1.0 - l_var;
- // correct start channel
- ind = (ind + num_orientations) % num_orientations;
- // rotated channel
- int ind_rot = (o - ind + num_orientations) % num_orientations;
- int ind_rot_plus = (ind_rot + 1 + num_orientations) % num_orientations;
- const scalar_t *offset_bottom_data =
- bottom_data + (roi_batch_ind * channels * num_orientations +
- c * num_orientations + ind_rot) *
- height * width;
-
- const scalar_t *offset_bottom_data_plus =
- bottom_data + (roi_batch_ind * channels * num_orientations +
- c * num_orientations + ind_rot_plus) *
- height * width;
- // We use roi_bin_grid to sample the grid and mimic integral
- int roi_bin_grid_h = (num_samples > 0)
- ? num_samples
- : ceilf(roi_height / pooled_height); // e.g., = 2
- int roi_bin_grid_w =
- (num_samples > 0) ? num_samples : ceilf(roi_width / pooled_width);
-
- // roi_start_h and roi_start_w are computed wrt the center of RoI (x, y).
- // Appropriate translation needs to be applied after.
- if (clockwise) {
- theta = -theta; // If clockwise, the angle needs to be reversed.
- }
- scalar_t roi_start_h = -roi_height / 2.0;
- scalar_t roi_start_w = -roi_width / 2.0;
- scalar_t cosscalar_theta = cos(theta);
- scalar_t sinscalar_theta = sin(theta);
-
- // We do average (integral) pooling inside a bin
- const scalar_t count = max(roi_bin_grid_h * roi_bin_grid_w, 1); // e.g. = 4
-
- scalar_t output_val = 0.;
- for (int iy = 0; iy < roi_bin_grid_h; iy++) { // e.g., iy = 0, 1
- const scalar_t yy =
- roi_start_h + ph * bin_size_h +
- static_cast(iy + .5f) * bin_size_h /
- static_cast(roi_bin_grid_h); // e.g., 0.5, 1.5
- for (int ix = 0; ix < roi_bin_grid_w; ix++) {
- const scalar_t xx = roi_start_w + pw * bin_size_w +
- static_cast(ix + .5f) * bin_size_w /
- static_cast(roi_bin_grid_w);
-
- // Rotate by theta (counterclockwise) around the center and translate
- scalar_t y = yy * cosscalar_theta - xx * sinscalar_theta + roi_center_h;
- scalar_t x = yy * sinscalar_theta + xx * cosscalar_theta + roi_center_w;
-
- scalar_t val = bilinear_interpolate(
- offset_bottom_data, height, width, y, x, index);
- scalar_t val_plus = bilinear_interpolate(
- offset_bottom_data_plus, height, width, y, x, index);
- output_val += r_var * val + l_var * val_plus;
- }
- }
- output_val /= count;
-
- top_data[index] = output_val;
- }
-}
-
-/*** Backward ***/
-template
-__global__ void riroi_align_rotated_backward_cuda_kernel(
- const int nthreads, const scalar_t *top_diff, const scalar_t *bottom_rois,
- const scalar_t spatial_scale, const int num_samples, const bool clockwise,
- const int channels, const int height, const int width,
- const int pooled_height, const int pooled_width, const int num_orientations,
- scalar_t *bottom_diff) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- // (n, c, ph, pw) is an element in the pooled output
- int pw = index % pooled_width;
- int ph = (index / pooled_width) % pooled_height;
- int o = (index / pooled_width / pooled_height) % num_orientations;
- int c =
- (index / pooled_width / pooled_height / num_orientations) % channels;
- int n = index / pooled_width / pooled_height / num_orientations / channels;
-
- const scalar_t *offset_bottom_rois = bottom_rois + n * 6;
- int roi_batch_ind = offset_bottom_rois[0];
-
- // Do not round
- scalar_t roi_center_w = offset_bottom_rois[1] * spatial_scale;
- scalar_t roi_center_h = offset_bottom_rois[2] * spatial_scale;
- scalar_t roi_width = offset_bottom_rois[3] * spatial_scale;
- scalar_t roi_height = offset_bottom_rois[4] * spatial_scale;
- // scalar_t theta = offset_bottom_rois[5] * M_PI / 180.0;
- scalar_t theta = offset_bottom_rois[5];
- // Force malformed ROIs to be 1x1
- roi_width = max(roi_width, (scalar_t)1.);
- roi_height = max(roi_height, (scalar_t)1.);
-
- scalar_t bin_size_h = static_cast(roi_height) /
- static_cast(pooled_height);
- scalar_t bin_size_w =
- static_cast(roi_width) / static_cast(pooled_width);
-
- // find aligned index
- scalar_t ind_float = theta * num_orientations / (2 * M_PI);
- int ind = floorf(ind_float);
- scalar_t l_var = ind_float - (scalar_t)ind;
- scalar_t r_var = 1.0 - l_var;
- // correct start channel
- ind = (ind + num_orientations) % num_orientations;
- // rotated channel
- int ind_rot = (o - ind + num_orientations) % num_orientations;
- int ind_rot_plus = (ind_rot + 1 + num_orientations) % num_orientations;
- scalar_t *offset_bottom_diff =
- bottom_diff + (roi_batch_ind * channels * num_orientations +
- c * num_orientations + ind_rot) *
- height * width;
- scalar_t *offset_bottom_diff_plus =
- bottom_diff + (roi_batch_ind * channels * num_orientations +
- c * num_orientations + ind_rot_plus) *
- height * width;
- int top_offset =
- (n * channels * num_orientations + c * num_orientations + o) *
- pooled_height * pooled_width;
- const scalar_t *offset_top_diff = top_diff + top_offset;
- const scalar_t top_diff_this_bin = offset_top_diff[ph * pooled_width + pw];
-
- // We use roi_bin_grid to sample the grid and mimic integral
- int roi_bin_grid_h = (num_samples > 0)
- ? num_samples
- : ceilf(roi_height / pooled_height); // e.g., = 2
- int roi_bin_grid_w =
- (num_samples > 0) ? num_samples : ceilf(roi_width / pooled_width);
-
- // roi_start_h and roi_start_w are computed wrt the center of RoI (x, y).
- // Appropriate translation needs to be applied after.
- if (clockwise) {
- theta = -theta; // If clockwise, the angle needs to be reversed.
- }
- scalar_t roi_start_h = -roi_height / 2.0;
- scalar_t roi_start_w = -roi_width / 2.0;
- scalar_t cosTheta = cos(theta);
- scalar_t sinTheta = sin(theta);
-
- // We do average (integral) pooling inside a bin
- const scalar_t count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4
-
- for (int iy = 0; iy < roi_bin_grid_h; iy++) { // e.g., iy = 0, 1
- const scalar_t yy =
- roi_start_h + ph * bin_size_h +
- static_cast(iy + .5f) * bin_size_h /
- static_cast(roi_bin_grid_h); // e.g., 0.5, 1.5
- for (int ix = 0; ix < roi_bin_grid_w; ix++) {
- const scalar_t xx = roi_start_w + pw * bin_size_w +
- static_cast(ix + .5f) * bin_size_w /
- static_cast(roi_bin_grid_w);
-
- // Rotate by theta around the center and translate
- scalar_t y = yy * cosTheta - xx * sinTheta + roi_center_h;
- scalar_t x = yy * sinTheta + xx * cosTheta + roi_center_w;
-
- scalar_t w1, w2, w3, w4;
- int x_low, x_high, y_low, y_high;
-
- bilinear_interpolate_gradient(height, width, y, x, w1, w2, w3,
- w4, x_low, x_high, y_low,
- y_high, index);
-
- scalar_t g1 = top_diff_this_bin * w1 / count;
- scalar_t g2 = top_diff_this_bin * w2 / count;
- scalar_t g3 = top_diff_this_bin * w3 / count;
- scalar_t g4 = top_diff_this_bin * w4 / count;
-
- if (x_low >= 0 && x_high >= 0 && y_low >= 0 && y_high >= 0) {
- atomicAdd(offset_bottom_diff + y_low * width + x_low, g1 * r_var);
- atomicAdd(offset_bottom_diff + y_low * width + x_high, g2 * r_var);
- atomicAdd(offset_bottom_diff + y_high * width + x_low, g3 * r_var);
- atomicAdd(offset_bottom_diff + y_high * width + x_high, g4 * r_var);
-
- atomicAdd(offset_bottom_diff_plus + y_low * width + x_low,
- g1 * l_var);
- atomicAdd(offset_bottom_diff_plus + y_low * width + x_high,
- g2 * l_var);
- atomicAdd(offset_bottom_diff_plus + y_high * width + x_low,
- g3 * l_var);
- atomicAdd(offset_bottom_diff_plus + y_high * width + x_high,
- g4 * l_var);
-
- } // if
- } // ix
- } // iy
- } // CUDA_1D_KERNEL_LOOP
-} // RiRoIAlignBackward
-
-#endif // RIROI_ALIGN_ROTATED_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/roi_align_rotated_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/roi_align_rotated_cuda_kernel.cuh
index 8274dc50c709630c4ee456efd543aa1265049b41..33571f29674f53674415afe1bb4cc3ea0d8a9865 100644
--- a/mmcv/ops/csrc/common/cuda/roi_align_rotated_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/roi_align_rotated_cuda_kernel.cuh
@@ -20,7 +20,7 @@ template
__global__ void roi_align_rotated_forward_cuda_kernel(
const int nthreads, const scalar_t *bottom_data,
const scalar_t *bottom_rois, const scalar_t spatial_scale,
- const int sampling_ratio, const bool aligned, const bool clockwise,
+ const int sample_num, const bool aligned, const bool clockwise,
const int channels, const int height, const int width,
const int pooled_height, const int pooled_width, scalar_t *top_data) {
CUDA_1D_KERNEL_LOOP(index, nthreads) {
@@ -58,11 +58,11 @@ __global__ void roi_align_rotated_forward_cuda_kernel(
bottom_data + (roi_batch_ind * channels + c) * height * width;
// We use roi_bin_grid to sample the grid and mimic integral
- int roi_bin_grid_h = (sampling_ratio > 0)
- ? sampling_ratio
+ int roi_bin_grid_h = (sample_num > 0)
+ ? sample_num
: ceilf(roi_height / pooled_height); // e.g., = 2
int roi_bin_grid_w =
- (sampling_ratio > 0) ? sampling_ratio : ceilf(roi_width / pooled_width);
+ (sample_num > 0) ? sample_num : ceilf(roi_width / pooled_width);
// roi_start_h and roi_start_w are computed wrt the center of RoI (x, y).
// Appropriate translation needs to be applied after.
@@ -104,7 +104,7 @@ __global__ void roi_align_rotated_forward_cuda_kernel(
template
__global__ void roi_align_rotated_backward_cuda_kernel(
const int nthreads, const scalar_t *top_diff, const scalar_t *bottom_rois,
- const scalar_t spatial_scale, const int sampling_ratio, const bool aligned,
+ const scalar_t spatial_scale, const int sample_num, const bool aligned,
const bool clockwise, const int channels, const int height, const int width,
const int pooled_height, const int pooled_width, scalar_t *bottom_diff) {
CUDA_1D_KERNEL_LOOP(index, nthreads) {
@@ -146,11 +146,11 @@ __global__ void roi_align_rotated_backward_cuda_kernel(
const scalar_t top_diff_this_bin = offset_top_diff[ph * pooled_width + pw];
// We use roi_bin_grid to sample the grid and mimic integral
- int roi_bin_grid_h = (sampling_ratio > 0)
- ? sampling_ratio
+ int roi_bin_grid_h = (sample_num > 0)
+ ? sample_num
: ceilf(roi_height / pooled_height); // e.g., = 2
int roi_bin_grid_w =
- (sampling_ratio > 0) ? sampling_ratio : ceilf(roi_width / pooled_width);
+ (sample_num > 0) ? sample_num : ceilf(roi_width / pooled_width);
// roi_start_h and roi_start_w are computed wrt the center of RoI (x, y).
// Appropriate translation needs to be applied after.
diff --git a/mmcv/ops/csrc/common/cuda/roiaware_pool3d_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/roiaware_pool3d_cuda_kernel.cuh
index fc0aacf1435f8715fae92de535bf01bac07ac39a..3b95dc79080323a0b7d1d6bba06a3a46b04a3f05 100644
--- a/mmcv/ops/csrc/common/cuda/roiaware_pool3d_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/roiaware_pool3d_cuda_kernel.cuh
@@ -44,38 +44,37 @@ __global__ void generate_pts_mask_for_box3d(int boxes_num, int pts_num,
// coordinate params pts: (npoints, 3) [x, y, z] params pts_mask: (N,
// npoints): -1 means point does not in this box, otherwise: encode (x_idxs,
// y_idxs, z_idxs) by binary bit
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
int box_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(pt_idx, pts_num) {
- if (box_idx >= boxes_num) return;
+ if (pt_idx >= pts_num || box_idx >= boxes_num) return;
- pts += pt_idx * 3;
- rois += box_idx * 7;
- pts_mask += box_idx * pts_num + pt_idx;
+ pts += pt_idx * 3;
+ rois += box_idx * 7;
+ pts_mask += box_idx * pts_num + pt_idx;
- T local_x = 0, local_y = 0;
- int cur_in_flag = check_pt_in_box3d(pts, rois, local_x, local_y);
+ T local_x = 0, local_y = 0;
+ int cur_in_flag = check_pt_in_box3d(pts, rois, local_x, local_y);
- pts_mask[0] = -1;
- if (cur_in_flag > 0) {
- T local_z = pts[2] - rois[2];
- T x_size = rois[3], y_size = rois[4], z_size = rois[5];
+ pts_mask[0] = -1;
+ if (cur_in_flag > 0) {
+ T local_z = pts[2] - rois[2];
+ T x_size = rois[3], y_size = rois[4], z_size = rois[5];
- T x_res = x_size / out_x;
- T y_res = y_size / out_y;
- T z_res = z_size / out_z;
+ T x_res = x_size / out_x;
+ T y_res = y_size / out_y;
+ T z_res = z_size / out_z;
- unsigned int x_idx = int((local_x + x_size / 2) / x_res);
- unsigned int y_idx = int((local_y + y_size / 2) / y_res);
- unsigned int z_idx = int(local_z / z_res);
+ unsigned int x_idx = int((local_x + x_size / 2) / x_res);
+ unsigned int y_idx = int((local_y + y_size / 2) / y_res);
+ unsigned int z_idx = int(local_z / z_res);
- x_idx = min(max(x_idx, 0), out_x - 1);
- y_idx = min(max(y_idx, 0), out_y - 1);
- z_idx = min(max(z_idx, 0), out_z - 1);
+ x_idx = min(max(x_idx, 0), out_x - 1);
+ y_idx = min(max(y_idx, 0), out_y - 1);
+ z_idx = min(max(z_idx, 0), out_z - 1);
- unsigned int idx_encoding = (x_idx << 16) + (y_idx << 8) + z_idx;
+ unsigned int idx_encoding = (x_idx << 16) + (y_idx << 8) + z_idx;
- pts_mask[0] = idx_encoding;
- }
+ pts_mask[0] = idx_encoding;
}
}
@@ -87,24 +86,26 @@ __global__ void collect_inside_pts_for_box3d(int boxes_num, int pts_num,
T *pts_idx_of_voxels) {
// params pts_mask: (N, npoints) 0 or 1
// params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel)
- CUDA_1D_KERNEL_LOOP(box_idx, boxes_num) {
- int max_num_pts = max_pts_each_voxel - 1; // index 0 is the counter
- pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel;
-
- for (int k = 0; k < pts_num; k++) {
- if (pts_mask[box_idx * pts_num + k] != -1) {
- unsigned int idx_encoding = pts_mask[box_idx * pts_num + k];
- unsigned int x_idx = (idx_encoding >> 16) & 0xFF;
- unsigned int y_idx = (idx_encoding >> 8) & 0xFF;
- unsigned int z_idx = idx_encoding & 0xFF;
- unsigned int base_offset = x_idx * out_y * out_z * max_pts_each_voxel +
- y_idx * out_z * max_pts_each_voxel +
- z_idx * max_pts_each_voxel;
- unsigned int cnt = pts_idx_of_voxels[base_offset];
- if (cnt < max_num_pts) {
- pts_idx_of_voxels[base_offset + cnt + 1] = k;
- pts_idx_of_voxels[base_offset]++;
- }
+
+ int box_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (box_idx >= boxes_num) return;
+
+ int max_num_pts = max_pts_each_voxel - 1; // index 0 is the counter
+ pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel;
+
+ for (int k = 0; k < pts_num; k++) {
+ if (pts_mask[box_idx * pts_num + k] != -1) {
+ unsigned int idx_encoding = pts_mask[box_idx * pts_num + k];
+ unsigned int x_idx = (idx_encoding >> 16) & 0xFF;
+ unsigned int y_idx = (idx_encoding >> 8) & 0xFF;
+ unsigned int z_idx = idx_encoding & 0xFF;
+ unsigned int base_offset = x_idx * out_y * out_z * max_pts_each_voxel +
+ y_idx * out_z * max_pts_each_voxel +
+ z_idx * max_pts_each_voxel;
+ unsigned int cnt = pts_idx_of_voxels[base_offset];
+ if (cnt < max_num_pts) {
+ pts_idx_of_voxels[base_offset + cnt + 1] = k;
+ pts_idx_of_voxels[base_offset]++;
}
}
}
@@ -123,38 +124,39 @@ __global__ void roiaware_maxpool3d(int boxes_num, int pts_num, int channels,
int box_idx = blockIdx.z;
int channel_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(voxel_idx_flat, out_x * out_y * out_z) {
- int x_idx = voxel_idx_flat / (out_y * out_z);
- int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
- int z_idx = voxel_idx_flat % out_z;
- if (box_idx >= boxes_num || channel_idx >= channels) return;
-
- int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
- pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel +
- offset_base * max_pts_each_voxel;
- pooled_features += box_idx * out_x * out_y * out_z * channels +
- offset_base * channels + channel_idx;
- argmax += box_idx * out_x * out_y * out_z * channels +
- offset_base * channels + channel_idx;
-
- int argmax_idx = -1;
- float max_val = -1e50;
-
- int total_pts = pts_idx_of_voxels[0];
-
- for (int k = 1; k <= total_pts; k++) {
- if (pts_feature[pts_idx_of_voxels[k] * channels + channel_idx] >
- max_val) {
- max_val = pts_feature[pts_idx_of_voxels[k] * channels + channel_idx];
- argmax_idx = pts_idx_of_voxels[k];
- }
+ int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;
+
+ int x_idx = voxel_idx_flat / (out_y * out_z);
+ int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
+ int z_idx = voxel_idx_flat % out_z;
+ if (box_idx >= boxes_num || channel_idx >= channels || x_idx >= out_x ||
+ y_idx >= out_y || z_idx >= out_z)
+ return;
+
+ int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
+ pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel +
+ offset_base * max_pts_each_voxel;
+ pooled_features += box_idx * out_x * out_y * out_z * channels +
+ offset_base * channels + channel_idx;
+ argmax += box_idx * out_x * out_y * out_z * channels +
+ offset_base * channels + channel_idx;
+
+ int argmax_idx = -1;
+ float max_val = -1e50;
+
+ int total_pts = pts_idx_of_voxels[0];
+
+ for (int k = 1; k <= total_pts; k++) {
+ if (pts_feature[pts_idx_of_voxels[k] * channels + channel_idx] > max_val) {
+ max_val = pts_feature[pts_idx_of_voxels[k] * channels + channel_idx];
+ argmax_idx = pts_idx_of_voxels[k];
}
+ }
- if (argmax_idx != -1) {
- pooled_features[0] = max_val;
- }
- argmax[0] = argmax_idx;
+ if (argmax_idx != -1) {
+ pooled_features[0] = max_val;
}
+ argmax[0] = argmax_idx;
}
template
@@ -170,28 +172,30 @@ __global__ void roiaware_avgpool3d(int boxes_num, int pts_num, int channels,
int box_idx = blockIdx.z;
int channel_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(voxel_idx_flat, out_x * out_y * out_z) {
- int x_idx = voxel_idx_flat / (out_y * out_z);
- int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
- int z_idx = voxel_idx_flat % out_z;
- if (box_idx >= boxes_num || channel_idx >= channels) return;
-
- int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
- pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel +
- offset_base * max_pts_each_voxel;
- pooled_features += box_idx * out_x * out_y * out_z * channels +
- offset_base * channels + channel_idx;
-
- float sum_val = 0;
- int total_pts = pts_idx_of_voxels[0];
-
- for (int k = 1; k <= total_pts; k++) {
- sum_val += pts_feature[pts_idx_of_voxels[k] * channels + channel_idx];
- }
+ int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;
+
+ int x_idx = voxel_idx_flat / (out_y * out_z);
+ int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
+ int z_idx = voxel_idx_flat % out_z;
+ if (box_idx >= boxes_num || channel_idx >= channels || x_idx >= out_x ||
+ y_idx >= out_y || z_idx >= out_z)
+ return;
+
+ int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
+ pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel +
+ offset_base * max_pts_each_voxel;
+ pooled_features += box_idx * out_x * out_y * out_z * channels +
+ offset_base * channels + channel_idx;
+
+ float sum_val = 0;
+ int total_pts = pts_idx_of_voxels[0];
+
+ for (int k = 1; k <= total_pts; k++) {
+ sum_val += pts_feature[pts_idx_of_voxels[k] * channels + channel_idx];
+ }
- if (total_pts > 0) {
- pooled_features[0] = sum_val / total_pts;
- }
+ if (total_pts > 0) {
+ pooled_features[0] = sum_val / total_pts;
}
}
@@ -206,22 +210,24 @@ __global__ void roiaware_maxpool3d_backward(int boxes_num, int channels,
int box_idx = blockIdx.z;
int channel_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(voxel_idx_flat, out_x * out_y * out_z) {
- int x_idx = voxel_idx_flat / (out_y * out_z);
- int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
- int z_idx = voxel_idx_flat % out_z;
- if (box_idx >= boxes_num || channel_idx >= channels) return;
-
- int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
- argmax += box_idx * out_x * out_y * out_z * channels +
+ int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;
+
+ int x_idx = voxel_idx_flat / (out_y * out_z);
+ int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
+ int z_idx = voxel_idx_flat % out_z;
+ if (box_idx >= boxes_num || channel_idx >= channels || x_idx >= out_x ||
+ y_idx >= out_y || z_idx >= out_z)
+ return;
+
+ int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
+ argmax += box_idx * out_x * out_y * out_z * channels +
+ offset_base * channels + channel_idx;
+ grad_out += box_idx * out_x * out_y * out_z * channels +
offset_base * channels + channel_idx;
- grad_out += box_idx * out_x * out_y * out_z * channels +
- offset_base * channels + channel_idx;
- if (argmax[0] == -1) return;
+ if (argmax[0] == -1) return;
- atomicAdd(grad_in + argmax[0] * channels + channel_idx, grad_out[0] * 1);
- }
+ atomicAdd(grad_in + argmax[0] * channels + channel_idx, grad_out[0] * 1);
}
template
@@ -236,24 +242,26 @@ __global__ void roiaware_avgpool3d_backward(int boxes_num, int channels,
int box_idx = blockIdx.z;
int channel_idx = blockIdx.y;
- CUDA_1D_KERNEL_LOOP(voxel_idx_flat, out_x * out_y * out_z) {
- int x_idx = voxel_idx_flat / (out_y * out_z);
- int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
- int z_idx = voxel_idx_flat % out_z;
- if (box_idx >= boxes_num || channel_idx >= channels) return;
-
- int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
- pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel +
- offset_base * max_pts_each_voxel;
- grad_out += box_idx * out_x * out_y * out_z * channels +
- offset_base * channels + channel_idx;
-
- int total_pts = pts_idx_of_voxels[0];
- float cur_grad = 1 / fmaxf(float(total_pts), 1.0);
- for (int k = 1; k <= total_pts; k++) {
- atomicAdd(grad_in + pts_idx_of_voxels[k] * channels + channel_idx,
- grad_out[0] * cur_grad);
- }
+ int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;
+
+ int x_idx = voxel_idx_flat / (out_y * out_z);
+ int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;
+ int z_idx = voxel_idx_flat % out_z;
+ if (box_idx >= boxes_num || channel_idx >= channels || x_idx >= out_x ||
+ y_idx >= out_y || z_idx >= out_z)
+ return;
+
+ int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;
+ pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel +
+ offset_base * max_pts_each_voxel;
+ grad_out += box_idx * out_x * out_y * out_z * channels +
+ offset_base * channels + channel_idx;
+
+ int total_pts = pts_idx_of_voxels[0];
+ float cur_grad = 1 / fmaxf(float(total_pts), 1.0);
+ for (int k = 1; k <= total_pts; k++) {
+ atomicAdd(grad_in + pts_idx_of_voxels[k] * channels + channel_idx,
+ grad_out[0] * cur_grad);
}
}
diff --git a/mmcv/ops/csrc/common/cuda/roipoint_pool3d_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/roipoint_pool3d_cuda_kernel.cuh
index 545f6ffa09d4a6cae49f1f1e68c191c1fd54de68..7597719e69098ca4942c803e9853556daaa3b375 100644
--- a/mmcv/ops/csrc/common/cuda/roipoint_pool3d_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/roipoint_pool3d_cuda_kernel.cuh
@@ -42,23 +42,23 @@ __global__ void assign_pts_to_box3d(int batch_size, int pts_num, int boxes_num,
// params boxes3d: (B, M, 7)
// params pts_assign: (B, N, M): idx of the corresponding box3d, -1 means
// background points
+ int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
int box_idx = blockIdx.y;
int bs_idx = blockIdx.z;
- CUDA_1D_KERNEL_LOOP(pt_idx, pts_num) {
- if (box_idx >= boxes_num || bs_idx >= batch_size) return;
- int assign_idx =
- bs_idx * pts_num * boxes_num + pt_idx * boxes_num + box_idx;
- pts_assign[assign_idx] = 0;
+ if (pt_idx >= pts_num || box_idx >= boxes_num || bs_idx >= batch_size) {
+ return;
+ }
+ int assign_idx = bs_idx * pts_num * boxes_num + pt_idx * boxes_num + box_idx;
+ pts_assign[assign_idx] = 0;
- int box_offset = bs_idx * boxes_num * 7 + box_idx * 7;
- int pt_offset = bs_idx * pts_num * 3 + pt_idx * 3;
+ int box_offset = bs_idx * boxes_num * 7 + box_idx * 7;
+ int pt_offset = bs_idx * pts_num * 3 + pt_idx * 3;
- T local_x = 0, local_y = 0;
- int cur_in_flag = check_pt_in_box3d(xyz + pt_offset, boxes3d + box_offset,
- local_x, local_y);
- pts_assign[assign_idx] = cur_in_flag;
- }
+ T local_x = 0, local_y = 0;
+ int cur_in_flag = check_pt_in_box3d(xyz + pt_offset, boxes3d + box_offset,
+ local_x, local_y);
+ pts_assign[assign_idx] = cur_in_flag;
}
__global__ void get_pooled_idx(int batch_size, int pts_num, int boxes_num,
@@ -69,32 +69,35 @@ __global__ void get_pooled_idx(int batch_size, int pts_num, int boxes_num,
// params pts_assign: (B, N)
// params pts_idx: (B, M, 512)
// params pooled_empty_flag: (B, M)
- CUDA_1D_KERNEL_LOOP(boxes_idx, boxes_num) {
- int bs_idx = blockIdx.y;
-
- int cnt = 0;
- for (int k = 0; k < pts_num; k++) {
- if (pts_assign[bs_idx * pts_num * boxes_num + k * boxes_num +
- boxes_idx]) {
- if (cnt < sampled_pts_num) {
- pts_idx[bs_idx * boxes_num * sampled_pts_num +
- boxes_idx * sampled_pts_num + cnt] = k;
- cnt++;
- } else
- break;
- }
+
+ int boxes_idx = blockIdx.x * blockDim.x + threadIdx.x;
+ if (boxes_idx >= boxes_num) {
+ return;
+ }
+
+ int bs_idx = blockIdx.y;
+
+ int cnt = 0;
+ for (int k = 0; k < pts_num; k++) {
+ if (pts_assign[bs_idx * pts_num * boxes_num + k * boxes_num + boxes_idx]) {
+ if (cnt < sampled_pts_num) {
+ pts_idx[bs_idx * boxes_num * sampled_pts_num +
+ boxes_idx * sampled_pts_num + cnt] = k;
+ cnt++;
+ } else
+ break;
}
+ }
- if (cnt == 0) {
- pooled_empty_flag[bs_idx * boxes_num + boxes_idx] = 1;
- } else if (cnt < sampled_pts_num) {
- // duplicate same points for sampling
- for (int k = cnt; k < sampled_pts_num; k++) {
- int duplicate_idx = k % cnt;
- int base_offset =
- bs_idx * boxes_num * sampled_pts_num + boxes_idx * sampled_pts_num;
- pts_idx[base_offset + k] = pts_idx[base_offset + duplicate_idx];
- }
+ if (cnt == 0) {
+ pooled_empty_flag[bs_idx * boxes_num + boxes_idx] = 1;
+ } else if (cnt < sampled_pts_num) {
+ // duplicate same points for sampling
+ for (int k = cnt; k < sampled_pts_num; k++) {
+ int duplicate_idx = k % cnt;
+ int base_offset =
+ bs_idx * boxes_num * sampled_pts_num + boxes_idx * sampled_pts_num;
+ pts_idx[base_offset + k] = pts_idx[base_offset + duplicate_idx];
}
}
}
@@ -109,26 +112,33 @@ __global__ void roipoint_pool3d_forward(
// params pts_feature: (B, N, C)
// params pooled_features: (B, M, 512, 3+C)
// params pooled_empty_flag: (B, M)
+
+ int sample_pt_idx = blockIdx.x * blockDim.x + threadIdx.x;
int box_idx = blockIdx.y;
int bs_idx = blockIdx.z;
- CUDA_1D_KERNEL_LOOP(sample_pt_idx, sampled_pts_num) {
- if (box_idx >= boxes_num || bs_idx >= batch_size) return;
- if (pooled_empty_flag[bs_idx * boxes_num + box_idx]) return;
-
- int temp_idx = bs_idx * boxes_num * sampled_pts_num +
- box_idx * sampled_pts_num + sample_pt_idx;
- int src_pt_idx = pts_idx[temp_idx];
- int dst_feature_offset = temp_idx * (3 + feature_in_len);
-
- for (int j = 0; j < 3; j++)
- pooled_features[dst_feature_offset + j] =
- xyz[bs_idx * pts_num * 3 + src_pt_idx * 3 + j];
-
- int src_feature_offset =
- bs_idx * pts_num * feature_in_len + src_pt_idx * feature_in_len;
- memcpy(pooled_features + dst_feature_offset + 3,
- pts_feature + src_feature_offset, feature_in_len * sizeof(T));
+
+ if (sample_pt_idx >= sampled_pts_num || box_idx >= boxes_num ||
+ bs_idx >= batch_size) {
+ return;
+ }
+
+ if (pooled_empty_flag[bs_idx * boxes_num + box_idx]) {
+ return;
}
+
+ int temp_idx = bs_idx * boxes_num * sampled_pts_num +
+ box_idx * sampled_pts_num + sample_pt_idx;
+ int src_pt_idx = pts_idx[temp_idx];
+ int dst_feature_offset = temp_idx * (3 + feature_in_len);
+
+ for (int j = 0; j < 3; j++)
+ pooled_features[dst_feature_offset + j] =
+ xyz[bs_idx * pts_num * 3 + src_pt_idx * 3 + j];
+
+ int src_feature_offset =
+ bs_idx * pts_num * feature_in_len + src_pt_idx * feature_in_len;
+ memcpy(pooled_features + dst_feature_offset + 3,
+ pts_feature + src_feature_offset, feature_in_len * sizeof(T));
}
#endif // ROIPOINT_POOL3D_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/rotated_feature_align_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/rotated_feature_align_cuda_kernel.cuh
deleted file mode 100644
index ffcc658ccb1f5e3059c0428159bc2e80fbeee3d4..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/rotated_feature_align_cuda_kernel.cuh
+++ /dev/null
@@ -1,129 +0,0 @@
-// Copyright (c) OpenMMLab. All rights reserved.
-// Modified from
-// https://github.com/SJTU-Thinklab-Det/r3det-on-mmdetection/blob/master/mmdet/ops/fr/src/feature_refine_kernel.cu
-#ifndef ROTATED_FEATURE_ALIGN_CUDA_KERNEL_CUH
-#define ROTATED_FEATURE_ALIGN_CUDA_KERNEL_CUH
-
-#ifdef MMCV_USE_PARROTS
-#include "parrots_cuda_helper.hpp"
-#else
-#include "pytorch_cuda_helper.hpp"
-#endif
-
-template
-__global__ void rotated_feature_align_forward_kernel(
- const int nthreads, const int points, const scalar_t* bottom_data,
- const scalar_t* best_bboxes, const scalar_t spatial_scale,
- const int channels, const int height, const int width, scalar_t* top_data) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- int w = index % width;
- int h = (index / width) % height;
- int c = (index / width / height) % channels;
- int n = index / width / height / channels;
-
- const scalar_t* bbox_offset =
- best_bboxes + ((n * height + h) * width + w) * 5;
- scalar_t roi_y = bbox_offset[0] * spatial_scale;
- scalar_t roi_x = bbox_offset[1] * spatial_scale;
-
- scalar_t px[5] = {roi_x, 0, 0, 0, 0};
- scalar_t py[5] = {roi_y, 0, 0, 0, 0};
-
- if (points > 1) {
- scalar_t roi_w = bbox_offset[2] * spatial_scale;
- scalar_t roi_h = bbox_offset[3] * spatial_scale;
- scalar_t roi_a = bbox_offset[4];
-
- scalar_t w_2 = roi_w / 2, h_2 = roi_h / 2;
- scalar_t cosa = cosf(roi_a), sina = sinf(roi_a);
- scalar_t wx = cosa * w_2, wy = sina * w_2;
- scalar_t hx = -sina * h_2, hy = cosa * h_2;
-
- px[1] = roi_x + wx + hx;
- py[1] = roi_y + wy + hy;
- px[2] = roi_x - wx + hx;
- py[2] = roi_y - wy + hy;
- px[3] = roi_x - wx - hx;
- py[3] = roi_y - wy - hy;
- px[4] = roi_x + wx - hx;
- py[4] = roi_y + wy - hy;
- }
-
- const scalar_t* offset_bottom_data =
- bottom_data + (n * channels + c) * height * width;
-
- scalar_t output_val = bottom_data[index];
- for (int i = 0; i < points; i++) {
- output_val += bilinear_interpolate(offset_bottom_data, height,
- width, py[i], px[i], i);
- }
- top_data[index] = output_val;
- }
-}
-
-template
-__global__ void rotated_feature_align_backward_kernel(
- const int nthreads, const int points, const scalar_t* top_diff,
- const scalar_t* best_bboxes, const scalar_t spatial_scale,
- const int channels, const int height, const int width,
- scalar_t* bottom_diff) {
- CUDA_1D_KERNEL_LOOP(index, nthreads) {
- int w = index % width;
- int h = (index / width) % height;
- int c = (index / width / height) % channels;
- int n = index / width / height / channels;
-
- const scalar_t* bbox_offset =
- best_bboxes + ((n * height + h) * width + w) * 5;
- scalar_t roi_y = bbox_offset[0] * spatial_scale;
- scalar_t roi_x = bbox_offset[1] * spatial_scale;
-
- scalar_t px[5] = {roi_x, 0, 0, 0, 0};
- scalar_t py[5] = {roi_y, 0, 0, 0, 0};
-
- if (points > 1) {
- scalar_t roi_w = bbox_offset[2] * spatial_scale;
- scalar_t roi_h = bbox_offset[3] * spatial_scale;
- scalar_t roi_a = bbox_offset[4];
-
- scalar_t w_2 = roi_w / 2, h_2 = roi_h / 2;
- scalar_t cosa = cosf(roi_a), sina = sinf(roi_a);
- scalar_t wx = cosa * w_2, wy = sina * w_2;
- scalar_t hx = -sina * h_2, hy = cosa * h_2;
-
- px[1] = roi_x + wx + hx;
- py[1] = roi_y + wy + hy;
- px[2] = roi_x - wx + hx;
- py[2] = roi_y - wy + hy;
- px[3] = roi_x - wx - hx;
- py[3] = roi_y - wy - hy;
- px[4] = roi_x + wx - hx;
- py[4] = roi_y + wy - hy;
- }
-
- scalar_t* offset_bottom_diff =
- bottom_diff + (n * channels + c) * height * width;
- scalar_t value_top_diff = top_diff[index];
-
- atomicAdd(bottom_diff + index, value_top_diff);
- for (int i = 0; i < points; i++) {
- scalar_t w1, w2, w3, w4;
- int x_low, x_high, y_low, y_high;
-
- bilinear_interpolate_gradient(height, width, py[i], px[i], w1,
- w2, w3, w4, x_low, x_high, y_low,
- y_high, i);
- scalar_t g1 = value_top_diff * w1;
- scalar_t g2 = value_top_diff * w2;
- scalar_t g3 = value_top_diff * w3;
- scalar_t g4 = value_top_diff * w4;
- if (x_low >= 0 && x_high >= 0 && y_low >= 0 && y_high >= 0) {
- atomicAdd(offset_bottom_diff + y_low * width + x_low, g1);
- atomicAdd(offset_bottom_diff + y_low * width + x_high, g2);
- atomicAdd(offset_bottom_diff + y_high * width + x_low, g3);
- atomicAdd(offset_bottom_diff + y_high * width + x_high, g4);
- }
- }
- }
-}
-#endif // ROTATED_FEATURE_ALIGN_CUDA_KERNEL_CUH
diff --git a/mmcv/ops/csrc/common/cuda/scatter_points_cuda_kernel.cuh b/mmcv/ops/csrc/common/cuda/scatter_points_cuda_kernel.cuh
index af5b9f67b12060ae5dfa52738dba52c8fe674105..7f9c40202fd4a6b4a43e4359e50c68cdb77d335f 100644
--- a/mmcv/ops/csrc/common/cuda/scatter_points_cuda_kernel.cuh
+++ b/mmcv/ops/csrc/common/cuda/scatter_points_cuda_kernel.cuh
@@ -34,7 +34,7 @@ __device__ __forceinline__ static void reduceMax(double *address, double val) {
}
// get rid of meaningless warnings when compiling host code
-#ifdef MMCV_WITH_HIP
+#ifdef HIP_DIFF
__device__ __forceinline__ static void reduceAdd(float *address, float val) {
atomicAdd(address, val);
}
@@ -86,7 +86,7 @@ __device__ __forceinline__ static void reduceAdd(double *address, double val) {
#endif
}
#endif // __CUDA_ARCH__
-#endif // MMCV_WITH_HIP
+#endif // HIP_DIFF
template
__global__ void feats_reduce_kernel(
diff --git a/mmcv/ops/csrc/common/cuda/spconv/indice.cuh b/mmcv/ops/csrc/common/cuda/spconv/indice.cuh
deleted file mode 100644
index 5ef0009a10f8effeb447e398cff5103b400056de..0000000000000000000000000000000000000000
--- a/mmcv/ops/csrc/common/cuda/spconv/indice.cuh
+++ /dev/null
@@ -1,236 +0,0 @@
-// Copyright 2019 Yan Yan
-//
-// Licensed under the Apache License, Version 2.0 (the "License");
-// you may not use this file except in compliance with the License.
-// You may obtain a copy of the License at
-//
-// http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing, software
-// distributed under the License is distributed on an "AS IS" BASIS,
-// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-// See the License for the specific language governing permissions and
-// limitations under the License.
-
-#ifndef INDICE_CU_H_
-#define INDICE_CU_H_
-#include
-#include
-
-#include
-
-template
-__global__ void prepareIndicePairsKernel(
- tv::TensorView indicesIn, tv::TensorView indicesOut,
- tv::TensorView gridsOut, tv::TensorView indicePairs,
- tv::TensorView indiceNum, tv::TensorView indicePairUnique,
- const tv::SimpleVector kernelSize,
- const tv::SimpleVector stride,
- const tv::SimpleVector padding,
- const tv::SimpleVector dilation,
- const tv::SimpleVector outSpatialShape) {
- auto numActIn = indicesIn.dim(0);
- Index spatialVolume = 1;
-#pragma unroll
- for (int i = 0; i < NDim; ++i) {
- spatialVolume *= outSpatialShape[i];
- }
- Index kernelVolume = 1;
-#pragma unroll
- for (int i = 0; i < NDim; ++i) {
- kernelVolume *= kernelSize[i];
- }
- Index numValidPoints = 0;
- Index validPoints[KernelMaxVolume * (NDim + 1)];
- Index *pointPtr = nullptr;
- auto indicePairsDim2 = indicePairs.dim(2);
- Index index;
- for (int ix : tv::KernelLoopX(numActIn)) {
- numValidPoints = getValidOutPos(
- indicesIn.data() + ix * (NDim + 1) + 1, kernelSize.data(),
- stride.data(), padding.data(), dilation.data(), outSpatialShape.data(),
- validPoints);
- for (Index i = 0; i < numValidPoints; ++i) {
- pointPtr = validPoints + i * (NDim + 1);
- auto offset = pointPtr[NDim];
- auto oldNum = atomicAdd(indiceNum.data() + offset, Index(1));
- indicePairs(offset, 0, oldNum) = ix;
- index = tv::rowArrayIdx(pointPtr, outSpatialShape.data()) +
- spatialVolume * indicesIn(ix, 0);
- indicePairs(offset, 1, oldNum) = index;
- indicePairUnique[offset * indicePairsDim2 + oldNum] = index;
- }
- }
-}
-
-template
-__global__ void prepareDeConvIndicePairsKernel(
- tv::TensorView indicesIn, tv::TensorView indicesOut,
- tv::TensorView gridsOut, tv::TensorView indicePairs,
- tv::TensorView indiceNum, tv::TensorView indicePairUnique,
- const tv::SimpleVector kernelSize,
- const tv::SimpleVector stride,
- const tv::SimpleVector padding,
- const tv::SimpleVector dilation,
- const tv::SimpleVector outSpatialShape) {
- auto numActIn = indicesIn.dim(0);
- Index spatialVolume = 1;
-#pragma unroll
- for (int i = 0; i < NDim; ++i) {
- spatialVolume *= outSpatialShape[i];
- }
- Index kernelVolume = 1;
-#pragma unroll
- for (int i = 0; i < NDim; ++i) {
- kernelVolume *= kernelSize[i];
- }
- Index numValidPoints = 0;
- Index validPoints[KernelMaxVolume * (NDim + 1)];
- Index *pointPtr = nullptr;
- auto indicePairsDim2 = indicePairs.dim(2);
- Index index;
- for (int ix : tv::KernelLoopX(numActIn)) {
- numValidPoints = getValidOutPosTranspose(
- indicesIn.data() + ix * (NDim + 1) + 1, kernelSize.data(),
- stride.data(), padding.data(), dilation.data(), outSpatialShape.data(),
- validPoints);
- for (Index i = 0; i < numValidPoints; ++i) {
- pointPtr = validPoints + i * (NDim + 1);
- auto offset = pointPtr[NDim];
- auto oldNum = atomicAdd(indiceNum.data() + offset, Index(1));
- indicePairs(offset, 0, oldNum) = ix;
- index = tv::rowArrayIdx(pointPtr, outSpatialShape.data()) +
- spatialVolume * indicesIn(ix, 0);
- indicePairs(offset, 1, oldNum) = index;
- indicePairUnique[offset * indicePairsDim2 + oldNum] = index;
- }
- }
-}
-
-template
-__global__ void assignGridAndIndiceOutKernel(
- tv::TensorView indicesOut, tv::TensorView gridsOut,
- int numAct, tv::TensorView indicePairs,
- tv::TensorView indicePairUnique,
- const tv::SimpleVector outSpatialShape, int batchSize) {
- Index index;
- auto indicesOutPtr = indicesOut.data();
- for (int ix : tv::KernelLoopX(numAct)) {
- index = indicePairUnique[ix];
- gridsOut[index] = ix;
- index = tv::rowArrayIdxInv(
- index, indicesOutPtr + ix * (NDim + 1) + 1, outSpatialShape.data());
- indicesOut[ix * (NDim + 1)] = index % batchSize;
- }
-}
-
-template
-__global__ void assignIndicePairsKernel(
- tv::TensorView indicesOut, tv::TensorView gridsOut,
- int numActIn, tv::TensorView indicePairs,
- tv::TensorView indicePairUnique,
- const tv::SimpleVector outSpatialShape) {
- Index index;
- int kernelVolume = indicePairs.dim(0);
- for (int ix : tv::KernelLoopX(numActIn)) {
- for (int i = 0; i < kernelVolume; ++i) {
- index = indicePairs(i, 1, ix);
- if (index > -1) {
- indicePairs(i, 1, ix) = gridsOut[index];
- }
- }
- }
-}
-
-template
-__global__ void prepareSubMGridKernel(
- tv::TensorView indicesIn, tv::TensorView gridsOut,
- const tv::SimpleVector outSpatialShape) {
- auto numActIn = indicesIn.dim(0);
- Index spatialVolume = 1;
-#pragma unroll
- for (int i = 0; i < NDim; ++i) {
- spatialVolume *= outSpatialShape[i];
- }
- Index index = 0;
- for (int ix : tv::KernelLoopX(numActIn)) {
- index = tv::rowArrayIdx(indicesIn.data() + ix * (NDim + 1) + 1,
- outSpatialShape.data()) +
- spatialVolume * indicesIn(ix, 0);
- gridsOut[index] = ix;
- }
-}
-
-template
-__global__ void getSubMIndicePairsKernel(
- tv::TensorView indicesIn, tv::TensorView gridsOut,
- tv::TensorView indicePairs, tv::TensorView indiceNum,
- const tv::SimpleVector kernelSize,
- const tv::SimpleVector stride,
- const tv::SimpleVector padding,
- const tv::SimpleVector dilation,
- const tv::SimpleVector outSpatialShape) {
- auto numActIn = indicesIn.dim(0);
- Index spatialVolume = 1;
-#pragma unroll
- for (int i = 0; i < NDim; ++i) {
- spatialVolume *= outSpatialShape[i];
- }
- Index numValidPoints = 0;
- Index validPoints[KernelMaxVolume * (NDim + 1)];
- Index *pointPtr = nullptr;
- Index index = 0;
- for (int ix : tv::KernelLoopX