Commit 6f3c5f1c authored by limm's avatar limm
Browse files

support v1.4.0

parent 6f674c7e
## 卷积神经网络
我们为卷积神经网络提供了一些构建模块,包括层构建、模块组件和权重初始化。
### 网络层的构建
在运行实验时,我们可能需要尝试同属一种类型但不同配置的层,但又不希望每次都修改代码。于是我们提供一些层构建方法,可以从字典构建层,字典可以在配置文件中配置,也可以通过命令行参数指定。
#### 用法
一个简单的例子:
```python
cfg = dict(type='Conv3d')
layer = build_conv_layer(cfg, in_channels=3, out_channels=8, kernel_size=3)
```
- `build_conv_layer`: 支持的类型包括 Conv1d、Conv2d、Conv3d、Conv (Conv是Conv2d的别名)
- `build_norm_layer`: 支持的类型包括 BN1d、BN2d、BN3d、BN (alias for BN2d)、SyncBN、GN、LN、IN1d、IN2d、IN3d、IN(IN是IN2d的别名)
- `build_activation_layer`:支持的类型包括 ReLU、LeakyReLU、PReLU、RReLU、ReLU6、ELU、Sigmoid、Tanh、GELU
- `build_upsample_layer`: 支持的类型包括 nearest、bilinear、deconv、pixel_shuffle
- `build_padding_layer`: 支持的类型包括 zero、reflect、replicate
#### 拓展
我们还允许自定义层和算子来扩展构建方法。
1. 编写和注册自己的模块:
```python
from mmcv.cnn import UPSAMPLE_LAYERS
@UPSAMPLE_LAYERS.register_module()
class MyUpsample:
def __init__(self, scale_factor):
pass
def forward(self, x):
pass
```
2. 在某处导入 `MyUpsample` (例如 `__init__.py` )然后使用它:
```python
cfg = dict(type='MyUpsample', scale_factor=2)
layer = build_upsample_layer(cfg)
```
### 模块组件
我们还提供了常用的模块组件,以方便网络构建。
卷积组件 `ConvModule` 由 convolution、normalization以及activation layers 组成,更多细节请参考 [ConvModule api](api.html#mmcv.cnn.ConvModule)
```python
# conv + bn + relu
conv = ConvModule(3, 8, 2, norm_cfg=dict(type='BN'))
# conv + gn + relu
conv = ConvModule(3, 8, 2, norm_cfg=dict(type='GN', num_groups=2))
# conv + relu
conv = ConvModule(3, 8, 2)
# conv
conv = ConvModule(3, 8, 2, act_cfg=None)
# conv + leaky relu
conv = ConvModule(3, 8, 3, padding=1, act_cfg=dict(type='LeakyReLU'))
# bn + conv + relu
conv = ConvModule(
3, 8, 2, norm_cfg=dict(type='BN'), order=('norm', 'conv', 'act'))
```
### Weight initialization
> 实现细节可以在 [mmcv/cnn/utils/weight_init.py](../../mmcv/cnn/utils/weight_init.py)中找到
在训练过程中,适当的初始化策略有利于加快训练速度或者获得更高的性能。 在MMCV中,我们提供了一些常用的方法来初始化模块,比如 `nn.Conv2d` 模块。当然,我们也提供了一些高级API,可用于初始化包含一个或多个模块的模型。
#### Initialization functions
以函数的方式初始化 `nn.Module` ,例如 `nn.Conv2d``nn.Linear` 等。
我们提供以下初始化方法,
- constant_init
使用给定常量值初始化模型参数
```python
>>> import torch.nn as nn
>>> from mmcv.cnn import constant_init
>>> conv1 = nn.Conv2d(3, 3, 1)
>>> # constant_init(module, val, bias=0)
>>> constant_init(conv1, 1, 0)
>>> conv1.weight
```
- xavier_init
按照 [Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010)](http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf) 描述的方法初始化模型参数
```python
>>> import torch.nn as nn
>>> from mmcv.cnn import xavier_init
>>> conv1 = nn.Conv2d(3, 3, 1)
>>> # xavier_init(module, gain=1, bias=0, distribution='normal')
>>> xavier_init(conv1, distribution='normal')
```
- normal_init
使用正态分布(高斯分布)初始化模型参数
```python
>>> import torch.nn as nn
>>> from mmcv.cnn import normal_init
>>> conv1 = nn.Conv2d(3, 3, 1)
>>> # normal_init(module, mean=0, std=1, bias=0)
>>> normal_init(conv1, std=0.01, bias=0)
```
- uniform_init
使用均匀分布初始化模型参数
```python
>>> import torch.nn as nn
>>> from mmcv.cnn import uniform_init
>>> conv1 = nn.Conv2d(3, 3, 1)
>>> # uniform_init(module, a=0, b=1, bias=0)
>>> uniform_init(conv1, a=0, b=1)
```
- kaiming_init
按照 [Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015)](https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf) 描述的方法来初始化模型参数。
```python
>>> import torch.nn as nn
>>> from mmcv.cnn import kaiming_init
>>> conv1 = nn.Conv2d(3, 3, 1)
>>> # kaiming_init(module, a=0, mode='fan_out', nonlinearity='relu', bias=0, distribution='normal')
>>> kaiming_init(conv1)
```
- caffe2_xavier_init
caffe2中实现的 `xavier initialization`,对应于 PyTorch中的 `kaiming_uniform_`
```python
>>> import torch.nn as nn
>>> from mmcv.cnn import caffe2_xavier_init
>>> conv1 = nn.Conv2d(3, 3, 1)
>>> # caffe2_xavier_init(module, bias=0)
>>> caffe2_xavier_init(conv1)
```
- bias_init_with_prob
根据给定的概率初始化 `conv/fc`, 这在 [Focal Loss for Dense Object Detection](https://arxiv.org/pdf/1708.02002.pdf) 提出。
```python
>>> from mmcv.cnn import bias_init_with_prob
>>> # bias_init_with_prob is proposed in Focal Loss
>>> bias = bias_init_with_prob(0.01)
>>> bias
-4.59511985013459
```
#### Initializers and configs
在初始化方法的基础上,我们定义了相应的初始化类,并将它们注册到 `INITIALIZERS` 中,这样我们就可以使用 `config` 配置来初始化模型了。
我们提供以下初始化类:
- ConstantInit
- XavierInit
- NormalInit
- UniformInit
- KaimingInit
- Caffe2XavierInit
- PretrainedInit
接下来详细介绍 `initialize` 的使用方法
1. 通过关键字 `layer` 来初始化模型
如果我们只定义了关键字 `layer` ,那么只初始化 `layer` 中包含的层。
注意: 关键字 `layer` 支持的模块是带有 weights 和 bias 属性的 PyTorch 模块,所以不支持 `MultiheadAttention layer`
- 定义关键字 `layer` 列表并使用相同相同配置初始化模块
```python
import torch.nn as nn
from mmcv.cnn import initialize
class FooNet(nn.Module):
def __init__(self):
super().__init__()
self.feat = nn.Conv1d(3, 1, 3)
self.reg = nn.Conv2d(3, 3, 3)
self.cls = nn.Linear(1, 2)
model = FooNet()
init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d', 'Linear'], val=1)
# 使用相同的配置初始化整个模块
initialize(model, init_cfg)
# model.feat.weight
# Parameter containing:
# tensor([[[1., 1., 1.],
# [1., 1., 1.],
# [1., 1., 1.]]], requires_grad=True)
```
- 定义关键字 `layer` 用于初始化不同配置的层
```python
import torch.nn as nn
from mmcv.cnn.utils import initialize
class FooNet(nn.Module):
def __init__(self):
super().__init__()
self.feat = nn.Conv1d(3, 1, 3)
self.reg = nn.Conv2d(3, 3, 3)
self.cls = nn.Linear(1,2)
model = FooNet()
init_cfg = [dict(type='Constant', layer='Conv1d', val=1),
dict(type='Constant', layer='Conv2d', val=2),
dict(type='Constant', layer='Linear', val=3)]
# nn.Conv1d 使用 dict(type='Constant', val=1) 初始化
# nn.Conv2d 使用 dict(type='Constant', val=2) 初始化
# nn.Linear 使用 dict(type='Constant', val=3) 初始化
initialize(model, init_cfg)
# model.reg.weight
# Parameter containing:
# tensor([[[[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]],
# ...,
# [[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]]]], requires_grad=True)
```
2. 定义关键字`override`初始化模型
- 当用属性名初始化某个特定部分时, 我们可以使用关键字 `override`, 关键字 `override` 对应的Value会替代init_cfg中相应的值
```python
import torch.nn as nn
from mmcv.cnn import initialize
class FooNet(nn.Module):
def __init__(self):
super().__init__()
self.feat = nn.Conv1d(3, 1, 3)
self.reg = nn.Conv2d(3, 3, 3)
self.cls = nn.Sequential(nn.Conv1d(3, 1, 3), nn.Linear(1,2))
# 如果我们想将模型的权重初始化为 1,将偏差初始化为 2
# 但希望 `cls` 中的权重为 3,偏差为 4,则我们可以使用关键字override
model = FooNet()
init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'], val=1, bias=2,
override=dict(type='Constant', name='reg', val=3, bias=4))
# 使用 dict(type='Constant', val=1, bias=2)来初始化 self.feat and self.cls
# 使用dict(type='Constant', val=3, bias=4)来初始化‘reg’模块。
initialize(model, init_cfg)
# model.reg.weight
# Parameter containing:
# tensor([[[[3., 3., 3.],
# [3., 3., 3.],
# [3., 3., 3.]],
# ...,
# [[3., 3., 3.],
# [3., 3., 3.],
# [3., 3., 3.]]]], requires_grad=True)
```
- 如果 init_cfg 中的关键字`layer`为None,则只初始化在关键字override中的子模块,并且省略override中的 type 和其他参数
```python
model = FooNet()
init_cfg = dict(type='Constant', val=1, bias=2, override=dict(name='reg'))
# self.feat 和 self.cls 使用pyTorch默认的初始化
# 将使用 dict(type='Constant', val=1, bias=2) 初始化名为 'reg' 的模块
initialize(model, init_cfg)
# model.reg.weight
# Parameter containing:
# tensor([[[[1., 1., 1.],
# [1., 1., 1.],
# [1., 1., 1.]],
# ...,
# [[1., 1., 1.],
# [1., 1., 1.],
# [1., 1., 1.]]]], requires_grad=True)
```
- 如果我们没有定义关键字`layer``override` , 将不会初始化任何东西
- 关键字`override`的无效用法
```python
# 没有重写任何子模块
init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'],
val=1, bias=2,
override=dict(type='Constant', val=3, bias=4))
# 没有指定type,即便有其他参数,也是无效的。
init_cfg = dict(type='Constant', layer=['Conv1d','Conv2d'],
val=1, bias=2,
override=dict(name='reg', val=3, bias=4))
```
3. 用预训练模型初始化
```python
import torch.nn as nn
import torchvision.models as models
from mmcv.cnn import initialize
# 使用预训练模型来初始化
model = models.resnet50()
# model.conv1.weight
# Parameter containing:
# tensor([[[[-6.7435e-03, -2.3531e-02, -9.0143e-03, ..., -2.1245e-03,
# -1.8077e-03, 3.0338e-03],
# [-1.2603e-02, -2.7831e-02, 2.3187e-02, ..., -1.5793e-02,
# 1.1655e-02, 4.5889e-03],
# [-3.7916e-02, 1.2014e-02, 1.3815e-02, ..., -4.2651e-03,
# 1.7314e-02, -9.9998e-03],
# ...,
init_cfg = dict(type='Pretrained',
checkpoint='torchvision://resnet50')
initialize(model, init_cfg)
# model.conv1.weight
# Parameter containing:
# tensor([[[[ 1.3335e-02, 1.4664e-02, -1.5351e-02, ..., -4.0896e-02,
# -4.3034e-02, -7.0755e-02],
# [ 4.1205e-03, 5.8477e-03, 1.4948e-02, ..., 2.2060e-03,
# -2.0912e-02, -3.8517e-02],
# [ 2.2331e-02, 2.3595e-02, 1.6120e-02, ..., 1.0281e-01,
# 6.2641e-02, 5.1977e-02],
# ...,
# 使用关键字'prefix'用预训练模型的特定部分来初始化子模块权重
model = models.resnet50()
url = 'http://download.openmmlab.com/mmdetection/v2.0/retinanet/'\
'retinanet_r50_fpn_1x_coco/'\
'retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth'
init_cfg = dict(type='Pretrained',
checkpoint=url, prefix='backbone.')
initialize(model, init_cfg)
```
4. 初始化继承自BaseModule、Sequential、ModuleList的模型
`BaseModule` 继承自 `torch.nn.Module`, 它们之间唯一的不同是 `BaseModule` 实现了 `init_weight`
`Sequential` 继承自 `BaseModule` 和 `torch.nn.Sequential`
`ModuleList` 继承自 `BaseModule` 和 `torch.nn.ModuleList`
`````python
import torch.nn as nn
from mmcv.runner import BaseModule, Sequential, ModuleList
class FooConv1d(BaseModule):
def __init__(self, init_cfg=None):
super().__init__(init_cfg)
self.conv1d = nn.Conv1d(4, 1, 4)
def forward(self, x):
return self.conv1d(x)
class FooConv2d(BaseModule):
def __init__(self, init_cfg=None):
super().__init__(init_cfg)
self.conv2d = nn.Conv2d(3, 1, 3)
def forward(self, x):
return self.conv2d(x)
# BaseModule
init_cfg = dict(type='Constant', layer='Conv1d', val=0., bias=1.)
model = FooConv1d(init_cfg)
model.init_weights()
# model.conv1d.weight
# Parameter containing:
# tensor([[[0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.]]], requires_grad=True)
# Sequential
init_cfg1 = dict(type='Constant', layer='Conv1d', val=0., bias=1.)
init_cfg2 = dict(type='Constant', layer='Conv2d', val=2., bias=3.)
model1 = FooConv1d(init_cfg1)
model2 = FooConv2d(init_cfg2)
seq_model = Sequential(model1, model2)
seq_model.init_weights()
# seq_model[0].conv1d.weight
# Parameter containing:
# tensor([[[0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.]]], requires_grad=True)
# seq_model[1].conv2d.weight
# Parameter containing:
# tensor([[[[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]],
# ...,
# [[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]]]], requires_grad=True)
# inner init_cfg has higher priority
model1 = FooConv1d(init_cfg1)
model2 = FooConv2d(init_cfg2)
init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d'], val=4., bias=5.)
seq_model = Sequential(model1, model2, init_cfg=init_cfg)
seq_model.init_weights()
# seq_model[0].conv1d.weight
# Parameter containing:
# tensor([[[0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.]]], requires_grad=True)
# seq_model[1].conv2d.weight
# Parameter containing:
# tensor([[[[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]],
# ...,
# [[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]]]], requires_grad=True)
# ModuleList
model1 = FooConv1d(init_cfg1)
model2 = FooConv2d(init_cfg2)
modellist = ModuleList([model1, model2])
modellist.init_weights()
# modellist[0].conv1d.weight
# Parameter containing:
# tensor([[[0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.]]], requires_grad=True)
# modellist[1].conv2d.weight
# Parameter containing:
# tensor([[[[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]],
# ...,
# [[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]]]], requires_grad=True)
# inner init_cfg has higher priority
model1 = FooConv1d(init_cfg1)
model2 = FooConv2d(init_cfg2)
init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d'], val=4., bias=5.)
modellist = ModuleList([model1, model2], init_cfg=init_cfg)
modellist.init_weights()
# modellist[0].conv1d.weight
# Parameter containing:
# tensor([[[0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.],
# [0., 0., 0., 0.]]], requires_grad=True)
# modellist[1].conv2d.weight
# Parameter containing:
# tensor([[[[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]],
# ...,
# [[2., 2., 2.],
# [2., 2., 2.],
# [2., 2., 2.]]]], requires_grad=True)
`````
### Model Zoo
除了`torchvision`的预训练模型,我们还提供以下 CNN 的预训练模型:
- VGG Caffe
- ResNet Caffe
- ResNeXt
- ResNet with Group Normalization
- ResNet with Group Normalization and Weight Standardization
- HRNetV2
- Res2Net
- RegNet
#### Model URLs in JSON
MMCV中的Model Zoo Link 由 JSON 文件管理。 json 文件由模型名称及其url或path的键值对组成,一个json文件可能类似于:
```json
{
"model_a": "https://example.com/models/model_a_9e5bac.pth",
"model_b": "pretrain/model_b_ab3ef2c.pth"
}
```
可以在[此处](https://github.com/open-mmlab/mmcv/blob/master/mmcv/model_zoo/open_mmlab.json)找到托管在 OpenMMLab AWS 上的预训练模型的默认链接。
你可以通过将 `open-mmlab.json` 放在 `MMCV_HOME`下来覆盖默认链接,如果在环境中找不到`MMCV_HOME`,则默认使用 `~/.cache/mmcv`。当然你也可以使用命令 `export MMCV_HOME=/your/path`来设置自己的路径。
外部的json文件将被合并为默认文件,如果相同的键出现在外部`json`和默认`json`中,则将使用外部`json`
#### Load Checkpoint
`mmcv.load_checkpoint()`的参数`filename`支持以下类型:
- filepath: `checkpoint`路径
- `http://xxx` and `https://xxx`: 下载checkpoint的链接,文件名中必需包含`SHA256`后缀
- `torchvision://xxx`: `torchvision.models`中的模型链接,更多细节参考 [torchvision](https://pytorch.org/docs/stable/torchvision/models.html)
- `open-mmlab://xxx`: 默认和其他 json 文件中提供的模型链接或文件路径
## 配置
`Config` 类用于操作配置文件,它支持从多种文件格式中加载配置,包括 **python**, **json****yaml**
它提供了类似字典对象的接口来获取和设置值。
以配置文件 `test.py` 为例
```python
a = 1
b = dict(b1=[0, 1, 2], b2=None)
c = (1, 2)
d = 'string'
```
加载与使用配置文件
```python
>>> cfg = Config.fromfile('test.py')
>>> print(cfg)
>>> dict(a=1,
... b=dict(b1=[0, 1, 2], b2=None),
... c=(1, 2),
... d='string')
```
对于所有格式的配置文件,都支持一些预定义变量。它会将 `{{ var }}` 替换为实际值。
目前支持以下四个预定义变量:
`{{ fileDirname }}` - 当前打开文件的目录名,例如 /home/your-username/your-project/folder
`{{ fileBasename }}` - 当前打开文件的文件名,例如 file.ext
`{{ fileBasenameNoExtension }}` - 当前打开文件不包含扩展名的文件名,例如 file
`{{ fileExtname }}` - 当前打开文件的扩展名,例如 .ext
这些变量名引用自 [VS Code](https://code.visualstudio.com/docs/editor/variables-reference)
这里是一个带有预定义变量的配置文件的例子。
`config_a.py`
```python
a = 1
b = './work_dir/{{ fileBasenameNoExtension }}'
c = '{{ fileExtname }}'
```
```python
>>> cfg = Config.fromfile('./config_a.py')
>>> print(cfg)
>>> dict(a=1,
... b='./work_dir/config_a',
... c='.py')
```
对于所有格式的配置文件, 都支持继承。为了重用其他配置文件的字段,
需要指定 `_base_='./config_a.py'` 或者一个包含配置文件的列表 `_base_=['./config_a.py', './config_b.py']`
这里有 4 个配置继承关系的例子。
`config_a.py` 作为基类配置文件
```python
a = 1
b = dict(b1=[0, 1, 2], b2=None)
```
### 不含重复键值对从基类配置文件继承
`config_b.py`
```python
_base_ = './config_a.py'
c = (1, 2)
d = 'string'
```
```python
>>> cfg = Config.fromfile('./config_b.py')
>>> print(cfg)
>>> dict(a=1,
... b=dict(b1=[0, 1, 2], b2=None),
... c=(1, 2),
... d='string')
```
`config_b.py`里的新字段与在`config_a.py`里的旧字段拼接
### 含重复键值对从基类配置文件继承
`config_c.py`
```python
_base_ = './config_a.py'
b = dict(b2=1)
c = (1, 2)
```
```python
>>> cfg = Config.fromfile('./config_c.py')
>>> print(cfg)
>>> dict(a=1,
... b=dict(b1=[0, 1, 2], b2=1),
... c=(1, 2))
```
在基类配置文件:`config_a` 里的 `b.b2=None`被配置文件:`config_c.py`里的 `b.b2=1`替代。
### 从具有忽略字段的配置文件继承
`config_d.py`
```python
_base_ = './config_a.py'
b = dict(_delete_=True, b2=None, b3=0.1)
c = (1, 2)
```
```python
>>> cfg = Config.fromfile('./config_d.py')
>>> print(cfg)
>>> dict(a=1,
... b=dict(b2=None, b3=0.1),
... c=(1, 2))
```
您还可以设置 `_delete_=True`忽略基类配置文件中的某些字段。所有在`b`中的旧键 `b1, b2, b3` 将会被新键 `b2, b3` 所取代。
### 从多个基类配置文件继承(基类配置文件不应包含相同的键)
`config_e.py`
```python
c = (1, 2)
d = 'string'
```
`config_f.py`
```python
_base_ = ['./config_a.py', './config_e.py']
```
```python
>>> cfg = Config.fromfile('./config_f.py')
>>> print(cfg)
>>> dict(a=1,
... b=dict(b1=[0, 1, 2], b2=None),
... c=(1, 2),
... d='string')
```
### 从基类引用变量
您可以使用以下语法引用在基类中定义的变量。
`base.py`
```python
item1 = 'a'
item2 = dict(item3 = 'b')
```
`config_g.py`
```python
_base_ = ['./base.py']
item = dict(a = {{ _base_.item1 }}, b = {{ _base_.item2.item3 }})
```
```python
>>> cfg = Config.fromfile('./config_g.py')
>>> print(cfg.pretty_text)
item1 = 'a'
item2 = dict(item3='b')
item = dict(a='a', b='b')
```
......@@ -130,7 +130,7 @@ bboxes = np.array([[10, 10, 100, 120], [0, 0, 50, 50]])
patches = mmcv.imcrop(img, bboxes)
# 裁剪两个区域并且缩放区域1.2倍
patches = mmcv.imcrop(img, bboxes, scale=1.2)
patches = mmcv.imcrop(img, bboxes, scale_ratio=1.2)
```
#### 填充
......@@ -144,13 +144,13 @@ img = mmcv.imread('tests/data/color.jpg')
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=0)
# 用给定值分别填充图像的3个通道至 (1000, 1200)
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=(100, 50, 200))
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=[100, 50, 200])
# 用给定值填充图像的左、右、上、下四条边
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=0)
# 用3个值分别填充图像的左、右、上、下四条边的3个通道
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=(100, 50, 200))
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=[100, 50, 200])
# 将图像的四条边填充至能够被给定值整除
img_ = mmcv.impad_to_multiple(img, 32)
......@@ -252,24 +252,24 @@ flow = mmcv.flowread('compressed.jpg', quantize=True, concat_axis=1)
mmcv.flowshow(flow)
```
![progress](../../en/_static/flow_visualization.png)
![progress](../../docs/_static/flow_visualization.png)
1. 流变换
3. 流变换
```python
img1 = mmcv.imread('img1.jpg')
flow = mmcv.flowread('flow.flo')
warped_img2 = mmcv.flow_warp(img1, flow)
warpped_img2 = mmcv.flow_warp(img1, flow)
```
img1 (左) and img2 (右)
![raw images](../../en/_static/flow_raw_images.png)
![raw images](../../docs/_static/flow_raw_images.png)
光流 (img2 -> img1)
![optical flow](../../en/_static/flow_img2toimg1.png)
![optical flow](../../docs/_static/flow_img2toimg1.png)
变换后的图像和真实图像的差异
![warped image](../../en/_static/flow_warp_diff.png)
![warpped image](../../docs/_static/flow_warp_diff.png)
## 文件输入输出
文件输入输出模块提供了两个通用的 API 接口用于读取和保存不同格式的文件。
```{note}
在 v1.3.16 及之后的版本中,IO 模块支持从不同后端读取数据并支持将数据至不同后端。更多细节请访问 PR [#1330](https://github.com/open-mmlab/mmcv/pull/1330)。
```
### 读取和保存数据
`mmcv` 提供了一个通用的 api 用于读取和保存数据,目前支持的格式有 json、yaml 和 pickle。
#### 从硬盘读取数据或者将数据保存至硬盘
```python
import mmcv
# 从文件中读取数据
data = mmcv.load('test.json')
data = mmcv.load('test.yaml')
data = mmcv.load('test.pkl')
# 从文件对象中读取数据
with open('test.json', 'r') as f:
data = mmcv.load(f, file_format='json')
# 将数据序列化为字符串
json_str = mmcv.dump(data, file_format='json')
# 将数据保存至文件 (根据文件名后缀反推文件类型)
mmcv.dump(data, 'out.pkl')
# 将数据保存至文件对象
with open('test.yaml', 'w') as f:
data = mmcv.dump(data, f, file_format='yaml')
```
#### 从其他后端加载或者保存至其他后端
```python
import mmcv
# 从 s3 文件读取数据
data = mmcv.load('s3://bucket-name/test.json')
data = mmcv.load('s3://bucket-name/test.yaml')
data = mmcv.load('s3://bucket-name/test.pkl')
# 将数据保存至 s3 文件 (根据文件名后缀反推文件类型)
mmcv.dump(data, 's3://bucket-name/out.pkl')
```
我们提供了易于拓展的方式以支持更多的文件格式。我们只需要创建一个继承自 `BaseFileHandler`
文件句柄类并将其注册到 `mmcv` 中即可。句柄类至少需要重写三个方法。
```python
import mmcv
# 支持为文件句柄类注册多个文件格式
# @mmcv.register_handler(['txt', 'log'])
@mmcv.register_handler('txt')
class TxtHandler1(mmcv.BaseFileHandler):
def load_from_fileobj(self, file):
return file.read()
def dump_to_fileobj(self, obj, file):
file.write(str(obj))
def dump_to_str(self, obj, **kwargs):
return str(obj)
```
`PickleHandler` 为例
```python
import pickle
class PickleHandler(mmcv.BaseFileHandler):
def load_from_fileobj(self, file, **kwargs):
return pickle.load(file, **kwargs)
def load_from_path(self, filepath, **kwargs):
return super(PickleHandler, self).load_from_path(
filepath, mode='rb', **kwargs)
def dump_to_str(self, obj, **kwargs):
kwargs.setdefault('protocol', 2)
return pickle.dumps(obj, **kwargs)
def dump_to_fileobj(self, obj, file, **kwargs):
kwargs.setdefault('protocol', 2)
pickle.dump(obj, file, **kwargs)
def dump_to_path(self, obj, filepath, **kwargs):
super(PickleHandler, self).dump_to_path(
obj, filepath, mode='wb', **kwargs)
```
### 读取文件并返回列表或字典
例如, `a.txt` 是文本文件,一共有5行内容。
```
a
b
c
d
e
```
#### 从硬盘读取
使用 `list_from_file` 读取 `a.txt`
```python
>>> mmcv.list_from_file('a.txt')
['a', 'b', 'c', 'd', 'e']
>>> mmcv.list_from_file('a.txt', offset=2)
['c', 'd', 'e']
>>> mmcv.list_from_file('a.txt', max_num=2)
['a', 'b']
>>> mmcv.list_from_file('a.txt', prefix='/mnt/')
['/mnt/a', '/mnt/b', '/mnt/c', '/mnt/d', '/mnt/e']
```
同样, `b.txt` 也是文本文件,一共有3行内容
```
1 cat
2 dog cow
3 panda
```
使用 `dict_from_file` 读取 `b.txt`
```python
>>> mmcv.dict_from_file('b.txt')
{'1': 'cat', '2': ['dog', 'cow'], '3': 'panda'}
>>> mmcv.dict_from_file('b.txt', key_type=int)
{1: 'cat', 2: ['dog', 'cow'], 3: 'panda'}
```
#### 从其他后端读取
使用 `list_from_file` 读取 `s3://bucket-name/a.txt`
```python
>>> mmcv.list_from_file('s3://bucket-name/a.txt')
['a', 'b', 'c', 'd', 'e']
>>> mmcv.list_from_file('s3://bucket-name/a.txt', offset=2)
['c', 'd', 'e']
>>> mmcv.list_from_file('s3://bucket-name/a.txt', max_num=2)
['a', 'b']
>>> mmcv.list_from_file('s3://bucket-name/a.txt', prefix='/mnt/')
['/mnt/a', '/mnt/b', '/mnt/c', '/mnt/d', '/mnt/e']
```
使用 `dict_from_file` 读取 `b.txt`
```python
>>> mmcv.dict_from_file('s3://bucket-name/b.txt')
{'1': 'cat', '2': ['dog', 'cow'], '3': 'panda'}
>>> mmcv.dict_from_file('s3://bucket-name/b.txt', key_type=int)
{1: 'cat', 2: ['dog', 'cow'], 3: 'panda'}
```
### 读取和保存权重文件
#### 从硬盘读取权重文件或者将权重文件保存至硬盘
我们可以通过下面的方式从磁盘读取权重文件或者将权重文件保存至磁盘
```python
import torch
filepath1 = '/path/of/your/checkpoint1.pth'
filepath2 = '/path/of/your/checkpoint2.pth'
# 从 filepath1 读取权重文件
checkpoint = torch.load(filepath1)
# 将权重文件保存至 filepath2
torch.save(checkpoint, filepath2)
```
MMCV 提供了很多后端,`HardDiskBackend` 是其中一个,我们可以通过它来读取或者保存权重文件。
```python
import io
from mmcv.fileio.file_client import HardDiskBackend
disk_backend = HardDiskBackend()
with io.BytesIO(disk_backend.get(filepath1)) as buffer:
checkpoint = torch.load(buffer)
with io.BytesIO() as buffer:
torch.save(checkpoint, f)
disk_backend.put(f.getvalue(), filepath2)
```
如果我们想在接口中实现根据文件路径自动选择对应的后端,我们可以使用 `FileClient`
例如,我们想实现两个方法,分别是读取权重以及保存权重,它们需支持不同类型的文件路径,可以是磁盘路径,也可以是网络路径或者其他路径。
```python
from mmcv.fileio.file_client import FileClient
def load_checkpoint(path):
file_client = FileClient.infer(uri=path)
with io.BytesIO(file_client.get(path)) as buffer:
checkpoint = torch.load(buffer)
return checkpoint
def save_checkpoint(checkpoint, path):
with io.BytesIO() as buffer:
torch.save(checkpoint, buffer)
file_client.put(buffer.getvalue(), path)
file_client = FileClient.infer_client(uri=filepath1)
checkpoint = load_checkpoint(filepath1)
save_checkpoint(checkpoint, filepath2)
```
#### 从网络远端读取权重文件
```{note}
目前只支持从网络远端读取权重文件,暂不支持将权重文件写入网络远端
```
```python
import io
import torch
from mmcv.fileio.file_client import HTTPBackend, FileClient
filepath = 'http://path/of/your/checkpoint.pth'
checkpoint = torch.utils.model_zoo.load_url(filepath)
http_backend = HTTPBackend()
with io.BytesIO(http_backend.get(filepath)) as buffer:
checkpoint = torch.load(buffer)
file_client = FileClient.infer_client(uri=filepath)
with io.BytesIO(file_client.get(filepath)) as buffer:
checkpoint = torch.load(buffer)
```
## CUDA 算子
MMCV 提供了检测、分割等任务中常用的 CUDA 算子
- AssignScoreWithK
- BallQuery
- BBoxOverlaps
- CARAFE
- CrissCrossAttention
- ContextBlock
- CornerPool
- Deformable Convolution v1/v2
- Deformable RoIPool
- DynamicScatter
- GatherPoints
- FurthestPointSample
- FurthestPointSampleWithDist
- GeneralizedAttention
- KNN
- MaskedConv
- NMS
- PSAMask
- RoIPointPool3d
- RoIPool
- RoIAlign
- RoIAwarePool3d
- SimpleRoIAlign
- SigmoidFocalLoss
- SoftmaxFocalLoss
- SoftNMS
- Synchronized BatchNorm
- Voxelization
- ThreeInterpolate
- ThreeNN
- Weight standardization
- Correlation
## 注册器
MMCV 使用 [注册器](https://github.com/open-mmlab/mmcv/blob/master/mmcv/utils/registry.py) 来管理具有相似功能的不同模块, 例如, 检测器中的主干网络、头部、和模型颈部。
在 OpenMMLab 家族中的绝大部分开源项目使用注册器去管理数据集和模型的模块,例如 [MMDetection](https://github.com/open-mmlab/mmdetection), [MMDetection3D](https://github.com/open-mmlab/mmdetection3d), [MMClassification](https://github.com/open-mmlab/mmclassification), [MMEditing](https://github.com/open-mmlab/mmediting) 等。
### 什么是注册器
在MMCV中,注册器可以看作类到字符串的映射。
一个注册器中的类通常有相似的接口,但是可以实现不同的算法或支持不同的数据集。
借助注册器,用户可以通过使用相应的字符串查找并实例化该类,并根据他们的需要实例化对应模块。
一个典型的案例是,OpenMMLab 中的大部分开源项目的配置系统,这些系统通过配置文件来使用注册器创建钩子、执行器、模型和数据集。
可以在[这里](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.Registry)找到注册器接口使用文档。
使用 `registry`(注册器)管理代码库中的模型,需要以下三个步骤。
1. 创建一个构建方法(可选,在大多数情况下您可以只使用默认方法)
2. 创建注册器
3. 使用此注册器来管理模块
`Registry`(注册器)的参数 `build_func`(构建函数) 用来自定以如何实例化类的实例,默认使用 [这里](https://mmcv.readthedocs.io/en/latest/api.html?highlight=registry#mmcv.utils.build_from_cfg)实现的`build_from_cfg`
### 一个简单的例子
这里是一个使用注册器管理包中模块的简单示例。您可以在 OpenMMLab 开源项目中找到更多实例。
假设我们要实现一系列数据集转换器(Dataset Converter),用于将不同格式的数据转换为标准数据格式。我们先创建一个名为converters的目录作为包,在包中我们创建一个文件来实现构建器(builder),命名为converters/builder.py,如下
```python
from mmcv.utils import Registry
# 创建转换器(converter)的注册器(registry)
CONVERTERS = Registry('converter')
```
然后我们在包中可以实现不同的转换器(converter)。例如,在 `converters/converter1.py` 中实现 `Converter1`
```python
from .builder import CONVERTERS
# 使用注册器管理模块
@CONVERTERS.register_module()
class Converter1(object):
def __init__(self, a, b):
self.a = a
self.b = b
```
使用注册器管理模块的关键步骤是,将实现的模块注册到注册表 `CONVERTERS` 中。通过 `@CONVERTERS.register_module()` 装饰所实现的模块,字符串和类之间的映射就可以由 `CONVERTERS` 构建和维护,如下所示:
通过这种方式,就可以通过 `CONVERTERS` 建立字符串与类之间的映射,如下所示:
```python
'Converter1' -> <class 'Converter1'>
```
如果模块被成功注册了,你可以通过配置文件使用这个转换器(converter),如下所示:
```python
converter_cfg = dict(type='Converter1', a=a_value, b=b_value)
converter = CONVERTERS.build(converter_cfg)
```
### 自定义构建函数
假设我们想自定义 `converters` 的构建流程,我们可以实现一个自定义的 `build_func` (构建函数)并将其传递到注册器中。
```python
from mmcv.utils import Registry
# 创建一个构建函数
def build_converter(cfg, registry, *args, **kwargs):
cfg_ = cfg.copy()
converter_type = cfg_.pop('type')
if converter_type not in registry:
raise KeyError(f'Unrecognized converter type {converter_type}')
else:
converter_cls = registry.get(converter_type)
converter = converter_cls(*args, **kwargs, **cfg_)
return converter
# 创建一个用于转换器(converters)的注册器,并传递(registry)``build_converter`` 函数
CONVERTERS = Registry('converter', build_func=build_converter)
```
```{note}
注:在这个例子中,我们演示了如何使用参数:`build_func` 自定义构建类的实例的方法。
该功能类似于默认的`build_from_cfg`。在大多数情况下,默认就足够了。
```
`build_model_from_cfg`也实现了在`nn.Sequentail`中构建PyTorch模块,你可以直接使用它们。
### 注册器层结构
你也可以从多个 OpenMMLab 开源框架中构建模块,例如,你可以把所有 [MMClassification](https://github.com/open-mmlab/mmclassification) 中的主干网络(backbone)用到 [MMDetection](https://github.com/open-mmlab/mmdetection) 的目标检测中,你也可以融合 [MMDetection](https://github.com/open-mmlab/mmdetection) 中的目标检测模型 和 [MMSegmentation](https://github.com/open-mmlab/mmsegmentation) 语义分割模型。
下游代码库中所有 `MODELS` 注册器都是MMCV `MODELS` 注册器的子注册器。基本上,使用以下两种方法从子注册器或相邻兄弟注册器构建模块。
1. 从子注册器中构建
例如:
我们在 MMDetection 中定义:
```python
from mmcv.utils import Registry
from mmcv.cnn import MODELS as MMCV_MODELS
MODELS = Registry('model', parent=MMCV_MODELS)
@MODELS.register_module()
class NetA(nn.Module):
def forward(self, x):
return x
```
我们在 MMClassification 中定义:
```python
from mmcv.utils import Registry
from mmcv.cnn import MODELS as MMCV_MODELS
MODELS = Registry('model', parent=MMCV_MODELS)
@MODELS.register_module()
class NetB(nn.Module):
def forward(self, x):
return x + 1
```
我们可以通过以下代码在 MMDetection 或 MMClassification 中构建两个网络:
```python
from mmdet.models import MODELS
net_a = MODELS.build(cfg=dict(type='NetA'))
net_b = MODELS.build(cfg=dict(type='mmcls.NetB'))
```
```python
from mmcls.models import MODELS
net_a = MODELS.build(cfg=dict(type='mmdet.NetA'))
net_b = MODELS.build(cfg=dict(type='NetB'))
```
2. 从父注册器中构建
MMCV中的共享`MODELS`注册器是所有下游代码库的父注册器(根注册器):
```python
from mmcv.cnn import MODELS as MMCV_MODELS
net_a = MMCV_MODELS.build(cfg=dict(type='mmdet.NetA'))
net_b = MMCV_MODELS.build(cfg=dict(type='mmcls.NetB'))
```
## 执行器
执行器模块负责模型训练过程调度,主要目的是让用户使用更少的代码以及灵活可配置方式开启训练。其具备如下核心特性:
- 支持以 `EpochBasedRunner``IterBasedRunner` 为单位的迭代模式以满足不同场景
- 支持定制工作流以满足训练过程中各状态自由切换,目前支持训练和验证两个工作流。工作流可以简单理解为一个完成的训练和验证迭代过程。
- 配合各类默认和自定义 Hook,对外提供了灵活扩展能力
### EpochBasedRunner
顾名思义,`EpochBasedRunner` 是指以 epoch 为周期的工作流,例如设置 workflow = [('train', 2), ('val', 1)] 表示循环迭代地训练 2 个 epoch,然后验证 1 个 epoch。MMDetection 目标检测框架默认采用的是 `EpochBasedRunner`
其抽象逻辑如下所示:
```python
# 训练终止条件
while curr_epoch < max_epochs:
# 遍历用户设置的工作流,例如 workflow = [('train', 2),('val', 1)]
for i, flow in enumerate(workflow):
# mode 是工作流函数,例如 train, epochs 是迭代次数
mode, epochs = flow
# 要么调用 self.train(),要么调用 self.val()
epoch_runner = getattr(self, mode)
# 运行对应工作流函数
for _ in range(epochs):
epoch_runner(data_loaders[i], **kwargs)
```
目前支持训练和验证两个工作流,以训练函数为例,其抽象逻辑是:
```python
# epoch_runner 目前可以是 train 或者 val
def train(self, data_loader, **kwargs):
# 遍历 dataset,共返回一个 epoch 的 batch 数据
for i, data_batch in enumerate(data_loader):
self.call_hook('before_train_iter')
# 验证时候 train_mode=False
self.run_iter(data_batch, train_mode=True, **kwargs)
self.call_hook('after_train_iter')
self.call_hook('after_train_epoch')
```
### IterBasedRunner
不同于 `EpochBasedRunner``IterBasedRunner` 是指以 iter 为周期的工作流,例如设置 workflow = [('train', 2), ('val', 1)] 表示循环迭代的训练 2 个 iter,然后验证 1 个 iter,MMSegmentation 语义分割框架默认采用的是 `EpochBasedRunner`
其抽象逻辑如下所示:
```python
# 虽然是 iter 单位,但是某些场合需要 epoch 信息,由 IterLoader 提供
iter_loaders = [IterLoader(x) for x in data_loaders]
# 训练终止条件
while curr_iter < max_iters:
# 遍历用户设置的工作流,例如 workflow = [('train', 2), ('val', 1)]
for i, flow in enumerate(workflow):
# mode 是工作流函数,例如 train, iters 是迭代次数
mode, iters = flow
# 要么调用 self.train(),要么调用 self.val()
iter_runner = getattr(self, mode)
# 运行对应工作流函数
for _ in range(iters):
iter_runner(iter_loaders[i], **kwargs)
```
目前支持训练和验证两个工作流,以验证函数为例,其抽象逻辑是:
```python
# iter_runner 目前可以是 train 或者 val
def val(self, data_loader, **kwargs):
# 获取 batch 数据,用于一次迭代
data_batch = next(data_loader)
self.call_hook('before_val_iter')
outputs = self.model.val_step(data_batch, self.optimizer, **kwargs)
self.outputs = outputs
self.call_hook('after_val_iter')
```
除了上述基础功能外,`EpochBasedRunner``IterBasedRunner` 还提供了 resume 、 save_checkpoint 和注册 hook 功能。
### 一个简单例子
以最常用的分类任务为例详细说明 `runner` 的使用方法。 开启任何一个训练任务,都需要包括如下步骤:
**(1) dataloader、model 和优化器等类初始化**
```python
# 模型类初始化
model=...
# 优化器类初始化,典型值 cfg.optimizer = dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001)
optimizer = build_optimizer(model, cfg.optimizer)
# 工作流对应的 dataloader 初始化
data_loaders = [
build_dataloader(
ds,
cfg.data.samples_per_gpu,
cfg.data.workers_per_gpu,
...) for ds in dataset
]
```
**(2) runner 类初始化**
```python
runner = build_runner(
# cfg.runner 典型配置为
# runner = dict(type='EpochBasedRunner', max_epochs=200)
cfg.runner,
default_args=dict(
model=model,
batch_processor=None,
optimizer=optimizer,
logger=logger))
```
**(3) 注册默认训练所必须的 hook,和用户自定义 hook**
```python
# 注册定制必需的 hook
runner.register_training_hooks(
# lr相关配置,典型为
# lr_config = dict(policy='step', step=[100, 150])
cfg.lr_config,
# 优化相关配置,例如 grad_clip 等
optimizer_config,
# 权重保存相关配置,典型为
# checkpoint_config = dict(interval=1),每个单位都保存权重
cfg.checkpoint_config,
# 日志相关配置
cfg.log_config,
...)
# 注册用户自定义 hook
# 例如想使用 ema 功能,则可以设置 custom_hooks=[dict(type='EMAHook')]
if cfg.get('custom_hooks', None):
custom_hooks = cfg.custom_hooks
for hook_cfg in cfg.custom_hooks:
hook_cfg = hook_cfg.copy()
priority = hook_cfg.pop('priority', 'NORMAL')
hook = build_from_cfg(hook_cfg, HOOKS)
runner.register_hook(hook, priority=priority)
```
然后可以进行 resume 或者 load_checkpoint 对权重进行加载。
**(4) 开启训练流**
```python
# workflow 典型为 workflow = [('train', 1)]
# 此时就真正开启了训练
runner.run(data_loaders, cfg.workflow)
```
关于 workflow 设置,以 `EpochBasedRunner` 为例,详情如下:
- 假设只想运行训练工作流,则可以设置 workflow = [('train', 1)],表示只进行迭代训练
- 假设想运行训练和验证工作流,则可以设置 workflow = [('train', 3), ('val', 1)],表示先训练 3 个 epoch ,然后切换到 val 工作流,运行 1 个 epoch,然后循环,直到训练 epoch 次数达到指定值
- 工作流设置还自由定制,例如你可以先验证再训练 workflow = [('val', 1), ('train', 1)]
上述代码都已经封装到了各个代码库的 train.py 中,用户只需要设置相应的配置即可,上述流程会自动运行。
## 辅助函数
### 进度条
如果你想跟踪函数批处理任务的进度,可以使用 `track_progress` 。它能以进度条的形式展示任务的完成情况以及剩余任务所需的时间(内部实现为for循环)。
```python
import mmcv
def func(item):
# 执行相关操作
pass
tasks = [item_1, item_2, ..., item_n]
mmcv.track_progress(func, tasks)
```
效果如下
![progress](../../docs/_static/progress.*)
如果你想可视化多进程任务的进度,你可以使用 `track_parallel_progress`
```python
mmcv.track_parallel_progress(func, tasks, 8) # 8 workers
```
![progress](../../docs/_static/parallel_progress.*)
如果你想要迭代或枚举数据列表并可视化进度,你可以使用 `track_iter_progress`
```python
import mmcv
tasks = [item_1, item_2, ..., item_n]
for task in mmcv.track_iter_progress(tasks):
# do something like print
print(task)
for i, task in enumerate(mmcv.track_iter_progress(tasks)):
# do something like print
print(i)
print(task)
```
### 计时器
mmcv提供的 `Timer` 可以很方便地计算代码块的执行时间。
```python
import time
with mmcv.Timer():
# simulate some code block
time.sleep(1)
```
你也可以使用 `since_start()``since_last_check()` 。前者返回计时器启动后的运行时长,后者返回最近一次查看计时器后的运行时长。
```python
timer = mmcv.Timer()
# code block 1 here
print(timer.since_start())
# code block 2 here
print(timer.since_last_check())
print(timer.since_start())
```
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torchvision.datasets import CIFAR10
from mmcv.parallel import MMDataParallel
from mmcv.runner import EpochBasedRunner
from mmcv.utils import get_logger
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
self.loss_fn = nn.CrossEntropyLoss()
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def train_step(self, data, optimizer):
images, labels = data
predicts = self(images) # -> self.__call__() -> self.forward()
loss = self.loss_fn(predicts, labels)
return {'loss': loss}
if __name__ == '__main__':
model = Model()
if torch.cuda.is_available():
# only use gpu:0 to train
# Solved issue https://github.com/open-mmlab/mmcv/issues/1470
model = MMDataParallel(model.cuda(), device_ids=[0])
# dataset and dataloader
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
trainset = CIFAR10(
root='data', train=True, download=True, transform=transform)
trainloader = DataLoader(
trainset, batch_size=128, shuffle=True, num_workers=2)
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
logger = get_logger('mmcv')
# runner is a scheduler to manage the training
runner = EpochBasedRunner(
model,
optimizer=optimizer,
work_dir='./work_dir',
logger=logger,
max_epochs=4)
# learning rate scheduler config
lr_config = dict(policy='step', step=[2, 3])
# configuration of optimizer
optimizer_config = dict(grad_clip=None)
# configuration of saving checkpoints periodically
checkpoint_config = dict(interval=1)
# save log periodically and multiple hooks can be used simultaneously
log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')])
# register hooks to runner and those hooks will be invoked automatically
runner.register_training_hooks(
lr_config=lr_config,
optimizer_config=optimizer_config,
checkpoint_config=checkpoint_config,
log_config=log_config)
runner.run([trainloader], [('train', 1)])
# Copyright (c) OpenMMLab. All rights reserved.
# flake8: noqa
from .arraymisc import *
from .fileio import *
from .image import *
from .transforms import *
from .utils import *
from .version import *
from .video import *
from .visualization import *
# The following modules are not imported to this level, so mmcv may be used
# without PyTorch.
# - runner
# - parallel
# - op
# - utils
# Copyright (c) OpenMMLab. All rights reserved.
from typing import Union
import numpy as np
def quantize(arr: np.ndarray,
min_val: Union[int, float],
max_val: Union[int, float],
levels: int,
dtype=np.int64) -> tuple:
def quantize(arr, min_val, max_val, levels, dtype=np.int64):
"""Quantize an array of (-inf, inf) to [0, levels-1].
Args:
arr (ndarray): Input array.
min_val (int or float): Minimum value to be clipped.
max_val (int or float): Maximum value to be clipped.
min_val (scalar): Minimum value to be clipped.
max_val (scalar): Maximum value to be clipped.
levels (int): Quantization levels.
dtype (np.type): The type of the quantized array.
......@@ -35,17 +29,13 @@ def quantize(arr: np.ndarray,
return quantized_arr
def dequantize(arr: np.ndarray,
min_val: Union[int, float],
max_val: Union[int, float],
levels: int,
dtype=np.float64) -> tuple:
def dequantize(arr, min_val, max_val, levels, dtype=np.float64):
"""Dequantize an array.
Args:
arr (ndarray): Input array.
min_val (int or float): Minimum value to be clipped.
max_val (int or float): Maximum value to be clipped.
min_val (scalar): Minimum value to be clipped.
max_val (scalar): Maximum value to be clipped.
levels (int): Quantization levels.
dtype (np.type): The type of the dequantized array.
......
# Copyright (c) OpenMMLab. All rights reserved.
from .alexnet import AlexNet
# yapf: disable
from .bricks import (ContextBlock, Conv2d, Conv3d, ConvAWS2d, ConvModule,
from .bricks import (ACTIVATION_LAYERS, CONV_LAYERS, NORM_LAYERS,
PADDING_LAYERS, PLUGIN_LAYERS, UPSAMPLE_LAYERS,
ContextBlock, Conv2d, Conv3d, ConvAWS2d, ConvModule,
ConvTranspose2d, ConvTranspose3d, ConvWS2d,
DepthwiseSeparableConvModule, GeneralizedAttention,
HSigmoid, HSwish, Linear, MaxPool2d, MaxPool3d,
......@@ -9,20 +11,31 @@ from .bricks import (ContextBlock, Conv2d, Conv3d, ConvAWS2d, ConvModule,
build_activation_layer, build_conv_layer,
build_norm_layer, build_padding_layer, build_plugin_layer,
build_upsample_layer, conv_ws_2d, is_norm)
from .builder import MODELS, build_model_from_cfg
# yapf: enable
from .resnet import ResNet, make_res_layer
from .rfsearch import Conv2dRFSearchOp, RFSearchHook
from .utils import fuse_conv_bn, get_model_complexity_info
from .utils import (INITIALIZERS, Caffe2XavierInit, ConstantInit, KaimingInit,
NormalInit, PretrainedInit, TruncNormalInit, UniformInit,
XavierInit, bias_init_with_prob, caffe2_xavier_init,
constant_init, fuse_conv_bn, get_model_complexity_info,
initialize, kaiming_init, normal_init, trunc_normal_init,
uniform_init, xavier_init)
from .vgg import VGG, make_vgg_layer
__all__ = [
'AlexNet', 'VGG', 'make_vgg_layer', 'ResNet', 'make_res_layer',
'ConvModule', 'build_activation_layer', 'build_conv_layer',
'build_norm_layer', 'build_padding_layer', 'build_upsample_layer',
'build_plugin_layer', 'is_norm', 'NonLocal1d', 'NonLocal2d', 'NonLocal3d',
'ContextBlock', 'HSigmoid', 'Swish', 'HSwish', 'GeneralizedAttention',
'Scale', 'conv_ws_2d', 'ConvAWS2d', 'ConvWS2d',
'DepthwiseSeparableConvModule', 'Linear', 'Conv2d', 'ConvTranspose2d',
'MaxPool2d', 'ConvTranspose3d', 'MaxPool3d', 'Conv3d', 'fuse_conv_bn',
'get_model_complexity_info', 'Conv2dRFSearchOp', 'RFSearchHook'
'constant_init', 'xavier_init', 'normal_init', 'trunc_normal_init',
'uniform_init', 'kaiming_init', 'caffe2_xavier_init',
'bias_init_with_prob', 'ConvModule', 'build_activation_layer',
'build_conv_layer', 'build_norm_layer', 'build_padding_layer',
'build_upsample_layer', 'build_plugin_layer', 'is_norm', 'NonLocal1d',
'NonLocal2d', 'NonLocal3d', 'ContextBlock', 'HSigmoid', 'Swish', 'HSwish',
'GeneralizedAttention', 'ACTIVATION_LAYERS', 'CONV_LAYERS', 'NORM_LAYERS',
'PADDING_LAYERS', 'UPSAMPLE_LAYERS', 'PLUGIN_LAYERS', 'Scale',
'get_model_complexity_info', 'conv_ws_2d', 'ConvAWS2d', 'ConvWS2d',
'fuse_conv_bn', 'DepthwiseSeparableConvModule', 'Linear', 'Conv2d',
'ConvTranspose2d', 'MaxPool2d', 'ConvTranspose3d', 'MaxPool3d', 'Conv3d',
'initialize', 'INITIALIZERS', 'ConstantInit', 'XavierInit', 'NormalInit',
'TruncNormalInit', 'UniformInit', 'KaimingInit', 'PretrainedInit',
'Caffe2XavierInit', 'MODELS', 'build_model_from_cfg'
]
# Copyright (c) OpenMMLab. All rights reserved.
import logging
from typing import Optional
import torch
import torch.nn as nn
from mmengine.runner import load_checkpoint
class AlexNet(nn.Module):
......@@ -14,8 +11,8 @@ class AlexNet(nn.Module):
num_classes (int): number of classes for classification.
"""
def __init__(self, num_classes: int = -1):
super().__init__()
def __init__(self, num_classes=-1):
super(AlexNet, self).__init__()
self.num_classes = num_classes
self.features = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
......@@ -43,9 +40,10 @@ class AlexNet(nn.Module):
nn.Linear(4096, num_classes),
)
def init_weights(self, pretrained: Optional[str] = None) -> None:
def init_weights(self, pretrained=None):
if isinstance(pretrained, str):
logger = logging.getLogger()
from ..runner import load_checkpoint
load_checkpoint(self, pretrained, strict=False, logger=logger)
elif pretrained is None:
# use default initializer
......@@ -53,7 +51,7 @@ class AlexNet(nn.Module):
else:
raise TypeError('pretrained must be a str or None')
def forward(self, x: torch.Tensor) -> torch.Tensor:
def forward(self, x):
x = self.features(x)
if self.num_classes > 0:
......
......@@ -14,7 +14,9 @@ from .non_local import NonLocal1d, NonLocal2d, NonLocal3d
from .norm import build_norm_layer, is_norm
from .padding import build_padding_layer
from .plugin import build_plugin_layer
from .scale import LayerScale, Scale
from .registry import (ACTIVATION_LAYERS, CONV_LAYERS, NORM_LAYERS,
PADDING_LAYERS, PLUGIN_LAYERS, UPSAMPLE_LAYERS)
from .scale import Scale
from .swish import Swish
from .upsample import build_upsample_layer
from .wrappers import (Conv2d, Conv3d, ConvTranspose2d, ConvTranspose3d,
......@@ -25,8 +27,9 @@ __all__ = [
'build_norm_layer', 'build_padding_layer', 'build_upsample_layer',
'build_plugin_layer', 'is_norm', 'HSigmoid', 'HSwish', 'NonLocal1d',
'NonLocal2d', 'NonLocal3d', 'ContextBlock', 'GeneralizedAttention',
'Scale', 'ConvAWS2d', 'ConvWS2d', 'conv_ws_2d',
'DepthwiseSeparableConvModule', 'Swish', 'Linear', 'Conv2dAdaptivePadding',
'Conv2d', 'ConvTranspose2d', 'MaxPool2d', 'ConvTranspose3d', 'MaxPool3d',
'Conv3d', 'Dropout', 'DropPath', 'LayerScale'
'ACTIVATION_LAYERS', 'CONV_LAYERS', 'NORM_LAYERS', 'PADDING_LAYERS',
'UPSAMPLE_LAYERS', 'PLUGIN_LAYERS', 'Scale', 'ConvAWS2d', 'ConvWS2d',
'conv_ws_2d', 'DepthwiseSeparableConvModule', 'Swish', 'Linear',
'Conv2dAdaptivePadding', 'Conv2d', 'ConvTranspose2d', 'MaxPool2d',
'ConvTranspose3d', 'MaxPool3d', 'Conv3d', 'Dropout', 'DropPath'
]
# Copyright (c) OpenMMLab. All rights reserved.
from typing import Dict
import torch
import torch.nn as nn
import torch.nn.functional as F
from mmengine.registry import MODELS
from mmengine.utils import digit_version
from mmengine.utils.dl_utils import TORCH_VERSION
from mmcv.utils import TORCH_VERSION, build_from_cfg, digit_version
from .registry import ACTIVATION_LAYERS
for module in [
nn.ReLU, nn.LeakyReLU, nn.PReLU, nn.RReLU, nn.ReLU6, nn.ELU,
nn.Sigmoid, nn.Tanh
]:
MODELS.register_module(module=module)
if digit_version(torch.__version__) >= digit_version('1.7.0'):
MODELS.register_module(module=nn.SiLU, name='SiLU')
else:
class SiLU(nn.Module):
"""Sigmoid Weighted Liner Unit."""
ACTIVATION_LAYERS.register_module(module=module)
def __init__(self, inplace=False):
super().__init__()
self.inplace = inplace
def forward(self, inputs) -> torch.Tensor:
if self.inplace:
return inputs.mul_(torch.sigmoid(inputs))
else:
return inputs * torch.sigmoid(inputs)
MODELS.register_module(module=SiLU, name='SiLU')
@MODELS.register_module(name='Clip')
@MODELS.register_module()
@ACTIVATION_LAYERS.register_module(name='Clip')
@ACTIVATION_LAYERS.register_module()
class Clamp(nn.Module):
"""Clamp activation layer.
......@@ -49,12 +28,12 @@ class Clamp(nn.Module):
Default to 1.
"""
def __init__(self, min: float = -1., max: float = 1.):
super().__init__()
def __init__(self, min=-1., max=1.):
super(Clamp, self).__init__()
self.min = min
self.max = max
def forward(self, x) -> torch.Tensor:
def forward(self, x):
"""Forward function.
Args:
......@@ -88,27 +67,26 @@ class GELU(nn.Module):
>>> output = m(input)
"""
def forward(self, input: torch.Tensor) -> torch.Tensor:
def forward(self, input):
return F.gelu(input)
if (TORCH_VERSION == 'parrots'
or digit_version(TORCH_VERSION) < digit_version('1.4')):
MODELS.register_module(module=GELU)
ACTIVATION_LAYERS.register_module(module=GELU)
else:
MODELS.register_module(module=nn.GELU)
ACTIVATION_LAYERS.register_module(module=nn.GELU)
def build_activation_layer(cfg: Dict) -> nn.Module:
def build_activation_layer(cfg):
"""Build activation layer.
Args:
cfg (dict): The activation layer config, which should contain:
- type (str): Layer type.
- layer args: Args needed to instantiate an activation layer.
Returns:
nn.Module: Created activation layer.
"""
return MODELS.build(cfg)
return build_from_cfg(cfg, ACTIVATION_LAYERS)
# Copyright (c) OpenMMLab. All rights reserved.
from typing import Union
import torch
from mmengine.model import constant_init, kaiming_init
from mmengine.registry import MODELS
from torch import nn
from ..utils import constant_init, kaiming_init
from .registry import PLUGIN_LAYERS
def last_zero_init(m: Union[nn.Module, nn.Sequential]) -> None:
def last_zero_init(m):
if isinstance(m, nn.Sequential):
constant_init(m[-1], val=0)
else:
constant_init(m, val=0)
@MODELS.register_module()
@PLUGIN_LAYERS.register_module()
class ContextBlock(nn.Module):
"""ContextBlock module in GCNet.
......@@ -35,11 +34,11 @@ class ContextBlock(nn.Module):
_abbr_ = 'context_block'
def __init__(self,
in_channels: int,
ratio: float,
pooling_type: str = 'att',
fusion_types: tuple = ('channel_add', )):
super().__init__()
in_channels,
ratio,
pooling_type='att',
fusion_types=('channel_add', )):
super(ContextBlock, self).__init__()
assert pooling_type in ['avg', 'att']
assert isinstance(fusion_types, (list, tuple))
valid_fusion_types = ['channel_add', 'channel_mul']
......@@ -83,7 +82,7 @@ class ContextBlock(nn.Module):
if self.channel_mul_conv is not None:
last_zero_init(self.channel_mul_conv)
def spatial_pool(self, x: torch.Tensor) -> torch.Tensor:
def spatial_pool(self, x):
batch, channel, height, width = x.size()
if self.pooling_type == 'att':
input_x = x
......@@ -109,7 +108,7 @@ class ContextBlock(nn.Module):
return context
def forward(self, x: torch.Tensor) -> torch.Tensor:
def forward(self, x):
# [N, C, 1, 1]
context = self.spatial_pool(x)
......
# Copyright (c) OpenMMLab. All rights reserved.
from typing import Dict, Optional
from mmengine.registry import MODELS
from torch import nn
MODELS.register_module('Conv1d', module=nn.Conv1d)
MODELS.register_module('Conv2d', module=nn.Conv2d)
MODELS.register_module('Conv3d', module=nn.Conv3d)
MODELS.register_module('Conv', module=nn.Conv2d)
from .registry import CONV_LAYERS
CONV_LAYERS.register_module('Conv1d', module=nn.Conv1d)
CONV_LAYERS.register_module('Conv2d', module=nn.Conv2d)
CONV_LAYERS.register_module('Conv3d', module=nn.Conv3d)
CONV_LAYERS.register_module('Conv', module=nn.Conv2d)
def build_conv_layer(cfg: Optional[Dict], *args, **kwargs) -> nn.Module:
def build_conv_layer(cfg, *args, **kwargs):
"""Build convolution layer.
Args:
......@@ -35,15 +34,11 @@ def build_conv_layer(cfg: Optional[Dict], *args, **kwargs) -> nn.Module:
cfg_ = cfg.copy()
layer_type = cfg_.pop('type')
if layer_type not in CONV_LAYERS:
raise KeyError(f'Unrecognized norm type {layer_type}')
else:
conv_layer = CONV_LAYERS.get(layer_type)
# Switch registry to the target scope. If `conv_layer` cannot be found
# in the registry, fallback to search `conv_layer` in the
# mmengine.MODELS.
with MODELS.switch_scope_and_registry(None) as registry:
conv_layer = registry.get(layer_type)
if conv_layer is None:
raise KeyError(f'Cannot find {conv_layer} in registry under scope '
f'name {registry.scope}')
layer = conv_layer(*args, **kwargs, **cfg_)
return layer
# Copyright (c) OpenMMLab. All rights reserved.
import math
from typing import Tuple, Union
import torch
from mmengine.registry import MODELS
from torch import nn
from torch.nn import functional as F
from .registry import CONV_LAYERS
@MODELS.register_module()
@CONV_LAYERS.register_module()
class Conv2dAdaptivePadding(nn.Conv2d):
"""Implementation of 2D convolution in tensorflow with `padding` as "same",
which applies padding to input (if needed) so that input image gets fully
......@@ -32,18 +31,18 @@ class Conv2dAdaptivePadding(nn.Conv2d):
"""
def __init__(self,
in_channels: int,
out_channels: int,
kernel_size: Union[int, Tuple[int, int]],
stride: Union[int, Tuple[int, int]] = 1,
padding: Union[int, Tuple[int, int]] = 0,
dilation: Union[int, Tuple[int, int]] = 1,
groups: int = 1,
bias: bool = True):
in_channels,
out_channels,
kernel_size,
stride=1,
padding=0,
dilation=1,
groups=1,
bias=True):
super().__init__(in_channels, out_channels, kernel_size, stride, 0,
dilation, groups, bias)
def forward(self, x: torch.Tensor) -> torch.Tensor:
def forward(self, x):
img_h, img_w = x.size()[-2:]
kernel_h, kernel_w = self.weight.size()[-2:]
stride_h, stride_w = self.stride
......
# Copyright (c) OpenMMLab. All rights reserved.
import warnings
from typing import Dict, Optional, Tuple, Union
import torch
import torch.nn as nn
from mmengine.model import constant_init, kaiming_init
from mmengine.registry import MODELS
from mmengine.utils.dl_utils.parrots_wrapper import _BatchNorm, _InstanceNorm
from mmcv.utils import _BatchNorm, _InstanceNorm
from ..utils import constant_init, kaiming_init
from .activation import build_activation_layer
from .conv import build_conv_layer
from .norm import build_norm_layer
from .padding import build_padding_layer
from .registry import PLUGIN_LAYERS
@MODELS.register_module()
@PLUGIN_LAYERS.register_module()
class ConvModule(nn.Module):
"""A conv block that bundles conv/norm/activation layers.
......@@ -70,22 +68,22 @@ class ConvModule(nn.Module):
_abbr_ = 'conv_block'
def __init__(self,
in_channels: int,
out_channels: int,
kernel_size: Union[int, Tuple[int, int]],
stride: Union[int, Tuple[int, int]] = 1,
padding: Union[int, Tuple[int, int]] = 0,
dilation: Union[int, Tuple[int, int]] = 1,
groups: int = 1,
bias: Union[bool, str] = 'auto',
conv_cfg: Optional[Dict] = None,
norm_cfg: Optional[Dict] = None,
act_cfg: Optional[Dict] = dict(type='ReLU'),
inplace: bool = True,
with_spectral_norm: bool = False,
padding_mode: str = 'zeros',
order: tuple = ('conv', 'norm', 'act')):
super().__init__()
in_channels,
out_channels,
kernel_size,
stride=1,
padding=0,
dilation=1,
groups=1,
bias='auto',
conv_cfg=None,
norm_cfg=None,
act_cfg=dict(type='ReLU'),
inplace=True,
with_spectral_norm=False,
padding_mode='zeros',
order=('conv', 'norm', 'act')):
super(ConvModule, self).__init__()
assert conv_cfg is None or isinstance(conv_cfg, dict)
assert norm_cfg is None or isinstance(norm_cfg, dict)
assert act_cfg is None or isinstance(act_cfg, dict)
......@@ -98,7 +96,7 @@ class ConvModule(nn.Module):
self.with_explicit_padding = padding_mode not in official_padding_mode
self.order = order
assert isinstance(self.order, tuple) and len(self.order) == 3
assert set(order) == {'conv', 'norm', 'act'}
assert set(order) == set(['conv', 'norm', 'act'])
self.with_norm = norm_cfg is not None
self.with_activation = act_cfg is not None
......@@ -145,22 +143,21 @@ class ConvModule(nn.Module):
norm_channels = out_channels
else:
norm_channels = in_channels
self.norm_name, norm = build_norm_layer(
norm_cfg, norm_channels) # type: ignore
self.norm_name, norm = build_norm_layer(norm_cfg, norm_channels)
self.add_module(self.norm_name, norm)
if self.with_bias:
if isinstance(norm, (_BatchNorm, _InstanceNorm)):
warnings.warn(
'Unnecessary conv bias before batch/instance norm')
else:
self.norm_name = None # type: ignore
self.norm_name = None
# build activation layer
if self.with_activation:
act_cfg_ = act_cfg.copy() # type: ignore
act_cfg_ = act_cfg.copy()
# nn.Tanh has no 'inplace' argument
if act_cfg_['type'] not in [
'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish', 'GELU'
'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish'
]:
act_cfg_.setdefault('inplace', inplace)
self.activate = build_activation_layer(act_cfg_)
......@@ -196,10 +193,7 @@ class ConvModule(nn.Module):
if self.with_norm:
constant_init(self.norm, 1, bias=0)
def forward(self,
x: torch.Tensor,
activate: bool = True,
norm: bool = True) -> torch.Tensor:
def forward(self, x, activate=True, norm=True):
for layer in self.order:
if layer == 'conv':
if self.with_explicit_padding:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment