Commit 0fd8347d authored by unknown's avatar unknown
Browse files

添加mmclassification-0.24.1代码,删除mmclassification-speed-benchmark

parent cc567e9e
# Visualization
<!-- TOC -->
- [Pipeline Visualization](#pipeline-visualization)
- [Learning Rate Schedule Visualization](#learning-rate-schedule-visualization)
- [Class Activation Map Visualization](#class-activation-map-visualization)
- [FAQs](#faqs)
<!-- TOC -->
## Pipeline Visualization
```bash
python tools/visualizations/vis_pipeline.py \
${CONFIG_FILE} \
[--output-dir ${OUTPUT_DIR}] \
[--phase ${DATASET_PHASE}] \
[--number ${BUNBER_IMAGES_DISPLAY}] \
[--skip-type ${SKIP_TRANSFORM_TYPE}] \
[--mode ${DISPLAY_MODE}] \
[--show] \
[--adaptive] \
[--min-edge-length ${MIN_EDGE_LENGTH}] \
[--max-edge-length ${MAX_EDGE_LENGTH}] \
[--bgr2rgb] \
[--window-size ${WINDOW_SIZE}] \
[--cfg-options ${CFG_OPTIONS}]
```
**Description of all arguments**
- `config` : The path of a model config file.
- `--output-dir`: The output path for visualized images. If not specified, it will be set to `''`, which means not to save.
- `--phase`: Phase of visualizing dataset,must be one of `[train, val, test]`. If not specified, it will be set to `train`.
- `--number`: The number of samples to visualized. If not specified, display all images in the dataset.
- `--skip-type`: The pipelines to be skipped. If not specified, it will be set to `['ToTensor', 'Normalize', 'ImageToTensor', 'Collect']`.
- `--mode`: The display mode, can be one of `[original, pipeline, concat]`. If not specified, it will be set to `concat`.
- `--show`: If set, display pictures in pop-up windows.
- `--adaptive`: If set, adaptively resize images for better visualization.
- `--min-edge-length`: The minimum edge length, used when `--adaptive` is set. When any side of the picture is smaller than `${MIN_EDGE_LENGTH}`, the picture will be enlarged while keeping the aspect ratio unchanged, and the short side will be aligned to `${MIN_EDGE_LENGTH}`. If not specified, it will be set to 200.
- `--max-edge-length`: The maximum edge length, used when `--adaptive` is set. When any side of the picture is larger than `${MAX_EDGE_LENGTH}`, the picture will be reduced while keeping the aspect ratio unchanged, and the long side will be aligned to `${MAX_EDGE_LENGTH}`. If not specified, it will be set to 1000.
- `--bgr2rgb`: If set, flip the color channel order of images.
- `--window-size`: The shape of the display window. If not specified, it will be set to `12*7`. If used, it must be in the format `'W*H'`.
- `--cfg-options` : Modifications to the configuration file, refer to [Tutorial 1: Learn about Configs](https://mmclassification.readthedocs.io/en/latest/tutorials/config.html).
```{note}
1. If the `--mode` is not specified, it will be set to `concat` as default, get the pictures stitched together by original pictures and transformed pictures; if the `--mode` is set to `original`, get the original pictures; if the `--mode` is set to `transformed`, get the transformed pictures; if the `--mode` is set to `pipeline`, get all the intermediate images through the pipeline.
2. When `--adaptive` option is set, images that are too large or too small will be automatically adjusted, you can use `--min-edge-length` and `--max-edge-length` to set the adjust size.
```
**Examples**
1. In **'original'** mode, visualize 100 original pictures in the `CIFAR100` validation set, then display and save them in the `./tmp` folder:
```shell
python ./tools/visualizations/vis_pipeline.py configs/resnet/resnet50_8xb16_cifar100.py --phase val --output-dir tmp --mode original --number 100 --show --adaptive --bgr2rgb
```
<div align=center><img src="https://user-images.githubusercontent.com/18586273/146117528-1ec2d918-57f8-4ae4-8ca3-a8d31b602f64.jpg" style=" width: auto; height: 40%; "></div>
2. In **'transformed'** mode, visualize all the transformed pictures of the `ImageNet` training set and display them in pop-up windows:
```shell
python ./tools/visualizations/vis_pipeline.py ./configs/resnet/resnet50_8xb32_in1k.py --show --mode transformed
```
<div align=center><img src="https://user-images.githubusercontent.com/18586273/146117553-8006a4ba-e2fa-4f53-99bc-42a4b06e413f.jpg" style=" width: auto; height: 40%; "></div>
3. In **'concat'** mode, visualize 10 pairs of origin and transformed images for comparison in the `ImageNet` train set and save them in the `./tmp` folder:
```shell
python ./tools/visualizations/vis_pipeline.py configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py --phase train --output-dir tmp --number 10 --adaptive
```
<div align=center><img src="https://user-images.githubusercontent.com/18586273/146128259-0a369991-7716-411d-8c27-c6863e6d76ea.JPEG" style=" width: auto; height: 40%; "></div>
4. In **'pipeline'** mode, visualize all the intermediate pictures in the `ImageNet` train set through the pipeline:
```shell
python ./tools/visualizations/vis_pipeline.py configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py --phase train --adaptive --mode pipeline --show
```
<div align=center><img src="https://user-images.githubusercontent.com/18586273/146128201-eb97c2aa-a615-4a81-a649-38db1c315d0e.JPEG" style=" width: auto; height: 40%; "></div>
## Learning Rate Schedule Visualization
```bash
python tools/visualizations/vis_lr.py \
${CONFIG_FILE} \
--dataset-size ${DATASET_SIZE} \
--ngpus ${NUM_GPUs}
--save-path ${SAVE_PATH} \
--title ${TITLE} \
--style ${STYLE} \
--window-size ${WINDOW_SIZE}
--cfg-options
```
**Description of all arguments**
- `config` : The path of a model config file.
- `dataset-size` : The size of the datasets. If set,`build_dataset` will be skipped and `${DATASET_SIZE}` will be used as the size. Default to use the function `build_dataset`.
- `ngpus` : The number of GPUs used in training, default to be 1.
- `save-path` : The learning rate curve plot save path, default not to save.
- `title` : Title of figure. If not set, default to be config file name.
- `style` : Style of plt. If not set, default to be `whitegrid`.
- `window-size`: The shape of the display window. If not specified, it will be set to `12*7`. If used, it must be in the format `'W*H'`.
- `cfg-options` : Modifications to the configuration file, refer to [Tutorial 1: Learn about Configs](https://mmclassification.readthedocs.io/en/latest/tutorials/config.html).
```{note}
Loading annotations maybe consume much time, you can directly specify the size of the dataset with `dataset-size` to save time.
```
**Examples**
```bash
python tools/visualizations/vis_lr.py configs/resnet/resnet50_b16x8_cifar100.py
```
<div align=center><img src="../_static/image/tools/visualization/lr_schedule1.png" style=" width: auto; height: 40%; "></div>
When using ImageNet, directly specify the size of ImageNet, as below:
```bash
python tools/visualizations/vis_lr.py configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py --dataset-size 1281167 --ngpus 4 --save-path ./repvgg-B3g4_4xb64-lr.jpg
```
<div align=center><img src="../_static/image/tools/visualization/lr_schedule2.png" style=" width: auto; height: 40%; "></div>
## Class Activation Map Visualization
MMClassification provides `tools\visualizations\vis_cam.py` tool to visualize class activation map. Please use `pip install "grad-cam>=1.3.6"` command to install [pytorch-grad-cam](https://github.com/jacobgil/pytorch-grad-cam).
The supported methods are as follows:
| Method | What it does |
| ------------ | ---------------------------------------------------------------------------------------------------------------------------- |
| GradCAM | Weight the 2D activations by the average gradient |
| GradCAM++ | Like GradCAM but uses second order gradients |
| XGradCAM | Like GradCAM but scale the gradients by the normalized activations |
| EigenCAM | Takes the first principle component of the 2D Activations (no class discrimination, but seems to give great results) |
| EigenGradCAM | Like EigenCAM but with class discrimination: First principle component of Activations\*Grad. Looks like GradCAM, but cleaner |
| LayerCAM | Spatially weight the activations by positive gradients. Works better especially in lower layers |
**Command**
```bash
python tools/visualizations/vis_cam.py \
${IMG} \
${CONFIG_FILE} \
${CHECKPOINT} \
[--target-layers ${TARGET-LAYERS}] \
[--preview-model] \
[--method ${METHOD}] \
[--target-category ${TARGET-CATEGORY}] \
[--save-path ${SAVE_PATH}] \
[--vit-like] \
[--num-extra-tokens ${NUM-EXTRA-TOKENS}]
[--aug_smooth] \
[--eigen_smooth] \
[--device ${DEVICE}] \
[--cfg-options ${CFG-OPTIONS}]
```
**Description of all arguments**
- `img` : The target picture path.
- `config` : The path of the model config file.
- `checkpoint` : The path of the checkpoint.
- `--target-layers` : The target layers to get activation maps, one or more network layers can be specified. If not set, use the norm layer of the last block.
- `--preview-model` : Whether to print all network layer names in the model.
- `--method` : Visualization method, supports `GradCAM`, `GradCAM++`, `XGradCAM`, `EigenCAM`, `EigenGradCAM`, `LayerCAM`, which is case insensitive. Defaults to `GradCAM`.
- `--target-category` : Target category, if not set, use the category detected by the given model.
- `--save-path` : The path to save the CAM visualization image. If not set, the CAM image will not be saved.
- `--vit-like` : Whether the network is ViT-like network.
- `--num-extra-tokens` : The number of extra tokens in ViT-like backbones. If not set, use num_extra_tokens the backbone.
- `--aug_smooth` : Whether to use TTA(Test Time Augment) to get CAM.
- `--eigen_smooth` : Whether to use the principal component to reduce noise.
- `--device` : The computing device used. Default to 'cpu'.
- `--cfg-options` : Modifications to the configuration file, refer to [Tutorial 1: Learn about Configs](https://mmclassification.readthedocs.io/en/latest/tutorials/config.html).
```{note}
The argument `--preview-model` can view all network layers names in the given model. It will be helpful if you know nothing about the model layers when setting `--target-layers`.
```
**Examples(CNN)**
Here are some examples of `target-layers` in ResNet-50, which can be any module or layer:
- `'backbone.layer4'` means the output of the forth ResLayer.
- `'backbone.layer4.2'` means the output of the third BottleNeck block in the forth ResLayer.
- `'backbone.layer4.2.conv1'` means the output of the `conv1` layer in above BottleNeck block.
```{note}
For `ModuleList` or `Sequential`, you can also use the index to specify which sub-module is the target layer.
For example, the `backbone.layer4[-1]` is the same as `backbone.layer4.2` since `layer4` is a `Sequential` with three sub-modules.
```
1. Use different methods to visualize CAM for `ResNet50`, the `target-category` is the predicted result by the given checkpoint, using the default `target-layers`.
```shell
python tools/visualizations/vis_cam.py \
demo/bird.JPEG \
configs/resnet/resnet50_8xb32_in1k.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth \
--method GradCAM
# GradCAM++, XGradCAM, EigenCAM, EigenGradCAM, LayerCAM
```
| Image | GradCAM | GradCAM++ | EigenGradCAM | LayerCAM |
| ------------------------------------ | --------------------------------------- | ----------------------------------------- | -------------------------------------------- | ---------------------------------------- |
| <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429496-628d3fb3-1f6e-41ff-aa5c-1b08c60c32a9.JPEG' height="auto" width="160" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065002-f1c86516-38b2-47ba-90c1-e00b49556c70.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065119-82581fa1-3414-4d6c-a849-804e1503c74b.jpg' height="auto" width="150"></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065096-75a6a2c1-6c57-4789-ad64-ebe5e38765f4.jpg' height="auto" width="150"></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065129-814d20fb-98be-4106-8c5e-420adcc85295.jpg' height="auto" width="150"></div> |
2. Use different `target-category` to get CAM from the same picture. In `ImageNet` dataset, the category 238 is 'Greater Swiss Mountain dog', the category 281 is 'tabby, tabby cat'.
```shell
python tools/visualizations/vis_cam.py \
demo/cat-dog.png configs/resnet/resnet50_8xb32_in1k.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth \
--target-layers 'backbone.layer4.2' \
--method GradCAM \
--target-category 238
# --target-category 281
```
| Category | Image | GradCAM | XGradCAM | LayerCAM |
| -------- | ---------------------------------------------- | ------------------------------------------------ | ------------------------------------------------- | ------------------------------------------------- |
| Dog | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429526-f27f4cce-89b9-4117-bfe6-55c2ca7eaba6.png' height="auto" width="165" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144433562-968a57bc-17d9-413e-810e-f91e334d648a.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144433853-319f3a8f-95f2-446d-b84f-3028daca5378.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144433937-daef5a69-fd70-428f-98a3-5e7747f4bb88.jpg' height="auto" width="150" ></div> |
| Cat | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429526-f27f4cce-89b9-4117-bfe6-55c2ca7eaba6.png' height="auto" width="165" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144434518-867ae32a-1cb5-4dbd-b1b9-5e375e94ea48.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144434603-0a2fd9ec-c02e-4e6c-a17b-64c234808c56.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144434623-b4432cc2-c663-4b97-aed3-583d9d3743e6.jpg' height="auto" width="150" ></div> |
3. Use `--eigen-smooth` and `--aug-smooth` to improve visual effects.
```shell
python tools/visualizations/vis_cam.py \
demo/dog.jpg \
configs/mobilenet_v3/mobilenet-v3-large_8xb32_in1k.py \
https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_large-3ea3c186.pth \
--target-layers 'backbone.layer16' \
--method LayerCAM \
--eigen-smooth --aug-smooth
```
| Image | LayerCAM | eigen-smooth | aug-smooth | eigen&aug |
| ------------------------------------ | --------------------------------------- | ------------------------------------------- | ----------------------------------------- | ----------------------------------------- |
| <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557492-98ac5ce0-61f9-4da9-8ea7-396d0b6a20fa.jpg' height="auto" width="160"></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557541-a4cf7d86-7267-46f9-937c-6f657ea661b4.jpg' height="auto" width="145" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557547-2731b53e-e997-4dd2-a092-64739cc91959.jpg' height="auto" width="145" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557545-8189524a-eb92-4cce-bf6a-760cab4a8065.jpg' height="auto" width="145" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557548-c1e3f3ec-3c96-43d4-874a-3b33cd3351c5.jpg' height="auto" width="145" ></div> |
**Examples(Transformer)**
Here are some examples:
- `'backbone.norm3'` for Swin-Transformer;
- `'backbone.layers[-1].ln1'` for ViT;
For ViT-like networks, such as ViT, T2T-ViT and Swin-Transformer, the features are flattened. And for drawing the CAM, we need to specify the `--vit-like` argument to reshape the features into square feature maps.
Besides the flattened features, some ViT-like networks also add extra tokens like the class token in ViT and T2T-ViT, and the distillation token in DeiT. In these networks, the final classification is done on the tokens computed in the last attention block, and therefore, the classification score will not be affected by other features and the gradient of the classification score with respect to them, will be zero. Therefore, you shouldn't use the output of the last attention block as the target layer in these networks.
To exclude these extra tokens, we need know the number of extra tokens. Almost all transformer-based backbones in MMClassification have the `num_extra_tokens` attribute. If you want to use this tool in a new or third-party network that don't have the `num_extra_tokens` attribute, please specify it the `--num-extra-tokens` argument.
1. Visualize CAM for `Swin Transformer`, using default `target-layers`:
```shell
python tools/visualizations/vis_cam.py \
demo/bird.JPEG \
configs/swin_transformer/swin-tiny_16xb64_in1k.py \
https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth \
--vit-like
```
2. Visualize CAM for `Vision Transformer(ViT)`:
```shell
python tools/visualizations/vis_cam.py \
demo/bird.JPEG \
configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py \
https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-98e8652b.pth \
--vit-like \
--target-layers 'backbone.layers[-1].ln1'
```
3. Visualize CAM for `T2T-ViT`:
```shell
python tools/visualizations/vis_cam.py \
demo/bird.JPEG \
configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py \
https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_3rdparty_8xb64_in1k_20210928-b7c09b62.pth \
--vit-like \
--target-layers 'backbone.encoder[-1].ln1'
```
| Image | ResNet50 | ViT | Swin | T2T-ViT |
| --------------------------------------- | ------------------------------------------ | -------------------------------------- | --------------------------------------- | ------------------------------------------ |
| <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429496-628d3fb3-1f6e-41ff-aa5c-1b08c60c32a9.JPEG' height="auto" width="165" ></div> | <div align=center><img src=https://user-images.githubusercontent.com/18586273/144431491-a2e19fe3-5c12-4404-b2af-a9552f5a95d9.jpg height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144436218-245a11de-6234-4852-9c08-ff5069f6a739.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144436168-01b0e565-442c-4e1e-910c-17c62cff7cd3.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144436198-51dbfbda-c48d-48cc-ae06-1a923d19b6f6.jpg' height="auto" width="150" ></div> |
## FAQs
- None
This source diff could not be displayed because it is too large. You can view the blob instead.
This source diff could not be displayed because it is too large. You can view the blob instead.
# Tutorial 1: Learn about Configs
MMClassification mainly uses python files as configs. The design of our configuration file system integrates modularity and inheritance, facilitating users to conduct various experiments. All configuration files are placed in the `configs` folder, which mainly contains the primitive configuration folder of `_base_` and many algorithm folders such as `resnet`, `swin_transformer`, `vision_transformer`, etc.
If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
<!-- TOC -->
- [Config File and Checkpoint Naming Convention](#config-file-and-checkpoint-naming-convention)
- [Config File Structure](#config-file-structure)
- [Inherit and Modify Config File](#inherit-and-modify-config-file)
- [Use intermediate variables in configs](#use-intermediate-variables-in-configs)
- [Ignore some fields in the base configs](#ignore-some-fields-in-the-base-configs)
- [Use some fields in the base configs](#use-some-fields-in-the-base-configs)
- [Modify config through script arguments](#modify-config-through-script-arguments)
- [Import user-defined modules](#import-user-defined-modules)
- [FAQ](#faq)
<!-- TOC -->
## Config File and Checkpoint Naming Convention
We follow the below convention to name config files. Contributors are advised to follow the same style. The config file names are divided into four parts: algorithm info, module information, training information and data information. Logically, different parts are concatenated by underscores `'_'`, and words in the same part are concatenated by dashes `'-'`.
```
{algorithm info}_{module info}_{training info}_{data info}.py
```
- `algorithm info`:algorithm information, model name and neural network architecture, such as resnet, etc.;
- `module info`: module information is used to represent some special neck, head and pretrain information;
- `training info`:Training information, some training schedule, including batch size, lr schedule, data augment and the like;
- `data info`:Data information, dataset name, input size and so on, such as imagenet, cifar, etc.;
### Algorithm information
The main algorithm name and the corresponding branch architecture information. E.g:
- `resnet50`
- `mobilenet-v3-large`
- `vit-small-patch32` : `patch32` represents the size of the partition in `ViT` algorithm;
- `seresnext101-32x4d` : `SeResNet101` network structure, `32x4d` means that `groups` and `width_per_group` are 32 and 4 respectively in `Bottleneck`;
### Module information
Some special `neck`, `head` and `pretrain` information. In classification tasks, `pretrain` information is the most commonly used:
- `in21k-pre` : pre-trained on ImageNet21k;
- `in21k-pre-3rd-party` : pre-trained on ImageNet21k and the checkpoint is converted from a third-party repository;
### Training information
Training schedule, including training type, `batch size`, `lr schedule`, data augment, special loss functions and so on:
- format `{gpu x batch_per_gpu}`, such as `8xb32`
Training type (mainly seen in the transformer network, such as the `ViT` algorithm, which is usually divided into two training type: pre-training and fine-tuning):
- `ft` : configuration file for fine-tuning
- `pt` : configuration file for pretraining
Training recipe. Usually, only the part that is different from the original paper will be marked. These methods will be arranged in the order `{pipeline aug}-{train aug}-{loss trick}-{scheduler}-{epochs}`.
- `coslr-200e` : use cosine scheduler to train 200 epochs
- `autoaug-mixup-lbs-coslr-50e` : use `autoaug`, `mixup`, `label smooth`, `cosine scheduler` to train 50 epochs
### Data information
- `in1k` : `ImageNet1k` dataset, default to use the input image size of 224x224;
- `in21k` : `ImageNet21k` dataset, also called `ImageNet22k` dataset, default to use the input image size of 224x224;
- `in1k-384px` : Indicates that the input image size is 384x384;
- `cifar100`
### Config File Name Example
```
repvgg-D2se_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py
```
- `repvgg-D2se`: Algorithm information
- `repvgg`: The main algorithm.
- `D2se`: The architecture.
- `deploy`: Module information, means the backbone is in the deploy state.
- `4xb64-autoaug-lbs-mixup-coslr-200e`: Training information.
- `4xb64`: Use 4 GPUs and the size of batches per GPU is 64.
- `autoaug`: Use `AutoAugment` in training pipeline.
- `lbs`: Use label smoothing loss.
- `mixup`: Use `mixup` training augment method.
- `coslr`: Use cosine learning rate scheduler.
- `200e`: Train the model for 200 epochs.
- `in1k`: Dataset information. The config is for `ImageNet1k` dataset and the input size is `224x224`.
```{note}
Some configuration files currently do not follow this naming convention, and related files will be updated in the near future.
```
### Checkpoint Naming Convention
The naming of the weight mainly includes the configuration file name, date and hash value.
```
{config_name}_{date}-{hash}.pth
```
## Config File Structure
There are four kinds of basic component file in the `configs/_base_` folders, namely:
- [models](https://github.com/open-mmlab/mmclassification/tree/master/configs/_base_/models)
- [datasets](https://github.com/open-mmlab/mmclassification/tree/master/configs/_base_/datasets)
- [schedules](https://github.com/open-mmlab/mmclassification/tree/master/configs/_base_/schedules)
- [runtime](https://github.com/open-mmlab/mmclassification/blob/master/configs/_base_/default_runtime.py)
You can easily build your own training config file by inherit some base config files. And the configs that are composed by components from `_base_` are called _primitive_.
For easy understanding, we use [ResNet50 primitive config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet50_8xb32_in1k.py) as a example and comment the meaning of each line. For more detaile, please refer to the API documentation.
```python
_base_ = [
'../_base_/models/resnet50.py', # model
'../_base_/datasets/imagenet_bs32.py', # data
'../_base_/schedules/imagenet_bs256.py', # training schedule
'../_base_/default_runtime.py' # runtime setting
]
```
The four parts are explained separately below, and the above-mentioned ResNet50 primitive config are also used as an example.
### model
The parameter `"model"` is a python dictionary in the configuration file, which mainly includes information such as network structure and loss function:
- `type` : Classifier name, MMCls supports `ImageClassifier`, refer to [API documentation](https://mmclassification.readthedocs.io/en/latest/api/models.html#classifier).
- `backbone` : Backbone configs, refer to [API documentation](https://mmclassification.readthedocs.io/en/latest/api/models.html#backbones) for available options.
- `neck` :Neck network name, MMCls supports `GlobalAveragePooling`, please refer to [API documentation](https://mmclassification.readthedocs.io/en/latest/api/models.html#necks).
- `head`: Head network name, MMCls supports single-label and multi-label classification head networks, available options refer to [API documentation](https://mmclassification.readthedocs.io/en/latest/api/models.html#heads).
- `loss`: Loss function type, supports `CrossEntropyLoss`, [`LabelSmoothLoss`](https://github.com/open-mmlab/mmclassification/blob/master/configs/_base_/models/resnet50_label_smooth.py) etc., For available options, refer to [API documentation](https://mmclassification.readthedocs.io/en/latest/api/models.html#losses).
- `train_cfg` :Training augment config, MMCls supports [`mixup`](https://github.com/open-mmlab/mmclassification/blob/master/configs/_base_/models/resnet50_mixup.py), [`cutmix`](https://github.com/open-mmlab/mmclassification/blob/master/configs/_base_/models/resnet50_cutmix.py) and other augments.
```{note}
The 'type' in the configuration file is not a constructed parameter, but a class name.
```
```python
model = dict(
type='ImageClassifier', # Classifier name
backbone=dict(
type='ResNet', # Backbones name
depth=50, # depth of backbone, ResNet has options of 18, 34, 50, 101, 152.
num_stages=4, # number of stages,The feature maps generated by these states are used as the input for the subsequent neck and head.
out_indices=(3, ), # The output index of the output feature maps.
frozen_stages=-1, # the stage to be frozen, '-1' means not be forzen
style='pytorch'), # The style of backbone, 'pytorch' means that stride 2 layers are in 3x3 conv, 'caffe' means stride 2 layers are in 1x1 convs.
neck=dict(type='GlobalAveragePooling'), # neck network name
head=dict(
type='LinearClsHead', # linear classification head,
num_classes=1000, # The number of output categories, consistent with the number of categories in the dataset
in_channels=2048, # The number of input channels, consistent with the output channel of the neck
loss=dict(type='CrossEntropyLoss', loss_weight=1.0), # Loss function configuration information
topk=(1, 5), # Evaluation index, Top-k accuracy rate, here is the accuracy rate of top1 and top5
))
```
### data
The parameter `"data"` is a python dictionary in the configuration file, which mainly includes information to construct dataloader:
- `samples_per_gpu` : the BatchSize of each GPU when building the dataloader
- `workers_per_gpu` : the number of threads per GPU when building dataloader
- `train | val | test` : config to construct dataset
- `type`: Dataset name, MMCls supports `ImageNet`, `Cifar` etc., refer to [API documentation](https://mmclassification.readthedocs.io/en/latest/api/datasets.html)
- `data_prefix` : Dataset root directory
- `pipeline` : Data processing pipeline, refer to related tutorial [CUSTOM DATA PIPELINES](https://mmclassification.readthedocs.io/en/latest/tutorials/data_pipeline.html)
The parameter `evaluation` is also a dictionary, which is the configuration information of `evaluation hook`, mainly including evaluation interval, evaluation index, etc..
```python
# dataset settings
dataset_type = 'ImageNet' # dataset name,
img_norm_cfg = dict( # Image normalization config to normalize the input images
mean=[123.675, 116.28, 103.53], # Mean values used to pre-training the pre-trained backbone models
std=[58.395, 57.12, 57.375], # Standard variance used to pre-training the pre-trained backbone models
to_rgb=True) # Whether to invert the color channel, rgb2bgr or bgr2rgb.
# train data pipeline
train_pipeline = [
dict(type='LoadImageFromFile'), # First pipeline to load images from file path
dict(type='RandomResizedCrop', size=224), # RandomResizedCrop
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'), # Randomly flip the picture horizontally with a probability of 0.5
dict(type='Normalize', **img_norm_cfg), # normalization
dict(type='ImageToTensor', keys=['img']), # convert image from numpy into torch.Tensor
dict(type='ToTensor', keys=['gt_label']), # convert gt_label into torch.Tensor
dict(type='Collect', keys=['img', 'gt_label']) # Pipeline that decides which keys in the data should be passed to the detector
]
# test data pipeline
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', size=(256, -1)),
dict(type='CenterCrop', crop_size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']) # do not pass gt_label while testing
]
data = dict(
samples_per_gpu=32, # Batch size of a single GPU
workers_per_gpu=2, # Worker to pre-fetch data for each single GPU
train=dict( # Train dataset config
train=dict( # train data config
type=dataset_type, # dataset name
data_prefix='data/imagenet/train', # Dataset root, when ann_file does not exist, the category information is automatically obtained from the root folder
pipeline=train_pipeline), # train data pipeline
val=dict( # val data config
type=dataset_type,
data_prefix='data/imagenet/val',
ann_file='data/imagenet/meta/val.txt', # ann_file existes, the category information is obtained from file
pipeline=test_pipeline),
test=dict( # test data config
type=dataset_type,
data_prefix='data/imagenet/val',
ann_file='data/imagenet/meta/val.txt',
pipeline=test_pipeline))
evaluation = dict( # The config to build the evaluation hook, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/evaluation/eval_hooks.py#L7 for more details.
interval=1, # Evaluation interval
metric='accuracy') # Metrics used during evaluation
```
### training schedule
Mainly include optimizer settings, `optimizer hook` settings, learning rate schedule and `runner` settings:
- `optimizer`: optimizer setting , support all optimizers in `pytorch`, refer to related [mmcv](https://mmcv.readthedocs.io/en/latest/_modules/mmcv/runner/optimizer/default_constructor.html#DefaultOptimizerConstructor) documentation.
- `optimizer_config`: `optimizer hook` configuration file, such as setting gradient limit, refer to related [mmcv](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8) code.
- `lr_config`: Learning rate scheduler, supports "CosineAnnealing", "Step", "Cyclic", etc. refer to related [mmcv](https://mmcv.readthedocs.io/en/latest/_modules/mmcv/runner/hooks/lr_updater.html#LrUpdaterHook) documentation for more options.
- `runner`: For `runner`, please refer to `mmcv` for [`runner`](https://mmcv.readthedocs.io/en/latest/understand_mmcv/runner.html) introduction document.
```python
# he configuration file used to build the optimizer, support all optimizers in PyTorch.
optimizer = dict(type='SGD', # Optimizer type
lr=0.1, # Learning rate of optimizers, see detail usages of the parameters in the documentation of PyTorch
momentum=0.9, # Momentum
weight_decay=0.0001) # Weight decay of SGD
# Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8 for implementation details.
optimizer_config = dict(grad_clip=None) # Most of the methods do not use gradient clip
# Learning rate scheduler config used to register LrUpdater hook
lr_config = dict(policy='step', # The policy of scheduler, also support CosineAnnealing, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L9.
step=[30, 60, 90]) # Steps to decay the learning rate
runner = dict(type='EpochBasedRunner', # Type of runner to use (i.e. IterBasedRunner or EpochBasedRunner)
max_epochs=100) # Runner that runs the workflow in total max_epochs. For IterBasedRunner use `max_iters`
```
### runtime setting
This part mainly includes saving the checkpoint strategy, log configuration, training parameters, breakpoint weight path, working directory, etc..
```python
# Config to set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
checkpoint_config = dict(interval=1) # The save interval is 1
# config to register logger hook
log_config = dict(
interval=100, # Interval to print the log
hooks=[
dict(type='TextLoggerHook'), # The Tensorboard logger is also supported
# dict(type='TensorboardLoggerHook')
])
dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set.
log_level = 'INFO' # The output level of the log.
resume_from = None # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once.
work_dir = 'work_dir' # Directory to save the model checkpoints and logs for the current experiments.
```
## Inherit and Modify Config File
For easy understanding, we recommend contributors to inherit from existing methods.
For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
For example, if your config file is based on ResNet with some other modification, you can first inherit the basic ResNet structure, dataset and other training setting by specifying `_base_ ='./resnet50_8xb32_in1k.py'` (The path relative to your config file), and then modify the necessary parameters in the config file. A more specific example, now we want to use almost all configs in `configs/resnet/resnet50_8xb32_in1k.py`, but change the number of training epochs from 100 to 300, modify when to decay the learning rate, and modify the dataset path, you can create a new config file `configs/resnet/resnet50_8xb32-300e_in1k.py` with content as below:
```python
_base_ = './resnet50_8xb32_in1k.py'
runner = dict(max_epochs=300)
lr_config = dict(step=[150, 200, 250])
data = dict(
train=dict(data_prefix='mydata/imagenet/train'),
val=dict(data_prefix='mydata/imagenet/train', ),
test=dict(data_prefix='mydata/imagenet/train', )
)
```
### Use intermediate variables in configs
Some intermediate variables are used in the configuration file. The intermediate variables make the configuration file clearer and easier to modify.
For example, `train_pipeline` / `test_pipeline` is the intermediate variable of the data pipeline. We first need to define `train_pipeline` / `test_pipeline`, and then pass them to `data`. If you want to modify the size of the input image during training and testing, you need to modify the intermediate variables of `train_pipeline` / `test_pipeline`.
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', size=384, backend='pillow',),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', size=384, backend='pillow'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
]
data = dict(
train=dict(pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline))
```
### Ignore some fields in the base configs
Sometimes, you need to set `_delete_=True` to ignore some domain content in the basic configuration file. You can refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#inherit-from-base-config-with-ignored-fields) for more instructions.
The following is an example. If you want to use cosine schedule in the above ResNet50 case, just using inheritance and directly modify it will report `get unexcepected keyword'step'` error, because the `'step'` field of the basic config in `lr_config` domain information is reserved, and you need to add `_delete_ =True` to ignore the content of `lr_config` related fields in the basic configuration file:
```python
_base_ = '../../configs/resnet/resnet50_8xb32_in1k.py'
lr_config = dict(
_delete_=True,
policy='CosineAnnealing',
min_lr=0,
warmup='linear',
by_epoch=True,
warmup_iters=5,
warmup_ratio=0.1
)
```
### Use some fields in the base configs
Sometimes, you may refer to some fields in the `_base_` config, so as to avoid duplication of definitions. You can refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#reference-variables-from-base) for some more instructions.
The following is an example of using auto augment in the training data preprocessing pipeline, refer to [`configs/_base_/datasets/imagenet_bs64_autoaug.py`](https://github.com/open-mmlab/mmclassification/blob/master/configs/_base_/datasets/imagenet_bs64_autoaug.py). When defining `train_pipeline`, just add the definition file name of auto augment to `_base_`, and then use `{{_base_.auto_increasing_policies}}` to reference the variables:
```python
_base_ = ['./pipelines/auto_aug.py']
# dataset settings
dataset_type = 'ImageNet'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', size=224),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='AutoAugment', policies={{_base_.auto_increasing_policies}}),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
test_pipeline = [...]
data = dict(
samples_per_gpu=64,
workers_per_gpu=2,
train=dict(..., pipeline=train_pipeline),
val=dict(..., pipeline=test_pipeline))
evaluation = dict(interval=1, metric='accuracy')
```
## Modify config through script arguments
When users use the script "tools/train.py" or "tools/test.py" to submit tasks or use some other tools, they can directly modify the content of the configuration file used by specifying the `--cfg-options` parameter.
- Update config keys of dict chains.
The config options can be specified following the order of the dict keys in the original config.
For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode.
- Update keys inside a list of configs.
Some config dicts are composed as a list in your config. For example, the training pipeline `data.train.pipeline` is normally a list
e.g. `[dict(type='LoadImageFromFile'), dict(type='TopDownRandomFlip', flip_prob=0.5), ...]`. If you want to change `'flip_prob=0.5'` to `'flip_prob=0.0'` in the pipeline,
you may specify `--cfg-options data.train.pipeline.1.flip_prob=0.0`.
- Update values of list/tuples.
If the value to be updated is a list or a tuple. For example, the config file normally sets `workflow=[('train', 1)]`. If you want to
change this key, you may specify `--cfg-options workflow="[(train,1),(val,1)]"`. Note that the quotation mark " is necessary to
support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value.
## Import user-defined modules
```{note}
This part may only be used when using MMClassification as a third party library to build your own project, and beginners can skip it.
```
After studying the follow-up tutorials [ADDING NEW DATASET](https://mmclassification.readthedocs.io/en/latest/tutorials/new_dataset.html), [CUSTOM DATA PIPELINES](https://mmclassification.readthedocs.io/en/latest/tutorials/data_pipeline.html), [ADDING NEW MODULES](https://mmclassification.readthedocs.io/en/latest/tutorials/new_modules.html). You may use MMClassification to complete your project and create new classes of datasets, models, data enhancements, etc. in the project. In order to streamline the code, you can use MMClassification as a third-party library, you just need to keep your own extra code and import your own custom module in the configuration files. For examples, you may refer to [OpenMMLab Algorithm Competition Project](https://github.com/zhangrui-wolf/openmmlab-competition-2021) .
Add the following code to your own configuration files:
```python
custom_imports = dict(
imports=['your_dataset_class',
'your_transforme_class',
'your_model_class',
'your_module_class'],
allow_failed_imports=False)
```
## FAQ
- None
# Tutorial 4: Custom Data Pipelines
## Design of Data pipelines
Following typical conventions, we use `Dataset` and `DataLoader` for data loading
with multiple workers. Indexing `Dataset` returns a dict of data items corresponding to
the arguments of models forward method.
The data preparation pipeline and the dataset is decomposed. Usually a dataset
defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
The operations are categorized into data loading, pre-processing and formatting.
Here is an pipeline example for ResNet-50 training on ImageNet.
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', size=224),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', size=256),
dict(type='CenterCrop', crop_size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
]
```
For each operation, we list the related dict fields that are added/updated/removed.
At the end of the pipeline, we use `Collect` to only retain the necessary items for forward computation.
### Data loading
`LoadImageFromFile`
- add: img, img_shape, ori_shape
By default, `LoadImageFromFile` loads images from disk but it may lead to IO bottleneck for efficient small models.
Various backends are supported by mmcv to accelerate this process. For example, if the training machines have setup
[memcached](https://memcached.org/), we can revise the config as follows.
```
memcached_root = '/mnt/xxx/memcached_client/'
train_pipeline = [
dict(
type='LoadImageFromFile',
file_client_args=dict(
backend='memcached',
server_list_cfg=osp.join(memcached_root, 'server_list.conf'),
client_cfg=osp.join(memcached_root, 'client.conf'))),
]
```
More supported backends can be found in [mmcv.fileio.FileClient](https://github.com/open-mmlab/mmcv/blob/master/mmcv/fileio/file_client.py).
### Pre-processing
`Resize`
- add: scale, scale_idx, pad_shape, scale_factor, keep_ratio
- update: img, img_shape
`RandomFlip`
- add: flip, flip_direction
- update: img
`RandomCrop`
- update: img, pad_shape
`Normalize`
- add: img_norm_cfg
- update: img
### Formatting
`ToTensor`
- update: specified by `keys`.
`ImageToTensor`
- update: specified by `keys`.
`Collect`
- remove: all other keys except for those specified by `keys`
For more information about other data transformation classes, please refer to [Data Transformations](../api/transforms.rst)
## Extend and use custom pipelines
1. Write a new pipeline in any file, e.g., `my_pipeline.py`, and place it in
the folder `mmcls/datasets/pipelines/`. The pipeline class needs to override
the `__call__` method which takes a dict as input and returns a dict.
```python
from mmcls.datasets import PIPELINES
@PIPELINES.register_module()
class MyTransform(object):
def __call__(self, results):
# apply transforms on results['img']
return results
```
2. Import the new class in `mmcls/datasets/pipelines/__init__.py`.
```python
...
from .my_pipeline import MyTransform
__all__ = [
..., 'MyTransform'
]
```
3. Use it in config files.
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', size=224),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='MyTransform'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
```
## Pipeline visualization
After designing data pipelines, you can use the [visualization tools](../tools/visualization.md) to view the performance.
# Tutorial 2: Fine-tune Models
Classification models pre-trained on the ImageNet dataset have been demonstrated to be effective for other datasets and other downstream tasks.
This tutorial provides instructions for users to use the models provided in the [Model Zoo](../model_zoo.md) for other datasets to obtain better performance.
There are two steps to fine-tune a model on a new dataset.
- Add support for the new dataset following [Tutorial 3: Customize Dataset](new_dataset.md).
- Modify the configs as will be discussed in this tutorial.
Assume we have a ResNet-50 model pre-trained on the ImageNet-2012 dataset and want
to take the fine-tuning on the CIFAR-10 dataset, we need to modify five parts in the
config.
## Inherit base configs
At first, create a new config file
`configs/tutorial/resnet50_finetune_cifar.py` to store our configs. Of course,
the path can be customized by yourself.
To reuse the common parts among different configs, we support inheriting
configs from multiple existing configs. To fine-tune a ResNet-50 model, the new
config needs to inherit `configs/_base_/models/resnet50.py` to build the basic
structure of the model. To use the CIFAR-10 dataset, the new config can also
simply inherit `configs/_base_/datasets/cifar10_bs16.py`. For runtime settings such as
training schedules, the new config needs to inherit
`configs/_base_/default_runtime.py`.
To inherit all above configs, put the following code at the config file.
```python
_base_ = [
'../_base_/models/resnet50.py',
'../_base_/datasets/cifar10_bs16.py', '../_base_/default_runtime.py'
]
```
Besides, you can also choose to write the whole contents rather than use inheritance,
like [`configs/lenet/lenet5_mnist.py`](https://github.com/open-mmlab/mmclassification/blob/master/configs/lenet/lenet5_mnist.py).
## Modify model
When fine-tuning a model, usually we want to load the pre-trained backbone
weights and train a new classification head.
To load the pre-trained backbone, we need to change the initialization config
of the backbone and use `Pretrained` initialization function. Besides, in the
`init_cfg`, we use `prefix='backbone'` to tell the initialization
function to remove the prefix of keys in the checkpoint, for example, it will
change `backbone.conv1` to `conv1`. And here we use an online checkpoint, it
will be downloaded during training, you can also download the model manually
and use a local path.
And then we need to modify the head according to the class numbers of the new
datasets by just changing `num_classes` in the head.
```python
model = dict(
backbone=dict(
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
```
```{tip}
Here we only need to set the part of configs we want to modify, because the
inherited configs will be merged and get the entire configs.
```
Sometimes, we want to freeze the first several layers' parameters of the
backbone, that will help the network to keep ability to extract low-level
information learnt from pre-trained model. In MMClassification, you can simply
specify how many layers to freeze by `frozen_stages` argument. For example, to
freeze the first two layers' parameters, just use the following config:
```python
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
```
```{note}
Not all backbones support the `frozen_stages` argument by now. Please check
[the docs](https://mmclassification.readthedocs.io/en/latest/api/models.html#backbones)
to confirm if your backbone supports it.
```
## Modify dataset
When fine-tuning on a new dataset, usually we need to modify some dataset
configs. Here, we need to modify the pipeline to resize the image from 32 to
224 to fit the input size of the model pre-trained on ImageNet, and some other
configs.
```python
img_norm_cfg = dict(
mean=[125.307, 122.961, 113.8575],
std=[51.5865, 50.847, 51.255],
to_rgb=False,
)
train_pipeline = [
dict(type='RandomCrop', size=32, padding=4),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='Resize', size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label']),
]
test_pipeline = [
dict(type='Resize', size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
]
data = dict(
train=dict(pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline),
)
```
## Modify training schedule
The fine-tuning hyper parameters vary from the default schedule. It usually
requires smaller learning rate and less training epochs.
```python
# lr is set for a batch size of 128
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[15])
runner = dict(type='EpochBasedRunner', max_epochs=200)
log_config = dict(interval=100)
```
## Start Training
Now, we have finished the fine-tuning config file as following:
```python
_base_ = [
'../_base_/models/resnet50.py',
'../_base_/datasets/cifar10_bs16.py', '../_base_/default_runtime.py'
]
# Model config
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# Dataset config
img_norm_cfg = dict(
mean=[125.307, 122.961, 113.8575],
std=[51.5865, 50.847, 51.255],
to_rgb=False,
)
train_pipeline = [
dict(type='RandomCrop', size=32, padding=4),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='Resize', size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label']),
]
test_pipeline = [
dict(type='Resize', size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
]
data = dict(
train=dict(pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline),
)
# Training schedule config
# lr is set for a batch size of 128
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[15])
runner = dict(type='EpochBasedRunner', max_epochs=200)
log_config = dict(interval=100)
```
Here we use 8 GPUs on your computer to train the model with the following
command:
```shell
bash tools/dist_train.sh configs/tutorial/resnet50_finetune_cifar.py 8
```
Also, you can use only one GPU to train the model with the following command:
```shell
python tools/train.py configs/tutorial/resnet50_finetune_cifar.py
```
But wait, an important config need to be changed if using one GPU. We need to
change the dataset config as following:
```python
data = dict(
samples_per_gpu=128,
train=dict(pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline),
)
```
It's because our training schedule is for a batch size of 128. If using 8 GPUs,
just use `samples_per_gpu=16` config in the base config file, and the total batch
size will be 128. But if using one GPU, you need to change it to 128 manually to
match the training schedule.
# Tutorial 3: Customize Dataset
We support many common public datasets for image classification task, you can find them in
[this page](https://mmclassification.readthedocs.io/en/latest/api/datasets.html).
In this section, we demonstrate how to [use your own dataset](#use-your-own-dataset)
and [use dataset wrapper](#use-dataset-wrapper).
## Use your own dataset
### Reorganize dataset to existing format
The simplest way to use your own dataset is to convert it to existing dataset formats.
For multi-class classification task, we recommend to use the format of
[`CustomDataset`](https://mmclassification.readthedocs.io/en/latest/api/datasets.html#mmcls.datasets.CustomDataset).
The `CustomDataset` supports two kinds of format:
1. An annotation file is provided, and each line indicates a sample image.
The sample images can be organized in any structure, like:
```
train/
├── folder_1
│ ├── xxx.png
│ ├── xxy.png
│ └── ...
├── 123.png
├── nsdf3.png
└── ...
```
And an annotation file records all paths of samples and corresponding
category index. The first column is the image path relative to the folder
(in this example, `train`) and the second column is the index of category:
```
folder_1/xxx.png 0
folder_1/xxy.png 1
123.png 1
nsdf3.png 2
...
```
```{note}
The value of the category indices should fall in range `[0, num_classes - 1]`.
```
2. The sample images are arranged in the special structure:
```
train/
├── cat
│ ├── xxx.png
│ ├── xxy.png
│ └── ...
│ └── xxz.png
├── bird
│ ├── bird1.png
│ ├── bird2.png
│ └── ...
└── dog
├── 123.png
├── nsdf3.png
├── ...
└── asd932_.png
```
In this case, you don't need provide annotation file, and all images in the directory `cat` will be
recognized as samples of `cat`.
Usually, we will split the whole dataset to three sub datasets: `train`, `val`
and `test` for training, validation and test. And **every** sub dataset should
be organized as one of the above structures.
For example, the whole dataset is as below (using the first structure):
```
mmclassification
└── data
└── my_dataset
├── meta
│ ├── train.txt
│ ├── val.txt
│ └── test.txt
├── train
├── val
└── test
```
And in your config file, you can modify the `data` field as below:
```python
...
dataset_type = 'CustomDataset'
classes = ['cat', 'bird', 'dog'] # The category names of your dataset
data = dict(
train=dict(
type=dataset_type,
data_prefix='data/my_dataset/train',
ann_file='data/my_dataset/meta/train.txt',
classes=classes,
pipeline=train_pipeline
),
val=dict(
type=dataset_type,
data_prefix='data/my_dataset/val',
ann_file='data/my_dataset/meta/val.txt',
classes=classes,
pipeline=test_pipeline
),
test=dict(
type=dataset_type,
data_prefix='data/my_dataset/test',
ann_file='data/my_dataset/meta/test.txt',
classes=classes,
pipeline=test_pipeline
)
)
...
```
### Create a new dataset class
You can write a new dataset class inherited from `BaseDataset`, and overwrite `load_annotations(self)`,
like [CIFAR10](https://github.com/open-mmlab/mmclassification/blob/master/mmcls/datasets/cifar.py) and
[CustomDataset](https://github.com/open-mmlab/mmclassification/blob/master/mmcls/datasets/custom.py).
Typically, this function returns a list, where each sample is a dict, containing necessary data information,
e.g., `img` and `gt_label`.
Assume we are going to implement a `Filelist` dataset, which takes filelists for both training and testing.
The format of annotation list is as follows:
```
000001.jpg 0
000002.jpg 1
```
We can create a new dataset in `mmcls/datasets/filelist.py` to load the data.
```python
import mmcv
import numpy as np
from .builder import DATASETS
from .base_dataset import BaseDataset
@DATASETS.register_module()
class Filelist(BaseDataset):
def load_annotations(self):
assert isinstance(self.ann_file, str)
data_infos = []
with open(self.ann_file) as f:
samples = [x.strip().split(' ') for x in f.readlines()]
for filename, gt_label in samples:
info = {'img_prefix': self.data_prefix}
info['img_info'] = {'filename': filename}
info['gt_label'] = np.array(gt_label, dtype=np.int64)
data_infos.append(info)
return data_infos
```
And add this dataset class in `mmcls/datasets/__init__.py`
```python
from .base_dataset import BaseDataset
...
from .filelist import Filelist
__all__ = [
'BaseDataset', ... ,'Filelist'
]
```
Then in the config, to use `Filelist` you can modify the config as the following
```python
train = dict(
type='Filelist',
ann_file='image_list.txt',
pipeline=train_pipeline
)
```
## Use dataset wrapper
The dataset wrapper is a kind of class to change the behavior of dataset class, such as repeat the dataset or
re-balance the samples of different categories.
### Repeat dataset
We use `RepeatDataset` as wrapper to repeat the dataset. For example, suppose the original dataset is
`Dataset_A`, to repeat it, the config looks like the following
```python
data = dict(
train = dict(
type='RepeatDataset',
times=N,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
...
)
```
### Class balanced dataset
We use `ClassBalancedDataset` as wrapper to repeat the dataset based on category frequency. The dataset to
repeat needs to implement method `get_cat_ids(idx)` to support `ClassBalancedDataset`. For example, to repeat
`Dataset_A` with `oversample_thr=1e-3`, the config looks like the following
```python
data = dict(
train = dict(
type='ClassBalancedDataset',
oversample_thr=1e-3,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
...
)
```
You may refer to [API reference](https://mmclassification.readthedocs.io/en/latest/api/datasets.html#mmcls.datasets.ClassBalancedDataset) for details.
# Tutorial 5: Adding New Modules
## Develop new components
We basically categorize model components into 3 types.
- backbone: usually an feature extraction network, e.g., ResNet, MobileNet.
- neck: the component between backbones and heads, e.g., GlobalAveragePooling.
- head: the component for specific tasks, e.g., classification or regression.
### Add new backbones
Here we show how to develop new components with an example of ResNet_CIFAR.
As the input size of CIFAR is 32x32, this backbone replaces the `kernel_size=7, stride=2` to `kernel_size=3, stride=1` and remove the MaxPooling after stem, to avoid forwarding small feature maps to residual blocks.
It inherits from ResNet and only modifies the stem layers.
1. Create a new file `mmcls/models/backbones/resnet_cifar.py`.
```python
import torch.nn as nn
from ..builder import BACKBONES
from .resnet import ResNet
@BACKBONES.register_module()
class ResNet_CIFAR(ResNet):
"""ResNet backbone for CIFAR.
short description of the backbone
Args:
depth(int): Network depth, from {18, 34, 50, 101, 152}.
...
"""
def __init__(self, depth, deep_stem, **kwargs):
# call ResNet init
super(ResNet_CIFAR, self).__init__(depth, deep_stem=deep_stem, **kwargs)
# other specific initialization
assert not self.deep_stem, 'ResNet_CIFAR do not support deep_stem'
def _make_stem_layer(self, in_channels, base_channels):
# override ResNet method to modify the network structure
self.conv1 = build_conv_layer(
self.conv_cfg,
in_channels,
base_channels,
kernel_size=3,
stride=1,
padding=1,
bias=False)
self.norm1_name, norm1 = build_norm_layer(
self.norm_cfg, base_channels, postfix=1)
self.add_module(self.norm1_name, norm1)
self.relu = nn.ReLU(inplace=True)
def forward(self, x): # should return a tuple
pass # implementation is ignored
def init_weights(self, pretrained=None):
pass # override ResNet init_weights if necessary
def train(self, mode=True):
pass # override ResNet train if necessary
```
2. Import the module in `mmcls/models/backbones/__init__.py`.
```python
...
from .resnet_cifar import ResNet_CIFAR
__all__ = [
..., 'ResNet_CIFAR'
]
```
3. Use it in your config file.
```python
model = dict(
...
backbone=dict(
type='ResNet_CIFAR',
depth=18,
other_arg=xxx),
...
```
### Add new necks
Here we take `GlobalAveragePooling` as an example. It is a very simple neck without any arguments.
To add a new neck, we mainly implement the `forward` function, which applies some operation on the output from backbone and forward the results to head.
1. Create a new file in `mmcls/models/necks/gap.py`.
```python
import torch.nn as nn
from ..builder import NECKS
@NECKS.register_module()
class GlobalAveragePooling(nn.Module):
def __init__(self):
self.gap = nn.AdaptiveAvgPool2d((1, 1))
def forward(self, inputs):
# we regard inputs as tensor for simplicity
outs = self.gap(inputs)
outs = outs.view(inputs.size(0), -1)
return outs
```
2. Import the module in `mmcls/models/necks/__init__.py`.
```python
...
from .gap import GlobalAveragePooling
__all__ = [
..., 'GlobalAveragePooling'
]
```
3. Modify the config file.
```python
model = dict(
neck=dict(type='GlobalAveragePooling'),
)
```
### Add new heads
Here we show how to develop a new head with the example of `LinearClsHead` as the following.
To implement a new head, basically we need to implement `forward_train`, which takes the feature maps from necks or backbones as input and compute loss based on ground-truth labels.
1. Create a new file in `mmcls/models/heads/linear_head.py`.
```python
from ..builder import HEADS
from .cls_head import ClsHead
@HEADS.register_module()
class LinearClsHead(ClsHead):
def __init__(self,
num_classes,
in_channels,
loss=dict(type='CrossEntropyLoss', loss_weight=1.0),
topk=(1, )):
super(LinearClsHead, self).__init__(loss=loss, topk=topk)
self.in_channels = in_channels
self.num_classes = num_classes
if self.num_classes <= 0:
raise ValueError(
f'num_classes={num_classes} must be a positive integer')
self._init_layers()
def _init_layers(self):
self.fc = nn.Linear(self.in_channels, self.num_classes)
def init_weights(self):
normal_init(self.fc, mean=0, std=0.01, bias=0)
def forward_train(self, x, gt_label):
cls_score = self.fc(x)
losses = self.loss(cls_score, gt_label)
return losses
```
2. Import the module in `mmcls/models/heads/__init__.py`.
```python
...
from .linear_head import LinearClsHead
__all__ = [
..., 'LinearClsHead'
]
```
3. Modify the config file.
Together with the added GlobalAveragePooling neck, an entire config for a model is as follows.
```python
model = dict(
type='ImageClassifier',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(3, ),
style='pytorch'),
neck=dict(type='GlobalAveragePooling'),
head=dict(
type='LinearClsHead',
num_classes=1000,
in_channels=2048,
loss=dict(type='CrossEntropyLoss', loss_weight=1.0),
topk=(1, 5),
))
```
### Add new loss
To add a new loss function, we mainly implement the `forward` function in the loss module.
In addition, it is helpful to leverage the decorator `weighted_loss` to weight the loss for each element.
Assuming that we want to mimic a probabilistic distribution generated from another classification model, we implement a L1Loss to fulfil the purpose as below.
1. Create a new file in `mmcls/models/losses/l1_loss.py`.
```python
import torch
import torch.nn as nn
from ..builder import LOSSES
from .utils import weighted_loss
@weighted_loss
def l1_loss(pred, target):
assert pred.size() == target.size() and target.numel() > 0
loss = torch.abs(pred - target)
return loss
@LOSSES.register_module()
class L1Loss(nn.Module):
def __init__(self, reduction='mean', loss_weight=1.0):
super(L1Loss, self).__init__()
self.reduction = reduction
self.loss_weight = loss_weight
def forward(self,
pred,
target,
weight=None,
avg_factor=None,
reduction_override=None):
assert reduction_override in (None, 'none', 'mean', 'sum')
reduction = (
reduction_override if reduction_override else self.reduction)
loss = self.loss_weight * l1_loss(
pred, target, weight, reduction=reduction, avg_factor=avg_factor)
return loss
```
2. Import the module in `mmcls/models/losses/__init__.py`.
```python
...
from .l1_loss import L1Loss, l1_loss
__all__ = [
..., 'L1Loss', 'l1_loss'
]
```
3. Modify loss field in the config.
```python
loss=dict(type='L1Loss', loss_weight=1.0))
```
# Tutorial 7: Customize Runtime Settings
In this tutorial, we will introduce some methods about how to customize workflow and hooks when running your own settings for the project.
<!-- TOC -->
- [Customize Workflow](#customize-workflow)
- [Hooks](#hooks)
- [Default training hooks](#default-training-hooks)
- [Use other implemented hooks](#use-other-implemented-hooks)
- [Customize self-implemented hooks](#customize-self-implemented-hooks)
- [FAQ](#faq)
<!-- TOC -->
## Customize Workflow
Workflow is a list of (phase, duration) to specify the running order and duration. The meaning of "duration" depends on the runner's type.
For example, we use epoch-based runner by default, and the "duration" means how many epochs the phase to be executed in a cycle. Usually,
we only want to execute training phase, just use the following config.
```python
workflow = [('train', 1)]
```
Sometimes we may want to check some metrics (e.g. loss, accuracy) about the model on the validate set.
In such case, we can set the workflow as
```python
[('train', 1), ('val', 1)]
```
so that 1 epoch for training and 1 epoch for validation will be run iteratively.
By default, we recommend using **`EvalHook`** to do evaluation after the training epoch, but you can still use `val` workflow as an alternative.
```{note}
1. The parameters of model will not be updated during the val epoch.
2. Keyword `max_epochs` in the config only controls the number of training epochs and will not affect the validation workflow.
3. Workflows `[('train', 1), ('val', 1)]` and `[('train', 1)]` will not change the behavior of `EvalHook` because `EvalHook` is called by `after_train_epoch` and validation workflow only affect hooks that are called through `after_val_epoch`.
Therefore, the only difference between `[('train', 1), ('val', 1)]` and ``[('train', 1)]`` is that the runner will calculate losses on the validation set after each training epoch.
```
## Hooks
The hook mechanism is widely used in the OpenMMLab open-source algorithm library. Combined with the `Runner`, the entire life cycle of the training process can be managed easily. You can learn more about the hook through [related article](https://www.calltutors.com/blog/what-is-hook/).
Hooks only work after being registered into the runner. At present, hooks are mainly divided into two categories:
- default training hooks
The default training hooks are registered by the runner by default. Generally, they are hooks for some basic functions, and have a certain priority, you don't need to modify the priority.
- custom hooks
The custom hooks are registered through `custom_hooks`. Generally, they are hooks with enhanced functions. The priority needs to be specified in the configuration file. If you do not specify the priority of the hook, it will be set to 'NORMAL' by default.
**Priority list**
| Level | Value |
| :-------------: | :---: |
| HIGHEST | 0 |
| VERY_HIGH | 10 |
| HIGH | 30 |
| ABOVE_NORMAL | 40 |
| NORMAL(default) | 50 |
| BELOW_NORMAL | 60 |
| LOW | 70 |
| VERY_LOW | 90 |
| LOWEST | 100 |
The priority determines the execution order of the hooks. Before training, the log will print out the execution order of the hooks at each stage to facilitate debugging.
### default training hooks
Some common hooks are not registered through `custom_hooks`, they are
| Hooks | Priority |
| :-------------------: | :---------------: |
| `LrUpdaterHook` | VERY_HIGH (10) |
| `MomentumUpdaterHook` | HIGH (30) |
| `OptimizerHook` | ABOVE_NORMAL (40) |
| `CheckpointHook` | NORMAL (50) |
| `IterTimerHook` | LOW (70) |
| `EvalHook` | LOW (70) |
| `LoggerHook(s)` | VERY_LOW (90) |
`OptimizerHook`, `MomentumUpdaterHook` and `LrUpdaterHook` have been introduced in [sehedule strategy](./schedule.md).
`IterTimerHook` is used to record elapsed time and does not support modification.
Here we reveal how to customize `CheckpointHook`, `LoggerHooks`, and `EvalHook`.
#### CheckpointHook
The MMCV runner will use `checkpoint_config` to initialize [`CheckpointHook`](https://github.com/open-mmlab/mmcv/blob/9ecd6b0d5ff9d2172c49a182eaa669e9f27bb8e7/mmcv/runner/hooks/checkpoint.py).
```python
checkpoint_config = dict(interval=1)
```
We could set `max_keep_ckpts` to save only a small number of checkpoints or decide whether to store state dict of optimizer by `save_optimizer`.
More details of the arguments are [here](https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.CheckpointHook)
#### LoggerHooks
The `log_config` wraps multiple logger hooks and enables to set intervals. Now MMCV supports `TextLoggerHook`, `WandbLoggerHook`, `MlflowLoggerHook`, `NeptuneLoggerHook`, `DvcliveLoggerHook` and `TensorboardLoggerHook`.
The detailed usages can be found in the [doc](https://mmcv.readthedocs.io/en/latest/api.html#mmcv.runner.LoggerHook).
```python
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
```
#### EvalHook
The config of `evaluation` will be used to initialize the [`EvalHook`](https://github.com/open-mmlab/mmclassification/blob/master/mmcls/core/evaluation/eval_hooks.py).
The `EvalHook` has some reserved keys, such as `interval`, `save_best` and `start`, and the other arguments such as `metrics` will be passed to the `dataset.evaluate()`
```python
evaluation = dict(interval=1, metric='accuracy', metric_options={'topk': (1, )})
```
You can save the model weight when the best verification result is obtained by modifying the parameter `save_best`:
```python
# "auto" means automatically select the metrics to compare.
# You can also use a specific key like "accuracy_top-1".
evaluation = dict(interval=1, save_best="auto", metric='accuracy', metric_options={'topk': (1, )})
```
When running some large experiments, you can skip the validation step at the beginning of training by modifying the parameter `start` as below:
```python
evaluation = dict(interval=1, start=200, metric='accuracy', metric_options={'topk': (1, )})
```
This indicates that, before the 200th epoch, evaluations would not be executed. Since the 200th epoch, evaluations would be executed after the training process.
```{note}
In the default configuration files of MMClassification, the evaluation field is generally placed in the datasets configs.
```
### Use other implemented hooks
Some hooks have been already implemented in MMCV and MMClassification, they are:
- [EMAHook](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/ema.py)
- [SyncBuffersHook](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/sync_buffer.py)
- [EmptyCacheHook](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/memory.py)
- [ProfilerHook](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/profiler.py)
- ......
If the hook is already implemented in MMCV, you can directly modify the config to use the hook as below
```python
mmcv_hooks = [
dict(type='MMCVHook', a=a_value, b=b_value, priority='NORMAL')
]
```
such as using `EMAHook`, interval is 100 iters:
```python
custom_hooks = [
dict(type='EMAHook', interval=100, priority='HIGH')
]
```
## Customize self-implemented hooks
### 1. Implement a new hook
Here we give an example of creating a new hook in MMClassification and using it in training.
```python
from mmcv.runner import HOOKS, Hook
@HOOKS.register_module()
class MyHook(Hook):
def __init__(self, a, b):
pass
def before_run(self, runner):
pass
def after_run(self, runner):
pass
def before_epoch(self, runner):
pass
def after_epoch(self, runner):
pass
def before_iter(self, runner):
pass
def after_iter(self, runner):
pass
```
Depending on the functionality of the hook, the users need to specify what the hook will do at each stage of the training in `before_run`, `after_run`, `before_epoch`, `after_epoch`, `before_iter`, and `after_iter`.
### 2. Register the new hook
Then we need to make `MyHook` imported. Assuming the file is in `mmcls/core/utils/my_hook.py` there are two ways to do that:
- Modify `mmcls/core/utils/__init__.py` to import it.
The newly defined module should be imported in `mmcls/core/utils/__init__.py` so that the registry will
find the new module and add it:
```python
from .my_hook import MyHook
```
- Use `custom_imports` in the config to manually import it
```python
custom_imports = dict(imports=['mmcls.core.utils.my_hook'], allow_failed_imports=False)
```
### 3. Modify the config
```python
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value)
]
```
You can also set the priority of the hook as below:
```python
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value, priority='ABOVE_NORMAL')
]
```
By default, the hook's priority is set as `NORMAL` during registration.
## FAQ
### 1. `resume_from` and `load_from` and `init_cfg.Pretrained`
- `load_from` : only imports model weights, which is mainly used to load pre-trained or trained models;
- `resume_from` : not only import model weights, but also optimizer information, current epoch information, mainly used to continue training from the checkpoint.
- `init_cfg.Pretrained` : Load weights during weight initialization, and you can specify which module to load. This is usually used when fine-tuning a model, refer to [Tutorial 2: Fine-tune Models](./finetune.md).
# Tutorial 6: Customize Schedule
In this tutorial, we will introduce some methods about how to construct optimizers, customize learning rate and momentum schedules, parameter-wise finely configuration, gradient clipping, gradient accumulation, and customize self-implemented methods for the project.
<!-- TOC -->
- [Customize optimizer supported by PyTorch](#customize-optimizer-supported-by-pytorch)
- [Customize learning rate schedules](#customize-learning-rate-schedules)
- [Learning rate decay](#learning-rate-decay)
- [Warmup strategy](#warmup-strategy)
- [Customize momentum schedules](#customize-momentum-schedules)
- [Parameter-wise finely configuration](#parameter-wise-finely-configuration)
- [Gradient clipping and gradient accumulation](#gradient-clipping-and-gradient-accumulation)
- [Gradient clipping](#gradient-clipping)
- [Gradient accumulation](#gradient-accumulation)
- [Customize self-implemented methods](#customize-self-implemented-methods)
- [Customize self-implemented optimizer](#customize-self-implemented-optimizer)
- [Customize optimizer constructor](#customize-optimizer-constructor)
<!-- TOC -->
## Customize optimizer supported by PyTorch
We already support to use all the optimizers implemented by PyTorch, and to use and modify them, please change the `optimizer` field of config files.
For example, if you want to use `SGD`, the modification could be as the following.
```python
optimizer = dict(type='SGD', lr=0.0003, weight_decay=0.0001)
```
To modify the learning rate of the model, just modify the `lr` in the config of optimizer.
You can also directly set other arguments according to the [API doc](https://pytorch.org/docs/stable/optim.html?highlight=optim#module-torch.optim) of PyTorch.
For example, if you want to use `Adam` with the setting like `torch.optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, amsgrad=False)` in PyTorch,
the config should looks like.
```python
optimizer = dict(type='Adam', lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, amsgrad=False)
```
## Customize learning rate schedules
### Learning rate decay
Learning rate decay is widely used to improve performance. And to use learning rate decay, please set the `lr_confg` field in config files.
For example, we use step policy as the default learning rate decay policy of ResNet, and the config is:
```python
lr_config = dict(policy='step', step=[100, 150])
```
Then during training, the program will call [`StepLRHook`](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/lr_updater.py#L153) periodically to update the learning rate.
We also support many other learning rate schedules [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py), such as `CosineAnnealing` and `Poly` schedule. Here are some examples
- ConsineAnnealing schedule:
```python
lr_config = dict(
policy='CosineAnnealing',
warmup='linear',
warmup_iters=1000,
warmup_ratio=1.0 / 10,
min_lr_ratio=1e-5)
```
- Poly schedule:
```python
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
```
### Warmup strategy
In the early stage, training is easy to be volatile, and warmup is a technique
to reduce volatility. With warmup, the learning rate will increase gradually
from a minor value to the expected value.
In MMClassification, we use `lr_config` to configure the warmup strategy, the main parameters are as follows:
- `warmup`: The warmup curve type. Please choose one from 'constant', 'linear', 'exp' and `None`, and `None` means disable warmup.
- `warmup_by_epoch` : if warmup by epoch or not, default to be True, if set to be False, warmup by iter.
- `warmup_iters` : the number of warm-up iterations, when `warmup_by_epoch=True`, the unit is epoch; when `warmup_by_epoch=False`, the unit is the number of iterations (iter).
- `warmup_ratio` : warm-up initial learning rate will calculate as `lr = lr * warmup_ratio`
Here are some examples
1. linear & warmup by iter
```python
lr_config = dict(
policy='CosineAnnealing',
by_epoch=False,
min_lr_ratio=1e-2,
warmup='linear',
warmup_ratio=1e-3,
warmup_iters=20 * 1252,
warmup_by_epoch=False)
```
2. exp & warmup by epoch
```python
lr_config = dict(
policy='CosineAnnealing',
min_lr=0,
warmup='exp',
warmup_iters=5,
warmup_ratio=0.1,
warmup_by_epoch=True)
```
```{tip}
After completing your configuration file,you could use [learning rate visualization tool](https://mmclassification.readthedocs.io/en/latest/tools/visualization.html#learning-rate-schedule-visualization) to draw the corresponding learning rate adjustment curve.
```
## Customize momentum schedules
We support the momentum scheduler to modify the model's momentum according to learning rate, which could make the model converge in a faster way.
Momentum scheduler is usually used with LR scheduler, for example, the following config is used to accelerate convergence.
For more details, please refer to the implementation of [CyclicLrUpdater](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/lr_updater.py#L327)
and [CyclicMomentumUpdater](https://github.com/open-mmlab/mmcv/blob/f48241a65aebfe07db122e9db320c31b685dc674/mmcv/runner/hooks/momentum_updater.py#L130).
Here is an example
```python
lr_config = dict(
policy='cyclic',
target_ratio=(10, 1e-4),
cyclic_times=1,
step_ratio_up=0.4,
)
momentum_config = dict(
policy='cyclic',
target_ratio=(0.85 / 0.95, 1),
cyclic_times=1,
step_ratio_up=0.4,
)
```
## Parameter-wise finely configuration
Some models may have some parameter-specific settings for optimization, for example, no weight decay to the BatchNorm layer or using different learning rates for different network layers.
To finely configuration them, we can use the `paramwise_cfg` option in `optimizer`.
We provide some examples here and more usages refer to [DefaultOptimizerConstructor](https://mmcv.readthedocs.io/en/latest/_modules/mmcv/runner/optimizer/default_constructor.html#DefaultOptimizerConstructor).
- Using specified options
The `DefaultOptimizerConstructor` provides options including `bias_lr_mult`, `bias_decay_mult`, `norm_decay_mult`, `dwconv_decay_mult`, `dcn_offset_lr_mult` and `bypass_duplicate` to configure special optimizer behaviors of bias, normalization, depth-wise convolution, deformable convolution and duplicated parameter. E.g:
1. No weight decay to the BatchNorm layer
```python
optimizer = dict(
type='SGD',
lr=0.8,
weight_decay=1e-4,
paramwise_cfg=dict(norm_decay_mult=0.))
```
- Using `custom_keys` dict
MMClassification can use `custom_keys` to specify different parameters to use different learning rates or weight decays, for example:
1. No weight decay for specific parameters
```python
paramwise_cfg = dict(
custom_keys={
'backbone.cls_token': dict(decay_mult=0.0),
'backbone.pos_embed': dict(decay_mult=0.0)
})
optimizer = dict(
type='SGD',
lr=0.8,
weight_decay=1e-4,
paramwise_cfg=paramwise_cfg)
```
2. Using a smaller learning rate and a weight decay for the backbone layers
```python
optimizer = dict(
type='SGD',
lr=0.8,
weight_decay=1e-4,
# 'lr' for backbone and 'weight_decay' are 0.1 * lr and 0.9 * weight_decay
paramwise_cfg=dict(
custom_keys={'backbone': dict(lr_mult=0.1, decay_mult=0.9)}))
```
## Gradient clipping and gradient accumulation
Besides the basic function of PyTorch optimizers, we also provide some enhancement functions, such as gradient clipping, gradient accumulation, etc., refer to [MMCV](https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py).
### Gradient clipping
During the training process, the loss function may get close to a cliffy region and cause gradient explosion. And gradient clipping is helpful to stabilize the training process. More introduction can be found in [this page](https://paperswithcode.com/method/gradient-clipping).
Currently we support `grad_clip` option in `optimizer_config`, and the arguments refer to [PyTorch Documentation](https://pytorch.org/docs/stable/generated/torch.nn.utils.clip_grad_norm_.html).
Here is an example:
```python
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# norm_type: type of the used p-norm, here norm_type is 2.
```
When inheriting from base and modifying configs, if `grad_clip=None` in base, `_delete_=True` is needed. For more details about `_delete_` you can refer to [TUTORIAL 1: LEARN ABOUT CONFIGS](https://mmclassification.readthedocs.io/en/latest/tutorials/config.html#ignore-some-fields-in-the-base-configs). For example,
```python
_base_ = [./_base_/schedules/imagenet_bs256_coslr.py]
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2), _delete_=True, type='OptimizerHook')
# you can ignore type if type is 'OptimizerHook', otherwise you must add "type='xxxxxOptimizerHook'" here
```
### Gradient accumulation
When computing resources are lacking, the batch size can only be set to a small value, which may affect the performance of models. Gradient accumulation can be used to solve this problem.
Here is an example:
```python
data = dict(samples_per_gpu=64)
optimizer_config = dict(type="GradientCumulativeOptimizerHook", cumulative_iters=4)
```
Indicates that during training, back-propagation is performed every 4 iters. And the above is equivalent to:
```python
data = dict(samples_per_gpu=256)
optimizer_config = dict(type="OptimizerHook")
```
```{note}
When the optimizer hook type is not specified in `optimizer_config`, `OptimizerHook` is used by default.
```
## Customize self-implemented methods
In academic research and industrial practice, it may be necessary to use optimization methods not implemented by MMClassification, and you can add them through the following methods.
```{note}
This part will modify the MMClassification source code or add code to the MMClassification framework, beginners can skip it.
```
### Customize self-implemented optimizer
#### 1. Define a new optimizer
A customized optimizer could be defined as below.
Assume you want to add an optimizer named `MyOptimizer`, which has arguments `a`, `b`, and `c`.
You need to create a new directory named `mmcls/core/optimizer`.
And then implement the new optimizer in a file, e.g., in `mmcls/core/optimizer/my_optimizer.py`:
```python
from mmcv.runner import OPTIMIZERS
from torch.optim import Optimizer
@OPTIMIZERS.register_module()
class MyOptimizer(Optimizer):
def __init__(self, a, b, c):
```
#### 2. Add the optimizer to registry
To find the above module defined above, this module should be imported into the main namespace at first. There are two ways to achieve it.
- Modify `mmcls/core/optimizer/__init__.py` to import it into `optimizer` package, and then modify `mmcls/core/__init__.py` to import the new `optimizer` package.
Create the `mmcls/core/optimizer` folder and the `mmcls/core/optimizer/__init__.py` file if they don't exist. The newly defined module should be imported in `mmcls/core/optimizer/__init__.py` and `mmcls/core/__init__.py` so that the registry will find the new module and add it:
```python
# In mmcls/core/optimizer/__init__.py
from .my_optimizer import MyOptimizer # MyOptimizer maybe other class name
__all__ = ['MyOptimizer']
```
```python
# In mmcls/core/__init__.py
...
from .optimizer import * # noqa: F401, F403
```
- Use `custom_imports` in the config to manually import it
```python
custom_imports = dict(imports=['mmcls.core.optimizer.my_optimizer'], allow_failed_imports=False)
```
The module `mmcls.core.optimizer.my_optimizer` will be imported at the beginning of the program and the class `MyOptimizer` is then automatically registered.
Note that only the package containing the class `MyOptimizer` should be imported. `mmcls.core.optimizer.my_optimizer.MyOptimizer` **cannot** be imported directly.
#### 3. Specify the optimizer in the config file
Then you can use `MyOptimizer` in `optimizer` field of config files.
In the configs, the optimizers are defined by the field `optimizer` like the following:
```python
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
```
To use your own optimizer, the field can be changed to
```python
optimizer = dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)
```
### Customize optimizer constructor
Some models may have some parameter-specific settings for optimization, e.g. weight decay for BatchNorm layers.
Although our `DefaultOptimizerConstructor` is powerful, it may still not cover your need. If that, you can do those fine-grained parameter tuning through customizing optimizer constructor.
```python
from mmcv.runner.optimizer import OPTIMIZER_BUILDERS
@OPTIMIZER_BUILDERS.register_module()
class MyOptimizerConstructor:
def __init__(self, optimizer_cfg, paramwise_cfg=None):
pass
def __call__(self, model):
... # Construct your optimzier here.
return my_optimizer
```
The default optimizer constructor is implemented [here](https://github.com/open-mmlab/mmcv/blob/9ecd6b0d5ff9d2172c49a182eaa669e9f27bb8e7/mmcv/runner/optimizer/default_constructor.py#L11), which could also serve as a template for new optimizer constructor.
.header-logo {
background-image: url("../image/mmcls-logo.png");
background-size: 204px 40px;
height: 40px;
width: 204px;
}
pre {
white-space: pre;
}
article.pytorch-article section code {
padding: .2em .4em;
background-color: #f3f4f7;
border-radius: 5px;
}
/* Disable the change in tables */
article.pytorch-article section table code {
padding: unset;
background-color: unset;
border-radius: unset;
}
table.autosummary td {
width: 50%
}
# 参与贡献 OpenMMLab
欢迎任何类型的贡献,包括但不限于
- 修改拼写错误或代码错误
- 添加文档或将文档翻译成其他语言
- 添加新功能和新组件
## 工作流程
1. fork 并 pull 最新的 OpenMMLab 仓库 (MMClassification)
2. 签出到一个新分支(不要使用 master 分支提交 PR)
3. 进行修改并提交至 fork 出的自己的远程仓库
4. 在我们的仓库中创建一个 PR
```{note}
如果你计划添加一些新的功能,并引入大量改动,请尽量首先创建一个 issue 来进行讨论。
```
## 代码风格
### Python
我们采用 [PEP8](https://www.python.org/dev/peps/pep-0008/) 作为统一的代码风格。
我们使用下列工具来进行代码风格检查与格式化:
- [flake8](https://github.com/PyCQA/flake8): Python 官方发布的代码规范检查工具,是多个检查工具的封装
- [isort](https://github.com/timothycrosley/isort): 自动调整模块导入顺序的工具
- [yapf](https://github.com/google/yapf): 一个 Python 文件的格式化工具。
- [codespell](https://github.com/codespell-project/codespell): 检查单词拼写是否有误
- [mdformat](https://github.com/executablebooks/mdformat): 检查 markdown 文件的工具
- [docformatter](https://github.com/myint/docformatter): 一个 docstring 格式化工具。
yapf 和 isort 的格式设置位于 [setup.cfg](https://github.com/open-mmlab/mmclassification/blob/master/setup.cfg)
我们使用 [pre-commit hook](https://pre-commit.com/) 来保证每次提交时自动进行代
码检查和格式化,启用的功能包括 `flake8`, `yapf`, `isort`, `trailing whitespaces`, `markdown files`, 修复 `end-of-files`, `double-quoted-strings`,
`python-encoding-pragma`, `mixed-line-ending`, 对 `requirments.txt`的排序等。
pre-commit hook 的配置文件位于 [.pre-commit-config](https://github.com/open-mmlab/mmclassification/blob/master/.pre-commit-config.yaml)
在你克隆仓库后,你需要按照如下步骤安装并初始化 pre-commit hook。
```shell
pip install -U pre-commit
```
在仓库文件夹中执行
```shell
pre-commit install
```
在此之后,每次提交,代码规范检查和格式化工具都将被强制执行。
```{important}
在创建 PR 之前,请确保你的代码完成了代码规范检查,并经过了 yapf 的格式化。
```
### C++ 和 CUDA
C++ 和 CUDA 的代码规范遵从 [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html)
# 0.x 相关兼容性问题
## MMClassification 0.20.1
### MMCV 兼容性
在 Twins 骨干网络中,我们使用了 MMCV 提供的 `PatchEmbed` 模块,该模块是在 MMCV 1.4.2 版本加入的,因此我们需要将 MMCV 依赖版本升至 1.4.2。
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import subprocess
import sys
import pytorch_sphinx_theme
from sphinx.builders.html import StandaloneHTMLBuilder
sys.path.insert(0, os.path.abspath('../..'))
# -- Project information -----------------------------------------------------
project = 'MMClassification'
copyright = '2020, OpenMMLab'
author = 'MMClassification Authors'
# The full version, including alpha/beta/rc tags
version_file = '../../mmcls/version.py'
def get_version():
with open(version_file, 'r') as f:
exec(compile(f.read(), version_file, 'exec'))
return locals()['__version__']
release = get_version()
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
'myst_parser',
'sphinx_copybutton',
]
autodoc_mock_imports = ['mmcv._ext', 'matplotlib']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
source_suffix = {
'.rst': 'restructuredtext',
'.md': 'markdown',
}
language = 'zh_CN'
# The master toctree document.
master_doc = 'index'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'pytorch_sphinx_theme'
html_theme_path = [pytorch_sphinx_theme.get_html_theme_path()]
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
html_theme_options = {
'logo_url':
'https://mmclassification.readthedocs.io/zh_CN/latest/',
'menu': [
{
'name': 'GitHub',
'url': 'https://github.com/open-mmlab/mmclassification'
},
{
'name':
'Colab 教程',
'children': [
{
'name':
'用命令行工具训练和推理',
'url':
'https://colab.research.google.com/github/'
'open-mmlab/mmclassification/blob/master/docs/zh_CN/'
'tutorials/MMClassification_tools_cn.ipynb',
},
{
'name':
'用 Python API 训练和推理',
'url':
'https://colab.research.google.com/github/'
'open-mmlab/mmclassification/blob/master/docs/zh_CN/'
'tutorials/MMClassification_python_cn.ipynb',
},
]
},
],
# Specify the language of shared menu
'menu_lang':
'cn',
}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_css_files = ['css/readthedocs.css']
html_js_files = ['js/custom.js']
# -- Options for HTMLHelp output ---------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'mmclsdoc'
# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'mmcls.tex', 'MMClassification Documentation', author,
'manual'),
]
# -- Options for manual page output ------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [(master_doc, 'mmcls', 'MMClassification Documentation', [author],
1)]
# -- Options for Texinfo output ----------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'mmcls', 'MMClassification Documentation', author, 'mmcls',
'OpenMMLab image classification toolbox and benchmark.', 'Miscellaneous'),
]
# -- Options for Epub output -------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = project
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''
# A unique identification for the text.
#
# epub_uid = ''
# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
# set priority when building html
StandaloneHTMLBuilder.supported_image_types = [
'image/svg+xml', 'image/gif', 'image/png', 'image/jpeg'
]
# -- Extension configuration -------------------------------------------------
# Ignore >>> when copying code
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True
# Auto-generated header anchors
myst_heading_anchors = 3
# Configuration for intersphinx
intersphinx_mapping = {
'python': ('https://docs.python.org/3', None),
'numpy': ('https://numpy.org/doc/stable', None),
'torch': ('https://pytorch.org/docs/stable/', None),
'mmcv': ('https://mmcv.readthedocs.io/zh_CN/latest/', None),
}
def builder_inited_handler(app):
subprocess.run(['./stat.py'])
def setup(app):
app.add_config_value('no_underscore_emphasis', False, 'env')
app.connect('builder-inited', builder_inited_handler)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment