Commit b6df0d33 authored by limm's avatar limm
Browse files

add resources part

parent cbc25585
Pipeline #2802 canceled with stages
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.engine
mmpretrain.engine
===================================
This package includes some runtime components, including hooks, runners, optimizers and loops. These components are useful in
classification tasks but not supported by MMEngine yet.
.. note::
Some components may be moved to MMEngine in the future.
.. contents:: mmpretrain.engine
:depth: 2
:local:
:backlinks: top
.. module:: mmpretrain.engine.hooks
Hooks
------------------
.. autosummary::
:toctree: generated
:nosignatures:
ClassNumCheckHook
PreciseBNHook
VisualizationHook
PrepareProtoBeforeValLoopHook
SetAdaptiveMarginsHook
EMAHook
SimSiamHook
DenseCLHook
SwAVHook
.. module:: mmpretrain.engine.optimizers
Optimizers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
Lamb
LARS
LearningRateDecayOptimWrapperConstructor
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.evaluation
mmpretrain.evaluation
===================================
This package includes metrics and evaluators for classification tasks.
.. contents:: mmpretrain.evaluation
:depth: 1
:local:
:backlinks: top
Single Label Metric
----------------------
.. autosummary::
:toctree: generated
:nosignatures:
Accuracy
SingleLabelMetric
ConfusionMatrix
Multi Label Metric
----------------------
.. autosummary::
:toctree: generated
:nosignatures:
AveragePrecision
MultiLabelMetric
VOCAveragePrecision
VOCMultiLabelMetric
Retrieval Metric
----------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
RetrievalRecall
RetrievalAveragePrecision
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.models
mmpretrain.models
===================================
The ``models`` package contains several sub-packages for addressing the different components of a model.
- :mod:`~mmpretrain.models.classifiers`: The top-level module which defines the whole process of a classification model.
- :mod:`~mmpretrain.models.selfsup`: The top-level module which defines the whole process of a self-supervised learning model.
- :mod:`~mmpretrain.models.retrievers`: The top-level module which defines the whole process of a retrieval model.
- :mod:`~mmpretrain.models.backbones`: Usually a feature extraction network, e.g., ResNet, MobileNet.
- :mod:`~mmpretrain.models.necks`: The component between backbones and heads, e.g., GlobalAveragePooling.
- :mod:`~mmpretrain.models.heads`: The component for specific tasks.
- :mod:`~mmpretrain.models.losses`: Loss functions.
- :mod:`~mmpretrain.models.peft`: The PEFT (Parameter-Efficient Fine-Tuning) module, e.g. LoRAModel.
- :mod:`~mmpretrain.models.utils`: Some helper functions and common components used in various networks.
- :mod:`~mmpretrain.models.utils.data_preprocessor`: The component before model to preprocess the inputs, e.g., ClsDataPreprocessor.
- :ref:`components`: Common components used in various networks.
- :ref:`helpers`: Helper functions.
Build Functions
---------------
.. autosummary::
:toctree: generated
:nosignatures:
build_classifier
build_backbone
build_neck
build_head
build_loss
.. module:: mmpretrain.models.classifiers
Classifiers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BaseClassifier
ImageClassifier
TimmClassifier
HuggingFaceClassifier
.. module:: mmpretrain.models.selfsup
Self-supervised Algorithms
--------------------------
.. _selfsup_algorithms:
.. autosummary::
:toctree: generated
:nosignatures:
BaseSelfSupervisor
BEiT
BYOL
BarlowTwins
CAE
DenseCL
EVA
iTPN
MAE
MILAN
MaskFeat
MixMIM
MoCo
MoCoV3
SimCLR
SimMIM
SimSiam
SparK
SwAV
.. _selfsup_backbones:
Some of above algorithms modified the backbone module to adapt the extra inputs
like ``mask``, and here is the a list of these **modified backbone** modules.
.. autosummary::
:toctree: generated
:nosignatures:
BEiTPretrainViT
CAEPretrainViT
iTPNHiViT
MAEHiViT
MAEViT
MILANViT
MaskFeatViT
MixMIMPretrainTransformer
MoCoV3ViT
SimMIMSwinTransformer
.. _target_generators:
Some self-supervise algorithms need an external **target generator** to
generate the optimization target. Here is a list of target generators.
.. autosummary::
:toctree: generated
:nosignatures:
VQKD
DALLEEncoder
HOGGenerator
CLIPGenerator
.. module:: mmpretrain.models.retrievers
Retrievers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BaseRetriever
ImageToImageRetriever
.. module:: mmpretrain.models.multimodal
Multi-Modality Algorithms
--------------------------
.. autosummary::
:toctree: generated
:nosignatures:
Blip2Caption
Blip2Retrieval
Blip2VQA
BlipCaption
BlipGrounding
BlipNLVR
BlipRetrieval
BlipVQA
Flamingo
OFA
MiniGPT4
Llava
Otter
.. module:: mmpretrain.models.backbones
Backbones
------------------
.. autosummary::
:toctree: generated
:nosignatures:
AlexNet
BEiTViT
CSPDarkNet
CSPNet
CSPResNeXt
CSPResNet
Conformer
ConvMixer
ConvNeXt
DaViT
DeiT3
DenseNet
DistilledVisionTransformer
EdgeNeXt
EfficientFormer
EfficientNet
EfficientNetV2
HiViT
HRNet
HorNet
InceptionV3
LeNet5
LeViT
MViT
MlpMixer
MobileNetV2
MobileNetV3
MobileOne
MobileViT
PCPVT
PoolFormer
PyramidVig
RegNet
RepLKNet
RepMLPNet
RepVGG
Res2Net
ResNeSt
ResNeXt
ResNet
ResNetV1c
ResNetV1d
ResNet_CIFAR
RevVisionTransformer
SEResNeXt
SEResNet
SVT
ShuffleNetV1
ShuffleNetV2
SparseResNet
SparseConvNeXt
SwinTransformer
SwinTransformerV2
T2T_ViT
TIMMBackbone
TNT
VAN
VGG
Vig
VisionTransformer
ViTSAM
XCiT
ViTEVA02
.. module:: mmpretrain.models.necks
Necks
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BEiTV2Neck
CAENeck
ClsBatchNormNeck
DenseCLNeck
GeneralizedMeanPooling
GlobalAveragePooling
HRFuseScales
LinearNeck
MAEPretrainDecoder
MILANPretrainDecoder
MixMIMPretrainDecoder
MoCoV2Neck
NonLinearNeck
SimMIMLinearDecoder
SwAVNeck
iTPNPretrainDecoder
SparKLightDecoder
.. module:: mmpretrain.models.heads
Heads
------------------
.. autosummary::
:toctree: generated
:nosignatures:
ArcFaceClsHead
BEiTV1Head
BEiTV2Head
CAEHead
CSRAClsHead
ClsHead
ConformerHead
ContrastiveHead
DeiTClsHead
EfficientFormerClsHead
LatentCrossCorrelationHead
LatentPredictHead
LeViTClsHead
LinearClsHead
MAEPretrainHead
MIMHead
MixMIMPretrainHead
MoCoV3Head
MultiLabelClsHead
MultiLabelLinearClsHead
MultiTaskHead
SimMIMHead
StackedLinearClsHead
SwAVHead
VigClsHead
VisionTransformerClsHead
iTPNClipHead
SparKPretrainHead
.. module:: mmpretrain.models.losses
Losses
------------------
.. autosummary::
:toctree: generated
:nosignatures:
AsymmetricLoss
CAELoss
CosineSimilarityLoss
CrossCorrelationLoss
CrossEntropyLoss
FocalLoss
LabelSmoothLoss
PixelReconstructionLoss
SeesawLoss
SwAVLoss
.. module:: mmpretrain.models.peft
PEFT
------------------
.. autosummary::
:toctree: generated
:nosignatures:
LoRAModel
.. module:: mmpretrain.models.utils
models.utils
------------
This package includes some helper functions and common components used in various networks.
.. _components:
Common Components
^^^^^^^^^^^^^^^^^
.. autosummary::
:toctree: generated
:nosignatures:
ConditionalPositionEncoding
CosineEMA
HybridEmbed
InvertedResidual
LayerScale
MultiheadAttention
PatchEmbed
PatchMerging
SELayer
ShiftWindowMSA
WindowMSA
WindowMSAV2
.. _helpers:
Helper Functions
^^^^^^^^^^^^^^^^
.. autosummary::
:toctree: generated
:nosignatures:
channel_shuffle
is_tracing
make_divisible
resize_pos_embed
resize_relative_position_bias_table
to_ntuple
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.structures
mmpretrain.structures
===================================
This package includes basic data structures.
DataSample
-------------
.. autoclass:: DataSample
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.utils
mmpretrain.utils
===================================
This package includes some useful helper functions for developing.
.. autosummary::
:toctree: generated
:nosignatures:
collect_env
register_all_modules
load_json_log
track_on_main_process
get_ori_model
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.visualization
mmpretrain.visualization
===================================
This package includes visualizer and some helper functions for visualization.
Visualizer
-------------
.. autoclass:: UniversalVisualizer
:members:
# flake8: noqa
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import subprocess
import sys
import pytorch_sphinx_theme
from sphinx.builders.html import StandaloneHTMLBuilder
sys.path.insert(0, os.path.abspath('../../'))
# -- Project information -----------------------------------------------------
project = 'MMPretrain'
copyright = '2020, OpenMMLab'
author = 'MMPretrain Authors'
# The full version, including alpha/beta/rc tags
version_file = '../../mmpretrain/version.py'
def get_version():
with open(version_file, 'r') as f:
exec(compile(f.read(), version_file, 'exec'))
return locals()['__version__']
release = get_version()
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
'myst_parser',
'sphinx_copybutton',
'sphinx_tabs.tabs',
'notfound.extension',
'sphinxcontrib.jquery',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
source_suffix = {
'.rst': 'restructuredtext',
'.md': 'markdown',
}
language = 'en'
# The master toctree document.
root_doc = 'index'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'pytorch_sphinx_theme'
html_theme_path = [pytorch_sphinx_theme.get_html_theme_path()]
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
# yapf: disable
html_theme_options = {
'menu': [
{
'name': 'GitHub',
'url': 'https://github.com/open-mmlab/mmpretrain'
},
{
'name': 'Colab Tutorials',
'children': [
{'name': 'Train and inference with shell commands',
'url': 'https://colab.research.google.com/github/mzr1996/mmpretrain-tutorial/blob/master/1.x/MMPretrain_tools.ipynb'},
{'name': 'Train and inference with Python APIs',
'url': 'https://colab.research.google.com/github/mzr1996/mmpretrain-tutorial/blob/master/1.x/MMPretrain_python.ipynb'},
]
},
{
'name': 'Version',
'children': [
{'name': 'MMPreTrain 0.x',
'url': 'https://mmpretrain.readthedocs.io/en/0.x/',
'description': '0.x branch'},
{'name': 'MMPreTrain 1.x',
'url': 'https://mmpretrain.readthedocs.io/en/latest/',
'description': 'Main branch'},
],
}
],
# Specify the language of shared menu
'menu_lang': 'en',
# Disable the default edit on GitHub
'default_edit_on_github': False,
}
# yapf: enable
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_css_files = [
'https://cdn.datatables.net/v/bs4/dt-1.12.1/datatables.min.css',
'css/readthedocs.css'
]
html_js_files = [
'https://cdn.datatables.net/v/bs4/dt-1.12.1/datatables.min.js',
'js/custom.js'
]
# -- Options for HTMLHelp output ---------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'mmpretraindoc'
# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(root_doc, 'mmpretrain.tex', 'MMPretrain Documentation', author, 'manual'),
]
# -- Options for manual page output ------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [(root_doc, 'mmpretrain', 'MMPretrain Documentation', [author], 1)]
# -- Options for Texinfo output ----------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(root_doc, 'mmpretrain', 'MMPretrain Documentation', author, 'mmpretrain',
'OpenMMLab pre-training toolbox and benchmark.', 'Miscellaneous'),
]
# -- Options for Epub output -------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = project
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''
# A unique identification for the text.
#
# epub_uid = ''
# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
# set priority when building html
StandaloneHTMLBuilder.supported_image_types = [
'image/svg+xml', 'image/gif', 'image/png', 'image/jpeg'
]
# -- Extension configuration -------------------------------------------------
# Ignore >>> when copying code
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True
# Auto-generated header anchors
myst_heading_anchors = 3
# Enable "colon_fence" extension of myst.
myst_enable_extensions = ['colon_fence', 'dollarmath']
# Configuration for intersphinx
intersphinx_mapping = {
'python': ('https://docs.python.org/3', None),
'numpy': ('https://numpy.org/doc/stable', None),
'torch': ('https://pytorch.org/docs/stable/', None),
'mmcv': ('https://mmcv.readthedocs.io/en/2.x/', None),
'mmengine': ('https://mmengine.readthedocs.io/en/latest/', None),
'transformers':
('https://huggingface.co/docs/transformers/main/en/', None),
}
napoleon_custom_sections = [
# Custom sections for data elements.
('Meta fields', 'params_style'),
('Data fields', 'params_style'),
]
# Disable docstring inheritance
autodoc_inherit_docstrings = False
# Mock some imports during generate API docs.
autodoc_mock_imports = ['rich', 'attr', 'einops', 'mat4py']
# Disable displaying type annotations, these can be very verbose
autodoc_typehints = 'none'
# The not found page
notfound_template = '404.html'
def builder_inited_handler(app):
if subprocess.run(['./stat.py']).returncode != 0:
raise RuntimeError('Failed to run the script `stat.py`.')
def setup(app):
app.connect('builder-inited', builder_inited_handler)
# NPU (HUAWEI Ascend)
## Usage
### General Usage
Please refer to the [building documentation of MMCV](https://mmcv.readthedocs.io/en/latest/get_started/build.html#build-mmcv-full-on-ascend-npu-machine) to install MMCV and [MMEngine](https://mmengine.readthedocs.io/en/latest/get_started/installation.html#build-from-source) on NPU devices.
Here we use 8 NPUs on your computer to train the model with the following command:
```shell
bash ./tools/dist_train.sh configs/resnet/resnet50_8xb32_in1k.py 8
```
Also, you can use only one NPU to train the model with the following command:
```shell
python ./tools/train.py configs/resnet/resnet50_8xb32_in1k.py
```
## Models Results
| Model | Top-1 (%) | Top-5 (%) | Config | Download |
| :---------------------------------------------------------: | :-------: | :-------: | :----------------------------------------------------------: | :-------------------------------------------------------------: |
| [ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) | 76.40 | 93.21 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnet50_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnet50_8xb32_in1k.log) |
| [ResNetXt-32x4d-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/README.md) | 77.48 | 93.75 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnext50-32x4d_8xb32_in1k.log) |
| [HRNet-W18](https://github.com/open-mmlab/mmclassification/blob/master/configs/hrnet/README.md) | 77.06 | 93.57 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/hrnet/hrnet-w18_4xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/hrnet-w18_4xb32_in1k.log) |
| [ResNetV1D-152](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) | 79.41 | 94.48 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnetv1d152_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnetv1d152_8xb32_in1k.log) |
| [SE-ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/README.md) | 77.65 | 93.74 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/seresnet50_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/seresnet50_8xb32_in1k.log) |
| [ShuffleNetV2 1.0x](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/README.md) | 69.52 | 88.79 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/shufflenet-v2-1x_16xb64_in1k.log) |
| [MobileNetV2](https://github.com/open-mmlab/mmclassification/tree/1.x/configs/mobilenet_v2) | 71.74 | 90.28 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v2_8xb32_in1k.log) |
| [MobileNetV3-Small](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/README.md) | 67.09 | 87.17 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/mobilenet-v3-small_8xb128_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v3-small.log) |
| [\*CSPResNeXt50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/README.md) | 77.25 | 93.46 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/cspresnext50_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/cspresnext50_8xb32_in1k.log) |
| [\*EfficientNet-B4](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/README.md) | 75.73 | 92.91 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/efficientnet-b4_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/efficientnet-b4_8xb32_in1k.log) |
| [\*\*DenseNet121](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/README.md) | 72.53 | 90.85 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/densenet121_4xb256_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/densenet121_4xb256_in1k.log) |
**Notes:**
- If not specially marked, the results are almost same between results on the NPU and results on the GPU with FP32.
- (\*) The training results of these models are lower than the results on the readme in the corresponding model, mainly
because the results on the readme are directly the weight of the timm of the eval, and the results on this side are
retrained according to the config with mmcls. The results of the config training on the GPU are consistent with the
results of the NPU.
- (\*\*) The accuracy of this model is slightly lower because config is a 4-card config, we use 8 cards to run, and users
can adjust hyperparameters to get the best accuracy results.
**All above models are provided by Huawei Ascend group.**
[html writers]
table_style: colwidths-auto
# Prerequisites
In this section we demonstrate how to prepare an environment with PyTorch.
MMPretrain works on Linux, Windows and macOS. It requires Python 3.7+, CUDA 10.2+ and PyTorch 1.8+.
```{note}
If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation.
```
**Step 1.** Download and install Miniconda from the [official website](https://docs.conda.io/en/latest/miniconda.html).
**Step 2.** Create a conda environment and activate it.
```shell
conda create --name openmmlab python=3.8 -y
conda activate openmmlab
```
**Step 3.** Install PyTorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.
On GPU platforms:
```shell
conda install pytorch torchvision -c pytorch
```
```{warning}
This command will automatically install the latest version PyTorch and cudatoolkit, please check whether they match your environment.
```
On CPU platforms:
```shell
conda install pytorch torchvision cpuonly -c pytorch
```
# Installation
## Best Practices
According to your needs, we support two install modes:
- [Install from source (Recommended)](#install-from-source): You want to develop your own network or new features based on MMPretrain framework. For example, adding new datasets or new backbones. And you can use all tools we provided.
- [Install as a Python package](#install-as-a-python-package): You just want to call MMPretrain's APIs or import MMPretrain's modules in your project.
### Install from source
In this case, install mmpretrain from source:
```shell
git clone https://github.com/open-mmlab/mmpretrain.git
cd mmpretrain
pip install -U openmim && mim install -e .
```
```{note}
`"-e"` means installing a project in editable mode, thus any local modifications made to the code will take effect without reinstallation.
```
### Install as a Python package
Just install with mim.
```shell
pip install -U openmim && mim install "mmpretrain>=1.0.0rc8"
```
```{note}
`mim` is a light-weight command-line tool to setup appropriate environment for OpenMMLab repositories according to PyTorch and CUDA version. It also has some useful functions for deep-learning experiments.
```
## Install multi-modality support (Optional)
The multi-modality models in MMPretrain requires extra dependencies. To install these dependencies, you
can add `[multimodal]` during the installation. For example:
```shell
# Install from source
mim install -e ".[multimodal]"
# Install as a Python package
mim install "mmpretrain[multimodal]>=1.0.0rc8"
```
## Verify the installation
To verify whether MMPretrain is installed correctly, we provide some sample codes to run an inference demo.
Option (a). If you install mmpretrain from the source, just run the following command:
```shell
python demo/image_demo.py demo/demo.JPEG resnet18_8xb32_in1k --device cpu
```
You will see the output result dict including `pred_label`, `pred_score` and `pred_class` in your terminal.
Option (b). If you install mmpretrain as a python package, open your python interpreter and copy&paste the following codes.
```python
from mmpretrain import get_model, inference_model
model = get_model('resnet18_8xb32_in1k', device='cpu') # or device='cuda:0'
inference_model(model, 'demo/demo.JPEG')
```
You will see a dict printed, including the predicted label, score and category name.
```{note}
The `resnet18_8xb32_in1k` is the model name, and you can use [`mmpretrain.list_models`](mmpretrain.apis.list_models) to
explore all models, or search them on the [Model Zoo Summary](./modelzoo_statistics.md)
```
## Customize Installation
### CUDA versions
When installing PyTorch, you need to specify the version of CUDA. If you are
not clear on which to choose, follow our recommendations:
- For Ampere-based NVIDIA GPUs, such as GeForce 30 series and NVIDIA A100, CUDA 11 is a must.
- For older NVIDIA GPUs, CUDA 11 is backward compatible, but CUDA 10.2 offers better compatibility and is more lightweight.
Please make sure the GPU driver satisfies the minimum version requirements. See [this table](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions) for more information.
```{note}
Installing CUDA runtime libraries is enough if you follow our best practices,
because no CUDA code will be compiled locally. However if you hope to compile
MMCV from source or develop other CUDA operators, you need to install the
complete CUDA toolkit from NVIDIA's [website](https://developer.nvidia.com/cuda-downloads),
and its version should match the CUDA version of PyTorch. i.e., the specified
version of cudatoolkit in `conda install` command.
```
### Install on CPU-only platforms
MMPretrain can be built for CPU only environment. In CPU mode you can train, test or inference a model.
### Install on Google Colab
See [the Colab tutorial](https://colab.research.google.com/github/mzr1996/mmclassification-tutorial/blob/master/1.x/MMClassification_tools.ipynb).
### Using MMPretrain with Docker
We provide a [Dockerfile](https://github.com/open-mmlab/mmpretrain/blob/main/docker/Dockerfile)
to build an image. Ensure that your [docker version](https://docs.docker.com/engine/install/) >=19.03.
```shell
# build an image with PyTorch 1.12.1, CUDA 11.3
# If you prefer other versions, just modified the Dockerfile
docker build -t mmpretrain docker/
```
Run it with
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmpretrain/data mmpretrain
```
## Trouble shooting
If you have some issues during the installation, please first view the [FAQ](./notes/faq.md) page.
You may [open an issue](https://github.com/open-mmlab/mmpretrain/issues/new/choose)
on GitHub if no solution is found.
Welcome to MMPretrain's documentation!
============================================
MMPretrain is a newly upgraded open-source framework for pre-training.
It has set out to provide multiple powerful pre-trained backbones and
support different pre-training strategies. MMPretrain originated from the
famous open-source projects
`MMClassification <https://github.com/open-mmlab/mmclassification/tree/1.x>`_
and `MMSelfSup <https://github.com/open-mmlab/mmselfsup>`_, and is developed
with many exiciting new features. The pre-training stage is essential for
vision recognition currently. With the rich and strong pre-trained models,
we are currently capable of improving various downstream vision tasks.
Our primary objective for the codebase is to become an easily accessible and
user-friendly library and to streamline research and engineering. We
detail the properties and design of MMPretrain across different sections.
Hands-on Roadmap of MMPretrain
-------------------------------
To help users quickly utilize MMPretrain, we recommend following the hands-on
roadmap we have created for the library:
- For users who want to try MMPretrain, we suggest reading the GetStarted_
section for the environment setup.
- For basic usage, we refer users to UserGuides_ for utilizing various
algorithms to obtain the pre-trained models and evaluate their performance
in downstream tasks.
- For those who wish to customize their own algorithms, we provide
AdvancedGuides_ that include hints and rules for modifying code.
- To find your desired pre-trained models, users could check the ModelZoo_,
which features a summary of various backbones and pre-training methods and
introfuction of different algorithms.
- Additionally, we provide Analysis_ and Visualization_ tools to help
diagnose algorithms.
- Besides, if you have any other questions or concerns, please refer to the
Notes_ section for potential answers.
We always welcome *PRs* and *Issues* for the betterment of MMPretrain.
.. _GetStarted:
.. toctree::
:maxdepth: 1
:caption: Get Started
get_started.md
.. _UserGuides:
.. toctree::
:maxdepth: 1
:caption: User Guides
user_guides/config.md
user_guides/dataset_prepare.md
user_guides/inference.md
user_guides/train.md
user_guides/test.md
user_guides/downstream.md
.. _AdvancedGuides:
.. toctree::
:maxdepth: 1
:caption: Advanced Guides
advanced_guides/datasets.md
advanced_guides/pipeline.md
advanced_guides/modules.md
advanced_guides/schedule.md
advanced_guides/runtime.md
advanced_guides/evaluation.md
advanced_guides/convention.md
.. _ModelZoo:
.. toctree::
:maxdepth: 1
:caption: Model Zoo
:glob:
modelzoo_statistics.md
papers/*
.. _Visualization:
.. toctree::
:maxdepth: 1
:caption: Visualization
useful_tools/dataset_visualization.md
useful_tools/scheduler_visualization.md
useful_tools/cam_visualization.md
useful_tools/t-sne_visualization.md
.. _Analysis:
.. toctree::
:maxdepth: 1
:caption: Analysis Tools
useful_tools/print_config.md
useful_tools/verify_dataset.md
useful_tools/log_result_analysis.md
useful_tools/complexity_analysis.md
useful_tools/confusion_matrix.md
useful_tools/shape_bias.md
.. toctree::
:maxdepth: 1
:caption: Deployment
useful_tools/model_serving.md
.. toctree::
:maxdepth: 1
:caption: Migration
migration.md
.. toctree::
:maxdepth: 1
:caption: API Reference
mmpretrain.apis <api/apis>
mmpretrain.engine <api/engine>
mmpretrain.datasets <api/datasets>
Data Process <api/data_process>
mmpretrain.models <api/models>
mmpretrain.structures <api/structures>
mmpretrain.visualization <api/visualization>
mmpretrain.evaluation <api/evaluation>
mmpretrain.utils <api/utils>
.. _Notes:
.. toctree::
:maxdepth: 1
:caption: Notes
notes/contribution_guide.md
notes/projects.md
notes/changelog.md
notes/faq.md
notes/pretrain_custom_dataset.md
notes/finetune_custom_dataset.md
.. toctree::
:maxdepth: 1
:caption: Device Support
device/npu.md
Indices and tables
==================
* :ref:`genindex`
* :ref:`search`
# Migration
We introduce some modifications in MMPretrain 1.x, and some of them are BC-breacking. To migrate your projects from **MMClassification 0.x** or **MMSelfSup 0.x** smoothly, please read this tutorial.
- [Migration](#migration)
- [New dependencies](#new-dependencies)
- [General change of config](#general-change-of-config)
- [Schedule settings](#schedule-settings)
- [Runtime settings](#runtime-settings)
- [Other changes](#other-changes)
- [Migration from MMClassification 0.x](#migration-from-mmclassification-0x)
- [Config files](#config-files)
- [Model settings](#model-settings)
- [Data settings](#data-settings)
- [Packages](#packages)
- [`mmpretrain.apis`](#mmpretrainapis)
- [`mmpretrain.core`](#mmpretraincore)
- [`mmpretrain.datasets`](#mmpretraindatasets)
- [`mmpretrain.models`](#mmpretrainmodels)
- [`mmpretrain.utils`](#mmpretrainutils)
- [Migration from MMSelfSup 0.x](#migration-from-mmselfsup-0x)
- [Config](#config)
- [Dataset settings](#dataset-settings)
- [Model settings](#model-settings-1)
- [Package](#package)
## New dependencies
```{warning}
MMPretrain 1.x has new package dependencies, and a new environment should be created for MMPretrain 1.x even if you already have a well-rounded MMClassification 0.x or MMSelfSup 0.x environment. Please refer to the [installation tutorial](./get_started.md) for the required package installation or install the packages manually.
```
1. [MMEngine](https://github.com/open-mmlab/mmengine): MMEngine is the core the OpenMMLab 2.0 architecture,
and we have split many compentents unrelated to computer vision from MMCV to MMEngine.
2. [MMCV](https://github.com/open-mmlab/mmcv): The computer vision package of OpenMMLab. This is not a new
dependency, but it should be upgraded to version `2.0.0rc1` or above.
3. [rich](https://github.com/Textualize/rich): A terminal formatting package, and we use it to enhance some
outputs in the terminal.
4. [einops](https://github.com/arogozhnikov/einops): Operators for Einstein notations.
# General change of config
In this section, we introduce the general difference between old version(**MMClassification 0.x** or **MMSelfSup 0.x**) and **MMPretrain 1.x**.
## Schedule settings
| MMCls or MMSelfSup 0.x | MMPretrain 1.x | Remark |
| ---------------------- | --------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| optimizer_config | / | It has been **removed**. |
| / | optim_wrapper | The `optim_wrapper` provides a common interface for updating parameters. |
| lr_config | param_scheduler | The `param_scheduler` is a list to set learning rate or other parameters, which is more flexible. |
| runner | train_cfg | The loop setting (`EpochBasedTrainLoop`, `IterBasedTrainLoop`) in `train_cfg` controls the work flow of the algorithm training. |
Changes in **`optimizer`** and **`optimizer_config`**:
- Now we use `optim_wrapper` field to specify all configurations related to optimization process. The
`optimizer` has become a subfield of `optim_wrapper`.
- The `paramwise_cfg` field is also a subfield of `optim_wrapper`, instead of `optimizer`.
- The `optimizer_config` field has been removed, and all configurations has been moved to `optim_wrapper`.
- The `grad_clip` field has been renamed to `clip_grad`.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
optimizer = dict(
type='AdamW',
lr=0.0015,
weight_decay=0.3,
paramwise_cfg = dict(
norm_decay_mult=0.0,
bias_decay_mult=0.0,
))
optimizer_config = dict(grad_clip=dict(max_norm=1.0))
```
</td>
<tr>
<td>New</td>
<td>
```python
optim_wrapper = dict(
optimizer=dict(type='AdamW', lr=0.0015, weight_decay=0.3),
paramwise_cfg = dict(
norm_decay_mult=0.0,
bias_decay_mult=0.0,
),
clip_grad=dict(max_norm=1.0),
)
```
</td>
</tr>
</table>
Changes in **`lr_config`**:
- The `lr_config` field has been removed and replaced by the new `param_scheduler`.
- The `warmup` related arguments have also been removed since we use a combination of schedulers to implement this
functionality.
The new scheduler combination mechanism is highly flexible and enables the design of various learning rate/momentum curves.
For more details, see the {external+mmengine:doc}`parameter schedulers tutorial <tutorials/param_scheduler>`.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
lr_config = dict(
policy='CosineAnnealing',
min_lr=0,
warmup='linear',
warmup_iters=5,
warmup_ratio=0.01,
warmup_by_epoch=True)
```
</td>
<tr>
<td>New</td>
<td>
```python
param_scheduler = [
# warmup
dict(
type='LinearLR',
start_factor=0.01,
by_epoch=True,
end=5,
# Update the learning rate after every iters.
convert_to_iter_based=True),
# main learning rate scheduler
dict(type='CosineAnnealingLR', by_epoch=True, begin=5),
]
```
</td>
</tr>
</table>
Changes in **`runner`**:
Most of the configurations that were originally in the `runner` field have been moved to `train_cfg`, `val_cfg`, and `test_cfg`.
These fields are used to configure the loop for training, validation, and testing.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
runner = dict(type='EpochBasedRunner', max_epochs=100)
```
</td>
<tr>
<td>New</td>
<td>
```python
# The `val_interval` is the original `evaluation.interval`.
train_cfg = dict(by_epoch=True, max_epochs=100, val_interval=1)
val_cfg = dict() # Use the default validation loop.
test_cfg = dict() # Use the default test loop.
```
</td>
</tr>
</table>
In OpenMMLab 2.0, we introduced `Loop` to control the behaviors in training, validation and testing. As a result, the functionalities of `Runner` have also been changed.
More details can be found in the {external+mmengine:doc}`MMEngine tutorials <design/runner>`.
## Runtime settings
Changes in **`checkpoint_config`** and **`log_config`**:
The `checkpoint_config` has been moved to `default_hooks.checkpoint`, and `log_config` has been moved to
`default_hooks.logger`. Additionally, many hook settings that were previously included in the script code have
been moved to the `default_hooks` field in the runtime configuration.
```python
default_hooks = dict(
# record the time of every iterations.
timer=dict(type='IterTimerHook'),
# print log every 100 iterations.
logger=dict(type='LoggerHook', interval=100),
# enable the parameter scheduler.
param_scheduler=dict(type='ParamSchedulerHook'),
# save checkpoint per epoch, and automatically save the best checkpoint.
checkpoint=dict(type='CheckpointHook', interval=1, save_best='auto'),
# set sampler seed in distributed evrionment.
sampler_seed=dict(type='DistSamplerSeedHook'),
# validation results visualization, set True to enable it.
visualization=dict(type='VisualizationHook', enable=False),
)
```
In OpenMMLab 2.0, we have split the original logger into logger and visualizer. The logger is used to record
information, while the visualizer is used to display the logger in different backends such as terminal,
TensorBoard, and Wandb.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
log_config = dict(
interval=100,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook'),
])
```
</td>
<tr>
<td>New</td>
<td>
```python
default_hooks = dict(
...
logger=dict(type='LoggerHook', interval=100),
)
visualizer = dict(
type='UniversalVisualizer',
vis_backends=[dict(type='LocalVisBackend'), dict(type='TensorboardVisBackend')],
)
```
</td>
</tr>
</table>
Changes in **`load_from`** and **`resume_from`**:
- The `resume_from` is removed. And we use `resume` and `load_from` to replace it.
- If `resume=True` and `load_from` is not None, resume training from the checkpoint in `load_from`.
- If `resume=True` and `load_from` is None, try to resume from the latest checkpoint in the work directory.
- If `resume=False` and `load_from` is not None, only load the checkpoint, not resume training.
- If `resume=False` and `load_from` is None, do not load nor resume.
the `resume_from` field has been removed, and we use `resume` and `load_from` instead.
- If `resume=True` and `load_from` is not None, training is resumed from the checkpoint in `load_from`.
- If `resume=True` and `load_from` is None, the latest checkpoint in the work directory is used for resuming.
- If `resume=False` and `load_from` is not None, only the checkpoint is loaded without resuming training.
- If `resume=False` and `load_from` is None, neither checkpoint is loaded nor is training resumed.
Changes in **`dist_params`**: The `dist_params` field has become a subfield of `env_cfg` now.
Additionally, some new configurations have been added to `env_cfg`.
```python
env_cfg = dict(
# whether to enable cudnn benchmark
cudnn_benchmark=False,
# set multi process parameters
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),
# set distributed parameters
dist_cfg=dict(backend='nccl'),
)
```
Changes in **`workflow`**: `workflow` related functionalities are removed.
New field **`visualizer`**: The visualizer is a new design in OpenMMLab 2.0 architecture. The runner uses an
instance of the visualizer to handle result and log visualization, as well as to save to different backends.
For more information, please refer to the {external+mmengine:doc}`MMEngine tutorial <advanced_tutorials/visualization>`.
```python
visualizer = dict(
type='UniversalVisualizer',
vis_backends=[
dict(type='LocalVisBackend'),
# Uncomment the below line to save the log and visualization results to TensorBoard.
# dict(type='TensorboardVisBackend')
]
)
```
New field **`default_scope`**: The start point to search module for all registries. The `default_scope` in MMPretrain is `mmpretrain`. See {external+mmengine:doc}`the registry tutorial <advanced_tutorials/registry>` for more details.
## Other changes
We moved the definition of all registries in different packages to the `mmpretrain.registry` package.
# Migration from MMClassification 0.x
## Config files
In MMPretrain 1.x, we refactored the structure of configuration files, and the original files are not usable.
In this section, we will introduce all changes of the configuration files. And we assume you already have
ideas of the [config files](./user_guides/config.md).
### Model settings
No changes in `model.backbone`, `model.neck` and `model.head` fields.
Changes in **`model.train_cfg`**:
- `BatchMixup` is renamed to [`Mixup`](mmpretrain.models.utils.batch_augments.Mixup).
- `BatchCutMix` is renamed to [`CutMix`](mmpretrain.models.utils.batch_augments.CutMix).
- `BatchResizeMix` is renamed to [`ResizeMix`](mmpretrain.models.utils.batch_augments.ResizeMix).
- The `prob` argument is removed from all augments settings, and you can use the `probs` field in `train_cfg` to
specify probabilities of every augemnts. If no `probs` field, randomly choose one by the same probability.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
model = dict(
...
train_cfg=dict(augments=[
dict(type='BatchMixup', alpha=0.8, num_classes=1000, prob=0.5),
dict(type='BatchCutMix', alpha=1.0, num_classes=1000, prob=0.5)
]
)
```
</td>
<tr>
<td>New</td>
<td>
```python
model = dict(
...
train_cfg=dict(augments=[
dict(type='Mixup', alpha=0.8), dict(type='CutMix', alpha=1.0)]
)
```
</td>
</tr>
</table>
### Data settings
Changes in **`data`**:
- The original `data` field is splited to `train_dataloader`, `val_dataloader` and
`test_dataloader`. This allows us to configure them in fine-grained. For example,
you can specify different sampler and batch size during training and test.
- The `samples_per_gpu` is renamed to `batch_size`.
- The `workers_per_gpu` is renamed to `num_workers`.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
train=dict(...),
val=dict(...),
test=dict(...),
)
```
</td>
<tr>
<td>New</td>
<td>
```python
train_dataloader = dict(
batch_size=32,
num_workers=2,
dataset=dict(...),
sampler=dict(type='DefaultSampler', shuffle=True) # necessary
)
val_dataloader = dict(
batch_size=32,
num_workers=2,
dataset=dict(...),
sampler=dict(type='DefaultSampler', shuffle=False) # necessary
)
test_dataloader = val_dataloader
```
</td>
</tr>
</table>
Changes in **`pipeline`**:
- The original formatting transforms **`ToTensor`**, **`ImageToTensor`** and **`Collect`** are combined as [`PackInputs`](mmpretrain.datasets.transforms.PackInputs).
- We don't recommend to do **`Normalize`** in the dataset pipeline. Please remove it from pipelines and set it in the `data_preprocessor` field.
- The argument `flip_prob` in [**`RandomFlip`**](mmcv.transforms.RandomFlip) is renamed to `prob`.
- The argument `size` in [**`RandomCrop`**](mmpretrain.datasets.transforms.RandomCrop) is renamed to `crop_size`.
- The argument `size` in [**`RandomResizedCrop`**](mmpretrain.datasets.transforms.RandomResizedCrop) is renamed to `scale`.
- The argument `size` in [**`Resize`**](mmcv.transforms.Resize) is renamed to `scale`. And `Resize` won't support size like `(256, -1)`, please use [`ResizeEdge`](mmpretrain.datasets.transforms.ResizeEdge) to replace it.
- The argument `policies` in [**`AutoAugment`**](mmpretrain.datasets.transforms.AutoAugment) and [**`RandAugment`**](mmpretrain.datasets.transforms.RandAugment) supports using string to specify preset policies. `AutoAugment` supports "imagenet" and `RandAugment` supports "timm_increasing".
- **`RandomResizedCrop`** and **`CenterCrop`** won't supports `efficientnet_style`, and please use [`EfficientNetRandomCrop`](mmpretrain.datasets.transforms.EfficientNetRandomCrop) and [`EfficientNetCenterCrop`](mmpretrain.datasets.transforms.EfficientNetCenterCrop) to replace them.
```{note}
We move some work of data transforms to the data preprocessor, like normalization, see [the documentation](mmpretrain.models.utils.data_preprocessor) for
more details.
```
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', size=224),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
```
</td>
<tr>
<td>New</td>
<td>
```python
data_preprocessor = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', scale=224),
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
dict(type='PackInputs'),
]
```
</td>
</tr>
</table>
Changes in **`evaluation`**:
- The **`evaluation`** field is splited to `val_evaluator` and `test_evaluator`. And it won't supports `interval` and `save_best` arguments.
The `interval` is moved to `train_cfg.val_interval`, see [the schedule settings](./user_guides/config.md#schedule-settings) and the `save_best`
is moved to `default_hooks.checkpoint.save_best`, see [the runtime settings](./user_guides/config.md#runtime-settings).
- The 'accuracy' metric is renamed to [`Accuracy`](mmpretrain.evaluation.Accuracy).
- The 'precision', 'recall', 'f1-score' and 'support' are combined as [`SingleLabelMetric`](mmpretrain.evaluation.SingleLabelMetric), and use `items` argument to specify to calculate which metric.
- The 'mAP' is renamed to [`AveragePrecision`](mmpretrain.evaluation.AveragePrecision).
- The 'CP', 'CR', 'CF1', 'OP', 'OR', 'OF1' are combined as [`MultiLabelMetric`](mmpretrain.evaluation.MultiLabelMetric), and use `items` and `average` arguments to specify to calculate which metric.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
evaluation = dict(
interval=1,
metric='accuracy',
metric_options=dict(topk=(1, 5))
)
```
</td>
<tr>
<td>New</td>
<td>
```python
val_evaluator = dict(type='Accuracy', topk=(1, 5))
test_evaluator = val_evaluator
```
</td>
</tr>
<tr>
<td>Original</td>
<td>
```python
evaluation = dict(
interval=1,
metric=['mAP', 'CP', 'OP', 'CR', 'OR', 'CF1', 'OF1'],
metric_options=dict(thr=0.5),
)
```
</td>
<tr>
<td>New</td>
<td>
```python
val_evaluator = [
dict(type='AveragePrecision'),
dict(type='MultiLabelMetric',
items=['precision', 'recall', 'f1-score'],
average='both',
thr=0.5),
]
test_evaluator = val_evaluator
```
</td>
</tr>
</table>
## Packages
### `mmpretrain.apis`
The documentation can be found [here](mmpretrain.apis).
| Function | Changes |
| :------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------- |
| `init_model` | No changes |
| `inference_model` | No changes. But we recommend to use [`mmpretrain.ImageClassificationInferencer`](mmpretrain.apis.ImageClassificationInferencer) instead. |
| `train_model` | Removed, use `runner.train` to train. |
| `multi_gpu_test` | Removed, use `runner.test` to test. |
| `single_gpu_test` | Removed, use `runner.test` to test. |
| `show_result_pyplot` | Removed, use [`mmpretrain.ImageClassificationInferencer`](mmpretrain.apis.ImageClassificationInferencer) to inference model and show the result. |
| `set_random_seed` | Removed, use `mmengine.runner.set_random_seed`. |
| `init_random_seed` | Removed, use `mmengine.dist.sync_random_seed`. |
### `mmpretrain.core`
The `mmpretrain.core` package is renamed to [`mmpretrain.engine`](mmpretrain.engine).
| Sub package | Changes |
| :-------------: | :-------------------------------------------------------------------------------------------------------------------------------- |
| `evaluation` | Removed, use the metrics in [`mmpretrain.evaluation`](mmpretrain.evaluation). |
| `hook` | Moved to [`mmpretrain.engine.hooks`](mmpretrain.engine.hooks) |
| `optimizers` | Moved to [`mmpretrain.engine.optimizers`](mmpretrain.engine.optimizers) |
| `utils` | Removed, the distributed environment related functions can be found in the [`mmengine.dist`](api/dist) package. |
| `visualization` | Removed, the related functionalities are implemented in [`mmengine.visualization.Visualizer`](mmengine.visualization.Visualizer). |
The `MMClsWandbHook` in `hooks` package is waiting for implementation.
The `CosineAnnealingCooldownLrUpdaterHook` in `hooks` package is removed, and we support this functionality by
the combination of parameter schedulers, see [the tutorial](./advanced_guides/schedule.md).
### `mmpretrain.datasets`
The documentation can be found [here](mmpretrain.datasets).
| Dataset class | Changes |
| :---------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------- |
| [`CustomDataset`](mmpretrain.datasets.CustomDataset) | Add `data_root` argument as the common prefix of `data_prefix` and `ann_file` and support to load unlabeled data. |
| [`ImageNet`](mmpretrain.datasets.ImageNet) | Same as `CustomDataset`. |
| [`ImageNet21k`](mmpretrain.datasets.ImageNet21k) | Same as `CustomDataset`. |
| [`CIFAR10`](mmpretrain.datasets.CIFAR10) & [`CIFAR100`](mmpretrain.datasets.CIFAR100) | The `test_mode` argument is a required argument now. |
| [`MNIST`](mmpretrain.datasets.MNIST) & [`FashionMNIST`](mmpretrain.datasets.FashionMNIST) | The `test_mode` argument is a required argument now. |
| [`VOC`](mmpretrain.datasets.VOC) | Requires `data_root`, `image_set_path` and `test_mode` now. |
| [`CUB`](mmpretrain.datasets.CUB) | Requires `data_root` and `test_mode` now. |
The `mmpretrain.datasets.pipelines` is renamed to `mmpretrain.datasets.transforms`.
| Transform class | Changes |
| :-----------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `LoadImageFromFile` | Removed, use [`mmcv.transforms.LoadImageFromFile`](mmcv.transforms.LoadImageFromFile). |
| `RandomFlip` | Removed, use [`mmcv.transforms.RandomFlip`](mmcv.transforms.RandomFlip). The argument `flip_prob` is renamed to `prob`. |
| `RandomCrop` | The argument `size` is renamed to `crop_size`. |
| `RandomResizedCrop` | The argument `size` is renamed to `scale`. The argument `scale` is renamed to `crop_ratio_range`. Won't support `efficientnet_style`, use [`EfficientNetRandomCrop`](mmpretrain.datasets.transforms.EfficientNetRandomCrop). |
| `CenterCrop` | Removed, use [`mmcv.transforms.CenterCrop`](mmcv.transforms.CenterCrop). Won't support `efficientnet_style`, use [`EfficientNetCenterCrop`](mmpretrain.datasets.transforms.EfficientNetCenterCrop). |
| `Resize` | Removed, use [`mmcv.transforms.Resize`](mmcv.transforms.Resize). The argument `size` is renamed to `scale`. Won't support size like `(256, -1)`, use [`ResizeEdge`](mmpretrain.datasets.transforms.ResizeEdge). |
| `AutoAugment` & `RandomAugment` | The argument `policies` supports using string to specify preset policies. |
| `Compose` | Removed, use [`mmcv.transforms.Compose`](mmcv.transforms.Compose). |
### `mmpretrain.models`
The documentation can be found [here](mmpretrain.models). The interface of all **backbones**, **necks** and **losses** didn't change.
Changes in [`ImageClassifier`](mmpretrain.models.classifiers.ImageClassifier):
| Method of classifiers | Changes |
| :-------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `extract_feat` | No changes |
| `forward` | Now only accepts three arguments: `inputs`, `data_samples` and `mode`. See [the documentation](mmpretrain.models.classifiers.ImageClassifier.forward) for more details. |
| `forward_train` | Replaced by `loss`. |
| `simple_test` | Replaced by `predict`. |
| `train_step` | The `optimizer` argument is replaced by `optim_wrapper` and it accepts [`OptimWrapper`](mmengine.optim.OptimWrapper). |
| `val_step` | The original `val_step` is the same as `train_step`, now it calls `predict`. |
| `test_step` | New method, and it's the same as `val_step`. |
Changes in [heads](mmpretrain.models.heads):
| Method of heads | Changes |
| :-------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------- |
| `pre_logits` | No changes |
| `forward_train` | Replaced by `loss`. |
| `simple_test` | Replaced by `predict`. |
| `loss` | It accepts `data_samples` instead of `gt_labels` to calculate loss. The `data_samples` is a list of [ClsDataSample](mmpretrain.structures.DataSample). |
| `forward` | New method, and it returns the output of the classification head without any post-processs like softmax or sigmoid. |
### `mmpretrain.utils`
| Function | Changes |
| :--------------------------: | :-------------------------------------------------------------------------------------------------------------- |
| `collect_env` | No changes |
| `get_root_logger` | Removed, use [`mmengine.logging.MMLogger.get_current_instance`](mmengine.logging.MMLogger.get_current_instance) |
| `load_json_log` | The output format changed. |
| `setup_multi_processes` | Removed, use [`mmengine.utils.dl_utils.set_multi_processing`](mmengine.utils.dl_utils.set_multi_processing). |
| `wrap_non_distributed_model` | Removed, we auto wrap the model in the runner. |
| `wrap_distributed_model` | Removed, we auto wrap the model in the runner. |
| `auto_select_device` | Removed, we auto select the device in the runner. |
# Migration from MMSelfSup 0.x
## Config
This section illustrates the changes of our config files in the `_base_` folder, which includes three parts
- Datasets: `configs/_base_/datasets`
- Models: `configs/_base_/models`
- Schedules: `configs/_base_/schedules`
### Dataset settings
In **MMSelfSup 0.x**, we use key `data` to summarize all information, such as `samples_per_gpu`, `train`, `val`, etc.
In **MMPretrain 1.x**, we separate `train_dataloader`, `val_dataloader` to summarize information correspodingly and the key `data` has been **removed**.
<table class="docutils">
<tr>
<td>Original</td>
<td>
```python
data = dict(
samples_per_gpu=32, # total 32*8(gpu)=256
workers_per_gpu=4,
train=dict(
type=dataset_type,
data_source=dict(
type=data_source,
data_prefix='data/imagenet/train',
ann_file='data/imagenet/meta/train.txt',
),
num_views=[1, 1],
pipelines=[train_pipeline1, train_pipeline2],
prefetch=prefetch,
),
val=...)
```
</td>
<tr>
<td>New</td>
<td>
```python
train_dataloader = dict(
batch_size=32,
num_workers=4,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
collate_fn=dict(type='default_collate'),
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file='meta/train.txt',
data_prefix=dict(img_path='train/'),
pipeline=train_pipeline))
val_dataloader = ...
```
</td>
</tr>
</table>
Besides, we **remove** the key of `data_source` to keep the pipeline format consistent with that in other OpenMMLab projects. Please refer to [Config](user_guides/config.md) for more details.
Changes in **`pipeline`**:
Take MAE as an example of `pipeline`:
```python
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='RandomResizedCrop',
scale=224,
crop_ratio_range=(0.2, 1.0),
backend='pillow',
interpolation='bicubic'),
dict(type='RandomFlip', prob=0.5),
dict(type='PackInputs')
]
```
### Model settings
In the config of models, there are two main different parts from MMSeflSup 0.x.
1. There is a new key called `data_preprocessor`, which is responsible for preprocessing the data, like normalization, channel conversion, etc. For example:
```python
data_preprocessor=dict(
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True)
model = dict(
type='MAE',
data_preprocessor=dict(
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
bgr_to_rgb=True),
backbone=...,
neck=...,
head=...,
init_cfg=...)
```
2. There is a new key `loss` in `head` in MMPretrain 1.x, to determine the loss function of the algorithm. For example:
```python
model = dict(
type='MAE',
backbone=...,
neck=...,
head=dict(
type='MAEPretrainHead',
norm_pix=True,
patch_size=16,
loss=dict(type='MAEReconstructionLoss')),
init_cfg=...)
```
## Package
The table below records the general modification of the folders and files.
| MMSelfSup 0.x | MMPretrain 1.x | Remark |
| ------------------------ | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| apis | apis | The high level APIs are updated. |
| core | engine | The `core` folder has been renamed to `engine`, which includes `hooks`, `opimizers`. ([API link](mmpretrain.engine)) |
| datasets | datasets | The datasets is implemented according to different datasets, such as ImageNet, Places205. ([API link](mmpretrain.datasets)) |
| datasets/data_sources | / | The `data_sources` has been **removed** and the directory of `datasets` now is consistent with other OpenMMLab projects. |
| datasets/pipelines | datasets/transforms | The `pipelines` folder has been renamed to `transforms`. ([API link](mmpretrain.datasets.transforms)) |
| / | evaluation | The `evaluation` is created for some evaluation functions or classes. ([API link](mmpretrain.evaluation)) |
| models/algorithms | selfsup | The algorithms are moved to `selfsup` folder. ([API link](mmpretrain.models.selfsup)) |
| models/backbones | selfsup | The re-implemented backbones are moved to corresponding self-supervised learning algorithm `.py` files. ([API link](mmpretrain.models.selfsup)) |
| models/target_generators | selfsup | The target generators are moved to corresponding self-supervised learning algorithm `.py` files. ([API link](mmpretrain.models.selfsup)) |
| / | models/losses | The `losses` folder is created to provide different loss implementations, which is from `heads`. ([API link](mmpretrain.models.losses)) |
| / | structures | The `structures` folder is for the implementation of data structures. In MMPretrain, we implement a new data structure, `DataSample`, to pass and receive data throughout the training/val process. ([API link](mmpretrain.structures)) |
| / | visualization | The `visualization` folder contains the visualizer, which is responsible for some visualization tasks like visualizing data augmentation. ([API link](mmpretrain.visualization)) |
# Changelog (MMPreTrain)
## v1.2.0(04/01/2024)
### New Features
- [Feature] Support LLaVA 1.5 ([#1853](https://github.com/open-mmlab/mmpretrain/pull/1853))
- [Feature] Implement of RAM with a gradio interface. ([#1802](https://github.com/open-mmlab/mmpretrain/pull/1802))
### Bug Fix
- [Fix] Fix resize mix argument bug.
## v1.1.0(12/10/2023)
### New Features
- [Feature] Implement of Zero-Shot CLIP Classifier ([#1737](https://github.com/open-mmlab/mmpretrain/pull/1737))
- [Feature] Add minigpt4 gradio demo and training script. ([#1758](https://github.com/open-mmlab/mmpretrain/pull/1758))
### Improvements
- [Config] New Version of config Adapting MobileNet Algorithm ([#1774](https://github.com/open-mmlab/mmpretrain/pull/1774))
- [Config] Support DINO self-supervised learning in project ([#1756](https://github.com/open-mmlab/mmpretrain/pull/1756))
- [Config] New Version of config Adapting Swin Transformer Algorithm ([#1780](https://github.com/open-mmlab/mmpretrain/pull/1780))
- [Enhance] Add iTPN Supports for Non-three channel image ([#1735](https://github.com/open-mmlab/mmpretrain/pull/1735))
- [Docs] Update dataset download script from opendatalab to openXlab ([#1765](https://github.com/open-mmlab/mmpretrain/pull/1765))
- [Docs] Update COCO-Retrieval dataset docs. ([#1806](https://github.com/open-mmlab/mmpretrain/pull/1806))
### Bug Fix
- Update `train.py` to compat with new config.
- Update OFA module to compat with the latest huggingface.
- Fix pipeline bug in ImageRetrievalInferencer.
## v1.0.2(15/08/2023)
### New Features
- Add MFF ([#1725](https://github.com/open-mmlab/mmpretrain/pull/1725))
- Support training of BLIP2 ([#1700](https://github.com/open-mmlab/mmpretrain/pull/1700))
### Improvements
- New Version of config Adapting MAE Algorithm ([#1750](https://github.com/open-mmlab/mmpretrain/pull/1750))
- New Version of config Adapting ConvNeXt Algorithm ([#1760](https://github.com/open-mmlab/mmpretrain/pull/1760))
- New version of config adapting BeitV2 Algorithm ([#1755](https://github.com/open-mmlab/mmpretrain/pull/1755))
- Update `dataset_prepare.md` ([#1732](https://github.com/open-mmlab/mmpretrain/pull/1732))
- New Version of `config` Adapting Vision Transformer Algorithm ([#1727](https://github.com/open-mmlab/mmpretrain/pull/1727))
- Support Infographic VQA dataset and ANLS metric. ([#1667](https://github.com/open-mmlab/mmpretrain/pull/1667))
- Support IconQA dataset. ([#1670](https://github.com/open-mmlab/mmpretrain/pull/1670))
- Fix typo MIMHIVIT to MAEHiViT ([#1749](https://github.com/open-mmlab/mmpretrain/pull/1749))
## v1.0.1(28/07/2023)
### Improvements
- Add init_cfg with type='pretrained' to downstream tasks ([#1717](https://github.com/open-mmlab/mmpretrain/pull/1717)
- Set 'is_init' in some multimodal methods ([#1718](https://github.com/open-mmlab/mmpretrain/pull/1718)
- Adapt test cases on Ascend NPU ([#1728](https://github.com/open-mmlab/mmpretrain/pull/1728)
- Add GPU Acceleration Apple silicon mac ([#1699](https://github.com/open-mmlab/mmpretrain/pull/1699)
- BEiT refactor ([#1705](https://github.com/open-mmlab/mmpretrain/pull/1705)
### Bug Fixes
- Fix dict update in minigpt4. ([#1709](https://github.com/open-mmlab/mmpretrain/pull/1709)
- Fix nested predict for multi-task prediction ([#1716](https://github.com/open-mmlab/mmpretrain/pull/1716)
- Fix the issue #1711 "GaussianBlur doesn't work" ([#1722](https://github.com/open-mmlab/mmpretrain/pull/1722)
- Just to correct a typo of 'target' ([#1655](https://github.com/open-mmlab/mmpretrain/pull/1655)
- Fix freeze without cls_token in vit ([#1693](https://github.com/open-mmlab/mmpretrain/pull/1693)
- Fix RandomCrop bug ([#1706](https://github.com/open-mmlab/mmpretrain/pull/1706)
### Docs Update
- Fix spelling ([#1689](https://github.com/open-mmlab/mmpretrain/pull/1689)
## v1.0.0(04/07/2023)
### Highlights
- Support inference of more **multi-modal** algorithms, such as **LLaVA**, **MiniGPT-4**, **Otter**, etc.
- Support around **10 multi-modal datasets**!
- Add **iTPN**, **SparK** self-supervised learning algorithms.
- Provide examples of [New Config](https://github.com/open-mmlab/mmpretrain/tree/main/mmpretrain/configs/) and [DeepSpeed/FSDP](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mae/benchmarks/).
### New Features
- Transfer shape-bias tool from mmselfsup ([#1658](https://github.com/open-mmlab/mmpretrain/pull/1685))
- Download dataset by using MIM&OpenDataLab ([#1630](https://github.com/open-mmlab/mmpretrain/pull/1630))
- Support New Configs ([#1639](https://github.com/open-mmlab/mmpretrain/pull/1639), [#1647](https://github.com/open-mmlab/mmpretrain/pull/1647), [#1665](https://github.com/open-mmlab/mmpretrain/pull/1665))
- Support Flickr30k Retrieval dataset ([#1625](https://github.com/open-mmlab/mmpretrain/pull/1625))
- Support SparK ([#1531](https://github.com/open-mmlab/mmpretrain/pull/1531))
- Support LLaVA ([#1652](https://github.com/open-mmlab/mmpretrain/pull/1652))
- Support Otter ([#1651](https://github.com/open-mmlab/mmpretrain/pull/1651))
- Support MiniGPT-4 ([#1642](https://github.com/open-mmlab/mmpretrain/pull/1642))
- Add support for VizWiz dataset ([#1636](https://github.com/open-mmlab/mmpretrain/pull/1636))
- Add support for vsr dataset ([#1634](https://github.com/open-mmlab/mmpretrain/pull/1634))
- Add InternImage Classification project ([#1569](https://github.com/open-mmlab/mmpretrain/pull/1569))
- Support OCR-VQA dataset ([#1621](https://github.com/open-mmlab/mmpretrain/pull/1621))
- Support OK-VQA dataset ([#1615](https://github.com/open-mmlab/mmpretrain/pull/1615))
- Support TextVQA dataset ([#1569](https://github.com/open-mmlab/mmpretrain/pull/1569))
- Support iTPN and HiViT ([#1584](https://github.com/open-mmlab/mmpretrain/pull/1584))
- Add retrieval mAP metric ([#1552](https://github.com/open-mmlab/mmpretrain/pull/1552))
- Support NoCap dataset based on BLIP. ([#1582](https://github.com/open-mmlab/mmpretrain/pull/1582))
- Add GQA dataset ([#1585](https://github.com/open-mmlab/mmpretrain/pull/1585))
### Improvements
- Update fsdp vit-huge and vit-large config ([#1675](https://github.com/open-mmlab/mmpretrain/pull/1675))
- Support deepspeed with flexible runner ([#1673](https://github.com/open-mmlab/mmpretrain/pull/1673))
- Update Otter and LLaVA docs and config. ([#1653](https://github.com/open-mmlab/mmpretrain/pull/1653))
- Add image_only param of ScienceQA ([#1613](https://github.com/open-mmlab/mmpretrain/pull/1613))
- Support to use "split" to specify training set/validation ([#1535](https://github.com/open-mmlab/mmpretrain/pull/1535))
### Bug Fixes
- Refactor \_prepare_pos_embed in ViT ([#1656](https://github.com/open-mmlab/mmpretrain/pull/1656)[#1679](https://github.com/open-mmlab/mmpretrain/pull/1679))
- Freeze pre norm in vision transformer ([#1672](https://github.com/open-mmlab/mmpretrain/pull/1672))
- Fix bug loading IN1k dataset ([#1641](https://github.com/open-mmlab/mmpretrain/pull/1641))
- Fix sam bug ([#1633](https://github.com/open-mmlab/mmpretrain/pull/1633))
- Fixed circular import error for new transform ([#1609](https://github.com/open-mmlab/mmpretrain/pull/1609))
- Update torchvision transform wrapper ([#1595](https://github.com/open-mmlab/mmpretrain/pull/1595))
- Set default out_type in CAM visualization ([#1586](https://github.com/open-mmlab/mmpretrain/pull/1586))
### Docs Update
- Fix spelling ([#1681](https://github.com/open-mmlab/mmpretrain/pull/1681))
- Fix doc typos ([#1671](https://github.com/open-mmlab/mmpretrain/pull/1671), [#1644](https://github.com/open-mmlab/mmpretrain/pull/1644), [#1629](https://github.com/open-mmlab/mmpretrain/pull/1629))
- Add t-SNE visualization doc ([#1555](https://github.com/open-mmlab/mmpretrain/pull/1555))
## v1.0.0rc8(22/05/2023)
### Highlights
- Support multiple multi-modal algorithms and inferencers. You can explore these features by the [gradio demo](https://github.com/open-mmlab/mmpretrain/tree/main/projects/gradio_demo)!
- Add EVA-02, Dino-V2, ViT-SAM and GLIP backbones.
- Register torchvision transforms into MMPretrain, you can now easily integrate torchvision's data augmentations in MMPretrain.
### New Features
- Support Chinese CLIP. ([#1576](https://github.com/open-mmlab/mmpretrain/pull/1576))
- Add ScienceQA Metrics ([#1577](https://github.com/open-mmlab/mmpretrain/pull/1577))
- Support multiple multi-modal algorithms and inferencers. ([#1561](https://github.com/open-mmlab/mmpretrain/pull/1561))
- add eva02 backbone ([#1450](https://github.com/open-mmlab/mmpretrain/pull/1450))
- Support dinov2 backbone ([#1522](https://github.com/open-mmlab/mmpretrain/pull/1522))
- Support some downstream classification datasets. ([#1467](https://github.com/open-mmlab/mmpretrain/pull/1467))
- Support GLIP ([#1308](https://github.com/open-mmlab/mmpretrain/pull/1308))
- Register torchvision transforms into mmpretrain ([#1265](https://github.com/open-mmlab/mmpretrain/pull/1265))
- Add ViT of SAM ([#1476](https://github.com/open-mmlab/mmpretrain/pull/1476))
### Improvements
- [Refactor] Support to freeze channel reduction and add layer decay function ([#1490](https://github.com/open-mmlab/mmpretrain/pull/1490))
- [Refactor] Support resizing pos_embed while loading ckpt and format output ([#1488](https://github.com/open-mmlab/mmpretrain/pull/1488))
### Bug Fixes
- Fix scienceqa ([#1581](https://github.com/open-mmlab/mmpretrain/pull/1581))
- Fix config of beit ([#1528](https://github.com/open-mmlab/mmpretrain/pull/1528))
- Incorrect stage freeze on RIFormer Model ([#1573](https://github.com/open-mmlab/mmpretrain/pull/1573))
- Fix ddp bugs caused by `out_type`. ([#1570](https://github.com/open-mmlab/mmpretrain/pull/1570))
- Fix multi-task-head loss potential bug ([#1530](https://github.com/open-mmlab/mmpretrain/pull/1530))
- Support bce loss without batch augmentations ([#1525](https://github.com/open-mmlab/mmpretrain/pull/1525))
- Fix clip generator init bug ([#1518](https://github.com/open-mmlab/mmpretrain/pull/1518))
- Fix the bug in binary cross entropy loss ([#1499](https://github.com/open-mmlab/mmpretrain/pull/1499))
### Docs Update
- Update PoolFormer citation to CVPR version ([#1505](https://github.com/open-mmlab/mmpretrain/pull/1505))
- Refine Inference Doc ([#1489](https://github.com/open-mmlab/mmpretrain/pull/1489))
- Add doc for usage of confusion matrix ([#1513](https://github.com/open-mmlab/mmpretrain/pull/1513))
- Update MMagic link ([#1517](https://github.com/open-mmlab/mmpretrain/pull/1517))
- Fix example_project README ([#1575](https://github.com/open-mmlab/mmpretrain/pull/1575))
- Add NPU support page ([#1481](https://github.com/open-mmlab/mmpretrain/pull/1481))
- train cfg: Removed old description ([#1473](https://github.com/open-mmlab/mmpretrain/pull/1473))
- Fix typo in MultiLabelDataset docstring ([#1483](https://github.com/open-mmlab/mmpretrain/pull/1483))
## v1.0.0rc7(07/04/2023)
### Highlights
- Integrated Self-supervised learning algorithms from **MMSelfSup**, such as **MAE**, **BEiT**, etc.
- Support **RIFormer**, a simple but effective vision backbone by removing token mixer.
- Support **LeViT**, **XCiT**, **ViG** and **ConvNeXt-V2** backbone.
- Add t-SNE visualization.
- Refactor dataset pipeline visualization.
- Support confusion matrix calculation and plot.
### New Features
- Support RIFormer. ([#1453](https://github.com/open-mmlab/mmpretrain/pull/1453))
- Support XCiT Backbone. ([#1305](https://github.com/open-mmlab/mmclassification/pull/1305))
- Support calculate confusion matrix and plot it. ([#1287](https://github.com/open-mmlab/mmclassification/pull/1287))
- Support RetrieverRecall metric & Add ArcFace config ([#1316](https://github.com/open-mmlab/mmclassification/pull/1316))
- Add `ImageClassificationInferencer`. ([#1261](https://github.com/open-mmlab/mmclassification/pull/1261))
- Support InShop Dataset (Image Retrieval). ([#1019](https://github.com/open-mmlab/mmclassification/pull/1019))
- Support LeViT backbone. ([#1238](https://github.com/open-mmlab/mmclassification/pull/1238))
- Support VIG Backbone. ([#1304](https://github.com/open-mmlab/mmclassification/pull/1304))
- Support ConvNeXt-V2 backbone. ([#1294](https://github.com/open-mmlab/mmclassification/pull/1294))
### Improvements
- Use PyTorch official `scaled_dot_product_attention` to accelerate `MultiheadAttention`. ([#1434](https://github.com/open-mmlab/mmpretrain/pull/1434))
- Add ln to vit avg_featmap output ([#1447](https://github.com/open-mmlab/mmpretrain/pull/1447))
- Update analysis tools and documentations. ([#1359](https://github.com/open-mmlab/mmclassification/pull/1359))
- Unify the `--out` and `--dump` in `tools/test.py`. ([#1307](https://github.com/open-mmlab/mmclassification/pull/1307))
- Enable to toggle whether Gem Pooling is trainable or not. ([#1246](https://github.com/open-mmlab/mmclassification/pull/1246))
- Update registries of mmcls. ([#1306](https://github.com/open-mmlab/mmclassification/pull/1306))
- Add metafile fill and validation tools. ([#1297](https://github.com/open-mmlab/mmclassification/pull/1297))
- Remove useless EfficientnetV2 config files. ([#1300](https://github.com/open-mmlab/mmclassification/pull/1300))
### Bug Fixes
- Fix precise bn hook ([#1466](https://github.com/open-mmlab/mmpretrain/pull/1466))
- Fix retrieval multi gpu bug ([#1319](https://github.com/open-mmlab/mmclassification/pull/1319))
- Fix error repvgg-deploy base config path. ([#1357](https://github.com/open-mmlab/mmclassification/pull/1357))
- Fix bug in test tools. ([#1309](https://github.com/open-mmlab/mmclassification/pull/1309))
### Docs Update
- Translate some tools tutorials to Chinese. ([#1321](https://github.com/open-mmlab/mmclassification/pull/1321))
- Add Chinese translation for runtime.md. ([#1313](https://github.com/open-mmlab/mmclassification/pull/1313))
# Changelog (MMClassification)
## v1.0.0rc5(30/12/2022)
### Highlights
- Support EVA, RevViT, EfficientnetV2, CLIP, TinyViT and MixMIM backbones.
- Reproduce the training accuracy of ConvNeXt and RepVGG.
- Support multi-task training and testing.
- Support Test-time Augmentation.
### New Features
- [Feature] Add EfficientnetV2 Backbone. ([#1253](https://github.com/open-mmlab/mmclassification/pull/1253))
- [Feature] Support TTA and add `--tta` in `tools/test.py`. ([#1161](https://github.com/open-mmlab/mmclassification/pull/1161))
- [Feature] Support Multi-task. ([#1229](https://github.com/open-mmlab/mmclassification/pull/1229))
- [Feature] Add clip backbone. ([#1258](https://github.com/open-mmlab/mmclassification/pull/1258))
- [Feature] Add mixmim backbone with checkpoints. ([#1224](https://github.com/open-mmlab/mmclassification/pull/1224))
- [Feature] Add TinyViT for dev-1.x. ([#1042](https://github.com/open-mmlab/mmclassification/pull/1042))
- [Feature] Add some scripts for development. ([#1257](https://github.com/open-mmlab/mmclassification/pull/1257))
- [Feature] Support EVA. ([#1239](https://github.com/open-mmlab/mmclassification/pull/1239))
- [Feature] Implementation of RevViT. ([#1127](https://github.com/open-mmlab/mmclassification/pull/1127))
### Improvements
- [Reproduce] Reproduce RepVGG Training Accuracy. ([#1264](https://github.com/open-mmlab/mmclassification/pull/1264))
- [Enhance] Support ConvNeXt More Weights. ([#1240](https://github.com/open-mmlab/mmclassification/pull/1240))
- [Reproduce] Update ConvNeXt config files. ([#1256](https://github.com/open-mmlab/mmclassification/pull/1256))
- [CI] Update CI to test PyTorch 1.13.0. ([#1260](https://github.com/open-mmlab/mmclassification/pull/1260))
- [Project] Add ACCV workshop 1st Solution. ([#1245](https://github.com/open-mmlab/mmclassification/pull/1245))
- [Project] Add Example project. ([#1254](https://github.com/open-mmlab/mmclassification/pull/1254))
### Bug Fixes
- [Fix] Fix imports in transforms. ([#1255](https://github.com/open-mmlab/mmclassification/pull/1255))
- [Fix] Fix CAM visualization. ([#1248](https://github.com/open-mmlab/mmclassification/pull/1248))
- [Fix] Fix the requirements and lazy register mmpretrain models. ([#1275](https://github.com/open-mmlab/mmclassification/pull/1275))
## v1.0.0rc4(06/12/2022)
### Highlights
- Upgrade API to get pre-defined models of MMClassification. See [#1236](https://github.com/open-mmlab/mmclassification/pull/1236) for more details.
- Refactor BEiT backbone and support v1/v2 inference. See [#1144](https://github.com/open-mmlab/mmclassification/pull/1144).
### New Features
- Support getting model from the name defined in the model-index file. ([#1236](https://github.com/open-mmlab/mmclassification/pull/1236))
### Improvements
- Support evaluate on both EMA and non-EMA models. ([#1204](https://github.com/open-mmlab/mmclassification/pull/1204))
- Refactor BEiT backbone and support v1/v2 inference. ([#1144](https://github.com/open-mmlab/mmclassification/pull/1144))
### Bug Fixes
- Fix `reparameterize_model.py` doesn't save meta info. ([#1221](https://github.com/open-mmlab/mmclassification/pull/1221))
- Fix dict update in BEiT. ([#1234](https://github.com/open-mmlab/mmclassification/pull/1234))
### Docs Update
- Update install tutorial. ([#1223](https://github.com/open-mmlab/mmclassification/pull/1223))
- Update MobileNetv2 & MobileNetv3 readme. ([#1222](https://github.com/open-mmlab/mmclassification/pull/1222))
- Add version selection in the banner. ([#1217](https://github.com/open-mmlab/mmclassification/pull/1217))
## v1.0.0rc3(21/11/2022)
### Highlights
- Add **Switch Recipe** Hook, Now we can modify training pipeline, mixup and loss settings during training, see [#1101](https://github.com/open-mmlab/mmclassification/pull/1101).
- Add **TIMM and HuggingFace** wrappers. Now you can train/use models in TIMM/HuggingFace directly, see [#1102](https://github.com/open-mmlab/mmclassification/pull/1102).
- Support **retrieval tasks**, see [#1055](https://github.com/open-mmlab/mmclassification/pull/1055).
- Reproduce **mobileone** training accuracy. See [#1191](https://github.com/open-mmlab/mmclassification/pull/1191)
### New Features
- Add checkpoints from EfficientNets NoisyStudent & L2. ([#1122](https://github.com/open-mmlab/mmclassification/pull/1122))
- Migrate CSRA head to 1.x. ([#1177](https://github.com/open-mmlab/mmclassification/pull/1177))
- Support RepLKnet backbone. ([#1129](https://github.com/open-mmlab/mmclassification/pull/1129))
- Add Switch Recipe Hook. ([#1101](https://github.com/open-mmlab/mmclassification/pull/1101))
- Add adan optimizer. ([#1180](https://github.com/open-mmlab/mmclassification/pull/1180))
- Support DaViT. ([#1105](https://github.com/open-mmlab/mmclassification/pull/1105))
- Support Activation Checkpointing for ConvNeXt. ([#1153](https://github.com/open-mmlab/mmclassification/pull/1153))
- Add TIMM and HuggingFace wrappers to build classifiers from them directly. ([#1102](https://github.com/open-mmlab/mmclassification/pull/1102))
- Add reduction for neck ([#978](https://github.com/open-mmlab/mmclassification/pull/978))
- Support HorNet Backbone for dev1.x. ([#1094](https://github.com/open-mmlab/mmclassification/pull/1094))
- Add arcface head. ([#926](https://github.com/open-mmlab/mmclassification/pull/926))
- Add Base Retriever and Image2Image Retriever for retrieval tasks. ([#1055](https://github.com/open-mmlab/mmclassification/pull/1055))
- Support MobileViT backbone. ([#1068](https://github.com/open-mmlab/mmclassification/pull/1068))
### Improvements
- [Enhance] Enhance ArcFaceClsHead. ([#1181](https://github.com/open-mmlab/mmclassification/pull/1181))
- [Refactor] Refactor to use new fileio API in MMEngine. ([#1176](https://github.com/open-mmlab/mmclassification/pull/1176))
- [Enhance] Reproduce mobileone training accuracy. ([#1191](https://github.com/open-mmlab/mmclassification/pull/1191))
- [Enhance] add deleting params info in swinv2. ([#1142](https://github.com/open-mmlab/mmclassification/pull/1142))
- [Enhance] Add more mobilenetv3 pretrains. ([#1154](https://github.com/open-mmlab/mmclassification/pull/1154))
- [Enhancement] RepVGG for YOLOX-PAI for dev-1.x. ([#1126](https://github.com/open-mmlab/mmclassification/pull/1126))
- [Improve] Speed up data preprocessor. ([#1064](https://github.com/open-mmlab/mmclassification/pull/1064))
### Bug Fixes
- Fix the torchserve. ([#1143](https://github.com/open-mmlab/mmclassification/pull/1143))
- Fix configs due to api refactor of `num_classes`. ([#1184](https://github.com/open-mmlab/mmclassification/pull/1184))
- Update mmpretrain2torchserve. ([#1189](https://github.com/open-mmlab/mmclassification/pull/1189))
- Fix for `inference_model` cannot get classes information in checkpoint. ([#1093](https://github.com/open-mmlab/mmclassification/pull/1093))
### Docs Update
- Add not-found page extension. ([#1207](https://github.com/open-mmlab/mmclassification/pull/1207))
- update visualization doc. ([#1160](https://github.com/open-mmlab/mmclassification/pull/1160))
- Support sort and search the Model Summary table. ([#1100](https://github.com/open-mmlab/mmclassification/pull/1100))
- Improve the ResNet model page. ([#1118](https://github.com/open-mmlab/mmclassification/pull/1118))
- update the readme of convnext. ([#1156](https://github.com/open-mmlab/mmclassification/pull/1156))
- Fix the installation docs link in README. ([#1164](https://github.com/open-mmlab/mmclassification/pull/1164))
- Improve ViT and MobileViT model pages. ([#1155](https://github.com/open-mmlab/mmclassification/pull/1155))
- Improve Swin Doc and Add Tabs enxtation. ([#1145](https://github.com/open-mmlab/mmclassification/pull/1145))
- Add MMEval projects link in README. ([#1162](https://github.com/open-mmlab/mmclassification/pull/1162))
- Add runtime configuration docs. ([#1128](https://github.com/open-mmlab/mmclassification/pull/1128))
- Add custom evaluation docs ([#1130](https://github.com/open-mmlab/mmclassification/pull/1130))
- Add custom pipeline docs. ([#1124](https://github.com/open-mmlab/mmclassification/pull/1124))
- Add MMYOLO projects link in MMCLS1.x. ([#1117](https://github.com/open-mmlab/mmclassification/pull/1117))
## v1.0.0rc2(12/10/2022)
### New Features
- [Feature] Support DeiT3. ([#1065](https://github.com/open-mmlab/mmclassification/pull/1065))
### Improvements
- [Enhance] Update `analyze_results.py` for dev-1.x. ([#1071](https://github.com/open-mmlab/mmclassification/pull/1071))
- [Enhance] Get scores from inference api. ([#1070](https://github.com/open-mmlab/mmclassification/pull/1070))
### Bug Fixes
- [Fix] Update requirements. ([#1083](https://github.com/open-mmlab/mmclassification/pull/1083))
### Docs Update
- [Docs] Add 1x docs schedule. ([#1015](https://github.com/open-mmlab/mmclassification/pull/1015))
## v1.0.0rc1(30/9/2022)
### New Features
- Support MViT for MMCLS 1.x ([#1023](https://github.com/open-mmlab/mmclassification/pull/1023))
- Add ViT huge architecture. ([#1049](https://github.com/open-mmlab/mmclassification/pull/1049))
- Support EdgeNeXt for dev-1.x. ([#1037](https://github.com/open-mmlab/mmclassification/pull/1037))
- Support Swin Transformer V2 for MMCLS 1.x. ([#1029](https://github.com/open-mmlab/mmclassification/pull/1029))
- Add efficientformer Backbone for MMCls 1.x. ([#1031](https://github.com/open-mmlab/mmclassification/pull/1031))
- Add MobileOne Backbone For MMCls 1.x. ([#1030](https://github.com/open-mmlab/mmclassification/pull/1030))
- Support BEiT Transformer layer. ([#919](https://github.com/open-mmlab/mmclassification/pull/919))
### Improvements
- [Refactor] Fix visualization tools. ([#1045](https://github.com/open-mmlab/mmclassification/pull/1045))
- [Improve] Update benchmark scripts ([#1028](https://github.com/open-mmlab/mmclassification/pull/1028))
- [Improve] Update tools to enable `pin_memory` and `persistent_workers` by default. ([#1024](https://github.com/open-mmlab/mmclassification/pull/1024))
- [CI] Update circle-ci and github workflow. ([#1018](https://github.com/open-mmlab/mmclassification/pull/1018))
### Bug Fixes
- Fix verify dataset tool in 1.x. ([#1062](https://github.com/open-mmlab/mmclassification/pull/1062))
- Fix `loss_weight` in `LabelSmoothLoss`. ([#1058](https://github.com/open-mmlab/mmclassification/pull/1058))
- Fix the output position of Swin-Transformer. ([#947](https://github.com/open-mmlab/mmclassification/pull/947))
### Docs Update
- Auto generate model summary table. ([#1010](https://github.com/open-mmlab/mmclassification/pull/1010))
- Refactor new modules tutorial. ([#998](https://github.com/open-mmlab/mmclassification/pull/998))
## v1.0.0rc0(31/8/2022)
MMClassification 1.0.0rc0 is the first version of MMClassification 1.x, a part of the OpenMMLab 2.0 projects.
Built upon the new [training engine](https://github.com/open-mmlab/mmengine), MMClassification 1.x unifies the interfaces of dataset, models, evaluation, and visualization.
And there are some BC-breaking changes. Please check [the migration tutorial](https://mmclassification.readthedocs.io/en/1.x/migration.html) for more details.
## v0.23.1(2/6/2022)
### New Features
- Dedicated MMClsWandbHook for MMClassification (Weights and Biases Integration) ([#764](https://github.com/open-mmlab/mmclassification/pull/764))
### Improvements
- Use mdformat instead of markdownlint to format markdown. ([#844](https://github.com/open-mmlab/mmclassification/pull/844))
### Bug Fixes
- Fix wrong `--local_rank`.
### Docs Update
- Update install tutorials. ([#854](https://github.com/open-mmlab/mmclassification/pull/854))
- Fix wrong link in README. ([#835](https://github.com/open-mmlab/mmclassification/pull/835))
## v0.23.0(1/5/2022)
### New Features
- Support DenseNet. ([#750](https://github.com/open-mmlab/mmclassification/pull/750))
- Support VAN. ([#739](https://github.com/open-mmlab/mmclassification/pull/739))
### Improvements
- Support training on IPU and add fine-tuning configs of ViT. ([#723](https://github.com/open-mmlab/mmclassification/pull/723))
### Docs Update
- New style API reference, and easier to use! Welcome [view it](https://mmclassification.readthedocs.io/en/master/api/models.html). ([#774](https://github.com/open-mmlab/mmclassification/pull/774))
## v0.22.1(15/4/2022)
### New Features
- [Feature] Support resize relative position embedding in `SwinTransformer`. ([#749](https://github.com/open-mmlab/mmclassification/pull/749))
- [Feature] Add PoolFormer backbone and checkpoints. ([#746](https://github.com/open-mmlab/mmclassification/pull/746))
### Improvements
- [Enhance] Improve CPE performance by reduce memory copy. ([#762](https://github.com/open-mmlab/mmclassification/pull/762))
- [Enhance] Add extra dataloader settings in configs. ([#752](https://github.com/open-mmlab/mmclassification/pull/752))
## v0.22.0(30/3/2022)
### Highlights
- Support a series of CSP Network, such as CSP-ResNet, CSP-ResNeXt and CSP-DarkNet.
- A new `CustomDataset` class to help you build dataset of yourself!
- Support ConvMixer, RepMLP and new dataset - CUB dataset.
### New Features
- [Feature] Add CSPNet and backbone and checkpoints ([#735](https://github.com/open-mmlab/mmclassification/pull/735))
- [Feature] Add `CustomDataset`. ([#738](https://github.com/open-mmlab/mmclassification/pull/738))
- [Feature] Add diff seeds to diff ranks. ([#744](https://github.com/open-mmlab/mmclassification/pull/744))
- [Feature] Support ConvMixer. ([#716](https://github.com/open-mmlab/mmclassification/pull/716))
- [Feature] Our `dist_train` & `dist_test` tools support distributed training on multiple machines. ([#734](https://github.com/open-mmlab/mmclassification/pull/734))
- [Feature] Add RepMLP backbone and checkpoints. ([#709](https://github.com/open-mmlab/mmclassification/pull/709))
- [Feature] Support CUB dataset. ([#703](https://github.com/open-mmlab/mmclassification/pull/703))
- [Feature] Support ResizeMix. ([#676](https://github.com/open-mmlab/mmclassification/pull/676))
### Improvements
- [Enhance] Use `--a-b` instead of `--a_b` in arguments. ([#754](https://github.com/open-mmlab/mmclassification/pull/754))
- [Enhance] Add `get_cat_ids` and `get_gt_labels` to KFoldDataset. ([#721](https://github.com/open-mmlab/mmclassification/pull/721))
- [Enhance] Set torch seed in `worker_init_fn`. ([#733](https://github.com/open-mmlab/mmclassification/pull/733))
### Bug Fixes
- [Fix] Fix the discontiguous output feature map of ConvNeXt. ([#743](https://github.com/open-mmlab/mmclassification/pull/743))
### Docs Update
- [Docs] Add brief installation steps in README for copy&paste. ([#755](https://github.com/open-mmlab/mmclassification/pull/755))
- [Docs] fix logo url link from mmocr to mmpretrain. ([#732](https://github.com/open-mmlab/mmclassification/pull/732))
## v0.21.0(04/03/2022)
### Highlights
- Support ResNetV1c and Wide-ResNet, and provide pre-trained models.
- Support dynamic input shape for ViT-based algorithms. Now our ViT, DeiT, Swin-Transformer and T2T-ViT support forwarding with any input shape.
- Reproduce training results of DeiT. And our DeiT-T and DeiT-S have higher accuracy comparing with the official weights.
### New Features
- Add ResNetV1c. ([#692](https://github.com/open-mmlab/mmclassification/pull/692))
- Support Wide-ResNet. ([#715](https://github.com/open-mmlab/mmclassification/pull/715))
- Support gem pooling ([#677](https://github.com/open-mmlab/mmclassification/pull/677))
### Improvements
- Reproduce training results of DeiT. ([#711](https://github.com/open-mmlab/mmclassification/pull/711))
- Add ConvNeXt pretrain models on ImageNet-1k. ([#707](https://github.com/open-mmlab/mmclassification/pull/707))
- Support dynamic input shape for ViT-based algorithms. ([#706](https://github.com/open-mmlab/mmclassification/pull/706))
- Add `evaluate` function for ConcatDataset. ([#650](https://github.com/open-mmlab/mmclassification/pull/650))
- Enhance vis-pipeline tool. ([#604](https://github.com/open-mmlab/mmclassification/pull/604))
- Return code 1 if scripts runs failed. ([#694](https://github.com/open-mmlab/mmclassification/pull/694))
- Use PyTorch official `one_hot` to implement `convert_to_one_hot`. ([#696](https://github.com/open-mmlab/mmclassification/pull/696))
- Add a new pre-commit-hook to automatically add a copyright. ([#710](https://github.com/open-mmlab/mmclassification/pull/710))
- Add deprecation message for deploy tools. ([#697](https://github.com/open-mmlab/mmclassification/pull/697))
- Upgrade isort pre-commit hooks. ([#687](https://github.com/open-mmlab/mmclassification/pull/687))
- Use `--gpu-id` instead of `--gpu-ids` in non-distributed multi-gpu training/testing. ([#688](https://github.com/open-mmlab/mmclassification/pull/688))
- Remove deprecation. ([#633](https://github.com/open-mmlab/mmclassification/pull/633))
### Bug Fixes
- Fix Conformer forward with irregular input size. ([#686](https://github.com/open-mmlab/mmclassification/pull/686))
- Add `dist.barrier` to fix a bug in directory checking. ([#666](https://github.com/open-mmlab/mmclassification/pull/666))
## v0.20.1(07/02/2022)
### Bug Fixes
- Fix the MMCV dependency version.
## v0.20.0(30/01/2022)
### Highlights
- Support K-fold cross-validation. The tutorial will be released later.
- Support HRNet, ConvNeXt, Twins and EfficientNet.
- Support model conversion from PyTorch to Core-ML by a tool.
### New Features
- Support K-fold cross-validation. ([#563](https://github.com/open-mmlab/mmclassification/pull/563))
- Support HRNet and add pre-trained models. ([#660](https://github.com/open-mmlab/mmclassification/pull/660))
- Support ConvNeXt and add pre-trained models. ([#670](https://github.com/open-mmlab/mmclassification/pull/670))
- Support Twins and add pre-trained models. ([#642](https://github.com/open-mmlab/mmclassification/pull/642))
- Support EfficientNet and add pre-trained models.([#649](https://github.com/open-mmlab/mmclassification/pull/649))
- Support `features_only` option in `TIMMBackbone`. ([#668](https://github.com/open-mmlab/mmclassification/pull/668))
- Add conversion script from pytorch to Core-ML model. ([#597](https://github.com/open-mmlab/mmclassification/pull/597))
### Improvements
- New-style CPU training and inference. ([#674](https://github.com/open-mmlab/mmclassification/pull/674))
- Add setup multi-processing both in train and test. ([#671](https://github.com/open-mmlab/mmclassification/pull/671))
- Rewrite channel split operation in ShufflenetV2. ([#632](https://github.com/open-mmlab/mmclassification/pull/632))
- Deprecate the support for "python setup.py test". ([#646](https://github.com/open-mmlab/mmclassification/pull/646))
- Support single-label, softmax, custom eps by asymmetric loss. ([#609](https://github.com/open-mmlab/mmclassification/pull/609))
- Save class names in best checkpoint created by evaluation hook. ([#641](https://github.com/open-mmlab/mmclassification/pull/641))
### Bug Fixes
- Fix potential unexcepted behaviors if `metric_options` is not specified in multi-label evaluation. ([#647](https://github.com/open-mmlab/mmclassification/pull/647))
- Fix API changes in `pytorch-grad-cam&gt;=1.3.7`. ([#656](https://github.com/open-mmlab/mmclassification/pull/656))
- Fix bug which breaks `cal_train_time` in `analyze_logs.py`. ([#662](https://github.com/open-mmlab/mmclassification/pull/662))
### Docs Update
- Update README in configs according to OpenMMLab standard. ([#672](https://github.com/open-mmlab/mmclassification/pull/672))
- Update installation guide and README. ([#624](https://github.com/open-mmlab/mmclassification/pull/624))
## v0.19.0(31/12/2021)
### Highlights
- The feature extraction function has been enhanced. See [#593](https://github.com/open-mmlab/mmclassification/pull/593) for more details.
- Provide the high-acc ResNet-50 training settings from [*ResNet strikes back*](https://arxiv.org/abs/2110.00476).
- Reproduce the training accuracy of T2T-ViT & RegNetX, and provide self-training checkpoints.
- Support DeiT & Conformer backbone and checkpoints.
- Provide a CAM visualization tool based on [pytorch-grad-cam](https://github.com/jacobgil/pytorch-grad-cam), and detailed [user guide](https://mmclassification.readthedocs.io/en/latest/tools/visualization.html#class-activation-map-visualization)!
### New Features
- Support Precise BN. ([#401](https://github.com/open-mmlab/mmclassification/pull/401))
- Add CAM visualization tool. ([#577](https://github.com/open-mmlab/mmclassification/pull/577))
- Repeated Aug and Sampler Registry. ([#588](https://github.com/open-mmlab/mmclassification/pull/588))
- Add DeiT backbone and checkpoints. ([#576](https://github.com/open-mmlab/mmclassification/pull/576))
- Support LAMB optimizer. ([#591](https://github.com/open-mmlab/mmclassification/pull/591))
- Implement the conformer backbone. ([#494](https://github.com/open-mmlab/mmclassification/pull/494))
- Add the frozen function for Swin Transformer model. ([#574](https://github.com/open-mmlab/mmclassification/pull/574))
- Support using checkpoint in Swin Transformer to save memory. ([#557](https://github.com/open-mmlab/mmclassification/pull/557))
### Improvements
- [Reproduction] Reproduce RegNetX training accuracy. ([#587](https://github.com/open-mmlab/mmclassification/pull/587))
- [Reproduction] Reproduce training results of T2T-ViT. ([#610](https://github.com/open-mmlab/mmclassification/pull/610))
- [Enhance] Provide high-acc training settings of ResNet. ([#572](https://github.com/open-mmlab/mmclassification/pull/572))
- [Enhance] Set a random seed when the user does not set a seed. ([#554](https://github.com/open-mmlab/mmclassification/pull/554))
- [Enhance] Added `NumClassCheckHook` and unit tests. ([#559](https://github.com/open-mmlab/mmclassification/pull/559))
- [Enhance] Enhance feature extraction function. ([#593](https://github.com/open-mmlab/mmclassification/pull/593))
- [Enhance] Improve efficiency of precision, recall, f1_score and support. ([#595](https://github.com/open-mmlab/mmclassification/pull/595))
- [Enhance] Improve accuracy calculation performance. ([#592](https://github.com/open-mmlab/mmclassification/pull/592))
- [Refactor] Refactor `analysis_log.py`. ([#529](https://github.com/open-mmlab/mmclassification/pull/529))
- [Refactor] Use new API of matplotlib to handle blocking input in visualization. ([#568](https://github.com/open-mmlab/mmclassification/pull/568))
- [CI] Cancel previous runs that are not completed. ([#583](https://github.com/open-mmlab/mmclassification/pull/583))
- [CI] Skip build CI if only configs or docs modification. ([#575](https://github.com/open-mmlab/mmclassification/pull/575))
### Bug Fixes
- Fix test sampler bug. ([#611](https://github.com/open-mmlab/mmclassification/pull/611))
- Try to create a symbolic link, otherwise copy. ([#580](https://github.com/open-mmlab/mmclassification/pull/580))
- Fix a bug for multiple output in swin transformer. ([#571](https://github.com/open-mmlab/mmclassification/pull/571))
### Docs Update
- Update mmcv, torch, cuda version in Dockerfile and docs. ([#594](https://github.com/open-mmlab/mmclassification/pull/594))
- Add analysis&misc docs. ([#525](https://github.com/open-mmlab/mmclassification/pull/525))
- Fix docs build dependency. ([#584](https://github.com/open-mmlab/mmclassification/pull/584))
## v0.18.0(30/11/2021)
### Highlights
- Support MLP-Mixer backbone and provide pre-trained checkpoints.
- Add a tool to visualize the learning rate curve of the training phase. Welcome to use with the [tutorial](https://mmclassification.readthedocs.io/en/latest/tools/visualization.html#learning-rate-schedule-visualization)!
### New Features
- Add MLP Mixer Backbone. ([#528](https://github.com/open-mmlab/mmclassification/pull/528), [#539](https://github.com/open-mmlab/mmclassification/pull/539))
- Support positive weights in BCE. ([#516](https://github.com/open-mmlab/mmclassification/pull/516))
- Add a tool to visualize learning rate in each iterations. ([#498](https://github.com/open-mmlab/mmclassification/pull/498))
### Improvements
- Use CircleCI to do unit tests. ([#567](https://github.com/open-mmlab/mmclassification/pull/567))
- Focal loss for single label tasks. ([#548](https://github.com/open-mmlab/mmclassification/pull/548))
- Remove useless `import_modules_from_string`. ([#544](https://github.com/open-mmlab/mmclassification/pull/544))
- Rename config files according to the config name standard. ([#508](https://github.com/open-mmlab/mmclassification/pull/508))
- Use `reset_classifier` to remove head of timm backbones. ([#534](https://github.com/open-mmlab/mmclassification/pull/534))
- Support passing arguments to loss from head. ([#523](https://github.com/open-mmlab/mmclassification/pull/523))
- Refactor `Resize` transform and add `Pad` transform. ([#506](https://github.com/open-mmlab/mmclassification/pull/506))
- Update mmcv dependency version. ([#509](https://github.com/open-mmlab/mmclassification/pull/509))
### Bug Fixes
- Fix bug when using `ClassBalancedDataset`. ([#555](https://github.com/open-mmlab/mmclassification/pull/555))
- Fix a bug when using iter-based runner with 'val' workflow. ([#542](https://github.com/open-mmlab/mmclassification/pull/542))
- Fix interpolation method checking in `Resize`. ([#547](https://github.com/open-mmlab/mmclassification/pull/547))
- Fix a bug when load checkpoints in mulit-GPUs environment. ([#527](https://github.com/open-mmlab/mmclassification/pull/527))
- Fix an error on indexing scalar metrics in `analyze_result.py`. ([#518](https://github.com/open-mmlab/mmclassification/pull/518))
- Fix wrong condition judgment in `analyze_logs.py` and prevent empty curve. ([#510](https://github.com/open-mmlab/mmclassification/pull/510))
### Docs Update
- Fix vit config and model broken links. ([#564](https://github.com/open-mmlab/mmclassification/pull/564))
- Add abstract and image for every paper. ([#546](https://github.com/open-mmlab/mmclassification/pull/546))
- Add mmflow and mim in banner and readme. ([#543](https://github.com/open-mmlab/mmclassification/pull/543))
- Add schedule and runtime tutorial docs. ([#499](https://github.com/open-mmlab/mmclassification/pull/499))
- Add the top-5 acc in ResNet-CIFAR README. ([#531](https://github.com/open-mmlab/mmclassification/pull/531))
- Fix TOC of `visualization.md` and add example images. ([#513](https://github.com/open-mmlab/mmclassification/pull/513))
- Use docs link of other projects and add MMCV docs. ([#511](https://github.com/open-mmlab/mmclassification/pull/511))
## v0.17.0(29/10/2021)
### Highlights
- Support Tokens-to-Token ViT backbone and Res2Net backbone. Welcome to use!
- Support ImageNet21k dataset.
- Add a pipeline visualization tool. Try it with the [tutorials](https://mmclassification.readthedocs.io/en/latest/tools/visualization.html#pipeline-visualization)!
### New Features
- Add Tokens-to-Token ViT backbone and converted checkpoints. ([#467](https://github.com/open-mmlab/mmclassification/pull/467))
- Add Res2Net backbone and converted weights. ([#465](https://github.com/open-mmlab/mmclassification/pull/465))
- Support ImageNet21k dataset. ([#461](https://github.com/open-mmlab/mmclassification/pull/461))
- Support seesaw loss. ([#500](https://github.com/open-mmlab/mmclassification/pull/500))
- Add a pipeline visualization tool. ([#406](https://github.com/open-mmlab/mmclassification/pull/406))
- Add a tool to find broken files. ([#482](https://github.com/open-mmlab/mmclassification/pull/482))
- Add a tool to test TorchServe. ([#468](https://github.com/open-mmlab/mmclassification/pull/468))
### Improvements
- Refator Vision Transformer. ([#395](https://github.com/open-mmlab/mmclassification/pull/395))
- Use context manager to reuse matplotlib figures. ([#432](https://github.com/open-mmlab/mmclassification/pull/432))
### Bug Fixes
- Remove `DistSamplerSeedHook` if use `IterBasedRunner`. ([#501](https://github.com/open-mmlab/mmclassification/pull/501))
- Set the priority of `EvalHook` to "LOW" to avoid a bug when using `IterBasedRunner`. ([#488](https://github.com/open-mmlab/mmclassification/pull/488))
- Fix a wrong parameter of `get_root_logger` in `apis/train.py`. ([#486](https://github.com/open-mmlab/mmclassification/pull/486))
- Fix version check in dataset builder. ([#474](https://github.com/open-mmlab/mmclassification/pull/474))
### Docs Update
- Add English Colab tutorials and update Chinese Colab tutorials. ([#483](https://github.com/open-mmlab/mmclassification/pull/483), [#497](https://github.com/open-mmlab/mmclassification/pull/497))
- Add tutuorial for config files. ([#487](https://github.com/open-mmlab/mmclassification/pull/487))
- Add model-pages in Model Zoo. ([#480](https://github.com/open-mmlab/mmclassification/pull/480))
- Add code-spell pre-commit hook and fix a large mount of typos. ([#470](https://github.com/open-mmlab/mmclassification/pull/470))
## v0.16.0(30/9/2021)
### Highlights
- We have improved compatibility with downstream repositories like MMDetection and MMSegmentation. We will add some examples about how to use our backbones in MMDetection.
- Add RepVGG backbone and checkpoints. Welcome to use it!
- Add timm backbones wrapper, now you can simply use backbones of pytorch-image-models in MMClassification!
### New Features
- Add RepVGG backbone and checkpoints. ([#414](https://github.com/open-mmlab/mmclassification/pull/414))
- Add timm backbones wrapper. ([#427](https://github.com/open-mmlab/mmclassification/pull/427))
### Improvements
- Fix TnT compatibility and verbose warning. ([#436](https://github.com/open-mmlab/mmclassification/pull/436))
- Support setting `--out-items` in `tools/test.py`. ([#437](https://github.com/open-mmlab/mmclassification/pull/437))
- Add datetime info and saving model using torch\<1.6 format. ([#439](https://github.com/open-mmlab/mmclassification/pull/439))
- Improve downstream repositories compatibility. ([#421](https://github.com/open-mmlab/mmclassification/pull/421))
- Rename the option `--options` to `--cfg-options` in some tools. ([#425](https://github.com/open-mmlab/mmclassification/pull/425))
- Add PyTorch 1.9 and Python 3.9 build workflow, and remove some CI. ([#422](https://github.com/open-mmlab/mmclassification/pull/422))
### Bug Fixes
- Fix format error in `test.py` when metric returns `np.ndarray`. ([#441](https://github.com/open-mmlab/mmclassification/pull/441))
- Fix `publish_model` bug if no parent of `out_file`. ([#463](https://github.com/open-mmlab/mmclassification/pull/463))
- Fix num_classes bug in pytorch2onnx.py. ([#458](https://github.com/open-mmlab/mmclassification/pull/458))
- Fix missing runtime requirement `packaging`. ([#459](https://github.com/open-mmlab/mmclassification/pull/459))
- Fix saving simplified model bug in ONNX export tool. ([#438](https://github.com/open-mmlab/mmclassification/pull/438))
### Docs Update
- Update `getting_started.md` and `install.md`. And rewrite `finetune.md`. ([#466](https://github.com/open-mmlab/mmclassification/pull/466))
- Use PyTorch style docs theme. ([#457](https://github.com/open-mmlab/mmclassification/pull/457))
- Update metafile and Readme. ([#435](https://github.com/open-mmlab/mmclassification/pull/435))
- Add `CITATION.cff`. ([#428](https://github.com/open-mmlab/mmclassification/pull/428))
## v0.15.0(31/8/2021)
### Highlights
- Support `hparams` argument in `AutoAugment` and `RandAugment` to provide hyperparameters for sub-policies.
- Support custom squeeze channels in `SELayer`.
- Support classwise weight in losses.
### New Features
- Add `hparams` argument in `AutoAugment` and `RandAugment` and some other improvement. ([#398](https://github.com/open-mmlab/mmclassification/pull/398))
- Support classwise weight in losses. ([#388](https://github.com/open-mmlab/mmclassification/pull/388))
- Enhance `SELayer` to support custom squeeze channels. ([#417](https://github.com/open-mmlab/mmclassification/pull/417))
### Code Refactor
- Better result visualization. ([#419](https://github.com/open-mmlab/mmclassification/pull/419))
- Use `post_process` function to handle pred result processing. ([#390](https://github.com/open-mmlab/mmclassification/pull/390))
- Update `digit_version` function. ([#402](https://github.com/open-mmlab/mmclassification/pull/402))
- Avoid albumentations to install both opencv and opencv-headless. ([#397](https://github.com/open-mmlab/mmclassification/pull/397))
- Avoid unnecessary listdir when building ImageNet. ([#396](https://github.com/open-mmlab/mmclassification/pull/396))
- Use dynamic mmcv download link in TorchServe dockerfile. ([#387](https://github.com/open-mmlab/mmclassification/pull/387))
### Docs Improvement
- Add readme of some algorithms and update meta yml. ([#418](https://github.com/open-mmlab/mmclassification/pull/418))
- Add Copyright information. ([#413](https://github.com/open-mmlab/mmclassification/pull/413))
- Fix typo 'metirc'. ([#411](https://github.com/open-mmlab/mmclassification/pull/411))
- Update QQ group QR code. ([#393](https://github.com/open-mmlab/mmclassification/pull/393))
- Add PR template and modify issue template. ([#380](https://github.com/open-mmlab/mmclassification/pull/380))
## v0.14.0(4/8/2021)
### Highlights
- Add transformer-in-transformer backbone and pretrain checkpoints, refers to [the paper](https://arxiv.org/abs/2103.00112).
- Add Chinese colab tutorial.
- Provide dockerfile to build mmpretrain dev docker image.
### New Features
- Add transformer in transformer backbone and pretrain checkpoints. ([#339](https://github.com/open-mmlab/mmclassification/pull/339))
- Support mim, welcome to use mim to manage your mmpretrain project. ([#376](https://github.com/open-mmlab/mmclassification/pull/376))
- Add Dockerfile. ([#365](https://github.com/open-mmlab/mmclassification/pull/365))
- Add ResNeSt configs. ([#332](https://github.com/open-mmlab/mmclassification/pull/332))
### Improvements
- Use the `presistent_works` option if available, to accelerate training. ([#349](https://github.com/open-mmlab/mmclassification/pull/349))
- Add Chinese ipynb tutorial. ([#306](https://github.com/open-mmlab/mmclassification/pull/306))
- Refactor unit tests. ([#321](https://github.com/open-mmlab/mmclassification/pull/321))
- Support to test mmdet inference with mmpretrain backbone. ([#343](https://github.com/open-mmlab/mmclassification/pull/343))
- Use zero as default value of `thrs` in metrics. ([#341](https://github.com/open-mmlab/mmclassification/pull/341))
### Bug Fixes
- Fix ImageNet dataset annotation file parse bug. ([#370](https://github.com/open-mmlab/mmclassification/pull/370))
- Fix docstring typo and init bug in ShuffleNetV1. ([#374](https://github.com/open-mmlab/mmclassification/pull/374))
- Use local ATTENTION registry to avoid conflict with other repositories. ([#376](https://github.com/open-mmlab/mmclassification/pull/375))
- Fix swin transformer config bug. ([#355](https://github.com/open-mmlab/mmclassification/pull/355))
- Fix `patch_cfg` argument bug in SwinTransformer. ([#368](https://github.com/open-mmlab/mmclassification/pull/368))
- Fix duplicate `init_weights` call in ViT init function. ([#373](https://github.com/open-mmlab/mmclassification/pull/373))
- Fix broken `_base_` link in a resnet config. ([#361](https://github.com/open-mmlab/mmclassification/pull/361))
- Fix vgg-19 model link missing. ([#363](https://github.com/open-mmlab/mmclassification/pull/363))
## v0.13.0(3/7/2021)
- Support Swin-Transformer backbone and add training configs for Swin-Transformer on ImageNet.
### New Features
- Support Swin-Transformer backbone and add training configs for Swin-Transformer on ImageNet. (#271)
- Add pretained model of RegNetX. (#269)
- Support adding custom hooks in config file. (#305)
- Improve and add Chinese translation of `CONTRIBUTING.md` and all tools tutorials. (#320)
- Dump config before training. (#282)
- Add torchscript and torchserve deployment tools. (#279, #284)
### Improvements
- Improve test tools and add some new tools. (#322)
- Correct MobilenetV3 backbone structure and add pretained models. (#291)
- Refactor `PatchEmbed` and `HybridEmbed` as independent components. (#330)
- Refactor mixup and cutmix as `Augments` to support more functions. (#278)
- Refactor weights initialization method. (#270, #318, #319)
- Refactor `LabelSmoothLoss` to support multiple calculation formulas. (#285)
### Bug Fixes
- Fix bug for CPU training. (#286)
- Fix missing test data when `num_imgs` can not be evenly divided by `num_gpus`. (#299)
- Fix build compatible with pytorch v1.3-1.5. (#301)
- Fix `magnitude_std` bug in `RandAugment`. (#309)
- Fix bug when `samples_per_gpu` is 1. (#311)
## v0.12.0(3/6/2021)
- Finish adding Chinese tutorials and build Chinese documentation on readthedocs.
- Update ResNeXt checkpoints and ResNet checkpoints on CIFAR.
### New Features
- Improve and add Chinese translation of `data_pipeline.md` and `new_modules.md`. (#265)
- Build Chinese translation on readthedocs. (#267)
- Add an argument efficientnet_style to `RandomResizedCrop` and `CenterCrop`. (#268)
### Improvements
- Only allow directory operation when rank==0 when testing. (#258)
- Fix typo in `base_head`. (#274)
- Update ResNeXt checkpoints. (#283)
### Bug Fixes
- Add attribute `data.test` in MNIST configs. (#264)
- Download CIFAR/MNIST dataset only on rank 0. (#273)
- Fix MMCV version compatibility. (#276)
- Fix CIFAR color channels bug and update checkpoints in model zoo. (#280)
## v0.11.1(21/5/2021)
- Refine `new_dataset.md` and add Chinese translation of `finture.md`, `new_dataset.md`.
### New Features
- Add `dim` argument for `GlobalAveragePooling`. (#236)
- Add random noise to `RandAugment` magnitude. (#240)
- Refine `new_dataset.md` and add Chinese translation of `finture.md`, `new_dataset.md`. (#243)
### Improvements
- Refactor arguments passing for Heads. (#239)
- Allow more flexible `magnitude_range` in `RandAugment`. (#249)
- Inherits MMCV registry so that in the future OpenMMLab repos like MMDet and MMSeg could directly use the backbones supported in MMCls. (#252)
### Bug Fixes
- Fix typo in `analyze_results.py`. (#237)
- Fix typo in unittests. (#238)
- Check if specified tmpdir exists when testing to avoid deleting existing data. (#242 & #258)
- Add missing config files in `MANIFEST.in`. (#250 & #255)
- Use temporary directory under shared directory to collect results to avoid unavailability of temporary directory for multi-node testing. (#251)
## v0.11.0(1/5/2021)
- Support cutmix trick.
- Support random augmentation.
- Add `tools/deployment/test.py` as a ONNX runtime test tool.
- Support ViT backbone and add training configs for ViT on ImageNet.
- Add Chinese `README.md` and some Chinese tutorials.
### New Features
- Support cutmix trick. (#198)
- Add `simplify` option in `pytorch2onnx.py`. (#200)
- Support random augmentation. (#201)
- Add config and checkpoint for training ResNet on CIFAR-100. (#208)
- Add `tools/deployment/test.py` as a ONNX runtime test tool. (#212)
- Support ViT backbone and add training configs for ViT on ImageNet. (#214)
- Add finetuning configs for ViT on ImageNet. (#217)
- Add `device` option to support training on CPU. (#219)
- Add Chinese `README.md` and some Chinese tutorials. (#221)
- Add `metafile.yml` in configs to support interaction with paper with code(PWC) and MMCLI. (#225)
- Upload configs and converted checkpoints for ViT fintuning on ImageNet. (#230)
### Improvements
- Fix `LabelSmoothLoss` so that label smoothing and mixup could be enabled at the same time. (#203)
- Add `cal_acc` option in `ClsHead`. (#206)
- Check `CLASSES` in checkpoint to avoid unexpected key error. (#207)
- Check mmcv version when importing mmpretrain to ensure compatibility. (#209)
- Update `CONTRIBUTING.md` to align with that in MMCV. (#210)
- Change tags to html comments in configs README.md. (#226)
- Clean codes in ViT backbone. (#227)
- Reformat `pytorch2onnx.md` tutorial. (#229)
- Update `setup.py` to support MMCLI. (#232)
### Bug Fixes
- Fix missing `cutmix_prob` in ViT configs. (#220)
- Fix backend for resize in ResNeXt configs. (#222)
## v0.10.0(1/4/2021)
- Support AutoAugmentation
- Add tutorials for installation and usage.
### New Features
- Add `Rotate` pipeline for data augmentation. (#167)
- Add `Invert` pipeline for data augmentation. (#168)
- Add `Color` pipeline for data augmentation. (#171)
- Add `Solarize` and `Posterize` pipeline for data augmentation. (#172)
- Support fp16 training. (#178)
- Add tutorials for installation and basic usage of MMClassification.(#176)
- Support `AutoAugmentation`, `AutoContrast`, `Equalize`, `Contrast`, `Brightness` and `Sharpness` pipelines for data augmentation. (#179)
### Improvements
- Support dynamic shape export to onnx. (#175)
- Release training configs and update model zoo for fp16 (#184)
- Use MMCV's EvalHook in MMClassification (#182)
### Bug Fixes
- Fix wrong naming in vgg config (#181)
## v0.9.0(1/3/2021)
- Implement mixup trick.
- Add a new tool to create TensorRT engine from ONNX, run inference and verify outputs in Python.
### New Features
- Implement mixup and provide configs of training ResNet50 using mixup. (#160)
- Add `Shear` pipeline for data augmentation. (#163)
- Add `Translate` pipeline for data augmentation. (#165)
- Add `tools/onnx2tensorrt.py` as a tool to create TensorRT engine from ONNX, run inference and verify outputs in Python. (#153)
### Improvements
- Add `--eval-options` in `tools/test.py` to support eval options override, matching the behavior of other open-mmlab projects. (#158)
- Support showing and saving painted results in `mmpretrain.apis.test` and `tools/test.py`, matching the behavior of other open-mmlab projects. (#162)
### Bug Fixes
- Fix configs for VGG, replace checkpoints converted from other repos with the ones trained by ourselves and upload the missing logs in the model zoo. (#161)
## v0.8.0(31/1/2021)
- Support multi-label task.
- Support more flexible metrics settings.
- Fix bugs.
### New Features
- Add evaluation metrics: mAP, CP, CR, CF1, OP, OR, OF1 for multi-label task. (#123)
- Add BCE loss for multi-label task. (#130)
- Add focal loss for multi-label task. (#131)
- Support PASCAL VOC 2007 dataset for multi-label task. (#134)
- Add asymmetric loss for multi-label task. (#132)
- Add analyze_results.py to select images for success/fail demonstration. (#142)
- Support new metric that calculates the total number of occurrences of each label. (#143)
- Support class-wise evaluation results. (#143)
- Add thresholds in eval_metrics. (#146)
- Add heads and a baseline config for multilabel task. (#145)
### Improvements
- Remove the models with 0 checkpoint and ignore the repeated papers when counting papers to gain more accurate model statistics. (#135)
- Add tags in README.md. (#137)
- Fix optional issues in docstring. (#138)
- Update stat.py to classify papers. (#139)
- Fix mismatched columns in README.md. (#150)
- Fix test.py to support more evaluation metrics. (#155)
### Bug Fixes
- Fix bug in VGG weight_init. (#140)
- Fix bug in 2 ResNet configs in which outdated heads were used. (#147)
- Fix bug of misordered height and width in `RandomCrop` and `RandomResizedCrop`. (#151)
- Fix missing `meta_keys` in `Collect`. (#149 & #152)
## v0.7.0(31/12/2020)
- Add more evaluation metrics.
- Fix bugs.
### New Features
- Remove installation of MMCV from requirements. (#90)
- Add 3 evaluation metrics: precision, recall and F-1 score. (#93)
- Allow config override during testing and inference with `--options`. (#91 & #96)
### Improvements
- Use `build_runner` to make runners more flexible. (#54)
- Support to get category ids in `BaseDataset`. (#72)
- Allow `CLASSES` override during `BaseDateset` initialization. (#85)
- Allow input image as ndarray during inference. (#87)
- Optimize MNIST config. (#98)
- Add config links in model zoo documentation. (#99)
- Use functions from MMCV to collect environment. (#103)
- Refactor config files so that they are now categorized by methods. (#116)
- Add README in config directory. (#117)
- Add model statistics. (#119)
- Refactor documentation in consistency with other MM repositories. (#126)
### Bug Fixes
- Add missing `CLASSES` argument to dataset wrappers. (#66)
- Fix slurm evaluation error during training. (#69)
- Resolve error caused by shape in `Accuracy`. (#104)
- Fix bug caused by extremely insufficient data in distributed sampler.(#108)
- Fix bug in `gpu_ids` in distributed training. (#107)
- Fix bug caused by extremely insufficient data in collect results during testing (#114)
## v0.6.0(11/10/2020)
- Support new method: ResNeSt and VGG.
- Support new dataset: CIFAR10.
- Provide new tools to do model inference, model conversion from pytorch to onnx.
### New Features
- Add model inference. (#16)
- Add pytorch2onnx. (#20)
- Add PIL backend for transform `Resize`. (#21)
- Add ResNeSt. (#25)
- Add VGG and its pretained models. (#27)
- Add CIFAR10 configs and models. (#38)
- Add albumentations transforms. (#45)
- Visualize results on image demo. (#58)
### Improvements
- Replace urlretrieve with urlopen in dataset.utils. (#13)
- Resize image according to its short edge. (#22)
- Update ShuffleNet config. (#31)
- Update pre-trained models for shufflenet_v2, shufflenet_v1, se-resnet50, se-resnet101. (#33)
### Bug Fixes
- Fix init_weights in `shufflenet_v2.py`. (#29)
- Fix the parameter `size` in test_pipeline. (#30)
- Fix the parameter in cosine lr schedule. (#32)
- Fix the convert tools for mobilenet_v2. (#34)
- Fix crash in CenterCrop transform when image is greyscale (#40)
- Fix outdated configs. (#53)
../../../CONTRIBUTING.md
\ No newline at end of file
# Frequently Asked Questions
We list some common troubles faced by many users and their corresponding
solutions here. Feel free to enrich the list if you find any frequent issues
and have ways to help others to solve them. If the contents here do not cover
your issue, please create an issue using the
[provided templates](https://github.com/open-mmlab/mmpretrain/issues/new/choose)
and make sure you fill in all required information in the template.
## Installation
- Compatibility issue between MMEngine, MMCV and MMPretrain
Compatible MMPretrain and MMEngine, MMCV versions are shown as below. Please
choose the correct version of MMEngine and MMCV to avoid installation issues.
| MMPretrain version | MMEngine version | MMCV version |
| :----------------: | :---------------: | :--------------: |
| 1.2.0 (main) | mmengine >= 0.8.3 | mmcv >= 2.0.0 |
| 1.1.1 | mmengine >= 0.8.3 | mmcv >= 2.0.0 |
| 1.0.0 | mmengine >= 0.8.0 | mmcv >= 2.0.0 |
| 1.0.0rc8 | mmengine >= 0.7.1 | mmcv >= 2.0.0rc4 |
| 1.0.0rc7 | mmengine >= 0.5.0 | mmcv >= 2.0.0rc4 |
```{note}
Since the `dev` branch is under frequent development, the MMEngine and MMCV
version dependency may be inaccurate. If you encounter problems when using
the `dev` branch, please try to update MMEngine and MMCV to the latest version.
```
- Using Albumentations
If you would like to use `albumentations`, we suggest using `pip install -r requirements/albu.txt` or
`pip install -U albumentations --no-binary qudida,albumentations`.
If you simply use `pip install albumentations>=0.3.2`, it will install `opencv-python-headless` simultaneously
(even though you have already installed `opencv-python`). Please refer to the
[official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies)
for details.
## General Questions
### Do I need to reinstall mmpretrain after some code modifications?
If you follow [the best practice](../get_started.md#best-practices) and install mmpretrain from source,
any local modifications made to the code will take effect without
reinstallation.
### How to develop with multiple MMPretrain versions?
Generally speaking, we recommend to use different virtual environments to
manage MMPretrain in different working directories. However, you
can also use the same environment to develop MMPretrain in different
folders, like mmpretrain-0.21, mmpretrain-0.23. When you run the train or test shell script,
it will adopt the mmpretrain package in the current folder. And when you run other Python
script, you can also add `` PYTHONPATH=`pwd` `` at the beginning of your command
to use the package in the current folder.
Conversely, to use the default MMPretrain installed in the environment
rather than the one you are working with, you can remove the following line
in those shell scripts:
```shell
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH
```
### What's the relationship between the `load_from` and the `init_cfg`?
- `load_from`: If `resume=False`, only imports model weights, which is mainly used to load trained models;
If `resume=True`, load all of the model weights, optimizer state, and other training information, which is
mainly used to resume interrupted training.
- `init_cfg`: You can also specify `init=dict(type="Pretrained", checkpoint=xxx)` to load checkpoint, it
means load the weights during model weights initialization. That is, it will be only done at the
beginning of the training. It's mainly used to fine-tune a pre-trained model, and you can set it in
the backbone config and use `prefix` field to only load backbone weights, for example:
```python
model = dict(
backbone=dict(
type='ResNet',
depth=50,
init_cfg=dict(type='Pretrained', checkpoints=xxx, prefix='backbone'),
)
...
)
```
See the [Fine-tune Models](./finetune_custom_dataset.md) for more details about fine-tuning.
### What's the difference between `default_hooks` and `custom_hooks`?
Almost no difference. Usually, the `default_hooks` field is used to specify the hooks that will be used in almost
all experiments, and the `custom_hooks` field is used in only some experiments.
Another difference is the `default_hooks` is a dict while the `custom_hooks` is a list, please don't be
confused.
### During training, I got no training log, what's the reason?
If your training dataset is small while the batch size is large, our default log interval may be too large to
record your training log.
You can shrink the log interval and try again, like:
```python
default_hooks = dict(
...
logger=dict(type='LoggerHook', interval=10),
...
)
```
### How to train with other datasets, like my own dataset or COCO?
We provide [specific examples](./pretrain_custom_dataset.md) to show how to train with other datasets.
# How to Fine-tune with Custom Dataset
In most scenarios, we want to apply a pre-trained model without training from scratch, which might possibly introduce extra uncertainties about the model convergency and therefore, is time-consuming.
The common sense is to learn from previous models trained on large dataset, which can hopefully provide better knowledge than a random beginner. Roughly speaking, this process is as known as fine-tuning.
Models pre-trained on the ImageNet dataset have been demonstrated to be effective for other datasets and other downstream tasks.
Hence, this tutorial provides instructions for users to use the models provided in the [Model Zoo](../modelzoo_statistics.md) for other datasets to obtain better performance.
In this tutorial, we provide a practice example and some tips on how to fine-tune a model on your own dataset.
## Step-1: Prepare your dataset
Prepare your dataset following [Prepare Dataset](../user_guides/dataset_prepare.md).
And the root folder of the dataset can be like `data/custom_dataset/`.
Here, we assume you want to do supervised image-classification training, and use the sub-folder format
`CustomDataset` to organize your dataset as:
```text
data/custom_dataset/
├── train
│   ├── class_x
│   │ ├── x_1.png
│ │ ├── x_2.png
│ │ ├── x_3.png
│ │ └── ...
│ ├── class_y
│   └── ...
└── test
   ├── class_x
   │ ├── test_x_1.png
│ ├── test_x_2.png
│ ├── test_x_3.png
│ └── ...
├── class_y
   └── ...
```
## Step-2: Choose one config as template
Here, we would like to use `configs/resnet/resnet50_8xb32_in1k.py` as the example. We first copy this config
file to the same folder and rename it as `resnet50_8xb32-ft_custom.py`.
```{tip}
As a convention, the last field of the config name is the dataset, e.g.,`in1k` for ImageNet dataset, `coco` for COCO dataset
```
The content of this config is:
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
```
## Step-3: Edit the model settings
When fine-tuning a model, usually we want to load the pre-trained backbone
weights and train a new classification head from scratch.
To load the pre-trained backbone, we need to change the initialization config
of the backbone and use `Pretrained` initialization function. Besides, in the
`init_cfg`, we use `prefix='backbone'` to tell the initialization function
the prefix of the submodule that needs to be loaded in the checkpoint.
For example, `backbone` here means to load the backbone submodule. And here we
use an online checkpoint, it will be downloaded automatically during training,
you can also download the model manually and use a local path.
And then we need to modify the head according to the class numbers of the new
datasets by just changing `num_classes` in the head.
When new dataset is small and shares the domain with the pre-trained dataset,
we might want to freeze the first several stages' parameters of the
backbone, that will help the network to keep ability to extract low-level
information learnt from pre-trained model. In MMPretrain, you can simply
specify how many stages to freeze by `frozen_stages` argument. For example, to
freeze the first two stages' parameters, just use the following configs:
```{note}
Not all backbones support the `frozen_stages` argument by now. Please check
[the docs](https://mmpretrain.readthedocs.io/en/latest/api.html#module-mmpretrain.models.backbones)
to confirm if your backbone supports it.
```
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# >>>>>>>>>>>>>>> Override model settings here >>>>>>>>>>>>>>>>>>>
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
```
```{tip}
Here we only need to set the part of configs we want to modify, because the
inherited configs will be merged and get the entire configs.
```
## Step-4: Edit the dataset settings
To fine-tuning on a new dataset, we need to override some dataset settings, like the type of dataset, data
pipeline, etc.
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# model settings
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# >>>>>>>>>>>>>>> Override data settings here >>>>>>>>>>>>>>>>>>>
data_root = 'data/custom_dataset'
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
```
## Step-5: Edit the schedule settings (optional)
The fine-tuning hyper parameters vary from the default schedule. It usually
requires smaller learning rate and quicker decaying scheduler epochs.
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# model settings
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# data settings
data_root = 'data/custom_dataset'
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
# >>>>>>>>>>>>>>> Override schedule settings here >>>>>>>>>>>>>>>>>>>
# optimizer hyper-parameters
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
# learning policy
param_scheduler = dict(
type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
```
```{tip}
Refers to [Learn about Configs](../user_guides/config.md) for more detailed configurations.
```
## Start Training
Now, we have finished the fine-tuning config file as following:
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# model settings
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# data settings
data_root = 'data/custom_dataset'
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
# schedule settings
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
param_scheduler = dict(
type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
```
Here we use 8 GPUs on your computer to train the model with the following command:
```shell
bash tools/dist_train.sh configs/resnet/resnet50_8xb32-ft_custom.py 8
```
Also, you can use only one GPU to train the model with the following command:
```shell
python tools/train.py configs/resnet/resnet50_8xb32-ft_custom.py
```
But wait, an important config need to be changed if using one GPU. We need to
change the dataset config as following:
```python
data_root = 'data/custom_dataset'
train_dataloader = dict(
batch_size=256,
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
```
It's because our training schedule is for a batch size of 256. If using 8 GPUs,
just use `batch_size=32` config in the base config file for every GPU, and the total batch
size will be 256. But if using one GPU, you need to change it to 256 manually to
match the training schedule.
However, a larger batch size requires a larger GPU memory, and here are several simple tricks to save the GPU
memory:
1. Enable Automatic-Mixed-Precision training.
```shell
python tools/train.py configs/resnet/resnet50_8xb32-ft_custom.py --amp
```
2. Use a smaller batch size, like `batch_size=32` instead of 256, and enable the auto learning rate scaling.
```shell
python tools/train.py configs/resnet/resnet50_8xb32-ft_custom.py --auto-scale-lr
```
The auto learning rate scaling will adjust the learning rate according to the actual batch size and the
`auto_scale_lr.base_batch_size` (You can find it in the base config
`configs/_base_/schedules/imagenet_bs256.py`)
```{note}
Most of these tricks may influence the training performance slightly.
```
### Apply pre-trained model with command line
If you don't want to modify the configs, you could use `--cfg-options` to add your pre-trained model path to `init_cfg`.
For example, the command below will also load pre-trained model.
```shell
bash tools/dist_train.sh configs/resnet/resnet50_8xb32-ft_custom.py 8 \
--cfg-options model.backbone.init_cfg.type='Pretrained' \
model.backbone.init_cfg.checkpoint='https://download.openmmlab.com/mmselfsup/1.x/mocov3/mocov3_resnet50_8xb512-amp-coslr-100e_in1k/mocov3_resnet50_8xb512-amp-coslr-100e_in1k_20220927-f1144efa.pth' \
model.backbone.init_cfg.prefix='backbone' \
```
# How to Pretrain with Custom Dataset
In this tutorial, we provide a practice example and some tips on how to train on your own dataset.
In MMPretrain, We support the `CustomDataset` (similar to the `ImageFolder` in `torchvision`), which is able to read the images within the specified folder directly. You only need to prepare the path information of the custom dataset and edit the config.
## Step-1: Prepare your dataset
Prepare your dataset following [Prepare Dataset](../user_guides/dataset_prepare.md).
And the root folder of the dataset can be like `data/custom_dataset/`.
Here, we assume you want to do unsupervised training, and use the sub-folder format `CustomDataset` to
organize your dataset as:
```text
data/custom_dataset/
├── sample1.png
├── sample2.png
├── sample3.png
├── sample4.png
└── ...
```
## Step-2: Choose one config as template
Here, we would like to use `configs/mae/mae_vit-base-p16_8xb512-amp-coslr-300e_in1k.py` as the example. We
first copy this config file to the same folder and rename it as
`mae_vit-base-p16_8xb512-amp-coslr-300e_custom.py`.
```{tip}
As a convention, the last field of the config name is the dataset, e.g.,`in1k` for ImageNet dataset, `coco` for COCO dataset
```
The content of this config is:
```python
_base_ = [
'../_base_/models/mae_vit-base-p16.py',
'../_base_/datasets/imagenet_bs512_mae.py',
'../_base_/default_runtime.py',
]
# optimizer wrapper
optim_wrapper = dict(
type='AmpOptimWrapper',
loss_scale='dynamic',
optimizer=dict(
type='AdamW',
lr=1.5e-4 * 4096 / 256,
betas=(0.9, 0.95),
weight_decay=0.05),
paramwise_cfg=dict(
custom_keys={
'ln': dict(decay_mult=0.0),
'bias': dict(decay_mult=0.0),
'pos_embed': dict(decay_mult=0.),
'mask_token': dict(decay_mult=0.),
'cls_token': dict(decay_mult=0.)
}))
# learning rate scheduler
param_scheduler = [
dict(
type='LinearLR',
start_factor=0.0001,
by_epoch=True,
begin=0,
end=40,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=260,
by_epoch=True,
begin=40,
end=300,
convert_to_iter_based=True)
]
# runtime settings
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=300)
default_hooks = dict(
# only keeps the latest 3 checkpoints
checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=3))
randomness = dict(seed=0, diff_rank_seed=True)
# auto resume
resume = True
# NOTE: `auto_scale_lr` is for automatically scaling LR
# based on the actual training batch size.
auto_scale_lr = dict(base_batch_size=4096)
```
## Step-3: Edit the dataset related config
- Override the `type` of dataset settings as `'CustomDataset'`
- Override the `data_root` of dataset settings as `data/custom_dataset`.
- Override the `ann_file` of dataset settings as an empty string since we assume you are using the sub-folder
format `CustomDataset`.
- Override the `data_prefix` of dataset settings as an empty string since we are using the whole dataset under
the `data_root`, and you don't need to split samples into different subset and set the `data_prefix`.
The modified config will be like:
```python
_base_ = [
'../_base_/models/mae_vit-base-p16.py',
'../_base_/datasets/imagenet_bs512_mae.py',
'../_base_/default_runtime.py',
]
# >>>>>>>>>>>>>>> Override dataset settings here >>>>>>>>>>>>>>>>>>>
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root='data/custom_dataset/',
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='', # The `data_root` is the data_prefix directly.
with_label=False,
)
)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# optimizer wrapper
optim_wrapper = dict(
type='AmpOptimWrapper',
loss_scale='dynamic',
optimizer=dict(
type='AdamW',
lr=1.5e-4 * 4096 / 256,
betas=(0.9, 0.95),
weight_decay=0.05),
paramwise_cfg=dict(
custom_keys={
'ln': dict(decay_mult=0.0),
'bias': dict(decay_mult=0.0),
'pos_embed': dict(decay_mult=0.),
'mask_token': dict(decay_mult=0.),
'cls_token': dict(decay_mult=0.)
}))
# learning rate scheduler
param_scheduler = [
dict(
type='LinearLR',
start_factor=0.0001,
by_epoch=True,
begin=0,
end=40,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=260,
by_epoch=True,
begin=40,
end=300,
convert_to_iter_based=True)
]
# runtime settings
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=300)
default_hooks = dict(
# only keeps the latest 3 checkpoints
checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=3))
randomness = dict(seed=0, diff_rank_seed=True)
# auto resume
resume = True
# NOTE: `auto_scale_lr` is for automatically scaling LR
# based on the actual training batch size.
auto_scale_lr = dict(base_batch_size=4096)
```
By using the edited config file, you are able to train a self-supervised model with MAE algorithm on the custom dataset.
## Another example: Train MAE on COCO Dataset
```{note}
You need to install MMDetection to use the `mmdet.CocoDataset` follow this [documentation](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/en/get_started.md)
```
Follow the aforementioned idea, we also present an example of how to train MAE on COCO dataset. The edited file will be like this:
```python
_base_ = [
'../_base_/models/mae_vit-base-p16.py',
'../_base_/datasets/imagenet_mae.py',
'../_base_/default_runtime.py',
]
# >>>>>>>>>>>>>>> Override dataset settings here >>>>>>>>>>>>>>>>>>>
train_dataloader = dict(
dataset=dict(
type='mmdet.CocoDataset',
data_root='data/coco/',
ann_file='annotations/instances_train2017.json', # Only for loading images, and the labels won't be used.
data_prefix=dict(img='train2017/'),
)
)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# optimizer wrapper
optim_wrapper = dict(
type='AmpOptimWrapper',
loss_scale='dynamic',
optimizer=dict(
type='AdamW',
lr=1.5e-4 * 4096 / 256,
betas=(0.9, 0.95),
weight_decay=0.05),
paramwise_cfg=dict(
custom_keys={
'ln': dict(decay_mult=0.0),
'bias': dict(decay_mult=0.0),
'pos_embed': dict(decay_mult=0.),
'mask_token': dict(decay_mult=0.),
'cls_token': dict(decay_mult=0.)
}))
# learning rate scheduler
param_scheduler = [
dict(
type='LinearLR',
start_factor=0.0001,
by_epoch=True,
begin=0,
end=40,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=260,
by_epoch=True,
begin=40,
end=300,
convert_to_iter_based=True)
]
# runtime settings
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=300)
default_hooks = dict(
# only keeps the latest 3 checkpoints
checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=3))
randomness = dict(seed=0, diff_rank_seed=True)
# auto resume
resume = True
# NOTE: `auto_scale_lr` is for automatically scaling LR
# based on the actual training batch size.
auto_scale_lr = dict(base_batch_size=4096)
```
# Projects based on MMPretrain
There are many projects built upon MMPretrain(MMClassification previsously).
We list some of them as examples of how to extend MMPretrain(MMClassification previsously) for your own projects.
As the page might not be completed, please feel free to create a PR to update this page.
## Projects as an extension
- [OpenMixup](https://github.com/Westlake-AI/openmixup): an open-source toolbox for supervised, self-, and semi-supervised visual representation learning with mixup based on PyTorch, especially for mixup-related methods.
- [AI Power](https://github.com/ykk648/AI_power): AI toolbox and pretrain models.
- [OpenBioSeq](https://github.com/Westlake-AI/OpenBioSeq): an open-source supervised and self-supervised bio-sequence representation learning toolbox based on PyTorch.
## Projects of papers
There are also projects released with papers.
Some of the papers are published in top-tier conferences (CVPR, ICCV, and ECCV), the others are also highly influential.
To make this list also a reference for the community to develop and compare new image classification algorithms, we list them following the time order of top-tier conferences.
Methods already supported and maintained by MMPretrain(MMClassification previsously) are not listed.
- Involution: Inverting the Inherence of Convolution for Visual Recognition, CVPR21. [[paper]](https://arxiv.org/abs/2103.06255)[[github]](https://github.com/d-li14/involution)
- Convolution of Convolution: Let Kernels Spatially Collaborate, CVPR22. [[paper]](https://openaccess.thecvf.com/content/CVPR2022/papers/Zhao_Convolution_of_Convolution_Let_Kernels_Spatially_Collaborate_CVPR_2022_paper.pdf)[[github]](https://github.com/Genera1Z/ConvolutionOfConvolution)
#!/usr/bin/env python
import re
import warnings
from collections import defaultdict
from pathlib import Path
from modelindex.load_model_index import load
from modelindex.models.Result import Result
from tabulate import tabulate
MMPT_ROOT = Path(__file__).absolute().parents[2]
PAPERS_ROOT = Path('papers') # Path to save generated paper pages.
GITHUB_PREFIX = 'https://github.com/open-mmlab/mmpretrain/blob/main/'
MODELZOO_TEMPLATE = """\
# Model Zoo Summary
In this page, we list [all algorithms](#all-supported-algorithms) we support. You can click the link to jump to the corresponding model pages.
And we also list all checkpoints for different tasks we provide. You can sort or search checkpoints in the table and click the corresponding link to model pages for more details.
## All supported algorithms
* Number of papers: {num_papers}
{type_msg}
* Number of checkpoints: {num_ckpts}
{paper_msg}
""" # noqa: E501
METRIC_ALIAS = {
'Top 1 Accuracy': 'Top-1 (%)',
'Top 5 Accuracy': 'Top-5 (%)',
}
model_index = load(str(MMPT_ROOT / 'model-index.yml'))
def build_collections(model_index):
col_by_name = {}
for col in model_index.collections:
setattr(col, 'models', [])
col_by_name[col.name] = col
for model in model_index.models:
col = col_by_name[model.in_collection]
col.models.append(model)
setattr(model, 'collection', col)
if model.results is None:
setattr(model, 'tasks', [])
else:
setattr(model, 'tasks', [result.task for result in model.results])
build_collections(model_index)
def count_papers(collections):
total_num_ckpts = 0
type_count = defaultdict(int)
paper_msgs = []
for collection in collections:
with open(MMPT_ROOT / collection.readme) as f:
readme = f.read()
ckpts = set(x.lower().strip()
for x in re.findall(r'\[model\]\((https?.*)\)', readme))
total_num_ckpts += len(ckpts)
title = collection.paper['Title']
papertype = collection.data.get('type', 'Algorithm')
type_count[papertype] += 1
readme = PAPERS_ROOT / Path(
collection.filepath).parent.with_suffix('.md').name
paper_msgs.append(
f'\t- [{papertype}] [{title}]({readme}) ({len(ckpts)} ckpts)')
type_msg = '\n'.join(
[f'\t- {type_}: {count}' for type_, count in type_count.items()])
paper_msg = '\n'.join(paper_msgs)
modelzoo = MODELZOO_TEMPLATE.format(
num_papers=len(collections),
num_ckpts=total_num_ckpts,
type_msg=type_msg,
paper_msg=paper_msg,
)
with open('modelzoo_statistics.md', 'w') as f:
f.write(modelzoo)
count_papers(model_index.collections)
def generate_paper_page(collection):
PAPERS_ROOT.mkdir(exist_ok=True)
# Write a copy of README
with open(MMPT_ROOT / collection.readme) as f:
readme = f.read()
folder = Path(collection.filepath).parent
copy = PAPERS_ROOT / folder.with_suffix('.md').name
def replace_link(matchobj):
# Replace relative link to GitHub link.
name = matchobj.group(1)
link = matchobj.group(2)
if not link.startswith('http'):
assert (folder / link).exists(), \
f'Link not found:\n{collection.readme}: {link}'
rel_link = (folder / link).absolute().relative_to(MMPT_ROOT)
link = GITHUB_PREFIX + str(rel_link)
return f'[{name}]({link})'
content = re.sub(r'\[([^\]]+)\]\(([^)]+)\)', replace_link, readme)
content = f'---\ngithub_page: /{collection.readme}\n---\n' + content
def make_tabs(matchobj):
"""modify the format from emphasis black symbol to tabs."""
content = matchobj.group()
content = content.replace('<!-- [TABS-BEGIN] -->', '')
content = content.replace('<!-- [TABS-END] -->', '')
# split the content by "**{Tab-Name}**""
splits = re.split(r'^\*\*(.*)\*\*$', content, flags=re.M)[1:]
tabs_list = []
for title, tab_content in zip(splits[::2], splits[1::2]):
title = ':::{tab} ' + title + '\n'
tab_content = tab_content.strip() + '\n:::\n'
tabs_list.append(title + tab_content)
return '::::{tabs}\n' + ''.join(tabs_list) + '::::'
if '<!-- [TABS-BEGIN] -->' in content and '<!-- [TABS-END] -->' in content:
# Make TABS block a selctive tabs
try:
pattern = r'<!-- \[TABS-BEGIN\] -->([\d\D]*?)<!-- \[TABS-END\] -->'
content = re.sub(pattern, make_tabs, content)
except Exception as e:
warnings.warn(f'Can not parse the TABS, get an error : {e}')
with open(copy, 'w') as copy_file:
copy_file.write(content)
for collection in model_index.collections:
generate_paper_page(collection)
def scatter_results(models):
model_result_pairs = []
for model in models:
if model.results is None:
result = Result(task=None, dataset=None, metrics={})
model_result_pairs.append((model, result))
else:
for result in model.results:
model_result_pairs.append((model, result))
return model_result_pairs
def generate_summary_table(task, model_result_pairs, title=None):
metrics = set()
for model, result in model_result_pairs:
if result.task == task:
metrics = metrics.union(result.metrics.keys())
metrics = sorted(list(metrics))
rows = []
for model, result in model_result_pairs:
if result.task != task:
continue
name = model.name
params = f'{model.metadata.parameters / 1e6:.2f}' # Params
if model.metadata.flops is not None:
flops = f'{model.metadata.flops / 1e9:.2f}' # Flops
else:
flops = None
readme = Path(model.collection.filepath).parent.with_suffix('.md').name
page = f'[link]({PAPERS_ROOT / readme})'
model_metrics = []
for metric in metrics:
model_metrics.append(str(result.metrics.get(metric, '')))
rows.append([name, params, flops, *model_metrics, page])
with open('modelzoo_statistics.md', 'a') as f:
if title is not None:
f.write(f'\n{title}')
f.write("""\n```{table}\n:class: model-summary\n""")
header = [
'Model',
'Params (M)',
'Flops (G)',
*[METRIC_ALIAS.get(metric, metric) for metric in metrics],
'Readme',
]
table_cfg = dict(
tablefmt='pipe',
floatfmt='.2f',
numalign='right',
stralign='center')
f.write(tabulate(rows, header, **table_cfg))
f.write('\n```\n')
def generate_dataset_wise_table(task, model_result_pairs, title=None):
dataset_rows = defaultdict(list)
for model, result in model_result_pairs:
if result.task == task:
dataset_rows[result.dataset].append((model, result))
if title is not None:
with open('modelzoo_statistics.md', 'a') as f:
f.write(f'\n{title}')
for dataset, pairs in dataset_rows.items():
generate_summary_table(task, pairs, title=f'### {dataset}')
model_result_pairs = scatter_results(model_index.models)
# Generate Pretrain Summary
generate_summary_table(
task=None,
model_result_pairs=model_result_pairs,
title='## Pretrained Models',
)
# Generate Image Classification Summary
generate_dataset_wise_table(
task='Image Classification',
model_result_pairs=model_result_pairs,
title='## Image Classification',
)
# Generate Multi-Label Classification Summary
generate_dataset_wise_table(
task='Multi-Label Classification',
model_result_pairs=model_result_pairs,
title='## Multi-Label Classification',
)
# Generate Image Retrieval Summary
generate_dataset_wise_table(
task='Image Retrieval',
model_result_pairs=model_result_pairs,
title='## Image Retrieval',
)
# Class Activation Map (CAM) Visualization
## Introduction of the CAM visualization tool
MMPretrain provides `tools/visualization/vis_cam.py` tool to visualize class activation map. Please use `pip install "grad-cam>=1.3.6"` command to install [pytorch-grad-cam](https://github.com/jacobgil/pytorch-grad-cam).
The supported methods are as follows:
| Method | What it does |
| ------------ | ---------------------------------------------------------------------------------------------------------------------------- |
| GradCAM | Weight the 2D activations by the average gradient |
| GradCAM++ | Like GradCAM but uses second order gradients |
| XGradCAM | Like GradCAM but scale the gradients by the normalized activations |
| EigenCAM | Takes the first principle component of the 2D Activations (no class discrimination, but seems to give great results) |
| EigenGradCAM | Like EigenCAM but with class discrimination: First principle component of Activations\*Grad. Looks like GradCAM, but cleaner |
| LayerCAM | Spatially weight the activations by positive gradients. Works better especially in lower layers |
More CAM methods supported by the new version `pytorch-grad-cam` can also be used but we haven't verified the availability.
**Command**
```bash
python tools/visualization/vis_cam.py \
${IMG} \
${CONFIG_FILE} \
${CHECKPOINT} \
[--target-layers ${TARGET-LAYERS}] \
[--preview-model] \
[--method ${METHOD}] \
[--target-category ${TARGET-CATEGORY}] \
[--save-path ${SAVE_PATH}] \
[--vit-like] \
[--num-extra-tokens ${NUM-EXTRA-TOKENS}]
[--aug_smooth] \
[--eigen_smooth] \
[--device ${DEVICE}] \
[--cfg-options ${CFG-OPTIONS}]
```
**Description of all arguments**
- `img`: The target picture path.
- `config`: The path of the model config file.
- `checkpoint`: The path of the checkpoint.
- `--target-layers`: The target layers to get activation maps, one or more network layers can be specified. If not set, use the norm layer of the last block.
- `--preview-model`: Whether to print all network layer names in the model.
- `--method`: Visualization method, supports `GradCAM`, `GradCAM++`, `XGradCAM`, `EigenCAM`, `EigenGradCAM`, `LayerCAM`, which is case insensitive. Defaults to `GradCAM`.
- `--target-category`: Target category, if not set, use the category detected by the given model.
- `--eigen-smooth`: Whether to use the principal component to reduce noise.
- `--aug-smooth`: Whether to use TTA(Test Time Augment) to get CAM.
- `--save-path`: The path to save the CAM visualization image. If not set, the CAM image will not be saved.
- `--vit-like`: Whether the network is ViT-like network.
- `--num-extra-tokens`: The number of extra tokens in ViT-like backbones. If not set, use num_extra_tokens the backbone.
- `--device`: The computing device used. Default to 'cpu'.
- `--cfg-options`: Modifications to the configuration file, refer to [Learn about Configs](../user_guides/config.md).
```{note}
The argument `--preview-model` can view all network layers names in the given model. It will be helpful if you know nothing about the model layers when setting `--target-layers`.
```
## How to visualize the CAM of CNN (ResNet-50)
Here are some examples of `target-layers` in ResNet-50, which can be any module or layer:
- `'backbone.layer4'` means the output of the forth ResLayer.
- `'backbone.layer4.2'` means the output of the third BottleNeck block in the forth ResLayer.
- `'backbone.layer4.2.conv1'` means the output of the `conv1` layer in above BottleNeck block.
1. Use different methods to visualize CAM for `ResNet50`, the `target-category` is the predicted result by the given checkpoint, using the default `target-layers`.
```shell
python tools/visualization/vis_cam.py \
demo/bird.JPEG \
configs/resnet/resnet50_8xb32_in1k.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth \
--method GradCAM
# GradCAM++, XGradCAM, EigenCAM, EigenGradCAM, LayerCAM
```
| Image | GradCAM | GradCAM++ | EigenGradCAM | LayerCAM |
| ------------------------------------ | --------------------------------------- | ----------------------------------------- | -------------------------------------------- | ---------------------------------------- |
| <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429496-628d3fb3-1f6e-41ff-aa5c-1b08c60c32a9.JPEG' height="auto" width="160" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065002-f1c86516-38b2-47ba-90c1-e00b49556c70.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065119-82581fa1-3414-4d6c-a849-804e1503c74b.jpg' height="auto" width="150"></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065096-75a6a2c1-6c57-4789-ad64-ebe5e38765f4.jpg' height="auto" width="150"></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/147065129-814d20fb-98be-4106-8c5e-420adcc85295.jpg' height="auto" width="150"></div> |
2. Use different `target-category` to get CAM from the same picture. In `ImageNet` dataset, the category 238 is 'Greater Swiss Mountain dog', the category 281 is 'tabby, tabby cat'.
```shell
python tools/visualization/vis_cam.py \
demo/cat-dog.png configs/resnet/resnet50_8xb32_in1k.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth \
--target-layers 'backbone.layer4.2' \
--method GradCAM \
--target-category 238
# --target-category 281
```
| Category | Image | GradCAM | XGradCAM | LayerCAM |
| -------- | ---------------------------------------------- | ------------------------------------------------ | ------------------------------------------------- | ------------------------------------------------- |
| Dog | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429526-f27f4cce-89b9-4117-bfe6-55c2ca7eaba6.png' height="auto" width="165" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144433562-968a57bc-17d9-413e-810e-f91e334d648a.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144433853-319f3a8f-95f2-446d-b84f-3028daca5378.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144433937-daef5a69-fd70-428f-98a3-5e7747f4bb88.jpg' height="auto" width="150" ></div> |
| Cat | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429526-f27f4cce-89b9-4117-bfe6-55c2ca7eaba6.png' height="auto" width="165" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144434518-867ae32a-1cb5-4dbd-b1b9-5e375e94ea48.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144434603-0a2fd9ec-c02e-4e6c-a17b-64c234808c56.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144434623-b4432cc2-c663-4b97-aed3-583d9d3743e6.jpg' height="auto" width="150" ></div> |
3. Use `--eigen-smooth` and `--aug-smooth` to improve visual effects.
```shell
python tools/visualization/vis_cam.py \
demo/dog.jpg \
configs/mobilenet_v3/mobilenet-v3-large_8xb128_in1k.py \
https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_large-3ea3c186.pth \
--target-layers 'backbone.layer16' \
--method LayerCAM \
--eigen-smooth --aug-smooth
```
| Image | LayerCAM | eigen-smooth | aug-smooth | eigen&aug |
| ------------------------------------ | --------------------------------------- | ------------------------------------------- | ----------------------------------------- | ----------------------------------------- |
| <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557492-98ac5ce0-61f9-4da9-8ea7-396d0b6a20fa.jpg' height="auto" width="160"></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557541-a4cf7d86-7267-46f9-937c-6f657ea661b4.jpg' height="auto" width="145" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557547-2731b53e-e997-4dd2-a092-64739cc91959.jpg' height="auto" width="145" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557545-8189524a-eb92-4cce-bf6a-760cab4a8065.jpg' height="auto" width="145" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144557548-c1e3f3ec-3c96-43d4-874a-3b33cd3351c5.jpg' height="auto" width="145" ></div> |
## How to visualize the CAM of vision transformer
Here are some examples:
- `'backbone.norm3'` for Swin-Transformer;
- `'backbone.layers.11.ln1'` for ViT;
For ViT-like networks, such as ViT, T2T-ViT and Swin-Transformer, the features are flattened. And for drawing the CAM, we need to specify the `--vit-like` argument to reshape the features into square feature maps.
Besides the flattened features, some ViT-like networks also add extra tokens like the class token in ViT and T2T-ViT, and the distillation token in DeiT. In these networks, the final classification is done on the tokens computed in the last attention block, and therefore, the classification score will not be affected by other features and the gradient of the classification score with respect to them, will be zero. Therefore, you shouldn't use the output of the last attention block as the target layer in these networks.
To exclude these extra tokens, we need know the number of extra tokens. Almost all transformer-based backbones in MMPretrain have the `num_extra_tokens` attribute. If you want to use this tool in a new or third-party network that don't have the `num_extra_tokens` attribute, please specify it the `--num-extra-tokens` argument.
1. Visualize CAM for `Swin Transformer`, using default `target-layers`:
```shell
python tools/visualization/vis_cam.py \
demo/bird.JPEG \
configs/swin_transformer/swin-tiny_16xb64_in1k.py \
https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth \
--vit-like
```
2. Visualize CAM for `Vision Transformer(ViT)`:
```shell
python tools/visualization/vis_cam.py \
demo/bird.JPEG \
configs/vision_transformer/vit-base-p16_64xb64_in1k-384px.py \
https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-98e8652b.pth \
--vit-like \
--target-layers 'backbone.layers.11.ln1'
```
3. Visualize CAM for `T2T-ViT`:
```shell
python tools/visualization/vis_cam.py \
demo/bird.JPEG \
configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py \
https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_3rdparty_8xb64_in1k_20210928-b7c09b62.pth \
--vit-like \
--target-layers 'backbone.encoder.12.ln1'
```
| Image | ResNet50 | ViT | Swin | T2T-ViT |
| --------------------------------------- | ------------------------------------------ | -------------------------------------- | --------------------------------------- | ------------------------------------------ |
| <div align=center><img src='https://user-images.githubusercontent.com/18586273/144429496-628d3fb3-1f6e-41ff-aa5c-1b08c60c32a9.JPEG' height="auto" width="165" ></div> | <div align=center><img src=https://user-images.githubusercontent.com/18586273/144431491-a2e19fe3-5c12-4404-b2af-a9552f5a95d9.jpg height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144436218-245a11de-6234-4852-9c08-ff5069f6a739.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144436168-01b0e565-442c-4e1e-910c-17c62cff7cd3.jpg' height="auto" width="150" ></div> | <div align=center><img src='https://user-images.githubusercontent.com/18586273/144436198-51dbfbda-c48d-48cc-ae06-1a923d19b6f6.jpg' height="auto" width="150" ></div> |
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment