"vscode:/vscode.git/clone" did not exist on "f71fb4f9a912ec945401cc49a287a759b6131026"
Commit b6df0d33 authored by limm's avatar limm
Browse files

add resources part

parent cbc25585
Pipeline #2802 canceled with stages
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.engine
mmpretrain.engine
===================================
This package includes some runtime components, including hooks, runners, optimizers and loops. These components are useful in
classification tasks but not supported by MMEngine yet.
.. note::
Some components may be moved to MMEngine in the future.
.. contents:: mmpretrain.engine
:depth: 2
:local:
:backlinks: top
.. module:: mmpretrain.engine.hooks
Hooks
------------------
.. autosummary::
:toctree: generated
:nosignatures:
ClassNumCheckHook
PreciseBNHook
VisualizationHook
PrepareProtoBeforeValLoopHook
SetAdaptiveMarginsHook
EMAHook
SimSiamHook
DenseCLHook
SwAVHook
.. module:: mmpretrain.engine.optimizers
Optimizers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
Lamb
LARS
LearningRateDecayOptimWrapperConstructor
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.evaluation
mmpretrain.evaluation
===================================
This package includes metrics and evaluators for classification tasks.
.. contents:: mmpretrain.evaluation
:depth: 1
:local:
:backlinks: top
Single Label Metric
----------------------
.. autosummary::
:toctree: generated
:nosignatures:
Accuracy
SingleLabelMetric
ConfusionMatrix
Multi Label Metric
----------------------
.. autosummary::
:toctree: generated
:nosignatures:
AveragePrecision
MultiLabelMetric
VOCAveragePrecision
VOCMultiLabelMetric
Retrieval Metric
----------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
RetrievalRecall
RetrievalAveragePrecision
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.models
mmpretrain.models
===================================
The ``models`` package contains several sub-packages for addressing the different components of a model.
- :mod:`~mmpretrain.models.classifiers`: The top-level module which defines the whole process of a classification model.
- :mod:`~mmpretrain.models.selfsup`: The top-level module which defines the whole process of a self-supervised learning model.
- :mod:`~mmpretrain.models.retrievers`: The top-level module which defines the whole process of a retrieval model.
- :mod:`~mmpretrain.models.backbones`: Usually a feature extraction network, e.g., ResNet, MobileNet.
- :mod:`~mmpretrain.models.necks`: The component between backbones and heads, e.g., GlobalAveragePooling.
- :mod:`~mmpretrain.models.heads`: The component for specific tasks.
- :mod:`~mmpretrain.models.losses`: Loss functions.
- :mod:`~mmpretrain.models.peft`: The PEFT (Parameter-Efficient Fine-Tuning) module, e.g. LoRAModel.
- :mod:`~mmpretrain.models.utils`: Some helper functions and common components used in various networks.
- :mod:`~mmpretrain.models.utils.data_preprocessor`: The component before model to preprocess the inputs, e.g., ClsDataPreprocessor.
- :ref:`components`: Common components used in various networks.
- :ref:`helpers`: Helper functions.
Build Functions
---------------
.. autosummary::
:toctree: generated
:nosignatures:
build_classifier
build_backbone
build_neck
build_head
build_loss
.. module:: mmpretrain.models.classifiers
Classifiers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BaseClassifier
ImageClassifier
TimmClassifier
HuggingFaceClassifier
.. module:: mmpretrain.models.selfsup
Self-supervised Algorithms
--------------------------
.. _selfsup_algorithms:
.. autosummary::
:toctree: generated
:nosignatures:
BaseSelfSupervisor
BEiT
BYOL
BarlowTwins
CAE
DenseCL
EVA
iTPN
MAE
MILAN
MaskFeat
MixMIM
MoCo
MoCoV3
SimCLR
SimMIM
SimSiam
SparK
SwAV
.. _selfsup_backbones:
Some of above algorithms modified the backbone module to adapt the extra inputs
like ``mask``, and here is the a list of these **modified backbone** modules.
.. autosummary::
:toctree: generated
:nosignatures:
BEiTPretrainViT
CAEPretrainViT
iTPNHiViT
MAEHiViT
MAEViT
MILANViT
MaskFeatViT
MixMIMPretrainTransformer
MoCoV3ViT
SimMIMSwinTransformer
.. _target_generators:
Some self-supervise algorithms need an external **target generator** to
generate the optimization target. Here is a list of target generators.
.. autosummary::
:toctree: generated
:nosignatures:
VQKD
DALLEEncoder
HOGGenerator
CLIPGenerator
.. module:: mmpretrain.models.retrievers
Retrievers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BaseRetriever
ImageToImageRetriever
.. module:: mmpretrain.models.multimodal
Multi-Modality Algorithms
--------------------------
.. autosummary::
:toctree: generated
:nosignatures:
Blip2Caption
Blip2Retrieval
Blip2VQA
BlipCaption
BlipGrounding
BlipNLVR
BlipRetrieval
BlipVQA
Flamingo
OFA
MiniGPT4
Llava
Otter
.. module:: mmpretrain.models.backbones
Backbones
------------------
.. autosummary::
:toctree: generated
:nosignatures:
AlexNet
BEiTViT
CSPDarkNet
CSPNet
CSPResNeXt
CSPResNet
Conformer
ConvMixer
ConvNeXt
DaViT
DeiT3
DenseNet
DistilledVisionTransformer
EdgeNeXt
EfficientFormer
EfficientNet
EfficientNetV2
HiViT
HRNet
HorNet
InceptionV3
LeNet5
LeViT
MViT
MlpMixer
MobileNetV2
MobileNetV3
MobileOne
MobileViT
PCPVT
PoolFormer
PyramidVig
RegNet
RepLKNet
RepMLPNet
RepVGG
Res2Net
ResNeSt
ResNeXt
ResNet
ResNetV1c
ResNetV1d
ResNet_CIFAR
RevVisionTransformer
SEResNeXt
SEResNet
SVT
ShuffleNetV1
ShuffleNetV2
SparseResNet
SparseConvNeXt
SwinTransformer
SwinTransformerV2
T2T_ViT
TIMMBackbone
TNT
VAN
VGG
Vig
VisionTransformer
ViTSAM
XCiT
ViTEVA02
.. module:: mmpretrain.models.necks
Necks
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BEiTV2Neck
CAENeck
ClsBatchNormNeck
DenseCLNeck
GeneralizedMeanPooling
GlobalAveragePooling
HRFuseScales
LinearNeck
MAEPretrainDecoder
MILANPretrainDecoder
MixMIMPretrainDecoder
MoCoV2Neck
NonLinearNeck
SimMIMLinearDecoder
SwAVNeck
iTPNPretrainDecoder
SparKLightDecoder
.. module:: mmpretrain.models.heads
Heads
------------------
.. autosummary::
:toctree: generated
:nosignatures:
ArcFaceClsHead
BEiTV1Head
BEiTV2Head
CAEHead
CSRAClsHead
ClsHead
ConformerHead
ContrastiveHead
DeiTClsHead
EfficientFormerClsHead
LatentCrossCorrelationHead
LatentPredictHead
LeViTClsHead
LinearClsHead
MAEPretrainHead
MIMHead
MixMIMPretrainHead
MoCoV3Head
MultiLabelClsHead
MultiLabelLinearClsHead
MultiTaskHead
SimMIMHead
StackedLinearClsHead
SwAVHead
VigClsHead
VisionTransformerClsHead
iTPNClipHead
SparKPretrainHead
.. module:: mmpretrain.models.losses
Losses
------------------
.. autosummary::
:toctree: generated
:nosignatures:
AsymmetricLoss
CAELoss
CosineSimilarityLoss
CrossCorrelationLoss
CrossEntropyLoss
FocalLoss
LabelSmoothLoss
PixelReconstructionLoss
SeesawLoss
SwAVLoss
.. module:: mmpretrain.models.peft
PEFT
------------------
.. autosummary::
:toctree: generated
:nosignatures:
LoRAModel
.. module:: mmpretrain.models.utils
models.utils
------------
This package includes some helper functions and common components used in various networks.
.. _components:
Common Components
^^^^^^^^^^^^^^^^^
.. autosummary::
:toctree: generated
:nosignatures:
ConditionalPositionEncoding
CosineEMA
HybridEmbed
InvertedResidual
LayerScale
MultiheadAttention
PatchEmbed
PatchMerging
SELayer
ShiftWindowMSA
WindowMSA
WindowMSAV2
.. _helpers:
Helper Functions
^^^^^^^^^^^^^^^^
.. autosummary::
:toctree: generated
:nosignatures:
channel_shuffle
is_tracing
make_divisible
resize_pos_embed
resize_relative_position_bias_table
to_ntuple
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.structures
mmpretrain.structures
===================================
This package includes basic data structures.
DataSample
-------------
.. autoclass:: DataSample
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.utils
mmpretrain.utils
===================================
This package includes some useful helper functions for developing.
.. autosummary::
:toctree: generated
:nosignatures:
collect_env
register_all_modules
load_json_log
track_on_main_process
get_ori_model
.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.visualization
mmpretrain.visualization
===================================
This package includes visualizer and some helper functions for visualization.
Visualizer
-------------
.. autoclass:: UniversalVisualizer
:members:
# flake8: noqa
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import subprocess
import sys
import pytorch_sphinx_theme
from sphinx.builders.html import StandaloneHTMLBuilder
sys.path.insert(0, os.path.abspath('../../'))
# -- Project information -----------------------------------------------------
project = 'MMPretrain'
copyright = '2020, OpenMMLab'
author = 'MMPretrain Authors'
# The full version, including alpha/beta/rc tags
version_file = '../../mmpretrain/version.py'
def get_version():
with open(version_file, 'r') as f:
exec(compile(f.read(), version_file, 'exec'))
return locals()['__version__']
release = get_version()
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
'myst_parser',
'sphinx_copybutton',
'sphinx_tabs.tabs',
'notfound.extension',
'sphinxcontrib.jquery',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
source_suffix = {
'.rst': 'restructuredtext',
'.md': 'markdown',
}
language = 'en'
# The master toctree document.
root_doc = 'index'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'pytorch_sphinx_theme'
html_theme_path = [pytorch_sphinx_theme.get_html_theme_path()]
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
# yapf: disable
html_theme_options = {
'menu': [
{
'name': 'GitHub',
'url': 'https://github.com/open-mmlab/mmpretrain'
},
{
'name': 'Colab Tutorials',
'children': [
{'name': 'Train and inference with shell commands',
'url': 'https://colab.research.google.com/github/mzr1996/mmpretrain-tutorial/blob/master/1.x/MMPretrain_tools.ipynb'},
{'name': 'Train and inference with Python APIs',
'url': 'https://colab.research.google.com/github/mzr1996/mmpretrain-tutorial/blob/master/1.x/MMPretrain_python.ipynb'},
]
},
{
'name': 'Version',
'children': [
{'name': 'MMPreTrain 0.x',
'url': 'https://mmpretrain.readthedocs.io/en/0.x/',
'description': '0.x branch'},
{'name': 'MMPreTrain 1.x',
'url': 'https://mmpretrain.readthedocs.io/en/latest/',
'description': 'Main branch'},
],
}
],
# Specify the language of shared menu
'menu_lang': 'en',
# Disable the default edit on GitHub
'default_edit_on_github': False,
}
# yapf: enable
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_css_files = [
'https://cdn.datatables.net/v/bs4/dt-1.12.1/datatables.min.css',
'css/readthedocs.css'
]
html_js_files = [
'https://cdn.datatables.net/v/bs4/dt-1.12.1/datatables.min.js',
'js/custom.js'
]
# -- Options for HTMLHelp output ---------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'mmpretraindoc'
# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(root_doc, 'mmpretrain.tex', 'MMPretrain Documentation', author, 'manual'),
]
# -- Options for manual page output ------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [(root_doc, 'mmpretrain', 'MMPretrain Documentation', [author], 1)]
# -- Options for Texinfo output ----------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(root_doc, 'mmpretrain', 'MMPretrain Documentation', author, 'mmpretrain',
'OpenMMLab pre-training toolbox and benchmark.', 'Miscellaneous'),
]
# -- Options for Epub output -------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = project
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''
# A unique identification for the text.
#
# epub_uid = ''
# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
# set priority when building html
StandaloneHTMLBuilder.supported_image_types = [
'image/svg+xml', 'image/gif', 'image/png', 'image/jpeg'
]
# -- Extension configuration -------------------------------------------------
# Ignore >>> when copying code
copybutton_prompt_text = r'>>> |\.\.\. '
copybutton_prompt_is_regexp = True
# Auto-generated header anchors
myst_heading_anchors = 3
# Enable "colon_fence" extension of myst.
myst_enable_extensions = ['colon_fence', 'dollarmath']
# Configuration for intersphinx
intersphinx_mapping = {
'python': ('https://docs.python.org/3', None),
'numpy': ('https://numpy.org/doc/stable', None),
'torch': ('https://pytorch.org/docs/stable/', None),
'mmcv': ('https://mmcv.readthedocs.io/en/2.x/', None),
'mmengine': ('https://mmengine.readthedocs.io/en/latest/', None),
'transformers':
('https://huggingface.co/docs/transformers/main/en/', None),
}
napoleon_custom_sections = [
# Custom sections for data elements.
('Meta fields', 'params_style'),
('Data fields', 'params_style'),
]
# Disable docstring inheritance
autodoc_inherit_docstrings = False
# Mock some imports during generate API docs.
autodoc_mock_imports = ['rich', 'attr', 'einops', 'mat4py']
# Disable displaying type annotations, these can be very verbose
autodoc_typehints = 'none'
# The not found page
notfound_template = '404.html'
def builder_inited_handler(app):
if subprocess.run(['./stat.py']).returncode != 0:
raise RuntimeError('Failed to run the script `stat.py`.')
def setup(app):
app.connect('builder-inited', builder_inited_handler)
# NPU (HUAWEI Ascend)
## Usage
### General Usage
Please refer to the [building documentation of MMCV](https://mmcv.readthedocs.io/en/latest/get_started/build.html#build-mmcv-full-on-ascend-npu-machine) to install MMCV and [MMEngine](https://mmengine.readthedocs.io/en/latest/get_started/installation.html#build-from-source) on NPU devices.
Here we use 8 NPUs on your computer to train the model with the following command:
```shell
bash ./tools/dist_train.sh configs/resnet/resnet50_8xb32_in1k.py 8
```
Also, you can use only one NPU to train the model with the following command:
```shell
python ./tools/train.py configs/resnet/resnet50_8xb32_in1k.py
```
## Models Results
| Model | Top-1 (%) | Top-5 (%) | Config | Download |
| :---------------------------------------------------------: | :-------: | :-------: | :----------------------------------------------------------: | :-------------------------------------------------------------: |
| [ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) | 76.40 | 93.21 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnet50_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnet50_8xb32_in1k.log) |
| [ResNetXt-32x4d-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/README.md) | 77.48 | 93.75 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnext50-32x4d_8xb32_in1k.log) |
| [HRNet-W18](https://github.com/open-mmlab/mmclassification/blob/master/configs/hrnet/README.md) | 77.06 | 93.57 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/hrnet/hrnet-w18_4xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/hrnet-w18_4xb32_in1k.log) |
| [ResNetV1D-152](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) | 79.41 | 94.48 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnetv1d152_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnetv1d152_8xb32_in1k.log) |
| [SE-ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/README.md) | 77.65 | 93.74 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/seresnet50_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/seresnet50_8xb32_in1k.log) |
| [ShuffleNetV2 1.0x](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/README.md) | 69.52 | 88.79 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/shufflenet-v2-1x_16xb64_in1k.log) |
| [MobileNetV2](https://github.com/open-mmlab/mmclassification/tree/1.x/configs/mobilenet_v2) | 71.74 | 90.28 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v2_8xb32_in1k.log) |
| [MobileNetV3-Small](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/README.md) | 67.09 | 87.17 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/mobilenet-v3-small_8xb128_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v3-small.log) |
| [\*CSPResNeXt50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/README.md) | 77.25 | 93.46 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/cspresnext50_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/cspresnext50_8xb32_in1k.log) |
| [\*EfficientNet-B4](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/README.md) | 75.73 | 92.91 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/efficientnet-b4_8xb32_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/efficientnet-b4_8xb32_in1k.log) |
| [\*\*DenseNet121](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/README.md) | 72.53 | 90.85 | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/densenet121_4xb256_in1k.py) | [log](https://download.openmmlab.com/mmclassification/v1/device/npu/densenet121_4xb256_in1k.log) |
**Notes:**
- If not specially marked, the results are almost same between results on the NPU and results on the GPU with FP32.
- (\*) The training results of these models are lower than the results on the readme in the corresponding model, mainly
because the results on the readme are directly the weight of the timm of the eval, and the results on this side are
retrained according to the config with mmcls. The results of the config training on the GPU are consistent with the
results of the NPU.
- (\*\*) The accuracy of this model is slightly lower because config is a 4-card config, we use 8 cards to run, and users
can adjust hyperparameters to get the best accuracy results.
**All above models are provided by Huawei Ascend group.**
[html writers]
table_style: colwidths-auto
# Prerequisites
In this section we demonstrate how to prepare an environment with PyTorch.
MMPretrain works on Linux, Windows and macOS. It requires Python 3.7+, CUDA 10.2+ and PyTorch 1.8+.
```{note}
If you are experienced with PyTorch and have already installed it, just skip this part and jump to the [next section](#installation). Otherwise, you can follow these steps for the preparation.
```
**Step 1.** Download and install Miniconda from the [official website](https://docs.conda.io/en/latest/miniconda.html).
**Step 2.** Create a conda environment and activate it.
```shell
conda create --name openmmlab python=3.8 -y
conda activate openmmlab
```
**Step 3.** Install PyTorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.
On GPU platforms:
```shell
conda install pytorch torchvision -c pytorch
```
```{warning}
This command will automatically install the latest version PyTorch and cudatoolkit, please check whether they match your environment.
```
On CPU platforms:
```shell
conda install pytorch torchvision cpuonly -c pytorch
```
# Installation
## Best Practices
According to your needs, we support two install modes:
- [Install from source (Recommended)](#install-from-source): You want to develop your own network or new features based on MMPretrain framework. For example, adding new datasets or new backbones. And you can use all tools we provided.
- [Install as a Python package](#install-as-a-python-package): You just want to call MMPretrain's APIs or import MMPretrain's modules in your project.
### Install from source
In this case, install mmpretrain from source:
```shell
git clone https://github.com/open-mmlab/mmpretrain.git
cd mmpretrain
pip install -U openmim && mim install -e .
```
```{note}
`"-e"` means installing a project in editable mode, thus any local modifications made to the code will take effect without reinstallation.
```
### Install as a Python package
Just install with mim.
```shell
pip install -U openmim && mim install "mmpretrain>=1.0.0rc8"
```
```{note}
`mim` is a light-weight command-line tool to setup appropriate environment for OpenMMLab repositories according to PyTorch and CUDA version. It also has some useful functions for deep-learning experiments.
```
## Install multi-modality support (Optional)
The multi-modality models in MMPretrain requires extra dependencies. To install these dependencies, you
can add `[multimodal]` during the installation. For example:
```shell
# Install from source
mim install -e ".[multimodal]"
# Install as a Python package
mim install "mmpretrain[multimodal]>=1.0.0rc8"
```
## Verify the installation
To verify whether MMPretrain is installed correctly, we provide some sample codes to run an inference demo.
Option (a). If you install mmpretrain from the source, just run the following command:
```shell
python demo/image_demo.py demo/demo.JPEG resnet18_8xb32_in1k --device cpu
```
You will see the output result dict including `pred_label`, `pred_score` and `pred_class` in your terminal.
Option (b). If you install mmpretrain as a python package, open your python interpreter and copy&paste the following codes.
```python
from mmpretrain import get_model, inference_model
model = get_model('resnet18_8xb32_in1k', device='cpu') # or device='cuda:0'
inference_model(model, 'demo/demo.JPEG')
```
You will see a dict printed, including the predicted label, score and category name.
```{note}
The `resnet18_8xb32_in1k` is the model name, and you can use [`mmpretrain.list_models`](mmpretrain.apis.list_models) to
explore all models, or search them on the [Model Zoo Summary](./modelzoo_statistics.md)
```
## Customize Installation
### CUDA versions
When installing PyTorch, you need to specify the version of CUDA. If you are
not clear on which to choose, follow our recommendations:
- For Ampere-based NVIDIA GPUs, such as GeForce 30 series and NVIDIA A100, CUDA 11 is a must.
- For older NVIDIA GPUs, CUDA 11 is backward compatible, but CUDA 10.2 offers better compatibility and is more lightweight.
Please make sure the GPU driver satisfies the minimum version requirements. See [this table](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions) for more information.
```{note}
Installing CUDA runtime libraries is enough if you follow our best practices,
because no CUDA code will be compiled locally. However if you hope to compile
MMCV from source or develop other CUDA operators, you need to install the
complete CUDA toolkit from NVIDIA's [website](https://developer.nvidia.com/cuda-downloads),
and its version should match the CUDA version of PyTorch. i.e., the specified
version of cudatoolkit in `conda install` command.
```
### Install on CPU-only platforms
MMPretrain can be built for CPU only environment. In CPU mode you can train, test or inference a model.
### Install on Google Colab
See [the Colab tutorial](https://colab.research.google.com/github/mzr1996/mmclassification-tutorial/blob/master/1.x/MMClassification_tools.ipynb).
### Using MMPretrain with Docker
We provide a [Dockerfile](https://github.com/open-mmlab/mmpretrain/blob/main/docker/Dockerfile)
to build an image. Ensure that your [docker version](https://docs.docker.com/engine/install/) >=19.03.
```shell
# build an image with PyTorch 1.12.1, CUDA 11.3
# If you prefer other versions, just modified the Dockerfile
docker build -t mmpretrain docker/
```
Run it with
```shell
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmpretrain/data mmpretrain
```
## Trouble shooting
If you have some issues during the installation, please first view the [FAQ](./notes/faq.md) page.
You may [open an issue](https://github.com/open-mmlab/mmpretrain/issues/new/choose)
on GitHub if no solution is found.
Welcome to MMPretrain's documentation!
============================================
MMPretrain is a newly upgraded open-source framework for pre-training.
It has set out to provide multiple powerful pre-trained backbones and
support different pre-training strategies. MMPretrain originated from the
famous open-source projects
`MMClassification <https://github.com/open-mmlab/mmclassification/tree/1.x>`_
and `MMSelfSup <https://github.com/open-mmlab/mmselfsup>`_, and is developed
with many exiciting new features. The pre-training stage is essential for
vision recognition currently. With the rich and strong pre-trained models,
we are currently capable of improving various downstream vision tasks.
Our primary objective for the codebase is to become an easily accessible and
user-friendly library and to streamline research and engineering. We
detail the properties and design of MMPretrain across different sections.
Hands-on Roadmap of MMPretrain
-------------------------------
To help users quickly utilize MMPretrain, we recommend following the hands-on
roadmap we have created for the library:
- For users who want to try MMPretrain, we suggest reading the GetStarted_
section for the environment setup.
- For basic usage, we refer users to UserGuides_ for utilizing various
algorithms to obtain the pre-trained models and evaluate their performance
in downstream tasks.
- For those who wish to customize their own algorithms, we provide
AdvancedGuides_ that include hints and rules for modifying code.
- To find your desired pre-trained models, users could check the ModelZoo_,
which features a summary of various backbones and pre-training methods and
introfuction of different algorithms.
- Additionally, we provide Analysis_ and Visualization_ tools to help
diagnose algorithms.
- Besides, if you have any other questions or concerns, please refer to the
Notes_ section for potential answers.
We always welcome *PRs* and *Issues* for the betterment of MMPretrain.
.. _GetStarted:
.. toctree::
:maxdepth: 1
:caption: Get Started
get_started.md
.. _UserGuides:
.. toctree::
:maxdepth: 1
:caption: User Guides
user_guides/config.md
user_guides/dataset_prepare.md
user_guides/inference.md
user_guides/train.md
user_guides/test.md
user_guides/downstream.md
.. _AdvancedGuides:
.. toctree::
:maxdepth: 1
:caption: Advanced Guides
advanced_guides/datasets.md
advanced_guides/pipeline.md
advanced_guides/modules.md
advanced_guides/schedule.md
advanced_guides/runtime.md
advanced_guides/evaluation.md
advanced_guides/convention.md
.. _ModelZoo:
.. toctree::
:maxdepth: 1
:caption: Model Zoo
:glob:
modelzoo_statistics.md
papers/*
.. _Visualization:
.. toctree::
:maxdepth: 1
:caption: Visualization
useful_tools/dataset_visualization.md
useful_tools/scheduler_visualization.md
useful_tools/cam_visualization.md
useful_tools/t-sne_visualization.md
.. _Analysis:
.. toctree::
:maxdepth: 1
:caption: Analysis Tools
useful_tools/print_config.md
useful_tools/verify_dataset.md
useful_tools/log_result_analysis.md
useful_tools/complexity_analysis.md
useful_tools/confusion_matrix.md
useful_tools/shape_bias.md
.. toctree::
:maxdepth: 1
:caption: Deployment
useful_tools/model_serving.md
.. toctree::
:maxdepth: 1
:caption: Migration
migration.md
.. toctree::
:maxdepth: 1
:caption: API Reference
mmpretrain.apis <api/apis>
mmpretrain.engine <api/engine>
mmpretrain.datasets <api/datasets>
Data Process <api/data_process>
mmpretrain.models <api/models>
mmpretrain.structures <api/structures>
mmpretrain.visualization <api/visualization>
mmpretrain.evaluation <api/evaluation>
mmpretrain.utils <api/utils>
.. _Notes:
.. toctree::
:maxdepth: 1
:caption: Notes
notes/contribution_guide.md
notes/projects.md
notes/changelog.md
notes/faq.md
notes/pretrain_custom_dataset.md
notes/finetune_custom_dataset.md
.. toctree::
:maxdepth: 1
:caption: Device Support
device/npu.md
Indices and tables
==================
* :ref:`genindex`
* :ref:`search`
This diff is collapsed.
This diff is collapsed.
../../../CONTRIBUTING.md
\ No newline at end of file
# Frequently Asked Questions
We list some common troubles faced by many users and their corresponding
solutions here. Feel free to enrich the list if you find any frequent issues
and have ways to help others to solve them. If the contents here do not cover
your issue, please create an issue using the
[provided templates](https://github.com/open-mmlab/mmpretrain/issues/new/choose)
and make sure you fill in all required information in the template.
## Installation
- Compatibility issue between MMEngine, MMCV and MMPretrain
Compatible MMPretrain and MMEngine, MMCV versions are shown as below. Please
choose the correct version of MMEngine and MMCV to avoid installation issues.
| MMPretrain version | MMEngine version | MMCV version |
| :----------------: | :---------------: | :--------------: |
| 1.2.0 (main) | mmengine >= 0.8.3 | mmcv >= 2.0.0 |
| 1.1.1 | mmengine >= 0.8.3 | mmcv >= 2.0.0 |
| 1.0.0 | mmengine >= 0.8.0 | mmcv >= 2.0.0 |
| 1.0.0rc8 | mmengine >= 0.7.1 | mmcv >= 2.0.0rc4 |
| 1.0.0rc7 | mmengine >= 0.5.0 | mmcv >= 2.0.0rc4 |
```{note}
Since the `dev` branch is under frequent development, the MMEngine and MMCV
version dependency may be inaccurate. If you encounter problems when using
the `dev` branch, please try to update MMEngine and MMCV to the latest version.
```
- Using Albumentations
If you would like to use `albumentations`, we suggest using `pip install -r requirements/albu.txt` or
`pip install -U albumentations --no-binary qudida,albumentations`.
If you simply use `pip install albumentations>=0.3.2`, it will install `opencv-python-headless` simultaneously
(even though you have already installed `opencv-python`). Please refer to the
[official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies)
for details.
## General Questions
### Do I need to reinstall mmpretrain after some code modifications?
If you follow [the best practice](../get_started.md#best-practices) and install mmpretrain from source,
any local modifications made to the code will take effect without
reinstallation.
### How to develop with multiple MMPretrain versions?
Generally speaking, we recommend to use different virtual environments to
manage MMPretrain in different working directories. However, you
can also use the same environment to develop MMPretrain in different
folders, like mmpretrain-0.21, mmpretrain-0.23. When you run the train or test shell script,
it will adopt the mmpretrain package in the current folder. And when you run other Python
script, you can also add `` PYTHONPATH=`pwd` `` at the beginning of your command
to use the package in the current folder.
Conversely, to use the default MMPretrain installed in the environment
rather than the one you are working with, you can remove the following line
in those shell scripts:
```shell
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH
```
### What's the relationship between the `load_from` and the `init_cfg`?
- `load_from`: If `resume=False`, only imports model weights, which is mainly used to load trained models;
If `resume=True`, load all of the model weights, optimizer state, and other training information, which is
mainly used to resume interrupted training.
- `init_cfg`: You can also specify `init=dict(type="Pretrained", checkpoint=xxx)` to load checkpoint, it
means load the weights during model weights initialization. That is, it will be only done at the
beginning of the training. It's mainly used to fine-tune a pre-trained model, and you can set it in
the backbone config and use `prefix` field to only load backbone weights, for example:
```python
model = dict(
backbone=dict(
type='ResNet',
depth=50,
init_cfg=dict(type='Pretrained', checkpoints=xxx, prefix='backbone'),
)
...
)
```
See the [Fine-tune Models](./finetune_custom_dataset.md) for more details about fine-tuning.
### What's the difference between `default_hooks` and `custom_hooks`?
Almost no difference. Usually, the `default_hooks` field is used to specify the hooks that will be used in almost
all experiments, and the `custom_hooks` field is used in only some experiments.
Another difference is the `default_hooks` is a dict while the `custom_hooks` is a list, please don't be
confused.
### During training, I got no training log, what's the reason?
If your training dataset is small while the batch size is large, our default log interval may be too large to
record your training log.
You can shrink the log interval and try again, like:
```python
default_hooks = dict(
...
logger=dict(type='LoggerHook', interval=10),
...
)
```
### How to train with other datasets, like my own dataset or COCO?
We provide [specific examples](./pretrain_custom_dataset.md) to show how to train with other datasets.
# How to Fine-tune with Custom Dataset
In most scenarios, we want to apply a pre-trained model without training from scratch, which might possibly introduce extra uncertainties about the model convergency and therefore, is time-consuming.
The common sense is to learn from previous models trained on large dataset, which can hopefully provide better knowledge than a random beginner. Roughly speaking, this process is as known as fine-tuning.
Models pre-trained on the ImageNet dataset have been demonstrated to be effective for other datasets and other downstream tasks.
Hence, this tutorial provides instructions for users to use the models provided in the [Model Zoo](../modelzoo_statistics.md) for other datasets to obtain better performance.
In this tutorial, we provide a practice example and some tips on how to fine-tune a model on your own dataset.
## Step-1: Prepare your dataset
Prepare your dataset following [Prepare Dataset](../user_guides/dataset_prepare.md).
And the root folder of the dataset can be like `data/custom_dataset/`.
Here, we assume you want to do supervised image-classification training, and use the sub-folder format
`CustomDataset` to organize your dataset as:
```text
data/custom_dataset/
├── train
│   ├── class_x
│   │ ├── x_1.png
│ │ ├── x_2.png
│ │ ├── x_3.png
│ │ └── ...
│ ├── class_y
│   └── ...
└── test
   ├── class_x
   │ ├── test_x_1.png
│ ├── test_x_2.png
│ ├── test_x_3.png
│ └── ...
├── class_y
   └── ...
```
## Step-2: Choose one config as template
Here, we would like to use `configs/resnet/resnet50_8xb32_in1k.py` as the example. We first copy this config
file to the same folder and rename it as `resnet50_8xb32-ft_custom.py`.
```{tip}
As a convention, the last field of the config name is the dataset, e.g.,`in1k` for ImageNet dataset, `coco` for COCO dataset
```
The content of this config is:
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
```
## Step-3: Edit the model settings
When fine-tuning a model, usually we want to load the pre-trained backbone
weights and train a new classification head from scratch.
To load the pre-trained backbone, we need to change the initialization config
of the backbone and use `Pretrained` initialization function. Besides, in the
`init_cfg`, we use `prefix='backbone'` to tell the initialization function
the prefix of the submodule that needs to be loaded in the checkpoint.
For example, `backbone` here means to load the backbone submodule. And here we
use an online checkpoint, it will be downloaded automatically during training,
you can also download the model manually and use a local path.
And then we need to modify the head according to the class numbers of the new
datasets by just changing `num_classes` in the head.
When new dataset is small and shares the domain with the pre-trained dataset,
we might want to freeze the first several stages' parameters of the
backbone, that will help the network to keep ability to extract low-level
information learnt from pre-trained model. In MMPretrain, you can simply
specify how many stages to freeze by `frozen_stages` argument. For example, to
freeze the first two stages' parameters, just use the following configs:
```{note}
Not all backbones support the `frozen_stages` argument by now. Please check
[the docs](https://mmpretrain.readthedocs.io/en/latest/api.html#module-mmpretrain.models.backbones)
to confirm if your backbone supports it.
```
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# >>>>>>>>>>>>>>> Override model settings here >>>>>>>>>>>>>>>>>>>
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
```
```{tip}
Here we only need to set the part of configs we want to modify, because the
inherited configs will be merged and get the entire configs.
```
## Step-4: Edit the dataset settings
To fine-tuning on a new dataset, we need to override some dataset settings, like the type of dataset, data
pipeline, etc.
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# model settings
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# >>>>>>>>>>>>>>> Override data settings here >>>>>>>>>>>>>>>>>>>
data_root = 'data/custom_dataset'
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
```
## Step-5: Edit the schedule settings (optional)
The fine-tuning hyper parameters vary from the default schedule. It usually
requires smaller learning rate and quicker decaying scheduler epochs.
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# model settings
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# data settings
data_root = 'data/custom_dataset'
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
# >>>>>>>>>>>>>>> Override schedule settings here >>>>>>>>>>>>>>>>>>>
# optimizer hyper-parameters
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
# learning policy
param_scheduler = dict(
type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
```
```{tip}
Refers to [Learn about Configs](../user_guides/config.md) for more detailed configurations.
```
## Start Training
Now, we have finished the fine-tuning config file as following:
```python
_base_ = [
'../_base_/models/resnet50.py', # model settings
'../_base_/datasets/imagenet_bs32.py', # data settings
'../_base_/schedules/imagenet_bs256.py', # schedule settings
'../_base_/default_runtime.py', # runtime settings
]
# model settings
model = dict(
backbone=dict(
frozen_stages=2,
init_cfg=dict(
type='Pretrained',
checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth',
prefix='backbone',
)),
head=dict(num_classes=10),
)
# data settings
data_root = 'data/custom_dataset'
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
# schedule settings
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
param_scheduler = dict(
type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
```
Here we use 8 GPUs on your computer to train the model with the following command:
```shell
bash tools/dist_train.sh configs/resnet/resnet50_8xb32-ft_custom.py 8
```
Also, you can use only one GPU to train the model with the following command:
```shell
python tools/train.py configs/resnet/resnet50_8xb32-ft_custom.py
```
But wait, an important config need to be changed if using one GPU. We need to
change the dataset config as following:
```python
data_root = 'data/custom_dataset'
train_dataloader = dict(
batch_size=256,
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='train',
))
val_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root=data_root,
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='test',
))
test_dataloader = val_dataloader
```
It's because our training schedule is for a batch size of 256. If using 8 GPUs,
just use `batch_size=32` config in the base config file for every GPU, and the total batch
size will be 256. But if using one GPU, you need to change it to 256 manually to
match the training schedule.
However, a larger batch size requires a larger GPU memory, and here are several simple tricks to save the GPU
memory:
1. Enable Automatic-Mixed-Precision training.
```shell
python tools/train.py configs/resnet/resnet50_8xb32-ft_custom.py --amp
```
2. Use a smaller batch size, like `batch_size=32` instead of 256, and enable the auto learning rate scaling.
```shell
python tools/train.py configs/resnet/resnet50_8xb32-ft_custom.py --auto-scale-lr
```
The auto learning rate scaling will adjust the learning rate according to the actual batch size and the
`auto_scale_lr.base_batch_size` (You can find it in the base config
`configs/_base_/schedules/imagenet_bs256.py`)
```{note}
Most of these tricks may influence the training performance slightly.
```
### Apply pre-trained model with command line
If you don't want to modify the configs, you could use `--cfg-options` to add your pre-trained model path to `init_cfg`.
For example, the command below will also load pre-trained model.
```shell
bash tools/dist_train.sh configs/resnet/resnet50_8xb32-ft_custom.py 8 \
--cfg-options model.backbone.init_cfg.type='Pretrained' \
model.backbone.init_cfg.checkpoint='https://download.openmmlab.com/mmselfsup/1.x/mocov3/mocov3_resnet50_8xb512-amp-coslr-100e_in1k/mocov3_resnet50_8xb512-amp-coslr-100e_in1k_20220927-f1144efa.pth' \
model.backbone.init_cfg.prefix='backbone' \
```
# How to Pretrain with Custom Dataset
In this tutorial, we provide a practice example and some tips on how to train on your own dataset.
In MMPretrain, We support the `CustomDataset` (similar to the `ImageFolder` in `torchvision`), which is able to read the images within the specified folder directly. You only need to prepare the path information of the custom dataset and edit the config.
## Step-1: Prepare your dataset
Prepare your dataset following [Prepare Dataset](../user_guides/dataset_prepare.md).
And the root folder of the dataset can be like `data/custom_dataset/`.
Here, we assume you want to do unsupervised training, and use the sub-folder format `CustomDataset` to
organize your dataset as:
```text
data/custom_dataset/
├── sample1.png
├── sample2.png
├── sample3.png
├── sample4.png
└── ...
```
## Step-2: Choose one config as template
Here, we would like to use `configs/mae/mae_vit-base-p16_8xb512-amp-coslr-300e_in1k.py` as the example. We
first copy this config file to the same folder and rename it as
`mae_vit-base-p16_8xb512-amp-coslr-300e_custom.py`.
```{tip}
As a convention, the last field of the config name is the dataset, e.g.,`in1k` for ImageNet dataset, `coco` for COCO dataset
```
The content of this config is:
```python
_base_ = [
'../_base_/models/mae_vit-base-p16.py',
'../_base_/datasets/imagenet_bs512_mae.py',
'../_base_/default_runtime.py',
]
# optimizer wrapper
optim_wrapper = dict(
type='AmpOptimWrapper',
loss_scale='dynamic',
optimizer=dict(
type='AdamW',
lr=1.5e-4 * 4096 / 256,
betas=(0.9, 0.95),
weight_decay=0.05),
paramwise_cfg=dict(
custom_keys={
'ln': dict(decay_mult=0.0),
'bias': dict(decay_mult=0.0),
'pos_embed': dict(decay_mult=0.),
'mask_token': dict(decay_mult=0.),
'cls_token': dict(decay_mult=0.)
}))
# learning rate scheduler
param_scheduler = [
dict(
type='LinearLR',
start_factor=0.0001,
by_epoch=True,
begin=0,
end=40,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=260,
by_epoch=True,
begin=40,
end=300,
convert_to_iter_based=True)
]
# runtime settings
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=300)
default_hooks = dict(
# only keeps the latest 3 checkpoints
checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=3))
randomness = dict(seed=0, diff_rank_seed=True)
# auto resume
resume = True
# NOTE: `auto_scale_lr` is for automatically scaling LR
# based on the actual training batch size.
auto_scale_lr = dict(base_batch_size=4096)
```
## Step-3: Edit the dataset related config
- Override the `type` of dataset settings as `'CustomDataset'`
- Override the `data_root` of dataset settings as `data/custom_dataset`.
- Override the `ann_file` of dataset settings as an empty string since we assume you are using the sub-folder
format `CustomDataset`.
- Override the `data_prefix` of dataset settings as an empty string since we are using the whole dataset under
the `data_root`, and you don't need to split samples into different subset and set the `data_prefix`.
The modified config will be like:
```python
_base_ = [
'../_base_/models/mae_vit-base-p16.py',
'../_base_/datasets/imagenet_bs512_mae.py',
'../_base_/default_runtime.py',
]
# >>>>>>>>>>>>>>> Override dataset settings here >>>>>>>>>>>>>>>>>>>
train_dataloader = dict(
dataset=dict(
type='CustomDataset',
data_root='data/custom_dataset/',
ann_file='', # We assume you are using the sub-folder format without ann_file
data_prefix='', # The `data_root` is the data_prefix directly.
with_label=False,
)
)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# optimizer wrapper
optim_wrapper = dict(
type='AmpOptimWrapper',
loss_scale='dynamic',
optimizer=dict(
type='AdamW',
lr=1.5e-4 * 4096 / 256,
betas=(0.9, 0.95),
weight_decay=0.05),
paramwise_cfg=dict(
custom_keys={
'ln': dict(decay_mult=0.0),
'bias': dict(decay_mult=0.0),
'pos_embed': dict(decay_mult=0.),
'mask_token': dict(decay_mult=0.),
'cls_token': dict(decay_mult=0.)
}))
# learning rate scheduler
param_scheduler = [
dict(
type='LinearLR',
start_factor=0.0001,
by_epoch=True,
begin=0,
end=40,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=260,
by_epoch=True,
begin=40,
end=300,
convert_to_iter_based=True)
]
# runtime settings
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=300)
default_hooks = dict(
# only keeps the latest 3 checkpoints
checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=3))
randomness = dict(seed=0, diff_rank_seed=True)
# auto resume
resume = True
# NOTE: `auto_scale_lr` is for automatically scaling LR
# based on the actual training batch size.
auto_scale_lr = dict(base_batch_size=4096)
```
By using the edited config file, you are able to train a self-supervised model with MAE algorithm on the custom dataset.
## Another example: Train MAE on COCO Dataset
```{note}
You need to install MMDetection to use the `mmdet.CocoDataset` follow this [documentation](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/en/get_started.md)
```
Follow the aforementioned idea, we also present an example of how to train MAE on COCO dataset. The edited file will be like this:
```python
_base_ = [
'../_base_/models/mae_vit-base-p16.py',
'../_base_/datasets/imagenet_mae.py',
'../_base_/default_runtime.py',
]
# >>>>>>>>>>>>>>> Override dataset settings here >>>>>>>>>>>>>>>>>>>
train_dataloader = dict(
dataset=dict(
type='mmdet.CocoDataset',
data_root='data/coco/',
ann_file='annotations/instances_train2017.json', # Only for loading images, and the labels won't be used.
data_prefix=dict(img='train2017/'),
)
)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# optimizer wrapper
optim_wrapper = dict(
type='AmpOptimWrapper',
loss_scale='dynamic',
optimizer=dict(
type='AdamW',
lr=1.5e-4 * 4096 / 256,
betas=(0.9, 0.95),
weight_decay=0.05),
paramwise_cfg=dict(
custom_keys={
'ln': dict(decay_mult=0.0),
'bias': dict(decay_mult=0.0),
'pos_embed': dict(decay_mult=0.),
'mask_token': dict(decay_mult=0.),
'cls_token': dict(decay_mult=0.)
}))
# learning rate scheduler
param_scheduler = [
dict(
type='LinearLR',
start_factor=0.0001,
by_epoch=True,
begin=0,
end=40,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
T_max=260,
by_epoch=True,
begin=40,
end=300,
convert_to_iter_based=True)
]
# runtime settings
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=300)
default_hooks = dict(
# only keeps the latest 3 checkpoints
checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=3))
randomness = dict(seed=0, diff_rank_seed=True)
# auto resume
resume = True
# NOTE: `auto_scale_lr` is for automatically scaling LR
# based on the actual training batch size.
auto_scale_lr = dict(base_batch_size=4096)
```
# Projects based on MMPretrain
There are many projects built upon MMPretrain(MMClassification previsously).
We list some of them as examples of how to extend MMPretrain(MMClassification previsously) for your own projects.
As the page might not be completed, please feel free to create a PR to update this page.
## Projects as an extension
- [OpenMixup](https://github.com/Westlake-AI/openmixup): an open-source toolbox for supervised, self-, and semi-supervised visual representation learning with mixup based on PyTorch, especially for mixup-related methods.
- [AI Power](https://github.com/ykk648/AI_power): AI toolbox and pretrain models.
- [OpenBioSeq](https://github.com/Westlake-AI/OpenBioSeq): an open-source supervised and self-supervised bio-sequence representation learning toolbox based on PyTorch.
## Projects of papers
There are also projects released with papers.
Some of the papers are published in top-tier conferences (CVPR, ICCV, and ECCV), the others are also highly influential.
To make this list also a reference for the community to develop and compare new image classification algorithms, we list them following the time order of top-tier conferences.
Methods already supported and maintained by MMPretrain(MMClassification previsously) are not listed.
- Involution: Inverting the Inherence of Convolution for Visual Recognition, CVPR21. [[paper]](https://arxiv.org/abs/2103.06255)[[github]](https://github.com/d-li14/involution)
- Convolution of Convolution: Let Kernels Spatially Collaborate, CVPR22. [[paper]](https://openaccess.thecvf.com/content/CVPR2022/papers/Zhao_Convolution_of_Convolution_Let_Kernels_Spatially_Collaborate_CVPR_2022_paper.pdf)[[github]](https://github.com/Genera1Z/ConvolutionOfConvolution)
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment