Commit 0fd8347d authored by unknown's avatar unknown
Browse files

添加mmclassification-0.24.1代码,删除mmclassification-speed-benchmark

parent cc567e9e
# optimizer
optimizer = dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='CosineAnnealing',
min_lr=0,
warmup='linear',
warmup_iters=25025,
warmup_ratio=0.25)
runner = dict(type='EpochBasedRunner', max_epochs=200)
# specific to vit pretrain
paramwise_cfg = dict(custom_keys={
'.cls_token': dict(decay_mult=0.0),
'.pos_embed': dict(decay_mult=0.0)
})
# optimizer
optimizer = dict(
type='AdamW',
lr=0.003,
weight_decay=0.3,
paramwise_cfg=paramwise_cfg,
)
optimizer_config = dict(grad_clip=dict(max_norm=1.0))
# learning policy
lr_config = dict(
policy='CosineAnnealing',
min_lr=0,
warmup='linear',
warmup_iters=10000,
warmup_ratio=1e-4,
)
runner = dict(type='EpochBasedRunner', max_epochs=300)
# optimizer
optimizer = dict(
type='SGD', lr=0.003, momentum=0.9, weight_decay=0.0005, nesterov=True)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[40, 70, 90])
runner = dict(type='EpochBasedRunner', max_epochs=100)
# Conformer
> [Conformer: Local Features Coupling Global Representations for Visual Recognition](https://arxiv.org/abs/2105.03889)
<!-- [ALGORITHM] -->
## Abstract
Within Convolutional Neural Network (CNN), the convolution operations are good at extracting local features but experience difficulty to capture global representations. Within visual transformer, the cascaded self-attention modules can capture long-distance feature dependencies but unfortunately deteriorate local feature details. In this paper, we propose a hybrid network structure, termed Conformer, to take advantage of convolutional operations and self-attention mechanisms for enhanced representation learning. Conformer roots in the Feature Coupling Unit (FCU), which fuses local features and global representations under different resolutions in an interactive fashion. Conformer adopts a concurrent structure so that local features and global representations are retained to the maximum extent. Experiments show that Conformer, under the comparable parameter complexity, outperforms the visual transformer (DeiT-B) by 2.3% on ImageNet. On MSCOCO, it outperforms ResNet-101 by 3.7% and 3.6% mAPs for object detection and instance segmentation, respectively, demonstrating the great potential to be a general backbone network.
<div align=center>
<img src="https://user-images.githubusercontent.com/26739999/144957687-926390ed-6119-4e4c-beaa-9bc0017fe953.png" width="90%"/>
</div>
## Results and models
### ImageNet-1k
| Model | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
| :-------------------: | :-------: | :------: | :-------: | :-------: | :---------------------------------------------------------------------: | :-----------------------------------------------------------------------: |
| Conformer-tiny-p16\* | 23.52 | 4.90 | 81.31 | 95.60 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-tiny-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-tiny-p16_3rdparty_8xb128_in1k_20211206-f6860372.pth) |
| Conformer-small-p32\* | 38.85 | 7.09 | 81.96 | 96.02 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p32_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p32_8xb128_in1k_20211206-947a0816.pth) |
| Conformer-small-p16\* | 37.67 | 10.31 | 83.32 | 96.46 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p16_3rdparty_8xb128_in1k_20211206-3065dcf5.pth) |
| Conformer-base-p16\* | 83.29 | 22.89 | 83.82 | 96.59 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-base-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-base-p16_3rdparty_8xb128_in1k_20211206-bfdf8637.pth) |
*Models with * are converted from the [official repo](https://github.com/pengzhiliang/Conformer). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
## Citation
```
@article{peng2021conformer,
title={Conformer: Local Features Coupling Global Representations for Visual Recognition},
author={Zhiliang Peng and Wei Huang and Shanzhi Gu and Lingxi Xie and Yaowei Wang and Jianbin Jiao and Qixiang Ye},
journal={arXiv preprint arXiv:2105.03889},
year={2021},
}
```
_base_ = [
'../_base_/models/conformer/base-p16.py',
'../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_conformer.py',
'../_base_/default_runtime.py'
]
data = dict(samples_per_gpu=128)
evaluation = dict(interval=1, metric='accuracy')
_base_ = [
'../_base_/models/conformer/small-p16.py',
'../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_conformer.py',
'../_base_/default_runtime.py'
]
data = dict(samples_per_gpu=128)
evaluation = dict(interval=1, metric='accuracy')
_base_ = [
'../_base_/models/conformer/small-p32.py',
'../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_conformer.py',
'../_base_/default_runtime.py'
]
data = dict(samples_per_gpu=128)
evaluation = dict(interval=1, metric='accuracy')
_base_ = [
'../_base_/models/conformer/tiny-p16.py',
'../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_conformer.py',
'../_base_/default_runtime.py'
]
data = dict(samples_per_gpu=128)
evaluation = dict(interval=1, metric='accuracy')
Collections:
- Name: Conformer
Metadata:
Training Data: ImageNet-1k
Architecture:
- Layer Normalization
- Scaled Dot-Product Attention
- Dropout
Paper:
URL: https://arxiv.org/abs/2105.03889
Title: "Conformer: Local Features Coupling Global Representations for Visual Recognition"
README: configs/conformer/README.md
Code:
URL: https://github.com/open-mmlab/mmclassification/blob/v0.19.0/mmcls/models/backbones/conformer.py
Version: v0.19.0
Models:
- Name: conformer-tiny-p16_3rdparty_8xb128_in1k
In Collection: Conformer
Config: configs/conformer/conformer-tiny-p16_8xb128_in1k.py
Metadata:
FLOPs: 4899611328
Parameters: 23524704
Results:
- Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 81.31
Top 5 Accuracy: 95.60
Task: Image Classification
Weights: https://download.openmmlab.com/mmclassification/v0/conformer/conformer-tiny-p16_3rdparty_8xb128_in1k_20211206-f6860372.pth
Converted From:
Weights: https://drive.google.com/file/d/19SxGhKcWOR5oQSxNUWUM2MGYiaWMrF1z/view?usp=sharing
Code: https://github.com/pengzhiliang/Conformer/blob/main/models.py#L65
- Name: conformer-small-p16_3rdparty_8xb128_in1k
In Collection: Conformer
Config: configs/conformer/conformer-small-p16_8xb128_in1k.py
Metadata:
FLOPs: 10311309312
Parameters: 37673424
Results:
- Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 83.32
Top 5 Accuracy: 96.46
Task: Image Classification
Weights: https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p16_3rdparty_8xb128_in1k_20211206-3065dcf5.pth
Converted From:
Weights: https://drive.google.com/file/d/1mpOlbLaVxOfEwV4-ha78j_1Ebqzj2B83/view?usp=sharing
Code: https://github.com/pengzhiliang/Conformer/blob/main/models.py#L73
- Name: conformer-small-p32_8xb128_in1k
In Collection: Conformer
Config: configs/conformer/conformer-small-p32_8xb128_in1k.py
Metadata:
FLOPs: 7087281792
Parameters: 38853072
Results:
- Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 81.96
Top 5 Accuracy: 96.02
Task: Image Classification
Weights: https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p32_8xb128_in1k_20211206-947a0816.pth
- Name: conformer-base-p16_3rdparty_8xb128_in1k
In Collection: Conformer
Config: configs/conformer/conformer-base-p16_8xb128_in1k.py
Metadata:
FLOPs: 22892078080
Parameters: 83289136
Results:
- Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 83.82
Top 5 Accuracy: 96.59
Task: Image Classification
Weights: https://download.openmmlab.com/mmclassification/v0/conformer/conformer-base-p16_3rdparty_8xb128_in1k_20211206-bfdf8637.pth
Converted From:
Weights: https://drive.google.com/file/d/1oeQ9LSOGKEUaYGu7WTlUGl3KDsQIi0MA/view?usp=sharing
Code: https://github.com/pengzhiliang/Conformer/blob/main/models.py#L89
# ConvMixer
> [Patches Are All You Need?](https://arxiv.org/abs/2201.09792)
<!-- [ALGORITHM] -->
## Abstract
<!-- [ABSTRACT] -->
Although convolutional networks have been the dominant architecture for vision tasks for many years, recent experiments have shown that Transformer-based models, most notably the Vision Transformer (ViT), may exceed their performance in some settings. However, due to the quadratic runtime of the self-attention layers in Transformers, ViTs require the use of patch embeddings, which group together small regions of the image into single input features, in order to be applied to larger image sizes. This raises a question: Is the performance of ViTs due to the inherently-more-powerful Transformer architecture, or is it at least partly due to using patches as the input representation? In this paper, we present some evidence for the latter: specifically, we propose the ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.
<!-- [IMAGE] -->
<div align=center>
<img src="https://user-images.githubusercontent.com/42952108/156284977-abf2245e-d9ba-4e0d-8e10-c0664a20f4c8.png" width="100%"/>
</div>
## Results and models
### ImageNet-1k
| Model | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
| :-----------------: | :-------: | :------: | :-------: | :-------: | :----------------------------------------------------------------------: | :------------------------------------------------------------------------: |
| ConvMixer-768/32\* | 21.11 | 19.62 | 80.16 | 95.08 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convmixer/convmixer-768-32_10xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convmixer/convmixer-768-32_3rdparty_10xb64_in1k_20220323-bca1f7b8.pth) |
| ConvMixer-1024/20\* | 24.38 | 5.55 | 76.94 | 93.36 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convmixer/convmixer-1024-20_10xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convmixer/convmixer-1024-20_3rdparty_10xb64_in1k_20220323-48f8aeba.pth) |
| ConvMixer-1536/20\* | 51.63 | 48.71 | 81.37 | 95.61 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convmixer/convmixer-1536-20_10xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convmixer/convmixer-1536_20_3rdparty_10xb64_in1k_20220323-ea5786f3.pth) |
*Models with * are converted from the [official repo](https://github.com/locuslab/convmixer). The config files of these models are only for inference. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
## Citation
```bibtex
@misc{trockman2022patches,
title={Patches Are All You Need?},
author={Asher Trockman and J. Zico Kolter},
year={2022},
eprint={2201.09792},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
_base_ = [
'../_base_/models/convmixer/convmixer-1024-20.py',
'../_base_/datasets/imagenet_bs64_convmixer_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_swin.py',
'../_base_/default_runtime.py',
]
optimizer = dict(lr=0.01)
runner = dict(type='EpochBasedRunner', max_epochs=150)
_base_ = [
'../_base_/models/convmixer/convmixer-1536-20.py',
'../_base_/datasets/imagenet_bs64_convmixer_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_swin.py',
'../_base_/default_runtime.py',
]
optimizer = dict(lr=0.01)
runner = dict(type='EpochBasedRunner', max_epochs=150)
_base_ = [
'../_base_/models/convmixer/convmixer-768-32.py',
'../_base_/datasets/imagenet_bs64_convmixer_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_swin.py',
'../_base_/default_runtime.py',
]
optimizer = dict(lr=0.01)
runner = dict(type='EpochBasedRunner', max_epochs=300)
Collections:
- Name: ConvMixer
Metadata:
Training Data: ImageNet-1k
Architecture:
- 1x1 Convolution
- LayerScale
Paper:
URL: https://arxiv.org/abs/2201.09792
Title: Patches Are All You Need?
README: configs/convmixer/README.md
Models:
- Name: convmixer-768-32_10xb64_in1k
Metadata:
FLOPs: 19623051264
Parameters: 21110248
In Collections: ConvMixer
Results:
- Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 80.16
Top 5 Accuracy: 95.08
Task: Image Classification
Weights: https://download.openmmlab.com/mmclassification/v0/convmixer/convmixer-768-32_3rdparty_10xb64_in1k_20220323-bca1f7b8.pth
Config: configs/convmixer/convmixer-768-32_10xb64_in1k.py
Converted From:
Weights: https://github.com/tmp-iclr/convmixer/releases/download/v1.0/convmixer_768_32_ks7_p7_relu.pth.tar
Code: https://github.com/locuslab/convmixer
- Name: convmixer-1024-20_10xb64_in1k
Metadata:
FLOPs: 5550112768
Parameters: 24383464
In Collections: ConvMixer
Results:
- Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 76.94
Top 5 Accuracy: 93.36
Task: Image Classification
Weights: https://download.openmmlab.com/mmclassification/v0/convmixer/convmixer-1024-20_3rdparty_10xb64_in1k_20220323-48f8aeba.pth
Config: configs/convmixer/convmixer-1024-20_10xb64_in1k.py
Converted From:
Weights: https://github.com/tmp-iclr/convmixer/releases/download/v1.0/convmixer_1024_20_ks9_p14.pth.tar
Code: https://github.com/locuslab/convmixer
- Name: convmixer-1536-20_10xb64_in1k
Metadata:
FLOPs: 48713170944
Parameters: 51625960
In Collections: ConvMixer
Results:
- Dataset: ImageNet-1k
Metrics:
Top 1 Accuracy: 81.37
Top 5 Accuracy: 95.61
Task: Image Classification
Weights: https://download.openmmlab.com/mmclassification/v0/convmixer/convmixer-1536_20_3rdparty_10xb64_in1k_20220323-ea5786f3.pth
Config: configs/convmixer/convmixer-1536-20_10xb64_in1k.py
Converted From:
Weights: https://github.com/tmp-iclr/convmixer/releases/download/v1.0/convmixer_1536_20_ks9_p7.pth.tar
Code: https://github.com/locuslab/convmixer
# ConvNeXt
> [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545v1)
<!-- [ALGORITHM] -->
## Abstract
<!-- [ABSTRACT] -->
The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model. A vanilla ViT, on the other hand, faces difficulties when applied to general computer vision tasks such as object detection and semantic segmentation. It is the hierarchical Transformers (e.g., Swin Transformers) that reintroduced several ConvNet priors, making Transformers practically viable as a generic vision backbone and demonstrating remarkable performance on a wide variety of vision tasks. However, the effectiveness of such hybrid approaches is still largely credited to the intrinsic superiority of Transformers, rather than the inherent inductive biases of convolutions. In this work, we reexamine the design spaces and test the limits of what a pure ConvNet can achieve. We gradually "modernize" a standard ResNet toward the design of a vision Transformer, and discover several key components that contribute to the performance difference along the way. The outcome of this exploration is a family of pure ConvNet models dubbed ConvNeXt. Constructed entirely from standard ConvNet modules, ConvNeXts compete favorably with Transformers in terms of accuracy and scalability, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers on COCO detection and ADE20K segmentation, while maintaining the simplicity and efficiency of standard ConvNets.
<!-- [IMAGE] -->
<div align=center>
<img src="https://user-images.githubusercontent.com/8370623/148624004-e9581042-ea4d-4e10-b3bd-42c92b02053b.png" width="100%"/>
</div>
## Results and models
### ImageNet-1k
| Model | Pretrain | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
| :-----------: | :----------: | :-------: | :------: | :-------: | :-------: | :-------------------------------------------------------------------: | :---------------------------------------------------------------------: |
| ConvNeXt-T\* | From scratch | 28.59 | 4.46 | 82.05 | 95.86 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convnext/convnext-tiny_32xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-tiny_3rdparty_32xb128_in1k_20220124-18abde00.pth) |
| ConvNeXt-S\* | From scratch | 50.22 | 8.69 | 83.13 | 96.44 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convnext/convnext-small_32xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-small_3rdparty_32xb128_in1k_20220124-d39b5192.pth) |
| ConvNeXt-B\* | From scratch | 88.59 | 15.36 | 83.85 | 96.74 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convnext/convnext-base_32xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-base_3rdparty_32xb128_in1k_20220124-d0915162.pth) |
| ConvNeXt-B\* | ImageNet-21k | 88.59 | 15.36 | 85.81 | 97.86 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convnext/convnext-base_32xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-base_in21k-pre-3rdparty_32xb128_in1k_20220124-eb2d6ada.pth) |
| ConvNeXt-L\* | From scratch | 197.77 | 34.37 | 84.30 | 96.89 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convnext/convnext-large_64xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-large_3rdparty_64xb64_in1k_20220124-f8a0ded0.pth) |
| ConvNeXt-L\* | ImageNet-21k | 197.77 | 34.37 | 86.61 | 98.04 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convnext/convnext-large_64xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-large_in21k-pre-3rdparty_64xb64_in1k_20220124-2412403d.pth) |
| ConvNeXt-XL\* | ImageNet-21k | 350.20 | 60.93 | 86.97 | 98.20 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/convnext/convnext-xlarge_64xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-xlarge_in21k-pre-3rdparty_64xb64_in1k_20220124-76b6863d.pth) |
*Models with * are converted from the [official repo](https://github.com/facebookresearch/ConvNeXt). The config files of these models are only for inference. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
### Pre-trained Models
The pre-trained models on ImageNet-1k or ImageNet-21k are used to fine-tune on the downstream tasks.
| Model | Training Data | Params(M) | Flops(G) | Download |
| :-----------: | :-----------: | :-------: | :------: | :-----------------------------------------------------------------------------------------------------------------------------------: |
| ConvNeXt-T\* | ImageNet-1k | 28.59 | 4.46 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-tiny_3rdparty_32xb128-noema_in1k_20220222-2908964a.pth) |
| ConvNeXt-S\* | ImageNet-1k | 50.22 | 8.69 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-small_3rdparty_32xb128-noema_in1k_20220222-fa001ca5.pth) |
| ConvNeXt-B\* | ImageNet-1k | 88.59 | 15.36 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-base_3rdparty_32xb128-noema_in1k_20220222-dba4f95f.pth) |
| ConvNeXt-B\* | ImageNet-21k | 88.59 | 15.36 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-base_3rdparty_in21k_20220124-13b83eec.pth) |
| ConvNeXt-L\* | ImageNet-21k | 197.77 | 34.37 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-large_3rdparty_in21k_20220124-41b5a79f.pth) |
| ConvNeXt-XL\* | ImageNet-21k | 350.20 | 60.93 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-xlarge_3rdparty_in21k_20220124-f909bad7.pth) |
*Models with * are converted from the [official repo](https://github.com/facebookresearch/ConvNeXt).*
## Citation
```bibtex
@Article{liu2022convnet,
author = {Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
title = {A ConvNet for the 2020s},
journal = {arXiv preprint arXiv:2201.03545},
year = {2022},
}
```
_base_ = [
'../_base_/models/convnext/convnext-base.py',
'../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_swin.py',
'../_base_/default_runtime.py',
]
data = dict(samples_per_gpu=128)
optimizer = dict(lr=4e-3)
custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
_base_ = [
'../_base_/models/convnext/convnext-large.py',
'../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_swin.py',
'../_base_/default_runtime.py',
]
data = dict(samples_per_gpu=64)
optimizer = dict(lr=4e-3)
custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
_base_ = [
'../_base_/models/convnext/convnext-small.py',
'../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_swin.py',
'../_base_/default_runtime.py',
]
data = dict(samples_per_gpu=128)
optimizer = dict(lr=4e-3)
custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment