first commit

dff2c686 · renzhc · 8f9dd0ed · dff2c686 · dff2c686 · dff2c686
Commit dff2c686 authored Sep 03, 2024 by renzhc
20 changed files
--- a/configs/hivit/hivit-base-p16_16xb64_in1k.py
+++ b/configs/hivit/hivit-base-p16_16xb64_in1k.py
+_base_ = [
+    '../_base_/models/hivit/base_224.py',
+    '../_base_/datasets/imagenet_bs64_hivit_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_hivit.py',
+    '../_base_/default_runtime.py'
+]
+
+# schedule settings
+optim_wrapper = dict(clip_grad=dict(max_norm=5.0))
--- a/configs/hivit/hivit-small-p16_16xb64_in1k.py
+++ b/configs/hivit/hivit-small-p16_16xb64_in1k.py
+_base_ = [
+    '../_base_/models/hivit/small_224.py',
+    '../_base_/datasets/imagenet_bs64_hivit_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_hivit.py',
+    '../_base_/default_runtime.py'
+]
+
+# schedule settings
+optim_wrapper = dict(clip_grad=dict(max_norm=5.0))
--- a/configs/hivit/hivit-tiny-p16_16xb64_in1k.py
+++ b/configs/hivit/hivit-tiny-p16_16xb64_in1k.py
+_base_ = [
+    '../_base_/models/hivit/tiny_224.py',
+    '../_base_/datasets/imagenet_bs64_hivit_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_hivit.py',
+    '../_base_/default_runtime.py'
+]
+
+# schedule settings
+optim_wrapper = dict(clip_grad=dict(max_norm=5.0))
--- a/configs/hivit/metafile.yml
+++ b/configs/hivit/metafile.yml
+Collections:
+  - Name: HiViT
+    Metadata:
+      Architecture:
+        - Dense Connections
+        - Dropout
+        - GELU
+        - Layer Normalization
+        - Multi-Head Attention
+        - Scaled Dot-Product Attention
+    Paper:
+      Title: 'HiViT: A Simple and More Efficient Design of Hierarchical Vision Transformer'
+      URL: https://arxiv.org/abs/2205.14949
+    README: configs/hivit/README.md
+    Code:
+      URL: null
+      Version: null
+
+Models:
+  - Name: hivit-tiny-p16_16xb64_in1k
+    Metadata:
+      FLOPs: 4603000000
+      Parameters: 19181000
+      Training Data:
+        - ImageNet-1k
+    In Collection: HiViT
+    Results:
+      - Dataset: ImageNet-1k
+        Metrics:
+          Top 1 Accuracy: 82.1
+        Task: Image Classification
+    Weights:
+    Config: configs/hivit/hivit-tiny-p16_16xb64_in1k.py
+
+  - Name: hivit-small-p16_16xb64_in1k
+    Metadata:
+      FLOPs: 9072000000
+      Parameters: 37526000
+      Training Data:
+        - ImageNet-1k
+    In Collection: HiViT
+    Results:
+      - Dataset: ImageNet-1k
+        Metrics:
+          Top 1 Accuracy:
+        Task: Image Classification
+    Weights:
+    Config: configs/hivit/hivit-small-p16_16xb64_in1k.py
+
+  - Name: hivit-base-p16_16xb64_in1k
+    Metadata:
+      FLOPs: 18474000000
+      Parameters: 79051000
+      Training Data:
+        - ImageNet-1k
+    In Collection: HiViT
+    Results:
+      - Dataset: ImageNet-1k
+        Metrics:
+          Top 1 Accuracy:
+        Task: Image Classification
+    Weights:
+    Config: configs/hivit/hivit-base-p16_16xb64_in1k.py
--- a/configs/hornet/README.md
+++ b/configs/hornet/README.md
+# HorNet
+
+> [HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions](https://arxiv.org/abs/2207.14284)
+
+<!-- [ALGORITHM] -->
+
+## Abstract
+
+Recent progress in vision Transformers exhibits great success in various tasks driven by the new spatial modeling mechanism based on dot-product self-attention. In this paper, we show that the key ingredients behind the vision Transformers, namely input-adaptive, long-range and high-order spatial interactions, can also be efficiently implemented with a convolution-based framework. We present the Recursive Gated Convolution (g nConv) that performs high-order spatial interactions with gated convolutions and recursive designs. The new operation is highly flexible and customizable, which is compatible with various variants of convolution and extends the two-order interactions in self-attention to arbitrary orders without introducing significant extra computation. g nConv can serve as a plug-and-play module to improve various vision Transformers and convolution-based models. Based on the operation, we construct a new family of generic vision backbones named HorNet. Extensive experiments on ImageNet classification, COCO object detection and ADE20K semantic segmentation show HorNet outperform Swin Transformers and ConvNeXt by a significant margin with similar overall architecture and training configurations. HorNet also shows favorable scalability to more training data and a larger model size. Apart from the effectiveness in visual encoders, we also show g nConv can be applied to task-specific decoders and consistently improve dense prediction performance with less computation. Our results demonstrate that g nConv can be a new basic module for visual modeling that effectively combines the merits of both vision Transformers and CNNs. Code is available at https://github.com/raoyongming/HorNet.
+
+<div align=center>
+<img src="https://user-images.githubusercontent.com/24734142/188356236-b8e3db94-eaa6-48e9-b323-15e5ba7f2991.png" width="80%"/>
+</div>
+
+## How to use it?
+
+<!-- [TABS-BEGIN] -->
+
+**Predict image**
+
+```python
+from mmpretrain import inference_model
+
+predict = inference_model('hornet-tiny_3rdparty_in1k', 'demo/bird.JPEG')
+print(predict['pred_class'])
+print(predict['pred_score'])
+```
+
+**Use the model**
+
+```python
+import torch
+from mmpretrain import get_model
+
+model = get_model('hornet-tiny_3rdparty_in1k', pretrained=True)
+inputs = torch.rand(1, 3, 224, 224)
+out = model(inputs)
+print(type(out))
+# To extract features.
+feats = model.extract_feat(inputs)
+print(type(feats))
+```
+
+**Test Command**
+
+Prepare your dataset according to the [docs](https://mmpretrain.readthedocs.io/en/latest/user_guides/dataset_prepare.html#prepare-dataset).
+
+Test:
+
+```shell
+python tools/test.py configs/hornet/hornet-tiny_8xb128_in1k.py https://download.openmmlab.com/mmclassification/v0/hornet/hornet-tiny_3rdparty_in1k_20220915-0e8eedff.pth
+```
+
+<!-- [TABS-END] -->
+
+## Models and results
+
+### Image Classification on ImageNet-1k
+
+| Model                             |   Pretrain   | Params (M) | Flops (G) | Top-1 (%) | Top-5 (%) |                 Config                  |                                    Download                                     |
+| :-------------------------------- | :----------: | :--------: | :-------: | :-------: | :-------: | :-------------------------------------: | :-----------------------------------------------------------------------------: |
+| `hornet-tiny_3rdparty_in1k`\*     | From scratch |   22.41    |   3.98    |   82.84   |   96.24   |  [config](hornet-tiny_8xb128_in1k.py)   | [model](https://download.openmmlab.com/mmclassification/v0/hornet/hornet-tiny_3rdparty_in1k_20220915-0e8eedff.pth) |
+| `hornet-tiny-gf_3rdparty_in1k`\*  | From scratch |   22.99    |   3.90    |   82.98   |   96.38   | [config](hornet-tiny-gf_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hornet/hornet-tiny-gf_3rdparty_in1k_20220915-4c35a66b.pth) |
+| `hornet-small_3rdparty_in1k`\*    | From scratch |   49.53    |   8.83    |   83.79   |   96.75   |  [config](hornet-small_8xb64_in1k.py)   | [model](https://download.openmmlab.com/mmclassification/v0/hornet/hornet-small_3rdparty_in1k_20220915-5935f60f.pth) |
+| `hornet-small-gf_3rdparty_in1k`\* | From scratch |   50.40    |   8.71    |   83.98   |   96.77   | [config](hornet-small-gf_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hornet/hornet-small-gf_3rdparty_in1k_20220915-649ca492.pth) |
+| `hornet-base_3rdparty_in1k`\*     | From scratch |   87.26    |   15.58   |   84.24   |   96.94   |   [config](hornet-base_8xb64_in1k.py)   | [model](https://download.openmmlab.com/mmclassification/v0/hornet/hornet-base_3rdparty_in1k_20220915-a06176bb.pth) |
+| `hornet-base-gf_3rdparty_in1k`\*  | From scratch |   88.42    |   15.42   |   84.32   |   96.95   | [config](hornet-base-gf_8xb64_in1k.py)  | [model](https://download.openmmlab.com/mmclassification/v0/hornet/hornet-base-gf_3rdparty_in1k_20220915-82c06fa7.pth) |
+
+*Models with * are converted from the [official repo](https://github.com/raoyongming/HorNet). The config files of these models are only for inference. We haven't reproduce the training results.*
+
+## Citation
+
+```bibtex
+@article{rao2022hornet,
+  title={HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions},
+  author={Rao, Yongming and Zhao, Wenliang and Tang, Yansong and Zhou, Jie and Lim, Ser-Lam and Lu, Jiwen},
+  journal={arXiv preprint arXiv:2207.14284},
+  year={2022}
+}
+```
--- a/configs/hornet/hornet-base-gf_8xb64_in1k.py
+++ b/configs/hornet/hornet-base-gf_8xb64_in1k.py
+_base_ = [
+    '../_base_/models/hornet/hornet-base-gf.py',
+    '../_base_/datasets/imagenet_bs64_swin_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_swin.py',
+    '../_base_/default_runtime.py',
+]
+
+data = dict(samples_per_gpu=64)
+
+optim_wrapper = dict(optimizer=dict(lr=4e-3), clip_grad=dict(max_norm=1.0))
+
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
--- a/configs/hornet/hornet-base_8xb64_in1k.py
+++ b/configs/hornet/hornet-base_8xb64_in1k.py
+_base_ = [
+    '../_base_/models/hornet/hornet-base.py',
+    '../_base_/datasets/imagenet_bs64_swin_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_swin.py',
+    '../_base_/default_runtime.py',
+]
+
+data = dict(samples_per_gpu=64)
+
+optim_wrapper = dict(optimizer=dict(lr=4e-3), clip_grad=dict(max_norm=5.0))
+
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
--- a/configs/hornet/hornet-small-gf_8xb64_in1k.py
+++ b/configs/hornet/hornet-small-gf_8xb64_in1k.py
+_base_ = [
+    '../_base_/models/hornet/hornet-small-gf.py',
+    '../_base_/datasets/imagenet_bs64_swin_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_swin.py',
+    '../_base_/default_runtime.py',
+]
+
+data = dict(samples_per_gpu=64)
+
+optim_wrapper = dict(optimizer=dict(lr=4e-3), clip_grad=dict(max_norm=1.0))
+
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
--- a/configs/hornet/hornet-small_8xb64_in1k.py
+++ b/configs/hornet/hornet-small_8xb64_in1k.py
+_base_ = [
+    '../_base_/models/hornet/hornet-small.py',
+    '../_base_/datasets/imagenet_bs64_swin_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_swin.py',
+    '../_base_/default_runtime.py',
+]
+
+data = dict(samples_per_gpu=64)
+
+optim_wrapper = dict(optimizer=dict(lr=4e-3), clip_grad=dict(max_norm=5.0))
+
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
--- a/configs/hornet/hornet-tiny-gf_8xb128_in1k.py
+++ b/configs/hornet/hornet-tiny-gf_8xb128_in1k.py
+_base_ = [
+    '../_base_/models/hornet/hornet-tiny-gf.py',
+    '../_base_/datasets/imagenet_bs64_swin_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_swin.py',
+    '../_base_/default_runtime.py',
+]
+
+data = dict(samples_per_gpu=128)
+
+optim_wrapper = dict(optimizer=dict(lr=4e-3), clip_grad=dict(max_norm=1.0))
+
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
--- a/configs/hornet/hornet-tiny_8xb128_in1k.py
+++ b/configs/hornet/hornet-tiny_8xb128_in1k.py
+_base_ = [
+    '../_base_/models/hornet/hornet-tiny.py',
+    '../_base_/datasets/imagenet_bs64_swin_224.py',
+    '../_base_/schedules/imagenet_bs1024_adamw_swin.py',
+    '../_base_/default_runtime.py',
+]
+
+data = dict(samples_per_gpu=128)
+
+optim_wrapper = dict(optimizer=dict(lr=4e-3), clip_grad=dict(max_norm=100.0))
+
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
--- a/configs/hornet/metafile.yml
+++ b/configs/hornet/metafile.yml
+Collections:
+  - Name: HorNet
+    Metadata:
+      Training Data: ImageNet-1k
+      Training Techniques:
+        - AdamW
+        - Weight Decay
+      Architecture:
+        - HorNet
+        - gnConv
+    Paper:
+      URL: https://arxiv.org/abs/2207.14284
+      Title: "HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions"
+    README: configs/hornet/README.md
+    Code:
+      Version: v0.24.0
+      URL: https://github.com/open-mmlab/mmpretrain/blob/v0.24.0/mmcls/models/backbones/hornet.py
+
+Models:
+  - Name: hornet-tiny_3rdparty_in1k
+    Metadata:
+      FLOPs: 3976156352   # 3.98G
+      Parameters: 22409512      # 22.41M
+    In Collection: HorNet
+    Results:
+      - Dataset: ImageNet-1k
+        Metrics:
+          Top 1 Accuracy: 82.84
+          Top 5 Accuracy: 96.24
+        Task: Image Classification
+    Weights: https://download.openmmlab.com/mmclassification/v0/hornet/hornet-tiny_3rdparty_in1k_20220915-0e8eedff.pth
+    Config: configs/hornet/hornet-tiny_8xb128_in1k.py
+    Converted From:
+      Code: https://github.com/raoyongming/HorNet
+      Weights: https://cloud.tsinghua.edu.cn/f/1ca970586c6043709a3f/?dl=1
+  - Name: hornet-tiny-gf_3rdparty_in1k
+    Metadata:
+      FLOPs: 3896472160   # 3.9G
+      Parameters: 22991848      # 22.99M
+    In Collection: HorNet
+    Results:
+      - Dataset: ImageNet-1k
+        Metrics:
+          Top 1 Accuracy: 82.98
+          Top 5 Accuracy: 96.38
+        Task: Image Classification
+    Weights: https://download.openmmlab.com/mmclassification/v0/hornet/hornet-tiny-gf_3rdparty_in1k_20220915-4c35a66b.pth
+    Config: configs/hornet/hornet-tiny-gf_8xb128_in1k.py
+    Converted From:
+      Code: https://github.com/raoyongming/HorNet
+      Weights: https://cloud.tsinghua.edu.cn/f/511faad0bde94dfcaa54/?dl=1
+  - Name: hornet-small_3rdparty_in1k
+    Metadata:
+      FLOPs:  8825621280    # 8.83G
+      Parameters: 49528264          # 49.53M
+    In Collection: HorNet
+    Results:
+        - Dataset: ImageNet-1k
+          Metrics:
+            Top 1 Accuracy: 83.79
+            Top 5 Accuracy: 96.75
+          Task: Image Classification
+    Weights: https://download.openmmlab.com/mmclassification/v0/hornet/hornet-small_3rdparty_in1k_20220915-5935f60f.pth
+    Config: configs/hornet/hornet-small_8xb64_in1k.py
+    Converted From:
+      Code: https://github.com/raoyongming/HorNet
+      Weights: https://cloud.tsinghua.edu.cn/f/46422799db2941f7b684/?dl=1
+  - Name: hornet-small-gf_3rdparty_in1k
+    Metadata:
+      FLOPs:  8706094992    # 8.71G
+      Parameters: 50401768          # 50.4M
+    In Collection: HorNet
+    Results:
+        - Dataset: ImageNet-1k
+          Metrics:
+            Top 1 Accuracy: 83.98
+            Top 5 Accuracy: 96.77
+          Task: Image Classification
+    Weights: https://download.openmmlab.com/mmclassification/v0/hornet/hornet-small-gf_3rdparty_in1k_20220915-649ca492.pth
+    Config: configs/hornet/hornet-small-gf_8xb64_in1k.py
+    Converted From:
+      Code: https://github.com/raoyongming/HorNet
+      Weights: https://cloud.tsinghua.edu.cn/f/8405c984bf084d2ba85a/?dl=1
+  - Name: hornet-base_3rdparty_in1k
+    Metadata:
+      FLOPs: 15582677376              # 15.59G
+      Parameters: 87256680            # 87.26M
+    In Collection: HorNet
+    Results:
+        - Dataset: ImageNet-1k
+          Metrics:
+            Top 1 Accuracy: 84.24
+            Top 5 Accuracy: 96.94
+          Task: Image Classification
+    Weights: https://download.openmmlab.com/mmclassification/v0/hornet/hornet-base_3rdparty_in1k_20220915-a06176bb.pth
+    Config: configs/hornet/hornet-base_8xb64_in1k.py
+    Converted From:
+      Code: https://github.com/raoyongming/HorNet
+      Weights: https://cloud.tsinghua.edu.cn/f/5c86cb3d655d4c17a959/?dl=1
+  - Name: hornet-base-gf_3rdparty_in1k
+    Metadata:
+      FLOPs: 15423308992              # 15.42G
+      Parameters: 88421352            # 88.42M
+    In Collection: HorNet
+    Results:
+        - Dataset: ImageNet-1k
+          Metrics:
+            Top 1 Accuracy: 84.32
+            Top 5 Accuracy: 96.95
+          Task: Image Classification
+    Weights: https://download.openmmlab.com/mmclassification/v0/hornet/hornet-base-gf_3rdparty_in1k_20220915-82c06fa7.pth
+    Config: configs/hornet/hornet-base-gf_8xb64_in1k.py
+    Converted From:
+      Code: https://github.com/raoyongming/HorNet
+      Weights: https://cloud.tsinghua.edu.cn/f/6c84935e63b547f383fb/?dl=1
--- a/configs/hrnet/README.md
+++ b/configs/hrnet/README.md
+# HRNet
+
+> [Deep High-Resolution Representation Learning for Visual Recognition](https://arxiv.org/abs/1908.07919v2)
+
+<!-- [ALGORITHM] -->
+
+## Abstract
+
+High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions *in series* (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams *in parallel*; (ii) Repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more precise. We show the superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, suggesting that the HRNet is a stronger backbone for computer vision problems.
+
+<div align=center>
+<img src="https://user-images.githubusercontent.com/26739999/149920446-cbe05670-989d-4fe6-accc-df20ae2984eb.png" width="100%"/>
+</div>
+
+## How to use it?
+
+<!-- [TABS-BEGIN] -->
+
+**Predict image**
+
+```python
+from mmpretrain import inference_model
+
+predict = inference_model('hrnet-w18_3rdparty_8xb32_in1k', 'demo/bird.JPEG')
+print(predict['pred_class'])
+print(predict['pred_score'])
+```
+
+**Use the model**
+
+```python
+import torch
+from mmpretrain import get_model
+
+model = get_model('hrnet-w18_3rdparty_8xb32_in1k', pretrained=True)
+inputs = torch.rand(1, 3, 224, 224)
+out = model(inputs)
+print(type(out))
+# To extract features.
+feats = model.extract_feat(inputs)
+print(type(feats))
+```
+
+**Test Command**
+
+Prepare your dataset according to the [docs](https://mmpretrain.readthedocs.io/en/latest/user_guides/dataset_prepare.html#prepare-dataset).
+
+Test:
+
+```shell
+python tools/test.py configs/hrnet/hrnet-w18_4xb32_in1k.py https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w18_3rdparty_8xb32_in1k_20220120-0c10b180.pth
+```
+
+<!-- [TABS-END] -->
+
+## Models and results
+
+### Image Classification on ImageNet-1k
+
+| Model                                  |   Pretrain   | Params (M) | Flops (G) | Top-1 (%) | Top-5 (%) |              Config               |                                     Download                                     |
+| :------------------------------------- | :----------: | :--------: | :-------: | :-------: | :-------: | :-------------------------------: | :------------------------------------------------------------------------------: |
+| `hrnet-w18_3rdparty_8xb32_in1k`\*      | From scratch |   21.30    |   4.33    |   76.75   |   93.44   | [config](hrnet-w18_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w18_3rdparty_8xb32_in1k_20220120-0c10b180.pth) |
+| `hrnet-w30_3rdparty_8xb32_in1k`\*      | From scratch |   37.71    |   8.17    |   78.19   |   94.22   | [config](hrnet-w30_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w30_3rdparty_8xb32_in1k_20220120-8aa3832f.pth) |
+| `hrnet-w32_3rdparty_8xb32_in1k`\*      | From scratch |   41.23    |   8.99    |   78.44   |   94.19   | [config](hrnet-w32_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w32_3rdparty_8xb32_in1k_20220120-c394f1ab.pth) |
+| `hrnet-w40_3rdparty_8xb32_in1k`\*      | From scratch |   57.55    |   12.77   |   78.94   |   94.47   | [config](hrnet-w40_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w40_3rdparty_8xb32_in1k_20220120-9a2dbfc5.pth) |
+| `hrnet-w44_3rdparty_8xb32_in1k`\*      | From scratch |   67.06    |   14.96   |   78.88   |   94.37   | [config](hrnet-w44_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w44_3rdparty_8xb32_in1k_20220120-35d07f73.pth) |
+| `hrnet-w48_3rdparty_8xb32_in1k`\*      | From scratch |   77.47    |   17.36   |   79.32   |   94.52   | [config](hrnet-w48_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w48_3rdparty_8xb32_in1k_20220120-e555ef50.pth) |
+| `hrnet-w64_3rdparty_8xb32_in1k`\*      | From scratch |   128.06   |   29.00   |   79.46   |   94.65   | [config](hrnet-w64_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w64_3rdparty_8xb32_in1k_20220120-19126642.pth) |
+| `hrnet-w18_3rdparty_8xb32-ssld_in1k`\* | From scratch |   21.30    |   4.33    |   81.06   |   95.70   | [config](hrnet-w18_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w18_3rdparty_8xb32-ssld_in1k_20220120-455f69ea.pth) |
+| `hrnet-w48_3rdparty_8xb32-ssld_in1k`\* | From scratch |   77.47    |   17.36   |   83.63   |   96.79   | [config](hrnet-w48_4xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/hrnet/hrnet-w48_3rdparty_8xb32-ssld_in1k_20220120-d0459c38.pth) |
+
+*Models with * are converted from the [official repo](https://github.com/HRNet/HRNet-Image-Classification). The config files of these models are only for inference. We haven't reproduce the training results.*
+
+## Citation
+
+```bibtex
+@article{WangSCJDZLMTWLX19,
+  title={Deep High-Resolution Representation Learning for Visual Recognition},
+  author={Jingdong Wang and Ke Sun and Tianheng Cheng and
+          Borui Jiang and Chaorui Deng and Yang Zhao and Dong Liu and Yadong Mu and
+          Mingkui Tan and Xinggang Wang and Wenyu Liu and Bin Xiao},
+  journal={TPAMI},
+  year={2019}
+}
+```
--- a/configs/hrnet/hrnet-w18_4xb32_in1k.py
+++ b/configs/hrnet/hrnet-w18_4xb32_in1k.py
+_base_ = [
+    '../_base_/models/hrnet/hrnet-w18.py',
+    '../_base_/datasets/imagenet_bs32_pil_resize.py',
+    '../_base_/schedules/imagenet_bs256_coslr.py',
+    '../_base_/default_runtime.py'
+]
+
+# NOTE: `auto_scale_lr` is for automatically scaling LR
+# based on the actual training batch size.
+# base_batch_size = (4 GPUs) x (32 samples per GPU)
+auto_scale_lr = dict(base_batch_size=128)
--- a/configs/hrnet/hrnet-w30_4xb32_in1k.py
+++ b/configs/hrnet/hrnet-w30_4xb32_in1k.py
+_base_ = [
+    '../_base_/models/hrnet/hrnet-w30.py',
+    '../_base_/datasets/imagenet_bs32_pil_resize.py',
+    '../_base_/schedules/imagenet_bs256_coslr.py',
+    '../_base_/default_runtime.py'
+]
+
+# NOTE: `auto_scale_lr` is for automatically scaling LR
+# based on the actual training batch size.
+# base_batch_size = (4 GPUs) x (32 samples per GPU)
+auto_scale_lr = dict(base_batch_size=128)
--- a/configs/hrnet/hrnet-w32_4xb32_in1k.py
+++ b/configs/hrnet/hrnet-w32_4xb32_in1k.py
+_base_ = [
+    '../_base_/models/hrnet/hrnet-w32.py',
+    '../_base_/datasets/imagenet_bs32_pil_resize.py',
+    '../_base_/schedules/imagenet_bs256_coslr.py',
+    '../_base_/default_runtime.py'
+]
+
+# NOTE: `auto_scale_lr` is for automatically scaling LR
+# based on the actual training batch size.
+# base_batch_size = (4 GPUs) x (32 samples per GPU)
+auto_scale_lr = dict(base_batch_size=128)
--- a/configs/hrnet/hrnet-w40_4xb32_in1k.py
+++ b/configs/hrnet/hrnet-w40_4xb32_in1k.py
+_base_ = [
+    '../_base_/models/hrnet/hrnet-w40.py',
+    '../_base_/datasets/imagenet_bs32_pil_resize.py',
+    '../_base_/schedules/imagenet_bs256_coslr.py',
+    '../_base_/default_runtime.py'
+]
+
+# NOTE: `auto_scale_lr` is for automatically scaling LR
+# based on the actual training batch size.
+# base_batch_size = (4 GPUs) x (32 samples per GPU)
+auto_scale_lr = dict(base_batch_size=128)
--- a/configs/hrnet/hrnet-w44_4xb32_in1k.py
+++ b/configs/hrnet/hrnet-w44_4xb32_in1k.py
+_base_ = [
+    '../_base_/models/hrnet/hrnet-w44.py',
+    '../_base_/datasets/imagenet_bs32_pil_resize.py',
+    '../_base_/schedules/imagenet_bs256_coslr.py',
+    '../_base_/default_runtime.py'
+]
+
+# NOTE: `auto_scale_lr` is for automatically scaling LR
+# based on the actual training batch size.
+# base_batch_size = (4 GPUs) x (32 samples per GPU)
+auto_scale_lr = dict(base_batch_size=128)
--- a/configs/hrnet/hrnet-w48_4xb32_in1k.py
+++ b/configs/hrnet/hrnet-w48_4xb32_in1k.py
+_base_ = [
+    '../_base_/models/hrnet/hrnet-w48.py',
+    '../_base_/datasets/imagenet_bs32_pil_resize.py',
+    '../_base_/schedules/imagenet_bs256_coslr.py',
+    '../_base_/default_runtime.py'
+]
+
+# NOTE: `auto_scale_lr` is for automatically scaling LR
+# based on the actual training batch size.
+# base_batch_size = (4 GPUs) x (32 samples per GPU)
+auto_scale_lr = dict(base_batch_size=128)
--- a/configs/hrnet/hrnet-w64_4xb32_in1k.py
+++ b/configs/hrnet/hrnet-w64_4xb32_in1k.py
+_base_ = [
+    '../_base_/models/hrnet/hrnet-w64.py',
+    '../_base_/datasets/imagenet_bs32_pil_resize.py',
+    '../_base_/schedules/imagenet_bs256_coslr.py',
+    '../_base_/default_runtime.py'
+]
+
+# NOTE: `auto_scale_lr` is for automatically scaling LR
+# based on the actual training batch size.
+# base_batch_size = (4 GPUs) x (32 samples per GPU)
+auto_scale_lr = dict(base_batch_size=128)