release classification

59c80aa2 · PRC-Huang · zhe chen · 2cab2294 · 59c80aa2 · 59c80aa2
Commit 59c80aa2 authored Mar 08, 2023 by PRC-Huang Committed by zhe chen Mar 08, 2023
20 changed files
--- a/README.md
+++ b/README.md
@@ -30,9 +30,9 @@ Deformable Convolutions](https://arxiv.org/abs/2211.05778).
 ADE20K, outperforming previous models by a large margin.

 ## Coming soon
- [ ] TensorRT inference. 
 - [ ] Other downstream tasks. 
- [ ] Classification code of the InternImage series.
+- [x] TensorRT inference. 
+- [x] Classification code of the InternImage series.
 - [x] InternImage-T/S/B/L/XL ImageNet-1k pretrained model.
 - [x] InternImage-L/XL ImageNet-22k pretrained model.
 - [x] InternImage-T/S/B/L/XL detection and instance segmentation model.
@@ -89,13 +89,13 @@ to reduces the strict inductive bias. Our model makes it possible to learn more

 ## Main Results of FPS

-|      name      | resolution | #params | FLOPs | Batch 1 FPS(PyTorch) | Batch 1 FPS(TensorRT) |
-| :------------: | :--------: | :-----: | :---: | :------------------: | :-------------------: |
-| InternImage-T  |  224x224   |   30M   |  5G   |          44          |          156          |
-| InternImage-S  |  224x224   |   50M   |  8G   |          40          |          129          |
-| InternImage-B  |  224x224   |   97M   |  16G  |          40          |          116          |
-| InternImage-L  |  384x384   |  223M   | 108G  |          40          |          56           |
-| InternImage-XL |  384x384   |  335M   | 163G  |          32          |          47           |
+|      name      | resolution | #params | FLOPs | Batch 1 FPS(TensorRT) |
+| :------------: | :--------: | :-----: | :---: | :-------------------: |
+| InternImage-T  |  224x224   |   30M   |  5G   |          156          |
+| InternImage-S  |  224x224   |   50M   |  8G   |          129          |
+| InternImage-B  |  224x224   |   97M   |  16G  |          116          |
+| InternImage-L  |  384x384   |  223M   | 108G  |          56           |
+| InternImage-XL |  384x384   |  335M   | 163G  |          47           |

 ## Citation


--- a/classification/README.md
+++ b/classification/README.md
+# InternImage for Image Classification
+
+This folder contains the implementation of the InternImage for image classification.
+
+## Usage
+
+### Install
+
+- Clone this repo:
+
+```bash
+git clone https://github.com/OpenGVLab/InternImage.git
+cd InternImage
+```
+
+- Create a conda virtual environment and activate it:
+
+```bash
+conda create -n internimage python=3.7 -y
+conda activate internimage
+```
+
+- Install `CUDA>=10.2` with `cudnn>=7` following
+  the [official installation instructions](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
+- Install `PyTorch>=1.8.0` and `torchvision>=0.9.0` with `CUDA>=10.2`:
+
+For examples, to install torch==1.11 with CUDA==11.3:
+```bash
+pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
+```
+
+- Install `timm==0.6.11` and `mmcv-full==1.5.0`:
+
+```bash
+pip install -U openmim
+mim install mmcv-full==1.5.0
+pip install timm==0.6.11 mmdet==2.28.1
+```
+
+- Install other requirements:
+
+```bash
+pip install opencv-python termcolor yacs pyyaml scipy
+```
+
+- Compiling CUDA operators
+```bash
+cd ./ops_dcnv3
+sh ./make.sh
+# unit test (should see all checking is True)
+python test.py
+```
+
+### Data preparation
+
+We use standard ImageNet dataset, you can download it from http://image-net.org/. We provide the following two ways to
+load data:
+
+- For standard folder dataset, move validation images to labeled sub-folders. The file structure should look like:
+  ```bash
+  $ tree data
+  imagenet
+  ├── train
+  │   ├── class1
+  │   │   ├── img1.jpeg
+  │   │   ├── img2.jpeg
+  │   │   └── ...
+  │   ├── class2
+  │   │   ├── img3.jpeg
+  │   │   └── ...
+  │   └── ...
+  └── val
+      ├── class1
+      │   ├── img4.jpeg
+      │   ├── img5.jpeg
+      │   └── ...
+      ├── class2
+      │   ├── img6.jpeg
+      │   └── ...
+      └── ...
+ 
+  ```
+- To boost the slow speed when reading images from massive small files, we also support zipped ImageNet, which includes
+  four files:
+    - `train.zip`, `val.zip`: which store the zipped folder for train and validate splits.
+    - `train.txt`, `val.txt`: which store the relative path in the corresponding zip file and ground truth
+      label. Make sure the data folder looks like this:
+
+  ```bash
+  $ tree data
+  data
+  └── ImageNet-Zip
+      ├── train_map.txt
+      ├── train.zip
+      ├── val_map.txt
+      └── val.zip
+  
+  $ head -n 5 meta_data/val.txt
+  ILSVRC2012_val_00000001.JPEG	65
+  ILSVRC2012_val_00000002.JPEG	970
+  ILSVRC2012_val_00000003.JPEG	230
+  ILSVRC2012_val_00000004.JPEG	809
+  ILSVRC2012_val_00000005.JPEG	516
+  
+  $ head -n 5 meta_data/train.txt
+  n01440764/n01440764_10026.JPEG	0
+  n01440764/n01440764_10027.JPEG	0
+  n01440764/n01440764_10029.JPEG	0
+  n01440764/n01440764_10040.JPEG	0
+  n01440764/n01440764_10042.JPEG	0
+  ```
+- For ImageNet-22K dataset, make a folder named `fall11_whole` and move all images to labeled sub-folders in this
+  folder. Then download the train-val split
+  file ([ILSVRC2011fall_whole_map_train.txt](https://github.com/SwinTransformer/storage/releases/download/v2.0.1/ILSVRC2011fall_whole_map_train.txt)
+  & [ILSVRC2011fall_whole_map_val.txt](https://github.com/SwinTransformer/storage/releases/download/v2.0.1/ILSVRC2011fall_whole_map_val.txt))
+  , and put them in the parent directory of `fall11_whole`. The file structure should look like:
+
+  ```bash
+    $ tree imagenet22k/
+    imagenet22k/
+    └── fall11_whole
+        ├── n00004475
+        ├── n00005787
+        ├── n00006024
+        ├── n00006484
+        └── ...
+  ```
+
+### Evaluation
+
+To evaluate a pre-trained `InternImage` on ImageNet val, run:
+
+```bash
+python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345 main.py --eval \
+--cfg <config-file> --resume <checkpoint> --data-path <imagenet-path> 
+```
+
+For example, to evaluate the `InternImage-B` with a single GPU:
+
+```bash
+python -m torch.distributed.launch --nproc_per_node 1 --master_port 12345 main.py --eval \
+--cfg configs/internimage_b_1k_224.yaml --resume internimage_b_1k_224.pth --data-path <imagenet-path>
+```
+
+### Training from scratch on ImageNet-1K
+
+To train an `InternImage` on ImageNet from scratch, run:
+
+```bash
+python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345  main.py \ 
+--cfg <config-file> --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory> --tag <job-tag>]
+```
+
+### Manage jobs with Srun.
+
+For example, to train `InternImage` with 8 GPU on a single node for 300 epochs, run:
+
+`InternImage-T`:
+
+```bash
+GPUS=8 sh train_in1k.sh <partition> <job-name> configs/internimage_t_1k_224.yaml --resume internimage_t_1k_224.pth --eval
+```
+
+`InternImage-S`:
+
+```bash
+GPUS=8 sh train_in1k.sh <partition> <job-name> configs/internimage_s_1k_224.yaml --resume internimage_s_1k_224.pth --eval
+```
+
+`InternImage-XL`:
+
+```bash
+GPUS=8 sh train_in1k.sh <partition> <job-name> configs/internimage_xl_22kto1k_384.pth --resume internimage_xl_22kto1k_384.pth --eval
+```
+
+<!-- 
+### Test pretrained model on ImageNet-22K
+
+For example, to evaluate the `InternImage-L-22k`:
+
+```bash
+python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345  main.py \ 
+--cfg configs/internimage_xl_22k_192to384.yaml --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory>] \
+--resume internimage_xl_22k_192to384.pth --eval
+``` -->
+
+<!-- ### Fine-tuning from a ImageNet-22K pre-trained model
+
+For example, to fine-tune a `InternImage-XL-22k` model pre-trained on ImageNet-22K:
+
+```bashs
+GPUS=8 sh train_in1k.sh <partition> <job-name> configs/intern_image_.yaml --pretrained intern_image_b.pth --eval
+python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345  main.py \
+--cfg configs/.yaml --pretrained swin_base_patch4_window7_224_22k.pth \
+--data-path <imagenet-path> --batch-size 64 --accumulation-steps 2 [--use-checkpoint]
+``` -->
+
+### Export
+
+To export `InternImage-T` from PyTorch to ONNX, run:
+```shell
+python export.py --model_name internimage_t_1k_224 --ckpt_dir /path/to/ckpt/dir --onnx
+```
+
+To export `InternImage-T` from PyTorch to TensorRT, run:
+```shell
+python export.py --model_name internimage_t_1k_224 --ckpt_dir /path/to/ckpt/dir --trt
+```
--- a/classification/config.py
+++ b/classification/config.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import os
+import yaml
+from yacs.config import CfgNode as CN
+
+_C = CN()
+
+# Base config files
+_C.BASE = ['']
+
+# -----------------------------------------------------------------------------
+# Data settings
+# -----------------------------------------------------------------------------
+_C.DATA = CN()
+# Batch size for a single GPU, could be overwritten by command line argument
+_C.DATA.BATCH_SIZE = 128
+# Path to dataset, could be overwritten by command line argument
+_C.DATA.DATA_PATH = ''
+# Dataset name
+_C.DATA.DATASET = 'imagenet'
+# Input image size
+_C.DATA.IMG_SIZE = 224
+# Interpolation to resize image (random, bilinear, bicubic)
+_C.DATA.INTERPOLATION = 'bicubic'
+# Use zipped dataset instead of folder dataset
+# could be overwritten by command line argument
+_C.DATA.ZIP_MODE = False
+# Cache Data in Memory, could be overwritten by command line argument
+_C.DATA.CACHE_MODE = 'part'
+# Pin CPU memory in DataLoader for more efficient (sometimes) transfer to GPU.
+_C.DATA.PIN_MEMORY = True
+# Number of data loading threads
+_C.DATA.NUM_WORKERS = 8
+# Load data to memory
+_C.DATA.IMG_ON_MEMORY = False
+
+# -----------------------------------------------------------------------------
+# Model settings
+# -----------------------------------------------------------------------------
+_C.MODEL = CN()
+# Model type
+_C.MODEL.TYPE = 'INTERN_IMAGE'
+# Model name
+_C.MODEL.NAME = 'intern_image'
+# Pretrained weight from checkpoint, could be imagenet22k pretrained weight
+# could be overwritten by command line argument
+_C.MODEL.PRETRAINED = ''
+# Checkpoint to resume, could be overwritten by command line argument
+_C.MODEL.RESUME = ''
+# Number of classes, overwritten in data preparation
+_C.MODEL.NUM_CLASSES = 1000
+# Dropout rate
+_C.MODEL.DROP_RATE = 0.0
+# Drop path rate
+_C.MODEL.DROP_PATH_RATE = 0.1
+# Drop path type
+_C.MODEL.DROP_PATH_TYPE = 'linear'  # linear, uniform
+# Label Smoothing
+_C.MODEL.LABEL_SMOOTHING = 0.1
+
+# INTERN_IMAGE parameters
+_C.MODEL.INTERN_IMAGE = CN()
+_C.MODEL.INTERN_IMAGE.DEPTHS = [4, 4, 18, 4]
+_C.MODEL.INTERN_IMAGE.GROUPS = [4, 8, 16, 32]
+_C.MODEL.INTERN_IMAGE.CHANNELS = 64
+_C.MODEL.INTERN_IMAGE.LAYER_SCALE = None
+_C.MODEL.INTERN_IMAGE.OFFSET_SCALE = 1.0
+_C.MODEL.INTERN_IMAGE.MLP_RATIO = 4.0
+_C.MODEL.INTERN_IMAGE.CORE_OP = 'DCNv3'
+_C.MODEL.INTERN_IMAGE.POST_NORM = False
+
+# -----------------------------------------------------------------------------
+# Training settings
+# -----------------------------------------------------------------------------
+_C.TRAIN = CN()
+_C.TRAIN.START_EPOCH = 0
+_C.TRAIN.EPOCHS = 300
+_C.TRAIN.WARMUP_EPOCHS = 20
+_C.TRAIN.WEIGHT_DECAY = 0.05
+_C.TRAIN.BASE_LR = 5e-4
+_C.TRAIN.WARMUP_LR = 5e-7
+_C.TRAIN.MIN_LR = 5e-6
+# Clip gradient norm
+_C.TRAIN.CLIP_GRAD = 5.0
+# Auto resume from latest checkpoint
+_C.TRAIN.AUTO_RESUME = True
+# Gradient accumulation steps
+# could be overwritten by command line argument
+_C.TRAIN.ACCUMULATION_STEPS = 0
+# Whether to use gradient checkpointing to save memory
+# could be overwritten by command line argument
+_C.TRAIN.USE_CHECKPOINT = False
+
+# LR scheduler
+_C.TRAIN.LR_SCHEDULER = CN()
+_C.TRAIN.LR_SCHEDULER.NAME = 'cosine'
+# Epoch interval to decay LR, used in StepLRScheduler
+_C.TRAIN.LR_SCHEDULER.DECAY_EPOCHS = 30
+# LR decay rate, used in StepLRScheduler
+_C.TRAIN.LR_SCHEDULER.DECAY_RATE = 0.1
+
+# Optimizer
+_C.TRAIN.OPTIMIZER = CN()
+_C.TRAIN.OPTIMIZER.NAME = 'adamw'
+# Optimizer Epsilon
+_C.TRAIN.OPTIMIZER.EPS = 1e-8
+# Optimizer Betas
+_C.TRAIN.OPTIMIZER.BETAS = (0.9, 0.999)
+# SGD momentum
+_C.TRAIN.OPTIMIZER.MOMENTUM = 0.9
+# ZeRO
+_C.TRAIN.OPTIMIZER.USE_ZERO = False
+# freeze backbone
+_C.TRAIN.OPTIMIZER.FREEZE_BACKBONE = None
+# dcn lr
+_C.TRAIN.OPTIMIZER.DCN_LR_MUL = None
+
+# EMA
+_C.TRAIN.EMA = CN()
+_C.TRAIN.EMA.ENABLE = False
+_C.TRAIN.EMA.DECAY = 0.9998
+
+# LR_LAYER_DECAY
+_C.TRAIN.LR_LAYER_DECAY = False
+_C.TRAIN.LR_LAYER_DECAY_RATIO = 0.875
+
+# FT head init weights
+_C.TRAIN.RAND_INIT_FT_HEAD = False
+
+# -----------------------------------------------------------------------------
+# Augmentation settings
+# -----------------------------------------------------------------------------
+_C.AUG = CN()
+# Color jitter factor
+_C.AUG.COLOR_JITTER = 0.4
+# Use AutoAugment policy. "v0" or "original"
+_C.AUG.AUTO_AUGMENT = 'rand-m9-mstd0.5-inc1'
+# Random erase prob
+_C.AUG.REPROB = 0.25
+# Random erase mode
+_C.AUG.REMODE = 'pixel'
+# Random erase count
+_C.AUG.RECOUNT = 1
+# Mixup alpha, mixup enabled if > 0
+_C.AUG.MIXUP = 0.8
+# Cutmix alpha, cutmix enabled if > 0
+_C.AUG.CUTMIX = 1.0
+# Cutmix min/max ratio, overrides alpha and enables cutmix if set
+_C.AUG.CUTMIX_MINMAX = None
+# Probability of performing mixup or cutmix when either/both is enabled
+_C.AUG.MIXUP_PROB = 1.0
+# Probability of switching to cutmix when both mixup and cutmix enabled
+_C.AUG.MIXUP_SWITCH_PROB = 0.5
+# How to apply mixup/cutmix params. Per "batch", "pair", or "elem"
+_C.AUG.MIXUP_MODE = 'batch'
+# RandomResizedCrop
+_C.AUG.RANDOM_RESIZED_CROP = False
+_C.AUG.MEAN = (0.485, 0.456, 0.406)
+_C.AUG.STD = (0.229, 0.224, 0.225)
+
+# -----------------------------------------------------------------------------
+# Testing settings
+# -----------------------------------------------------------------------------
+_C.TEST = CN()
+# Whether to use center crop when testing
+_C.TEST.CROP = True
+
+# Whether to use SequentialSampler as validation sampler
+_C.TEST.SEQUENTIAL = False
+
+# -----------------------------------------------------------------------------
+# Misc
+# -----------------------------------------------------------------------------
+# Mixed precision opt level, if O0, no amp is used ('O0', 'O1', 'O2')
+# overwritten by command line argument
+_C.AMP_OPT_LEVEL = ''
+# Path to output folder, overwritten by command line argument
+_C.OUTPUT = ''
+# Tag of experiment, overwritten by command line argument
+_C.TAG = 'default'
+# Frequency to save checkpoint
+_C.SAVE_FREQ = 1
+# Frequency to logging info
+_C.PRINT_FREQ = 10
+# eval freq
+_C.EVAL_FREQ = 1
+# Fixed random seed
+_C.SEED = 0
+# Perform evaluation only, overwritten by command line argument
+_C.EVAL_MODE = False
+# Test throughput only, overwritten by command line argument
+_C.THROUGHPUT_MODE = False
+# local rank for DistributedDataParallel, given by command line argument
+_C.LOCAL_RANK = 0
+_C.EVAL_22K_TO_1K = False
+
+_C.AMP_TYPE = 'float16'
+
+
+def _update_config_from_file(config, cfg_file):
+    config.defrost()
+    with open(cfg_file, 'r') as f:
+        yaml_cfg = yaml.load(f, Loader=yaml.FullLoader)
+
+    for cfg in yaml_cfg.setdefault('BASE', ['']):
+        if cfg:
+            _update_config_from_file(
+                config, os.path.join(os.path.dirname(cfg_file), cfg))
+    print('=> merge config from {}'.format(cfg_file))
+    config.merge_from_file(cfg_file)
+    config.freeze()
+
+
+def update_config(config, args):
+    _update_config_from_file(config, args.cfg)
+
+    config.defrost()
+    if hasattr(args, 'opts') and args.opts:
+        config.merge_from_list(args.opts)
+
+    # merge from specific arguments
+    if hasattr(args, 'batch_size') and args.batch_size:
+        config.DATA.BATCH_SIZE = args.batch_size
+    if hasattr(args, 'dataset') and args.dataset:
+        config.DATA.DATASET = args.dataset
+    if hasattr(args, 'data_path') and args.data_path:
+        config.DATA.DATA_PATH = args.data_path
+    if hasattr(args, 'zip') and args.zip:
+        config.DATA.ZIP_MODE = True
+    if hasattr(args, 'cache_mode') and args.cache_mode:
+        config.DATA.CACHE_MODE = args.cache_mode
+    if hasattr(args, 'pretrained') and args.pretrained:
+        config.MODEL.PRETRAINED = args.pretrained
+    if hasattr(args, 'resume') and args.resume:
+        config.MODEL.RESUME = args.resume
+    if hasattr(args, 'accumulation_steps') and args.accumulation_steps:
+        config.TRAIN.ACCUMULATION_STEPS = args.accumulation_steps
+    if hasattr(args, 'use_checkpoint') and args.use_checkpoint:
+        config.TRAIN.USE_CHECKPOINT = True
+    if hasattr(args, 'amp_opt_level') and args.amp_opt_level:
+        config.AMP_OPT_LEVEL = args.amp_opt_level
+    if hasattr(args, 'output') and args.output:
+        config.OUTPUT = args.output
+    if hasattr(args, 'tag') and args.tag:
+        config.TAG = args.tag
+    if hasattr(args, 'eval') and args.eval:
+        config.EVAL_MODE = True
+    if hasattr(args, 'throughput') and args.throughput:
+        config.THROUGHPUT_MODE = True
+    if hasattr(args, 'save_ckpt_num') and args.save_ckpt_num:
+        config.SAVE_CKPT_NUM = args.save_ckpt_num
+    if hasattr(args, 'use_zero') and args.use_zero:
+        config.TRAIN.OPTIMIZER.USE_ZERO = True
+    # set local rank for distributed training
+    if hasattr(args, 'local_rank') and args.local_rank:
+        config.LOCAL_RANK = args.local_rank
+
+    # output folder
+    config.MODEL.NAME = args.cfg.split('/')[-1].replace('.yaml', '')
+    config.OUTPUT = os.path.join(config.OUTPUT, config.MODEL.NAME)
+    # config.OUTPUT = os.path.join(config.OUTPUT, config.MODEL.NAME, config.TAG)
+
+    config.freeze()
+
+
+def get_config(args):
+    """Get a yacs CfgNode object with default values."""
+    # Return a clone so that the defaults will not be altered
+    # This is for the "local variable" use pattern
+    config = _C.clone()
+    update_config(config, args)
+
+    return config
--- a/classification/configs/internimage_b_1k_224.yaml
+++ b/classification/configs/internimage_b_1k_224.yaml
+DATA:
+  IMG_ON_MEMORY: True
+MODEL:
+  TYPE: intern_image
+  DROP_PATH_RATE: 0.5
+  INTERN_IMAGE:
+    CORE_OP: 'DCNv3'
+    DEPTHS: [4, 4, 21, 4]
+    GROUPS: [7, 14, 28, 56]
+    CHANNELS: 112
+    LAYER_SCALE: 1e-5
+    OFFSET_SCALE: 1.0
+    MLP_RATIO: 4.0
+    POST_NORM: True
+TRAIN:
+  EMA:
+    ENABLE: True
+    DECAY: 0.9999
+  BASE_LR: 5e-4
\ No newline at end of file
--- a/classification/configs/internimage_l_22kto1k_384.yaml
+++ b/classification/configs/internimage_l_22kto1k_384.yaml
+DATA:
+  IMG_SIZE: 384
+  IMG_ON_MEMORY: True
+AUG:
+  MIXUP: 0.0
+  CUTMIX: 0.0
+  REPROB: 0.0
+MODEL:
+  TYPE: intern_image
+  DROP_PATH_RATE: 0.1
+  LABEL_SMOOTHING: 0.3
+  INTERN_IMAGE:
+    CORE_OP: 'DCNv3'
+    DEPTHS: [5, 5, 22, 5]
+    GROUPS: [10, 20, 40, 80]
+    CHANNELS: 160
+    LAYER_SCALE: 1e-5
+    OFFSET_SCALE: 2.0
+    MLP_RATIO: 4.0
+    POST_NORM: True
+TRAIN:
+  EMA:
+    ENABLE: true
+    DECAY: 0.9999
+  EPOCHS: 20
+  WARMUP_EPOCHS: 2
+  WEIGHT_DECAY: 0.05
+  BASE_LR: 2e-05 # 512
+  WARMUP_LR: .0
+  MIN_LR: .0
+  LR_LAYER_DECAY: true
+  LR_LAYER_DECAY_RATIO: 0.9
+  USE_CHECKPOINT: true
+  OPTIMIZER:
+    DCN_LR_MUL: 0.1
+AMP_OPT_LEVEL: O0
+EVAL_FREQ: 1
--- a/classification/configs/internimage_s_1k_224.yaml
+++ b/classification/configs/internimage_s_1k_224.yaml
+DATA:
+  IMG_ON_MEMORY: True
+MODEL:
+  TYPE: intern_image
+  DROP_PATH_RATE: 0.4
+  INTERN_IMAGE:
+    CORE_OP: 'DCNv3'
+    DEPTHS: [4, 4, 21, 4]
+    GROUPS: [5, 10, 20, 40]
+    CHANNELS: 80
+    LAYER_SCALE: 1e-5
+    OFFSET_SCALE: 1.0
+    MLP_RATIO: 4.0
+    POST_NORM: True
+TRAIN:
+  EMA:
+    ENABLE: True
+    DECAY: 0.9999
+  BASE_LR: 5e-4
--- a/classification/configs/internimage_t_1k_224.yaml
+++ b/classification/configs/internimage_t_1k_224.yaml
+DATA:
+  IMG_ON_MEMORY: True
+MODEL:
+  TYPE: intern_image
+  DROP_PATH_RATE: 0.1
+  INTERN_IMAGE:
+    CORE_OP: 'DCNv3'
+    DEPTHS: [4, 4, 18, 4]
+    GROUPS: [4, 8, 16, 32]
+    CHANNELS: 64
+    OFFSET_SCALE: 1.0
+    MLP_RATIO: 4.0
+TRAIN:
+  EMA:
+    ENABLE: True
+    DECAY: 0.9999
+  BASE_LR: 5e-4
--- a/classification/configs/internimage_xl_22kto1k_384.yaml
+++ b/classification/configs/internimage_xl_22kto1k_384.yaml
+DATA:
+  IMG_SIZE: 384
+  IMG_ON_MEMORY: True
+AUG:
+  MIXUP: 0.0
+  CUTMIX: 0.0
+  REPROB: 0.0
+MODEL:
+  TYPE: intern_image
+  DROP_PATH_RATE: 0.2
+  LABEL_SMOOTHING: 0.3
+  INTERN_IMAGE:
+    CORE_OP: 'DCNv3'
+    DEPTHS: [5, 5, 24, 5]
+    GROUPS: [12, 24, 48, 96]
+    CHANNELS: 192
+    LAYER_SCALE: 1e-5
+    OFFSET_SCALE: 2.0
+    MLP_RATIO: 4.0
+    POST_NORM: True
+TRAIN:
+  EMA:
+    ENABLE: true
+    DECAY: 0.9999
+  EPOCHS: 20
+  WARMUP_EPOCHS: 2
+  WEIGHT_DECAY: 0.05
+  BASE_LR: 2e-05 # 512
+  WARMUP_LR: .0
+  MIN_LR: .0
+  LR_LAYER_DECAY: true
+  LR_LAYER_DECAY_RATIO: 0.9
+  USE_CHECKPOINT: true
+  OPTIMIZER:
+    DCN_LR_MUL: 0.1
+AMP_OPT_LEVEL: O0
+EVAL_FREQ: 1
\ No newline at end of file
--- a/classification/dataset/__init__.py
+++ b/classification/dataset/__init__.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+from .build import build_loader
\ No newline at end of file
--- a/classification/dataset/build.py
+++ b/classification/dataset/build.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import os
+import torch
+import numpy as np
+import torch.distributed as dist
+from torchvision import transforms
+from timm.data import Mixup
+from timm.data import create_transform
+from .cached_image_folder import ImageCephDataset
+from .samplers import SubsetRandomSampler, NodeDistributedSampler
+
+try:
+    from torchvision.transforms import InterpolationMode
+
+    def _pil_interp(method):
+        if method == 'bicubic':
+            return InterpolationMode.BICUBIC
+        elif method == 'lanczos':
+            return InterpolationMode.LANCZOS
+        elif method == 'hamming':
+            return InterpolationMode.HAMMING
+        else:
+            return InterpolationMode.BILINEAR
+except:
+    from timm.data.transforms import _pil_interp
+
+
+class TTA(torch.nn.Module):
+
+    def __init__(self, size, scales=[1.0, 1.05, 1.1]):
+        super().__init__()
+        self.size = size
+        self.scales = scales
+
+    def forward(self, img):
+        out = []
+        cc = transforms.CenterCrop(self.size)
+        for scale in self.scales:
+            size_ = int(scale * self.size)
+            rs = transforms.Resize(size_, interpolation=_pil_interp('bicubic'))
+            img_ = rs(img)
+            img_ = cc(img_)
+            out.append(img_)
+
+        return out
+
+    def __repr__(self) -> str:
+        return f"{self.__class__.__name__}(size={self.size}, scale={self.scales})"
+
+
+def build_loader(config):
+    config.defrost()
+    dataset_train, config.MODEL.NUM_CLASSES = build_dataset('train',
+                                                            config=config)
+    config.freeze()
+    print(f"local rank {config.LOCAL_RANK} / global rank {dist.get_rank()}"
+          "successfully build train dataset")
+
+    dataset_val, _ = build_dataset('val', config=config)
+    print(f"local rank {config.LOCAL_RANK} / global rank {dist.get_rank()}"
+          "successfully build val dataset")
+
+    dataset_test, _ = build_dataset('test', config=config)
+    print(f"local rank {config.LOCAL_RANK} / global rank {dist.get_rank()}"
+          "successfully build test dataset")
+
+    num_tasks = dist.get_world_size()
+    global_rank = dist.get_rank()
+
+    if dataset_train is not None:
+        if config.DATA.IMG_ON_MEMORY:
+            sampler_train = NodeDistributedSampler(dataset_train)
+        else:
+            if config.DATA.ZIP_MODE and config.DATA.CACHE_MODE == 'part':
+                indices = np.arange(dist.get_rank(), len(dataset_train),
+                                    dist.get_world_size())
+                sampler_train = SubsetRandomSampler(indices)
+            else:
+                sampler_train = torch.utils.data.DistributedSampler(
+                    dataset_train,
+                    num_replicas=num_tasks,
+                    rank=global_rank,
+                    shuffle=True)
+
+    if dataset_val is not None:
+        if config.TEST.SEQUENTIAL:
+            sampler_val = torch.utils.data.SequentialSampler(dataset_val)
+        else:
+            sampler_val = torch.utils.data.distributed.DistributedSampler(
+                dataset_val, shuffle=False)
+
+    if dataset_test is not None:
+        if config.TEST.SEQUENTIAL:
+            sampler_test = torch.utils.data.SequentialSampler(dataset_test)
+        else:
+            sampler_test = torch.utils.data.distributed.DistributedSampler(
+                dataset_test, shuffle=False)
+
+    data_loader_train = torch.utils.data.DataLoader(
+        dataset_train,
+        sampler=sampler_train,
+        batch_size=config.DATA.BATCH_SIZE,
+        num_workers=config.DATA.NUM_WORKERS,
+        pin_memory=config.DATA.PIN_MEMORY,
+        drop_last=True,
+        persistent_workers=True) if dataset_train is not None else None
+
+    data_loader_val = torch.utils.data.DataLoader(
+        dataset_val,
+        sampler=sampler_val,
+        batch_size=config.DATA.BATCH_SIZE,
+        shuffle=False,
+        num_workers=config.DATA.NUM_WORKERS,
+        pin_memory=config.DATA.PIN_MEMORY,
+        drop_last=False,
+        persistent_workers=True) if dataset_val is not None else None
+
+    data_loader_test = torch.utils.data.DataLoader(
+        dataset_test,
+        sampler=sampler_test,
+        batch_size=config.DATA.BATCH_SIZE,
+        shuffle=False,
+        num_workers=config.DATA.NUM_WORKERS,
+        pin_memory=config.DATA.PIN_MEMORY,
+        drop_last=False,
+        persistent_workers=True) if dataset_test is not None else None
+
+    # setup mixup / cutmix
+    mixup_fn = None
+    mixup_active = config.AUG.MIXUP > 0 or config.AUG.CUTMIX > 0. or config.AUG.CUTMIX_MINMAX is not None
+    if mixup_active:
+        mixup_fn = Mixup(mixup_alpha=config.AUG.MIXUP,
+                         cutmix_alpha=config.AUG.CUTMIX,
+                         cutmix_minmax=config.AUG.CUTMIX_MINMAX,
+                         prob=config.AUG.MIXUP_PROB,
+                         switch_prob=config.AUG.MIXUP_SWITCH_PROB,
+                         mode=config.AUG.MIXUP_MODE,
+                         label_smoothing=config.MODEL.LABEL_SMOOTHING,
+                         num_classes=config.MODEL.NUM_CLASSES)
+
+    return dataset_train, dataset_val, dataset_test, data_loader_train, \
+        data_loader_val, data_loader_test, mixup_fn
+
+
+def build_dataset(split, config):
+    transform = build_transform(split == 'train', config)
+    dataset = None
+    nb_classes = None
+    prefix = split
+    if config.DATA.DATASET == 'imagenet':
+        if prefix == 'train' and not config.EVAL_MODE:
+            root = os.path.join(config.DATA.DATA_PATH, 'train')
+            dataset = ImageCephDataset(root,
+                                       'train',
+                                       transform=transform,
+                                       on_memory=config.DATA.IMG_ON_MEMORY)
+        elif prefix == 'val':
+            root = os.path.join(config.DATA.DATA_PATH, 'val')
+            dataset = ImageCephDataset(root, 'val', transform=transform)
+        nb_classes = 1000
+    elif config.DATA.DATASET == 'imagenet22K':
+        if prefix == 'train':
+            if not config.EVAL_MODE:
+                root = config.DATA.DATA_PATH
+                dataset = ImageCephDataset(root,
+                                           'train',
+                                           transform=transform,
+                                           on_memory=config.DATA.IMG_ON_MEMORY)
+            nb_classes = 21841
+        elif prefix == 'val':
+            root = os.path.join(config.DATA.DATA_PATH, 'val')
+            dataset = ImageCephDataset(root, 'val', transform=transform)
+            nb_classes = 1000
+    else:
+        raise NotImplementedError(
+            f'build_dataset does support {config.DATA.DATASET}')
+
+    return dataset, nb_classes
+
+
+def build_transform(is_train, config):
+    resize_im = config.DATA.IMG_SIZE > 32
+    if is_train:
+        # this should always dispatch to transforms_imagenet_train
+        transform = create_transform(
+            input_size=config.DATA.IMG_SIZE,
+            is_training=True,
+            color_jitter=config.AUG.COLOR_JITTER
+            if config.AUG.COLOR_JITTER > 0 else None,
+            auto_augment=config.AUG.AUTO_AUGMENT
+            if config.AUG.AUTO_AUGMENT != 'none' else None,
+            re_prob=config.AUG.REPROB,
+            re_mode=config.AUG.REMODE,
+            re_count=config.AUG.RECOUNT,
+            interpolation=config.DATA.INTERPOLATION,
+        )
+        if not resize_im:
+            # replace RandomResizedCropAndInterpolation with
+            # RandomCrop
+            transform.transforms[0] = transforms.RandomCrop(
+                config.DATA.IMG_SIZE, padding=4)
+
+        return transform
+
+    t = []
+    if resize_im:
+        if config.TEST.CROP:
+            size = int(1.0 * config.DATA.IMG_SIZE)
+            t.append(
+                transforms.Resize(size,
+                                  interpolation=_pil_interp(
+                                      config.DATA.INTERPOLATION)),
+                # to maintain same ratio w.r.t. 224 images
+            )
+            t.append(transforms.CenterCrop(config.DATA.IMG_SIZE))
+        elif config.AUG.RANDOM_RESIZED_CROP:
+            t.append(
+                transforms.RandomResizedCrop(
+                    (config.DATA.IMG_SIZE, config.DATA.IMG_SIZE),
+                    interpolation=_pil_interp(config.DATA.INTERPOLATION)))
+        else:
+            t.append(
+                transforms.Resize(
+                    (config.DATA.IMG_SIZE, config.DATA.IMG_SIZE),
+                    interpolation=_pil_interp(config.DATA.INTERPOLATION)))
+    t.append(transforms.ToTensor())
+    t.append(transforms.Normalize(config.AUG.MEAN, config.AUG.STD))
+
+    return transforms.Compose(t)
--- a/classification/dataset/cached_image_folder.py
+++ b/classification/dataset/cached_image_folder.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import io
+import os
+import re
+import time
+import json
+import math
+import mmcv
+import torch
+import logging
+import os.path as osp
+from PIL import Image
+from tqdm import tqdm, trange
+from abc import abstractmethod
+import torch.utils.data as data
+import torch.distributed as dist
+from mmcv.fileio import FileClient
+from .zipreader import is_zip_path, ZipReader
+
+_logger = logging.getLogger(__name__)
+
+_ERROR_RETRY = 50
+
+
+def has_file_allowed_extension(filename, extensions):
+    """Checks if a file is an allowed extension.
+    Args:
+        filename (string): path to a file
+    Returns:
+        bool: True if the filename ends with a known image extension
+    """
+    filename_lower = filename.lower()
+    return any(filename_lower.endswith(ext) for ext in extensions)
+
+
+def find_classes(dir):
+    classes = [
+        d for d in os.listdir(dir) if os.path.isdir(os.path.join(dir, d))
+    ]
+    classes.sort()
+    class_to_idx = {classes[i]: i for i in range(len(classes))}
+    return classes, class_to_idx
+
+
+def make_dataset(dir, class_to_idx, extensions):
+    images = []
+    dir = os.path.expanduser(dir)
+    for target in sorted(os.listdir(dir)):
+        d = os.path.join(dir, target)
+        if not os.path.isdir(d):
+            continue
+        for root, _, fnames in sorted(os.walk(d)):
+            for fname in sorted(fnames):
+                if has_file_allowed_extension(fname, extensions):
+                    path = os.path.join(root, fname)
+                    item = (path, class_to_idx[target])
+                    images.append(item)
+
+    return images
+
+
+def make_dataset_with_ann(ann_file, img_prefix, extensions):
+    images = []
+    with open(ann_file, "r") as f:
+        contents = f.readlines()
+        for line_str in contents:
+            path_contents = [c for c in line_str.split('\t')]
+            im_file_name = path_contents[0]
+            class_index = int(path_contents[1])
+            assert str.lower(os.path.splitext(im_file_name)[-1]) in extensions
+            item = (os.path.join(img_prefix, im_file_name), class_index)
+            images.append(item)
+
+    return images
+
+
+class DatasetFolder(data.Dataset):
+    """A generic data loader where the samples are arranged in this way: ::
+        root/class_x/xxx.ext
+        root/class_x/xxy.ext
+        root/class_x/xxz.ext
+        root/class_y/123.ext
+        root/class_y/nsdf3.ext
+        root/class_y/asd932_.ext
+    Args:
+        root (string): Root directory path.
+        loader (callable): A function to load a sample given its path.
+        extensions (list[string]): A list of allowed extensions.
+        transform (callable, optional): A function/transform that takes in
+            a sample and returns a transformed version.
+            E.g, ``transforms.RandomCrop`` for images.
+        target_transform (callable, optional): A function/transform that takes
+            in the target and transforms it.
+     Attributes:
+        samples (list): List of (sample path, class_index) tuples
+    """
+
+    def __init__(self,
+                 root,
+                 loader,
+                 extensions,
+                 ann_file='',
+                 img_prefix='',
+                 transform=None,
+                 target_transform=None,
+                 cache_mode="no"):
+        # image folder mode
+        if ann_file == '':
+            _, class_to_idx = find_classes(root)
+            samples = make_dataset(root, class_to_idx, extensions)
+        # zip mode
+        else:
+            samples = make_dataset_with_ann(os.path.join(root, ann_file),
+                                            os.path.join(root, img_prefix),
+                                            extensions)
+
+        if len(samples) == 0:
+            raise (RuntimeError("Found 0 files in subfolders of: " + root +
+                                "\n" + "Supported extensions are: " +
+                                ",".join(extensions)))
+
+        self.root = root
+        self.loader = loader
+        self.extensions = extensions
+
+        self.samples = samples
+        self.labels = [y_1k for _, y_1k in samples]
+        self.classes = list(set(self.labels))
+
+        self.transform = transform
+        self.target_transform = target_transform
+
+        self.cache_mode = cache_mode
+        if self.cache_mode != "no":
+            self.init_cache()
+
+    def init_cache(self):
+        assert self.cache_mode in ["part", "full"]
+        n_sample = len(self.samples)
+        global_rank = dist.get_rank()
+        world_size = dist.get_world_size()
+
+        samples_bytes = [None for _ in range(n_sample)]
+        start_time = time.time()
+        for index in range(n_sample):
+            if index % (n_sample // 10) == 0:
+                t = time.time() - start_time
+                print(
+                    f'global_rank {dist.get_rank()} cached {index}/{n_sample} takes {t:.2f}s per block'
+                )
+                start_time = time.time()
+            path, target = self.samples[index]
+            if self.cache_mode == "full":
+                samples_bytes[index] = (ZipReader.read(path), target)
+            elif self.cache_mode == "part" and index % world_size == global_rank:
+                samples_bytes[index] = (ZipReader.read(path), target)
+            else:
+                samples_bytes[index] = (path, target)
+        self.samples = samples_bytes
+
+    def __getitem__(self, index):
+        """
+        Args:
+            index (int): Index
+        Returns:
+            tuple: (sample, target) where target is class_index of the target class.
+        """
+        path, target = self.samples[index]
+        sample = self.loader(path)
+        if self.transform is not None:
+            sample = self.transform(sample)
+        if self.target_transform is not None:
+            target = self.target_transform(target)
+
+        return sample, target
+
+    def __len__(self):
+        return len(self.samples)
+
+    def __repr__(self):
+        fmt_str = 'Dataset ' + self.__class__.__name__ + '\n'
+        fmt_str += '    Number of datapoints: {}\n'.format(self.__len__())
+        fmt_str += '    Root Location: {}\n'.format(self.root)
+        tmp = '    Transforms (if any): '
+        fmt_str += '{0}{1}\n'.format(
+            tmp,
+            self.transform.__repr__().replace('\n', '\n' + ' ' * len(tmp)))
+        tmp = '    Target Transforms (if any): '
+        fmt_str += '{0}{1}'.format(
+            tmp,
+            self.target_transform.__repr__().replace('\n',
+                                                     '\n' + ' ' * len(tmp)))
+
+        return fmt_str
+
+
+IMG_EXTENSIONS = ['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif']
+
+
+def pil_loader(path):
+    # open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
+    if isinstance(path, bytes):
+        img = Image.open(io.BytesIO(path))
+    elif is_zip_path(path):
+        data = ZipReader.read(path)
+        img = Image.open(io.BytesIO(data))
+    else:
+        with open(path, 'rb') as f:
+            img = Image.open(f)
+            return img.convert('RGB')
+
+    return img.convert('RGB')
+
+
+def accimage_loader(path):
+    import accimage
+    try:
+        return accimage.Image(path)
+    except IOError:
+        # Potentially a decoding problem, fall back to PIL.Image
+        return pil_loader(path)
+
+
+def default_img_loader(path):
+    from torchvision import get_image_backend
+    if get_image_backend() == 'accimage':
+        return accimage_loader(path)
+    else:
+        return pil_loader(path)
+
+
+class CachedImageFolder(DatasetFolder):
+    """A generic data loader where the images are arranged in this way: ::
+        root/dog/xxx.png
+        root/dog/xxy.png
+        root/dog/xxz.png
+        root/cat/123.png
+        root/cat/nsdf3.png
+        root/cat/asd932_.png
+    Args:
+        root (string): Root directory path.
+        transform (callable, optional): A function/transform that  takes in an PIL image
+            and returns a transformed version. E.g, ``transforms.RandomCrop``
+        target_transform (callable, optional): A function/transform that takes in the
+            target and transforms it.
+        loader (callable, optional): A function to load an image given its path.
+     Attributes:
+        imgs (list): List of (image path, class_index) tuples
+    """
+
+    def __init__(self,
+                 root,
+                 ann_file='',
+                 img_prefix='',
+                 transform=None,
+                 target_transform=None,
+                 loader=default_img_loader,
+                 cache_mode="no"):
+        super(CachedImageFolder,
+              self).__init__(root,
+                             loader,
+                             IMG_EXTENSIONS,
+                             ann_file=ann_file,
+                             img_prefix=img_prefix,
+                             transform=transform,
+                             target_transform=target_transform,
+                             cache_mode=cache_mode)
+        self.imgs = self.samples
+
+    def __getitem__(self, index):
+        """
+        Args:
+            index (int): Index
+        Returns:
+            tuple: (image, target) where target is class_index of the target class.
+        """
+        path, target = self.samples[index]
+        image = self.loader(path)
+        if self.transform is not None:
+            img = self.transform(image)
+        else:
+            img = image
+        if self.target_transform is not None:
+            target = self.target_transform(target)
+
+        return img, target
+
+
+class ImageCephDataset(data.Dataset):
+
+    def __init__(self,
+                 root,
+                 split,
+                 parser=None,
+                 transform=None,
+                 target_transform=None,
+                 on_memory=False):
+        if '22k' in root:
+            # Imagenet 22k
+            annotation_root = 'meta_data/'
+        else:
+            # Imagenet
+            annotation_root = 'meta_data/'
+        if parser is None or isinstance(parser, str):
+            parser = ParserCephImage(root=root,
+                                     split=split,
+                                     annotation_root=annotation_root,
+                                     on_memory=on_memory)
+        self.parser = parser
+        self.transform = transform
+        self.target_transform = target_transform
+        self._consecutive_errors = 0
+
+    def __getitem__(self, index):
+        img, target = self.parser[index]
+        self._consecutive_errors = 0
+        if self.transform is not None:
+            img = self.transform(img)
+        if target is None:
+            target = -1
+        elif self.target_transform is not None:
+            target = self.target_transform(target)
+        return img, target
+
+    def __len__(self):
+        return len(self.parser)
+
+    def filename(self, index, basename=False, absolute=False):
+        return self.parser.filename(index, basename, absolute)
+
+    def filenames(self, basename=False, absolute=False):
+        return self.parser.filenames(basename, absolute)
+
+
+class Parser:
+
+    def __init__(self):
+        pass
+
+    @abstractmethod
+    def _filename(self, index, basename=False, absolute=False):
+        pass
+
+    def filename(self, index, basename=False, absolute=False):
+        return self._filename(index, basename=basename, absolute=absolute)
+
+    def filenames(self, basename=False, absolute=False):
+        return [
+            self._filename(index, basename=basename, absolute=absolute)
+            for index in range(len(self))
+        ]
+
+
+class ParserCephImage(Parser):
+
+    def __init__(self,
+                 root,
+                 split,
+                 annotation_root,
+                 on_memory=False,
+                 **kwargs):
+        super().__init__()
+
+        self.file_client = None
+        self.kwargs = kwargs
+
+        self.root = root  # dataset:s3://imagenet22k
+        if '22k' in root:
+            self.io_backend = 'petrel'
+            with open(osp.join(annotation_root, '22k_class_to_idx.json'),
+                      'r') as f:
+                self.class_to_idx = json.loads(f.read())
+            with open(osp.join(annotation_root, '22k_label.txt'), 'r') as f:
+                self.samples = f.read().splitlines()
+        else:
+            self.io_backend = 'disk'
+            self.class_to_idx = None
+            with open(osp.join(annotation_root, f'{split}.txt'), 'r') as f:
+                self.samples = f.read().splitlines()
+        local_rank = None
+        local_size = None
+        self._consecutive_errors = 0
+        self.on_memory = on_memory
+        if on_memory:
+            self.holder = {}
+            if local_rank is None:
+                local_rank = int(os.environ.get('LOCAL_RANK', 0))
+            if local_size is None:
+                local_size = int(os.environ.get('LOCAL_SIZE', 1))
+            self.local_rank = local_rank
+            self.local_size = local_size
+            self.rank = int(os.environ["RANK"])
+            self.world_size = int(os.environ['WORLD_SIZE'])
+            self.num_replicas = int(os.environ['WORLD_SIZE'])
+            self.num_parts = local_size
+            self.num_samples = int(
+                math.ceil(len(self.samples) * 1.0 / self.num_replicas))
+            self.total_size = self.num_samples * self.num_replicas
+            self.total_size_parts = self.num_samples * self.num_replicas // self.num_parts
+            self.load_onto_memory_v2()
+
+    def load_onto_memory(self):
+        print("Loading images onto memory...", self.local_rank,
+              self.local_size)
+        if self.file_client is None:
+            self.file_client = FileClient(self.io_backend, **self.kwargs)
+        for index in trange(len(self.samples)):
+            if index % self.local_size != self.local_rank:
+                continue
+            path, _ = self.samples[index].split(' ')
+            path = osp.join(self.root, path)
+            img_bytes = self.file_client.get(path)
+            self.holder[path] = img_bytes
+
+        print("Loading complete!")
+
+    def load_onto_memory_v2(self):
+        # print("Loading images onto memory...", self.local_rank, self.local_size)
+        t = torch.Generator()
+        t.manual_seed(0)
+        indices = torch.randperm(len(self.samples), generator=t).tolist()
+        # indices = range(len(self.samples))
+        indices = [i for i in indices if i % self.num_parts == self.local_rank]
+        # add extra samples to make it evenly divisible
+        indices += indices[:(self.total_size_parts - len(indices))]
+        assert len(indices) == self.total_size_parts
+
+        # subsample
+        indices = indices[self.rank // self.num_parts:self.
+                          total_size_parts:self.num_replicas // self.num_parts]
+        assert len(indices) == self.num_samples
+
+        if self.file_client is None:
+            self.file_client = FileClient(self.io_backend, **self.kwargs)
+        for index in tqdm(indices):
+            if index % self.local_size != self.local_rank:
+                continue
+            path, _ = self.samples[index].split(' ')
+            path = osp.join(self.root, path)
+            img_bytes = self.file_client.get(path)
+
+            self.holder[path] = img_bytes
+
+        print("Loading complete!")
+
+    def __getitem__(self, index):
+        if self.file_client is None:
+            self.file_client = FileClient(self.io_backend, **self.kwargs)
+
+        filepath, target = self.samples[index].split(' ')
+        filepath = osp.join(self.root, filepath)
+
+        try:
+            if self.on_memory:
+                img_bytes = self.holder[filepath]
+            else:
+                # pass
+                img_bytes = self.file_client.get(filepath)
+            img = mmcv.imfrombytes(img_bytes)[:, :, ::-1]
+        except Exception as e:
+            _logger.warning(
+                f'Skipped sample (index {index}, file {filepath}). {str(e)}')
+            self._consecutive_errors += 1
+            if self._consecutive_errors < _ERROR_RETRY:
+                return self.__getitem__((index + 1) % len(self))
+            else:
+                raise e
+        self._consecutive_errors = 0
+
+        img = Image.fromarray(img)
+        try:
+            if self.class_to_idx is not None:
+                target = self.class_to_idx[target]
+            else:
+                target = int(target)
+        except:
+            print('aaaaaaaaaaaa', filepath, target)
+            exit()
+
+        return img, target
+
+    def __len__(self):
+        return len(self.samples)
+
+    def _filename(self, index, basename=False, absolute=False):
+        filename, _ = self.samples[index].split(' ')
+        filename = osp.join(self.root, filename)
+
+        return filename
+
+
+def get_temporal_info(date, miss_hour=False):
+    try:
+        if date:
+            if miss_hour:
+                pattern = re.compile(r'(\d*)-(\d*)-(\d*)', re.I)
+            else:
+                pattern = re.compile(r'(\d*)-(\d*)-(\d*) (\d*):(\d*):(\d*)',
+                                     re.I)
+            m = pattern.match(date.strip())
+
+            if m:
+                year = int(m.group(1))
+                month = int(m.group(2))
+                day = int(m.group(3))
+                x_month = math.sin(2 * math.pi * month / 12)
+                y_month = math.cos(2 * math.pi * month / 12)
+                if miss_hour:
+                    x_hour = 0
+                    y_hour = 0
+                else:
+                    hour = int(m.group(4))
+                    x_hour = math.sin(2 * math.pi * hour / 24)
+                    y_hour = math.cos(2 * math.pi * hour / 24)
+                return [x_month, y_month, x_hour, y_hour]
+            else:
+                return [0, 0, 0, 0]
+        else:
+            return [0, 0, 0, 0]
+    except:
+        return [0, 0, 0, 0]
+
+
+def get_spatial_info(latitude, longitude):
+    if latitude and longitude:
+        latitude = math.radians(latitude)
+        longitude = math.radians(longitude)
+        x = math.cos(latitude) * math.cos(longitude)
+        y = math.cos(latitude) * math.sin(longitude)
+        z = math.sin(latitude)
+        return [x, y, z]
+    else:
+        return [0, 0, 0]
--- a/classification/dataset/samplers.py
+++ b/classification/dataset/samplers.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import torch
+import os
+import math
+from torch.utils.data.sampler import Sampler
+import torch.distributed as dist
+import numpy as np
+
+
+class SubsetRandomSampler(torch.utils.data.Sampler):
+    """Samples elements randomly from a given list of indices, without replacement.
+
+    Arguments:
+        indices (sequence): a sequence of indices
+    """
+
+    def __init__(self, indices):
+        self.epoch = 0
+        self.indices = indices
+
+    def __iter__(self):
+        return (self.indices[i] for i in torch.randperm(len(self.indices)))
+
+    def __len__(self):
+        return len(self.indices)
+
+    def set_epoch(self, epoch):
+        self.epoch = epoch
+
+
+class NodeDistributedSampler(Sampler):
+    """Sampler that restricts data loading to a subset of the dataset.
+    It is especially useful in conjunction with
+    :class:`torch.nn.parallel.DistributedDataParallel`. In such case, each
+    process can pass a DistributedSampler instance as a DataLoader sampler,
+    and load a subset of the original dataset that is exclusive to it.
+    .. note::
+        Dataset is assumed to be of constant size.
+    Arguments:
+        dataset: Dataset used for sampling.
+        num_replicas (optional): Number of processes participating in
+            distributed training.
+        rank (optional): Rank of the current process within num_replicas.
+    """
+
+    def __init__(self,
+                 dataset,
+                 num_replicas=None,
+                 rank=None,
+                 local_rank=None,
+                 local_size=None):
+        if num_replicas is None:
+            if not dist.is_available():
+                raise RuntimeError(
+                    "Requires distributed package to be available")
+            num_replicas = dist.get_world_size()
+        if rank is None:
+            if not dist.is_available():
+                raise RuntimeError(
+                    "Requires distributed package to be available")
+            rank = dist.get_rank()
+        if local_rank is None:
+            local_rank = int(os.environ.get('LOCAL_RANK', 0))
+        if local_size is None:
+            local_size = int(os.environ.get('LOCAL_SIZE', 1))
+        self.dataset = dataset
+        self.num_replicas = num_replicas
+        self.num_parts = local_size
+        self.rank = rank
+        self.local_rank = local_rank
+        self.epoch = 0
+        self.num_samples = int(
+            math.ceil(len(self.dataset) * 1.0 / self.num_replicas))
+        self.total_size = self.num_samples * self.num_replicas
+
+        self.total_size_parts = self.num_samples * self.num_replicas // self.num_parts
+
+    def __iter__(self):
+        # deterministically shuffle based on epoch
+        g = torch.Generator()
+        g.manual_seed(self.epoch)
+
+        t = torch.Generator()
+        t.manual_seed(0)
+
+        indices = torch.randperm(len(self.dataset), generator=t).tolist()
+        # indices = range(len(self.dataset))
+        indices = [i for i in indices if i % self.num_parts == self.local_rank]
+
+        # add extra samples to make it evenly divisible
+        indices += indices[:(self.total_size_parts - len(indices))]
+        assert len(indices) == self.total_size_parts
+
+        # subsample
+        indices = indices[self.rank // self.num_parts:self.
+                          total_size_parts:self.num_replicas // self.num_parts]
+
+        index = torch.randperm(len(indices), generator=g).tolist()
+        indices = list(np.array(indices)[index])
+
+        assert len(indices) == self.num_samples
+
+        return iter(indices)
+
+    def __len__(self):
+        return self.num_samples
+
+    def set_epoch(self, epoch):
+        self.epoch = epoch
--- a/classification/dataset/zipreader.py
+++ b/classification/dataset/zipreader.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import os
+import zipfile
+import io
+import numpy as np
+from PIL import Image
+from PIL import ImageFile
+
+ImageFile.LOAD_TRUNCATED_IMAGES = True
+
+
+def is_zip_path(img_or_path):
+    """judge if this is a zip path"""
+    return '.zip@' in img_or_path
+
+
+class ZipReader(object):
+    """A class to read zipped files"""
+    zip_bank = dict()
+
+    def __init__(self):
+        super(ZipReader, self).__init__()
+
+    @staticmethod
+    def get_zipfile(path):
+        zip_bank = ZipReader.zip_bank
+        if path not in zip_bank:
+            zfile = zipfile.ZipFile(path, 'r')
+            zip_bank[path] = zfile
+        return zip_bank[path]
+
+    @staticmethod
+    def split_zip_style_path(path):
+        pos_at = path.index('@')
+        assert pos_at != -1, "character '@' is not found from the given path '%s'" % path
+
+        zip_path = path[0:pos_at]
+        folder_path = path[pos_at + 1:]
+        folder_path = str.strip(folder_path, '/')
+        return zip_path, folder_path
+
+    @staticmethod
+    def list_folder(path):
+        zip_path, folder_path = ZipReader.split_zip_style_path(path)
+
+        zfile = ZipReader.get_zipfile(zip_path)
+        folder_list = []
+        for file_foler_name in zfile.namelist():
+            file_foler_name = str.strip(file_foler_name, '/')
+            if file_foler_name.startswith(folder_path) and \
+                    len(os.path.splitext(file_foler_name)[-1]) == 0 and \
+                    file_foler_name != folder_path:
+                if len(folder_path) == 0:
+                    folder_list.append(file_foler_name)
+                else:
+                    folder_list.append(file_foler_name[len(folder_path) + 1:])
+
+        return folder_list
+
+    @staticmethod
+    def list_files(path, extension=None):
+        if extension is None:
+            extension = ['.*']
+        zip_path, folder_path = ZipReader.split_zip_style_path(path)
+
+        zfile = ZipReader.get_zipfile(zip_path)
+        file_lists = []
+        for file_foler_name in zfile.namelist():
+            file_foler_name = str.strip(file_foler_name, '/')
+            if file_foler_name.startswith(folder_path) and \
+                    str.lower(os.path.splitext(file_foler_name)[-1]) in extension:
+                if len(folder_path) == 0:
+                    file_lists.append(file_foler_name)
+                else:
+                    file_lists.append(file_foler_name[len(folder_path) + 1:])
+
+        return file_lists
+
+    @staticmethod
+    def read(path):
+        zip_path, path_img = ZipReader.split_zip_style_path(path)
+        zfile = ZipReader.get_zipfile(zip_path)
+        data = zfile.read(path_img)
+        return data
+
+    @staticmethod
+    def imread(path):
+        zip_path, path_img = ZipReader.split_zip_style_path(path)
+        zfile = ZipReader.get_zipfile(zip_path)
+        data = zfile.read(path_img)
+        try:
+            im = Image.open(io.BytesIO(data))
+        except:
+            print("ERROR IMG LOADED: ", path_img)
+            random_img = np.random.rand(224, 224, 3) * 255
+            im = Image.fromarray(np.uint8(random_img))
+        return im
--- a/classification/ddp_hooks.py
+++ b/classification/ddp_hooks.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+from typing import Any, Callable
+
+import torch
+import torch.distributed as dist
+
+
+def _allreduce_fut(process_group: dist.ProcessGroup,
+                   tensor: torch.Tensor) -> torch.futures.Future[torch.Tensor]:
+    "Averages the input gradient tensor by allreduce and returns a future."
+    group_to_use = process_group if process_group is not None else dist.group.WORLD
+
+    # Apply the division first to avoid overflow, especially for FP16.
+    tensor.div_(group_to_use.size())
+
+    return (dist.all_reduce(
+        tensor, group=group_to_use,
+        async_op=True).get_future().then(lambda fut: fut.value()[0]))
+
+
+def allreduce_hook(
+        process_group: dist.ProcessGroup,
+        bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
+    """
+    This DDP communication hook just calls ``allreduce`` using ``GradBucket``
+    tensors. Once gradient tensors are aggregated across all workers, its ``then``
+    callback takes the mean and returns the result. If user registers this hook,
+    DDP results is expected to be same as the case where no hook was registered.
+    Hence, this won't change behavior of DDP and user can use this as a reference
+    or modify this hook to log useful information or any other purposes while
+    unaffecting DDP behavior.
+
+    Example::
+        >>> ddp_model.register_comm_hook(process_group, allreduce_hook)
+    """
+    return _allreduce_fut(process_group, bucket.buffer())
+
+
+def fp16_compress_hook(
+        process_group: dist.ProcessGroup,
+        bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
+    """
+    This DDP communication hook implements a simple gradient compression
+    approach that casts ``GradBucket`` tensor to half-precision floating-point format (``torch.float16``)
+    and then divides it by the process group size.
+    It allreduces those ``float16`` gradient tensors. Once compressed gradient
+    tensors are allreduced, the chained callback ``decompress`` casts it back to the input data type (such as ``float32``).
+
+    Example::
+        >>> ddp_model.register_comm_hook(process_group, fp16_compress_hook)
+    """
+    group_to_use = process_group if process_group is not None else dist.group.WORLD
+    world_size = group_to_use.size()
+
+    compressed_tensor = bucket.buffer().to(torch.float16).div_(world_size)
+
+    fut = dist.all_reduce(compressed_tensor, group=group_to_use,
+                          async_op=True).get_future()
+
+    def decompress(fut):
+        decompressed_tensor = bucket.buffer()
+        # Decompress in place to reduce the peak memory.
+        # See: https://github.com/pytorch/pytorch/issues/45968
+        decompressed_tensor.copy_(fut.value()[0])
+        return decompressed_tensor
+
+    return fut.then(decompress)
+
+
+# TODO: create an internal helper function and extract the duplicate code in FP16_compress and BF16_compress.
+
+
+def bf16_compress_hook(
+        process_group: dist.ProcessGroup,
+        bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
+    """
+    Warning: This API is experimental, and it requires NCCL version later than 2.9.6.
+
+    This DDP communication hook implements a simple gradient compression
+    approach that casts ``GradBucket`` tensor to half-precision
+    `Brain floating point format <https://en.wikipedia.org/wiki/Bfloat16_floating-point_format>`_ (``torch.bfloat16``)
+    and then divides it by the process group size.
+    It allreduces those ``bfloat16`` gradient tensors. Once compressed gradient
+    tensors are allreduced, the chained callback ``decompress`` casts it back to the input data type (such as ``float32``).
+
+    Example::
+        >>> ddp_model.register_comm_hook(process_group, bf16_compress_hook)
+    """
+    group_to_use = process_group if process_group is not None else dist.group.WORLD
+    world_size = group_to_use.size()
+
+    compressed_tensor = bucket.buffer().to(torch.bfloat16).div_(world_size)
+
+    fut = dist.all_reduce(compressed_tensor, group=group_to_use,
+                          async_op=True).get_future()
+
+    def decompress(fut):
+        decompressed_tensor = bucket.buffer()
+        # Decompress in place to reduce the peak memory.
+        # See: https://github.com/pytorch/pytorch/issues/45968
+        decompressed_tensor.copy_(fut.value()[0])
+        return decompressed_tensor
+
+    return fut.then(decompress)
+
+
+def fp16_compress_wrapper(
+    hook: Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]
+) -> Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]:
+    """
+    This wrapper casts the input gradient tensor of a given DDP communication hook to half-precision
+    floating point format (``torch.float16``), and casts the resulting tensor of the given hook back to
+    the input data type, such as ``float32``.
+
+    Therefore, ``fp16_compress_hook`` is equivalent to ``fp16_compress_wrapper(allreduce_hook)``.
+
+    Example::
+        >>> state = PowerSGDState(process_group=process_group, matrix_approximation_rank=1, start_powerSGD_iter=10)
+        >>> ddp_model.register_comm_hook(state, fp16_compress_wrapper(powerSGD_hook))
+    """
+
+    def fp16_compress_wrapper_hook(
+            hook_state,
+            bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
+        # Cast bucket tensor to FP16.
+        bucket.set_buffer(bucket.buffer().to(torch.float16))
+
+        fut = hook(hook_state, bucket)
+
+        def decompress(fut):
+            decompressed_tensor = bucket.buffer()
+            # Decompress in place to reduce the peak memory.
+            # See: https://github.com/pytorch/pytorch/issues/45968
+            decompressed_tensor.copy_(fut.value())
+            return decompressed_tensor
+
+        # Decompress after hook has run.
+        return fut.then(decompress)
+
+    return fp16_compress_wrapper_hook
+
+
+def bf16_compress_wrapper(
+    hook: Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]
+) -> Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]:
+    """
+    Warning: This API is experimental, and it requires NCCL version later than 2.9.6.
+
+    This wrapper casts the input gradient tensor of a given DDP communication hook to half-precision
+    `Brain floating point format <https://en.wikipedia.org/wiki/Bfloat16_floating-point_format> `_  (``torch.bfloat16``),
+    and casts the resulting tensor of the given hook back to the input data type, such as ``float32``.
+
+    Therefore, ``bf16_compress_hook`` is equivalent to ``bf16_compress_wrapper(allreduce_hook)``.
+
+    Example::
+        >>> state = PowerSGDState(process_group=process_group, matrix_approximation_rank=1, start_powerSGD_iter=10)
+        >>> ddp_model.register_comm_hook(state, bf16_compress_wrapper(powerSGD_hook))
+    """
+
+    def bf16_compress_wrapper_hook(
+            hook_state,
+            bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
+        # Cast bucket tensor to BF16.
+        bucket.set_buffer(bucket.buffer().to(torch.bfloat16))
+
+        fut = hook(hook_state, bucket)
+
+        def decompress(fut):
+            decompressed_tensor = bucket.buffer()
+            # Decompress in place to reduce the peak memory.
+            # See: https://github.com/pytorch/pytorch/issues/45968
+            decompressed_tensor.copy_(fut.value())
+            return decompressed_tensor
+
+        # Decompress after hook has run.
+        return fut.then(decompress)
+
+    return bf16_compress_wrapper_hook
--- a/classification/export.py
+++ b/classification/export.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import os
+import time
+import argparse
+
+import torch
+from tqdm import tqdm
+
+from config import get_config
+from models import build_model
+
+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--model_name', type=str,
+                        default='internimage_t_1k_224')
+    parser.add_argument('--ckpt_dir', type=str,
+                        default='/mnt/petrelfs/share_data/huangzhenhang/code/internimage/checkpoint_dir/new/cls')
+    parser.add_argument('--onnx', default=False, action='store_true')
+    parser.add_argument('--trt', default=False, action='store_true')
+
+    args = parser.parse_args()
+    args.cfg = os.path.join('./configs', f'{args.model_name}.yaml')
+    args.ckpt = os.path.join(args.ckpt_dir, f'{args.model_name}.pth')
+    args.size = int(args.model_name.split('.')[0].split('_')[-1])
+
+    cfg = get_config(args)
+    return args, cfg
+
+def get_model(args, cfg):
+    model = build_model(cfg)
+    ckpt = torch.load(args.ckpt, map_location='cpu')['model']
+
+    model.load_state_dict(ckpt)
+    return model
+
+def speed_test(model, input):
+    # warmup
+    for _ in tqdm(range(100)):
+        _ = model(input)
+
+    # speed test
+    torch.cuda.synchronize()
+    start = time.time()
+    for _ in tqdm(range(100)):
+        _ = model(input)
+    end = time.time()
+    th = 100 / (end - start)
+    print(f"using time: {end - start}, throughput {th}")
+
+def torch2onnx(args, cfg):
+    model = get_model(args, cfg).cuda()
+
+    # speed_test(model)
+
+    onnx_name = f'{args.model_name}.onnx'
+    torch.onnx.export(model,
+                      torch.rand(1, 3, args.size, args.size).cuda(),
+                      onnx_name,
+                      input_names=['input'],
+                      output_names=['output'])
+
+    return model
+
+def onnx2trt(args):
+    from mmdeploy.backend.tensorrt import from_onnx
+
+    onnx_name = f'{args.model_name}.onnx'
+    from_onnx(
+        onnx_name,
+        args.model_name,
+        dict(
+            input=dict(
+                min_shape=[1, 3, args.size, args.size],
+                opt_shape=[1, 3, args.size, args.size],
+                max_shape=[1, 3, args.size, args.size],
+            )
+        ),
+        max_workspace_size=2**30,
+    )
+
+def check(args, cfg):
+    from mmdeploy.backend.tensorrt.wrapper import TRTWrapper
+
+    model = get_model(args, cfg).cuda()
+    model.eval()
+    trt_model = TRTWrapper(f'{args.model_name}.engine',
+                           ['output'])
+
+    x = torch.randn(1, 3, args.size, args.size).cuda()
+
+    torch_out = model(x)
+    trt_out = trt_model(dict(input=x))['output']
+
+    print('torch out shape:', torch_out.shape)
+    print('trt out shape:', trt_out.shape)
+
+    print('max delta:', (torch_out - trt_out).abs().max())
+    print('mean delta:', (torch_out - trt_out).abs().mean())
+
+    speed_test(model, x)
+    speed_test(trt_model, dict(input=x))
+
+def main():
+    args, cfg = get_args()
+
+    if args.onnx or args.trt:
+        torch2onnx(args, cfg)
+        print('torch -> onnx: succeess')
+
+    if args.trt:
+        onnx2trt(args)
+        print('onnx -> trt: success')
+        check(args, cfg)
+
+if __name__ == '__main__':
+    main()
--- a/classification/logger.py
+++ b/classification/logger.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import os
+import sys
+import logging
+import functools
+from termcolor import colored
+
+
+@functools.lru_cache()
+def create_logger(output_dir, dist_rank=0, name=''):
+    # create logger
+    logger = logging.getLogger(name)
+    logger.setLevel(logging.DEBUG)
+    logger.propagate = False
+
+    # create formatter
+    fmt = '[%(asctime)s %(name)s] (%(filename)s %(lineno)d): %(levelname)s %(message)s'
+    color_fmt = colored('[%(asctime)s %(name)s]', 'green') + \
+        colored('(%(filename)s %(lineno)d)', 'yellow') + \
+        ': %(levelname)s %(message)s'
+
+    # create console handlers for master process
+    if dist_rank == 0:
+        console_handler = logging.StreamHandler(sys.stdout)
+        console_handler.setLevel(logging.DEBUG)
+        console_handler.setFormatter(
+            logging.Formatter(fmt=color_fmt, datefmt='%Y-%m-%d %H:%M:%S'))
+        logger.addHandler(console_handler)
+
+    # create file handlers
+    file_handler = logging.FileHandler(os.path.join(
+        output_dir, f'log_rank{dist_rank}.txt'),
+                                       mode='a')
+    file_handler.setLevel(logging.DEBUG)
+    file_handler.setFormatter(
+        logging.Formatter(fmt=fmt, datefmt='%Y-%m-%d %H:%M:%S'))
+    logger.addHandler(file_handler)
+
+    return logger
--- a/classification/lr_scheduler.py
+++ b/classification/lr_scheduler.py
+# --------------------------------------------------------
+# InternImage
+# Copyright (c) 2022 OpenGVLab
+# Licensed under The MIT License [see LICENSE for details]
+# --------------------------------------------------------
+
+import torch
+from timm.scheduler.cosine_lr import CosineLRScheduler
+from timm.scheduler.step_lr import StepLRScheduler
+from timm.scheduler.scheduler import Scheduler
+
+
+def build_scheduler(config, optimizer, n_iter_per_epoch):
+    num_steps = int(config.TRAIN.EPOCHS * n_iter_per_epoch)
+    warmup_steps = int(config.TRAIN.WARMUP_EPOCHS * n_iter_per_epoch)
+    decay_steps = int(config.TRAIN.LR_SCHEDULER.DECAY_EPOCHS *
+                      n_iter_per_epoch)
+
+    lr_scheduler = None
+    if config.TRAIN.LR_SCHEDULER.NAME == 'cosine':
+        lr_scheduler = CosineLRScheduler(
+            optimizer,
+            t_initial=num_steps,
+            # t_mul=1.,
+            lr_min=config.TRAIN.MIN_LR,
+            warmup_lr_init=config.TRAIN.WARMUP_LR,
+            warmup_t=warmup_steps,
+            cycle_limit=1,
+            t_in_epochs=False,
+        )
+    elif config.TRAIN.LR_SCHEDULER.NAME == 'linear':
+        lr_scheduler = LinearLRScheduler(
+            optimizer,
+            t_initial=num_steps,
+            lr_min_rate=0.01,
+            warmup_lr_init=config.TRAIN.WARMUP_LR,
+            warmup_t=warmup_steps,
+            t_in_epochs=False,
+        )
+    elif config.TRAIN.LR_SCHEDULER.NAME == 'step':
+        lr_scheduler = StepLRScheduler(
+            optimizer,
+            decay_t=decay_steps,
+            decay_rate=config.TRAIN.LR_SCHEDULER.DECAY_RATE,
+            warmup_lr_init=config.TRAIN.WARMUP_LR,
+            warmup_t=warmup_steps,
+            t_in_epochs=False,
+        )
+
+    return lr_scheduler
+
+
+class LinearLRScheduler(Scheduler):
+
+    def __init__(
+        self,
+        optimizer: torch.optim.Optimizer,
+        t_initial: int,
+        lr_min_rate: float,
+        warmup_t=0,
+        warmup_lr_init=0.,
+        t_in_epochs=True,
+        noise_range_t=None,
+        noise_pct=0.67,
+        noise_std=1.0,
+        noise_seed=42,
+        initialize=True,
+    ) -> None:
+        super().__init__(optimizer,
+                         param_group_field="lr",
+                         noise_range_t=noise_range_t,
+                         noise_pct=noise_pct,
+                         noise_std=noise_std,
+                         noise_seed=noise_seed,
+                         initialize=initialize)
+
+        self.t_initial = t_initial
+        self.lr_min_rate = lr_min_rate
+        self.warmup_t = warmup_t
+        self.warmup_lr_init = warmup_lr_init
+        self.t_in_epochs = t_in_epochs
+        if self.warmup_t:
+            self.warmup_steps = [(v - warmup_lr_init) / self.warmup_t
+                                 for v in self.base_values]
+            super().update_groups(self.warmup_lr_init)
+        else:
+            self.warmup_steps = [1 for _ in self.base_values]
+
+    def _get_lr(self, t):
+        if t < self.warmup_t:
+            lrs = [self.warmup_lr_init + t * s for s in self.warmup_steps]
+        else:
+            t = t - self.warmup_t
+            total_t = self.t_initial - self.warmup_t
+            lrs = [
+                v - ((v - v * self.lr_min_rate) * (t / total_t))
+                for v in self.base_values
+            ]
+        return lrs
+
+    def get_epoch_values(self, epoch: int):
+        if self.t_in_epochs:
+            return self._get_lr(epoch)
+        else:
+            return None
+
+    def get_update_values(self, num_updates: int):
+        if not self.t_in_epochs:
+            return self._get_lr(num_updates)
+        else:
+            return None
--- a/classification/main.py
+++ b/classification/main.py
--- a/classification/meta_data/22k_class_to_idx.json
+++ b/classification/meta_data/22k_class_to_idx.json
--- a/classification/meta_data/map22kto1k.txt
+++ b/classification/meta_data/map22kto1k.txt
+359
+368
+460
+475
+486
+492
+496
+514
+516
+525
+547
+548
+556
+563
+575
+641
+648
+723
+733
+765
+801
+826
+852
+858
+878
+896
+900
+905
+908
+910
+935
+946
+947
+994
+999
+1003
+1005
+1010
+1027
+1029
+1048
+1055
+1064
+1065
+1069
+1075
+1079
+1081
+1085
+1088
+1093
+1106
+1143
+1144
+1145
+1147
+1168
+1171
+1178
+1187
+1190
+1197
+1205
+1216
+1223
+1230
+1236
+1241
+1245
+1257
+1259
+1260
+1267
+1268
+1269
+1271
+1272
+1273
+1277
+1303
+1344
+1349
+1355
+1357
+1384
+1388
+1391
+1427
+1429
+1432
+1437
+1450
+1461
+1462
+1474
+1502
+1503
+1512
+1552
+1555
+1577
+1584
+1587
+1589
+1599
+1615
+1616
+1681
+1692
+1701
+1716
+1729
+1757
+1759
+1764
+1777
+1786
+1822
+1841
+1842
+1848
+1850
+1856
+1860
+1861
+1864
+1876
+1897
+1898
+1910
+1913
+1918
+1922
+1928
+1932
+1935
+1947
+1951
+1953
+1970
+1977
+1979
+2001
+2017
+2067
+2081
+2087
+2112
+2128
+2135
+2147
+2174
+2175
+2176
+2177
+2178
+2181
+2183
+2184
+2187
+2189
+2190
+2191
+2192
+2193
+2197
+2202
+2203
+2206
+2208
+2209
+2211
+2212
+2213
+2214
+2215
+2216
+2217
+2219
+2222
+2223
+2224
+2225
+2226
+2227
+2228
+2229
+2230
+2236
+2238
+2240
+2241
+2242
+2243
+2244
+2245
+2247
+2248
+2249
+2250
+2251
+2252
+2255
+2256
+2257
+2262
+2263
+2264
+2265
+2266
+2268
+2270
+2271
+2272
+2273
+2275
+2276
+2279
+2280
+2281
+2282
+2285
+2289
+2292
+2295
+2296
+2297
+2298
+2299
+2300
+2301
+2302
+2303
+2304
+2305
+2306
+2309
+2310
+2312
+2313
+2314
+2315
+2316
+2318
+2319
+2321
+2322
+2326
+2329
+2330
+2331
+2332
+2334
+2335
+2336
+2337
+2338
+2339
+2341
+2342
+2343
+2344
+2346
+2348
+2349
+2351
+2352
+2353
+2355
+2357
+2358
+2359
+2360
+2364
+2365
+2368
+2369
+2377
+2382
+2383
+2385
+2397
+2398
+2400
+2402
+2405
+2412
+2421
+2428
+2431
+2432
+2433
+2436
+2441
+2445
+2450
+2453
+2454
+2465
+2469
+2532
+2533
+2538
+2544
+2547
+2557
+2565
+2578
+2612
+2658
+2702
+2722
+2731
+2738
+2741
+2747
+2810
+2818
+2833
+2844
+2845
+2867
+2874
+2882
+2884
+2888
+2889
+3008
+3012
+3019
+3029
+3033
+3042
+3091
+3106
+3138
+3159
+3164
+3169
+3280
+3296
+3311
+3318
+3320
+3324
+3330
+3366
+3375
+3381
+3406
+3419
+3432
+3434
+3435
+3493
+3495
+3503
+3509
+3511
+3513
+3517
+3521
+3526
+3546
+3554
+3600
+3601
+3606
+3612
+3613
+3616
+3622
+3623
+3627
+3632
+3634
+3636
+3638
+3644
+3646
+3649
+3650
+3651
+3656
+3663
+3673
+3674
+3689
+3690
+3702
+3733
+3769
+3971
+3974
+4065
+4068
+4073
+4102
+4136
+4140
+4151
+4159
+4165
+4207
+4219
+4226
+4249
+4256
+4263
+4270
+4313
+4321
+4378
+4386
+4478
+4508
+4512
+4536
+4542
+4550
+4560
+4562
+4570
+4571
+4572
+4583
+4588
+4594
+4604
+4608
+4623
+4634
+4636
+4646
+4651
+4652
+4686
+4688
+4691
+4699
+4724
+4727
+4737
+4770
+4774
+4789
+4802
+4807
+4819
+4880
+4886
+4908
+4927
+4931
+4936
+4964
+4976
+4993
+5028
+5033
+5043
+5046
+5096
+5111
+5114
+5131
+5132
+5183
+5199
+5235
+5275
+5291
+5293
+5294
+5343
+5360
+5362
+5364
+5390
+5402
+5418
+5428
+5430
+5437
+5443
+5473
+5484
+5486
+5505
+5507
+5508
+5510
+5567
+5578
+5580
+5584
+5606
+5613
+5629
+5672
+5676
+5692
+5701
+5760
+5769
+5770
+5779
+5814
+5850
+5871
+5893
+5911
+5949
+5954
+6005
+6006
+6012
+6017
+6023
+6024
+6040
+6050
+6054
+6087
+6105
+6157
+6235
+6237
+6256
+6259
+6286
+6291
+6306
+6339
+6341
+6343
+6379
+6383
+6393
+6405
+6479
+6511
+6517
+6541
+6561
+6608
+6611
+6615
+6678
+6682
+6707
+6752
+6798
+6850
+6880
+6885
+6890
+6920
+6981
+7000
+7009
+7038
+7049
+7050
+7052
+7073
+7078
+7098
+7111
+7165
+7198
+7204
+7280
+7283
+7286
+7287
+7293
+7294
+7305
+7318
+7341
+7346
+7354
+7382
+7427
+7428
+7435
+7445
+7450
+7455
+7467
+7469
+7497
+7502
+7506
+7514
+7523
+7651
+7661
+7664
+7672
+7679
+7685
+7696
+7730
+7871
+7873
+7895
+7914
+7915
+7920
+7934
+7935
+7949
+8009
+8036
+8051
+8065
+8074
+8090
+8112
+8140
+8164
+8168
+8178
+8182
+8198
+8212
+8216
+8230
+8242
+8288
+8289
+8295
+8318
+8352
+8368
+8371
+8375
+8376
+8401
+8416
+8419
+8436
+8460
+8477
+8478
+8482
+8498
+8500
+8539
+8543
+8552
+8555
+8580
+8584
+8586
+8594
+8598
+8601
+8606
+8610
+8611
+8622
+8627
+8639
+8649
+8650
+8653
+8654
+8667
+8672
+8673
+8674
+8676
+8684
+8720
+8723
+8750
+8753
+8801
+8815
+8831
+8835
+8842
+8845
+8858
+8897
+8916
+8951
+8954
+8959
+8970
+8976
+8981
+8983
+8989
+8991
+8993
+9019
+9039
+9042
+9043
+9056
+9057
+9070
+9087
+9098
+9106
+9130
+9131
+9155
+9171
+9183
+9198
+9199
+9201
+9204
+9211
+9220
+9224
+9228
+9249
+9259
+9270
+9278
+9294
+9299
+9309
+9321
+9344
+9351
+9375
+9376
+9381
+9391
+9400
+9404
+9440
+9448
+9463
+9474
+9501
+9504
+9513
+9514
+9544
+9566
+9575
+9607
+9608
+9623
+9632
+9638
+9642
+9655
+9673
+9739
+9751
+9759
+9766
+9777
+9801
+9819
+9838
+9878
+9923
+9955
+9960
+9962
+9969
+9996
+10009
+10030
+10039
+10051
+10072
+10074
+10077
+10093
+10096
+10108
+10117
+10120
+10123
+10157
+10225
+10275
+10303
+10306
+10313
+10314
+10331
+10336
+10337
+10412
+10422
+10450
+10462
+10464
+10486
+10518
+10521
+10522
+10531
+10533
+10534
+10550
+10558
+10573
+10582
+10585
+10588
+10611
+10625
+10634
+10637
+10676
+10682
+10725
+10775
+10781
+10782
+10806
+10836
+10839
+10847
+10858
+10870
+10880
+10883
+10907
+10913
+10920
+10935
+10946
+10950
+10951
+10956
+10998
+11002
+11017
+11022
+11024
+11026
+11044
+11054
+11094
+11109
+11136
+11136
+11167
+11185
+11220
+11222
+11241
+11254
+11258
+11278
+11305
+11310
+11330
+11366
+11376
+11388
+11391
+11400
+11406
+11436
+11448
+11465
+11468
+11472
+11477
+11482
+11483
+11506
+11535
+11557
+11565
+11574
+11583
+11593
+11610
+11611
+11618
+11620
+11639
+11642
+11663
+11673
+11688
+11708
+11709
+11715
+11720
+11725
+11728
+11742
+11759
+11770
+11836
+11838
+11855
+11875
+11877
+11883
+11888
+11895
+11916
+11922
+11929
+11943
+11951
+11979
+11983
+12213
+12228
+12238
+12240
+12241
+12246
+12282
+12348
+12368
+12372
+12421
+12559
+12565
+12574
+12687
+12754
+12767
+12777
+12779
+12811
+12831
+12834
+12835
+12842
+12846
+12848
+12849
+12855
+12857
+12872
+12937
+12970
+13016
+13037
+13045
+13058
+13084
+13085
+13087
+13093
+13133
+13181
+13229
+13405
+13443
+13613
+13689
+13697
+13708
+13748
+13803
+13981
+14050
+14058
+14218
+14245
+14255
+14263
+14293
+14323
+14366
+14388
+14393
+14437
+14441
+14964
+15730
+16742
+18035
+18203
+18533
+18790
+19100
+20017
+20460
+21024
+21043
+21161
+21169
+21179
+21194
+21198
+21367
+21815
\ No newline at end of file