"...git@developer.sourcefind.cn:yangql/googletest.git" did not exist on "10f05a627c2da8d7de78da1b08f984ce8de398fb"
Commit 59c80aa2 authored by PRC-Huang's avatar PRC-Huang Committed by zhe chen
Browse files

release classification

parent 2cab2294
......@@ -30,9 +30,9 @@ Deformable Convolutions](https://arxiv.org/abs/2211.05778).
ADE20K, outperforming previous models by a large margin.
## Coming soon
- [ ] TensorRT inference.
- [ ] Other downstream tasks.
- [ ] Classification code of the InternImage series.
- [x] TensorRT inference.
- [x] Classification code of the InternImage series.
- [x] InternImage-T/S/B/L/XL ImageNet-1k pretrained model.
- [x] InternImage-L/XL ImageNet-22k pretrained model.
- [x] InternImage-T/S/B/L/XL detection and instance segmentation model.
......@@ -89,13 +89,13 @@ to reduces the strict inductive bias. Our model makes it possible to learn more
## Main Results of FPS
| name | resolution | #params | FLOPs | Batch 1 FPS(PyTorch) | Batch 1 FPS(TensorRT) |
| :------------: | :--------: | :-----: | :---: | :------------------: | :-------------------: |
| InternImage-T | 224x224 | 30M | 5G | 44 | 156 |
| InternImage-S | 224x224 | 50M | 8G | 40 | 129 |
| InternImage-B | 224x224 | 97M | 16G | 40 | 116 |
| InternImage-L | 384x384 | 223M | 108G | 40 | 56 |
| InternImage-XL | 384x384 | 335M | 163G | 32 | 47 |
| name | resolution | #params | FLOPs | Batch 1 FPS(TensorRT) |
| :------------: | :--------: | :-----: | :---: | :-------------------: |
| InternImage-T | 224x224 | 30M | 5G | 156 |
| InternImage-S | 224x224 | 50M | 8G | 129 |
| InternImage-B | 224x224 | 97M | 16G | 116 |
| InternImage-L | 384x384 | 223M | 108G | 56 |
| InternImage-XL | 384x384 | 335M | 163G | 47 |
## Citation
......
# InternImage for Image Classification
This folder contains the implementation of the InternImage for image classification.
## Usage
### Install
- Clone this repo:
```bash
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
```
- Create a conda virtual environment and activate it:
```bash
conda create -n internimage python=3.7 -y
conda activate internimage
```
- Install `CUDA>=10.2` with `cudnn>=7` following
the [official installation instructions](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
- Install `PyTorch>=1.8.0` and `torchvision>=0.9.0` with `CUDA>=10.2`:
For examples, to install torch==1.11 with CUDA==11.3:
```bash
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html
```
- Install `timm==0.6.11` and `mmcv-full==1.5.0`:
```bash
pip install -U openmim
mim install mmcv-full==1.5.0
pip install timm==0.6.11 mmdet==2.28.1
```
- Install other requirements:
```bash
pip install opencv-python termcolor yacs pyyaml scipy
```
- Compiling CUDA operators
```bash
cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
```
### Data preparation
We use standard ImageNet dataset, you can download it from http://image-net.org/. We provide the following two ways to
load data:
- For standard folder dataset, move validation images to labeled sub-folders. The file structure should look like:
```bash
$ tree data
imagenet
├── train
│ ├── class1
│ │ ├── img1.jpeg
│ │ ├── img2.jpeg
│ │ └── ...
│ ├── class2
│ │ ├── img3.jpeg
│ │ └── ...
│ └── ...
└── val
├── class1
│ ├── img4.jpeg
│ ├── img5.jpeg
│ └── ...
├── class2
│ ├── img6.jpeg
│ └── ...
└── ...
```
- To boost the slow speed when reading images from massive small files, we also support zipped ImageNet, which includes
four files:
- `train.zip`, `val.zip`: which store the zipped folder for train and validate splits.
- `train.txt`, `val.txt`: which store the relative path in the corresponding zip file and ground truth
label. Make sure the data folder looks like this:
```bash
$ tree data
data
└── ImageNet-Zip
├── train_map.txt
├── train.zip
├── val_map.txt
└── val.zip
$ head -n 5 meta_data/val.txt
ILSVRC2012_val_00000001.JPEG 65
ILSVRC2012_val_00000002.JPEG 970
ILSVRC2012_val_00000003.JPEG 230
ILSVRC2012_val_00000004.JPEG 809
ILSVRC2012_val_00000005.JPEG 516
$ head -n 5 meta_data/train.txt
n01440764/n01440764_10026.JPEG 0
n01440764/n01440764_10027.JPEG 0
n01440764/n01440764_10029.JPEG 0
n01440764/n01440764_10040.JPEG 0
n01440764/n01440764_10042.JPEG 0
```
- For ImageNet-22K dataset, make a folder named `fall11_whole` and move all images to labeled sub-folders in this
folder. Then download the train-val split
file ([ILSVRC2011fall_whole_map_train.txt](https://github.com/SwinTransformer/storage/releases/download/v2.0.1/ILSVRC2011fall_whole_map_train.txt)
& [ILSVRC2011fall_whole_map_val.txt](https://github.com/SwinTransformer/storage/releases/download/v2.0.1/ILSVRC2011fall_whole_map_val.txt))
, and put them in the parent directory of `fall11_whole`. The file structure should look like:
```bash
$ tree imagenet22k/
imagenet22k/
└── fall11_whole
├── n00004475
├── n00005787
├── n00006024
├── n00006484
└── ...
```
### Evaluation
To evaluate a pre-trained `InternImage` on ImageNet val, run:
```bash
python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345 main.py --eval \
--cfg <config-file> --resume <checkpoint> --data-path <imagenet-path>
```
For example, to evaluate the `InternImage-B` with a single GPU:
```bash
python -m torch.distributed.launch --nproc_per_node 1 --master_port 12345 main.py --eval \
--cfg configs/internimage_b_1k_224.yaml --resume internimage_b_1k_224.pth --data-path <imagenet-path>
```
### Training from scratch on ImageNet-1K
To train an `InternImage` on ImageNet from scratch, run:
```bash
python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345 main.py \
--cfg <config-file> --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory> --tag <job-tag>]
```
### Manage jobs with Srun.
For example, to train `InternImage` with 8 GPU on a single node for 300 epochs, run:
`InternImage-T`:
```bash
GPUS=8 sh train_in1k.sh <partition> <job-name> configs/internimage_t_1k_224.yaml --resume internimage_t_1k_224.pth --eval
```
`InternImage-S`:
```bash
GPUS=8 sh train_in1k.sh <partition> <job-name> configs/internimage_s_1k_224.yaml --resume internimage_s_1k_224.pth --eval
```
`InternImage-XL`:
```bash
GPUS=8 sh train_in1k.sh <partition> <job-name> configs/internimage_xl_22kto1k_384.pth --resume internimage_xl_22kto1k_384.pth --eval
```
<!--
### Test pretrained model on ImageNet-22K
For example, to evaluate the `InternImage-L-22k`:
```bash
python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345 main.py \
--cfg configs/internimage_xl_22k_192to384.yaml --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory>] \
--resume internimage_xl_22k_192to384.pth --eval
``` -->
<!-- ### Fine-tuning from a ImageNet-22K pre-trained model
For example, to fine-tune a `InternImage-XL-22k` model pre-trained on ImageNet-22K:
```bashs
GPUS=8 sh train_in1k.sh <partition> <job-name> configs/intern_image_.yaml --pretrained intern_image_b.pth --eval
python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345 main.py \
--cfg configs/.yaml --pretrained swin_base_patch4_window7_224_22k.pth \
--data-path <imagenet-path> --batch-size 64 --accumulation-steps 2 [--use-checkpoint]
``` -->
### Export
To export `InternImage-T` from PyTorch to ONNX, run:
```shell
python export.py --model_name internimage_t_1k_224 --ckpt_dir /path/to/ckpt/dir --onnx
```
To export `InternImage-T` from PyTorch to TensorRT, run:
```shell
python export.py --model_name internimage_t_1k_224 --ckpt_dir /path/to/ckpt/dir --trt
```
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import os
import yaml
from yacs.config import CfgNode as CN
_C = CN()
# Base config files
_C.BASE = ['']
# -----------------------------------------------------------------------------
# Data settings
# -----------------------------------------------------------------------------
_C.DATA = CN()
# Batch size for a single GPU, could be overwritten by command line argument
_C.DATA.BATCH_SIZE = 128
# Path to dataset, could be overwritten by command line argument
_C.DATA.DATA_PATH = ''
# Dataset name
_C.DATA.DATASET = 'imagenet'
# Input image size
_C.DATA.IMG_SIZE = 224
# Interpolation to resize image (random, bilinear, bicubic)
_C.DATA.INTERPOLATION = 'bicubic'
# Use zipped dataset instead of folder dataset
# could be overwritten by command line argument
_C.DATA.ZIP_MODE = False
# Cache Data in Memory, could be overwritten by command line argument
_C.DATA.CACHE_MODE = 'part'
# Pin CPU memory in DataLoader for more efficient (sometimes) transfer to GPU.
_C.DATA.PIN_MEMORY = True
# Number of data loading threads
_C.DATA.NUM_WORKERS = 8
# Load data to memory
_C.DATA.IMG_ON_MEMORY = False
# -----------------------------------------------------------------------------
# Model settings
# -----------------------------------------------------------------------------
_C.MODEL = CN()
# Model type
_C.MODEL.TYPE = 'INTERN_IMAGE'
# Model name
_C.MODEL.NAME = 'intern_image'
# Pretrained weight from checkpoint, could be imagenet22k pretrained weight
# could be overwritten by command line argument
_C.MODEL.PRETRAINED = ''
# Checkpoint to resume, could be overwritten by command line argument
_C.MODEL.RESUME = ''
# Number of classes, overwritten in data preparation
_C.MODEL.NUM_CLASSES = 1000
# Dropout rate
_C.MODEL.DROP_RATE = 0.0
# Drop path rate
_C.MODEL.DROP_PATH_RATE = 0.1
# Drop path type
_C.MODEL.DROP_PATH_TYPE = 'linear' # linear, uniform
# Label Smoothing
_C.MODEL.LABEL_SMOOTHING = 0.1
# INTERN_IMAGE parameters
_C.MODEL.INTERN_IMAGE = CN()
_C.MODEL.INTERN_IMAGE.DEPTHS = [4, 4, 18, 4]
_C.MODEL.INTERN_IMAGE.GROUPS = [4, 8, 16, 32]
_C.MODEL.INTERN_IMAGE.CHANNELS = 64
_C.MODEL.INTERN_IMAGE.LAYER_SCALE = None
_C.MODEL.INTERN_IMAGE.OFFSET_SCALE = 1.0
_C.MODEL.INTERN_IMAGE.MLP_RATIO = 4.0
_C.MODEL.INTERN_IMAGE.CORE_OP = 'DCNv3'
_C.MODEL.INTERN_IMAGE.POST_NORM = False
# -----------------------------------------------------------------------------
# Training settings
# -----------------------------------------------------------------------------
_C.TRAIN = CN()
_C.TRAIN.START_EPOCH = 0
_C.TRAIN.EPOCHS = 300
_C.TRAIN.WARMUP_EPOCHS = 20
_C.TRAIN.WEIGHT_DECAY = 0.05
_C.TRAIN.BASE_LR = 5e-4
_C.TRAIN.WARMUP_LR = 5e-7
_C.TRAIN.MIN_LR = 5e-6
# Clip gradient norm
_C.TRAIN.CLIP_GRAD = 5.0
# Auto resume from latest checkpoint
_C.TRAIN.AUTO_RESUME = True
# Gradient accumulation steps
# could be overwritten by command line argument
_C.TRAIN.ACCUMULATION_STEPS = 0
# Whether to use gradient checkpointing to save memory
# could be overwritten by command line argument
_C.TRAIN.USE_CHECKPOINT = False
# LR scheduler
_C.TRAIN.LR_SCHEDULER = CN()
_C.TRAIN.LR_SCHEDULER.NAME = 'cosine'
# Epoch interval to decay LR, used in StepLRScheduler
_C.TRAIN.LR_SCHEDULER.DECAY_EPOCHS = 30
# LR decay rate, used in StepLRScheduler
_C.TRAIN.LR_SCHEDULER.DECAY_RATE = 0.1
# Optimizer
_C.TRAIN.OPTIMIZER = CN()
_C.TRAIN.OPTIMIZER.NAME = 'adamw'
# Optimizer Epsilon
_C.TRAIN.OPTIMIZER.EPS = 1e-8
# Optimizer Betas
_C.TRAIN.OPTIMIZER.BETAS = (0.9, 0.999)
# SGD momentum
_C.TRAIN.OPTIMIZER.MOMENTUM = 0.9
# ZeRO
_C.TRAIN.OPTIMIZER.USE_ZERO = False
# freeze backbone
_C.TRAIN.OPTIMIZER.FREEZE_BACKBONE = None
# dcn lr
_C.TRAIN.OPTIMIZER.DCN_LR_MUL = None
# EMA
_C.TRAIN.EMA = CN()
_C.TRAIN.EMA.ENABLE = False
_C.TRAIN.EMA.DECAY = 0.9998
# LR_LAYER_DECAY
_C.TRAIN.LR_LAYER_DECAY = False
_C.TRAIN.LR_LAYER_DECAY_RATIO = 0.875
# FT head init weights
_C.TRAIN.RAND_INIT_FT_HEAD = False
# -----------------------------------------------------------------------------
# Augmentation settings
# -----------------------------------------------------------------------------
_C.AUG = CN()
# Color jitter factor
_C.AUG.COLOR_JITTER = 0.4
# Use AutoAugment policy. "v0" or "original"
_C.AUG.AUTO_AUGMENT = 'rand-m9-mstd0.5-inc1'
# Random erase prob
_C.AUG.REPROB = 0.25
# Random erase mode
_C.AUG.REMODE = 'pixel'
# Random erase count
_C.AUG.RECOUNT = 1
# Mixup alpha, mixup enabled if > 0
_C.AUG.MIXUP = 0.8
# Cutmix alpha, cutmix enabled if > 0
_C.AUG.CUTMIX = 1.0
# Cutmix min/max ratio, overrides alpha and enables cutmix if set
_C.AUG.CUTMIX_MINMAX = None
# Probability of performing mixup or cutmix when either/both is enabled
_C.AUG.MIXUP_PROB = 1.0
# Probability of switching to cutmix when both mixup and cutmix enabled
_C.AUG.MIXUP_SWITCH_PROB = 0.5
# How to apply mixup/cutmix params. Per "batch", "pair", or "elem"
_C.AUG.MIXUP_MODE = 'batch'
# RandomResizedCrop
_C.AUG.RANDOM_RESIZED_CROP = False
_C.AUG.MEAN = (0.485, 0.456, 0.406)
_C.AUG.STD = (0.229, 0.224, 0.225)
# -----------------------------------------------------------------------------
# Testing settings
# -----------------------------------------------------------------------------
_C.TEST = CN()
# Whether to use center crop when testing
_C.TEST.CROP = True
# Whether to use SequentialSampler as validation sampler
_C.TEST.SEQUENTIAL = False
# -----------------------------------------------------------------------------
# Misc
# -----------------------------------------------------------------------------
# Mixed precision opt level, if O0, no amp is used ('O0', 'O1', 'O2')
# overwritten by command line argument
_C.AMP_OPT_LEVEL = ''
# Path to output folder, overwritten by command line argument
_C.OUTPUT = ''
# Tag of experiment, overwritten by command line argument
_C.TAG = 'default'
# Frequency to save checkpoint
_C.SAVE_FREQ = 1
# Frequency to logging info
_C.PRINT_FREQ = 10
# eval freq
_C.EVAL_FREQ = 1
# Fixed random seed
_C.SEED = 0
# Perform evaluation only, overwritten by command line argument
_C.EVAL_MODE = False
# Test throughput only, overwritten by command line argument
_C.THROUGHPUT_MODE = False
# local rank for DistributedDataParallel, given by command line argument
_C.LOCAL_RANK = 0
_C.EVAL_22K_TO_1K = False
_C.AMP_TYPE = 'float16'
def _update_config_from_file(config, cfg_file):
config.defrost()
with open(cfg_file, 'r') as f:
yaml_cfg = yaml.load(f, Loader=yaml.FullLoader)
for cfg in yaml_cfg.setdefault('BASE', ['']):
if cfg:
_update_config_from_file(
config, os.path.join(os.path.dirname(cfg_file), cfg))
print('=> merge config from {}'.format(cfg_file))
config.merge_from_file(cfg_file)
config.freeze()
def update_config(config, args):
_update_config_from_file(config, args.cfg)
config.defrost()
if hasattr(args, 'opts') and args.opts:
config.merge_from_list(args.opts)
# merge from specific arguments
if hasattr(args, 'batch_size') and args.batch_size:
config.DATA.BATCH_SIZE = args.batch_size
if hasattr(args, 'dataset') and args.dataset:
config.DATA.DATASET = args.dataset
if hasattr(args, 'data_path') and args.data_path:
config.DATA.DATA_PATH = args.data_path
if hasattr(args, 'zip') and args.zip:
config.DATA.ZIP_MODE = True
if hasattr(args, 'cache_mode') and args.cache_mode:
config.DATA.CACHE_MODE = args.cache_mode
if hasattr(args, 'pretrained') and args.pretrained:
config.MODEL.PRETRAINED = args.pretrained
if hasattr(args, 'resume') and args.resume:
config.MODEL.RESUME = args.resume
if hasattr(args, 'accumulation_steps') and args.accumulation_steps:
config.TRAIN.ACCUMULATION_STEPS = args.accumulation_steps
if hasattr(args, 'use_checkpoint') and args.use_checkpoint:
config.TRAIN.USE_CHECKPOINT = True
if hasattr(args, 'amp_opt_level') and args.amp_opt_level:
config.AMP_OPT_LEVEL = args.amp_opt_level
if hasattr(args, 'output') and args.output:
config.OUTPUT = args.output
if hasattr(args, 'tag') and args.tag:
config.TAG = args.tag
if hasattr(args, 'eval') and args.eval:
config.EVAL_MODE = True
if hasattr(args, 'throughput') and args.throughput:
config.THROUGHPUT_MODE = True
if hasattr(args, 'save_ckpt_num') and args.save_ckpt_num:
config.SAVE_CKPT_NUM = args.save_ckpt_num
if hasattr(args, 'use_zero') and args.use_zero:
config.TRAIN.OPTIMIZER.USE_ZERO = True
# set local rank for distributed training
if hasattr(args, 'local_rank') and args.local_rank:
config.LOCAL_RANK = args.local_rank
# output folder
config.MODEL.NAME = args.cfg.split('/')[-1].replace('.yaml', '')
config.OUTPUT = os.path.join(config.OUTPUT, config.MODEL.NAME)
# config.OUTPUT = os.path.join(config.OUTPUT, config.MODEL.NAME, config.TAG)
config.freeze()
def get_config(args):
"""Get a yacs CfgNode object with default values."""
# Return a clone so that the defaults will not be altered
# This is for the "local variable" use pattern
config = _C.clone()
update_config(config, args)
return config
DATA:
IMG_ON_MEMORY: True
MODEL:
TYPE: intern_image
DROP_PATH_RATE: 0.5
INTERN_IMAGE:
CORE_OP: 'DCNv3'
DEPTHS: [4, 4, 21, 4]
GROUPS: [7, 14, 28, 56]
CHANNELS: 112
LAYER_SCALE: 1e-5
OFFSET_SCALE: 1.0
MLP_RATIO: 4.0
POST_NORM: True
TRAIN:
EMA:
ENABLE: True
DECAY: 0.9999
BASE_LR: 5e-4
\ No newline at end of file
DATA:
IMG_SIZE: 384
IMG_ON_MEMORY: True
AUG:
MIXUP: 0.0
CUTMIX: 0.0
REPROB: 0.0
MODEL:
TYPE: intern_image
DROP_PATH_RATE: 0.1
LABEL_SMOOTHING: 0.3
INTERN_IMAGE:
CORE_OP: 'DCNv3'
DEPTHS: [5, 5, 22, 5]
GROUPS: [10, 20, 40, 80]
CHANNELS: 160
LAYER_SCALE: 1e-5
OFFSET_SCALE: 2.0
MLP_RATIO: 4.0
POST_NORM: True
TRAIN:
EMA:
ENABLE: true
DECAY: 0.9999
EPOCHS: 20
WARMUP_EPOCHS: 2
WEIGHT_DECAY: 0.05
BASE_LR: 2e-05 # 512
WARMUP_LR: .0
MIN_LR: .0
LR_LAYER_DECAY: true
LR_LAYER_DECAY_RATIO: 0.9
USE_CHECKPOINT: true
OPTIMIZER:
DCN_LR_MUL: 0.1
AMP_OPT_LEVEL: O0
EVAL_FREQ: 1
DATA:
IMG_ON_MEMORY: True
MODEL:
TYPE: intern_image
DROP_PATH_RATE: 0.4
INTERN_IMAGE:
CORE_OP: 'DCNv3'
DEPTHS: [4, 4, 21, 4]
GROUPS: [5, 10, 20, 40]
CHANNELS: 80
LAYER_SCALE: 1e-5
OFFSET_SCALE: 1.0
MLP_RATIO: 4.0
POST_NORM: True
TRAIN:
EMA:
ENABLE: True
DECAY: 0.9999
BASE_LR: 5e-4
DATA:
IMG_ON_MEMORY: True
MODEL:
TYPE: intern_image
DROP_PATH_RATE: 0.1
INTERN_IMAGE:
CORE_OP: 'DCNv3'
DEPTHS: [4, 4, 18, 4]
GROUPS: [4, 8, 16, 32]
CHANNELS: 64
OFFSET_SCALE: 1.0
MLP_RATIO: 4.0
TRAIN:
EMA:
ENABLE: True
DECAY: 0.9999
BASE_LR: 5e-4
DATA:
IMG_SIZE: 384
IMG_ON_MEMORY: True
AUG:
MIXUP: 0.0
CUTMIX: 0.0
REPROB: 0.0
MODEL:
TYPE: intern_image
DROP_PATH_RATE: 0.2
LABEL_SMOOTHING: 0.3
INTERN_IMAGE:
CORE_OP: 'DCNv3'
DEPTHS: [5, 5, 24, 5]
GROUPS: [12, 24, 48, 96]
CHANNELS: 192
LAYER_SCALE: 1e-5
OFFSET_SCALE: 2.0
MLP_RATIO: 4.0
POST_NORM: True
TRAIN:
EMA:
ENABLE: true
DECAY: 0.9999
EPOCHS: 20
WARMUP_EPOCHS: 2
WEIGHT_DECAY: 0.05
BASE_LR: 2e-05 # 512
WARMUP_LR: .0
MIN_LR: .0
LR_LAYER_DECAY: true
LR_LAYER_DECAY_RATIO: 0.9
USE_CHECKPOINT: true
OPTIMIZER:
DCN_LR_MUL: 0.1
AMP_OPT_LEVEL: O0
EVAL_FREQ: 1
\ No newline at end of file
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
from .build import build_loader
\ No newline at end of file
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import os
import torch
import numpy as np
import torch.distributed as dist
from torchvision import transforms
from timm.data import Mixup
from timm.data import create_transform
from .cached_image_folder import ImageCephDataset
from .samplers import SubsetRandomSampler, NodeDistributedSampler
try:
from torchvision.transforms import InterpolationMode
def _pil_interp(method):
if method == 'bicubic':
return InterpolationMode.BICUBIC
elif method == 'lanczos':
return InterpolationMode.LANCZOS
elif method == 'hamming':
return InterpolationMode.HAMMING
else:
return InterpolationMode.BILINEAR
except:
from timm.data.transforms import _pil_interp
class TTA(torch.nn.Module):
def __init__(self, size, scales=[1.0, 1.05, 1.1]):
super().__init__()
self.size = size
self.scales = scales
def forward(self, img):
out = []
cc = transforms.CenterCrop(self.size)
for scale in self.scales:
size_ = int(scale * self.size)
rs = transforms.Resize(size_, interpolation=_pil_interp('bicubic'))
img_ = rs(img)
img_ = cc(img_)
out.append(img_)
return out
def __repr__(self) -> str:
return f"{self.__class__.__name__}(size={self.size}, scale={self.scales})"
def build_loader(config):
config.defrost()
dataset_train, config.MODEL.NUM_CLASSES = build_dataset('train',
config=config)
config.freeze()
print(f"local rank {config.LOCAL_RANK} / global rank {dist.get_rank()}"
"successfully build train dataset")
dataset_val, _ = build_dataset('val', config=config)
print(f"local rank {config.LOCAL_RANK} / global rank {dist.get_rank()}"
"successfully build val dataset")
dataset_test, _ = build_dataset('test', config=config)
print(f"local rank {config.LOCAL_RANK} / global rank {dist.get_rank()}"
"successfully build test dataset")
num_tasks = dist.get_world_size()
global_rank = dist.get_rank()
if dataset_train is not None:
if config.DATA.IMG_ON_MEMORY:
sampler_train = NodeDistributedSampler(dataset_train)
else:
if config.DATA.ZIP_MODE and config.DATA.CACHE_MODE == 'part':
indices = np.arange(dist.get_rank(), len(dataset_train),
dist.get_world_size())
sampler_train = SubsetRandomSampler(indices)
else:
sampler_train = torch.utils.data.DistributedSampler(
dataset_train,
num_replicas=num_tasks,
rank=global_rank,
shuffle=True)
if dataset_val is not None:
if config.TEST.SEQUENTIAL:
sampler_val = torch.utils.data.SequentialSampler(dataset_val)
else:
sampler_val = torch.utils.data.distributed.DistributedSampler(
dataset_val, shuffle=False)
if dataset_test is not None:
if config.TEST.SEQUENTIAL:
sampler_test = torch.utils.data.SequentialSampler(dataset_test)
else:
sampler_test = torch.utils.data.distributed.DistributedSampler(
dataset_test, shuffle=False)
data_loader_train = torch.utils.data.DataLoader(
dataset_train,
sampler=sampler_train,
batch_size=config.DATA.BATCH_SIZE,
num_workers=config.DATA.NUM_WORKERS,
pin_memory=config.DATA.PIN_MEMORY,
drop_last=True,
persistent_workers=True) if dataset_train is not None else None
data_loader_val = torch.utils.data.DataLoader(
dataset_val,
sampler=sampler_val,
batch_size=config.DATA.BATCH_SIZE,
shuffle=False,
num_workers=config.DATA.NUM_WORKERS,
pin_memory=config.DATA.PIN_MEMORY,
drop_last=False,
persistent_workers=True) if dataset_val is not None else None
data_loader_test = torch.utils.data.DataLoader(
dataset_test,
sampler=sampler_test,
batch_size=config.DATA.BATCH_SIZE,
shuffle=False,
num_workers=config.DATA.NUM_WORKERS,
pin_memory=config.DATA.PIN_MEMORY,
drop_last=False,
persistent_workers=True) if dataset_test is not None else None
# setup mixup / cutmix
mixup_fn = None
mixup_active = config.AUG.MIXUP > 0 or config.AUG.CUTMIX > 0. or config.AUG.CUTMIX_MINMAX is not None
if mixup_active:
mixup_fn = Mixup(mixup_alpha=config.AUG.MIXUP,
cutmix_alpha=config.AUG.CUTMIX,
cutmix_minmax=config.AUG.CUTMIX_MINMAX,
prob=config.AUG.MIXUP_PROB,
switch_prob=config.AUG.MIXUP_SWITCH_PROB,
mode=config.AUG.MIXUP_MODE,
label_smoothing=config.MODEL.LABEL_SMOOTHING,
num_classes=config.MODEL.NUM_CLASSES)
return dataset_train, dataset_val, dataset_test, data_loader_train, \
data_loader_val, data_loader_test, mixup_fn
def build_dataset(split, config):
transform = build_transform(split == 'train', config)
dataset = None
nb_classes = None
prefix = split
if config.DATA.DATASET == 'imagenet':
if prefix == 'train' and not config.EVAL_MODE:
root = os.path.join(config.DATA.DATA_PATH, 'train')
dataset = ImageCephDataset(root,
'train',
transform=transform,
on_memory=config.DATA.IMG_ON_MEMORY)
elif prefix == 'val':
root = os.path.join(config.DATA.DATA_PATH, 'val')
dataset = ImageCephDataset(root, 'val', transform=transform)
nb_classes = 1000
elif config.DATA.DATASET == 'imagenet22K':
if prefix == 'train':
if not config.EVAL_MODE:
root = config.DATA.DATA_PATH
dataset = ImageCephDataset(root,
'train',
transform=transform,
on_memory=config.DATA.IMG_ON_MEMORY)
nb_classes = 21841
elif prefix == 'val':
root = os.path.join(config.DATA.DATA_PATH, 'val')
dataset = ImageCephDataset(root, 'val', transform=transform)
nb_classes = 1000
else:
raise NotImplementedError(
f'build_dataset does support {config.DATA.DATASET}')
return dataset, nb_classes
def build_transform(is_train, config):
resize_im = config.DATA.IMG_SIZE > 32
if is_train:
# this should always dispatch to transforms_imagenet_train
transform = create_transform(
input_size=config.DATA.IMG_SIZE,
is_training=True,
color_jitter=config.AUG.COLOR_JITTER
if config.AUG.COLOR_JITTER > 0 else None,
auto_augment=config.AUG.AUTO_AUGMENT
if config.AUG.AUTO_AUGMENT != 'none' else None,
re_prob=config.AUG.REPROB,
re_mode=config.AUG.REMODE,
re_count=config.AUG.RECOUNT,
interpolation=config.DATA.INTERPOLATION,
)
if not resize_im:
# replace RandomResizedCropAndInterpolation with
# RandomCrop
transform.transforms[0] = transforms.RandomCrop(
config.DATA.IMG_SIZE, padding=4)
return transform
t = []
if resize_im:
if config.TEST.CROP:
size = int(1.0 * config.DATA.IMG_SIZE)
t.append(
transforms.Resize(size,
interpolation=_pil_interp(
config.DATA.INTERPOLATION)),
# to maintain same ratio w.r.t. 224 images
)
t.append(transforms.CenterCrop(config.DATA.IMG_SIZE))
elif config.AUG.RANDOM_RESIZED_CROP:
t.append(
transforms.RandomResizedCrop(
(config.DATA.IMG_SIZE, config.DATA.IMG_SIZE),
interpolation=_pil_interp(config.DATA.INTERPOLATION)))
else:
t.append(
transforms.Resize(
(config.DATA.IMG_SIZE, config.DATA.IMG_SIZE),
interpolation=_pil_interp(config.DATA.INTERPOLATION)))
t.append(transforms.ToTensor())
t.append(transforms.Normalize(config.AUG.MEAN, config.AUG.STD))
return transforms.Compose(t)
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import io
import os
import re
import time
import json
import math
import mmcv
import torch
import logging
import os.path as osp
from PIL import Image
from tqdm import tqdm, trange
from abc import abstractmethod
import torch.utils.data as data
import torch.distributed as dist
from mmcv.fileio import FileClient
from .zipreader import is_zip_path, ZipReader
_logger = logging.getLogger(__name__)
_ERROR_RETRY = 50
def has_file_allowed_extension(filename, extensions):
"""Checks if a file is an allowed extension.
Args:
filename (string): path to a file
Returns:
bool: True if the filename ends with a known image extension
"""
filename_lower = filename.lower()
return any(filename_lower.endswith(ext) for ext in extensions)
def find_classes(dir):
classes = [
d for d in os.listdir(dir) if os.path.isdir(os.path.join(dir, d))
]
classes.sort()
class_to_idx = {classes[i]: i for i in range(len(classes))}
return classes, class_to_idx
def make_dataset(dir, class_to_idx, extensions):
images = []
dir = os.path.expanduser(dir)
for target in sorted(os.listdir(dir)):
d = os.path.join(dir, target)
if not os.path.isdir(d):
continue
for root, _, fnames in sorted(os.walk(d)):
for fname in sorted(fnames):
if has_file_allowed_extension(fname, extensions):
path = os.path.join(root, fname)
item = (path, class_to_idx[target])
images.append(item)
return images
def make_dataset_with_ann(ann_file, img_prefix, extensions):
images = []
with open(ann_file, "r") as f:
contents = f.readlines()
for line_str in contents:
path_contents = [c for c in line_str.split('\t')]
im_file_name = path_contents[0]
class_index = int(path_contents[1])
assert str.lower(os.path.splitext(im_file_name)[-1]) in extensions
item = (os.path.join(img_prefix, im_file_name), class_index)
images.append(item)
return images
class DatasetFolder(data.Dataset):
"""A generic data loader where the samples are arranged in this way: ::
root/class_x/xxx.ext
root/class_x/xxy.ext
root/class_x/xxz.ext
root/class_y/123.ext
root/class_y/nsdf3.ext
root/class_y/asd932_.ext
Args:
root (string): Root directory path.
loader (callable): A function to load a sample given its path.
extensions (list[string]): A list of allowed extensions.
transform (callable, optional): A function/transform that takes in
a sample and returns a transformed version.
E.g, ``transforms.RandomCrop`` for images.
target_transform (callable, optional): A function/transform that takes
in the target and transforms it.
Attributes:
samples (list): List of (sample path, class_index) tuples
"""
def __init__(self,
root,
loader,
extensions,
ann_file='',
img_prefix='',
transform=None,
target_transform=None,
cache_mode="no"):
# image folder mode
if ann_file == '':
_, class_to_idx = find_classes(root)
samples = make_dataset(root, class_to_idx, extensions)
# zip mode
else:
samples = make_dataset_with_ann(os.path.join(root, ann_file),
os.path.join(root, img_prefix),
extensions)
if len(samples) == 0:
raise (RuntimeError("Found 0 files in subfolders of: " + root +
"\n" + "Supported extensions are: " +
",".join(extensions)))
self.root = root
self.loader = loader
self.extensions = extensions
self.samples = samples
self.labels = [y_1k for _, y_1k in samples]
self.classes = list(set(self.labels))
self.transform = transform
self.target_transform = target_transform
self.cache_mode = cache_mode
if self.cache_mode != "no":
self.init_cache()
def init_cache(self):
assert self.cache_mode in ["part", "full"]
n_sample = len(self.samples)
global_rank = dist.get_rank()
world_size = dist.get_world_size()
samples_bytes = [None for _ in range(n_sample)]
start_time = time.time()
for index in range(n_sample):
if index % (n_sample // 10) == 0:
t = time.time() - start_time
print(
f'global_rank {dist.get_rank()} cached {index}/{n_sample} takes {t:.2f}s per block'
)
start_time = time.time()
path, target = self.samples[index]
if self.cache_mode == "full":
samples_bytes[index] = (ZipReader.read(path), target)
elif self.cache_mode == "part" and index % world_size == global_rank:
samples_bytes[index] = (ZipReader.read(path), target)
else:
samples_bytes[index] = (path, target)
self.samples = samples_bytes
def __getitem__(self, index):
"""
Args:
index (int): Index
Returns:
tuple: (sample, target) where target is class_index of the target class.
"""
path, target = self.samples[index]
sample = self.loader(path)
if self.transform is not None:
sample = self.transform(sample)
if self.target_transform is not None:
target = self.target_transform(target)
return sample, target
def __len__(self):
return len(self.samples)
def __repr__(self):
fmt_str = 'Dataset ' + self.__class__.__name__ + '\n'
fmt_str += ' Number of datapoints: {}\n'.format(self.__len__())
fmt_str += ' Root Location: {}\n'.format(self.root)
tmp = ' Transforms (if any): '
fmt_str += '{0}{1}\n'.format(
tmp,
self.transform.__repr__().replace('\n', '\n' + ' ' * len(tmp)))
tmp = ' Target Transforms (if any): '
fmt_str += '{0}{1}'.format(
tmp,
self.target_transform.__repr__().replace('\n',
'\n' + ' ' * len(tmp)))
return fmt_str
IMG_EXTENSIONS = ['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif']
def pil_loader(path):
# open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
if isinstance(path, bytes):
img = Image.open(io.BytesIO(path))
elif is_zip_path(path):
data = ZipReader.read(path)
img = Image.open(io.BytesIO(data))
else:
with open(path, 'rb') as f:
img = Image.open(f)
return img.convert('RGB')
return img.convert('RGB')
def accimage_loader(path):
import accimage
try:
return accimage.Image(path)
except IOError:
# Potentially a decoding problem, fall back to PIL.Image
return pil_loader(path)
def default_img_loader(path):
from torchvision import get_image_backend
if get_image_backend() == 'accimage':
return accimage_loader(path)
else:
return pil_loader(path)
class CachedImageFolder(DatasetFolder):
"""A generic data loader where the images are arranged in this way: ::
root/dog/xxx.png
root/dog/xxy.png
root/dog/xxz.png
root/cat/123.png
root/cat/nsdf3.png
root/cat/asd932_.png
Args:
root (string): Root directory path.
transform (callable, optional): A function/transform that takes in an PIL image
and returns a transformed version. E.g, ``transforms.RandomCrop``
target_transform (callable, optional): A function/transform that takes in the
target and transforms it.
loader (callable, optional): A function to load an image given its path.
Attributes:
imgs (list): List of (image path, class_index) tuples
"""
def __init__(self,
root,
ann_file='',
img_prefix='',
transform=None,
target_transform=None,
loader=default_img_loader,
cache_mode="no"):
super(CachedImageFolder,
self).__init__(root,
loader,
IMG_EXTENSIONS,
ann_file=ann_file,
img_prefix=img_prefix,
transform=transform,
target_transform=target_transform,
cache_mode=cache_mode)
self.imgs = self.samples
def __getitem__(self, index):
"""
Args:
index (int): Index
Returns:
tuple: (image, target) where target is class_index of the target class.
"""
path, target = self.samples[index]
image = self.loader(path)
if self.transform is not None:
img = self.transform(image)
else:
img = image
if self.target_transform is not None:
target = self.target_transform(target)
return img, target
class ImageCephDataset(data.Dataset):
def __init__(self,
root,
split,
parser=None,
transform=None,
target_transform=None,
on_memory=False):
if '22k' in root:
# Imagenet 22k
annotation_root = 'meta_data/'
else:
# Imagenet
annotation_root = 'meta_data/'
if parser is None or isinstance(parser, str):
parser = ParserCephImage(root=root,
split=split,
annotation_root=annotation_root,
on_memory=on_memory)
self.parser = parser
self.transform = transform
self.target_transform = target_transform
self._consecutive_errors = 0
def __getitem__(self, index):
img, target = self.parser[index]
self._consecutive_errors = 0
if self.transform is not None:
img = self.transform(img)
if target is None:
target = -1
elif self.target_transform is not None:
target = self.target_transform(target)
return img, target
def __len__(self):
return len(self.parser)
def filename(self, index, basename=False, absolute=False):
return self.parser.filename(index, basename, absolute)
def filenames(self, basename=False, absolute=False):
return self.parser.filenames(basename, absolute)
class Parser:
def __init__(self):
pass
@abstractmethod
def _filename(self, index, basename=False, absolute=False):
pass
def filename(self, index, basename=False, absolute=False):
return self._filename(index, basename=basename, absolute=absolute)
def filenames(self, basename=False, absolute=False):
return [
self._filename(index, basename=basename, absolute=absolute)
for index in range(len(self))
]
class ParserCephImage(Parser):
def __init__(self,
root,
split,
annotation_root,
on_memory=False,
**kwargs):
super().__init__()
self.file_client = None
self.kwargs = kwargs
self.root = root # dataset:s3://imagenet22k
if '22k' in root:
self.io_backend = 'petrel'
with open(osp.join(annotation_root, '22k_class_to_idx.json'),
'r') as f:
self.class_to_idx = json.loads(f.read())
with open(osp.join(annotation_root, '22k_label.txt'), 'r') as f:
self.samples = f.read().splitlines()
else:
self.io_backend = 'disk'
self.class_to_idx = None
with open(osp.join(annotation_root, f'{split}.txt'), 'r') as f:
self.samples = f.read().splitlines()
local_rank = None
local_size = None
self._consecutive_errors = 0
self.on_memory = on_memory
if on_memory:
self.holder = {}
if local_rank is None:
local_rank = int(os.environ.get('LOCAL_RANK', 0))
if local_size is None:
local_size = int(os.environ.get('LOCAL_SIZE', 1))
self.local_rank = local_rank
self.local_size = local_size
self.rank = int(os.environ["RANK"])
self.world_size = int(os.environ['WORLD_SIZE'])
self.num_replicas = int(os.environ['WORLD_SIZE'])
self.num_parts = local_size
self.num_samples = int(
math.ceil(len(self.samples) * 1.0 / self.num_replicas))
self.total_size = self.num_samples * self.num_replicas
self.total_size_parts = self.num_samples * self.num_replicas // self.num_parts
self.load_onto_memory_v2()
def load_onto_memory(self):
print("Loading images onto memory...", self.local_rank,
self.local_size)
if self.file_client is None:
self.file_client = FileClient(self.io_backend, **self.kwargs)
for index in trange(len(self.samples)):
if index % self.local_size != self.local_rank:
continue
path, _ = self.samples[index].split(' ')
path = osp.join(self.root, path)
img_bytes = self.file_client.get(path)
self.holder[path] = img_bytes
print("Loading complete!")
def load_onto_memory_v2(self):
# print("Loading images onto memory...", self.local_rank, self.local_size)
t = torch.Generator()
t.manual_seed(0)
indices = torch.randperm(len(self.samples), generator=t).tolist()
# indices = range(len(self.samples))
indices = [i for i in indices if i % self.num_parts == self.local_rank]
# add extra samples to make it evenly divisible
indices += indices[:(self.total_size_parts - len(indices))]
assert len(indices) == self.total_size_parts
# subsample
indices = indices[self.rank // self.num_parts:self.
total_size_parts:self.num_replicas // self.num_parts]
assert len(indices) == self.num_samples
if self.file_client is None:
self.file_client = FileClient(self.io_backend, **self.kwargs)
for index in tqdm(indices):
if index % self.local_size != self.local_rank:
continue
path, _ = self.samples[index].split(' ')
path = osp.join(self.root, path)
img_bytes = self.file_client.get(path)
self.holder[path] = img_bytes
print("Loading complete!")
def __getitem__(self, index):
if self.file_client is None:
self.file_client = FileClient(self.io_backend, **self.kwargs)
filepath, target = self.samples[index].split(' ')
filepath = osp.join(self.root, filepath)
try:
if self.on_memory:
img_bytes = self.holder[filepath]
else:
# pass
img_bytes = self.file_client.get(filepath)
img = mmcv.imfrombytes(img_bytes)[:, :, ::-1]
except Exception as e:
_logger.warning(
f'Skipped sample (index {index}, file {filepath}). {str(e)}')
self._consecutive_errors += 1
if self._consecutive_errors < _ERROR_RETRY:
return self.__getitem__((index + 1) % len(self))
else:
raise e
self._consecutive_errors = 0
img = Image.fromarray(img)
try:
if self.class_to_idx is not None:
target = self.class_to_idx[target]
else:
target = int(target)
except:
print('aaaaaaaaaaaa', filepath, target)
exit()
return img, target
def __len__(self):
return len(self.samples)
def _filename(self, index, basename=False, absolute=False):
filename, _ = self.samples[index].split(' ')
filename = osp.join(self.root, filename)
return filename
def get_temporal_info(date, miss_hour=False):
try:
if date:
if miss_hour:
pattern = re.compile(r'(\d*)-(\d*)-(\d*)', re.I)
else:
pattern = re.compile(r'(\d*)-(\d*)-(\d*) (\d*):(\d*):(\d*)',
re.I)
m = pattern.match(date.strip())
if m:
year = int(m.group(1))
month = int(m.group(2))
day = int(m.group(3))
x_month = math.sin(2 * math.pi * month / 12)
y_month = math.cos(2 * math.pi * month / 12)
if miss_hour:
x_hour = 0
y_hour = 0
else:
hour = int(m.group(4))
x_hour = math.sin(2 * math.pi * hour / 24)
y_hour = math.cos(2 * math.pi * hour / 24)
return [x_month, y_month, x_hour, y_hour]
else:
return [0, 0, 0, 0]
else:
return [0, 0, 0, 0]
except:
return [0, 0, 0, 0]
def get_spatial_info(latitude, longitude):
if latitude and longitude:
latitude = math.radians(latitude)
longitude = math.radians(longitude)
x = math.cos(latitude) * math.cos(longitude)
y = math.cos(latitude) * math.sin(longitude)
z = math.sin(latitude)
return [x, y, z]
else:
return [0, 0, 0]
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import torch
import os
import math
from torch.utils.data.sampler import Sampler
import torch.distributed as dist
import numpy as np
class SubsetRandomSampler(torch.utils.data.Sampler):
"""Samples elements randomly from a given list of indices, without replacement.
Arguments:
indices (sequence): a sequence of indices
"""
def __init__(self, indices):
self.epoch = 0
self.indices = indices
def __iter__(self):
return (self.indices[i] for i in torch.randperm(len(self.indices)))
def __len__(self):
return len(self.indices)
def set_epoch(self, epoch):
self.epoch = epoch
class NodeDistributedSampler(Sampler):
"""Sampler that restricts data loading to a subset of the dataset.
It is especially useful in conjunction with
:class:`torch.nn.parallel.DistributedDataParallel`. In such case, each
process can pass a DistributedSampler instance as a DataLoader sampler,
and load a subset of the original dataset that is exclusive to it.
.. note::
Dataset is assumed to be of constant size.
Arguments:
dataset: Dataset used for sampling.
num_replicas (optional): Number of processes participating in
distributed training.
rank (optional): Rank of the current process within num_replicas.
"""
def __init__(self,
dataset,
num_replicas=None,
rank=None,
local_rank=None,
local_size=None):
if num_replicas is None:
if not dist.is_available():
raise RuntimeError(
"Requires distributed package to be available")
num_replicas = dist.get_world_size()
if rank is None:
if not dist.is_available():
raise RuntimeError(
"Requires distributed package to be available")
rank = dist.get_rank()
if local_rank is None:
local_rank = int(os.environ.get('LOCAL_RANK', 0))
if local_size is None:
local_size = int(os.environ.get('LOCAL_SIZE', 1))
self.dataset = dataset
self.num_replicas = num_replicas
self.num_parts = local_size
self.rank = rank
self.local_rank = local_rank
self.epoch = 0
self.num_samples = int(
math.ceil(len(self.dataset) * 1.0 / self.num_replicas))
self.total_size = self.num_samples * self.num_replicas
self.total_size_parts = self.num_samples * self.num_replicas // self.num_parts
def __iter__(self):
# deterministically shuffle based on epoch
g = torch.Generator()
g.manual_seed(self.epoch)
t = torch.Generator()
t.manual_seed(0)
indices = torch.randperm(len(self.dataset), generator=t).tolist()
# indices = range(len(self.dataset))
indices = [i for i in indices if i % self.num_parts == self.local_rank]
# add extra samples to make it evenly divisible
indices += indices[:(self.total_size_parts - len(indices))]
assert len(indices) == self.total_size_parts
# subsample
indices = indices[self.rank // self.num_parts:self.
total_size_parts:self.num_replicas // self.num_parts]
index = torch.randperm(len(indices), generator=g).tolist()
indices = list(np.array(indices)[index])
assert len(indices) == self.num_samples
return iter(indices)
def __len__(self):
return self.num_samples
def set_epoch(self, epoch):
self.epoch = epoch
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import os
import zipfile
import io
import numpy as np
from PIL import Image
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
def is_zip_path(img_or_path):
"""judge if this is a zip path"""
return '.zip@' in img_or_path
class ZipReader(object):
"""A class to read zipped files"""
zip_bank = dict()
def __init__(self):
super(ZipReader, self).__init__()
@staticmethod
def get_zipfile(path):
zip_bank = ZipReader.zip_bank
if path not in zip_bank:
zfile = zipfile.ZipFile(path, 'r')
zip_bank[path] = zfile
return zip_bank[path]
@staticmethod
def split_zip_style_path(path):
pos_at = path.index('@')
assert pos_at != -1, "character '@' is not found from the given path '%s'" % path
zip_path = path[0:pos_at]
folder_path = path[pos_at + 1:]
folder_path = str.strip(folder_path, '/')
return zip_path, folder_path
@staticmethod
def list_folder(path):
zip_path, folder_path = ZipReader.split_zip_style_path(path)
zfile = ZipReader.get_zipfile(zip_path)
folder_list = []
for file_foler_name in zfile.namelist():
file_foler_name = str.strip(file_foler_name, '/')
if file_foler_name.startswith(folder_path) and \
len(os.path.splitext(file_foler_name)[-1]) == 0 and \
file_foler_name != folder_path:
if len(folder_path) == 0:
folder_list.append(file_foler_name)
else:
folder_list.append(file_foler_name[len(folder_path) + 1:])
return folder_list
@staticmethod
def list_files(path, extension=None):
if extension is None:
extension = ['.*']
zip_path, folder_path = ZipReader.split_zip_style_path(path)
zfile = ZipReader.get_zipfile(zip_path)
file_lists = []
for file_foler_name in zfile.namelist():
file_foler_name = str.strip(file_foler_name, '/')
if file_foler_name.startswith(folder_path) and \
str.lower(os.path.splitext(file_foler_name)[-1]) in extension:
if len(folder_path) == 0:
file_lists.append(file_foler_name)
else:
file_lists.append(file_foler_name[len(folder_path) + 1:])
return file_lists
@staticmethod
def read(path):
zip_path, path_img = ZipReader.split_zip_style_path(path)
zfile = ZipReader.get_zipfile(zip_path)
data = zfile.read(path_img)
return data
@staticmethod
def imread(path):
zip_path, path_img = ZipReader.split_zip_style_path(path)
zfile = ZipReader.get_zipfile(zip_path)
data = zfile.read(path_img)
try:
im = Image.open(io.BytesIO(data))
except:
print("ERROR IMG LOADED: ", path_img)
random_img = np.random.rand(224, 224, 3) * 255
im = Image.fromarray(np.uint8(random_img))
return im
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
from typing import Any, Callable
import torch
import torch.distributed as dist
def _allreduce_fut(process_group: dist.ProcessGroup,
tensor: torch.Tensor) -> torch.futures.Future[torch.Tensor]:
"Averages the input gradient tensor by allreduce and returns a future."
group_to_use = process_group if process_group is not None else dist.group.WORLD
# Apply the division first to avoid overflow, especially for FP16.
tensor.div_(group_to_use.size())
return (dist.all_reduce(
tensor, group=group_to_use,
async_op=True).get_future().then(lambda fut: fut.value()[0]))
def allreduce_hook(
process_group: dist.ProcessGroup,
bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
"""
This DDP communication hook just calls ``allreduce`` using ``GradBucket``
tensors. Once gradient tensors are aggregated across all workers, its ``then``
callback takes the mean and returns the result. If user registers this hook,
DDP results is expected to be same as the case where no hook was registered.
Hence, this won't change behavior of DDP and user can use this as a reference
or modify this hook to log useful information or any other purposes while
unaffecting DDP behavior.
Example::
>>> ddp_model.register_comm_hook(process_group, allreduce_hook)
"""
return _allreduce_fut(process_group, bucket.buffer())
def fp16_compress_hook(
process_group: dist.ProcessGroup,
bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
"""
This DDP communication hook implements a simple gradient compression
approach that casts ``GradBucket`` tensor to half-precision floating-point format (``torch.float16``)
and then divides it by the process group size.
It allreduces those ``float16`` gradient tensors. Once compressed gradient
tensors are allreduced, the chained callback ``decompress`` casts it back to the input data type (such as ``float32``).
Example::
>>> ddp_model.register_comm_hook(process_group, fp16_compress_hook)
"""
group_to_use = process_group if process_group is not None else dist.group.WORLD
world_size = group_to_use.size()
compressed_tensor = bucket.buffer().to(torch.float16).div_(world_size)
fut = dist.all_reduce(compressed_tensor, group=group_to_use,
async_op=True).get_future()
def decompress(fut):
decompressed_tensor = bucket.buffer()
# Decompress in place to reduce the peak memory.
# See: https://github.com/pytorch/pytorch/issues/45968
decompressed_tensor.copy_(fut.value()[0])
return decompressed_tensor
return fut.then(decompress)
# TODO: create an internal helper function and extract the duplicate code in FP16_compress and BF16_compress.
def bf16_compress_hook(
process_group: dist.ProcessGroup,
bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
"""
Warning: This API is experimental, and it requires NCCL version later than 2.9.6.
This DDP communication hook implements a simple gradient compression
approach that casts ``GradBucket`` tensor to half-precision
`Brain floating point format <https://en.wikipedia.org/wiki/Bfloat16_floating-point_format>`_ (``torch.bfloat16``)
and then divides it by the process group size.
It allreduces those ``bfloat16`` gradient tensors. Once compressed gradient
tensors are allreduced, the chained callback ``decompress`` casts it back to the input data type (such as ``float32``).
Example::
>>> ddp_model.register_comm_hook(process_group, bf16_compress_hook)
"""
group_to_use = process_group if process_group is not None else dist.group.WORLD
world_size = group_to_use.size()
compressed_tensor = bucket.buffer().to(torch.bfloat16).div_(world_size)
fut = dist.all_reduce(compressed_tensor, group=group_to_use,
async_op=True).get_future()
def decompress(fut):
decompressed_tensor = bucket.buffer()
# Decompress in place to reduce the peak memory.
# See: https://github.com/pytorch/pytorch/issues/45968
decompressed_tensor.copy_(fut.value()[0])
return decompressed_tensor
return fut.then(decompress)
def fp16_compress_wrapper(
hook: Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]
) -> Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]:
"""
This wrapper casts the input gradient tensor of a given DDP communication hook to half-precision
floating point format (``torch.float16``), and casts the resulting tensor of the given hook back to
the input data type, such as ``float32``.
Therefore, ``fp16_compress_hook`` is equivalent to ``fp16_compress_wrapper(allreduce_hook)``.
Example::
>>> state = PowerSGDState(process_group=process_group, matrix_approximation_rank=1, start_powerSGD_iter=10)
>>> ddp_model.register_comm_hook(state, fp16_compress_wrapper(powerSGD_hook))
"""
def fp16_compress_wrapper_hook(
hook_state,
bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
# Cast bucket tensor to FP16.
bucket.set_buffer(bucket.buffer().to(torch.float16))
fut = hook(hook_state, bucket)
def decompress(fut):
decompressed_tensor = bucket.buffer()
# Decompress in place to reduce the peak memory.
# See: https://github.com/pytorch/pytorch/issues/45968
decompressed_tensor.copy_(fut.value())
return decompressed_tensor
# Decompress after hook has run.
return fut.then(decompress)
return fp16_compress_wrapper_hook
def bf16_compress_wrapper(
hook: Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]
) -> Callable[[Any, dist.GradBucket], torch.futures.Future[torch.Tensor]]:
"""
Warning: This API is experimental, and it requires NCCL version later than 2.9.6.
This wrapper casts the input gradient tensor of a given DDP communication hook to half-precision
`Brain floating point format <https://en.wikipedia.org/wiki/Bfloat16_floating-point_format> `_ (``torch.bfloat16``),
and casts the resulting tensor of the given hook back to the input data type, such as ``float32``.
Therefore, ``bf16_compress_hook`` is equivalent to ``bf16_compress_wrapper(allreduce_hook)``.
Example::
>>> state = PowerSGDState(process_group=process_group, matrix_approximation_rank=1, start_powerSGD_iter=10)
>>> ddp_model.register_comm_hook(state, bf16_compress_wrapper(powerSGD_hook))
"""
def bf16_compress_wrapper_hook(
hook_state,
bucket: dist.GradBucket) -> torch.futures.Future[torch.Tensor]:
# Cast bucket tensor to BF16.
bucket.set_buffer(bucket.buffer().to(torch.bfloat16))
fut = hook(hook_state, bucket)
def decompress(fut):
decompressed_tensor = bucket.buffer()
# Decompress in place to reduce the peak memory.
# See: https://github.com/pytorch/pytorch/issues/45968
decompressed_tensor.copy_(fut.value())
return decompressed_tensor
# Decompress after hook has run.
return fut.then(decompress)
return bf16_compress_wrapper_hook
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import os
import time
import argparse
import torch
from tqdm import tqdm
from config import get_config
from models import build_model
def get_args():
parser = argparse.ArgumentParser()
parser.add_argument('--model_name', type=str,
default='internimage_t_1k_224')
parser.add_argument('--ckpt_dir', type=str,
default='/mnt/petrelfs/share_data/huangzhenhang/code/internimage/checkpoint_dir/new/cls')
parser.add_argument('--onnx', default=False, action='store_true')
parser.add_argument('--trt', default=False, action='store_true')
args = parser.parse_args()
args.cfg = os.path.join('./configs', f'{args.model_name}.yaml')
args.ckpt = os.path.join(args.ckpt_dir, f'{args.model_name}.pth')
args.size = int(args.model_name.split('.')[0].split('_')[-1])
cfg = get_config(args)
return args, cfg
def get_model(args, cfg):
model = build_model(cfg)
ckpt = torch.load(args.ckpt, map_location='cpu')['model']
model.load_state_dict(ckpt)
return model
def speed_test(model, input):
# warmup
for _ in tqdm(range(100)):
_ = model(input)
# speed test
torch.cuda.synchronize()
start = time.time()
for _ in tqdm(range(100)):
_ = model(input)
end = time.time()
th = 100 / (end - start)
print(f"using time: {end - start}, throughput {th}")
def torch2onnx(args, cfg):
model = get_model(args, cfg).cuda()
# speed_test(model)
onnx_name = f'{args.model_name}.onnx'
torch.onnx.export(model,
torch.rand(1, 3, args.size, args.size).cuda(),
onnx_name,
input_names=['input'],
output_names=['output'])
return model
def onnx2trt(args):
from mmdeploy.backend.tensorrt import from_onnx
onnx_name = f'{args.model_name}.onnx'
from_onnx(
onnx_name,
args.model_name,
dict(
input=dict(
min_shape=[1, 3, args.size, args.size],
opt_shape=[1, 3, args.size, args.size],
max_shape=[1, 3, args.size, args.size],
)
),
max_workspace_size=2**30,
)
def check(args, cfg):
from mmdeploy.backend.tensorrt.wrapper import TRTWrapper
model = get_model(args, cfg).cuda()
model.eval()
trt_model = TRTWrapper(f'{args.model_name}.engine',
['output'])
x = torch.randn(1, 3, args.size, args.size).cuda()
torch_out = model(x)
trt_out = trt_model(dict(input=x))['output']
print('torch out shape:', torch_out.shape)
print('trt out shape:', trt_out.shape)
print('max delta:', (torch_out - trt_out).abs().max())
print('mean delta:', (torch_out - trt_out).abs().mean())
speed_test(model, x)
speed_test(trt_model, dict(input=x))
def main():
args, cfg = get_args()
if args.onnx or args.trt:
torch2onnx(args, cfg)
print('torch -> onnx: succeess')
if args.trt:
onnx2trt(args)
print('onnx -> trt: success')
check(args, cfg)
if __name__ == '__main__':
main()
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import os
import sys
import logging
import functools
from termcolor import colored
@functools.lru_cache()
def create_logger(output_dir, dist_rank=0, name=''):
# create logger
logger = logging.getLogger(name)
logger.setLevel(logging.DEBUG)
logger.propagate = False
# create formatter
fmt = '[%(asctime)s %(name)s] (%(filename)s %(lineno)d): %(levelname)s %(message)s'
color_fmt = colored('[%(asctime)s %(name)s]', 'green') + \
colored('(%(filename)s %(lineno)d)', 'yellow') + \
': %(levelname)s %(message)s'
# create console handlers for master process
if dist_rank == 0:
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.DEBUG)
console_handler.setFormatter(
logging.Formatter(fmt=color_fmt, datefmt='%Y-%m-%d %H:%M:%S'))
logger.addHandler(console_handler)
# create file handlers
file_handler = logging.FileHandler(os.path.join(
output_dir, f'log_rank{dist_rank}.txt'),
mode='a')
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(
logging.Formatter(fmt=fmt, datefmt='%Y-%m-%d %H:%M:%S'))
logger.addHandler(file_handler)
return logger
# --------------------------------------------------------
# InternImage
# Copyright (c) 2022 OpenGVLab
# Licensed under The MIT License [see LICENSE for details]
# --------------------------------------------------------
import torch
from timm.scheduler.cosine_lr import CosineLRScheduler
from timm.scheduler.step_lr import StepLRScheduler
from timm.scheduler.scheduler import Scheduler
def build_scheduler(config, optimizer, n_iter_per_epoch):
num_steps = int(config.TRAIN.EPOCHS * n_iter_per_epoch)
warmup_steps = int(config.TRAIN.WARMUP_EPOCHS * n_iter_per_epoch)
decay_steps = int(config.TRAIN.LR_SCHEDULER.DECAY_EPOCHS *
n_iter_per_epoch)
lr_scheduler = None
if config.TRAIN.LR_SCHEDULER.NAME == 'cosine':
lr_scheduler = CosineLRScheduler(
optimizer,
t_initial=num_steps,
# t_mul=1.,
lr_min=config.TRAIN.MIN_LR,
warmup_lr_init=config.TRAIN.WARMUP_LR,
warmup_t=warmup_steps,
cycle_limit=1,
t_in_epochs=False,
)
elif config.TRAIN.LR_SCHEDULER.NAME == 'linear':
lr_scheduler = LinearLRScheduler(
optimizer,
t_initial=num_steps,
lr_min_rate=0.01,
warmup_lr_init=config.TRAIN.WARMUP_LR,
warmup_t=warmup_steps,
t_in_epochs=False,
)
elif config.TRAIN.LR_SCHEDULER.NAME == 'step':
lr_scheduler = StepLRScheduler(
optimizer,
decay_t=decay_steps,
decay_rate=config.TRAIN.LR_SCHEDULER.DECAY_RATE,
warmup_lr_init=config.TRAIN.WARMUP_LR,
warmup_t=warmup_steps,
t_in_epochs=False,
)
return lr_scheduler
class LinearLRScheduler(Scheduler):
def __init__(
self,
optimizer: torch.optim.Optimizer,
t_initial: int,
lr_min_rate: float,
warmup_t=0,
warmup_lr_init=0.,
t_in_epochs=True,
noise_range_t=None,
noise_pct=0.67,
noise_std=1.0,
noise_seed=42,
initialize=True,
) -> None:
super().__init__(optimizer,
param_group_field="lr",
noise_range_t=noise_range_t,
noise_pct=noise_pct,
noise_std=noise_std,
noise_seed=noise_seed,
initialize=initialize)
self.t_initial = t_initial
self.lr_min_rate = lr_min_rate
self.warmup_t = warmup_t
self.warmup_lr_init = warmup_lr_init
self.t_in_epochs = t_in_epochs
if self.warmup_t:
self.warmup_steps = [(v - warmup_lr_init) / self.warmup_t
for v in self.base_values]
super().update_groups(self.warmup_lr_init)
else:
self.warmup_steps = [1 for _ in self.base_values]
def _get_lr(self, t):
if t < self.warmup_t:
lrs = [self.warmup_lr_init + t * s for s in self.warmup_steps]
else:
t = t - self.warmup_t
total_t = self.t_initial - self.warmup_t
lrs = [
v - ((v - v * self.lr_min_rate) * (t / total_t))
for v in self.base_values
]
return lrs
def get_epoch_values(self, epoch: int):
if self.t_in_epochs:
return self._get_lr(epoch)
else:
return None
def get_update_values(self, num_updates: int):
if not self.t_in_epochs:
return self._get_lr(num_updates)
else:
return None
This diff is collapsed.
This diff is collapsed.
359
368
460
475
486
492
496
514
516
525
547
548
556
563
575
641
648
723
733
765
801
826
852
858
878
896
900
905
908
910
935
946
947
994
999
1003
1005
1010
1027
1029
1048
1055
1064
1065
1069
1075
1079
1081
1085
1088
1093
1106
1143
1144
1145
1147
1168
1171
1178
1187
1190
1197
1205
1216
1223
1230
1236
1241
1245
1257
1259
1260
1267
1268
1269
1271
1272
1273
1277
1303
1344
1349
1355
1357
1384
1388
1391
1427
1429
1432
1437
1450
1461
1462
1474
1502
1503
1512
1552
1555
1577
1584
1587
1589
1599
1615
1616
1681
1692
1701
1716
1729
1757
1759
1764
1777
1786
1822
1841
1842
1848
1850
1856
1860
1861
1864
1876
1897
1898
1910
1913
1918
1922
1928
1932
1935
1947
1951
1953
1970
1977
1979
2001
2017
2067
2081
2087
2112
2128
2135
2147
2174
2175
2176
2177
2178
2181
2183
2184
2187
2189
2190
2191
2192
2193
2197
2202
2203
2206
2208
2209
2211
2212
2213
2214
2215
2216
2217
2219
2222
2223
2224
2225
2226
2227
2228
2229
2230
2236
2238
2240
2241
2242
2243
2244
2245
2247
2248
2249
2250
2251
2252
2255
2256
2257
2262
2263
2264
2265
2266
2268
2270
2271
2272
2273
2275
2276
2279
2280
2281
2282
2285
2289
2292
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2309
2310
2312
2313
2314
2315
2316
2318
2319
2321
2322
2326
2329
2330
2331
2332
2334
2335
2336
2337
2338
2339
2341
2342
2343
2344
2346
2348
2349
2351
2352
2353
2355
2357
2358
2359
2360
2364
2365
2368
2369
2377
2382
2383
2385
2397
2398
2400
2402
2405
2412
2421
2428
2431
2432
2433
2436
2441
2445
2450
2453
2454
2465
2469
2532
2533
2538
2544
2547
2557
2565
2578
2612
2658
2702
2722
2731
2738
2741
2747
2810
2818
2833
2844
2845
2867
2874
2882
2884
2888
2889
3008
3012
3019
3029
3033
3042
3091
3106
3138
3159
3164
3169
3280
3296
3311
3318
3320
3324
3330
3366
3375
3381
3406
3419
3432
3434
3435
3493
3495
3503
3509
3511
3513
3517
3521
3526
3546
3554
3600
3601
3606
3612
3613
3616
3622
3623
3627
3632
3634
3636
3638
3644
3646
3649
3650
3651
3656
3663
3673
3674
3689
3690
3702
3733
3769
3971
3974
4065
4068
4073
4102
4136
4140
4151
4159
4165
4207
4219
4226
4249
4256
4263
4270
4313
4321
4378
4386
4478
4508
4512
4536
4542
4550
4560
4562
4570
4571
4572
4583
4588
4594
4604
4608
4623
4634
4636
4646
4651
4652
4686
4688
4691
4699
4724
4727
4737
4770
4774
4789
4802
4807
4819
4880
4886
4908
4927
4931
4936
4964
4976
4993
5028
5033
5043
5046
5096
5111
5114
5131
5132
5183
5199
5235
5275
5291
5293
5294
5343
5360
5362
5364
5390
5402
5418
5428
5430
5437
5443
5473
5484
5486
5505
5507
5508
5510
5567
5578
5580
5584
5606
5613
5629
5672
5676
5692
5701
5760
5769
5770
5779
5814
5850
5871
5893
5911
5949
5954
6005
6006
6012
6017
6023
6024
6040
6050
6054
6087
6105
6157
6235
6237
6256
6259
6286
6291
6306
6339
6341
6343
6379
6383
6393
6405
6479
6511
6517
6541
6561
6608
6611
6615
6678
6682
6707
6752
6798
6850
6880
6885
6890
6920
6981
7000
7009
7038
7049
7050
7052
7073
7078
7098
7111
7165
7198
7204
7280
7283
7286
7287
7293
7294
7305
7318
7341
7346
7354
7382
7427
7428
7435
7445
7450
7455
7467
7469
7497
7502
7506
7514
7523
7651
7661
7664
7672
7679
7685
7696
7730
7871
7873
7895
7914
7915
7920
7934
7935
7949
8009
8036
8051
8065
8074
8090
8112
8140
8164
8168
8178
8182
8198
8212
8216
8230
8242
8288
8289
8295
8318
8352
8368
8371
8375
8376
8401
8416
8419
8436
8460
8477
8478
8482
8498
8500
8539
8543
8552
8555
8580
8584
8586
8594
8598
8601
8606
8610
8611
8622
8627
8639
8649
8650
8653
8654
8667
8672
8673
8674
8676
8684
8720
8723
8750
8753
8801
8815
8831
8835
8842
8845
8858
8897
8916
8951
8954
8959
8970
8976
8981
8983
8989
8991
8993
9019
9039
9042
9043
9056
9057
9070
9087
9098
9106
9130
9131
9155
9171
9183
9198
9199
9201
9204
9211
9220
9224
9228
9249
9259
9270
9278
9294
9299
9309
9321
9344
9351
9375
9376
9381
9391
9400
9404
9440
9448
9463
9474
9501
9504
9513
9514
9544
9566
9575
9607
9608
9623
9632
9638
9642
9655
9673
9739
9751
9759
9766
9777
9801
9819
9838
9878
9923
9955
9960
9962
9969
9996
10009
10030
10039
10051
10072
10074
10077
10093
10096
10108
10117
10120
10123
10157
10225
10275
10303
10306
10313
10314
10331
10336
10337
10412
10422
10450
10462
10464
10486
10518
10521
10522
10531
10533
10534
10550
10558
10573
10582
10585
10588
10611
10625
10634
10637
10676
10682
10725
10775
10781
10782
10806
10836
10839
10847
10858
10870
10880
10883
10907
10913
10920
10935
10946
10950
10951
10956
10998
11002
11017
11022
11024
11026
11044
11054
11094
11109
11136
11136
11167
11185
11220
11222
11241
11254
11258
11278
11305
11310
11330
11366
11376
11388
11391
11400
11406
11436
11448
11465
11468
11472
11477
11482
11483
11506
11535
11557
11565
11574
11583
11593
11610
11611
11618
11620
11639
11642
11663
11673
11688
11708
11709
11715
11720
11725
11728
11742
11759
11770
11836
11838
11855
11875
11877
11883
11888
11895
11916
11922
11929
11943
11951
11979
11983
12213
12228
12238
12240
12241
12246
12282
12348
12368
12372
12421
12559
12565
12574
12687
12754
12767
12777
12779
12811
12831
12834
12835
12842
12846
12848
12849
12855
12857
12872
12937
12970
13016
13037
13045
13058
13084
13085
13087
13093
13133
13181
13229
13405
13443
13613
13689
13697
13708
13748
13803
13981
14050
14058
14218
14245
14255
14263
14293
14323
14366
14388
14393
14437
14441
14964
15730
16742
18035
18203
18533
18790
19100
20017
20460
21024
21043
21161
21169
21179
21194
21198
21367
21815
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment