Commit b6c19984 authored by dengjb's avatar dengjb
Browse files

update

parents
# FastRetri in FastReID
This project provides a strong baseline for fine-grained image retrieval.
## Datasets Preparation
We use `CUB200`, `CARS-196`, `Standford Online Products` and `In-Shop` to evaluate the model's performance.
You can do data management following [dml_cross_entropy](https://github.com/jeromerony/dml_cross_entropy) instruction.
## Usage
Each dataset's config file can be found in `projects/FastRetri/config`, which you can use to reproduce the results of the repo.
For example, if you want to train with `CUB200`, you can run an experiment with `cub.yml`
```bash
python3 projects/FastRetri/train_net.py --config-file projects/FastRetri/config/cub.yml --num-gpus 4
```
## Experiment Results
We refer to [A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses](arxiv.org/abs/2003.08983) as our baseline methods, and on top of it, we add some tricks, such as gem pooling.
More details can be found in the config file and code.
### CUB
| Method | Pretrained | Recall@1 | Recall@2 | Recall@4 | Recall@8 | Recall@16 | Recall@32 |
| :---: | :---: | :---: |:---: | :---: | :---: | :---: | :---: |
| [dml_cross_entropy](https://github.com/jeromerony/dml_cross_entropy) | ImageNet | 69.2 | 79.2 | 86.9 | 91.6 | 95.0 | 97.3 |
| Fastretri | ImageNet | 69.46 | 79.57 | 87.53 | 92.61 | 95.75 | 97.35 |
### Cars-196
| Method | Pretrained | Recall@1 | Recall@2 | Recall@4 | Recall@8 | Recall@16 | Recall@32 |
| :---: | :---: | :---: |:---: | :---: | :---: | :---: | :---: |
| [dml_cross_entropy](https://github.com/jeromerony/dml_cross_entropy) | ImageNet | 89.3 | 93.9 | 96.6 | 98.4 | 99.3 | 99.7 |
| Fastretri | ImageNet | 92.31 | 95.99 | 97.60 | 98.63 | 99.24 | 99.62 |
### Standford Online Products
| Method | Pretrained | Recall@1 | Recall@10 | Recall@100 | Recall@1000 |
| :---: | :---: | :---: |:---: | :---: | :---: |
| [dml_cross_entropy](https://github.com/jeromerony/dml_cross_entropy) | ImageNet | 81.1 | 91.7 | 96.3 | 98.8 |
| Fastretri | ImageNet | 82.46 | 92.56 | 96.78 | 98.95 |
### In-Shop
| Method | Pretrained | Recall@1 | Recall@10 | Recall@20 | Recall@30 | Recall@40 | Recall@50 |
| :---: | :---: | :---: |:---: | :---: | :---: | :---: | :---: |
| [dml_cross_entropy](https://github.comjeromerony/dml_cross_entropy) | ImageNet | 90.6 | 98.0 | 98.6 | 98.9 | 99.1 | 99.2 |
| Fastretri | ImageNet | 91.97 | 98.29 | 98.85 | 99.11 | 99.24 | 99.35 |
MODEL:
META_ARCHITECTURE: Baseline
BACKBONE:
NAME: build_resnet_backbone
DEPTH: 50x
NORM: FrozenBN
LAST_STRIDE: 1
FEAT_DIM: 2048
PRETRAIN: True
HEADS:
NAME: EmbeddingHead
NORM: syncBN
WITH_BNNECK: True
NECK_FEAT: after
EMBEDDING_DIM: 0
POOL_LAYER: GeneralizedMeanPooling
CLS_LAYER: Linear
LOSSES:
NAME: ("CrossEntropyLoss",)
CE:
EPSILON: 0.1
SCALE: 1.
INPUT:
SIZE_TRAIN: [256, 256]
SIZE_TEST: [256, 256]
CROP:
ENABLED: True
SIZE: [224,]
SCALE: [0.16, 1.]
RATIO: [0.75, 1.33333]
FLIP:
ENABLED: True
CJ:
ENABLED: False
BRIGHTNESS: 0.3
CONTRAST: 0.3
SATURATION: 0.1
HUE: 0.1
DATALOADER:
SAMPLER_TRAIN: TrainingSampler
NUM_WORKERS: 8
SOLVER:
MAX_EPOCH: 100
AMP:
ENABLED: True
OPT: SGD
SCHED: CosineAnnealingLR
BASE_LR: 0.003
MOMENTUM: 0.99
NESTEROV: True
BIAS_LR_FACTOR: 1.
WEIGHT_DECAY: 0.0005
WEIGHT_DECAY_BIAS: 0.
IMS_PER_BATCH: 128
ETA_MIN_LR: 0.00003
WARMUP_FACTOR: 0.1
WARMUP_ITERS: 1000
CHECKPOINT_PERIOD: 10
CLIP_GRADIENTS:
ENABLED: True
TEST:
EVAL_PERIOD: 10
IMS_PER_BATCH: 256
CUDNN_BENCHMARK: True
\ No newline at end of file
_BASE_: base-image_retri.yml
MODEL:
LOSSES:
CE:
EPSILON: 0.4
INPUT:
CJ:
ENABLED: True
BRIGHTNESS: 0.3
CONTRAST: 0.3
SATURATION: 0.3
HUE: 0.1
CROP:
RATIO: (1., 1.)
SOLVER:
MAX_EPOCH: 100
BASE_LR: 0.05
ETA_MIN_LR: 0.0005
NESTEROV: False
MOMENTUM: 0.
TEST:
RECALLS: [ 1, 2, 4, 8, 16, 32 ]
DATASETS:
NAMES: ("Cars196",)
TESTS: ("Cars196",)
OUTPUT_DIR: projects/FastRetri/logs/r50-base_cars
_BASE_: base-image_retri.yml
MODEL:
LOSSES:
CE:
EPSILON: 0.3
INPUT:
SIZE_TRAIN: [256,]
SIZE_TEST: [256,]
CJ:
ENABLED: True
BRIGHTNESS: 0.25
CONTRAST: 0.25
SATURATION: 0.25
HUE: 0.0
SOLVER:
MAX_EPOCH: 30
BASE_LR: 0.02
ETA_MIN_LR: 0.00002
NESTEROV: False
MOMENTUM: 0.
TEST:
RECALLS: [ 1, 2, 4, 8, 16, 32 ]
DATASETS:
NAMES: ("CUB",)
TESTS: ("CUB",)
OUTPUT_DIR: projects/FastRetri/logs/r50-base_cub
\ No newline at end of file
_BASE_: base-image_retri.yml
INPUT:
SIZE_TRAIN: [0,]
SIZE_TEST: [0,]
SOLVER:
MAX_EPOCH: 100
BASE_LR: 0.003
ETA_MIN_LR: 0.00003
MOMENTUM: 0.99
NESTEROV: True
TEST:
RECALLS: [ 1, 10, 20, 30, 40, 50 ]
DATASETS:
NAMES: ("InShop",)
TESTS: ("InShop",)
OUTPUT_DIR: projects/FastRetri/logs/r50-base_inshop
\ No newline at end of file
_BASE_: base-image_retri.yml
SOLVER:
MAX_EPOCH: 100
BASE_LR: 0.003
ETA_MIN_LR: 0.00003
MOMENTUM: 0.99
NESTEROV: True
TEST:
RECALLS: [1, 10, 100, 1000]
DATASETS:
NAMES: ("SOP",)
TESTS: ("SOP",)
OUTPUT_DIR: projects/FastRetri/logs/r50-base_sop
\ No newline at end of file
# encoding: utf-8
"""
@author: xingyu liao
@contact: sherlockliao01@gmail.com
"""
from .config import add_retri_config
from .datasets import *
from .retri_evaluator import RetriEvaluator
# encoding: utf-8
"""
@author: xingyu liao
@contact: sherlockliao01@gmail.com
"""
def add_retri_config(cfg):
_C = cfg
_C.TEST.RECALLS = [1, 2, 4, 8, 16, 32]
# encoding: utf-8
"""
@author: xingyu liao
@contact: sherlockliao01@gmail.com
"""
import os
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset
__all__ = ["Cars196", "CUB", "SOP", "InShop"]
@DATASET_REGISTRY.register()
class Cars196(ImageDataset):
dataset_dir = 'Cars_196'
dataset_name = "cars"
def __init__(self, root='datasets', **kwargs):
self.root = root
self.dataset_dir = os.path.join(self.root, self.dataset_dir)
train_file = os.path.join(self.dataset_dir, "train.txt")
test_file = os.path.join(self.dataset_dir, "test.txt")
required_files = [
self.dataset_dir,
train_file,
test_file,
]
self.check_before_run(required_files)
train = self.process_label_file(train_file, is_train=True)
query = self.process_label_file(test_file, is_train=False)
super(Cars196, self).__init__(train, query, [], **kwargs)
def process_label_file(self, file, is_train):
data_list = []
with open(file, 'r') as f:
lines = f.read().splitlines()
for line in lines:
img_name, label = line.split(',')
if is_train:
label = self.dataset_name + '_' + str(label)
data_list.append((os.path.join(self.dataset_dir, img_name), label, '0'))
return data_list
@DATASET_REGISTRY.register()
class CUB(Cars196):
dataset_dir = "CUB_200_2011"
dataset_name = "cub"
@DATASET_REGISTRY.register()
class SOP(Cars196):
dataset_dir = "Stanford_Online_Products"
dataset_name = "sop"
@DATASET_REGISTRY.register()
class InShop(Cars196):
dataset_dir = "InShop"
dataset_name = "inshop"
def __init__(self, root="datasets", **kwargs):
self.root = root
self.dataset_dir = os.path.join(self.root, self.dataset_dir)
train_file = os.path.join(self.dataset_dir, "train.txt")
query_file = os.path.join(self.dataset_dir, "test_query.txt")
gallery_file = os.path.join(self.dataset_dir, "test_gallery.txt")
required_files = [
train_file,
query_file,
gallery_file,
]
self.check_before_run(required_files)
train = self.process_label_file(train_file, True)
query = self.process_label_file(query_file, False)
gallery = self.process_label_file(gallery_file, False)
super(Cars196, self).__init__(train, query, gallery, **kwargs)
# encoding: utf-8
"""
@author: xingyu liao
@contact: sherlockliao01@gmail.com
"""
import copy
import logging
from collections import OrderedDict
from typing import List, Optional, Dict
import faiss
import numpy as np
import torch
import torch.nn.functional as F
from fastreid.evaluation import DatasetEvaluator
from fastreid.utils import comm
logger = logging.getLogger("fastreid.retri_evaluator")
@torch.no_grad()
def recall_at_ks(query_features: torch.Tensor,
query_labels: np.ndarray,
ks: List[int],
gallery_features: Optional[torch.Tensor] = None,
gallery_labels: Optional[torch.Tensor] = None,
cosine: bool = False) -> Dict[int, float]:
"""
Compute the recall between samples at each k. This function uses about 8GB of memory.
Parameters
----------
query_features : torch.Tensor
Features for each query sample. shape: (num_queries, num_features)
query_labels : torch.LongTensor
Labels corresponding to the query features. shape: (num_queries,)
ks : List[int]
Values at which to compute the recall.
gallery_features : torch.Tensor
Features for each gallery sample. shape: (num_queries, num_features)
gallery_labels : torch.LongTensor
Labels corresponding to the gallery features. shape: (num_queries,)
cosine : bool
Use cosine distance between samples instead of euclidean distance.
Returns
-------
recalls : Dict[int, float]
Values of the recall at each k.
"""
offset = 0
if gallery_features is None and gallery_labels is None:
offset = 1
gallery_features = query_features
gallery_labels = query_labels
elif gallery_features is None or gallery_labels is None:
raise ValueError('gallery_features and gallery_labels needs to be both None or both Tensors.')
if cosine:
query_features = F.normalize(query_features, p=2, dim=1)
gallery_features = F.normalize(gallery_features, p=2, dim=1)
to_cpu_numpy = lambda x: x.cpu().numpy()
query_features, gallery_features = map(to_cpu_numpy, [query_features, gallery_features])
res = faiss.StandardGpuResources()
flat_config = faiss.GpuIndexFlatConfig()
flat_config.device = 0
max_k = max(ks)
index_function = faiss.GpuIndexFlatIP if cosine else faiss.GpuIndexFlatL2
index = index_function(res, gallery_features.shape[1], flat_config)
index.add(gallery_features)
closest_indices = index.search(query_features, max_k + offset)[1]
recalls = {}
for k in ks:
indices = closest_indices[:, offset:k + offset]
recalls[k] = (query_labels[:, None] == gallery_labels[indices]).any(1).mean()
return {k: round(v * 100, 2) for k, v in recalls.items()}
class RetriEvaluator(DatasetEvaluator):
def __init__(self, cfg, num_query, output_dir=None):
self.cfg = cfg
self._num_query = num_query
self._output_dir = output_dir
self.recalls = cfg.TEST.RECALLS
self.features = []
self.labels = []
def reset(self):
self.features = []
self.labels = []
def process(self, inputs, outputs):
self.features.append(outputs.cpu())
self.labels.extend(inputs["targets"])
def evaluate(self):
if comm.get_world_size() > 1:
comm.synchronize()
features = comm.gather(self.features)
features = sum(features, [])
labels = comm.gather(self.labels)
labels = sum(labels, [])
# fmt: off
if not comm.is_main_process(): return {}
# fmt: on
else:
features = self.features
labels = self.labels
features = torch.cat(features, dim=0)
# query feature, person ids and camera ids
query_features = features[:self._num_query]
query_labels = np.asarray(labels[:self._num_query])
# gallery features, person ids and camera ids
gallery_features = features[self._num_query:]
gallery_pids = np.asarray(labels[self._num_query:])
self._results = OrderedDict()
if self._num_query == len(features):
cmc = recall_at_ks(query_features, query_labels, self.recalls, cosine=True)
else:
cmc = recall_at_ks(query_features, query_labels, self.recalls,
gallery_features, gallery_pids,
cosine=True)
for r in self.recalls:
self._results['Recall@{}'.format(r)] = cmc[r]
self._results["metric"] = cmc[self.recalls[0]]
return copy.deepcopy(self._results)
#!/usr/bin/env python
# encoding: utf-8
"""
@author: sherlock
@contact: sherlockliao01@gmail.com
"""
import sys
sys.path.append('.')
from fastreid.config import get_cfg
from fastreid.engine import default_argument_parser, default_setup, launch
from fastreid.utils.checkpoint import Checkpointer
from fastreid.engine.defaults import DefaultTrainer
from fastretri import *
class Trainer(DefaultTrainer):
@classmethod
def build_evaluator(cls, cfg, dataset_name, output_dir=None):
data_loader, num_query = cls.build_test_loader(cfg, dataset_name)
return data_loader, RetriEvaluator(cfg, num_query, output_dir)
def setup(args):
"""
Create configs and perform basic setups.
"""
cfg = get_cfg()
add_retri_config(cfg)
cfg.merge_from_file(args.config_file)
cfg.merge_from_list(args.opts)
cfg.freeze()
default_setup(cfg, args)
return cfg
def main(args):
cfg = setup(args)
if args.eval_only:
cfg.defrost()
cfg.MODEL.BACKBONE.PRETRAIN = False
model = Trainer.build_model(cfg)
Checkpointer(model).load(cfg.MODEL.WEIGHTS) # load trained model
res = Trainer.test(cfg, model)
return res
trainer = Trainer(cfg)
trainer.resume_or_load(resume=args.resume)
return trainer.train()
if __name__ == "__main__":
args = default_argument_parser().parse_args()
print("Command Line Args:", args)
launch(
main,
args.num_gpus,
num_machines=args.num_machines,
machine_rank=args.machine_rank,
dist_url=args.dist_url,
args=(args,),
)
# Hyper-Parameter Optimization in FastReID
This project includes training reid models with hyper-parameter optimization.
Install the following
```bash
pip install 'ray[tune]'
pip install hpbandster ConfigSpace hyperopt
```
## Example
This is an example for tuning `batch_size` and `num_instance` automatically.
To train hyperparameter optimization with BOHB(Bayesian Optimization with HyperBand) search algorithm, run
```bash
python3 projects/FastTune/tune_net.py --config-file projects/FastTune/configs/search_trial.yml --srch-algo "bohb"
```
## Known issues
todo
\ No newline at end of file
# encoding: utf-8
"""
@author: xingyu liao
@contact: sherlockliao01@gmail.com
"""
from .tune_hooks import TuneReportHook
# encoding: utf-8
"""
@author: xingyu liao
@contact: sherlockliao01@gmail.com
"""
import torch
from ray import tune
from fastreid.engine.hooks import EvalHook, flatten_results_dict
from fastreid.utils.checkpoint import Checkpointer
class TuneReportHook(EvalHook):
def __init__(self, eval_period, eval_function):
super().__init__(eval_period, eval_function)
self.step = 0
def _do_eval(self):
results = self._func()
if results:
assert isinstance(
results, dict
), "Eval function must return a dict. Got {} instead.".format(results)
flattened_results = flatten_results_dict(results)
for k, v in flattened_results.items():
try:
v = float(v)
except Exception:
raise ValueError(
"[EvalHook] eval_function should return a nested dict of float. "
"Got '{}: {}' instead.".format(k, v)
)
# Remove extra memory cache of main process due to evaluation
torch.cuda.empty_cache()
self.step += 1
# Here we save a checkpoint. It is automatically registered with
# RayTune and will potentially be passed as the `checkpoint_dir`
# parameter in future iterations.
with tune.checkpoint_dir(step=self.step) as checkpoint_dir:
additional_state = {"epoch": int(self.trainer.epoch)}
# Change path of save dir where tune can find
self.trainer.checkpointer.save_dir = checkpoint_dir
self.trainer.checkpointer.save(name="checkpoint", **additional_state)
metrics = dict(r1=results["Rank-1"], map=results["mAP"], score=(results["Rank-1"] + results["mAP"]) / 2)
tune.report(**metrics)
MODEL:
META_ARCHITECTURE: Baseline
FREEZE_LAYERS: [ backbone ]
BACKBONE:
NAME: build_resnet_backbone
DEPTH: 34x
LAST_STRIDE: 1
FEAT_DIM: 512
NORM: BN
WITH_NL: False
WITH_IBN: True
PRETRAIN: True
PRETRAIN_PATH: /export/home/lxy/.cache/torch/checkpoints/resnet34_ibn_a-94bc1577.pth
HEADS:
NUM_CLASSES: 702
NAME: EmbeddingHead
NORM: BN
NECK_FEAT: after
EMBEDDING_DIM: 0
POOL_LAYER: GeneralizedMeanPooling
CLS_LAYER: CircleSoftmax
SCALE: 64
MARGIN: 0.35
LOSSES:
NAME: ("CrossEntropyLoss", "TripletLoss",)
CE:
EPSILON: 0.1
SCALE: 1.
TRI:
MARGIN: 0.0
HARD_MINING: True
NORM_FEAT: False
SCALE: 1.
INPUT:
SIZE_TRAIN: [ 256, 128 ]
SIZE_TEST: [ 256, 128 ]
AUTOAUG:
ENABLED: True
PROB: 0.1
REA:
ENABLED: True
CJ:
ENABLED: True
PADDING:
ENABLED: True
DATALOADER:
SAMPLER_TRAIN: NaiveIdentitySampler
NUM_INSTANCE: 16
NUM_WORKERS: 8
SOLVER:
AMP:
ENABLED: False
MAX_EPOCH: 60
OPT: Adam
SCHED: CosineAnnealingLR
BASE_LR: 0.00035
BIAS_LR_FACTOR: 1.
WEIGHT_DECAY: 0.0005
WEIGHT_DECAY_BIAS: 0.0
IMS_PER_BATCH: 64
DELAY_EPOCHS: 30
ETA_MIN_LR: 0.00000077
FREEZE_ITERS: 500
WARMUP_FACTOR: 0.1
WARMUP_ITERS: 1000
CHECKPOINT_PERIOD: 100
TEST:
EVAL_PERIOD: 10
IMS_PER_BATCH: 256
DATASETS:
NAMES: ("DukeMTMC",)
TESTS: ("DukeMTMC",)
COMBINEALL: False
CUDNN_BENCHMARK: True
OUTPUT_DIR: projects/FastTune/logs/trial
#!/usr/bin/env python
# encoding: utf-8
"""
@author: sherlock
@contact: sherlockliao01@gmail.com
"""
import logging
import os
import sys
from functools import partial
import ConfigSpace as CS
import ray
from hyperopt import hp
from ray import tune
from ray.tune import CLIReporter
from ray.tune.schedulers import ASHAScheduler, PopulationBasedTraining
from ray.tune.schedulers.hb_bohb import HyperBandForBOHB
from ray.tune.suggest.bohb import TuneBOHB
from ray.tune.suggest.hyperopt import HyperOptSearch
sys.path.append('.')
from fastreid.config import get_cfg, CfgNode
from fastreid.engine import hooks
from fastreid.modeling import build_model
from fastreid.engine import DefaultTrainer, default_argument_parser, default_setup
from fastreid.utils.events import CommonMetricPrinter
from fastreid.utils import comm
from fastreid.utils.file_io import PathManager
from autotuner import *
logger = logging.getLogger("fastreid.auto_tuner")
ray.init(dashboard_host='127.0.0.1')
class AutoTuner(DefaultTrainer):
def build_hooks(self):
r"""
Build a list of default hooks, including timing, evaluation,
checkpointing, lr scheduling, precise BN, writing events.
Returns:
list[HookBase]:
"""
cfg = self.cfg.clone()
cfg.defrost()
ret = [
hooks.IterationTimer(),
hooks.LRScheduler(self.optimizer, self.scheduler),
]
ret.append(hooks.LayerFreeze(
self.model,
cfg.MODEL.FREEZE_LAYERS,
cfg.SOLVER.FREEZE_ITERS,
cfg.SOLVER.FREEZE_FC_ITERS,
))
def test_and_save_results():
self._last_eval_results = self.test(self.cfg, self.model)
return self._last_eval_results
# Do evaluation after checkpointer, because then if it fails,
# we can use the saved checkpoint to debug.
ret.append(TuneReportHook(cfg.TEST.EVAL_PERIOD, test_and_save_results))
if comm.is_main_process():
# run writers in the end, so that evaluation metrics are written
ret.append(hooks.PeriodicWriter([CommonMetricPrinter(self.max_iter)], 200))
return ret
@classmethod
def build_model(cls, cfg):
model = build_model(cfg)
return model
def setup(args):
"""
Create configs and perform basic setups.
"""
cfg = get_cfg()
cfg.merge_from_file(args.config_file)
cfg.merge_from_list(args.opts)
cfg.freeze()
default_setup(cfg, args)
return cfg
def update_config(cfg, config):
frozen = cfg.is_frozen()
cfg.defrost()
# cfg.SOLVER.BASE_LR = config["lr"]
# cfg.SOLVER.ETA_MIN_LR = config["lr"] * 0.0001
# cfg.SOLVER.DELAY_EPOCHS = int(config["delay_epochs"])
# cfg.MODEL.LOSSES.CE.SCALE = config["ce_scale"]
# cfg.MODEL.HEADS.SCALE = config["circle_scale"]
# cfg.MODEL.HEADS.MARGIN = config["circle_margin"]
# cfg.SOLVER.WEIGHT_DECAY = config["wd"]
# cfg.SOLVER.WEIGHT_DECAY_BIAS = config["wd_bias"]
cfg.SOLVER.IMS_PER_BATCH = config["bsz"]
cfg.DATALOADER.NUM_INSTANCE = config["num_inst"]
if frozen: cfg.freeze()
return cfg
def train_tuner(config, checkpoint_dir=None, cfg=None):
update_config(cfg, config)
tuner = AutoTuner(cfg)
# Load checkpoint if specific
if checkpoint_dir:
path = os.path.join(checkpoint_dir, "checkpoint.pth")
checkpoint = tuner.checkpointer.resume_or_load(path, resume=False)
tuner.start_epoch = checkpoint.get("epoch", -1) + 1
# Regular model training
tuner.train()
def main(args):
cfg = setup(args)
exp_metrics = dict(metric="score", mode="max")
if args.srch_algo == "hyperopt":
# Create a HyperOpt search space
search_space = {
# "lr": hp.loguniform("lr", np.log(1e-6), np.log(1e-3)),
# "delay_epochs": hp.randint("delay_epochs", 20, 60),
# "wd": hp.uniform("wd", 0, 1e-3),
# "wd_bias": hp.uniform("wd_bias", 0, 1e-3),
"bsz": hp.choice("bsz", [64, 96, 128, 160, 224, 256]),
"num_inst": hp.choice("num_inst", [2, 4, 8, 16, 32]),
# "ce_scale": hp.uniform("ce_scale", 0.1, 1.0),
# "circle_scale": hp.choice("circle_scale", [16, 32, 64, 128, 256]),
# "circle_margin": hp.uniform("circle_margin", 0, 1) * 0.4 + 0.1,
}
current_best_params = [{
"bsz": 0, # index of hp.choice list
"num_inst": 3,
}]
search_algo = HyperOptSearch(
search_space,
points_to_evaluate=current_best_params,
**exp_metrics)
if args.pbt:
scheduler = PopulationBasedTraining(
time_attr="training_iteration",
**exp_metrics,
perturbation_interval=2,
hyperparam_mutations={
"bsz": [64, 96, 128, 160, 224, 256],
"num_inst": [2, 4, 8, 16, 32],
}
)
else:
scheduler = ASHAScheduler(
grace_period=2,
reduction_factor=3,
max_t=7,
**exp_metrics)
elif args.srch_algo == "bohb":
search_space = CS.ConfigurationSpace()
search_space.add_hyperparameters([
# CS.UniformFloatHyperparameter(name="lr", lower=1e-6, upper=1e-3, log=True),
# CS.UniformIntegerHyperparameter(name="delay_epochs", lower=20, upper=60),
# CS.UniformFloatHyperparameter(name="ce_scale", lower=0.1, upper=1.0),
# CS.UniformIntegerHyperparameter(name="circle_scale", lower=8, upper=128),
# CS.UniformFloatHyperparameter(name="circle_margin", lower=0.1, upper=0.5),
# CS.UniformFloatHyperparameter(name="wd", lower=0, upper=1e-3),
# CS.UniformFloatHyperparameter(name="wd_bias", lower=0, upper=1e-3),
CS.CategoricalHyperparameter(name="bsz", choices=[64, 96, 128, 160, 224, 256]),
CS.CategoricalHyperparameter(name="num_inst", choices=[2, 4, 8, 16, 32]),
# CS.CategoricalHyperparameter(name="autoaug_enabled", choices=[True, False]),
# CS.CategoricalHyperparameter(name="cj_enabled", choices=[True, False]),
])
search_algo = TuneBOHB(
search_space, max_concurrent=4, **exp_metrics)
scheduler = HyperBandForBOHB(
time_attr="training_iteration",
reduction_factor=3,
max_t=7,
**exp_metrics,
)
else:
raise ValueError("Search algorithm must be chosen from [hyperopt, bohb], but got {}".format(args.srch_algo))
reporter = CLIReporter(
parameter_columns=["bsz", "num_inst"],
metric_columns=["r1", "map", "training_iteration"])
analysis = tune.run(
partial(
train_tuner,
cfg=cfg),
resources_per_trial={"cpu": 4, "gpu": 1},
search_alg=search_algo,
num_samples=args.num_trials,
scheduler=scheduler,
progress_reporter=reporter,
local_dir=cfg.OUTPUT_DIR,
keep_checkpoints_num=10,
name=args.srch_algo)
best_trial = analysis.get_best_trial("score", "max", "last")
logger.info("Best trial config: {}".format(best_trial.config))
logger.info("Best trial final validation mAP: {}, Rank-1: {}".format(
best_trial.last_result["map"], best_trial.last_result["r1"]))
save_dict = dict(R1=best_trial.last_result["r1"].item(), mAP=best_trial.last_result["map"].item())
save_dict.update(best_trial.config)
path = os.path.join(cfg.OUTPUT_DIR, "best_config.yaml")
with PathManager.open(path, "w") as f:
f.write(CfgNode(save_dict).dump())
logger.info("Best config saved to {}".format(os.path.abspath(path)))
if __name__ == "__main__":
parser = default_argument_parser()
parser.add_argument("--num-trials", type=int, default=8, help="number of tune trials")
parser.add_argument("--srch-algo", type=str, default="hyperopt",
help="search algorithms for hyperparameters search space")
parser.add_argument("--pbt", action="store_true", help="use population based training")
args = parser.parse_args()
print("Command Line Args:", args)
main(args)
# Black Re-ID: A Head-shoulder Descriptor for the Challenging Problem of Person Re-Identification
## Training
To train a model, run
```bash
CUDA_VISIBLE_DEVICES=gpus python train_net.py --config-file <config.yml>
```
## Evaluation
To evaluate the model in test set, run similarly:
```bash
CUDA_VISIBLE_DEVICES=gpus python train_net.py --config-file <configs.yaml> --eval-only MODEL.WEIGHTS model.pth
```
## Experimental Results
### Market1501 dataset
| Method | Pretrained | Rank@1 | mAP |
| :---: | :---: | :---: |:---: |
| ResNet50 | ImageNet | 93.3% | 84.6% |
| MGN | ImageNet | 95.7% | 86.9% |
| HAA (ResNet50) | ImageNet | 95% | 87.1% |
| HAA (MGN) | ImageNet | 95.8% | 89.5% |
### DukeMTMC dataset
| Method | Pretrained | Rank@1 | mAP |
| :---: | :---: | :---: |:---: |
| ResNet50 | ImageNet | 86.2% | 75.3% |
| MGN | ImageNet | 88.7% | 78.4% |
| HAA (ResNet50) | ImageNet | 87.7% | 75.7% |
| HAA (MGN) | ImageNet | 89% | 80.4% |
### Black-reid black group
| Method | Pretrained | Rank@1 | mAP |
| :---: | :---: | :---: |:---: |
| ResNet50 | ImageNet | 80.9% | 70.8% |
| MGN | ImageNet | 86.7% | 79.1% |
| HAA (ResNet50) | ImageNet | 86.7% | 79% |
| HAA (MGN) | ImageNet | 91.0% | 83.8% |
### White-reid white group
| Method | Pretrained | Rank@1 | mAP |
| :---: | :---: | :---: |:---: |
| ResNet50 | ImageNet | 89.5% | 75.8% |
| MGN | ImageNet | 94.3% | 85.8% |
| HAA (ResNet50) | ImageNet | 93.5% | 84.4% |
| HSE (MGN) | ImageNet | 95.3% | 88.1% |
# NAIC20 Competition (ReID Track)
This repository contains the 1-st place solution of ReID Competition of NAIC. We got the first place in the final stage.
## Introduction
Detailed information about the NAIC competition can be found [here](https://naic.pcl.ac.cn/homepage/index.html).
## Useful Tricks
- [x] DataAugmentation (RandomErasing + ColorJitter + Augmix + RandomAffine + RandomHorizontallyFilp + Padding + RandomCrop)
- [x] LR Scheduler (Warmup + CosineAnnealing)
- [x] Optimizer (Adam)
- [x] FP16 mixed precision training
- [x] CircleSoftmax
- [x] Pairwise Cosface
- [x] GeM pooling
- [x] Remove Long Tail Data (pid with single image)
- [x] Channel Shuffle
- [x] Distmat Ensemble
1. Due to the competition's rule, pseudo label is not allowed in the preliminary and semi-finals, but can be used in finals.
2. We combine naic19, naic20r1 and naic20r2 datasets, but there are overlap and noise between these datasets. So we
use an automatic data clean strategy for data clean. The cleaned txt files are put here. Sorry that this part cannot ben open sourced.
3. Due to the characteristics of the encrypted dataset, we found **channel shuffle** very helpful.
It's an offline data augmentation method. Specifically, for each id, random choice an order of channel,
such as `(2, 1, 0)`, then apply this order for all images of this id, and make it a new id.
With this method, you can enlarge the scale of identities. Theoretically, each id can be enlarged to 5 times.
Considering computational efficiency and marginal effect, we just enlarge each id once.
But this trick is no effect in normal dataset.
4. Due to the distribution of dataset, we found pairwise cosface can greatly boost model performance.
5. The performance of `resnest` is far better than `ibn`.
We choose `resnest101`, `resnest200` with different resolution (192x256, 192x384) to ensemble.
## Training & Submission in Command Line
Before starting, please see [GETTING_STARTED.md](https://github.com/JDAI-CV/fast-reid/blob/master/GETTING_STARTED.md) for the basic setup of FastReID.
All configs are made for 2-GPU training.
1. To train a model, first set up the corresponding datasets following [datasets/README.md](https://github.com/JDAI-CV/fast-reid/tree/master/datasets), then run:
```bash
python3 projects/NAIC20/train_net.py --config-file projects/NAIC20/configs/r34-ibn.yml --num-gpus 2
```
2. After the model is trained, you can start to generate submission file. First, modify the content of `MODEL` in `submit.yml` to
adapt your trained model, and set `MODEL.WEIGHTS` to the path of your trained model, then run:
```bash
python3 projects/NAIC20/train_net.py --config-file projects/NAIC20/configs/submit.yml --eval-only --commit --num-gpus 2
```
You can find `submit.json` and `distmat.npy` in `OUTPUT_DIR` of `submit.yml`.
## Ablation Study
To quickly verify the results, we use resnet34-ibn as backbone to conduct ablation study.
The datasets are `naic19`, `naic20r1` and `naic20r2`.
| Setting | Rank-1 | mAP |
| ------ | ------ | --- |
| Baseline | 70.11 | 63.29 |
| w/ tripletx10 | 73.79 | 67.01 |
| w/ cosface | 75.61 | 70.07 |
MODEL:
META_ARCHITECTURE: Baseline
FREEZE_LAYERS: [ backbone ]
HEADS:
NAME: EmbeddingHead
NORM: BN
EMBEDDING_DIM: 0
NECK_FEAT: after
POOL_LAYER: GeneralizedMeanPooling
CLS_LAYER: CircleSoftmax
SCALE: 64
MARGIN: 0.35
LOSSES:
NAME: ("CrossEntropyLoss", "Cosface",)
CE:
EPSILON: 0.
SCALE: 1.
TRI:
MARGIN: 0.
HARD_MINING: True
NORM_FEAT: True
SCALE: 1.
COSFACE:
MARGIN: 0.35
GAMMA: 64
SCALE: 1.
INPUT:
SIZE_TRAIN: [ 256, 128 ]
SIZE_TEST: [ 256, 128 ]
FLIP:
ENABLED: True
PADDING:
ENABLED: True
AUGMIX:
ENABLED: True
PROB: 0.5
AFFINE:
ENABLED: True
REA:
ENABLED: True
VALUE: [ 0., 0., 0. ]
CJ:
ENABLED: True
BRIGHTNESS: 0.15
CONTRAST: 0.1
SATURATION: 0.
HUE: 0.
DATALOADER:
SAMPLER_TRAIN: NaiveIdentitySampler
NUM_INSTANCE: 2
NUM_WORKERS: 8
SOLVER:
AMP:
ENABLED: False
OPT: Adam
SCHED: CosineAnnealingLR
MAX_EPOCH: 30
BASE_LR: 0.0007
BIAS_LR_FACTOR: 1.
WEIGHT_DECAY: 0.0005
WEIGHT_DECAY_BIAS: 0.0005
IMS_PER_BATCH: 256
DELAY_EPOCHS: 5
ETA_MIN_LR: 0.0000007
FREEZE_ITERS: 1000
FREEZE_FC_ITERS: 0
WARMUP_FACTOR: 0.1
WARMUP_ITERS: 4000
CHECKPOINT_PERIOD: 3
DATASETS:
NAMES: ("NAIC20_R2", "NAIC20_R1", "NAIC19",)
TESTS: ("NAIC20_R2",)
RM_LT: True
TEST:
EVAL_PERIOD: 3
IMS_PER_BATCH: 256
RERANK:
ENABLED: False
K1: 20
K2: 3
LAMBDA: 0.5
CUDNN_BENCHMARK: True
_BASE_: Base-naic.yml
MODEL:
BACKBONE:
NAME: build_resnest_backbone
DEPTH: 101x
WITH_IBN: False
PRETRAIN: True
OUTPUT_DIR: projects/NAIC20/logs/nest101-128x256
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment