Commit a4372e17 authored by huchen's avatar huchen
Browse files

Merge branch 'hepj_work' into 'main'

增加conformer代码

See merge request dcutoolkit/deeplearing/dlexamples_new!7
parents 7f99c1c3 142dcf29
Apart from training/testing scripts, We provide lots of useful tools under the
`tools/` directory.
## Log Analysis
`tools/analysis_tools/analyze_logs.py` plots loss/mAP curves given a training
log file. Run `pip install seaborn` first to install the dependency.
```shell
python tools/analysis_tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title ${TITLE}] [--legend ${LEGEND}] [--backend ${BACKEND}] [--style ${STYLE}] [--out ${OUT_FILE}]
```
![loss curve image](../resources/loss_curve.png)
Examples:
- Plot the classification loss of some run.
```shell
python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
```
- Plot the classification and regression loss of some run, and save the figure to a pdf.
```shell
python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
```
- Compare the bbox mAP of two runs in the same figure.
```shell
python tools/analysis_tools/analyze_logs.py plot_curve log1.json log2.json --keys bbox_mAP --legend run1 run2
```
- Compute the average training speed.
```shell
python tools/analysis_tools/analyze_logs.py cal_train_time log.json [--include-outliers]
```
The output is expected to be like the following.
```text
-----Analyze train time of work_dirs/some_exp/20190611_192040.log.json-----
slowest epoch 11, average time is 1.2024
fastest epoch 1, average time is 1.1909
time std over epochs is 0.0028
average iter time: 1.1959 s/iter
```
## Visualization
### Visualize Datasets
`tools/misc/browse_dataset.py` helps the user to browse a detection dataset (both
images and bounding box annotations) visually, or save the image to a
designated directory.
```shell
python tools/misc/browse_dataset.py ${CONFIG} [-h] [--skip-type ${SKIP_TYPE[SKIP_TYPE...]}] [--output-dir ${OUTPUT_DIR}] [--not-show] [--show-interval ${SHOW_INTERVAL}]
```
### Visualize Models
First, convert the model to ONNX as described
[here](#convert-mmdetection-model-to-onnx-experimental).
Note that currently only RetinaNet is supported, support for other models
will be coming in later versions.
The converted model could be visualized by tools like [Netron](https://github.com/lutzroeder/netron).
### Visualize Predictions
If you need a lightweight GUI for visualizing the detection results, you can refer [DetVisGUI project](https://github.com/Chien-Hung/DetVisGUI/tree/mmdetection).
## Error Analysis
`tools/analysis_tools/coco_error_analysis.py` analyzes COCO results per category and by
different criterion. It can also make a plot to provide useful
information.
```shell
python tools/analysis_tools/coco_error_analysis.py ${RESULT} ${OUT_DIR} [-h] [--ann ${ANN}] [--types ${TYPES[TYPES...]}]
```
## Model Complexity
`tools/analysis_tools/get_flops.py` is a script adapted from [flops-counter.pytorch](https://github.com/sovrasov/flops-counter.pytorch) to compute the FLOPs and params of a given model.
```shell
python tools/analysis_tools/get_flops.py ${CONFIG_FILE} [--shape ${INPUT_SHAPE}]
```
You will get the results like this.
```text
==============================
Input shape: (3, 1280, 800)
Flops: 239.32 GFLOPs
Params: 37.74 M
==============================
```
**Note**: This tool is still experimental and we do not guarantee that the
number is absolutely correct. You may well use the result for simple
comparisons, but double check it before you adopt it in technical reports or papers.
1. FLOPs are related to the input shape while parameters are not. The default
input shape is (1, 3, 1280, 800).
2. Some operators are not counted into FLOPs like GN and custom operators. Refer to [`mmcv.cnn.get_model_complexity_info()`](https://github.com/open-mmlab/mmcv/blob/master/mmcv/cnn/utils/flops_counter.py) for details.
3. The FLOPs of two-stage detectors is dependent on the number of proposals.
## Model conversion
### MMDetection model to ONNX (experimental)
We provide a script to convert model to [ONNX](https://github.com/onnx/onnx) format. We also support comparing the output results between Pytorch and ONNX model for verification.
```shell
python tools/deployment/pytorch2onnx.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --output_file ${ONNX_FILE} [--shape ${INPUT_SHAPE} --verify]
```
**Note**: This tool is still experimental. Some customized operators are not supported for now. For a detailed description of the usage and the list of supported models, please refer to [pytorch2onnx](tutorials/pytorch2onnx.md).
### MMDetection 1.x model to MMDetection 2.x
`tools/model_converters/upgrade_model_version.py` upgrades a previous MMDetection checkpoint
to the new version. Note that this script is not guaranteed to work as some
breaking changes are introduced in the new version. It is recommended to
directly use the new checkpoints.
```shell
python tools/model_converters/upgrade_model_version.py ${IN_FILE} ${OUT_FILE} [-h] [--num-classes NUM_CLASSES]
```
### RegNet model to MMDetection
`tools/model_converters/regnet2mmdet.py` convert keys in pycls pretrained RegNet models to
MMDetection style.
```shell
python tools/model_converters/regnet2mmdet.py ${SRC} ${DST} [-h]
```
### Detectron ResNet to Pytorch
`tools/model_converters/detectron2pytorch.py` converts keys in the original detectron pretrained
ResNet models to PyTorch style.
```shell
python tools/model_converters/detectron2pytorch.py ${SRC} ${DST} ${DEPTH} [-h]
```
### Prepare a model for publishing
`tools/model_converters/publish_model.py` helps users to prepare their model for publishing.
Before you upload a model to AWS, you may want to
1. convert model weights to CPU tensors
2. delete the optimizer states and
3. compute the hash of the checkpoint file and append the hash id to the
filename.
```shell
python tools/model_converters/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
```
E.g.,
```shell
python tools/model_converters/publish_model.py work_dirs/faster_rcnn/latest.pth faster_rcnn_r50_fpn_1x_20190801.pth
```
The final output filename will be `faster_rcnn_r50_fpn_1x_20190801-{hash id}.pth`.
## Dataset Conversion
`tools/data_converters/` contains tools to convert the Cityscapes dataset
and Pascal VOC dataset to the COCO format.
```shell
python tools/dataset_converters/cityscapes.py ${CITYSCAPES_PATH} [-h] [--img-dir ${IMG_DIR}] [--gt-dir ${GT_DIR}] [-o ${OUT_DIR}] [--nproc ${NPROC}]
python tools/dataset_converters/pascal_voc.py ${DEVKIT_PATH} [-h] [-o ${OUT_DIR}]
```
## Miscellaneous
### Evaluating a metric
`tools/analysis_tools/eval_metric.py` evaluates certain metrics of a pkl result file
according to a config file.
```shell
python tools/analysis_tools/eval_metric.py ${CONFIG} ${PKL_RESULTS} [-h] [--format-only] [--eval ${EVAL[EVAL ...]}]
[--cfg-options ${CFG_OPTIONS [CFG_OPTIONS ...]}]
[--eval-options ${EVAL_OPTIONS [EVAL_OPTIONS ...]}]
```
### Print the entire config
`tools/misc/print_config.py` prints the whole config verbatim, expanding all its
imports.
```shell
python tools/misc/print_config.py ${CONFIG} [-h] [--options ${OPTIONS [OPTIONS...]}]
```
### Test the robustness of detectors
Please refer to [robustness_benchmarking.md](robustness_benchmarking.md).
import mmcv
from .version import __version__, short_version
def digit_version(version_str):
digit_version = []
for x in version_str.split('.'):
if x.isdigit():
digit_version.append(int(x))
elif x.find('rc') != -1:
patch_version = x.split('rc')
digit_version.append(int(patch_version[0]) - 1)
digit_version.append(int(patch_version[1]))
return digit_version
mmcv_minimum_version = '1.2.4'
mmcv_maximum_version = '1.3'
mmcv_version = digit_version(mmcv.__version__)
assert (mmcv_version >= digit_version(mmcv_minimum_version)
and mmcv_version <= digit_version(mmcv_maximum_version)), \
f'MMCV=={mmcv.__version__} is used but incompatible. ' \
f'Please install mmcv>={mmcv_minimum_version}, <={mmcv_maximum_version}.'
__all__ = ['__version__', 'short_version']
from .inference import (async_inference_detector, inference_detector,
init_detector, show_result_pyplot)
from .test import multi_gpu_test, single_gpu_test
from .train import get_root_logger, set_random_seed, train_detector
__all__ = [
'get_root_logger', 'set_random_seed', 'train_detector', 'init_detector',
'async_inference_detector', 'inference_detector', 'show_result_pyplot',
'multi_gpu_test', 'single_gpu_test'
]
import warnings
import mmcv
import numpy as np
import torch
from mmcv.ops import RoIPool
from mmcv.parallel import collate, scatter
from mmcv.runner import load_checkpoint
from mmdet.core import get_classes
from mmdet.datasets import replace_ImageToTensor
from mmdet.datasets.pipelines import Compose
from mmdet.models import build_detector
def init_detector(config, checkpoint=None, device='cuda:0', cfg_options=None):
"""Initialize a detector from config file.
Args:
config (str or :obj:`mmcv.Config`): Config file path or the config
object.
checkpoint (str, optional): Checkpoint path. If left as None, the model
will not load any weights.
cfg_options (dict): Options to override some settings in the used
config.
Returns:
nn.Module: The constructed detector.
"""
if isinstance(config, str):
config = mmcv.Config.fromfile(config)
elif not isinstance(config, mmcv.Config):
raise TypeError('config must be a filename or Config object, '
f'but got {type(config)}')
if cfg_options is not None:
config.merge_from_dict(cfg_options)
config.model.pretrained = None
config.model.train_cfg = None
model = build_detector(config.model, test_cfg=config.get('test_cfg'))
if checkpoint is not None:
map_loc = 'cpu' if device == 'cpu' else None
checkpoint = load_checkpoint(model, checkpoint, map_location=map_loc)
if 'CLASSES' in checkpoint['meta']:
model.CLASSES = checkpoint['meta']['CLASSES']
else:
warnings.simplefilter('once')
warnings.warn('Class names are not saved in the checkpoint\'s '
'meta data, use COCO classes by default.')
model.CLASSES = get_classes('coco')
model.cfg = config # save the config in the model for convenience
model.to(device)
model.eval()
return model
class LoadImage(object):
"""A simple pipeline to load image."""
def __call__(self, results):
"""Call function to load images into results.
Args:
results (dict): A result dict contains the file name
of the image to be read.
Returns:
dict: ``results`` will be returned containing loaded image.
"""
if isinstance(results['img'], str):
results['filename'] = results['img']
results['ori_filename'] = results['img']
else:
results['filename'] = None
results['ori_filename'] = None
img = mmcv.imread(results['img'])
results['img'] = img
results['img_fields'] = ['img']
results['img_shape'] = img.shape
results['ori_shape'] = img.shape
return results
def inference_detector(model, img):
"""Inference image(s) with the detector.
Args:
model (nn.Module): The loaded detector.
imgs (str/ndarray or list[str/ndarray]): Either image files or loaded
images.
Returns:
If imgs is a str, a generator will be returned, otherwise return the
detection results directly.
"""
cfg = model.cfg
device = next(model.parameters()).device # model device
# prepare data
if isinstance(img, np.ndarray):
# directly add img
data = dict(img=img)
cfg = cfg.copy()
# set loading pipeline type
cfg.data.test.pipeline[0].type = 'LoadImageFromWebcam'
else:
# add information into dict
data = dict(img_info=dict(filename=img), img_prefix=None)
# build the data pipeline
cfg.data.test.pipeline = replace_ImageToTensor(cfg.data.test.pipeline)
test_pipeline = Compose(cfg.data.test.pipeline)
data = test_pipeline(data)
data = collate([data], samples_per_gpu=1)
# just get the actual data from DataContainer
data['img_metas'] = [img_metas.data[0] for img_metas in data['img_metas']]
data['img'] = [img.data[0] for img in data['img']]
if next(model.parameters()).is_cuda:
# scatter to specified GPU
data = scatter(data, [device])[0]
else:
for m in model.modules():
assert not isinstance(
m, RoIPool
), 'CPU inference with RoIPool is not supported currently.'
# forward the model
with torch.no_grad():
result = model(return_loss=False, rescale=True, **data)[0]
return result
async def async_inference_detector(model, img):
"""Async inference image(s) with the detector.
Args:
model (nn.Module): The loaded detector.
img (str | ndarray): Either image files or loaded images.
Returns:
Awaitable detection results.
"""
cfg = model.cfg
device = next(model.parameters()).device # model device
# prepare data
if isinstance(img, np.ndarray):
# directly add img
data = dict(img=img)
cfg = cfg.copy()
# set loading pipeline type
cfg.data.test.pipeline[0].type = 'LoadImageFromWebcam'
else:
# add information into dict
data = dict(img_info=dict(filename=img), img_prefix=None)
# build the data pipeline
test_pipeline = Compose(cfg.data.test.pipeline)
data = test_pipeline(data)
data = scatter(collate([data], samples_per_gpu=1), [device])[0]
# We don't restore `torch.is_grad_enabled()` value during concurrent
# inference since execution can overlap
torch.set_grad_enabled(False)
result = await model.aforward_test(rescale=True, **data)
return result
def show_result_pyplot(model,
img,
result,
score_thr=0.3,
fig_size=(15, 10),
title='result',
block=True,
wait_time=0):
"""Visualize the detection results on the image.
Args:
model (nn.Module): The loaded detector.
img (str or np.ndarray): Image filename or loaded image.
result (tuple[list] or list): The detection result, can be either
(bbox, segm) or just bbox.
score_thr (float): The threshold to visualize the bboxes and masks.
fig_size (tuple): Figure size of the pyplot figure.
title (str): Title of the pyplot figure.
block (bool): Whether to block GUI. Default: True
wait_time (float): Value of waitKey param.
Default: 0.
"""
warnings.warn('"block" will be deprecated in v2.9.0,'
'Please use "wait_time"')
if hasattr(model, 'module'):
model = model.module
model.show_result(
img,
result,
score_thr=score_thr,
show=True,
wait_time=wait_time,
fig_size=fig_size,
win_name=title,
bbox_color=(72, 101, 241),
text_color=(72, 101, 241))
import os.path as osp
import pickle
import shutil
import tempfile
import time
import mmcv
import torch
import torch.distributed as dist
from mmcv.image import tensor2imgs
from mmcv.runner import get_dist_info
from mmdet.core import encode_mask_results
def single_gpu_test(model,
data_loader,
show=False,
out_dir=None,
show_score_thr=0.3):
model.eval()
results = []
dataset = data_loader.dataset
prog_bar = mmcv.ProgressBar(len(dataset))
for i, data in enumerate(data_loader):
with torch.no_grad():
result = model(return_loss=False, rescale=True, **data)
batch_size = len(result)
if show or out_dir:
if batch_size == 1 and isinstance(data['img'][0], torch.Tensor):
img_tensor = data['img'][0]
else:
img_tensor = data['img'][0].data[0]
img_metas = data['img_metas'][0].data[0]
imgs = tensor2imgs(img_tensor, **img_metas[0]['img_norm_cfg'])
assert len(imgs) == len(img_metas)
for i, (img, img_meta) in enumerate(zip(imgs, img_metas)):
h, w, _ = img_meta['img_shape']
img_show = img[:h, :w, :]
ori_h, ori_w = img_meta['ori_shape'][:-1]
img_show = mmcv.imresize(img_show, (ori_w, ori_h))
if out_dir:
out_file = osp.join(out_dir, img_meta['ori_filename'])
else:
out_file = None
model.module.show_result(
img_show,
result[i],
show=show,
out_file=out_file,
score_thr=show_score_thr)
# encode mask results
if isinstance(result[0], tuple):
result = [(bbox_results, encode_mask_results(mask_results))
for bbox_results, mask_results in result]
results.extend(result)
for _ in range(batch_size):
prog_bar.update()
return results
def multi_gpu_test(model, data_loader, tmpdir=None, gpu_collect=False):
"""Test model with multiple gpus.
This method tests model with multiple gpus and collects the results
under two different modes: gpu and cpu modes. By setting 'gpu_collect=True'
it encodes results to gpu tensors and use gpu communication for results
collection. On cpu mode it saves the results on different gpus to 'tmpdir'
and collects them by the rank 0 worker.
Args:
model (nn.Module): Model to be tested.
data_loader (nn.Dataloader): Pytorch data loader.
tmpdir (str): Path of directory to save the temporary results from
different gpus under cpu mode.
gpu_collect (bool): Option to use either gpu or cpu to collect results.
Returns:
list: The prediction results.
"""
model.eval()
results = []
dataset = data_loader.dataset
rank, world_size = get_dist_info()
if rank == 0:
prog_bar = mmcv.ProgressBar(len(dataset))
time.sleep(2) # This line can prevent deadlock problem in some cases.
for i, data in enumerate(data_loader):
with torch.no_grad():
result = model(return_loss=False, rescale=True, **data)
# encode mask results
if isinstance(result[0], tuple):
result = [(bbox_results, encode_mask_results(mask_results))
for bbox_results, mask_results in result]
results.extend(result)
if rank == 0:
batch_size = len(result)
for _ in range(batch_size * world_size):
prog_bar.update()
# collect results from all ranks
if gpu_collect:
results = collect_results_gpu(results, len(dataset))
else:
results = collect_results_cpu(results, len(dataset), tmpdir)
return results
def collect_results_cpu(result_part, size, tmpdir=None):
rank, world_size = get_dist_info()
# create a tmp dir if it is not specified
if tmpdir is None:
MAX_LEN = 512
# 32 is whitespace
dir_tensor = torch.full((MAX_LEN, ),
32,
dtype=torch.uint8,
device='cuda')
if rank == 0:
mmcv.mkdir_or_exist('.dist_test')
tmpdir = tempfile.mkdtemp(dir='.dist_test')
tmpdir = torch.tensor(
bytearray(tmpdir.encode()), dtype=torch.uint8, device='cuda')
dir_tensor[:len(tmpdir)] = tmpdir
dist.broadcast(dir_tensor, 0)
tmpdir = dir_tensor.cpu().numpy().tobytes().decode().rstrip()
else:
mmcv.mkdir_or_exist(tmpdir)
# dump the part result to the dir
mmcv.dump(result_part, osp.join(tmpdir, f'part_{rank}.pkl'))
dist.barrier()
# collect all parts
if rank != 0:
return None
else:
# load results of all parts from tmp dir
part_list = []
for i in range(world_size):
part_file = osp.join(tmpdir, f'part_{i}.pkl')
part_list.append(mmcv.load(part_file))
# sort the results
ordered_results = []
for res in zip(*part_list):
ordered_results.extend(list(res))
# the dataloader may pad some samples
ordered_results = ordered_results[:size]
# remove tmp dir
shutil.rmtree(tmpdir)
return ordered_results
def collect_results_gpu(result_part, size):
rank, world_size = get_dist_info()
# dump result part to tensor with pickle
part_tensor = torch.tensor(
bytearray(pickle.dumps(result_part)), dtype=torch.uint8, device='cuda')
# gather all result part tensor shape
shape_tensor = torch.tensor(part_tensor.shape, device='cuda')
shape_list = [shape_tensor.clone() for _ in range(world_size)]
dist.all_gather(shape_list, shape_tensor)
# padding result part tensor to max length
shape_max = torch.tensor(shape_list).max()
part_send = torch.zeros(shape_max, dtype=torch.uint8, device='cuda')
part_send[:shape_tensor[0]] = part_tensor
part_recv_list = [
part_tensor.new_zeros(shape_max) for _ in range(world_size)
]
# gather all result part
dist.all_gather(part_recv_list, part_send)
if rank == 0:
part_list = []
for recv, shape in zip(part_recv_list, shape_list):
part_list.append(
pickle.loads(recv[:shape[0]].cpu().numpy().tobytes()))
# sort the results
ordered_results = []
for res in zip(*part_list):
ordered_results.extend(list(res))
# the dataloader may pad some samples
ordered_results = ordered_results[:size]
return ordered_results
import random
import numpy as np
import torch
from mmcv.parallel import MMDataParallel, MMDistributedDataParallel
from mmcv.runner import (HOOKS, DistSamplerSeedHook, EpochBasedRunner,
Fp16OptimizerHook, OptimizerHook, build_optimizer)
from mmcv.utils import build_from_cfg
from mmdet.core import DistEvalHook, EvalHook
from mmdet.datasets import (build_dataloader, build_dataset,
replace_ImageToTensor)
from mmdet.utils import get_root_logger
def set_random_seed(seed, deterministic=False):
"""Set random seed.
Args:
seed (int): Seed to be used.
deterministic (bool): Whether to set the deterministic option for
CUDNN backend, i.e., set `torch.backends.cudnn.deterministic`
to True and `torch.backends.cudnn.benchmark` to False.
Default: False.
"""
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
if deterministic:
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
def train_detector(model,
dataset,
cfg,
distributed=False,
validate=False,
timestamp=None,
meta=None):
logger = get_root_logger(cfg.log_level)
# prepare data loaders
dataset = dataset if isinstance(dataset, (list, tuple)) else [dataset]
if 'imgs_per_gpu' in cfg.data:
logger.warning('"imgs_per_gpu" is deprecated in MMDet V2.0. '
'Please use "samples_per_gpu" instead')
if 'samples_per_gpu' in cfg.data:
logger.warning(
f'Got "imgs_per_gpu"={cfg.data.imgs_per_gpu} and '
f'"samples_per_gpu"={cfg.data.samples_per_gpu}, "imgs_per_gpu"'
f'={cfg.data.imgs_per_gpu} is used in this experiments')
else:
logger.warning(
'Automatically set "samples_per_gpu"="imgs_per_gpu"='
f'{cfg.data.imgs_per_gpu} in this experiments')
cfg.data.samples_per_gpu = cfg.data.imgs_per_gpu
data_loaders = [
build_dataloader(
ds,
cfg.data.samples_per_gpu,
cfg.data.workers_per_gpu,
# cfg.gpus will be ignored if distributed
len(cfg.gpu_ids),
dist=distributed,
seed=cfg.seed) for ds in dataset
]
# put model on gpus
if distributed:
find_unused_parameters = cfg.get('find_unused_parameters', False)
# Sets the `find_unused_parameters` parameter in
# torch.nn.parallel.DistributedDataParallel
model = MMDistributedDataParallel(
model.cuda(),
device_ids=[torch.cuda.current_device()],
broadcast_buffers=False,
find_unused_parameters=find_unused_parameters)
else:
model = MMDataParallel(
model.cuda(cfg.gpu_ids[0]), device_ids=cfg.gpu_ids)
# build runner
optimizer = build_optimizer(model, cfg.optimizer)
runner = EpochBasedRunner(
model,
optimizer=optimizer,
work_dir=cfg.work_dir,
logger=logger,
meta=meta)
# an ugly workaround to make .log and .log.json filenames the same
runner.timestamp = timestamp
# fp16 setting
fp16_cfg = cfg.get('fp16', None)
if fp16_cfg is not None:
optimizer_config = Fp16OptimizerHook(
**cfg.optimizer_config, **fp16_cfg, distributed=distributed)
elif distributed and 'type' not in cfg.optimizer_config:
optimizer_config = OptimizerHook(**cfg.optimizer_config)
else:
optimizer_config = cfg.optimizer_config
# register hooks
runner.register_training_hooks(cfg.lr_config, optimizer_config,
cfg.checkpoint_config, cfg.log_config,
cfg.get('momentum_config', None))
if distributed:
runner.register_hook(DistSamplerSeedHook())
# register eval hooks
if validate:
# Support batch_size > 1 in validation
val_samples_per_gpu = cfg.data.val.pop('samples_per_gpu', 1)
if val_samples_per_gpu > 1:
# Replace 'ImageToTensor' to 'DefaultFormatBundle'
cfg.data.val.pipeline = replace_ImageToTensor(
cfg.data.val.pipeline)
val_dataset = build_dataset(cfg.data.val, dict(test_mode=True))
val_dataloader = build_dataloader(
val_dataset,
samples_per_gpu=val_samples_per_gpu,
workers_per_gpu=cfg.data.workers_per_gpu,
dist=distributed,
shuffle=False)
eval_cfg = cfg.get('evaluation', {})
eval_hook = DistEvalHook if distributed else EvalHook
runner.register_hook(eval_hook(val_dataloader, **eval_cfg))
# user-defined hooks
if cfg.get('custom_hooks', None):
custom_hooks = cfg.custom_hooks
assert isinstance(custom_hooks, list), \
f'custom_hooks expect list type, but got {type(custom_hooks)}'
for hook_cfg in cfg.custom_hooks:
assert isinstance(hook_cfg, dict), \
'Each item in custom_hooks expects dict type, but got ' \
f'{type(hook_cfg)}'
hook_cfg = hook_cfg.copy()
priority = hook_cfg.pop('priority', 'NORMAL')
hook = build_from_cfg(hook_cfg, HOOKS)
runner.register_hook(hook, priority=priority)
if cfg.resume_from:
runner.resume(cfg.resume_from)
elif cfg.load_from:
runner.load_checkpoint(cfg.load_from)
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
from .anchor import * # noqa: F401, F403
from .bbox import * # noqa: F401, F403
from .evaluation import * # noqa: F401, F403
from .export import * # noqa: F401, F403
from .fp16 import * # noqa: F401, F403
from .mask import * # noqa: F401, F403
from .post_processing import * # noqa: F401, F403
from .utils import * # noqa: F401, F403
from .anchor_generator import (AnchorGenerator, LegacyAnchorGenerator,
YOLOAnchorGenerator)
from .builder import ANCHOR_GENERATORS, build_anchor_generator
from .point_generator import PointGenerator
from .utils import anchor_inside_flags, calc_region, images_to_levels
__all__ = [
'AnchorGenerator', 'LegacyAnchorGenerator', 'anchor_inside_flags',
'PointGenerator', 'images_to_levels', 'calc_region',
'build_anchor_generator', 'ANCHOR_GENERATORS', 'YOLOAnchorGenerator'
]
import mmcv
import numpy as np
import torch
from torch.nn.modules.utils import _pair
from .builder import ANCHOR_GENERATORS
@ANCHOR_GENERATORS.register_module()
class AnchorGenerator(object):
"""Standard anchor generator for 2D anchor-based detectors.
Args:
strides (list[int] | list[tuple[int, int]]): Strides of anchors
in multiple feature levels in order (w, h).
ratios (list[float]): The list of ratios between the height and width
of anchors in a single level.
scales (list[int] | None): Anchor scales for anchors in a single level.
It cannot be set at the same time if `octave_base_scale` and
`scales_per_octave` are set.
base_sizes (list[int] | None): The basic sizes
of anchors in multiple levels.
If None is given, strides will be used as base_sizes.
(If strides are non square, the shortest stride is taken.)
scale_major (bool): Whether to multiply scales first when generating
base anchors. If true, the anchors in the same row will have the
same scales. By default it is True in V2.0
octave_base_scale (int): The base scale of octave.
scales_per_octave (int): Number of scales for each octave.
`octave_base_scale` and `scales_per_octave` are usually used in
retinanet and the `scales` should be None when they are set.
centers (list[tuple[float, float]] | None): The centers of the anchor
relative to the feature grid center in multiple feature levels.
By default it is set to be None and not used. If a list of tuple of
float is given, they will be used to shift the centers of anchors.
center_offset (float): The offset of center in proportion to anchors'
width and height. By default it is 0 in V2.0.
Examples:
>>> from mmdet.core import AnchorGenerator
>>> self = AnchorGenerator([16], [1.], [1.], [9])
>>> all_anchors = self.grid_anchors([(2, 2)], device='cpu')
>>> print(all_anchors)
[tensor([[-4.5000, -4.5000, 4.5000, 4.5000],
[11.5000, -4.5000, 20.5000, 4.5000],
[-4.5000, 11.5000, 4.5000, 20.5000],
[11.5000, 11.5000, 20.5000, 20.5000]])]
>>> self = AnchorGenerator([16, 32], [1.], [1.], [9, 18])
>>> all_anchors = self.grid_anchors([(2, 2), (1, 1)], device='cpu')
>>> print(all_anchors)
[tensor([[-4.5000, -4.5000, 4.5000, 4.5000],
[11.5000, -4.5000, 20.5000, 4.5000],
[-4.5000, 11.5000, 4.5000, 20.5000],
[11.5000, 11.5000, 20.5000, 20.5000]]), \
tensor([[-9., -9., 9., 9.]])]
"""
def __init__(self,
strides,
ratios,
scales=None,
base_sizes=None,
scale_major=True,
octave_base_scale=None,
scales_per_octave=None,
centers=None,
center_offset=0.):
# check center and center_offset
if center_offset != 0:
assert centers is None, 'center cannot be set when center_offset' \
f'!=0, {centers} is given.'
if not (0 <= center_offset <= 1):
raise ValueError('center_offset should be in range [0, 1], '
f'{center_offset} is given.')
if centers is not None:
assert len(centers) == len(strides), \
'The number of strides should be the same as centers, got ' \
f'{strides} and {centers}'
# calculate base sizes of anchors
self.strides = [_pair(stride) for stride in strides]
self.base_sizes = [min(stride) for stride in self.strides
] if base_sizes is None else base_sizes
assert len(self.base_sizes) == len(self.strides), \
'The number of strides should be the same as base sizes, got ' \
f'{self.strides} and {self.base_sizes}'
# calculate scales of anchors
assert ((octave_base_scale is not None
and scales_per_octave is not None) ^ (scales is not None)), \
'scales and octave_base_scale with scales_per_octave cannot' \
' be set at the same time'
if scales is not None:
self.scales = torch.Tensor(scales)
elif octave_base_scale is not None and scales_per_octave is not None:
octave_scales = np.array(
[2**(i / scales_per_octave) for i in range(scales_per_octave)])
scales = octave_scales * octave_base_scale
self.scales = torch.Tensor(scales)
else:
raise ValueError('Either scales or octave_base_scale with '
'scales_per_octave should be set')
self.octave_base_scale = octave_base_scale
self.scales_per_octave = scales_per_octave
self.ratios = torch.Tensor(ratios)
self.scale_major = scale_major
self.centers = centers
self.center_offset = center_offset
self.base_anchors = self.gen_base_anchors()
@property
def num_base_anchors(self):
"""list[int]: total number of base anchors in a feature grid"""
return [base_anchors.size(0) for base_anchors in self.base_anchors]
@property
def num_levels(self):
"""int: number of feature levels that the generator will be applied"""
return len(self.strides)
def gen_base_anchors(self):
"""Generate base anchors.
Returns:
list(torch.Tensor): Base anchors of a feature grid in multiple \
feature levels.
"""
multi_level_base_anchors = []
for i, base_size in enumerate(self.base_sizes):
center = None
if self.centers is not None:
center = self.centers[i]
multi_level_base_anchors.append(
self.gen_single_level_base_anchors(
base_size,
scales=self.scales,
ratios=self.ratios,
center=center))
return multi_level_base_anchors
def gen_single_level_base_anchors(self,
base_size,
scales,
ratios,
center=None):
"""Generate base anchors of a single level.
Args:
base_size (int | float): Basic size of an anchor.
scales (torch.Tensor): Scales of the anchor.
ratios (torch.Tensor): The ratio between between the height
and width of anchors in a single level.
center (tuple[float], optional): The center of the base anchor
related to a single feature grid. Defaults to None.
Returns:
torch.Tensor: Anchors in a single-level feature maps.
"""
w = base_size
h = base_size
if center is None:
x_center = self.center_offset * w
y_center = self.center_offset * h
else:
x_center, y_center = center
h_ratios = torch.sqrt(ratios)
w_ratios = 1 / h_ratios
if self.scale_major:
ws = (w * w_ratios[:, None] * scales[None, :]).view(-1)
hs = (h * h_ratios[:, None] * scales[None, :]).view(-1)
else:
ws = (w * scales[:, None] * w_ratios[None, :]).view(-1)
hs = (h * scales[:, None] * h_ratios[None, :]).view(-1)
# use float anchor and the anchor's center is aligned with the
# pixel center
base_anchors = [
x_center - 0.5 * ws, y_center - 0.5 * hs, x_center + 0.5 * ws,
y_center + 0.5 * hs
]
base_anchors = torch.stack(base_anchors, dim=-1)
return base_anchors
def _meshgrid(self, x, y, row_major=True):
"""Generate mesh grid of x and y.
Args:
x (torch.Tensor): Grids of x dimension.
y (torch.Tensor): Grids of y dimension.
row_major (bool, optional): Whether to return y grids first.
Defaults to True.
Returns:
tuple[torch.Tensor]: The mesh grids of x and y.
"""
xx = x.repeat(len(y))
yy = y.view(-1, 1).repeat(1, len(x)).view(-1)
if row_major:
return xx, yy
else:
return yy, xx
def grid_anchors(self, featmap_sizes, device='cuda'):
"""Generate grid anchors in multiple feature levels.
Args:
featmap_sizes (list[tuple]): List of feature map sizes in
multiple feature levels.
device (str): Device where the anchors will be put on.
Return:
list[torch.Tensor]: Anchors in multiple feature levels. \
The sizes of each tensor should be [N, 4], where \
N = width * height * num_base_anchors, width and height \
are the sizes of the corresponding feature level, \
num_base_anchors is the number of anchors for that level.
"""
assert self.num_levels == len(featmap_sizes)
multi_level_anchors = []
for i in range(self.num_levels):
anchors = self.single_level_grid_anchors(
self.base_anchors[i].to(device),
featmap_sizes[i],
self.strides[i],
device=device)
multi_level_anchors.append(anchors)
return multi_level_anchors
def single_level_grid_anchors(self,
base_anchors,
featmap_size,
stride=(16, 16),
device='cuda'):
"""Generate grid anchors of a single level.
Note:
This function is usually called by method ``self.grid_anchors``.
Args:
base_anchors (torch.Tensor): The base anchors of a feature grid.
featmap_size (tuple[int]): Size of the feature maps.
stride (tuple[int], optional): Stride of the feature map in order
(w, h). Defaults to (16, 16).
device (str, optional): Device the tensor will be put on.
Defaults to 'cuda'.
Returns:
torch.Tensor: Anchors in the overall feature maps.
"""
feat_h, feat_w = featmap_size
# convert Tensor to int, so that we can covert to ONNX correctlly
feat_h = int(feat_h)
feat_w = int(feat_w)
shift_x = torch.arange(0, feat_w, device=device) * stride[0]
shift_y = torch.arange(0, feat_h, device=device) * stride[1]
shift_xx, shift_yy = self._meshgrid(shift_x, shift_y)
shifts = torch.stack([shift_xx, shift_yy, shift_xx, shift_yy], dim=-1)
shifts = shifts.type_as(base_anchors)
# first feat_w elements correspond to the first row of shifts
# add A anchors (1, A, 4) to K shifts (K, 1, 4) to get
# shifted anchors (K, A, 4), reshape to (K*A, 4)
all_anchors = base_anchors[None, :, :] + shifts[:, None, :]
all_anchors = all_anchors.view(-1, 4)
# first A rows correspond to A anchors of (0, 0) in feature map,
# then (0, 1), (0, 2), ...
return all_anchors
def valid_flags(self, featmap_sizes, pad_shape, device='cuda'):
"""Generate valid flags of anchors in multiple feature levels.
Args:
featmap_sizes (list(tuple)): List of feature map sizes in
multiple feature levels.
pad_shape (tuple): The padded shape of the image.
device (str): Device where the anchors will be put on.
Return:
list(torch.Tensor): Valid flags of anchors in multiple levels.
"""
assert self.num_levels == len(featmap_sizes)
multi_level_flags = []
for i in range(self.num_levels):
anchor_stride = self.strides[i]
feat_h, feat_w = featmap_sizes[i]
h, w = pad_shape[:2]
valid_feat_h = min(int(np.ceil(h / anchor_stride[1])), feat_h)
valid_feat_w = min(int(np.ceil(w / anchor_stride[0])), feat_w)
flags = self.single_level_valid_flags((feat_h, feat_w),
(valid_feat_h, valid_feat_w),
self.num_base_anchors[i],
device=device)
multi_level_flags.append(flags)
return multi_level_flags
def single_level_valid_flags(self,
featmap_size,
valid_size,
num_base_anchors,
device='cuda'):
"""Generate the valid flags of anchor in a single feature map.
Args:
featmap_size (tuple[int]): The size of feature maps.
valid_size (tuple[int]): The valid size of the feature maps.
num_base_anchors (int): The number of base anchors.
device (str, optional): Device where the flags will be put on.
Defaults to 'cuda'.
Returns:
torch.Tensor: The valid flags of each anchor in a single level \
feature map.
"""
feat_h, feat_w = featmap_size
valid_h, valid_w = valid_size
assert valid_h <= feat_h and valid_w <= feat_w
valid_x = torch.zeros(feat_w, dtype=torch.bool, device=device)
valid_y = torch.zeros(feat_h, dtype=torch.bool, device=device)
valid_x[:valid_w] = 1
valid_y[:valid_h] = 1
valid_xx, valid_yy = self._meshgrid(valid_x, valid_y)
valid = valid_xx & valid_yy
valid = valid[:, None].expand(valid.size(0),
num_base_anchors).contiguous().view(-1)
return valid
def __repr__(self):
"""str: a string that describes the module"""
indent_str = ' '
repr_str = self.__class__.__name__ + '(\n'
repr_str += f'{indent_str}strides={self.strides},\n'
repr_str += f'{indent_str}ratios={self.ratios},\n'
repr_str += f'{indent_str}scales={self.scales},\n'
repr_str += f'{indent_str}base_sizes={self.base_sizes},\n'
repr_str += f'{indent_str}scale_major={self.scale_major},\n'
repr_str += f'{indent_str}octave_base_scale='
repr_str += f'{self.octave_base_scale},\n'
repr_str += f'{indent_str}scales_per_octave='
repr_str += f'{self.scales_per_octave},\n'
repr_str += f'{indent_str}num_levels={self.num_levels}\n'
repr_str += f'{indent_str}centers={self.centers},\n'
repr_str += f'{indent_str}center_offset={self.center_offset})'
return repr_str
@ANCHOR_GENERATORS.register_module()
class SSDAnchorGenerator(AnchorGenerator):
"""Anchor generator for SSD.
Args:
strides (list[int] | list[tuple[int, int]]): Strides of anchors
in multiple feature levels.
ratios (list[float]): The list of ratios between the height and width
of anchors in a single level.
basesize_ratio_range (tuple(float)): Ratio range of anchors.
input_size (int): Size of feature map, 300 for SSD300,
512 for SSD512.
scale_major (bool): Whether to multiply scales first when generating
base anchors. If true, the anchors in the same row will have the
same scales. It is always set to be False in SSD.
"""
def __init__(self,
strides,
ratios,
basesize_ratio_range,
input_size=300,
scale_major=True):
assert len(strides) == len(ratios)
assert mmcv.is_tuple_of(basesize_ratio_range, float)
self.strides = [_pair(stride) for stride in strides]
self.input_size = input_size
self.centers = [(stride[0] / 2., stride[1] / 2.)
for stride in self.strides]
self.basesize_ratio_range = basesize_ratio_range
# calculate anchor ratios and sizes
min_ratio, max_ratio = basesize_ratio_range
min_ratio = int(min_ratio * 100)
max_ratio = int(max_ratio * 100)
step = int(np.floor(max_ratio - min_ratio) / (self.num_levels - 2))
min_sizes = []
max_sizes = []
for ratio in range(int(min_ratio), int(max_ratio) + 1, step):
min_sizes.append(int(self.input_size * ratio / 100))
max_sizes.append(int(self.input_size * (ratio + step) / 100))
if self.input_size == 300:
if basesize_ratio_range[0] == 0.15: # SSD300 COCO
min_sizes.insert(0, int(self.input_size * 7 / 100))
max_sizes.insert(0, int(self.input_size * 15 / 100))
elif basesize_ratio_range[0] == 0.2: # SSD300 VOC
min_sizes.insert(0, int(self.input_size * 10 / 100))
max_sizes.insert(0, int(self.input_size * 20 / 100))
else:
raise ValueError(
'basesize_ratio_range[0] should be either 0.15'
'or 0.2 when input_size is 300, got '
f'{basesize_ratio_range[0]}.')
elif self.input_size == 512:
if basesize_ratio_range[0] == 0.1: # SSD512 COCO
min_sizes.insert(0, int(self.input_size * 4 / 100))
max_sizes.insert(0, int(self.input_size * 10 / 100))
elif basesize_ratio_range[0] == 0.15: # SSD512 VOC
min_sizes.insert(0, int(self.input_size * 7 / 100))
max_sizes.insert(0, int(self.input_size * 15 / 100))
else:
raise ValueError('basesize_ratio_range[0] should be either 0.1'
'or 0.15 when input_size is 512, got'
f' {basesize_ratio_range[0]}.')
else:
raise ValueError('Only support 300 or 512 in SSDAnchorGenerator'
f', got {self.input_size}.')
anchor_ratios = []
anchor_scales = []
for k in range(len(self.strides)):
scales = [1., np.sqrt(max_sizes[k] / min_sizes[k])]
anchor_ratio = [1.]
for r in ratios[k]:
anchor_ratio += [1 / r, r] # 4 or 6 ratio
anchor_ratios.append(torch.Tensor(anchor_ratio))
anchor_scales.append(torch.Tensor(scales))
self.base_sizes = min_sizes
self.scales = anchor_scales
self.ratios = anchor_ratios
self.scale_major = scale_major
self.center_offset = 0
self.base_anchors = self.gen_base_anchors()
def gen_base_anchors(self):
"""Generate base anchors.
Returns:
list(torch.Tensor): Base anchors of a feature grid in multiple \
feature levels.
"""
multi_level_base_anchors = []
for i, base_size in enumerate(self.base_sizes):
base_anchors = self.gen_single_level_base_anchors(
base_size,
scales=self.scales[i],
ratios=self.ratios[i],
center=self.centers[i])
indices = list(range(len(self.ratios[i])))
indices.insert(1, len(indices))
base_anchors = torch.index_select(base_anchors, 0,
torch.LongTensor(indices))
multi_level_base_anchors.append(base_anchors)
return multi_level_base_anchors
def __repr__(self):
"""str: a string that describes the module"""
indent_str = ' '
repr_str = self.__class__.__name__ + '(\n'
repr_str += f'{indent_str}strides={self.strides},\n'
repr_str += f'{indent_str}scales={self.scales},\n'
repr_str += f'{indent_str}scale_major={self.scale_major},\n'
repr_str += f'{indent_str}input_size={self.input_size},\n'
repr_str += f'{indent_str}scales={self.scales},\n'
repr_str += f'{indent_str}ratios={self.ratios},\n'
repr_str += f'{indent_str}num_levels={self.num_levels},\n'
repr_str += f'{indent_str}base_sizes={self.base_sizes},\n'
repr_str += f'{indent_str}basesize_ratio_range='
repr_str += f'{self.basesize_ratio_range})'
return repr_str
@ANCHOR_GENERATORS.register_module()
class LegacyAnchorGenerator(AnchorGenerator):
"""Legacy anchor generator used in MMDetection V1.x.
Note:
Difference to the V2.0 anchor generator:
1. The center offset of V1.x anchors are set to be 0.5 rather than 0.
2. The width/height are minused by 1 when calculating the anchors' \
centers and corners to meet the V1.x coordinate system.
3. The anchors' corners are quantized.
Args:
strides (list[int] | list[tuple[int]]): Strides of anchors
in multiple feature levels.
ratios (list[float]): The list of ratios between the height and width
of anchors in a single level.
scales (list[int] | None): Anchor scales for anchors in a single level.
It cannot be set at the same time if `octave_base_scale` and
`scales_per_octave` are set.
base_sizes (list[int]): The basic sizes of anchors in multiple levels.
If None is given, strides will be used to generate base_sizes.
scale_major (bool): Whether to multiply scales first when generating
base anchors. If true, the anchors in the same row will have the
same scales. By default it is True in V2.0
octave_base_scale (int): The base scale of octave.
scales_per_octave (int): Number of scales for each octave.
`octave_base_scale` and `scales_per_octave` are usually used in
retinanet and the `scales` should be None when they are set.
centers (list[tuple[float, float]] | None): The centers of the anchor
relative to the feature grid center in multiple feature levels.
By default it is set to be None and not used. It a list of float
is given, this list will be used to shift the centers of anchors.
center_offset (float): The offset of center in propotion to anchors'
width and height. By default it is 0.5 in V2.0 but it should be 0.5
in v1.x models.
Examples:
>>> from mmdet.core import LegacyAnchorGenerator
>>> self = LegacyAnchorGenerator(
>>> [16], [1.], [1.], [9], center_offset=0.5)
>>> all_anchors = self.grid_anchors(((2, 2),), device='cpu')
>>> print(all_anchors)
[tensor([[ 0., 0., 8., 8.],
[16., 0., 24., 8.],
[ 0., 16., 8., 24.],
[16., 16., 24., 24.]])]
"""
def gen_single_level_base_anchors(self,
base_size,
scales,
ratios,
center=None):
"""Generate base anchors of a single level.
Note:
The width/height of anchors are minused by 1 when calculating \
the centers and corners to meet the V1.x coordinate system.
Args:
base_size (int | float): Basic size of an anchor.
scales (torch.Tensor): Scales of the anchor.
ratios (torch.Tensor): The ratio between between the height.
and width of anchors in a single level.
center (tuple[float], optional): The center of the base anchor
related to a single feature grid. Defaults to None.
Returns:
torch.Tensor: Anchors in a single-level feature map.
"""
w = base_size
h = base_size
if center is None:
x_center = self.center_offset * (w - 1)
y_center = self.center_offset * (h - 1)
else:
x_center, y_center = center
h_ratios = torch.sqrt(ratios)
w_ratios = 1 / h_ratios
if self.scale_major:
ws = (w * w_ratios[:, None] * scales[None, :]).view(-1)
hs = (h * h_ratios[:, None] * scales[None, :]).view(-1)
else:
ws = (w * scales[:, None] * w_ratios[None, :]).view(-1)
hs = (h * scales[:, None] * h_ratios[None, :]).view(-1)
# use float anchor and the anchor's center is aligned with the
# pixel center
base_anchors = [
x_center - 0.5 * (ws - 1), y_center - 0.5 * (hs - 1),
x_center + 0.5 * (ws - 1), y_center + 0.5 * (hs - 1)
]
base_anchors = torch.stack(base_anchors, dim=-1).round()
return base_anchors
@ANCHOR_GENERATORS.register_module()
class LegacySSDAnchorGenerator(SSDAnchorGenerator, LegacyAnchorGenerator):
"""Legacy anchor generator used in MMDetection V1.x.
The difference between `LegacySSDAnchorGenerator` and `SSDAnchorGenerator`
can be found in `LegacyAnchorGenerator`.
"""
def __init__(self,
strides,
ratios,
basesize_ratio_range,
input_size=300,
scale_major=True):
super(LegacySSDAnchorGenerator,
self).__init__(strides, ratios, basesize_ratio_range, input_size,
scale_major)
self.centers = [((stride - 1) / 2., (stride - 1) / 2.)
for stride in strides]
self.base_anchors = self.gen_base_anchors()
@ANCHOR_GENERATORS.register_module()
class YOLOAnchorGenerator(AnchorGenerator):
"""Anchor generator for YOLO.
Args:
strides (list[int] | list[tuple[int, int]]): Strides of anchors
in multiple feature levels.
base_sizes (list[list[tuple[int, int]]]): The basic sizes
of anchors in multiple levels.
"""
def __init__(self, strides, base_sizes):
self.strides = [_pair(stride) for stride in strides]
self.centers = [(stride[0] / 2., stride[1] / 2.)
for stride in self.strides]
self.base_sizes = []
num_anchor_per_level = len(base_sizes[0])
for base_sizes_per_level in base_sizes:
assert num_anchor_per_level == len(base_sizes_per_level)
self.base_sizes.append(
[_pair(base_size) for base_size in base_sizes_per_level])
self.base_anchors = self.gen_base_anchors()
@property
def num_levels(self):
"""int: number of feature levels that the generator will be applied"""
return len(self.base_sizes)
def gen_base_anchors(self):
"""Generate base anchors.
Returns:
list(torch.Tensor): Base anchors of a feature grid in multiple \
feature levels.
"""
multi_level_base_anchors = []
for i, base_sizes_per_level in enumerate(self.base_sizes):
center = None
if self.centers is not None:
center = self.centers[i]
multi_level_base_anchors.append(
self.gen_single_level_base_anchors(base_sizes_per_level,
center))
return multi_level_base_anchors
def gen_single_level_base_anchors(self, base_sizes_per_level, center=None):
"""Generate base anchors of a single level.
Args:
base_sizes_per_level (list[tuple[int, int]]): Basic sizes of
anchors.
center (tuple[float], optional): The center of the base anchor
related to a single feature grid. Defaults to None.
Returns:
torch.Tensor: Anchors in a single-level feature maps.
"""
x_center, y_center = center
base_anchors = []
for base_size in base_sizes_per_level:
w, h = base_size
# use float anchor and the anchor's center is aligned with the
# pixel center
base_anchor = torch.Tensor([
x_center - 0.5 * w, y_center - 0.5 * h, x_center + 0.5 * w,
y_center + 0.5 * h
])
base_anchors.append(base_anchor)
base_anchors = torch.stack(base_anchors, dim=0)
return base_anchors
def responsible_flags(self, featmap_sizes, gt_bboxes, device='cuda'):
"""Generate responsible anchor flags of grid cells in multiple scales.
Args:
featmap_sizes (list(tuple)): List of feature map sizes in multiple
feature levels.
gt_bboxes (Tensor): Ground truth boxes, shape (n, 4).
device (str): Device where the anchors will be put on.
Return:
list(torch.Tensor): responsible flags of anchors in multiple level
"""
assert self.num_levels == len(featmap_sizes)
multi_level_responsible_flags = []
for i in range(self.num_levels):
anchor_stride = self.strides[i]
flags = self.single_level_responsible_flags(
featmap_sizes[i],
gt_bboxes,
anchor_stride,
self.num_base_anchors[i],
device=device)
multi_level_responsible_flags.append(flags)
return multi_level_responsible_flags
def single_level_responsible_flags(self,
featmap_size,
gt_bboxes,
stride,
num_base_anchors,
device='cuda'):
"""Generate the responsible flags of anchor in a single feature map.
Args:
featmap_size (tuple[int]): The size of feature maps.
gt_bboxes (Tensor): Ground truth boxes, shape (n, 4).
stride (tuple(int)): stride of current level
num_base_anchors (int): The number of base anchors.
device (str, optional): Device where the flags will be put on.
Defaults to 'cuda'.
Returns:
torch.Tensor: The valid flags of each anchor in a single level \
feature map.
"""
feat_h, feat_w = featmap_size
gt_bboxes_cx = ((gt_bboxes[:, 0] + gt_bboxes[:, 2]) * 0.5).to(device)
gt_bboxes_cy = ((gt_bboxes[:, 1] + gt_bboxes[:, 3]) * 0.5).to(device)
gt_bboxes_grid_x = torch.floor(gt_bboxes_cx / stride[0]).long()
gt_bboxes_grid_y = torch.floor(gt_bboxes_cy / stride[1]).long()
# row major indexing
gt_bboxes_grid_idx = gt_bboxes_grid_y * feat_w + gt_bboxes_grid_x
responsible_grid = torch.zeros(
feat_h * feat_w, dtype=torch.uint8, device=device)
responsible_grid[gt_bboxes_grid_idx] = 1
responsible_grid = responsible_grid[:, None].expand(
responsible_grid.size(0), num_base_anchors).contiguous().view(-1)
return responsible_grid
from mmcv.utils import Registry, build_from_cfg
ANCHOR_GENERATORS = Registry('Anchor generator')
def build_anchor_generator(cfg, default_args=None):
return build_from_cfg(cfg, ANCHOR_GENERATORS, default_args)
import torch
from .builder import ANCHOR_GENERATORS
@ANCHOR_GENERATORS.register_module()
class PointGenerator(object):
def _meshgrid(self, x, y, row_major=True):
xx = x.repeat(len(y))
yy = y.view(-1, 1).repeat(1, len(x)).view(-1)
if row_major:
return xx, yy
else:
return yy, xx
def grid_points(self, featmap_size, stride=16, device='cuda'):
feat_h, feat_w = featmap_size
shift_x = torch.arange(0., feat_w, device=device) * stride
shift_y = torch.arange(0., feat_h, device=device) * stride
shift_xx, shift_yy = self._meshgrid(shift_x, shift_y)
stride = shift_x.new_full((shift_xx.shape[0], ), stride)
shifts = torch.stack([shift_xx, shift_yy, stride], dim=-1)
all_points = shifts.to(device)
return all_points
def valid_flags(self, featmap_size, valid_size, device='cuda'):
feat_h, feat_w = featmap_size
valid_h, valid_w = valid_size
assert valid_h <= feat_h and valid_w <= feat_w
valid_x = torch.zeros(feat_w, dtype=torch.bool, device=device)
valid_y = torch.zeros(feat_h, dtype=torch.bool, device=device)
valid_x[:valid_w] = 1
valid_y[:valid_h] = 1
valid_xx, valid_yy = self._meshgrid(valid_x, valid_y)
valid = valid_xx & valid_yy
return valid
import torch
def images_to_levels(target, num_levels):
"""Convert targets by image to targets by feature level.
[target_img0, target_img1] -> [target_level0, target_level1, ...]
"""
target = torch.stack(target, 0)
level_targets = []
start = 0
for n in num_levels:
end = start + n
# level_targets.append(target[:, start:end].squeeze(0))
level_targets.append(target[:, start:end])
start = end
return level_targets
def anchor_inside_flags(flat_anchors,
valid_flags,
img_shape,
allowed_border=0):
"""Check whether the anchors are inside the border.
Args:
flat_anchors (torch.Tensor): Flatten anchors, shape (n, 4).
valid_flags (torch.Tensor): An existing valid flags of anchors.
img_shape (tuple(int)): Shape of current image.
allowed_border (int, optional): The border to allow the valid anchor.
Defaults to 0.
Returns:
torch.Tensor: Flags indicating whether the anchors are inside a \
valid range.
"""
img_h, img_w = img_shape[:2]
if allowed_border >= 0:
inside_flags = valid_flags & \
(flat_anchors[:, 0] >= -allowed_border) & \
(flat_anchors[:, 1] >= -allowed_border) & \
(flat_anchors[:, 2] < img_w + allowed_border) & \
(flat_anchors[:, 3] < img_h + allowed_border)
else:
inside_flags = valid_flags
return inside_flags
def calc_region(bbox, ratio, featmap_size=None):
"""Calculate a proportional bbox region.
The bbox center are fixed and the new h' and w' is h * ratio and w * ratio.
Args:
bbox (Tensor): Bboxes to calculate regions, shape (n, 4).
ratio (float): Ratio of the output region.
featmap_size (tuple): Feature map size used for clipping the boundary.
Returns:
tuple: x1, y1, x2, y2
"""
x1 = torch.round((1 - ratio) * bbox[0] + ratio * bbox[2]).long()
y1 = torch.round((1 - ratio) * bbox[1] + ratio * bbox[3]).long()
x2 = torch.round(ratio * bbox[0] + (1 - ratio) * bbox[2]).long()
y2 = torch.round(ratio * bbox[1] + (1 - ratio) * bbox[3]).long()
if featmap_size is not None:
x1 = x1.clamp(min=0, max=featmap_size[1])
y1 = y1.clamp(min=0, max=featmap_size[0])
x2 = x2.clamp(min=0, max=featmap_size[1])
y2 = y2.clamp(min=0, max=featmap_size[0])
return (x1, y1, x2, y2)
from .assigners import (AssignResult, BaseAssigner, CenterRegionAssigner,
MaxIoUAssigner, RegionAssigner)
from .builder import build_assigner, build_bbox_coder, build_sampler
from .coder import (BaseBBoxCoder, DeltaXYWHBBoxCoder, PseudoBBoxCoder,
TBLRBBoxCoder)
from .iou_calculators import BboxOverlaps2D, bbox_overlaps
from .samplers import (BaseSampler, CombinedSampler,
InstanceBalancedPosSampler, IoUBalancedNegSampler,
OHEMSampler, PseudoSampler, RandomSampler,
SamplingResult, ScoreHLRSampler)
from .transforms import (bbox2distance, bbox2result, bbox2roi,
bbox_cxcywh_to_xyxy, bbox_flip, bbox_mapping,
bbox_mapping_back, bbox_rescale, bbox_xyxy_to_cxcywh,
distance2bbox, roi2bbox)
__all__ = [
'bbox_overlaps', 'BboxOverlaps2D', 'BaseAssigner', 'MaxIoUAssigner',
'AssignResult', 'BaseSampler', 'PseudoSampler', 'RandomSampler',
'InstanceBalancedPosSampler', 'IoUBalancedNegSampler', 'CombinedSampler',
'OHEMSampler', 'SamplingResult', 'ScoreHLRSampler', 'build_assigner',
'build_sampler', 'bbox_flip', 'bbox_mapping', 'bbox_mapping_back',
'bbox2roi', 'roi2bbox', 'bbox2result', 'distance2bbox', 'bbox2distance',
'build_bbox_coder', 'BaseBBoxCoder', 'PseudoBBoxCoder',
'DeltaXYWHBBoxCoder', 'TBLRBBoxCoder', 'CenterRegionAssigner',
'bbox_rescale', 'bbox_cxcywh_to_xyxy', 'bbox_xyxy_to_cxcywh',
'RegionAssigner'
]
from .approx_max_iou_assigner import ApproxMaxIoUAssigner
from .assign_result import AssignResult
from .atss_assigner import ATSSAssigner
from .base_assigner import BaseAssigner
from .center_region_assigner import CenterRegionAssigner
from .grid_assigner import GridAssigner
from .hungarian_assigner import HungarianAssigner
from .max_iou_assigner import MaxIoUAssigner
from .point_assigner import PointAssigner
from .region_assigner import RegionAssigner
__all__ = [
'BaseAssigner', 'MaxIoUAssigner', 'ApproxMaxIoUAssigner', 'AssignResult',
'PointAssigner', 'ATSSAssigner', 'CenterRegionAssigner', 'GridAssigner',
'HungarianAssigner', 'RegionAssigner'
]
import torch
from ..builder import BBOX_ASSIGNERS
from ..iou_calculators import build_iou_calculator
from .max_iou_assigner import MaxIoUAssigner
@BBOX_ASSIGNERS.register_module()
class ApproxMaxIoUAssigner(MaxIoUAssigner):
"""Assign a corresponding gt bbox or background to each bbox.
Each proposals will be assigned with an integer indicating the ground truth
index. (semi-positive index: gt label (0-based), -1: background)
- -1: negative sample, no assigned gt
- semi-positive integer: positive sample, index (0-based) of assigned gt
Args:
pos_iou_thr (float): IoU threshold for positive bboxes.
neg_iou_thr (float or tuple): IoU threshold for negative bboxes.
min_pos_iou (float): Minimum iou for a bbox to be considered as a
positive bbox. Positive samples can have smaller IoU than
pos_iou_thr due to the 4th step (assign max IoU sample to each gt).
gt_max_assign_all (bool): Whether to assign all bboxes with the same
highest overlap with some gt to that gt.
ignore_iof_thr (float): IoF threshold for ignoring bboxes (if
`gt_bboxes_ignore` is specified). Negative values mean not
ignoring any bboxes.
ignore_wrt_candidates (bool): Whether to compute the iof between
`bboxes` and `gt_bboxes_ignore`, or the contrary.
match_low_quality (bool): Whether to allow quality matches. This is
usually allowed for RPN and single stage detectors, but not allowed
in the second stage.
gpu_assign_thr (int): The upper bound of the number of GT for GPU
assign. When the number of gt is above this threshold, will assign
on CPU device. Negative values mean not assign on CPU.
"""
def __init__(self,
pos_iou_thr,
neg_iou_thr,
min_pos_iou=.0,
gt_max_assign_all=True,
ignore_iof_thr=-1,
ignore_wrt_candidates=True,
match_low_quality=True,
gpu_assign_thr=-1,
iou_calculator=dict(type='BboxOverlaps2D')):
self.pos_iou_thr = pos_iou_thr
self.neg_iou_thr = neg_iou_thr
self.min_pos_iou = min_pos_iou
self.gt_max_assign_all = gt_max_assign_all
self.ignore_iof_thr = ignore_iof_thr
self.ignore_wrt_candidates = ignore_wrt_candidates
self.gpu_assign_thr = gpu_assign_thr
self.match_low_quality = match_low_quality
self.iou_calculator = build_iou_calculator(iou_calculator)
def assign(self,
approxs,
squares,
approxs_per_octave,
gt_bboxes,
gt_bboxes_ignore=None,
gt_labels=None):
"""Assign gt to approxs.
This method assign a gt bbox to each group of approxs (bboxes),
each group of approxs is represent by a base approx (bbox) and
will be assigned with -1, or a semi-positive number.
background_label (-1) means negative sample,
semi-positive number is the index (0-based) of assigned gt.
The assignment is done in following steps, the order matters.
1. assign every bbox to background_label (-1)
2. use the max IoU of each group of approxs to assign
2. assign proposals whose iou with all gts < neg_iou_thr to background
3. for each bbox, if the iou with its nearest gt >= pos_iou_thr,
assign it to that bbox
4. for each gt bbox, assign its nearest proposals (may be more than
one) to itself
Args:
approxs (Tensor): Bounding boxes to be assigned,
shape(approxs_per_octave*n, 4).
squares (Tensor): Base Bounding boxes to be assigned,
shape(n, 4).
approxs_per_octave (int): number of approxs per octave
gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4).
gt_bboxes_ignore (Tensor, optional): Ground truth bboxes that are
labelled as `ignored`, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional): Label of gt_bboxes, shape (k, ).
Returns:
:obj:`AssignResult`: The assign result.
"""
num_squares = squares.size(0)
num_gts = gt_bboxes.size(0)
if num_squares == 0 or num_gts == 0:
# No predictions and/or truth, return empty assignment
overlaps = approxs.new(num_gts, num_squares)
assign_result = self.assign_wrt_overlaps(overlaps, gt_labels)
return assign_result
# re-organize anchors by approxs_per_octave x num_squares
approxs = torch.transpose(
approxs.view(num_squares, approxs_per_octave, 4), 0,
1).contiguous().view(-1, 4)
assign_on_cpu = True if (self.gpu_assign_thr > 0) and (
num_gts > self.gpu_assign_thr) else False
# compute overlap and assign gt on CPU when number of GT is large
if assign_on_cpu:
device = approxs.device
approxs = approxs.cpu()
gt_bboxes = gt_bboxes.cpu()
if gt_bboxes_ignore is not None:
gt_bboxes_ignore = gt_bboxes_ignore.cpu()
if gt_labels is not None:
gt_labels = gt_labels.cpu()
all_overlaps = self.iou_calculator(approxs, gt_bboxes)
overlaps, _ = all_overlaps.view(approxs_per_octave, num_squares,
num_gts).max(dim=0)
overlaps = torch.transpose(overlaps, 0, 1)
if (self.ignore_iof_thr > 0 and gt_bboxes_ignore is not None
and gt_bboxes_ignore.numel() > 0 and squares.numel() > 0):
if self.ignore_wrt_candidates:
ignore_overlaps = self.iou_calculator(
squares, gt_bboxes_ignore, mode='iof')
ignore_max_overlaps, _ = ignore_overlaps.max(dim=1)
else:
ignore_overlaps = self.iou_calculator(
gt_bboxes_ignore, squares, mode='iof')
ignore_max_overlaps, _ = ignore_overlaps.max(dim=0)
overlaps[:, ignore_max_overlaps > self.ignore_iof_thr] = -1
assign_result = self.assign_wrt_overlaps(overlaps, gt_labels)
if assign_on_cpu:
assign_result.gt_inds = assign_result.gt_inds.to(device)
assign_result.max_overlaps = assign_result.max_overlaps.to(device)
if assign_result.labels is not None:
assign_result.labels = assign_result.labels.to(device)
return assign_result
import torch
from mmdet.utils import util_mixins
class AssignResult(util_mixins.NiceRepr):
"""Stores assignments between predicted and truth boxes.
Attributes:
num_gts (int): the number of truth boxes considered when computing this
assignment
gt_inds (LongTensor): for each predicted box indicates the 1-based
index of the assigned truth box. 0 means unassigned and -1 means
ignore.
max_overlaps (FloatTensor): the iou between the predicted box and its
assigned truth box.
labels (None | LongTensor): If specified, for each predicted box
indicates the category label of the assigned truth box.
Example:
>>> # An assign result between 4 predicted boxes and 9 true boxes
>>> # where only two boxes were assigned.
>>> num_gts = 9
>>> max_overlaps = torch.LongTensor([0, .5, .9, 0])
>>> gt_inds = torch.LongTensor([-1, 1, 2, 0])
>>> labels = torch.LongTensor([0, 3, 4, 0])
>>> self = AssignResult(num_gts, gt_inds, max_overlaps, labels)
>>> print(str(self)) # xdoctest: +IGNORE_WANT
<AssignResult(num_gts=9, gt_inds.shape=(4,), max_overlaps.shape=(4,),
labels.shape=(4,))>
>>> # Force addition of gt labels (when adding gt as proposals)
>>> new_labels = torch.LongTensor([3, 4, 5])
>>> self.add_gt_(new_labels)
>>> print(str(self)) # xdoctest: +IGNORE_WANT
<AssignResult(num_gts=9, gt_inds.shape=(7,), max_overlaps.shape=(7,),
labels.shape=(7,))>
"""
def __init__(self, num_gts, gt_inds, max_overlaps, labels=None):
self.num_gts = num_gts
self.gt_inds = gt_inds
self.max_overlaps = max_overlaps
self.labels = labels
# Interface for possible user-defined properties
self._extra_properties = {}
@property
def num_preds(self):
"""int: the number of predictions in this assignment"""
return len(self.gt_inds)
def set_extra_property(self, key, value):
"""Set user-defined new property."""
assert key not in self.info
self._extra_properties[key] = value
def get_extra_property(self, key):
"""Get user-defined property."""
return self._extra_properties.get(key, None)
@property
def info(self):
"""dict: a dictionary of info about the object"""
basic_info = {
'num_gts': self.num_gts,
'num_preds': self.num_preds,
'gt_inds': self.gt_inds,
'max_overlaps': self.max_overlaps,
'labels': self.labels,
}
basic_info.update(self._extra_properties)
return basic_info
def __nice__(self):
"""str: a "nice" summary string describing this assign result"""
parts = []
parts.append(f'num_gts={self.num_gts!r}')
if self.gt_inds is None:
parts.append(f'gt_inds={self.gt_inds!r}')
else:
parts.append(f'gt_inds.shape={tuple(self.gt_inds.shape)!r}')
if self.max_overlaps is None:
parts.append(f'max_overlaps={self.max_overlaps!r}')
else:
parts.append('max_overlaps.shape='
f'{tuple(self.max_overlaps.shape)!r}')
if self.labels is None:
parts.append(f'labels={self.labels!r}')
else:
parts.append(f'labels.shape={tuple(self.labels.shape)!r}')
return ', '.join(parts)
@classmethod
def random(cls, **kwargs):
"""Create random AssignResult for tests or debugging.
Args:
num_preds: number of predicted boxes
num_gts: number of true boxes
p_ignore (float): probability of a predicted box assinged to an
ignored truth
p_assigned (float): probability of a predicted box not being
assigned
p_use_label (float | bool): with labels or not
rng (None | int | numpy.random.RandomState): seed or state
Returns:
:obj:`AssignResult`: Randomly generated assign results.
Example:
>>> from mmdet.core.bbox.assigners.assign_result import * # NOQA
>>> self = AssignResult.random()
>>> print(self.info)
"""
from mmdet.core.bbox import demodata
rng = demodata.ensure_rng(kwargs.get('rng', None))
num_gts = kwargs.get('num_gts', None)
num_preds = kwargs.get('num_preds', None)
p_ignore = kwargs.get('p_ignore', 0.3)
p_assigned = kwargs.get('p_assigned', 0.7)
p_use_label = kwargs.get('p_use_label', 0.5)
num_classes = kwargs.get('p_use_label', 3)
if num_gts is None:
num_gts = rng.randint(0, 8)
if num_preds is None:
num_preds = rng.randint(0, 16)
if num_gts == 0:
max_overlaps = torch.zeros(num_preds, dtype=torch.float32)
gt_inds = torch.zeros(num_preds, dtype=torch.int64)
if p_use_label is True or p_use_label < rng.rand():
labels = torch.zeros(num_preds, dtype=torch.int64)
else:
labels = None
else:
import numpy as np
# Create an overlap for each predicted box
max_overlaps = torch.from_numpy(rng.rand(num_preds))
# Construct gt_inds for each predicted box
is_assigned = torch.from_numpy(rng.rand(num_preds) < p_assigned)
# maximum number of assignments constraints
n_assigned = min(num_preds, min(num_gts, is_assigned.sum()))
assigned_idxs = np.where(is_assigned)[0]
rng.shuffle(assigned_idxs)
assigned_idxs = assigned_idxs[0:n_assigned]
assigned_idxs.sort()
is_assigned[:] = 0
is_assigned[assigned_idxs] = True
is_ignore = torch.from_numpy(
rng.rand(num_preds) < p_ignore) & is_assigned
gt_inds = torch.zeros(num_preds, dtype=torch.int64)
true_idxs = np.arange(num_gts)
rng.shuffle(true_idxs)
true_idxs = torch.from_numpy(true_idxs)
gt_inds[is_assigned] = true_idxs[:n_assigned]
gt_inds = torch.from_numpy(
rng.randint(1, num_gts + 1, size=num_preds))
gt_inds[is_ignore] = -1
gt_inds[~is_assigned] = 0
max_overlaps[~is_assigned] = 0
if p_use_label is True or p_use_label < rng.rand():
if num_classes == 0:
labels = torch.zeros(num_preds, dtype=torch.int64)
else:
labels = torch.from_numpy(
# remind that we set FG labels to [0, num_class-1]
# since mmdet v2.0
# BG cat_id: num_class
rng.randint(0, num_classes, size=num_preds))
labels[~is_assigned] = 0
else:
labels = None
self = cls(num_gts, gt_inds, max_overlaps, labels)
return self
def add_gt_(self, gt_labels):
"""Add ground truth as assigned results.
Args:
gt_labels (torch.Tensor): Labels of gt boxes
"""
self_inds = torch.arange(
1, len(gt_labels) + 1, dtype=torch.long, device=gt_labels.device)
self.gt_inds = torch.cat([self_inds, self.gt_inds])
self.max_overlaps = torch.cat(
[self.max_overlaps.new_ones(len(gt_labels)), self.max_overlaps])
if self.labels is not None:
self.labels = torch.cat([gt_labels, self.labels])
import torch
from ..builder import BBOX_ASSIGNERS
from ..iou_calculators import build_iou_calculator
from .assign_result import AssignResult
from .base_assigner import BaseAssigner
@BBOX_ASSIGNERS.register_module()
class ATSSAssigner(BaseAssigner):
"""Assign a corresponding gt bbox or background to each bbox.
Each proposals will be assigned with `0` or a positive integer
indicating the ground truth index.
- 0: negative sample, no assigned gt
- positive integer: positive sample, index (1-based) of assigned gt
Args:
topk (float): number of bbox selected in each level
"""
def __init__(self,
topk,
iou_calculator=dict(type='BboxOverlaps2D'),
ignore_iof_thr=-1):
self.topk = topk
self.iou_calculator = build_iou_calculator(iou_calculator)
self.ignore_iof_thr = ignore_iof_thr
# https://github.com/sfzhang15/ATSS/blob/master/atss_core/modeling/rpn/atss/loss.py
def assign(self,
bboxes,
num_level_bboxes,
gt_bboxes,
gt_bboxes_ignore=None,
gt_labels=None):
"""Assign gt to bboxes.
The assignment is done in following steps
1. compute iou between all bbox (bbox of all pyramid levels) and gt
2. compute center distance between all bbox and gt
3. on each pyramid level, for each gt, select k bbox whose center
are closest to the gt center, so we total select k*l bbox as
candidates for each gt
4. get corresponding iou for the these candidates, and compute the
mean and std, set mean + std as the iou threshold
5. select these candidates whose iou are greater than or equal to
the threshold as postive
6. limit the positive sample's center in gt
Args:
bboxes (Tensor): Bounding boxes to be assigned, shape(n, 4).
num_level_bboxes (List): num of bboxes in each level
gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4).
gt_bboxes_ignore (Tensor, optional): Ground truth bboxes that are
labelled as `ignored`, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional): Label of gt_bboxes, shape (k, ).
Returns:
:obj:`AssignResult`: The assign result.
"""
INF = 100000000
bboxes = bboxes[:, :4]
num_gt, num_bboxes = gt_bboxes.size(0), bboxes.size(0)
# compute iou between all bbox and gt
overlaps = self.iou_calculator(bboxes, gt_bboxes)
# assign 0 by default
assigned_gt_inds = overlaps.new_full((num_bboxes, ),
0,
dtype=torch.long)
if num_gt == 0 or num_bboxes == 0:
# No ground truth or boxes, return empty assignment
max_overlaps = overlaps.new_zeros((num_bboxes, ))
if num_gt == 0:
# No truth, assign everything to background
assigned_gt_inds[:] = 0
if gt_labels is None:
assigned_labels = None
else:
assigned_labels = overlaps.new_full((num_bboxes, ),
-1,
dtype=torch.long)
return AssignResult(
num_gt, assigned_gt_inds, max_overlaps, labels=assigned_labels)
# compute center distance between all bbox and gt
gt_cx = (gt_bboxes[:, 0] + gt_bboxes[:, 2]) / 2.0
gt_cy = (gt_bboxes[:, 1] + gt_bboxes[:, 3]) / 2.0
gt_points = torch.stack((gt_cx, gt_cy), dim=1)
bboxes_cx = (bboxes[:, 0] + bboxes[:, 2]) / 2.0
bboxes_cy = (bboxes[:, 1] + bboxes[:, 3]) / 2.0
bboxes_points = torch.stack((bboxes_cx, bboxes_cy), dim=1)
distances = (bboxes_points[:, None, :] -
gt_points[None, :, :]).pow(2).sum(-1).sqrt()
if (self.ignore_iof_thr > 0 and gt_bboxes_ignore is not None
and gt_bboxes_ignore.numel() > 0 and bboxes.numel() > 0):
ignore_overlaps = self.iou_calculator(
bboxes, gt_bboxes_ignore, mode='iof')
ignore_max_overlaps, _ = ignore_overlaps.max(dim=1)
ignore_idxs = ignore_max_overlaps > self.ignore_iof_thr
distances[ignore_idxs, :] = INF
assigned_gt_inds[ignore_idxs] = -1
# Selecting candidates based on the center distance
candidate_idxs = []
start_idx = 0
for level, bboxes_per_level in enumerate(num_level_bboxes):
# on each pyramid level, for each gt,
# select k bbox whose center are closest to the gt center
end_idx = start_idx + bboxes_per_level
distances_per_level = distances[start_idx:end_idx, :]
selectable_k = min(self.topk, bboxes_per_level)
_, topk_idxs_per_level = distances_per_level.topk(
selectable_k, dim=0, largest=False)
candidate_idxs.append(topk_idxs_per_level + start_idx)
start_idx = end_idx
candidate_idxs = torch.cat(candidate_idxs, dim=0)
# get corresponding iou for the these candidates, and compute the
# mean and std, set mean + std as the iou threshold
candidate_overlaps = overlaps[candidate_idxs, torch.arange(num_gt)]
overlaps_mean_per_gt = candidate_overlaps.mean(0)
overlaps_std_per_gt = candidate_overlaps.std(0)
overlaps_thr_per_gt = overlaps_mean_per_gt + overlaps_std_per_gt
is_pos = candidate_overlaps >= overlaps_thr_per_gt[None, :]
# limit the positive sample's center in gt
for gt_idx in range(num_gt):
candidate_idxs[:, gt_idx] += gt_idx * num_bboxes
ep_bboxes_cx = bboxes_cx.view(1, -1).expand(
num_gt, num_bboxes).contiguous().view(-1)
ep_bboxes_cy = bboxes_cy.view(1, -1).expand(
num_gt, num_bboxes).contiguous().view(-1)
candidate_idxs = candidate_idxs.view(-1)
# calculate the left, top, right, bottom distance between positive
# bbox center and gt side
l_ = ep_bboxes_cx[candidate_idxs].view(-1, num_gt) - gt_bboxes[:, 0]
t_ = ep_bboxes_cy[candidate_idxs].view(-1, num_gt) - gt_bboxes[:, 1]
r_ = gt_bboxes[:, 2] - ep_bboxes_cx[candidate_idxs].view(-1, num_gt)
b_ = gt_bboxes[:, 3] - ep_bboxes_cy[candidate_idxs].view(-1, num_gt)
is_in_gts = torch.stack([l_, t_, r_, b_], dim=1).min(dim=1)[0] > 0.01
is_pos = is_pos & is_in_gts
# if an anchor box is assigned to multiple gts,
# the one with the highest IoU will be selected.
overlaps_inf = torch.full_like(overlaps,
-INF).t().contiguous().view(-1)
index = candidate_idxs.view(-1)[is_pos.view(-1)]
overlaps_inf[index] = overlaps.t().contiguous().view(-1)[index]
overlaps_inf = overlaps_inf.view(num_gt, -1).t()
max_overlaps, argmax_overlaps = overlaps_inf.max(dim=1)
assigned_gt_inds[
max_overlaps != -INF] = argmax_overlaps[max_overlaps != -INF] + 1
if gt_labels is not None:
assigned_labels = assigned_gt_inds.new_full((num_bboxes, ), -1)
pos_inds = torch.nonzero(
assigned_gt_inds > 0, as_tuple=False).squeeze()
if pos_inds.numel() > 0:
assigned_labels[pos_inds] = gt_labels[
assigned_gt_inds[pos_inds] - 1]
else:
assigned_labels = None
return AssignResult(
num_gt, assigned_gt_inds, max_overlaps, labels=assigned_labels)
from abc import ABCMeta, abstractmethod
class BaseAssigner(metaclass=ABCMeta):
"""Base assigner that assigns boxes to ground truth boxes."""
@abstractmethod
def assign(self, bboxes, gt_bboxes, gt_bboxes_ignore=None, gt_labels=None):
"""Assign boxes to either a ground truth boxe or a negative boxes."""
pass
import torch
from ..builder import BBOX_ASSIGNERS
from ..iou_calculators import build_iou_calculator
from .assign_result import AssignResult
from .base_assigner import BaseAssigner
def scale_boxes(bboxes, scale):
"""Expand an array of boxes by a given scale.
Args:
bboxes (Tensor): Shape (m, 4)
scale (float): The scale factor of bboxes
Returns:
(Tensor): Shape (m, 4). Scaled bboxes
"""
assert bboxes.size(1) == 4
w_half = (bboxes[:, 2] - bboxes[:, 0]) * .5
h_half = (bboxes[:, 3] - bboxes[:, 1]) * .5
x_c = (bboxes[:, 2] + bboxes[:, 0]) * .5
y_c = (bboxes[:, 3] + bboxes[:, 1]) * .5
w_half *= scale
h_half *= scale
boxes_scaled = torch.zeros_like(bboxes)
boxes_scaled[:, 0] = x_c - w_half
boxes_scaled[:, 2] = x_c + w_half
boxes_scaled[:, 1] = y_c - h_half
boxes_scaled[:, 3] = y_c + h_half
return boxes_scaled
def is_located_in(points, bboxes):
"""Are points located in bboxes.
Args:
points (Tensor): Points, shape: (m, 2).
bboxes (Tensor): Bounding boxes, shape: (n, 4).
Return:
Tensor: Flags indicating if points are located in bboxes, shape: (m, n).
"""
assert points.size(1) == 2
assert bboxes.size(1) == 4
return (points[:, 0].unsqueeze(1) > bboxes[:, 0].unsqueeze(0)) & \
(points[:, 0].unsqueeze(1) < bboxes[:, 2].unsqueeze(0)) & \
(points[:, 1].unsqueeze(1) > bboxes[:, 1].unsqueeze(0)) & \
(points[:, 1].unsqueeze(1) < bboxes[:, 3].unsqueeze(0))
def bboxes_area(bboxes):
"""Compute the area of an array of bboxes.
Args:
bboxes (Tensor): The coordinates ox bboxes. Shape: (m, 4)
Returns:
Tensor: Area of the bboxes. Shape: (m, )
"""
assert bboxes.size(1) == 4
w = (bboxes[:, 2] - bboxes[:, 0])
h = (bboxes[:, 3] - bboxes[:, 1])
areas = w * h
return areas
@BBOX_ASSIGNERS.register_module()
class CenterRegionAssigner(BaseAssigner):
"""Assign pixels at the center region of a bbox as positive.
Each proposals will be assigned with `-1`, `0`, or a positive integer
indicating the ground truth index.
- -1: negative samples
- semi-positive numbers: positive sample, index (0-based) of assigned gt
Args:
pos_scale (float): Threshold within which pixels are
labelled as positive.
neg_scale (float): Threshold above which pixels are
labelled as positive.
min_pos_iof (float): Minimum iof of a pixel with a gt to be
labelled as positive. Default: 1e-2
ignore_gt_scale (float): Threshold within which the pixels
are ignored when the gt is labelled as shadowed. Default: 0.5
foreground_dominate (bool): If True, the bbox will be assigned as
positive when a gt's kernel region overlaps with another's shadowed
(ignored) region, otherwise it is set as ignored. Default to False.
"""
def __init__(self,
pos_scale,
neg_scale,
min_pos_iof=1e-2,
ignore_gt_scale=0.5,
foreground_dominate=False,
iou_calculator=dict(type='BboxOverlaps2D')):
self.pos_scale = pos_scale
self.neg_scale = neg_scale
self.min_pos_iof = min_pos_iof
self.ignore_gt_scale = ignore_gt_scale
self.foreground_dominate = foreground_dominate
self.iou_calculator = build_iou_calculator(iou_calculator)
def get_gt_priorities(self, gt_bboxes):
"""Get gt priorities according to their areas.
Smaller gt has higher priority.
Args:
gt_bboxes (Tensor): Ground truth boxes, shape (k, 4).
Returns:
Tensor: The priority of gts so that gts with larger priority is \
more likely to be assigned. Shape (k, )
"""
gt_areas = bboxes_area(gt_bboxes)
# Rank all gt bbox areas. Smaller objects has larger priority
_, sort_idx = gt_areas.sort(descending=True)
sort_idx = sort_idx.argsort()
return sort_idx
def assign(self, bboxes, gt_bboxes, gt_bboxes_ignore=None, gt_labels=None):
"""Assign gt to bboxes.
This method assigns gts to every bbox (proposal/anchor), each bbox \
will be assigned with -1, or a semi-positive number. -1 means \
negative sample, semi-positive number is the index (0-based) of \
assigned gt.
Args:
bboxes (Tensor): Bounding boxes to be assigned, shape(n, 4).
gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4).
gt_bboxes_ignore (tensor, optional): Ground truth bboxes that are
labelled as `ignored`, e.g., crowd boxes in COCO.
gt_labels (tensor, optional): Label of gt_bboxes, shape (num_gts,).
Returns:
:obj:`AssignResult`: The assigned result. Note that \
shadowed_labels of shape (N, 2) is also added as an \
`assign_result` attribute. `shadowed_labels` is a tensor \
composed of N pairs of anchor_ind, class_label], where N \
is the number of anchors that lie in the outer region of a \
gt, anchor_ind is the shadowed anchor index and class_label \
is the shadowed class label.
Example:
>>> self = CenterRegionAssigner(0.2, 0.2)
>>> bboxes = torch.Tensor([[0, 0, 10, 10], [10, 10, 20, 20]])
>>> gt_bboxes = torch.Tensor([[0, 0, 10, 10]])
>>> assign_result = self.assign(bboxes, gt_bboxes)
>>> expected_gt_inds = torch.LongTensor([1, 0])
>>> assert torch.all(assign_result.gt_inds == expected_gt_inds)
"""
# There are in total 5 steps in the pixel assignment
# 1. Find core (the center region, say inner 0.2)
# and shadow (the relatively ourter part, say inner 0.2-0.5)
# regions of every gt.
# 2. Find all prior bboxes that lie in gt_core and gt_shadow regions
# 3. Assign prior bboxes in gt_core with a one-hot id of the gt in
# the image.
# 3.1. For overlapping objects, the prior bboxes in gt_core is
# assigned with the object with smallest area
# 4. Assign prior bboxes with class label according to its gt id.
# 4.1. Assign -1 to prior bboxes lying in shadowed gts
# 4.2. Assign positive prior boxes with the corresponding label
# 5. Find pixels lying in the shadow of an object and assign them with
# background label, but set the loss weight of its corresponding
# gt to zero.
assert bboxes.size(1) == 4, 'bboxes must have size of 4'
# 1. Find core positive and shadow region of every gt
gt_core = scale_boxes(gt_bboxes, self.pos_scale)
gt_shadow = scale_boxes(gt_bboxes, self.neg_scale)
# 2. Find prior bboxes that lie in gt_core and gt_shadow regions
bbox_centers = (bboxes[:, 2:4] + bboxes[:, 0:2]) / 2
# The center points lie within the gt boxes
is_bbox_in_gt = is_located_in(bbox_centers, gt_bboxes)
# Only calculate bbox and gt_core IoF. This enables small prior bboxes
# to match large gts
bbox_and_gt_core_overlaps = self.iou_calculator(
bboxes, gt_core, mode='iof')
# The center point of effective priors should be within the gt box
is_bbox_in_gt_core = is_bbox_in_gt & (
bbox_and_gt_core_overlaps > self.min_pos_iof) # shape (n, k)
is_bbox_in_gt_shadow = (
self.iou_calculator(bboxes, gt_shadow, mode='iof') >
self.min_pos_iof)
# Rule out center effective positive pixels
is_bbox_in_gt_shadow &= (~is_bbox_in_gt_core)
num_gts, num_bboxes = gt_bboxes.size(0), bboxes.size(0)
if num_gts == 0 or num_bboxes == 0:
# If no gts exist, assign all pixels to negative
assigned_gt_ids = \
is_bbox_in_gt_core.new_zeros((num_bboxes,),
dtype=torch.long)
pixels_in_gt_shadow = assigned_gt_ids.new_empty((0, 2))
else:
# Step 3: assign a one-hot gt id to each pixel, and smaller objects
# have high priority to assign the pixel.
sort_idx = self.get_gt_priorities(gt_bboxes)
assigned_gt_ids, pixels_in_gt_shadow = \
self.assign_one_hot_gt_indices(is_bbox_in_gt_core,
is_bbox_in_gt_shadow,
gt_priority=sort_idx)
if gt_bboxes_ignore is not None and gt_bboxes_ignore.numel() > 0:
# No ground truth or boxes, return empty assignment
gt_bboxes_ignore = scale_boxes(
gt_bboxes_ignore, scale=self.ignore_gt_scale)
is_bbox_in_ignored_gts = is_located_in(bbox_centers,
gt_bboxes_ignore)
is_bbox_in_ignored_gts = is_bbox_in_ignored_gts.any(dim=1)
assigned_gt_ids[is_bbox_in_ignored_gts] = -1
# 4. Assign prior bboxes with class label according to its gt id.
assigned_labels = None
shadowed_pixel_labels = None
if gt_labels is not None:
# Default assigned label is the background (-1)
assigned_labels = assigned_gt_ids.new_full((num_bboxes, ), -1)
pos_inds = torch.nonzero(
assigned_gt_ids > 0, as_tuple=False).squeeze()
if pos_inds.numel() > 0:
assigned_labels[pos_inds] = gt_labels[assigned_gt_ids[pos_inds]
- 1]
# 5. Find pixels lying in the shadow of an object
shadowed_pixel_labels = pixels_in_gt_shadow.clone()
if pixels_in_gt_shadow.numel() > 0:
pixel_idx, gt_idx =\
pixels_in_gt_shadow[:, 0], pixels_in_gt_shadow[:, 1]
assert (assigned_gt_ids[pixel_idx] != gt_idx).all(), \
'Some pixels are dually assigned to ignore and gt!'
shadowed_pixel_labels[:, 1] = gt_labels[gt_idx - 1]
override = (
assigned_labels[pixel_idx] == shadowed_pixel_labels[:, 1])
if self.foreground_dominate:
# When a pixel is both positive and shadowed, set it as pos
shadowed_pixel_labels = shadowed_pixel_labels[~override]
else:
# When a pixel is both pos and shadowed, set it as shadowed
assigned_labels[pixel_idx[override]] = -1
assigned_gt_ids[pixel_idx[override]] = 0
assign_result = AssignResult(
num_gts, assigned_gt_ids, None, labels=assigned_labels)
# Add shadowed_labels as assign_result property. Shape: (num_shadow, 2)
assign_result.set_extra_property('shadowed_labels',
shadowed_pixel_labels)
return assign_result
def assign_one_hot_gt_indices(self,
is_bbox_in_gt_core,
is_bbox_in_gt_shadow,
gt_priority=None):
"""Assign only one gt index to each prior box.
Gts with large gt_priority are more likely to be assigned.
Args:
is_bbox_in_gt_core (Tensor): Bool tensor indicating the bbox center
is in the core area of a gt (e.g. 0-0.2).
Shape: (num_prior, num_gt).
is_bbox_in_gt_shadow (Tensor): Bool tensor indicating the bbox
center is in the shadowed area of a gt (e.g. 0.2-0.5).
Shape: (num_prior, num_gt).
gt_priority (Tensor): Priorities of gts. The gt with a higher
priority is more likely to be assigned to the bbox when the bbox
match with multiple gts. Shape: (num_gt, ).
Returns:
tuple: Returns (assigned_gt_inds, shadowed_gt_inds).
- assigned_gt_inds: The assigned gt index of each prior bbox \
(i.e. index from 1 to num_gts). Shape: (num_prior, ).
- shadowed_gt_inds: shadowed gt indices. It is a tensor of \
shape (num_ignore, 2) with first column being the \
shadowed prior bbox indices and the second column the \
shadowed gt indices (1-based).
"""
num_bboxes, num_gts = is_bbox_in_gt_core.shape
if gt_priority is None:
gt_priority = torch.arange(
num_gts, device=is_bbox_in_gt_core.device)
assert gt_priority.size(0) == num_gts
# The bigger gt_priority, the more preferable to be assigned
# The assigned inds are by default 0 (background)
assigned_gt_inds = is_bbox_in_gt_core.new_zeros((num_bboxes, ),
dtype=torch.long)
# Shadowed bboxes are assigned to be background. But the corresponding
# label is ignored during loss calculation, which is done through
# shadowed_gt_inds
shadowed_gt_inds = torch.nonzero(is_bbox_in_gt_shadow, as_tuple=False)
if is_bbox_in_gt_core.sum() == 0: # No gt match
shadowed_gt_inds[:, 1] += 1 # 1-based. For consistency issue
return assigned_gt_inds, shadowed_gt_inds
# The priority of each prior box and gt pair. If one prior box is
# matched bo multiple gts. Only the pair with the highest priority
# is saved
pair_priority = is_bbox_in_gt_core.new_full((num_bboxes, num_gts),
-1,
dtype=torch.long)
# Each bbox could match with multiple gts.
# The following codes deal with this situation
# Matched bboxes (to any gt). Shape: (num_pos_anchor, )
inds_of_match = torch.any(is_bbox_in_gt_core, dim=1)
# The matched gt index of each positive bbox. Length >= num_pos_anchor
# , since one bbox could match multiple gts
matched_bbox_gt_inds = torch.nonzero(
is_bbox_in_gt_core, as_tuple=False)[:, 1]
# Assign priority to each bbox-gt pair.
pair_priority[is_bbox_in_gt_core] = gt_priority[matched_bbox_gt_inds]
_, argmax_priority = pair_priority[inds_of_match].max(dim=1)
assigned_gt_inds[inds_of_match] = argmax_priority + 1 # 1-based
# Zero-out the assigned anchor box to filter the shadowed gt indices
is_bbox_in_gt_core[inds_of_match, argmax_priority] = 0
# Concat the shadowed indices due to overlapping with that out side of
# effective scale. shape: (total_num_ignore, 2)
shadowed_gt_inds = torch.cat(
(shadowed_gt_inds, torch.nonzero(
is_bbox_in_gt_core, as_tuple=False)),
dim=0)
# `is_bbox_in_gt_core` should be changed back to keep arguments intact.
is_bbox_in_gt_core[inds_of_match, argmax_priority] = 1
# 1-based shadowed gt indices, to be consistent with `assigned_gt_inds`
if shadowed_gt_inds.numel() > 0:
shadowed_gt_inds[:, 1] += 1
return assigned_gt_inds, shadowed_gt_inds
import torch
from ..builder import BBOX_ASSIGNERS
from ..iou_calculators import build_iou_calculator
from .assign_result import AssignResult
from .base_assigner import BaseAssigner
@BBOX_ASSIGNERS.register_module()
class GridAssigner(BaseAssigner):
"""Assign a corresponding gt bbox or background to each bbox.
Each proposals will be assigned with `-1`, `0`, or a positive integer
indicating the ground truth index.
- -1: don't care
- 0: negative sample, no assigned gt
- positive integer: positive sample, index (1-based) of assigned gt
Args:
pos_iou_thr (float): IoU threshold for positive bboxes.
neg_iou_thr (float or tuple): IoU threshold for negative bboxes.
min_pos_iou (float): Minimum iou for a bbox to be considered as a
positive bbox. Positive samples can have smaller IoU than
pos_iou_thr due to the 4th step (assign max IoU sample to each gt).
gt_max_assign_all (bool): Whether to assign all bboxes with the same
highest overlap with some gt to that gt.
"""
def __init__(self,
pos_iou_thr,
neg_iou_thr,
min_pos_iou=.0,
gt_max_assign_all=True,
iou_calculator=dict(type='BboxOverlaps2D')):
self.pos_iou_thr = pos_iou_thr
self.neg_iou_thr = neg_iou_thr
self.min_pos_iou = min_pos_iou
self.gt_max_assign_all = gt_max_assign_all
self.iou_calculator = build_iou_calculator(iou_calculator)
def assign(self, bboxes, box_responsible_flags, gt_bboxes, gt_labels=None):
"""Assign gt to bboxes. The process is very much like the max iou
assigner, except that positive samples are constrained within the cell
that the gt boxes fell in.
This method assign a gt bbox to every bbox (proposal/anchor), each bbox
will be assigned with -1, 0, or a positive number. -1 means don't care,
0 means negative sample, positive number is the index (1-based) of
assigned gt.
The assignment is done in following steps, the order matters.
1. assign every bbox to -1
2. assign proposals whose iou with all gts <= neg_iou_thr to 0
3. for each bbox within a cell, if the iou with its nearest gt >
pos_iou_thr and the center of that gt falls inside the cell,
assign it to that bbox
4. for each gt bbox, assign its nearest proposals within the cell the
gt bbox falls in to itself.
Args:
bboxes (Tensor): Bounding boxes to be assigned, shape(n, 4).
box_responsible_flags (Tensor): flag to indicate whether box is
responsible for prediction, shape(n, )
gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4).
gt_labels (Tensor, optional): Label of gt_bboxes, shape (k, ).
Returns:
:obj:`AssignResult`: The assign result.
"""
num_gts, num_bboxes = gt_bboxes.size(0), bboxes.size(0)
# compute iou between all gt and bboxes
overlaps = self.iou_calculator(gt_bboxes, bboxes)
# 1. assign -1 by default
assigned_gt_inds = overlaps.new_full((num_bboxes, ),
-1,
dtype=torch.long)
if num_gts == 0 or num_bboxes == 0:
# No ground truth or boxes, return empty assignment
max_overlaps = overlaps.new_zeros((num_bboxes, ))
if num_gts == 0:
# No truth, assign everything to background
assigned_gt_inds[:] = 0
if gt_labels is None:
assigned_labels = None
else:
assigned_labels = overlaps.new_full((num_bboxes, ),
-1,
dtype=torch.long)
return AssignResult(
num_gts,
assigned_gt_inds,
max_overlaps,
labels=assigned_labels)
# 2. assign negative: below
# for each anchor, which gt best overlaps with it
# for each anchor, the max iou of all gts
# shape of max_overlaps == argmax_overlaps == num_bboxes
max_overlaps, argmax_overlaps = overlaps.max(dim=0)
if isinstance(self.neg_iou_thr, float):
assigned_gt_inds[(max_overlaps >= 0)
& (max_overlaps <= self.neg_iou_thr)] = 0
elif isinstance(self.neg_iou_thr, (tuple, list)):
assert len(self.neg_iou_thr) == 2
assigned_gt_inds[(max_overlaps > self.neg_iou_thr[0])
& (max_overlaps <= self.neg_iou_thr[1])] = 0
# 3. assign positive: falls into responsible cell and above
# positive IOU threshold, the order matters.
# the prior condition of comparision is to filter out all
# unrelated anchors, i.e. not box_responsible_flags
overlaps[:, ~box_responsible_flags.type(torch.bool)] = -1.
# calculate max_overlaps again, but this time we only consider IOUs
# for anchors responsible for prediction
max_overlaps, argmax_overlaps = overlaps.max(dim=0)
# for each gt, which anchor best overlaps with it
# for each gt, the max iou of all proposals
# shape of gt_max_overlaps == gt_argmax_overlaps == num_gts
gt_max_overlaps, gt_argmax_overlaps = overlaps.max(dim=1)
pos_inds = (max_overlaps >
self.pos_iou_thr) & box_responsible_flags.type(torch.bool)
assigned_gt_inds[pos_inds] = argmax_overlaps[pos_inds] + 1
# 4. assign positive to max overlapped anchors within responsible cell
for i in range(num_gts):
if gt_max_overlaps[i] > self.min_pos_iou:
if self.gt_max_assign_all:
max_iou_inds = (overlaps[i, :] == gt_max_overlaps[i]) & \
box_responsible_flags.type(torch.bool)
assigned_gt_inds[max_iou_inds] = i + 1
elif box_responsible_flags[gt_argmax_overlaps[i]]:
assigned_gt_inds[gt_argmax_overlaps[i]] = i + 1
# assign labels of positive anchors
if gt_labels is not None:
assigned_labels = assigned_gt_inds.new_full((num_bboxes, ), -1)
pos_inds = torch.nonzero(
assigned_gt_inds > 0, as_tuple=False).squeeze()
if pos_inds.numel() > 0:
assigned_labels[pos_inds] = gt_labels[
assigned_gt_inds[pos_inds] - 1]
else:
assigned_labels = None
return AssignResult(
num_gts, assigned_gt_inds, max_overlaps, labels=assigned_labels)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment