Commit eb1107e4 authored by raojy's avatar raojy
Browse files

fix_mmdetection

parent 7aa442d5
Pipeline #3461 canceled with stages
Models:
- Name: mask-rcnn_r50_fpn_albu-1x_coco
In Collection: Mask R-CNN
Config: mask-rcnn_r50_fpn_albu-1x_coco.py
Metadata:
Training Memory (GB): 4.4
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 38.0
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 34.5
Weights: https://download.openmmlab.com/mmdetection/v2.0/albu_example/mask_rcnn_r50_fpn_albu_1x_coco/mask_rcnn_r50_fpn_albu_1x_coco_20200208-ab203bcd.pth
_base_ = './atss_r50_fpn_1x_coco.py'
model = dict(
backbone=dict(
depth=101,
init_cfg=dict(type='Pretrained',
checkpoint='torchvision://resnet101')))
_base_ = './atss_r50_fpn_8xb8-amp-lsj-200e_coco.py'
model = dict(
backbone=dict(
depth=101,
init_cfg=dict(type='Pretrained',
checkpoint='torchvision://resnet101')))
_base_ = './atss_r50_fpn_8xb8-amp-lsj-200e_coco.py'
model = dict(
backbone=dict(
depth=18,
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet18')),
neck=dict(in_channels=[64, 128, 256, 512]))
_base_ = [
'../_base_/datasets/coco_detection.py',
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]
# model settings
model = dict(
type='ATSS',
data_preprocessor=dict(
type='DetDataPreprocessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_size_divisor=32),
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_output',
num_outs=5),
bbox_head=dict(
type='ATSSHead',
num_classes=80,
in_channels=256,
stacked_convs=4,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=2.0),
loss_centerness=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
# training and testing settings
train_cfg=dict(
assigner=dict(type='ATSSAssigner', topk=9),
allowed_border=-1,
pos_weight=-1,
debug=False),
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.6),
max_per_img=100))
# optimizer
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
_base_ = '../common/lsj-200e_coco-detection.py'
image_size = (1024, 1024)
batch_augments = [dict(type='BatchFixedSizePad', size=image_size)]
model = dict(
type='ATSS',
data_preprocessor=dict(
type='DetDataPreprocessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_size_divisor=32,
batch_augments=batch_augments),
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_output',
num_outs=5),
bbox_head=dict(
type='ATSSHead',
num_classes=80,
in_channels=256,
stacked_convs=4,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=2.0),
loss_centerness=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
# training and testing settings
train_cfg=dict(
assigner=dict(type='ATSSAssigner', topk=9),
allowed_border=-1,
pos_weight=-1,
debug=False),
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.6),
max_per_img=100))
train_dataloader = dict(batch_size=8, num_workers=4)
# Enable automatic-mixed-precision training with AmpOptimWrapper.
optim_wrapper = dict(
type='AmpOptimWrapper',
optimizer=dict(
type='SGD', lr=0.01 * 4, momentum=0.9, weight_decay=0.00004))
# NOTE: `auto_scale_lr` is for automatically scaling LR,
# USER SHOULD NOT CHANGE ITS VALUES.
# base_batch_size = (8 GPUs) x (8 samples per GPU)
auto_scale_lr = dict(base_batch_size=64)
Collections:
- Name: ATSS
Metadata:
Training Data: COCO
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x V100 GPUs
Architecture:
- ATSS
- FPN
- ResNet
Paper:
URL: https://arxiv.org/abs/1912.02424
Title: 'Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection'
README: configs/atss/README.md
Code:
URL: https://github.com/open-mmlab/mmdetection/blob/v2.0.0/mmdet/models/detectors/atss.py#L6
Version: v2.0.0
Models:
- Name: atss_r50_fpn_1x_coco
In Collection: ATSS
Config: configs/atss/atss_r50_fpn_1x_coco.py
Metadata:
Training Memory (GB): 3.7
inference time (ms/im):
- value: 50.76
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (800, 1333)
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 39.4
Weights: https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209-985f7bd0.pth
- Name: atss_r101_fpn_1x_coco
In Collection: ATSS
Config: configs/atss/atss_r101_fpn_1x_coco.py
Metadata:
Training Memory (GB): 5.6
inference time (ms/im):
- value: 81.3
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (800, 1333)
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.5
Weights: https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.pth
# We follow the original implementation which
# adopts the Caffe pre-trained backbone.
_base_ = [
'../_base_/datasets/coco_detection.py',
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]
# model settings
model = dict(
type='AutoAssign',
data_preprocessor=dict(
type='DetDataPreprocessor',
mean=[102.9801, 115.9465, 122.7717],
std=[1.0, 1.0, 1.0],
bgr_to_rgb=False,
pad_size_divisor=32),
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=False),
norm_eval=True,
style='caffe',
init_cfg=dict(
type='Pretrained',
checkpoint='open-mmlab://detectron2/resnet50_caffe')),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs=True,
num_outs=5,
relu_before_extra_convs=True,
init_cfg=dict(type='Caffe2Xavier', layer='Conv2d')),
bbox_head=dict(
type='AutoAssignHead',
num_classes=80,
in_channels=256,
stacked_convs=4,
feat_channels=256,
strides=[8, 16, 32, 64, 128],
loss_bbox=dict(type='GIoULoss', loss_weight=5.0)),
train_cfg=None,
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.6),
max_per_img=100))
# learning rate
param_scheduler = [
dict(
type='LinearLR', start_factor=0.001, by_epoch=False, begin=0,
end=1000),
dict(
type='MultiStepLR',
begin=0,
end=12,
by_epoch=True,
milestones=[8, 11],
gamma=0.1)
]
# optimizer
optim_wrapper = dict(
optimizer=dict(lr=0.01), paramwise_cfg=dict(norm_decay_mult=0.))
Collections:
- Name: AutoAssign
Metadata:
Training Data: COCO
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x V100 GPUs
Architecture:
- AutoAssign
- FPN
- ResNet
Paper:
URL: https://arxiv.org/abs/2007.03496
Title: 'AutoAssign: Differentiable Label Assignment for Dense Object Detection'
README: configs/autoassign/README.md
Code:
URL: https://github.com/open-mmlab/mmdetection/blob/v2.12.0/mmdet/models/detectors/autoassign.py#L6
Version: v2.12.0
Models:
- Name: autoassign_r50-caffe_fpn_1x_coco
In Collection: AutoAssign
Config: configs/autoassign/autoassign_r50-caffe_fpn_1x_coco.py
Metadata:
Training Memory (GB): 4.08
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 40.4
Weights: https://download.openmmlab.com/mmdetection/v2.0/autoassign/auto_assign_r50_fpn_1x_coco/auto_assign_r50_fpn_1x_coco_20210413_115540-5e17991f.pth
_base_ = './boxinst_r50_fpn_ms-90k_coco.py'
# model settings
model = dict(
backbone=dict(
depth=101,
init_cfg=dict(type='Pretrained',
checkpoint='torchvision://resnet101')))
_base_ = '../common/ms-90k_coco.py'
# model settings
model = dict(
type='BoxInst',
data_preprocessor=dict(
type='BoxInstDataPreprocessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_size_divisor=32,
mask_stride=4,
pairwise_size=3,
pairwise_dilation=2,
pairwise_color_thresh=0.3,
bottom_pixels_removed=10),
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_output', # use P5
num_outs=5,
relu_before_extra_convs=True),
bbox_head=dict(
type='BoxInstBboxHead',
num_params=593,
num_classes=80,
in_channels=256,
stacked_convs=4,
feat_channels=256,
strides=[8, 16, 32, 64, 128],
norm_on_bbox=True,
centerness_on_reg=True,
dcn_on_last_conv=False,
center_sampling=True,
conv_bias=True,
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=1.0),
loss_centerness=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
mask_head=dict(
type='BoxInstMaskHead',
num_layers=3,
feat_channels=16,
size_of_interest=8,
mask_out_stride=4,
topk_masks_per_img=64,
mask_feature_head=dict(
in_channels=256,
feat_channels=128,
start_level=0,
end_level=2,
out_channels=16,
mask_stride=8,
num_stacked_convs=4,
norm_cfg=dict(type='BN', requires_grad=True)),
loss_mask=dict(
type='DiceLoss',
use_sigmoid=True,
activate=True,
eps=5e-6,
loss_weight=1.0)),
# model training and testing settings
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.6),
max_per_img=100,
mask_thr=0.5))
# optimizer
optim_wrapper = dict(optimizer=dict(lr=0.01))
# evaluator
val_evaluator = dict(metric=['bbox', 'segm'])
test_evaluator = val_evaluator
Collections:
- Name: BoxInst
Metadata:
Training Data: COCO
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x A100 GPUs
Architecture:
- ResNet
- FPN
- CondInst
Paper:
URL: https://arxiv.org/abs/2012.02310
Title: 'BoxInst: High-Performance Instance Segmentation with Box Annotations'
README: configs/boxinst/README.md
Code:
URL: https://github.com/open-mmlab/mmdetection/blob/v3.0.0rc6/mmdet/models/detectors/boxinst.py#L8
Version: v3.0.0rc6
Models:
- Name: boxinst_r50_fpn_ms-90k_coco
In Collection: BoxInst
Config: configs/boxinst/boxinst_r50_fpn_ms-90k_coco.py
Metadata:
Iterations: 90000
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 39.4
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 30.8
Weights: https://download.openmmlab.com/mmdetection/v3.0/boxinst/boxinst_r50_fpn_ms-90k_coco/boxinst_r50_fpn_ms-90k_coco_20221228_163052-6add751a.pth
- Name: boxinst_r101_fpn_ms-90k_coco
In Collection: BoxInst
Config: configs/boxinst/boxinst_r101_fpn_ms-90k_coco.py
Metadata:
Iterations: 90000
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 41.8
- Task: Instance Segmentation
Dataset: COCO
Metrics:
mask AP: 32.7
Weights: https://download.openmmlab.com/mmdetection/v3.0/boxinst/boxinst_r101_fpn_ms-90k_coco/boxinst_r101_fpn_ms-90k_coco_20221229_145106-facf375b.pth
_base_ = ['../yolox/yolox_x_8xb8-300e_coco.py']
dataset_type = 'MOTChallengeDataset'
data_root = 'data/MOT17/'
img_scale = (1440, 800) # weight, height
batch_size = 4
detector = _base_.model
detector.pop('data_preprocessor')
detector.bbox_head.update(dict(num_classes=1))
detector.test_cfg.nms.update(dict(iou_threshold=0.7))
detector['init_cfg'] = dict(
type='Pretrained',
checkpoint= # noqa: E251
'https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_x_8x8_300e_coco/yolox_x_8x8_300e_coco_20211126_140254-1ef88d67.pth' # noqa: E501
)
del _base_.model
model = dict(
type='ByteTrack',
data_preprocessor=dict(
type='TrackDataPreprocessor',
pad_size_divisor=32,
# in bytetrack, we provide joint train detector and evaluate tracking
# performance, use_det_processor means use independent detector
# data_preprocessor. of course, you can train detector independently
# like strongsort
use_det_processor=True,
batch_augments=[
dict(
type='BatchSyncRandomResize',
random_size_range=(576, 1024),
size_divisor=32,
interval=10)
]),
detector=detector,
tracker=dict(
type='ByteTracker',
motion=dict(type='KalmanFilter'),
obj_score_thrs=dict(high=0.6, low=0.1),
init_track_thr=0.7,
weight_iou_with_det_scores=True,
match_iou_thrs=dict(high=0.1, low=0.5, tentative=0.3),
num_frames_retain=30))
train_pipeline = [
dict(
type='Mosaic',
img_scale=img_scale,
pad_val=114.0,
bbox_clip_border=False),
dict(
type='RandomAffine',
scaling_ratio_range=(0.1, 2),
border=(-img_scale[0] // 2, -img_scale[1] // 2),
bbox_clip_border=False),
dict(
type='MixUp',
img_scale=img_scale,
ratio_range=(0.8, 1.6),
pad_val=114.0,
bbox_clip_border=False),
dict(type='YOLOXHSVRandomAug'),
dict(type='RandomFlip', prob=0.5),
dict(
type='Resize',
scale=img_scale,
keep_ratio=True,
clip_object_border=False),
dict(type='Pad', size_divisor=32, pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='FilterAnnotations', min_gt_bbox_wh=(1, 1), keep_empty=False),
dict(type='PackDetInputs')
]
test_pipeline = [
dict(
type='TransformBroadcaster',
transforms=[
dict(type='LoadImageFromFile', backend_args=_base_.backend_args),
dict(type='Resize', scale=img_scale, keep_ratio=True),
dict(
type='Pad',
size_divisor=32,
pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='LoadTrackAnnotations'),
]),
dict(type='PackTrackInputs')
]
train_dataloader = dict(
_delete_=True,
batch_size=batch_size,
num_workers=4,
persistent_workers=True,
pin_memory=True,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type='MultiImageMixDataset',
dataset=dict(
type='ConcatDataset',
datasets=[
dict(
type='CocoDataset',
data_root='data/MOT17',
ann_file='annotations/half-train_cocoformat.json',
data_prefix=dict(img='train'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
metainfo=dict(classes=('pedestrian', )),
pipeline=[
dict(
type='LoadImageFromFile',
backend_args=_base_.backend_args),
dict(type='LoadAnnotations', with_bbox=True),
]),
dict(
type='CocoDataset',
data_root='data/crowdhuman',
ann_file='annotations/crowdhuman_train.json',
data_prefix=dict(img='train'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
metainfo=dict(classes=('pedestrian', )),
pipeline=[
dict(
type='LoadImageFromFile',
backend_args=_base_.backend_args),
dict(type='LoadAnnotations', with_bbox=True),
]),
dict(
type='CocoDataset',
data_root='data/crowdhuman',
ann_file='annotations/crowdhuman_val.json',
data_prefix=dict(img='val'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
metainfo=dict(classes=('pedestrian', )),
pipeline=[
dict(
type='LoadImageFromFile',
backend_args=_base_.backend_args),
dict(type='LoadAnnotations', with_bbox=True),
]),
]),
pipeline=train_pipeline))
val_dataloader = dict(
_delete_=True,
batch_size=1,
num_workers=2,
persistent_workers=True,
pin_memory=True,
drop_last=False,
# video_based
# sampler=dict(type='DefaultSampler', shuffle=False, round_up=False),
sampler=dict(type='TrackImgSampler'), # image_based
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file='annotations/half-val_cocoformat.json',
data_prefix=dict(img_path='train'),
test_mode=True,
pipeline=test_pipeline))
test_dataloader = val_dataloader
# optimizer
# default 8 gpu
base_lr = 0.001 / 8 * batch_size
optim_wrapper = dict(optimizer=dict(lr=base_lr))
# some hyper parameters
# training settings
max_epochs = 80
num_last_epochs = 10
interval = 5
train_cfg = dict(
type='EpochBasedTrainLoop',
max_epochs=max_epochs,
val_begin=70,
val_interval=1)
# learning policy
param_scheduler = [
dict(
# use quadratic formula to warm up 1 epochs
type='QuadraticWarmupLR',
by_epoch=True,
begin=0,
end=1,
convert_to_iter_based=True),
dict(
# use cosine lr from 1 to 70 epoch
type='CosineAnnealingLR',
eta_min=base_lr * 0.05,
begin=1,
T_max=max_epochs - num_last_epochs,
end=max_epochs - num_last_epochs,
by_epoch=True,
convert_to_iter_based=True),
dict(
# use fixed lr during last 10 epochs
type='ConstantLR',
by_epoch=True,
factor=1,
begin=max_epochs - num_last_epochs,
end=max_epochs,
)
]
custom_hooks = [
dict(
type='YOLOXModeSwitchHook',
num_last_epochs=num_last_epochs,
priority=48),
dict(type='SyncNormHook', priority=48),
dict(
type='EMAHook',
ema_type='ExpMomentumEMA',
momentum=0.0001,
update_buffers=True,
priority=49)
]
default_hooks = dict(
checkpoint=dict(
_delete_=True, type='CheckpointHook', interval=1, max_keep_ckpts=10),
visualization=dict(type='TrackVisualizationHook', draw=False))
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
type='TrackLocalVisualizer', vis_backends=vis_backends, name='visualizer')
# evaluator
val_evaluator = dict(
_delete_=True,
type='MOTChallengeMetric',
metric=['HOTA', 'CLEAR', 'Identity'],
postprocess_tracklet_cfg=[
dict(type='InterpolateTracklets', min_num_frames=5, max_num_frames=20)
])
test_evaluator = val_evaluator
# NOTE: `auto_scale_lr` is for automatically scaling LR,
# USER SHOULD NOT CHANGE ITS VALUES.
# base_batch_size = (8 GPUs) x (4 samples per GPU)
auto_scale_lr = dict(base_batch_size=32)
del detector
del _base_.tta_model
del _base_.tta_pipeline
del _base_.train_dataset
_base_ = [
'./bytetrack_yolox_x_8xb4-80e_crowdhuman-mot17halftrain_'
'test-mot17halfval.py'
]
dataset_type = 'MOTChallengeDataset'
img_scale = (1600, 896) # weight, height
model = dict(
data_preprocessor=dict(
type='TrackDataPreprocessor',
use_det_processor=True,
pad_size_divisor=32,
batch_augments=[
dict(type='BatchSyncRandomResize', random_size_range=(640, 1152))
]),
tracker=dict(
weight_iou_with_det_scores=False,
match_iou_thrs=dict(high=0.3),
))
train_pipeline = [
dict(
type='Mosaic',
img_scale=img_scale,
pad_val=114.0,
bbox_clip_border=True),
dict(
type='RandomAffine',
scaling_ratio_range=(0.1, 2),
border=(-img_scale[0] // 2, -img_scale[1] // 2),
bbox_clip_border=True),
dict(
type='MixUp',
img_scale=img_scale,
ratio_range=(0.8, 1.6),
pad_val=114.0,
bbox_clip_border=True),
dict(type='YOLOXHSVRandomAug'),
dict(type='RandomFlip', prob=0.5),
dict(
type='Resize',
scale=img_scale,
keep_ratio=True,
clip_object_border=True),
dict(type='Pad', size_divisor=32, pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='FilterAnnotations', min_gt_bbox_wh=(1, 1), keep_empty=False),
dict(type='PackDetInputs')
]
test_pipeline = [
dict(
type='TransformBroadcaster',
transforms=[
dict(type='LoadImageFromFile', backend_args=_base_.backend_args),
dict(type='Resize', scale=img_scale, keep_ratio=True),
dict(
type='Pad',
size_divisor=32,
pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='LoadTrackAnnotations'),
]),
dict(type='PackTrackInputs')
]
train_dataloader = dict(
dataset=dict(
type='MultiImageMixDataset',
dataset=dict(
type='ConcatDataset',
datasets=[
dict(
type='CocoDataset',
data_root='data/MOT20',
ann_file='annotations/train_cocoformat.json',
# TODO: mmdet use img as key, but img_path is needed
data_prefix=dict(img='train'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
metainfo=dict(classes=('pedestrian', )),
pipeline=[
dict(
type='LoadImageFromFile',
backend_args=_base_.backend_args),
dict(type='LoadAnnotations', with_bbox=True),
]),
dict(
type='CocoDataset',
data_root='data/crowdhuman',
ann_file='annotations/crowdhuman_train.json',
data_prefix=dict(img='train'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
metainfo=dict(classes=('pedestrian', )),
pipeline=[
dict(
type='LoadImageFromFile',
backend_args=_base_.backend_args),
dict(type='LoadAnnotations', with_bbox=True),
]),
dict(
type='CocoDataset',
data_root='data/crowdhuman',
ann_file='annotations/crowdhuman_val.json',
data_prefix=dict(img='val'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
metainfo=dict(classes=('pedestrian', )),
pipeline=[
dict(
type='LoadImageFromFile',
backend_args=_base_.backend_args),
dict(type='LoadAnnotations', with_bbox=True),
]),
]),
pipeline=train_pipeline))
val_dataloader = dict(
dataset=dict(ann_file='annotations/train_cocoformat.json'))
test_dataloader = dict(
dataset=dict(
data_root='data/MOT20', ann_file='annotations/test_cocoformat.json'))
test_evaluator = dict(
type='MOTChallengeMetrics',
postprocess_tracklet_cfg=[
dict(type='InterpolateTracklets', min_num_frames=5, max_num_frames=20)
],
format_only=True,
outfile_prefix='./mot_20_test_res')
_base_ = [
'./bytetrack_yolox_x_8xb4-80e_crowdhuman-mot17halftrain_'
'test-mot17halfval.py'
]
# fp16 settings
optim_wrapper = dict(type='AmpOptimWrapper', loss_scale='dynamic')
val_cfg = dict(type='ValLoop', fp16=True)
test_cfg = dict(type='TestLoop', fp16=True)
_base_ = [
'./bytetrack/bytetrack_yolox_x_8xb4-amp-80e_crowdhuman-'
'mot17halftrain_test-mot17halfval.py'
]
test_dataloader = dict(
dataset=dict(
data_root='data/MOT17/',
ann_file='annotations/test_cocoformat.json',
data_prefix=dict(img_path='test')))
test_evaluator = dict(
type='MOTChallengeMetrics',
postprocess_tracklet_cfg=[
dict(type='InterpolateTracklets', min_num_frames=5, max_num_frames=20)
],
format_only=True,
outfile_prefix='./mot_17_test_res')
_base_ = [
'./bytetrack_yolox_x_8xb4-80e_crowdhuman-mot20train_test-mot20test.py'
]
# fp16 settings
optim_wrapper = dict(type='AmpOptimWrapper', loss_scale='dynamic')
val_cfg = dict(type='ValLoop', fp16=True)
test_cfg = dict(type='TestLoop', fp16=True)
Collections:
- Name: ByteTrack
Metadata:
Training Techniques:
- SGD with Momentum
Training Resources: 8x V100 GPUs
Architecture:
- YOLOX
Paper:
URL: https://arxiv.org/abs/2110.06864
Title: ByteTrack Multi-Object Tracking by Associating Every Detection Box
README: configs/bytetrack/README.md
Models:
- Name: bytetrack_yolox_x_8xb4-amp-80e_crowdhuman-mot17halftrain_test-mot17halfval
In Collection: ByteTrack
Config: configs/bytetrack/bytetrack_yolox_x_8xb4-80e_crowdhuman-mot17halftrain_test-mot17halfval.py
Metadata:
Training Data: CrowdHuman + MOT17-half-train
Results:
- Task: Multiple Object Tracking
Dataset: MOT17-half-val
Metrics:
HOTA: 67.5
MOTA: 78.6
IDF1: 78.5
Weights: https://download.openmmlab.com/mmtracking/mot/bytetrack/bytetrack_yolox_x/bytetrack_yolox_x_crowdhuman_mot17-private-half_20211218_205500-1985c9f0.pth
- Name: bytetrack_yolox_x_8xb4-amp-80e_crowdhuman-mot17halftrain_test-mot17test
In Collection: ByteTrack
Config: configs/bytetrack/bytetrack_yolox_x_8xb4-amp-80e_crowdhuman-mot17halftrain_test-mot17test.py
Metadata:
Training Data: CrowdHuman + MOT17-half-train
Results:
- Task: Multiple Object Tracking
Dataset: MOT17-test
Metrics:
MOTA: 78.1
IDF1: 74.8
Weights: https://download.openmmlab.com/mmtracking/mot/bytetrack/bytetrack_yolox_x/bytetrack_yolox_x_crowdhuman_mot17-private-half_20211218_205500-1985c9f0.pth
- Name: bytetrack_yolox_x_8xb4-amp-80e_crowdhuman-mot20train_test-mot20test
In Collection: ByteTrack
Config: configs/bytetrack/bytetrack_yolox_x_8xb4-amp-80e_crowdhuman-mot20train_test-mot20test.py
Metadata:
Training Data: CrowdHuman + MOT20-train
Results:
- Task: Multiple Object Tracking
Dataset: MOT20-test
Metrics:
MOTA: 77.0
IDF1: 75.4
Weights: https://download.openmmlab.com/mmtracking/mot/bytetrack/bytetrack_yolox_x/bytetrack_yolox_x_crowdhuman_mot20-private_20220506_101040-9ce38a60.pth
_base_ = [
'../strongsort/yolox_x_8xb4-80e_crowdhuman-mot17halftrain_test-mot17halfval.py' # noqa: E501
]
# fp16 settings
optim_wrapper = dict(type='AmpOptimWrapper', loss_scale='dynamic')
_base_ = '../faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py'
model = dict(
data_preprocessor=dict(pad_size_divisor=64),
neck=dict(
type='FPN_CARAFE',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5,
start_level=0,
end_level=-1,
norm_cfg=None,
act_cfg=None,
order=('conv', 'norm', 'act'),
upsample_cfg=dict(
type='carafe',
up_kernel=5,
up_group=1,
encoder_kernel=3,
encoder_dilation=1,
compressed_channels=64)))
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment