Commit 41b18fd8 authored by zhe chen's avatar zhe chen
Browse files

Use pre-commit to reformat code


Use pre-commit to reformat code
parent ff20ea39
...@@ -6,11 +6,10 @@ ...@@ -6,11 +6,10 @@
The Common Objects in COntext-stuff (COCO-stuff) dataset is a dataset for scene understanding tasks like semantic segmentation, object detection and image captioning. It is constructed by annotating the original COCO dataset, which originally annotated things while neglecting stuff annotations.  There are 164k images in COCO-Stuff-164K dataset that span over 172 categories including 80 things, 91 stuff, and 1 unlabeled class. The Common Objects in COntext-stuff (COCO-stuff) dataset is a dataset for scene understanding tasks like semantic segmentation, object detection and image captioning. It is constructed by annotating the original COCO dataset, which originally annotated things while neglecting stuff annotations.  There are 164k images in COCO-Stuff-164K dataset that span over 172 categories including 80 things, 91 stuff, and 1 unlabeled class.
## Model Zoo ## Model Zoo
### Mask2Former + InternImage ### Mask2Former + InternImage
| backbone | resolution | mIoU (ss) | train speed | train time | #param | FLOPs | Config | Download | | backbone | resolution | mIoU (ss) | train speed | train time | #param | FLOPs | Config | Download |
|:--------------:|:----------:|:-----------:|:-----------:|:----------:|:-------:|:-----:|:-----:|:-------------------:| | :-----------: | :--------: | :-------: | :---------: | :--------: | :----: | :---: | :---------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| InternImage-H | 896x896 | 52.6 | 1.6s / iter | 1.5d (2n) | 1.31B | 4635G | [config](./mask2former_internimage_h_896_80k_cocostuff164k_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_896_80k_cocostuff164k.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_896_80k_cocostuff164k.log.json) | | InternImage-H | 896x896 | 52.6 | 1.6s / iter | 1.5d (2n) | 1.31B | 4635G | [config](./mask2former_internimage_h_896_80k_cocostuff164k_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_896_80k_cocostuff164k.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_896_80k_cocostuff164k.log.json) |
# Mapillary Vistas # Mapillary Vistas
Introduced by Neuhold et al. in [The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes](http://openaccess.thecvf.com/content_ICCV_2017/papers/Neuhold_The_Mapillary_Vistas_ICCV_2017_paper.pdf) Introduced by Neuhold et al. in [The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes](http://openaccess.thecvf.com/content_ICCV_2017/papers/Neuhold_The_Mapillary_Vistas_ICCV_2017_paper.pdf)
Mapillary Vistas Dataset is a diverse street-level imagery dataset with pixel‑accurate and instance‑specific human annotations for understanding street scenes around the world. Mapillary Vistas Dataset is a diverse street-level imagery dataset with pixel‑accurate and instance‑specific human annotations for understanding street scenes around the world.
We first pretrain our models on the Mapillary Vistas dataset, then finetune them on the Cityscapes dataset. We first pretrain our models on the Mapillary Vistas dataset, then finetune them on the Cityscapes dataset.
## Model Zoo ## Model Zoo
### UperNet + InternImage ### UperNet + InternImage
| backbone | resolution | schd | train speed | train time | #params | FLOPs | Config | Download | | backbone | resolution | schd | train speed | train time | #params | FLOPs | Config | Download |
|:--------------:|:----------:|:------------:|:-----------:|:-----------:|:-------:|:-----:|:------:|:------------:| | :------------: | :--------: | :--: | :----------: | :--------: | :-----: | :---: | :----------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------: |
| InternImage-L | 512x1024 | 80k | 0.50s / iter | 11.5h | 256M | 3234G | [config](./upernet_internimage_l_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_l_512x1024_80k_mapillary.pth) | | InternImage-L | 512x1024 | 80k | 0.50s / iter | 11.5h | 256M | 3234G | [config](./upernet_internimage_l_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_l_512x1024_80k_mapillary.pth) |
| InternImage-XL | 512x1024 | 80k | 0.56s / iter | 13h | 368M | 4022G | [config](./upernet_internimage_xl_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_xl_512x1024_80k_mapillary.pth) | | InternImage-XL | 512x1024 | 80k | 0.56s / iter | 13h | 368M | 4022G | [config](./upernet_internimage_xl_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_xl_512x1024_80k_mapillary.pth) |
### SegFormerHead + InternImage ### SegFormerHead + InternImage
| backbone | resolution | schd | train speed | train time | #params | FLOPs | Config | Download | | backbone | resolution | schd | train speed | train time | #params | FLOPs | Config | Download |
|:--------------:|:----------:|:------------:|:-----------:|:-----------:|:-------:|:-----:|:-----:|:---------:| | :------------: | :--------: | :--: | :----------: | :--------: | :-----: | :---: | :------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------: |
| InternImage-L | 512x1024 | 80k | 0.37s / iter | 9h | 220M | 1580G | [config](./segformer_internimage_l_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_l_512x1024_80k_mapillary.pth) | | InternImage-L | 512x1024 | 80k | 0.37s / iter | 9h | 220M | 1580G | [config](./segformer_internimage_l_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_l_512x1024_80k_mapillary.pth) |
| InternImage-XL | 512x1024 | 80k | 0.43s / iter | 10h | 330M | 2364G | [config](./segformer_internimage_xl_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_xl_512x1024_80k_mapillary.pth) | | InternImage-XL | 512x1024 | 80k | 0.43s / iter | 10h | 330M | 2364G | [config](./segformer_internimage_xl_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_xl_512x1024_80k_mapillary.pth) |
...@@ -6,9 +6,9 @@ import os.path as osp ...@@ -6,9 +6,9 @@ import os.path as osp
from functools import partial from functools import partial
import mmcv import mmcv
import mmcv_custom
import mmseg_custom
import torch.multiprocessing as mp import torch.multiprocessing as mp
from torch.multiprocessing import Process, set_start_method
from mmdeploy.apis import (create_calib_input_data, extract_model, from mmdeploy.apis import (create_calib_input_data, extract_model,
get_predefined_partition_cfg, torch2onnx, get_predefined_partition_cfg, torch2onnx,
torch2torchscript, visualize_model) torch2torchscript, visualize_model)
...@@ -18,9 +18,8 @@ from mmdeploy.backend.sdk.export_info import export2SDK ...@@ -18,9 +18,8 @@ from mmdeploy.backend.sdk.export_info import export2SDK
from mmdeploy.utils import (IR, Backend, get_backend, get_calib_filename, from mmdeploy.utils import (IR, Backend, get_backend, get_calib_filename,
get_ir_config, get_partition_config, get_ir_config, get_partition_config,
get_root_logger, load_config, target_wrapper) get_root_logger, load_config, target_wrapper)
from torch.multiprocessing import Process, set_start_method
import mmcv_custom
import mmseg_custom
def parse_args(): def parse_args():
parser = argparse.ArgumentParser(description='Export model to backends.') parser = argparse.ArgumentParser(description='Export model to backends.')
...@@ -242,9 +241,8 @@ def main(): ...@@ -242,9 +241,8 @@ def main():
# ncnn quantization # ncnn quantization
if backend == Backend.NCNN and quant: if backend == Backend.NCNN and quant:
from onnx2ncnn_quant_table import get_table
from mmdeploy.apis.ncnn import get_quant_model_file, ncnn2int8 from mmdeploy.apis.ncnn import get_quant_model_file, ncnn2int8
from onnx2ncnn_quant_table import get_table
model_param_paths = backend_files[::2] model_param_paths = backend_files[::2]
model_bin_paths = backend_files[1::2] model_bin_paths = backend_files[1::2]
backend_files = [] backend_files = []
......
# Copyright (c) OpenMMLab. All rights reserved. # Copyright (c) OpenMMLab. All rights reserved.
import argparse import argparse
import mmcv_custom # noqa: F401,F403
import mmseg_custom # noqa: F401,F403
import numpy as np import numpy as np
import torch import torch
from mmcv import Config, DictAction from mmcv import Config, DictAction
from mmseg.models import build_segmentor from mmseg.models import build_segmentor
import mmcv_custom # noqa: F401,F403
import mmseg_custom # noqa: F401,F403
try: try:
from mmcv.cnn.utils.flops_counter import flops_to_string, params_to_string
from mmcv.cnn import get_model_complexity_info from mmcv.cnn import get_model_complexity_info
from mmcv.cnn.utils.flops_counter import flops_to_string, params_to_string
except ImportError: except ImportError:
raise ImportError('Please upgrade mmcv to >0.6.2') raise ImportError('Please upgrade mmcv to >0.6.2')
...@@ -44,9 +43,11 @@ def parse_args(): ...@@ -44,9 +43,11 @@ def parse_args():
args = parser.parse_args() args = parser.parse_args()
return args return args
def dcnv3_flops(n, k, c): def dcnv3_flops(n, k, c):
return 5 * n * k * c return 5 * n * k * c
def get_flops(model, input_shape): def get_flops(model, input_shape):
flops, params = get_model_complexity_info(model, input_shape, as_strings=False) flops, params = get_model_complexity_info(model, input_shape, as_strings=False)
...@@ -66,7 +67,7 @@ def get_flops(model, input_shape): ...@@ -66,7 +67,7 @@ def get_flops(model, input_shape):
flops = flops + temp flops = flops + temp
return flops_to_string(flops), params_to_string(params) return flops_to_string(flops), params_to_string(params)
if __name__ == '__main__': if __name__ == '__main__':
args = parse_args() args = parse_args()
...@@ -93,7 +94,7 @@ if __name__ == '__main__': ...@@ -93,7 +94,7 @@ if __name__ == '__main__':
cfg.model, cfg.model,
train_cfg=cfg.get('train_cfg'), train_cfg=cfg.get('train_cfg'),
test_cfg=cfg.get('test_cfg')) test_cfg=cfg.get('test_cfg'))
if torch.cuda.is_available(): if torch.cuda.is_available():
model.cuda() model.cuda()
model.eval() model.eval()
......
# Copyright (c) OpenMMLab. All rights reserved. # Copyright (c) OpenMMLab. All rights reserved.
import os
import os.path as osp
from argparse import ArgumentParser from argparse import ArgumentParser
import cv2
import mmcv import mmcv
import mmcv_custom # noqa: F401,F403
import mmcv_custom # noqa: F401,F403 import mmseg_custom # noqa: F401,F403
import mmseg_custom # noqa: F401,F403
from mmseg.apis import inference_segmentor, init_segmentor, show_result_pyplot
from mmseg.core.evaluation import get_palette
from mmcv.runner import load_checkpoint from mmcv.runner import load_checkpoint
from mmseg.apis import inference_segmentor, init_segmentor, show_result_pyplot
from mmseg.core import get_classes from mmseg.core import get_classes
import cv2 from mmseg.core.evaluation import get_palette
import os.path as osp
import os
def test_single_image(model, img_name, out_dir, color_palette, opacity): def test_single_image(model, img_name, out_dir, color_palette, opacity):
# check img_name is an image file or not # check img_name is an image file or not
assumed_imgformat = ('.png', '.jpg', '.jpeg', '.tiff', '.bmp', '.gif') assumed_imgformat = ('.png', '.jpg', '.jpeg', '.tiff', '.bmp', '.gif')
if (not img_name.lower().endswith(assumed_imgformat)): if (not img_name.lower().endswith(assumed_imgformat)):
print(f"Skip {img_name} because it is not an image file.") print(f'Skip {img_name} because it is not an image file.')
return return
result = inference_segmentor(model, img_name) result = inference_segmentor(model, img_name)
...@@ -34,7 +33,7 @@ def test_single_image(model, img_name, out_dir, color_palette, opacity): ...@@ -34,7 +33,7 @@ def test_single_image(model, img_name, out_dir, color_palette, opacity):
mmcv.mkdir_or_exist(out_dir) mmcv.mkdir_or_exist(out_dir)
out_path = osp.join(out_dir, osp.basename(img_name)) out_path = osp.join(out_dir, osp.basename(img_name))
cv2.imwrite(out_path, img) cv2.imwrite(out_path, img)
print(f"Result is save at {out_path}") print(f'Result is save at {out_path}')
def main(): def main():
...@@ -43,7 +42,7 @@ def main(): ...@@ -43,7 +42,7 @@ def main():
'img', help='Image file or a directory contains images') 'img', help='Image file or a directory contains images')
parser.add_argument('config', help='Config file') parser.add_argument('config', help='Config file')
parser.add_argument('checkpoint', help='Checkpoint file') parser.add_argument('checkpoint', help='Checkpoint file')
parser.add_argument('--out', type=str, default="demo", help='out dir') parser.add_argument('--out', type=str, default='demo', help='out dir')
parser.add_argument( parser.add_argument(
'--device', default='cuda:0', help='Device used for inference') '--device', default='cuda:0', help='Device used for inference')
parser.add_argument( parser.add_argument(
......
...@@ -5,6 +5,7 @@ ...@@ -5,6 +5,7 @@
# -------------------------------------------------------- # --------------------------------------------------------
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from .custom_layer_decay_optimizer_constructor import CustomLayerDecayOptimizerConstructor from .custom_layer_decay_optimizer_constructor import \
CustomLayerDecayOptimizerConstructor
__all__ = ['CustomLayerDecayOptimizerConstructor',] __all__ = ['CustomLayerDecayOptimizerConstructor',]
...@@ -10,13 +10,13 @@ https://github.com/microsoft/unilm/blob/master/beit/semantic_segmentation/mmcv_c ...@@ -10,13 +10,13 @@ https://github.com/microsoft/unilm/blob/master/beit/semantic_segmentation/mmcv_c
import json import json
from mmcv.runner import OPTIMIZER_BUILDERS, DefaultOptimizerConstructor from mmcv.runner import (OPTIMIZER_BUILDERS, DefaultOptimizerConstructor,
from mmcv.runner import get_dist_info get_dist_info)
from mmseg.utils import get_root_logger from mmseg.utils import get_root_logger
def get_num_layer_for_swin(var_name, num_max_layer, depths): def get_num_layer_for_swin(var_name, num_max_layer, depths):
if var_name.startswith("backbone.patch_embed"): if var_name.startswith('backbone.patch_embed'):
return 0 return 0
elif var_name.startswith('decode_head.mask_embed'): elif var_name.startswith('decode_head.mask_embed'):
return 0 return 0
...@@ -28,12 +28,12 @@ def get_num_layer_for_swin(var_name, num_max_layer, depths): ...@@ -28,12 +28,12 @@ def get_num_layer_for_swin(var_name, num_max_layer, depths):
return 0 return 0
elif var_name.startswith('decode_head.query_feat'): elif var_name.startswith('decode_head.query_feat'):
return 0 return 0
if var_name.startswith("backbone.cb_modules.0.patch_embed"): if var_name.startswith('backbone.cb_modules.0.patch_embed'):
return 0 return 0
elif "level_embeds" in var_name: elif 'level_embeds' in var_name:
return 0 return 0
elif var_name.startswith("backbone.layers") or var_name.startswith( elif var_name.startswith('backbone.layers') or var_name.startswith(
"backbone.levels"): 'backbone.levels'):
if var_name.split('.')[3] not in ['downsample', 'norm']: if var_name.split('.')[3] not in ['downsample', 'norm']:
stage_id = int(var_name.split('.')[2]) stage_id = int(var_name.split('.')[2])
layer_id = int(var_name.split('.')[4]) layer_id = int(var_name.split('.')[4])
...@@ -86,64 +86,64 @@ class CustomLayerDecayOptimizerConstructor(DefaultOptimizerConstructor): ...@@ -86,64 +86,64 @@ class CustomLayerDecayOptimizerConstructor(DefaultOptimizerConstructor):
depths = self.paramwise_cfg.get('depths') depths = self.paramwise_cfg.get('depths')
offset_lr_scale = self.paramwise_cfg.get('offset_lr_scale', 1.0) offset_lr_scale = self.paramwise_cfg.get('offset_lr_scale', 1.0)
logger.info("Build CustomLayerDecayOptimizerConstructor %f - %d" % logger.info('Build CustomLayerDecayOptimizerConstructor %f - %d' %
(layer_decay_rate, num_layers)) (layer_decay_rate, num_layers))
weight_decay = self.base_wd weight_decay = self.base_wd
for name, param in module.named_parameters(): for name, param in module.named_parameters():
if not param.requires_grad: if not param.requires_grad:
continue # frozen weights continue # frozen weights
if len(param.shape) == 1 or name.endswith(".bias") or \ if len(param.shape) == 1 or name.endswith('.bias') or \
"relative_position" in name or \ 'relative_position' in name or \
"norm" in name or\ 'norm' in name or\
"sampling_offsets" in name: 'sampling_offsets' in name:
group_name = "no_decay" group_name = 'no_decay'
this_weight_decay = 0. this_weight_decay = 0.
else: else:
group_name = "decay" group_name = 'decay'
this_weight_decay = weight_decay this_weight_decay = weight_decay
layer_id = get_num_layer_for_swin(name, num_layers, depths) layer_id = get_num_layer_for_swin(name, num_layers, depths)
if layer_id == num_layers - 1 and dino_head and \ if layer_id == num_layers - 1 and dino_head and \
("sampling_offsets" in name or "reference_points" in name): ('sampling_offsets' in name or 'reference_points' in name):
group_name = "layer_%d_%s_0.1x" % (layer_id, group_name) group_name = 'layer_%d_%s_0.1x' % (layer_id, group_name)
elif ("sampling_offsets" in name or "reference_points" in name) and "backbone" in name: elif ('sampling_offsets' in name or 'reference_points' in name) and 'backbone' in name:
group_name = "layer_%d_%s_offset_lr_scale" % (layer_id, group_name = 'layer_%d_%s_offset_lr_scale' % (layer_id,
group_name) group_name)
else: else:
group_name = "layer_%d_%s" % (layer_id, group_name) group_name = 'layer_%d_%s' % (layer_id, group_name)
if group_name not in parameter_groups: if group_name not in parameter_groups:
scale = layer_decay_rate ** (num_layers - layer_id - 1) scale = layer_decay_rate ** (num_layers - layer_id - 1)
if scale < 1 and backbone_small_lr == True: if scale < 1 and backbone_small_lr == True:
scale = scale * 0.1 scale = scale * 0.1
if "0.1x" in group_name: if '0.1x' in group_name:
scale = scale * 0.1 scale = scale * 0.1
if "offset_lr_scale" in group_name: if 'offset_lr_scale' in group_name:
scale = scale * offset_lr_scale scale = scale * offset_lr_scale
parameter_groups[group_name] = { parameter_groups[group_name] = {
"weight_decay": this_weight_decay, 'weight_decay': this_weight_decay,
"params": [], 'params': [],
"param_names": [], 'param_names': [],
"lr_scale": scale, 'lr_scale': scale,
"group_name": group_name, 'group_name': group_name,
"lr": scale * self.base_lr, 'lr': scale * self.base_lr,
} }
parameter_groups[group_name]["params"].append(param) parameter_groups[group_name]['params'].append(param)
parameter_groups[group_name]["param_names"].append(name) parameter_groups[group_name]['param_names'].append(name)
rank, _ = get_dist_info() rank, _ = get_dist_info()
if rank == 0: if rank == 0:
to_display = {} to_display = {}
for key in parameter_groups: for key in parameter_groups:
to_display[key] = { to_display[key] = {
"param_names": parameter_groups[key]["param_names"], 'param_names': parameter_groups[key]['param_names'],
"lr_scale": parameter_groups[key]["lr_scale"], 'lr_scale': parameter_groups[key]['lr_scale'],
"lr": parameter_groups[key]["lr"], 'lr': parameter_groups[key]['lr'],
"weight_decay": parameter_groups[key]["weight_decay"], 'weight_decay': parameter_groups[key]['weight_decay'],
} }
logger.info("Param groups = %s" % json.dumps(to_display, indent=2)) logger.info('Param groups = %s' % json.dumps(to_display, indent=2))
# state_dict = module.state_dict() # state_dict = module.state_dict()
# for group_name in parameter_groups: # for group_name in parameter_groups:
...@@ -151,4 +151,4 @@ class CustomLayerDecayOptimizerConstructor(DefaultOptimizerConstructor): ...@@ -151,4 +151,4 @@ class CustomLayerDecayOptimizerConstructor(DefaultOptimizerConstructor):
# for name in group["param_names"]: # for name in group["param_names"]:
# group["params"].append(state_dict[name]) # group["params"].append(state_dict[name])
params.extend(parameter_groups.values()) params.extend(parameter_groups.values())
\ No newline at end of file
...@@ -4,6 +4,6 @@ ...@@ -4,6 +4,6 @@
# Licensed under The MIT License [see LICENSE for details] # Licensed under The MIT License [see LICENSE for details]
# -------------------------------------------------------- # --------------------------------------------------------
from .models import * # noqa: F401,F403
from .datasets import * # noqa: F401,F403
from .core import * # noqa: F401,F403 from .core import * # noqa: F401,F403
from .datasets import * # noqa: F401,F403
from .models import * # noqa: F401,F403
# Copyright (c) OpenMMLab. All rights reserved. # Copyright (c) OpenMMLab. All rights reserved.
from .dataset_wrappers import ConcatDataset
from .mapillary import MapillaryDataset # noqa: F401,F403 from .mapillary import MapillaryDataset # noqa: F401,F403
from .nyu_depth_v2 import NYUDepthV2Dataset # noqa: F401,F403 from .nyu_depth_v2 import NYUDepthV2Dataset # noqa: F401,F403
from .pipelines import * # noqa: F401,F403 from .pipelines import * # noqa: F401,F403
from .dataset_wrappers import ConcatDataset
__all__ = [ __all__ = [
'MapillaryDataset', 'NYUDepthV2Dataset', 'ConcatDataset' 'MapillaryDataset', 'NYUDepthV2Dataset', 'ConcatDataset'
] ]
\ No newline at end of file
...@@ -5,9 +5,8 @@ from itertools import chain ...@@ -5,9 +5,8 @@ from itertools import chain
import mmcv import mmcv
import numpy as np import numpy as np
from mmcv.utils import build_from_cfg, print_log from mmcv.utils import build_from_cfg, print_log
from torch.utils.data.dataset import ConcatDataset as _ConcatDataset
from mmseg.datasets.builder import DATASETS from mmseg.datasets.builder import DATASETS
from torch.utils.data.dataset import ConcatDataset as _ConcatDataset
@DATASETS.register_module(force=True) @DATASETS.register_module(force=True)
......
...@@ -45,4 +45,4 @@ class MapillaryDataset(CustomDataset): ...@@ -45,4 +45,4 @@ class MapillaryDataset(CustomDataset):
img_suffix='.jpg', img_suffix='.jpg',
seg_map_suffix='.png', seg_map_suffix='.png',
reduce_zero_label=False, reduce_zero_label=False,
**kwargs) **kwargs)
\ No newline at end of file
...@@ -21,7 +21,6 @@ class NYUDepthV2Dataset(CustomDataset): ...@@ -21,7 +21,6 @@ class NYUDepthV2Dataset(CustomDataset):
'person', 'night stand', 'toilet', 'sink', 'lamp', 'person', 'night stand', 'toilet', 'sink', 'lamp',
'bathtub', 'bag', 'otherstructure', 'otherfurniture', 'otherprop') 'bathtub', 'bag', 'otherstructure', 'otherfurniture', 'otherprop')
PALETTE = [[120, 120, 120], [180, 120, 120], [6, 230, 230], [80, 50, 50], PALETTE = [[120, 120, 120], [180, 120, 120], [6, 230, 230], [80, 50, 50],
[4, 200, 3], [120, 120, 80], [140, 140, 140], [204, 5, 255], [4, 200, 3], [120, 120, 80], [140, 140, 140], [204, 5, 255],
[230, 230, 230], [4, 250, 7], [224, 5, 255], [235, 255, 7], [230, 230, 230], [4, 250, 7], [224, 5, 255], [235, 255, 7],
...@@ -40,4 +39,3 @@ class NYUDepthV2Dataset(CustomDataset): ...@@ -40,4 +39,3 @@ class NYUDepthV2Dataset(CustomDataset):
split=split, split=split,
reduce_zero_label=True, reduce_zero_label=True,
**kwargs) **kwargs)
\ No newline at end of file
...@@ -93,7 +93,7 @@ class SETR_Resize(object): ...@@ -93,7 +93,7 @@ class SETR_Resize(object):
``img_scale`` is sampled scale and None is just a placeholder ``img_scale`` is sampled scale and None is just a placeholder
to be consistent with :func:`random_select`. to be consistent with :func:`random_select`.
""" """
assert mmcv.is_list_of(img_scales, tuple) and len(img_scales) == 2 assert mmcv.is_list_of(img_scales, tuple) and len(img_scales) == 2
img_scale_long = [max(s) for s in img_scales] img_scale_long = [max(s) for s in img_scales]
img_scale_short = [min(s) for s in img_scales] img_scale_short = [min(s) for s in img_scales]
...@@ -105,7 +105,7 @@ class SETR_Resize(object): ...@@ -105,7 +105,7 @@ class SETR_Resize(object):
max(img_scale_short) + 1) max(img_scale_short) + 1)
img_scale = (long_edge, short_edge) img_scale = (long_edge, short_edge)
return img_scale, None return img_scale, None
@staticmethod @staticmethod
def random_sample_ratio(img_scale, ratio_range): def random_sample_ratio(img_scale, ratio_range):
"""Randomly sample an img_scale when ``ratio_range`` is specified. """Randomly sample an img_scale when ``ratio_range`` is specified.
......
...@@ -9,4 +9,4 @@ from .decode_heads import * # noqa: F401,F403 ...@@ -9,4 +9,4 @@ from .decode_heads import * # noqa: F401,F403
from .losses import * # noqa: F401,F403 from .losses import * # noqa: F401,F403
from .plugins import * # noqa: F401,F403 from .plugins import * # noqa: F401,F403
from .segmentors import * # noqa: F401,F403 from .segmentors import * # noqa: F401,F403
from .utils import * # noqa: F401,F403 from .utils import * # noqa: F401,F403
\ No newline at end of file
...@@ -4,18 +4,18 @@ ...@@ -4,18 +4,18 @@
# Licensed under The MIT License [see LICENSE for details] # Licensed under The MIT License [see LICENSE for details]
# -------------------------------------------------------- # --------------------------------------------------------
from collections import OrderedDict
import torch import torch
import torch.nn as nn import torch.nn as nn
from collections import OrderedDict import torch.nn.functional as F
import torch.utils.checkpoint as checkpoint import torch.utils.checkpoint as checkpoint
from timm.models.layers import trunc_normal_, DropPath
from mmcv.runner import _load_checkpoint
from mmcv.cnn import constant_init, trunc_normal_init from mmcv.cnn import constant_init, trunc_normal_init
from mmseg.utils import get_root_logger from mmcv.runner import _load_checkpoint
from mmseg.models.builder import BACKBONES from mmseg.models.builder import BACKBONES
import torch.nn.functional as F from mmseg.utils import get_root_logger
from ops_dcnv3 import modules as dcnv3 from ops_dcnv3 import modules as dcnv3
from timm.models.layers import DropPath, trunc_normal_
class to_channels_first(nn.Module): class to_channels_first(nn.Module):
...@@ -86,7 +86,7 @@ class CrossAttention(nn.Module): ...@@ -86,7 +86,7 @@ class CrossAttention(nn.Module):
attn_head_dim (int, optional): Dimension of attention head. attn_head_dim (int, optional): Dimension of attention head.
out_dim (int, optional): Dimension of output. out_dim (int, optional): Dimension of output.
""" """
def __init__(self, def __init__(self,
dim, dim,
num_heads=8, num_heads=8,
...@@ -178,7 +178,7 @@ class AttentiveBlock(nn.Module): ...@@ -178,7 +178,7 @@ class AttentiveBlock(nn.Module):
attn_head_dim (int, optional): Dimension of attention head. Default: None. attn_head_dim (int, optional): Dimension of attention head. Default: None.
out_dim (int, optional): Dimension of output. Default: None. out_dim (int, optional): Dimension of output. Default: None.
""" """
def __init__(self, def __init__(self,
dim, dim,
num_heads, num_heads,
...@@ -187,7 +187,7 @@ class AttentiveBlock(nn.Module): ...@@ -187,7 +187,7 @@ class AttentiveBlock(nn.Module):
drop=0., drop=0.,
attn_drop=0., attn_drop=0.,
drop_path=0., drop_path=0.,
norm_layer="LN", norm_layer='LN',
attn_head_dim=None, attn_head_dim=None,
out_dim=None): out_dim=None):
super().__init__() super().__init__()
...@@ -593,10 +593,10 @@ class InternImage(nn.Module): ...@@ -593,10 +593,10 @@ class InternImage(nn.Module):
logger.info(f'using activation layer: {act_layer}') logger.info(f'using activation layer: {act_layer}')
logger.info(f'using main norm layer: {norm_layer}') logger.info(f'using main norm layer: {norm_layer}')
logger.info(f'using dpr: {drop_path_type}, {drop_path_rate}') logger.info(f'using dpr: {drop_path_type}, {drop_path_rate}')
logger.info(f"level2_post_norm: {level2_post_norm}") logger.info(f'level2_post_norm: {level2_post_norm}')
logger.info(f"level2_post_norm_block_ids: {level2_post_norm_block_ids}") logger.info(f'level2_post_norm_block_ids: {level2_post_norm_block_ids}')
logger.info(f"res_post_norm: {res_post_norm}") logger.info(f'res_post_norm: {res_post_norm}')
logger.info(f"use_dcn_v4_op: {use_dcn_v4_op}") logger.info(f'use_dcn_v4_op: {use_dcn_v4_op}')
in_chans = 3 in_chans = 3
self.patch_embed = StemLayer(in_chans=in_chans, self.patch_embed = StemLayer(in_chans=in_chans,
......
# Copyright (c) OpenMMLab. All rights reserved. # Copyright (c) OpenMMLab. All rights reserved.
import warnings # noqa: F401,F403 import warnings # noqa: F401,F403
from mmcv.utils import Registry from mmcv.utils import Registry
......
...@@ -388,8 +388,8 @@ class MaskFormerHead(BaseDecodeHead): ...@@ -388,8 +388,8 @@ class MaskFormerHead(BaseDecodeHead):
# shape [num_gts, h, w] -> [num_gts * h * w] # shape [num_gts, h, w] -> [num_gts * h * w]
mask_targets = mask_targets.reshape(-1) mask_targets = mask_targets.reshape(-1)
# target is (1 - mask_targets) !!! # target is (1 - mask_targets) !!!
print("mask_pred:", mask_preds.shape) print('mask_pred:', mask_preds.shape)
print("mask_targets:", mask_targets.shape) print('mask_targets:', mask_targets.shape)
loss_mask = self.loss_mask( loss_mask = self.loss_mask(
mask_preds, 1 - mask_targets, avg_factor=num_total_masks * h * w) mask_preds, 1 - mask_targets, avg_factor=num_total_masks * h * w)
......
...@@ -176,4 +176,4 @@ class DiceLoss(nn.Module): ...@@ -176,4 +176,4 @@ class DiceLoss(nn.Module):
reduction=reduction, reduction=reduction,
avg_factor=avg_factor) avg_factor=avg_factor)
return loss return loss
\ No newline at end of file
...@@ -265,4 +265,4 @@ class MSDeformAttnPixelDecoder(BaseModule): ...@@ -265,4 +265,4 @@ class MSDeformAttnPixelDecoder(BaseModule):
multi_scale_features = outs[:self.num_outs] multi_scale_features = outs[:self.num_outs]
mask_feature = self.mask_feature(outs[-1]) mask_feature = self.mask_feature(outs[-1])
return mask_feature, multi_scale_features return mask_feature, multi_scale_features
\ No newline at end of file
...@@ -84,4 +84,4 @@ def get_uncertain_point_coords_with_randomness(mask_pred, labels, num_points, ...@@ -84,4 +84,4 @@ def get_uncertain_point_coords_with_randomness(mask_pred, labels, num_points,
rand_roi_coords = torch.rand( rand_roi_coords = torch.rand(
batch_size, num_random_points, 2, device=mask_pred.device) batch_size, num_random_points, 2, device=mask_pred.device)
point_coords = torch.cat((point_coords, rand_roi_coords), dim=1) point_coords = torch.cat((point_coords, rand_roi_coords), dim=1)
return point_coords return point_coords
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment