README.md 3.42 KB
Newer Older
1
2
3
4
# Mapillary Vistas

Introduced by Neuhold et al. in [The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes](http://openaccess.thecvf.com/content_ICCV_2017/papers/Neuhold_The_Mapillary_Vistas_ICCV_2017_paper.pdf)

zhe chen's avatar
zhe chen committed
5
Mapillary Vistas Dataset is a diverse street-level imagery dataset with pixel‑accurate and instance‑specific human annotations for understanding street scenes around the world.
6
7
8
9
10
11
12

We first pretrain our models on the Mapillary Vistas dataset, then finetune them on the Cityscapes dataset.

## Model Zoo

### UperNet + InternImage

zhe chen's avatar
zhe chen committed
13
14
15
16
|    backbone    | resolution | schd | train speed  | train time | #params | FLOPs |                            Config                            |                                                      Download                                                       |
| :------------: | :--------: | :--: | :----------: | :--------: | :-----: | :---: | :----------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------: |
| InternImage-L  |  512x1024  | 80k  | 0.50s / iter |   11.5h    |  256M   | 3234G | [config](./upernet_internimage_l_512x1024_80k_mapillary.py)  | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_l_512x1024_80k_mapillary.pth)  |
| InternImage-XL |  512x1024  | 80k  | 0.56s / iter |    13h     |  368M   | 4022G | [config](./upernet_internimage_xl_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_xl_512x1024_80k_mapillary.pth) |
17
18
19

### SegFormerHead + InternImage

zhe chen's avatar
zhe chen committed
20
21
22
23
|    backbone    | resolution | schd | train speed  | train time | #params | FLOPs |                             Config                             |                                                       Download                                                        |
| :------------: | :--------: | :--: | :----------: | :--------: | :-----: | :---: | :------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------: |
| InternImage-L  |  512x1024  | 80k  | 0.37s / iter |     9h     |  220M   | 1580G | [config](./segformer_internimage_l_512x1024_80k_mapillary.py)  | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_l_512x1024_80k_mapillary.pth)  |
| InternImage-XL |  512x1024  | 80k  | 0.43s / iter |    10h     |  330M   | 2364G | [config](./segformer_internimage_xl_512x1024_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_xl_512x1024_80k_mapillary.pth) |
zhe chen's avatar
zhe chen committed
24
25
26
27
28
29

### Mask2Former + InternImage

|   backbone    | resolution | schd | #params | FLOPs |                             Config                             |                                                       Download                                                        |
| :-----------: | :--------: | :--: | :-----: | :---: | :------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------: |
| InternImage-H |  896x896   | 80k  |  1094M  | 7878G | [config](./mask2former_internimage_h_896x896_80k_mapillary.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_896x896_80k_mapillary.pth) |