Commit 0d97cc8c authored by Sugon_ldc's avatar Sugon_ldc
Browse files

add new model

parents
Pipeline #316 failed with stages
in 0 seconds
# GINet: Graph Interaction Network for Scene Parsing
## Reference
> Wu, Tianyi, Yu Lu, Yu Zhu, Chuang Zhang, Ming Wu, Zhanyu Ma, and Guodong Guo. "GINet: Graph interaction network for scene parsing." In European Conference on Computer Vision, pp. 34-51. Springer, Cham, 2020.
## Performance
### Cityscapes
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|GINet|ResNet50_OS8|1024x512|80000|78.66%|79.07%|79.2%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/ginet_resnet50_os8_cityscapes_1024x512_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/ginet_resnet50_os8_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=bb439dc87b311c4105c426eadd5a641e) |
|GINet|ResNet101_OS8|1024x512|80000|78.4%|78.72%|78.99%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/ginet_resnet101_os8_cityscapes_1024x512_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/ginet_resnet101_os8_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=ffae8d094b755a4313d6e02540de9515) |
### Pascal VOC 2012 + Aug
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|GINet|ResNet50_OS8|512x512|40000|81.97%|82.02%|81.65%|[model](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/ginet_resnet50_os8_voc12aug_512x512_40k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/ginet_resnet50_os8_voc12aug_512x512_40k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=638ff8bcc88575489ee36da0edad51b6) |
|GINet|ResNet101_OS8|512x512|40000|79.79%|79.99%|80.6%|[model](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/ginet_resnet101_os8_voc12aug_512x512_40k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/ginet_resnet101_os8_voc12aug_512x512_40k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=a1f7d1040f371585d9aac1610116f594) |
### ADE20K
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|GINet|ResNet50_OS8|520x520|150000|43.45%|43.98%|43.80%|[model](https://paddleseg.bj.bcebos.com/dygraph/ade20k/ginet_resnet50_os8_ade20k_520x520_150k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/ade20k/ginet_resnet50_os8_ade20k_520x520_150k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=665901e12a35319710197380a5dfafa5) |
|GINet|ResNet101_OS8|520x520|150000|45.79%|45.94%|46.18%|[model](https://paddleseg.bj.bcebos.com/dygraph/ade20k/ginet_resnet101_os8_ade20k_520x520_150k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/ade20k/ginet_resnet101_os8_ade20k_520x520_150k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=46b63c18e421e2a0ba95faefdc8d5c39) |
_base_: '../_base_/ade20k.yml'
batch_size: 4
iters: 150000
train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [520, 520]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.4
contrast_range: 0.4
saturation_range: 0.4
- type: Normalize
model:
type: GINet
backbone:
type: ResNet101_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet101_vd_ssld.tar.gz
backbone_indices: [0, 1, 2, 3]
enable_auxiliary_loss: True
jpu: True
align_corners: True
pretrained: null
lr_scheduler:
type: PolynomialDecay
learning_rate: 0.005
end_lr: 0
power: 0.9
loss:
types:
- type: CrossEntropyLoss
- type: CrossEntropyLoss
coef: [1, 0.4]
_base_: 'ginet_resnet50_os8_cityscapes_1024x512_80k.yml'
model:
backbone:
type: ResNet101_vd
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet101_vd_ssld.tar.gz
_base_: 'ginet_resnet50_os8_voc12aug_512x512_40k.yml'
model:
backbone:
type: ResNet101_vd
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet101_vd_ssld.tar.gz
_base_: '../_base_/ade20k.yml'
batch_size: 4
iters: 150000
train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [520, 520]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.4
contrast_range: 0.4
saturation_range: 0.4
- type: Normalize
model:
type: GINet
backbone:
type: ResNet50_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
backbone_indices: [0, 1, 2, 3]
enable_auxiliary_loss: True
jpu: True
align_corners: True
pretrained: null
lr_scheduler:
type: PolynomialDecay
learning_rate: 0.005
end_lr: 0
power: 0.9
loss:
types:
- type: CrossEntropyLoss
- type: CrossEntropyLoss
coef: [1, 0.4]
_base_: '../_base_/cityscapes.yml'
batch_size: 2
iters: 80000
model:
type: GINet
backbone:
type: ResNet50_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
backbone_indices: [0, 1, 2, 3]
enable_auxiliary_loss: True
jpu: True
align_corners: True
pretrained: null
loss:
types:
- type: CrossEntropyLoss
coef: [1, 0.4]
_base_: '../_base_/pascal_voc12aug.yml'
model:
type: GINet
backbone:
type: ResNet50_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
backbone_indices: [0, 1, 2, 3]
enable_auxiliary_loss: True
jpu: True
align_corners: True
pretrained: null
loss:
types:
- type: CrossEntropyLoss
coef: [1, 0.4]
# Graph-Based Global Reasoning Networks
## Reference
> Chen, Yunpeng, Marcus Rohrbach, Zhicheng Yan, Yan Shuicheng, Jiashi Feng, and Yannis Kalantidis. "Graph-based global reasoning networks." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 433-442. 2019.
## Performance
### Cityscapes
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|GloRe|ResNet50_OS8|1024x512|80000|78.26%|78.61%|78.72%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/glore_resnet50_os8_cityscapes_1024x512_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/glore_resnet50_os8_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=de754e39ac9de4d2e951915c2334d6ec) |
### Pascal VOC 2012 + Aug
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|GloRe|ResNet50_OS8|512x512|40000|80.16%|80.35%|80.40%|[model](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/glore_resnet50_os8_voc12aug_512x512_40k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/glore_resnet50_os8_voc12aug_512x512_40k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=e40c1dd8d4fcbf2dcda01242dec9d9b5) |
_base_: '../_base_/cityscapes.yml'
batch_size: 2
iters: 80000
learning_rate:
decay:
end_lr: 1.0e-5
loss:
types:
- type: CrossEntropyLoss
coef: [1, 0.4]
model:
type: GloRe
backbone:
type: ResNet50_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
enable_auxiliary_loss: True
align_corners: False
pretrained: null
_base_: '../_base_/pascal_voc12aug.yml'
model:
type: GloRe
backbone:
type: ResNet50_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
enable_auxiliary_loss: True
align_corners: False
pretrained: null
loss:
types:
- type: CrossEntropyLoss
coef: [1, 0.4]
# Gated-scnn: Gated shape cnns for semantic segmentation
## Reference
> Takikawa, Towaki, David Acuna, Varun Jampani, and Sanja Fidler. "Gated-scnn: Gated shape cnns for semantic segmentation." In Proceedings of the IEEE International Conference on Computer Vision, pp. 5229-5238. 2019.
## Performance
### Cityscapes
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|GSCNN|ResNet50_OS8|1024x512|80000|80.67%|80.88%|80.88%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/gscnn_resnet50_os8_cityscapes_1024x512_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/gscnn_resnet50_os8_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=11b79b6a2899739c0d009b1ce34bad77)|
_base_: '../../configs/_base_/cityscapes.yml'
batch_size: 2
iters: 80000
model:
type: GSCNN
backbone:
type: ResNet50_vd
output_stride: 8
multi_grid: [1, 2, 4]
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
num_classes: 19
backbone_indices: [0, 1, 2, 3]
aspp_ratios: [1, 12, 24, 36]
aspp_out_channels: 256
align_corners: False
pretrained: null
loss:
types:
- type: CrossEntropyLoss
- type: EdgeAttentionLoss
- type: BCELoss
edge_label: True
- type: DualTaskLoss
coef: [1, 1, 20, 1]
train_dataset:
edge: True
# HarDNet: A Low Memory Traffic Network
## Reference
> Chao, Ping, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, and Youn-Long Lin. "Hardnet: A low memory traffic network." In Proceedings of the IEEE International Conference on Computer Vision, pp. 3552-3561. 2019.
## Performance
### Cityscapes
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|-|-|-|-|-|-|-|-|
|HarDNet|-|1024x1024|160000|79.03%|79.49%|79.76%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/hardnet_cityscapes_1024x1024_160k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/hardnet_cityscapes_1024x1024_160k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=b90b16bff783a97baed70313821fe551)|
_base_: '../_base_/cityscapes_1024x1024.yml'
batch_size: 4
iters: 160000
lr_scheduler:
type: PolynomialDecay
learning_rate: 0.02
optimizer:
type: sgd
momentum: 0.9
weight_decay: 5.0e-4
model:
type: HarDNet
pretrained: null
loss:
types:
- type: BootstrappedCrossEntropyLoss
min_K: 4096
loss_th: 0.3
coef: [1]
_base_: '../_base_/cityscapes.yml'
batch_size: 2
iters: 60000
optimizer:
type: sgd
weight_decay: 0.0002
loss:
types:
- type: CrossEntropyLoss
- type: PixelContrastCrossEntropyLoss
temperature: 0.1
base_temperature: 0.07
ignore_index: 255
max_samples: 1024
max_views: 100
coef: [1, 0.1]
model:
type: HRNetW48Contrast
backbone:
type: HRNet_W48
pretrained: https://bj.bcebos.com/paddleseg/dygraph/hrnet_w48_ssld.tar.gz
in_channels: 720
drop_prob: 0.1
proj_dim: 720
# Exploring Cross-Image Pixel Contrast for Semantic Segmentation
## Reference
> Wenguan Wang, Tianfei Zhou, Fisher Yu , Jifeng Dai, Ender Konukoglu, Luc Van Gool. "Exploring Cross-Image Pixel Contrast for Semantic Segmentation." In Proceedings of the IEEE International Conference on Computer Vision. 2021.
## Performance
### CityScapes
|Model|Backbone|Resolution|Training Iters|mIou|Links|
| :---: | :---: | :---: | :---: | :---: | :---: |
|HRNet_W48_contrast|HRNet_W48|1024x512|60000|0.8230|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/HRNet_W48_contrast_cityscapes_1024x512_60k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/HRNet_W48_contrast_cityscapes_1024x512_60k/train.log) \| [vdl](https://www.paddlepaddle.org.cn/paddle/visualdl/service/app/scalar?id=19772033a387e334dc10ea395a55a53a)
# Interlaced Sparse Self-Attention for Semantic Segmentation
## Reference
> Lang Huang, Yuhui Yuan, Jianyuan Guo, Chao Zhang, Xilin Chen, Jingdong Wang: Interlaced Sparse Self-Attention for Semantic Segmentation. CoRR abs/1907.12273 (2019).
## Performance
### Cityscapes
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|-|-|-|-|-|-|-|-|
|ISANet|ResNet50_OS8|769x769|80000|79.03%|79.43%|79.52%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/isanet_resnet50_os8_cityscapes_769x769_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/isanet_resnet50_os8_cityscapes_769x769_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=ab7cc0627fdbf1e210557c33d94d2e8c)|
|ISANet|ResNet101_OS8|769x769|80000|80.10%|80.30%|80.26%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/isanet_resnet101_os8_cityscapes_769x769_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/isanet_resnet101_os8_cityscapes_769x769_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=76366b80293c3ac2374d981b4573eb52)|
### Pascal VOC 2012 + Aug
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) |Links |
|-|-|-|-|-|-|-|-|
|ISANet|ResNet50_OS8|512x512|40000|79.69%|79.93%|80.53%|[model](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/isanet_resnet50_os8_voc12aug_512x512_40k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/isanet_resnet50_os8_voc12aug_512x512_40k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=84af8df983e48f1a0c89154a26f55032)|
|ISANet|ResNet101_OS8|512x512|40000|79.57%|79.69%|80.01%|[model](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/isanet_resnet101_os8_voc12aug_512x512_40k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/pascal_voc12/isanet_resnet101_os8_voc12aug_512x512_40k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=6874531f0adbfc72f22fb816bb231a46)|
_base_: '../_base_/cityscapes_769x769.yml'
batch_size: 2
iters: 80000
model:
type: ISANet
isa_channels: 256
backbone:
type: ResNet101_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet101_vd_ssld.tar.gz
num_classes: 19
optimizer:
type: sgd
momentum: 0.9
weight_decay: 0.00001
lr_scheduler:
type: PolynomialDecay
learning_rate: 0.01
power: 0.9
loss:
types:
- type: CrossEntropyLoss
- type: CrossEntropyLoss
coef: [1, 0.4]
_base_: '../_base_/pascal_voc12aug.yml'
model:
type: ISANet
isa_channels: 256
backbone:
type: ResNet101_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet101_vd_ssld.tar.gz
align_corners: True
optimizer:
type: sgd
momentum: 0.9
weight_decay: 4.0e-05
lr_scheduler:
type: PolynomialDecay
learning_rate: 0.01
power: 0.9
loss:
types:
- type: CrossEntropyLoss
- type: CrossEntropyLoss
coef: [1, 0.4]
_base_: '../_base_/cityscapes_769x769.yml'
batch_size: 2
iters: 80000
model:
type: ISANet
isa_channels: 256
backbone:
type: ResNet50_vd
output_stride: 8
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
num_classes: 19
optimizer:
type: sgd
momentum: 0.9
weight_decay: 0.00001
lr_scheduler:
type: PolynomialDecay
learning_rate: 0.01
power: 0.9
loss:
types:
- type: CrossEntropyLoss
- type: CrossEntropyLoss
coef: [1, 0.4]
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment