README.md 5.13 KB
Newer Older
A. Unique TensorFlower's avatar
A. Unique TensorFlower committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context

[![Paper](http://img.shields.io/badge/Paper-arXiv.2112.11623-B3181B?logo=arXiv)](https://arxiv.org/abs/2112.11623)

This repository is the official implementation of the following
paper.

* [MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context](https://arxiv.org/abs/2112.11623)

## Description

MOSAIC is a neural network architecture for efficient and accurate semantic
image segmentation on mobile devices. MOSAIC is designed using commonly
supported neural operations by diverse mobile hardware platforms for flexible
deployment across various mobile platforms. With a simple asymmetric
encoder-decoder structure which consists of an efficient multi-scale context
encoder and a light-weight hybrid decoder to recover spatial details from
aggregated information, MOSAIC achieves better balanced performance while
considering accuracy and computational cost. Deployed on top of a tailored
feature extraction backbone based on a searched classification network, MOSAIC
achieves a 5% absolute accuracy gain on ADE20K with similar or lower latency
compared to the current industry standard MLPerf mobile v1.0 models and
state-of-the-art architectures.

[MLPerf Mobile v2.0]((https://mlcommons.org/en/inference-mobile-20/)) included
MOSAIC as a new industry standard benchmark model for image segmentation.
Please see details [here](https://mlcommons.org/en/news/mlperf-inference-1q2022/).

You can also refer to the [MLCommons GitHub repository](https://github.com/mlcommons/mobile_open/tree/main/vision/mosaic).

## History

### Oct 13, 2022

*   First release of MOSAIC in TensorFlow 2 including checkpoints that have been
    pretrained on Cityscapes.

## Maintainers

* Weijun Wang ([weijunw-g](https://github.com/weijunw-g))
* Fang Yang ([fyangf](https://github.com/fyangf))
* Shixin Luo ([luotigerlsx](https://github.com/luotigerlsx))

## Requirements

[![Python](https://img.shields.io/pypi/pyversions/tensorflow.svg?style=plastic)](https://badge.fury.io/py/tensorflow)
[![tf-models-official PyPI](https://badge.fury.io/py/tf-models-official.svg)](https://badge.fury.io/py/tf-models-official)

## Results

The following table shows the mIoU measured on the `cityscapes` dataset.

| Config                  | Backbone             | Resolution | branch_filter_depths | pyramid_pool_bin_nums | mIoU  | Download |
|-------------------------|:--------------------:|:----------:|:--------------------:|:---------------------:|:-----:|:--------:|
| Paper reference config  | MobileNetMultiAVGSeg | 1024x2048  | [32, 32]             | [4, 8, 16]            | 75.98 | [ckpt](https://storage.googleapis.com/tf_model_garden/vision/mosaic/MobileNetMultiAVGSeg-r1024-ebf32-nogp.tar.gz)<br>[tensorboard](https://tensorboard.dev/experiment/okEog90bSwupajFgJwGEIw//#scalars) |
| Current best config     | MobileNetMultiAVGSeg | 1024x2048  | [64, 64]             | [1, 4, 8, 16]         | 77.24 | [ckpt](https://storage.googleapis.com/tf_model_garden/vision/mosaic/MobileNetMultiAVGSeg-r1024-ebf64-gp.tar.gz)<br>[tensorboard](https://tensorboard.dev/experiment/l5hkV7JaQM23EXeOBT6oJg/#scalars)  |

*   `branch_filter_depths`: the number of convolution channels in each branch at
    a pyramid level after `Spatial Pyramid Pooling`
*   `pyramid_pool_bin_nums`: the number of bins at each level of the `Spatial
    Pyramid Pooling`

## Training

It can run on Google Cloud Platform using Cloud TPU.
[Here](https://cloud.google.com/tpu/docs/how-to) is the instruction of using
Cloud TPU. Following the instructions to set up Cloud TPU and
launch training by:

```shell
EXP_TYPE=mosaic_mnv35_cityscapes
EXP_NAME="<experiment-name>"  # You can give any name to the experiment.
TPU_NAME="<tpu-name>"  # The name assigned while creating a Cloud TPU
MODEL_DIR="gs://<path-to-model-directory>"
# Now launch the experiment.
python3 -m official.projects.mosaic.train \
  --experiment=$EXP_TYPE \
  --mode=train \
  --tpu=$TPU_NAME \
  --model_dir=$MODEL_DIR \
  --config_file=official/projects/mosaic/configs/experiments/mosaic_mnv35_cityscapes_tdfs_tpu.yaml
```

## Evaluation

Please run this command line for evaluation.

```shell
EXP_TYPE=mosaic_mnv35_cityscapes
EXP_NAME="<experiment-name>"  # You can give any name to the experiment.
TPU_NAME="<tpu-name>"  # The name assigned while creating a Cloud TPU
MODEL_DIR="gs://<path-to-model-directory>"
# Now launch the experiment.
python3 -m official.projects.mosaic.train \
  --experiment=$EXP_TYPE \
  --mode=eval \
  --tpu=$TPU_NAME \
  --model_dir=$MODEL_DIR \
  --config_file=official/projects/mosaic/configs/experiments/mosaic_mnv35_cityscapes_tdfs_tpu.yaml
```

## License

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

This project is licensed under the terms of the **Apache License 2.0**.

## Citation

If you want to cite this repository in your work, please consider citing the
paper.

```
@inproceedings{weijun2021mosaic,
  title={MOSAIC: Mobile Segmentation via decoding Aggregated Information and
    encoded Context},
  author={Weijun Wang, Andrew Howard},
  journal={arXiv preprint arXiv:2112.11623},
  year={2021},
}
```