segmentation.rst 17.3 KB
Newer Older
Hang Zhang's avatar
Hang Zhang committed
1
2
Semantic Segmentation
=====================
Zhang's avatar
v0.4.2  
Zhang committed
3
4
5
6
7
8

Install Package
---------------

- Clone the GitHub repo::
    
Hang Zhang's avatar
Hang Zhang committed
9
    git clone https://github.com/zhanghang1989/PyTorch-Encoding
Zhang's avatar
v0.4.2  
Zhang committed
10
11
12

- Install PyTorch Encoding (if not yet). Please follow the installation guide `Installing PyTorch Encoding <../notes/compile.html>`_.

Hang Zhang's avatar
Hang Zhang committed
13
14
Get Pre-trained Model
---------------------
Zhang's avatar
v0.4.2  
Zhang committed
15
16

.. hint::
Hang Zhang's avatar
Hang Zhang committed
17
18
    The model names contain the training information. For instance ``EncNet_ResNet50s_ADE``:
      - ``EncNet`` indicate the algorithm is Context Encoding for Semantic Segmentation
Zhang's avatar
v0.4.2  
Zhang committed
19
      - ``ResNet50`` is the name of backbone network.
Hang Zhang's avatar
Hang Zhang committed
20
      - ``ADE`` means the ADE20K dataset.
Zhang's avatar
v0.4.2  
Zhang committed
21

Hang Zhang's avatar
Hang Zhang committed
22
    How to get pretrained model, for example ``EncNet_ResNet50s_ADE``::
Zhang's avatar
v0.4.2  
Zhang committed
23

Hang Zhang's avatar
Hang Zhang committed
24
        model = encoding.models.get_model('EncNet_ResNet50s_ADE', pretrained=True)
Zhang's avatar
v0.4.2  
Zhang committed
25

Hang Zhang's avatar
Hang Zhang committed
26
    After clicking ``cmd`` in the table, the command for training the model can be found below the table.
Zhang's avatar
v0.4.2  
Zhang committed
27
28
29
30

.. role:: raw-html(raw)
   :format: html

Hang Zhang's avatar
Hang Zhang committed
31

Hang Zhang's avatar
Hang Zhang committed
32
33
ResNeSt Backbone Models
-----------------------
Hang Zhang's avatar
Hang Zhang committed
34

35
36
37
38
39
40
41
42
43
44
45
46
ADE20K Dataset
~~~~~~~~~~~~~~

==============================================================================  ====================    ===================    =========================================================================================================
Model                                                                           pixAcc                  mIoU                   Command                                                                                      
==============================================================================  ====================    ===================    =========================================================================================================
FCN_ResNeSt50_ADE                                                               80.18%                  42.94%                 :raw-html:`<a href="javascript:toggleblock('cmd_fcn_nest50_ade')" class="toggleblock">cmd</a>`
DeepLab_ResNeSt50_ADE                                                           81.17%                  45.12%                 :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest50_ade')" class="toggleblock">cmd</a>`
DeepLab_ResNeSt101_ADE                                                          82.07%                  46.91%                 :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest101_ade')" class="toggleblock">cmd</a>`
DeepLab_ResNeSt200_ADE                                                          82.45%                  48.36%                 :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest200_ade')" class="toggleblock">cmd</a>`
DeepLab_ResNeSt269_ADE                                                          82.62%                  47.60%                 :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest269_ade')" class="toggleblock">cmd</a>`
==============================================================================  ====================    ===================    =========================================================================================================
Zhang's avatar
v0.4.2  
Zhang committed
47
48
49

.. raw:: html

Hang Zhang's avatar
Hang Zhang committed
50
    <code xml:space="preserve" id="cmd_fcn_nest50_ade" style="display: none; text-align: left; white-space: pre-wrap">
51
    python train.py --dataset ADE20K --model fcn  --aux --backbone resnest50
52
53
54
    </code>

    <code xml:space="preserve" id="cmd_enc_nest50_ade" style="display: none; text-align: left; white-space: pre-wrap">
55
    python train.py --dataset ADE20K --model EncNet --aux --se-loss --backbone resnest50
Zhang's avatar
v0.4.2  
Zhang committed
56
57
    </code>

Hang Zhang's avatar
Hang Zhang committed
58
    <code xml:space="preserve" id="cmd_deeplab_resnest50_ade" style="display: none; text-align: left; white-space: pre-wrap">
59
    python train.py --dataset ADE20K --model deeplab --aux --backbone resnest50
Zhang's avatar
v0.4.2  
Zhang committed
60
61
    </code>

Hang Zhang's avatar
Hang Zhang committed
62
    <code xml:space="preserve" id="cmd_deeplab_resnest101_ade" style="display: none; text-align: left; white-space: pre-wrap">
63
    python train.py --dataset ADE20K --model deeplab --aux --backbone resnest101 --epochs 180
64
65
66
67
    </code>

    <code xml:space="preserve" id="cmd_deeplab_resnest200_ade" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset ADE20K --model deeplab --aux --backbone resnest200 --epochs 180
68
69
70
    </code>

    <code xml:space="preserve" id="cmd_deeplab_resnest269_ade" style="display: none; text-align: left; white-space: pre-wrap">
71
    python train.py --dataset ADE20K --model deeplab --aux --backbone resnest269
Hang Zhang's avatar
Hang Zhang committed
72
73
74
    </code>


75
76
77
78
79
80
Pascal Context Dataset
~~~~~~~~~~~~~~~~~~~~~~

==============================================================================  ====================    ====================    =========================================================================================================
Model                                                                           pixAcc                  mIoU                    Command                                                                                      
==============================================================================  ====================    ====================    =========================================================================================================
Hang Zhang's avatar
Hang Zhang committed
81
82
FCN_ResNeSt50_PContext                                                          79.19%                  51.98%                  :raw-html:`<a href="javascript:toggleblock('cmd_fcn_nest50_pcont')" class="toggleblock">cmd</a>`
DeepLab_ResNeSt50_PContext                                                      80.41%                  53.19%                  :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_nest50_pcont')" class="toggleblock">cmd</a>`
83
84
DeepLab_ResNeSt101_PContext                                                     81.91%                  56.49%                  :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_nest101_pcont')" class="toggleblock">cmd</a>`
DeepLab_ResNeSt200_PContext                                                     82.50%                  58.37%                  :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_nest200_pcont')" class="toggleblock">cmd</a>`
Hang Zhang's avatar
Hang Zhang committed
85
DeepLab_ResNeSt269_PContext                                                     83.06%                  58.92%                  :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_nest269_pcont')" class="toggleblock">cmd</a>`
86
87
88
89
==============================================================================  ====================    ====================    =========================================================================================================

.. raw:: html

Hang Zhang's avatar
Hang Zhang committed
90
91
92
93
94
95
96
97
    <code xml:space="preserve" id="cmd_fcn_nest50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset pcontext --model fcn --aux --backbone resnest50
    </code>

    <code xml:space="preserve" id="cmd_deeplab_nest50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset pcontext --model deeplab --aux --backbone resnest50
    </code>

98
99
100
101
102
103
104
105
    <code xml:space="preserve" id="cmd_deeplab_nest101_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset pcontext --model deeplab --aux --backbone resnest101
    </code>

    <code xml:space="preserve" id="cmd_deeplab_nest200_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset pcontext --model deeplab --aux --backbone resnest200
    </code>

Hang Zhang's avatar
Hang Zhang committed
106
107
108
    <code xml:space="preserve" id="cmd_deeplab_nest269_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset pcontext --model deeplab --aux --backbone resnest269
    </code>
109
110


Hang Zhang's avatar
Hang Zhang committed
111
112
113
114
115
116
ResNet Backbone Models
----------------------

ADE20K Dataset
~~~~~~~~~~~~~~

117
118
119
120
121
122
123
==============================================================================  ====================    ====================    =============================================================================================
Model                                                                           pixAcc                  mIoU                    Command                                                                                      
==============================================================================  ====================    ====================    =============================================================================================
FCN_ResNet50s_ADE                                                               78.7%                   38.5%                   :raw-html:`<a href="javascript:toggleblock('cmd_fcn50_ade')" class="toggleblock">cmd</a>`
EncNet_ResNet50s_ADE                                                            80.1%                   41.5%                   :raw-html:`<a href="javascript:toggleblock('cmd_enc50_ade')" class="toggleblock">cmd</a>`    
EncNet_ResNet101s_ADE                                                           81.3%                   44.4%                   :raw-html:`<a href="javascript:toggleblock('cmd_enc101_ade')" class="toggleblock">cmd</a>`   
==============================================================================  ====================    ====================    =============================================================================================
Hang Zhang's avatar
Hang Zhang committed
124
125
126
127
128


.. raw:: html

    <code xml:space="preserve" id="cmd_fcn50_ade" style="display: none; text-align: left; white-space: pre-wrap">
129
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model FCN
Hang Zhang's avatar
Hang Zhang committed
130
131
    </code>

Hang Zhang's avatar
Hang Zhang committed
132
    <code xml:space="preserve" id="cmd_psp50_ade" style="display: none; text-align: left; white-space: pre-wrap">
133
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model PSP --aux
Hang Zhang's avatar
Hang Zhang committed
134
135
136
    </code>

    <code xml:space="preserve" id="cmd_enc50_ade" style="display: none; text-align: left; white-space: pre-wrap">
137
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model EncNet --aux --se-loss
Hang Zhang's avatar
Hang Zhang committed
138
139
    </code>

Hang Zhang's avatar
Hang Zhang committed
140
    <code xml:space="preserve" id="cmd_enc101_ade" style="display: none; text-align: left; white-space: pre-wrap">
141
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model EncNet --aux --se-loss --backbone resnet101
Hang Zhang's avatar
Hang Zhang committed
142
143
    </code>

Hang Zhang's avatar
Hang Zhang committed
144
145
146
Pascal Context Dataset
~~~~~~~~~~~~~~~~~~~~~~

147
148
149
150
151
152
==============================================================================  =====================    =====================    =============================================================================================
Model                                                                           pixAcc                   mIoU                     Command                                                                                      
==============================================================================  =====================    =====================    =============================================================================================
Encnet_ResNet50s_PContext                                                        79.2%                    51.0%                    :raw-html:`<a href="javascript:toggleblock('cmd_enc50_pcont')" class="toggleblock">cmd</a>`  
EncNet_ResNet101s_PContext                                                       80.7%                    54.1%                    :raw-html:`<a href="javascript:toggleblock('cmd_enc101_pcont')" class="toggleblock">cmd</a>` 
==============================================================================  =====================    =====================    =============================================================================================
Hang Zhang's avatar
Hang Zhang committed
153
154
155
156

.. raw:: html

    <code xml:space="preserve" id="cmd_fcn50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
157
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model FCN
Hang Zhang's avatar
Hang Zhang committed
158
159
160
    </code>

    <code xml:space="preserve" id="cmd_enc50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
161
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss
Hang Zhang's avatar
Hang Zhang committed
162
163
164
    </code>

    <code xml:space="preserve" id="cmd_enc101_pcont" style="display: none; text-align: left; white-space: pre-wrap">
165
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss --backbone resnet101
Hang Zhang's avatar
Hang Zhang committed
166
167
168
    </code>


Hang Zhang's avatar
Hang Zhang committed
169
170
171
Pascal VOC Dataset
~~~~~~~~~~~~~~~~~~

172
173
174
175
176
==============================================================================  ======================    =====================    =============================================================================================
Model                                                                           pixAcc                    mIoU                     Command                                                                                      
==============================================================================  ======================    =====================    =============================================================================================
EncNet_ResNet101s_VOC                                                           N/A                       85.9%                    :raw-html:`<a href="javascript:toggleblock('cmd_enc101_voc')" class="toggleblock">cmd</a>`   
==============================================================================  ======================    =====================    =============================================================================================
Hang Zhang's avatar
Hang Zhang committed
177
178
179
180
181
182

.. raw:: html

    <code xml:space="preserve" id="cmd_enc101_voc" style="display: none; text-align: left; white-space: pre-wrap">
    # First finetuning COCO dataset pretrained model on augmented set
    # You can also train from scratch on COCO by yourself
183
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset Pascal_aug --model-zoo EncNet_Resnet101_COCO --aux --se-loss --lr 0.001 --syncbn --ngpus 4 --checkname res101 --ft
Hang Zhang's avatar
Hang Zhang committed
184
    # Finetuning on original set
185
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset Pascal_voc --model encnet --aux  --se-loss --backbone resnet101 --lr 0.0001 --syncbn --ngpus 4 --checkname res101 --resume runs/Pascal_aug/encnet/res101/checkpoint.params --ft
Hang Zhang's avatar
Hang Zhang committed
186
187
188
    </code>


Hang Zhang's avatar
Hang Zhang committed
189
190
191
192
193
Test Pretrained
~~~~~~~~~~~~~~~

- Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``PASCAL Context`` dataset::

Hang Zhang's avatar
Hang Zhang committed
194
      python scripts/prepare_ade20k.py
Hang Zhang's avatar
Hang Zhang committed
195
196
  
- The test script is in the ``experiments/segmentation/`` folder. For evaluating the model (using MS),
Hang Zhang's avatar
Hang Zhang committed
197
  for example ``EncNet_ResNet50s_ADE``::
Hang Zhang's avatar
Hang Zhang committed
198

Hang Zhang's avatar
Hang Zhang committed
199
200
      python test.py --dataset ADE20K --model-zoo EncNet_ResNet50s_ADE --eval
      # pixAcc: 0.801, mIoU: 0.415: 100%|████████████████████████| 250/250
Hang Zhang's avatar
Hang Zhang committed
201

202
203
204
205
206
207
208
209
210
211

Train Your Own Model
--------------------

- Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``ADE20K`` dataset::

    python scripts/prepare_ade20k.py

- The training script is in the ``experiments/segmentation/`` folder, example training command::

212
    python train.py --dataset ade20k --model encnet --aux --se-loss
213

214
- Detail training options, please run ``python train.py -h``. Commands for reproducing pre-trained models can be found in the table.
215
216
217
218
219
220
221
222
223

.. hint::
    The validation metrics during the training only using center-crop is just for monitoring the
    training correctness purpose. For evaluating the pretrained model on validation set using MS,
    please use the command::

        python test.py --dataset pcontext --model encnet --aux --se-loss --resume mycheckpoint --eval


Zhang's avatar
v0.4.2  
Zhang committed
224
225
226
227
228
229
230
231
232
Quick Demo
~~~~~~~~~~

.. code-block:: python

    import torch
    import encoding

    # Get the model
233
    model = encoding.models.get_model('Encnet_ResNet50s_PContext', pretrained=True).cuda()
Zhang's avatar
v0.4.2  
Zhang committed
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
    model.eval()

    # Prepare the image
    url = 'https://github.com/zhanghang1989/image-data/blob/master/' + \
          'encoding/segmentation/pcontext/2010_001829_org.jpg?raw=true'
    filename = 'example.jpg'
    img = encoding.utils.load_image(
        encoding.utils.download(url, filename)).cuda().unsqueeze(0)

    # Make prediction
    output = model.evaluate(img)
    predict = torch.max(output, 1)[1].cpu().numpy() + 1

    # Get color pallete for visualization
    mask = encoding.utils.get_mask_pallete(predict, 'pcontext')
    mask.save('output.png')


.. image:: https://raw.githubusercontent.com/zhanghang1989/image-data/master/encoding/segmentation/pcontext/2010_001829_org.jpg
   :width: 45%

.. image:: https://raw.githubusercontent.com/zhanghang1989/image-data/master/encoding/segmentation/pcontext/2010_001829.png
   :width: 45%

Hang Zhang's avatar
Hang Zhang committed
258

Zhang's avatar
v0.4.2  
Zhang committed
259
260
261
262
Citation
--------

.. note::
263
264
265
266
267
268
269
270
271
272
    * Hang Zhang et al. "ResNeSt: Split-Attention Networks" *arXiv 2020*::

        @article{zhang2020resnest,
        title={ResNeSt: Split-Attention Networks},
        author={Zhang, Hang and Wu, Chongruo and Zhang, Zhongyue and Zhu, Yi and Zhang, Zhi and Lin, Haibin and Sun, Yue and He, Tong and Muller, Jonas and Manmatha, R. and Li, Mu and Smola, Alexander},
        journal={arXiv preprint arXiv:2004.08955},
        year={2020}
        }


Zhang's avatar
v0.4.2  
Zhang committed
273
274
275
276
277
278
279
280
281
    * Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. "Context Encoding for Semantic Segmentation"  *The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018*::

        @InProceedings{Zhang_2018_CVPR,
        author = {Zhang, Hang and Dana, Kristin and Shi, Jianping and Zhang, Zhongyue and Wang, Xiaogang and Tyagi, Ambrish and Agrawal, Amit},
        title = {Context Encoding for Semantic Segmentation},
        booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
        month = {June},
        year = {2018}
        }