segmentation.rst 11.5 KB
Newer Older
Hang Zhang's avatar
Hang Zhang committed
1
2
Semantic Segmentation
=====================
Zhang's avatar
v0.4.2  
Zhang committed
3
4
5
6
7
8

Install Package
---------------

- Clone the GitHub repo::
    
Hang Zhang's avatar
Hang Zhang committed
9
    git clone https://github.com/zhanghang1989/PyTorch-Encoding
Zhang's avatar
v0.4.2  
Zhang committed
10
11
12

- Install PyTorch Encoding (if not yet). Please follow the installation guide `Installing PyTorch Encoding <../notes/compile.html>`_.

Hang Zhang's avatar
Hang Zhang committed
13
14
Get Pre-trained Model
---------------------
Zhang's avatar
v0.4.2  
Zhang committed
15
16
17
18
19
20
21
22
23
24
25

.. hint::
    The model names contain the training information. For instance ``FCN_ResNet50_PContext``:
      - ``FCN`` indicate the algorithm is Fully Convolutional Network for Semantic Segmentation
      - ``ResNet50`` is the name of backbone network.
      - ``PContext`` means the PASCAL in Context dataset.

    How to get pretrained model, for example ``FCN_ResNet50_PContext``::

        model = encoding.models.get_model('FCN_ResNet50_PContext', pretrained=True)

Hang Zhang's avatar
Hang Zhang committed
26
    After clicking ``cmd`` in the table, the command for training the model can be found below the table.
Zhang's avatar
v0.4.2  
Zhang committed
27
28
29
30

.. role:: raw-html(raw)
   :format: html

Hang Zhang's avatar
Hang Zhang committed
31

Hang Zhang's avatar
Hang Zhang committed
32
33
ResNeSt Backbone Models
-----------------------
Hang Zhang's avatar
Hang Zhang committed
34

Hang Zhang's avatar
Hang Zhang committed
35
==============================================================================  ==============    ==============    =========================================================================================================
Hang Zhang's avatar
Hang Zhang committed
36
Model                                                                           pixAcc            mIoU              Command                                                                                      
Hang Zhang's avatar
Hang Zhang committed
37
38
39
40
41
==============================================================================  ==============    ==============    =========================================================================================================
FCN_ResNeSt50_ADE                                                               xx.xx%            xx.xx%            :raw-html:`<a href="javascript:toggleblock('cmd_fcn_nest50_ade')" class="toggleblock">cmd</a>`
DeepLabV3_ResNeSt50_ADE                                                         81.17%            45.12%            :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest50_ade')" class="toggleblock">cmd</a>`
DeepLabV3_ResNeSt101_ADE                                                        82.07%            46.91%            :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest101_ade')" class="toggleblock">cmd</a>`
==============================================================================  ==============    ==============    =========================================================================================================
Zhang's avatar
v0.4.2  
Zhang committed
42
43
44

.. raw:: html

Hang Zhang's avatar
Hang Zhang committed
45
46
    <code xml:space="preserve" id="cmd_fcn_nest50_ade" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset ade20k --model fcn  --aux --backbone resnest50 --batch-size 2
Zhang's avatar
v0.4.2  
Zhang committed
47
48
    </code>

Hang Zhang's avatar
Hang Zhang committed
49
50
    <code xml:space="preserve" id="cmd_deeplab_resnest50_ade" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset ADE20K --model deeplab --aux --backbone resnest50
Zhang's avatar
v0.4.2  
Zhang committed
51
52
    </code>

Hang Zhang's avatar
Hang Zhang committed
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
    <code xml:space="preserve" id="cmd_deeplab_resnest101_ade" style="display: none; text-align: left; white-space: pre-wrap">
    python train.py --dataset ADE20K --model deeplab --aux --backbone resnest101
    </code>


ResNet Backbone Models
----------------------

ADE20K Dataset
~~~~~~~~~~~~~~

==============================================================================  =================    ==============    =============================================================================================
Model                                                                           pixAcc               mIoU              Command                                                                                      
==============================================================================  =================    ==============    =============================================================================================
FCN_ResNet50_ADE                                                                78.7%                38.5%             :raw-html:`<a href="javascript:toggleblock('cmd_fcn50_ade')" class="toggleblock">cmd</a>`
EncNet_ResNet50_ADE                                                             80.1%                41.5%             :raw-html:`<a href="javascript:toggleblock('cmd_enc50_ade')" class="toggleblock">cmd</a>`    
EncNet_ResNet101_ADE                                                            81.3%                44.4%             :raw-html:`<a href="javascript:toggleblock('cmd_enc101_ade')" class="toggleblock">cmd</a>`   
EncNet_ResNet101_VOC                                                            N/A                  85.9%             :raw-html:`<a href="javascript:toggleblock('cmd_enc101_voc')" class="toggleblock">cmd</a>`   
==============================================================================  =================    ==============    =============================================================================================


.. raw:: html

    <code xml:space="preserve" id="cmd_fcn50_ade" style="display: none; text-align: left; white-space: pre-wrap">
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model FCN
Hang Zhang's avatar
Hang Zhang committed
78
79
    </code>

Hang Zhang's avatar
Hang Zhang committed
80
81
82
83
84
    <code xml:space="preserve" id="cmd_psp50_ade" style="display: none; text-align: left; white-space: pre-wrap">
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model PSP --aux
    </code>

    <code xml:space="preserve" id="cmd_enc50_ade" style="display: none; text-align: left; white-space: pre-wrap">
Hang Zhang's avatar
Hang Zhang committed
85
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model EncNet --aux --se-loss
Hang Zhang's avatar
Hang Zhang committed
86
87
    </code>

Hang Zhang's avatar
Hang Zhang committed
88
    <code xml:space="preserve" id="cmd_enc101_ade" style="display: none; text-align: left; white-space: pre-wrap">
89
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model EncNet --aux --se-loss --backbone resnet101 --base-size 640 --crop-size 576
Hang Zhang's avatar
Hang Zhang committed
90
91
92
93
94
    </code>

    <code xml:space="preserve" id="cmd_enc101_voc" style="display: none; text-align: left; white-space: pre-wrap">
    # First finetuning COCO dataset pretrained model on augmented set
    # You can also train from scratch on COCO by yourself
95
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset Pascal_aug --model-zoo EncNet_Resnet101_COCO --aux --se-loss --lr 0.001 --syncbn --ngpus 4 --checkname res101 --ft
Hang Zhang's avatar
Hang Zhang committed
96
    # Finetuning on original set
97
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset Pascal_voc --model encnet --aux  --se-loss --backbone resnet101 --lr 0.0001 --syncbn --ngpus 4 --checkname res101 --resume runs/Pascal_aug/encnet/res101/checkpoint.params --ft
Hang Zhang's avatar
Hang Zhang committed
98
99
    </code>

Hang Zhang's avatar
Hang Zhang committed
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126


Pascal Context Dataset
~~~~~~~~~~~~~~~~~~~~~~

==============================================================================  =================    ==============    =============================================================================================
Model                                                                           pixAcc               mIoU              Command                                                                                      
==============================================================================  =================    ==============    =============================================================================================
Encnet_ResNet50_PContext                                                        79.2%                51.0%             :raw-html:`<a href="javascript:toggleblock('cmd_enc50_pcont')" class="toggleblock">cmd</a>`  
EncNet_ResNet101_PContext                                                       80.7%                54.1%             :raw-html:`<a href="javascript:toggleblock('cmd_enc101_pcont')" class="toggleblock">cmd</a>` 
==============================================================================  =================    ==============    =============================================================================================

.. raw:: html

    <code xml:space="preserve" id="cmd_fcn50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model FCN
    </code>

    <code xml:space="preserve" id="cmd_enc50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss
    </code>

    <code xml:space="preserve" id="cmd_enc101_pcont" style="display: none; text-align: left; white-space: pre-wrap">
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss --backbone resnet101
    </code>


Hang Zhang's avatar
Hang Zhang committed
127
128
129
130
131
132
133
134
135
136
137
138
139
Test Pretrained
~~~~~~~~~~~~~~~

- Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``PASCAL Context`` dataset::

      python scripts/prepare_pcontext.py
  
- The test script is in the ``experiments/segmentation/`` folder. For evaluating the model (using MS),
  for example ``Encnet_ResNet50_PContext``::

      python test.py --dataset PContext --model-zoo Encnet_ResNet50_PContext --eval
      # pixAcc: 0.792, mIoU: 0.510: 100%|████████████████████████| 1276/1276 [46:31<00:00,  2.19s/it]

Zhang's avatar
v0.4.2  
Zhang committed
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
Quick Demo
~~~~~~~~~~

.. code-block:: python

    import torch
    import encoding

    # Get the model
    model = encoding.models.get_model('Encnet_ResNet50_PContext', pretrained=True).cuda()
    model.eval()

    # Prepare the image
    url = 'https://github.com/zhanghang1989/image-data/blob/master/' + \
          'encoding/segmentation/pcontext/2010_001829_org.jpg?raw=true'
    filename = 'example.jpg'
    img = encoding.utils.load_image(
        encoding.utils.download(url, filename)).cuda().unsqueeze(0)

    # Make prediction
    output = model.evaluate(img)
    predict = torch.max(output, 1)[1].cpu().numpy() + 1

    # Get color pallete for visualization
    mask = encoding.utils.get_mask_pallete(predict, 'pcontext')
    mask.save('output.png')


.. image:: https://raw.githubusercontent.com/zhanghang1989/image-data/master/encoding/segmentation/pcontext/2010_001829_org.jpg
   :width: 45%

.. image:: https://raw.githubusercontent.com/zhanghang1989/image-data/master/encoding/segmentation/pcontext/2010_001829.png
   :width: 45%

Train Your Own Model
--------------------

Hang Zhang's avatar
Hang Zhang committed
177
- Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``ADE20K`` dataset::
Zhang's avatar
v0.4.2  
Zhang committed
178

Hang Zhang's avatar
Hang Zhang committed
179
    python scripts/prepare_ade20k.py
Zhang's avatar
v0.4.2  
Zhang committed
180
181
182

- The training script is in the ``experiments/segmentation/`` folder, example training command::

Hang Zhang's avatar
Hang Zhang committed
183
    python train_dist.py --dataset ade20k --model encnet --aux --se-loss
Zhang's avatar
v0.4.2  
Zhang committed
184

Hang Zhang's avatar
Hang Zhang committed
185
- Detail training options, please run ``python train.py -h``. Commands for reproducing pre-trained models can be found in the table.
Zhang's avatar
v0.4.2  
Zhang committed
186

Hang Zhang's avatar
Hang Zhang committed
187
188
189
190
.. hint::
    The validation metrics during the training only using center-crop is just for monitoring the
    training correctness purpose. For evaluating the pretrained model on validation set using MS,
    please use the command::
Hang Zhang's avatar
Hang Zhang committed
191

Hang Zhang's avatar
Hang Zhang committed
192
        python test.py --dataset pcontext --model encnet --aux --se-loss --resume mycheckpoint --eval
Hang Zhang's avatar
Hang Zhang committed
193

Zhang's avatar
v0.4.2  
Zhang committed
194
195
196
197
198
199
200
201
202
203
204
205
206
Citation
--------

.. note::
    * Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. "Context Encoding for Semantic Segmentation"  *The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018*::

        @InProceedings{Zhang_2018_CVPR,
        author = {Zhang, Hang and Dana, Kristin and Shi, Jianping and Zhang, Zhongyue and Wang, Xiaogang and Tyagi, Ambrish and Agrawal, Amit},
        title = {Context Encoding for Semantic Segmentation},
        booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
        month = {June},
        year = {2018}
        }