pruning_quick_start_mnist.rst 9.11 KB
Newer Older
J-shang's avatar
J-shang committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/pruning_quick_start_mnist.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_tutorials_pruning_quick_start_mnist.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_pruning_quick_start_mnist.py:


Pruning Quickstart
==================

24
25
26
27
28
Here is a three-minute video to get you started with model pruning.

..  youtube:: wKh51Jnr0a8
    :align: center

J-shang's avatar
J-shang committed
29
Model pruning is a technique to reduce the model size and computation by reducing model weight size or intermediate state size.
30
There are three common practices for pruning a DNN model:
J-shang's avatar
J-shang committed
31

32
33
34
#. Pre-training a model -> Pruning the model -> Fine-tuning the pruned model
#. Pruning a model during training (i.e., pruning aware training) -> Fine-tuning the pruned model
#. Pruning a model -> Training the pruned model from scratch
J-shang's avatar
J-shang committed
35

36
37
NNI supports all of the above pruning practices by working on the key pruning stage.
Following this tutorial for a quick look at how to use NNI to prune a model in a common practice.
J-shang's avatar
J-shang committed
38

39
.. GENERATED FROM PYTHON SOURCE LINES 22-27
J-shang's avatar
J-shang committed
40
41
42
43

Preparation
-----------

44
In this tutorial, we use a simple model and pre-trained on MNIST dataset.
J-shang's avatar
J-shang committed
45
46
If you are familiar with defining a model and training in pytorch, you can skip directly to `Pruning Model`_.

47
.. GENERATED FROM PYTHON SOURCE LINES 27-40
J-shang's avatar
J-shang committed
48
49
50
51
52
53
54
55

.. code-block:: default


    import torch
    import torch.nn.functional as F
    from torch.optim import SGD

56
    from nni_assets.compression.mnist_model import TorchModel, trainer, evaluator, device
J-shang's avatar
J-shang committed
57
58
59
60

    # define the model
    model = TorchModel().to(device)

61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
    # show the model structure, note that pruner will wrap the model layer.
    print(model)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    TorchModel(
      (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
      (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
      (fc1): Linear(in_features=256, out_features=120, bias=True)
      (fc2): Linear(in_features=120, out_features=84, bias=True)
      (fc3): Linear(in_features=84, out_features=10, bias=True)
78
79
80
81
82
83
      (relu1): ReLU()
      (relu2): ReLU()
      (relu3): ReLU()
      (relu4): ReLU()
      (pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
84
85
86
87
88
    )




89
.. GENERATED FROM PYTHON SOURCE LINES 41-52
90
91
92
93

.. code-block:: default


J-shang's avatar
J-shang committed
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
    # define the optimizer and criterion for pre-training

    optimizer = SGD(model.parameters(), 1e-2)
    criterion = F.nll_loss

    # pre-train and evaluate the model on MNIST dataset
    for epoch in range(3):
        trainer(model, optimizer, criterion)
        evaluator(model)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

112
113
114
    Average test loss: 1.3409, Accuracy: 6494/10000 (65%)
    Average test loss: 0.3263, Accuracy: 9003/10000 (90%)
    Average test loss: 0.2029, Accuracy: 9388/10000 (94%)
J-shang's avatar
J-shang committed
115
116
117
118




119
.. GENERATED FROM PYTHON SOURCE LINES 53-63
J-shang's avatar
J-shang committed
120
121
122
123

Pruning Model
-------------

124
125
Using L1NormPruner to prune the model and generate the masks.
Usually, a pruner requires original model and ``config_list`` as its inputs.
126
Detailed about how to write ``config_list`` please refer :doc:`compression config specification <../compression/compression_config_list>`.
J-shang's avatar
J-shang committed
127

128
The following `config_list` means all layers whose type is `Linear` or `Conv2d` will be pruned,
J-shang's avatar
J-shang committed
129
130
131
except the layer named `fc3`, because `fc3` is `exclude`.
The final sparsity ratio for each layer is 50%. The layer named `fc3` will not be pruned.

132
.. GENERATED FROM PYTHON SOURCE LINES 63-72
J-shang's avatar
J-shang committed
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151

.. code-block:: default


    config_list = [{
        'sparsity_per_layer': 0.5,
        'op_types': ['Linear', 'Conv2d']
    }, {
        'exclude': True,
        'op_names': ['fc3']
    }]








152
.. GENERATED FROM PYTHON SOURCE LINES 73-74
J-shang's avatar
J-shang committed
153
154
155

Pruners usually require `model` and `config_list` as input arguments.

156
.. GENERATED FROM PYTHON SOURCE LINES 74-81
J-shang's avatar
J-shang committed
157
158
159
160

.. code-block:: default


J-shang's avatar
J-shang committed
161
    from nni.compression.pytorch.pruning import L1NormPruner
J-shang's avatar
J-shang committed
162
    pruner = L1NormPruner(model, config_list)
163
164

    # show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.
J-shang's avatar
J-shang committed
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
    print(model)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    TorchModel(
      (conv1): PrunerModuleWrapper(
        (module): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
      )
      (conv2): PrunerModuleWrapper(
        (module): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
      )
      (fc1): PrunerModuleWrapper(
        (module): Linear(in_features=256, out_features=120, bias=True)
      )
      (fc2): PrunerModuleWrapper(
        (module): Linear(in_features=120, out_features=84, bias=True)
      )
      (fc3): Linear(in_features=84, out_features=10, bias=True)
189
190
191
192
193
194
      (relu1): ReLU()
      (relu2): ReLU()
      (relu3): ReLU()
      (relu4): ReLU()
      (pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
J-shang's avatar
J-shang committed
195
196
197
198
199
    )




200
.. GENERATED FROM PYTHON SOURCE LINES 82-89
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226

.. code-block:: default


    # compress the model and generate the masks
    _, masks = pruner.compress()
    # show the masks sparsity
    for name, mask in masks.items():
        print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel()))





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    conv1  sparsity :  0.5
    conv2  sparsity :  0.5
    fc1  sparsity :  0.5
    fc2  sparsity :  0.5




227
.. GENERATED FROM PYTHON SOURCE LINES 90-93
J-shang's avatar
J-shang committed
228

229
230
Speedup the original model with masks, note that `ModelSpeedup` requires an unwrapped model.
The model becomes smaller after speedup,
J-shang's avatar
J-shang committed
231
232
and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the masks across layers.

233
.. GENERATED FROM PYTHON SOURCE LINES 93-102
J-shang's avatar
J-shang committed
234
235
236
237

.. code-block:: default


238
    # need to unwrap the model, if the model is wrapped before speedup
J-shang's avatar
J-shang committed
239
240
    pruner._unwrap_model()

241
    # speedup the model, for more information about speedup, please refer :doc:`pruning_speedup`.
J-shang's avatar
J-shang committed
242
243
244
245
246
247
248
249
250
251
252
253
    from nni.compression.pytorch.speedup import ModelSpeedup

    ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

254
    /home/ningshang/anaconda3/envs/nni-dev/lib/python3.8/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:417.)
255
      return self._grad
J-shang's avatar
J-shang committed
256
257
258
259




260
.. GENERATED FROM PYTHON SOURCE LINES 103-104
J-shang's avatar
J-shang committed
261

262
the model will become real smaller after speedup
J-shang's avatar
J-shang committed
263

264
.. GENERATED FROM PYTHON SOURCE LINES 104-106
J-shang's avatar
J-shang committed
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283

.. code-block:: default

    print(model)





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    TorchModel(
      (conv1): Conv2d(1, 3, kernel_size=(5, 5), stride=(1, 1))
      (conv2): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
      (fc1): Linear(in_features=128, out_features=60, bias=True)
      (fc2): Linear(in_features=60, out_features=42, bias=True)
      (fc3): Linear(in_features=42, out_features=10, bias=True)
284
285
286
287
288
289
      (relu1): ReLU()
      (relu2): ReLU()
      (relu3): ReLU()
      (relu4): ReLU()
      (pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
J-shang's avatar
J-shang committed
290
291
292
293
294
    )




295
.. GENERATED FROM PYTHON SOURCE LINES 107-111
J-shang's avatar
J-shang committed
296
297
298
299

Fine-tuning Compacted Model
---------------------------
Note that if the model has been sped up, you need to re-initialize a new optimizer for fine-tuning.
300
Because speedup will replace the masked big layers with dense small ones.
J-shang's avatar
J-shang committed
301

302
.. GENERATED FROM PYTHON SOURCE LINES 111-115
J-shang's avatar
J-shang committed
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319

.. code-block:: default


    optimizer = SGD(model.parameters(), 1e-2)
    for epoch in range(3):
        trainer(model, optimizer, criterion)








.. rst-class:: sphx-glr-timing

320
   **Total running time of the script:** ( 1 minutes  0.810 seconds)
J-shang's avatar
J-shang committed
321
322
323
324


.. _sphx_glr_download_tutorials_pruning_quick_start_mnist.py:

325
.. only:: html
J-shang's avatar
J-shang committed
326

327
  .. container:: sphx-glr-footer sphx-glr-footer-example
J-shang's avatar
J-shang committed
328
329


330
    .. container:: sphx-glr-download sphx-glr-download-python
J-shang's avatar
J-shang committed
331

332
      :download:`Download Python source code: pruning_quick_start_mnist.py <pruning_quick_start_mnist.py>`
J-shang's avatar
J-shang committed
333

334
    .. container:: sphx-glr-download sphx-glr-download-jupyter
J-shang's avatar
J-shang committed
335

336
      :download:`Download Jupyter notebook: pruning_quick_start_mnist.ipynb <pruning_quick_start_mnist.ipynb>`
J-shang's avatar
J-shang committed
337
338
339
340
341
342
343


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_