pruning_quick_start_mnist.rst


.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/pruning_quick_start_mnist.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_tutorials_pruning_quick_start_mnist.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_pruning_quick_start_mnist.py:


Pruning Quickstart
==================

Here is a three-minute video to get you started with model pruning.

..  youtube:: wKh51Jnr0a8
    :align: center

Model pruning is a technique to reduce the model size and computation by reducing model weight size or intermediate state size.
There are three common practices for pruning a DNN model:

#. Pre-training a model -> Pruning the model -> Fine-tuning the pruned model
#. Pruning a model during training (i.e., pruning aware training) -> Fine-tuning the pruned model
#. Pruning a model -> Training the pruned model from scratch

NNI supports all of the above pruning practices by working on the key pruning stage.
Following this tutorial for a quick look at how to use NNI to prune a model in a common practice.

.. GENERATED FROM PYTHON SOURCE LINES 22-27

Preparation
-----------

In this tutorial, we use a simple model and pre-trained on MNIST dataset.
If you are familiar with defining a model and training in pytorch, you can skip directly to `Pruning Model`_.

.. GENERATED FROM PYTHON SOURCE LINES 27-40

.. code-block:: default


    import torch
    import torch.nn.functional as F
    from torch.optim import SGD

    from scripts.compression_mnist_model import TorchModel, trainer, evaluator, device

    # define the model
    model = TorchModel().to(device)

    # show the model structure, note that pruner will wrap the model layer.
    print(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    TorchModel(
      (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
      (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
      (fc1): Linear(in_features=256, out_features=120, bias=True)
      (fc2): Linear(in_features=120, out_features=84, bias=True)
      (fc3): Linear(in_features=84, out_features=10, bias=True)
      (relu1): ReLU()
      (relu2): ReLU()
      (relu3): ReLU()
      (relu4): ReLU()
      (pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    )


.. GENERATED FROM PYTHON SOURCE LINES 41-52

.. code-block:: default


    # define the optimizer and criterion for pre-training

    optimizer = SGD(model.parameters(), 1e-2)
    criterion = F.nll_loss

    # pre-train and evaluate the model on MNIST dataset
    for epoch in range(3):
        trainer(model, optimizer, criterion)
        evaluator(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    Average test loss: 0.4925, Accuracy: 8414/10000 (84%)
    Average test loss: 0.2626, Accuracy: 9214/10000 (92%)
    Average test loss: 0.2006, Accuracy: 9369/10000 (94%)


.. GENERATED FROM PYTHON SOURCE LINES 53-63

Pruning Model
-------------

Using L1NormPruner to prune the model and generate the masks.
Usually, a pruner requires original model and ``config_list`` as its inputs.
Detailed about how to write ``config_list`` please refer :doc:`compression config specification <../compression/compression_config_list>`.

The following `config_list` means all layers whose type is `Linear` or `Conv2d` will be pruned,
except the layer named `fc3`, because `fc3` is `exclude`.
The final sparsity ratio for each layer is 50%. The layer named `fc3` will not be pruned.

.. GENERATED FROM PYTHON SOURCE LINES 63-72

.. code-block:: default


    config_list = [{
        'sparsity_per_layer': 0.5,
        'op_types': ['Linear', 'Conv2d']
    }, {
        'exclude': True,
        'op_names': ['fc3']
    }]


.. GENERATED FROM PYTHON SOURCE LINES 73-74

Pruners usually require `model` and `config_list` as input arguments.

.. GENERATED FROM PYTHON SOURCE LINES 74-81

.. code-block:: default


    from nni.compression.pytorch.pruning import L1NormPruner
    pruner = L1NormPruner(model, config_list)

    # show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.
    print(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    TorchModel(
      (conv1): PrunerModuleWrapper(
        (module): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
      )
      (conv2): PrunerModuleWrapper(
        (module): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
      )
      (fc1): PrunerModuleWrapper(
        (module): Linear(in_features=256, out_features=120, bias=True)
      )
      (fc2): PrunerModuleWrapper(
        (module): Linear(in_features=120, out_features=84, bias=True)
      )
      (fc3): Linear(in_features=84, out_features=10, bias=True)
      (relu1): ReLU()
      (relu2): ReLU()
      (relu3): ReLU()
      (relu4): ReLU()
      (pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    )


.. GENERATED FROM PYTHON SOURCE LINES 82-89

.. code-block:: default


    # compress the model and generate the masks
    _, masks = pruner.compress()
    # show the masks sparsity
    for name, mask in masks.items():
        print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel()))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    conv1  sparsity :  0.5
    conv2  sparsity :  0.5
    fc1  sparsity :  0.5
    fc2  sparsity :  0.5


.. GENERATED FROM PYTHON SOURCE LINES 90-93

Speedup the original model with masks, note that `ModelSpeedup` requires an unwrapped model.
The model becomes smaller after speedup,
and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the masks across layers.

.. GENERATED FROM PYTHON SOURCE LINES 93-102

.. code-block:: default


    # need to unwrap the model, if the model is wrapped before speedup
    pruner._unwrap_model()

    # speedup the model, for more information about speedup, please refer :doc:`pruning_speedup`.
    from nni.compression.pytorch.speedup import ModelSpeedup

    ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    aten::log_softmax is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~
    Note: .aten::log_softmax.12 does not have corresponding mask inference object
    /home/ningshang/anaconda3/envs/nni-dev/lib/python3.8/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:417.)
      return self._grad


.. GENERATED FROM PYTHON SOURCE LINES 103-104

the model will become real smaller after speedup

.. GENERATED FROM PYTHON SOURCE LINES 104-106

.. code-block:: default

    print(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    TorchModel(
      (conv1): Conv2d(1, 3, kernel_size=(5, 5), stride=(1, 1))
      (conv2): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
      (fc1): Linear(in_features=128, out_features=60, bias=True)
      (fc2): Linear(in_features=60, out_features=42, bias=True)
      (fc3): Linear(in_features=42, out_features=10, bias=True)
      (relu1): ReLU()
      (relu2): ReLU()
      (relu3): ReLU()
      (relu4): ReLU()
      (pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
    )


.. GENERATED FROM PYTHON SOURCE LINES 107-111

Fine-tuning Compacted Model
---------------------------
Note that if the model has been sped up, you need to re-initialize a new optimizer for fine-tuning.
Because speedup will replace the masked big layers with dense small ones.

.. GENERATED FROM PYTHON SOURCE LINES 111-115

.. code-block:: default


    optimizer = SGD(model.parameters(), 1e-2)
    for epoch in range(3):
        trainer(model, optimizer, criterion)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 1 minutes  30.730 seconds)


.. _sphx_glr_download_tutorials_pruning_quick_start_mnist.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: pruning_quick_start_mnist.py <pruning_quick_start_mnist.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: pruning_quick_start_mnist.ipynb <pruning_quick_start_mnist.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_