pruning_quick_start_mnist.rst


.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/pruning_quick_start_mnist.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_tutorials_pruning_quick_start_mnist.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_pruning_quick_start_mnist.py:


Pruning Quickstart
==================

Model pruning is a technique to reduce the model size and computation by reducing model weight size or intermediate state size.
It usually has following paths:

#. Pre-training a model -> Pruning the model -> Fine-tuning the model
#. Pruning the model aware training -> Fine-tuning the model
#. Pruning the model -> Pre-training the compact model

NNI supports the above three modes and mainly focuses on the pruning stage.
Follow this tutorial for a quick look at how to use NNI to prune a model in a common practice.

.. GENERATED FROM PYTHON SOURCE LINES 17-22

Preparation
-----------

In this tutorial, we use a simple model and pre-train on MNIST dataset.
If you are familiar with defining a model and training in pytorch, you can skip directly to `Pruning Model`_.

.. GENERATED FROM PYTHON SOURCE LINES 22-35

.. code-block:: default


    import torch
    import torch.nn.functional as F
    from torch.optim import SGD

    from scripts.compression_mnist_model import TorchModel, trainer, evaluator, device

    # define the model
    model = TorchModel().to(device)

    # show the model structure, note that pruner will wrap the model layer.
    print(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    TorchModel(
      (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
      (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
      (fc1): Linear(in_features=256, out_features=120, bias=True)
      (fc2): Linear(in_features=120, out_features=84, bias=True)
      (fc3): Linear(in_features=84, out_features=10, bias=True)
    )


.. GENERATED FROM PYTHON SOURCE LINES 36-47

.. code-block:: default


    # define the optimizer and criterion for pre-training

    optimizer = SGD(model.parameters(), 1e-2)
    criterion = F.nll_loss

    # pre-train and evaluate the model on MNIST dataset
    for epoch in range(3):
        trainer(model, optimizer, criterion)
        evaluator(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    Average test loss: 0.5876, Accuracy: 8158/10000 (82%)
    Average test loss: 0.2501, Accuracy: 9217/10000 (92%)
    Average test loss: 0.1786, Accuracy: 9486/10000 (95%)


.. GENERATED FROM PYTHON SOURCE LINES 48-58

Pruning Model
-------------

Using L1NormPruner pruning the model and generating the masks.
Usually, pruners require original model and ``config_list`` as parameters.
Detailed about how to write ``config_list`` please refer :doc:`compression config specification <../compression/compression_config_list>`.

This `config_list` means all layers whose type is `Linear` or `Conv2d` will be pruned,
except the layer named `fc3`, because `fc3` is `exclude`.
The final sparsity ratio for each layer is 50%. The layer named `fc3` will not be pruned.

.. GENERATED FROM PYTHON SOURCE LINES 58-67

.. code-block:: default


    config_list = [{
        'sparsity_per_layer': 0.5,
        'op_types': ['Linear', 'Conv2d']
    }, {
        'exclude': True,
        'op_names': ['fc3']
    }]


.. GENERATED FROM PYTHON SOURCE LINES 68-69

Pruners usually require `model` and `config_list` as input arguments.

.. GENERATED FROM PYTHON SOURCE LINES 69-76

.. code-block:: default


    from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner
    pruner = L1NormPruner(model, config_list)

    # show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.
    print(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    TorchModel(
      (conv1): PrunerModuleWrapper(
        (module): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
      )
      (conv2): PrunerModuleWrapper(
        (module): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
      )
      (fc1): PrunerModuleWrapper(
        (module): Linear(in_features=256, out_features=120, bias=True)
      )
      (fc2): PrunerModuleWrapper(
        (module): Linear(in_features=120, out_features=84, bias=True)
      )
      (fc3): Linear(in_features=84, out_features=10, bias=True)
    )


.. GENERATED FROM PYTHON SOURCE LINES 77-84

.. code-block:: default


    # compress the model and generate the masks
    _, masks = pruner.compress()
    # show the masks sparsity
    for name, mask in masks.items():
        print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel()))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    conv1  sparsity :  0.5
    conv2  sparsity :  0.5
    fc1  sparsity :  0.5
    fc2  sparsity :  0.5


.. GENERATED FROM PYTHON SOURCE LINES 85-88

Speed up the original model with masks, note that `ModelSpeedup` requires an unwrapped model.
The model becomes smaller after speed-up,
and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the masks across layers.

.. GENERATED FROM PYTHON SOURCE LINES 88-97

.. code-block:: default


    # need to unwrap the model, if the model is wrapped before speed up
    pruner._unwrap_model()

    # speed up the model
    from nni.compression.pytorch.speedup import ModelSpeedup

    ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    aten::log_softmax is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~
    Note: .aten::log_softmax.12 does not have corresponding mask inference object
    /home/ningshang/anaconda3/envs/nni-dev/lib/python3.8/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:417.)
      return self._grad


.. GENERATED FROM PYTHON SOURCE LINES 98-99

the model will become real smaller after speed up

.. GENERATED FROM PYTHON SOURCE LINES 99-101

.. code-block:: default

    print(model)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    TorchModel(
      (conv1): Conv2d(1, 3, kernel_size=(5, 5), stride=(1, 1))
      (conv2): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
      (fc1): Linear(in_features=128, out_features=60, bias=True)
      (fc2): Linear(in_features=60, out_features=42, bias=True)
      (fc3): Linear(in_features=42, out_features=10, bias=True)
    )


.. GENERATED FROM PYTHON SOURCE LINES 102-106

Fine-tuning Compacted Model
---------------------------
Note that if the model has been sped up, you need to re-initialize a new optimizer for fine-tuning.
Because speed up will replace the masked big layers with dense small ones.

.. GENERATED FROM PYTHON SOURCE LINES 106-110

.. code-block:: default


    optimizer = SGD(model.parameters(), 1e-2)
    for epoch in range(3):
        trainer(model, optimizer, criterion)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 1 minutes  33.096 seconds)


.. _sphx_glr_download_tutorials_pruning_quick_start_mnist.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: pruning_quick_start_mnist.py <pruning_quick_start_mnist.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: pruning_quick_start_mnist.ipynb <pruning_quick_start_mnist.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_