.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/pruning_quick_start_mnist.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_pruning_quick_start_mnist.py: Pruning Quickstart ================== Model pruning is a technique to reduce the model size and computation by reducing model weight size or intermediate state size. There are three common practices for pruning a DNN model: #. Pre-training a model -> Pruning the model -> Fine-tuning the pruned model #. Pruning a model during training (i.e., pruning aware training) -> Fine-tuning the pruned model #. Pruning a model -> Training the pruned model from scratch NNI supports all of the above pruning practices by working on the key pruning stage. Following this tutorial for a quick look at how to use NNI to prune a model in a common practice. .. GENERATED FROM PYTHON SOURCE LINES 17-22 Preparation ----------- In this tutorial, we use a simple model and pre-trained on MNIST dataset. If you are familiar with defining a model and training in pytorch, you can skip directly to `Pruning Model`_. .. GENERATED FROM PYTHON SOURCE LINES 22-35 .. code-block:: default import torch import torch.nn.functional as F from torch.optim import SGD from scripts.compression_mnist_model import TorchModel, trainer, evaluator, device # define the model model = TorchModel().to(device) # show the model structure, note that pruner will wrap the model layer. print(model) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none TorchModel( (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1)) (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (fc1): Linear(in_features=256, out_features=120, bias=True) (fc2): Linear(in_features=120, out_features=84, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True) ) .. GENERATED FROM PYTHON SOURCE LINES 36-47 .. code-block:: default # define the optimizer and criterion for pre-training optimizer = SGD(model.parameters(), 1e-2) criterion = F.nll_loss # pre-train and evaluate the model on MNIST dataset for epoch in range(3): trainer(model, optimizer, criterion) evaluator(model) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Average test loss: 0.5822, Accuracy: 8311/10000 (83%) Average test loss: 0.2795, Accuracy: 9154/10000 (92%) Average test loss: 0.2036, Accuracy: 9345/10000 (93%) .. GENERATED FROM PYTHON SOURCE LINES 48-58 Pruning Model ------------- Using L1NormPruner to prune the model and generate the masks. Usually, a pruner requires original model and ``config_list`` as its inputs. Detailed about how to write ``config_list`` please refer :doc:`compression config specification <../compression/compression_config_list>`. The following `config_list` means all layers whose type is `Linear` or `Conv2d` will be pruned, except the layer named `fc3`, because `fc3` is `exclude`. The final sparsity ratio for each layer is 50%. The layer named `fc3` will not be pruned. .. GENERATED FROM PYTHON SOURCE LINES 58-67 .. code-block:: default config_list = [{ 'sparsity_per_layer': 0.5, 'op_types': ['Linear', 'Conv2d'] }, { 'exclude': True, 'op_names': ['fc3'] }] .. GENERATED FROM PYTHON SOURCE LINES 68-69 Pruners usually require `model` and `config_list` as input arguments. .. GENERATED FROM PYTHON SOURCE LINES 69-76 .. code-block:: default from nni.compression.pytorch.pruning import L1NormPruner pruner = L1NormPruner(model, config_list) # show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list. print(model) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none TorchModel( (conv1): PrunerModuleWrapper( (module): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1)) ) (conv2): PrunerModuleWrapper( (module): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) ) (fc1): PrunerModuleWrapper( (module): Linear(in_features=256, out_features=120, bias=True) ) (fc2): PrunerModuleWrapper( (module): Linear(in_features=120, out_features=84, bias=True) ) (fc3): Linear(in_features=84, out_features=10, bias=True) ) .. GENERATED FROM PYTHON SOURCE LINES 77-84 .. code-block:: default # compress the model and generate the masks _, masks = pruner.compress() # show the masks sparsity for name, mask in masks.items(): print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel())) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none conv1 sparsity : 0.5 conv2 sparsity : 0.5 fc1 sparsity : 0.5 fc2 sparsity : 0.5 .. GENERATED FROM PYTHON SOURCE LINES 85-88 Speedup the original model with masks, note that `ModelSpeedup` requires an unwrapped model. The model becomes smaller after speedup, and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the masks across layers. .. GENERATED FROM PYTHON SOURCE LINES 88-97 .. code-block:: default # need to unwrap the model, if the model is wrapped before speedup pruner._unwrap_model() # speedup the model from nni.compression.pytorch.speedup import ModelSpeedup ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model() .. rst-class:: sphx-glr-script-out Out: .. code-block:: none aten::log_softmax is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~ Note: .aten::log_softmax.12 does not have corresponding mask inference object /home/ningshang/anaconda3/envs/nni-dev/lib/python3.8/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at aten/src/ATen/core/TensorBody.h:417.) return self._grad .. GENERATED FROM PYTHON SOURCE LINES 98-99 the model will become real smaller after speedup .. GENERATED FROM PYTHON SOURCE LINES 99-101 .. code-block:: default print(model) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none TorchModel( (conv1): Conv2d(1, 3, kernel_size=(5, 5), stride=(1, 1)) (conv2): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1)) (fc1): Linear(in_features=128, out_features=60, bias=True) (fc2): Linear(in_features=60, out_features=42, bias=True) (fc3): Linear(in_features=42, out_features=10, bias=True) ) .. GENERATED FROM PYTHON SOURCE LINES 102-106 Fine-tuning Compacted Model --------------------------- Note that if the model has been sped up, you need to re-initialize a new optimizer for fine-tuning. Because speedup will replace the masked big layers with dense small ones. .. GENERATED FROM PYTHON SOURCE LINES 106-110 .. code-block:: default optimizer = SGD(model.parameters(), 1e-2) for epoch in range(3): trainer(model, optimizer, criterion) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 1 minutes 38.500 seconds) .. _sphx_glr_download_tutorials_pruning_quick_start_mnist.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: pruning_quick_start_mnist.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: pruning_quick_start_mnist.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_