{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Pruning Quickstart\n\nModel pruning is a technique to reduce the model size and computation by reducing model weight size or intermediate state size.\nThere are three common practices for pruning a DNN model:\n\n#. Pre-training a model -> Pruning the model -> Fine-tuning the pruned model\n#. Pruning a model during training (i.e., pruning aware training) -> Fine-tuning the pruned model\n#. Pruning a model -> Training the pruned model from scratch\n\nNNI supports all of the above pruning practices by working on the key pruning stage.\nFollowing this tutorial for a quick look at how to use NNI to prune a model in a common practice.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparation\n\nIn this tutorial, we use a simple model and pre-trained on MNIST dataset.\nIf you are familiar with defining a model and training in pytorch, you can skip directly to `Pruning Model`_.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\nimport torch.nn.functional as F\nfrom torch.optim import SGD\n\nfrom scripts.compression_mnist_model import TorchModel, trainer, evaluator, device\n\n# define the model\nmodel = TorchModel().to(device)\n\n# show the model structure, note that pruner will wrap the model layer.\nprint(model)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# define the optimizer and criterion for pre-training\n\noptimizer = SGD(model.parameters(), 1e-2)\ncriterion = F.nll_loss\n\n# pre-train and evaluate the model on MNIST dataset\nfor epoch in range(3):\n trainer(model, optimizer, criterion)\n evaluator(model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pruning Model\n\nUsing L1NormPruner to prune the model and generate the masks.\nUsually, a pruner requires original model and ``config_list`` as its inputs.\nDetailed about how to write ``config_list`` please refer :doc:`compression config specification <../compression/compression_config_list>`.\n\nThe following `config_list` means all layers whose type is `Linear` or `Conv2d` will be pruned,\nexcept the layer named `fc3`, because `fc3` is `exclude`.\nThe final sparsity ratio for each layer is 50%. The layer named `fc3` will not be pruned.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "config_list = [{\n 'sparsity_per_layer': 0.5,\n 'op_types': ['Linear', 'Conv2d']\n}, {\n 'exclude': True,\n 'op_names': ['fc3']\n}]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pruners usually require `model` and `config_list` as input arguments.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner\npruner = L1NormPruner(model, config_list)\n\n# show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.\nprint(model)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# compress the model and generate the masks\n_, masks = pruner.compress()\n# show the masks sparsity\nfor name, mask in masks.items():\n print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Speedup the original model with masks, note that `ModelSpeedup` requires an unwrapped model.\nThe model becomes smaller after speedup,\nand reaches a higher sparsity ratio because `ModelSpeedup` will propagate the masks across layers.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# need to unwrap the model, if the model is wrapped before speedup\npruner._unwrap_model()\n\n# speedup the model\nfrom nni.compression.pytorch.speedup import ModelSpeedup\n\nModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "the model will become real smaller after speedup\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "print(model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fine-tuning Compacted Model\nNote that if the model has been sped up, you need to re-initialize a new optimizer for fine-tuning.\nBecause speedup will replace the masked big layers with dense small ones.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "optimizer = SGD(model.parameters(), 1e-2)\nfor epoch in range(3):\n trainer(model, optimizer, criterion)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 0 }