QuickStart.rst 4.99 KB
Newer Older
colorjam's avatar
colorjam committed
1
2
Quick Start
===========
3

colorjam's avatar
colorjam committed
4
5
..  toctree::
    :hidden:
6

7
    Notebook Example <compression_pipeline_example>
8
9


colorjam's avatar
colorjam committed
10
Model compression usually consists of three stages: 1) pre-training a model, 2) compress the model, 3) fine-tuning the model. NNI mainly focuses on the second stage and provides very simple APIs for compressing a model. Follow this guide for a quick look at how easy it is to use NNI to compress a model. 
11

12
13
A `compression pipeline example <./compression_pipeline_example.rst>`__ with Jupyter notebook is supported and refer the code :githublink:`here <examples/notebooks/compression_pipeline_example.ipynb>`.

colorjam's avatar
colorjam committed
14
15
Model Pruning
-------------
16

colorjam's avatar
colorjam committed
17
Here we use `level pruner <../Compression/Pruner.rst#level-pruner>`__ as an example to show the usage of pruning in NNI.
18

colorjam's avatar
colorjam committed
19
20
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
21

colorjam's avatar
colorjam committed
22
Write a configuration to specify the layers that you want to prune. The following configuration means pruning all the ``default``\ ops to sparsity 0.5 while keeping other layers unpruned.
23
24
25

.. code-block:: python

colorjam's avatar
colorjam committed
26
27
28
29
   config_list = [{
       'sparsity': 0.5,
       'op_types': ['default'],
   }]
30

31
The specification of configuration can be found `here <./Tutorial.rst#specify-the-configuration>`__. Note that different pruners may have their own defined fields in configuration. Please refer to each pruner's `usage <./Pruner.rst>`__ for details, and adjust the configuration accordingly.
32

colorjam's avatar
colorjam committed
33
34
Step2. Choose a pruner and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
35

36
First instantiate the chosen pruner with your model and configuration as arguments, then invoke ``compress()`` to compress your model. Note that, some algorithms may check gradients for compressing, so we may also define a trainer, an optimizer, a criterion and pass them to the pruner.
37
38
39
40
41

.. code-block:: python

   from nni.algorithms.compression.pytorch.pruning import LevelPruner

J-shang's avatar
J-shang committed
42
   pruner = LevelPruner(model, config_list)
colorjam's avatar
colorjam committed
43
   model = pruner.compress()
44

J-shang's avatar
J-shang committed
45
Some pruners (e.g., L1FilterPruner, FPGMPruner) prune once, some pruners (e.g., AGPPruner) prune your model iteratively, the masks are adjusted epoch by epoch during training.
46

47
So if the pruners prune your model iteratively or they need training or inference to get gradients, you need pass finetuning logic to pruner.
48

colorjam's avatar
colorjam committed
49
For example:
50

colorjam's avatar
colorjam committed
51
.. code-block:: python
52

53
   from nni.algorithms.compression.pytorch.pruning import AGPPruner
54

55
56
   pruner = AGPPruner(model, config_list, optimizer, trainer, criterion, num_iterations=10, epochs_per_iteration=1, pruning_algorithm='level')
   model = pruner.compress()
57

colorjam's avatar
colorjam committed
58
59
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
60

colorjam's avatar
colorjam committed
61
After training, you can export model weights to a file, and the generated masks to a file as well. Exporting onnx model is also supported.
62
63
64

.. code-block:: python

colorjam's avatar
colorjam committed
65
   pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
66

colorjam's avatar
colorjam committed
67
Plese refer to :githublink:`mnist example <examples/model_compress/pruning/naive_prune_torch.py>` for example code.
68

colorjam's avatar
colorjam committed
69
More examples of pruning algorithms can be found in :githublink:`basic_pruners_torch <examples/model_compress/pruning/basic_pruners_torch.py>` and :githublink:`auto_pruners_torch <examples/model_compress/pruning/auto_pruners_torch.py>`.
70
71


colorjam's avatar
colorjam committed
72
73
Model Quantization
------------------
74

colorjam's avatar
colorjam committed
75
Here we use `QAT  Quantizer <../Compression/Quantizer.rst#qat-quantizer>`__ as an example to show the usage of pruning in NNI.
76

colorjam's avatar
colorjam committed
77
78
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
79

colorjam's avatar
colorjam committed
80
.. code-block:: python
81

colorjam's avatar
colorjam committed
82
   config_list = [{
83
       'quant_types': ['weight', 'input'],
colorjam's avatar
colorjam committed
84
       'quant_bits': {
85
           'weight': 8,
86
           'input': 8,
colorjam's avatar
colorjam committed
87
       }, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
88
89
90
       'op_types':['Conv2d', 'Linear'],
       'quant_dtype': 'int',
       'quant_scheme': 'per_channel_symmetric'
colorjam's avatar
colorjam committed
91
92
93
94
   }, {
       'quant_types': ['output'],
       'quant_bits': 8,
       'quant_start_step': 7000,
95
96
97
       'op_types':['ReLU6'],
       'quant_dtype': 'uint',
       'quant_scheme': 'per_tensor_affine'
colorjam's avatar
colorjam committed
98
   }]
99

colorjam's avatar
colorjam committed
100
The specification of configuration can be found `here <./Tutorial.rst#quantization-specific-keys>`__.
101

colorjam's avatar
colorjam committed
102
103
Step2. Choose a quantizer and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
104

colorjam's avatar
colorjam committed
105
.. code-block:: python
106

colorjam's avatar
colorjam committed
107
   from nni.algorithms.compression.pytorch.quantization import QAT_Quantizer
108

colorjam's avatar
colorjam committed
109
110
   quantizer = QAT_Quantizer(model, config_list)
   quantizer.compress()
111
112


colorjam's avatar
colorjam committed
113
114
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
115

116
After training and calibration, you can export model weight to a file, and the generated calibration parameters to a file as well. Exporting onnx model is also supported.
117
118
119

.. code-block:: python

120
   calibration_config = quantizer.export_model(model_path, calibration_path, onnx_path, input_shape, device)
121

colorjam's avatar
colorjam committed
122
Plese refer to :githublink:`mnist example <examples/model_compress/quantization/QAT_torch_quantizer.py>` for example code.
123

colorjam's avatar
colorjam committed
124
Congratulations! You've compressed your first model via NNI. To go a bit more in depth about model compression in NNI, check out the `Tutorial <./Tutorial.rst>`__.