QuickStart.rst 5 KB
Newer Older
colorjam's avatar
colorjam committed
1
2
Quick Start
===========
3

colorjam's avatar
colorjam committed
4
5
..  toctree::
    :hidden:
6

7
    Notebook Example <compression_pipeline_example>
8
9


colorjam's avatar
colorjam committed
10
Model compression usually consists of three stages: 1) pre-training a model, 2) compress the model, 3) fine-tuning the model. NNI mainly focuses on the second stage and provides very simple APIs for compressing a model. Follow this guide for a quick look at how easy it is to use NNI to compress a model. 
11

12
13
A `compression pipeline example <./compression_pipeline_example.rst>`__ with Jupyter notebook is supported and refer the code :githublink:`here <examples/notebooks/compression_pipeline_example.ipynb>`.

colorjam's avatar
colorjam committed
14
15
Model Pruning
-------------
16

colorjam's avatar
colorjam committed
17
Here we use `level pruner <../Compression/Pruner.rst#level-pruner>`__ as an example to show the usage of pruning in NNI.
18

colorjam's avatar
colorjam committed
19
20
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
21

colorjam's avatar
colorjam committed
22
Write a configuration to specify the layers that you want to prune. The following configuration means pruning all the ``default``\ ops to sparsity 0.5 while keeping other layers unpruned.
23
24
25

.. code-block:: python

colorjam's avatar
colorjam committed
26
27
28
29
   config_list = [{
       'sparsity': 0.5,
       'op_types': ['default'],
   }]
30

colorjam's avatar
colorjam committed
31
The specification of configuration can be found `here <./Tutorial.rst#specify-the-configuration>`__. Note that different pruners may have their own defined fields in configuration, for exmaple ``start_epoch`` in AGP pruner. Please refer to each pruner's `usage <./Pruner.rst>`__ for details, and adjust the configuration accordingly.
32

colorjam's avatar
colorjam committed
33
34
Step2. Choose a pruner and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
35

36
First instantiate the chosen pruner with your model and configuration as arguments, then invoke ``compress()`` to compress your model. Note that, some algorithms may check gradients for compressing, so we may also define a trainer, an optimizer, a criterion and pass them to the pruner.
37
38
39
40
41

.. code-block:: python

   from nni.algorithms.compression.pytorch.pruning import LevelPruner

J-shang's avatar
J-shang committed
42
   pruner = LevelPruner(model, config_list)
colorjam's avatar
colorjam committed
43
   model = pruner.compress()
44

J-shang's avatar
J-shang committed
45
Some pruners (e.g., L1FilterPruner, FPGMPruner) prune once, some pruners (e.g., AGPPruner) prune your model iteratively, the masks are adjusted epoch by epoch during training.
46

47
So if the pruners prune your model iteratively or they need training or inference to get gradients, you need pass finetuning logic to pruner.
48

colorjam's avatar
colorjam committed
49
For example:
50

colorjam's avatar
colorjam committed
51
.. code-block:: python
52

53
   from nni.algorithms.compression.pytorch.pruning import AGPPruner
54

55
56
   pruner = AGPPruner(model, config_list, optimizer, trainer, criterion, num_iterations=10, epochs_per_iteration=1, pruning_algorithm='level')
   model = pruner.compress()
57

colorjam's avatar
colorjam committed
58
59
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
60

colorjam's avatar
colorjam committed
61
After training, you can export model weights to a file, and the generated masks to a file as well. Exporting onnx model is also supported.
62
63
64

.. code-block:: python

colorjam's avatar
colorjam committed
65
   pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
66

colorjam's avatar
colorjam committed
67
Plese refer to :githublink:`mnist example <examples/model_compress/pruning/naive_prune_torch.py>` for example code.
68

colorjam's avatar
colorjam committed
69
More examples of pruning algorithms can be found in :githublink:`basic_pruners_torch <examples/model_compress/pruning/basic_pruners_torch.py>` and :githublink:`auto_pruners_torch <examples/model_compress/pruning/auto_pruners_torch.py>`.
70
71


colorjam's avatar
colorjam committed
72
73
Model Quantization
------------------
74

colorjam's avatar
colorjam committed
75
Here we use `QAT  Quantizer <../Compression/Quantizer.rst#qat-quantizer>`__ as an example to show the usage of pruning in NNI.
76

colorjam's avatar
colorjam committed
77
78
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
79

colorjam's avatar
colorjam committed
80
.. code-block:: python
81

colorjam's avatar
colorjam committed
82
83
84
   config_list = [{
       'quant_types': ['weight'],
       'quant_bits': {
85
           'weight': 8,
colorjam's avatar
colorjam committed
86
       }, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
87
88
89
       'op_types':['Conv2d', 'Linear'],
       'quant_dtype': 'int',
       'quant_scheme': 'per_channel_symmetric'
colorjam's avatar
colorjam committed
90
91
92
93
   }, {
       'quant_types': ['output'],
       'quant_bits': 8,
       'quant_start_step': 7000,
94
95
96
       'op_types':['ReLU6'],
       'quant_dtype': 'uint',
       'quant_scheme': 'per_tensor_affine'
colorjam's avatar
colorjam committed
97
   }]
98

colorjam's avatar
colorjam committed
99
The specification of configuration can be found `here <./Tutorial.rst#quantization-specific-keys>`__.
100

colorjam's avatar
colorjam committed
101
102
Step2. Choose a quantizer and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
103

colorjam's avatar
colorjam committed
104
.. code-block:: python
105

colorjam's avatar
colorjam committed
106
   from nni.algorithms.compression.pytorch.quantization import QAT_Quantizer
107

colorjam's avatar
colorjam committed
108
109
   quantizer = QAT_Quantizer(model, config_list)
   quantizer.compress()
110
111


colorjam's avatar
colorjam committed
112
113
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
114

115
After training and calibration, you can export model weight to a file, and the generated calibration parameters to a file as well. Exporting onnx model is also supported.
116
117
118

.. code-block:: python

119
   calibration_config = quantizer.export_model(model_path, calibration_path, onnx_path, input_shape, device)
120

colorjam's avatar
colorjam committed
121
Plese refer to :githublink:`mnist example <examples/model_compress/quantization/QAT_torch_quantizer.py>` for example code.
122

colorjam's avatar
colorjam committed
123
Congratulations! You've compressed your first model via NNI. To go a bit more in depth about model compression in NNI, check out the `Tutorial <./Tutorial.rst>`__.