QuickStart.rst 5.01 KB
Newer Older
colorjam's avatar
colorjam committed
1
2
Quick Start
===========
3

4
.. code-block::
5

6
7
8
9
   ..  toctree::
      :hidden:

      Notebook Example <compression_pipeline_example>
10
11


colorjam's avatar
colorjam committed
12
Model compression usually consists of three stages: 1) pre-training a model, 2) compress the model, 3) fine-tuning the model. NNI mainly focuses on the second stage and provides very simple APIs for compressing a model. Follow this guide for a quick look at how easy it is to use NNI to compress a model. 
13

14
.. A `compression pipeline example <./compression_pipeline_example.rst>`__ with Jupyter notebook is supported and refer the code :githublink:`here <examples/notebooks/compression_pipeline_example.ipynb>`.
15

colorjam's avatar
colorjam committed
16
17
Model Pruning
-------------
18

colorjam's avatar
colorjam committed
19
Here we use `level pruner <../Compression/Pruner.rst#level-pruner>`__ as an example to show the usage of pruning in NNI.
20

colorjam's avatar
colorjam committed
21
22
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
23

colorjam's avatar
colorjam committed
24
Write a configuration to specify the layers that you want to prune. The following configuration means pruning all the ``default``\ ops to sparsity 0.5 while keeping other layers unpruned.
25
26
27

.. code-block:: python

colorjam's avatar
colorjam committed
28
29
30
31
   config_list = [{
       'sparsity': 0.5,
       'op_types': ['default'],
   }]
32

33
The specification of configuration can be found `here <./Tutorial.rst#specify-the-configuration>`__. Note that different pruners may have their own defined fields in configuration. Please refer to each pruner's `usage <./Pruner.rst>`__ for details, and adjust the configuration accordingly.
34

colorjam's avatar
colorjam committed
35
36
Step2. Choose a pruner and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
37

38
First instantiate the chosen pruner with your model and configuration as arguments, then invoke ``compress()`` to compress your model. Note that, some algorithms may check gradients for compressing, so we may also define a trainer, an optimizer, a criterion and pass them to the pruner.
39
40
41
42
43

.. code-block:: python

   from nni.algorithms.compression.pytorch.pruning import LevelPruner

J-shang's avatar
J-shang committed
44
   pruner = LevelPruner(model, config_list)
colorjam's avatar
colorjam committed
45
   model = pruner.compress()
46

J-shang's avatar
J-shang committed
47
Some pruners (e.g., L1FilterPruner, FPGMPruner) prune once, some pruners (e.g., AGPPruner) prune your model iteratively, the masks are adjusted epoch by epoch during training.
48

49
So if the pruners prune your model iteratively or they need training or inference to get gradients, you need pass finetuning logic to pruner.
50

colorjam's avatar
colorjam committed
51
For example:
52

colorjam's avatar
colorjam committed
53
.. code-block:: python
54

55
   from nni.algorithms.compression.pytorch.pruning import AGPPruner
56

57
58
   pruner = AGPPruner(model, config_list, optimizer, trainer, criterion, num_iterations=10, epochs_per_iteration=1, pruning_algorithm='level')
   model = pruner.compress()
59

colorjam's avatar
colorjam committed
60
61
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
62

colorjam's avatar
colorjam committed
63
After training, you can export model weights to a file, and the generated masks to a file as well. Exporting onnx model is also supported.
64
65
66

.. code-block:: python

colorjam's avatar
colorjam committed
67
   pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
68

colorjam's avatar
colorjam committed
69
Plese refer to :githublink:`mnist example <examples/model_compress/pruning/naive_prune_torch.py>` for example code.
70

colorjam's avatar
colorjam committed
71
More examples of pruning algorithms can be found in :githublink:`basic_pruners_torch <examples/model_compress/pruning/basic_pruners_torch.py>` and :githublink:`auto_pruners_torch <examples/model_compress/pruning/auto_pruners_torch.py>`.
72
73


colorjam's avatar
colorjam committed
74
75
Model Quantization
------------------
76

colorjam's avatar
colorjam committed
77
Here we use `QAT  Quantizer <../Compression/Quantizer.rst#qat-quantizer>`__ as an example to show the usage of pruning in NNI.
78

colorjam's avatar
colorjam committed
79
80
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
81

colorjam's avatar
colorjam committed
82
.. code-block:: python
83

colorjam's avatar
colorjam committed
84
   config_list = [{
85
       'quant_types': ['weight', 'input'],
colorjam's avatar
colorjam committed
86
       'quant_bits': {
87
           'weight': 8,
88
           'input': 8,
colorjam's avatar
colorjam committed
89
       }, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
90
91
92
       'op_types':['Conv2d', 'Linear'],
       'quant_dtype': 'int',
       'quant_scheme': 'per_channel_symmetric'
colorjam's avatar
colorjam committed
93
94
95
96
   }, {
       'quant_types': ['output'],
       'quant_bits': 8,
       'quant_start_step': 7000,
97
98
99
       'op_types':['ReLU6'],
       'quant_dtype': 'uint',
       'quant_scheme': 'per_tensor_affine'
colorjam's avatar
colorjam committed
100
   }]
101

colorjam's avatar
colorjam committed
102
The specification of configuration can be found `here <./Tutorial.rst#quantization-specific-keys>`__.
103

colorjam's avatar
colorjam committed
104
105
Step2. Choose a quantizer and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
106

colorjam's avatar
colorjam committed
107
.. code-block:: python
108

colorjam's avatar
colorjam committed
109
   from nni.algorithms.compression.pytorch.quantization import QAT_Quantizer
110

colorjam's avatar
colorjam committed
111
112
   quantizer = QAT_Quantizer(model, config_list)
   quantizer.compress()
113
114


colorjam's avatar
colorjam committed
115
116
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
117

118
After training and calibration, you can export model weight to a file, and the generated calibration parameters to a file as well. Exporting onnx model is also supported.
119
120
121

.. code-block:: python

122
   calibration_config = quantizer.export_model(model_path, calibration_path, onnx_path, input_shape, device)
123

colorjam's avatar
colorjam committed
124
Plese refer to :githublink:`mnist example <examples/model_compress/quantization/QAT_torch_quantizer.py>` for example code.
125

colorjam's avatar
colorjam committed
126
Congratulations! You've compressed your first model via NNI. To go a bit more in depth about model compression in NNI, check out the `Tutorial <./Tutorial.rst>`__.