QuickStart.rst 4.76 KB
Newer Older
colorjam's avatar
colorjam committed
1
2
Quick Start
===========
3

colorjam's avatar
colorjam committed
4
5
..  toctree::
    :hidden:
6

colorjam's avatar
colorjam committed
7
    Tutorial <Tutorial>
8
9


colorjam's avatar
colorjam committed
10
Model compression usually consists of three stages: 1) pre-training a model, 2) compress the model, 3) fine-tuning the model. NNI mainly focuses on the second stage and provides very simple APIs for compressing a model. Follow this guide for a quick look at how easy it is to use NNI to compress a model. 
11

colorjam's avatar
colorjam committed
12
13
Model Pruning
-------------
14

colorjam's avatar
colorjam committed
15
Here we use `level pruner <../Compression/Pruner.rst#level-pruner>`__ as an example to show the usage of pruning in NNI.
16

colorjam's avatar
colorjam committed
17
18
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
19

colorjam's avatar
colorjam committed
20
Write a configuration to specify the layers that you want to prune. The following configuration means pruning all the ``default``\ ops to sparsity 0.5 while keeping other layers unpruned.
21
22
23

.. code-block:: python

colorjam's avatar
colorjam committed
24
25
26
27
   config_list = [{
       'sparsity': 0.5,
       'op_types': ['default'],
   }]
28

colorjam's avatar
colorjam committed
29
The specification of configuration can be found `here <./Tutorial.rst#specify-the-configuration>`__. Note that different pruners may have their own defined fields in configuration, for exmaple ``start_epoch`` in AGP pruner. Please refer to each pruner's `usage <./Pruner.rst>`__ for details, and adjust the configuration accordingly.
30

colorjam's avatar
colorjam committed
31
32
Step2. Choose a pruner and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
33

J-shang's avatar
J-shang committed
34
First instantiate the chosen pruner with your model and configuration as arguments, then invoke ``compress()`` to compress your model. Note that, some algorithms may check gradients for compressing, so we may also define an optimizer and pass it to the pruner.
35
36
37
38
39

.. code-block:: python

   from nni.algorithms.compression.pytorch.pruning import LevelPruner

J-shang's avatar
J-shang committed
40
   pruner = LevelPruner(model, config_list)
colorjam's avatar
colorjam committed
41
   model = pruner.compress()
42

J-shang's avatar
J-shang committed
43
Some pruners (e.g., L1FilterPruner, FPGMPruner) prune once, some pruners (e.g., AGPPruner) prune your model iteratively, the masks are adjusted epoch by epoch during training.
44

colorjam's avatar
colorjam committed
45
Note that, ``pruner.compress`` simply adds masks on model weights, it does not include fine-tuning logic. If users want to fine tune the compressed model, they need to write the fine tune logic by themselves after ``pruner.compress``.
46

colorjam's avatar
colorjam committed
47
For example:
48

colorjam's avatar
colorjam committed
49
.. code-block:: python
50

colorjam's avatar
colorjam committed
51
52
53
54
   for epoch in range(1, args.epochs + 1):
        pruner.update_epoch(epoch)
        train(args, model, device, train_loader, optimizer_finetune, epoch)
        test(model, device, test_loader)
55

colorjam's avatar
colorjam committed
56
More APIs to control the fine-tuning can be found `here <./Tutorial.rst#apis-to-control-the-fine-tuning>`__. 
57
58


colorjam's avatar
colorjam committed
59
60
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
61

colorjam's avatar
colorjam committed
62
After training, you can export model weights to a file, and the generated masks to a file as well. Exporting onnx model is also supported.
63
64
65

.. code-block:: python

colorjam's avatar
colorjam committed
66
   pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
67

colorjam's avatar
colorjam committed
68
Plese refer to :githublink:`mnist example <examples/model_compress/pruning/naive_prune_torch.py>` for example code.
69

colorjam's avatar
colorjam committed
70
More examples of pruning algorithms can be found in :githublink:`basic_pruners_torch <examples/model_compress/pruning/basic_pruners_torch.py>` and :githublink:`auto_pruners_torch <examples/model_compress/pruning/auto_pruners_torch.py>`.
71
72


colorjam's avatar
colorjam committed
73
74
Model Quantization
------------------
75

colorjam's avatar
colorjam committed
76
Here we use `QAT  Quantizer <../Compression/Quantizer.rst#qat-quantizer>`__ as an example to show the usage of pruning in NNI.
77

colorjam's avatar
colorjam committed
78
79
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
80

colorjam's avatar
colorjam committed
81
.. code-block:: python
82

colorjam's avatar
colorjam committed
83
84
85
   config_list = [{
       'quant_types': ['weight'],
       'quant_bits': {
86
           'weight': 8,
colorjam's avatar
colorjam committed
87
88
89
90
91
92
93
94
       }, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
       'op_types':['Conv2d', 'Linear']
   }, {
       'quant_types': ['output'],
       'quant_bits': 8,
       'quant_start_step': 7000,
       'op_types':['ReLU6']
   }]
95

colorjam's avatar
colorjam committed
96
The specification of configuration can be found `here <./Tutorial.rst#quantization-specific-keys>`__.
97

colorjam's avatar
colorjam committed
98
99
Step2. Choose a quantizer and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
100

colorjam's avatar
colorjam committed
101
.. code-block:: python
102

colorjam's avatar
colorjam committed
103
   from nni.algorithms.compression.pytorch.quantization import QAT_Quantizer
104

colorjam's avatar
colorjam committed
105
106
   quantizer = QAT_Quantizer(model, config_list)
   quantizer.compress()
107
108


colorjam's avatar
colorjam committed
109
110
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
111

112
After training and calibration, you can export model weight to a file, and the generated calibration parameters to a file as well. Exporting onnx model is also supported.
113
114
115

.. code-block:: python

116
   calibration_config = quantizer.export_model(model_path, calibration_path, onnx_path, input_shape, device)
117

colorjam's avatar
colorjam committed
118
Plese refer to :githublink:`mnist example <examples/model_compress/quantization/QAT_torch_quantizer.py>` for example code.
119

colorjam's avatar
colorjam committed
120
Congratulations! You've compressed your first model via NNI. To go a bit more in depth about model compression in NNI, check out the `Tutorial <./Tutorial.rst>`__.