"docs/vscode:/vscode.git/clone" did not exist on "566038de72f091488ca90c6a9c8959233f9adc9a"
QuickStart.rst 5.04 KB
Newer Older
colorjam's avatar
colorjam committed
1
2
Quick Start
===========
3

colorjam's avatar
colorjam committed
4
5
..  toctree::
    :hidden:
6

colorjam's avatar
colorjam committed
7
    Tutorial <Tutorial>
8
9


colorjam's avatar
colorjam committed
10
Model compression usually consists of three stages: 1) pre-training a model, 2) compress the model, 3) fine-tuning the model. NNI mainly focuses on the second stage and provides very simple APIs for compressing a model. Follow this guide for a quick look at how easy it is to use NNI to compress a model. 
11

colorjam's avatar
colorjam committed
12
13
Model Pruning
-------------
14

colorjam's avatar
colorjam committed
15
Here we use `level pruner <../Compression/Pruner.rst#level-pruner>`__ as an example to show the usage of pruning in NNI.
16

colorjam's avatar
colorjam committed
17
18
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
19

colorjam's avatar
colorjam committed
20
Write a configuration to specify the layers that you want to prune. The following configuration means pruning all the ``default``\ ops to sparsity 0.5 while keeping other layers unpruned.
21
22
23

.. code-block:: python

colorjam's avatar
colorjam committed
24
25
26
27
   config_list = [{
       'sparsity': 0.5,
       'op_types': ['default'],
   }]
28

colorjam's avatar
colorjam committed
29
The specification of configuration can be found `here <./Tutorial.rst#specify-the-configuration>`__. Note that different pruners may have their own defined fields in configuration, for exmaple ``start_epoch`` in AGP pruner. Please refer to each pruner's `usage <./Pruner.rst>`__ for details, and adjust the configuration accordingly.
30

colorjam's avatar
colorjam committed
31
32
Step2. Choose a pruner and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
33

colorjam's avatar
colorjam committed
34
First instantiate the chosen pruner with your model and configuration as arguments, then invoke ``compress()`` to compress your model. Note that, some algorithms may check gradients for compressing, so we also define an optimizer and pass it to the pruner.
35
36
37
38
39

.. code-block:: python

   from nni.algorithms.compression.pytorch.pruning import LevelPruner

colorjam's avatar
colorjam committed
40
41
42
   optimizer_finetune = torch.optim.SGD(model.parameters(), lr=0.01)
   pruner = LevelPruner(model, config_list, optimizer_finetune)
   model = pruner.compress()
43

colorjam's avatar
colorjam committed
44
Then, you can train your model using traditional training approach (e.g., SGD), pruning is applied transparently during the training. Some pruners (e.g., L1FilterPruner, FPGMPruner) prune once at the beginning, the following training can be seen as fine-tune. Some pruners (e.g., AGPPruner) prune your model iteratively, the masks are adjusted epoch by epoch during training.
45

colorjam's avatar
colorjam committed
46
Note that, ``pruner.compress`` simply adds masks on model weights, it does not include fine-tuning logic. If users want to fine tune the compressed model, they need to write the fine tune logic by themselves after ``pruner.compress``.
47

colorjam's avatar
colorjam committed
48
For example:
49

colorjam's avatar
colorjam committed
50
.. code-block:: python
51

colorjam's avatar
colorjam committed
52
53
54
55
   for epoch in range(1, args.epochs + 1):
        pruner.update_epoch(epoch)
        train(args, model, device, train_loader, optimizer_finetune, epoch)
        test(model, device, test_loader)
56

colorjam's avatar
colorjam committed
57
More APIs to control the fine-tuning can be found `here <./Tutorial.rst#apis-to-control-the-fine-tuning>`__. 
58
59


colorjam's avatar
colorjam committed
60
61
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
62

colorjam's avatar
colorjam committed
63
After training, you can export model weights to a file, and the generated masks to a file as well. Exporting onnx model is also supported.
64
65
66

.. code-block:: python

colorjam's avatar
colorjam committed
67
   pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
68

colorjam's avatar
colorjam committed
69
Plese refer to :githublink:`mnist example <examples/model_compress/pruning/naive_prune_torch.py>` for example code.
70

colorjam's avatar
colorjam committed
71
More examples of pruning algorithms can be found in :githublink:`basic_pruners_torch <examples/model_compress/pruning/basic_pruners_torch.py>` and :githublink:`auto_pruners_torch <examples/model_compress/pruning/auto_pruners_torch.py>`.
72
73


colorjam's avatar
colorjam committed
74
75
Model Quantization
------------------
76

colorjam's avatar
colorjam committed
77
Here we use `QAT  Quantizer <../Compression/Quantizer.rst#qat-quantizer>`__ as an example to show the usage of pruning in NNI.
78

colorjam's avatar
colorjam committed
79
80
Step1. Write configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^
81

colorjam's avatar
colorjam committed
82
.. code-block:: python
83

colorjam's avatar
colorjam committed
84
85
86
   config_list = [{
       'quant_types': ['weight'],
       'quant_bits': {
87
           'weight': 8,
colorjam's avatar
colorjam committed
88
89
90
91
92
93
94
95
       }, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
       'op_types':['Conv2d', 'Linear']
   }, {
       'quant_types': ['output'],
       'quant_bits': 8,
       'quant_start_step': 7000,
       'op_types':['ReLU6']
   }]
96

colorjam's avatar
colorjam committed
97
The specification of configuration can be found `here <./Tutorial.rst#quantization-specific-keys>`__.
98

colorjam's avatar
colorjam committed
99
100
Step2. Choose a quantizer and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
101

colorjam's avatar
colorjam committed
102
.. code-block:: python
103

colorjam's avatar
colorjam committed
104
   from nni.algorithms.compression.pytorch.quantization import QAT_Quantizer
105

colorjam's avatar
colorjam committed
106
107
   quantizer = QAT_Quantizer(model, config_list)
   quantizer.compress()
108
109


colorjam's avatar
colorjam committed
110
111
Step3. Export compression result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
112

colorjam's avatar
colorjam committed
113
You can export the quantized model directly by using ``torch.save`` api and the quantized model can be loaded by ``torch.load`` without any extra modification.
114
115
116
117

.. code-block:: python

   # Save quantized model which is generated by using NNI QAT algorithm
colorjam's avatar
colorjam committed
118
   torch.save(model.state_dict(), "quantized_model.pth")
119

colorjam's avatar
colorjam committed
120
Plese refer to :githublink:`mnist example <examples/model_compress/quantization/QAT_torch_quantizer.py>` for example code.
121

colorjam's avatar
colorjam committed
122
Congratulations! You've compressed your first model via NNI. To go a bit more in depth about model compression in NNI, check out the `Tutorial <./Tutorial.rst>`__.