pruning_customize.rst 10.5 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285

.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/pruning_customize.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_tutorials_pruning_customize.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_pruning_customize.py:


Customize Basic Pruner
======================

Users can easily customize a basic pruner in NNI. A large number of basic modules have been provided and can be reused.
Follow the NNI pruning interface, users only need to focus on their creative parts without worrying about other regular modules.

In this tutorial, we show how to customize a basic pruner.

Concepts
--------

NNI abstracts the basic pruning process into three steps, collecting data, calculating metrics, allocating sparsity.
Most pruning algorithms rely on a metric to decide where should be pruned. Using L1 norm pruner as an example,
the first step is collecting model weights, the second step is calculating L1 norm for weight per output channel,
the third step is ranking L1 norm metric and masking the output channels that have small L1 norm.

In NNI basic pruner, these three step is implement as ``DataCollector``, ``MetricsCalculator`` and ``SparsityAllocator``.

-   ``DataCollector``: This module take pruner as initialize parameter.
    It will get the relevant information of the model from the pruner,
    and sometimes it will also hook the model to get input, output or gradient of a layer or a tensor.
    It can also patch optimizer if some special steps need to be executed before or after ``optimizer.step()``.

-   ``MetricsCalculator``: This module will take the data collected from the ``DataCollector``,
    then calculate the metrics. The metric shape is usually reduced from the data shape.
    The ``dim`` taken by ``MetricsCalculator`` means which dimension will be kept after calculate metrics.
    i.e., the collected data shape is (10, 20, 30), and the ``dim`` is 1, then the dimension-1 will be kept,
    the output metrics shape should be (20,).

-   ``SparsityAllocator``: This module take the metrics and generate the masks.
    Different ``SparsityAllocator`` has different masks generation strategies.
    A common and simple strategy is sorting the metrics' values and calculating a threshold according to the configured sparsity,
    mask the positions which metric value smaller than the threshold.
    The ``dim`` taken by ``SparsityAllocator`` means the metrics are for which dimension, the mask will be expanded to weight shape.
    i.e., the metric shape is (20,), the corresponding layer weight shape is (20, 40), and the ``dim`` is 0.
    ``SparsityAllocator`` will first generate a mask with shape (20,), then expand this mask to shape (20, 40).

Simple Example: Customize a Block-L1NormPruner
----------------------------------------------

NNI already have L1NormPruner, but for the reason of reproducing the paper and reducing user configuration items,
it only support pruning layer output channels. In this example, we will customize a pruner that supports block granularity for Linear.

Note that you don't need to implement all these three kinds of tools for each time,
NNI supports many predefined tools, and you can directly use these to customize your own pruner.
This is a tutorial so we show how to define all these three kinds of pruning tools.

Customize the pruning tools used by the pruner at first.

.. GENERATED FROM PYTHON SOURCE LINES 51-128

.. code-block:: default


    import torch
    from nni.algorithms.compression.v2.pytorch.pruning.basic_pruner import BasicPruner
    from nni.algorithms.compression.v2.pytorch.pruning.tools import (
        DataCollector,
        MetricsCalculator,
        SparsityAllocator
    )


    # This data collector collects weight in wrapped module as data.
    # The wrapped module is the module configured in pruner's config_list.
    # This implementation is similar as nni.algorithms.compression.v2.pytorch.pruning.tools.WeightDataCollector
    class WeightDataCollector(DataCollector):
        def collect(self):
            data = {}
            # get_modules_wrapper will get all the wrapper in the compressor (pruner),
            # it returns a dict with format {wrapper_name: wrapper},
            # use wrapper.module to get the wrapped module.
            for _, wrapper in self.compressor.get_modules_wrapper().items():
                data[wrapper.name] = wrapper.module.weight.data
            # return {wrapper_name: weight_data}
            return data


    class BlockNormMetricsCalculator(MetricsCalculator):
        def __init__(self, block_sparse_size):
            # Because we will keep all dimension with block granularity, so fix ``dim=None``,
            # means all dimensions will be kept.
            super().__init__(dim=None, block_sparse_size=block_sparse_size)

        def calculate_metrics(self, data):
            data_length = len(self.block_sparse_size)
            reduce_unfold_dims = list(range(data_length, 2 * data_length))

            metrics = {}
            for name, t in data.items():
                # Unfold t as block size, and calculate L1 Norm for each block.
                for dim, size in enumerate(self.block_sparse_size):
                    t = t.unfold(dim, size, size)
                metrics[name] = t.norm(dim=reduce_unfold_dims, p=1)
            # return {wrapper_name: block_metric}
            return metrics


    # This implementation is similar as nni.algorithms.compression.v2.pytorch.pruning.tools.NormalSparsityAllocator
    class BlockSparsityAllocator(SparsityAllocator):
        def __init__(self, pruner, block_sparse_size):
            super().__init__(pruner, dim=None, block_sparse_size=block_sparse_size, continuous_mask=True)

        def generate_sparsity(self, metrics):
            masks = {}
            for name, wrapper in self.pruner.get_modules_wrapper().items():
                # wrapper.config['total_sparsity'] can get the configured sparsity ratio for this wrapped module
                sparsity_rate = wrapper.config['total_sparsity']
                # get metric for this wrapped module
                metric = metrics[name]
                # mask the metric with old mask, if the masked position need never recover,
                # just keep this is ok if you are new in NNI pruning
                if self.continuous_mask:
                    metric *= self._compress_mask(wrapper.weight_mask)
                # convert sparsity ratio to prune number
                prune_num = int(sparsity_rate * metric.numel())
                # calculate the metric threshold
                threshold = torch.topk(metric.view(-1), prune_num, largest=False)[0].max()
                # generate mask, keep the metric positions that metric values greater than the threshold
                mask = torch.gt(metric, threshold).type_as(metric)
                # expand the mask to weight size, if the block is masked, this block will be filled with zeros,
                # otherwise filled with ones
                masks[name] = self._expand_mask(name, mask)
                # merge the new mask with old mask, if the masked position need never recover,
                # just keep this is ok if you are new in NNI pruning
                if self.continuous_mask:
                    masks[name]['weight'] *= wrapper.weight_mask
            return masks









.. GENERATED FROM PYTHON SOURCE LINES 129-130

Customize the pruner.

.. GENERATED FROM PYTHON SOURCE LINES 130-148

.. code-block:: default


    class BlockL1NormPruner(BasicPruner):
        def __init__(self, model, config_list, block_sparse_size):
            self.block_sparse_size = block_sparse_size
            super().__init__(model, config_list)

        # Implement reset_tools is enough for this pruner.
        def reset_tools(self):
            if self.data_collector is None:
                self.data_collector = WeightDataCollector(self)
            else:
                self.data_collector.reset()
            if self.metrics_calculator is None:
                self.metrics_calculator = BlockNormMetricsCalculator(self.block_sparse_size)
            if self.sparsity_allocator is None:
                self.sparsity_allocator = BlockSparsityAllocator(self, self.block_sparse_size)









.. GENERATED FROM PYTHON SOURCE LINES 149-150

Try this pruner.

.. GENERATED FROM PYTHON SOURCE LINES 150-171

.. code-block:: default


    # Define a simple model.
    class TestModel(torch.nn.Module):
        def __init__(self) -> None:
            super().__init__()
            self.fc1 = torch.nn.Linear(4, 8)
            self.fc2 = torch.nn.Linear(8, 4)

        def forward(self, x):
            return self.fc2(self.fc1(x))

    model = TestModel()
    config_list = [{'op_types': ['Linear'], 'total_sparsity': 0.5}]
    # use 2x2 block
    _, masks = BlockL1NormPruner(model, config_list, [2, 2]).compress()

    # show the generated masks
    print('fc1 masks:\n', masks['fc1']['weight'])
    print('fc2 masks:\n', masks['fc2']['weight'])






.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    fc1 masks:
     tensor([[0., 0., 0., 0.],
            [0., 0., 0., 0.],
            [0., 0., 0., 0.],
            [0., 0., 0., 0.],
            [1., 1., 1., 1.],
            [1., 1., 1., 1.],
            [1., 1., 1., 1.],
            [1., 1., 1., 1.]])
    fc2 masks:
     tensor([[0., 0., 0., 0., 1., 1., 1., 1.],
            [0., 0., 0., 0., 1., 1., 1., 1.],
            [0., 0., 0., 0., 1., 1., 1., 1.],
            [0., 0., 0., 0., 1., 1., 1., 1.]])




.. GENERATED FROM PYTHON SOURCE LINES 172-175

This time we successfully define a new pruner with pruning block granularity!
Note that we don't put validation logic in this example, like ``_validate_config_before_canonical``,
but for a robust implementation, we suggest you involve the validation logic.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  1.175 seconds)


.. _sphx_glr_download_tutorials_pruning_customize.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example



  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: pruning_customize.py <pruning_customize.py>`



  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: pruning_customize.ipynb <pruning_customize.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_