"\n# Customize Basic Pruner\n\nUsers can easily customize a basic pruner in NNI. A large number of basic modules have been provided and can be reused.\nFollow the NNI pruning interface, users only need to focus on their creative parts without worrying about other regular modules.\n\nIn this tutorial, we show how to customize a basic pruner.\n\n## Concepts\n\nNNI abstracts the basic pruning process into three steps, collecting data, calculating metrics, allocating sparsity.\nMost pruning algorithms rely on a metric to decide where should be pruned. Using L1 norm pruner as an example,\nthe first step is collecting model weights, the second step is calculating L1 norm for weight per output channel,\nthe third step is ranking L1 norm metric and masking the output channels that have small L1 norm.\n\nIn NNI basic pruner, these three step is implement as ``DataCollector``, ``MetricsCalculator`` and ``SparsityAllocator``.\n\n- ``DataCollector``: This module take pruner as initialize parameter.\n It will get the relevant information of the model from the pruner,\n and sometimes it will also hook the model to get input, output or gradient of a layer or a tensor.\n It can also patch optimizer if some special steps need to be executed before or after ``optimizer.step()``.\n\n- ``MetricsCalculator``: This module will take the data collected from the ``DataCollector``,\n then calculate the metrics. The metric shape is usually reduced from the data shape.\n The ``dim`` taken by ``MetricsCalculator`` means which dimension will be kept after calculate metrics.\n i.e., the collected data shape is (10, 20, 30), and the ``dim`` is 1, then the dimension-1 will be kept,\n the output metrics shape should be (20,).\n\n- ``SparsityAllocator``: This module take the metrics and generate the masks.\n Different ``SparsityAllocator`` has different masks generation strategies.\n A common and simple strategy is sorting the metrics' values and calculating a threshold according to the configured sparsity,\n mask the positions which metric value smaller than the threshold.\n The ``dim`` taken by ``SparsityAllocator`` means the metrics are for which dimension, the mask will be expanded to weight shape.\n i.e., the metric shape is (20,), the corresponding layer weight shape is (20, 40), and the ``dim`` is 0.\n ``SparsityAllocator`` will first generate a mask with shape (20,), then expand this mask to shape (20, 40).\n\n## Simple Example: Customize a Block-L1NormPruner\n\nNNI already have L1NormPruner, but for the reason of reproducing the paper and reducing user configuration items,\nit only support pruning layer output channels. In this example, we will customize a pruner that supports block granularity for Linear.\n\nNote that you don't need to implement all these three kinds of tools for each time,\nNNI supports many predefined tools, and you can directly use these to customize your own pruner.\nThis is a tutorial so we show how to define all these three kinds of pruning tools.\n\nCustomize the pruning tools used by the pruner at first.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import torch\nfrom nni.algorithms.compression.v2.pytorch.pruning.basic_pruner import BasicPruner\nfrom nni.algorithms.compression.v2.pytorch.pruning.tools import (\n DataCollector,\n MetricsCalculator,\n SparsityAllocator\n)\n\n\n# This data collector collects weight in wrapped module as data.\n# The wrapped module is the module configured in pruner's config_list.\n# This implementation is similar as nni.algorithms.compression.v2.pytorch.pruning.tools.WeightDataCollector\nclass WeightDataCollector(DataCollector):\n def collect(self):\n data = {}\n # get_modules_wrapper will get all the wrapper in the compressor (pruner),\n # it returns a dict with format {wrapper_name: wrapper},\n # use wrapper.module to get the wrapped module.\n for _, wrapper in self.compressor.get_modules_wrapper().items():\n data[wrapper.name] = wrapper.module.weight.data\n # return {wrapper_name: weight_data}\n return data\n\n\nclass BlockNormMetricsCalculator(MetricsCalculator):\n def __init__(self, block_sparse_size):\n # Because we will keep all dimension with block granularity, so fix ``dim=None``,\n # means all dimensions will be kept.\n super().__init__(dim=None, block_sparse_size=block_sparse_size)\n\n def calculate_metrics(self, data):\n data_length = len(self.block_sparse_size)\n reduce_unfold_dims = list(range(data_length, 2 * data_length))\n\n metrics = {}\n for name, t in data.items():\n # Unfold t as block size, and calculate L1 Norm for each block.\n for dim, size in enumerate(self.block_sparse_size):\n t = t.unfold(dim, size, size)\n metrics[name] = t.norm(dim=reduce_unfold_dims, p=1)\n # return {wrapper_name: block_metric}\n return metrics\n\n\n# This implementation is similar as nni.algorithms.compression.v2.pytorch.pruning.tools.NormalSparsityAllocator\nclass BlockSparsityAllocator(SparsityAllocator):\n def __init__(self, pruner, block_sparse_size):\n super().__init__(pruner, dim=None, block_sparse_size=block_sparse_size, continuous_mask=True)\n\n def generate_sparsity(self, metrics):\n masks = {}\n for name, wrapper in self.pruner.get_modules_wrapper().items():\n # wrapper.config['total_sparsity'] can get the configured sparsity ratio for this wrapped module\n sparsity_rate = wrapper.config['total_sparsity']\n # get metric for this wrapped module\n metric = metrics[name]\n # mask the metric with old mask, if the masked position need never recover,\n # just keep this is ok if you are new in NNI pruning\n if self.continuous_mask:\n metric *= self._compress_mask(wrapper.weight_mask)\n # convert sparsity ratio to prune number\n prune_num = int(sparsity_rate * metric.numel())\n # calculate the metric threshold\n threshold = torch.topk(metric.view(-1), prune_num, largest=False)[0].max()\n # generate mask, keep the metric positions that metric values greater than the threshold\n mask = torch.gt(metric, threshold).type_as(metric)\n # expand the mask to weight size, if the block is masked, this block will be filled with zeros,\n # otherwise filled with ones\n masks[name] = self._expand_mask(name, mask)\n # merge the new mask with old mask, if the masked position need never recover,\n # just keep this is ok if you are new in NNI pruning\n if self.continuous_mask:\n masks[name]['weight'] *= wrapper.weight_mask\n return masks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Customize the pruner.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"class BlockL1NormPruner(BasicPruner):\n def __init__(self, model, config_list, block_sparse_size):\n self.block_sparse_size = block_sparse_size\n super().__init__(model, config_list)\n\n # Implement reset_tools is enough for this pruner.\n def reset_tools(self):\n if self.data_collector is None:\n self.data_collector = WeightDataCollector(self)\n else:\n self.data_collector.reset()\n if self.metrics_calculator is None:\n self.metrics_calculator = BlockNormMetricsCalculator(self.block_sparse_size)\n if self.sparsity_allocator is None:\n self.sparsity_allocator = BlockSparsityAllocator(self, self.block_sparse_size)"
"This time we successfully define a new pruner with pruning block granularity!\nNote that we don't put validation logic in this example, like ``_validate_config_before_canonical``,\nbut for a robust implementation, we suggest you involve the validation logic.\n\n"
"\n# Customize a new quantization algorithm\n\nTo write a new quantization algorithm, you can write a class that inherits ``nni.compression.pytorch.Quantizer``.\nThen, override the member functions with the logic of your algorithm. The member function to override is ``quantize_weight``.\n``quantize_weight`` directly returns the quantized weights rather than mask, because for quantization the quantized weights cannot be obtained by applying mask.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from nni.compression.pytorch import Quantizer\n\nclass YourQuantizer(Quantizer):\n def __init__(self, model, config_list):\n \"\"\"\n Suggest you to use the NNI defined spec for config\n \"\"\"\n super().__init__(model, config_list)\n\n def quantize_weight(self, weight, config, **kwargs):\n \"\"\"\n quantize should overload this method to quantize weight tensors.\n This method is effectively hooked to :meth:`forward` of the model.\n\n Parameters\n ----------\n weight : Tensor\n weight that needs to be quantized\n config : dict\n the configuration for weight quantization\n \"\"\"\n\n # Put your code to generate `new_weight` here\n new_weight = ...\n return new_weight\n\n def quantize_output(self, output, config, **kwargs):\n \"\"\"\n quantize should overload this method to quantize output.\n This method is effectively hooked to `:meth:`forward` of the model.\n\n Parameters\n ----------\n output : Tensor\n output that needs to be quantized\n config : dict\n the configuration for output quantization\n \"\"\"\n\n # Put your code to generate `new_output` here\n new_output = ...\n return new_output\n\n def quantize_input(self, *inputs, config, **kwargs):\n \"\"\"\n quantize should overload this method to quantize input.\n This method is effectively hooked to :meth:`forward` of the model.\n\n Parameters\n ----------\n inputs : Tensor\n inputs that needs to be quantized\n config : dict\n the configuration for inputs quantization\n \"\"\"\n\n # Put your code to generate `new_input` here\n new_input = ...\n return new_input\n\n def update_epoch(self, epoch_num):\n pass\n\n def step(self):\n \"\"\"\n Can do some processing based on the model or weights binded\n in the func bind_model\n \"\"\"\n pass"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Customize backward function\n\nSometimes it's necessary for a quantization operation to have a customized backward function,\nsuch as `Straight-Through Estimator <https://stackoverflow.com/questions/38361314/the-concept-of-straight-through-estimator-ste>`__\\ ,\nuser can customize a backward function as follow:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from nni.compression.pytorch.compressor import Quantizer, QuantGrad, QuantType\n\nclass ClipGrad(QuantGrad):\n @staticmethod\n def quant_backward(tensor, grad_output, quant_type):\n \"\"\"\n This method should be overrided by subclass to provide customized backward function,\n default implementation is Straight-Through Estimator\n Parameters\n ----------\n tensor : Tensor\n input of quantization operation\n grad_output : Tensor\n gradient of the output of quantization operation\n quant_type : QuantType\n the type of quantization, it can be `QuantType.INPUT`, `QuantType.WEIGHT`, `QuantType.OUTPUT`,\n you can define different behavior for different types.\n Returns\n -------\n tensor\n gradient of the input of quantization operation\n \"\"\"\n\n # for quant_output function, set grad to zero if the absolute value of tensor is larger than 1\n if quant_type == QuantType.OUTPUT:\n grad_output[tensor.abs() > 1] = 0\n return grad_output\n\nclass _YourQuantizer(Quantizer):\n def __init__(self, model, config_list):\n super().__init__(model, config_list)\n # set your customized backward function to overwrite default backward function\n self.quant_grad = ClipGrad"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you do not customize ``QuantGrad``, the default backward is Straight-Through Estimator. \n\n"
To write a new quantization algorithm, you can write a class that inherits ``nni.compression.pytorch.Quantizer``.
Then, override the member functions with the logic of your algorithm. The member function to override is ``quantize_weight``.
``quantize_weight`` directly returns the quantized weights rather than mask, because for quantization the quantized weights cannot be obtained by applying mask.
"""
fromnni.compression.pytorchimportQuantizer
classYourQuantizer(Quantizer):
def__init__(self,model,config_list):
"""
Suggest you to use the NNI defined spec for config
"""
super().__init__(model,config_list)
defquantize_weight(self,weight,config,**kwargs):
"""
quantize should overload this method to quantize weight tensors.
This method is effectively hooked to :meth:`forward` of the model.
Parameters
----------
weight : Tensor
weight that needs to be quantized
config : dict
the configuration for weight quantization
"""
# Put your code to generate `new_weight` here
new_weight=...
returnnew_weight
defquantize_output(self,output,config,**kwargs):
"""
quantize should overload this method to quantize output.
This method is effectively hooked to `:meth:`forward` of the model.
Parameters
----------
output : Tensor
output that needs to be quantized
config : dict
the configuration for output quantization
"""
# Put your code to generate `new_output` here
new_output=...
returnnew_output
defquantize_input(self,*inputs,config,**kwargs):
"""
quantize should overload this method to quantize input.
This method is effectively hooked to :meth:`forward` of the model.
Parameters
----------
inputs : Tensor
inputs that needs to be quantized
config : dict
the configuration for inputs quantization
"""
# Put your code to generate `new_input` here
new_input=...
returnnew_input
defupdate_epoch(self,epoch_num):
pass
defstep(self):
"""
Can do some processing based on the model or weights binded
in the func bind_model
"""
pass
# %%
# Customize backward function
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Sometimes it's necessary for a quantization operation to have a customized backward function,
# such as `Straight-Through Estimator <https://stackoverflow.com/questions/38361314/the-concept-of-straight-through-estimator-ste>`__\ ,
# user can customize a backward function as follow:
To write a new quantization algorithm, you can write a class that inherits ``nni.compression.pytorch.Quantizer``.
Then, override the member functions with the logic of your algorithm. The member function to override is ``quantize_weight``.
``quantize_weight`` directly returns the quantized weights rather than mask, because for quantization the quantized weights cannot be obtained by applying mask.
.. GENERATED FROM PYTHON SOURCE LINES 9-80
.. code-block:: default
from nni.compression.pytorch import Quantizer
class YourQuantizer(Quantizer):
def __init__(self, model, config_list):
"""
Suggest you to use the NNI defined spec for config
To write a new quantization algorithm, you can write a class that inherits ``nni.compression.pytorch.Quantizer``.
Then, override the member functions with the logic of your algorithm. The member function to override is ``quantize_weight``.
``quantize_weight`` directly returns the quantized weights rather than mask, because for quantization the quantized weights cannot be obtained by applying mask.
"""
fromnni.compression.pytorchimportQuantizer
classYourQuantizer(Quantizer):
def__init__(self,model,config_list):
"""
Suggest you to use the NNI defined spec for config
"""
super().__init__(model,config_list)
defquantize_weight(self,weight,config,**kwargs):
"""
quantize should overload this method to quantize weight tensors.
This method is effectively hooked to :meth:`forward` of the model.
Parameters
----------
weight : Tensor
weight that needs to be quantized
config : dict
the configuration for weight quantization
"""
# Put your code to generate `new_weight` here
new_weight=...
returnnew_weight
defquantize_output(self,output,config,**kwargs):
"""
quantize should overload this method to quantize output.
This method is effectively hooked to `:meth:`forward` of the model.
Parameters
----------
output : Tensor
output that needs to be quantized
config : dict
the configuration for output quantization
"""
# Put your code to generate `new_output` here
new_output=...
returnnew_output
defquantize_input(self,*inputs,config,**kwargs):
"""
quantize should overload this method to quantize input.
This method is effectively hooked to :meth:`forward` of the model.
Parameters
----------
inputs : Tensor
inputs that needs to be quantized
config : dict
the configuration for inputs quantization
"""
# Put your code to generate `new_input` here
new_input=...
returnnew_input
defupdate_epoch(self,epoch_num):
pass
defstep(self):
"""
Can do some processing based on the model or weights binded
in the func bind_model
"""
pass
# %%
# Customize backward function
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Sometimes it's necessary for a quantization operation to have a customized backward function,
# such as `Straight-Through Estimator <https://stackoverflow.com/questions/38361314/the-concept-of-straight-through-estimator-ste>`__\ ,
# user can customize a backward function as follow: