@@ -100,6 +100,8 @@ Quantization algorithms compress the original network by reducing the number of
...
@@ -100,6 +100,8 @@ Quantization algorithms compress the original network by reducing the number of
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `Reference Paper <https://arxiv.org/abs/1602.02830>`__
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `Reference Paper <https://arxiv.org/abs/1602.02830>`__
@@ -9,6 +9,7 @@ Index of supported quantization algorithms
...
@@ -9,6 +9,7 @@ Index of supported quantization algorithms
*`DoReFaQuantizer<#dorefa-quantizer>`__
*`DoReFaQuantizer<#dorefa-quantizer>`__
*`BNNQuantizer<#bnn-quantizer>`__
*`BNNQuantizer<#bnn-quantizer>`__
*`LSQQuantizer<#lsq-quantizer>`__
*`LSQQuantizer<#lsq-quantizer>`__
*`ObserverQuantizer<#observer-quantizer>`__
NaiveQuantizer
NaiveQuantizer
---------------
---------------
...
@@ -311,3 +312,74 @@ We implemented one of the experiments in `Binarized Neural Networks: Training De
...
@@ -311,3 +312,74 @@ We implemented one of the experiments in `Binarized Neural Networks: Training De
The experiments code can be found at :githublink:`examples/model_compress/quantization/BNN_quantizer_cifar10.py <examples/model_compress/quantization/BNN_quantizer_cifar10.py>`
The experiments code can be found at :githublink:`examples/model_compress/quantization/BNN_quantizer_cifar10.py <examples/model_compress/quantization/BNN_quantizer_cifar10.py>`
Observer Quantizer
------------------
..
Observer quantizer is a framework of post-training quantization. It will insert observers into the place where the quantization will happen. During quantization calibration, each observer will record all the tensors it 'sees'. These tensors will be used to calculate the quantization statistics after calibration.
Usage
^^^^^
1. configure which layer to be quantized and which tensor (input/output/weight) of that layer to be quantized.
2. construct the observer quantizer.
3. do quantization calibration.
4. call the `compress` API to calculate the scale and zero point for each tensor and switch model to evaluation mode.
PyTorch code
.. code-block:: python
from nni.algorithms.compression.pytorch.quantization import ObserverQuantizer
You can view example :githublink:`examples/model_compress/quantization/observer_quantizer.py <examples/model_compress/quantization/observer_quantizer.py>` for more information.
User configuration for Observer Quantizer
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Common configuration needed by compression algorithms can be found at `Specification of `config_list <./QuickStart.rst>`__.
.. note::
This quantizer is still under development for now. Some quantizer settings are hard-coded: