Quantizer in NNI ================ Quantization algorithms compress the original network by reducing the number of bits required to represent weights or activations, which can reduce the computations and the inference time. .. list-table:: :header-rows: 1 :widths: auto * - Name - Brief Introduction of Algorithm * - :ref:`naive-quantizer` - Quantize weights to default 8 bits * - :ref:`qat-quantizer` - Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. `Reference Paper `__ * - :ref:`dorefa-quantizer` - DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. `Reference Paper `__ * - :ref:`bnn-quantizer` - Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `Reference Paper `__ * - :ref:`lsq-quantizer` - Learned step size quantization. `Reference Paper `__ * - :ref:`observer-quantizer` - Post training quantizaiton. Collect quantization information during calibration with observers.