"include/vscode:/vscode.git/clone" did not exist on "57fadf6fb90bfab20e890aab21b940edee26ba63"
Unverified Commit 8ac072ae authored by J-shang's avatar J-shang Committed by GitHub
Browse files

[Bugbash] promote pruning v2 (#4733)

parent 1a3c019a
docs/img/pruning_process.png

55.9 KB | W: | H:

docs/img/pruning_process.png

338 KB | W: | H:

docs/img/pruning_process.png
docs/img/pruning_process.png
docs/img/pruning_process.png
docs/img/pruning_process.png
  • 2-up
  • Swipe
  • Onion skin
...@@ -172,7 +172,7 @@ Usage ...@@ -172,7 +172,7 @@ Usage
.. code-block:: python .. code-block:: python
from nni.compression.pytorch.utils.counter import count_flops_params from nni.compression.pytorch.utils import count_flops_params
# Given input size (1, 1, 28, 28) # Given input size (1, 1, 28, 28)
flops, params, results = count_flops_params(model, (1, 1, 28, 28)) flops, params, results = count_flops_params(model, (1, 1, 28, 28))
......
...@@ -26,6 +26,7 @@ We further elaborate on the two methods, pruning and quantization, in the follow ...@@ -26,6 +26,7 @@ We further elaborate on the two methods, pruning and quantization, in the follow
.. image:: ../../img/prune_quant.jpg .. image:: ../../img/prune_quant.jpg
:target: ../../img/prune_quant.jpg :target: ../../img/prune_quant.jpg
:scale: 40% :scale: 40%
:align: center
:alt: :alt:
NNI provides an easy-to-use toolkit to help users design and use model pruning and quantization algorithms. NNI provides an easy-to-use toolkit to help users design and use model pruning and quantization algorithms.
...@@ -46,8 +47,9 @@ There are several core features supported by NNI model compression: ...@@ -46,8 +47,9 @@ There are several core features supported by NNI model compression:
.. image:: ../../img/compression_pipeline.png .. image:: ../../img/compression_pipeline.png
:target: ../../img/compression_pipeline.png :target: ../../img/compression_pipeline.png
:alt: :alt:
:scale: 20% :align: center
:scale: 30%
The overall compression pipeline in NNI is shown above. For compressing a pretrained model, pruning and quantization can be used alone or in combination. The overall compression pipeline in NNI is shown above. For compressing a pretrained model, pruning and quantization can be used alone or in combination.
If users want to apply both, a sequential mode is recommended as common practise. If users want to apply both, a sequential mode is recommended as common practise.
...@@ -58,74 +60,6 @@ If users want to apply both, a sequential mode is recommended as common practise ...@@ -58,74 +60,6 @@ If users want to apply both, a sequential mode is recommended as common practise
The interface and APIs are unified for both PyTorch and TensorFlow. Currently only PyTorch version has been supported, and TensorFlow version will be supported in future. The interface and APIs are unified for both PyTorch and TensorFlow. Currently only PyTorch version has been supported, and TensorFlow version will be supported in future.
.. rubric:: Supported Pruning Algorithms
Pruning algorithms compress the original network by removing redundant weights or channels of layers, which can reduce model complexity and mitigate the over-fitting issue.
.. list-table::
:header-rows: 1
:widths: auto
* - Name
- Brief Introduction of Algorithm
* - :ref:`level-pruner`
- Pruning the specified ratio on each weight element based on absolute value of weight element
* - :ref:`l1-norm-pruner`
- Pruning output channels with the smallest L1 norm of weights (Pruning Filters for Efficient Convnets) `Reference Paper <https://arxiv.org/abs/1608.08710>`__
* - :ref:`l2-norm-pruner`
- Pruning output channels with the smallest L2 norm of weights
* - :ref:`fpgm-pruner`
- Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration `Reference Paper <https://arxiv.org/abs/1811.00250>`__
* - :ref:`slim-pruner`
- Pruning output channels by pruning scaling factors in BN layers(Learning Efficient Convolutional Networks through Network Slimming) `Reference Paper <https://arxiv.org/abs/1708.06519>`__
* - :ref:`activation-apoz-rank-pruner`
- Pruning output channels based on the metric APoZ (average percentage of zeros) which measures the percentage of zeros in activations of (convolutional) layers. `Reference Paper <https://arxiv.org/abs/1607.03250>`__
* - :ref:`activation-mean-rank-pruner`
- Pruning output channels based on the metric that calculates the smallest mean value of output activations
* - :ref:`taylor-fo-weight-pruner`
- Pruning filters based on the first order taylor expansion on weights(Importance Estimation for Neural Network Pruning) `Reference Paper <http://jankautz.com/publications/Importance4NNPruning_CVPR19.pdf>`__
* - :ref:`admm-pruner`
- Pruning based on ADMM optimization technique `Reference Paper <https://arxiv.org/abs/1804.03294>`__
* - :ref:`linear-pruner`
- Sparsity ratio increases linearly during each pruning rounds, in each round, using a basic pruner to prune the model.
* - :ref:`agp-pruner`
- Automated gradual pruning (To prune, or not to prune: exploring the efficacy of pruning for model compression) `Reference Paper <https://arxiv.org/abs/1710.01878>`__
* - :ref:`lottery-ticket-pruner`
- The pruning process used by "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks". It prunes a model iteratively. `Reference Paper <https://arxiv.org/abs/1803.03635>`__
* - :ref:`simulated-annealing-pruner`
- Automatic pruning with a guided heuristic search method, Simulated Annealing algorithm `Reference Paper <https://arxiv.org/abs/1907.03141>`__
* - :ref:`auto-compress-pruner`
- Automatic pruning by iteratively call SimulatedAnnealing Pruner and ADMM Pruner `Reference Paper <https://arxiv.org/abs/1907.03141>`__
* - :ref:`amc-pruner`
- AMC: AutoML for Model Compression and Acceleration on Mobile Devices `Reference Paper <https://arxiv.org/abs/1802.03494>`__
* - :ref:`movement-pruner`
- Movement Pruning: Adaptive Sparsity by Fine-Tuning `Reference Paper <https://arxiv.org/abs/2005.07683>`__
.. rubric:: Supported Quantization Algorithms
Quantization algorithms compress the original network by reducing the number of bits required to represent weights or activations, which can reduce the computations and the inference time.
.. list-table::
:header-rows: 1
:widths: auto
* - Name
- Brief Introduction of Algorithm
* - :ref:`naive-quantizer`
- Quantize weights to default 8 bits
* - :ref:`qat-quantizer`
- Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. `Reference Paper <http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf>`__
* - :ref:`dorefa-quantizer`
- DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. `Reference Paper <https://arxiv.org/abs/1606.06160>`__
* - :ref:`bnn-quantizer`
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `Reference Paper <https://arxiv.org/abs/1602.02830>`__
* - :ref:`lsq-quantizer`
- Learned step size quantization. `Reference Paper <https://arxiv.org/pdf/1902.08153.pdf>`__
* - :ref:`observer-quantizer`
- Post training quantizaiton. Collect quantization information during calibration with observers.
.. rubric:: Model Speedup .. rubric:: Model Speedup
The final goal of model compression is to reduce inference latency and model size. The final goal of model compression is to reduce inference latency and model size.
...@@ -137,7 +71,8 @@ The following figure shows how NNI prunes and speeds up your models. ...@@ -137,7 +71,8 @@ The following figure shows how NNI prunes and speeds up your models.
.. image:: ../../img/nni_prune_process.png .. image:: ../../img/nni_prune_process.png
:target: ../../img/nni_prune_process.png :target: ../../img/nni_prune_process.png
:scale: 20% :scale: 30%
:align: center
:alt: :alt:
The detailed tutorial of Speedup Model with Mask can be found :doc:`here <../tutorials/cp_pruning_speedup>`. The detailed tutorial of Speedup Model with Mask can be found :doc:`here <../tutorials/cp_pruning_speedup>`.
......
.. 8ac41c7d9227a65de0d1445a1085f8ec .. 8106afa0f255f5f0f75fb94dd1c0badd
模型压缩 模型压缩
======== ========
......
Pruner Reference Pruner in NNI
================ =============
Basic Pruner Pruning algorithms compress the original network by removing redundant weights or channels of layers, which can reduce model complexity and mitigate the over-fitting issue.
------------
.. list-table::
.. _level-pruner: :header-rows: 1
:widths: auto
Level Pruner
^^^^^^^^^^^^ * - Name
- Brief Introduction of Algorithm
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.LevelPruner * - :ref:`level-pruner`
- Pruning the specified ratio on each weight element based on absolute value of weight element
.. _l1-norm-pruner: * - :ref:`l1-norm-pruner`
- Pruning output channels with the smallest L1 norm of weights (Pruning Filters for Efficient Convnets) `Reference Paper <https://arxiv.org/abs/1608.08710>`__
L1 Norm Pruner * - :ref:`l2-norm-pruner`
^^^^^^^^^^^^^^ - Pruning output channels with the smallest L2 norm of weights
* - :ref:`fpgm-pruner`
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.L1NormPruner - Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration `Reference Paper <https://arxiv.org/abs/1811.00250>`__
* - :ref:`slim-pruner`
.. _l2-norm-pruner: - Pruning output channels by pruning scaling factors in BN layers(Learning Efficient Convolutional Networks through Network Slimming) `Reference Paper <https://arxiv.org/abs/1708.06519>`__
* - :ref:`activation-apoz-rank-pruner`
L2 Norm Pruner - Pruning output channels based on the metric APoZ (average percentage of zeros) which measures the percentage of zeros in activations of (convolutional) layers. `Reference Paper <https://arxiv.org/abs/1607.03250>`__
^^^^^^^^^^^^^^ * - :ref:`activation-mean-rank-pruner`
- Pruning output channels based on the metric that calculates the smallest mean value of output activations
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.L2NormPruner * - :ref:`taylor-fo-weight-pruner`
- Pruning filters based on the first order taylor expansion on weights(Importance Estimation for Neural Network Pruning) `Reference Paper <http://jankautz.com/publications/Importance4NNPruning_CVPR19.pdf>`__
.. _fpgm-pruner: * - :ref:`admm-pruner`
- Pruning based on ADMM optimization technique `Reference Paper <https://arxiv.org/abs/1804.03294>`__
FPGM Pruner * - :ref:`linear-pruner`
^^^^^^^^^^^ - Sparsity ratio increases linearly during each pruning rounds, in each round, using a basic pruner to prune the model.
* - :ref:`agp-pruner`
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.FPGMPruner - Automated gradual pruning (To prune, or not to prune: exploring the efficacy of pruning for model compression) `Reference Paper <https://arxiv.org/abs/1710.01878>`__
* - :ref:`lottery-ticket-pruner`
.. _slim-pruner: - The pruning process used by "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks". It prunes a model iteratively. `Reference Paper <https://arxiv.org/abs/1803.03635>`__
* - :ref:`simulated-annealing-pruner`
Slim Pruner - Automatic pruning with a guided heuristic search method, Simulated Annealing algorithm `Reference Paper <https://arxiv.org/abs/1907.03141>`__
^^^^^^^^^^^ * - :ref:`auto-compress-pruner`
- Automatic pruning by iteratively call SimulatedAnnealing Pruner and ADMM Pruner `Reference Paper <https://arxiv.org/abs/1907.03141>`__
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.SlimPruner * - :ref:`amc-pruner`
- AMC: AutoML for Model Compression and Acceleration on Mobile Devices `Reference Paper <https://arxiv.org/abs/1802.03494>`__
.. _activation-apoz-rank-pruner: * - :ref:`movement-pruner`
- Movement Pruning: Adaptive Sparsity by Fine-Tuning `Reference Paper <https://arxiv.org/abs/2005.07683>`__
Activation APoZ Rank Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.ActivationAPoZRankPruner
.. _activation-mean-rank-pruner:
Activation Mean Rank Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.ActivationMeanRankPruner
.. _taylor-fo-weight-pruner:
Taylor FO Weight Pruner
^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.TaylorFOWeightPruner
.. _admm-pruner:
ADMM Pruner
^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.ADMMPruner
Scheduled Pruners
-----------------
.. _linear-pruner:
Linear Pruner
^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.LinearPruner
.. _agp-pruner:
AGP Pruner
^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.AGPPruner
.. _lottery-ticket-pruner:
Lottery Ticket Pruner
^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.LotteryTicketPruner
.. _simulated-annealing-pruner:
Simulated Annealing Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.SimulatedAnnealingPruner
.. _auto-compress-pruner:
Auto Compress Pruner
^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.AutoCompressPruner
.. _amc-pruner:
AMC Pruner
^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.AMCPruner
Other Pruner
------------
.. _movement-pruner:
Movement Pruner
^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.MovementPruner
\ No newline at end of file
...@@ -34,7 +34,7 @@ For a clearer structure vision, please refer to the figure below. ...@@ -34,7 +34,7 @@ For a clearer structure vision, please refer to the figure below.
.. image:: ../../img/pruning_process.png .. image:: ../../img/pruning_process.png
:target: ../../img/pruning_process.png :target: ../../img/pruning_process.png
:scale: 80% :scale: 30%
:align: center :align: center
:alt: :alt:
......
...@@ -40,7 +40,7 @@ Using AGP Pruning as an example to explain how to implement an iterative pruning ...@@ -40,7 +40,7 @@ Using AGP Pruning as an example to explain how to implement an iterative pruning
scheduler.compress() scheduler.compress()
_, model, masks, _, _ = scheduler.get_best_result() _, model, masks, _, _ = scheduler.get_best_result()
The full script can be found :githublink:`here <examples/model_compress/pruning/v2/scheduler_torch.py>`. The full script can be found :githublink:`here <examples/model_compress/pruning/scheduler_torch.py>`.
In this example, we use dependency-aware mode L1 Norm Pruner as a basic pruner during each iteration. In this example, we use dependency-aware mode L1 Norm Pruner as a basic pruner during each iteration.
Note we do not need to pass ``model`` and ``config_list`` to the pruner, because in each iteration the ``model`` and ``config_list`` used by the pruner are received from the task generator. Note we do not need to pass ``model`` and ``config_list`` to the pruner, because in each iteration the ``model`` and ``config_list`` used by the pruner are received from the task generator.
...@@ -56,7 +56,8 @@ The pruning result will return to the ``TaskGenerator`` at the end of each itera ...@@ -56,7 +56,8 @@ The pruning result will return to the ``TaskGenerator`` at the end of each itera
The information included in the ``Task`` and ``TaskResult`` can be found :githublink:`here <nni/algorithms/compression/v2/pytorch/base/scheduler.py>`. The information included in the ``Task`` and ``TaskResult`` can be found :githublink:`here <nni/algorithms/compression/v2/pytorch/base/scheduler.py>`.
A clearer iterative pruning flow chart can be found `here <v2_pruning.rst>`__. A clearer iterative pruning flow chart can be found :doc:`here <pruning>`.
If you want to implement your own task generator, please following the ``TaskGenerator`` :githublink:`interface <nni/algorithms/compression/v2/pytorch/pruning/tools/base.py>`. If you want to implement your own task generator, please following the ``TaskGenerator`` :githublink:`interface <nni/algorithms/compression/v2/pytorch/pruning/tools/base.py>`.
Two main functions should be implemented, ``init_pending_tasks(self) -> List[Task]`` and ``generate_tasks(self, task_result: TaskResult) -> List[Task]``. Two main functions should be implemented, ``init_pending_tasks(self) -> List[Task]`` and ``generate_tasks(self, task_result: TaskResult) -> List[Task]``.
......
Quantizer Reference Quantizer in NNI
=================== ================
.. _naive-quantizer: Quantization algorithms compress the original network by reducing the number of bits required to represent weights or activations, which can reduce the computations and the inference time.
Naive Quantizer .. list-table::
^^^^^^^^^^^^^^^ :header-rows: 1
:widths: auto
.. autoclass:: nni.algorithms.compression.pytorch.quantization.NaiveQuantizer
* - Name
.. _qat-quantizer: - Brief Introduction of Algorithm
* - :ref:`naive-quantizer`
QAT Quantizer - Quantize weights to default 8 bits
^^^^^^^^^^^^^ * - :ref:`qat-quantizer`
- Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. `Reference Paper <http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf>`__
.. autoclass:: nni.algorithms.compression.pytorch.quantization.QAT_Quantizer * - :ref:`dorefa-quantizer`
- DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. `Reference Paper <https://arxiv.org/abs/1606.06160>`__
.. _dorefa-quantizer: * - :ref:`bnn-quantizer`
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `Reference Paper <https://arxiv.org/abs/1602.02830>`__
DoReFa Quantizer * - :ref:`lsq-quantizer`
^^^^^^^^^^^^^^^^ - Learned step size quantization. `Reference Paper <https://arxiv.org/pdf/1902.08153.pdf>`__
* - :ref:`observer-quantizer`
.. autoclass:: nni.algorithms.compression.pytorch.quantization.DoReFaQuantizer - Post training quantizaiton. Collect quantization information during calibration with observers.
.. _bnn-quantizer:
BNN Quantizer
^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.BNNQuantizer
.. _lsq-quantizer:
LSQ Quantizer
^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.LsqQuantizer
.. _observer-quantizer:
Observer Quantizer
^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.ObserverQuantizer
Compression API Reference Framework Related
========================= =================
Pruner Pruner
------ ------
Please refer to :doc:`../compression/pruner`. .. autoclass:: nni.algorithms.compression.v2.pytorch.base.Pruner
Quantizer
---------
Please refer to :doc:`../compression/quantizer`.
Pruning Speedup
---------------
.. autoclass:: nni.compression.pytorch.speedup.ModelSpeedup
:members:
Quantization Speedup
--------------------
.. autoclass:: nni.compression.pytorch.quantization_speedup.ModelSpeedupTensorRT
:members: :members:
Compression Utilities PrunerModuleWrapper
--------------------- -------------------
.. autoclass:: nni.compression.pytorch.utils.sensitivity_analysis.SensitivityAnalysis .. autoclass:: nni.algorithms.compression.v2.pytorch.base.PrunerModuleWrapper
:members:
.. autoclass:: nni.compression.pytorch.utils.shape_dependency.ChannelDependency BasicPruner
:members: -----------
.. autoclass:: nni.compression.pytorch.utils.shape_dependency.GroupDependency .. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.basic_pruner.BasicPruner
:members: :members:
.. autoclass:: nni.compression.pytorch.utils.mask_conflict.ChannelMaskConflict DataCollector
:members: -------------
.. autoclass:: nni.compression.pytorch.utils.mask_conflict.GroupMaskConflict .. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.DataCollector
:members: :members:
.. autofunction:: nni.compression.pytorch.utils.counter.count_flops_params MetricsCalculator
.. autofunction:: nni.algorithms.compression.v2.pytorch.utils.pruning.compute_sparsity
Framework Related
----------------- -----------------
.. autoclass:: nni.algorithms.compression.v2.pytorch.base.Pruner
:members:
.. autoclass:: nni.algorithms.compression.v2.pytorch.base.PrunerModuleWrapper
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.basic_pruner.BasicPruner
:members:
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.DataCollector
:members:
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.MetricsCalculator .. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.MetricsCalculator
:members: :members:
SparsityAllocator
-----------------
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.SparsityAllocator .. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.SparsityAllocator
:members: :members:
BasePruningScheduler
--------------------
.. autoclass:: nni.algorithms.compression.v2.pytorch.base.BasePruningScheduler .. autoclass:: nni.algorithms.compression.v2.pytorch.base.BasePruningScheduler
:members: :members:
TaskGenerator
-------------
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.TaskGenerator .. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.tools.TaskGenerator
:members: :members:
Quantizer
---------
.. autoclass:: nni.compression.pytorch.compressor.Quantizer .. autoclass:: nni.compression.pytorch.compressor.Quantizer
:members: :members:
QuantizerModuleWrapper
----------------------
.. autoclass:: nni.compression.pytorch.compressor.QuantizerModuleWrapper .. autoclass:: nni.compression.pytorch.compressor.QuantizerModuleWrapper
:members: :members:
QuantGrad
---------
.. autoclass:: nni.compression.pytorch.compressor.QuantGrad .. autoclass:: nni.compression.pytorch.compressor.QuantGrad
:members: :members:
Compression API Reference
=========================
.. toctree::
:maxdepth: 1
Pruner <pruner>
Quantizer <quantizer>
Pruning Speedup <pruning_speedup>
Quantization Speedup <quantization_speedup>
Compression Utilities <utils>
Framework Related <framework>
Pruner
======
Basic Pruner
------------
.. _level-pruner:
Level Pruner
^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.LevelPruner
.. _l1-norm-pruner:
L1 Norm Pruner
^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.L1NormPruner
.. _l2-norm-pruner:
L2 Norm Pruner
^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.L2NormPruner
.. _fpgm-pruner:
FPGM Pruner
^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.FPGMPruner
.. _slim-pruner:
Slim Pruner
^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.SlimPruner
.. _activation-apoz-rank-pruner:
Activation APoZ Rank Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.ActivationAPoZRankPruner
.. _activation-mean-rank-pruner:
Activation Mean Rank Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.ActivationMeanRankPruner
.. _taylor-fo-weight-pruner:
Taylor FO Weight Pruner
^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.TaylorFOWeightPruner
.. _admm-pruner:
ADMM Pruner
^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.ADMMPruner
Scheduled Pruners
-----------------
.. _linear-pruner:
Linear Pruner
^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.LinearPruner
.. _agp-pruner:
AGP Pruner
^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.AGPPruner
.. _lottery-ticket-pruner:
Lottery Ticket Pruner
^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.LotteryTicketPruner
.. _simulated-annealing-pruner:
Simulated Annealing Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.SimulatedAnnealingPruner
.. _auto-compress-pruner:
Auto Compress Pruner
^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.AutoCompressPruner
.. _amc-pruner:
AMC Pruner
^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.AMCPruner
Other Pruner
------------
.. _movement-pruner:
Movement Pruner
^^^^^^^^^^^^^^^
.. autoclass:: nni.compression.pytorch.pruning.MovementPruner
\ No newline at end of file
Pruning Speedup
===============
.. autoclass:: nni.compression.pytorch.speedup.ModelSpeedup
:members:
Quantization Speedup
====================
.. autoclass:: nni.compression.pytorch.quantization_speedup.ModelSpeedupTensorRT
:members:
Quantizer
=========
.. _naive-quantizer:
Naive Quantizer
^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.NaiveQuantizer
.. _qat-quantizer:
QAT Quantizer
^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.QAT_Quantizer
.. _dorefa-quantizer:
DoReFa Quantizer
^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.DoReFaQuantizer
.. _bnn-quantizer:
BNN Quantizer
^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.BNNQuantizer
.. _lsq-quantizer:
LSQ Quantizer
^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.LsqQuantizer
.. _observer-quantizer:
Observer Quantizer
^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.algorithms.compression.pytorch.quantization.ObserverQuantizer
Compression Utilities
=====================
SensitivityAnalysis
-------------------
.. autoclass:: nni.compression.pytorch.utils.SensitivityAnalysis
:members:
ChannelDependency
-----------------
.. autoclass:: nni.compression.pytorch.utils.ChannelDependency
:members:
GroupDependency
---------------
.. autoclass:: nni.compression.pytorch.utils.GroupDependency
:members:
ChannelMaskConflict
-------------------
.. autoclass:: nni.compression.pytorch.utils.ChannelMaskConflict
:members:
GroupMaskConflict
-----------------
.. autoclass:: nni.compression.pytorch.utils.GroupMaskConflict
:members:
count_flops_params
------------------
.. autofunction:: nni.compression.pytorch.utils.count_flops_params
compute_sparsity
----------------
.. autofunction:: nni.algorithms.compression.v2.pytorch.utils.pruning.compute_sparsity
...@@ -6,6 +6,6 @@ Python API Reference ...@@ -6,6 +6,6 @@ Python API Reference
Hyperparameter Optimization <hpo> Hyperparameter Optimization <hpo>
Neural Architecture Search <nas/index> Neural Architecture Search <nas/index>
Model Compression <compression> Model Compression <compression/index>
Experiment <experiment> Experiment <experiment>
Others <others> Others <others>
...@@ -35,12 +35,12 @@ PyTorch code ...@@ -35,12 +35,12 @@ PyTorch code
loss.backward() loss.backward()
The complete code for fine-tuning the pruned model can be found :githublink:`here <examples/model_compress/pruning/finetune_kd_torch.py>` The complete code for fine-tuning the pruned model can be found :githublink:`here <examples/model_compress/pruning/legacy/finetune_kd_torch.py>`
.. code-block:: bash .. code-block:: bash
python finetune_kd_torch.py --model [model name] --teacher-model-dir [pretrained checkpoint path] --student-model-dir [pruned checkpoint path] --mask-path [mask file path] python finetune_kd_torch.py --model [model name] --teacher-model-dir [pretrained checkpoint path] --student-model-dir [pruned checkpoint path] --mask-path [mask file path]
Note that: for fine-tuning a pruned model, run :githublink:`basic_pruners_torch.py <examples/model_compress/pruning/basic_pruners_torch.py>` first to get the mask file, then pass the mask path as argument to the script. Note that: for fine-tuning a pruned model, run :githublink:`basic_pruners_torch.py <examples/model_compress/pruning/legacy/basic_pruners_torch.py>` first to get the mask file, then pass the mask path as argument to the script.
...@@ -13,7 +13,7 @@ The experiments are performed with the following pruners/datasets/models: ...@@ -13,7 +13,7 @@ The experiments are performed with the following pruners/datasets/models:
* *
Models: :githublink:`VGG16, ResNet18, ResNet50 <examples/model_compress/pruning/models/cifar10>` Models: :githublink:`VGG16, ResNet18, ResNet50 <examples/model_compress/models/cifar10>`
* *
Datasets: CIFAR-10 Datasets: CIFAR-10
...@@ -50,24 +50,24 @@ The experiment result are shown in the following figures: ...@@ -50,24 +50,24 @@ The experiment result are shown in the following figures:
CIFAR-10, VGG16: CIFAR-10, VGG16:
.. image:: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_vgg16.png .. image:: ../../../examples/model_compress/pruning/legacy/comparison_of_pruners/img/performance_comparison_vgg16.png
:target: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_vgg16.png :target: ../../../examples/model_compress/pruning/legacy/comparison_of_pruners/img/performance_comparison_vgg16.png
:alt: :alt:
CIFAR-10, ResNet18: CIFAR-10, ResNet18:
.. image:: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet18.png .. image:: ../../../examples/model_compress/pruning/legacy/comparison_of_pruners/img/performance_comparison_resnet18.png
:target: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet18.png :target: ../../../examples/model_compress/pruning/legacy/comparison_of_pruners/img/performance_comparison_resnet18.png
:alt: :alt:
CIFAR-10, ResNet50: CIFAR-10, ResNet50:
.. image:: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet50.png .. image:: ../../../examples/model_compress/pruning/legacy/comparison_of_pruners/img/performance_comparison_resnet50.png
:target: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet50.png :target: ../../../examples/model_compress/pruning/legacy/comparison_of_pruners/img/performance_comparison_resnet50.png
:alt: :alt:
...@@ -96,14 +96,14 @@ Implementation Details ...@@ -96,14 +96,14 @@ Implementation Details
This avoids potential issues of counting them of masked models. This avoids potential issues of counting them of masked models.
* *
The experiment code can be found :githublink:`here <examples/model_compress/pruning/auto_pruners_torch.py>`. The experiment code can be found :githublink:`here <examples/model_compress/pruning/legacy/auto_pruners_torch.py>`.
Experiment Result Rendering Experiment Result Rendering
^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
* *
If you follow the practice in the :githublink:`example <examples/model_compress/pruning/auto_pruners_torch.py>`\ , for every single pruning experiment, the experiment result will be saved in JSON format as follows: If you follow the practice in the :githublink:`example <examples/model_compress/pruning/legacy/auto_pruners_torch.py>`\ , for every single pruning experiment, the experiment result will be saved in JSON format as follows:
.. code-block:: json .. code-block:: json
...@@ -114,8 +114,8 @@ Experiment Result Rendering ...@@ -114,8 +114,8 @@ Experiment Result Rendering
} }
* *
The experiment results are saved :githublink:`here <examples/model_compress/pruning/comparison_of_pruners>`. The experiment results are saved :githublink:`here <examples/model_compress/pruning/legacy/comparison_of_pruners>`.
You can refer to :githublink:`analyze <examples/model_compress/pruning/comparison_of_pruners/analyze.py>` to plot new performance comparison figures. You can refer to :githublink:`analyze <examples/model_compress/pruning/legacy/comparison_of_pruners/analyze.py>` to plot new performance comparison figures.
Contribution Contribution
------------ ------------
......
...@@ -80,7 +80,7 @@ ...@@ -80,7 +80,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner\npruner = L1NormPruner(model, config_list)\n\n# show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.\nprint(model)" "from nni.compression.pytorch.pruning import L1NormPruner\npruner = L1NormPruner(model, config_list)\n\n# show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.\nprint(model)"
] ]
}, },
{ {
......
...@@ -67,7 +67,7 @@ config_list = [{ ...@@ -67,7 +67,7 @@ config_list = [{
# %% # %%
# Pruners usually require `model` and `config_list` as input arguments. # Pruners usually require `model` and `config_list` as input arguments.
from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner from nni.compression.pytorch.pruning import L1NormPruner
pruner = L1NormPruner(model, config_list) pruner = L1NormPruner(model, config_list)
# show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list. # show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment