Unverified Commit fe02b808 authored by J-shang's avatar J-shang Committed by GitHub
Browse files

[Doc] split index to overview & toctree compression part (#4749)

parent b8d029b1
Model Compression with NNI Overview of NNI Model Compression
========================== =================================
.. toctree::
:hidden:
:maxdepth: 2
Pruning <pruning>
Quantization <quantization>
Config Specification <compression_config_list>
Advanced Usage <advanced_usage>
.. Using rubric to prevent the section heading to be include into toc
.. rubric:: Overview
Deep neural networks (DNNs) have achieved great success in many tasks like computer vision, nature launguage processing, speech processing. Deep neural networks (DNNs) have achieved great success in many tasks like computer vision, nature launguage processing, speech processing.
However, typical neural networks are both computationally expensive and energy-intensive, However, typical neural networks are both computationally expensive and energy-intensive,
...@@ -43,7 +30,8 @@ There are several core features supported by NNI model compression: ...@@ -43,7 +30,8 @@ There are several core features supported by NNI model compression:
* Concise interface for users to customize their own compression algorithms. * Concise interface for users to customize their own compression algorithms.
.. rubric:: Compression Pipeline Compression Pipeline
--------------------
.. image:: ../../img/compression_pipeline.png .. image:: ../../img/compression_pipeline.png
:target: ../../img/compression_pipeline.png :target: ../../img/compression_pipeline.png
...@@ -60,7 +48,8 @@ If users want to apply both, a sequential mode is recommended as common practise ...@@ -60,7 +48,8 @@ If users want to apply both, a sequential mode is recommended as common practise
The interface and APIs are unified for both PyTorch and TensorFlow. Currently only PyTorch version has been supported, and TensorFlow version will be supported in future. The interface and APIs are unified for both PyTorch and TensorFlow. Currently only PyTorch version has been supported, and TensorFlow version will be supported in future.
.. rubric:: Model Speedup Model Speedup
-------------
The final goal of model compression is to reduce inference latency and model size. The final goal of model compression is to reduce inference latency and model size.
However, existing model compression algorithms mainly use simulation to check the performance (e.g., accuracy) of compressed model. However, existing model compression algorithms mainly use simulation to check the performance (e.g., accuracy) of compressed model.
......
.. 3f7d3620b31e7bab985f1429044b7adc .. b6bdf52910e2e2c72085d03482d45340
模型压缩 模型压缩
======== ========
.. toctree::
:hidden:
:maxdepth: 2
模型剪枝 <pruning>
模型量化 <quantization>
用户配置 <compression_config_list>
高级用法 <advanced_usage>
深度神经网络(DNNs)在计算机视觉、自然语言处理、语音处理等领域取得了巨大的成功。 深度神经网络(DNNs)在计算机视觉、自然语言处理、语音处理等领域取得了巨大的成功。
然而,典型的神经网络是计算和能源密集型的,很难将其部署在计算资源匮乏 然而,典型的神经网络是计算和能源密集型的,很难将其部署在计算资源匮乏
或具有严格延迟要求的设备上。 因此,一个自然的想法就是对模型进行压缩, 或具有严格延迟要求的设备上。 因此,一个自然的想法就是对模型进行压缩,
...@@ -42,7 +33,8 @@ NNI 具备以下几个核心特性: ...@@ -42,7 +33,8 @@ NNI 具备以下几个核心特性:
* 提供友好和易于使用的压缩工具,让用户深入到压缩过程和结果。 * 提供友好和易于使用的压缩工具,让用户深入到压缩过程和结果。
* 简洁的界面,供用户自定义自己的压缩算法。 * 简洁的界面,供用户自定义自己的压缩算法。
.. rubric:: 压缩流程 压缩流程
---------
.. image:: ../../img/compression_pipeline.png .. image:: ../../img/compression_pipeline.png
:target: ../../img/compression_pipeline.png :target: ../../img/compression_pipeline.png
...@@ -62,7 +54,8 @@ NNI中模型压缩的整体流程如上图所示。 ...@@ -62,7 +54,8 @@ NNI中模型压缩的整体流程如上图所示。
PyTorch和TensorFlow的接口都是统一的。目前只支持PyTorch版本,未来将支持TensorFlow版本。 PyTorch和TensorFlow的接口都是统一的。目前只支持PyTorch版本,未来将支持TensorFlow版本。
.. rubric:: 模型加速 模型加速
---------
模型压缩的最终目标是减少推理延迟和模型大小。 模型压缩的最终目标是减少推理延迟和模型大小。
然而,现有的模型压缩算法主要是通过仿真来检测压缩模型的性能。 然而,现有的模型压缩算法主要是通过仿真来检测压缩模型的性能。
......
Pruner in NNI Pruner in NNI
============= =============
Pruning algorithms compress the original network by removing redundant weights or channels of layers, which can reduce model complexity and mitigate the over-fitting issue. NNI implements the main part of the pruning algorithm as pruner. All pruners are implemented as close as possible to what is described in the paper (if it has).
The following table provides a brief introduction to the pruners implemented in nni, click the link in table to view a more detailed introduction and use cases.
There are two kinds of pruners in NNI, please refer to `basic pruner <basic-pruner>`_ and `scheduled pruner <scheduled-pruner>`_ for details.
.. list-table:: .. list-table::
:header-rows: 1 :header-rows: 1
......
Model Pruning with NNI Overview of NNI Model Pruning
====================== =============================
Pruning is a common technique to compress neural network models. Pruning is a common technique to compress neural network models.
The pruning methods explore the redundancy in the model weights(parameters) and try to remove/prune the redundant and uncritical weights. The pruning methods explore the redundancy in the model weights(parameters) and try to remove/prune the redundant and uncritical weights.
...@@ -7,21 +7,26 @@ The redundant elements are pruned from the model, their values are zeroed and we ...@@ -7,21 +7,26 @@ The redundant elements are pruned from the model, their values are zeroed and we
The following concepts can help you understand pruning in NNI. The following concepts can help you understand pruning in NNI.
.. Using rubric to prevent the section heading to be include into toc Pruning Target
--------------
.. rubric:: Pruning Target
Pruning target means where we apply the sparsity. Pruning target means where we apply the sparsity.
Most pruning methods prune the weights to reduce the model size and accelerate the inference latency. Most pruning methods prune the weights to reduce the model size and accelerate the inference latency.
Other pruning methods also apply sparsity on activations (e.g., inputs, outputs, or feature maps) to accelerate the inference latency. Other pruning methods also apply sparsity on activations (e.g., inputs, outputs, or feature maps) to accelerate the inference latency.
NNI supports pruning module weights right now, and will support other pruning targets in the future. NNI supports pruning module weights right now, and will support other pruning targets in the future.
.. rubric:: Basic Pruner .. _basic-pruner:
Basic Pruner
------------
Basic pruner generates the masks for each pruning target (weights) for a determined sparsity ratio. Basic pruner generates the masks for each pruning target (weights) for a determined sparsity ratio.
It usually takes model and config as input arguments, then generates masks for each pruning target. It usually takes model and config as input arguments, then generates masks for each pruning target.
.. rubric:: Scheduled Pruner .. _scheduled-pruner:
Scheduled Pruner
----------------
Scheduled pruner decides how to allocate sparsity ratio to each pruning target, Scheduled pruner decides how to allocate sparsity ratio to each pruning target,
it also handles the model speedup (after each pruning iteration) and finetuning logic. it also handles the model speedup (after each pruning iteration) and finetuning logic.
...@@ -40,7 +45,8 @@ For a clearer structure vision, please refer to the figure below. ...@@ -40,7 +45,8 @@ For a clearer structure vision, please refer to the figure below.
More information about scheduled pruning process please refer to :doc:`Pruning Scheduler <pruning_scheduler>`. More information about scheduled pruning process please refer to :doc:`Pruning Scheduler <pruning_scheduler>`.
.. rubric:: Granularity Granularity
-----------
Fine-grained pruning or unstructured pruning refers to pruning each individual weights separately. Fine-grained pruning or unstructured pruning refers to pruning each individual weights separately.
Coarse-grained pruning or structured pruning is pruning a regular group of weights, such as a convolutional filter. Coarse-grained pruning or structured pruning is pruning a regular group of weights, such as a convolutional filter.
...@@ -49,7 +55,8 @@ Only :ref:`level-pruner` and :ref:`admm-pruner` support fine-grained pruning, al ...@@ -49,7 +55,8 @@ Only :ref:`level-pruner` and :ref:`admm-pruner` support fine-grained pruning, al
.. _dependency-awareode-for-output-channel-pruning: .. _dependency-awareode-for-output-channel-pruning:
.. rubric:: Dependency-aware Mode for Output Channel Pruning Dependency-aware Mode for Output Channel Pruning
------------------------------------------------
Currently, we support dependency-aware mode in several ``pruner``: :ref:`l1-norm-pruner`, :ref:`l2-norm-pruner`, :ref:`fpgm-pruner`, Currently, we support dependency-aware mode in several ``pruner``: :ref:`l1-norm-pruner`, :ref:`l2-norm-pruner`, :ref:`fpgm-pruner`,
:ref:`activation-apoz-rank-pruner`, :ref:`activation-mean-rank-pruner`, :ref:`taylor-fo-weight-pruner`. :ref:`activation-apoz-rank-pruner`, :ref:`activation-mean-rank-pruner`, :ref:`taylor-fo-weight-pruner`.
...@@ -99,11 +106,3 @@ In addition, for the convolutional layers that have more than one filter group, ...@@ -99,11 +106,3 @@ In addition, for the convolutional layers that have more than one filter group,
Overall, this pruner will prune the model according to the L1 norm of each filter and try to meet the topological constrains (channel dependency, etc) to improve the final speed gain after the speedup process. Overall, this pruner will prune the model according to the L1 norm of each filter and try to meet the topological constrains (channel dependency, etc) to improve the final speed gain after the speedup process.
In the dependency-aware mode, the pruner will provide a better speed gain from the model pruning. In the dependency-aware mode, the pruner will provide a better speed gain from the model pruning.
.. toctree::
:hidden:
:maxdepth: 2
Quickstart <../tutorials/pruning_quick_start_mnist>
Pruner <pruner>
Speedup <../tutorials/pruning_speedup>
Model Quantization with NNI Overview of NNI Model Quantization
=========================== ==================================
Quantization refers to compressing models by reducing the number of bits required to represent weights or activations, Quantization refers to compressing models by reducing the number of bits required to represent weights or activations,
which can reduce the computations and the inference time. In the context of deep neural networks, the major numerical which can reduce the computations and the inference time. In the context of deep neural networks, the major numerical
...@@ -9,11 +9,3 @@ is an active field of research. ...@@ -9,11 +9,3 @@ is an active field of research.
A quantizer is a quantization algorithm implementation in NNI. A quantizer is a quantization algorithm implementation in NNI.
You can also :doc:`create your own quantizer <../tutorials/quantization_customize>` using NNI model compression interface. You can also :doc:`create your own quantizer <../tutorials/quantization_customize>` using NNI model compression interface.
.. toctree::
:hidden:
:maxdepth: 2
Quickstart <../tutorials/quantization_quick_start_mnist>
Quantizer <quantizer>
SpeedUp <../tutorials/quantization_speedup>
Quantizer in NNI Quantizer in NNI
================ ================
Quantization algorithms compress the original network by reducing the number of bits required to represent weights or activations, which can reduce the computations and the inference time. NNI implements the main part of the quantizaiton algorithm as quantizer. All quantizers are implemented as close as possible to what is described in the paper (if it has).
The following table provides a brief introduction to the quantizers implemented in nni, click the link in table to view a more detailed introduction and use cases.
.. list-table:: .. list-table::
:header-rows: 1 :header-rows: 1
......
Compression
===========
.. toctree::
:hidden:
:maxdepth: 2
Overview <overview>
Pruning <toctree_pruning>
Quantization <toctree_quantization>
Config Specification <compression_config_list>
Advanced Usage <advanced_usage>
Pruning
=======
.. toctree::
:hidden:
:maxdepth: 2
Overview <pruning>
Quickstart </tutorials/pruning_quick_start_mnist>
Pruner <pruner>
Speedup </tutorials/pruning_speedup>
Quantization
============
.. toctree::
:hidden:
:maxdepth: 2
Overview <quantization>
Quickstart </tutorials/quantization_quick_start_mnist>
Quantizer <quantizer>
SpeedUp </tutorials/quantization_speedup>
...@@ -145,11 +145,6 @@ sphinx_tabs_disable_css_loading = True ...@@ -145,11 +145,6 @@ sphinx_tabs_disable_css_loading = True
tutorials_copy_list = [ tutorials_copy_list = [
# Seems that we don't need it for now. # Seems that we don't need it for now.
# Add tuples back if we need it in future. # Add tuples back if we need it in future.
# ('tutorials/pruning_quick_start_mnist.rst', 'tutorials/cp_pruning_quick_start_mnist.rst'),
# ('tutorials/pruning_speedup.rst', 'tutorials/cp_pruning_speedup.rst'),
# ('tutorials/quantization_quick_start_mnist.rst', 'tutorials/cp_quantization_quick_start_mnist.rst'),
# ('tutorials/quantization_speedup.rst', 'tutorials/cp_quantization_speedup.rst'),
] ]
# Toctree ensures that toctree docs do not contain any other contents. # Toctree ensures that toctree docs do not contain any other contents.
......
...@@ -309,13 +309,3 @@ Benchmark ...@@ -309,13 +309,3 @@ Benchmark
The dataset of benchmark could be download in `here <https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/>`__ The dataset of benchmark could be download in `here <https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/>`__
The code could be refenrence ``/examples/feature_engineering/gradient_feature_selector/benchmark_test.py``. The code could be refenrence ``/examples/feature_engineering/gradient_feature_selector/benchmark_test.py``.
Reference and Feedback
----------------------
* To `report a bug <https://github.com/microsoft/nni/issues/new?template=bug-report.rst>`__ for this feature in GitHub;
* To `file a feature or improvement request <https://github.com/microsoft/nni/issues/new?template=enhancement.rst>`__ for this feature in GitHub;
* To know more about :githublink:`Neural Architecture Search with NNI <docs/en_US/NAS/Overview.rst>`\ ;
* To know more about :githublink:`Model Compression with NNI <docs/en_US/Compression/Overview.rst>`\ ;
* To know more about :githublink:`Hyperparameter Tuning with NNI <docs/en_US/Tuner/BuiltinTuner.rst>`\ ;
...@@ -16,7 +16,7 @@ NNI Documentation ...@@ -16,7 +16,7 @@ NNI Documentation
Hyperparameter Optimization <hpo/index> Hyperparameter Optimization <hpo/index>
nas/toctree nas/toctree
Model Compression <compression/index> Model Compression <compression/toctree>
feature_engineering/toctree feature_engineering/toctree
experiment/toctree experiment/toctree
...@@ -45,7 +45,7 @@ NNI Documentation ...@@ -45,7 +45,7 @@ NNI Documentation
* :doc:`Hyperparameter Optimization </hpo/overview>` * :doc:`Hyperparameter Optimization </hpo/overview>`
* :doc:`Neural Architecture Search </nas/overview>` * :doc:`Neural Architecture Search </nas/overview>`
* :doc:`Model Compression </compression/index>` * :doc:`Model Compression </compression/overview>`
* :doc:`Feature Engineering </feature_engineering/overview>` * :doc:`Feature Engineering </feature_engineering/overview>`
Get Started Get Started
......
.. b1421b75629e06cb368f4c02a12a5f7d .. 954c2f433b4617a40d684df9b1a5f16b
########################### ###########################
Neural Network Intelligence Neural Network Intelligence
...@@ -15,7 +15,7 @@ Neural Network Intelligence ...@@ -15,7 +15,7 @@ Neural Network Intelligence
教程<examples> 教程<examples>
超参调优 <hpo/index> 超参调优 <hpo/index>
神经网络架构搜索<nas/toctree> 神经网络架构搜索<nas/toctree>
模型压缩<compression/index> 模型压缩<compression/toctree>
特征工程<feature_engineering/toctree> 特征工程<feature_engineering/toctree>
NNI实验 <experiment/toctree> NNI实验 <experiment/toctree>
HPO API Reference <reference/hpo> HPO API Reference <reference/hpo>
......
...@@ -75,7 +75,7 @@ NNI provides an easy-to-use model compression framework to compress deep neural ...@@ -75,7 +75,7 @@ NNI provides an easy-to-use model compression framework to compress deep neural
inference speed without losing performance significantlly. Model compression on NNI includes pruning algorithms and quantization algorithms. NNI provides many pruning and inference speed without losing performance significantlly. Model compression on NNI includes pruning algorithms and quantization algorithms. NNI provides many pruning and
quantization algorithms through NNI trial SDK. Users can directly use them in their trial code and run the trial code without starting an NNI experiment. Users can also use NNI model compression framework to customize their own pruning and quantization algorithms. quantization algorithms through NNI trial SDK. Users can directly use them in their trial code and run the trial code without starting an NNI experiment. Users can also use NNI model compression framework to customize their own pruning and quantization algorithms.
A detailed description of model compression and its usage can be found :doc:`here <../compression/index>`. A detailed description of model compression and its usage can be found :doc:`here <../compression/overview>`.
Automatic Feature Engineering Automatic Feature Engineering
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
......
...@@ -32,14 +32,14 @@ Tutorials ...@@ -32,14 +32,14 @@ Tutorials
.. raw:: html .. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="Quantization reduces model size and speeds up inference time by reducing the number of bits req..."> <div class="sphx-glr-thumbcontainer" tooltip=" Introduction ------------">
.. only:: html .. only:: html
.. figure:: /tutorials/images/thumb/sphx_glr_quantization_quick_start_mnist_thumb.png .. figure:: /tutorials/images/thumb/sphx_glr_quantization_speedup_thumb.png
:alt: Quantization Quickstart :alt: SpeedUp Model with Calibration Config
:ref:`sphx_glr_tutorials_quantization_quick_start_mnist.py` :ref:`sphx_glr_tutorials_quantization_speedup.py`
.. raw:: html .. raw:: html
...@@ -49,18 +49,18 @@ Tutorials ...@@ -49,18 +49,18 @@ Tutorials
.. toctree:: .. toctree::
:hidden: :hidden:
/tutorials/quantization_quick_start_mnist /tutorials/quantization_speedup
.. raw:: html .. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip=" Introduction ------------"> <div class="sphx-glr-thumbcontainer" tooltip="Quantization reduces model size and speeds up inference time by reducing the number of bits req...">
.. only:: html .. only:: html
.. figure:: /tutorials/images/thumb/sphx_glr_quantization_speedup_thumb.png .. figure:: /tutorials/images/thumb/sphx_glr_quantization_quick_start_mnist_thumb.png
:alt: SpeedUp Model with Calibration Config :alt: Quantization Quickstart
:ref:`sphx_glr_tutorials_quantization_speedup.py` :ref:`sphx_glr_tutorials_quantization_quick_start_mnist.py`
.. raw:: html .. raw:: html
...@@ -70,7 +70,7 @@ Tutorials ...@@ -70,7 +70,7 @@ Tutorials
.. toctree:: .. toctree::
:hidden: :hidden:
/tutorials/quantization_speedup /tutorials/quantization_quick_start_mnist
.. raw:: html .. raw:: html
......
...@@ -109,7 +109,7 @@ ...@@ -109,7 +109,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"# need to unwrap the model, if the model is wrapped before speedup\npruner._unwrap_model()\n\n# speedup the model\nfrom nni.compression.pytorch.speedup import ModelSpeedup\n\nModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()" "# need to unwrap the model, if the model is wrapped before speedup\npruner._unwrap_model()\n\n# speedup the model, for more information about speedup, please refer :doc:`pruning_speedup`.\nfrom nni.compression.pytorch.speedup import ModelSpeedup\n\nModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()"
] ]
}, },
{ {
...@@ -165,7 +165,7 @@ ...@@ -165,7 +165,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.8.8" "version": "3.9.7"
} }
}, },
"nbformat": 4, "nbformat": 4,
......
...@@ -89,7 +89,7 @@ for name, mask in masks.items(): ...@@ -89,7 +89,7 @@ for name, mask in masks.items():
# need to unwrap the model, if the model is wrapped before speedup # need to unwrap the model, if the model is wrapped before speedup
pruner._unwrap_model() pruner._unwrap_model()
# speedup the model # speedup the model, for more information about speedup, please refer :doc:`pruning_speedup`.
from nni.compression.pytorch.speedup import ModelSpeedup from nni.compression.pytorch.speedup import ModelSpeedup
ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model() ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()
......
ea8cdb78aeea1a44e96e8903cfeedc41 930f8ee2f57b70037e3231152a72606c
\ No newline at end of file \ No newline at end of file
...@@ -72,6 +72,12 @@ If you are familiar with defining a model and training in pytorch, you can skip ...@@ -72,6 +72,12 @@ If you are familiar with defining a model and training in pytorch, you can skip
(fc1): Linear(in_features=256, out_features=120, bias=True) (fc1): Linear(in_features=256, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=84, bias=True) (fc2): Linear(in_features=120, out_features=84, bias=True)
(fc3): Linear(in_features=84, out_features=10, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True)
(relu1): ReLU()
(relu2): ReLU()
(relu3): ReLU()
(relu4): ReLU()
(pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
) )
...@@ -102,9 +108,9 @@ If you are familiar with defining a model and training in pytorch, you can skip ...@@ -102,9 +108,9 @@ If you are familiar with defining a model and training in pytorch, you can skip
.. code-block:: none .. code-block:: none
Average test loss: 0.5822, Accuracy: 8311/10000 (83%) Average test loss: 0.5368, Accuracy: 8321/10000 (83%)
Average test loss: 0.2795, Accuracy: 9154/10000 (92%) Average test loss: 0.3092, Accuracy: 9104/10000 (91%)
Average test loss: 0.2036, Accuracy: 9345/10000 (93%) Average test loss: 0.2070, Accuracy: 9380/10000 (94%)
...@@ -181,6 +187,12 @@ Pruners usually require `model` and `config_list` as input arguments. ...@@ -181,6 +187,12 @@ Pruners usually require `model` and `config_list` as input arguments.
(module): Linear(in_features=120, out_features=84, bias=True) (module): Linear(in_features=120, out_features=84, bias=True)
) )
(fc3): Linear(in_features=84, out_features=10, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True)
(relu1): ReLU()
(relu2): ReLU()
(relu3): ReLU()
(relu4): ReLU()
(pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
) )
...@@ -229,7 +241,7 @@ and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the ma ...@@ -229,7 +241,7 @@ and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the ma
# need to unwrap the model, if the model is wrapped before speedup # need to unwrap the model, if the model is wrapped before speedup
pruner._unwrap_model() pruner._unwrap_model()
# speedup the model # speedup the model, for more information about speedup, please refer :doc:`pruning_speedup`.
from nni.compression.pytorch.speedup import ModelSpeedup from nni.compression.pytorch.speedup import ModelSpeedup
ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model() ModelSpeedup(model, torch.rand(3, 1, 28, 28).to(device), masks).speedup_model()
...@@ -246,7 +258,7 @@ and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the ma ...@@ -246,7 +258,7 @@ and reaches a higher sparsity ratio because `ModelSpeedup` will propagate the ma
aten::log_softmax is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~ aten::log_softmax is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~
Note: .aten::log_softmax.12 does not have corresponding mask inference object Note: .aten::log_softmax.12 does not have corresponding mask inference object
/home/ningshang/anaconda3/envs/nni-dev/lib/python3.8/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at aten/src/ATen/core/TensorBody.h:417.) /home/nishang/anaconda3/envs/MCM/lib/python3.9/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at /opt/conda/conda-bld/pytorch_1640811803361/work/build/aten/src/ATen/core/TensorBody.h:417.)
return self._grad return self._grad
...@@ -278,6 +290,12 @@ the model will become real smaller after speedup ...@@ -278,6 +290,12 @@ the model will become real smaller after speedup
(fc1): Linear(in_features=128, out_features=60, bias=True) (fc1): Linear(in_features=128, out_features=60, bias=True)
(fc2): Linear(in_features=60, out_features=42, bias=True) (fc2): Linear(in_features=60, out_features=42, bias=True)
(fc3): Linear(in_features=42, out_features=10, bias=True) (fc3): Linear(in_features=42, out_features=10, bias=True)
(relu1): ReLU()
(relu2): ReLU()
(relu3): ReLU()
(relu4): ReLU()
(pool1): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(pool2): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
) )
...@@ -308,7 +326,7 @@ Because speedup will replace the masked big layers with dense small ones. ...@@ -308,7 +326,7 @@ Because speedup will replace the masked big layers with dense small ones.
.. rst-class:: sphx-glr-timing .. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 1 minutes 38.500 seconds) **Total running time of the script:** ( 0 minutes 58.337 seconds)
.. _sphx_glr_download_tutorials_pruning_quick_start_mnist.py: .. _sphx_glr_download_tutorials_pruning_quick_start_mnist.py:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment