@@ -28,7 +28,7 @@ Write a configuration to specify the layers that you want to prune. The followin
...
@@ -28,7 +28,7 @@ Write a configuration to specify the layers that you want to prune. The followin
'op_types':['default'],
'op_types':['default'],
}]
}]
Thespecificationofconfigurationcanbefound`here<./Tutorial.rst#specify-the-configuration>`__.Notethatdifferentprunersmayhavetheirowndefinedfieldsinconfiguration,forexmaple``start_epoch``inAGPpruner.Pleaserefertoeachpruner's `usage <./Pruner.rst>`__ for details, and adjust the configuration accordingly.
Thespecificationofconfigurationcanbefound`here<./Tutorial.rst#specify-the-configuration>`__.Notethatdifferentprunersmayhavetheirowndefinedfieldsinconfiguration.Pleaserefertoeachpruner's `usage <./Pruner.rst>`__ for details, and adjust the configuration accordingly.
Step2. Choose a pruner and compress the model
Step2. Choose a pruner and compress the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
@@ -80,9 +80,10 @@ Step1. Write configuration
...
@@ -80,9 +80,10 @@ Step1. Write configuration
.. code-block:: python
.. code-block:: python
config_list = [{
config_list = [{
'quant_types': ['weight'],
'quant_types': ['weight', 'input'],
'quant_bits': {
'quant_bits': {
'weight': 8,
'weight': 8,
'input': 8,
}, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
}, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
"<ipython-input-1-0f2a9eb92f42>:22: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
"<ipython-input-1-0f2a9eb92f42>:22: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
" x = x.view(-1, x.size()[1:].numel())\n"
" x = x.view(-1, x.size()[1:].numel())\n"
]
]
},
},
{
{
"output_type": "stream",
"name": "stdout",
"name": "stdout",
"output_type": "stream",
"text": [
"text": [
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) start to speed up the model\n",
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) start to speed up the model\n",
"[2021-07-26 22:26:18] INFO (FixMaskConflict/MainThread) {'conv1': 1, 'conv2': 1}\n"
"[2021-07-26 22:26:18] INFO (FixMaskConflict/MainThread) {'conv1': 1, 'conv2': 1}\n"
]
]
},
},
{
{
"output_type": "stream",
"name": "stdout",
"name": "stdout",
"output_type": "stream",
"text": [
"text": [
"[2021-07-26 22:26:18] INFO (FixMaskConflict/MainThread) dim0 sparsity: 0.500000\n",
"[2021-07-26 22:26:18] INFO (FixMaskConflict/MainThread) dim0 sparsity: 0.500000\n",
"[2021-07-26 22:26:18] INFO (FixMaskConflict/MainThread) dim1 sparsity: 0.000000\n",
"[2021-07-26 22:26:18] INFO (FixMaskConflict/MainThread) dim1 sparsity: 0.000000\n",
...
@@ -991,16 +986,16 @@
...
@@ -991,16 +986,16 @@
]
]
},
},
{
{
"output_type": "stream",
"name": "stdout",
"name": "stdout",
"output_type": "stream",
"text": [
"text": [
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update indirect sparsity for relu3\n",
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update indirect sparsity for relu3\n",
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the relu3\n"
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the relu3\n"
]
]
},
},
{
{
"output_type": "stream",
"name": "stdout",
"name": "stdout",
"output_type": "stream",
"text": [
"text": [
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update indirect sparsity for fc1\n",
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update indirect sparsity for fc1\n",
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the fc1\n",
"[2021-07-26 22:26:18] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the fc1\n",
- Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. `参考论文 <http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf>`__
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `参考论文 <https://arxiv.org/abs/1602.02830>`__
NNI 模型压缩提供了简洁的接口,用于自定义新的压缩算法。接口的设计理念是,将框架相关的实现细节包装起来,让用户能聚焦于压缩逻辑。用户可以进一步了解我们的压缩框架,并根据我们的框架定制新的压缩算法(剪枝算法或量化算法)。此外,还可利用 NNI 的自动调参功能来自动的压缩模型。参考 `这里 <./advanced.rst>`__ 了解更多细节。
NNI 模型压缩提供了简洁的接口,用于自定义新的压缩算法。接口的设计理念是,将框架相关的实现细节包装起来,让用户能聚焦于压缩逻辑。用户可以进一步了解我们的压缩框架,并根据我们的框架定制新的压缩算法(剪枝算法或量化算法)。此外,还可利用 NNI 的自动调参功能来自动的压缩模型。参考 `这里 <./advanced.rst>`__ 了解更多细节。