"src/sdk/vscode:/vscode.git/clone" did not exist on "a7b96de922bc6646f93c142b57e9fb9e2bf01920"
Unverified Commit 070df4a0 authored by liuzhe-lz's avatar liuzhe-lz Committed by GitHub
Browse files

Merge pull request #4291 from microsoft/v2.5

merge v2.5 back to master
parents 821706b8 6a082fe9
...@@ -2,8 +2,6 @@ ...@@ -2,8 +2,6 @@
<img src="docs/img/nni_logo.png" width="300"/> <img src="docs/img/nni_logo.png" width="300"/>
</p> </p>
-----------
[![MIT licensed](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE) [![MIT licensed](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE)
[![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20linux?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=62&branchName=master) [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20linux?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=62&branchName=master)
[![Issues](https://img.shields.io/github/issues-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen) [![Issues](https://img.shields.io/github/issues-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen)
...@@ -27,10 +25,14 @@ The tool manages automated machine learning (AutoML) experiments, **dispatches a ...@@ -27,10 +25,14 @@ The tool manages automated machine learning (AutoML) experiments, **dispatches a
## **What's NEW!** &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a> ## **What's NEW!** &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>
* **New release**: [v2.4 is available](https://github.com/microsoft/nni/releases) - _released on June-15-2021_ * **New release**: [v2.5 is available](https://github.com/microsoft/nni/releases) - _released on June-15-2021_
* **New demo available**: [Youtube entry](https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw) | [Bilibili 入口](https://space.bilibili.com/1649051673) - _last updated on May-26-2021_ * **New demo available**: [Youtube entry](https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw) | [Bilibili 入口](https://space.bilibili.com/1649051673) - _last updated on May-26-2021_
* **New webinar**: [Introducing Retiarii: A deep learning exploratory-training framework on NNI](https://note.microsoft.com/MSR-Webinar-Retiarii-Registration-Live.html) - _scheduled on June-24-2021_ * **New webinar**: [Introducing Retiarii: A deep learning exploratory-training framework on NNI](https://note.microsoft.com/MSR-Webinar-Retiarii-Registration-Live.html) - _scheduled on June-24-2021_
* **New community channel**: [Discussions](https://github.com/microsoft/nni/discussions) * **New community channel**: [Discussions](https://github.com/microsoft/nni/discussions)
* **New emoticons release**: [nnSpider](./docs/en_US/Tutorial/NNSpider.md)
<p align="center">
<a href="#nni-spider"><img width="100%" src="docs/img/emoicons/home.svg" /></a>
</p>
## **NNI capabilities in a glance** ## **NNI capabilities in a glance**
...@@ -251,7 +253,7 @@ Note: ...@@ -251,7 +253,7 @@ Note:
* Download the examples via clone the source code. * Download the examples via clone the source code.
```bash ```bash
git clone -b v2.4 https://github.com/Microsoft/nni.git git clone -b v2.5 https://github.com/Microsoft/nni.git
``` ```
* Run the MNIST example. * Run the MNIST example.
......
...@@ -8,7 +8,8 @@ torch == 1.9.0+cpu ; sys_platform != "darwin" ...@@ -8,7 +8,8 @@ torch == 1.9.0+cpu ; sys_platform != "darwin"
torch == 1.9.0 ; sys_platform == "darwin" torch == 1.9.0 ; sys_platform == "darwin"
torchvision == 0.10.0+cpu ; sys_platform != "darwin" torchvision == 0.10.0+cpu ; sys_platform != "darwin"
torchvision == 0.10.0 ; sys_platform == "darwin" torchvision == 0.10.0 ; sys_platform == "darwin"
pytorch-lightning >= 1.4.2 pytorch-lightning >= 1.5
torchmetrics
onnx onnx
peewee peewee
graphviz graphviz
......
...@@ -6,6 +6,7 @@ torchvision == 0.7.0+cpu ...@@ -6,6 +6,7 @@ torchvision == 0.7.0+cpu
# It will install pytorch-lightning 0.8.x and unit tests won't work. # It will install pytorch-lightning 0.8.x and unit tests won't work.
# Latest version has conflict with tensorboard and tensorflow 1.x. # Latest version has conflict with tensorboard and tensorflow 1.x.
pytorch-lightning pytorch-lightning
torchmetrics
keras == 2.1.6 keras == 2.1.6
onnx onnx
......
...@@ -2,7 +2,7 @@ astor ...@@ -2,7 +2,7 @@ astor
hyperopt == 0.1.2 hyperopt == 0.1.2
json_tricks json_tricks
psutil psutil
pyyaml pyyaml >= 5.4
requests requests
responses responses
schema schema
......
...@@ -198,3 +198,11 @@ The latency is measured on one V100 GPU and the input tensor is ``torch.randn(1 ...@@ -198,3 +198,11 @@ The latency is measured on one V100 GPU and the input tensor is ``torch.randn(1
.. image:: ../../img/SA_latency_accuracy.png .. image:: ../../img/SA_latency_accuracy.png
User configuration for ModelSpeedup
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.compression.pytorch.ModelSpeedup
...@@ -100,6 +100,8 @@ Quantization algorithms compress the original network by reducing the number of ...@@ -100,6 +100,8 @@ Quantization algorithms compress the original network by reducing the number of
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `Reference Paper <https://arxiv.org/abs/1602.02830>`__ - Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. `Reference Paper <https://arxiv.org/abs/1602.02830>`__
* - `LSQ Quantizer <../Compression/Quantizer.rst#lsq-quantizer>`__ * - `LSQ Quantizer <../Compression/Quantizer.rst#lsq-quantizer>`__
- Learned step size quantization. `Reference Paper <https://arxiv.org/pdf/1902.08153.pdf>`__ - Learned step size quantization. `Reference Paper <https://arxiv.org/pdf/1902.08153.pdf>`__
* - `Observer Quantizer <../Compression/Quantizer.rst#observer-quantizer>`__
- Post training quantizaiton. Collect quantization information during calibration with observers.
Model Speedup Model Speedup
......
...@@ -9,6 +9,7 @@ Index of supported quantization algorithms ...@@ -9,6 +9,7 @@ Index of supported quantization algorithms
* `DoReFa Quantizer <#dorefa-quantizer>`__ * `DoReFa Quantizer <#dorefa-quantizer>`__
* `BNN Quantizer <#bnn-quantizer>`__ * `BNN Quantizer <#bnn-quantizer>`__
* `LSQ Quantizer <#lsq-quantizer>`__ * `LSQ Quantizer <#lsq-quantizer>`__
* `Observer Quantizer <#observer-quantizer>`__
Naive Quantizer Naive Quantizer
--------------- ---------------
...@@ -102,6 +103,64 @@ The quantizer will automatically detect Conv-BN patterns and simulate batch norm ...@@ -102,6 +103,64 @@ The quantizer will automatically detect Conv-BN patterns and simulate batch norm
graph. Note that when the quantization aware training process is finished, the folded weight/bias would be restored after calling graph. Note that when the quantization aware training process is finished, the folded weight/bias would be restored after calling
`quantizer.export_model`. `quantizer.export_model`.
Quantization dtype and scheme customization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Different backends on different devices use different quantization strategies (i.e. dtype (int or uint) and
scheme (per-tensor or per-channel and symmetric or affine)). QAT quantizer supports customization of mainstream dtypes and schemes.
There are two ways to set them. One way is setting them globally through a function named `set_quant_scheme_dtype` like:
.. code-block:: python
from nni.compression.pytorch.quantization.settings import set_quant_scheme_dtype
# This will set all the quantization of 'input' in 'per_tensor_affine' and 'uint' manner
set_quant_scheme_dtype('input', 'per_tensor_affine', 'uint)
# This will set all the quantization of 'output' in 'per_tensor_symmetric' and 'int' manner
set_quant_scheme_dtype('output', 'per_tensor_symmetric', 'int')
# This will set all the quantization of 'weight' in 'per_channel_symmetric' and 'int' manner
set_quant_scheme_dtype('weight', 'per_channel_symmetric', 'int')
The other way is more detailed. You can customize the dtype and scheme in each quantization config list like:
.. code-block:: python
config_list = [{
'quant_types': ['weight'],
'quant_bits': 8,
'op_types':['Conv2d', 'Linear'],
'quant_dtype': 'int',
'quant_scheme': 'per_channel_symmetric'
}, {
'quant_types': ['output'],
'quant_bits': 8,
'quant_start_step': 7000,
'op_types':['ReLU6'],
'quant_dtype': 'uint',
'quant_scheme': 'per_tensor_affine'
}]
Multi-GPU training
^^^^^^^^^^^^^^^^^^^
QAT quantizer natively supports multi-gpu training (DataParallel and DistributedDataParallel). Note that the quantizer
instantiation should happen before you wrap your model with DataParallel or DistributedDataParallel. For example:
.. code-block:: python
from torch.nn.parallel import DistributedDataParallel as DDP
from nni.algorithms.compression.pytorch.quantization import QAT_Quantizer
model = define_your_model()
model = QAT_Quantizer(model, **other_params) # <--- QAT_Quantizer instantiation
model = DDP(model)
for i in range(epochs):
train(model)
eval(model)
---- ----
LSQ Quantizer LSQ Quantizer
...@@ -253,3 +312,74 @@ We implemented one of the experiments in `Binarized Neural Networks: Training De ...@@ -253,3 +312,74 @@ We implemented one of the experiments in `Binarized Neural Networks: Training De
The experiments code can be found at :githublink:`examples/model_compress/quantization/BNN_quantizer_cifar10.py <examples/model_compress/quantization/BNN_quantizer_cifar10.py>` The experiments code can be found at :githublink:`examples/model_compress/quantization/BNN_quantizer_cifar10.py <examples/model_compress/quantization/BNN_quantizer_cifar10.py>`
Observer Quantizer
------------------
..
Observer quantizer is a framework of post-training quantization. It will insert observers into the place where the quantization will happen. During quantization calibration, each observer will record all the tensors it 'sees'. These tensors will be used to calculate the quantization statistics after calibration.
Usage
^^^^^
1. configure which layer to be quantized and which tensor (input/output/weight) of that layer to be quantized.
2. construct the observer quantizer.
3. do quantization calibration.
4. call the `compress` API to calculate the scale and zero point for each tensor and switch model to evaluation mode.
PyTorch code
.. code-block:: python
from nni.algorithms.compression.pytorch.quantization import ObserverQuantizer
def calibration(model, calib_loader):
model.eval()
with torch.no_grad():
for data, _ in calib_loader:
model(data)
model = Mnist()
configure_list = [{
'quant_bits': 8,
'quant_types': ['weight', 'input'],
'op_names': ['conv1', 'conv2],
}, {
'quant_bits': 8,
'quant_types': ['output'],
'op_names': ['relu1', 'relu2],
}]
quantizer = ObserverQuantizer(model, configure_list)
calibration(model, calib_loader)
model = quantizer.compress()
You can view example :githublink:`examples/model_compress/quantization/observer_quantizer.py <examples/model_compress/quantization/observer_quantizer.py>` for more information.
User configuration for Observer Quantizer
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Common configuration needed by compression algorithms can be found at `Specification of `config_list <./QuickStart.rst>`__.
.. note::
This quantizer is still under development for now. Some quantizer settings are hard-coded:
- weight observer: per_tensor_symmetric, qint8
- output observer: per_tensor_affine, quint8, reduce_range=True
Other settings (such as quant_type and op_names) can be configured.
About the compress API
^^^^^^^^^^^^^^^^^^^^^^
Before the `compress` API is called, the model will only record tensors' statistics and no quantization process will be executed.
After the `compress` API is called, the model will NOT record tensors' statistics any more. The quantization scale and zero point will
be generated for each tensor and will be used to quantize each tensor during inference (we call it evaluation mode)
About calibration
^^^^^^^^^^^^^^^^^
Usually we pick up about 100 training/evaluation examples for calibration. If you found the accuracy is a bit low, try
to reduce the number of calibration examples.
...@@ -85,12 +85,16 @@ Step1. Write configuration ...@@ -85,12 +85,16 @@ Step1. Write configuration
'quant_bits': { 'quant_bits': {
'weight': 8, 'weight': 8,
}, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below. }, # you can just use `int` here because all `quan_types` share same bits length, see config for `ReLu6` below.
'op_types':['Conv2d', 'Linear'] 'op_types':['Conv2d', 'Linear'],
'quant_dtype': 'int',
'quant_scheme': 'per_channel_symmetric'
}, { }, {
'quant_types': ['output'], 'quant_types': ['output'],
'quant_bits': 8, 'quant_bits': 8,
'quant_start_step': 7000, 'quant_start_step': 7000,
'op_types':['ReLU6'] 'op_types':['ReLU6'],
'quant_dtype': 'uint',
'quant_scheme': 'per_tensor_affine'
}] }]
The specification of configuration can be found `here <./Tutorial.rst#quantization-specific-keys>`__. The specification of configuration can be found `here <./Tutorial.rst#quantization-specific-keys>`__.
......
...@@ -66,7 +66,7 @@ to the weight parameter of modules. 'input' means applying quantization operatio ...@@ -66,7 +66,7 @@ to the weight parameter of modules. 'input' means applying quantization operatio
bits length of quantization, key is the quantization type, value is the quantization bits length, eg. bits length of quantization, key is the quantization type, value is the quantization bits length, eg.
.. code-block:: bash .. code-block:: python
{ {
quant_bits: { quant_bits: {
...@@ -77,36 +77,102 @@ bits length of quantization, key is the quantization type, value is the quantiza ...@@ -77,36 +77,102 @@ bits length of quantization, key is the quantization type, value is the quantiza
when the value is int type, all quantization types share same bits length. eg. when the value is int type, all quantization types share same bits length. eg.
.. code-block:: bash .. code-block:: python
{ {
quant_bits: 8, # weight or output quantization are all 8 bits quant_bits: 8, # weight or output quantization are all 8 bits
} }
* **quant_dtype** : str or dict of {str : str}
quantization dtype, used to determine the range of quantized value. Two choices can be used:
- int: the range is singed
- uint: the range is unsigned
Two ways to set it. One is that the key is the quantization type, and the value is the quantization dtype, eg.
.. code-block:: python
{
quant_dtype: {
'weight': 'int',
'output': 'uint,
},
}
The other is that the value is str type, and all quantization types share the same dtype. eg.
.. code-block:: python
{
'quant_dtype': 'int', # the dtype of weight and output quantization are all 'int'
}
There are totally two kinds of `quant_dtype` you can set, they are 'int' and 'uint'.
* **quant_scheme** : str or dict of {str : str}
quantization scheme, used to determine the quantization manners. Four choices can used:
- per_tensor_affine: per tensor, asymmetric quantization
- per_tensor_symmetric: per tensor, symmetric quantization
- per_channel_affine: per channel, asymmetric quantization
- per_channel_symmetric: per channel, symmetric quantization
Two ways to set it. One is that the key is the quantization type, value is the quantization scheme, eg.
.. code-block:: python
{
quant_scheme: {
'weight': 'per_channel_symmetric',
'output': 'per_tensor_affine',
},
}
The other is that the value is str type, all quantization types share the same quant_scheme. eg.
.. code-block:: python
{
quant_scheme: 'per_channel_symmetric', # the quant_scheme of weight and output quantization are all 'per_channel_symmetric'
}
There are totally four kinds of `quant_scheme` you can set, they are 'per_tensor_affine', 'per_tensor_symmetric', 'per_channel_affine' and 'per_channel_symmetric'.
The following example shows a more complete ``config_list``\ , it uses ``op_names`` (or ``op_types``\ ) to specify the target layers along with the quantization bits for those layers. The following example shows a more complete ``config_list``\ , it uses ``op_names`` (or ``op_types``\ ) to specify the target layers along with the quantization bits for those layers.
.. code-block:: bash .. code-block:: python
config_list = [{ config_list = [{
'quant_types': ['weight'], 'quant_types': ['weight'],
'quant_bits': 8, 'quant_bits': 8,
'op_names': ['conv1'] 'op_names': ['conv1'],
'quant_dtype': 'int',
'quant_scheme': 'per_channel_symmetric'
}, },
{ {
'quant_types': ['weight'], 'quant_types': ['weight'],
'quant_bits': 4, 'quant_bits': 4,
'quant_start_step': 0, 'quant_start_step': 0,
'op_names': ['conv2'] 'op_names': ['conv2'],
'quant_dtype': 'int',
'quant_scheme': 'per_tensor_symmetric'
}, },
{ {
'quant_types': ['weight'], 'quant_types': ['weight'],
'quant_bits': 3, 'quant_bits': 3,
'op_names': ['fc1'] 'op_names': ['fc1'],
'quant_dtype': 'int',
'quant_scheme': 'per_tensor_symmetric'
}, },
{ {
'quant_types': ['weight'], 'quant_types': ['weight'],
'quant_bits': 2, 'quant_bits': 2,
'op_names': ['fc2'] 'op_names': ['fc2'],
'quant_dtype': 'int',
'quant_scheme': 'per_channel_symmetric'
}] }]
In this example, 'op_names' is the name of layer and four layers will be quantized to different quant_bits. In this example, 'op_names' is the name of layer and four layers will be quantized to different quant_bits.
......
Pruning V2
==========
Pruning V2 is a refactoring of the old version and provides more powerful functions.
Compared with the old version, the iterative pruning process is detached from the pruner and the pruner is only responsible for pruning and generating the masks once.
What's more, pruning V2 unifies the pruning process and provides a more free combination of pruning components.
Task generator only cares about the pruning effect that should be achieved in each round, and uses a config list to express how to pruning in the next step.
Pruner will reset with the model and config list given by task generator then generate the masks in current step.
For a clearer structure vision, please refer to the figure below.
.. image:: ../../img/pruning_process.png
:target: ../../img/pruning_process.png
:alt:
In V2, a pruning process is usually driven by a pruning scheduler, it contains a specific pruner and a task generator.
But users can also use pruner directly like in the pruning V1.
For details, please refer to the following tutorials:
.. toctree::
:maxdepth: 2
Pruning Algorithms <v2_pruning_algo>
Pruning Scheduler <v2_scheduler>
Supported Pruning Algorithms in NNI
===================================
NNI provides several pruning algorithms that reproducing from the papers. In pruning v2, NNI split the pruning algorithm into more detailed components.
This means users can freely combine components from different algorithms,
or easily use a component of their own implementation to replace a step in the original algorithm to implement their own pruning algorithm.
Right now, pruning algorithms with how to generate masks in one step are implemented as pruners,
and how to schedule sparsity in each iteration are implemented as iterative pruners.
**Pruner**
* `Level Pruner <#level-pruner>`__
* `L1 Norm Pruner <#l1-norm-pruner>`__
* `L2 Norm Pruner <#l2-norm-pruner>`__
* `FPGM Pruner <#fpgm-pruner>`__
* `Slim Pruner <#slim-pruner>`__
* `Activation APoZ Rank Pruner <#activation-apoz-rank-pruner>`__
* `Activation Mean Rank Pruner <#activation-mean-rank-pruner>`__
* `Taylor FO Weight Pruner <#taylor-fo-weight-pruner>`__
* `ADMM Pruner <#admm-pruner>`__
**Iterative Pruner**
* `Linear Pruner <#linear-pruner>`__
* `AGP Pruner <#agp-pruner>`__
* `Lottery Ticket Pruner <#lottery-ticket-pruner>`__
* `Simulated Annealing Pruner <#simulated-annealing-pruner>`__
Level Pruner
------------
This is a basic pruner, and in some papers called it magnitude pruning or fine-grained pruning.
It will mask the weight in each specified layer with smaller absolute value by a ratio configured in the config list.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import LevelPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['default'] }]
pruner = LevelPruner(model, config_list)
masked_model, masks = pruner.compress()
User configuration for Level Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.LevelPruner
L1 Norm Pruner
--------------
L1 norm pruner computes the l1 norm of the layer weight on the first dimension,
then prune the weight blocks on this dimension with smaller l1 norm values.
i.e., compute the l1 norm of the filters in convolution layer as metric values,
compute the l1 norm of the weight by rows in linear layer as metric values.
For more details, please refer to `PRUNING FILTERS FOR EFFICIENT CONVNETS <https://arxiv.org/abs/1608.08710>`__\.
In addition, L1 norm pruner also supports dependency-aware mode.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = L1NormPruner(model, config_list)
masked_model, masks = pruner.compress()
User configuration for L1 Norm Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.L1NormPruner
L2 Norm Pruner
--------------
L2 norm pruner is a variant of L1 norm pruner. It uses l2 norm as metric to determine which weight elements should be pruned.
L2 norm pruner also supports dependency-aware mode.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import L2NormPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = L2NormPruner(model, config_list)
masked_model, masks = pruner.compress()
User configuration for L2 Norm Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.L2NormPruner
FPGM Pruner
-----------
FPGM pruner prunes the blocks of the weight on the first dimension with the smallest geometric median.
FPGM chooses the weight blocks with the most replaceable contribution.
For more details, please refer to `Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration <https://arxiv.org/abs/1811.00250>`__.
FPGM pruner also supports dependency-aware mode.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import FPGMPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = FPGMPruner(model, config_list)
masked_model, masks = pruner.compress()
User configuration for FPGM Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.FPGMPruner
Slim Pruner
-----------
Slim pruner adds sparsity regularization on the scaling factors of batch normalization (BN) layers during training to identify unimportant channels.
The channels with small scaling factor values will be pruned.
For more details, please refer to `Learning Efficient Convolutional Networks through Network Slimming <https://arxiv.org/abs/1708.06519>`__\.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import SlimPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['BatchNorm2d'] }]
pruner = SlimPruner(model, config_list, trainer, optimizer, criterion, training_epochs=1)
masked_model, masks = pruner.compress()
User configuration for Slim Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.SlimPruner
Activation APoZ Rank Pruner
---------------------------
Activation APoZ rank pruner is a pruner which prunes on the first weight dimension,
with the smallest importance criterion ``APoZ`` calculated from the output activations of convolution layers to achieve a preset level of network sparsity.
The pruning criterion ``APoZ`` is explained in the paper `Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures <https://arxiv.org/abs/1607.03250>`__.
The APoZ is defined as:
:math:`APoZ_{c}^{(i)} = APoZ\left(O_{c}^{(i)}\right)=\frac{\sum_{k}^{N} \sum_{j}^{M} f\left(O_{c, j}^{(i)}(k)=0\right)}{N \times M}`
Activation APoZ rank pruner also supports dependency-aware mode.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import ActivationAPoZRankPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = ActivationAPoZRankPruner(model, config_list, trainer, optimizer, criterion, training_batches=20)
masked_model, masks = pruner.compress()
User configuration for Activation APoZ Rank Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.ActivationAPoZRankPruner
Activation Mean Rank Pruner
---------------------------
Activation mean rank pruner is a pruner which prunes on the first weight dimension,
with the smallest importance criterion ``mean activation`` calculated from the output activations of convolution layers to achieve a preset level of network sparsity.
The pruning criterion ``mean activation`` is explained in section 2.2 of the paper `Pruning Convolutional Neural Networks for Resource Efficient Inference <https://arxiv.org/abs/1611.06440>`__.
Activation mean rank pruner also supports dependency-aware mode.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import ActivationMeanRankPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = ActivationMeanRankPruner(model, config_list, trainer, optimizer, criterion, training_batches=20)
masked_model, masks = pruner.compress()
User configuration for Activation Mean Rank Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.ActivationMeanRankPruner
Taylor FO Weight Pruner
-----------------------
Taylor FO weight pruner is a pruner which prunes on the first weight dimension,
based on estimated importance calculated from the first order taylor expansion on weights to achieve a preset level of network sparsity.
The estimated importance is defined as the paper `Importance Estimation for Neural Network Pruning <http://jankautz.com/publications/Importance4NNPruning_CVPR19.pdf>`__.
:math:`\widehat{\mathcal{I}}_{\mathcal{S}}^{(1)}(\mathbf{W}) \triangleq \sum_{s \in \mathcal{S}} \mathcal{I}_{s}^{(1)}(\mathbf{W})=\sum_{s \in \mathcal{S}}\left(g_{s} w_{s}\right)^{2}`
Taylor FO weight pruner also supports dependency-aware mode.
What's more, we provide a global-sort mode for this pruner which is aligned with paper implementation.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import TaylorFOWeightPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = TaylorFOWeightPruner(model, config_list, trainer, optimizer, criterion, training_batches=20)
masked_model, masks = pruner.compress()
User configuration for Activation Mean Rank Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.TaylorFOWeightPruner
ADMM Pruner
-----------
Alternating Direction Method of Multipliers (ADMM) is a mathematical optimization technique,
by decomposing the original nonconvex problem into two subproblems that can be solved iteratively.
In weight pruning problem, these two subproblems are solved via 1) gradient descent algorithm and 2) Euclidean projection respectively.
During the process of solving these two subproblems, the weights of the original model will be changed.
Then a fine-grained pruning will be applied to prune the model according to the config list given.
This solution framework applies both to non-structured and different variations of structured pruning schemes.
For more details, please refer to `A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers <https://arxiv.org/abs/1804.03294>`__.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import ADMMPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = ADMMPruner(model, config_list, trainer, optimizer, criterion, iterations=10, training_epochs=1)
masked_model, masks = pruner.compress()
User configuration for ADMM Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.ADMMPruner
Linear Pruner
-------------
Linear pruner is an iterative pruner, it will increase sparsity evenly from scratch during each iteration.
For example, the final sparsity is set as 0.5, and the iteration number is 5, then the sparsity used in each iteration are ``[0, 0.1, 0.2, 0.3, 0.4, 0.5]``.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import LinearPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = LinearPruner(model, config_list, pruning_algorithm='l1', total_iteration=10, finetuner=finetuner)
pruner.compress()
_, model, masks, _, _ = pruner.get_best_result()
User configuration for Linear Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.LinearPruner
AGP Pruner
----------
This is an iterative pruner, which the sparsity is increased from an initial sparsity value :math:`s_{i}` (usually 0) to a final sparsity value :math:`s_{f}` over a span of :math:`n` pruning iterations,
starting at training step :math:`t_{0}` and with pruning frequency :math:`\Delta t`:
:math:`s_{t}=s_{f}+\left(s_{i}-s_{f}\right)\left(1-\frac{t-t_{0}}{n \Delta t}\right)^{3} \text { for } t \in\left\{t_{0}, t_{0}+\Delta t, \ldots, t_{0} + n \Delta t\right\}`
For more details please refer to `To prune, or not to prune: exploring the efficacy of pruning for model compression <https://arxiv.org/abs/1710.01878>`__\.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import AGPPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = AGPPruner(model, config_list, pruning_algorithm='l1', total_iteration=10, finetuner=finetuner)
pruner.compress()
_, model, masks, _, _ = pruner.get_best_result()
User configuration for AGP Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.AGPPruner
Lottery Ticket Pruner
---------------------
`The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks <https://arxiv.org/abs/1803.03635>`__\ ,
authors Jonathan Frankle and Michael Carbin,provides comprehensive measurement and analysis,
and articulate the *lottery ticket hypothesis*\ : dense, randomly-initialized, feed-forward networks contain subnetworks (*winning tickets*\ ) that
-- when trained in isolation -- reach test accuracy comparable to the original network in a similar number of iterations.
In this paper, the authors use the following process to prune a model, called *iterative prunning*\ :
..
#. Randomly initialize a neural network f(x;theta_0) (where theta\ *0 follows D*\ {theta}).
#. Train the network for j iterations, arriving at parameters theta_j.
#. Prune p% of the parameters in theta_j, creating a mask m.
#. Reset the remaining parameters to their values in theta_0, creating the winning ticket f(x;m*theta_0).
#. Repeat step 2, 3, and 4.
If the configured final sparsity is P (e.g., 0.8) and there are n times iterative pruning,
each iterative pruning prunes 1-(1-P)^(1/n) of the weights that survive the previous round.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import LotteryTicketPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = LotteryTicketPruner(model, config_list, pruning_algorithm='l1', total_iteration=10, finetuner=finetuner, reset_weight=True)
pruner.compress()
_, model, masks, _, _ = pruner.get_best_result()
User configuration for Lottery Ticket Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.LotteryTicketPruner
Simulated Annealing Pruner
--------------------------
We implement a guided heuristic search method, Simulated Annealing (SA) algorithm. As mentioned in the paper, this method is enhanced on guided search based on prior experience.
The enhanced SA technique is based on the observation that a DNN layer with more number of weights often has a higher degree of model compression with less impact on overall accuracy.
* Randomly initialize a pruning rate distribution (sparsities).
* While current_temperature < stop_temperature:
#. generate a perturbation to current distribution
#. Perform fast evaluation on the perturbated distribution
#. accept the perturbation according to the performance and probability, if not accepted, return to step 1
#. cool down, current_temperature <- current_temperature * cool_down_rate
For more details, please refer to `AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates <https://arxiv.org/abs/1907.03141>`__.
Usage
^^^^^^
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import SimulatedAnnealingPruner
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }]
pruner = SimulatedAnnealingPruner(model, config_list, pruning_algorithm='l1', cool_down_rate=0.9, finetuner=finetuner)
pruner.compress()
_, model, masks, _, _ = pruner.get_best_result()
User configuration for Simulated Annealing Pruner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**PyTorch**
.. autoclass:: nni.algorithms.compression.v2.pytorch.pruning.SimulatedAnnealingPruner
Pruning Scheduler
=================
Pruning scheduler is new feature supported in pruning v2. It can bring more flexibility for pruning the model iteratively.
All the built-in iterative pruners (e.g., AGPPruner, SimulatedAnnealingPruner) are based on three abstracted components: pruning scheduler, pruners and task generators.
In addition to using the NNI built-in iterative pruners,
users can directly use the pruning schedulers to customize their own iterative pruning logic.
Workflow of Pruning Scheduler
-----------------------------
In iterative pruning, the final goal will be broken down into different small goals, and complete a small goal in each iteration.
For example, each iteration increases a little sparsity ratio, and after several pruning iterations, the continuous pruned model reaches the final overall sparsity;
fix the overall sparsity, try different ways to allocate sparsity between layers in each iteration, and find the best allocation way.
We define a small goal as ``Task``, it usually includes states inherited from previous iterations (eg. pruned model and masks) and description of the current goal (eg. a config list that describes how to allocate sparsity).
Details about ``Task`` can be found in this :githublink:`file <nni/algorithms/compression/v2/pytorch/base/scheduler.py>`.
Pruning scheduler handles two main components, a basic pruner, and a task generator. The logic of generating ``Task`` is encapsulated in the task generator.
In an iteration (one pruning step), pruning scheduler parses the ``Task`` getting from the task generator,
and reset the pruner by ``model``, ``masks``, ``config_list`` parsing from the ``Task``.
Then pruning scheduler generates the new masks by the pruner. During an iteration, the new masked model may also experience speed-up, finetuning, and evaluating.
After one iteration is done, the pruning scheduler collects the compact model, new masks and evaluation score, packages them into ``TaskResult``, and passes it to task generator.
The iteration process will end until the task generator has no more ``Task``.
How to Customized Iterative Pruning
-----------------------------------
Using AGP Pruning as an example to explain how to implement an iterative pruning by scheduler in NNI.
.. code-block:: python
from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner, PruningScheduler
from nni.algorithms.compression.v2.pytorch.pruning.tools import AGPTaskGenerator
pruner = L1NormPruner(model=None, config_list=None, mode='dependency_aware', dummy_input=torch.rand(10, 3, 224, 224).to(device))
task_generator = AGPTaskGenerator(total_iteration=10, origin_model=model, origin_config_list=config_list, log_dir='.', keep_intermediate_result=True)
scheduler = PruningScheduler(pruner, task_generator, finetuner=finetuner, speed_up=True, dummy_input=dummy_input, evaluator=None, reset_weight=False)
scheduler.compress()
_, model, masks, _, _ = scheduler.get_best_result()
The full script can be found :githublink:`here <examples/model_compress/pruning/v2/scheduler_torch.py>`.
In this example, we use ``dependency_aware`` mode L1 Norm Pruner as a basic pruner during each iteration.
Note we do not need to pass ``model`` and ``config_list`` to the pruner, because in each iteration the ``model`` and ``config_list`` used by the pruner are received from the task generator.
Then we can use ``scheduler`` as an iterative pruner directly. In fact, this is the implementation of ``AGPPruner`` in NNI.
More about Task Generator
-------------------------
The task generator is used to give the model that needs to be pruned in each iteration and the corresponding config_list.
For example, ``AGPTaskGenerator`` will give the model pruned in the previous iteration and compute the sparsity using in the current iteration.
``TaskGenerator`` put all these pruning information into ``Task`` and pruning scheduler will get the ``Task``, then run it.
The pruning result will return to the ``TaskGenerator`` at the end of each iteration and ``TaskGenerator`` will judge whether and how to generate the next ``Task``.
The information included in the ``Task`` and ``TaskResult`` can be found :githublink:`here <nni/algorithms/compression/v2/pytorch/base/scheduler.py>`.
A clearer iterative pruning flow chart can be found `here <v2_pruning.rst>`__.
If you want to implement your own task generator, please following the ``TaskGenerator`` :githublink:`interface <nni/algorithms/compression/v2/pytorch/pruning/tools/base.py>`.
Two main functions should be implemented, ``init_pending_tasks(self) -> List[Task]`` and ``generate_tasks(self, task_result: TaskResult) -> List[Task]``.
Why Use Pruning Scheduler
-------------------------
One of the benefits of using a scheduler to do iterative pruning is users can use more functions of NNI pruning components,
because of simplicity of the interface and the restoration of the paper, NNI not fully exposing all the low-level interfaces to the upper layer.
For example, resetting weight value to the original model in each iteration is a key point in lottery ticket pruning algorithm, and this is implemented in ``LotteryTicketPruner``.
To reduce the complexity of the interface, we only support this function in ``LotteryTicketPruner``, not other pruners.
If users want to reset weight during each iteration in AGP pruning, ``AGPPruner`` can not do this, but users can easily set ``reset_weight=True`` in ``PruningScheduler`` to implement this.
What's more, for a customized pruner or task generator, using scheduler can easily enhance the algorithm.
In addition, users can also customize the scheduling process to implement their own scheduler.
...@@ -102,9 +102,20 @@ Retiarii Experiments ...@@ -102,9 +102,20 @@ Retiarii Experiments
.. autoclass:: nni.retiarii.experiment.pytorch.RetiariiExeConfig .. autoclass:: nni.retiarii.experiment.pytorch.RetiariiExeConfig
:members: :members:
CGO Execution
-------------
.. autofunction:: nni.retiarii.evaluator.pytorch.cgo.evaluator.MultiModelSupervisedLearningModule
.. autofunction:: nni.retiarii.evaluator.pytorch.cgo.evaluator.Classification
.. autofunction:: nni.retiarii.evaluator.pytorch.cgo.evaluator.Regression
Utilities Utilities
--------- ---------
.. autofunction:: nni.retiarii.serialize .. autofunction:: nni.retiarii.serialize
.. autofunction:: nni.retiarii.fixed_arch .. autofunction:: nni.retiarii.fixed_arch
...@@ -15,14 +15,21 @@ Prerequisites ...@@ -15,14 +15,21 @@ Prerequisites
------------- -------------
* Please prepare a folder to household all the benchmark databases. By default, it can be found at ``${HOME}/.nni/nasbenchmark``. You can place it anywhere you like, and specify it in ``NASBENCHMARK_DIR`` via ``export NASBENCHMARK_DIR=/path/to/your/nasbenchmark`` before importing NNI. * Please prepare a folder to household all the benchmark databases. By default, it can be found at ``${HOME}/.cache/nni/nasbenchmark``. Or you can place it anywhere you like, and specify it in ``NASBENCHMARK_DIR`` via ``export NASBENCHMARK_DIR=/path/to/your/nasbenchmark`` before importing NNI.
* Please install ``peewee`` via ``pip3 install peewee``\ , which NNI uses to connect to database. * Please install ``peewee`` via ``pip3 install peewee``\ , which NNI uses to connect to database.
Data Preparation Data Preparation
---------------- ----------------
To avoid storage and legality issues, we do not provide any prepared databases. Please follow the following steps. Option 1 (Recommended)
^^^^^^^^^^^^^^^^^^^^^^
You can download the preprocessed benchmark files via ``python -m nni.nas.benchmarks.download <benchmark_name>``, where ``<benchmark_name>`` can be ``nasbench101``, ``nasbench201``, and etc. Add ``--help`` to the command for supported command line arguments.
Option 2
^^^^^^^^
.. note:: If you have files that are processed before v2.5, it is recommended that you delete them and try option 1.
#. #.
Clone NNI to your machine and enter ``examples/nas/benchmarks`` directory. Clone NNI to your machine and enter ``examples/nas/benchmarks`` directory.
......
...@@ -53,7 +53,54 @@ Three steps are need to use graph-based execution engine. ...@@ -53,7 +53,54 @@ Three steps are need to use graph-based execution engine.
For exporting top models, graph-based execution engine supports exporting source code for top models by running ``exp.export_top_models(formatter='code')``. For exporting top models, graph-based execution engine supports exporting source code for top models by running ``exp.export_top_models(formatter='code')``.
CGO Execution Engine CGO Execution Engine (experimental)
-------------------- -----------------------------------
CGO execution engine does cross-model optimizations based on the graph-based execution engine. This execution engine will be `released in v2.4 <https://github.com/microsoft/nni/issues/3813>`__. CGO(Cross-Graph Optimization) execution engine does cross-model optimizations based on the graph-based execution engine. In CGO execution engine, multiple models could be merged and trained together in one trial.
Currently, it only supports ``DedupInputOptimizer`` that can merge graphs sharing the same dataset to only loading and pre-processing each batch of data once, which can avoid bottleneck on data loading.
.. note :: To use CGO engine, PyTorch-lightning above version 1.4.2 is required.
To enable CGO execution engine, you need to follow these steps:
1. Create RetiariiExeConfig with remote training service. CGO execution engine currently only supports remote training service.
2. Add configurations for remote training service
3. Add configurations for CGO engine
.. code-block:: python
exp = RetiariiExperiment(base_model, trainer, mutators, strategy)
config = RetiariiExeConfig('remote')
# ...
# other configurations of RetiariiExeConfig
config.execution_engine = 'cgo' # set execution engine to CGO
config.max_concurrency_cgo = 3 # the maximum number of concurrent models to merge
config.batch_waiting_time = 10 # how many seconds CGO execution engine should wait before optimizing a new batch of models
rm_conf = RemoteMachineConfig()
# ...
# server configuration in rm_conf
rm_conf.gpu_indices = [0, 1, 2, 3] # gpu_indices must be set in RemoteMachineConfig for CGO execution engine
config.training_service.machine_list = [rm_conf]
exp.run(config, 8099)
CGO Execution Engine only supports pytorch-lightning trainer that inherits `MultiModelSupervisedLearningModule <./ApiReference.rst#nni.retiarii.evaluator.pytorch.cgo.evaluator.MultiModelSupervisedLearningModule>`__.
For a trial running multiple models, the trainers inheriting ``MultiModelSupervisedLearningModule`` can handle the multiple outputs from the merged model for training, test and validation.
We have already implemented two trainers: `Classification <./ApiReference.rst#nni.retiarii.evaluator.pytorch.cgo.evaluator.Classification>`__ and `Regression <./ApiReference.rst#nni.retiarii.evaluator.pytorch.cgo.evaluator.Regression>`__.
.. code-block:: python
from nni.retiarii.evaluator.pytorch.cgo.evaluator import Classification
trainer = Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
max_epochs=1, limit_train_batches=0.2)
Advanced users can also implement their own trainers by inheriting ``MultiModelSupervisedLearningModule``.
Sometimes, a mutated model cannot be executed (e.g., due to shape mismatch). When a trial running multiple models contains
a bad model, CGO execution engine will re-run each model independently in seperate trials without cross-model optimizations.
...@@ -5,6 +5,71 @@ ...@@ -5,6 +5,71 @@
Change Log Change Log
========== ==========
Release 2.5 - 11/2/2021
-----------------------
Model Compression
^^^^^^^^^^^^^^^^^
* New major version of pruning framework `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/v2_pruning.html>`__
* Iterative pruning is more automated, users can use less code to implement iterative pruning.
* Support exporting intermediate models in the iterative pruning process.
* The implementation of the pruning algorithm is closer to the paper.
* Users can easily customize their own iterative pruning by using ``PruningScheduler``.
* Optimize the basic pruners underlying generate mask logic, easier to extend new functions.
* Optimized the memory usage of the pruners.
* MobileNetV2 end-to-end example `(notebook) <https://github.com/microsoft/nni/blob/v2.5/examples/model_compress/pruning/mobilenetv2_end2end/Compressing%20MobileNetV2%20with%20NNI%20Pruners.ipynb>`__
* Improved QAT quantizer `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/Quantizer.html#qat-quantizer>`__
* support dtype and scheme customization
* support dp multi-gpu training
* support load_calibration_config
* Model speed-up now supports directly loading the mask `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/ModelSpeedup.html#nni.compression.pytorch.ModelSpeedup>`__
* Support speed-up depth-wise convolution
* Support bn-folding for LSQ quantizer
* Support QAT and LSQ resume from PTQ
* Added doc for observer quantizer `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/Quantizer.html#observer-quantizer>`__
Neural Architecture Search
^^^^^^^^^^^^^^^^^^^^^^^^^^
* NAS benchmark `(doc) <https://nni.readthedocs.io/en/v2.5/NAS/Benchmarks.html>`__
* Support benchmark table lookup in experiments
* New data preparation approach
* Improved `quick start doc <https://nni.readthedocs.io/en/v2.5/NAS/QuickStart.html>`__
* Experimental CGO execution engine `(doc) <https://nni.readthedocs.io/en/v2.5/NAS/ExecutionEngines.html#cgo-execution-engine-experimental>`__
Hyper-Parameter Optimization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* New training platform: Alibaba DSW+DLC `(doc) <https://nni.readthedocs.io/en/v2.5/TrainingService/DLCMode.html>`__
* Support passing ConfigSpace definition directly to BOHB `(doc) <https://nni.readthedocs.io/en/v2.5/Tuner/BohbAdvisor.html#usage>`__ (thanks to khituras)
* Reformatted `experiment config doc <https://nni.readthedocs.io/en/v2.5/reference/experiment_config.html>`__
* Added example config files for Windows (thanks to @politecat314)
* FrameworkController now supports reuse mode
Fixed Bugs
^^^^^^^^^^
* Experiment cannot start due to platform timestamp format (issue #4077 #4083)
* Cannot use ``1e-5`` in search space (issue #4080)
* Dependency version conflict caused by ConfigSpace (issue #3909) (thanks to @jexxers)
* Hardware-aware SPOS example does not work (issue #4198)
* Web UI show wrong remaining time when duration exceeds limit (issue #4015)
* cudnn.deterministic is always set in AMC pruner (#4117) thanks to @mstczuo
And...
^^^^^^
* New `emoticons <https://github.com/microsoft/nni/blob/v2.5/docs/en_US/Tutorial/NNSpider.md>`__!
.. image:: https://raw.githubusercontent.com/microsoft/nni/v2.5/docs/img/emoicons/Holiday.png
Release 2.4 - 8/11/2021 Release 2.4 - 8/11/2021
----------------------- -----------------------
......
...@@ -106,6 +106,7 @@ To use BOHB, you should add the following spec in your experiment's YAML config ...@@ -106,6 +106,7 @@ To use BOHB, you should add the following spec in your experiment's YAML config
* **random_fraction**\ (*float, optional, default = 0.33*\ ): fraction of purely random configurations that are sampled from the prior without the model. * **random_fraction**\ (*float, optional, default = 0.33*\ ): fraction of purely random configurations that are sampled from the prior without the model.
* **bandwidth_factor**\ (*float, optional, default = 3.0*\ ): to encourage diversity, the points proposed to optimize EI are sampled from a 'widened' KDE where the bandwidth is multiplied by this factor. We suggest using the default value if you are not familiar with KDE. * **bandwidth_factor**\ (*float, optional, default = 3.0*\ ): to encourage diversity, the points proposed to optimize EI are sampled from a 'widened' KDE where the bandwidth is multiplied by this factor. We suggest using the default value if you are not familiar with KDE.
* **min_bandwidth**\ (*float, optional, default = 0.001*\ ): to keep diversity, even when all (good) samples have the same value for one of the parameters, a minimum bandwidth (default: 1e-3) is used instead of zero. We suggest using the default value if you are not familiar with KDE. * **min_bandwidth**\ (*float, optional, default = 0.001*\ ): to keep diversity, even when all (good) samples have the same value for one of the parameters, a minimum bandwidth (default: 1e-3) is used instead of zero. We suggest using the default value if you are not familiar with KDE.
* **config_space** (*str, optional*): directly use a .pcs file serialized by `ConfigSpace <https://automl.github.io/ConfigSpace/>` in "pcs new" format. In this case, search space file (if provided in config) will be ignored. Note that this path needs to be an absolute path. Relative path is currently not supported.
*Please note that the float type currently only supports decimal representations. You have to use 0.333 instead of 1/3 and 0.001 instead of 1e-3.* *Please note that the float type currently only supports decimal representations. You have to use 0.333 instead of 1/3 and 0.001 instead of 1e-3.*
......
...@@ -108,4 +108,4 @@ A common example of this would be run the mnist example without installing tenso ...@@ -108,4 +108,4 @@ A common example of this would be run the mnist example without installing tenso
As it shows, every trial has a log path, where you can find trial's log and stderr. As it shows, every trial has a log path, where you can find trial's log and stderr.
In addition to experiment level debug, NNI also provides the capability for debugging a single trial without the need to start the entire experiment. Refer to `standalone mode <../TrialExample/Trials#standalone-mode-for-debugging>`__ for more information about debug single trial code. In addition to experiment level debug, NNI also provides the capability for debugging a single trial without the need to start the entire experiment. Refer to `standalone mode <../TrialExample/Trials.rst#standalone-mode-for-debugging>`__ for more information about debug single trial code.
...@@ -24,7 +24,7 @@ Install NNI through source code ...@@ -24,7 +24,7 @@ Install NNI through source code
.. code-block:: bash .. code-block:: bash
git clone -b v2.4 https://github.com/Microsoft/nni.git git clone -b v2.5 https://github.com/Microsoft/nni.git
cd nni cd nni
python3 -m pip install -U -r dependencies/setup.txt python3 -m pip install -U -r dependencies/setup.txt
python3 -m pip install -r dependencies/develop.txt python3 -m pip install -r dependencies/develop.txt
...@@ -38,7 +38,7 @@ If you want to perform a persist install instead, we recommend to build your own ...@@ -38,7 +38,7 @@ If you want to perform a persist install instead, we recommend to build your own
.. code-block:: bash .. code-block:: bash
git clone -b v2.4 https://github.com/Microsoft/nni.git git clone -b v2.5 https://github.com/Microsoft/nni.git
cd nni cd nni
export NNI_RELEASE=2.0 export NNI_RELEASE=2.0
python3 -m pip install -U -r dependencies/setup.txt python3 -m pip install -U -r dependencies/setup.txt
...@@ -61,7 +61,7 @@ Verify installation ...@@ -61,7 +61,7 @@ Verify installation
.. code-block:: bash .. code-block:: bash
git clone -b v2.4 https://github.com/Microsoft/nni.git git clone -b v2.5 https://github.com/Microsoft/nni.git
* *
Run the MNIST example. Run the MNIST example.
......
...@@ -40,7 +40,7 @@ If you want to contribute to NNI, refer to `setup development environment <Setup ...@@ -40,7 +40,7 @@ If you want to contribute to NNI, refer to `setup development environment <Setup
.. code-block:: bat .. code-block:: bat
git clone -b v2.4 https://github.com/Microsoft/nni.git git clone -b v2.5 https://github.com/Microsoft/nni.git
cd nni cd nni
python -m pip install -U -r dependencies/setup.txt python -m pip install -U -r dependencies/setup.txt
python -m pip install -r dependencies/develop.txt python -m pip install -r dependencies/develop.txt
...@@ -54,7 +54,7 @@ Verify installation ...@@ -54,7 +54,7 @@ Verify installation
.. code-block:: bat .. code-block:: bat
git clone -b v2.4 https://github.com/Microsoft/nni.git git clone -b v2.5 https://github.com/Microsoft/nni.git
* *
Run the MNIST example. Run the MNIST example.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment