The 'default' op_type stands for the module types defined in [default_layers.py](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/compression/torch/default_layers.py) for pytorch.
The 'default' op_type stands for the module types defined in [default_layers.py](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/compression/torch/default_layers.py) for pytorch.
Therefore ```{ 'sparsity': 0.8, 'op_types': ['default'] }```means that **all layers with specified op_types will be compressed with the same 0.8 sparsity**. When ```pruner(model)``` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked.
Therefore ```{ 'sparsity': 0.8, 'op_types': ['default'] }```means that **all layers with specified op_types will be compressed with the same 0.8 sparsity**. When ```pruner.compress()``` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked.
You can use other compression algorithms in the package of `nni.compression`. The algorithms are implemented in both PyTorch and Tensorflow, under `nni.compression.torch` and `nni.compression.tensorflow` respectively. You can refer to [Pruner](./Pruner.md) and [Quantizer](./Quantizer.md) for detail description of supported algorithms.
You can use other compression algorithms in the package of `nni.compression`. The algorithms are implemented in both PyTorch and Tensorflow, under `nni.compression.torch` and `nni.compression.tensorflow` respectively. You can refer to [Pruner](./Pruner.md) and [Quantizer](./Quantizer.md) for detail description of supported algorithms.
The function call `pruner(model)` receives user defined model (in Tensorflow the model can be obtained with `tf.get_default_graph()`, while in PyTorch the model is the defined model class), and the model is modified with masks inserted. Then when you run the model, the masks take effect. The masks can be adjusted at runtime by the algorithms.
The function call `pruner.compress()` modifies user defined model (in Tensorflow the model can be obtained with `tf.get_default_graph()`, while in PyTorch the model is the defined model class), and the model is modified with masks inserted. Then when you run the model, the masks take effect. The masks can be adjusted at runtime by the algorithms.
When instantiate a compression algorithm, there is `config_list` passed in. We describe how to write this config below.
When instantiate a compression algorithm, there is `config_list` passed in. We describe how to write this config below.
...
@@ -95,7 +95,17 @@ pruner.update_epoch(epoch)
...
@@ -95,7 +95,17 @@ pruner.update_epoch(epoch)
The other is `step`, it can be called with `pruner.step()` after each minibatch. Note that not all algorithms need these two APIs, for those that do not need them, calling them is allowed but has no effect.
The other is `step`, it can be called with `pruner.step()` after each minibatch. Note that not all algorithms need these two APIs, for those that do not need them, calling them is allowed but has no effect.
__[TODO]__ The last API is for users to export the compressed model. You will get a compressed model when you finish the training using this API. It also exports another file storing the values of masks.
You can easily export the compressed model using the following API if you are pruning your model, ```state_dict``` of the sparse model weights will be stored in ```model.pth```, which can be loaded by ```torch.load('model.pth')```
```
pruner.export_model(model_path='model.pth')
```
```mask_dict ``` and pruned model in ```onnx``` format(```input_shape``` need to be specified) can also be exported like this:
@@ -111,20 +121,26 @@ If you want to write a new pruning algorithm, you can write a class that inherit
...
@@ -111,20 +121,26 @@ If you want to write a new pruning algorithm, you can write a class that inherit
# nni.compression.tensorflow.Pruner with
# nni.compression.tensorflow.Pruner with
# nni.compression.torch.Pruner
# nni.compression.torch.Pruner
class YourPruner(nni.compression.tensorflow.Pruner):
class YourPruner(nni.compression.tensorflow.Pruner):
def__init__(self,config_list):
def __init__(self, model, config_list):
# suggest you to use the NNI defined spec for config
"""
super().__init__(config_list)
Suggest you to use the NNI defined spec for config
"""
defbind_model(self,model):
super().__init__(model, config_list)
# this func can be used to remember the model or its weights
# in member variables, for getting their values during training
def calc_mask(self, layer, config):
pass
"""
Pruners should overload this method to provide mask for weight tensors.
defcalc_mask(self,weight,config,**kwargs):
The mask must have the same shape and type comparing to the weight.
# weight is the target weight tensor
It will be applied with ``mul()`` operation on the weight.
# config is the selected dict object in config_list for this layer
This method is effectively hooked to ``forward()`` method of the model.
# kwargs contains op, op_types, and op_name
# design your mask and return your mask
Parameters
----------
layer: LayerInfo
calculate mask for ``layer``'s weight
config: dict
the configuration for generating the mask
"""
return your_mask
return your_mask
# note for pytorch version, there is no sess in input arguments
# note for pytorch version, there is no sess in input arguments
...
@@ -133,16 +149,18 @@ class YourPruner(nni.compression.tensorflow.Pruner):
...
@@ -133,16 +149,18 @@ class YourPruner(nni.compression.tensorflow.Pruner):
# note for pytorch version, there is no sess in input arguments
# note for pytorch version, there is no sess in input arguments
def step(self, sess):
def step(self, sess):
# can do some processing based on the model or weights binded
"""
# in the func bind_model
Can do some processing based on the model or weights binded
in the func bind_model
"""
pass
pass
```
```
For the simplest algorithm, you only need to override `calc_mask`. It receives each layer's weight and selected configuration, as well as op information. You generate the mask for this weight in this function and return. Then NNI applies the mask for you.
For the simplest algorithm, you only need to override ``calc_mask``. It receives the to-be-compressed layers one by one along with their compression configuration. You generate the mask for this weight in this function and return. Then NNI applies the mask for you.
Some algorithms generate mask based on training progress, i.e., epoch number. We provide `update_epoch` for the pruner to be aware of the training progress.
Some algorithms generate mask based on training progress, i.e., epoch number. We provide `update_epoch` for the pruner to be aware of the training progress. It should be called at the beginning of each epoch.
Some algorithms may want global information for generating masks, for example, all weights of the model (for statistic information), model optimizer's information. NNI supports this requirement using `bind_model`. `bind_model` receives the complete model, thus, it could record any information (e.g., reference to weights) it cares about. Then `step` can process or update the information according to the algorithm. You can refer to [source code of built-in algorithms](https://github.com/microsoft/nni/tree/master/src/sdk/pynni/nni/compressors) for example implementations.
Some algorithms may want global information for generating masks, for example, all weights of the model (for statistic information). Your can use `self.bound_model` in the Pruner class for accessing weights. If you also need optimizer's information (for example in Pytorch), you could override `__init__` to receive more arguments such as model's optimizer. Then `step` can process or update the information according to the algorithm. You can refer to [source code of built-in algorithms](https://github.com/microsoft/nni/tree/master/src/sdk/pynni/nni/compressors) for example implementations.
### Quantization algorithm
### Quantization algorithm
...
@@ -154,20 +172,19 @@ The interface for customizing quantization algorithm is similar to that of pruni
...
@@ -154,20 +172,19 @@ The interface for customizing quantization algorithm is similar to that of pruni
# nni.compression.tensorflow.Quantizer with
# nni.compression.tensorflow.Quantizer with
# nni.compression.torch.Quantizer
# nni.compression.torch.Quantizer
class YourQuantizer(nni.compression.tensorflow.Quantizer):
class YourQuantizer(nni.compression.tensorflow.Quantizer):
def__init__(self,config_list):
def __init__(self, model, config_list):
# suggest you to use the NNI defined spec for config
"""
super().__init__(config_list)
Suggest you to use the NNI defined spec for config
"""
defbind_model(self,model):
super().__init__(model, config_list)
# this func can be used to remember the model or its weights
# in member variables, for getting their values during training
@@ -82,6 +82,22 @@ Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMod
...
@@ -82,6 +82,22 @@ Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMod
portNumber: 1
portNumber: 1
```
```
NNI support two kind of authorization method in PAI, including password and PAI token, [refer](https://github.com/microsoft/pai/blob/b6bd2ab1c8890f91b7ac5859743274d2aa923c22/docs/rest-server/API.md#2-authentication). The authorization is configured in `paiConfig` field.
For password authorization, the `paiConfig` schema is:
```
paiConfig:
userName: your_pai_nni_user
passWord: your_pai_password
host: 10.1.1.1
```
For pai token authorization, the `paiConfig` schema is:
```
paiConfig:
userName: your_pai_nni_user
token: your_pai_token
host: 10.1.1.1
```
Once complete to fill NNI experiment config file and save (for example, save as exp_pai.yml), then run the following command
Once complete to fill NNI experiment config file and save (for example, save as exp_pai.yml), then run the following command
@@ -122,7 +122,7 @@ Its requirement of computation resource is relatively high. Specifically, it req
...
@@ -122,7 +122,7 @@ Its requirement of computation resource is relatively high. Specifically, it req
***optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics.
***optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics.
***population_size** (*int value (should > 0), optional, default = 20*) - the initial size of the population(trial num) in evolution tuner. Suggests `population_size` be much larger than `concurrency`, so users can get the most out of the algorithm (and at least `concurrency`, or the tuner will fail on their first generation of parameters).
***population_size** (*int value (should > 0), optional, default = 20*) - the initial size of the population(trial num) in evolution tuner. Suggests `population_size` be much larger than `concurrency`, so users can get the most out of the algorithm (and at least `concurrency`, or the tuner will fail on their first generation of parameters).
**Usage example**
**Usage example**
...
@@ -143,11 +143,11 @@ tuner:
...
@@ -143,11 +143,11 @@ tuner:
> Built-in Tuner Name: **SMAC**
> Built-in Tuner Name: **SMAC**
**Please note that SMAC doesn't support running on windows currently. The specific reason can be referred to this [GitHub issue](https://github.com/automl/SMAC3/issues/483).**
**Please note that SMAC doesn't support running on Windows currently. The specific reason can be referred to this [GitHub issue](https://github.com/automl/SMAC3/issues/483).**
**Installation**
**Installation**
SMAC need to be installed by following command before first use.
SMAC need to be installed by following command before first use. As a reminder, `swig` is required for SMAC: for Ubuntu `swig` can be installed with `apt`.
@@ -21,6 +21,8 @@ To define a search space, users should define the name of variable, the type of
...
@@ -21,6 +21,8 @@ To define a search space, users should define the name of variable, the type of
Take the first line as an example. `dropout_rate` is defined as a variable whose priori distribution is a uniform distribution of a range from `0.1` and `0.5`.
Take the first line as an example. `dropout_rate` is defined as a variable whose priori distribution is a uniform distribution of a range from `0.1` and `0.5`.
Note that the ability of a search space is highly connected with your tuner. We listed the supported types for each builtin tuner below. For a customized tuner, you don't have to follow our convention and you will have the flexibility to define any type you want.
## Types
## Types
All types of sampling strategies and their parameter are listed here:
All types of sampling strategies and their parameter are listed here:
...
@@ -74,6 +76,8 @@ All types of sampling strategies and their parameter are listed here:
...
@@ -74,6 +76,8 @@ All types of sampling strategies and their parameter are listed here:
* Type for [Neural Architecture Search Space][1]. Value is also a dictionary, which contains key-value pairs representing respectively name and search space of each mutable_layer.
* Type for [Neural Architecture Search Space][1]. Value is also a dictionary, which contains key-value pairs representing respectively name and search space of each mutable_layer.
* For now, users can only use this type of search space with annotation, which means that there is no need to define a json file for search space since it will be automatically generated according to the annotation in trial code.
* For now, users can only use this type of search space with annotation, which means that there is no need to define a json file for search space since it will be automatically generated according to the annotation in trial code.
* The following HPO tuners can be adapted to tune this search space: TPE, Random, Anneal, Evolution, Grid Search,
Hyperband and BOHB.
* For detailed usage, please refer to [General NAS Interfaces][1].
* For detailed usage, please refer to [General NAS Interfaces][1].
## Search Space Types Supported by Each Tuner
## Search Space Types Supported by Each Tuner
...
@@ -94,12 +98,12 @@ All types of sampling strategies and their parameter are listed here:
...
@@ -94,12 +98,12 @@ All types of sampling strategies and their parameter are listed here:
Known Limitations:
Known Limitations:
* GP Tuner and Metis Tuner support only **numerical values** in search space(`choice` type values can be no-numeraical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.
* GP Tuner and Metis Tuner support only **numerical values** in search space(`choice` type values can be no-numeraical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.
* Note that for nested search space:
* Note that for nested search space:
* Only Random Search/TPE/Anneal/Evolution tuner supports nested search space
* Only Random Search/TPE/Anneal/Evolution tuner supports nested search space
* We do not support nested search space "Hyper Parameter" in visualization now, the enhancement is being considered in #1110(https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed
* We do not support nested search space "Hyper Parameter" in visualization now, the enhancement is being considered in [#1110](https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed