Unverified Commit 035d58bc authored by SparkSnail's avatar SparkSnail Committed by GitHub
Browse files

Merge pull request #121 from Microsoft/master

merge master
parents b633c265 8e732f2c
...@@ -11,10 +11,12 @@ ...@@ -11,10 +11,12 @@
[![Pull Requests](https://img.shields.io/github/issues-pr-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/pulls?q=is%3Apr+is%3Aopen) [![Pull Requests](https://img.shields.io/github/issues-pr-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/pulls?q=is%3Apr+is%3Aopen)
[![Version](https://img.shields.io/github/release/Microsoft/nni.svg)](https://github.com/Microsoft/nni/releases) [![Join the chat at https://gitter.im/Microsoft/nni](https://badges.gitter.im/Microsoft/nni.svg)](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![Version](https://img.shields.io/github/release/Microsoft/nni.svg)](https://github.com/Microsoft/nni/releases) [![Join the chat at https://gitter.im/Microsoft/nni](https://badges.gitter.im/Microsoft/nni.svg)](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[简体中文](zh_CN/README.md)
NNI (Neural Network Intelligence) is a toolkit to help users run automated machine learning (AutoML) experiments. NNI (Neural Network Intelligence) is a toolkit to help users run automated machine learning (AutoML) experiments.
The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments like local machine, remote servers and cloud. The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments like local machine, remote servers and cloud.
### **NNI [v0.5](https://github.com/Microsoft/nni/releases) has been released!** ### **NNI [v0.5.1](https://github.com/Microsoft/nni/releases) has been released!**
<p align="center"> <p align="center">
<a href="#nni-v05-has-been-released"><img src="https://microsoft.github.io/nni/docs/img/overview.svg" /></a> <a href="#nni-v05-has-been-released"><img src="https://microsoft.github.io/nni/docs/img/overview.svg" /></a>
</p> </p>
...@@ -49,33 +51,33 @@ The tool dispatches and runs trial jobs generated by tuning algorithms to search ...@@ -49,33 +51,33 @@ The tool dispatches and runs trial jobs generated by tuning algorithms to search
</ul> </ul>
</td> </td>
<td> <td>
<a href="docs/HowToChooseTuner.md">Tuner</a> <a href="docs/Builtin_Tuner.md">Tuner</a>
<ul> <ul>
<li><a href="docs/HowToChooseTuner.md#TPE">TPE</a></li> <li><a href="docs/Builtin_Tuner.md#TPE">TPE</a></li>
<li><a href="docs/HowToChooseTuner.md#Random">Random Search</a></li> <li><a href="docs/Builtin_Tuner.md#Random">Random Search</a></li>
<li><a href="docs/HowToChooseTuner.md#Anneal">Anneal</a></li> <li><a href="docs/Builtin_Tuner.md#Anneal">Anneal</a></li>
<li><a href="docs/HowToChooseTuner.md#Evolution">Naive Evolution</a></li> <li><a href="docs/Builtin_Tuner.md#Evolution">Naive Evolution</a></li>
<li><a href="docs/HowToChooseTuner.md#SMAC">SMAC</a></li> <li><a href="docs/Builtin_Tuner.md#SMAC">SMAC</a></li>
<li><a href="docs/HowToChooseTuner.md#Batch">Batch</a></li> <li><a href="docs/Builtin_Tuner.md#Batch">Batch</a></li>
<li><a href="docs/HowToChooseTuner.md#Grid">Grid Search</a></li> <li><a href="docs/Builtin_Tuner.md#Grid">Grid Search</a></li>
<li><a href="docs/HowToChooseTuner.md#Hyperband">Hyperband</a></li> <li><a href="docs/Builtin_Tuner.md#Hyperband">Hyperband</a></li>
<li><a href="docs/HowToChooseTuner.md#NetworkMorphism">Network Morphism</a></li> <li><a href="docs/Builtin_Tuner.md#NetworkMorphism">Network Morphism</a></li>
<li><a href="examples/tuners/enas_nni/README.md">ENAS</a></li> <li><a href="examples/tuners/enas_nni/README.md">ENAS</a></li>
<li><a href="docs/HowToChooseTuner.md#NetworkMorphism#MetisTuner">Metis Tuner</a></li> <li><a href="docs/Builtin_Tuner.md#NetworkMorphism#MetisTuner">Metis Tuner</a></li>
</ul> </ul>
<a href="docs/HowToChooseTuner.md#assessor">Assessor</a> <a href="docs/Builtin_Tuner.md#assessor">Assessor</a>
<ul> <ul>
<li><a href="docs/HowToChooseTuner.md#Medianstop">Median Stop</a></li> <li><a href="docs/Builtin_Tuner.md#Medianstop">Median Stop</a></li>
<li><a href="docs/HowToChooseTuner.md#Curvefitting">Curve Fitting</a></li> <li><a href="docs/Builtin_Tuner.md#Curvefitting">Curve Fitting</a></li>
</ul> </ul>
</td> </td>
<td> <td>
<ul> <ul>
<li><a href="docs/tutorial_1_CR_exp_local_api.md">Local Machine</a></li> <li><a href="docs/tutorial_1_CR_exp_local_api.md">Local Machine</a></li>
<li><a href="docs/tutorial_2_RemoteMachineMode.md">Remote Servers</a></li> <li><a href="docs/RemoteMachineMode.md">Remote Servers</a></li>
<li><a href="docs/PAIMode.md">OpenPAI</a></li> <li><a href="docs/PAIMode.md">OpenPAI</a></li>
<li><a href="docs/KubeflowMode.md">Kubeflow</a></li> <li><a href="docs/KubeflowMode.md">Kubeflow</a></li>
<li><a href="docs/KubeflowMode.md">FrameworkController on K8S (AKS etc.)</a></li> <li><a href="docs/FrameworkControllerMode.md">FrameworkController on K8S (AKS etc.)</a></li>
</ul> </ul>
</td> </td>
</tr> </tr>
...@@ -112,7 +114,7 @@ Note: ...@@ -112,7 +114,7 @@ Note:
* We support Linux (Ubuntu 16.04 or higher), MacOS (10.14.1) in our current stage. * We support Linux (Ubuntu 16.04 or higher), MacOS (10.14.1) in our current stage.
* Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`. * Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`.
```bash ```bash
git clone -b v0.5 https://github.com/Microsoft/nni.git git clone -b v0.5.1 https://github.com/Microsoft/nni.git
cd nni cd nni
source install.sh source install.sh
``` ```
...@@ -124,7 +126,7 @@ For the system requirements of NNI, please refer to [Install NNI](docs/Installat ...@@ -124,7 +126,7 @@ For the system requirements of NNI, please refer to [Install NNI](docs/Installat
The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow installed** before running it. The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow installed** before running it.
* Download the examples via clone the source code. * Download the examples via clone the source code.
```bash ```bash
git clone -b v0.5 https://github.com/Microsoft/nni.git git clone -b v0.5.1 https://github.com/Microsoft/nni.git
``` ```
* Run the mnist example. * Run the mnist example.
```bash ```bash
...@@ -168,24 +170,25 @@ You can use these commands to get more information about the experiment ...@@ -168,24 +170,25 @@ You can use these commands to get more information about the experiment
## **Documentation** ## **Documentation**
* [NNI overview](docs/Overview.md) * [NNI overview](docs/Overview.md)
* [Quick start](docs/GetStarted.md) * [Quick start](docs/QuickStart.md)
## **How to** ## **How to**
* [Install NNI](docs/Installation.md) * [Install NNI](docs/Installation.md)
* [Use command line tool nnictl](docs/NNICTLDOC.md) * [Use command line tool nnictl](docs/NNICTLDOC.md)
* [Use NNIBoard](docs/WebUI.md) * [Use NNIBoard](docs/WebUI.md)
* [How to define search space](docs/SearchSpaceSpec.md) * [How to define search space](docs/SearchSpaceSpec.md)
* [How to define a trial](docs/howto_1_WriteTrial.md) * [How to define a trial](docs/Trials.md)
* [How to choose tuner/search-algorithm](docs/HowToChooseTuner.md) * [How to choose tuner/search-algorithm](docs/Builtin_Tuner.md)
* [Config an experiment](docs/ExperimentConfig.md) * [Config an experiment](docs/ExperimentConfig.md)
* [How to use annotation](docs/howto_1_WriteTrial.md#nni-python-annotation) * [How to use annotation](docs/Trials.md#nni-python-annotation)
## **Tutorials** ## **Tutorials**
* [Run an experiment on local (with multiple GPUs)?](docs/tutorial_1_CR_exp_local_api.md) * [Run an experiment on local (with multiple GPUs)?](docs/tutorial_1_CR_exp_local_api.md)
* [Run an experiment on multiple machines?](docs/tutorial_2_RemoteMachineMode.md) * [Run an experiment on multiple machines?](docs/RemoteMachineMode.md)
* [Run an experiment on OpenPAI?](docs/PAIMode.md) * [Run an experiment on OpenPAI?](docs/PAIMode.md)
* [Run an experiment on Kubeflow?](docs/KubeflowMode.md) * [Run an experiment on Kubeflow?](docs/KubeflowMode.md)
* [Try different tuners and assessors](docs/tutorial_3_tryTunersAndAssessors.md) * [Try different tuners](docs/tuners.rst)
* [Implement a customized tuner](docs/howto_2_CustomizedTuner.md) * [Try different assessors](docs/assessors.rst)
* [Implement a customized tuner](docs/Customize_Tuner.md)
* [Implement a customized assessor](examples/assessors/README.md) * [Implement a customized assessor](examples/assessors/README.md)
* [Use Genetic Algorithm to find good model architectures for Reading Comprehension task](examples/trials/ga_squad/README.md) * [Use Genetic Algorithm to find good model architectures for Reading Comprehension task](examples/trials/ga_squad/README.md)
......
...@@ -33,7 +33,7 @@ jobs: ...@@ -33,7 +33,7 @@ jobs:
displayName: 'Built-in tuners / assessors tests' displayName: 'Built-in tuners / assessors tests'
- script: | - script: |
cd test cd test
PATH=$HOME/.local/bin:$PATH python3 config_test.py --ts local PATH=$HOME/.local/bin:$PATH python3 config_test.py --ts local --local_gpu
displayName: 'Examples and advanced features tests on local machine' displayName: 'Examples and advanced features tests on local machine'
- script: | - script: |
cd test cd test
......
Dockerfile Dockerfile
=== ===
## 1.Description ## 1.Description
This is the Dockerfile of nni project. It includes serveral popular deep learning frameworks and NNI. It is tested on `Ubuntu 16.04 LTS`: This is the Dockerfile of NNI project. It includes serveral popular deep learning frameworks and NNI. It is tested on `Ubuntu 16.04 LTS`:
``` ```
CUDA 9.0, CuDNN 7.0 CUDA 9.0, CuDNN 7.0
numpy 1.14.3,scipy 1.1.0 numpy 1.14.3,scipy 1.1.0
TensorFlow 1.10.0 TensorFlow-gpu 1.10.0
Keras 2.1.6 Keras 2.1.6
PyTorch 0.4.1 PyTorch 0.4.1
scikit-learn 0.20.0 scikit-learn 0.20.0
pandas 0.23.4 pandas 0.23.4
lightgbm 2.2.2 lightgbm 2.2.2
NNI v0.5 NNI v0.5.1
``` ```
You can take this Dockerfile as a reference for your own customized Dockerfile. You can take this Dockerfile as a reference for your own customized Dockerfile.
...@@ -26,6 +26,8 @@ __Run the docker image__ ...@@ -26,6 +26,8 @@ __Run the docker image__
``` ```
docker run -it nni/nni docker run -it nni/nni
``` ```
Note that if you want to use tensorflow, please uninstall tensorflow-gpu and install tensorflow in this docker container. Or modify `Dockerfile` to install tensorflow (without gpu) and build docker image.
* If use GPU in docker container, make sure you have installed [NVIDIA Container Runtime](https://github.com/NVIDIA/nvidia-docker), then run the following command * If use GPU in docker container, make sure you have installed [NVIDIA Container Runtime](https://github.com/NVIDIA/nvidia-docker), then run the following command
``` ```
nvidia-docker run -it nni/nni nvidia-docker run -it nni/nni
......
_build
_static
_templates
\ No newline at end of file
...@@ -8,6 +8,7 @@ Currently we recommend sharing weights through NFS (Network File System), which ...@@ -8,6 +8,7 @@ Currently we recommend sharing weights through NFS (Network File System), which
### Weight Sharing through NFS file ### Weight Sharing through NFS file
With the NFS setup (see below), trial code can share model weight through loading & saving files. Here we recommend that user feed the tuner with the storage path: With the NFS setup (see below), trial code can share model weight through loading & saving files. Here we recommend that user feed the tuner with the storage path:
```yaml ```yaml
tuner: tuner:
codeDir: path/to/customer_tuner codeDir: path/to/customer_tuner
...@@ -17,9 +18,10 @@ tuner: ...@@ -17,9 +18,10 @@ tuner:
... ...
save_dir_root: /nfs/storage/path/ save_dir_root: /nfs/storage/path/
``` ```
And let tuner decide where to save & load weights and feed the paths to trials through `nni.get_next_parameters()`: And let tuner decide where to save & load weights and feed the paths to trials through `nni.get_next_parameters()`:
![weight_sharing_design](./img/weight_sharing.png) <img src="https://user-images.githubusercontent.com/23273522/51817667-93ebf080-2306-11e9-8395-b18b322062bc.png" alt="drawing" width="700"/>
For example, in tensorflow: For example, in tensorflow:
```python ```python
...@@ -80,10 +82,10 @@ The feature of weight sharing enables trials from different machines, in which m ...@@ -80,10 +82,10 @@ The feature of weight sharing enables trials from different machines, in which m
``` ```
## Examples ## Examples
For details, please refer to this [simple weight sharing example](../test/async_sharing_test). We also provided a [practice example](../examples/trials/weight_sharing/ga_squad) for reading comprehension, based on previous [ga_squad](../examples/trials/ga_squad) example. For details, please refer to this [simple weight sharing example](https://github.com/Microsoft/nni/tree/master/test/async_sharing_test). We also provided a [practice example](https://github.com/Microsoft/nni/tree/master/examples/trials/weight_sharing/ga_squad) for reading comprehension, based on previous [ga_squad](https://github.com/Microsoft/nni/tree/master/examples/trials/ga_squad) example.
[1]: https://arxiv.org/abs/1802.03268 [1]: https://arxiv.org/abs/1802.03268
[2]: https://arxiv.org/abs/1707.07012 [2]: https://arxiv.org/abs/1707.07012
[3]: https://arxiv.org/abs/1806.09055 [3]: https://arxiv.org/abs/1806.09055
[4]: https://arxiv.org/abs/1806.10282 [4]: https://arxiv.org/abs/1806.10282
[5]: https://arxiv.org/abs/1703.01041 [5]: https://arxiv.org/abs/1703.01041
\ No newline at end of file
# NNI Annotation # NNI Annotation
For good user experience and reduce user effort, we need to design a good annotation grammar.
If users use NNI system, they only need to: ## Overview
1. Use nni.get_next_parameter() to retrieve hyper parameters from Tuner, before using other annotation, use following annotation at the begining of trial code: To improve user experience and reduce user effort, we design an annotation grammar. Using NNI annotation, users can adapt their code to NNI just by adding some standalone annotating strings, which does not affect the execution of the original code.
'''@nni.get_next_parameter()'''
2. Annotation variable in code as: Below is an example:
'''@nni.variable(nni.choice(2,3,5,7),name=self.conv_size)''' ```python
'''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
The meaning of this example is that NNI will choose one of several values (0.1, 0.01, 0.001) to assign to the learning_rate variable. Specifically, this first line is an NNI annotation, which is a single string. Following is an assignment statement. What nni does here is to replace the right value of this assignment statement according to the information provided by the annotation line.
3. Annotation intermediate in code as:
'''@nni.report_intermediate_result(test_acc)''' In this way, users could either run the python code directly or launch NNI to tune hyper-parameter in this code, without changing any codes.
4. Annotation output in code as: ## Types of Annotation:
'''@nni.report_final_result(test_acc)''' In NNI, there are mainly four types of annotation:
5. Annotation `function_choice` in code as:
'''@nni.function_choice(max_pool(h_conv1, self.pool_size),avg_pool(h_conv1, self.pool_size),name=max_pool)''' ### 1. Annotate variables
In this way, they can easily implement automatic tuning on NNI. `'''@nni.variable(sampling_algo, name)'''`
For `@nni.variable`, `nni.choice` is the type of search space and there are 10 types to express your search space as follows: `@nni.variable` is used in NNI to annotate a variable.
1. `@nni.variable(nni.choice(option1,option2,...,optionN),name=variable)` **Arguments**
Which means the variable value is one of the options, which should be a list The elements of options can themselves be stochastic expressions
2. `@nni.variable(nni.randint(upper),name=variable)` - **sampling_algo**: Sampling algorithm that specifies a search space. User should replace it with a built-in NNI sampling function whose name consists of an `nni.` identification and a search space type specified in [SearchSpaceSpec](SearchSpaceSpec.md) such as `choice` or `uniform`.
Which means the variable value is a random integer in the range [0, upper). - **name**: The name of the variable that the selected value will be assigned to. Note that this argument should be the same as the left value of the following assignment statement.
3. `@nni.variable(nni.uniform(low, high),name=variable)` An example here is:
Which means the variable value is a value uniformly between low and high.
4. `@nni.variable(nni.quniform(low, high, q),name=variable)` ```python
Which means the variable value is a value like round(uniform(low, high) / q) * q '''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
5. `@nni.variable(nni.loguniform(low, high),name=variable)` ### 2. Annotate functions
Which means the variable value is a value drawn according to exp(uniform(low, high)) so that the logarithm of the return value is uniformly distributed.
6. `@nni.variable(nni.qloguniform(low, high, q),name=variable)` `'''@nni.function_choice(*functions, name)'''`
Which means the variable value is a value like round(exp(uniform(low, high)) / q) * q
7. `@nni.variable(nni.normal(label, mu, sigma),name=variable)` `@nni.function_choice` is used to choose one from several functions.
Which means the variable value is a real value that's normally-distributed with mean mu and standard deviation sigma.
8. `@nni.variable(nni.qnormal(label, mu, sigma, q),name=variable)` **Arguments**
Which means the variable value is a value like round(normal(mu, sigma) / q) * q
9. `@nni.variable(nni.lognormal(label, mu, sigma),name=variable)` - **\*functions**: Several functions that are waiting to be selected from. Note that it should be a complete function call with arguments. Such as `max_pool(hidden_layer, pool_size)`.
Which means the variable value is a value drawn according to exp(normal(mu, sigma)) - **name**: The name of the function that will be replaced in the following assignment statement.
10. `@nni.variable(nni.qlognormal(label, mu, sigma, q),name=variable)` An example here is:
Which means the variable value is a value like round(exp(normal(mu, sigma)) / q) * q
```python
"""@nni.function_choice(max_pool(hidden_layer, pool_size), avg_pool(hidden_layer, pool_size), name=max_pool)"""
h_pooling = max_pool(hidden_layer, pool_size)
```
### 3. Annotate intermediate result
`'''@nni.report_intermediate_result(metrics)'''`
`@nni.report_intermediate_result` is used to report intermediate result, whose usage is the same as `nni.report_intermediate_result` in [Trials.md](Trials.md)
### 4. Annotate final result
`'''@nni.report_final_result(metrics)'''`
`@nni.report_final_result` is used to report the final result of the current trial, whose usage is the same as `nni.report_final_result` in [Trials.md](Trials.md)
# Built-in Assessors
NNI provides state-of-the-art tuning algorithm in our builtin-assessors and makes them easy to use. Below is the brief overview of NNI current builtin Assessors:
|Assessor|Brief Introduction of Algorithm|
|---|---|
|**Medianstop** [(Usage)](#MedianStop)|Medianstop is a simple early stopping rule mentioned in the [paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf). It stops a pending trial X at step S if the trial’s best objective value by step S is strictly worse than the median value of the running averages of all completed trials’ objectives reported up to step S.|
|[Curvefitting](https://github.com/Microsoft/nni/blob/master/src/sdk/pynni/nni/curvefitting_assessor/README.md) [(Usage)](#Curvefitting)|Curve Fitting Assessor is a LPA(learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of final epoch's performance worse than the best final performance in the trial history. In this algorithm, we use 12 curves to fit the accuracy curve|
## Usage of Builtin Assessors
Use builtin assessors provided by NNI SDK requires to declare the **builtinAssessorName** and **classArgs** in `config.yml` file. In this part, we will introduce the detailed usage about the suggested scenarios, classArg requirements, and example for each assessor.
Note: Please follow the format when you write your `config.yml` file.
<a name="MedianStop"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Median Stop Assessor`
> Builtin Assessor Name: **Medianstop**
**Suggested scenario**
It is applicable in a wide range of performance curves, thus, can be used in various scenarios to speed up the tuning progress.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', assessor will **stop** the trial with smaller expectation. If 'minimize', assessor will **stop** the trial with larger expectation.
* **start_step** (*int, optional, default = 0*) - A trial is determined to be stopped or not, only after receiving start_step number of reported intermediate results.
**Usage example:**
```yaml
# config.yml
assessor:
builtinAssessorName: Medianstop
classArgs:
optimize_mode: maximize
start_step: 5
```
<br>
<a name="Curvefitting"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Curve Fitting Assessor`
> Builtin Assessor Name: **Curvefitting**
**Suggested scenario**
It is applicable in a wide range of performance curves, thus, can be used in various scenarios to speed up the tuning progress. Even better, it's able to handle and assess curves with similar performance.
**Requirement of classArg**
* **epoch_num** (*int, **required***) - The total number of epoch. We need to know the number of epoch to determine which point we need to predict.
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', assessor will **stop** the trial with smaller expectation. If 'minimize', assessor will **stop** the trial with larger expectation.
* **start_step** (*int, optional, default = 6*) - A trial is determined to be stopped or not, we start to predict only after receiving start_step number of reported intermediate results.
* **threshold** (*float, optional, default = 0.95*) - The threshold that we decide to early stop the worse performance curve. For example: if threshold = 0.95, optimize_mode = maximize, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
**Usage example:**
```yaml
# config.yml
assessor:
builtinAssessorName: Curvefitting
classArgs:
epoch_num: 20
optimize_mode: maximize
start_step: 6
threshold: 0.95
```
\ No newline at end of file
# Built-in Tuners
NNI provides state-of-the-art tuning algorithm as our builtin-tuners and makes them easy to use. Below is the brief summary of NNI currently built-in Tuners:
|Tuner|Brief Introduction of Algorithm|
|---|---|
|**TPE** [(Usage)](#TPE)|The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach. SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements, and then subsequently choose new hyperparameters to test based on this model.|
|**Random Search** [(Usage)](#Random)|In Random Search for Hyper-Parameter Optimization show that Random Search might be surprisingly simple and effective. We suggest that we could use Random Search as the baseline when we have no knowledge about the prior distribution of hyper-parameters.|
|**Anneal** [(Usage)](#Anneal)|This simple annealing algorithm begins by sampling from the prior, but tends over time to sample from points closer and closer to the best ones observed. This algorithm is a simple variation on the random search that leverages smoothness in the response surface. The annealing rate is not adaptive.|
|**Naive Evolution** [(Usage)](#Evolution)|Naive Evolution comes from Large-Scale Evolution of Image Classifiers. It randomly initializes a population-based on search space. For each generation, it chooses better ones and does some mutation (e.g., change a hyperparameter, add/remove one layer) on them to get the next generation. Naive Evolution requires many trials to works, but it's very simple and easy to expand new features.|
|**SMAC** [(Usage)](#SMAC)|SMAC is based on Sequential Model-Based Optimization (SMBO). It adapts the most prominent previously used model class (Gaussian stochastic process models) and introduces the model class of random forests to SMBO, in order to handle categorical parameters. The SMAC supported by nni is a wrapper on the SMAC3 Github repo. Notice, SMAC need to be installed by `nnictl package` command.|
|**Batch tuner** [(Usage)](#Batch)|Batch tuner allows users to simply provide several configurations (i.e., choices of hyper-parameters) for their trial code. After finishing all the configurations, the experiment is done. Batch tuner only supports the type choice in search space spec.|
|**Grid Search** [(Usage)](#GridSearch)|Grid Search performs an exhaustive searching through a manually specified subset of the hyperparameter space defined in the searchspace file. Note that the only acceptable types of search space are choice, quniform, qloguniform. The number q in quniform and qloguniform has special meaning (different from the spec in search space spec). It means the number of values that will be sampled evenly from the range low and high.|
|[Hyperband](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/hyperband_advisor) [(Usage)](#Hyperband)|Hyperband tries to use the limited resource to explore as many configurations as possible, and finds out the promising ones to get the final result. The basic idea is generating many configurations and to run them for the small number of STEPs to find out promising one, then further training those promising ones to select several more promising one.|
|[Network Morphism](https://github.com/Microsoft/nni/blob/master/src/sdk/pynni/nni/networkmorphism_tuner/README.md) [(Usage)](#NetworkMorphism)|Network Morphism provides functions to automatically search for architecture of deep learning models. Every child network inherits the knowledge from its parent network and morphs into diverse types of networks, including changes of depth, width, and skip-connection. Next, it estimates the value of a child network using the historic architecture and metric pairs. Then it selects the most promising one to train.|
|**Metis Tuner** [(Usage)](#MetisTuner)|Metis offers the following benefits when it comes to tuning parameters: While most tools only predict the optimal configuration, Metis gives you two outputs: (a) current prediction of optimal configuration, and (b) suggestion for the next trial. No more guesswork. While most tools assume training datasets do not have noisy data, Metis actually tells you if you need to re-sample a particular hyper-parameter.|
<br>
## Usage of Builtin Tuners
Use builtin tuner provided by NNI SDK requires to declare the **builtinTunerName** and **classArgs** in `config.yml` file. In this part, we will introduce the detailed usage about the suggested scenarios, classArg requirements and example for each tuner.
Note: Please follow the format when you write your `config.yml` file. Some builtin tuner need to be installed by `nnictl package`, like SMAC.
<a name="TPE"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `TPE`
> Builtin Tuner Name: **TPE**
**Suggested scenario**
TPE, as a black-box optimization, can be used in various scenarios and shows good performance in general. Especially when you have limited computation resource and can only try a small number of trials. From a large amount of experiments, we could found that TPE is far better than Random Search.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example:**
```yaml
# config.yml
tuner:
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
```
<br>
<a name="Random"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Random Search`
> Builtin Tuner Name: **Random**
**Suggested scenario**
Random search is suggested when each trial does not take too long (e.g., each trial can be completed very soon, or early stopped by assessor quickly), and you have enough computation resource. Or you want to uniformly explore the search space. Random Search could be considered as baseline of search algorithm.
**Requirement of classArg:**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: Random
classArgs:
optimize_mode: maximize
```
<br>
<a name="Anneal"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Anneal`
> Builtin Tuner Name: **Anneal**
**Suggested scenario**
Anneal is suggested when each trial does not take too long, and you have enough computation resource(almost same with Random Search). Or the variables in search space could be sample from some prior distribution.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: Anneal
classArgs:
optimize_mode: maximize
```
<br>
<a name="Evolution"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Naive Evolution`
> Builtin Tuner Name: **Evolution**
**Suggested scenario**
Its requirement of computation resource is relatively high. Specifically, it requires large initial population to avoid falling into local optimum. If your trial is short or leverages assessor, this tuner is a good choice. And, it is more suggested when your trial code supports weight transfer, that is, the trial could inherit the converged weights from its parent(s). This can greatly speed up the training progress.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: Evolution
classArgs:
optimize_mode: maximize
```
<br>
<a name="SMAC"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `SMAC`
> Builtin Tuner Name: **SMAC**
**Installation**
SMAC need to be installed by following command before first use.
```bash
nnictl package install --name=SMAC
```
**Suggested scenario**
Similar to TPE, SMAC is also a black-box tuner which can be tried in various scenarios, and is suggested when computation resource is limited. It is optimized for discrete hyperparameters, thus, suggested when most of your hyperparameters are discrete.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: SMAC
classArgs:
optimize_mode: maximize
```
<br>
<a name="Batch"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Batch Tuner`
> Builtin Tuner Name: BatchTuner
**Suggested scenario**
If the configurations you want to try have been decided, you can list them in searchspace file (using `choice`) and run them using batch tuner.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: BatchTuner
```
<br>
Note that the search space that BatchTuner supported like:
```json
{
"combine_params":
{
"_type" : "choice",
"_value" : [{"optimizer": "Adam", "learning_rate": 0.00001},
{"optimizer": "Adam", "learning_rate": 0.0001},
{"optimizer": "Adam", "learning_rate": 0.001},
{"optimizer": "SGD", "learning_rate": 0.01},
{"optimizer": "SGD", "learning_rate": 0.005},
{"optimizer": "SGD", "learning_rate": 0.0002}]
}
}
```
The search space file including the high-level key `combine_params`. The type of params in search space must be `choice` and the `values` including all the combined-params value.
<a name="GridSearch"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Grid Search`
> Builtin Tuner Name: **Grid Search**
**Suggested scenario**
Note that the only acceptable types of search space are `choice`, `quniform`, `qloguniform`. **The number `q` in `quniform` and `qloguniform` has special meaning (different from the spec in [search space spec](./SearchSpaceSpec.md)). It means the number of values that will be sampled evenly from the range `low` and `high`.**
It is suggested when search space is small, it is feasible to exhaustively sweeping the whole search space.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: GridSearch
```
<br>
<a name="Hyperband"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Hyperband`
> Builtin Advisor Name: **Hyperband**
**Suggested scenario**
It is suggested when you have limited computation resource but have relatively large search space. It performs well in the scenario that intermediate result (e.g., accuracy) can reflect good or bad of final result (e.g., accuracy) to some extent.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
* **R** (*int, optional, default = 60*) - the maximum STEPS (could be the number of mini-batches or epochs) can be allocated to a trial. Each trial should use STEPS to control how long it runs.
* **eta** (*int, optional, default = 3*) - `(eta-1)/eta` is the proportion of discarded trials
**Usage example**
```yaml
# config.yml
advisor:
builtinAdvisorName: Hyperband
classArgs:
optimize_mode: maximize
R: 60
eta: 3
```
<br>
<a name="NetworkMorphism"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Network Morphism`
> Builtin Tuner Name: **NetworkMorphism**
**Installation**
NetworkMorphism requires [pyTorch](https://pytorch.org/get-started/locally), so users should install it first.
**Suggested scenario**
It is suggested that you want to apply deep learning methods to your task (your own dataset) but you have no idea of how to choose or design a network. You modify the [example](https://github.com/Microsoft/nni/tree/master/examples/trials/network_morphism/cifar10/cifar10_keras.py) to fit your own dataset and your own data augmentation method. Also you can change the batch size, learning rate or optimizer. It is feasible for different tasks to find a good network architecture. Now this tuner only supports the computer vision domain.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
* **task** (*('cv'), optional, default = 'cv'*) - The domain of experiment, for now, this tuner only supports the computer vision(cv) domain.
* **input_width** (*int, optional, default = 32*) - input image width
* **input_channel** (*int, optional, default = 3*) - input image channel
* **n_output_node** (*int, optional, default = 10*) - number of classes
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: NetworkMorphism
classArgs:
optimize_mode: maximize
task: cv
input_width: 32
input_channel: 3
n_output_node: 10
```
<br>
<a name="MetisTuner"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Metis Tuner`
> Builtin Tuner Name: **MetisTuner**
Note that the only acceptable types of search space are `choice`, `quniform`, `uniform` and `randint`.
**Installation**
Metis Tuner requires [sklearn](https://scikit-learn.org/), so users should install it first. User could use `pip3 install sklearn` to install it.
**Suggested scenario**
Similar to TPE and SMAC, Metis is a black-box tuner. If your system takes a long time to finish each trial, Metis is more favorable than other approaches such as random search. Furthermore, Metis provides guidance on the subsequent trial. Here is an [example](https://github.com/Microsoft/nni/tree/master/examples/trials/auto-gbdt/search_space_metis.json) about the use of Metis. User only need to send the final result like `accuracy` to tuner, by calling the nni SDK.
**Requirement of classArg**
* **optimize_mode** (*'maximize' or 'minimize', optional, default = 'maximize'*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: MetisTuner
classArgs:
optimize_mode: maximize
```
...@@ -6,7 +6,7 @@ Firstly, if you are unsure or afraid of anything, just ask or submit the issue o ...@@ -6,7 +6,7 @@ Firstly, if you are unsure or afraid of anything, just ask or submit the issue o
However, for those individuals who want a bit more guidance on the best way to contribute to the project, read on. This document will cover all the points we're looking for in your contributions, raising your chances of quickly merging or addressing your contributions. However, for those individuals who want a bit more guidance on the best way to contribute to the project, read on. This document will cover all the points we're looking for in your contributions, raising your chances of quickly merging or addressing your contributions.
Looking for a quickstart, get acquainted with our [Get Started](./GetStarted.md) guide. Looking for a quickstart, get acquainted with our [Get Started](./QuickStart.md) guide.
There are a few simple guidelines that you need to follow before providing your hacks. There are a few simple guidelines that you need to follow before providing your hacks.
...@@ -30,7 +30,7 @@ Provide PRs with appropriate tags for bug fixes or enhancements to the source co ...@@ -30,7 +30,7 @@ Provide PRs with appropriate tags for bug fixes or enhancements to the source co
If you are looking for How to develop and debug the NNI source code, you can refer to [How to set up NNI developer environment doc](./SetupNNIDeveloperEnvironment.md) file in the `docs` folder. If you are looking for How to develop and debug the NNI source code, you can refer to [How to set up NNI developer environment doc](./SetupNNIDeveloperEnvironment.md) file in the `docs` folder.
Similarly for [writing trials](./WriteYourTrial.md) or [starting experiments](StartExperiment.md). For everything else, refer [here](https://github.com/Microsoft/nni/tree/master/docs). Similarly for [Quick Start](QuickStart.md). For everything else, refer to [NNI Home page](http://nni.readthedocs.io).
## Solve Existing Issues ## Solve Existing Issues
Head over to [issues](https://github.com/Microsoft/nni/issues) to find issues where help is needed from contributors. You can find issues tagged with 'good-first-issue' or 'help-wanted' to contribute in. Head over to [issues](https://github.com/Microsoft/nni/issues) to find issues where help is needed from contributors. You can find issues tagged with 'good-first-issue' or 'help-wanted' to contribute in.
......
###############################
Contribute to NNI
###############################
.. toctree::
Development Setup<SetupNNIDeveloperEnvironment>
Contribution Guide<CONTRIBUTING>
Debug HowTo<HowToDebug>
\ No newline at end of file
...@@ -6,7 +6,7 @@ So, if user want to implement a customized Advisor, she/he only need to: ...@@ -6,7 +6,7 @@ So, if user want to implement a customized Advisor, she/he only need to:
1. Define an Advisor inheriting from the MsgDispatcherBase class 1. Define an Advisor inheriting from the MsgDispatcherBase class
1. Implement the methods with prefix `handle_` except `handle_request` 1. Implement the methods with prefix `handle_` except `handle_request`
1. Configure your customized Advisor in experiment yaml config file 1. Configure your customized Advisor in experiment YAML config file
Here is an example: Here is an example:
...@@ -22,9 +22,9 @@ class CustomizedAdvisor(MsgDispatcherBase): ...@@ -22,9 +22,9 @@ class CustomizedAdvisor(MsgDispatcherBase):
**2) Implement the methods with prefix `handle_` except `handle_request`** **2) Implement the methods with prefix `handle_` except `handle_request`**
Please refer to the implementation of Hyperband ([src/sdk/pynni/nni/hyperband_advisor/hyperband_advisor.py](../src/sdk/pynni/nni/hyperband_advisor/hyperband_advisor.py)) for how to implement the methods. Please refer to the implementation of Hyperband ([src/sdk/pynni/nni/hyperband_advisor/hyperband_advisor.py](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/hyperband_advisor/hyperband_advisor.py)) for how to implement the methods.
**3) Configure your customized Advisor in experiment yaml config file** **3) Configure your customized Advisor in experiment YAML config file**
Similar to tuner and assessor. NNI needs to locate your customized Advisor class and instantiate the class, so you need to specify the location of the customized Advisor class and pass literal values as parameters to the \_\_init__ constructor. Similar to tuner and assessor. NNI needs to locate your customized Advisor class and instantiate the class, so you need to specify the location of the customized Advisor class and pass literal values as parameters to the \_\_init__ constructor.
......
# Customize Assessor
NNI supports to build an assessor by yourself for tuning demand.
If you want to implement a customized Assessor, there are three things to do:
1. Inherit the base Assessor class
1. Implement assess_trial function
1. Configure your customized Assessor in experiment YAML config file
**1. Inherit the base Assessor class**
```python
from nni.assessor import Assessor
class CustomizedAssessor(Assessor):
def __init__(self, ...):
...
```
**2. Implement assess trial function**
```python
from nni.assessor import Assessor, AssessResult
class CustomizedAssessor(Assessor):
def __init__(self, ...):
...
def assess_trial(self, trial_history):
"""
Determines whether a trial should be killed. Must override.
trial_history: a list of intermediate result objects.
Returns AssessResult.Good or AssessResult.Bad.
"""
# you code implement here.
...
```
**3. Configure your customized Assessor in experiment YAML config file**
NNI needs to locate your customized Assessor class and instantiate the class, so you need to specify the location of the customized Assessor class and pass literal values as parameters to the \_\_init__ constructor.
```yaml
assessor:
codeDir: /home/abc/myassessor
classFileName: my_customized_assessor.py
className: CustomizedAssessor
# Any parameter need to pass to your Assessor class __init__ constructor
# can be specified in this optional classArgs field, for example
classArgs:
arg1: value1
```
Please noted in **2**. The object `trial_history` are exact the object that Trial send to Assessor by using SDK `report_intermediate_result` function.
More detail example you could see:
> * [medianstop-assessor](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/medianstop_assessor)
> * [curvefitting-assessor](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/curvefitting_assessor)
\ No newline at end of file
# **How To** - Customize Your Own Tuner # Customize-Tuner
*Tuner receive result from Trial as a matric to evaluate the performance of a specific parameters/architecture configure. And tuner send next hyper-parameter or architecture configure to Trial.* ## Customize Tuner
So, if user want to implement a customized Tuner, she/he only need to: NNI provides state-of-the-art tuning algorithm in builtin-tuners. NNI supports to build a tuner by yourself for tuning demand.
1. Inherit a tuner of a base Tuner class If you want to implement your own tuning algorithm, you can implement a customized Tuner, there are three things to do:
1. Inherit the base Tuner class
1. Implement receive_trial_result and generate_parameter function 1. Implement receive_trial_result and generate_parameter function
1. Configure your customized tuner in experiment yaml config file 1. Configure your customized tuner in experiment YAML config file
Here is an example: Here is an example:
**1) Inherit a tuner of a base Tuner class** **1. Inherit the base Tuner class**
```python ```python
from nni.tuner import Tuner from nni.tuner import Tuner
...@@ -20,7 +22,7 @@ class CustomizedTuner(Tuner): ...@@ -20,7 +22,7 @@ class CustomizedTuner(Tuner):
... ...
``` ```
**2) Implement receive_trial_result and generate_parameter function** **2. Implement receive_trial_result and generate_parameter function**
```python ```python
from nni.tuner import Tuner from nni.tuner import Tuner
...@@ -31,10 +33,10 @@ class CustomizedTuner(Tuner): ...@@ -31,10 +33,10 @@ class CustomizedTuner(Tuner):
def receive_trial_result(self, parameter_id, parameters, value): def receive_trial_result(self, parameter_id, parameters, value):
''' '''
Record an observation of the objective function and Train Receive trial's final result.
parameter_id: int parameter_id: int
parameters: object created by 'generate_parameters()' parameters: object created by 'generate_parameters()'
value: final metrics of the trial, including reward value: final metrics of the trial, including default metric
''' '''
# your code implements here. # your code implements here.
... ...
...@@ -57,13 +59,14 @@ For example: ...@@ -57,13 +59,14 @@ For example:
If the you implement the `generate_parameters` like this: If the you implement the `generate_parameters` like this:
```python ```python
def generate_parameters(self, parameter_id): def generate_parameters(self, parameter_id):
''' '''
Returns a set of trial (hyper-)parameters, as a serializable object Returns a set of trial (hyper-)parameters, as a serializable object
parameter_id: int parameter_id: int
''' '''
# your code implements here. # your code implements here.
return {"dropout": 0.3, "learning_rate": 0.4} return {"dropout": 0.3, "learning_rate": 0.4}
``` ```
It means your Tuner will always generate parameters `{"dropout": 0.3, "learning_rate": 0.4}`. Then Trial will receive `{"dropout": 0.3, "learning_rate": 0.4}` by calling API `nni.get_next_parameter()`. Once the trial ends with a result (normally some kind of metrics), it can send the result to Tuner by calling API `nni.report_final_result()`, for example `nni.report_final_result(0.93)`. Then your Tuner's `receive_trial_result` function will receied the result like: It means your Tuner will always generate parameters `{"dropout": 0.3, "learning_rate": 0.4}`. Then Trial will receive `{"dropout": 0.3, "learning_rate": 0.4}` by calling API `nni.get_next_parameter()`. Once the trial ends with a result (normally some kind of metrics), it can send the result to Tuner by calling API `nni.report_final_result()`, for example `nni.report_final_result(0.93)`. Then your Tuner's `receive_trial_result` function will receied the result like:
...@@ -83,7 +86,7 @@ _fd = open(os.path.join(_pwd, 'data.txt'), 'r') ...@@ -83,7 +86,7 @@ _fd = open(os.path.join(_pwd, 'data.txt'), 'r')
This is because your tuner is not executed in the directory of your tuner (i.e., `pwd` is not the directory of your own tuner). This is because your tuner is not executed in the directory of your tuner (i.e., `pwd` is not the directory of your own tuner).
**3) Configure your customized tuner in experiment yaml config file** **3. Configure your customized tuner in experiment YAML config file**
NNI needs to locate your customized tuner class and instantiate the class, so you need to specify the location of the customized tuner class and pass literal values as parameters to the \_\_init__ constructor. NNI needs to locate your customized tuner class and instantiate the class, so you need to specify the location of the customized tuner class and pass literal values as parameters to the \_\_init__ constructor.
...@@ -96,14 +99,14 @@ tuner: ...@@ -96,14 +99,14 @@ tuner:
# can be specified in this optional classArgs field, for example # can be specified in this optional classArgs field, for example
classArgs: classArgs:
arg1: value1 arg1: value1
``` ```
More detail example you could see: More detail example you could see:
> * [evolution-tuner](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/evolution_tuner)
> * [hyperopt-tuner](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/hyperopt_tuner)
> * [evolution-based-customized-tuner](https://github.com/Microsoft/nni/tree/master/examples/tuners/ga_customer_tuner)
> * [evolution-tuner](../src/sdk/pynni/nni/evolution_tuner) ### Write a more advanced automl algorithm
> * [hyperopt-tuner](../src/sdk/pynni/nni/hyperopt_tuner)
> * [evolution-based-customized-tuner](../examples/tuners/ga_customer_tuner)
## Write a more advanced automl algorithm
The information above are usually enough to write a general tuner. However, users may also want more information, for example, intermediate results, trials' state (e.g., the information in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](../src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](./howto_3_CustomizedAdvisor.md) for how to write a customized advisor. The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](Customize_Advisor.md) for how to write a customized advisor.
\ No newline at end of file
**Enable Assessor in your expeirment**
===
Assessor module is for assessing running trials. One common use case is early stopping, which terminates unpromising trial jobs based on their intermediate results.
## Using NNI built-in Assessor
Here we use the same example `examples/trials/mnist-annotation`. We use `Medianstop` assessor for this experiment. The yaml configure file is shown below:
```
authorName: your_name
experimentName: auto_mnist
# how many trials could be concurrently running
trialConcurrency: 2
# maximum experiment running duration
maxExecDuration: 3h
# empty means never stop
maxTrialNum: 100
# choice: local, remote
trainingServicePlatform: local
# choice: true, false
useAnnotation: true
tuner:
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
assessor:
builtinAssessorName: Medianstop
classArgs:
optimize_mode: maximize
trial:
command: python mnist.py
codeDir: /usr/share/nni/examples/trials/mnist-annotation
gpuNum: 0
```
For our built-in assessors, you need to fill two fields: `builtinAssessorName` which chooses NNI provided assessors (refer to [here]() for built-in assessors), `optimize_mode` which includes maximize and minimize (you want to maximize or minimize your trial result).
## Using user customized Assessor
You can also write your own assessor following the guidance [here](). For example, you wrote an assessor for `examples/trials/mnist-annotation`. You should prepare the yaml configure below:
```
authorName: your_name
experimentName: auto_mnist
# how many trials could be concurrently running
trialConcurrency: 2
# maximum experiment running duration
maxExecDuration: 3h
# empty means never stop
maxTrialNum: 100
# choice: local, remote
trainingServicePlatform: local
# choice: true, false
useAnnotation: true
tuner:
# Possible values: TPE, Random, Anneal, Evolution
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
assessor:
# Your assessor code directory
codeDir:
# Name of the file which contains your assessor class
classFileName:
# Your assessor class name, must be a subclass of nni.Assessor
className:
# Parameter names and literal values you want to pass to
# the __init__ constructor of your assessor class
classArgs:
arg1: value1
gpuNum: 0
trial:
command: python mnist.py
codeDir: /usr/share/nni/examples/trials/mnist-annotation
gpuNum: 0
```
You need to fill: `codeDir`, `classFileName`, `className`, and pass parameters to \_\_init__ constructor through `classArgs` field if the \_\_init__ constructor of your assessor class has required parameters.
**Note that** if you want to access a file (e.g., ```data.txt```) in the directory of your own assessor, you cannot use ```open('data.txt', 'r')```. Instead, you should use the following:
```
_pwd = os.path.dirname(__file__)
_fd = open(os.path.join(_pwd, 'data.txt'), 'r')
```
This is because your assessor is not executed in the directory of your assessor (i.e., ```pwd``` is not the directory of your own assessor).
\ No newline at end of file
######################
Examples
######################
.. toctree::
:maxdepth: 2
MNIST<mnist_examples>
Cifar10<cifar10_examples>
Scikit-learn<sklearn_examples>
EvolutionSQuAD<SQuAD_evolution_examples>
GBDT<gbdt_example>
This diff is collapsed.
# FAQ
This page is for frequent asked questions and answers. This page is for frequent asked questions and answers.
...@@ -7,23 +9,23 @@ When met errors like below, try to clean up **tmp** folder first. ...@@ -7,23 +9,23 @@ When met errors like below, try to clean up **tmp** folder first.
> OSError: [Errno 28] No space left on device > OSError: [Errno 28] No space left on device
### Cannot get trials' metrics in OpenPAI mode ### Cannot get trials' metrics in OpenPAI mode
In OpenPAI training mode, we start a rest server which listens on 51189 port in nniManager to receive metrcis reported from trials running in OpenPAI cluster. If you didn't see any metrics from WebUI in OpenPAI mode, check your machine where nniManager runs on to make sure 51189 port is turned on in the firewall rule. In OpenPAI training mode, we start a rest server which listens on 51189 port in NNI Manager to receive metrcis reported from trials running in OpenPAI cluster. If you didn't see any metrics from WebUI in OpenPAI mode, check your machine where NNI manager runs on to make sure 51189 port is turned on in the firewall rule.
### Segmentation Fault (core dumped) when installing ### Segmentation Fault (core dumped) when installing
> make: *** [install-XXX] Segmentation fault (core dumped) > make: *** [install-XXX] Segmentation fault (core dumped)
Please try the following solutions in turn: Please try the following solutions in turn:
* Update or reinstall you current python's pip like `python3 -m pip install -U pip` * Update or reinstall you current python's pip like `python3 -m pip install -U pip`
* Install nni with `--no-cache-dir` flag like `python3 -m pip install nni --no-cache-dir` * Install NNI with `--no-cache-dir` flag like `python3 -m pip install nni --no-cache-dir`
### Job management error: getIPV4Address() failed because os.networkInterfaces().eth0 is undefined. ### Job management error: getIPV4Address() failed because os.networkInterfaces().eth0 is undefined.
Your machine don't have eth0 device, please set nniManagerIp in your config file manually. [refer](https://github.com/Microsoft/nni/blob/master/docs/ExperimentConfig.md) Your machine don't have eth0 device, please set [nniManagerIp](ExperimentConfig.md) in your config file manually.
### Exceed the MaxDuration but didn't stop ### Exceed the MaxDuration but didn't stop
When the duration of experiment reaches the maximum duration, nniManager will not create new trials, but the existing trials will continue unless user manually stop the experiment. When the duration of experiment reaches the maximum duration, nniManager will not create new trials, but the existing trials will continue unless user manually stop the experiment.
### Could not stop an experiment using `nnictl stop` ### Could not stop an experiment using `nnictl stop`
If you upgrade your nni or you delete some config files of nni when there is an experiment running, this kind of issue may happen because the loss of config file. You could use `ps -ef | grep node` to find the pid of your experiment, and use `kill -9 {pid}` to kill it manually. If you upgrade your NNI or you delete some config files of NNI when there is an experiment running, this kind of issue may happen because the loss of config file. You could use `ps -ef | grep node` to find the pid of your experiment, and use `kill -9 {pid}` to kill it manually.
### Could not get `default metric` in webUI of virtual machines ### Could not get `default metric` in webUI of virtual machines
Config the network mode to bridge mode or other mode that could make virtual machine's host accessible from external machine, and make sure the port of virtual machine is not forbidden by firewall. Config the network mode to bridge mode or other mode that could make virtual machine's host accessible from external machine, and make sure the port of virtual machine is not forbidden by firewall.
......
...@@ -4,25 +4,25 @@ NNI supports running experiment using [FrameworkController](https://github.com/M ...@@ -4,25 +4,25 @@ NNI supports running experiment using [FrameworkController](https://github.com/M
## Prerequisite for on-premises Kubernetes Service ## Prerequisite for on-premises Kubernetes Service
1. A **Kubernetes** cluster using Kubernetes 1.8 or later. Follow this [guideline](https://kubernetes.io/docs/setup/) to set up Kubernetes 1. A **Kubernetes** cluster using Kubernetes 1.8 or later. Follow this [guideline](https://kubernetes.io/docs/setup/) to set up Kubernetes
2. Prepare a **kubeconfig** file, which will be used by NNI to interact with your kubernetes API server. By default, NNI manager will use $(HOME)/.kube/config as kubeconfig file's path. You can also specify other kubeconfig files by setting the **KUBECONFIG** environment variable. Refer this [guideline]( https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig) to learn more about kubeconfig. 2. Prepare a **kubeconfig** file, which will be used by NNI to interact with your kubernetes API server. By default, NNI manager will use $(HOME)/.kube/config as kubeconfig file's path. You can also specify other kubeconfig files by setting the **KUBECONFIG** environment variable. Refer this [guideline]( https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig) to learn more about kubeconfig.
3. If your NNI trial job needs GPU resource, you should follow this [guideline](https://github.com/NVIDIA/k8s-device-plugin) to configure **Nvidia device plugin for Kubernetes**. 3. If your NNI trial job needs GPU resource, you should follow this [guideline](https://github.com/NVIDIA/k8s-device-plugin) to configure **Nvidia device plugin for Kubernetes**.
4. Prepare a **NFS server** and export a general purpose mount (we recommend to map your NFS server path in `root_squash option`, otherwise permission issue may raise when nni copy files to NFS. Refer this [page](https://linux.die.net/man/5/exports) to learn what root_squash option is), or **Azure File Storage**. 4. Prepare a **NFS server** and export a general purpose mount (we recommend to map your NFS server path in `root_squash option`, otherwise permission issue may raise when NNI copies files to NFS. Refer this [page](https://linux.die.net/man/5/exports) to learn what root_squash option is), or **Azure File Storage**.
5. Install **NFS client** on the machine where you install NNI and run nnictl to create experiment. Run this command to install NFSv4 client: 5. Install **NFS client** on the machine where you install NNI and run nnictl to create experiment. Run this command to install NFSv4 client:
``` ```
apt-get install nfs-common apt-get install nfs-common
``` ```
6. Install **NNI**, follow the install guide [here](GetStarted.md). 6. Install **NNI**, follow the install guide [here](QuickStart.md).
## Prerequisite for Azure Kubernetes Service ## Prerequisite for Azure Kubernetes Service
1. NNI support kubeflow based on Azure Kubernetes Service, follow the [guideline](https://azure.microsoft.com/en-us/services/kubernetes-service/) to set up Azure Kubernetes Service. 1. NNI support kubeflow based on Azure Kubernetes Service, follow the [guideline](https://azure.microsoft.com/en-us/services/kubernetes-service/) to set up Azure Kubernetes Service.
2. Install [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) and __kubectl__. Use `az login` to set azure account, and connect kubectl client to AKS, refer this [guideline](https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough#connect-to-the-cluster). 2. Install [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) and __kubectl__. Use `az login` to set azure account, and connect kubectl client to AKS, refer this [guideline](https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough#connect-to-the-cluster).
3. Follow the [guideline](https://docs.microsoft.com/en-us/azure/storage/common/storage-quickstart-create-account?tabs=portal) to create azure file storage account. If you use Azure Kubernetes Service, nni need Azure Storage Service to store code files and the output files. 3. Follow the [guideline](https://docs.microsoft.com/en-us/azure/storage/common/storage-quickstart-create-account?tabs=portal) to create azure file storage account. If you use Azure Kubernetes Service, NNI need Azure Storage Service to store code files and the output files.
4. To access Azure storage service, nni need the access key of the storage account, and nni use [Azure Key Vault](https://azure.microsoft.com/en-us/services/key-vault/) Service to protect your private key. Set up Azure Key Vault Service, add a secret to Key Vault to store the access key of Azure storage account. Follow this [guideline](https://docs.microsoft.com/en-us/azure/key-vault/quick-create-cli) to store the access key. 4. To access Azure storage service, NNI need the access key of the storage account, and NNI uses [Azure Key Vault](https://azure.microsoft.com/en-us/services/key-vault/) Service to protect your private key. Set up Azure Key Vault Service, add a secret to Key Vault to store the access key of Azure storage account. Follow this [guideline](https://docs.microsoft.com/en-us/azure/key-vault/quick-create-cli) to store the access key.
## Set up FrameworkController ## Set up FrameworkController
Follow the [guideline](https://github.com/Microsoft/frameworkcontroller/tree/master/example/run) to set up frameworkcontroller in the kubernetes cluster, nni support frameworkcontroller by the statefulset mode. Follow the [guideline](https://github.com/Microsoft/frameworkcontroller/tree/master/example/run) to set up frameworkcontroller in the kubernetes cluster, NNI supports frameworkcontroller by the statefulset mode.
## Design ## Design
Please refer the design of [kubeflow training service](./KubeflowMode.md), frameworkcontroller training service pipeline is similar. Please refer the design of [kubeflow training service](./KubeflowMode.md), frameworkcontroller training service pipeline is similar.
...@@ -71,7 +71,7 @@ frameworkcontrollerConfig: ...@@ -71,7 +71,7 @@ frameworkcontrollerConfig:
server: {your_nfs_server} server: {your_nfs_server}
path: {your_nfs_server_exported_path} path: {your_nfs_server_exported_path}
``` ```
If you use Azure Kubernetes Service, you should set `frameworkcontrollerConfig` in your config yaml file as follows: If you use Azure Kubernetes Service, you should set `frameworkcontrollerConfig` in your config YAML file as follows:
``` ```
frameworkcontrollerConfig: frameworkcontrollerConfig:
storage: azureStorage storage: azureStorage
...@@ -82,9 +82,9 @@ frameworkcontrollerConfig: ...@@ -82,9 +82,9 @@ frameworkcontrollerConfig:
accountName: {your_storage_account_name} accountName: {your_storage_account_name}
azureShare: {your_azure_share_name} azureShare: {your_azure_share_name}
``` ```
Note: You should explicitly set `trainingServicePlatform: frameworkcontroller` in nni config yaml file if you want to start experiment in frameworkcontrollerConfig mode. Note: You should explicitly set `trainingServicePlatform: frameworkcontroller` in NNI config YAML file if you want to start experiment in frameworkcontrollerConfig mode.
The trial's config format for nni frameworkcontroller mode is a simple version of frameworkcontroller's offical config, you could refer the [tensorflow example of frameworkcontroller](https://github.com/Microsoft/frameworkcontroller/blob/master/example/framework/scenario/tensorflow/cpu/tensorflowdistributedtrainingwithcpu.yaml) for deep understanding. The trial's config format for NNI frameworkcontroller mode is a simple version of frameworkcontroller's offical config, you could refer the [tensorflow example of frameworkcontroller](https://github.com/Microsoft/frameworkcontroller/blob/master/example/framework/scenario/tensorflow/cpu/tensorflowdistributedtrainingwithcpu.yaml) for deep understanding.
Trial configuration in frameworkcontroller mode have the following configuration keys: Trial configuration in frameworkcontroller mode have the following configuration keys:
* taskRoles: you could set multiple task roles in config file, and each task role is a basic unit to process in kubernetes cluster. * taskRoles: you could set multiple task roles in config file, and each task role is a basic unit to process in kubernetes cluster.
* name: the name of task role specified, like "worker", "ps", "master". * name: the name of task role specified, like "worker", "ps", "master".
......
**Get Started with NNI**
===
## **Installation**
* __Dependencies__
```bash
python >= 3.5
git
wget
```
python pip should also be correctly installed. You could use "python3 -m pip -v" to check in Linux.
* Note: we don't support virtual environment in current releases.
* __Install NNI through pip__
```bash
python3 -m pip install --user --upgrade nni
```
* __Install NNI through source code__
```bash
git clone -b v0.5 https://github.com/Microsoft/nni.git
cd nni
source install.sh
```
## **Quick start: run a customized experiment**
An experiment is to run multiple trial jobs, each trial job tries a configuration which includes a specific neural architecture (or model) and hyper-parameter values. To run an experiment through NNI, you should:
* Provide a runnable trial
* Provide or choose a tuner
* Provide a yaml experiment configure file
* (optional) Provide or choose an assessor
**Prepare trial**: Let's use a simple trial example, e.g. mnist, provided by NNI. After you installed NNI, NNI examples have been put in ~/nni/examples, run `ls ~/nni/examples/trials` to see all the trial examples. You can simply execute the following command to run the NNI mnist example:
```bash
python3 ~/nni/examples/trials/mnist-annotation/mnist.py
```
This command will be filled in the yaml configure file below. Please refer to [here](howto_1_WriteTrial.md) for how to write your own trial.
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](howto_2_CustomizedTuner.md), but for simplicity, here we choose a tuner provided by NNI as below:
```yaml
tuner:
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
```
*builtinTunerName* is used to specify a tuner in NNI, *classArgs* are the arguments pass to the tuner, *optimization_mode* is to indicate whether you want to maximize or minimize your trial's result.
**Prepare configure file**: Since you have already known which trial code you are going to run and which tuner you are going to use, it is time to prepare the yaml configure file. NNI provides a demo configure file for each trial example, `cat ~/nni/examples/trials/mnist-annotation/config.yml` to see it. Its content is basically shown below:
```yaml
authorName: your_name
experimentName: auto_mnist
# how many trials could be concurrently running
trialConcurrency: 2
# maximum experiment running duration
maxExecDuration: 3h
# empty means never stop
maxTrialNum: 100
# choice: local, remote, pai
trainingServicePlatform: local
# choice: true, false
useAnnotation: true
tuner:
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
trial:
command: python mnist.py
codeDir: ~/nni/examples/trials/mnist-annotation
gpuNum: 0
```
Here *useAnnotation* is true because this trial example uses our python annotation (refer to [here](../tools/annotation/README.md) for details). For trial, we should provide *trialCommand* which is the command to run the trial, provide *trialCodeDir* where the trial code is. The command will be executed in this directory. We should also provide how many GPUs a trial requires.
With all these steps done, we can run the experiment with the following command:
nnictl create --config ~/nni/examples/trials/mnist-annotation/config.yml
You can refer to [here](NNICTLDOC.md) for more usage guide of *nnictl* command line tool.
## View experiment results
The experiment has been running now, NNI provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features. The WebUI is opened by default by `nnictl create`.
## Read more
* [Tuners supported in the latest NNI release](./HowToChooseTuner.md)
* [Overview](Overview.md)
* [Installation](Installation.md)
* [Use command line tool nnictl](NNICTLDOC.md)
* [Use NNIBoard](WebUI.md)
* [Define search space](SearchSpaceSpec.md)
* [Config an experiment](ExperimentConfig.md)
* [How to run an experiment on local (with multiple GPUs)?](tutorial_1_CR_exp_local_api.md)
* [How to run an experiment on multiple machines?](tutorial_2_RemoteMachineMode.md)
* [How to run an experiment on OpenPAI?](PAIMode.md)
* [How to create a multi-phase experiment](multiPhase.md)
# How to use Tuner that NNI supports?
For now, NNI has supported the following tuner algorithms. Note that NNI installation only installs a subset of those algorithms, other algorithms should be installed through `nnictl package install` before you use them. For example, for SMAC the installation command is `nnictl package install --name=SMAC`.
- [TPE](#TPE)
- [Random Search](#Random)
- [Anneal](#Anneal)
- [Naive Evolution](#Evolution)
- [SMAC](#SMAC) (to install through `nnictl`)
- [Batch Tuner](#Batch)
- [Grid Search](#Grid)
- [Hyperband](#Hyperband)
- [Network Morphism](#NetworkMorphism) (require pyTorch)
- [Metis Tuner](#MetisTuner) (require sklearn)
## Supported tuner algorithms
We will introduce some basic knowledge about the tuner algorithms, suggested scenarios for each tuner, and their example usage (for complete usage spec, please refer to [here]()).
<a name="TPE"></a>
**TPE**
The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach. SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements, and then subsequently choose new hyperparameters to test based on this model.
The TPE approach models P(x|y) and P(y) where x represents hyperparameters and y the associated evaluate matric. P(x|y) is modeled by transforming the generative process of hyperparameters, replacing the distributions of the configuration prior with non-parametric densities.
This optimization approach is described in detail in [Algorithms for Hyper-Parameter Optimization][1].
_Suggested scenario_: TPE, as a black-box optimization, can be used in various scenarios, and shows good performance in general. Especially when you have limited computation resource and can only try a small number of trials. From a large amount of experiments, we could found that TPE is far better than Random Search.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: TPE
classArgs:
# choice: maximize, minimize
optimize_mode: maximize
```
<a name="Random"></a>
**Random Search**
In [Random Search for Hyper-Parameter Optimization][2] show that Random Search might be surprisingly simple and effective. We suggests that we could use Random Search as baseline when we have no knowledge about the prior distribution of hyper-parameters.
_Suggested scenario_: Random search is suggested when each trial does not take too long (e.g., each trial can be completed very soon, or early stopped by assessor quickly), and you have enough computation resource. Or you want to uniformly explore the search space. Random Search could be considered as baseline of search algorithm.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: Random
```
<a name="Anneal"></a>
**Anneal**
This simple annealing algorithm begins by sampling from the prior, but tends over time to sample from points closer and closer to the best ones observed. This algorithm is a simple variation on random search that leverages smoothness in the response surface. The annealing rate is not adaptive.
_Suggested scenario_: Anneal is suggested when each trial does not take too long, and you have enough computation resource(almost same with Random Search). Or the variables in search space could be sample from some prior distribution.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: Anneal
classArgs:
# choice: maximize, minimize
optimize_mode: maximize
```
<a name="Evolution"></a>
**Naive Evolution**
Naive Evolution comes from [Large-Scale Evolution of Image Classifiers][3]. It randomly initializes a population based on search space. For each generation, it chooses better ones and do some mutation (e.g., change a hyperparameter, add/remove one layer) on them to get the next generation. Naive Evolution requires many trials to works, but it's very simple and easily to expand new features.
_Suggested scenario_: Its requirement of computation resource is relatively high. Specifically, it requires large inital population to avoid falling into local optimum. If your trial is short or leverages assessor, this tuner is a good choice. And, it is more suggested when your trial code supports weight transfer, that is, the trial could inherit the converged weights from its parent(s). This can greatly speed up the training progress.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: Evolution
classArgs:
# choice: maximize, minimize
optimize_mode: maximize
```
<a name="SMAC"></a>
**SMAC**
[SMAC][4] is based on Sequential Model-Based Optimization (SMBO). It adapts the most prominent previously used model class (Gaussian stochastic process models) and introduces the model class of random forests to SMBO, in order to handle categorical parameters. The SMAC supported by nni is a wrapper on [the SMAC3 github repo][5].
Note that SMAC on nni only supports a subset of the types in [search space spec](./SearchSpaceSpec.md), including `choice`, `randint`, `uniform`, `loguniform`, `quniform(q=1)`.
_Installation_:
* Install swig first. (`sudo apt-get install swig` for Ubuntu users)
* Run `nnictl package install --name=SMAC`
_Suggested scenario_: Similar to TPE, SMAC is also a black-box tuner which can be tried in various scenarios, and is suggested when computation resource is limited. It is optimized for discrete hyperparameters, thus, suggested when most of your hyperparameters are discrete.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: SMAC
classArgs:
# choice: maximize, minimize
optimize_mode: maximize
```
<a name="Batch"></a>
**Batch tuner**
Batch tuner allows users to simply provide several configurations (i.e., choices of hyper-parameters) for their trial code. After finishing all the configurations, the experiment is done. Batch tuner only supports the type `choice` in [search space spec](./SearchSpaceSpec.md).
_Suggested sceanrio_: If the configurations you want to try have been decided, you can list them in searchspace file (using `choice`) and run them using batch tuner.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: BatchTuner
```
Note that the search space that BatchTuner supported like:
```json
{
"combine_params":
{
"_type" : "choice",
"_value" : [{"optimizer": "Adam", "learning_rate": 0.00001},
{"optimizer": "Adam", "learning_rate": 0.0001},
{"optimizer": "Adam", "learning_rate": 0.001},
{"optimizer": "SGD", "learning_rate": 0.01},
{"optimizer": "SGD", "learning_rate": 0.005},
{"optimizer": "SGD", "learning_rate": 0.0002}]
}
}
```
The search space file including the high-level key `combine_params`. The type of params in search space must be `choice` and the `values` including all the combined-params value.
<a name="Grid"></a>
**Grid Search**
Grid Search performs an exhaustive searching through a manually specified subset of the hyperparameter space defined in the searchspace file.
Note that the only acceptable types of search space are `choice`, `quniform`, `qloguniform`. **The number `q` in `quniform` and `qloguniform` has special meaning (different from the spec in [search space spec](./SearchSpaceSpec.md)). It means the number of values that will be sampled evenly from the range `low` and `high`.**
_Suggested scenario_: It is suggested when search space is small, it is feasible to exhaustively sweeping the whole search space.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: GridSearch
```
<a name="Hyperband"></a>
**Hyperband**
[Hyperband][6] tries to use limited resource to explore as many configurations as possible, and finds out the promising ones to get the final result. The basic idea is generating many configurations and to run them for small number of STEPs to find out promising one, then further training those promising ones to select several more promising one. More detail can be referred to [here](../src/sdk/pynni/nni/hyperband_advisor/README.md).
_Suggested scenario_: It is suggested when you have limited computation resource but have relatively large search space. It performs good in the scenario that intermediate result (e.g., accuracy) can reflect good or bad of final result (e.g., accuracy) to some extent.
_Usage_:
```yaml
# config.yaml
advisor:
builtinAdvisorName: Hyperband
classArgs:
# choice: maximize, minimize
optimize_mode: maximize
# R: the maximum STEPS (could be the number of mini-batches or epochs) can be
# allocated to a trial. Each trial should use STEPS to control how long it runs.
R: 60
# eta: proportion of discarded trials
eta: 3
```
<a name="NetworkMorphism"></a>
**Network Morphism**
[Network Morphism][7] provides functions to automatically search for architecture of deep learning models. Every child network inherits the knowledge from its parent network and morphs into diverse types of networks, including changes of depth, width and skip-connection. Next, it estimates the value of child network using the history architecture and metric pairs. Then it selects the most promising one to train. More detail can be referred to [here](../src/sdk/pynni/nni/networkmorphism_tuner/README.md).
_Installation_:
NetworkMorphism requires [pyTorch](https://pytorch.org/get-started/locally), so users should install it first.
_Suggested scenario_: It is suggested that you want to apply deep learning methods to your task (your own dataset) but you have no idea of how to choose or design a network. You modify the [example](../examples/trials/network_morphism/cifar10/cifar10_keras.py) to fit your own dataset and your own data augmentation method. Also you can change the batch size, learning rate or optimizer. It is feasible for different tasks to find a good network architecture. Now this tuner only supports the cv domain.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: NetworkMorphism
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
#for now, this tuner only supports cv domain
task: cv
#input image width
input_width: 32
#input image channel
input_channel: 3
#number of classes
n_output_node: 10
```
<a name="MetisTuner"></a>
**Metis Tuner**
[Metis][10] offers the following benefits when it comes to tuning parameters:
While most tools only predicts the optimal configuration, Metis gives you two outputs: (a) current prediction of optimal configuration, and (b) suggestion for the next trial. No more guess work!
While most tools assume training datasets do not have noisy data, Metis actually tells you if you need to re-sample a particular hyper-parameter.
While most tools have problems of being exploitation-heavy, Metis' search strategy balances exploration, exploitation, and (optional) re-sampling.
Metis belongs to the class of sequential model-based optimization (SMBO), and it is based on the Bayesian Optimization framework. To model the parameter-vs-performance space, Metis uses both Gaussian Process and GMM. Since each trial can impose a high time cost, Metis heavily trades inference computations with naive trial. At each iteration, Metis does two tasks:
* It finds the global optimal point in the Gaussian Process space. This point represents the optimal configuration.
* It identifies the next hyper-parameter candidate. This is achieved by inferring the potential information gain of exploration, exploitation, and re-sampling.
Note that the only acceptable types of search space are `choice`, `quniform`, `uniform` and `randint`. We only support
numerical `choice` now. More features will support later.
More details can be found in our paper: https://www.microsoft.com/en-us/research/publication/metis-robustly-tuning-tail-latencies-cloud-systems/
_Installation_:
Metis Tuner requires [sklearn](https://scikit-learn.org/), so users should install it first. User could use `pip3 install sklearn` to install it.
_Suggested scenario_:
Similar to TPE and SMAC, Metis is a black-box tuner. If your system takes a long time to finish each trial, Metis is more favorable than other approaches such as random search. Furthermore, Metis provides guidance on the subsequent trial. Here is an [example](../examples/trials/auto-gbdt/search_space_metis.json) about the use of Metis. User only need to send the final result like `accuracy` to tuner, by calling the nni SDK.
_Usage_:
```yaml
# config.yaml
tuner:
builtinTunerName: MetisTuner
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
```
<a name="assessor"></a>
# How to use Assessor that NNI supports?
For now, NNI has supported the following assessor algorithms.
- [Medianstop](#Medianstop)
- [Curvefitting](#Curvefitting)
## Supported Assessor Algorithms
<a name="Medianstop"></a>
**Medianstop**
Medianstop is a simple early stopping rule mentioned in the [paper][8]. It stops a pending trial X at step S if the trial’s best objective value by step S is strictly worse than the median value of the running averages of all completed trials’ objectives reported up to step S.
_Suggested scenario_: It is applicable in a wide range of performance curves, thus, can be used in various scenarios to speed up the tuning progress.
_Usage_:
```yaml
assessor:
builtinAssessorName: Medianstop
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
# (optional) A trial is determined to be stopped or not,
* only after receiving start_step number of reported intermediate results.
* The default value of start_step is 0.
start_step: 5
```
<a name="Curvefitting"></a>
**Curvefitting**
Curve Fitting Assessor is a LPA(learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of final epoch's performance worse than the best final performance in the trial history. In this algorithm, we use 12 curves to fit the accuracy curve, the large set of parametric curve models are chosen from [reference paper][9]. The learning curves' shape coincides with our prior knowlwdge about the form of learning curves: They are typically increasing, saturating functions.
_Suggested scenario_: It is applicable in a wide range of performance curves, thus, can be used in various scenarios to speed up the tuning progress. Even better, it's able to handle and assess curves with similar performance.
_Usage_:
```yaml
assessor:
builtinAssessorName: Curvefitting
classArgs:
# (required)The total number of epoch.
# We need to know the number of epoch to determine which point we need to predict.
epoch_num: 20
# (optional) choice: maximize, minimize
# Kindly reminds that if you choose minimize mode, please adjust the value of threshold >= 1.0 (e.g threshold=1.1)
* The default value of optimize_mode is maximize
optimize_mode: maximize
# (optional) A trial is determined to be stopped or not
# In order to save our computing resource, we start to predict when we have more than start_step(default=6) accuracy points.
# only after receiving start_step number of reported intermediate results.
* The default value of start_step is 6.
start_step: 6
# (optional) The threshold that we decide to early stop the worse performance curve.
# For example: if threshold = 0.95, optimize_mode = maximize, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
* The default value of threshold is 0.95.
threshold: 0.95
```
[1]: https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf
[2]: http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
[3]: https://arxiv.org/pdf/1703.01041.pdf
[4]: https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf
[5]: https://github.com/automl/SMAC3
[6]: https://arxiv.org/pdf/1603.06560.pdf
[7]: https://arxiv.org/abs/1806.10282
[8]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf
[9]: http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf
[10]:https://www.microsoft.com/en-us/research/publication/metis-robustly-tuning-tail-latencies-cloud-systems/
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment