@@ -106,7 +106,7 @@ We encourage researchers and students leverage these projects to accelerate the
...
@@ -106,7 +106,7 @@ We encourage researchers and students leverage these projects to accelerate the
## **Install & Verify**
## **Install & Verify**
If you choose NNI Windows local mode and you use PowerShell to run script for the first time, you need to **run PowerShell as administrator** with this command first:
If you are using NNI on Windows and use PowerShell to run script for the first time, you need to **run PowerShell as administrator** with this command first:
```bash
```bash
Set-ExecutionPolicy -ExecutionPolicy Unrestricted
Set-ExecutionPolicy -ExecutionPolicy Unrestricted
...
@@ -114,7 +114,7 @@ If you choose NNI Windows local mode and you use PowerShell to run script for th
...
@@ -114,7 +114,7 @@ If you choose NNI Windows local mode and you use PowerShell to run script for th
**Install through pip**
**Install through pip**
* We support Linux, MacOS and Windows(local mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 along with Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
* We support Linux, MacOS and Windows(local, remote and pai mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 along with Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
Linux and MacOS
Linux and MacOS
...
@@ -131,12 +131,12 @@ python -m pip install --upgrade nni
...
@@ -131,12 +131,12 @@ python -m pip install --upgrade nni
Note:
Note:
*`--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
*`--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* Currently NNI on Windows only support local mode. Anaconda or Miniconda is highly recommended to install NNI on Windows.
* Currently NNI on Windows support local, remote and pai mode. Anaconda or Miniconda is highly recommended to install NNI on Windows.
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/FAQ.md)
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/FAQ.md)
**Install through source code**
**Install through source code**
* We support Linux (Ubuntu 16.04 or higher), MacOS (10.14.1) and Windows local mode (10.1809) in our current stage.
* We support Linux (Ubuntu 16.04 or higher), MacOS (10.14.1) and Windows (10.1809) in our current stage.
* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.
* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.
@@ -94,7 +94,10 @@ NNI (Neural Network Intelligence) 是自动机器学习(AutoML)的工具包
...
@@ -94,7 +94,10 @@ NNI (Neural Network Intelligence) 是自动机器学习(AutoML)的工具包
*[OpenPAI](https://github.com/Microsoft/pai):作为开源平台,提供了完整的 AI 模型训练和资源管理能力,能轻松扩展,并支持各种规模的私有部署、云和混合环境。
*[OpenPAI](https://github.com/Microsoft/pai):作为开源平台,提供了完整的 AI 模型训练和资源管理能力,能轻松扩展,并支持各种规模的私有部署、云和混合环境。
*[FrameworkController](https://github.com/Microsoft/frameworkcontroller):开源的通用 Kubernetes Pod 控制器,通过单个控制器来编排 Kubernetes 上所有类型的应用。
*[FrameworkController](https://github.com/Microsoft/frameworkcontroller):开源的通用 Kubernetes Pod 控制器,通过单个控制器来编排 Kubernetes 上所有类型的应用。
*[MMdnn](https://github.com/Microsoft/MMdnn):一个完整、跨框架的解决方案,能够转换、可视化、诊断深度神经网络模型。 MMdnn 中的 "MM" 表示model management(模型管理),而 "dnn" 是 deep neural network(深度神经网络)的缩写。 我们鼓励研究人员和学生利用这些项目来加速 AI 开发和研究。
*[MMdnn](https://github.com/Microsoft/MMdnn):一个完整、跨框架的解决方案,能够转换、可视化、诊断深度神经网络模型。 MMdnn 中的 "MM" 表示model management(模型管理),而 "dnn" 是 deep neural network(深度神经网络)的缩写。
*[SPTAG](https://github.com/Microsoft/SPTAG) : Space Partition Tree And Graph (SPTAG) 是用于大规模向量的最近邻搜索场景的开源库。
我们鼓励研究人员和学生利用这些项目来加速 AI 开发和研究。
## **安装和验证**
## **安装和验证**
...
@@ -106,9 +109,9 @@ NNI (Neural Network Intelligence) 是自动机器学习(AutoML)的工具包
...
@@ -106,9 +109,9 @@ NNI (Neural Network Intelligence) 是自动机器学习(AutoML)的工具包
Which means the variable value is a random integer in the range [0, upper).
Which means the variable value is a value like round(uniform(low, high)). For now, the type of chosen value is float. If you want to use integer value, please convert it explicitly.
NNI provides state-of-the-art tuning algorithm as our builtin-tuners and makes them easy to use. Below is the brief summary of NNI currently built-in Tuners:
NNI provides state-of-the-art tuning algorithm as our builtin-tuners and makes them easy to use. Below is the brief summary of NNI currently built-in Tuners:
Note: Click the **Tuner's name** to get a detailed description of the algorithm, click the corresponding **Usage** to get the Tuner's installation requirements, suggested scenario and using example. Here is an [article](./Blog/HPOComparison.md) about the comparison of different Tuners on several problems.
Note: Click the **Tuner's name** to get a detailed description of the algorithm, click the corresponding **Usage** to get the Tuner's installation requirements, suggested scenario and using example. Here is an [article](./CommunitySharings/HPOComparison.md) about the comparison of different Tuners on several problems.
__gpuIndices__ is used to specify designated GPU devices for NNI, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or `0,1,3`.
__gpuIndices__ is used to specify designated GPU devices for NNI, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or `0,1,3`.
* __maxTrialNumPerGpu__
__maxTrialNumPerGpu__ is used to specify the max concurrency trial number on a GPU device.
* __useActiveGpu__
__useActiveGpu__ is used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no another active process in the GPU, if __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. This field is not applicable for NNI on Windows.
* __machineList__
* __machineList__
__machineList__ should be set if __trainingServicePlatform__ is set to remote, or it should be empty.
__machineList__ should be set if __trainingServicePlatform__ is set to remote, or it should be empty.
...
@@ -433,6 +442,14 @@ machineList:
...
@@ -433,6 +442,14 @@ machineList:
__gpuIndices__ is used to specify designated GPU devices for NNI on this remote machine, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or `0,1,3`.
__gpuIndices__ is used to specify designated GPU devices for NNI on this remote machine, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or `0,1,3`.
* __maxTrialNumPerGpu__
__maxTrialNumPerGpu__ is used to specify the max concurrency trial number on a GPU device.
* __useActiveGpu__
__useActiveGpu__ is used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no another active process in the GPU, if __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. This field is not applicable for NNI on Windows.
@@ -36,8 +36,8 @@ Unable to open the WebUI may have the following reasons:
...
@@ -36,8 +36,8 @@ Unable to open the WebUI may have the following reasons:
* If you still can't see the WebUI after you use the server IP, you can check the proxy and the firewall of your machine. Or use the browser on the machine where you start your NNI experiment.
* If you still can't see the WebUI after you use the server IP, you can check the proxy and the firewall of your machine. Or use the browser on the machine where you start your NNI experiment.
* Another reason may be your experiment is failed and NNI may fail to get the experiment infomation. You can check the log of NNImanager in the following directory: ~/nni/experiment/[your_experiment_id] /log/nnimanager.log
* Another reason may be your experiment is failed and NNI may fail to get the experiment infomation. You can check the log of NNImanager in the following directory: ~/nni/experiment/[your_experiment_id] /log/nnimanager.log
### Windows local mode problems
### NNI on Windows problems
Please refer to [NNI Windows local mode](WindowsLocalMode.md)
Please refer to [NNI on Windows](NniOnWindows.md)
### Help us improve
### Help us improve
Please inquiry the problem in https://github.com/Microsoft/nni/issues to see whether there are other people already reported the problem, create a new one if there are no existing issues been created.
Please inquiry the problem in https://github.com/Microsoft/nni/issues to see whether there are other people already reported the problem, create a new one if there are no existing issues been created.
# General Programming Interface for Neural Architecture Search
Automatic neural architecture search is taking an increasingly important role on finding better models. Recent research works have proved the feasibility of automatic NAS, and also found some models that could beat manually designed and tuned models. Some of representative works are [NASNet][2], [ENAS][1], [DARTS][3], [Network Morphism][4], and [Evolution][5]. There are new innovations keeping emerging. However, it takes great efforts to implement those algorithms, and it is hard to reuse code base of one algorithm for implementing another.
To facilitate NAS innovations (e.g., design/implement new NAS models, compare different NAS models side-by-side), an easy-to-use and flexibile programming interface is crucial.
## Programming interface
A new programming interface for designing and searching for a model is often demanded in two scenarios. 1) When designing a neural network, the designer may have multiple choices for a layer, sub-model, or connection, and not sure which one or a combination performs the best. It would be appealing to have an easy way to express the candidate layers/sub-models they want to try. 2) For the researchers who are working on automatic NAS, they want to have an unified way to express the search space of neural architectures. And making unchanged trial code adapted to different searching algorithms.
We designed a simple and flexible programming interface based on [NNI annotation](./AnnotationSpec.md). It is elaborated through examples below.
### Example: choose an operator for a layer
When designing the following model there might be several choices in the fourth layer that may make this model perform good. In the script of this model, we can use annotation for the fourth layer as shown in the figure. In this annotation, there are five fields in total:

* __layer_choice__: It is a list of function calls, each function should have defined in user's script or imported libraries. The input arguments of the function should follow the format: `def XXX(inputs, arg2, arg3, ...)`, where `inputs` is a list with two elements. One is the list of `fixed_inputs`, and the other is a list of the chosen inputs from `optional_inputs`. `conv` and `pool` in the figure are examples of function definition. For the function calls in this list, no need to write the first argument (i.e., `input`). Note that only one of the function calls are chosen for this layer.
* __fixed_inputs__: It is a list of variables, the variable could be an output tensor from a previous layer. The variable could be `layer_output` of another nni.mutable_layer before this layer, or other python variables before this layer. All the variables in this list will be fed into the chosen function in `layer_choice` (as the first element of the `input` list).
* __optional_inputs__: It is a list of variables, the variable could be an output tensor from a previous layer. The variable could be `layer_output` of another nni.mutable_layer before this layer, or other python variables before this layer. Only `input_num` variables will be fed into the chosen function in `layer_choice` (as the second element of the `input` list).
* __optional_input_size__: It indicates how many inputs are chosen from `input_candidates`. It could be a number or a range. A range [1,3] means it chooses 1, 2, or 3 inputs.
* __layer_output__: The name of the output(s) of this layer, in this case it represents the return of the function call in `layer_choice`. This will be a variable name that can be used in the following python code or nni.mutable_layer(s).
There are two ways to write annotation for this example. For the upper one, `input` of the function calls is `[[],[out3]]`. For the bottom one, `input` is `[[out3],[]]`.
### Example: choose input connections for a layer
Designing connections of layers is critical for making a high performance model. With our provided interface, users could annotate which connections a layer takes (as inputs). They could choose several ones from a set of connections. Below is an example which chooses two inputs from three candidate inputs for `concat`. Here `concat` always takes the output of its previous layer using `fixed_inputs`.

### Example: choose both operators and connections
In this example, we choose one from the three operators and choose two connections for it. As there are multiple variables in `inputs`, we call `concat` at the beginning of the functions.

### Example: [ENAS][1] macro search space
To illustrate the convenience of the programming interface, we use the interface to implement the trial code of "ENAS + macro search space". The left figure is the macro search space in ENAS paper.

## Unified NAS search space specification
After finishing the trial code through the annotation above, users have implicitly specified the search space of neural architectures in the code. Based on the code, NNI will automatcailly generate a search space file which could be fed into tuning algorithms. This search space file follows the following `json` format.
Accordingly, a specified neural architecture (generated by tuning algorithm) is expressed as follows:
```json
{
"mutable_1":{
"layer_1":{
"chosen_layer":"pool",
"chosen_inputs":["out1","out3"]
},
"layer_2":{
...
}
}
}
```
With the specification of the format of search space and architecture (choice) expression, users are free to implement various (general) tuning algorithms for neural architecture search on NNI. One future work is to provide a general NAS algorihtm.
NNI's annotation compiler transforms the annotated trial code to the code that could receive architecture choice and build the corresponding model (i.e., graph). The NAS search space can be seen as a full graph (here, full graph means enabling all the provided operators and connections to build a graph), the architecture chosen by the tuning algorithm is a subgraph in it. By default, the compiled trial code only builds and executes the subgraph.

The above figure shows how the trial code runs on NNI. `nnictl` processes user trial code to generate a search space file and compiled trial code. The former is fed to tuner, and the latter is used to run trilas.
[__TODO__] Simple example of NAS on NNI.
### Weight sharing
Sharing weights among chosen architectures (i.e., trials) could speedup model search. For example, properly inheriting weights of completed trials could speedup the converge of new trials. One-Shot NAS (e.g., ENAS, Darts) is more aggressive, the training of different architectures (i.e., subgraphs) shares the same copy of the weights in full graph.

We believe weight sharing (transferring) plays a key role on speeding up NAS, while finding efficient ways of sharing weights is still a hot research topic. We provide a key-value store for users to store and load weights. Tuners and Trials use a provided KV client lib to access the storage.
[__TODO__] Example of weight sharing on NNI.
### Support of One-Shot NAS
One-Shot NAS is a popular approach to find good neural architecture within a limited time and resource budget. Basically, it builds a full graph based on the search space, and uses gradient descent to at last find the best subgraph. There are different training approaches, such as [training subgraphs (per mini-batch)][1], [training full graph through dropout][6], [training with architecture weights (regularization)][3]. Here we focus on the first approach, i.e., training subgraphs (ENAS).
With the same annotated trial code, users could choose One-Shot NAS as execution mode on NNI. Specifically, the compiled trial code builds the full graph (rather than subgraph demonstrated above), it receives a chosen architecture and training this architecture on the full graph for a mini-batch, then request another chosen architecture. It is supported by [NNI multi-phase](./multiPhase.md). We support this training approach because training a subgraph is very fast, building the graph every time training a subgraph induces too much overhead.

The design of One-Shot NAS on NNI is shown in the above figure. One-Shot NAS usually only has one trial job with full graph. NNI supports running multiple such trial jobs each of which runs independently. As One-Shot NAS is not stable, running multiple instances helps find better model. Moreover, trial jobs are also able to synchronize weights during running (i.e., there is only one copy of weights, like asynchroneous parameter-server mode). This may speedup converge.
[__TODO__] Example of One-Shot NAS on NNI.
## General tuning algorithms for NAS
Like hyperparameter tuning, a relatively general algorithm for NAS is required. The general programming interface makes this task easier to some extent. We have a RL-based tuner algorithm for NAS from our contributors. We expect efforts from community to design and implement better NAS algorithms.
[__TODO__] More tuning algorithms for NAS.
## Export best neural architecture and code
[__TODO__] After the NNI experiment is done, users could run `nnictl experiment export --code` to export the trial code with the best neural architecture.
## Conclusion and Future work
There could be different NAS algorithms and execution modes, but they could be supported with the same programming interface as demonstrated above.
There are many interesting research topics in this area, both system and machine learning.
We support Linux MacOS and Windows(local mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
We support Linux MacOS and Windows in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
#### Linux and MacOS
#### Linux and MacOS
```bash
```bash
...
@@ -10,7 +10,7 @@ We support Linux MacOS and Windows(local mode) in current stage, Ubuntu 16.04 or
...
@@ -10,7 +10,7 @@ We support Linux MacOS and Windows(local mode) in current stage, Ubuntu 16.04 or
```
```
#### Windows
#### Windows
If you choose Windows local mode and use PowerShell to run script, you need run below PowerShell command as administrator at first time.
If you are using NNI on Windows, you need run below PowerShell command as administrator at first time.
```bash
```bash
Set-ExecutionPolicy -ExecutionPolicy Unrestricted
Set-ExecutionPolicy -ExecutionPolicy Unrestricted
```
```
...
@@ -151,10 +151,10 @@ Run the **config.yml** file from your command line to start MNIST experiment.
...
@@ -151,10 +151,10 @@ Run the **config.yml** file from your command line to start MNIST experiment.
#### Windows
#### Windows
Run the **config_windows.yml** file from your command line to start MNIST experiment.
Run the **config_windows.yml** file from your command line to start MNIST experiment.
**Note**, if you're using windows local mode, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
**Note**, if you're using NNI on Windows, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
Note, **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Note, **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
@@ -29,16 +29,16 @@ All types of sampling strategies and their parameter are listed here:
...
@@ -29,16 +29,16 @@ All types of sampling strategies and their parameter are listed here:
* Which means the variable's value is one of the options. Here 'options' should be a list. Each element of options is a number of string. It could also be a nested sub-search-space, this sub-search-space takes effect only when the corresponding element is chosen. The variables in this sub-search-space could be seen as conditional variables.
* Which means the variable's value is one of the options. Here 'options' should be a list. Each element of options is a number of string. It could also be a nested sub-search-space, this sub-search-space takes effect only when the corresponding element is chosen. The variables in this sub-search-space could be seen as conditional variables.
* An simple [example](../../examples/trials/mnist-cascading-search-space/search_space.json) of [nested] search space definition. If an element in the options list is a dict, it is a sub-search-space, and for our built-in tuners you have to add a key '_name' in this dict, which helps you to identify which element is chosen. Accordingly, here is a [sample](../../examples/trials/mnist-cascading-search-space/sample.json) which users can get from nni with nested search space definition. Tuners which support nested search space is as follows:
* An simple [example](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nested-search-space/search_space.json) of [nested] search space definition. If an element in the options list is a dict, it is a sub-search-space, and for our built-in tuners you have to add a key '_name' in this dict, which helps you to identify which element is chosen. Accordingly, here is a [sample](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nested-search-space/sample.json) which users can get from nni with nested search space definition. Tuners which support nested search space is as follows:
- Random Search
- Random Search
- TPE
- TPE
- Anneal
- Anneal
- Evolution
- Evolution
* {"_type":"randint","_value":[upper]}
* {"_type":"randint","_value":[lower, upper]}
*Which means the variable value is a random integer in the range [0, upper). The semantics of this distribution is that there is no more correlation in the loss function between nearby integer values, as compared with more distant integer values. This is an appropriate distribution for describing random seeds for example. If the loss function is probably more correlated for nearby integer values, then you should probably use one of the "quantized" continuous distributions, such as either quniform, qloguniform, qnormal or qlognormal. Note that if you want to change lower bound, you can use `quniform` for now.
*For now, we implment the "randint" distribution with "quniform", which means the variable value is a value like round(uniform(lower, upper)). The type of chosen value is float. If you want to use integer value, please convert it explicitly.
* {"_type":"uniform","_value":[low, high]}
* {"_type":"uniform","_value":[low, high]}
* Which means the variable value is a value uniformly between low and high.
* Which means the variable value is a value uniformly between low and high.
...
@@ -86,9 +86,19 @@ All types of sampling strategies and their parameter are listed here:
...
@@ -86,9 +86,19 @@ All types of sampling strategies and their parameter are listed here:
Note that In Grid Search Tuner, for users' convenience, the definition of `quniform` and `qloguniform` change, where q here specifies the number of values that will be sampled. Details about them are listed as follows
* Type 'quniform' will receive three values [low, high, q], where [low, high] specifies a range and 'q' specifies the number of values that will be sampled evenly. Note that q should be at least 2. It will be sampled in a way that the first sampled value is 'low', and each of the following values is (high-low)/q larger that the value in front of it.
Known Limitations:
* Type 'qloguniform' behaves like 'quniform' except that it will first change the range to [log(low), log(high)] and sample and then change the sampled value back.
Note that Metis Tuner only support numerical `choice` now
* Note that In Grid Search Tuner, for users' convenience, the definition of `quniform` and `qloguniform` change, where q here specifies the number of values that will be sampled. Details about them are listed as follows
* Type 'quniform' will receive three values [low, high, q], where [low, high] specifies a range and 'q' specifies the number of values that will be sampled evenly. Note that q should be at least 2. It will be sampled in a way that the first sampled value is 'low', and each of the following values is (high-low)/q larger that the value in front of it.
* Type 'qloguniform' behaves like 'quniform' except that it will first change the range to [log(low), log(high)] and sample and then change the sampled value back.
* Note that Metis Tuner only supports numerical `choice` now
* Note that for nested search space:
* Only Random Search/TPE/Anneal/Evolution tuner supports nested search space
* We do not support nested search space "Hyper Parameter" parallel graph now, the enhancement is being considered in #1110(https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed
* See the experiment trial profile and search space message.
* See the experiment trial profile and search space message.
* Support to download the experiment result.
* Support to download the experiment result.
* Support to export nni-manager and dispatcher log file.
* If you have any question, you can click "Feedback" to report it.


* See good performance trials.
* See good performance trials.
...
@@ -52,6 +54,14 @@ Click the tab "Trials Detail" to see the status of the all trials. Specifically:
...
@@ -52,6 +54,14 @@ Click the tab "Trials Detail" to see the status of the all trials. Specifically:


* The button named "Add column" can select which column to show in the table. If you run an experiment that final result is dict, you can see other keys in the table.

* You can use the button named "Copy as python" to copy trial's parameters.

* If you run on OpenPAI or Kubeflow platform, you can also see the hdfsLog.
* If you run on OpenPAI or Kubeflow platform, you can also see the hdfsLog.