An experiment is to run multiple trial jobs, each trial job tries a configuration which includes a specific neural architecture (or model) and hyper-parameter values. To run an experiment through NNI, you should:
An experiment is to run multiple trial jobs, each trial job tries a configuration which includes a specific neural architecture (or model) and hyper-parameter values. To run an experiment through NNI, you should:
* Provide a runnable trial
* Provide a runnable trial
...
@@ -32,22 +40,26 @@ An experiment is to run multiple trial jobs, each trial job tries a configuratio
...
@@ -32,22 +40,26 @@ An experiment is to run multiple trial jobs, each trial job tries a configuratio
**Prepare trial**: Let's use a simple trial example, e.g. mnist, provided by NNI. After you installed NNI, NNI examples have been put in ~/nni/examples, run `ls ~/nni/examples/trials` to see all the trial examples. You can simply execute the following command to run the NNI mnist example:
**Prepare trial**: Let's use a simple trial example, e.g. mnist, provided by NNI. After you installed NNI, NNI examples have been put in ~/nni/examples, run `ls ~/nni/examples/trials` to see all the trial examples. You can simply execute the following command to run the NNI mnist example:
This command will be filled in the yaml configure file below. Please refer to [here](howto_1_WriteTrial.md) for how to write your own trial.
This command will be filled in the yaml configure file below. Please refer to [here](howto_1_WriteTrial.md) for how to write your own trial.
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](howto_2_CustomizedTuner.md), but for simplicity, here we choose a tuner provided by NNI as below:
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](howto_2_CustomizedTuner.md), but for simplicity, here we choose a tuner provided by NNI as below:
tuner:
```yaml
builtinTunerName: TPE
tuner:
classArgs:
builtinTunerName:TPE
optimize_mode: maximize
classArgs:
optimize_mode:maximize
```
*builtinTunerName* is used to specify a tuner in NNI, *classArgs* are the arguments pass to the tuner, *optimization_mode* is to indicate whether you want to maximize or minimize your trial's result.
*builtinTunerName* is used to specify a tuner in NNI, *classArgs* are the arguments pass to the tuner, *optimization_mode* is to indicate whether you want to maximize or minimize your trial's result.
**Prepare configure file**: Since you have already known which trial code you are going to run and which tuner you are going to use, it is time to prepare the yaml configure file. NNI provides a demo configure file for each trial example, `cat ~/nni/examples/trials/mnist-annotation/config.yml` to see it. Its content is basically shown below:
**Prepare configure file**: Since you have already known which trial code you are going to run and which tuner you are going to use, it is time to prepare the yaml configure file. NNI provides a demo configure file for each trial example, `cat ~/nni/examples/trials/mnist-annotation/config.yml` to see it. Its content is basically shown below:
```
```yaml
authorName:your_name
authorName:your_name
experimentName:auto_mnist
experimentName:auto_mnist
...
@@ -73,7 +85,7 @@ trial:
...
@@ -73,7 +85,7 @@ trial:
command:python mnist.py
command:python mnist.py
codeDir:~/nni/examples/trials/mnist-annotation
codeDir:~/nni/examples/trials/mnist-annotation
gpuNum:0
gpuNum:0
```
```
Here *useAnnotation* is true because this trial example uses our python annotation (refer to [here](../tools/annotation/README.md) for details). For trial, we should provide *trialCommand* which is the command to run the trial, provide *trialCodeDir* where the trial code is. The command will be executed in this directory. We should also provide how many GPUs a trial requires.
Here *useAnnotation* is true because this trial example uses our python annotation (refer to [here](../tools/annotation/README.md) for details). For trial, we should provide *trialCommand* which is the command to run the trial, provide *trialCodeDir* where the trial code is. The command will be executed in this directory. We should also provide how many GPUs a trial requires.
...
@@ -87,6 +99,7 @@ You can refer to [here](NNICTLDOC.md) for more usage guide of *nnictl* command l
...
@@ -87,6 +99,7 @@ You can refer to [here](NNICTLDOC.md) for more usage guide of *nnictl* command l
The experiment has been running now, NNI provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features. The WebUI is opened by default by `nnictl create`.
The experiment has been running now, NNI provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features. The WebUI is opened by default by `nnictl create`.
## Read more
## Read more
*[Tuners supported in the latest NNI release](./HowToChooseTuner.md)
*[Tuners supported in the latest NNI release](./HowToChooseTuner.md)
You can also install NNI in a docker image. Please follow the instructions [here](../deployment/docker/README.md) to build NNI docker image. The NNI docker image can also be retrieved from Docker Hub through the command `docker pull msranni/nni:latest`.
You can also install NNI in a docker image. Please follow the instructions [here](../deployment/docker/README.md) to build NNI docker image. The NNI docker image can also be retrieved from Docker Hub through the command `docker pull msranni/nni:latest`.
## **System requirements**
## **System requirements**
...
@@ -52,8 +58,8 @@ Below are the minimum system requirements for NNI on macOS. Due to potential pro
...
@@ -52,8 +58,8 @@ Below are the minimum system requirements for NNI on macOS. Due to potential pro
|**Internet**|Boardband internet connection|
|**Internet**|Boardband internet connection|
|**Resolution**|1024 x 768 minimum display resolution|
|**Resolution**|1024 x 768 minimum display resolution|
If users want to use tf-operator, he could set `ps` and `worker` in trial config. If users want to use pytorch-operator, he could set `master` and `worker` in trial config.
If users want to use tf-operator, he could set `ps` and `worker` in trial config. If users want to use pytorch-operator, he could set `master` and `worker` in trial config.
## Supported sotrage type
## Supported storage type
NNI support NFS and Azure Storage to store the code and output files, users could set storage type in config file and set the corresponding config.
NNI support NFS and Azure Storage to store the code and output files, users could set storage type in config file and set the corresponding config.
The setting for NFS storage are as follows:
The setting for NFS storage are as follows:
```
```
...
@@ -197,4 +197,3 @@ Notice: In kubeflow mode, NNIManager will start a rest server and listen on a po
...
@@ -197,4 +197,3 @@ Notice: In kubeflow mode, NNIManager will start a rest server and listen on a po
Once a trial job is completed, you can goto NNI WebUI's overview page (like http://localhost:8080/oview) to check trial's information.
Once a trial job is completed, you can goto NNI WebUI's overview page (like http://localhost:8080/oview) to check trial's information.
Any problems when using NNI in kubeflow mode, plesae create issues on [NNI github repo](https://github.com/Microsoft/nni), or send mail to nni@microsoft.com
Any problems when using NNI in kubeflow mode, plesae create issues on [NNI github repo](https://github.com/Microsoft/nni), or send mail to nni@microsoft.com
__nnictl__ is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc.
__nnictl__ is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc.
## Commands
## Commands
nnictl support commands:
nnictl support commands:
```
```bash
nnictl create
nnictl create
nnictl stop
nnictl stop
nnictl update
nnictl update
...
@@ -19,359 +24,403 @@ nnictl tensorboard
...
@@ -19,359 +24,403 @@ nnictl tensorboard
nnictl top
nnictl top
nnictl --version
nnictl --version
```
```
### Manage an experiment
### Manage an experiment
* __nnictl create__
* Description
* __nnictl create__
You can use this command to create a new experiment, using the configuration specified in config file.
* Description
After this command is successfully done, the context will be set as this experiment,
which means the following command you issued is associated with this experiment,
You can use this command to create a new experiment, using the configuration specified in config file. After this command is successfully done, the context will be set as this experiment, which means the following command you issued is associated with this experiment, unless you explicitly changes the context(not supported yet).
unless you explicitly changes the context(not supported yet).
| id| False| |ID of the experiment you want to set|
```
| --value, -v| True| |the experiment duration will be NUMBER seconds. SUFFIX may be 's' for seconds (the default), 'm' for minutes, 'h' for hours or 'd' for days.|
You can use this command to update an experiment's maxtrialnum.
| id| False| |ID of the experiment you want to set|
| --value, -v| True| |the experiment duration will be NUMBER seconds. SUFFIX may be 's' for seconds (the default), 'm' for minutes, 'h' for hours or 'd' for days.|
* Usage
* __nnictl update trialnum__
nnictl update trialnum [OPTIONS]
* Description
Options:
You can use this command to update an experiment's maxtrialnum.
| --port| False| 6006|The port of the tensorboard process|
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
* Detail
| --trialid| False| |ID of the trial|
| --port| False| 6006|The port of the tensorboard process|
1. NNICTL support tensorboard function in local and remote platform for the moment, other platforms will be supported later.
2. If you want to use tensorboard, you need to write your tensorboard log data to environment variable [NNI_OUTPUT_DIR] path.
* Detail
3. In local mode, nnictl will set --logdir=[NNI_OUTPUT_DIR] directly and start a tensorboard process.
4. In remote mode, nnictl will create a ssh client to copy log data from remote machine to local temp directory firstly, and then start a tensorboard process in your local machine. You need to notice that nnictl only copy the log data one time when you use the command, if you want to see the later result of tensorboard, you should execute nnictl tensorboard command again.
1. NNICTL support tensorboard function in local and remote platform for the moment, other platforms will be supported later.
5. If there is only one trial job, you don't need to set trialid. If there are multiple trial jobs running, you should set the trialid, or you could use [nnictl tensorboard start --trialid all] to map --logdir to all trial log paths.
2. If you want to use tensorboard, you need to write your tensorboard log data to environment variable [NNI_OUTPUT_DIR] path.
3. In local mode, nnictl will set --logdir=[NNI_OUTPUT_DIR] directly and start a tensorboard process.
4. In remote mode, nnictl will create a ssh client to copy log data from remote machine to local temp directory firstly, and then start a tensorboard process in your local machine. You need to notice that nnictl only copy the log data one time when you use the command, if you want to see the later result of tensorboard, you should execute nnictl tensorboard command again.
5. If there is only one trial job, you don't need to set trialid. If there are multiple trial jobs running, you should set the trialid, or you could use [nnictl tensorboard start --trialid all] to map --logdir to all trial log paths.
@@ -22,7 +22,7 @@ After user submits the experiment through a command line tool [nnictl](../tools/
...
@@ -22,7 +22,7 @@ After user submits the experiment through a command line tool [nnictl](../tools/
User can use the nnictl and/or a visualized Web UI nniboard to monitor and debug a given experiment.
User can use the nnictl and/or a visualized Web UI nniboard to monitor and debug a given experiment.
NNI provides a set of examples in the package to get you familiar with the above process. In the following example [/examples/trials/mnist], we had already set up the configuration and updated the training codes for you. You can directly run the following command to start an experiment.
NNI provides a set of examples in the package to get you familiar with the above process.
## Key Concepts
## Key Concepts
...
@@ -46,4 +46,4 @@ NNI provides a set of examples in the package to get you familiar with the above
...
@@ -46,4 +46,4 @@ NNI provides a set of examples in the package to get you familiar with the above
### **Tutorials**
### **Tutorials**
*[How to run an experiment on local (with multiple GPUs)?](tutorial_1_CR_exp_local_api.md)
*[How to run an experiment on local (with multiple GPUs)?](tutorial_1_CR_exp_local_api.md)
*[How to run an experiment on multiple machines?](tutorial_2_RemoteMachineMode.md)
*[How to run an experiment on multiple machines?](tutorial_2_RemoteMachineMode.md)
*[How to run an experiment on OpenPAI?](PAIMode.md)
*[How to run an experiment on OpenPAI?](PAIMode.md)
NNI supports running an experiment on [OpenPAI](https://github.com/Microsoft/pai)(aka pai), called pai mode. Before starting to use NNI pai mode, you should have an account to access an [OpenPAI](https://github.com/Microsoft/pai) cluster. See [here](https://github.com/Microsoft/pai#how-to-deploy) if you don't have any OpenPAI account and want to deploy an OpenPAI cluster. In pai mode, your trial program will run in pai's container created by Docker.
NNI supports running an experiment on [OpenPAI](https://github.com/Microsoft/pai)(aka pai), called pai mode. Before starting to use NNI pai mode, you should have an account to access an [OpenPAI](https://github.com/Microsoft/pai) cluster. See [here](https://github.com/Microsoft/pai#how-to-deploy) if you don't have any OpenPAI account and want to deploy an OpenPAI cluster. In pai mode, your trial program will run in pai's container created by Docker.
## Setup environment
## Setup environment
Install NNI, follow the install guide [here](GetStarted.md).
Install NNI, follow the install guide [here](GetStarted.md).
## Run an experiment
## Run an experiment
Use `examples/trials/mnist-annotation` as an example. The nni config yaml file's content is like:
Use `examples/trials/mnist-annotation` as an example. The nni config yaml file's content is like:
```
```yaml
authorName:your_name
authorName:your_name
experimentName:auto_mnist
experimentName:auto_mnist
# how many trials could be concurrently running
# how many trials could be concurrently running
...
@@ -39,7 +40,8 @@ paiConfig:
...
@@ -39,7 +40,8 @@ paiConfig:
passWord:your_pai_password
passWord:your_pai_password
host:10.1.1.1
host:10.1.1.1
```
```
Note: You should set `trainingServicePlatform: pai` in nni config yaml file if you want to start experiment in pai mode.
Note: You should set `trainingServicePlatform: pai` in nni config yaml file if you want to start experiment in pai mode.
Compared with LocalMode and [RemoteMachineMode](RemoteMachineMode.md), trial configuration in pai mode have five additional keys:
Compared with LocalMode and [RemoteMachineMode](RemoteMachineMode.md), trial configuration in pai mode have five additional keys:
* cpuNum
* cpuNum
...
@@ -58,7 +60,7 @@ Once complete to fill nni experiment config file and save (for example, save as
...
@@ -58,7 +60,7 @@ Once complete to fill nni experiment config file and save (for example, save as
```
```
nnictl create --config exp_pai.yaml
nnictl create --config exp_pai.yaml
```
```
to start the experiment in pai mode. NNI will create OpanPAI job for each trial, and the job name format is something like `nni_exp_{experiment_id}_trial_{trial_id}`.
to start the experiment in pai mode. NNI will create OpenPAI job for each trial, and the job name format is something like `nni_exp_{experiment_id}_trial_{trial_id}`.
You can see the pai jobs created by NNI in your OpenPAI cluster's web portal, like:
You can see the pai jobs created by NNI in your OpenPAI cluster's web portal, like:


...
@@ -77,4 +79,3 @@ You can see there're three fils in output folder: stderr, stdout, and trial.log
...
@@ -77,4 +79,3 @@ You can see there're three fils in output folder: stderr, stdout, and trial.log
If you also want to save trial's other output into HDFS, like model files, you can use environment variable `NNI_OUTPUT_DIR` in your trial code to save your own output files, and NNI SDK will copy all the files in `NNI_OUTPUT_DIR` from trial's container to HDFS.
If you also want to save trial's other output into HDFS, like model files, you can use environment variable `NNI_OUTPUT_DIR` in your trial code to save your own output files, and NNI SDK will copy all the files in `NNI_OUTPUT_DIR` from trial's container to HDFS.
Any problems when using NNI in pai mode, plesae create issues on [NNI github repo](https://github.com/Microsoft/nni), or send mail to nni@microsoft.com
Any problems when using NNI in pai mode, plesae create issues on [NNI github repo](https://github.com/Microsoft/nni), or send mail to nni@microsoft.com
NNI supports running an experiment on multiple machines through SSH channel, called `remote` mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code.
e.g. Three machines and you login in with account `bob` (Note: the account is not necessarily the same on different machine):
NNI supports running an experiment on multiple machines through SSH channel, called `remote` mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code.
e.g. Three machines and you login in with account `bob` (Note: the account is not necessarily the same on different machine):
| IP | Username| Password |
| IP | Username| Password |
| -------- |---------|-------|
| -------- |---------|-------|
...
@@ -11,19 +13,24 @@ e.g. Three machines and you login in with account `bob` (Note: the account is no
...
@@ -11,19 +13,24 @@ e.g. Three machines and you login in with account `bob` (Note: the account is no
| 10.1.1.3 | bob | bob123 |
| 10.1.1.3 | bob | bob123 |
## Setup NNI environment
## Setup NNI environment
Install NNI on each of your machines following the install guide [here](GetStarted.md).
Install NNI on each of your machines following the install guide [here](GetStarted.md).
For remote machines that are used only to run trials but not the nnictl, you can just install python SDK:
For remote machines that are used only to run trials but not the nnictl, you can just install python SDK:
* __Install python SDK through pip__
* __Install python SDK through pip__
python3 -m pip install --user --upgrade nni-sdk
```bash
python3 -m pip install--user--upgrade nni-sdk
```
## Run an experiment
## Run an experiment
Install NNI on another machine which has network accessibility to those three machines above, or you can just use any machine above to run nnictl command line tool.
Install NNI on another machine which has network accessibility to those three machines above, or you can just use any machine above to run nnictl command line tool.
We use `examples/trials/mnist-annotation` as an example here. `cat ~/nni/examples/trials/mnist-annotation/config_remote.yml` to see the detailed configuration file:
We use `examples/trials/mnist-annotation` as an example here. `cat ~/nni/examples/trials/mnist-annotation/config_remote.yml` to see the detailed configuration file:
```
```yaml
authorName:default
authorName:default
experimentName:example_mnist
experimentName:example_mnist
trialConcurrency:1
trialConcurrency:1
...
@@ -58,8 +65,11 @@ machineList:
...
@@ -58,8 +65,11 @@ machineList:
username:bob
username:bob
passwd:bob123
passwd:bob123
```
```
Simply filling the `machineList` section and then run:
Simply filling the `machineList` section and then run:
The example define ```dropout_rate``` as variable which priori distribution is uniform distribution, and its value from ```0.1``` and ```0.5```.
The example define `dropout_rate` as variable which priori distribution is uniform distribution, and its value from `0.1` and `0.5`.
The tuner will sample parameters/architecture by understanding the search space first.
The tuner will sample parameters/architecture by understanding the search space first.
User should define the name of variable, type and candidate value of variable.
User should define the name of variable, type and candidate value of variable.
...
@@ -69,6 +69,6 @@ The candidate type and value for variable is here:
...
@@ -69,6 +69,6 @@ The candidate type and value for variable is here:
Note that SMAC only supports a subset of the types above, including `choice`, `randint`, `uniform`, `loguniform`, `quniform(q=1)`. In the current version, SMAC does not support cascaded search space (i.e., conditional variable in SMAC).
Note that SMAC only supports a subset of the types above, including `choice`, `randint`, `uniform`, `loguniform`, `quniform(q=1)`. In the current version, SMAC does not support cascaded search space (i.e., conditional variable in SMAC).
Note that GridSearch Tuner only supports a subset of the types above, including `choic`, `quniform` and `qloguniform`, where q here specifies the number of values that will be sampled. Details about the last two type as follows
Note that GridSearch Tuner only supports a subset of the types above, including `choice`, `quniform` and `qloguniform`, where q here specifies the number of values that will be sampled. Details about the last two type as follows
* Type 'quniform' will receive three values [low, high, q], where [low, high] specifies a range and 'q' specifies the number of values that will be sampled evenly. Note that q should be at least 2. It will be sampled in a way that the first sampled value is 'low', and each of the following values is (high-low)/q larger that the value in front of it.
* Type 'quniform' will receive three values [low, high, q], where [low, high] specifies a range and 'q' specifies the number of values that will be sampled evenly. Note that q should be at least 2. It will be sampled in a way that the first sampled value is 'low', and each of the following values is (high-low)/q larger that the value in front of it.
* Type 'qloguniform' behaves like 'quniform' except that it will first change the range to [log(low), log(high)] and sample and then change the sampled value back.
* Type 'qloguniform' behaves like 'quniform' except that it will first change the range to [log(low), log(high)] and sample and then change the sampled value back.
For debugging NNI source code, your development environment should be under Ubuntu 16.04 (or above) system with python 3 and pip 3 installed, then follow the below steps.
For debugging NNI source code, your development environment should be under Ubuntu 16.04 (or above) system with python 3 and pip 3 installed, then follow the below steps.
...
@@ -7,42 +9,52 @@ For debugging NNI source code, your development environment should be under Ubun
...
@@ -7,42 +9,52 @@ For debugging NNI source code, your development environment should be under Ubun
**1. Clone the source code**
**1. Clone the source code**
Run the command
Run the command
```
```
git clone https://github.com/Microsoft/nni.git
git clone https://github.com/Microsoft/nni.git
```
```
to clone the source code
to clone the source code
**2. Prepare the debug environment and install dependencies**
**2. Prepare the debug environment and install dependencies**
Change directory to the source code folder, then run the command
Change directory to the source code folder, then run the command
```
```
make install-dependencies
make install-dependencies
```
```
to install the dependent tools for the environment
to install the dependent tools for the environment
**3. Build source code**
**3. Build source code**
Run the command
Run the command
```
```
make build
make build
```
```
to build the source code
to build the source code
**4. Install NNI to development environment**
**4. Install NNI to development environment**
Run the command
Run the command
```
```
make dev-install
make dev-install
```
```
to install the distribution content to development environment, and create cli scripts
to install the distribution content to development environment, and create cli scripts
**5. Check if the environment is ready**
**5. Check if the environment is ready**
Now, you can try to start an experiment to check if your environment is ready.
Now, you can try to start an experiment to check if your environment is ready.
@@ -6,10 +6,12 @@ A **Trial** in NNI is an individual attempt at applying a set of parameters on a
...
@@ -6,10 +6,12 @@ A **Trial** in NNI is an individual attempt at applying a set of parameters on a
To define a NNI trial, you need to firstly define the set of parameters and then update the model. NNI provide two approaches for you to define a trial: `NNI API` and `NNI Python annotation`.
To define a NNI trial, you need to firstly define the set of parameters and then update the model. NNI provide two approaches for you to define a trial: `NNI API` and `NNI Python annotation`.
accuracy - The `accuracy` could be any python object, but if you use NNI built-in tuner/assessor, `accuracy` should be a numerical variable (e.g. float, int).
accuracy - The `accuracy` could be any python object, but if you use NNI built-in tuner/assessor, `accuracy` should be a numerical variable (e.g. float, int).
assessor - The assessor will decide which trial should early stop based on the history performance of trial (intermediate result of one trial).
assessor - The assessor will decide which trial should early stop based on the history performance of trial (intermediate result of one trial).
...
@@ -63,16 +68,17 @@ useAnnotation: false
...
@@ -63,16 +68,17 @@ useAnnotation: false
searchSpacePath: /path/to/your/search_space.json
searchSpacePath: /path/to/your/search_space.json
```
```
You can refer to [here](ExperimentConfig.md) for more information about how to set up experiment configurations.
You can refer to [here](./ExperimentConfig.md) for more information about how to set up experiment configurations.
(../examples/trials/README.md) for more information about how to write trial code using NNI APIs.
You can refer to [here](../examples/trials/README.md) for more information about how to write trial code using NNI APIs.
## NNI Python Annotation
## NNI Python Annotation
An alternative to write a trial is to use NNI's syntax for python. Simple as any annotation, NNI annotation is working like comments in your codes. You don't have to make structure or any other big changes to your existing codes. With a few lines of NNI annotation, you will be able to:
An alternative to write a trial is to use NNI's syntax for python. Simple as any annotation, NNI annotation is working like comments in your codes. You don't have to make structure or any other big changes to your existing codes. With a few lines of NNI annotation, you will be able to:
* annotate the variables you want to tune
* annotate the variables you want to tune
* specify in which range you want to tune the variables
* specify in which range you want to tune the variables
* annotate which variable you want to report as intermediate result to `assessor`
* annotate which variable you want to report as intermediate result to `assessor`
* annotate which variable you want to report as the final result (e.g. model accuracy) to `tuner`.
* annotate which variable you want to report as the final result (e.g. model accuracy) to `tuner`.
Again, take MNIST as an example, it only requires 2 steps to write a trial with NNI Annotation.
Again, take MNIST as an example, it only requires 2 steps to write a trial with NNI Annotation.
...
@@ -113,14 +119,16 @@ with tf.Session() as sess:
...
@@ -113,14 +119,16 @@ with tf.Session() as sess:
>>
>>
>>`@nni.report_intermediate_result`/`@nni.report_final_result` will send the data to assessor/tuner at that line.
>>`@nni.report_intermediate_result`/`@nni.report_final_result` will send the data to assessor/tuner at that line.
>>
>>
>>Please refer to [Annotation README](../tools/nni_annotation/README.md) for more information about annotation syntax and its usage.
>>Please refer to [Annotation README](../tools/nni_annotation/README.md) for more information about annotation syntax and its usage.
>Step 2 - Enable NNI Annotation
>Step 2 - Enable NNI Annotation
In the yaml configure file, you need to set *useAnnotation* to true to enable NNI annotation:
In the yaml configure file, you need to set *useAnnotation* to true to enable NNI annotation:
```
```yaml
useAnnotation:true
useAnnotation:true
```
```
## More Trial Example
## More Trial Example
*[Automatic Model Architecture Search for Reading Comprehension.](../examples/trials/ga_squad/README.md)
*[Automatic Model Architecture Search for Reading Comprehension.](../examples/trials/ga_squad/README.md)
Record an observation of the objective function and Train
Record an observation of the objective function and Train
...
@@ -36,7 +38,7 @@ class CustomizedTuner(Tuner):
...
@@ -36,7 +38,7 @@ class CustomizedTuner(Tuner):
'''
'''
# your code implements here.
# your code implements here.
...
...
defgenerate_parameters(self,parameter_id):
defgenerate_parameters(self,parameter_id):
'''
'''
Returns a set of trial (hyper-)parameters, as a serializable object
Returns a set of trial (hyper-)parameters, as a serializable object
...
@@ -46,12 +48,14 @@ class CustomizedTuner(Tuner):
...
@@ -46,12 +48,14 @@ class CustomizedTuner(Tuner):
returnyour_parameters
returnyour_parameters
...
...
```
```
```receive_trial_result``` will receive ```the parameter_id, parameters, value``` as parameters input. Also, Tuner will receive the ```value``` object are exactly same value that Trial send.
The ```your_parameters``` return from ```generate_parameters``` function, will be package as json object by NNI SDK. NNI SDK will unpack json object so the Trial will receive the exact same ```your_parameters``` from Tuner.
`receive_trial_result` will receive the `parameter_id, parameters, value` as parameters input. Also, Tuner will receive the `value` object are exactly same value that Trial send.
The `your_parameters` return from `generate_parameters` function, will be package as json object by NNI SDK. NNI SDK will unpack json object so the Trial will receive the exact same `your_parameters` from Tuner.
For example:
For example:
If the you implement the ```generate_parameters``` like this:
If the you implement the `generate_parameters` like this:
```python
```python
defgenerate_parameters(self,parameter_id):
defgenerate_parameters(self,parameter_id):
'''
'''
...
@@ -61,23 +65,28 @@ If the you implement the ```generate_parameters``` like this:
...
@@ -61,23 +65,28 @@ If the you implement the ```generate_parameters``` like this:
# your code implements here.
# your code implements here.
return{"dropout":0.3,"learning_rate":0.4}
return{"dropout":0.3,"learning_rate":0.4}
```
```
It means your Tuner will always generate parameters ```{"dropout": 0.3, "learning_rate": 0.4}```. Then Trial will receive ```{"dropout": 0.3, "learning_rate": 0.4}``` by calling API ```nni.get_next_parameter()```. Once the trial ends with a result (normally some kind of metrics), it can send the result to Tuner by calling API ```nni.report_final_result()```, for example ```nni.report_final_result(0.93)```. Then your Tuner's ```receive_trial_result``` function will receied the result like:
```
It means your Tuner will always generate parameters `{"dropout": 0.3, "learning_rate": 0.4}`. Then Trial will receive `{"dropout": 0.3, "learning_rate": 0.4}` by calling API `nni.get_next_parameter()`. Once the trial ends with a result (normally some kind of metrics), it can send the result to Tuner by calling API `nni.report_final_result()`, for example `nni.report_final_result(0.93)`. Then your Tuner's `receive_trial_result` function will receied the result like:
```python
parameter_id=82347
parameter_id=82347
parameters={"dropout":0.3,"learning_rate":0.4}
parameters={"dropout":0.3,"learning_rate":0.4}
value=0.93
value=0.93
```
```
**Note that** if you want to access a file (e.g., ```data.txt```) in the directory of your own tuner, you cannot use ```open('data.txt', 'r')```. Instead, you should use the following:
**Note that** if you want to access a file (e.g., `data.txt`) in the directory of your own tuner, you cannot use `open('data.txt', 'r')`. Instead, you should use the following:
```
```python
_pwd=os.path.dirname(__file__)
_pwd=os.path.dirname(__file__)
_fd=open(os.path.join(_pwd,'data.txt'),'r')
_fd=open(os.path.join(_pwd,'data.txt'),'r')
```
```
This is because your tuner is not executed in the directory of your tuner (i.e., ```pwd``` is not the directory of your own tuner).
This is because your tuner is not executed in the directory of your tuner (i.e., `pwd` is not the directory of your own tuner).
**3) Configure your customized tuner in experiment yaml config file**
**3) Configure your customized tuner in experiment yaml config file**
NNI needs to locate your customized tuner class and instantiate the class, so you need to specify the location of the customized tuner class and pass literal values as parameters to the \_\_init__ constructor.
NNI needs to locate your customized tuner class and instantiate the class, so you need to specify the location of the customized tuner class and pass literal values as parameters to the \_\_init__ constructor.
The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](../src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](./howto_3_CustomizedAdvisor.md) for how to write a customized advisor.
The information above are usually enough to write a general tuner. However, users may also want more information, for example, intermediate results, trials' state (e.g., the information in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](../src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](./howto_3_CustomizedAdvisor.md) for how to write a customized advisor.
*Advisor targets the scenario that the automl algorithm wants the methods of both tuner and assessor. Advisor is similar to tuner on that it receives trial configuration request, final results, and generate trial configurations. Also, it is similar to assessor on that it receives intermediate results, trial's end state, and could send trial kill command. Note that, if you use Advisor, tuner and assessor are not allowed to be used at the same time.*
*Advisor targets the scenario that the automl algorithm wants the methods of both tuner and assessor. Advisor is similar to tuner on that it receives trial parameters request, final results, and generate trial parameters. Also, it is similar to assessor on that it receives intermediate results, trial's end state, and could send trial kill command. Note that, if you use Advisor, tuner and assessor are not allowed to be used at the same time.*
So, if user want to implement a customized Advisor, she/he only need to:
So, if user want to implement a customized Advisor, she/he only need to:
1) Define an Advisor inheriting from the MsgDispatcherBase class
1. Define an Advisor inheriting from the MsgDispatcherBase class
2) Implement the methods with prefix `handle_` except `handle_request`
1. Implement the methods with prefix `handle_` except `handle_request`
3) Configure your customized Advisor in experiment yaml config file
1. Configure your customized Advisor in experiment yaml config file
Here ia an example:
Here is an example:
**1) Define an Advisor inheriting from the MsgDispatcherBase class**
**1) Define an Advisor inheriting from the MsgDispatcherBase class**
This command will be filled in the yaml configure file below. Please refer to [here](howto_1_WriteTrial) for how to write your own trial.
This command will be filled in the yaml configure file below. Please refer to [here](./howto_1_WriteTrial.md) for how to write your own trial.
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](CustomizedTuner.md)), but for simplicity, here we choose a tuner provided by NNI as below:
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](./howto_2_CustomizedTuner.md)), but for simplicity, here we choose a tuner provided by NNI as below:
tuner:
tuner:
builtinTunerName: TPE
builtinTunerName: TPE
...
@@ -133,7 +133,7 @@ With all these steps done, we can run the experiment with the following command:
...
@@ -133,7 +133,7 @@ With all these steps done, we can run the experiment with the following command:
You can refer to [here](NNICTLDOC.md) for more usage guide of *nnictl* command line tool.
You can refer to [here](NNICTLDOC.md) for more usage guide of *nnictl* command line tool.
## View experiment results
## View experiment results
The experiment has been running now. Oher than *nnictl*, NNI also provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features.
The experiment has been running now. Other than *nnictl*, NNI also provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features.
## Using multiple local GPUs to speed up search
## Using multiple local GPUs to speed up search
The following steps assume that you have 4 NVIDIA GPUs installed at local and [tensorflow with GPU support](https://www.tensorflow.org/install/gpu). The demo enables 4 concurrent trail jobs and each trail job uses 1 GPU.
The following steps assume that you have 4 NVIDIA GPUs installed at local and [tensorflow with GPU support](https://www.tensorflow.org/install/gpu). The demo enables 4 concurrent trail jobs and each trail job uses 1 GPU.