An experiment is to run multiple trial jobs, each trial job tries a configuration which includes a specific neural architecture (or model) and hyper-parameter values. To run an experiment through NNI, you should:
* Provide a runnable trial
...
...
@@ -32,22 +40,26 @@ An experiment is to run multiple trial jobs, each trial job tries a configuratio
**Prepare trial**: Let's use a simple trial example, e.g. mnist, provided by NNI. After you installed NNI, NNI examples have been put in ~/nni/examples, run `ls ~/nni/examples/trials` to see all the trial examples. You can simply execute the following command to run the NNI mnist example:
This command will be filled in the yaml configure file below. Please refer to [here](howto_1_WriteTrial.md) for how to write your own trial.
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](howto_2_CustomizedTuner.md), but for simplicity, here we choose a tuner provided by NNI as below:
tuner:
```yaml
tuner:
builtinTunerName:TPE
classArgs:
optimize_mode:maximize
```
*builtinTunerName* is used to specify a tuner in NNI, *classArgs* are the arguments pass to the tuner, *optimization_mode* is to indicate whether you want to maximize or minimize your trial's result.
**Prepare configure file**: Since you have already known which trial code you are going to run and which tuner you are going to use, it is time to prepare the yaml configure file. NNI provides a demo configure file for each trial example, `cat ~/nni/examples/trials/mnist-annotation/config.yml` to see it. Its content is basically shown below:
```
```yaml
authorName:your_name
experimentName:auto_mnist
...
...
@@ -87,6 +99,7 @@ You can refer to [here](NNICTLDOC.md) for more usage guide of *nnictl* command l
The experiment has been running now, NNI provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features. The WebUI is opened by default by `nnictl create`.
## Read more
*[Tuners supported in the latest NNI release](./HowToChooseTuner.md)
If users want to use tf-operator, he could set `ps` and `worker` in trial config. If users want to use pytorch-operator, he could set `master` and `worker` in trial config.
## Supported sotrage type
## Supported storage type
NNI support NFS and Azure Storage to store the code and output files, users could set storage type in config file and set the corresponding config.
The setting for NFS storage are as follows:
```
...
...
@@ -197,4 +197,3 @@ Notice: In kubeflow mode, NNIManager will start a rest server and listen on a po
Once a trial job is completed, you can goto NNI WebUI's overview page (like http://localhost:8080/oview) to check trial's information.
Any problems when using NNI in kubeflow mode, plesae create issues on [NNI github repo](https://github.com/Microsoft/nni), or send mail to nni@microsoft.com
__nnictl__ is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc.
## Commands
nnictl support commands:
```
```bash
nnictl create
nnictl stop
nnictl update
...
...
@@ -19,21 +24,22 @@ nnictl tensorboard
nnictl top
nnictl --version
```
### Manage an experiment
* __nnictl create__
* Description
You can use this command to create a new experiment, using the configuration specified in config file.
After this command is successfully done, the context will be set as this experiment,
which means the following command you issued is associated with this experiment,
unless you explicitly changes the context(not supported yet).
You can use this command to create a new experiment, using the configuration specified in config file. After this command is successfully done, the context will be set as this experiment, which means the following command you issued is associated with this experiment, unless you explicitly changes the context(not supported yet).
@@ -22,7 +22,7 @@ After user submits the experiment through a command line tool [nnictl](../tools/
User can use the nnictl and/or a visualized Web UI nniboard to monitor and debug a given experiment.
NNI provides a set of examples in the package to get you familiar with the above process. In the following example [/examples/trials/mnist], we had already set up the configuration and updated the training codes for you. You can directly run the following command to start an experiment.
NNI provides a set of examples in the package to get you familiar with the above process.
Use `examples/trials/mnist-annotation` as an example. The nni config yaml file's content is like:
```
```yaml
authorName:your_name
experimentName:auto_mnist
# how many trials could be concurrently running
...
...
@@ -39,6 +40,7 @@ paiConfig:
passWord:your_pai_password
host:10.1.1.1
```
Note: You should set `trainingServicePlatform: pai` in nni config yaml file if you want to start experiment in pai mode.
Compared with LocalMode and [RemoteMachineMode](RemoteMachineMode.md), trial configuration in pai mode have five additional keys:
...
...
@@ -58,7 +60,7 @@ Once complete to fill nni experiment config file and save (for example, save as
```
nnictl create --config exp_pai.yaml
```
to start the experiment in pai mode. NNI will create OpanPAI job for each trial, and the job name format is something like `nni_exp_{experiment_id}_trial_{trial_id}`.
to start the experiment in pai mode. NNI will create OpenPAI job for each trial, and the job name format is something like `nni_exp_{experiment_id}_trial_{trial_id}`.
You can see the pai jobs created by NNI in your OpenPAI cluster's web portal, like:

...
...
@@ -77,4 +79,3 @@ You can see there're three fils in output folder: stderr, stdout, and trial.log
If you also want to save trial's other output into HDFS, like model files, you can use environment variable `NNI_OUTPUT_DIR` in your trial code to save your own output files, and NNI SDK will copy all the files in `NNI_OUTPUT_DIR` from trial's container to HDFS.
Any problems when using NNI in pai mode, plesae create issues on [NNI github repo](https://github.com/Microsoft/nni), or send mail to nni@microsoft.com
NNI supports running an experiment on multiple machines through SSH channel, called `remote` mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code.
e.g. Three machines and you login in with account `bob` (Note: the account is not necessarily the same on different machine):
...
...
@@ -11,19 +13,24 @@ e.g. Three machines and you login in with account `bob` (Note: the account is no
| 10.1.1.3 | bob | bob123 |
## Setup NNI environment
Install NNI on each of your machines following the install guide [here](GetStarted.md).
For remote machines that are used only to run trials but not the nnictl, you can just install python SDK:
* __Install python SDK through pip__
```bash
python3 -m pip install--user--upgrade nni-sdk
```
## Run an experiment
Install NNI on another machine which has network accessibility to those three machines above, or you can just use any machine above to run nnictl command line tool.
We use `examples/trials/mnist-annotation` as an example here. `cat ~/nni/examples/trials/mnist-annotation/config_remote.yml` to see the detailed configuration file:
```
```yaml
authorName:default
experimentName:example_mnist
trialConcurrency:1
...
...
@@ -58,8 +65,11 @@ machineList:
username:bob
passwd:bob123
```
Simply filling the `machineList` section and then run:
The example define ```dropout_rate``` as variable which priori distribution is uniform distribution, and its value from ```0.1``` and ```0.5```.
The example define `dropout_rate` as variable which priori distribution is uniform distribution, and its value from `0.1` and `0.5`.
The tuner will sample parameters/architecture by understanding the search space first.
User should define the name of variable, type and candidate value of variable.
...
...
@@ -69,6 +69,6 @@ The candidate type and value for variable is here:
Note that SMAC only supports a subset of the types above, including `choice`, `randint`, `uniform`, `loguniform`, `quniform(q=1)`. In the current version, SMAC does not support cascaded search space (i.e., conditional variable in SMAC).
Note that GridSearch Tuner only supports a subset of the types above, including `choic`, `quniform` and `qloguniform`, where q here specifies the number of values that will be sampled. Details about the last two type as follows
Note that GridSearch Tuner only supports a subset of the types above, including `choice`, `quniform` and `qloguniform`, where q here specifies the number of values that will be sampled. Details about the last two type as follows
* Type 'quniform' will receive three values [low, high, q], where [low, high] specifies a range and 'q' specifies the number of values that will be sampled evenly. Note that q should be at least 2. It will be sampled in a way that the first sampled value is 'low', and each of the following values is (high-low)/q larger that the value in front of it.
* Type 'qloguniform' behaves like 'quniform' except that it will first change the range to [log(low), log(high)] and sample and then change the sampled value back.
For debugging NNI source code, your development environment should be under Ubuntu 16.04 (or above) system with python 3 and pip 3 installed, then follow the below steps.
...
...
@@ -7,42 +9,52 @@ For debugging NNI source code, your development environment should be under Ubun
**1. Clone the source code**
Run the command
```
git clone https://github.com/Microsoft/nni.git
```
to clone the source code
**2. Prepare the debug environment and install dependencies**
Change directory to the source code folder, then run the command
```
make install-dependencies
```
to install the dependent tools for the environment
**3. Build source code**
Run the command
```
make build
```
to build the source code
**4. Install NNI to development environment**
Run the command
```
make dev-install
```
to install the distribution content to development environment, and create cli scripts
**5. Check if the environment is ready**
Now, you can try to start an experiment to check if your environment is ready.
@@ -6,10 +6,12 @@ A **Trial** in NNI is an individual attempt at applying a set of parameters on a
To define a NNI trial, you need to firstly define the set of parameters and then update the model. NNI provide two approaches for you to define a trial: `NNI API` and `NNI Python annotation`.
Refer to [SearchSpaceSpec.md](SearchSpaceSpec.md) to learn more about search space.
Refer to [SearchSpaceSpec.md](./SearchSpaceSpec.md) to learn more about search space.
>Step 2 - Update model codes
~~~~
2.1 Declare NNI API
Include `import nni` in your trial code to use NNI APIs.
...
...
@@ -48,6 +52,7 @@ Refer to [SearchSpaceSpec.md](SearchSpaceSpec.md) to learn more about search spa
~~~~
**NOTE**:
~~~~
accuracy - The `accuracy` could be any python object, but if you use NNI built-in tuner/assessor, `accuracy` should be a numerical variable (e.g. float, int).
assessor - The assessor will decide which trial should early stop based on the history performance of trial (intermediate result of one trial).
...
...
@@ -63,11 +68,12 @@ useAnnotation: false
searchSpacePath: /path/to/your/search_space.json
```
You can refer to [here](ExperimentConfig.md) for more information about how to set up experiment configurations.
You can refer to [here](./ExperimentConfig.md) for more information about how to set up experiment configurations.
(../examples/trials/README.md) for more information about how to write trial code using NNI APIs.
You can refer to [here](../examples/trials/README.md) for more information about how to write trial code using NNI APIs.
## NNI Python Annotation
An alternative to write a trial is to use NNI's syntax for python. Simple as any annotation, NNI annotation is working like comments in your codes. You don't have to make structure or any other big changes to your existing codes. With a few lines of NNI annotation, you will be able to:
* annotate the variables you want to tune
* specify in which range you want to tune the variables
...
...
@@ -118,9 +124,11 @@ with tf.Session() as sess:
>Step 2 - Enable NNI Annotation
In the yaml configure file, you need to set *useAnnotation* to true to enable NNI annotation:
```
```yaml
useAnnotation:true
```
## More Trial Example
*[Automatic Model Architecture Search for Reading Comprehension.](../examples/trials/ga_squad/README.md)
So, if user want to implement a customized Tuner, she/he only need to:
1) Inherit a tuner of a base Tuner class
2) Implement receive_trial_result and generate_parameter function
3) Configure your customized tuner in experiment yaml config file
1. Inherit a tuner of a base Tuner class
1. Implement receive_trial_result and generate_parameter function
1. Configure your customized tuner in experiment yaml config file
Here ia an example:
Here is an example:
**1) Inherit a tuner of a base Tuner class**
```python
fromnni.tunerimportTuner
...
...
@@ -20,6 +21,7 @@ class CustomizedTuner(Tuner):
```
**2) Implement receive_trial_result and generate_parameter function**
```python
fromnni.tunerimportTuner
...
...
@@ -46,12 +48,14 @@ class CustomizedTuner(Tuner):
returnyour_parameters
...
```
```receive_trial_result``` will receive ```the parameter_id, parameters, value``` as parameters input. Also, Tuner will receive the ```value``` object are exactly same value that Trial send.
The ```your_parameters``` return from ```generate_parameters``` function, will be package as json object by NNI SDK. NNI SDK will unpack json object so the Trial will receive the exact same ```your_parameters``` from Tuner.
`receive_trial_result` will receive the `parameter_id, parameters, value` as parameters input. Also, Tuner will receive the `value` object are exactly same value that Trial send.
The `your_parameters` return from `generate_parameters` function, will be package as json object by NNI SDK. NNI SDK will unpack json object so the Trial will receive the exact same `your_parameters` from Tuner.
For example:
If the you implement the ```generate_parameters``` like this:
If the you implement the `generate_parameters` like this:
```python
defgenerate_parameters(self,parameter_id):
'''
...
...
@@ -61,23 +65,28 @@ If the you implement the ```generate_parameters``` like this:
# your code implements here.
return{"dropout":0.3,"learning_rate":0.4}
```
It means your Tuner will always generate parameters ```{"dropout": 0.3, "learning_rate": 0.4}```. Then Trial will receive ```{"dropout": 0.3, "learning_rate": 0.4}``` by calling API ```nni.get_next_parameter()```. Once the trial ends with a result (normally some kind of metrics), it can send the result to Tuner by calling API ```nni.report_final_result()```, for example ```nni.report_final_result(0.93)```. Then your Tuner's ```receive_trial_result``` function will receied the result like:
```
It means your Tuner will always generate parameters `{"dropout": 0.3, "learning_rate": 0.4}`. Then Trial will receive `{"dropout": 0.3, "learning_rate": 0.4}` by calling API `nni.get_next_parameter()`. Once the trial ends with a result (normally some kind of metrics), it can send the result to Tuner by calling API `nni.report_final_result()`, for example `nni.report_final_result(0.93)`. Then your Tuner's `receive_trial_result` function will receied the result like:
```python
parameter_id=82347
parameters={"dropout":0.3,"learning_rate":0.4}
value=0.93
```
**Note that** if you want to access a file (e.g., ```data.txt```) in the directory of your own tuner, you cannot use ```open('data.txt', 'r')```. Instead, you should use the following:
```
**Note that** if you want to access a file (e.g., `data.txt`) in the directory of your own tuner, you cannot use `open('data.txt', 'r')`. Instead, you should use the following:
```python
_pwd=os.path.dirname(__file__)
_fd=open(os.path.join(_pwd,'data.txt'),'r')
```
This is because your tuner is not executed in the directory of your tuner (i.e., ```pwd``` is not the directory of your own tuner).
This is because your tuner is not executed in the directory of your tuner (i.e., `pwd` is not the directory of your own tuner).
**3) Configure your customized tuner in experiment yaml config file**
NNI needs to locate your customized tuner class and instantiate the class, so you need to specify the location of the customized tuner class and pass literal values as parameters to the \_\_init__ constructor.
The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](../src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](./howto_3_CustomizedAdvisor.md) for how to write a customized advisor.
The information above are usually enough to write a general tuner. However, users may also want more information, for example, intermediate results, trials' state (e.g., the information in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](../src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](./howto_3_CustomizedAdvisor.md) for how to write a customized advisor.
*Advisor targets the scenario that the automl algorithm wants the methods of both tuner and assessor. Advisor is similar to tuner on that it receives trial configuration request, final results, and generate trial configurations. Also, it is similar to assessor on that it receives intermediate results, trial's end state, and could send trial kill command. Note that, if you use Advisor, tuner and assessor are not allowed to be used at the same time.*
*Advisor targets the scenario that the automl algorithm wants the methods of both tuner and assessor. Advisor is similar to tuner on that it receives trial parameters request, final results, and generate trial parameters. Also, it is similar to assessor on that it receives intermediate results, trial's end state, and could send trial kill command. Note that, if you use Advisor, tuner and assessor are not allowed to be used at the same time.*
So, if user want to implement a customized Advisor, she/he only need to:
1) Define an Advisor inheriting from the MsgDispatcherBase class
2) Implement the methods with prefix `handle_` except `handle_request`
3) Configure your customized Advisor in experiment yaml config file
1. Define an Advisor inheriting from the MsgDispatcherBase class
1. Implement the methods with prefix `handle_` except `handle_request`
1. Configure your customized Advisor in experiment yaml config file
Here ia an example:
Here is an example:
**1) Define an Advisor inheriting from the MsgDispatcherBase class**
This command will be filled in the yaml configure file below. Please refer to [here](howto_1_WriteTrial) for how to write your own trial.
This command will be filled in the yaml configure file below. Please refer to [here](./howto_1_WriteTrial.md) for how to write your own trial.
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](CustomizedTuner.md)), but for simplicity, here we choose a tuner provided by NNI as below:
**Prepare tuner**: NNI supports several popular automl algorithms, including Random Search, Tree of Parzen Estimators (TPE), Evolution algorithm etc. Users can write their own tuner (refer to [here](./howto_2_CustomizedTuner.md)), but for simplicity, here we choose a tuner provided by NNI as below:
tuner:
builtinTunerName: TPE
...
...
@@ -133,7 +133,7 @@ With all these steps done, we can run the experiment with the following command:
You can refer to [here](NNICTLDOC.md) for more usage guide of *nnictl* command line tool.
## View experiment results
The experiment has been running now. Oher than *nnictl*, NNI also provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features.
The experiment has been running now. Other than *nnictl*, NNI also provides WebUI for you to view experiment progress, to control your experiment, and some other appealing features.
## Using multiple local GPUs to speed up search
The following steps assume that you have 4 NVIDIA GPUs installed at local and [tensorflow with GPU support](https://www.tensorflow.org/install/gpu). The demo enables 4 concurrent trail jobs and each trail job uses 1 GPU.
@@ -35,7 +35,7 @@ class CustomizedAssessor(Assessor):
```python
importargparse
importCustomizedAssesor
importCustomizedAssessor
defmain():
parser=argparse.ArgumentParser(description='parse command line parameters.')
...
...
@@ -49,9 +49,9 @@ def main():
main()
```
Please noted in 2). The object ```trial_history``` are exact the object that Trial send to Assesor by using SDK ```report_intermediate_result``` function.
Please noted in 2). The object `trial_history` are exact the object that Trial send to Assessor by using SDK `report_intermediate_result` function.
Also, user could override the ```run``` function in Assessor to control the process logic.
Also, user could override the `run` function in Assessor to control the process logic.