* __remote__ submit trial jobs to remote ubuntu machines, and __machineList__ field should be filed in order to set up SSH connection to remote machine.
* __remote__ submit trial jobs to remote ubuntu machines, and __machineList__ field should be filed in order to set up SSH connection to remote machine.
* __pai__ submit trial jobs to [OpenPai](https://github.com/Microsoft/pai) of Microsoft. For more details of pai configuration, please reference [PAIMOdeDoc](./PaiMode.md)
* __pai__ submit trial jobs to [OpenPai](https://github.com/Microsoft/pai) of Microsoft. For more details of pai configuration, please reference [PAIMOdeDoc](../TrainingService/PaiMode.md)
* __kubeflow__ submit trial jobs to [kubeflow](https://www.kubeflow.org/docs/about/kubeflow/), NNI support kubeflow based on normal kubernetes and [azure kubernetes](https://azure.microsoft.com/en-us/services/kubernetes-service/).
* __kubeflow__ submit trial jobs to [kubeflow](https://www.kubeflow.org/docs/about/kubeflow/), NNI support kubeflow based on normal kubernetes and [azure kubernetes](https://azure.microsoft.com/en-us/services/kubernetes-service/). Detail please reference [KubeflowDoc](../TrainingService/KubeflowMode.md)
There are three parts that might have logs in NNI. They are nnimanager, dispatcher and trial. Here we will introduce them succinctly. More information please refer to [Overview](Overview.md).
There are three parts that might have logs in NNI. They are nnimanager, dispatcher and trial. Here we will introduce them succinctly. More information please refer to [Overview](../Overview.md).
-**NNI controller**: NNI controller (nnictl) is the nni command-line tool that is used to manage experiments (e.g., start an experiment).
-**NNI controller**: NNI controller (nnictl) is the nni command-line tool that is used to manage experiments (e.g., start an experiment).
-**nnimanager**: nnimanager is the core of NNI, whose log is important when the whole experiment fails (e.g., no webUI or training service fails)
-**nnimanager**: nnimanager is the core of NNI, whose log is important when the whole experiment fails (e.g., no webUI or training service fails)
...
@@ -57,7 +57,7 @@ Dispatcher fails. Usually, for some new users of NNI, it means that tuner fails.
...
@@ -57,7 +57,7 @@ Dispatcher fails. Usually, for some new users of NNI, it means that tuner fails.
Take the later situation as an example. If you write a customized tuner who's \_\_init\_\_ function has an argument called `optimize_mode`, which you do not provide in your configuration file, NNI will fail to run your tuner so the experiment fails. You can see errors in the webUI like:
Take the later situation as an example. If you write a customized tuner who's \_\_init\_\_ function has an argument called `optimize_mode`, which you do not provide in your configuration file, NNI will fail to run your tuner so the experiment fails. You can see errors in the webUI like:


Here we can see it is a dispatcher error. So we can check dispatcher's log, which might look like:
Here we can see it is a dispatcher error. So we can check dispatcher's log, which might look like:
...
@@ -82,7 +82,7 @@ It means your trial code (which is run by NNI) fails. This kind of error is stro
...
@@ -82,7 +82,7 @@ It means your trial code (which is run by NNI) fails. This kind of error is stro
A common example of this would be run the mnist example without installing tensorflow. Surely there is an Import Error (that is, not installing tensorflow but trying to import it in your trial code) and thus every trial fails.
A common example of this would be run the mnist example without installing tensorflow. Surely there is an Import Error (that is, not installing tensorflow but trying to import it in your trial code) and thus every trial fails.


As it shows, every trial has a log path, where you can find trial'log and stderr.
As it shows, every trial has a log path, where you can find trial'log and stderr.
@@ -47,7 +47,7 @@ After you prepare NNI's environment, you could start a new experiment using `nni
...
@@ -47,7 +47,7 @@ After you prepare NNI's environment, you could start a new experiment using `nni
## Using docker in remote platform
## Using docker in remote platform
NNI support starting experiments in [remoteTrainingService](RemoteMachineMode.md), and run trial jobs in remote machines. As docker could start an independent Ubuntu system as SSH server, docker container could be used as the remote machine in NNI's remot mode.
NNI support starting experiments in [remoteTrainingService](../TrainingService/RemoteMachineMode.md), and run trial jobs in remote machines. As docker could start an independent Ubuntu system as SSH server, docker container could be used as the remote machine in NNI's remot mode.
### Step 1: Setting docker environment
### Step 1: Setting docker environment
...
@@ -78,7 +78,7 @@ If you use your own docker image as remote server, please make sure that this im
...
@@ -78,7 +78,7 @@ If you use your own docker image as remote server, please make sure that this im
### Step3: Run NNI experiments
### Step3: Run NNI experiments
You could set your config file as remote platform, and setting the `machineList` configuration to connect your docker SSH server, [refer](RemoteMachineMode.md). Note that you should set correct `port`,`username` and `passwd` or `sshKeyPath` of your host machine.
You could set your config file as remote platform, and setting the `machineList` configuration to connect your docker SSH server, [refer](../TrainingService/RemoteMachineMode.md). Note that you should set correct `port`,`username` and `passwd` or `sshKeyPath` of your host machine.
`port:` The host machine's port, mapping to docker's SSH port.
`port:` The host machine's port, mapping to docker's SSH port.
Information about this experiment will be shown in the WebUI, including the experiment trial profile and search space message. NNI also support `download these information and parameters` through the **Download** button. You can download the experiment result anytime in the middle for the running or at the end of the execution, etc.
Information about this experiment will be shown in the WebUI, including the experiment trial profile and search space message. NNI also support `download these information and parameters` through the **Download** button. You can download the experiment result anytime in the middle for the running or at the end of the execution, etc.


Top 10 trials will be listed in the Overview page, you can browse all the trials in "Trials Detail" page.
Top 10 trials will be listed in the Overview page, you can browse all the trials in "Trials Detail" page.


#### View trials detail page
#### View trials detail page
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.


Click the tab "Hyper Parameter" to see the parallel graph.
Click the tab "Hyper Parameter" to see the parallel graph.
* You can select the percentage to see top trials.
* You can select the percentage to see top trials.
* Choose two axis to swap its positions
* Choose two axis to swap its positions


Click the tab "Trial Duration" to see the bar graph.
Click the tab "Trial Duration" to see the bar graph.


Below is the status of the all trials. Specifically:
Below is the status of the all trials. Specifically:
...
@@ -231,20 +231,20 @@ Below is the status of the all trials. Specifically:
...
@@ -231,20 +231,20 @@ Below is the status of the all trials. Specifically:
* Kill: you can kill a job that status is running.
* Kill: you can kill a job that status is running.
* Support to search for a specific trial.
* Support to search for a specific trial.


* Intermediate Result Graph
* Intermediate Result Graph


## Related Topic
## Related Topic
*[Try different Tuners](BuiltinTuner.md)
*[Try different Tuners](../Tuner/BuiltinTuner.md)
*[Try different Assessors](BuiltinAssessor.md)
*[Try different Assessors](../Assessor/BuiltinAssessor.md)
*[How to use command line tool nnictl](Nnictl.md)
*[How to use command line tool nnictl](Nnictl.md)
*[How to write a trial](Trials.md)
*[How to write a trial](../TrialExample/Trials.md)
*[How to run an experiment on local (with multiple GPUs)?](LocalMode.md)
*[How to run an experiment on local (with multiple GPUs)?](../TrainingService/LocalMode.md)
*[How to run an experiment on multiple machines?](RemoteMachineMode.md)
*[How to run an experiment on multiple machines?](../TrainingService/RemoteMachineMode.md)
*[How to run an experiment on OpenPAI?](PaiMode.md)
*[How to run an experiment on OpenPAI?](../TrainingService/PaiMode.md)
*[How to run an experiment on Kubernetes through Kubeflow?](KubeflowMode.md)
*[How to run an experiment on Kubernetes through Kubeflow?](../TrainingService/KubeflowMode.md)
*[How to run an experiment on Kubernetes through FrameworkController?](FrameworkControllerMode.md)
*[How to run an experiment on Kubernetes through FrameworkController?](../TrainingService/FrameworkControllerMode.md)
* Only Random Search/TPE/Anneal/Evolution tuner supports nested search space
* Only Random Search/TPE/Anneal/Evolution tuner supports nested search space
* We do not support nested search space "Hyper Parameter" parallel graph now, the enhancement is being considered in #1110(https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed
* We do not support nested search space "Hyper Parameter" in visualization now, the enhancement is being considered in #1110(https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed
For debugging NNI source code, your development environment should be under Ubuntu 16.04 (or above) system with python 3 and pip 3 installed, then follow the below steps.
For debugging NNI source code, your development environment should be under Ubuntu 16.04 (or above) system with python 3 and pip 3 installed, then follow the below steps.
**1. Clone the source code**
**1. Clone the source code**
Run the command
Run the command
```
```
git clone https://github.com/Microsoft/nni.git
git clone https://github.com/Microsoft/nni.git
```
```
to clone the source code
to clone the source code
**2. Prepare the debug environment and install dependencies**
**2. Prepare the debug environment and install dependencies**
Change directory to the source code folder, then run the command
Change directory to the source code folder, then run the command
```
```
make install-dependencies
make install-dependencies
```
```
to install the dependent tools for the environment
to install the dependent tools for the environment
**3. Build source code**
**3. Build source code**
Run the command
Run the command
```
```
make build
make build
```
```
to build the source code
to build the source code
**4. Install NNI to development environment**
**4. Install NNI to development environment**
Run the command
Run the command
```
```
make dev-install
make dev-install
```
```
to install the distribution content to development environment, and create cli scripts
to install the distribution content to development environment, and create cli scripts
**5. Check if the environment is ready**
**5. Check if the environment is ready**
Now, you can try to start an experiment to check if your environment is ready.
Now, you can try to start an experiment to check if your environment is ready.
* If you have any question, you can click "Feedback" to report it.
* If you have any question, you can click "Feedback" to report it.
* If your experiment have more than 1000 trials, you can change the refresh interval on here.
* If your experiment have more than 1000 trials, you can change the refresh interval on here.


* See good performance trials.
* See good performance trials.


## View job default metric
## View job default metric
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.


## View hyper parameter
## View hyper parameter
...
@@ -28,24 +28,24 @@ Click the tab "Hyper Parameter" to see the parallel graph.
...
@@ -28,24 +28,24 @@ Click the tab "Hyper Parameter" to see the parallel graph.
* You can select the percentage to see top trials.
* You can select the percentage to see top trials.
* Choose two axis to swap its positions
* Choose two axis to swap its positions


## View Trial Duration
## View Trial Duration
Click the tab "Trial Duration" to see the bar graph.
Click the tab "Trial Duration" to see the bar graph.


## View Trial Intermediate Result Graph
## View Trial Intermediate Result Graph
Click the tab "Intermediate Result" to see the lines graph.
Click the tab "Intermediate Result" to see the lines graph.


The graph has a filter function. You can open the filter button. And then enter your focus point
The graph has a filter function. You can open the filter button. And then enter your focus point
in the scape input. Simultaneously, intermediate result inputs can limit the intermediate's range.
in the scape input. Simultaneously, intermediate result inputs can limit the intermediate's range.


## View trials status
## View trials status
...
@@ -53,27 +53,27 @@ Click the tab "Trials Detail" to see the status of the all trials. Specifically:
...
@@ -53,27 +53,27 @@ Click the tab "Trials Detail" to see the status of the all trials. Specifically:
* Trial detail: trial's id, trial's duration, start time, end time, status, accuracy and search space file.
* Trial detail: trial's id, trial's duration, start time, end time, status, accuracy and search space file.


* The button named "Add column" can select which column to show in the table. If you run an experiment that final result is dict, you can see other keys in the table.
* The button named "Add column" can select which column to show in the table. If you run an experiment that final result is dict, you can see other keys in the table.


* If you want to compare some trials, you can select them and then click "Compare" to see the results.
* If you want to compare some trials, you can select them and then click "Compare" to see the results.


* You can use the button named "Copy as python" to copy trial's parameters.
* You can use the button named "Copy as python" to copy trial's parameters.


* If you run on OpenPAI or Kubeflow platform, you can also see the hdfsLog.
* If you run on OpenPAI or Kubeflow platform, you can also see the hdfsLog.


* Kill: you can kill a job that status is running.
* Kill: you can kill a job that status is running.
* Support to search for a specific trial.
* Support to search for a specific trial.
* Intermediate Result Graph: you can see default and other keys in this graph.
* Intermediate Result Graph: you can see default and other keys in this graph.