Docs improvement: configurations and more (#1823)

* docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * update * update

Docs improvement: configurations and more (#1823)
* docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * docs improvement * update * update
843b642f · Yuge Zhang · GitHub · 2c17da7d · 843b642f · 843b642f
Unverified Commit 843b642f authored Dec 16, 2019 by Yuge Zhang Committed by GitHub Dec 16, 2019
5 changed files
--- a/docs/en_US/TrainingService/RemoteMachineMode.md
+++ b/docs/en_US/TrainingService/RemoteMachineMode.md
@@ -16,9 +16,9 @@ Install NNI on each of your machines following the install guide [here](../Tutor

 ## Run an experiment

-Install NNI on another machine which has network accessibility to those three machines above, or you can just use any machine above to run nnictl command line tool.
+Install NNI on another machine which has network accessibility to those three machines above, or you can just run `nnictl` on any one of the three to launch the experiment.

-We use `examples/trials/mnist-annotation` as an example here. `cat ~/nni/examples/trials/mnist-annotation/config_remote.yml` to see the detailed configuration file:
+We use `examples/trials/mnist-annotation` as an example here. Shown here is `examples/trials/mnist-annotation/config_remote.yml`:

 ```yaml
 authorName: default
@@ -57,24 +57,15 @@ machineList:
    username: bob
    passwd: bob123
 ```
-You can use different systems to run experiments on the remote machine.
-#### Linux and MacOS
-Simply filling the `machineList` section and then run:

-```bash
-nnictl create --config ~/nni/examples/trials/mnist-annotation/config_remote.yml
-```
-
-to start the experiment.
-
-#### Windows
-Simply filling the `machineList` section and then run:
+Files in `codeDir` will be automatically uploaded to the remote machine. You can run NNI on different operating systems (Windows, Linux, MacOS) to spawn experiments on the remote machines (only Linux allowed):

 ```bash
-nnictl create --config %userprofile%\nni\examples\trials\mnist-annotation\config_remote.yml
+nnictl create --config examples/trials/mnist-annotation/config_remote.yml
 ```

-to start the experiment.
+You can also use public/private key pairs instead of username/password for authentication. For advanced usages, please refer to [Experiment Config Reference](../Tutorial/ExperimentConfig.md).
+
+## Version check

-## version check
-NNI support version check feature in since version 0.6, [refer](PaiMode.md)
\ No newline at end of file
+NNI support version check feature in since version 0.6, [reference](PaiMode.md).
\ No newline at end of file
--- a/docs/en_US/Tutorial/ExperimentConfig.md
+++ b/docs/en_US/Tutorial/ExperimentConfig.md
-# Experiment config reference
+# Experiment Config Reference

 A config file is needed when creating an experiment. The path of the config file is provided to `nnictl`.
 The config file is in YAML format.
 This document describes the rules to write the config file, and provides some examples and templates.

- [Experiment config reference](#experiment-config-reference)
-  - [Template](#template)
-  - [Configuration spec](#configuration-spec)
-  - [Examples](#examples)
+- [Experiment Config Reference](#experiment-config-reference)
+  * [Template](#template)
+  * [Configuration Spec](#configuration-spec)
+    + [authorName](#authorname)
+    + [experimentName](#experimentname)
+    + [trialConcurrency](#trialconcurrency)
+    + [maxExecDuration](#maxexecduration)
+    + [versionCheck](#versioncheck)
+    + [debug](#debug)
+    + [maxTrialNum](#maxtrialnum)
+    + [trainingServicePlatform](#trainingserviceplatform)
+    + [searchSpacePath](#searchspacepath)
+    + [useAnnotation](#useannotation)
+    + [multiPhase](#multiphase)
+    + [multiThread](#multithread)
+    + [nniManagerIp](#nnimanagerip)
+    + [logDir](#logdir)
+    + [logLevel](#loglevel)
+    + [logCollection](#logcollection)
+    + [tuner](#tuner)
+      - [builtinTunerName](#builtintunername)
+      - [codeDir](#codedir)
+      - [classFileName](#classfilename)
+      - [className](#classname)
+      - [classArgs](#classargs)
+      - [gpuIndices](#gpuindices)
+      - [includeIntermediateResults](#includeintermediateresults)
+    + [assessor](#assessor)
+      - [builtinAssessorName](#builtinassessorname)
+      - [codeDir](#codedir-1)
+      - [classFileName](#classfilename-1)
+      - [className](#classname-1)
+      - [classArgs](#classargs-1)
+    + [advisor](#advisor)
+      - [builtinAdvisorName](#builtinadvisorname)
+      - [codeDir](#codedir-2)
+      - [classFileName](#classfilename-2)
+      - [className](#classname-2)
+      - [classArgs](#classargs-2)
+      - [gpuIndices](#gpuindices-1)
+    + [trial](#trial)
+    + [localConfig](#localconfig)
+      - [gpuIndices](#gpuindices-2)
+      - [maxTrialNumPerGpu](#maxtrialnumpergpu)
+      - [useActiveGpu](#useactivegpu)
+    + [machineList](#machinelist)
+      - [ip](#ip)
+      - [port](#port)
+      - [username](#username)
+      - [passwd](#passwd)
+      - [sshKeyPath](#sshkeypath)
+      - [passphrase](#passphrase)
+      - [gpuIndices](#gpuindices-3)
+      - [maxTrialNumPerGpu](#maxtrialnumpergpu-1)
+      - [useActiveGpu](#useactivegpu-1)
+    + [kubeflowConfig](#kubeflowconfig)
+      - [operator](#operator)
+      - [storage](#storage)
+      - [nfs](#nfs)
+      - [keyVault](#keyvault)
+      - [azureStorage](#azurestorage)
+      - [uploadRetryCount](#uploadretrycount)
+    + [paiConfig](#paiconfig)
+      - [userName](#username)
+      - [password](#password)
+      - [token](#token)
+      - [host](#host)
+  * [Examples](#examples)
+    + [Local mode](#local-mode)
+    + [Remote mode](#remote-mode)
+    + [PAI mode](#pai-mode)
+    + [Kubeflow mode](#kubeflow-mode)
+    + [Kubeflow with azure storage](#kubeflow-with-azure-storage)

 ## Template

-* __light weight(without Annotation and Assessor)__
+* __Light weight (without Annotation and Assessor)__

 ```yaml
 authorName:
@@ -130,434 +199,481 @@ machineList:
    passwd:
 ```

-## Configuration spec
+## Configuration Spec

-* __authorName__
-  * Description
+### authorName

-    __authorName__ is the name of the author who create the experiment.
+Required. String.

-    TBD: add default value
+The name of the author who create the experiment.

-* __experimentName__
-  * Description
+*TBD: add default value.*

-    __experimentName__ is the name of the experiment created.
+### experimentName

-    TBD: add default value
+Required. String.

-* __trialConcurrency__
-  * Description
+The name of the experiment created.

-    __trialConcurrency__ specifies the max num of trial jobs run simultaneously.
+*TBD: add default value.*

-    Note: if trialGpuNum is bigger than the free gpu numbers, and the trial jobs running simultaneously can not reach trialConcurrency number, some trial jobs will be put into a queue to wait for gpu allocation.
+### trialConcurrency

-* __maxExecDuration__
-  * Description
+Required. Integer between 1 and 99999.

-    __maxExecDuration__ specifies the max duration time of an experiment.The unit of the time is {__s__, __m__, __h__, __d__}, which means {_seconds_, _minutes_, _hours_, _days_}.
+Specifies the max num of trial jobs run simultaneously.

-    Note: The maxExecDuration spec set the time of an experiment, not a trial job. If the experiment reach the max duration time, the experiment will not stop, but could not submit new trial jobs any more.
+If trialGpuNum is bigger than the free gpu numbers, and the trial jobs running simultaneously can not reach __trialConcurrency__ number, some trial jobs will be put into a queue to wait for gpu allocation.

-* __versionCheck__
-  * Description
+### maxExecDuration

-    NNI will check the version of nniManager process and the version of trialKeeper in remote, pai and kubernetes platform. If you want to disable version check, you could set versionCheck be false.
+Optional. String. Default: 999d.

-* __debug__
-  * Description
+__maxExecDuration__ specifies the max duration time of an experiment. The unit of the time is {__s__, __m__, __h__, __d__}, which means {_seconds_, _minutes_, _hours_, _days_}.

-    Debug mode will set versionCheck be False and set logLevel be 'debug'
+Note: The maxExecDuration spec set the time of an experiment, not a trial job. If the experiment reach the max duration time, the experiment will not stop, but could not submit new trial jobs any more.

-* __maxTrialNum__
-  * Description
+### versionCheck

-   __maxTrialNum__ specifies the max number of trial jobs created by NNI, including succeeded and failed jobs.
+Optional. Bool. Default: false.
  
-* __trainingServicePlatform__
-  * Description
+NNI will check the version of nniManager process and the version of trialKeeper in remote, pai and kubernetes platform. If you want to disable version check, you could set versionCheck be false.

-    __trainingServicePlatform__ specifies the platform to run the experiment, including {__local__, __remote__, __pai__, __kubeflow__}.
+### debug

-    * __local__ run an experiment on local ubuntu machine.
+Optional. Bool. Default: false.

-    * __remote__ submit trial jobs to remote ubuntu machines, and __machineList__ field should be filed in order to set up SSH connection to remote machine.
+Debug mode will set versionCheck to false and set logLevel to be 'debug'.

-    * __pai__  submit trial jobs to [OpenPai](https://github.com/Microsoft/pai) of Microsoft. For more details of pai configuration, please reference [PAIMOdeDoc](../TrainingService/PaiMode.md)
+### maxTrialNum

-    * __kubeflow__ submit trial jobs to [kubeflow](https://www.kubeflow.org/docs/about/kubeflow/), NNI support kubeflow based on normal kubernetes and [azure kubernetes](https://azure.microsoft.com/en-us/services/kubernetes-service/). Detail please reference [KubeflowDoc](../TrainingService/KubeflowMode.md)
+Optional. Integer between 1 and 99999. Default: 99999.

-* __searchSpacePath__
-  * Description
+Specifies the max number of trial jobs created by NNI, including succeeded and failed jobs.

-    __searchSpacePath__ specifies the path of search space file, which should be a valid path in the local linux machine.
+### trainingServicePlatform

-    Note: if set useAnnotation=True, the searchSpacePath field should be removed.
+Required. String.

-* __useAnnotation__
-  * Description
+Specifies the platform to run the experiment, including __local__, __remote__, __pai__, __kubeflow__, __frameworkcontroller__.

-    __useAnnotation__ use annotation to analysis trial code and generate search space.
+* __local__ run an experiment on local ubuntu machine.

-    Note: if set useAnnotation=True, the searchSpacePath field should be removed.
+* __remote__ submit trial jobs to remote ubuntu machines, and __machineList__ field should be filed in order to set up SSH connection to remote machine.

-* __multiPhase__
-  * Description
+* __pai__  submit trial jobs to [OpenPAI](https://github.com/Microsoft/pai) of Microsoft. For more details of pai configuration, please refer to [Guide to PAI Mode](../TrainingService/PaiMode.md)

-    __multiPhase__ enable [multi-phase experiment](../AdvancedFeature/MultiPhase.md).
+* __kubeflow__ submit trial jobs to [kubeflow](https://www.kubeflow.org/docs/about/kubeflow/), NNI support kubeflow based on normal kubernetes and [azure kubernetes](https://azure.microsoft.com/en-us/services/kubernetes-service/). For detail please refer to [Kubeflow Docs](../TrainingService/KubeflowMode.md)

-* __multiThread__
-  * Description
+* TODO: explain frameworkcontroller.

-    __multiThread__ enable multi-thread mode for dispatcher, if multiThread is set to `true`, dispatcher will start a thread to process each command from NNI Manager.
+### searchSpacePath

-* __nniManagerIp__
-  * Description
+Optional. Path to existing file.

-    __nniManagerIp__ set the IP address of the machine on which NNI manager process runs. This field is optional, and if it's not set, eth0 device IP will be used instead.
+Specifies the path of search space file, which should be a valid path in the local linux machine.

-    Note: run ifconfig on NNI manager's machine to check if eth0 device exists. If not, we recommend to set nnimanagerIp explicitly.
+The only exception that __searchSpacePath__ can be not fulfilled is when `useAnnotation=True`.

-* __logDir__
-  * Description
+### useAnnotation

-    __logDir__ configures the directory to store logs and data of the experiment. The default value is `<user home directory>/nni/experiment`
+Optional. Bool. Default: false.

-* __logLevel__
-  * Description
+Use annotation to analysis trial code and generate search space.

-    __logLevel__ sets log level for the experiment, available log levels are: `trace, debug, info, warning, error, fatal`. The default value is `info`.
+Note: if __useAnnotation__ is true, the searchSpacePath field should be removed.

-* __logCollection__
-  * Description
+### multiPhase

-    __logCollection__ set the way to collect log in remote, pai, kubeflow, frameworkcontroller platform. There are two ways to collect log, one way is from `http`, trial keeper will post log content back from http request in this way, but this way may slow down the speed to process logs in trialKeeper. The other way is `none`, trial keeper will not post log content back, and only post job metrics. If your log content is too big, you could consider setting this param be `none`.
+Optional. Bool. Default: false.

-* __tuner__
-  * Description
+Enable [multi-phase experiment](../AdvancedFeature/MultiPhase.md).

-    __tuner__ specifies the tuner algorithm in the experiment, there are two kinds of ways to set tuner. One way is to use tuner provided by NNI sdk, need to set __builtinTunerName__ and __classArgs__. Another way is to use users' own tuner file, and need to set __codeDirectory__, __classFileName__, __className__ and __classArgs__.
-  * __builtinTunerName__ and __classArgs__
-    * __builtinTunerName__
+### multiThread

-      __builtinTunerName__ specifies the name of system tuner, NNI sdk provides different tuners introduced [here](../Tuner/BuiltinTuner.md).
+Optional. Bool. Default: false.

-    * __classArgs__
+Enable multi-thread mode for dispatcher. If multiThread is enabled, dispatcher will start a thread to process each command from NNI Manager.

-      __classArgs__ specifies the arguments of tuner algorithm. Please refer to [this file](../Tuner/BuiltinTuner.md) for the configurable arguments of each built-in tuner.
-  * __codeDir__, __classFileName__, __className__ and __classArgs__
-    * __codeDir__
+### nniManagerIp

-      __codeDir__ specifies the directory of tuner code.
-    * __classFileName__
+Optional. String. Default: eth0 device IP.

-      __classFileName__ specifies the name of tuner file.
-    * __className__
+Set the IP address of the machine on which NNI manager process runs. This field is optional, and if it's not set, eth0 device IP will be used instead.

-      __className__ specifies the name of tuner class.
-    * __classArgs__
+Note: run `ifconfig` on NNI manager's machine to check if eth0 device exists. If not, __nniManagerIp__ is recommended to set explicitly.

-      __classArgs__ specifies the arguments of tuner algorithm.
+### logDir

-  * __gpuIndices__
+Optional. Path to a directory. Default: `<user home directory>/nni/experiment`.

-      __gpuIndices__ specifies the gpus that can be used by the tuner process. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or `0,1,3`. If the field is not set, `CUDA_VISIBLE_DEVICES` will be '' in script, that is, no GPU is visible to tuner.
+Configures the directory to store logs and data of the experiment.

-  * __includeIntermediateResults__
+### logLevel

-      If __includeIntermediateResults__ is true, the last intermediate result of the trial that is early stopped by assessor is sent to tuner as final result. The default value of __includeIntermediateResults__ is false.
+Optional. String. Default: `info`.

-  Note: users could only use one way to specify tuner, either specifying `builtinTunerName` and `classArgs`, or specifying `codeDir`, `classFileName`, `className` and `classArgs`.
+Sets log level for the experiment. Available log levels are: `trace`, `debug`, `info`, `warning`, `error`, `fatal`.

-* __assessor__
+### logCollection

-  * Description
+Optional. `http` or `none`. Default: `none`.

-    __assessor__ specifies the assessor algorithm to run an experiment, there are two kinds of ways to set assessor. One way is to use assessor provided by NNI sdk, users need to set __builtinAssessorName__ and __classArgs__. Another way is to use users' own assessor file, and need to set __codeDirectory__, __classFileName__, __className__ and __classArgs__.
-  * __builtinAssessorName__ and __classArgs__
-    * __builtinAssessorName__
+Set the way to collect log in remote, pai, kubeflow, frameworkcontroller platform. There are two ways to collect log, one way is from `http`, trial keeper will post log content back from http request in this way, but this way may slow down the speed to process logs in trialKeeper. The other way is `none`, trial keeper will not post log content back, and only post job metrics. If your log content is too big, you could consider setting this param be `none`.

-      __builtinAssessorName__ specifies the name of built-in assessor, NNI sdk provides different assessors introducted [here](../Assessor/BuiltinAssessor.md).
-    * __classArgs__
+### tuner

-      __classArgs__ specifies the arguments of assessor algorithm
+Required.

-  * __codeDir__, __classFileName__, __className__ and __classArgs__
+Specifies the tuner algorithm in the experiment, there are two kinds of ways to set tuner. One way is to use tuner provided by NNI sdk (built-in tuners), in which case you need to set __builtinTunerName__ and __classArgs__. Another way is to use users' own tuner file, in which case __codeDirectory__, __classFileName__, __className__ and __classArgs__ are needed. *Users must choose exactly one way.*

-    * __codeDir__
+#### builtinTunerName

-      __codeDir__ specifies the directory of assessor code.
+Required if using built-in tuners. String.

-    * __classFileName__
+Specifies the name of system tuner, NNI sdk provides different tuners introduced [here](../Tuner/BuiltinTuner.md).

-      __classFileName__ specifies the name of assessor file.
+#### codeDir

-    * __className__
+Required if using customized tuners. Path relative to the location of config file.

-      __className__ specifies the name of assessor class.
+Specifies the directory of tuner code.

-    * __classArgs__
+#### classFileName

-      __classArgs__ specifies the arguments of assessor algorithm.
+Required if using customized tuners. File path relative to __codeDir__.

-  Note: users could only use one way to specify assessor, either specifying `builtinAssessorName` and `classArgs`, or specifying `codeDir`, `classFileName`, `className` and `classArgs`. If users do not want to use assessor, assessor fileld should leave to empty.
+Specifies the name of tuner file.

-* __advisor__
-  * Description
+#### className

-    __advisor__ specifies the advisor algorithm in the experiment, there are two kinds of ways to specify advisor. One way is to use advisor provided by NNI sdk, need to set __builtinAdvisorName__ and __classArgs__. Another way is to use users' own advisor file, and need to set __codeDirectory__, __classFileName__, __className__ and __classArgs__.
-  * __builtinAdvisorName__ and __classArgs__
-    * __builtinAdvisorName__
+Required if using customized tuners. String.

-      __builtinAdvisorName__ specifies the name of a built-in advisor, NNI sdk provides [different advisors](../Tuner/BuiltinTuner.md).
+Specifies the name of tuner class.

-    * __classArgs__
+#### classArgs

-      __classArgs__ specifies the arguments of the advisor algorithm. Please refer to [this file](../Tuner/BuiltinTuner.md) for the configurable arguments of each built-in advisor.
-  * __codeDir__, __classFileName__, __className__ and __classArgs__
-    * __codeDir__
+Optional. Key-value pairs. Default: empty.

-      __codeDir__ specifies the directory of advisor code.
-    * __classFileName__
+Specifies the arguments of tuner algorithm. Please refer to [this file](../Tuner/BuiltinTuner.md) for the configurable arguments of each built-in tuner.

-      __classFileName__ specifies the name of advisor file.
-    * __className__
+#### gpuIndices

-      __className__ specifies the name of advisor class.
-    * __classArgs__
+Optional. String. Default: empty.

-      __classArgs__ specifies the arguments of advisor algorithm.
+Specifies the GPUs that can be used by the tuner process. Single or multiple GPU indices can be specified. Multiple GPU indices are separated by comma `,`. For example, `1`, or `0,1,3`. If the field is not set, no GPU will be visible to tuner (by setting `CUDA_VISIBLE_DEVICES` to be an empty string).

-  * __gpuIndices__
+#### includeIntermediateResults

-      __gpuIndices__ specifies the gpus that can be used by the advisor process. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or `0,1,3`. If the field is not set, `CUDA_VISIBLE_DEVICES` will be '' in script, that is, no GPU is visible to tuner.
+Optional. Bool. Default: false.

-  Note: users could only use one way to specify advisor, either specifying `builtinAdvisorName` and `classArgs`, or specifying `codeDir`, `classFileName`, `className` and `classArgs`.
+If __includeIntermediateResults__ is true, the last intermediate result of the trial that is early stopped by assessor is sent to tuner as final result.

-* __trial(local, remote)__
+### assessor

-  * __command__
+Specifies the assessor algorithm to run an experiment. Similar to tuners, there are two kinds of ways to set assessor. One way is to use assessor provided by NNI sdk. Users need to set __builtinAssessorName__ and __classArgs__. Another way is to use users' own assessor file, and users need to set __codeDirectory__, __classFileName__, __className__ and __classArgs__. *Users must choose exactly one way.*

-    __command__  specifies the command to run trial process.
+By default, there is no assessor enabled.

-  * __codeDir__
+#### builtinAssessorName

-    __codeDir__ specifies the directory of your own trial file.
+Required if using built-in assessors. String.

-  * __gpuNum__
+Specifies the name of built-in assessor, NNI sdk provides different assessors introduced [here](../Assessor/BuiltinAssessor.md).

-    __gpuNum__ specifies the num of gpu to run the trial process. Default value is 0.
+#### codeDir

-* __trial(pai)__
+Required if using customized assessors. Path relative to the location of config file.

-  * __command__
+Specifies the directory of assessor code.

-    __command__  specifies the command to run trial process.
+#### classFileName

-  * __codeDir__
+Required if using customized assessors. File path relative to __codeDir__.

-    __codeDir__ specifies the directory of the own trial file.
+Specifies the name of assessor file.

-  * __gpuNum__
+#### className

-    __gpuNum__ specifies the num of gpu to run the trial process. Default value is 0.
+Required if using customized assessors. String.

-  * __cpuNum__
+Specifies the name of assessor class.

-    __cpuNum__ is the cpu number of cpu to be used in pai container.
+#### classArgs

-  * __memoryMB__
+Optional. Key-value pairs. Default: empty.

-    __memoryMB__ set the momory size to be used in pai's container.
+Specifies the arguments of assessor algorithm.

-  * __image__
+### advisor

-    __image__ set the image to be used in pai.
+Optional.

-* __trial(kubeflow)__
+Specifies the advisor algorithm in the experiment. Similar to tuners and assessors, there are two kinds of ways to specify advisor. One way is to use advisor provided by NNI sdk, need to set __builtinAdvisorName__ and __classArgs__. Another way is to use users' own advisor file, and need to set __codeDirectory__, __classFileName__, __className__ and __classArgs__.

-  * __codeDir__
+When advisor is enabled, settings of tuners and advisors will be bypassed.

-    __codeDir__ is the local directory where the code files in.
+#### builtinAdvisorName

-  * __ps(optional)__
+Specifies the name of a built-in advisor. NNI sdk provides [BOHB](../Tuner/BohbAdvisor.md) and [Hyperband](../Tuner/HyperbandAdvisor.md).

-    __ps__ is the configuration for kubeflow's tensorflow-operator.
+#### codeDir

-    * __replicas__
+Required if using customized advisors. Path relative to the location of config file.

-      __replicas__ is the replica number of __ps__ role.
+Specifies the directory of advisor code.

-    * __command__
+#### classFileName

-      __command__ is the run script in __ps__'s container.
+Required if using customized advisors. File path relative to __codeDir__.

-    * __gpuNum__
+Specifies the name of advisor file.

-      __gpuNum__ set the gpu number to be used in __ps__ container.
+#### className

-    * __cpuNum__
+Required if using customized advisors. String.

-      __cpuNum__ set the cpu number to be used in __ps__ container.
+Specifies the name of advisor class.

-    * __memoryMB__
+#### classArgs

-      __memoryMB__ set the memory size of the container.
+Optional. Key-value pairs. Default: empty.

-    * __image__
+Specifies the arguments of advisor.

-      __image__ set the image to be used in __ps__.
+#### gpuIndices

-  * __worker__
+Optional. String. Default: empty.

-    __worker__ is the configuration for kubeflow's tensorflow-operator.
+Specifies the GPUs that can be used. Single or multiple GPU indices can be specified. Multiple GPU indices are separated by comma `,`. For example, `1`, or `0,1,3`. If the field is not set, no GPU will be visible to tuner (by setting `CUDA_VISIBLE_DEVICES` to be an empty string).

-    * __replicas__
+### trial

-      __replicas__ is the replica number of __worker__ role.
+Required. Key-value pairs.

-    * __command__
+In local and remote mode, the following keys are required.

-      __command__ is the run script in __worker__'s container.
+* __command__: Required string. Specifies the command to run trial process.

-    * __gpuNum__
+* __codeDir__: Required string. Specifies the directory of your own trial file. This directory will be automatically uploaded in remote mode.

-      __gpuNum__ set the gpu number to be used in __worker__ container.
+* __gpuNum__: Optional integer. Specifies the num of gpu to run the trial process. Default value is 0.

-    * __cpuNum__
+In PAI mode, the following keys are required.

-      __cpuNum__ set the cpu number to be used in __worker__ container.
+* __command__: Required string. Specifies the command to run trial process.

-    * __memoryMB__
+* __codeDir__: Required string. Specifies the directory of the own trial file. Files in the directory will be uploaded in PAI mode.

-      __memoryMB__ set the memory size of the container.
+* __gpuNum__: Required integer. Specifies the num of gpu to run the trial process. Default value is 0.

-    * __image__
+* __cpuNum__: Required integer. Specifies the cpu number of cpu to be used in pai container.

-      __image__ set the image to be used in __worker__.
+* __memoryMB__: Required integer. Set the memory size to be used in pai container, in megabytes.

-* __localConfig__
+* __image__: Required string. Set the image to be used in pai.

-  __localConfig__ is applicable only if __trainingServicePlatform__ is set to `local`, otherwise there should not be __localConfig__ section in configuration file.
-  * __gpuIndices__
+* __authFile__: Optional string. Used to provide Docker registry which needs authentication for image pull in PAI. [Reference](https://github.com/microsoft/pai/blob/2ea69b45faa018662bc164ed7733f6fdbb4c42b3/docs/faq.md#q-how-to-use-private-docker-registry-job-image-when-submitting-an-openpai-job).

-    __gpuIndices__ is used to specify designated GPU devices for NNI, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or  `0,1,3`.
+* __shmMB__: Optional integer. Shared memory size of container.

-  * __maxTrialNumPerGpu__
+* __portList__: List of key-values pairs with `label`, `beginAt`, `portNumber`. See [job tutorial of PAI](https://github.com/microsoft/pai/blob/master/docs/job_tutorial.md) for details.

-    __maxTrialNumPerGpu__ is used to specify the max concurrency trial number on a GPU device.
+In Kubeflow mode, the following keys are required.

-  * __useActiveGpu__
+* __codeDir__: The local directory where the code files are in.

-    __useActiveGpu__ is used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no another active process in the GPU, if __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. This field is not applicable for NNI on Windows.
+* __ps__: An optional configuration for kubeflow's tensorflow-operator, which includes

+    * __replicas__: The replica number of __ps__ role.

-* __machineList__
+    * __command__: The run script in __ps__'s container.

-  __machineList__ should be set if __trainingServicePlatform__ is set to remote, or it should be empty.
+    * __gpuNum__: The gpu number to be used in __ps__ container.

-  * __ip__
+    * __cpuNum__: The cpu number to be used in __ps__ container.

-    __ip__ is the ip address of remote machine.
+    * __memoryMB__: The memory size of the container.

-  * __port__
+    * __image__: The image to be used in __ps__.

-    __port__ is the ssh port to be used to connect machine.
+* __worker__: An optional configuration for kubeflow's tensorflow-operator.

-     Note: if users set port empty, the default value will be 22.
-  * __username__
+    * __replicas__: The replica number of __worker__ role.

-    __username__ is the account of remote machine.
-  * __passwd__
+    * __command__: The run script in __worker__'s container.

-    __passwd__ specifies the password of the account.
+    * __gpuNum__: The gpu number to be used in __worker__ container.

-  * __sshKeyPath__
+    * __cpuNum__: The cpu number to be used in __worker__ container.

-    If users use ssh key to login remote machine, could set __sshKeyPath__ in config file. __sshKeyPath__ is the path of ssh key file, which should be valid.
+    * __memoryMB__: The memory size of the container.

-    Note: if users set passwd and sshKeyPath simultaneously, NNI will try passwd.
+    * __image__: The image to be used in __worker__.

-  * __passphrase__
+### localConfig

-    __passphrase__ is used to protect ssh key, which could be empty if users don't have passphrase.
+Optional in local mode. Key-value pairs.

-  * __gpuIndices__
+Only applicable if __trainingServicePlatform__ is set to `local`, otherwise there should not be __localConfig__ section in configuration file.

-    __gpuIndices__ is used to specify designated GPU devices for NNI on this remote machine, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified, multiple GPU indices are seperated by comma(,), such as `1` or  `0,1,3`.
+#### gpuIndices

-  * __maxTrialNumPerGpu__
+Optional. String. Default: none.

-    __maxTrialNumPerGpu__ is used to specify the max concurrency trial number on a GPU device.
+Used to specify designated GPU devices for NNI, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified. Multiple GPU indices should be separated with comma (`,`), such as `1` or  `0,1,3`. By default, all GPUs available will be used.

-  * __useActiveGpu__
+#### maxTrialNumPerGpu

-    __useActiveGpu__ is used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no another active process in the GPU, if __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. This field is not applicable for NNI on Windows.
+Optional. Integer. Default: 99999.
  
-* __kubeflowConfig__:
+Used to specify the max concurrency trial number on a GPU device.
    
-  * __operator__
+#### useActiveGpu

-    __operator__ specify the kubeflow's operator to be used, NNI support __tf-operator__ in current version.
+Optional. Bool. Default: false.

-  * __storage__
+Used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no other active process in the GPU. If __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. This field is not applicable for NNI on Windows.

-    __storage__ specify the storage type of kubeflow, including {__nfs__, __azureStorage__}. This field is optional, and the default value is __nfs__. If the config use azureStorage, this field must be completed.
+### machineList

-  * __nfs__
+Required in remote mode. A list of key-value pairs with the following keys.

-    __server__ is the host of nfs server
+#### ip

-    __path__ is the mounted path of nfs
+Required. IP address that is accessible from the current machine.

-  * __keyVault__
+The IP address of remote machine.

-    If users want to use azure kubernetes service, they should set keyVault to storage the private key of your azure storage account. Refer: https://docs.microsoft.com/en-us/azure/key-vault/key-vault-manage-with-cli2
+#### port

-    * __vaultName__
+Optional. Integer. Valid port. Default: 22.

-      __vaultName__ is the value of `--vault-name` used in az command.
+The ssh port to be used to connect machine.

-    * __name__
+#### username

-      __name__ is the value of `--name` used in az command.
+Required if authentication with username/password. String.

-  * __azureStorage__
+The account of remote machine.

-    If users use azure kubernetes service, they should set azure storage account to store code files.
+#### passwd

-    * __accountName__
+Required if authentication with username/password. String.

-      __accountName__ is the name of azure storage account.
+Specifies the password of the account.

-    * __azureShare__
+#### sshKeyPath

-      __azureShare__ is the share of the azure file storage.
+Required if authentication with ssh key. Path to private key file.

-  * __uploadRetryCount__
+If users use ssh key to login remote machine, __sshKeyPath__ should be a valid path to a ssh key file.

-    If upload files to azure storage failed, NNI will retry the process of uploading, this field will specify the number of attempts to re-upload files.
+*Note: if users set passwd and sshKeyPath simultaneously, NNI will try passwd first.*

-* __paiConfig__
+#### passphrase

-  * __userName__
+Optional. String.

-    __userName__ is the user name of your pai account.
+Used to protect ssh key, which could be empty if users don't have passphrase.

-  * __password__
+#### gpuIndices

-    __password__ is the password of the pai account.
+Optional. String. Default: none.

-  * __host__
+Used to specify designated GPU devices for NNI, if it is set, only the specified GPU devices are used for NNI trial jobs. Single or multiple GPU indices can be specified. Multiple GPU indices should be separated with comma (`,`), such as `1` or  `0,1,3`. By default, all GPUs available will be used.

-    __host__ is the host of pai.
+#### maxTrialNumPerGpu
+
+Optional. Integer. Default: 99999.
+
+Used to specify the max concurrency trial number on a GPU device.
+
+#### useActiveGpu
+
+Optional. Bool. Default: false.
+
+Used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no other active process in the GPU. If __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. This field is not applicable for NNI on Windows.
+
+### kubeflowConfig
+
+#### operator
+
+Required. String. Has to be `tf-operator` or `pytorch-operator`.
+
+Specifies the kubeflow's operator to be used, NNI support `tf-operator` in current version.
+
+#### storage
+
+Optional. String. Default. `nfs`.
+
+Specifies the storage type of kubeflow, including `nfs` and `azureStorage`.
+
+#### nfs
+
+Required if using nfs. Key-value pairs.
+
+* __server__ is the host of nfs server.
+
+* __path__ is the mounted path of nfs.
+
+#### keyVault
+
+Required if using azure storage. Key-value pairs.
+
+Set __keyVault__ to storage the private key of your azure storage account. Refer to https://docs.microsoft.com/en-us/azure/key-vault/key-vault-manage-with-cli2.
+
+* __vaultName__ is the value of `--vault-name` used in az command.
+
+* __name__ is the value of `--name` used in az command.
+
+#### azureStorage
+
+Required if using azure storage. Key-value pairs.
+
+Set azure storage account to store code files.
+
+* __accountName__ is the name of azure storage account.
+
+* __azureShare__ is the share of the azure file storage.
+
+#### uploadRetryCount
+
+Required if using azure storage. Integer between 1 and 99999.
+
+If upload files to azure storage failed, NNI will retry the process of uploading, this field will specify the number of attempts to re-upload files.
+
+### paiConfig
+
+#### userName
+
+Required. String.
+
+The user name of your pai account.
+
+#### password
+
+Required if using password authentication. String.
+
+The password of the pai account.
+
+#### token
+
+Required if using token authentication. String.
+
+Personal access token that can be retrieved from PAI portal.
+
+#### host
+
+Required. String.
+
+The hostname of IP address of PAI.

 ## Examples

-* __local mode__
+### Local mode

-  If users want to run trial jobs in local machine, and use annotation to generate search space, could use the following config:
+If users want to run trial jobs in local machine, and use annotation to generate search space, could use the following config:

  ```yaml
  authorName: test
@@ -581,7 +697,7 @@ machineList:
    gpuNum: 0
  ```

-  You can add assessor configuration.
+You can add assessor configuration.

  ```yaml
  authorName: test
@@ -612,7 +728,7 @@ machineList:
    gpuNum: 0
  ```

-  Or you could specify your own tuner and assessor file as following,
+Or you could specify your own tuner and assessor file as following,

  ```yaml
  authorName: test
@@ -645,9 +761,9 @@ machineList:
    gpuNum: 0
  ```

-* __remote mode__
+### Remote mode

-  If run trial jobs in remote machine, users could specify the remote machine information as following format:
+If run trial jobs in remote machine, users could specify the remote machine information as following format:

  ```yaml
  authorName: test
@@ -687,7 +803,7 @@ machineList:
      passphrase: qwert
  ```

-* __pai mode__
+### PAI mode

  ```yaml
  authorName: test
@@ -724,7 +840,7 @@ machineList:
    host: 10.10.10.10
  ```

-* __kubeflow mode__
+### Kubeflow mode

  kubeflow with nfs storage.

@@ -761,7 +877,7 @@ machineList:
      path: /var/nfs/general
  ```

-  kubeflow with azure storage
+### Kubeflow with azure storage

  ```yaml
  authorName: default

--- a/docs/en_US/Tutorial/FAQ.md
+++ b/docs/en_US/Tutorial/FAQ.md
@@ -32,9 +32,17 @@ Config the network mode to bridge mode or other mode that could make virtual mac
 ### Could not open webUI link
 Unable to open the WebUI may have the following reasons:

-* http://127.0.0.1, http://172.17.0.1 and http://10.0.0.15 are referred to localhost, if you start your experiment on the server or remote machine. You can replace the IP to your server IP to view the WebUI, like http://[your_server_ip]:8080
+* `http://127.0.0.1`, `http://172.17.0.1` and `http://10.0.0.15` are referred to localhost, if you start your experiment on the server or remote machine. You can replace the IP to your server IP to view the WebUI, like `http://[your_server_ip]:8080`
 * If you still can't see the WebUI after you use the server IP, you can check the proxy and the firewall of your machine. Or use the browser on the machine where you start your NNI experiment.
-* Another reason may be your experiment is failed and NNI may fail to get the experiment information. You can check the log of NNIManager in the following directory: ~/nni/experiment/[your_experiment_id] /log/nnimanager.log
+* Another reason may be your experiment is failed and NNI may fail to get the experiment information. You can check the log of NNIManager in the following directory: `~/nni/experiment/[your_experiment_id]` `/log/nnimanager.log`
+
+### Restful server start failed
+
+Probably it's a problem with your network config. Here is a checklist.
+
+* You might need to link `127.0.0.1` with `localhost`. Add a line `127.0.0.1 localhost` to `/etc/hosts`.
+* It's also possible that you have set some proxy config. Check your environment for variables like `HTTP_PROXY` or `HTTPS_PROXY` and unset if they are set.
+

 ### NNI on Windows problems
 Please refer to [NNI on Windows](NniOnWindows.md)

--- a/docs/en_US/Tutorial/Installation.md
+++ b/docs/en_US/Tutorial/Installation.md
 # Installation of NNI

-Currently we support installation on Linux, Mac and Windows(local, remote and pai mode).
+Currently we support installation on Linux, Mac and Windows.

 ## **Installation on Linux & Mac**


--- a/docs/en_US/Tutorial/NniOnWindows.md
+++ b/docs/en_US/Tutorial/NniOnWindows.md
 # NNI on Windows (experimental feature)

-Currently we support local, remote and pai mode on Windows. Windows 10.1809 is well tested and recommended.
+Running NNI on Windows is an experimental feature. Windows 10.1809 is well tested and recommended.

 ## **Installation on Windows**

@@ -41,6 +41,9 @@ Make sure C++ 14.0 compiler installed then try to run `nnictl package install --
 ### Not supported tuner on Windows
 SMAC is not supported currently, the specific reason can be referred to this [GitHub issue](https://github.com/automl/SMAC3/issues/483).

+### Use a Windows server as a remote worker
+Currently you can't.
+
 Note:

 * If there is any error like `Segmentation fault`, please refer to [FAQ](FAQ.md)