* We support Linux, MacOS and Windows (local, remote and pai mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 along with Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
NNI supports and is tested on Ubuntu >= 16.04, macOS >= 10.14.1, and Windows 10 >= 1809. Simply run the following `pip install` in an environment that has `python 64-bit >= 3.5`.
Linux and MacOS
Linux or macOS
```bash
python3 -m pip install--upgrade nni
...
...
@@ -216,65 +216,39 @@ Windows
python -m pip install--upgrade nni
```
Note:
*`--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* Currently NNI on Windows support local, remote and pai mode. Anaconda or Miniconda is highly recommended to install NNI on Windows.
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/Tutorial/FAQ.md)
**Install through source code**
* We support Linux (Ubuntu 16.04 or higher), MacOS (10.14.1) and Windows (10.1809) in our current stage.
Linux and MacOS
* Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`.
For detail system requirements of NNI, please refer to [here](docs/en_US/Tutorial/Installation.md#system-requirements).
For the system requirements of NNI, please refer to [Install NNI](docs/en_US/Tutorial/Installation.md)
Note:
For NNI on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/NniOnWindows.md)
* If there is any privilege issue, add `--user` to install NNI in the user directory.
* Currently NNI on Windows supports local, remote and pai mode. Anaconda or Miniconda is highly recommended to install NNI on Windows.
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/Tutorial/FAQ.md). For FAQ on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/NniOnWindows.md).
**Verify install**
### **Verify installation**
The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow 1.x installed** before running it. Note that **currently Tensorflow 2.0 is NOT supported**.
The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is used** when running it.
* Download the examples via clone the source code.
* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.
...
...
@@ -371,4 +345,3 @@ We encourage researchers and students leverage these projects to accelerate the
## **License**
The entire codebase is under [MIT license](LICENSE)
NNI supports running an experiment on multiple machines through SSH channel, called `remote` mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code.
NNI can run one experiment on multiple remote machines through SSH, called `remote` mode. It's like a lightweight training platform. In this mode, NNI can be started from your computer, and dispatch trials to remote machines in parallel.
e.g. Three machines and you login in with account `bob` (Note: the account is not necessarily the same on different machine):
## Remote machine requirements
| IP | Username| Password |
| -------- |---------|-------|
| 10.1.1.1 | bob | bob123 |
| 10.1.1.2 | bob | bob123 |
| 10.1.1.3 | bob | bob123 |
* It only supports Linux as remote machines, and [linux part in system specification](../Tutorial/Installation.md) is same as NNI local mode.
## Setup NNI environment
* Follow [installation](../Tutorial/Installation.md) to install NNI on each machine.
Install NNI on each of your machines following the install guide [here](../Tutorial/QuickStart.md).
* Make sure remote machines meet environment requirements of your trial code. If the default environment does not meet the requirements, the setup script can be added into `command` field of NNI config.
* Make sure remote machines can be accessed through SSH from the machine which runs `nnictl` command. It supports both password and key authentication of SSH. For advanced usages, please refer to [machineList part of configuration](../Tutorial/ExperimentConfig.md).
* Make sure the NNI version on each machine is consistent.
## Run an experiment
Install NNI on another machine which has network accessibility to those three machines above, or you can just run `nnictl` on any one of the three to launch the experiment.
e.g. there are three machines, which can be logged in with username and password.
| IP | Username | Password |
| -------- | -------- | -------- |
| 10.1.1.1 | bob | bob123 |
| 10.1.1.2 | bob | bob123 |
| 10.1.1.3 | bob | bob123 |
Install and run NNI on one of those three machines or another machine, which has network access to them.
We use `examples/trials/mnist-annotation` as an example here. Shown here is`examples/trials/mnist-annotation/config_remote.yml`:
Use `examples/trials/mnist-annotation` as the example. Below is content of`examples/trials/mnist-annotation/config_remote.yml`:
```yaml
authorName:default
...
...
@@ -58,14 +66,8 @@ machineList:
passwd:bob123
```
Files in `codeDir` will be automatically uploaded to the remote machine. You can run NNI on different operating systems (Windows, Linux, MacOS) to spawn experiments on the remote machines (only Linux allowed):
Files in `codeDir` will be uploaded to remote machines automatically. You can run below command on Windows, Linux, or macOS to spawn trials on remote Linux machines:
You can also use public/private key pairs instead of username/password for authentication. For advanced usages, please refer to [Experiment Config Reference](../Tutorial/ExperimentConfig.md).
## Version check
NNI support version check feature in since version 0.6, [reference](PaiMode.md).
@@ -4,10 +4,11 @@ NNI TrainingService provides the training platform for running NNI trial jobs. N
NNI not only provides few built-in training service options, but also provides a method for customers to build their own training service easily.
## Built-in TrainingService
|TrainingService|Brief Introduction|
|---|---|
|[__Local__](./LocalMode.md)|NNI supports running an experiment on local machine, called local mode. Local mode means that NNI will run the trial jobs and nniManager process in same machine, and support gpu schedule function for trial jobs.|
|[__Remote__](./RemoteMachineMode.md)|NNI supports running an experiment on multiple machines through SSH channel, called remote mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code. NNI will submit the trial jobs in remote machine, and schedule suitable machine with enouth gpu resource if specified.|
|[__Remote__](./RemoteMachineMode.md)|NNI supports running an experiment on multiple machines through SSH channel, called remote mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code. NNI will submit the trial jobs in remote machine, and schedule suitable machine with enough gpu resource if specified.|
|[__Pai__](./PaiMode.md)|NNI supports running an experiment on [OpenPAI](https://github.com/Microsoft/pai)(aka pai), called pai mode. Before starting to use NNI pai mode, you should have an account to access an [OpenPAI](https://github.com/Microsoft/pai) cluster. See [here](https://github.com/Microsoft/pai#how-to-deploy) if you don't have any OpenPAI account and want to deploy an OpenPAI cluster. In pai mode, your trial program will run in pai's container created by Docker.|
|[__Kubeflow__](./KubeflowMode.md)|NNI supports running experiment on [Kubeflow](https://github.com/kubeflow/kubeflow), called kubeflow mode. Before starting to use NNI kubeflow mode, you should have a Kubernetes cluster, either on-premises or [Azure Kubernetes Service(AKS)](https://azure.microsoft.com/en-us/services/kubernetes-service/), a Ubuntu machine on which [kubeconfig](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/) is setup to connect to your Kubernetes cluster. If you are not familiar with Kubernetes, [here](https://kubernetes.io/docs/tutorials/kubernetes-basics/) is a good start. In kubeflow mode, your trial program will run as Kubeflow job in Kubernetes cluster.|
|[__FrameworkController__](./FrameworkControllerMode.md)|NNI supports running experiment using [FrameworkController](https://github.com/Microsoft/frameworkcontroller), called frameworkcontroller mode. FrameworkController is built to orchestrate all kinds of applications on Kubernetes, you don't need to install Kubeflow for specific deep learning framework like tf-operator or pytorch-operator. Now you can use FrameworkController as the training service to run NNI experiment.|
...
...
@@ -16,7 +17,8 @@ NNI not only provides few built-in training service options, but also provides a
TrainingService is designed to be easily implemented, we define an abstract class TrainingService as the parent class of all kinds of TrainingService, users just need to inherit the parent class and complete their own child class if they want to implement customized TrainingService.
The abstract function in TrainingService is shown below:
@@ -32,5 +34,6 @@ abstract class TrainingService {
publicabstractrun():Promise<void>;
}
```
The parent class of TrainingService has a few abstract functions, users need to inherit the parent class and implement all of these abstract functions.
For more information about how to write your own TrainingService, please [refer](https://github.com/microsoft/nni/blob/master/docs/en_US/TrainingService/HowToImplementTrainingService.md).
@@ -22,25 +24,27 @@ Currently we support installation on Linux, Mac and Windows.
./install.sh
```
*__Install NNI in docker image__
*Use NNI in a docker image
You can also install NNI in a docker image. Please follow the instructions [here](https://github.com/Microsoft/nni/tree/master/deployment/docker/README.md) to build NNI docker image. The NNI docker image can also be retrieved from Docker Hub through the command `docker pull msranni/nni:latest`.
## **Installation on Windows**
## Install on Windows
Anaconda or Miniconda is highly recommended.
Anaconda or Miniconda is highly recommended to manage multiple Python environments.
*__Install NNI through pip__
* Install NNI through pip
Prerequisite: `python(64-bit) >= 3.5`
Prerequisites: `python64-bit >= 3.5`
```bash
python -m pip install--upgrade nni
```
* __Install NNI through source code__
* Install NNI through source code
If you are interested on special or latest code version, you can install NNI through source code.
Below are the minimum system requirements for NNI on Linux. Due to potential programming changes, the minimum system requirements for NNI may change over time.
|**Operating System**|Ubuntu 16.04 or above|Ubuntu 16.04 or above|
|**CPU**|Intel® Core™ i3 or AMD Phenom™ X3 8650|Intel® Core™ i5 or AMD Phenom™ II X3 or better|
|**GPU**|NVIDIA® GeForce® GTX 460|NVIDIA® GeForce® GTX 660 or better|
|**Memory**|4 GB RAM|6 GB RAM|
|**Storage**|30 GB available hare drive space|
|**Internet**|Boardband internet connection|
|**Resolution**|1024 x 768 minimum display resolution|
Below are the minimum system requirements for NNI on macOS. Due to potential programming changes, the minimum system requirements for NNI may change over time.
|**CPU**|Intel® Core™ i5-760 or better|Intel® Core™ i7-4770 or better|
|**GPU**|NVIDIA® GeForce® GT 750M or AMD Radeon™ R9 M290 or better|AMD Radeon™ R9 M395X or better|
|**Memory**|4 GB RAM|8 GB RAM|
|**Storage**|70GB available space 7200 RPM HDD|70GB available space SSD|
|**Internet**|Boardband internet connection|
|**Resolution**|1024 x 768 minimum display resolution|
Below are the minimum system requirements for NNI on Windows, Windows 10.1809 is well tested and recommend. Due to potential programming changes, the minimum system requirements for NNI may change over time.
* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.
* Open the `Web UI url` in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below. [Here](../Tutorial/WebUI.md) are more Web UI pages.
We support Linux MacOS and Windows in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
#### Linux and MacOS
We support Linux macOS and Windows in current stage, Ubuntu 16.04 or higher, macOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
**Linux and macOS**
```bash
python3 -m pip install--upgrade nni
```
#### Windows
**Windows**
```bash
python -m pip install--upgrade nni
...
...
@@ -17,7 +18,7 @@ We support Linux MacOS and Windows in current stage, Ubuntu 16.04 or higher, Mac
Note:
* For Linux and MacOS `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* For Linux and macOS `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* If there is any error like `Segmentation fault`, please refer to [FAQ](FAQ.md)
* For the `system requirements` of NNI, please refer to [Install NNI](Installation.md)
...
...
@@ -53,7 +54,7 @@ The above code can only try one set of parameters at a time, if we want to tune
NNI is born for helping user do the tuning jobs, the NNI working process is presented below:
```
```text
input: search space, trial code, config file
output: one optimal hyperparameter configuration
...
...
@@ -68,7 +69,7 @@ output: one optimal hyperparameter configuration
If you want to use NNI to automatically train your model and find the optimal hyper-parameters, you need to do three changes base on your code:
**Three things required to do when using NNI**
**Three steps to start an experiment**
**Step 1**: Give a `Search Space` file in JSON, includes the `name` and the `distribution` (discrete valued or continuous valued) of all the hyperparameters you need to search.
...
...
@@ -138,22 +139,25 @@ Note, **for Windows, you need to change trial command `python3` to `python`**
All the codes above are already prepared and stored in [examples/trials/mnist-tfv1/](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1).
#### Linux and MacOS
**Linux and macOS**
Run the **config.yml** file from your command line to start MNIST experiment.
Run the **config_windows.yml** file from your command line to start MNIST experiment.
**Note**, if you're using NNI on Windows, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
Note, if you're using NNI on Windows, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
Note, **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Note, `nnictl` is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. And this is what we expected to get:
...
...
@@ -195,7 +199,7 @@ The Web UI urls are: [Your IP]:8080
Open the `Web UI url`(In this information is: `[Your IP]:8080`) in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below. If you can not open the WebUI link in your terminal, you can refer to [FAQ](FAQ.md).
#### View summary page
### View summary page
Click the tab "Overview".
...
...
@@ -207,7 +211,7 @@ Top 10 trials will be listed in the Overview page, you can browse all the trials

#### View trials detail page
### View trials detail page
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.