@@ -87,81 +87,108 @@ The tool dispatches and runs trial jobs generated by tuning algorithms to search
</table>
## **Who should consider using NNI**
* Those who want to try different AutoML algorithms in their training code (model) at their local machine.
* Those who want to run AutoML trial jobs in different environments to speed up search (e.g. remote servers and cloud).
* Researchers and data scientists who want to implement their own AutoML algorithms and compare it with other algorithms.
* ML Platform owners who want to support AutoML in their platform.
## Related Projects
Targeting at openness and advancing state-of-art technology, [Microsoft Research (MSR)](https://www.microsoft.com/en-us/research/group/systems-research-group-asia/) had also released few other open source projects.
*[OpenPAI](https://github.com/Microsoft/pai) : an open source platform that provides complete AI model training and resource management capabilities, it is easy to extend and supports on-premise, cloud and hybrid environments in various scale.
*[FrameworkController](https://github.com/Microsoft/frameworkcontroller) : an open source general-purpose Kubernetes Pod Controller that orchestrate all kinds of applications on Kubernetes by a single controller.
*[MMdnn](https://github.com/Microsoft/MMdnn) : A comprehensive, cross-framework solution to convert, visualize and diagnose deep neural network models. The "MM" in MMdnn stands for model management and "dnn" is an acronym for deep neural network.
We encourage researchers and students leverage these projects to accelerate the AI development and research.
## **Install & Verify**
If you choose NNI Windows local mode and you use powershell to run script for the first time, you need to **run powershell as administrator** with this command first:
If you choose NNI Windows local mode and you use PowerShell to run script for the first time, you need to **run PowerShell as administrator** with this command first:
```bash
Set-ExecutionPolicy -ExecutionPolicy Unrestricted
```
**Install through pip**
* We support Linux, MacOS and Windows(local mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 along with Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
Linux and MacOS
```bash
python3 -m pip install--upgrade nni
python3 -m pip install--upgrade nni
```
Windows
```bash
python -m pip install--upgrade nni
python -m pip install--upgrade nni
```
Note:
*`--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* Currently NNI on Windows only support local mode. Anaconda is highly recommanded to install NNI on Windows.
* Currently NNI on Windows only support local mode. Anaconda is highly recommended to install NNI on Windows.
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/FAQ.md)
**Install through source code**
* We support Linux (Ubuntu 16.04 or higher), MacOS (10.14.1) and Windows local mode (10.1809) in our current stage.
Linux and MacOS
* Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`.
* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.
```
```text
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
...
...
@@ -195,10 +222,12 @@ You can use these commands to get more information about the experiment
</table>
## **Documentation**
*[NNI overview](docs/en_US/Overview.md)
*[Quick start](docs/en_US/QuickStart.md)
## **How to**
*[Install NNI](docs/en_US/Installation.md)
*[Use command line tool nnictl](docs/en_US/NNICTLDOC.md)
*[Use NNIBoard](docs/en_US/WebUI.md)
...
...
@@ -207,7 +236,9 @@ You can use these commands to get more information about the experiment
*[How to choose tuner/search-algorithm](docs/en_US/Builtin_Tuner.md)
*[Config an experiment](docs/en_US/ExperimentConfig.md)
*[How to use annotation](docs/en_US/Trials.md#nni-python-annotation)
## **Tutorials**
*[Run an experiment on local (with multiple GPUs)?](docs/en_US/LocalMode.md)
*[Run an experiment on multiple machines?](docs/en_US/RemoteMachineMode.md)
*[Run an experiment on OpenPAI?](docs/en_US/PAIMode.md)
...
...
@@ -219,6 +250,7 @@ You can use these commands to get more information about the experiment
*[Use Genetic Algorithm to find good model architectures for Reading Comprehension task](examples/trials/ga_squad/README.md)
## **Contribute**
This project welcomes contributions and suggestions, we use [GitHub issues](https://github.com/Microsoft/nni/issues) for tracking requests and bugs.
Issues with the **good first issue** label are simple and easy-to-start ones that we recommend new contributors to start with.
...
...
@@ -230,4 +262,5 @@ Before start coding, review and get familiar with the NNI Code Contribution Guid
We are in construction of the instruction for [How to Debug](docs/en_US/HowToDebug.md), you are also welcome to contribute questions or suggestions on this area.
## **License**
The entire codebase is under [MIT license](LICENSE)
We support Linux MacOS and Windows(local mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
#### Linux and MacOS
```bash
python3 -m pip install--upgrade nni
```
#### Windows
```bash
python -m pip install--upgrade nni
```
Note:
* For Linux and MacOS `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
...
...
@@ -49,7 +53,7 @@ The above code can only try one set of parameters at a time, if we want to tune
NNI is born for helping user do the tuning jobs, the NNI working process is presented below:
```
```pseudo
input: search space, trial code, config file
output: one optimal hyperparameter configuration
...
...
@@ -66,7 +70,7 @@ If you want to use NNI to automatically train your model and find the optimal hy
**Three things required to do when using NNI**
**Step 1**: Give a `Search Space` file in json, includes the `name` and the `distribution` (discrete valued or continuous valued) of all the hyperparameters you need to search.
**Step 1**: Give a `Search Space` file in JSON, includes the `name` and the `distribution` (discrete valued or continuous valued) of all the hyperparameters you need to search.
If you use windows local mode and forget to change the trial command `python3` to `python` in config.yml, **then run the config_windows.yml file from your command line to start the experiment**.
**Note**, if you're using windows local mode, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
Note:**nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](NNICTLDOC.md) for more usage of `nnictl`
Note,**nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](NNICTLDOC.md) for more usage of `nnictl`
Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. And this is what we expected to get:
```
```text
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
...
...
@@ -185,7 +192,7 @@ If you prepare `trial`, `search space` and `config` according to the above steps
After you start your experiment in NNI successfully, you can find a message in the command-line interface to tell you `Web UI url` like this:
```
```text
The Web UI urls are: [Your IP]:8080
```
...
...
@@ -229,7 +236,7 @@ Below is the status of the all trials. Specifically:
* check whether the version is consistent between nniManager and trialKeeper
*[Report final metrics for early stop job](https://github.com/Microsoft/nni/issues/776)
...
...
@@ -37,6 +41,7 @@
* Add intermediate result graph for all trials
### Bug fix
*[Add shmMB config key for PAI](https://github.com/Microsoft/nni/issues/842)
* Fix the bug that doesn't show any result if metrics is dict
* Fix the number calculation issue for float types in hyperband
...
...
@@ -45,7 +50,9 @@
* Fix cold start issue in Metis Tuner
## Release 0.5.2 - 3/4/2019
### Improvements
* Curve fitting assessor performance improvement.
### Documentation
...
...
@@ -61,7 +68,6 @@
* Add integration test azure pipelines for remote machine, OpenPAI and kubeflow training services.
* Support Pylon in OpenPAI webhdfs client.
## Release 0.5.1 - 1/31/2018
### Improvements
* Making [log directory](https://github.com/Microsoft/nni/blob/v0.5.1/docs/en_US/ExperimentConfig.md) configurable
...
...
@@ -75,7 +81,6 @@
* Fix the bug of HDFS access failure on OpenPAI mode after OpenPAI is upgraded.
* Fix the bug that sometimes in-place flushed stdout makes experiment crash
## Release 0.5.0 - 01/14/2019
### Major Features
...
...
@@ -170,7 +175,7 @@
* Support running multiple experiments simultaneously.
Before v0.3, NNI only supports running single experiment once a time. After this realse, users are able to run multiple experiments simultaneously. Each experiment will require a unique port, the 1st experiment will be set to the default port as previous versions. You can specify a unique port for the rest experiments as below:
Before v0.3, NNI only supports running single experiment once a time. After this release, users are able to run multiple experiments simultaneously. Each experiment will require a unique port, the 1st experiment will be set to the default port as previous versions. You can specify a unique port for the rest experiments as below:
For other examples you need to change trial command `python3` into `python` in each example yaml.
## **Frequent met errors and answers**
For other examples you need to change trial command `python3` into `python` in each example YAML.
## **FAQ**
### simplejson failed when installing NNI
### simplejson failed when installing nni
Make sure C++ 14.0 compiler installed.
>builging 'simplejson._speedups' extension error: [WinError 3] The system cannot find the path specified
>building 'simplejson._speedups' extension error: [WinError 3] The system cannot find the path specified
### Fail to run PowerShell when install NNI from source
If you run PowerShell script for the first time and did not set the execution policies for executing the script, you will meet this error below. Try to run PowerShell as administrator with this command first:
### Fail to run powershell when install nni from source
If you run powershell script for the first time and did not set the execution policies for executing the script, you will meet this error below. Try to run powershell as administrator with this command first:
```bash
Set-ExecutionPolicy -ExecutionPolicy Unrestricted
```
>...cannot be loaded because running scripts is disabled on this system.
### Trial failed with missing DLL in cmd or powershell
This error caused by missing LIBIFCOREMD.DLL and LIBMMD.DLL and fail to install scipy. Anaconda python is highly recommended. If you use official python, make sure you have one of `Visual Studio`, `MATLAB`, `MKL` and `Intel Distribution for Python` installed on Windows before running nni. If not, try to install one of the softwares above or change to use Anaconda python(64-bit).
### Trial failed with missing DLL in cmd or PowerShell
This error caused by missing LIBIFCOREMD.DLL and LIBMMD.DLL and fail to install SciPy. Anaconda python is highly recommended. If you use official python, make sure you have one of `Visual Studio`, `MATLAB`, `MKL` and `Intel Distribution for Python` installed on Windows before running NNI. If not, try to install one of products above or Anaconda python(64-bit).
>ImportError: DLL load failed
### Trial failed on webUI
Please check the trial log file stderr for more details. If there is no such file and nni is installed through pip, then you need to run powershell as administrator with this command first:
Please check the trial log file stderr for more details. If there is no such file and NNI is installed through pip, then you need to run PowerShell as administrator with this command first:
```bash
Set-ExecutionPolicy -ExecutionPolicy Unrestricted
```
If there is a stderr file, please check out. Two possible cases are as follows:
* forget to change the trial command `python3` into `python` in each experiment yaml.
* forget to install experiment dependencies such as tensorflow, keras and so on.
* forget to change the trial command `python3` into `python` in each experiment YAML.
* forget to install experiment dependencies such as TensorFlow, Keras and so on.
### Support tuner on Windows
* SMAC is not supported
* BOHB is supported, make sure C++ 14.0 compiler and dependencies installed successfully.
Typically each trial job gets a single configuration (e.g., hyperparameters) from tuner, tries this configuration and reports result, then exits. But sometimes a trial job may wants to request multiple configurations from tuner. We find this is a very compelling feature. For example:
1. Job launch takes tens of seconds in some training platform. If a configuration takes only around a minute to finish, running only one configuration in a trial job would be every inefficient. An appealing way is that a trial job requests a configuration and finishes it, then requests another configuration and run. The extreme case is that a trial job can run infinite configurations. If you set concurrency to be for example 6, there would be 6 __long running__ jobs keeping trying different configurations.
1. Job launch takes tens of seconds in some training platform. If a configuration takes only around a minute to finish, running only one configuration in a trial job would be very inefficient. An appealing way is that a trial job requests a configuration and finishes it, then requests another configuration and run. The extreme case is that a trial job can run infinite configurations. If you set concurrency to be for example 6, there would be 6 __long running__ jobs keeping trying different configurations.
2. Some types of models have to be trained phase by phase, the configuration of next phase depends on the results of previous phase(s). For example, to find the best quantization for a model, the training procedure is often as follows: the auto-quantization algorithm (i.e., tuner in NNI) chooses a size of bits (e.g., 16 bits), a trial job gets this configuration and trains the model for some epochs and reports result (e.g., accuracy). The algorithm receives this result and makes decision of changing 16 bits to 8 bits, or changing back to 32 bits. This process is repeated for a configured times.
...
...
@@ -34,12 +34,12 @@ It is pretty simple to use multi-phase in trial code, an example is shown below:
__2. Modify experiment configuration__
To enable multi-phase, you should also add `multiPhase: true` in your experiment yaml configure file. If this line is not added, `nni.get_next_parameter()` would always return the same configuration. For all the built-in tuners/advisors, you can use multi-phase in your trial code without modification of tuner/advisor spec in the yaml configure file.
To enable multi-phase, you should also add `multiPhase: true` in your experiment YAML configure file. If this line is not added, `nni.get_next_parameter()` would always return the same configuration. For all the built-in tuners/advisors, you can use multi-phase in your trial code without modification of tuner/advisor spec in the YAML configure file.
### Write a tuner that leverages multi-phase:
Before writing a multi-phase tuner, we highly suggest you to go through [Customize Tuner](https://nni.readthedocs.io/en/latest/Customize_Tuner.html). Different from writing a normal tuner, your tuner needs to inherit from `MultiPhaseTuner` (in nni.multi_phase_tuner). The key difference between `Tuner` and `MultiPhaseTuner` is that the methods in MultiPhaseTuner are aware of additional information, that is, `trial_job_id`. With this information, the tuner could know which trial is requesting a configuration, and which trial is reporting results. This information provides enough flexibility for your tuner to deal with different trials and different phases. For example, you may want to use the trial_job_id parameter of generate_parameters method to generate hyperparameters for a specific trial job.
Of course, to use your multi-phase tuner, __you should add `multiPhase: true` in your experiment yaml configure file__.
Of course, to use your multi-phase tuner, __you should add `multiPhase: true` in your experiment YAML configure file__.
[ENAS tuner](https://github.com/countif/enas_nni/blob/master/nni/examples/tuners/enas/nni_controller_ptb.py) is an example of a multi-phase tuner.