@@ -18,7 +18,7 @@ NNI (Neural Network Intelligence) is a toolkit to help users run automated machi
...
@@ -18,7 +18,7 @@ NNI (Neural Network Intelligence) is a toolkit to help users run automated machi
The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments like local machine, remote servers and cloud.
The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments like local machine, remote servers and cloud.
### **NNI v1.1 has been released! <a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
### **NNI v1.2 has been released! <a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
`Baseline` means without any feature selection, we directly pass the data to LogisticRegression. For this benchmark, we only use 10% data from the train as test data.
`Baseline` means without any feature selection, we directly pass the data to LogisticRegression. For this benchmark, we only use 10% data from the train as test data. For the GradientFeatureSelector, we only take the top20 features. The metric is the mean accuracy on the given test data and labels.
The paper [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Their method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent
To implement, authors optimize the network weights and architecture weights alternatively in mini-batches. They further explore the possibility that uses second order optimization (unroll) instead of first order, to improve the performance.
Implementation on NNI is based on the [official implementation](https://github.com/quark0/darts) and a [popular 3rd-party repo](https://github.com/khanrc/pt.darts). So far, first and second order optimization and training from scratch on CIFAR10 have been implemented.
## Reproduce Results
To reproduce the results in the paper, we do experiments with first and second order optimization. Due to the time limit, we retrain *only the best architecture* derived from the search phase and we repeat the experiment *only once*. Our results is currently on par with the results reported in paper. We will add more results later when ready.
The paper [Efficient Neural Architecture Search via Parameter Sharing](https://arxiv.org/abs/1802.03268) uses parameter sharing between child models to accelerate the NAS process. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss.
Implementation on NNI is based on the [official implementation in Tensorflow](https://github.com/melodyguan/enas), macro and micro search space on CIFAR10 included. Since code to train from scratch on NNI is not ready yet, reproduction results are currently unavailable.
`InputChoice` is a PyTorch module, in init, it needs meta information, for example, from how many input candidates to choose how many inputs, the name of this initialized `InputChoice`. The real candidate input tensors can only be obtained in `forward` function. In `forward`,`InputChoice`instance is called with real candidate input tensors.
`InputChoice` is a PyTorch module, in init, it needs meta information, for example, from how many input candidates to choose how many inputs, and the name of this initialized `InputChoice`. The real candidate input tensors can only be obtained in `forward` function. In the `forward` function, the`InputChoice`module you create in `__init__` (e.g., `self.input_switch`) is called with real candidate input tensors.
Some [NAS trainers](#one-shot-training-mode) need to know the source layer the input tensors, thus, we add one input argument `choose_from` in `InputChoice` to indicate the source layer of each candidate input. `choose_from` is a list of string, each element is `key` of `LayerChoice` and `InputChoice` or the name of a module (refer to [the code](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/nas/pytorch/mutables.py) for more details).
Some [NAS trainers](#one-shot-training-mode) need to know the source layer the input tensors, thus, we add one input argument `choose_from` in `InputChoice` to indicate the source layer of each candidate input. `choose_from` is a list of string, each element is `key` of `LayerChoice` and `InputChoice` or the name of a module (refer to [the code](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/nas/pytorch/mutables.py) for more details).
...
@@ -102,8 +102,6 @@ Different trainers could have different input arguments depending on their algor
...
@@ -102,8 +102,6 @@ Different trainers could have different input arguments depending on their algor
The supported trainers can be found [here](./Overview.md#supported-one-shot-nas-algorithms). A very simple example using NNI NAS API can be found [here](https://github.com/microsoft/nni/tree/master/examples/nas/simple/train.py).
The supported trainers can be found [here](./Overview.md#supported-one-shot-nas-algorithms). A very simple example using NNI NAS API can be found [here](https://github.com/microsoft/nni/tree/master/examples/nas/simple/train.py).
The complete example code can be found [here]().
### Classic distributed search
### Classic distributed search
Neural architecture search is originally executed by running each child model independently as a trial job. We also support this searching approach, and it naturally fits in NNI hyper-parameter tuning framework, where tuner generates child model for next trial and trials run in training service.
Neural architecture search is originally executed by running each child model independently as a trial job. We also support this searching approach, and it naturally fits in NNI hyper-parameter tuning framework, where tuner generates child model for next trial and trials run in training service.
@@ -6,11 +6,11 @@ However, it takes great efforts to implement NAS algorithms, and it is hard to r
...
@@ -6,11 +6,11 @@ However, it takes great efforts to implement NAS algorithms, and it is hard to r
With this motivation, our ambition is to provide a unified architecture in NNI, to accelerate innovations on NAS, and apply state-of-art algorithms on real world problems faster.
With this motivation, our ambition is to provide a unified architecture in NNI, to accelerate innovations on NAS, and apply state-of-art algorithms on real world problems faster.
With [the unified interface](.NasInterface.md), there are two different modes for the architecture search. [The one](#supported-one-shot-nas-algorithms) is the so-called one-shot NAS, where a super-net is built based on search space, and using one shot training to generate good-performing child model. [The other](.ClassicNas.md) is the traditional searching approach, where each child model in search space runs as an independent trial, the performance result is sent to tuner and the tuner generates new child model.
With [the unified interface](./NasInterface.md), there are two different modes for the architecture search. [The one](#supported-one-shot-nas-algorithms) is the so-called one-shot NAS, where a super-net is built based on search space, and using one shot training to generate good-performing child model. [The other](./NasInterface.md#classic-distributed-search) is the traditional searching approach, where each child model in search space runs as an independent trial, the performance result is sent to tuner and the tuner generates new child model.
*[Supported One-shot NAS Algorithms](#supported-one-shot-nas-algorithms)
*[Supported One-shot NAS Algorithms](#supported-one-shot-nas-algorithms)
*[Classic Distributed NAS with NNI experiment](.NasInterface.md#classic-distributed-search)
*[Classic Distributed NAS with NNI experiment](./NasInterface.md#classic-distributed-search)
*[NNI NAS Programming Interface](.NasInterface.md)
*[NNI NAS Programming Interface](./NasInterface.md)
## Supported One-shot NAS Algorithms
## Supported One-shot NAS Algorithms
...
@@ -37,7 +37,7 @@ Note, these algorithms run **standalone without nnictl**, and supports PyTorch o
...
@@ -37,7 +37,7 @@ Note, these algorithms run **standalone without nnictl**, and supports PyTorch o
#### Usage
#### Usage
ENAS in NNI is still under development and we only support search phase for macro/micro search space on CIFAR10. Training from scratch and search space on PTB has not been finished yet.
ENAS in NNI is still under development and we only support search phase for macro/micro search space on CIFAR10. Training from scratch and search space on PTB has not been finished yet.[Detailed Description](ENAS.md)
```bash
```bash
# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
...
@@ -58,7 +58,7 @@ python3 search.py -h
...
@@ -58,7 +58,7 @@ python3 search.py -h
### DARTS
### DARTS
The main contribution of [DARTS: Differentiable Architecture Search][3] on algorithm is to introduce a novel algorithm for differentiable network architecture search on bilevel optimization.
The main contribution of [DARTS: Differentiable Architecture Search][3] on algorithm is to introduce a novel algorithm for differentiable network architecture search on bilevel optimization.[Detailed Description](DARTS.md)
@@ -46,6 +46,33 @@ For each experiment, user only needs to define a search space and update a few l
...
@@ -46,6 +46,33 @@ For each experiment, user only needs to define a search space and update a few l
More details about how to run an experiment, please refer to [Get Started](Tutorial/QuickStart.md).
More details about how to run an experiment, please refer to [Get Started](Tutorial/QuickStart.md).
## Core Features
NNI provides a key capacity to run multiple instances in parallel to find best combinations of parameters. This feature can be used in various domains, like find best hyperparameters for a deep learning model, or find best configuration for database and other complex system with real data.
NNI is also like to provide algorithm toolkits for machine learning and deep learning, especially neural architecture search (NAS) algorithms, model compression algorithms, and feature engineering algorithms.
### Hyperparameter Tuning
This is a core and basic feature of NNI, we provide many popular [automatic tuning algorithms](Tuner/BuiltinTuner.md)(i.e., tuner) and [early stop algorithms](Assessor/BuiltinAssessor.md)(i.e., assessor). You could follow [Quick Start](Tutorial/QuickStart.md) to tune your model (or system). Basically, there are the above three steps and then start an NNI experiment.
### General NAS Framework
This NAS framework is for users to easily specify candidate neural architectures, for example, could specify multiple candidate operations (e.g., separable conv, dilated conv) for a single layer, and specify possible skip connections. NNI will find the best candidate automatically. On the other hand, the NAS framework provides simple interface for another type of users (e.g., NAS algorithm researchers) to implement new NAS algorithms. Detailed description and usage can be found [here](NAS/Overview.md).
NNI has supported many one-shot NAS algorithms, such as ENAS, DARTS, through NNI trial SDK. To use these algorithms you do not have to start an NNI experiment. Instead, to import an algorithm in your trial code, and simply run your trial code. If you want to tune the hyperparameters in the algorithms or want to run multiple instances, you could choose a tuner and start an NNI experiment.
Other than one-shot NAS, NAS can also run in a classic mode where each candidate architecture runs as an independent trial job. In this mode, similar to hyperparameter tuning, users have to start an NNI experiment and choose a tuner for NAS.
### Model Compression
Model Compression on NNI includes pruning algorithms and quantization algorithms. These algorithms are provided through NNI trial SDK. Users could directly use them in their trial code and run the trial code without starting an NNI experiment. Detailed description and usage can be found [here](Compressor/Overview.md).
There are different types of hyperparamters in model compression. One type is the hyperparameters in input configuration, e.g., sparsity, quantization bits, to a compression algorithm. The other type is the hyperparamters in compression algorithms. Here, Hyperparameter tuning of NNI could help a lot in finding the best compressed model automatically. A simple example can be found [here](Compressor/AutoCompression.md).
### Automatic Feature Engineering
Automatic feature engineering is for users to find the best features for the following tasks. Detailed description and usage can be found [here](FeatureEngineering/Overview.md). It is supported through NNI trial SDK, which means you do not have to create an NNI experiment. Instead, simply import a built-in auto-feature-engineering algorithm in your trial code and directly run your trial code.
The auto-feature-engineering algorithms usually have a bunch of hyperparameters themselves. If you want to automatically tune those hyperparameters, you can leverage hyperparameter tuning of NNI, that is, choose a tuning algorithm (i.e., tuner) and start an NNI experiment for it.
## Learn More
## Learn More
*[Get started](Tutorial/QuickStart.md)
*[Get started](Tutorial/QuickStart.md)
*[How to adapt your trial code on NNI?](TrialExample/Trials.md)
*[How to adapt your trial code on NNI?](TrialExample/Trials.md)
...
@@ -57,3 +84,6 @@ More details about how to run an experiment, please refer to [Get Started](Tutor
...
@@ -57,3 +84,6 @@ More details about how to run an experiment, please refer to [Get Started](Tutor
*[How to run an experiment on multiple machines?](TrainingService/RemoteMachineMode.md)
*[How to run an experiment on multiple machines?](TrainingService/RemoteMachineMode.md)
*[How to run an experiment on OpenPAI?](TrainingService/PaiMode.md)
*[How to run an experiment on OpenPAI?](TrainingService/PaiMode.md)
*[Examples](TrialExample/MnistExamples.md)
*[Examples](TrialExample/MnistExamples.md)
*[Neural Architecture Search on NNI](NAS/Overview.md)
*[Automatic model compression on NNI](Compressor/Overview.md)
*[Automatic feature engineering on NNI](FeatureEngineering/Overview.md)
-[New model quantization algorithms](https://github.com/microsoft/nni/blob/v1.2/docs/en_US/Compressor/Overview.md): QAT quantizer, DoReFa quantizer
- Support the API for exporting compressed model.
* Training Service
- Support OpenPAI token authentication
* Examples:
-[An example to automatically tune rocksdb configuration with NNI](https://github.com/microsoft/nni/tree/v1.2/examples/trials/systems/rocksdb-fillrandom).
-[A new MNIST trial example supports tensorflow 2.0](https://github.com/microsoft/nni/tree/v1.2/examples/trials/mnist-tfv2).
* Engineering Improvements
- For remote training service, trial jobs require no GPU are now scheduled with round-robin policy instead of random.
- Pylint rules added to check pull requests, new pull requests need to comply with these [pylint rules](https://github.com/microsoft/nni/blob/v1.2/pylintrc).
* Web Portal & User Experience
- Support user to add customized trial.
- User can zoom out/in in detail graphs, except Hyper-parameter.
* Documentation
- Improved NNI API documentation with more API docstring.
### Bug fix
- Fix the table sort issue when failed trials haven't metrics. -Issue #1773
- Maintain selected status(Maximal/Minimal) when the page switched. -PR#1710
- Make hyper-parameters graph's default metric yAxis more accurate. -PR#1736
- Fix GPU script permission issue. -Issue #1665
## Release 1.1 - 10/23/2019
### Major Features
* New tuner: [PPO Tuner](https://github.com/microsoft/nni/blob/v1.1/docs/en_US/Tuner/PPOTuner.md)
* Tuners can now use dedicated GPU resource (see `gpuIndices` in [tutorial](https://github.com/microsoft/nni/blob/v1.1/docs/en_US/Tutorial/ExperimentConfig.md) for details)
* Web UI improvements
- Trials detail page can now list hyperparameters of each trial, as well as their start and end time (via "add column")
-[Cifar10 NAS example](https://github.com/microsoft/nni/blob/v1.1/examples/trials/nas_cifar10/README.md)
*[Model compression toolkit - Alpha release](https://github.com/microsoft/nni/blob/v1.1/docs/en_US/Compressor/Overview.md): We are glad to announce the alpha release for model compression toolkit on top of NNI, it's still in the experiment phase which might evolve based on usage feedback. We'd like to invite you to use, feedback and even contribute
### Fixed Bugs
* Multiphase job hangs when search space exhuasted (issue #1204)
*`nnictl` fails when log not available (issue #1548)
## Release 1.0 - 9/2/2019
### Major Features
* Tuners and Assessors
- Support Auto-Feature generator & selection -Issue#877 -PR #1387
@@ -127,6 +127,23 @@ In the YAML configure file, you need to set *useAnnotation* to true to enable NN
...
@@ -127,6 +127,23 @@ In the YAML configure file, you need to set *useAnnotation* to true to enable NN
useAnnotation: true
useAnnotation: true
```
```
## Standalone mode for debug
NNI supports standalone mode for trial code to run without starting an NNI experiment. This is for finding out bugs in trial code more conveniently. NNI annotation natively supports standalone mode, as the added NNI related lines are comments. For NNI trial APIs, the APIs have changed behaviors in standalone mode, some APIs return dummy values, and some APIs do not really report values. Please refer to the following table for the full list of these APIs.
```python
# NOTE: please assign default values to the hyperparameters in your trial code
nni.get_next_parameter# return {}
nni.report_final_result# have log printed on stdout, but does not report
nni.report_intermediate_result# have log printed on stdout, but does not report
nni.get_experiment_id# return "STANDALONE"
nni.get_trial_id# return "STANDALONE"
nni.get_sequence_id# return 0
```
You can try standalone mode with the [mnist example](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-tfv1). Simply run `python3 mnist.py` under the code directory. The trial code successfully runs with default hyperparameter values.
For more debuggability, please refer to [How to Debug](../Tutorial/HowToDebug.md)
@@ -51,4 +51,4 @@ Our documentation is built with [sphinx](http://sphinx-doc.org/), supporting [Ma
...
@@ -51,4 +51,4 @@ Our documentation is built with [sphinx](http://sphinx-doc.org/), supporting [Ma
* For links, please consider using __relative paths__ first. However, if the documentation is written in Markdown format, and:
* For links, please consider using __relative paths__ first. However, if the documentation is written in Markdown format, and:
* It's an image link which needs to be formatted with embedded html grammar, please use global URL like `https://user-images.githubusercontent.com/44491713/51381727-e3d0f780-1b4f-11e9-96ab-d26b9198ba65.png`, which can be automatically generated by dragging picture onto [Github Issue](https://github.com/Microsoft/nni/issues/new) Box.
* It's an image link which needs to be formatted with embedded html grammar, please use global URL like `https://user-images.githubusercontent.com/44491713/51381727-e3d0f780-1b4f-11e9-96ab-d26b9198ba65.png`, which can be automatically generated by dragging picture onto [Github Issue](https://github.com/Microsoft/nni/issues/new) Box.
* It cannot be re-formatted by sphinx, such as source code, please use its global URL. For source code that links to our github repo, please use URLs rooted at `https://github.com/Microsoft/nni/tree/master/` ([mnist.py](https://github.com/Microsoft/nni/blob/master/examples/trials/mnist/mnist.py) for example).
* It cannot be re-formatted by sphinx, such as source code, please use its global URL. For source code that links to our github repo, please use URLs rooted at `https://github.com/Microsoft/nni/tree/master/` ([mnist.py](https://github.com/Microsoft/nni/blob/master/examples/trials/mnist-tfv1/mnist.py) for example).
@@ -84,5 +84,6 @@ A common example of this would be run the mnist example without installing tenso
...
@@ -84,5 +84,6 @@ A common example of this would be run the mnist example without installing tenso


As it shows, every trial has a log path, where you can find trial'log and stderr.
As it shows, every trial has a log path, where you can find trial's log and stderr.
In addition to experiment level debug, NNI also provides the capability for debugging a single trial without the need to start the entire experiment. Refer to [standalone mode](../TrialExample/Trials.md#standalone-mode-for-debug) for more information about debug single trial code.
Note: If you want to see the full implementation, please refer to [examples/trials/mnist/mnist_before.py](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist/mnist_before.py)
Note: If you want to see the full implementation, please refer to [examples/trials/mnist-tfv1/mnist_before.py](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1/mnist_before.py)
The above code can only try one set of parameters at a time, if we want to tune learning rate, we need to manually modify the hyperparameter and start the trial again and again.
The above code can only try one set of parameters at a time, if we want to tune learning rate, we need to manually modify the hyperparameter and start the trial again and again.
...
@@ -84,7 +84,7 @@ If you want to use NNI to automatically train your model and find the optimal hy
...
@@ -84,7 +84,7 @@ If you want to use NNI to automatically train your model and find the optimal hy
**Step 3**: Define a `config` file in YAML, which declare the `path` to search space and trial, also give `other information` such as tuning algorithm, max trial number and max duration arguments.
**Step 3**: Define a `config` file in YAML, which declare the `path` to search space and trial, also give `other information` such as tuning algorithm, max trial number and max duration arguments.
...
@@ -134,15 +134,15 @@ trial:
...
@@ -134,15 +134,15 @@ trial:
Note, **for Windows, you need to change trial command `python3` to `python`**
Note, **for Windows, you need to change trial command `python3` to `python`**
All the codes above are already prepared and stored in [examples/trials/mnist/](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist).
All the codes above are already prepared and stored in [examples/trials/mnist-tfv1/](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1).
#### Linux and MacOS
#### Linux and MacOS
Run the **config.yml** file from your command line to start MNIST experiment.
Run the **config.yml** file from your command line to start MNIST experiment.
Run the **config_windows.yml** file from your command line to start MNIST experiment.
Run the **config_windows.yml** file from your command line to start MNIST experiment.
...
@@ -150,7 +150,7 @@ Run the **config_windows.yml** file from your command line to start MNIST experi
...
@@ -150,7 +150,7 @@ Run the **config_windows.yml** file from your command line to start MNIST experi
**Note**, if you're using NNI on Windows, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
**Note**, if you're using NNI on Windows, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
Note, **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Note, **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`