To imporve the reproducibility of NAS algorithms as well as reducing computing resource requirements, researchers proposed a series of NAS benchmarks such as [NAS-Bench-101](https://arxiv.org/abs/1902.09635), [NAS-Bench-201](https://arxiv.org/abs/2001.00326), [NDS](https://arxiv.org/abs/1905.13214), etc. NNI provides a query interface for users to acquire these benchmarks. Within just a few lines of code, researcher are able to evaluate their NAS algorithms easily and fairly by utilizing these benchmarks.
To imporve the reproducibility of NAS algorithms as well as reducing computing resource requirements, researchers proposed a series of NAS benchmarks such as [NAS-Bench-101](https://arxiv.org/abs/1902.09635), [NAS-Bench-201](https://arxiv.org/abs/2001.00326), [NDS](https://arxiv.org/abs/1905.13214), [NLP](https://arxiv.org/abs/2006.07116), etc. NNI provides a query interface for users to acquire these benchmarks. Within just a few lines of code, researcher are able to evaluate their NAS algorithms easily and fairly by utilizing these benchmarks.
## Prerequisites
...
...
@@ -27,7 +29,7 @@ cd nni/examples/nas/benchmarks
```
Replace `${NNI_VERSION}` with a released version name or branch name, e.g., `v1.9`.
2. Install dependencies via `pip3 install -r xxx.requirements.txt`. `xxx` can be `nasbench101`, `nasbench201` or `nds`.
2. Install dependencies via `pip3 install -r xxx.requirements.txt`. `xxx` can be `nasbench101`, `nasbench201`, `nds` or `nlp`.
3. Generate the database via `./xxx.sh`. The directory that stores the benchmark file can be configured with `NASBENCHMARK_DIR` environment variable, which defaults to `~/.nni/nasbenchmark`. Note that the NAS-Bench-201 dataset will be downloaded from a google drive.
Please make sure there is at least 10GB free disk space and note that the conversion process can take up to hours to complete.
...
...
@@ -109,7 +111,7 @@ _On Network Design Spaces for Visual Recognition_ released trial statistics of o
Instead of storing results obtained with different configurations in separate files, we dump them into one single database to enable comparison in multiple dimensions. Specifically, we use `model_family` to distinguish model types, `model_spec` for all hyper-parameters needed to build this model, `cell_spec` for detailed information on operators and connections if it is a NAS cell, `generator` to denote the sampling policy through which this configuration is generated. Refer to API documentation for details.
## Available Operators
### Available Operators
Here is a list of available operators used in NDS.
...
...
@@ -158,3 +160,22 @@ Here is a list of available operators used in NDS.
The paper "NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing" have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it, and have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation. There are 2 datasets - PTB and wikitext-2. In the end, the precomputed results(ptb_single_run + ptb_multi_run + wikitext-2) can be utilized.
***nfs**: (*Optional*) mounting external storage. For more information about using NFS please check the below paragraph.
***checkpoint** (*Optional*) [storage settings](https://kubernetes.io/docs/concepts/storage/storage-classes/) for AdaptDL internal checkpoints. You can keep it optional if you are not dev users.
***namespace**: (*Optional*) Kubernetes namespace to launch the trials. Default to `default` namespace.
***adaptive**: (*Optional*) Boolean for AdaptDL trainer. While `true`, it the job is preemptible and adaptive.
***image**: Docker image for the trial
***imagePullSecret**: (*Optional*) If you are using a private registry,
you need to provide the secret to successfully pull the image.
***codeDir**: the working directory of the container. `.` means the default working directory defined by the image.
***command**: the bash command to start the trial
***gpuNum**: the number of GPUs requested for this trial. It must be non-negative integer.
***cpuNum**: (*Optional*) the number of CPUs requested for this trial. It must be non-negative integer.
***memorySize**: (*Optional*) the size of memory requested for this trial. It must follow the Kubernetes
***nfs**: (*Optional*) mounting external storage. For more information about using NFS please check the below paragraph.
***checkpoint**: (*Optional*) storage settings for model checkpoints.
***storageClass**: check [Kubernetes storage documentation](https://kubernetes.io/docs/concepts/storage/storage-classes/) for how to use the appropriate `storageClass`.
***storageSize**: this value should be large enough to fit your model's checkpoints, or it could cause disk quota exceeded error.
@@ -12,7 +12,7 @@ Currently, we support the following algorithms:
|[__Random Search__](#Random)|In Random Search for Hyper-Parameter Optimization show that Random Search might be surprisingly simple and effective. We suggest that we could use Random Search as the baseline when we have no knowledge about the prior distribution of hyper-parameters. [Reference Paper](http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf)|
|[__Anneal__](#Anneal)|This simple annealing algorithm begins by sampling from the prior, but tends over time to sample from points closer and closer to the best ones observed. This algorithm is a simple variation on the random search that leverages smoothness in the response surface. The annealing rate is not adaptive.|
|[__Naïve Evolution__](#Evolution)|Naïve Evolution comes from Large-Scale Evolution of Image Classifiers. It randomly initializes a population-based on search space. For each generation, it chooses better ones and does some mutation (e.g., change a hyperparameter, add/remove one layer) on them to get the next generation. Naïve Evolution requires many trials to work, but it's very simple and easy to expand new features. [Reference paper](https://arxiv.org/pdf/1703.01041.pdf)|
|[__SMAC__](#SMAC)|SMAC is based on Sequential Model-Based Optimization (SMBO). It adapts the most prominent previously used model class (Gaussian stochastic process models) and introduces the model class of random forests to SMBO, in order to handle categorical parameters. The SMAC supported by NNI is a wrapper on the SMAC3 GitHub repo. Notice, SMAC needs to be installed by `nnictl package` command. [Reference Paper,](https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf)[GitHub Repo](https://github.com/automl/SMAC3)|
|[__SMAC__](#SMAC)|SMAC is based on Sequential Model-Based Optimization (SMBO). It adapts the most prominent previously used model class (Gaussian stochastic process models) and introduces the model class of random forests to SMBO, in order to handle categorical parameters. The SMAC supported by NNI is a wrapper on the SMAC3 GitHub repo. Notice, SMAC needs to be installed by `pip install nni[SMAC]` command. [Reference Paper,](https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf)[GitHub Repo](https://github.com/automl/SMAC3)|
|[__Batch tuner__](#Batch)|Batch tuner allows users to simply provide several configurations (i.e., choices of hyper-parameters) for their trial code. After finishing all the configurations, the experiment is done. Batch tuner only supports the type choice in search space spec.|
|[__Grid Search__](#GridSearch)|Grid Search performs an exhaustive searching through a manually specified subset of the hyperparameter space defined in the searchspace file. Note that the only acceptable types of search space are choice, quniform, randint. |
|[__Hyperband__](#Hyperband)|Hyperband tries to use limited resources to explore as many configurations as possible and returns the most promising ones as a final result. The basic idea is to generate many configurations and run them for a small number of trials. The half least-promising configurations are thrown out, the remaining are further trained along with a selection of new configurations. The size of these populations is sensitive to resource constraints (e.g. allotted search time). [Reference Paper](https://arxiv.org/pdf/1603.06560.pdf)|
...
...
@@ -27,7 +27,9 @@ Currently, we support the following algorithms:
Using a built-in tuner provided by the NNI SDK requires one to declare the **builtinTunerName** and **classArgs** in the `config.yml` file. In this part, we will introduce each tuner along with information about usage and suggested scenarios, classArg requirements, and an example configuration.
Note: Please follow the format when you write your `config.yml` file. Some built-in tuners need to be installed using `nnictl package`, like SMAC.
Note: Please follow the format when you write your `config.yml` file. Some built-in tuners have
dependencies need to be installed using `pip install nni[<tuner>]`, like SMAC's dependencies can
be installed using `pip install nni[SMAC]`.
<aname="TPE"></a>
...
...
@@ -144,10 +146,10 @@ tuner:
**Installation**
SMAC needs to be installed by following command before the first usage. As a reminder, `swig` is required for SMAC: for Ubuntu `swig` can be installed with `apt`.
SMAC has dependencies need to be installed by following command before the first usage. As a reminder, `swig` is required for SMAC: for Ubuntu `swig` can be installed with `apt`.
```bash
nnictl packageinstall--name=SMAC
pipinstall nni[SMAC]
```
**Suggested scenario**
...
...
@@ -340,7 +342,7 @@ tuner:
BOHB advisor requires [ConfigSpace](https://github.com/automl/ConfigSpace) package. ConfigSpace can be installed using the following command.
**How to install customized algorithms as builtin tuners, assessors and advisors**
**How to register customized algorithms as builtin tuners, assessors and advisors**
===
## Overview
NNI provides a lot of [builtin tuners](../Tuner/BuiltinTuner.md), [advisors](../Tuner/HyperbandAdvisor.md) and [assessors](../Assessor/BuiltinAssessor.md) can be used directly for Hyper Parameter Optimization, and some extra algorithms can be installed via `nnictl package install --name <name>` after NNI is installed. You can check these extra algorithms via `nnictl package list` command.
NNI provides a lot of [builtin tuners](../Tuner/BuiltinTuner.md), [advisors](../Tuner/HyperbandAdvisor.md) and [assessors](../Assessor/BuiltinAssessor.md) can be used directly for Hyper Parameter Optimization, and some extra algorithms can be registered via `nnictl algo register --meta <path_to_meta_file>` after NNI is installed. You can check builtin algorithms via `nnictl algo list` command.
NNI also provides the ability to build your own customized tuners, advisors and assessors. To use the customized algorithm, users can simply follow the spec in experiment config file to properly reference the algorithm, which has been illustrated in the tutorials of [customized tuners](../Tuner/CustomizeTuner.md)/[advisors](../Tuner/CustomizeAdvisor.md)/[assessors](../Assessor/CustomizeAssessor.md).
...
...
@@ -13,8 +13,8 @@ tuner:
builtinTunerName:mytuner
```
## Install customized algorithms as builtin tuners, assessors and advisors
You can follow below steps to build a customized tuner/assessor/advisor, and install it into NNI as builtin algorithm.
## Register customized algorithms as builtin tuners, assessors and advisors
You can follow below steps to build a customized tuner/assessor/advisor, and register it into NNI as builtin algorithm.
### 1. Create a customized tuner/assessor/advisor
Reference following instructions to create:
...
...
@@ -48,56 +48,43 @@ class MedianstopClassArgsValidator(ClassArgsValidator):
```
The validator will be invoked before experiment is started to check whether the classArgs fields are valid for your customized algorithms.
### 3. Prepare package installation source
In order to be installed as builtin tuners, assessors and advisors, the customized algorithms need to be packaged as installable source which can be recognized by `pip` command, under the hood nni calls `pip` command to install the package.
Besides being a common pip source, the package needs to provide meta information in the `classifiers` field.
Format of classifiers field is a following:
```
NNI Package :: <type> :: <builtin name> :: <full class name of tuner> :: <full class name of class args validator>
```
*`type`: type of algorithms, could be one of `tuner`, `assessor`, `advisor`
*`builtin name`: builtin name used in experiment configuration file
*`full class name of tuner`: tuner class name, including its module name, for example: `demo_tuner.DemoTuner`
*`full class name of class args validator`: class args validator class name, including its module name, for example: `demo_tuner.MyClassArgsValidator`
Following is an example of classfiers in package's `setup.py`:
### 3. Install your customized algorithms into python environment
Firstly, the customized algorithms need to be prepared as a python package. Then you can install the package into python environment via:
* Run command `python setup.py develop` from the package directory, this command will install the package in development mode, this is recommended if your algorithm is under development.
* Run command `python setup.py bdist_wheel` from the package directory, this command build a whl file which is a pip installation source. Then run `pip install <wheel file>` to install it.
Once you have the meta info in `setup.py`, you can build your pip installation source via:
* Run command `python setup.py develop` from the package directory, this command will build the directory as a pip installation source.
* Run command `python setup.py bdist_wheel` from the package directory, this command build a whl file which is a pip installation source.
### 4. Prepare meta file
NNI will look for the classifier starts with `NNI Package` to retrieve the package meta information while the package being installed with `nnictl package install <source>` command.
Create a yaml file with following keys as meta file:
*`algoType`: type of algorithms, could be one of `tuner`, `assessor`, `advisor`
*`builtinName`: builtin name used in experiment configuration file
*`className`: tuner class name, including its module name, for example: `demo_tuner.DemoTuner`
*`classArgsValidator`: class args validator class name, including its module name, for example: `demo_tuner.MyClassArgsValidator`
Reference [customized tuner example](../Tuner/InstallCustomizedTuner.md) for a full example.
Following is an example of the yaml file:
### 4. Install customized algorithms package into NNI
If your installation source is prepared as a directory with `python setup.py develop`, you can install the package by following command:
## 5. Use the installed builtin algorithms in experiment
## 6. Use the installed builtin algorithms in experiment
Once your customized algorithms is installed, you can use it in experiment configuration file the same way as other builtin tuners/assessors/advisors, for example:
```yaml
...
...
@@ -109,56 +96,42 @@ tuner:
```
## Manage packages using `nnictl package`
## Manage builtin algorithms using `nnictl algo`
### List installed packages
### List builtin algorithms
Run following command to list the installed packages:
Run following command to list the registered builtin algorithms:
@@ -72,7 +72,7 @@ Here is a template configuration specification to use AdaptDL as a training serv
path: /
containerMountPath: /nfs
checkpoint: # optional
storageClass: microk8s-hostpath
storageClass: dfs
storageSize: 1Gi
Those configs not mentioned below, are following the
...
...
@@ -86,6 +86,7 @@ Those configs not mentioned below, are following the
* **tuner**\ : It supports the Tuun tuner and all NNI built-in tuners (only except for the checkpoint feature of the NNI PBT tuners).
* **trial**\ : It defines the specs of an ``adl`` trial.
* **namespace**\: (*Optional*\ ) Kubernetes namespace to launch the trials. Default to ``default`` namespace.
* **adaptive**\ : (*Optional*\ ) Boolean for AdaptDL trainer. While ``true``\ , it the job is preemptible and adaptive.
* **image**\ : Docker image for the trial
* **imagePullSecret**\ : (*Optional*\ ) If you are using a private registry,
...
...
@@ -97,7 +98,10 @@ Those configs not mentioned below, are following the
* **memorySize**\ : (*Optional*\ ) the size of memory requested for this trial. It must follow the Kubernetes
`default format <https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory>`__.
* **nfs**\ : (*Optional*\ ) mounting external storage. For more information about using NFS please check the below paragraph.
* **checkpoint** (*Optional*\ ) `storage settings <https://kubernetes.io/docs/concepts/storage/storage-classes/>`__ for AdaptDL internal checkpoints. You can keep it optional if you are not dev users.
* **checkpoint** (*Optional*\ ) storage settings for model checkpoints.
* **storageClass**\ : check `Kubernetes storage documentation <https://kubernetes.io/docs/concepts/storage/storage-classes/>`__ for how to use the appropriate ``storageClass``.
* **storageSize**\ : this value should be large enough to fit your model's checkpoints, or it could cause "disk quota exceeded" error.