Unverified Commit fbee0df1 authored by liuzhe-lz's avatar liuzhe-lz Committed by GitHub
Browse files

Fix HPO doc fixme (#4661)

parent f886ae5d
...@@ -4,7 +4,7 @@ build/ ...@@ -4,7 +4,7 @@ build/
_build/ _build/
# ignored copied rst in tutorials # ignored copied rst in tutorials
**/tutorials/**/cp_*.rst /source/tutorials/**/cp_*.rst
# auto-generated reference table # auto-generated reference table
_modules/ _modules/
.. role:: raw-html(raw)
:format: html
Built-in Assessors
==================
NNI provides state-of-the-art tuning algorithms within our builtin-assessors and makes them easy to use. Below is a brief overview of NNI's current builtin Assessors.
Note: Click the **Assessor's name** to get each Assessor's installation requirements, suggested usage scenario, and a config example. A link to a detailed description of each algorithm is provided at the end of the suggested scenario for each Assessor.
Currently, we support the following Assessors:
.. list-table::
:header-rows: 1
:widths: auto
* - Assessor
- Brief Introduction of Algorithm
* - `Medianstop <#MedianStop>`__
- Medianstop is a simple early stopping rule. It stops a pending trial X at step S if the trial’s best objective value by step S is strictly worse than the median value of the running averages of all completed trials’ objectives reported up to step S. `Reference Paper <https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf>`__
* - `Curvefitting <#Curvefitting>`__
- Curve Fitting Assessor is an LPA (learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of the final epoch's performance worse than the best final performance in the trial history. In this algorithm, we use 12 curves to fit the accuracy curve. `Reference Paper <http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf>`__
Usage of Builtin Assessors
--------------------------
Usage of builtin assessors provided by the NNI SDK requires one to declare the **builtinAssessorName** and **classArgs** in the ``config.yml`` file. In this part, we will introduce the details of usage and the suggested scenarios, classArg requirements, and an example for each assessor.
Note: Please follow the provided format when writing your ``config.yml`` file.
:raw-html:`<a name="MedianStop"></a>`
Median Stop Assessor
^^^^^^^^^^^^^^^^^^^^
..
Builtin Assessor Name: **Medianstop**
**Suggested scenario**
It's applicable in a wide range of performance curves, thus, it can be used in various scenarios to speed up the tuning progress. `Detailed Description <./MedianstopAssessor.rst>`__
**classArgs requirements:**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*\ ) - If 'maximize', assessor will **stop** the trial with smaller expectation. If 'minimize', assessor will **stop** the trial with larger expectation.
* **start_step** (*int, optional, default = 0*\ ) - A trial is determined to be stopped or not only after receiving start_step number of reported intermediate results.
**Usage example:**
.. code-block:: yaml
# config.yml
assessor:
builtinAssessorName: Medianstop
classArgs:
optimize_mode: maximize
start_step: 5
:raw-html:`<br>`
:raw-html:`<a name="Curvefitting"></a>`
Curve Fitting Assessor
^^^^^^^^^^^^^^^^^^^^^^
..
Builtin Assessor Name: **Curvefitting**
**Suggested scenario**
It's applicable in a wide range of performance curves, thus, it can be used in various scenarios to speed up the tuning progress. Even better, it's able to handle and assess curves with similar performance. `Detailed Description <./CurvefittingAssessor.rst>`__
**Note**\ , according to the original paper, only incremental functions are supported. Therefore this assessor can only be used to maximize optimization metrics. For example, it can be used for accuracy, but not for loss.
**classArgs requirements:**
* **epoch_num** (*int,** required***\ ) - The total number of epochs. We need to know the number of epochs to determine which points we need to predict.
* **start_step** (*int, optional, default = 6*\ ) - A trial is determined to be stopped or not only after receiving start_step number of reported intermediate results.
* **threshold** (*float, optional, default = 0.95*\ ) - The threshold that we use to decide to early stop the worst performance curve. For example: if threshold = 0.95, and the best performance in the history is 0.9, then we will stop the trial who's predicted value is lower than 0.95 * 0.9 = 0.855.
* **gap** (*int, optional, default = 1*\ ) - The gap interval between Assessor judgements. For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermediate results.
**Usage example:**
.. code-block:: yaml
# config.yml
assessor:
builtinAssessorName: Curvefitting
classArgs:
epoch_num: 20
start_step: 6
threshold: 0.95
gap: 1
Network Morphism Tuner
======================
`Autokeras <https://arxiv.org/abs/1806.10282>`__ is a popular autoML tool using Network Morphism. The basic idea of Autokeras is to use Bayesian Regression to estimate the metric of the Neural Network Architecture. Each time, it generates several child networks from father networks. Then it uses a naïve Bayesian regression to estimate its metric value from the history of trained results of network and metric value pairs. Next, it chooses the child which has the best, estimated performance and adds it to the training queue. Inspired by the work of Autokeras and referring to its `code <https://github.com/jhfjhfj1/autokeras>`__, we implemented our Network Morphism method on the NNI platform.
If you want to know more about network morphism trial usage, please see the :githublink:`Readme.md <examples/trials/network_morphism/README.rst>`.
Usage
-----
Installation
^^^^^^^^^^^^
NetworkMorphism requires :githublink:`PyTorch <examples/trials/network_morphism/requirements.txt>`.
classArgs Requirements
^^^^^^^^^^^^^^^^^^^^^^
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', the tuner will try to maximize metrics. If 'minimize', the tuner will try to minimize metrics.
* **task** (*('cv'), optional, default = 'cv'*) - The domain of the experiment. For now, this tuner only supports the computer vision (CV) domain.
* **input_width** (*int, optional, default = 32*) - input image width
* **input_channel** (*int, optional, default = 3*) - input image channel
* **n_output_node** (*int, optional, default = 10*) - number of classes
Config File
^^^^^^^^^^^
To use Network Morphism, you should modify the following spec in your ``config.yml`` file:
.. code-block:: yaml
tuner:
#choice: NetworkMorphism
name: NetworkMorphism
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
#for now, this tuner only supports cv domain
task: cv
#modify to fit your input image width
input_width: 32
#modify to fit your input image channel
input_channel: 3
#modify to fit your number of classes
n_output_node: 10
Example Configuration
^^^^^^^^^^^^^^^^^^^^^
.. code-block:: yaml
# config.yml
tuner:
name: NetworkMorphism
classArgs:
optimize_mode: maximize
task: cv
input_width: 32
input_channel: 3
n_output_node: 10
In the training procedure, it generates a JSON file which represents a Network Graph. Users can call the "json_to_graph()" function to build a PyTorch or Keras model from this JSON file.
.. code-block:: python
import nni
from nni.networkmorphism_tuner.graph import json_to_graph
def build_graph_from_json(ir_model_json):
"""build a pytorch model from json representation
"""
graph = json_to_graph(ir_model_json)
model = graph.produce_torch_model()
return model
# trial get next parameter from network morphism tuner
RCV_CONFIG = nni.get_next_parameter()
# call the function to build pytorch model or keras model
net = build_graph_from_json(RCV_CONFIG)
# training procedure
# ....
# report the final accuracy to NNI
nni.report_final_result(best_acc)
If you want to save and load the **best model**, the following methods are recommended.
.. code-block:: python
# 1. Use NNI API
## You can get the best model ID from WebUI
## or `nni-experiments/experiment_id/log/model_path/best_model.txt'
## read the json string from model file and load it with NNI API
with open("best-model.json") as json_file:
json_of_model = json_file.read()
model = build_graph_from_json(json_of_model)
# 2. Use Framework API (Related to Framework)
## 2.1 Keras API
## Save the model with Keras API in the trial code
## it's better to save model with id in nni local mode
model_id = nni.get_sequence_id()
## serialize model to JSON
model_json = model.to_json()
with open("model-{}.json".format(model_id), "w") as json_file:
json_file.write(model_json)
## serialize weights to HDF5
model.save_weights("model-{}.h5".format(model_id))
## Load the model with Keras API if you want to reuse the model
## load json and create model
model_id = "" # id of the model you want to reuse
with open('model-{}.json'.format(model_id), 'r') as json_file:
loaded_model_json = json_file.read()
loaded_model = model_from_json(loaded_model_json)
## load weights into new model
loaded_model.load_weights("model-{}.h5".format(model_id))
## 2.2 PyTorch API
## Save the model with PyTorch API in the trial code
model_id = nni.get_sequence_id()
torch.save(model, "model-{}.pt".format(model_id))
## Load the model with PyTorch API if you want to reuse the model
model_id = "" # id of the model you want to reuse
loaded_model = torch.load("model-{}.pt".format(model_id))
File Structure
--------------
The tuner has a lot of different files, functions, and classes. Here, we will give most of those files only a brief introduction:
*
``networkmorphism_tuner.py`` is a tuner which uses network morphism techniques.
*
``bayesian.py`` is a Bayesian method to estimate the metric of unseen model based on the models we have already searched.
* ``graph.py`` is the meta graph data structure. The class Graph represents the neural architecture graph of a model.
* Graph extracts the neural architecture graph from a model.
* Each node in the graph is an intermediate tensor between layers.
* Each layer is an edge in the graph.
* Notably, multiple edges may refer to the same layer.
*
``graph_transformer.py`` includes some graph transformers which widen, deepen, or add skip-connections to the graph.
*
``layers.py`` includes all the layers we use in our model.
* ``layer_transformer.py`` includes some layer transformers which widen, deepen, or add skip-connections to the layer.
* ``nn.py`` includes the class which generates the initial network.
* ``metric.py`` some metric classes including Accuracy and MSE.
* ``utils.py`` is the example search network architectures for the ``cifar10`` dataset, using Keras.
The Network Representation Json Example
---------------------------------------
Here is an example of the intermediate representation JSON file we defined, which is passed from the tuner to the trial in the architecture search procedure. Users can call the "json_to_graph()" function in the trial code to build a PyTorch or Keras model from this JSON file.
.. code-block:: json
{
"input_shape": [32, 32, 3],
"weighted": false,
"operation_history": [],
"layer_id_to_input_node_ids": {"0": [0],"1": [1],"2": [2],"3": [3],"4": [4],"5": [5],"6": [6],"7": [7],"8": [8],"9": [9],"10": [10],"11": [11],"12": [12],"13": [13],"14": [14],"15": [15],"16": [16]
},
"layer_id_to_output_node_ids": {"0": [1],"1": [2],"2": [3],"3": [4],"4": [5],"5": [6],"6": [7],"7": [8],"8": [9],"9": [10],"10": [11],"11": [12],"12": [13],"13": [14],"14": [15],"15": [16],"16": [17]
},
"adj_list": {
"0": [[1, 0]],
"1": [[2, 1]],
"2": [[3, 2]],
"3": [[4, 3]],
"4": [[5, 4]],
"5": [[6, 5]],
"6": [[7, 6]],
"7": [[8, 7]],
"8": [[9, 8]],
"9": [[10, 9]],
"10": [[11, 10]],
"11": [[12, 11]],
"12": [[13, 12]],
"13": [[14, 13]],
"14": [[15, 14]],
"15": [[16, 15]],
"16": [[17, 16]],
"17": []
},
"reverse_adj_list": {
"0": [],
"1": [[0, 0]],
"2": [[1, 1]],
"3": [[2, 2]],
"4": [[3, 3]],
"5": [[4, 4]],
"6": [[5, 5]],
"7": [[6, 6]],
"8": [[7, 7]],
"9": [[8, 8]],
"10": [[9, 9]],
"11": [[10, 10]],
"12": [[11, 11]],
"13": [[12, 12]],
"14": [[13, 13]],
"15": [[14, 14]],
"16": [[15, 15]],
"17": [[16, 16]]
},
"node_list": [
[0, [32, 32, 3]],
[1, [32, 32, 3]],
[2, [32, 32, 64]],
[3, [32, 32, 64]],
[4, [16, 16, 64]],
[5, [16, 16, 64]],
[6, [16, 16, 64]],
[7, [16, 16, 64]],
[8, [8, 8, 64]],
[9, [8, 8, 64]],
[10, [8, 8, 64]],
[11, [8, 8, 64]],
[12, [4, 4, 64]],
[13, [64]],
[14, [64]],
[15, [64]],
[16, [64]],
[17, [10]]
],
"layer_list": [
[0, ["StubReLU", 0, 1]],
[1, ["StubConv2d", 1, 2, 3, 64, 3]],
[2, ["StubBatchNormalization2d", 2, 3, 64]],
[3, ["StubPooling2d", 3, 4, 2, 2, 0]],
[4, ["StubReLU", 4, 5]],
[5, ["StubConv2d", 5, 6, 64, 64, 3]],
[6, ["StubBatchNormalization2d", 6, 7, 64]],
[7, ["StubPooling2d", 7, 8, 2, 2, 0]],
[8, ["StubReLU", 8, 9]],
[9, ["StubConv2d", 9, 10, 64, 64, 3]],
[10, ["StubBatchNormalization2d", 10, 11, 64]],
[11, ["StubPooling2d", 11, 12, 2, 2, 0]],
[12, ["StubGlobalPooling2d", 12, 13]],
[13, ["StubDropout2d", 13, 14, 0.25]],
[14, ["StubDense", 14, 15, 64, 64]],
[15, ["StubReLU", 15, 16]],
[16, ["StubDense", 16, 17, 64, 10]]
]
}
You can consider the model to be a `directed acyclic graph <https://en.wikipedia.org/wiki/Directed_acyclic_graph>`__. The definition of each model is a JSON object where:
* ``input_shape`` is a list of integers which do not include the batch axis.
* ``weighted`` means whether the weights and biases in the neural network should be included in the graph.
* ``operation_history`` is a list saving all the network morphism operations.
* ``layer_id_to_input_node_ids`` is a dictionary mapping from layer identifiers to their input nodes identifiers.
* ``layer_id_to_output_node_ids`` is a dictionary mapping from layer identifiers to their output nodes identifiers
* ``adj_list`` is a two-dimensional list; the adjacency list of the graph. The first dimension is identified by tensor identifiers. In each edge list, the elements are two-element tuples of (tensor identifier, layer identifier).
* ``reverse_adj_list`` is a reverse adjacent list in the same format as adj_list.
* ``node_list`` is a list of integers. The indices of the list are the identifiers.
*
``layer_list`` is a list of stub layers. The indices of the list are the identifiers.
*
For ``StubConv (StubConv1d, StubConv2d, StubConv3d)``, the numbering follows the format: its node input id (or id list), node output id, input_channel, filters, kernel_size, stride, and padding.
*
For ``StubDense``, the numbering follows the format: its node input id (or id list), node output id, input_units, and units.
*
For ``StubBatchNormalization (StubBatchNormalization1d, StubBatchNormalization2d, StubBatchNormalization3d)``, the numbering follows the format: its node input id (or id list), node output id, and features numbers.
*
For ``StubDropout(StubDropout1d, StubDropout2d, StubDropout3d)``, the numbering follows the format: its node input id (or id list), node output id, and dropout rate.
*
For ``StubPooling (StubPooling1d, StubPooling2d, StubPooling3d)``, the numbering follows the format: its node input id (or id list), node output id, kernel_size, stride, and padding.
*
For else layers, the numbering follows the format: its node input id (or id list) and node output id.
TODO
----
Next step, we will change the API from s fixed network generator to a network generator with more available operators. We will use ONNX instead of JSON later as the intermediate representation spec in the future.
######################
Examples
######################
.. toctree::
:maxdepth: 2
MNIST<./TrialExample/MnistExamples>
Cifar10<./TrialExample/Cifar10Examples>
Scikit-learn<./TrialExample/SklearnExamples>
GBDT<./TrialExample/GbdtExample>
Pix2pix<./TrialExample/Pix2pixExample>
...@@ -46,6 +46,7 @@ extensions = [ ...@@ -46,6 +46,7 @@ extensions = [
'sphinx_gallery.gen_gallery', 'sphinx_gallery.gen_gallery',
'sphinx.ext.autodoc', 'sphinx.ext.autodoc',
'sphinx.ext.autosummary', 'sphinx.ext.autosummary',
'sphinx.ext.intersphinx',
'sphinx.ext.mathjax', 'sphinx.ext.mathjax',
'sphinxarg4nni.ext', 'sphinxarg4nni.ext',
'sphinx.ext.napoleon', 'sphinx.ext.napoleon',
......
...@@ -9,9 +9,8 @@ If a trial is predicted to produce suboptimal final result, the assessor will st ...@@ -9,9 +9,8 @@ If a trial is predicted to produce suboptimal final result, the assessor will st
to save computing resources for other hyperparameter sets. to save computing resources for other hyperparameter sets.
As introduced in quickstart tutorial, a trial is the evaluation process of a hyperparameter set, As introduced in quickstart tutorial, a trial is the evaluation process of a hyperparameter set,
and intermediate results are reported with ``nni.report_intermediate_result()`` API in trial code. and intermediate results are reported with :func:`nni.report_intermediate_result` API in trial code.
Typically, intermediate results are accuracy or loss metrics of each epoch. Typically, intermediate results are accuracy or loss metrics of each epoch.
(FIXME: links)
Using an assessor will increase the efficiency of computing resources, Using an assessor will increase the efficiency of computing resources,
but may slightly reduce the predicition accuracy of tuners. but may slightly reduce the predicition accuracy of tuners.
......
...@@ -6,11 +6,10 @@ Auto hyperparameter optimization (HPO), or auto tuning, is one of the key featur ...@@ -6,11 +6,10 @@ Auto hyperparameter optimization (HPO), or auto tuning, is one of the key featur
Introduction to HPO Introduction to HPO
------------------- -------------------
In machine learning, a hyperparameter is a parameter whose value is used to control learning process [1]_, In machine learning, a hyperparameter is a parameter whose value is used to control learning process,
and HPO is the problem of choosing a set of optimal hyperparameters for a learning algorithm [2]_. and HPO is the problem of choosing a set of optimal hyperparameters for a learning algorithm.
(`From <https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)>`__
.. [1] https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning) `Wikipedia <https://en.wikipedia.org/wiki/Hyperparameter_optimization>`__)
.. [2] https://en.wikipedia.org/wiki/Hyperparameter_optimization
Following code snippet demonstrates a naive HPO process: Following code snippet demonstrates a naive HPO process:
...@@ -35,9 +34,9 @@ Following code snippet demonstrates a naive HPO process: ...@@ -35,9 +34,9 @@ Following code snippet demonstrates a naive HPO process:
You may have noticed, the example will train 4×10×3=120 models in total. You may have noticed, the example will train 4×10×3=120 models in total.
Since it consumes so much computing resources, you may want to: Since it consumes so much computing resources, you may want to:
1. Find the best set of hyperparameters with less iterations. 1. Find the best set of hyperparameters with less iterations.
2. Train the models on distributed platforms. 2. Train the models on distributed platforms.
3. Have a portal to monitor and control the process. 3. Have a portal to monitor and control the process.
And NNI will do them for you. And NNI will do them for you.
...@@ -74,7 +73,7 @@ from simple on-premise servers to scalable commercial clouds. ...@@ -74,7 +73,7 @@ from simple on-premise servers to scalable commercial clouds.
With NNI you can write one piece of model code, and concurrently evaluate hyperparameter sets on local machine, SSH servers, With NNI you can write one piece of model code, and concurrently evaluate hyperparameter sets on local machine, SSH servers,
Kubernetes-based clusters, AzureML service, and much more. Kubernetes-based clusters, AzureML service, and much more.
Main article: (FIXME: link to training_services) Main article: :doc:`/experiment/training_service`
Web Portal Web Portal
^^^^^^^^^^ ^^^^^^^^^^
...@@ -82,7 +81,10 @@ Web Portal ...@@ -82,7 +81,10 @@ Web Portal
NNI provides a web portal to monitor training progress, to visualize hyperparameter performance, NNI provides a web portal to monitor training progress, to visualize hyperparameter performance,
to manually customize hyperparameters, and to manage multiple HPO experiments. to manually customize hyperparameters, and to manage multiple HPO experiments.
(FIXME: image and link) Main article: :doc:`/experiment/web_portal`
.. image:: ../../static/img/webui.gif
:width: 100%
Tutorials Tutorials
--------- ---------
...@@ -97,12 +99,11 @@ Extra Features ...@@ -97,12 +99,11 @@ Extra Features
After you are familiar with basic usage, you can explore more HPO features: After you are familiar with basic usage, you can explore more HPO features:
* :doc:`Assessor: Early stop non-optimal models <assessors>` * :doc:`Use command line tool to create and manage experiments (nnictl) </reference/nnictl>`
* :doc:`nnictl: Use command line tool to create and manage experiments </reference/nnictl>` * :doc:`Early stop non-optimal models (assessor) <assessors>`
* :doc:`Custom tuner: Implement your own tuner <custom_algorithm>` * :doc:`TensorBoard integration <tensorboard>`
* :doc:`Tensorboard support <tensorboard>` * :doc:`Implement your own algorithm <custom_algorithm>`
* :doc:`Tuner benchmark <hpo_benchmark>` * :doc:`Benchmark tuners <hpo_benchmark>`
* :doc:`NNI Annotation (legacy) <nni_annotation>`
Built-in Algorithms Built-in Algorithms
------------------- -------------------
...@@ -166,7 +167,7 @@ Main article: :doc:`tuners` ...@@ -166,7 +167,7 @@ Main article: :doc:`tuners`
* - :class:`DNGOTuner <nni.algorithms.hpo.dngo_tuner.DNGOTuner>` * - :class:`DNGOTuner <nni.algorithms.hpo.dngo_tuner.DNGOTuner>`
- Bayesian - Bayesian
- (FIXME: full name?) - Deep Networks for Global Optimization.
* - :class:`PPOTuner <nni.algorithms.hpo.ppo_tuner.PPOTuner>` * - :class:`PPOTuner <nni.algorithms.hpo.ppo_tuner.PPOTuner>`
- RL - RL
......
...@@ -3,14 +3,16 @@ Tuner: Tuning Algorithms ...@@ -3,14 +3,16 @@ Tuner: Tuning Algorithms
The tuner decides which hyperparameter sets will be evaluated. It is a most important part of NNI HPO. The tuner decides which hyperparameter sets will be evaluated. It is a most important part of NNI HPO.
A tuner works in following steps: A tuner works like following pseudocode:
1. Initialize with a search space. .. code-block:: python
2. Generate hyperparameter sets from the search space.
3. Send hyperparameters to trials. space = get_search_space()
4. Receive evaluation results. history = []
5. Update internal states according to the results. while not experiment_end:
6. Go to step 2, until experiment end. hp = suggest_hyperparameter_set(space, history)
result = run_trial(hp)
history.append((hp, result))
NNI has out-of-the-box support for many popular tuning algorithms. NNI has out-of-the-box support for many popular tuning algorithms.
They should be sufficient to cover most typical machine learning scenarios. They should be sufficient to cover most typical machine learning scenarios.
...@@ -26,15 +28,15 @@ All built-in tuners have similar usage. ...@@ -26,15 +28,15 @@ All built-in tuners have similar usage.
To use a built-in tuner, you need to specify its name and arguments in experiment config, To use a built-in tuner, you need to specify its name and arguments in experiment config,
and provides a standard :doc:`search_space`. and provides a standard :doc:`search_space`.
Some tuners, like SMAC and DNGO, have extra dependencies that need to be installed separately. Some tuners, like SMAC and DNGO, have extra dependencies that need to be installed separately.
Please check each tuner's reference page for what arguments it supports and whether it needs extra dependencies. Please check each tuner's reference page for what arguments it supports and whether it needs extra dependencies.
For a general example, random tuner can be configured as follow: As a general example, random tuner can be configured as follow:
.. code-block:: python .. code-block:: python
config.search_space = { config.search_space = {
'x': {'_type': 'uniform', '_value': [0, 1]} 'x': {'_type': 'uniform', '_value': [0, 1]},
'y': {'_type': 'choice', '_value': ['a', 'b', 'c']}
} }
config.tuner.name = 'Random' config.tuner.name = 'Random'
config.tuner.class_args = {'seed': 0} config.tuner.class_args = {'seed': 0}
...@@ -47,13 +49,25 @@ Built-in Tuners ...@@ -47,13 +49,25 @@ Built-in Tuners
:widths: auto :widths: auto
* - Tuner * - Tuner
- Brief Introduction of Algorithm - Brief Introduction
* - :class:`TPE <nni.algorithms.hpo.tpe_tuner.TpeTuner>` * - :class:`TPE <nni.algorithms.hpo.tpe_tuner.TpeTuner>`
- The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach. SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements, and then subsequently choose new hyperparameters to test based on this model. `Reference Paper <https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf>`__ - Tree-structured Parzen Estimator, a classic Bayesian optimization algorithm.
(`paper <https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf>`__)
TPE is a lightweight tuner that has no extra dependency and supports all search space types.
Good to start with.
The drawback is that TPE cannot discover relationship between different hyperparameters.
* - :class:`Random Search <nni.algorithms.hpo.random_tuner.RandomTuner>` * - :class:`Random <nni.algorithms.hpo.random_tuner.RandomTuner>`
- In Random Search for Hyper-Parameter Optimization show that Random Search might be surprisingly simple and effective. We suggest that we could use Random Search as the baseline when we have no knowledge about the prior distribution of hyper-parameters. `Reference Paper <http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf>`__ - Naive random search, the baseline. It supports all search space types.
* - :class:`Grid Search <nni.algorithms.hpo.gridsearch_tuner.GridSearchTuner>`
- Divides search space into evenly spaced grid, and performs brute-force traverse. Another baseline.
It supports all search space types.
Recommended when the search space is small, and when you want to find the strictly optimal hyperparameters.
* - :class:`Anneal <nni.algorithms.hpo.hyperopt_tuner.HyperoptTuner>` * - :class:`Anneal <nni.algorithms.hpo.hyperopt_tuner.HyperoptTuner>`
- This simple annealing algorithm begins by sampling from the prior, but tends over time to sample from points closer and closer to the best ones observed. This algorithm is a simple variation on the random search that leverages smoothness in the response surface. The annealing rate is not adaptive. - This simple annealing algorithm begins by sampling from the prior, but tends over time to sample from points closer and closer to the best ones observed. This algorithm is a simple variation on the random search that leverages smoothness in the response surface. The annealing rate is not adaptive.
...@@ -69,9 +83,6 @@ Built-in Tuners ...@@ -69,9 +83,6 @@ Built-in Tuners
* - :class:`Batch <nni.algorithms.hpo.batch_tuner.BatchTuner>` * - :class:`Batch <nni.algorithms.hpo.batch_tuner.BatchTuner>`
- Batch tuner allows users to simply provide several configurations (i.e., choices of hyper-parameters) for their trial code. After finishing all the configurations, the experiment is done. Batch tuner only supports the type choice in search space spec. - Batch tuner allows users to simply provide several configurations (i.e., choices of hyper-parameters) for their trial code. After finishing all the configurations, the experiment is done. Batch tuner only supports the type choice in search space spec.
* - :class:`Grid Search <nni.algorithms.hpo.gridsearch_tuner.GridSearchTuner>`
- Grid Search performs an exhaustive searching through the search space.
* - :class:`Hyperband <nni.algorithms.hpo.hyperband_advisor.Hyperband>` * - :class:`Hyperband <nni.algorithms.hpo.hyperband_advisor.Hyperband>`
- Hyperband tries to use limited resources to explore as many configurations as possible and returns the most promising ones as a final result. The basic idea is to generate many configurations and run them for a small number of trials. The half least-promising configurations are thrown out, the remaining are further trained along with a selection of new configurations. The size of these populations is sensitive to resource constraints (e.g. allotted search time). `Reference Paper <https://arxiv.org/pdf/1603.06560.pdf>`__ - Hyperband tries to use limited resources to explore as many configurations as possible and returns the most promising ones as a final result. The basic idea is to generate many configurations and run them for a small number of trials. The half least-promising configurations are thrown out, the remaining are further trained along with a selection of new configurations. The size of these populations is sensitive to resource constraints (e.g. allotted search time). `Reference Paper <https://arxiv.org/pdf/1603.06560.pdf>`__
......
...@@ -28,7 +28,6 @@ Neural Network Intelligence ...@@ -28,7 +28,6 @@ Neural Network Intelligence
nnictl Commands <reference/nnictl> nnictl Commands <reference/nnictl>
Experiment Configuration <reference/experiment_config> Experiment Configuration <reference/experiment_config>
HPO API Reference <reference/hpo>
Python API <reference/_modules/nni> Python API <reference/_modules/nni>
API Reference <reference/python_api_ref> API Reference <reference/python_api_ref>
......
.. 4d670c5b2f7eebe5b65f593d7d350b85 .. 4908917bfebd7c3afbbdf8529b2d8a6c
########################### ###########################
Neural Network Intelligence Neural Network Intelligence
...@@ -19,6 +19,7 @@ Neural Network Intelligence ...@@ -19,6 +19,7 @@ Neural Network Intelligence
特征工程<feature_engineering> 特征工程<feature_engineering>
NNI实验 <experiment/overview> NNI实验 <experiment/overview>
HPO API Reference <reference/hpo> HPO API Reference <reference/hpo>
Experiment API Reference <reference/experiment>
参考<reference> 参考<reference>
示例与解决方案<CommunitySharings/community_sharings> 示例与解决方案<CommunitySharings/community_sharings>
研究和出版物 <ResearchPublications> 研究和出版物 <ResearchPublications>
......
...@@ -12,4 +12,3 @@ References ...@@ -12,4 +12,3 @@ References
Experiment Configuration <reference/experiment_config> Experiment Configuration <reference/experiment_config>
API References <reference/python_api_ref> API References <reference/python_api_ref>
Supported Framework Library <SupportedFramework_Library> Supported Framework Library <SupportedFramework_Library>
Launch from Python <Tutorial/HowToLaunchFromPython>
Experiment API Reference
========================
.. autoclass:: nni.experiment.Experiment
:members:
Experiment
==========
nni.experiment
--------------
nni.runtime
-----------
nni.tools
---------
Hyperparameter Optimization
===========================
nni.algorithms.hpo
------------------
nni.tuner
---------
nni.assessor
------------
nni.trial
---------
...@@ -4,9 +4,9 @@ API Reference ...@@ -4,9 +4,9 @@ API Reference
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
Hyperparameter Optimization <./python_api/hpo> Hyperparameter Optimization <hpo>
Neural Architecture Search <./python_api/nas> Neural Architecture Search <./python_api/nas>
Model Compression <./python_api/compression> Model Compression <./python_api/compression>
Feature Engineering <./python_api/feature_engineering> Feature Engineering <./python_api/feature_engineering>
Experiment <./python_api/experiment> Experiment <experiment>
Others <./python_api/others> Others <./python_api/others>
.. bcc89d271f64dcf7c00d79b9442933a9 .. e973987e22c5e2d43f325d6f29717ecb
:orphan: :orphan:
...@@ -12,4 +12,3 @@ ...@@ -12,4 +12,3 @@
Experiment 配置 <reference/experiment_config> Experiment 配置 <reference/experiment_config>
API 参考 <reference/python_api_ref> API 参考 <reference/python_api_ref>
支持的框架和库 <SupportedFramework_Library> 支持的框架和库 <SupportedFramework_Library>
从 Python 发起实验 <Tutorial/HowToLaunchFromPython>
...@@ -15,21 +15,21 @@ ...@@ -15,21 +15,21 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"\n# NNI HPO Quickstart with PyTorch\nThis tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning.\n\nThe tutorial consists of 4 steps: \n\n 1. Modify the model for auto-tuning.\n 2. Define hyperparameters' search space.\n 3. Configure the experiment.\n 4. Run the experiment.\n\n" "\n# NNI HPO Quickstart with PyTorch\nThis tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning.\n\nThere is also a :doc:`TensorFlow version<../hpo_quickstart_tensorflow/main>` if you prefer it.\n\nThe tutorial consists of 4 steps: \n\n1. Modify the model for auto-tuning.\n2. Define hyperparameters' search space.\n3. Configure the experiment.\n4. Run the experiment.\n\n"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Step 1: Prepare the model\nIn first step, you need to prepare the model to be tuned.\n\nThe model should be put in a separate script.\nIt will be evaluated many times concurrently,\nand possibly will be trained on distributed platforms.\n\nIn this tutorial, the model is defined in :doc:`model.py <model>`.\n\nPlease understand the model code before continue to next step.\n\n" "## Step 1: Prepare the model\nIn first step, we need to prepare the model to be tuned.\n\nThe model should be put in a separate script.\nIt will be evaluated many times concurrently,\nand possibly will be trained on distributed platforms.\n\nIn this tutorial, the model is defined in :doc:`model.py <model>`.\n\nIn short, it is a PyTorch model with 3 additional API calls:\n\n1. Use :func:`nni.get_next_parameter` to fetch the hyperparameters to be evalutated.\n2. Use :func:`nni.report_intermediate_result` to report per-epoch accuracy metrics.\n3. Use :func:`nni.report_final_result` to report final accuracy.\n\nPlease understand the model code before continue to next step.\n\n"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Step 2: Define search space\nIn model code, we have prepared 3 hyperparameters to be tuned:\n*features*, *lr*, and *momentum*.\n\nHere we need to define their *search space* so the tuning algorithm can sample them in desired range.\n\nAssuming we have following prior knowledge for these hyperparameters:\n\n 1. *features* should be one of 128, 256, 512, 1024.\n 2. *lr* should be a float between 0.0001 and 0.1, and it follows exponential distribution.\n 3. *momentum* should be a float between 0 and 1.\n\nIn NNI, the space of *features* is called ``choice``;\nthe space of *lr* is called ``loguniform``;\nand the space of *momentum* is called ``uniform``.\nYou may have noticed, these names are derived from ``numpy.random``.\n\nFor full specification of search space, check :doc:`the reference </hpo/search_space>`.\n\nNow we can define the search space as follow:\n\n" "## Step 2: Define search space\nIn model code, we have prepared 3 hyperparameters to be tuned:\n*features*, *lr*, and *momentum*.\n\nHere we need to define their *search space* so the tuning algorithm can sample them in desired range.\n\nAssuming we have following prior knowledge for these hyperparameters:\n\n1. *features* should be one of 128, 256, 512, 1024.\n2. *lr* should be a float between 0.0001 and 0.1, and it follows exponential distribution.\n3. *momentum* should be a float between 0 and 1.\n\nIn NNI, the space of *features* is called ``choice``;\nthe space of *lr* is called ``loguniform``;\nand the space of *momentum* is called ``uniform``.\nYou may have noticed, these names are derived from ``numpy.random``.\n\nFor full specification of search space, check :doc:`the reference </hpo/search_space>`.\n\nNow we can define the search space as follow:\n\n"
] ]
}, },
{ {
...@@ -65,7 +65,7 @@ ...@@ -65,7 +65,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Now we start to configure the experiment.\n\nFirstly, specify the model code.\nIn NNI evaluation of each hyperparameter set is called a *trial*.\nSo the model script is called *trial code*.\n\nIf you are using Linux system without Conda, you many need to change ``python`` to ``python3``.\n\nWhen ``trial_code_directory`` is a relative path, it relates to current working directory.\nTo run ``main.py`` from a different path, you can set trial code directory to ``Path(__file__).parent``.\n\n" "Now we start to configure the experiment.\n\n### Configure trial code\nIn NNI evaluation of each hyperparameter set is called a *trial*.\nSo the model script is called *trial code*.\n\n"
] ]
}, },
{ {
...@@ -83,7 +83,14 @@ ...@@ -83,7 +83,14 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Then specify the search space we defined above:\n\n" "When ``trial_code_directory`` is a relative path, it relates to current working directory.\nTo run ``main.py`` in a different path, you can set trial code directory to ``Path(__file__).parent``.\n(`__file__ <https://docs.python.org/3.10/reference/datamodel.html#index-43>`__\nis only available in standard Python, not in Jupyter Notebook.)\n\n.. attention::\n\n If you are using Linux system without Conda,\n you may need to change ``\"python model.py\"`` to ``\"python3 model.py\"``.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure search space\n\n"
] ]
}, },
{ {
...@@ -101,7 +108,7 @@ ...@@ -101,7 +108,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Choose a tuning algorithm.\nHere we use :doc:`TPE tuner </hpo/tuners>`.\n\n" "### Configure tuning algorithm\nHere we use :doc:`TPE tuner </hpo/tuners>`.\n\n"
] ]
}, },
{ {
...@@ -119,7 +126,7 @@ ...@@ -119,7 +126,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Specify how many trials to run.\nHere we evaluate 10 sets of hyperparameters in total, and concurrently evaluate 4 sets at a time.\n\nPlease note that ``max_trial_number`` here is merely for a quick example.\nWith default config TPE tuner requires 20 trials to warm up.\nIn real world max trial number is commonly set to 100+.\n\nYou can also set ``max_experiment_duration = '1h'`` to limit running time.\n\nAnd alternatively, you can skip this part and set no limit at all.\nThe experiment will run forever until you press Ctrl-C.\n\n" "### Configure how many trials to run\nHere we evaluate 10 sets of hyperparameters in total, and concurrently evaluate 2 sets at a time.\n\n"
] ]
}, },
{ {
...@@ -130,14 +137,21 @@ ...@@ -130,14 +137,21 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"experiment.config.max_trial_number = 10\nexperiment.config.trial_concurrency = 4" "experiment.config.max_trial_number = 10\nexperiment.config.trial_concurrency = 2"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Step 4: Run the experiment\nNow the experiment is ready. Choose a port and launch it.\n\nYou can use the web portal to view experiment status: http://localhost:8080.\n\n" "<div class=\"alert alert-info\"><h4>Note</h4><p>``max_trial_number`` is set to 10 here for a fast example.\n In real world it should be set to a larger number.\n With default config TPE tuner requires 20 trials to warm up.</p></div>\n\nYou may also set ``max_experiment_duration = '1h'`` to limit running time.\n\nIf neither ``max_trial_number`` nor ``max_experiment_duration`` are set,\nthe experiment will run forever until you press Ctrl-C.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 4: Run the experiment\nNow the experiment is ready. Choose a port and launch it. (Here we use port 8080.)\n\nYou can use the web portal to view experiment status: http://localhost:8080.\n\n"
] ]
}, },
{ {
...@@ -150,6 +164,31 @@ ...@@ -150,6 +164,31 @@
"source": [ "source": [
"experiment.run(8080)" "experiment.run(8080)"
] ]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## After the experiment is done\nEverything is done and it is safe to exit now. The following are optional.\n\nIf you are using standard Python instead of Jupyter Notebook,\nyou can add ``input()`` or ``signal.pause()`` to prevent Python from exiting,\nallowing you to view the web portal after the experiment is done.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# input('Press enter to quit')\nexperiment.stop()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
":meth:`nni.experiment.Experiment.stop` is automatically invoked when Python exits,\nso it can be omitted in your code.\n\nAfter the experiment is stopped, you can run :meth:`nni.experiment.Experiment.view` to restart web portal.\n\n.. tip::\n\n This example uses :doc:`Python API </reference/experiment>` to create experiment.\n\n You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.\n\n"
]
} }
], ],
"metadata": { "metadata": {
...@@ -168,7 +207,7 @@ ...@@ -168,7 +207,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.2" "version": "3.10.3"
} }
}, },
"nbformat": 4, "nbformat": 4,
......
...@@ -3,12 +3,14 @@ NNI HPO Quickstart with PyTorch ...@@ -3,12 +3,14 @@ NNI HPO Quickstart with PyTorch
=============================== ===============================
This tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning. This tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning.
There is also a :doc:`TensorFlow version<../hpo_quickstart_tensorflow/main>` if you prefer it.
The tutorial consists of 4 steps: The tutorial consists of 4 steps:
1. Modify the model for auto-tuning. 1. Modify the model for auto-tuning.
2. Define hyperparameters' search space. 2. Define hyperparameters' search space.
3. Configure the experiment. 3. Configure the experiment.
4. Run the experiment. 4. Run the experiment.
.. _official PyTorch quickstart: https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html .. _official PyTorch quickstart: https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
""" """
...@@ -16,7 +18,7 @@ The tutorial consists of 4 steps: ...@@ -16,7 +18,7 @@ The tutorial consists of 4 steps:
# %% # %%
# Step 1: Prepare the model # Step 1: Prepare the model
# ------------------------- # -------------------------
# In first step, you need to prepare the model to be tuned. # In first step, we need to prepare the model to be tuned.
# #
# The model should be put in a separate script. # The model should be put in a separate script.
# It will be evaluated many times concurrently, # It will be evaluated many times concurrently,
...@@ -24,6 +26,12 @@ The tutorial consists of 4 steps: ...@@ -24,6 +26,12 @@ The tutorial consists of 4 steps:
# #
# In this tutorial, the model is defined in :doc:`model.py <model>`. # In this tutorial, the model is defined in :doc:`model.py <model>`.
# #
# In short, it is a PyTorch model with 3 additional API calls:
#
# 1. Use :func:`nni.get_next_parameter` to fetch the hyperparameters to be evalutated.
# 2. Use :func:`nni.report_intermediate_result` to report per-epoch accuracy metrics.
# 3. Use :func:`nni.report_final_result` to report final accuracy.
#
# Please understand the model code before continue to next step. # Please understand the model code before continue to next step.
# %% # %%
...@@ -69,46 +77,81 @@ experiment = Experiment('local') ...@@ -69,46 +77,81 @@ experiment = Experiment('local')
# %% # %%
# Now we start to configure the experiment. # Now we start to configure the experiment.
# #
# Firstly, specify the model code. # Configure trial code
# ^^^^^^^^^^^^^^^^^^^^
# In NNI evaluation of each hyperparameter set is called a *trial*. # In NNI evaluation of each hyperparameter set is called a *trial*.
# So the model script is called *trial code*. # So the model script is called *trial code*.
#
# If you are using Linux system without Conda, you many need to change ``python`` to ``python3``.
#
# When ``trial_code_directory`` is a relative path, it relates to current working directory.
# To run ``main.py`` from a different path, you can set trial code directory to ``Path(__file__).parent``.
experiment.config.trial_command = 'python model.py' experiment.config.trial_command = 'python model.py'
experiment.config.trial_code_directory = '.' experiment.config.trial_code_directory = '.'
# %%
# When ``trial_code_directory`` is a relative path, it relates to current working directory.
# To run ``main.py`` in a different path, you can set trial code directory to ``Path(__file__).parent``.
# (`__file__ <https://docs.python.org/3.10/reference/datamodel.html#index-43>`__
# is only available in standard Python, not in Jupyter Notebook.)
#
# .. attention::
#
# If you are using Linux system without Conda,
# you may need to change ``"python model.py"`` to ``"python3 model.py"``.
# %% # %%
# Then specify the search space we defined above: # Configure search space
# ^^^^^^^^^^^^^^^^^^^^^^
experiment.config.search_space = search_space experiment.config.search_space = search_space
# %% # %%
# Choose a tuning algorithm. # Configure tuning algorithm
# ^^^^^^^^^^^^^^^^^^^^^^^^^^
# Here we use :doc:`TPE tuner </hpo/tuners>`. # Here we use :doc:`TPE tuner </hpo/tuners>`.
experiment.config.tuner.name = 'TPE' experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize' experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
# %% # %%
# Specify how many trials to run. # Configure how many trials to run
# Here we evaluate 10 sets of hyperparameters in total, and concurrently evaluate 4 sets at a time. # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Here we evaluate 10 sets of hyperparameters in total, and concurrently evaluate 2 sets at a time.
experiment.config.max_trial_number = 10
experiment.config.trial_concurrency = 2
# %%
# .. note::
# #
# Please note that ``max_trial_number`` here is merely for a quick example. # ``max_trial_number`` is set to 10 here for a fast example.
# In real world it should be set to a larger number.
# With default config TPE tuner requires 20 trials to warm up. # With default config TPE tuner requires 20 trials to warm up.
# In real world max trial number is commonly set to 100+.
# #
# You can also set ``max_experiment_duration = '1h'`` to limit running time. # You may also set ``max_experiment_duration = '1h'`` to limit running time.
# #
# And alternatively, you can skip this part and set no limit at all. # If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# The experiment will run forever until you press Ctrl-C. # the experiment will run forever until you press Ctrl-C.
experiment.config.max_trial_number = 10
experiment.config.trial_concurrency = 4
# %% # %%
# Step 4: Run the experiment # Step 4: Run the experiment
# -------------------------- # --------------------------
# Now the experiment is ready. Choose a port and launch it. # Now the experiment is ready. Choose a port and launch it. (Here we use port 8080.)
# #
# You can use the web portal to view experiment status: http://localhost:8080. # You can use the web portal to view experiment status: http://localhost:8080.
experiment.run(8080) experiment.run(8080)
# %%
# After the experiment is done
# ----------------------------
# Everything is done and it is safe to exit now. The following are optional.
#
# If you are using standard Python instead of Jupyter Notebook,
# you can add ``input()`` or ``signal.pause()`` to prevent Python from exiting,
# allowing you to view the web portal after the experiment is done.
# input('Press enter to quit')
experiment.stop()
# %%
# :meth:`nni.experiment.Experiment.stop` is automatically invoked when Python exits,
# so it can be omitted in your code.
#
# After the experiment is stopped, you can run :meth:`nni.experiment.Experiment.view` to restart web portal.
#
# .. tip::
#
# This example uses :doc:`Python API </reference/experiment>` to create experiment.
#
# You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.
264c7f7dffbb756f8a6aa675d4e792e4 f3498812ae89cde34b6f0f54216012fd
\ No newline at end of file \ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment