Commit 9f73153f authored by zhanggzh's avatar zhanggzh
Browse files

add dtk24.04 code

parent eb77376e
**How To** - Customize Your Own Advisor
=======================================
*Warning: API is subject to change in future releases.*
Advisor targets the scenario that the automl algorithm wants the methods of both tuner and assessor. Advisor is similar to tuner on that it receives trial parameters request, final results, and generate trial parameters. Also, it is similar to assessor on that it receives intermediate results, trial's end state, and could send trial kill command. Note that, if you use Advisor, tuner and assessor are not allowed to be used at the same time.
If a user want to implement a customized Advisor, she/he only needs to:
**1. Define an Advisor inheriting from the MsgDispatcherBase class.** For example:
.. code-block:: python
from nni.runtime.msg_dispatcher_base import MsgDispatcherBase
class CustomizedAdvisor(MsgDispatcherBase):
def __init__(self, ...):
...
**2. Implement the methods with prefix "handle_" except "handle_request""**
You might find `docs <../autotune_ref.rst#Advisor>`__ for ``MsgDispatcherBase`` helpful.
**3. Configure your customized Advisor in experiment YAML config file.**
Similar to tuner and assessor. NNI needs to locate your customized Advisor class and instantiate the class, so you need to specify the location of the customized Advisor class and pass literal values as parameters to the ``__init__`` constructor.
.. code-block:: yaml
advisor:
codeDir: /home/abc/myadvisor
classFileName: my_customized_advisor.py
className: CustomizedAdvisor
# Any parameter need to pass to your advisor class __init__ constructor
# can be specified in this optional classArgs field, for example
classArgs:
arg1: value1
**Note that** The working directory of your advisor is ``<home>/nni-experiments/<experiment_id>/log``, which can be retrieved with environment variable ``NNI_LOG_DIRECTORY``.
Example
-------
Here we provide an :githublink:`example <examples/tuners/mnist_keras_customized_advisor>`.
DNGO Tuner
==========
Usage
-----
Installation
^^^^^^^^^^^^
classArgs requirements
^^^^^^^^^^^^^^^^^^^^^^
* **optimize_mode** (*'maximize' or 'minimize'*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics.
* **sample_size** (*int, default = 1000*) - Number of samples to select in each iteration. The best one will be picked from the samples as the next trial.
* **trials_per_update** (*int, default = 20*) - Number of trials to collect before updating the model.
* **num_epochs_per_training** (*int, default = 500*) - Number of epochs to train DNGO model.
Example Configuration
^^^^^^^^^^^^^^^^^^^^^
.. code-block:: yaml
# config.yml
tuner:
name: DNGOTuner
classArgs:
optimize_mode: maximize
Naive Evolution Tuner
=====================
Naive Evolution comes from `Large-Scale Evolution of Image Classifiers <https://arxiv.org/pdf/1703.01041.pdf>`__. It randomly initializes a population based on the search space. For each generation, it chooses better ones and does some mutation (e.g., changes a hyperparameter, adds/removes one layer, etc.) on them to get the next generation. Naive Evolution requires many trials to works but it's very simple and it's easily expanded with new features.
Usage
-----
classArgs Requirements
^^^^^^^^^^^^^^^^^^^^^^
*
**optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', the tuner will try to maximize metrics. If 'minimize', the tuner will try to minimize metrics.
*
**population_size** (*int value (should > 0), optional, default = 20*) - the initial size of the population (trial num) in the evolution tuner. It's suggested that ``population_size`` be much larger than ``concurrency`` so users can get the most out of the algorithm (and at least ``concurrency``, or the tuner will fail on its first generation of parameters).
Example Configuration
^^^^^^^^^^^^^^^^^^^^^
.. code-block:: yaml
# config.yml
tuner:
name: Evolution
classArgs:
optimize_mode: maximize
population_size: 100
GP Tuner
========
Bayesian optimization works by constructing a posterior distribution of functions (a Gaussian Process) that best describes the function you want to optimize. As the number of observations grows, the posterior distribution improves, and the algorithm becomes more certain of which regions in parameter space are worth exploring and which are not.
GP Tuner is designed to minimize/maximize the number of steps required to find a combination of parameters that are close to the optimal combination. To do so, this method uses a proxy optimization problem (finding the maximum of the acquisition function) that, albeit still a hard problem, is cheaper (in the computational sense) to solve, and it's amenable to common tools. Therefore, Bayesian Optimization is suggested for situations where sampling the function to be optimized is very expensive.
Note that the only acceptable types within the search space are ``randint``, ``uniform``, ``quniform``, ``loguniform``, ``qloguniform``, and numerical ``choice``.
This optimization approach is described in Section 3 of `Algorithms for Hyper-Parameter Optimization <https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf>`__.
Usage
-----
classArgs requirements
^^^^^^^^^^^^^^^^^^^^^^
* **optimize_mode** (*'maximize' or 'minimize', optional, default = 'maximize'*) - If 'maximize', the tuner will try to maximize metrics. If 'minimize', the tuner will try to minimize metrics.
* **utility** (*'ei', 'ucb' or 'poi', optional, default = 'ei'*) - The utility function (acquisition function). 'ei', 'ucb', and 'poi' correspond to 'Expected Improvement', 'Upper Confidence Bound', and 'Probability of Improvement', respectively.
* **kappa** (*float, optional, default = 5*) - Used by the 'ucb' utility function. The bigger ``kappa`` is, the more exploratory the tuner will be.
* **xi** (*float, optional, default = 0*) - Used by the 'ei' and 'poi' utility functions. The bigger ``xi`` is, the more exploratory the tuner will be.
* **nu** (*float, optional, default = 2.5*) - Used to specify the Matern kernel. The smaller nu, the less smooth the approximated function is.
* **alpha** (*float, optional, default = 1e-6*) - Used to specify the Gaussian Process Regressor. Larger values correspond to an increased noise level in the observations.
* **cold_start_num** (*int, optional, default = 10*) - Number of random explorations to perform before the Gaussian Process. Random exploration can help by diversifying the exploration space.
* **selection_num_warm_up** (*int, optional, default = 1e5*) - Number of random points to evaluate when getting the point which maximizes the acquisition function.
* **selection_num_starting_points** (*int, optional, default = 250*) - Number of times to run L-BFGS-B from a random starting point after the warmup.
Example Configuration
^^^^^^^^^^^^^^^^^^^^^
.. code-block:: yaml
# config.yml
tuner:
name: GPTuner
classArgs:
optimize_mode: maximize
utility: 'ei'
kappa: 5.0
xi: 0.0
nu: 2.5
alpha: 1e-6
cold_start_num: 10
selection_num_warm_up: 100000
selection_num_starting_points: 250
This diff is collapsed.
Metis Tuner
===========
`Metis <https://www.microsoft.com/en-us/research/publication/metis-robustly-tuning-tail-latencies-cloud-systems/>`__ offers several benefits over other tuning algorithms. While most tools only predict the optimal configuration, Metis gives you two outputs, a prediction for the optimal configuration and a suggestion for the next trial. No more guess work!
While most tools assume training datasets do not have noisy data, Metis actually tells you if you need to resample a particular hyper-parameter.
While most tools have problems of being exploitation-heavy, Metis' search strategy balances exploration, exploitation, and (optional) resampling.
Metis belongs to the class of sequential model-based optimization (SMBO) algorithms and it is based on the Bayesian Optimization framework. To model the parameter-vs-performance space, Metis uses both a Gaussian Process and GMM. Since each trial can impose a high time cost, Metis heavily trades inference computations with naive trials. At each iteration, Metis does two tasks:
*
It finds the global optimal point in the Gaussian Process space. This point represents the optimal configuration.
*
It identifies the next hyper-parameter candidate. This is achieved by inferring the potential information gain of exploration, exploitation, and resampling.
Note that the only acceptable types within the search space are ``quniform``, ``uniform``, ``randint``, and numerical ``choice``.
More details can be found in our `paper <https://www.microsoft.com/en-us/research/publication/metis-robustly-tuning-tail-latencies-cloud-systems/>`__.
Usage
-----
classArgs requirements
^^^^^^^^^^^^^^^^^^^^^^
* **optimize_mode** (*'maximize' or 'minimize', optional, default = 'maximize'*) - If 'maximize', the tuner will try to maximize metrics. If 'minimize', the tuner will try to minimize metrics.
Example Configuration
^^^^^^^^^^^^^^^^^^^^^
.. code-block:: yaml
# config.yml
tuner:
name: MetisTuner
classArgs:
optimize_mode: maximize
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
############
Installation
############
Currently we support installation on Linux, Mac and Windows. We also allow you to use docker.
.. toctree::
:maxdepth: 2
Linux & Mac <Tutorial/InstallationLinux>
Windows <Tutorial/InstallationWin>
Use Docker <Tutorial/HowToUseDocker>
\ No newline at end of file
:orphan:
Python API Reference
====================
.. autosummary::
:toctree: _modules
:recursive:
nni
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment