exploration_strategy.rst 7.31 KB
Newer Older
Yuge Zhang's avatar
Yuge Zhang committed
1
2
3
Exploration Strategy
====================

4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
There are two types of model space exploration approach: **Multi-trial strategy** and **One-shot strategy**. When the model space has been constructed, users can use either exploration approach to explore the model space. 

* :ref:`Mutli-trial strategy <multi-trial-nas>` trains each sampled model in the model space independently.
* :ref:`One-shot strategy <one-shot-nas>` samples the model from a super model.

Here is the list of exploration strategies that NNI has supported.

.. list-table::
   :header-rows: 1
   :widths: auto

   * - Name
     - Category
     - Brief Description
   * - :class:`Random <nni.retiarii.strategy.Random>`
     - :ref:`Multi-trial <multi-trial-nas>`
     - Randomly sample an architecture each time
   * - :class:`GridSearch <nni.retiarii.strategy.GridSearch>`
     - :ref:`Multi-trial <multi-trial-nas>`
     - Traverse the search space and try all possibilities
   * - :class:`RegularizedEvolution <nni.retiarii.strategy.RegularizedEvolution>`
     - :ref:`Multi-trial <multi-trial-nas>`
     - Evolution algorithm for NAS. `Reference <https://arxiv.org/abs/1802.01548>`__
   * - :class:`TPE <nni.retiarii.strategy.TPE>`
     - :ref:`Multi-trial <multi-trial-nas>`
     - Tree-structured Parzen Estimator (TPE). `Reference <https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf>`__
   * - :class:`PolicyBasedRL <nni.retiarii.strategy.PolicyBasedRL>`
     - :ref:`Multi-trial <multi-trial-nas>`
     - Policy-based reinforcement learning, based on implementation of tianshou. `Reference <https://arxiv.org/abs/1611.01578>`__
33
   * - :class:`DARTS <nni.retiarii.strategy.DARTS>`
34
35
     - :ref:`One-shot <one-shot-nas>`
     - Continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. `Reference <https://arxiv.org/abs/1806.09055>`__
36
   * - :class:`ENAS <nni.retiarii.strategy.ENAS>`
37
38
     - :ref:`One-shot <one-shot-nas>`
     - RL controller learns to generate the best network on a super-net. `Reference <https://arxiv.org/abs/1802.03268>`__
39
   * - :class:`GumbelDARTS <nni.retiarii.strategy.GumbelDARTS>`
40
41
     - :ref:`One-shot <one-shot-nas>`
     - Choose the best block by using Gumbel Softmax random sampling and differentiable training. `Reference <https://arxiv.org/abs/1812.03443>`__
42
   * - :class:`RandomOneShot <nni.retiarii.strategy.RandomOneShot>`
43
44
     - :ref:`One-shot <one-shot-nas>`
     - Train a super-net with uniform path sampling. `Reference <https://arxiv.org/abs/1904.00420>`__
45
   * - :class:`Proxyless <nni.retiarii.strategy.Proxyless>`
46
47
     - :ref:`One-shot <one-shot-nas>`
     - A low-memory-consuming optimized version of differentiable architecture search. `Reference <https://arxiv.org/abs/1812.00332>`__
Yuge Zhang's avatar
Yuge Zhang committed
48
49
50
51
52
53
54
55

.. _multi-trial-nas:

Multi-trial strategy
--------------------

Multi-trial NAS means each sampled model from model space is trained independently. A typical multi-trial NAS is `NASNet <https://arxiv.org/abs/1707.07012>`__. In multi-trial NAS, users need model evaluator to evaluate the performance of each sampled model, and need an exploration strategy to sample models from a defined model space. Here, users could use NNI provided model evaluators or write their own model evalutor. They can simply choose a exploration strategy. Advanced users can also customize new exploration strategy.

56
To use an exploration strategy, users simply instantiate an exploration strategy and pass the instantiated object to :class:`~nni.retiarii.experiment.pytorch.RetiariiExperiment`. Below is a simple example.
Yuge Zhang's avatar
Yuge Zhang committed
57
58
59

.. code-block:: python

60
61
   import nni.retiarii.strategy as strategy
   exploration_strategy = strategy.Random(dedup=True)
Yuge Zhang's avatar
Yuge Zhang committed
62

63
Rather than using :class:`strategy.Random <nni.retiarii.strategy.Random>`, users can choose one of the strategies from the table above.
Yuge Zhang's avatar
Yuge Zhang committed
64
65
66
67
68
69

.. _one-shot-nas:

One-shot strategy
-----------------

70
One-shot NAS algorithms leverage weight sharing among models in neural architecture search space to train a supernet, and use this supernet to guide the selection of better models. This type of algorihtms greatly reduces computational resource compared to independently training each model from scratch (which we call "Multi-trial NAS").
Yuge Zhang's avatar
Yuge Zhang committed
71

72
Starting from v2.8, the usage of one-shot strategies are much alike to multi-trial strategies. Users simply need to create a strategy and run :class:`~nni.retiarii.experiment.pytorch.RetiariiExperiment`. Since one-shot strategies will manipulate the training recipe, to use a one-shot strategy, the evaluator needs to be one of the :ref:`PyTorch-Lightning evaluators <lightning-evaluator>`, either built-in or customized. Last but not least, don't forget to set execution engine to ``oneshot``. Example follows:
73
74
75
76
77
78
79
80

.. code-block:: python

   import nni.retiarii.strategy as strategy
   import nni.retiarii.evaluator.pytorch.lightning as pl
   evaluator = pl.Classification(...)
   exploration_strategy = strategy.DARTS()

81
82
   exp_config.execution_engine = 'oneshot'

83
84
One-shot strategies only support a limited set of :ref:`mutation-primitives`, and does not support :doc:`customizing mutators <mutator>` at all. See the :ref:`reference <one-shot-strategy-reference>` for the detailed support list of each algorithm.

85
86
87
.. versionadded:: 2.8

   One-shot strategy is now compatible with `Lightning accelerators <https://pytorch-lightning.readthedocs.io/en/stable/accelerators/gpu.html>`__. It means that, you can accelerate one-shot strategies on hardwares like multiple GPUs. To enable this feature, you only need to pass the keyword arguments which used to be set in ``pytorch_lightning.Trainer``, to your evaluator. See :doc:`this reference </reference/nas/evaluator>` for more details.
88
89
90
91

One-shot strategy (legacy)
--------------------------

92
93
94
95
96
.. warning::

   .. deprecated:: 2.8

      The following usages are deprecated and will be removed in future releases. If you intend to use them, the references can be found :doc:`here </deprecated/oneshot_legacy>`.
97
98

The usage of one-shot NAS strategy is a little different from multi-trial strategy. One-shot strategy is implemented with a special type of objects named *Trainer*. Following the common practice of one-shot NAS, *Trainer* trains the super-net and searches for the optimal architecture in a single run. For example,
Yuge Zhang's avatar
Yuge Zhang committed
99
100
101

.. code-block:: python

102
   from nni.retiarii.oneshot.pytorch import DartsTrainer
Yuge Zhang's avatar
Yuge Zhang committed
103

104
105
106
107
108
109
110
111
112
113
   trainer = DartsTrainer(
      model=model,
      loss=criterion,
      metrics=lambda output, target: accuracy(output, target, topk=(1,)),
      optimizer=optim,
      dataset=dataset_train,
      batch_size=32,
      log_frequency=50
   )
   trainer.fit()
Yuge Zhang's avatar
Yuge Zhang committed
114

115
One-shot strategy can be used without :class:`~nni.retiarii.experiment.pytorch.RetiariiExperiment`. Thus, the ``trainer.fit()`` here runs the experiment locally.
Yuge Zhang's avatar
Yuge Zhang committed
116

117
After ``trainer.fit()`` completes, we can use ``trainer.export()`` to export the searched architecture (a dict of choices) to a file.
Yuge Zhang's avatar
Yuge Zhang committed
118

119
.. code-block:: python
Yuge Zhang's avatar
Yuge Zhang committed
120

121
122
123
   final_architecture = trainer.export()
   print('Final architecture:', trainer.export())
   json.dump(trainer.export(), open('checkpoint.json', 'w'))
Yuge Zhang's avatar
Yuge Zhang committed
124

125
.. tip:: The trained super-net (neither the weights or exported JSON) can't be used directly. It's only an intermediate result used for deriving the final architecture. The exported architecture (can be retrieved with :meth:`nni.retiarii.fixed_arch`) needs to be *retrained* with a standard training recipe to get the final model.