Tutorial.rst 13.5 KB
Newer Older
QuanluZhang's avatar
QuanluZhang committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Neural Architecture Search with Retiarii (Experimental)
=======================================================

`Retiarii <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__ is a new framework to support neural architecture search and hyper-parameter tuning. It allows users to express various search space with high flexibility, to reuse many SOTA search algorithms, and to leverage system level optimizations to speed up the search process. This framework provides the following new user experiences.

* Search space can be expressed directly in user model code. A tuning space can be expressed along defining a model.
* Neural architecture candidates and hyper-parameter candidates are more friendly supported in an experiment.
* The experiment can be launched directly from python code.

*We are working on migrating* `our previous NAS framework <../Overview.rst>`__ *to Retiarii framework. Thus, this feature is still experimental. We recommend users to try the new framework and provide your valuable feedback for us to improve it. The old framework is still supported for now.*

.. contents::

There are mainly two steps to start an experiment for your neural architecture search task. First, define the model space you want to explore. Second, choose a search method to explore your defined model space.

Define your Model Space
-----------------------

Model space is defined by users to express a set of models that users want to explore, and believe good-performing models are included in those models. In this framework, a model space is defined with two parts: a base model and possible mutations on the base model.

Define Base Model
^^^^^^^^^^^^^^^^^

Defining a base model is almost the same as defining a PyTorch (or TensorFlow) model. There are only two small differences.

26
* Replace the code ``import torch.nn as nn`` with ``import nni.retiarii.nn.pytorch as nn`` for PyTorch modules, such as ``nn.Conv2d``, ``nn.ReLU``.
27
* Some **user-defined** modules should be decorated with ``@basic_unit``. For example, user-defined module used in ``LayerChoice`` should be decorated. Users can refer to `here <#serialize-module>`__ for detailed usage instruction of ``@basic_unit``.
QuanluZhang's avatar
QuanluZhang committed
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

Below is a very simple example of defining a base model, it is almost the same as defining a PyTorch model.

.. code-block:: python

  import torch.nn.functional as F
  import nni.retiarii.nn.pytorch as nn

  class MyModule(nn.Module):
    def __init__(self):
      super().__init__()
      self.conv = nn.Conv2d(32, 1, 5)
      self.pool = nn.MaxPool2d(kernel_size=2)
    def forward(self, x):
      return self.pool(self.conv(x))

  class Model(nn.Module):
    def __init__(self):
      super().__init__()
      self.mymodule = MyModule()
    def forward(self, x):
      return F.relu(self.mymodule(x))

Users can refer to :githublink:`Darts base model <test/retiarii_test/darts/darts_model.py>` and :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>` for more complicated examples.

Define Model Mutations
^^^^^^^^^^^^^^^^^^^^^^

56
A base model is only one concrete model not a model space. We provide APIs and primitives for users to express how the base model can be mutated, i.e., a model space which includes many models.
QuanluZhang's avatar
QuanluZhang committed
57
58
59

**Express mutations in an inlined manner**

60
For easy usability and also backward compatibility, we provide some APIs for users to easily express possible mutations after defining a base model. The APIs can be used just like PyTorch module.
QuanluZhang's avatar
QuanluZhang committed
61

62
* ``nn.LayerChoice``. It allows users to put several candidate operations (e.g., PyTorch modules), one of them is chosen in each explored model. *Note that if the candidate is a user-defined module, it should be decorated as `serialize module <#serialize-module>`__. In the following example, ``ops.PoolBN`` and ``ops.SepConv`` should be decorated.*
QuanluZhang's avatar
QuanluZhang committed
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85

  .. code-block:: python

    # import nni.retiarii.nn.pytorch as nn
    # declared in `__init__`
    self.layer = nn.LayerChoice([
      ops.PoolBN('max', channels, 3, stride, 1),
      ops.SepConv(channels, channels, 3, stride, 1),
      nn.Identity()
    ]))
    # invoked in `forward` function
    out = self.layer(x)

* ``nn.InputChoice``. It is mainly for choosing (or trying) different connections. It takes several tensors and chooses ``n_chosen`` tensors from them.

  .. code-block:: python

    # import nni.retiarii.nn.pytorch as nn
    # declared in `__init__`
    self.input_switch = nn.InputChoice(n_chosen=1)
    # invoked in `forward` function, choose one from the three
    out = self.input_switch([tensor1, tensor2, tensor3])

86
* ``nn.ValueChoice``. It is for choosing one value from some candidate values. It can only be used as input argument of the modules in ``nn.modules`` and ``@basic_unit`` decorated user-defined modules.
QuanluZhang's avatar
QuanluZhang committed
87
88
89
90
91
92
93
94
95
96
97
98

  .. code-block:: python

    # import nni.retiarii.nn.pytorch as nn
    # used in `__init__`
    self.conv = nn.Conv2d(XX, XX, kernel_size=nn.ValueChoice([1, 3, 5])
    self.op = MyOp(nn.ValueChoice([0, 1], nn.ValueChoice([-1, 1]))

Detailed API description and usage can be found `here <./ApiReference.rst>`__\. Example of using these APIs can be found in :githublink:`Darts base model <test/retiarii_test/darts/darts_model.py>`.

**Express mutations with mutators**

99
Though easy-to-use, inline mutations have limited expressiveness, some model spaces cannot be expressed. To improve expressiveness and flexibility, we provide primitives for users to write *Mutator* to express how they want to mutate base model more flexibly. Mutator stands above base model, thus has full ability to edit the model.
QuanluZhang's avatar
QuanluZhang committed
100

101
Users can instantiate several mutators as below, the mutators will be sequentially applied to the base model one after another for sampling a new model.
QuanluZhang's avatar
QuanluZhang committed
102
103
104
105
106
107
108

.. code-block:: python

  applied_mutators = []
  applied_mutators.append(BlockMutator('mutable_0'))
  applied_mutators.append(BlockMutator('mutable_1'))

109
``BlockMutator`` is defined by users to express how to mutate the base model. User-defined mutator should inherit ``Mutator`` class, and implement mutation logic in the member function ``mutate``.
QuanluZhang's avatar
QuanluZhang committed
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127

.. code-block:: python

  from nni.retiarii import Mutator
  class BlockMutator(Mutator):
    def __init__(self, target: str, candidates: List):
        super(BlockMutator, self).__init__()
        self.target = target
        self.candidate_op_list = candidates

    def mutate(self, model):
      nodes = model.get_nodes_by_label(self.target)
      for node in nodes:
        chosen_op = self.choice(self.candidate_op_list)
        node.update_operation(chosen_op.type, chosen_op.params)

The input of ``mutate`` is graph IR of the base model (please refer to `here <./ApiReference.rst>`__ for the format and APIs of the IR), users can mutate the graph with its member functions (e.g., ``get_nodes_by_label``, ``update_operation``). The mutation operations can be combined with the API ``self.choice``, in order to express a set of possible mutations. In the above example, the node's operation can be changed to any operation from ``candidate_op_list``.

128
Use placehoder to make mutation easier: ``nn.Placeholder``. If you want to mutate a subgraph or node of your model, you can define a placeholder in this model to represent the subgraph or node. Then, use mutator to mutate this placeholder to make it real modules.
QuanluZhang's avatar
QuanluZhang committed
129
130
131

.. code-block:: python

132
133
134
135
136
137
  ph = nn.Placeholder(
    label='mutable_0',
    kernel_size_options=[1, 3, 5],
    n_layer_options=[1, 2, 3, 4],
    exp_ratio=exp_ratio,
    stride=stride
QuanluZhang's avatar
QuanluZhang committed
138
139
  )

140
``label`` is used by mutator to identify this placeholder. The other parameters are the information that are required by mutator. They can be accessed from ``node.operation.parameters`` as a dict, it could include any information that users want to put to pass it to user defined mutator. The complete example code can be found in :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>`.
QuanluZhang's avatar
QuanluZhang committed
141
142
143
144

Explore the Defined Model Space
-------------------------------

145
After model space is defined, it is time to explore this model space. Users can choose proper search and model evaluator to explore the model space.
QuanluZhang's avatar
QuanluZhang committed
146

147
148
Create an Evaluator and Exploration Strategy
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
QuanluZhang's avatar
QuanluZhang committed
149
150

**Classic search approach:**
151
In this approach, model evaluator is for training and testing each explored model, while strategy is for sampling the models. Both evaluator and strategy are required to explore the model space. We recommend PyTorch-Lightning to write the full evaluation process.
QuanluZhang's avatar
QuanluZhang committed
152

153
**Oneshot (weight-sharing) search approach:**
154
In this approach, users only need a oneshot trainer, because this trainer takes charge of both search, training and testing.
QuanluZhang's avatar
QuanluZhang committed
155

156
In the following table, we listed the available evaluators and strategies.
QuanluZhang's avatar
QuanluZhang committed
157
158
159
160
161

.. list-table::
  :header-rows: 1
  :widths: auto

162
  * - Evaluator
QuanluZhang's avatar
QuanluZhang committed
163
164
    - Strategy
    - Oneshot Trainer
165
  * - Classification
QuanluZhang's avatar
QuanluZhang committed
166
167
    - TPEStrategy
    - DartsTrainer
168
  * - Regression
169
    - Random
QuanluZhang's avatar
QuanluZhang committed
170
171
    - EnasTrainer
  * - 
172
    - GridSearch
QuanluZhang's avatar
QuanluZhang committed
173
174
    - ProxylessTrainer
  * - 
175
    - RegularizedEvolution
QuanluZhang's avatar
QuanluZhang committed
176
177
178
179
    - SinglePathTrainer (RandomTrainer)

There usage and API document can be found `here <./ApiReference>`__\.

180
Here is a simple example of using evaluator and strategy.
QuanluZhang's avatar
QuanluZhang committed
181
182
183

.. code-block:: python

184
185
  import nni.retiarii.evaluator.pytorch.lightning as pl
  from nni.retiarii import serialize
186
  from torchvision import transforms
QuanluZhang's avatar
QuanluZhang committed
187

188
  transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
189
190
  train_dataset = serialize(MNIST, root='data/mnist', train=True, download=True, transform=transform)
  test_dataset = serialize(MNIST, root='data/mnist', train=False, download=True, transform=transform)
191
192
193
194
  lightning = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
                                val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
                                max_epochs=10)

195
.. Note:: For NNI to capture the dataset and dataloader and distribute it across different runs, please wrap your dataset with ``serialize`` and use ``pl.DataLoader`` instead of ``torch.utils.data.DataLoader``. See ``basic_unit`` section below for details.
196

197
Users can refer to `API reference <./ApiReference.rst>`__ on detailed usage of evaluator. "`write a trainer <./WriteTrainer.rst>`__" for how to write a new trainer, and refer to `this document <./WriteStrategy.rst>`__ for how to write a new strategy.
QuanluZhang's avatar
QuanluZhang committed
198
199
200
201
202
203
204
205

Set up an Experiment
^^^^^^^^^^^^^^^^^^^^

After all the above are prepared, it is time to start an experiment to do the model search. We design unified interface for users to start their experiment. An example is shown below

.. code-block:: python

206
  exp = RetiariiExperiment(base_model, trainer, applied_mutators, simple_strategy)
QuanluZhang's avatar
QuanluZhang committed
207
208
209
210
211
212
213
214
215
216
217
218
219
220
  exp_config = RetiariiExeConfig('local')
  exp_config.experiment_name = 'mnasnet_search'
  exp_config.trial_concurrency = 2
  exp_config.max_trial_number = 10
  exp_config.training_service.use_active_gpu = False
  exp.run(exp_config, 8081)

This code starts an NNI experiment. Note that if inlined mutation is used, ``applied_mutators`` should be ``None``.

The complete code of a simple MNIST example can be found :githublink:`here <test/retiarii_test/mnist/test.py>`.

Visualize your experiment
^^^^^^^^^^^^^^^^^^^^^^^^^

221
222
223
224
225
226
Users can visualize their experiment in the same way as visualizing a normal hyper-parameter tuning experiment. For example, open ``localhost::8081`` in your browser, 8081 is the port that you set in ``exp.run``. Please refer to `here <../../Tutorial/WebUI.rst>`__ for details. If users are using oneshot trainer, they can refer to `here <../Visualization.rst>`__ for how to visualize their experiments.

Export the best model found in your experiment
----------------------------------------------

If you are using *classic search approach*, you can simply find out the best one from WebUI.
QuanluZhang's avatar
QuanluZhang committed
227

228
If you are using *oneshot (weight-sharing) search approach*, you can invole ``exp.export_top_models`` to output several best models that are found in the experiment.
QuanluZhang's avatar
QuanluZhang committed
229

230
231
232
Advanced and FAQ
----------------

233
.. _serialize-module:
234

235
**Serialize Module**
236

237
To understand the decorator ``basic_unit``, we first briefly explain how our framework works: it converts user-defined model to a graph representation (called graph IR), each instantiated module is converted to a subgraph. Then user-defined mutations are applied to the graph to generate new graphs. Each new graph is then converted back to PyTorch code and executed. ``@basic_unit`` here means the module will not be converted to a subgraph but is converted to a single graph node. That is, the module will not be unfolded anymore. Users should/can decorate a user-defined module class in the following cases:
238

239
* When a module class cannot be successfully converted to a subgraph due to some implementation issues. For example, currently our framework does not support adhoc loop, if there is adhoc loop in a module's forward, this class should be decorated as serializeble module. The following ``MyModule`` should be decorated.
240
241
242

  .. code-block:: python

243
    @basic_unit
244
245
246
247
248
249
250
    class MyModule(nn.Module):
      def __init__(self):
        ...
      def forward(self, x):
        for i in range(10): # <- adhoc loop
          ...

251
252
253
* The candidate ops in ``LayerChoice`` should be decorated as serializable module. For example, ``self.op = nn.LayerChoice([Op1(...), Op2(...), Op3(...)])``, where ``Op1``, ``Op2``, ``Op3`` should be decorated if they are user defined modules.
* When users want to use ``ValueChoice`` in a module's input argument, the module should be decorated as serializable module. For example, ``self.conv = MyConv(kernel_size=nn.ValueChoice([1, 3, 5]))``, where ``MyConv`` should be decorated.
* If no mutation is targeted on a module, this module *can be* decorated as a serializable module.