Tutorial.rst 11.7 KB
Newer Older
QuanluZhang's avatar
QuanluZhang committed
1
2
3
4
Neural Architecture Search with Retiarii (Alpha)
================================================

*This is a pre-release, its interfaces may subject to minor changes. The roadmap of this figure is: experimental in V2.0 -> alpha version in V2.1 -> beta version in V2.2 -> official release in V2.3. Feel free to give us your comments and suggestions.*
QuanluZhang's avatar
QuanluZhang committed
5
6
7

`Retiarii <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__ is a new framework to support neural architecture search and hyper-parameter tuning. It allows users to express various search space with high flexibility, to reuse many SOTA search algorithms, and to leverage system level optimizations to speed up the search process. This framework provides the following new user experiences.

QuanluZhang's avatar
QuanluZhang committed
8
* Search space can be expressed directly in user model code. A tuning space can be expressed during defining a model.
QuanluZhang's avatar
QuanluZhang committed
9
10
11
* Neural architecture candidates and hyper-parameter candidates are more friendly supported in an experiment.
* The experiment can be launched directly from python code.

QuanluZhang's avatar
QuanluZhang committed
12
.. Note:: `Our previous NAS framework <../Overview.rst>`__ is still supported for now, but will be migrated to Retiarii framework in V2.3.
QuanluZhang's avatar
QuanluZhang committed
13
14
15

.. contents::

QuanluZhang's avatar
QuanluZhang committed
16
17
18
19
20
21
22
There are mainly two crucial components for a neural architecture search task, namely,

* Model search space that defines the set of models to explore.
* A proper strategy as the method to explore this search space.
* A model evaluator that reports the performance of a given model.

.. note:: Currently, PyTorch is the only supported framework by Retiarii, and we have only tested with **PyTorch 1.6 and 1.7**. This documentation assumes PyTorch context but it should also apply to other frameworks, that is in our future plan.
QuanluZhang's avatar
QuanluZhang committed
23
24
25
26

Define your Model Space
-----------------------

QuanluZhang's avatar
QuanluZhang committed
27
Model space is defined by users to express a set of models that users want to explore, which contains potentially good-performing models. In this framework, a model space is defined with two parts: a base model and possible mutations on the base model.
QuanluZhang's avatar
QuanluZhang committed
28
29
30
31

Define Base Model
^^^^^^^^^^^^^^^^^

QuanluZhang's avatar
QuanluZhang committed
32
Defining a base model is almost the same as defining a PyTorch (or TensorFlow) model. Usually, you only need to replace the code ``import torch.nn as nn`` with ``import nni.retiarii.nn.pytorch as nn`` to use our wrapped PyTorch modules.
QuanluZhang's avatar
QuanluZhang committed
33
34
35
36
37
38
39
40

Below is a very simple example of defining a base model, it is almost the same as defining a PyTorch model.

.. code-block:: python

  import torch.nn.functional as F
  import nni.retiarii.nn.pytorch as nn

QuanluZhang's avatar
QuanluZhang committed
41
42
43
44
45
46
47
48
  @basic_unit
  class BasicBlock(nn.Module):
    def __init__(self, const):
      self.const = const
    def forward(self, x):
      return x + self.const

  class ConvPool(nn.Module):
QuanluZhang's avatar
QuanluZhang committed
49
50
    def __init__(self):
      super().__init__()
QuanluZhang's avatar
QuanluZhang committed
51
      self.conv = nn.Conv2d(32, 1, 5)  # possibly mutate this conv
QuanluZhang's avatar
QuanluZhang committed
52
53
54
55
56
57
58
      self.pool = nn.MaxPool2d(kernel_size=2)
    def forward(self, x):
      return self.pool(self.conv(x))

  class Model(nn.Module):
    def __init__(self):
      super().__init__()
QuanluZhang's avatar
QuanluZhang committed
59
60
      self.convpool = ConvPool()
      self.mymodule = BasicBlock(2.)
QuanluZhang's avatar
QuanluZhang committed
61
    def forward(self, x):
QuanluZhang's avatar
QuanluZhang committed
62
63
64
      return F.relu(self.convpool(self.mymodule(x)))

The above example also shows how to use ``@basic_unit``. ``@basic_unit`` is decorated on a user-defined module to tell Retiarii that there will be no mutation within this module, Retiarii can treat it as a basic unit (i.e., as a blackbox). It is useful when (1) users want to mutate the initialization parameters of this module, or (2) Retiarii fails to parse this module due to complex control flow (e.g., ``for``, ``while``). More detailed description of ``@basic_unit`` can be found `here <./Advanced.rst>`__.
QuanluZhang's avatar
QuanluZhang committed
65
66
67
68
69
70

Users can refer to :githublink:`Darts base model <test/retiarii_test/darts/darts_model.py>` and :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>` for more complicated examples.

Define Model Mutations
^^^^^^^^^^^^^^^^^^^^^^

71
A base model is only one concrete model not a model space. We provide APIs and primitives for users to express how the base model can be mutated, i.e., a model space which includes many models.
QuanluZhang's avatar
QuanluZhang committed
72

QuanluZhang's avatar
QuanluZhang committed
73
We provide some APIs as shown below for users to easily express possible mutations after defining a base model. The APIs can be used just like PyTorch module. This approach is also called inline mutations.
QuanluZhang's avatar
QuanluZhang committed
74

QuanluZhang's avatar
QuanluZhang committed
75
* ``nn.LayerChoice``. It allows users to put several candidate operations (e.g., PyTorch modules), one of them is chosen in each explored model. Note that if the candidate is a user-defined module, it should be decorated as a `basic unit <./Advanced.rst>`__ with ``@basic_unit``. In the following example, ``ops.PoolBN`` and ``ops.SepConv`` should be decorated.
QuanluZhang's avatar
QuanluZhang committed
76
77
78
79

  .. code-block:: python

    # import nni.retiarii.nn.pytorch as nn
QuanluZhang's avatar
QuanluZhang committed
80
    # declared in `__init__` method
QuanluZhang's avatar
QuanluZhang committed
81
82
83
84
85
    self.layer = nn.LayerChoice([
      ops.PoolBN('max', channels, 3, stride, 1),
      ops.SepConv(channels, channels, 3, stride, 1),
      nn.Identity()
    ]))
QuanluZhang's avatar
QuanluZhang committed
86
    # invoked in `forward` method
QuanluZhang's avatar
QuanluZhang committed
87
88
89
90
91
92
93
    out = self.layer(x)

* ``nn.InputChoice``. It is mainly for choosing (or trying) different connections. It takes several tensors and chooses ``n_chosen`` tensors from them.

  .. code-block:: python

    # import nni.retiarii.nn.pytorch as nn
QuanluZhang's avatar
QuanluZhang committed
94
    # declared in `__init__` method
QuanluZhang's avatar
QuanluZhang committed
95
    self.input_switch = nn.InputChoice(n_chosen=1)
QuanluZhang's avatar
QuanluZhang committed
96
    # invoked in `forward` method, choose one from the three
QuanluZhang's avatar
QuanluZhang committed
97
98
    out = self.input_switch([tensor1, tensor2, tensor3])

QuanluZhang's avatar
QuanluZhang committed
99
* ``nn.ValueChoice``. It is for choosing one value from some candidate values. It can only be used as input argument of basic units, that is, modules in ``nni.retiarii.nn.pytorch`` and user-defined modules decorated with ``@basic_unit``.
QuanluZhang's avatar
QuanluZhang committed
100
101
102
103

  .. code-block:: python

    # import nni.retiarii.nn.pytorch as nn
QuanluZhang's avatar
QuanluZhang committed
104
    # used in `__init__` method
QuanluZhang's avatar
QuanluZhang committed
105
    self.conv = nn.Conv2d(XX, XX, kernel_size=nn.ValueChoice([1, 3, 5])
QuanluZhang's avatar
QuanluZhang committed
106
    self.op = MyOp(nn.ValueChoice([0, 1]), nn.ValueChoice([-1, 1]))
QuanluZhang's avatar
QuanluZhang committed
107

QuanluZhang's avatar
QuanluZhang committed
108
All the APIs have an optional argument called ``label``, mutations with the same label will share the same choice. A typical example is,
QuanluZhang's avatar
QuanluZhang committed
109

QuanluZhang's avatar
QuanluZhang committed
110
  .. code-block:: python
QuanluZhang's avatar
QuanluZhang committed
111

QuanluZhang's avatar
QuanluZhang committed
112
113
114
115
    self.net = nn.Sequential(
        nn.Linear(10, nn.ValueChoice([32, 64, 128], label='hidden_dim'),
        nn.Linear(nn.ValueChoice([32, 64, 128], label='hidden_dim'), 3)
    )
QuanluZhang's avatar
QuanluZhang committed
116

QuanluZhang's avatar
QuanluZhang committed
117
Detailed API description and usage can be found `here <./ApiReference.rst>`__\. Example of using these APIs can be found in :githublink:`Darts base model <test/retiarii_test/darts/darts_model.py>`. We are actively enriching the set of inline mutations, to make it easier to express a new search space.
QuanluZhang's avatar
QuanluZhang committed
118

QuanluZhang's avatar
QuanluZhang committed
119
If the inline mutation APIs are not enough for your scenario, you can refer to `defining model space using mutators <./Advanced.rst#express-mutations-with-mutators>`__ to write more complex model spaces.
QuanluZhang's avatar
QuanluZhang committed
120

QuanluZhang's avatar
QuanluZhang committed
121
122
Explore the Defined Model Space
-------------------------------
QuanluZhang's avatar
QuanluZhang committed
123

QuanluZhang's avatar
QuanluZhang committed
124
There are basically two exploration approaches: (1) search by evaluating each sampled model independently and (2) one-shot weight-sharing based search. We demonstrate the first approach below in this tutorial. Users can refer to `here <./OneshotTrainer.rst>`__ for the second approach.
QuanluZhang's avatar
QuanluZhang committed
125

QuanluZhang's avatar
QuanluZhang committed
126
Users can choose a proper search strategy to explore the model space, and use a chosen or user-defined model evaluator to evaluate the performance of each sampled model.
QuanluZhang's avatar
QuanluZhang committed
127

QuanluZhang's avatar
QuanluZhang committed
128
129
Choose a search strategy
^^^^^^^^^^^^^^^^^^^^^^^^
QuanluZhang's avatar
QuanluZhang committed
130

QuanluZhang's avatar
QuanluZhang committed
131
Retiarii currently supports the following search strategies:
QuanluZhang's avatar
QuanluZhang committed
132

QuanluZhang's avatar
QuanluZhang committed
133
134
135
* Grid search: enumerate all the possible models defined in the space.
* Random: randomly pick the models from search space.
* Regularized evolution: a genetic algorithm that explores the space based on inheritance and mutation.
QuanluZhang's avatar
QuanluZhang committed
136

QuanluZhang's avatar
QuanluZhang committed
137
Choose (i.e., instantiate) a search strategy is very easy. An example is as follows,
QuanluZhang's avatar
QuanluZhang committed
138
139
140

.. code-block:: python

QuanluZhang's avatar
QuanluZhang committed
141
  import nni.retiarii.strategy as strategy
QuanluZhang's avatar
QuanluZhang committed
142

QuanluZhang's avatar
QuanluZhang committed
143
  search_strategy = strategy.Random(dedup=True)  # dedup=False if deduplication is not wanted
QuanluZhang's avatar
QuanluZhang committed
144

QuanluZhang's avatar
QuanluZhang committed
145
Detailed descriptions and usages of available strategies can be found `here <./ApiReference.rst>`__ .
QuanluZhang's avatar
QuanluZhang committed
146

QuanluZhang's avatar
QuanluZhang committed
147
148
Choose or write a model evaluator
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
QuanluZhang's avatar
QuanluZhang committed
149

QuanluZhang's avatar
QuanluZhang committed
150
In the NAS process, the search strategy repeatedly generates new models. A model evaluator is for training and validating each generated model. The obtained performance of a generated model is collected and sent to search strategy for generating better models.
QuanluZhang's avatar
QuanluZhang committed
151

QuanluZhang's avatar
QuanluZhang committed
152
The model evaluator should correctly identify the use case of the model and the optimization goal. For example, on a classification task, an <input, label> dataset is needed, the loss function could be cross entropy and the optimized metric could be accuracy. On a regression task, the optimized metric could be mean-squared-error.
QuanluZhang's avatar
QuanluZhang committed
153

QuanluZhang's avatar
QuanluZhang committed
154
In the context of PyTorch, Retiarii has provided two built-in model evaluators, designed for simple use cases: classification and regression. These two evaluators are built upon the awesome library PyTorch-Lightning.
QuanluZhang's avatar
QuanluZhang committed
155

QuanluZhang's avatar
QuanluZhang committed
156
An example here creates a simple evaluator that runs on MNIST dataset, trains for 10 epochs, and reports its validation accuracy.
QuanluZhang's avatar
QuanluZhang committed
157
158
159

.. code-block:: python

160
161
  import nni.retiarii.evaluator.pytorch.lightning as pl
  from nni.retiarii import serialize
162
  from torchvision import transforms
QuanluZhang's avatar
QuanluZhang committed
163

QuanluZhang's avatar
QuanluZhang committed
164
  transform = serialize(transforms.Compose, [serialize(transforms.ToTensor()), serialize(transforms.Normalize, (0.1307,), (0.3081,))])
165
166
  train_dataset = serialize(MNIST, root='data/mnist', train=True, download=True, transform=transform)
  test_dataset = serialize(MNIST, root='data/mnist', train=False, download=True, transform=transform)
QuanluZhang's avatar
QuanluZhang committed
167
  evaluator = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
168
169
170
                                val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
                                max_epochs=10)

QuanluZhang's avatar
QuanluZhang committed
171
172
173
174
175
176
177
178
179
As the model evaluator is running in another process (possibly in some remote machines), the defined evaluator, along with all its parameters, needs to be correctly serialized. For example, users should use the dataloader that has been already wrapped as a serializable class defined in ``nni.retiarii.evaluator.pytorch.lightning``. For the arguments used in dataloader, recursive serialization needs to be done, until the arguments are simple types like int, str, float.

Detailed descriptions and usages of model evaluators can be found `here <./ApiReference.rst>`__ .

If the built-in model evaluators do not meet your requirement, or you already wrote the training code and just want to use it, you can follow `the guide to write a new evaluator <./WriteTrainer.rst>`__ .

.. note:: In case you want to run the model evaluator locally for debug purpose, you can directly run the evaluator via ``evaluator._execute(Net)`` (note that it has to be ``Net``, not ``Net()``). However, this API is currently internal and subject to change.

.. warning:: Mutations on the parameters of model evaluator (known as hyper-parameter tuning) is currently not supported but will be supported in the future.
180

QuanluZhang's avatar
QuanluZhang committed
181
.. warning:: To use PyTorch-lightning with Retiarii, currently you need to install PyTorch-lightning v1.1.x (v1.2 is not supported).
QuanluZhang's avatar
QuanluZhang committed
182

QuanluZhang's avatar
QuanluZhang committed
183
184
Launch an Experiment
--------------------
QuanluZhang's avatar
QuanluZhang committed
185

QuanluZhang's avatar
QuanluZhang committed
186
After all the above are prepared, it is time to start an experiment to do the model search. An example is shown below.
QuanluZhang's avatar
QuanluZhang committed
187
188
189

.. code-block:: python

QuanluZhang's avatar
QuanluZhang committed
190
  exp = RetiariiExperiment(base_model, trainer, None, simple_strategy)
QuanluZhang's avatar
QuanluZhang committed
191
192
193
194
195
196
197
198
199
  exp_config = RetiariiExeConfig('local')
  exp_config.experiment_name = 'mnasnet_search'
  exp_config.trial_concurrency = 2
  exp_config.max_trial_number = 10
  exp_config.training_service.use_active_gpu = False
  exp.run(exp_config, 8081)

The complete code of a simple MNIST example can be found :githublink:`here <test/retiarii_test/mnist/test.py>`.

QuanluZhang's avatar
QuanluZhang committed
200
201
Visualize the Experiment
------------------------
202

QuanluZhang's avatar
QuanluZhang committed
203
Users can visualize their experiment in the same way as visualizing a normal hyper-parameter tuning experiment. For example, open ``localhost::8081`` in your browser, 8081 is the port that you set in ``exp.run``. Please refer to `here <../../Tutorial/WebUI.rst>`__ for details.