Unverified Commit 8acddd0c authored by Yuge Zhang's avatar Yuge Zhang Committed by GitHub
Browse files

Update NAS reference (#4682)

parent e8b88a79
...@@ -44,6 +44,15 @@ ValueChoice ...@@ -44,6 +44,15 @@ ValueChoice
:members: :members:
:inherited-members: Module :inherited-members: Module
.. _nas-model-parameter-choice:
ModelParameterChoice
^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.retiarii.nn.pytorch.ModelParameterChoice
:members:
:inherited-members: Module
.. _nas-repeat: .. _nas-repeat:
Repeat Repeat
...@@ -60,8 +69,6 @@ Cell ...@@ -60,8 +69,6 @@ Cell
.. autoclass:: nni.retiarii.nn.pytorch.Cell .. autoclass:: nni.retiarii.nn.pytorch.Cell
:members: :members:
.. footbibliography::
.. _nas-cell-101: .. _nas-cell-101:
NasBench101Cell NasBench101Cell
...@@ -70,8 +77,6 @@ NasBench101Cell ...@@ -70,8 +77,6 @@ NasBench101Cell
.. autoclass:: nni.retiarii.nn.pytorch.NasBench101Cell .. autoclass:: nni.retiarii.nn.pytorch.NasBench101Cell
:members: :members:
.. footbibliography::
.. _nas-cell-201: .. _nas-cell-201:
NasBench201Cell NasBench201Cell
...@@ -80,8 +85,6 @@ NasBench201Cell ...@@ -80,8 +85,6 @@ NasBench201Cell
.. autoclass:: nni.retiarii.nn.pytorch.NasBench201Cell .. autoclass:: nni.retiarii.nn.pytorch.NasBench201Cell
:members: :members:
.. footbibliography::
.. _hyper-modules: .. _hyper-modules:
Hyper-module Library (experimental) Hyper-module Library (experimental)
......
...@@ -56,8 +56,6 @@ TPE ...@@ -56,8 +56,6 @@ TPE
:members: :members:
:noindex: :noindex:
.. footbibliography::
.. _policy-based-rl-strategy: .. _policy-based-rl-strategy:
PolicyBasedRL PolicyBasedRL
...@@ -67,8 +65,6 @@ PolicyBasedRL ...@@ -67,8 +65,6 @@ PolicyBasedRL
:members: :members:
:noindex: :noindex:
.. footbibliography::
.. _one-shot-nas: .. _one-shot-nas:
One-shot strategy One-shot strategy
......
...@@ -10,9 +10,8 @@ Retiarii for Neural Architecture Search ...@@ -10,9 +10,8 @@ Retiarii for Neural Architecture Search
exploration_strategy exploration_strategy
evaluator evaluator
advanced_usage advanced_usage
reference
.. attention:: NNI's latest NAS supports are all based on Retiarii Framework, users who are still on `early version using NNI NAS v1.0 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ shall migrate your work to Retiarii as soon as possible. .. attention:: NNI's latest NAS supports are all based on Retiarii Framework, users who are still on `early version using NNI NAS v1.0 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ shall migrate your work to Retiarii as soon as possible. We plan to remove the legacy NAS framework in the next few releases.
.. note:: PyTorch is the **only supported framework on Retiarii**. Inquiries of NAS support on Tensorflow is in `this discussion <https://github.com/microsoft/nni/discussions/4605>`__. If you intend to run NAS with DL frameworks other than PyTorch and Tensorflow, please `open new issues <https://github.com/microsoft/nni/issues>`__ to let us know. .. note:: PyTorch is the **only supported framework on Retiarii**. Inquiries of NAS support on Tensorflow is in `this discussion <https://github.com/microsoft/nni/discussions/4605>`__. If you intend to run NAS with DL frameworks other than PyTorch and Tensorflow, please `open new issues <https://github.com/microsoft/nni/issues>`__ to let us know.
......
Retiarii API Reference
======================
nni.retiarii
------------
.. automodule:: nni.retiarii
:imported-members:
:members:
nni.retiarii.codegen
--------------------
.. automodule:: nni.retiarii.codegen
:imported-members:
:members:
nni.retiarii.converter
----------------------
.. automodule:: nni.retiarii.converter
:imported-members:
:members:
nni.retiarii.evaluator
----------------------
.. automodule:: nni.retiarii.evaluator
:imported-members:
:members:
.. automodule:: nni.retiarii.evaluator.pytorch
:imported-members:
:members:
:exclude-members: Trainer, DataLoader
.. autoclass:: nni.retiarii.evaluator.pytorch.Trainer
.. autoclass:: nni.retiarii.evaluator.pytorch.DataLoader
nni.retiarii.execution
----------------------
.. automodule:: nni.retiarii.execution
:imported-members:
:members:
:undoc-members:
nni.retiarii.experiment.pytorch
-------------------------------
.. automodule:: nni.retiarii.experiment.pytorch
:members:
nni.retiarii.nn.pytorch
-----------------------
Please refer to:
* :doc:`construct_space`.
* :doc:`mutator`.
* `torch.nn reference <https://pytorch.org/docs/stable/nn.html>`_.
nni.retiarii.oneshot
--------------------
.. automodule:: nni.retiarii.oneshot
:imported-members:
:members:
nni.retiarii.operation_def
--------------------------
.. automodule:: nni.retiarii.operation_def
:imported-members:
:members:
nni.retiarii.strategy
---------------------
.. automodule:: nni.retiarii.strategy
:imported-members:
:members:
nni.retiarii.utils
------------------
.. automodule:: nni.retiarii.utils
:members:
.. footbibliography::
...@@ -3,3 +3,101 @@ Neural Architecture Search ...@@ -3,3 +3,101 @@ Neural Architecture Search
nni.retiarii nni.retiarii
------------ ------------
.. automodule:: nni.retiarii
:imported-members:
:members:
nni.retiarii.codegen
--------------------
.. automodule:: nni.retiarii.codegen
:imported-members:
:members:
nni.retiarii.converter
----------------------
.. automodule:: nni.retiarii.converter
:imported-members:
:members:
nni.retiarii.evaluator
----------------------
.. automodule:: nni.retiarii.evaluator
:imported-members:
:members:
.. automodule:: nni.retiarii.evaluator.pytorch
:imported-members:
:members:
:exclude-members: Trainer, DataLoader
.. autoclass:: nni.retiarii.evaluator.pytorch.Trainer
.. autoclass:: nni.retiarii.evaluator.pytorch.DataLoader
nni.retiarii.execution
----------------------
.. automodule:: nni.retiarii.execution
:imported-members:
:members:
:undoc-members:
nni.retiarii.experiment.pytorch
-------------------------------
.. automodule:: nni.retiarii.experiment.pytorch
:members:
nni.retiarii.nn.pytorch
-----------------------
.. automodule:: nni.retiarii.nn.pytorch.api
:imported-members:
:members:
:noindex:
.. automodule:: nni.retiarii.nn.pytorch.component
:imported-members:
:members:
:noindex:
.. automodule:: nni.retiarii.nn.pytorch.hypermodule
:imported-members:
:members:
:noindex:
.. automodule:: nni.retiarii.nn.pytorch.mutation_utils
:imported-members:
:members:
nni.retiarii.oneshot
--------------------
.. automodule:: nni.retiarii.oneshot
:imported-members:
:members:
nni.retiarii.operation_def
--------------------------
.. automodule:: nni.retiarii.operation_def
:imported-members:
:members:
nni.retiarii.strategy
---------------------
.. automodule:: nni.retiarii.strategy
:imported-members:
:members:
nni.retiarii.utils
------------------
.. automodule:: nni.retiarii.utils
:members:
...@@ -818,7 +818,7 @@ class GraphConverterWithShape(GraphConverter): ...@@ -818,7 +818,7 @@ class GraphConverterWithShape(GraphConverter):
def convert_to_graph(script_module, module, converter=None, **kwargs): def convert_to_graph(script_module, module, converter=None, **kwargs):
""" """
Convert module to our graph ir, i.e., build a ```Model``` type Convert module to our graph ir, i.e., build a :class:`Model` type
Parameters Parameters
---------- ----------
......
...@@ -75,12 +75,12 @@ class Model: ...@@ -75,12 +75,12 @@ class Model:
""" """
Represents a neural network model. Represents a neural network model.
During mutation, one `Model` object is created for each trainable snapshot. During mutation, one :class:`Model` object is created for each trainable snapshot.
For example, consider a mutator that insert a node at an edge for each iteration. For example, consider a mutator that insert a node at an edge for each iteration.
In one iteration, the mutator invokes 4 primitives: add node, remove edge, add edge to head, add edge to tail. In one iteration, the mutator invokes 4 primitives: add node, remove edge, add edge to head, add edge to tail.
These 4 primitives operates in one `Model` object. These 4 primitives operates in one :class:`Model` object.
When they are all done the model will be set to "frozen" (trainable) status and be submitted to execution engine. When they are all done the model will be set to "frozen" (trainable) status and be submitted to execution engine.
And then a new iteration starts, and a new `Model` object is created by forking last model. And then a new iteration starts, and a new :class:`Model` object is created by forking last model.
Attributes Attributes
---------- ----------
...@@ -89,7 +89,7 @@ class Model: ...@@ -89,7 +89,7 @@ class Model:
python_init_params python_init_params
Initialization parameters of python class. Initialization parameters of python class.
status status
See `ModelStatus`. See :class:`ModelStatus`.
root_graph root_graph
The outermost graph which usually takes dataset as input and feeds output to loss function. The outermost graph which usually takes dataset as input and feeds output to loss function.
graphs graphs
...@@ -98,11 +98,11 @@ class Model: ...@@ -98,11 +98,11 @@ class Model:
Model evaluator Model evaluator
history history
Mutation history. Mutation history.
`self` is directly mutated from `self.history[-1]`; ``self`` is directly mutated from ``self.history[-1]``;
`self.history[-1] is mutated from `self.history[-2]`, and so on. ``self.history[-1]`` is mutated from ``self.history[-2]``, and so on.
`self.history[0]` is the base graph. ``self.history[0]`` is the base graph.
metric metric
Training result of the model, or `None` if it's not yet trained or has failed to train. Training result of the model, or ``None`` if it's not yet trained or has failed to train.
intermediate_metrics intermediate_metrics
Intermediate training metrics. If the model is not trained, it's an empty list. Intermediate training metrics. If the model is not trained, it's an empty list.
""" """
...@@ -262,9 +262,9 @@ class Graph: ...@@ -262,9 +262,9 @@ class Graph:
Graph topology. Graph topology.
This class simply represents the topology, with no semantic meaning. This class simply represents the topology, with no semantic meaning.
All other information like metric, non-graph functions, mutation history, etc should go to `Model`. All other information like metric, non-graph functions, mutation history, etc should go to :class:`Model`.
Each graph belongs to and only belongs to one `Model`. Each graph belongs to and only belongs to one :class:`Model`.
Attributes Attributes
---------- ----------
...@@ -281,15 +281,15 @@ class Graph: ...@@ -281,15 +281,15 @@ class Graph:
output_names output_names
Optional mnemonic names of output values. Optional mnemonic names of output values.
input_node input_node
... Incoming node.
output_node output_node
... Output node.
hidden_nodes hidden_nodes
... Hidden nodes
nodes nodes
All input/output/hidden nodes. All input/output/hidden nodes.
edges edges
... Edges.
python_name python_name
The name of torch.nn.Module, should have one-to-one mapping with items in python model. The name of torch.nn.Module, should have one-to-one mapping with items in python model.
""" """
...@@ -529,16 +529,16 @@ class Node: ...@@ -529,16 +529,16 @@ class Node:
""" """
An operation or an opaque subgraph inside a graph. An operation or an opaque subgraph inside a graph.
Each node belongs to and only belongs to one `Graph`. Each node belongs to and only belongs to one :class:`Graph`.
Nodes should never be created with constructor. Use `Graph.add_node()` instead. Nodes should never be created with constructor. Use :meth:`Graph.add_node` instead.
The node itself is for topology only. The node itself is for topology only.
Information of tensor calculation should all go inside `operation` attribute. Information of tensor calculation should all go inside ``operation`` attribute.
TODO: parameter of subgraph (cell) TODO: parameter of subgraph (cell)
It's easy to assign parameters on cell node, but it's hard to "use" them. It's easy to assign parameters on cell node, but it's hard to "use" them.
We need to design a way to reference stored cell parameters in inner node operations. We need to design a way to reference stored cell parameters in inner node operations.
e.g. `self.fc = Linear(self.units)` <- how to express `self.units` in IR? e.g. ``self.fc = Linear(self.units)`` <- how to express ``self.units`` in IR?
Attributes Attributes
---------- ----------
...@@ -554,10 +554,10 @@ class Node: ...@@ -554,10 +554,10 @@ class Node:
label label
Optional. If two nodes have the same label, they are considered same by the mutator. Optional. If two nodes have the same label, they are considered same by the mutator.
operation operation
... Operation.
cell cell
Read only shortcut to get the referenced subgraph. Read only shortcut to get the referenced subgraph.
If this node is not a subgraph (is a primitive operation), accessing `cell` will raise an error. If this node is not a subgraph (is a primitive operation), accessing ``cell`` will raise an error.
predecessors predecessors
Predecessor nodes of this node in the graph. This is an optional mutation helper. Predecessor nodes of this node in the graph. This is an optional mutation helper.
successors successors
...@@ -674,15 +674,15 @@ class Edge: ...@@ -674,15 +674,15 @@ class Edge:
""" """
A tensor, or "data flow", between two nodes. A tensor, or "data flow", between two nodes.
Example forward code snippet: Example forward code snippet: ::
```
a, b, c = split(x) a, b, c = split(x)
p = concat(a, c) p = concat(a, c)
q = sum(b, p) q = sum(b, p)
z = relu(q) z = relu(q)
```
Edges in above snippet: Edges in above snippet: ::
+ head: (split, 0), tail: (concat, 0) # a in concat + head: (split, 0), tail: (concat, 0) # a in concat
+ head: (split, 2), tail: (concat, 1) # c in concat + head: (split, 2), tail: (concat, 1) # c in concat
+ head: (split, 1), tail: (sum, -1 or 0) # b in sum + head: (split, 1), tail: (sum, -1 or 0) # b in sum
...@@ -692,18 +692,18 @@ class Edge: ...@@ -692,18 +692,18 @@ class Edge:
Attributes Attributes
---------- ----------
graph graph
... Graph.
head head
Head node. Head node.
tail tail
Tail node. Tail node.
head_slot head_slot
Index of outputs in head node. Index of outputs in head node.
If the node has only one output, this should be `null`. If the node has only one output, this should be ``null``.
tail_slot tail_slot
Index of inputs in tail node. Index of inputs in tail node.
If the node has only one input, this should be `null`. If the node has only one input, this should be ``null``.
If the node does not care about order, this can be `-1`. If the node does not care about order, this can be ``-1``.
""" """
def __init__(self, head: EdgeEndpoint, tail: EdgeEndpoint, _internal: bool = False): def __init__(self, head: EdgeEndpoint, tail: EdgeEndpoint, _internal: bool = False):
......
...@@ -34,8 +34,14 @@ class Cell(nn.Module): ...@@ -34,8 +34,14 @@ class Cell(nn.Module):
""" """
Cell structure that is popularly used in NAS literature. Cell structure that is popularly used in NAS literature.
Refer to :footcite:t:`zoph2017neural,zoph2018learning,liu2018darts` for details. Find the details in:
:footcite:t:`radosavovic2019network` is a good summary of how this structure works in practice.
* `Neural Architecture Search with Reinforcement Learning <https://arxiv.org/abs/1611.01578>`__.
* `Learning Transferable Architectures for Scalable Image Recognition <https://arxiv.org/abs/1707.07012>`__.
* `DARTS: Differentiable Architecture Search <https://arxiv.org/abs/1806.09055>`__
`On Network Design Spaces for Visual Recognition <https://arxiv.org/abs/1905.13214>`__
is a good summary of how this structure works in practice.
A cell consists of multiple "nodes". Each node is a sum of multiple operators. Each operator is chosen from A cell consists of multiple "nodes". Each node is a sum of multiple operators. Each operator is chosen from
``op_candidates``, and takes one input from previous nodes and predecessors. Predecessor means the input of cell. ``op_candidates``, and takes one input from previous nodes and predecessors. Predecessor means the input of cell.
...@@ -45,7 +51,10 @@ class Cell(nn.Module): ...@@ -45,7 +51,10 @@ class Cell(nn.Module):
.. list-table:: .. list-table::
:widths: 25 75 :widths: 25 75
:header-rows: 1
* - Name
- Brief Description
* - Cell * - Cell
- A cell consists of several nodes. - A cell consists of several nodes.
* - Node * - Node
...@@ -111,15 +120,20 @@ class Cell(nn.Module): ...@@ -111,15 +120,20 @@ class Cell(nn.Module):
-------- --------
Choose between conv2d and maxpool2d. Choose between conv2d and maxpool2d.
The cell have 4 nodes, 1 op per node, and 2 predecessors. The cell have 4 nodes, 1 op per node, and 2 predecessors.
>>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2) >>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2)
In forward: In forward:
>>> cell([input1, input2]) >>> cell([input1, input2])
Use ``merge_op`` to specify how to construct the output. Use ``merge_op`` to specify how to construct the output.
The output will then have dynamic shape, depending on which input has been used in the cell. The output will then have dynamic shape, depending on which input has been used in the cell.
>>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2, merge_op='loose_end') >>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2, merge_op='loose_end')
The op candidates can be callable that accepts node index in cell, op index in node, and input index. The op candidates can be callable that accepts node index in cell, op index in node, and input index.
>>> cell = nn.Cell([ >>> cell = nn.Cell([
... lambda node_index, op_index, input_index: nn.Conv2d(32, 32, 3, stride=2 if input_index < 1 else 1), ... lambda node_index, op_index, input_index: nn.Conv2d(32, 32, 3, stride=2 if input_index < 1 else 1),
... ], 4, 1, 2) ... ], 4, 1, 2)
......
...@@ -147,7 +147,7 @@ class NasBench201Cell(nn.Module): ...@@ -147,7 +147,7 @@ class NasBench201Cell(nn.Module):
""" """
Cell structure that is proposed in NAS-Bench-201. Cell structure that is proposed in NAS-Bench-201.
Refer to :footcite:t:`dong2019bench` for details. Proposed by `NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search <https://arxiv.org/abs/2001.00326>`__.
This cell is a densely connected DAG with ``num_tensors`` nodes, where each node is tensor. This cell is a densely connected DAG with ``num_tensors`` nodes, where each node is tensor.
For every i < j, there is an edge from i-th node to j-th node. For every i < j, there is an edge from i-th node to j-th node.
......
...@@ -222,14 +222,16 @@ binary_modules = ['BinaryAdd', 'BinaryMul', 'BinaryMinus', 'BinaryDivide', 'Bina ...@@ -222,14 +222,16 @@ binary_modules = ['BinaryAdd', 'BinaryMul', 'BinaryMinus', 'BinaryDivide', 'Bina
class AutoActivation(nn.Module): class AutoActivation(nn.Module):
""" """
This module is an implementation of the paper "Searching for Activation Functions" This module is an implementation of the paper `Searching for Activation Functions <https://arxiv.org/abs/1710.05941>`__.
(https://arxiv.org/abs/1710.05941).
NOTE: current `beta` is not per-channel parameter
Parameters Parameters
---------- ----------
unit_num : int unit_num : int
the number of core units the number of core units
Notes
-----
Current `beta` is not per-channel parameter.
""" """
def __init__(self, unit_num: int = 1, label: str = None): def __init__(self, unit_num: int = 1, label: str = None):
super().__init__() super().__init__()
......
...@@ -221,7 +221,7 @@ class NasBench101Cell(Mutable): ...@@ -221,7 +221,7 @@ class NasBench101Cell(Mutable):
""" """
Cell structure that is proposed in NAS-Bench-101. Cell structure that is proposed in NAS-Bench-101.
Refer to :footcite:t:`ying2019bench` for details. Proposed by `NAS-Bench-101: Towards Reproducible Neural Architecture Search <http://proceedings.mlr.press/v97/ying19a/ying19a.pdf>`__.
This cell is usually used in evaluation of NAS algorithms because there is a "comprehensive analysis" of this search space This cell is usually used in evaluation of NAS algorithms because there is a "comprehensive analysis" of this search space
available, which includes a full architecture-dataset that "maps 423k unique architectures to metrics available, which includes a full architecture-dataset that "maps 423k unique architectures to metrics
......
...@@ -23,7 +23,9 @@ class PolicyBasedRL(BaseStrategy): ...@@ -23,7 +23,9 @@ class PolicyBasedRL(BaseStrategy):
""" """
Algorithm for policy-based reinforcement learning. Algorithm for policy-based reinforcement learning.
This is a wrapper of algorithms provided in tianshou (PPO by default), This is a wrapper of algorithms provided in tianshou (PPO by default),
and can be easily customized with other algorithms that inherit ``BasePolicy`` (e.g., REINFORCE :footcite:p:`zoph2017neural`). and can be easily customized with other algorithms that inherit ``BasePolicy``
(e.g., `REINFORCE <https://link.springer.com/content/pdf/10.1007/BF00992696.pdf>`__
as in `this paper <https://arxiv.org/abs/1611.01578>`__).
Parameters Parameters
---------- ----------
...@@ -33,7 +35,8 @@ class PolicyBasedRL(BaseStrategy): ...@@ -33,7 +35,8 @@ class PolicyBasedRL(BaseStrategy):
How many trials (trajectories) each time collector collects. How many trials (trajectories) each time collector collects.
After each collect, trainer will sample batch from replay buffer and do the update. Default: 20. After each collect, trainer will sample batch from replay buffer and do the update. Default: 20.
policy_fn : function policy_fn : function
Takes ``ModelEvaluationEnv`` as input and return a policy. See ``_default_policy_fn`` for an example. Takes :class:`ModelEvaluationEnv` as input and return a policy.
See :meth:`PolicyBasedRL._default_policy_fn` for an example.
""" """
def __init__(self, max_collect: int = 100, trial_per_collect = 20, def __init__(self, max_collect: int = 100, trial_per_collect = 20,
......
...@@ -43,7 +43,8 @@ class TPE(BaseStrategy): ...@@ -43,7 +43,8 @@ class TPE(BaseStrategy):
""" """
The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach. The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach.
Refer to :footcite:t:`bergstra2011algorithms` for details. Find the details in
`Algorithms for Hyper-Parameter Optimization <https://papers.nips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf>`__.
SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements, SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements,
and then subsequently choose new hyperparameters to test based on this model. and then subsequently choose new hyperparameters to test based on this model.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment