Unverified Commit 8acddd0c authored by Yuge Zhang's avatar Yuge Zhang Committed by GitHub
Browse files

Update NAS reference (#4682)

parent e8b88a79
......@@ -44,6 +44,15 @@ ValueChoice
:members:
:inherited-members: Module
.. _nas-model-parameter-choice:
ModelParameterChoice
^^^^^^^^^^^^^^^^^^^^
.. autoclass:: nni.retiarii.nn.pytorch.ModelParameterChoice
:members:
:inherited-members: Module
.. _nas-repeat:
Repeat
......@@ -60,8 +69,6 @@ Cell
.. autoclass:: nni.retiarii.nn.pytorch.Cell
:members:
.. footbibliography::
.. _nas-cell-101:
NasBench101Cell
......@@ -70,8 +77,6 @@ NasBench101Cell
.. autoclass:: nni.retiarii.nn.pytorch.NasBench101Cell
:members:
.. footbibliography::
.. _nas-cell-201:
NasBench201Cell
......@@ -80,8 +85,6 @@ NasBench201Cell
.. autoclass:: nni.retiarii.nn.pytorch.NasBench201Cell
:members:
.. footbibliography::
.. _hyper-modules:
Hyper-module Library (experimental)
......
......@@ -56,8 +56,6 @@ TPE
:members:
:noindex:
.. footbibliography::
.. _policy-based-rl-strategy:
PolicyBasedRL
......@@ -67,8 +65,6 @@ PolicyBasedRL
:members:
:noindex:
.. footbibliography::
.. _one-shot-nas:
One-shot strategy
......
......@@ -10,9 +10,8 @@ Retiarii for Neural Architecture Search
exploration_strategy
evaluator
advanced_usage
reference
.. attention:: NNI's latest NAS supports are all based on Retiarii Framework, users who are still on `early version using NNI NAS v1.0 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ shall migrate your work to Retiarii as soon as possible.
.. attention:: NNI's latest NAS supports are all based on Retiarii Framework, users who are still on `early version using NNI NAS v1.0 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ shall migrate your work to Retiarii as soon as possible. We plan to remove the legacy NAS framework in the next few releases.
.. note:: PyTorch is the **only supported framework on Retiarii**. Inquiries of NAS support on Tensorflow is in `this discussion <https://github.com/microsoft/nni/discussions/4605>`__. If you intend to run NAS with DL frameworks other than PyTorch and Tensorflow, please `open new issues <https://github.com/microsoft/nni/issues>`__ to let us know.
......
Retiarii API Reference
======================
nni.retiarii
------------
.. automodule:: nni.retiarii
:imported-members:
:members:
nni.retiarii.codegen
--------------------
.. automodule:: nni.retiarii.codegen
:imported-members:
:members:
nni.retiarii.converter
----------------------
.. automodule:: nni.retiarii.converter
:imported-members:
:members:
nni.retiarii.evaluator
----------------------
.. automodule:: nni.retiarii.evaluator
:imported-members:
:members:
.. automodule:: nni.retiarii.evaluator.pytorch
:imported-members:
:members:
:exclude-members: Trainer, DataLoader
.. autoclass:: nni.retiarii.evaluator.pytorch.Trainer
.. autoclass:: nni.retiarii.evaluator.pytorch.DataLoader
nni.retiarii.execution
----------------------
.. automodule:: nni.retiarii.execution
:imported-members:
:members:
:undoc-members:
nni.retiarii.experiment.pytorch
-------------------------------
.. automodule:: nni.retiarii.experiment.pytorch
:members:
nni.retiarii.nn.pytorch
-----------------------
Please refer to:
* :doc:`construct_space`.
* :doc:`mutator`.
* `torch.nn reference <https://pytorch.org/docs/stable/nn.html>`_.
nni.retiarii.oneshot
--------------------
.. automodule:: nni.retiarii.oneshot
:imported-members:
:members:
nni.retiarii.operation_def
--------------------------
.. automodule:: nni.retiarii.operation_def
:imported-members:
:members:
nni.retiarii.strategy
---------------------
.. automodule:: nni.retiarii.strategy
:imported-members:
:members:
nni.retiarii.utils
------------------
.. automodule:: nni.retiarii.utils
:members:
.. footbibliography::
......@@ -3,3 +3,101 @@ Neural Architecture Search
nni.retiarii
------------
.. automodule:: nni.retiarii
:imported-members:
:members:
nni.retiarii.codegen
--------------------
.. automodule:: nni.retiarii.codegen
:imported-members:
:members:
nni.retiarii.converter
----------------------
.. automodule:: nni.retiarii.converter
:imported-members:
:members:
nni.retiarii.evaluator
----------------------
.. automodule:: nni.retiarii.evaluator
:imported-members:
:members:
.. automodule:: nni.retiarii.evaluator.pytorch
:imported-members:
:members:
:exclude-members: Trainer, DataLoader
.. autoclass:: nni.retiarii.evaluator.pytorch.Trainer
.. autoclass:: nni.retiarii.evaluator.pytorch.DataLoader
nni.retiarii.execution
----------------------
.. automodule:: nni.retiarii.execution
:imported-members:
:members:
:undoc-members:
nni.retiarii.experiment.pytorch
-------------------------------
.. automodule:: nni.retiarii.experiment.pytorch
:members:
nni.retiarii.nn.pytorch
-----------------------
.. automodule:: nni.retiarii.nn.pytorch.api
:imported-members:
:members:
:noindex:
.. automodule:: nni.retiarii.nn.pytorch.component
:imported-members:
:members:
:noindex:
.. automodule:: nni.retiarii.nn.pytorch.hypermodule
:imported-members:
:members:
:noindex:
.. automodule:: nni.retiarii.nn.pytorch.mutation_utils
:imported-members:
:members:
nni.retiarii.oneshot
--------------------
.. automodule:: nni.retiarii.oneshot
:imported-members:
:members:
nni.retiarii.operation_def
--------------------------
.. automodule:: nni.retiarii.operation_def
:imported-members:
:members:
nni.retiarii.strategy
---------------------
.. automodule:: nni.retiarii.strategy
:imported-members:
:members:
nni.retiarii.utils
------------------
.. automodule:: nni.retiarii.utils
:members:
......@@ -818,7 +818,7 @@ class GraphConverterWithShape(GraphConverter):
def convert_to_graph(script_module, module, converter=None, **kwargs):
"""
Convert module to our graph ir, i.e., build a ```Model``` type
Convert module to our graph ir, i.e., build a :class:`Model` type
Parameters
----------
......
......@@ -75,12 +75,12 @@ class Model:
"""
Represents a neural network model.
During mutation, one `Model` object is created for each trainable snapshot.
During mutation, one :class:`Model` object is created for each trainable snapshot.
For example, consider a mutator that insert a node at an edge for each iteration.
In one iteration, the mutator invokes 4 primitives: add node, remove edge, add edge to head, add edge to tail.
These 4 primitives operates in one `Model` object.
These 4 primitives operates in one :class:`Model` object.
When they are all done the model will be set to "frozen" (trainable) status and be submitted to execution engine.
And then a new iteration starts, and a new `Model` object is created by forking last model.
And then a new iteration starts, and a new :class:`Model` object is created by forking last model.
Attributes
----------
......@@ -89,7 +89,7 @@ class Model:
python_init_params
Initialization parameters of python class.
status
See `ModelStatus`.
See :class:`ModelStatus`.
root_graph
The outermost graph which usually takes dataset as input and feeds output to loss function.
graphs
......@@ -98,11 +98,11 @@ class Model:
Model evaluator
history
Mutation history.
`self` is directly mutated from `self.history[-1]`;
`self.history[-1] is mutated from `self.history[-2]`, and so on.
`self.history[0]` is the base graph.
``self`` is directly mutated from ``self.history[-1]``;
``self.history[-1]`` is mutated from ``self.history[-2]``, and so on.
``self.history[0]`` is the base graph.
metric
Training result of the model, or `None` if it's not yet trained or has failed to train.
Training result of the model, or ``None`` if it's not yet trained or has failed to train.
intermediate_metrics
Intermediate training metrics. If the model is not trained, it's an empty list.
"""
......@@ -262,9 +262,9 @@ class Graph:
Graph topology.
This class simply represents the topology, with no semantic meaning.
All other information like metric, non-graph functions, mutation history, etc should go to `Model`.
All other information like metric, non-graph functions, mutation history, etc should go to :class:`Model`.
Each graph belongs to and only belongs to one `Model`.
Each graph belongs to and only belongs to one :class:`Model`.
Attributes
----------
......@@ -281,15 +281,15 @@ class Graph:
output_names
Optional mnemonic names of output values.
input_node
...
Incoming node.
output_node
...
Output node.
hidden_nodes
...
Hidden nodes
nodes
All input/output/hidden nodes.
edges
...
Edges.
python_name
The name of torch.nn.Module, should have one-to-one mapping with items in python model.
"""
......@@ -529,16 +529,16 @@ class Node:
"""
An operation or an opaque subgraph inside a graph.
Each node belongs to and only belongs to one `Graph`.
Nodes should never be created with constructor. Use `Graph.add_node()` instead.
Each node belongs to and only belongs to one :class:`Graph`.
Nodes should never be created with constructor. Use :meth:`Graph.add_node` instead.
The node itself is for topology only.
Information of tensor calculation should all go inside `operation` attribute.
Information of tensor calculation should all go inside ``operation`` attribute.
TODO: parameter of subgraph (cell)
It's easy to assign parameters on cell node, but it's hard to "use" them.
We need to design a way to reference stored cell parameters in inner node operations.
e.g. `self.fc = Linear(self.units)` <- how to express `self.units` in IR?
e.g. ``self.fc = Linear(self.units)`` <- how to express ``self.units`` in IR?
Attributes
----------
......@@ -554,10 +554,10 @@ class Node:
label
Optional. If two nodes have the same label, they are considered same by the mutator.
operation
...
Operation.
cell
Read only shortcut to get the referenced subgraph.
If this node is not a subgraph (is a primitive operation), accessing `cell` will raise an error.
If this node is not a subgraph (is a primitive operation), accessing ``cell`` will raise an error.
predecessors
Predecessor nodes of this node in the graph. This is an optional mutation helper.
successors
......@@ -674,36 +674,36 @@ class Edge:
"""
A tensor, or "data flow", between two nodes.
Example forward code snippet:
```
a, b, c = split(x)
p = concat(a, c)
q = sum(b, p)
z = relu(q)
```
Edges in above snippet:
+ head: (split, 0), tail: (concat, 0) # a in concat
+ head: (split, 2), tail: (concat, 1) # c in concat
+ head: (split, 1), tail: (sum, -1 or 0) # b in sum
+ head: (concat, null), tail: (sum, -1 or 1) # p in sum
+ head: (sum, null), tail: (relu, null) # q in relu
Example forward code snippet: ::
a, b, c = split(x)
p = concat(a, c)
q = sum(b, p)
z = relu(q)
Edges in above snippet: ::
+ head: (split, 0), tail: (concat, 0) # a in concat
+ head: (split, 2), tail: (concat, 1) # c in concat
+ head: (split, 1), tail: (sum, -1 or 0) # b in sum
+ head: (concat, null), tail: (sum, -1 or 1) # p in sum
+ head: (sum, null), tail: (relu, null) # q in relu
Attributes
----------
graph
...
Graph.
head
Head node.
tail
Tail node.
head_slot
Index of outputs in head node.
If the node has only one output, this should be `null`.
If the node has only one output, this should be ``null``.
tail_slot
Index of inputs in tail node.
If the node has only one input, this should be `null`.
If the node does not care about order, this can be `-1`.
If the node has only one input, this should be ``null``.
If the node does not care about order, this can be ``-1``.
"""
def __init__(self, head: EdgeEndpoint, tail: EdgeEndpoint, _internal: bool = False):
......
......@@ -34,8 +34,14 @@ class Cell(nn.Module):
"""
Cell structure that is popularly used in NAS literature.
Refer to :footcite:t:`zoph2017neural,zoph2018learning,liu2018darts` for details.
:footcite:t:`radosavovic2019network` is a good summary of how this structure works in practice.
Find the details in:
* `Neural Architecture Search with Reinforcement Learning <https://arxiv.org/abs/1611.01578>`__.
* `Learning Transferable Architectures for Scalable Image Recognition <https://arxiv.org/abs/1707.07012>`__.
* `DARTS: Differentiable Architecture Search <https://arxiv.org/abs/1806.09055>`__
`On Network Design Spaces for Visual Recognition <https://arxiv.org/abs/1905.13214>`__
is a good summary of how this structure works in practice.
A cell consists of multiple "nodes". Each node is a sum of multiple operators. Each operator is chosen from
``op_candidates``, and takes one input from previous nodes and predecessors. Predecessor means the input of cell.
......@@ -45,7 +51,10 @@ class Cell(nn.Module):
.. list-table::
:widths: 25 75
:header-rows: 1
* - Name
- Brief Description
* - Cell
- A cell consists of several nodes.
* - Node
......@@ -111,15 +120,20 @@ class Cell(nn.Module):
--------
Choose between conv2d and maxpool2d.
The cell have 4 nodes, 1 op per node, and 2 predecessors.
>>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2)
In forward:
>>> cell([input1, input2])
Use ``merge_op`` to specify how to construct the output.
The output will then have dynamic shape, depending on which input has been used in the cell.
>>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2, merge_op='loose_end')
The op candidates can be callable that accepts node index in cell, op index in node, and input index.
>>> cell = nn.Cell([
... lambda node_index, op_index, input_index: nn.Conv2d(32, 32, 3, stride=2 if input_index < 1 else 1),
... ], 4, 1, 2)
......
......@@ -147,7 +147,7 @@ class NasBench201Cell(nn.Module):
"""
Cell structure that is proposed in NAS-Bench-201.
Refer to :footcite:t:`dong2019bench` for details.
Proposed by `NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search <https://arxiv.org/abs/2001.00326>`__.
This cell is a densely connected DAG with ``num_tensors`` nodes, where each node is tensor.
For every i < j, there is an edge from i-th node to j-th node.
......
......@@ -222,14 +222,16 @@ binary_modules = ['BinaryAdd', 'BinaryMul', 'BinaryMinus', 'BinaryDivide', 'Bina
class AutoActivation(nn.Module):
"""
This module is an implementation of the paper "Searching for Activation Functions"
(https://arxiv.org/abs/1710.05941).
NOTE: current `beta` is not per-channel parameter
This module is an implementation of the paper `Searching for Activation Functions <https://arxiv.org/abs/1710.05941>`__.
Parameters
----------
unit_num : int
the number of core units
Notes
-----
Current `beta` is not per-channel parameter.
"""
def __init__(self, unit_num: int = 1, label: str = None):
super().__init__()
......
......@@ -221,7 +221,7 @@ class NasBench101Cell(Mutable):
"""
Cell structure that is proposed in NAS-Bench-101.
Refer to :footcite:t:`ying2019bench` for details.
Proposed by `NAS-Bench-101: Towards Reproducible Neural Architecture Search <http://proceedings.mlr.press/v97/ying19a/ying19a.pdf>`__.
This cell is usually used in evaluation of NAS algorithms because there is a "comprehensive analysis" of this search space
available, which includes a full architecture-dataset that "maps 423k unique architectures to metrics
......
......@@ -23,7 +23,9 @@ class PolicyBasedRL(BaseStrategy):
"""
Algorithm for policy-based reinforcement learning.
This is a wrapper of algorithms provided in tianshou (PPO by default),
and can be easily customized with other algorithms that inherit ``BasePolicy`` (e.g., REINFORCE :footcite:p:`zoph2017neural`).
and can be easily customized with other algorithms that inherit ``BasePolicy``
(e.g., `REINFORCE <https://link.springer.com/content/pdf/10.1007/BF00992696.pdf>`__
as in `this paper <https://arxiv.org/abs/1611.01578>`__).
Parameters
----------
......@@ -33,7 +35,8 @@ class PolicyBasedRL(BaseStrategy):
How many trials (trajectories) each time collector collects.
After each collect, trainer will sample batch from replay buffer and do the update. Default: 20.
policy_fn : function
Takes ``ModelEvaluationEnv`` as input and return a policy. See ``_default_policy_fn`` for an example.
Takes :class:`ModelEvaluationEnv` as input and return a policy.
See :meth:`PolicyBasedRL._default_policy_fn` for an example.
"""
def __init__(self, max_collect: int = 100, trial_per_collect = 20,
......
......@@ -43,7 +43,8 @@ class TPE(BaseStrategy):
"""
The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach.
Refer to :footcite:t:`bergstra2011algorithms` for details.
Find the details in
`Algorithms for Hyper-Parameter Optimization <https://papers.nips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf>`__.
SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements,
and then subsequently choose new hyperparameters to test based on this model.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment