Update NAS reference (#4682)

8acddd0c · Yuge Zhang · GitHub · e8b88a79 · 8acddd0c · 8acddd0c
Unverified Commit 8acddd0c authored Mar 28, 2022 by Yuge Zhang Committed by GitHub Mar 28, 2022
13 changed files
--- a/docs/source/nas/construct_space.rst
+++ b/docs/source/nas/construct_space.rst
@@ -44,6 +44,15 @@ ValueChoice
   :members:
   :inherited-members: Module

+.. _nas-model-parameter-choice:
+
+ModelParameterChoice
+^^^^^^^^^^^^^^^^^^^^
+
+.. autoclass:: nni.retiarii.nn.pytorch.ModelParameterChoice
+   :members:
+   :inherited-members: Module
+
 .. _nas-repeat:

 Repeat
@@ -60,8 +69,6 @@ Cell
 .. autoclass:: nni.retiarii.nn.pytorch.Cell
   :members:

-.. footbibliography::
-
 .. _nas-cell-101:

 NasBench101Cell
@@ -70,8 +77,6 @@ NasBench101Cell
 .. autoclass:: nni.retiarii.nn.pytorch.NasBench101Cell
   :members:

-.. footbibliography::
-
 .. _nas-cell-201:

 NasBench201Cell
@@ -80,8 +85,6 @@ NasBench201Cell
 .. autoclass:: nni.retiarii.nn.pytorch.NasBench201Cell
   :members:

-.. footbibliography::
-
 .. _hyper-modules:

 Hyper-module Library (experimental)

--- a/docs/source/nas/exploration_strategy.rst
+++ b/docs/source/nas/exploration_strategy.rst
@@ -56,8 +56,6 @@ TPE
   :members:
   :noindex:

-.. footbibliography::
-
 .. _policy-based-rl-strategy:

 PolicyBasedRL
@@ -67,8 +65,6 @@ PolicyBasedRL
   :members:
   :noindex:

-.. footbibliography::
-
 .. _one-shot-nas:

 One-shot strategy

--- a/docs/source/nas/index.rst
+++ b/docs/source/nas/index.rst
@@ -10,9 +10,8 @@ Retiarii for Neural Architecture Search
   exploration_strategy
   evaluator
   advanced_usage
-   reference

-.. attention:: NNI's latest NAS supports are all based on Retiarii Framework, users who are still on `early version using NNI NAS v1.0 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ shall migrate your work to Retiarii as soon as possible.
+.. attention:: NNI's latest NAS supports are all based on Retiarii Framework, users who are still on `early version using NNI NAS v1.0 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ shall migrate your work to Retiarii as soon as possible. We plan to remove the legacy NAS framework in the next few releases.

 .. note:: PyTorch is the **only supported framework on Retiarii**. Inquiries of NAS support on Tensorflow is in `this discussion <https://github.com/microsoft/nni/discussions/4605>`__. If you intend to run NAS with DL frameworks other than PyTorch and Tensorflow, please `open new issues <https://github.com/microsoft/nni/issues>`__ to let us know.


--- a/docs/source/nas/reference.rst
+++ b/docs/source/nas/reference.rst
-Retiarii API Reference
-======================
-
-nni.retiarii
------------
-
-..  automodule:: nni.retiarii
-    :imported-members:
-    :members:
-
-nni.retiarii.codegen
--------------------
-
-..  automodule:: nni.retiarii.codegen
-    :imported-members:
-    :members:
-
-nni.retiarii.converter
----------------------
-
-..  automodule:: nni.retiarii.converter
-    :imported-members:
-    :members:
-
-nni.retiarii.evaluator
----------------------
-
-..  automodule:: nni.retiarii.evaluator
-    :imported-members:
-    :members:
-
-..  automodule:: nni.retiarii.evaluator.pytorch
-    :imported-members:
-    :members:
-    :exclude-members: Trainer, DataLoader
-
-..  autoclass:: nni.retiarii.evaluator.pytorch.Trainer
-
-..  autoclass:: nni.retiarii.evaluator.pytorch.DataLoader
-
-nni.retiarii.execution
----------------------
-
-..  automodule:: nni.retiarii.execution
-    :imported-members:
-    :members:
-    :undoc-members:
-
-nni.retiarii.experiment.pytorch
-------------------------------
-
-..  automodule:: nni.retiarii.experiment.pytorch
-    :members:
-
-nni.retiarii.nn.pytorch
-----------------------
-
-Please refer to:
-
-* :doc:`construct_space`.
-* :doc:`mutator`.
-* `torch.nn reference <https://pytorch.org/docs/stable/nn.html>`_.
-
-nni.retiarii.oneshot
--------------------
-
-..  automodule:: nni.retiarii.oneshot
-    :imported-members:
-    :members:
-
-
-nni.retiarii.operation_def
--------------------------
-
-..  automodule:: nni.retiarii.operation_def
-    :imported-members:
-    :members:
-
-nni.retiarii.strategy
---------------------
-
-..  automodule:: nni.retiarii.strategy
-    :imported-members:
-    :members:
-
-nni.retiarii.utils
------------------
-
-..  automodule:: nni.retiarii.utils
-    :members:
-
-.. footbibliography::
--- a/docs/source/reference/python_api/nas.rst
+++ b/docs/source/reference/python_api/nas.rst
@@ -3,3 +3,101 @@ Neural Architecture Search

 nni.retiarii
 ------------
+
+..  automodule:: nni.retiarii
+    :imported-members:
+    :members:
+
+nni.retiarii.codegen
+--------------------
+
+..  automodule:: nni.retiarii.codegen
+    :imported-members:
+    :members:
+
+nni.retiarii.converter
+----------------------
+
+..  automodule:: nni.retiarii.converter
+    :imported-members:
+    :members:
+
+nni.retiarii.evaluator
+----------------------
+
+..  automodule:: nni.retiarii.evaluator
+    :imported-members:
+    :members:
+
+..  automodule:: nni.retiarii.evaluator.pytorch
+    :imported-members:
+    :members:
+    :exclude-members: Trainer, DataLoader
+
+..  autoclass:: nni.retiarii.evaluator.pytorch.Trainer
+
+..  autoclass:: nni.retiarii.evaluator.pytorch.DataLoader
+
+nni.retiarii.execution
+----------------------
+
+..  automodule:: nni.retiarii.execution
+    :imported-members:
+    :members:
+    :undoc-members:
+
+nni.retiarii.experiment.pytorch
+-------------------------------
+
+..  automodule:: nni.retiarii.experiment.pytorch
+    :members:
+
+nni.retiarii.nn.pytorch
+-----------------------
+
+..  automodule:: nni.retiarii.nn.pytorch.api
+    :imported-members:
+    :members:
+    :noindex:
+
+..  automodule:: nni.retiarii.nn.pytorch.component
+    :imported-members:
+    :members:
+    :noindex:
+
+..  automodule:: nni.retiarii.nn.pytorch.hypermodule
+    :imported-members:
+    :members:
+    :noindex:
+
+..  automodule:: nni.retiarii.nn.pytorch.mutation_utils
+    :imported-members:
+    :members:
+
+nni.retiarii.oneshot
+--------------------
+
+..  automodule:: nni.retiarii.oneshot
+    :imported-members:
+    :members:
+
+
+nni.retiarii.operation_def
+--------------------------
+
+..  automodule:: nni.retiarii.operation_def
+    :imported-members:
+    :members:
+
+nni.retiarii.strategy
+---------------------
+
+..  automodule:: nni.retiarii.strategy
+    :imported-members:
+    :members:
+
+nni.retiarii.utils
+------------------
+
+..  automodule:: nni.retiarii.utils
+    :members:
--- a/nni/retiarii/converter/graph_gen.py
+++ b/nni/retiarii/converter/graph_gen.py
@@ -818,7 +818,7 @@ class GraphConverterWithShape(GraphConverter):

 def convert_to_graph(script_module, module, converter=None, **kwargs):
    """
-    Convert module to our graph ir, i.e., build a ```Model``` type
+    Convert module to our graph ir, i.e., build a :class:`Model` type

    Parameters
    ----------

--- a/nni/retiarii/graph.py
+++ b/nni/retiarii/graph.py
@@ -75,12 +75,12 @@ class Model:
    """
    Represents a neural network model.

-    During mutation, one `Model` object is created for each trainable snapshot.
+    During mutation, one :class:`Model` object is created for each trainable snapshot.
    For example, consider a mutator that insert a node at an edge for each iteration.
    In one iteration, the mutator invokes 4 primitives: add node, remove edge, add edge to head, add edge to tail.
-    These 4 primitives operates in one `Model` object.
+    These 4 primitives operates in one :class:`Model` object.
    When they are all done the model will be set to "frozen" (trainable) status and be submitted to execution engine.
-    And then a new iteration starts, and a new `Model` object is created by forking last model.
+    And then a new iteration starts, and a new :class:`Model` object is created by forking last model.

    Attributes
    ----------
@@ -89,7 +89,7 @@ class Model:
    python_init_params
        Initialization parameters of python class.
    status
-        See `ModelStatus`.
+        See :class:`ModelStatus`.
    root_graph
        The outermost graph which usually takes dataset as input and feeds output to loss function.
    graphs
@@ -98,11 +98,11 @@ class Model:
        Model evaluator
    history
        Mutation history.
-        `self` is directly mutated from `self.history[-1]`;
-        `self.history[-1] is mutated from `self.history[-2]`, and so on.
-        `self.history[0]` is the base graph.
+        ``self`` is directly mutated from ``self.history[-1]``;
+        ``self.history[-1]`` is mutated from ``self.history[-2]``, and so on.
+        ``self.history[0]`` is the base graph.
    metric
-        Training result of the model, or `None` if it's not yet trained or has failed to train.
+        Training result of the model, or ``None`` if it's not yet trained or has failed to train.
    intermediate_metrics
        Intermediate training metrics. If the model is not trained, it's an empty list.
    """
@@ -262,9 +262,9 @@ class Graph:
    Graph topology.

    This class simply represents the topology, with no semantic meaning.
-    All other information like metric, non-graph functions, mutation history, etc should go to `Model`.
+    All other information like metric, non-graph functions, mutation history, etc should go to :class:`Model`.

-    Each graph belongs to and only belongs to one `Model`.
+    Each graph belongs to and only belongs to one :class:`Model`.

    Attributes
    ----------
@@ -281,15 +281,15 @@ class Graph:
    output_names
        Optional mnemonic names of output values.
    input_node
-        ...
+        Incoming node.
    output_node
-        ...
+        Output node.
    hidden_nodes
-        ...
+        Hidden nodes
    nodes
        All input/output/hidden nodes.
    edges
-        ...
+        Edges.
    python_name
        The name of torch.nn.Module, should have one-to-one mapping with items in python model.
    """
@@ -529,16 +529,16 @@ class Node:
    """
    An operation or an opaque subgraph inside a graph.

-    Each node belongs to and only belongs to one `Graph`.
-    Nodes should never be created with constructor. Use `Graph.add_node()` instead.
+    Each node belongs to and only belongs to one :class:`Graph`.
+    Nodes should never be created with constructor. Use :meth:`Graph.add_node` instead.

    The node itself is for topology only.
-    Information of tensor calculation should all go inside `operation` attribute.
+    Information of tensor calculation should all go inside ``operation`` attribute.

    TODO: parameter of subgraph (cell)
    It's easy to assign parameters on cell node, but it's hard to "use" them.
    We need to design a way to reference stored cell parameters in inner node operations.
-    e.g. `self.fc = Linear(self.units)`  <-  how to express `self.units` in IR?
+    e.g. ``self.fc = Linear(self.units)``  <-  how to express ``self.units`` in IR?

    Attributes
    ----------
@@ -554,10 +554,10 @@ class Node:
    label
        Optional. If two nodes have the same label, they are considered same by the mutator.
    operation
-        ...
+        Operation.
    cell
        Read only shortcut to get the referenced subgraph.
-        If this node is not a subgraph (is a primitive operation), accessing `cell` will raise an error.
+        If this node is not a subgraph (is a primitive operation), accessing ``cell`` will raise an error.
    predecessors
        Predecessor nodes of this node in the graph. This is an optional mutation helper.
    successors
@@ -674,36 +674,36 @@ class Edge:
    """
    A tensor, or "data flow", between two nodes.

-    Example forward code snippet:
-    ```
-    a, b, c = split(x)
-    p = concat(a, c)
-    q = sum(b, p)
-    z = relu(q)
-    ```
-
-    Edges in above snippet:
-      + head: (split, 0), tail: (concat, 0)  # a in concat
-      + head: (split, 2), tail: (concat, 1)  # c in concat
-      + head: (split, 1), tail: (sum, -1 or 0)  # b in sum
-      + head: (concat, null), tail: (sum, -1 or 1)  # p in sum
-      + head: (sum, null), tail: (relu, null)  # q in relu
+    Example forward code snippet: ::
+
+        a, b, c = split(x)
+        p = concat(a, c)
+        q = sum(b, p)
+        z = relu(q)
+
+    Edges in above snippet: ::
+
+        + head: (split, 0), tail: (concat, 0)  # a in concat
+        + head: (split, 2), tail: (concat, 1)  # c in concat
+        + head: (split, 1), tail: (sum, -1 or 0)  # b in sum
+        + head: (concat, null), tail: (sum, -1 or 1)  # p in sum
+        + head: (sum, null), tail: (relu, null)  # q in relu

    Attributes
    ----------
    graph
-        ...
+        Graph.
    head
        Head node.
    tail
        Tail node.
    head_slot
        Index of outputs in head node.
-        If the node has only one output, this should be `null`.
+        If the node has only one output, this should be ``null``.
    tail_slot
        Index of inputs in tail node.
-        If the node has only one input, this should be `null`.
-        If the node does not care about order, this can be `-1`.
+        If the node has only one input, this should be ``null``.
+        If the node does not care about order, this can be ``-1``.
    """

    def __init__(self, head: EdgeEndpoint, tail: EdgeEndpoint, _internal: bool = False):

--- a/nni/retiarii/nn/pytorch/cell.py
+++ b/nni/retiarii/nn/pytorch/cell.py
@@ -34,8 +34,14 @@ class Cell(nn.Module):
    """
    Cell structure that is popularly used in NAS literature.

-    Refer to :footcite:t:`zoph2017neural,zoph2018learning,liu2018darts` for details.
-    :footcite:t:`radosavovic2019network` is a good summary of how this structure works in practice.
+    Find the details in:
+
+    * `Neural Architecture Search with Reinforcement Learning <https://arxiv.org/abs/1611.01578>`__.
+    * `Learning Transferable Architectures for Scalable Image Recognition <https://arxiv.org/abs/1707.07012>`__.
+    * `DARTS: Differentiable Architecture Search <https://arxiv.org/abs/1806.09055>`__
+
+    `On Network Design Spaces for Visual Recognition <https://arxiv.org/abs/1905.13214>`__
+    is a good summary of how this structure works in practice.

    A cell consists of multiple "nodes". Each node is a sum of multiple operators. Each operator is chosen from
    ``op_candidates``, and takes one input from previous nodes and predecessors. Predecessor means the input of cell.
@@ -45,7 +51,10 @@ class Cell(nn.Module):

    .. list-table::
        :widths: 25 75
+        :header-rows: 1

+        * - Name
+          - Brief Description
        * - Cell
          - A cell consists of several nodes.
        * - Node
@@ -111,15 +120,20 @@ class Cell(nn.Module):
    --------
    Choose between conv2d and maxpool2d.
    The cell have 4 nodes, 1 op per node, and 2 predecessors.
+
    >>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2)
+
    In forward:
+
    >>> cell([input1, input2])

    Use ``merge_op`` to specify how to construct the output.
    The output will then have dynamic shape, depending on which input has been used in the cell.
+
    >>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2, merge_op='loose_end')

    The op candidates can be callable that accepts node index in cell, op index in node, and input index.
+
    >>> cell = nn.Cell([
    ...     lambda node_index, op_index, input_index: nn.Conv2d(32, 32, 3, stride=2 if input_index < 1 else 1),
    ... ], 4, 1, 2)

--- a/nni/retiarii/nn/pytorch/component.py
+++ b/nni/retiarii/nn/pytorch/component.py
@@ -147,7 +147,7 @@ class NasBench201Cell(nn.Module):
    """
    Cell structure that is proposed in NAS-Bench-201.

-    Refer to :footcite:t:`dong2019bench` for details.
+    Proposed by `NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search <https://arxiv.org/abs/2001.00326>`__.

    This cell is a densely connected DAG with ``num_tensors`` nodes, where each node is tensor.
    For every i < j, there is an edge from i-th node to j-th node.

--- a/nni/retiarii/nn/pytorch/hypermodule.py
+++ b/nni/retiarii/nn/pytorch/hypermodule.py
@@ -222,14 +222,16 @@ binary_modules = ['BinaryAdd', 'BinaryMul', 'BinaryMinus', 'BinaryDivide', 'Bina

 class AutoActivation(nn.Module):
    """
-    This module is an implementation of the paper "Searching for Activation Functions"
-    (https://arxiv.org/abs/1710.05941).
-    NOTE: current `beta` is not per-channel parameter
+    This module is an implementation of the paper `Searching for Activation Functions <https://arxiv.org/abs/1710.05941>`__.

    Parameters
    ----------
    unit_num : int
        the number of core units
+
+    Notes
+    -----
+    Current `beta` is not per-channel parameter.
    """
    def __init__(self, unit_num: int = 1, label: str = None):
        super().__init__()

--- a/nni/retiarii/nn/pytorch/nasbench101.py
+++ b/nni/retiarii/nn/pytorch/nasbench101.py
@@ -221,7 +221,7 @@ class NasBench101Cell(Mutable):
    """
    Cell structure that is proposed in NAS-Bench-101.

-    Refer to :footcite:t:`ying2019bench` for details.
+    Proposed by `NAS-Bench-101: Towards Reproducible Neural Architecture Search <http://proceedings.mlr.press/v97/ying19a/ying19a.pdf>`__.

    This cell is usually used in evaluation of NAS algorithms because there is a "comprehensive analysis" of this search space
    available, which includes a full architecture-dataset that "maps 423k unique architectures to metrics

--- a/nni/retiarii/strategy/rl.py
+++ b/nni/retiarii/strategy/rl.py
@@ -23,7 +23,9 @@ class PolicyBasedRL(BaseStrategy):
    """
    Algorithm for policy-based reinforcement learning.
    This is a wrapper of algorithms provided in tianshou (PPO by default),
-    and can be easily customized with other algorithms that inherit ``BasePolicy`` (e.g., REINFORCE :footcite:p:`zoph2017neural`).
+    and can be easily customized with other algorithms that inherit ``BasePolicy``
+    (e.g., `REINFORCE <https://link.springer.com/content/pdf/10.1007/BF00992696.pdf>`__
+    as in `this paper <https://arxiv.org/abs/1611.01578>`__).

    Parameters
    ----------
@@ -33,7 +35,8 @@ class PolicyBasedRL(BaseStrategy):
        How many trials (trajectories) each time collector collects.
        After each collect, trainer will sample batch from replay buffer and do the update. Default: 20.
    policy_fn : function
-        Takes ``ModelEvaluationEnv`` as input and return a policy. See ``_default_policy_fn`` for an example.
+        Takes :class:`ModelEvaluationEnv` as input and return a policy.
+        See :meth:`PolicyBasedRL._default_policy_fn` for an example.
    """

    def __init__(self, max_collect: int = 100, trial_per_collect = 20,

--- a/nni/retiarii/strategy/tpe_strategy.py
+++ b/nni/retiarii/strategy/tpe_strategy.py
@@ -43,7 +43,8 @@ class TPE(BaseStrategy):
    """
    The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach.

-    Refer to :footcite:t:`bergstra2011algorithms` for details.
+    Find the details in
+    `Algorithms for Hyper-Parameter Optimization <https://papers.nips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf>`__.

    SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements,
    and then subsequently choose new hyperparameters to test based on this model.