Unverified Commit dda6a126 authored by Yuge Zhang's avatar Yuge Zhang Committed by GitHub
Browse files

Translation for home page and NAS (#4748)

parent fe02b808
...@@ -62,13 +62,11 @@ See the :doc:`installation guide </installation>` if you need additional help on ...@@ -62,13 +62,11 @@ See the :doc:`installation guide </installation>` if you need additional help on
Try your first NNI experiment Try your first NNI experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To run your first NNI experiment:
.. code-block:: shell .. code-block:: shell
$ nnictl hello $ nnictl hello
.. note:: you need to have `PyTorch <https://pytorch.org/>`_ (as well as `torchvision <https://pytorch.org/vision/stable/index.html>`_) installed to run this experiment. .. note:: You need to have `PyTorch <https://pytorch.org/>`_ (as well as `torchvision <https://pytorch.org/vision/stable/index.html>`_) installed to run this experiment.
To start your journey now, please follow the :doc:`absolute quickstart of NNI <quickstart>`! To start your journey now, please follow the :doc:`absolute quickstart of NNI <quickstart>`!
...@@ -261,7 +259,7 @@ Get Support and Contribute Back ...@@ -261,7 +259,7 @@ Get Support and Contribute Back
NNI is maintained on the `NNI GitHub repository <https://github.com/microsoft/nni>`_. We collect feedbacks and new proposals/ideas on GitHub. You can: NNI is maintained on the `NNI GitHub repository <https://github.com/microsoft/nni>`_. We collect feedbacks and new proposals/ideas on GitHub. You can:
* Open a `GitHub issue <https://github.com/microsoft/nni/issues>`_ for bugs and feature requests. * Open a `GitHub issue <https://github.com/microsoft/nni/issues>`_ for bugs and feature requests.
* Open a `pull request <https://github.com/microsoft/nni/pulls>`_ to contribute code (make sure to read the `contribution guide </contribution>` before doing this). * Open a `pull request <https://github.com/microsoft/nni/pulls>`_ to contribute code (make sure to read the :doc:`contribution guide <notes/contributing>` before doing this).
* Participate in `NNI Discussion <https://github.com/microsoft/nni/discussions>`_ for general questions and new ideas. * Participate in `NNI Discussion <https://github.com/microsoft/nni/discussions>`_ for general questions and new ideas.
* Join the following IM groups. * Join the following IM groups.
......
This diff is collapsed.
...@@ -9,7 +9,7 @@ Execution engine is for running Retiarii Experiment. NNI supports three executio ...@@ -9,7 +9,7 @@ Execution engine is for running Retiarii Experiment. NNI supports three executio
* **CGO execution engine** has the same requirements and capabilities as the **Graph-based execution engine**. But further enables cross-model optimizations, which makes model space exploration faster. * **CGO execution engine** has the same requirements and capabilities as the **Graph-based execution engine**. But further enables cross-model optimizations, which makes model space exploration faster.
.. _pure-python-exeuction-engine: .. _pure-python-execution-engine:
Pure-python Execution Engine Pure-python Execution Engine
---------------------------- ----------------------------
...@@ -20,7 +20,7 @@ Rememeber to add :meth:`nni.retiarii.model_wrapper` decorator outside the whole ...@@ -20,7 +20,7 @@ Rememeber to add :meth:`nni.retiarii.model_wrapper` decorator outside the whole
.. note:: You should always use ``super().__init__()`` instead of ``super(MyNetwork, self).__init__()`` in the PyTorch model, because the latter one has issues with model wrapper. .. note:: You should always use ``super().__init__()`` instead of ``super(MyNetwork, self).__init__()`` in the PyTorch model, because the latter one has issues with model wrapper.
.. _graph-based-exeuction-engine: .. _graph-based-execution-engine:
Graph-based Execution Engine Graph-based Execution Engine
---------------------------- ----------------------------
......
...@@ -43,7 +43,7 @@ Search Space Design ...@@ -43,7 +43,7 @@ Search Space Design
The search space defines which architectures can be represented in principle. Incorporating prior knowledge about typical properties of architectures well-suited for a task can reduce the size of the search space and simplify the search. However, this also introduces a human bias, which may prevent finding novel architectural building blocks that go beyond the current human knowledge. Search space design can be very challenging for beginners, who might not possess the experience to balance the richness and simplicity. The search space defines which architectures can be represented in principle. Incorporating prior knowledge about typical properties of architectures well-suited for a task can reduce the size of the search space and simplify the search. However, this also introduces a human bias, which may prevent finding novel architectural building blocks that go beyond the current human knowledge. Search space design can be very challenging for beginners, who might not possess the experience to balance the richness and simplicity.
In NNI, we provide a wide range of APIs to build the search space. There are :doc:`high-level APIs <construct_space>`, that enables incorporating human knowledge about what makes a good architecture or search space. There are also :doc:`low-level APIs <mutator>`, that is a list of primitives to construct a network from operator to operator. In NNI, we provide a wide range of APIs to build the search space. There are :doc:`high-level APIs <construct_space>`, that enables the possibility to incorporate human knowledge about what makes a good architecture or search space. There are also :doc:`low-level APIs <mutator>`, that is a list of primitives to construct a network from operation to operation.
Exploration strategy Exploration strategy
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
...@@ -57,7 +57,7 @@ Performance estimation ...@@ -57,7 +57,7 @@ Performance estimation
The objective of NAS is typically to find architectures that achieve high predictive performance on unseen data. Performance estimation refers to the process of estimating this performance. The problem with performance estimation is mostly its scalability, i.e., how can I run and manage multiple trials simultaneously. The objective of NAS is typically to find architectures that achieve high predictive performance on unseen data. Performance estimation refers to the process of estimating this performance. The problem with performance estimation is mostly its scalability, i.e., how can I run and manage multiple trials simultaneously.
In NNI, we standardize this process is implemented with :doc:`evaluator <evaluator>`, which is responsible of estimating a model's performance. The choices of evaluators also range from the simplest option, e.g., to perform a standard training and validation of the architecture on data, to complex configurations and implementations. Evaluators are run in *trials*, where trials can be spawn onto distributed platforms with our powerful :doc:`training service </experiment/training_service/overview>`. In NNI, we standardize this process is implemented with :doc:`evaluator <evaluator>`, which is responsible of estimating a model's performance. NNI has quite a few built-in supports of evaluators, ranging from the simplest option, e.g., to perform a standard training and validation of the architecture on data, to complex configurations and implementations. Evaluators are run in *trials*, where trials can be spawn onto distributed platforms with our powerful :doc:`training service </experiment/training_service/overview>`.
Tutorials Tutorials
--------- ---------
......
.. 1bfa9317e112e9ffc5c7c6a2625188ab
神经架构搜索
===========================
.. toctree::
:hidden:
快速入门 </tutorials/hello_nas>
构建搜索空间 <construct_space>
探索策略 <exploration_strategy>
评估器 <evaluator>
高级用法 <advanced_usage>
.. attention:: NNI 最新的架构搜索支持都是基于 Retiarii 框架,还在使用 `NNI 架构搜索的早期版本 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ 的用户应尽快将您的工作迁移到 Retiarii。我们计划在接下来的几个版本中删除旧的架构搜索框架。
.. attention:: PyTorch 是 **Retiarii 唯一支持的框架**。有关 Tensorflow 上架构搜索支持的需求在 `此讨论 <https://github.com/microsoft/nni/discussions/4605>`__ 中。另外,如果您打算使用 PyTorch 和 Tensorflow 以外的 DL 框架运行 NAS,请 `创建新 issue <https://github.com/microsoft/nni/issues>`__ 让我们知道。
概述
------
自动神经架构搜索 (Neural Architecture Search, NAS)在寻找更好的模型方面发挥着越来越重要的作用。最近的研究证明了自动架构搜索的可行性,并导致模型击败了许多手动设计和调整的模型。其中具有代表性的有 `NASNet <https://arxiv.org/abs/1707.07012>`__、 `ENAS <https://arxiv.org/abs/1802.03268>`__、 `DARTS <https://arxiv.org/ abs/1806.09055>`__、 `Network Morphism <https://arxiv.org/abs/1806.10282>`__ 和 `进化算法 <https://arxiv.org/abs/1703.01041>`__。此外,新的创新正不断涌现。
总的来说,使用神经架构搜索解决任何特定任务通常需要:搜索空间设计、搜索策略选择和性能评估。这三个组件形成如下的循环(图来自于 `架构搜索综述 <https://arxiv.org/abs/1808.05377>`__):
.. image:: ../../img/nas_abstract_illustration.png
:align: center
:width: 700
在这个图中:
* *模型搜索空间* 是指一组模型,从中探索/搜索最佳模型,简称为 *搜索空间* 或 *模型空间*。
* *探索策略* 是用于探索模型搜索空间的算法。有时我们也称它为 *搜索策略*。
* *模型评估者* 负责训练模型并评估其性能。
该过程类似于 :doc:`超参数优化 </hpo/index>`,只不过目标是最佳网络结构而不是最优超参数。具体来说,探索策略从预定义的搜索空间中选择架构。该架构被传递给性能评估以获得评分,该评分表示这个网络结构在特定任务上的表现。重复此过程,直到搜索过程能够找到最优的网络结构。
主要特点
------------
NNI 中当前的架构搜索框架由 `Retiarii: A Deep Learning Exploratory-Training Framework <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__ 的研究支撑,具有以下特点:
* :doc:`简单的 API,让您轻松构建搜索空间 <construct_space>`
* :doc:`SOTA 架构搜索算法,以高效探索搜索空间 <exploration_strategy>`
* :doc:`后端支持,在大规模 AI 平台上运行实验 </experiment/overview>`
为什么使用 NNI 的架构搜索
-------------------------------
若没有 NNI,实现架构搜索将极具挑战性,主要包含以下三个方面。当用户想在自己的场景中尝试架构搜索技术时,NNI 提供的解决方案可以极大程度上减轻用户的工作量。
搜索空间设计
^^^^^^^^^^^^^^^^^^^
搜索空间定义了架构的可行域集合。为了简化搜索,我们通常需要结合任务相关的先验知识,减小搜索空间的规模。然而,这也引入了人类的偏见,在某种程度上可能会丧失突破人类认知的可能性。无论如何,对于初学者来说,搜索空间设计是一个极具挑战性的任务,因为他们可能无法在简单的空间和丰富的想象力之间取得平衡。
在 NNI 中,我们提供了不同层级的 API 来构建搜索空间。有 :doc:`高层 API <construct_space>`,引入大量先验,帮助用户迅速了解什么是好的架构或搜索空间;也有 :doc:`底层 API <mutator>`,提供了最底层的算子和图变换原语。
探索策略
^^^^^^^^^^^^^^^^^^^^
探索策略定义了如何探索搜索空间(通常是指数级规模的)。它包含经典的探索-利用权衡。一方面,我们希望快速找到性能良好的架构;而另一方面,我们也应避免过早收敛到次优架构的区域。我们往往需要通常通过反复试验找到特定场景的“最佳”探索策略。由于许多近期发表的探索策略都是使用自己的代码库实现的,因此从一个切换到另一个变得非常麻烦。
在 NNI 中,我们还提供了 :doc:`一系列的探索策略 <exploration_strategy>`。其中一些功能强大但耗时,而另一些可能不能找到最优架构但非常高效。鉴于所有策略都使用统一的用户接口实现,用户可以轻松找到符合他们需求的策略。
性能评估
^^^^^^^^^^^^^^^^^^^^^^
架构搜索的目标通常是找到能够在测试数据集表现理想的网络结构。性能评估的作用便是量化每个网络的好坏。其主要难点在于可扩展性,即如何在大规模训练平台上同时运行和管理多个试验。
在 NNI 中,我们使用 :doc:`evaluator <evaluator>` 来标准化性能评估流程。它负责估计模型的性能。NNI 内建了不少性能评估器,从最简单的交叉验证,到复杂的自定义配置。评估器在 *试验 (trials)* 中运行,可以通过我们强大的 :doc:`训练平台 </experiment/training_service/overview>` 将试验分发到大规模训练平台上。
教程
---------
要开始使用 NNI 架构搜索框架,我们建议至少阅读以下教程:
* :doc:`快速入门 </tutorials/hello_nas>`
* :doc:`构建搜索空间 <construct_space>`
* :doc:`探索策略 <exploration_strategy>`
* :doc:`评估器 <evaluator>`
资源
---------
以下文章将有助于更好地了解 NAS 的最新发展:
* `神经架构搜索:综述 <https://arxiv.org/abs/1808.05377>`__
* `神经架构搜索的综述:挑战和解决方案 <https://arxiv.org/abs/2006.02903>`__
.. ccd00e2e56b44cf452b0afb81e8cecff
快速入门
==========
.. cardlinkitem::
:header: 超参调优快速入门(以 PyTorch 框架为例)
:description: 使用超参数调优 (HPO) 为一个 PyTorch FashionMNIST 模型调参.
:link: tutorials/hpo_quickstart_pytorch/main
:image: ../img/thumbnails/hpo-pytorch.svg
:background: purple
.. cardlinkitem::
:header: 神经架构搜索快速入门
:description: 为初学者讲解如何使用 NNI 在 MNIST 数据集上搜索一个网络结构。
:link: tutorials/hello_nas
:image: ../img/thumbnails/nas-tutorial.svg
:background: cyan
.. cardlinkitem::
:header: 模型压缩快速入门
:description: 学习剪枝以压缩您的模型。
:link: tutorials/pruning_quick_start_mnist
:image: ../img/thumbnails/pruning-tutorial.svg
:background: blue
...@@ -213,7 +213,25 @@ ...@@ -213,7 +213,25 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"for model_dict in exp.export_top_models(formatter='dict'):\n print(model_dict)\n\n# The output is `json` object which records the mutation actions of the top model.\n# If users want to output source code of the top model, they can use graph-based execution engine for the experiment,\n# by simply adding the following two lines.\n#\n# .. code-block:: python\n#\n# exp_config.execution_engine = 'base'\n# export_formatter = 'code'" "for model_dict in exp.export_top_models(formatter='dict'):\n print(model_dict)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The output is ``json`` object which records the mutation actions of the top model.\nIf users want to output source code of the top model,\nthey can use `graph-based execution engine <graph-based-execution-engine>` for the experiment,\nby simply adding the following two lines.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"exp_config.execution_engine = 'base'\nexport_formatter = 'code'"
] ]
} }
], ],
......
...@@ -354,11 +354,11 @@ def evaluate_model_with_visualization(model_cls): ...@@ -354,11 +354,11 @@ def evaluate_model_with_visualization(model_cls):
for model_dict in exp.export_top_models(formatter='dict'): for model_dict in exp.export_top_models(formatter='dict'):
print(model_dict) print(model_dict)
# The output is `json` object which records the mutation actions of the top model. # %%
# If users want to output source code of the top model, they can use graph-based execution engine for the experiment, # The output is ``json`` object which records the mutation actions of the top model.
# If users want to output source code of the top model,
# they can use :ref:`graph-based execution engine <graph-based-execution-engine>` for the experiment,
# by simply adding the following two lines. # by simply adding the following two lines.
#
# .. code-block:: python exp_config.execution_engine = 'base'
# export_formatter = 'code'
# exp_config.execution_engine = 'base'
# export_formatter = 'code'
be654727f3e5e43571f23dcb9a871abf 0e49e3aef98633744807b814786f6b31
\ No newline at end of file \ No newline at end of file
...@@ -466,6 +466,27 @@ Launch the experiment. The experiment should take several minutes to finish on a ...@@ -466,6 +466,27 @@ Launch the experiment. The experiment should take several minutes to finish on a
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
INFO:nni.experiment:Creating experiment, Experiment ID: z8ns5fv7
INFO:nni.experiment:Connecting IPC pipe...
INFO:nni.experiment:Starting web server...
INFO:nni.experiment:Setting up...
INFO:nni.runtime.msg_dispatcher_base:Dispatcher started
INFO:nni.retiarii.experiment.pytorch:Web UI URLs: http://127.0.0.1:8081 http://10.190.172.35:8081 http://192.168.49.1:8081 http://172.17.0.1:8081
INFO:nni.retiarii.experiment.pytorch:Start strategy...
INFO:root:Successfully update searchSpace.
INFO:nni.retiarii.strategy.bruteforce:Random search running in fixed size mode. Dedup: on.
INFO:nni.retiarii.experiment.pytorch:Stopping experiment, please wait...
INFO:nni.retiarii.experiment.pytorch:Strategy exit
INFO:nni.retiarii.experiment.pytorch:Waiting for experiment to become DONE (you can ctrl+c if there is no running trial jobs)...
INFO:nni.runtime.msg_dispatcher_base:Dispatcher exiting...
INFO:nni.retiarii.experiment.pytorch:Experiment stopped
...@@ -526,7 +547,7 @@ Export Top Models ...@@ -526,7 +547,7 @@ Export Top Models
Users can export top models after the exploration is done using ``export_top_models``. Users can export top models after the exploration is done using ``export_top_models``.
.. GENERATED FROM PYTHON SOURCE LINES 353-365 .. GENERATED FROM PYTHON SOURCE LINES 353-357
.. code-block:: default .. code-block:: default
...@@ -534,14 +555,6 @@ Users can export top models after the exploration is done using ``export_top_mod ...@@ -534,14 +555,6 @@ Users can export top models after the exploration is done using ``export_top_mod
for model_dict in exp.export_top_models(formatter='dict'): for model_dict in exp.export_top_models(formatter='dict'):
print(model_dict) print(model_dict)
# The output is `json` object which records the mutation actions of the top model.
# If users want to output source code of the top model, they can use graph-based execution engine for the experiment,
# by simply adding the following two lines.
#
# .. code-block:: python
#
# exp_config.execution_engine = 'base'
# export_formatter = 'code'
...@@ -552,7 +565,28 @@ Users can export top models after the exploration is done using ``export_top_mod ...@@ -552,7 +565,28 @@ Users can export top models after the exploration is done using ``export_top_mod
.. code-block:: none .. code-block:: none
{'model_1': '0', 'model_2': 0.75, 'model_3': 128} {'model_1': '0', 'model_2': 0.25, 'model_3': 64}
.. GENERATED FROM PYTHON SOURCE LINES 358-362
The output is ``json`` object which records the mutation actions of the top model.
If users want to output source code of the top model,
they can use :ref:`graph-based execution engine <graph-based-execution-engine>` for the experiment,
by simply adding the following two lines.
.. GENERATED FROM PYTHON SOURCE LINES 362-365
.. code-block:: default
exp_config.execution_engine = 'base'
export_formatter = 'code'
...@@ -560,7 +594,7 @@ Users can export top models after the exploration is done using ``export_top_mod ...@@ -560,7 +594,7 @@ Users can export top models after the exploration is done using ``export_top_mod
.. rst-class:: sphx-glr-timing .. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 2 minutes 15.810 seconds) **Total running time of the script:** ( 2 minutes 4.499 seconds)
.. _sphx_glr_download_tutorials_hello_nas.py: .. _sphx_glr_download_tutorials_hello_nas.py:
......
.. 8a873f2c9cb0e8e3ed2d66b9d16c330f
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/hello_nas.py"
.. LINE NUMBERS ARE GIVEN BELOW.
.. only:: html
.. note::
:class: sphx-glr-download-link-note
Click :ref:`here <sphx_glr_download_tutorials_hello_nas.py>`
to download the full example code
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_tutorials_hello_nas.py:
架构搜索入门教程
================
这是 NNI 上的神经架构搜索(NAS)的入门教程。
在本教程中,我们将借助 NNI NAS 框架,即 *Retiarii*,在 MNIST 数据集上实现网络结构搜索。
我们以多尝试的架构搜索为例来展示如何构建和探索模型空间。
神经架构搜索任务主要有三个关键组成部分,即
* 模型搜索空间,定义了一个要探索的模型的集合。
* 一个合适的策略作为探索这个模型空间的方法。
* 一个模型评估器,用于为搜索空间中每个模型评估性能。
目前,Retiarii 只支持 PyTorch,并对 **PyTorch 1.7 1.10** 进行了测试。
所以本教程假定您使用 PyTorch 作为深度学习框架。未来我们会支持更多框架。
定义您的模型空间
----------------------
模型空间是由用户定义的,用来表达用户想要探索的一组模型,其中包含有潜力的好模型。
NNI 的框架中,模型空间由两部分定义:基本模型和基本模型上可能的变化。
.. GENERATED FROM PYTHON SOURCE LINES 26-34
定义基本模型
^^^^^^^^^^^^^^^^^
定义基本模型与定义 PyTorch(或 TensorFlow)模型几乎相同。
通常,您只需将代码 ``import torch.nn as nn`` 替换为
``import nni.retiarii.nn.pytorch as nn`` 以使用我们打包的 PyTorch 模块。
下面是定义基本模型的一个非常简单的示例。
.. GENERATED FROM PYTHON SOURCE LINES 35-61
.. code-block:: default
import torch
import torch.nn.functional as F
import nni.retiarii.nn.pytorch as nn
from nni.retiarii import model_wrapper
@model_wrapper # this decorator should be put on the out most
class Net(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout(0.25)
self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(self.conv2(x), 2)
x = torch.flatten(self.dropout1(x), 1)
x = self.fc2(self.dropout2(F.relu(self.fc1(x))))
output = F.log_softmax(x, dim=1)
return output
.. GENERATED FROM PYTHON SOURCE LINES 62-104
.. tip:: 记住,您应该使用 ``import nni.retiarii.nn.pytorch as nn`` :meth:`nni.retiarii.model_wrapper`
许多错误都是因为忘记使用某一个。
另外,要使用 ``nn.init`` 的子模块,可以使用 ``torch.nn``,例如, ``torch.nn.init`` 而不是 ``nn.init``
定义模型变化
^^^^^^^^^^^^^^^^^^^^^^
基本模型只是一个具体模型,而不是模型空间。 我们提供 :doc:`模型变化的 API </nas/construct_space>`
让用户表达如何改变基本模型。 即构建一个包含许多模型的搜索空间。
基于上述基本模型,我们可以定义如下模型空间。
.. code-block:: diff
@model_wrapper
class Net(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
- self.conv2 = nn.Conv2d(32, 64, 3, 1)
+ self.conv2 = nn.LayerChoice([
+ nn.Conv2d(32, 64, 3, 1),
+ DepthwiseSeparableConv(32, 64)
+ ])
- self.dropout1 = nn.Dropout(0.25)
+ self.dropout1 = nn.Dropout(nn.ValueChoice([0.25, 0.5, 0.75]))
self.dropout2 = nn.Dropout(0.5)
- self.fc1 = nn.Linear(9216, 128)
- self.fc2 = nn.Linear(128, 10)
+ feature = nn.ValueChoice([64, 128, 256])
+ self.fc1 = nn.Linear(9216, feature)
+ self.fc2 = nn.Linear(feature, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(self.conv2(x), 2)
x = torch.flatten(self.dropout1(x), 1)
x = self.fc2(self.dropout2(F.relu(self.fc1(x))))
output = F.log_softmax(x, dim=1)
return output
结果是以下代码:
.. GENERATED FROM PYTHON SOURCE LINES 104-147
.. code-block:: default
class DepthwiseSeparableConv(nn.Module):
def __init__(self, in_ch, out_ch):
super().__init__()
self.depthwise = nn.Conv2d(in_ch, in_ch, kernel_size=3, groups=in_ch)
self.pointwise = nn.Conv2d(in_ch, out_ch, kernel_size=1)
def forward(self, x):
return self.pointwise(self.depthwise(x))
@model_wrapper
class ModelSpace(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
# LayerChoice is used to select a layer between Conv2d and DwConv.
self.conv2 = nn.LayerChoice([
nn.Conv2d(32, 64, 3, 1),
DepthwiseSeparableConv(32, 64)
])
# ValueChoice is used to select a dropout rate.
# ValueChoice can be used as parameter of modules wrapped in `nni.retiarii.nn.pytorch`
# or customized modules wrapped with `@basic_unit`.
self.dropout1 = nn.Dropout(nn.ValueChoice([0.25, 0.5, 0.75])) # choose dropout rate from 0.25, 0.5 and 0.75
self.dropout2 = nn.Dropout(0.5)
feature = nn.ValueChoice([64, 128, 256])
self.fc1 = nn.Linear(9216, feature)
self.fc2 = nn.Linear(feature, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(self.conv2(x), 2)
x = torch.flatten(self.dropout1(x), 1)
x = self.fc2(self.dropout2(F.relu(self.fc1(x))))
output = F.log_softmax(x, dim=1)
return output
model_space = ModelSpace()
model_space
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
ModelSpace(
(conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1))
(conv2): LayerChoice([Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1)), DepthwiseSeparableConv(
(depthwise): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), groups=32)
(pointwise): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1))
)], label='model_1')
(dropout1): Dropout(p=0.25, inplace=False)
(dropout2): Dropout(p=0.5, inplace=False)
(fc1): Linear(in_features=9216, out_features=64, bias=True)
(fc2): Linear(in_features=64, out_features=10, bias=True)
)
.. GENERATED FROM PYTHON SOURCE LINES 148-182
这个例子使用了两个模型变化的 API :class:`nn.LayerChoice <nni.retiarii.nn.pytorch.LayerChoice>` :class:`nn.InputChoice <nni.retiarii.nn.pytorch.ValueChoice>`
:class:`nn.LayerChoice <nni.retiarii.nn.pytorch.LayerChoice>` 可以从一系列的候选子模块中(在本例中为两个),为每个采样模型选择一个。
它可以像原来的 PyTorch 子模块一样使用。
:class:`nn.InputChoice <nni.retiarii.nn.pytorch.ValueChoice>` 的参数是一个候选值列表,语义是为每个采样模型选择一个值。
更详细的 API 描述和用法可以在 :doc:`这里 </nas/construct_space>` 找到。
.. note::
我们正在积极丰富模型变化的 API,使得您可以轻松构建模型空间。
如果当前支持的模型变化的 API 不能表达您的模型空间,
请参考 :doc:`这篇文档 </nas/mutator>` 来自定义突变。
探索定义的模型空间
-------------------------------------------
简单来讲,有两种探索方法:
(1) 独立评估每个采样到的模型,这是 :ref:`多尝试 NAS <multi-trial-nas>` 中的搜索方法。
(2) 单尝试共享权重型的搜索,简称单尝试 NAS
我们在本教程中演示了第一种方法。第二种方法用户可以参考 :ref:`这里 <one-shot-nas>`
首先,用户需要选择合适的探索策略来探索定义好的模型空间。
其次,用户需要选择或自定义模型性能评估来评估每个探索模型的性能。
选择探索策略
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Retiarii 支持许多 :doc:`探索策略</nas/exploration_strategy>`
只需选择(即实例化)探索策略,就如下面的代码演示的一样:
.. GENERATED FROM PYTHON SOURCE LINES 182-186
.. code-block:: default
import nni.retiarii.strategy as strategy
search_strategy = strategy.Random(dedup=True) # dedup=False if deduplication is not wanted
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
/home/yugzhan/miniconda3/envs/cu102/lib/python3.8/site-packages/ray/autoscaler/_private/cli_logger.py:57: FutureWarning: Not all Ray CLI dependencies were found. In Ray 1.4+, the Ray CLI, autoscaler, and dashboard will only be usable via `pip install 'ray[default]'`. Please update your install command.
warnings.warn(
.. GENERATED FROM PYTHON SOURCE LINES 187-200
挑选或自定义模型评估器
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
在探索过程中,探索策略反复生成新模型。模型评估器负责训练并验证每个生成的模型以获得模型的性能。
该性能作为模型的得分被发送到探索策略以帮助其生成更好的模型。
Retiarii 提供了 :doc:`内置模型评估器 </nas/evaluator>`,但在此之前,
我们建议使用 :class:`FunctionalEvaluator <nni.retiarii.evaluator.FunctionalEvaluator>`,即用一个函数包装您自己的训练和评估代码。
这个函数应该接收一个单一的模型类并使用 :func:`nni.report_final_result` 报告这个模型的最终分数。
此处的示例创建了一个简单的评估器,该评估器在 MNIST 数据集上运行,训练 2 epoch,并报告其在验证集上的准确率。
.. GENERATED FROM PYTHON SOURCE LINES 200-268
.. code-block:: default
import nni
from torchvision import transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
def train_epoch(model, device, train_loader, optimizer, epoch):
loss_fn = torch.nn.CrossEntropyLoss()
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = loss_fn(output, target)
loss.backward()
optimizer.step()
if batch_idx % 10 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
def test_epoch(model, device, test_loader):
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)
accuracy = 100. * correct / len(test_loader.dataset)
print('\nTest set: Accuracy: {}/{} ({:.0f}%)\n'.format(
correct, len(test_loader.dataset), accuracy))
return accuracy
def evaluate_model(model_cls):
# "model_cls" is a class, need to instantiate
model = model_cls()
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
transf = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = DataLoader(MNIST('data/mnist', download=True, transform=transf), batch_size=64, shuffle=True)
test_loader = DataLoader(MNIST('data/mnist', download=True, train=False, transform=transf), batch_size=64)
for epoch in range(3):
# train the model for one epoch
train_epoch(model, device, train_loader, optimizer, epoch)
# test the model for one epoch
accuracy = test_epoch(model, device, test_loader)
# call report intermediate result. Result can be float or dict
nni.report_intermediate_result(accuracy)
# report final test result
nni.report_final_result(accuracy)
.. GENERATED FROM PYTHON SOURCE LINES 269-270
创建评估器
.. GENERATED FROM PYTHON SOURCE LINES 270-274
.. code-block:: default
from nni.retiarii.evaluator import FunctionalEvaluator
evaluator = FunctionalEvaluator(evaluate_model)
.. GENERATED FROM PYTHON SOURCE LINES 275-286
这里的 ``train_epoch`` ``test_epoch`` 可以是任何自定义函数,用户可以在其中编写自己的训练逻辑。
建议这里的 ``evaluate_model`` 不接受除 ``model_cls`` 之外的其他参数。
但是,在 `高级教程 </nas/evaluator>` 中,我们将展示如何使用其他参数,以免您确实需要这些参数。
未来,我们将支持对评估器的参数进行变化(通常称为“超参数调优”)。
启动实验
--------------------
一切都已准备就绪,现在就可以开始做模型搜索的实验了。如下所示。
.. GENERATED FROM PYTHON SOURCE LINES 287-293
.. code-block:: default
from nni.retiarii.experiment.pytorch import RetiariiExperiment, RetiariiExeConfig
exp = RetiariiExperiment(model_space, evaluator, [], search_strategy)
exp_config = RetiariiExeConfig('local')
exp_config.experiment_name = 'mnist_search'
.. GENERATED FROM PYTHON SOURCE LINES 294-295
以下配置可以用于控制最多/同时运行多少试验。
.. GENERATED FROM PYTHON SOURCE LINES 295-299
.. code-block:: default
exp_config.max_trial_number = 4 # 最多运行 4 个实验
exp_config.trial_concurrency = 2 # 最多同时运行 2 个试验
.. GENERATED FROM PYTHON SOURCE LINES 300-302
如果要使用 GPU,请设置以下配置。
如果您希望使用被占用了的 GPU(比如 GPU 上可能正在运行 GUI),则 ``use_active_gpu`` 应设置为 true
.. GENERATED FROM PYTHON SOURCE LINES 302-306
.. code-block:: default
exp_config.trial_gpu_number = 1
exp_config.training_service.use_active_gpu = True
.. GENERATED FROM PYTHON SOURCE LINES 307-308
启动实验。 在一个有两块 GPU 的工作站上完成整个实验大约需要几分钟时间。
.. GENERATED FROM PYTHON SOURCE LINES 308-311
.. code-block:: default
exp.run(exp_config, 8081)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
INFO:nni.experiment:Creating experiment, Experiment ID: z8ns5fv7
INFO:nni.experiment:Connecting IPC pipe...
INFO:nni.experiment:Starting web server...
INFO:nni.experiment:Setting up...
INFO:nni.runtime.msg_dispatcher_base:Dispatcher started
INFO:nni.retiarii.experiment.pytorch:Web UI URLs: http://127.0.0.1:8081 http://10.190.172.35:8081 http://192.168.49.1:8081 http://172.17.0.1:8081
INFO:nni.retiarii.experiment.pytorch:Start strategy...
INFO:root:Successfully update searchSpace.
INFO:nni.retiarii.strategy.bruteforce:Random search running in fixed size mode. Dedup: on.
INFO:nni.retiarii.experiment.pytorch:Stopping experiment, please wait...
INFO:nni.retiarii.experiment.pytorch:Strategy exit
INFO:nni.retiarii.experiment.pytorch:Waiting for experiment to become DONE (you can ctrl+c if there is no running trial jobs)...
INFO:nni.runtime.msg_dispatcher_base:Dispatcher exiting...
INFO:nni.retiarii.experiment.pytorch:Experiment stopped
.. GENERATED FROM PYTHON SOURCE LINES 312-330
除了 ``local`` 训练平台,用户还可以使用 :doc:`不同的训练平台 </experiment/training_service/overview>` 来运行 Retiarii 试验。
可视化实验
----------------------
用户可以可视化他们的架构搜索实验,就像可视化超参调优实验一样。
例如,在浏览器中打开 ``localhost:8081``8081 是您在 ``exp.run`` 中设置的端口。
详情请参考 :doc:`这里</experiment/web_portal/web_portal>`
我们支持使用第三方可视化引擎(如 `Netron <https://netron.app/>`__)对模型进行可视化。
这可以通过单击每个试验的详细面板中的“可视化”来使用。
请注意,当前的可视化是基于 `onnx <https://onnx.ai/>`__
因此,如果模型不能导出为 onnx,可视化是不可行的。
内置评估器(例如 Classification)会将模型自动导出到文件中。
对于您自己的评估器,您需要将文件保存到 ``$NNI_OUTPUT_DIR/model.onnx``
例如,
.. GENERATED FROM PYTHON SOURCE LINES 330-344
.. code-block:: default
import os
from pathlib import Path
def evaluate_model_with_visualization(model_cls):
model = model_cls()
# dump the model into an onnx
if 'NNI_OUTPUT_DIR' in os.environ:
dummy_input = torch.zeros(1, 3, 32, 32)
torch.onnx.export(model, (dummy_input, ),
Path(os.environ['NNI_OUTPUT_DIR']) / 'model.onnx')
evaluate_model(model_cls)
.. GENERATED FROM PYTHON SOURCE LINES 345-353
重新启动实验,Web 界面上会显示一个按钮。
.. image:: ../../img/netron_entrance_webui.png
导出最优模型
-----------------
搜索完成后,用户可以使用 ``export_top_models`` 导出最优模型。
.. GENERATED FROM PYTHON SOURCE LINES 353-357
.. code-block:: default
for model_dict in exp.export_top_models(formatter='dict'):
print(model_dict)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
{'model_1': '0', 'model_2': 0.25, 'model_3': 64}
.. GENERATED FROM PYTHON SOURCE LINES 358-362
输出是一个 JSON 对象,记录了最好的模型的每一个选择都选了什么。
如果用户想要搜出来的模型的源代码,他们可以使用 :ref:`基于图的引擎 <graph-based-execution-engine>`,只需增加如下两行。
.. GENERATED FROM PYTHON SOURCE LINES 362-365
.. code-block:: default
exp_config.execution_engine = 'base'
export_formatter = 'code'
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 2 minutes 4.499 seconds)
.. _sphx_glr_download_tutorials_hello_nas.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: hello_nas.py <hello_nas.py>`
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: hello_nas.ipynb <hello_nas.ipynb>`
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
...@@ -5,10 +5,10 @@ ...@@ -5,10 +5,10 @@
Computation times Computation times
================= =================
**02:15.810** total execution time for **tutorials** files: **02:04.499** total execution time for **tutorials** files:
+-----------------------------------------------------------------------------------------------------+-----------+--------+ +-----------------------------------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_hello_nas.py` (``hello_nas.py``) | 02:15.810 | 0.0 MB | | :ref:`sphx_glr_tutorials_hello_nas.py` (``hello_nas.py``) | 02:04.499 | 0.0 MB |
+-----------------------------------------------------------------------------------------------------+-----------+--------+ +-----------------------------------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_nasbench_as_dataset.py` (``nasbench_as_dataset.py``) | 00:00.000 | 0.0 MB | | :ref:`sphx_glr_tutorials_nasbench_as_dataset.py` (``nasbench_as_dataset.py``) | 00:00.000 | 0.0 MB |
+-----------------------------------------------------------------------------------------------------+-----------+--------+ +-----------------------------------------------------------------------------------------------------+-----------+--------+
......
...@@ -12,18 +12,18 @@ $(document).ready(function() { ...@@ -12,18 +12,18 @@ $(document).ready(function() {
// the image links are stored in layout.html // the image links are stored in layout.html
// to leverage jinja engine // to leverage jinja engine
downloadNote.html(` downloadNote.html(`
<a class="notebook-action-link" href="${colabLink}">
<div class="notebook-action-div">
<img src="${GALLERY_LINKS.colab}"/>
<div>Run in Google Colab</div>
</div>
</a>
<a class="notebook-action-link" href="${notebookLink}"> <a class="notebook-action-link" href="${notebookLink}">
<div class="notebook-action-div"> <div class="notebook-action-div">
<img src="${GALLERY_LINKS.notebook}"/> <img src="${GALLERY_LINKS.notebook}"/>
<div>Download Notebook</div> <div>Download Notebook</div>
</div> </div>
</a> </a>
<a class="notebook-action-link" href="${colabLink}">
<div class="notebook-action-div">
<img src="${GALLERY_LINKS.colab}"/>
<div>Run in Google Colab</div>
</div>
</a>
<a class="notebook-action-link" href="${githubLink}"> <a class="notebook-action-link" href="${githubLink}">
<div class="notebook-action-div"> <div class="notebook-action-div">
<img src="${GALLERY_LINKS.github}"/> <img src="${GALLERY_LINKS.github}"/>
......
...@@ -78,7 +78,7 @@ for path in iterate_dir(Path('source')): ...@@ -78,7 +78,7 @@ for path in iterate_dir(Path('source')):
failed_files.append('(redundant) ' + source_path.as_posix()) failed_files.append('(redundant) ' + source_path.as_posix())
if not pipeline_mode: if not pipeline_mode:
print(f'Deleting {source_path}') print(f'Deleting {source_path}')
source_path.unlink() path.unlink()
if pipeline_mode and failed_files: if pipeline_mode and failed_files:
......
...@@ -354,11 +354,11 @@ def evaluate_model_with_visualization(model_cls): ...@@ -354,11 +354,11 @@ def evaluate_model_with_visualization(model_cls):
for model_dict in exp.export_top_models(formatter='dict'): for model_dict in exp.export_top_models(formatter='dict'):
print(model_dict) print(model_dict)
# The output is `json` object which records the mutation actions of the top model. # %%
# If users want to output source code of the top model, they can use graph-based execution engine for the experiment, # The output is ``json`` object which records the mutation actions of the top model.
# If users want to output source code of the top model,
# they can use :ref:`graph-based execution engine <graph-based-execution-engine>` for the experiment,
# by simply adding the following two lines. # by simply adding the following two lines.
#
# .. code-block:: python exp_config.execution_engine = 'base'
# export_formatter = 'code'
# exp_config.execution_engine = 'base'
# export_formatter = 'code'
...@@ -90,6 +90,13 @@ class Cell(nn.Module): ...@@ -90,6 +90,13 @@ class Cell(nn.Module):
(e.g., the next cell wants to have the outputs of both this cell and previous cell as its input). (e.g., the next cell wants to have the outputs of both this cell and previous cell as its input).
By default, directly use this cell's output. By default, directly use this cell's output.
.. tip::
It's highly recommended to make the candidate operators have an output of the same shape as input.
This is because, there can be dynamic connections within cell. If there's shape change within operations,
the input shape of the subsequent operation becomes unknown.
In addition, the final concatenation could have shape mismatch issues.
Parameters Parameters
---------- ----------
op_candidates : list of module or function, or dict op_candidates : list of module or function, or dict
...@@ -131,7 +138,7 @@ class Cell(nn.Module): ...@@ -131,7 +138,7 @@ class Cell(nn.Module):
Choose between conv2d and maxpool2d. Choose between conv2d and maxpool2d.
The cell have 4 nodes, 1 op per node, and 2 predecessors. The cell have 4 nodes, 1 op per node, and 2 predecessors.
>>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2) >>> cell = nn.Cell([nn.Conv2d(32, 32, 3, padding=1), nn.MaxPool2d(3, padding=1)], 4, 1, 2)
In forward: In forward:
...@@ -169,7 +176,7 @@ class Cell(nn.Module): ...@@ -169,7 +176,7 @@ class Cell(nn.Module):
Warnings Warnings
-------- --------
:class:`Cell` is not supported in :ref:`graph-based execution engine <graph-based-exeuction-engine>`. :class:`Cell` is not supported in :ref:`graph-based execution engine <graph-based-execution-engine>`.
Attributes Attributes
---------- ----------
......
...@@ -280,7 +280,7 @@ class NasBench101Cell(Mutable): ...@@ -280,7 +280,7 @@ class NasBench101Cell(Mutable):
Warnings Warnings
-------- --------
:class:`NasBench101Cell` is not supported in :ref:`graph-based execution engine <graph-based-exeuction-engine>`. :class:`NasBench101Cell` is not supported in :ref:`graph-based execution engine <graph-based-execution-engine>`.
""" """
@staticmethod @staticmethod
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment