Unverified Commit 4446280d authored by liuzhe-lz's avatar liuzhe-lz Committed by GitHub
Browse files

update hpo tutorials (#4758)

parent b7dfc7cf
...@@ -19,6 +19,14 @@ More examples can be found in our :githublink:`GitHub repository <nni/examples>` ...@@ -19,6 +19,14 @@ More examples can be found in our :githublink:`GitHub repository <nni/examples>`
:background: purple :background: purple
:tags: HPO :tags: HPO
.. cardlinkitem::
:header: HPO using command line tool
:description: Run HPO experiment with nnictl
:link: tutorials/hpo_nnictl/nnictl
:image: ../img/thumbnails/hpo-pytorch.svg
:background: purple
:tags: HPO
.. cardlinkitem:: .. cardlinkitem::
:header: Hello, NAS! :header: Hello, NAS!
:description: Beginners' NAS tutorial on how to search for neural architectures for MNIST dataset. :description: Beginners' NAS tutorial on how to search for neural architectures for MNIST dataset.
......
########################### Advanced Usage
Hyperparameter Optimization ==============
###########################
.. toctree:: .. toctree::
:maxdepth: 2 :hidden:
Command Line Tool Example </tutorials/hpo_nnictl/nnictl>
Implement Custom Tuners and Assessors <custom_algorithm> Implement Custom Tuners and Assessors <custom_algorithm>
Install Custom or 3rd-party Tuners and Assessors <custom_algorithm_installation> Install Custom or 3rd-party Tuners and Assessors <custom_algorithm_installation>
Tuner Benchmark <hpo_benchmark> Tuner Benchmark <hpo_benchmark>
......
...@@ -106,6 +106,9 @@ Extra Features ...@@ -106,6 +106,9 @@ Extra Features
After you are familiar with basic usage, you can explore more HPO features: After you are familiar with basic usage, you can explore more HPO features:
* :doc:`Use command line tool to create and manage experiments (nnictl) </reference/nnictl>` * :doc:`Use command line tool to create and manage experiments (nnictl) </reference/nnictl>`
* :doc:`nnictl example </tutorials/hpo_nnictl/nnictl>`
* :doc:`Early stop non-optimal models (assessor) <assessors>` * :doc:`Early stop non-optimal models (assessor) <assessors>`
* :doc:`TensorBoard integration </experiment/web_portal/tensorboard>` * :doc:`TensorBoard integration </experiment/web_portal/tensorboard>`
* :doc:`Implement your own algorithm <custom_algorithm>` * :doc:`Implement your own algorithm <custom_algorithm>`
......
.. 317442fd7a0540c0776a08ad773566cf .. c74f6d072f5f8fa93eadd214bba992b4
超参调优 超参调优
======== ========
自动超参调优(hyperparameter optimization, HPO)是NNI的主要功能之一。 自动超参调优(hyperparameter optimization, HPO)是NNI的主要功能之一。
超参调优简介 超参调优简介
------------ ------------
...@@ -36,24 +36,24 @@ ...@@ -36,24 +36,24 @@
2. :ref:`利用分布式平台进行训练 <zh-hpo-overview-platforms>` 2. :ref:`利用分布式平台进行训练 <zh-hpo-overview-platforms>`
3. :ref:`使用网页控制台来监控调参过程 <zh-hpo-overview-portal>` 3. :ref:`使用网页控制台来监控调参过程 <zh-hpo-overview-portal>`
NNI可以满足您的这些需求。 NNI可以满足您的这些需求。
NNI超参调优的主要功能 NNI超参调优的主要功能
--------------------- ----------------------
.. _zh-hpo-overview-tuners: .. _zh-hpo-overview-tuners:
调优算法 调优算法
^^^^^^^^ ^^^^^^^^
NNI通过调优算法来更快地找到最优超参组合,这些算法被称为“tuner”(调参器)。 NNI通过调优算法来更快地找到最优超参组合,这些算法被称为“tuner”(调参器)。
调优算法会决定需要运行、评估哪些超参组合,以及应该以何种顺序评估超参组合。 调优算法会决定需要运行、评估哪些超参组合,以及应该以何种顺序评估超参组合。
高效的算法可以通过已评估超参组合的结果去预测最优超参的取值,从而减少找到最优超参所需的评估次数。 高效的算法可以通过已评估超参组合的结果去预测最优超参的取值,从而减少找到最优超参所需的评估次数。
开头的示例以固定顺序评估所有可能的超参组合,无视了超参的评估结果,这种朴素方法被称为“grid search”(网格搜索)。 开头的示例以固定顺序评估所有可能的超参组合,无视了超参的评估结果,这种朴素方法被称为“grid search”(网格搜索)。
NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、网格搜索,贝叶斯优化类算法如TPESMAC,强化学习算法如PPO等等。 NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、网格搜索,贝叶斯优化类算法如TPESMAC,强化学习算法如PPO等等。
完整内容: :doc:`tuners` 完整内容: :doc:`tuners`
...@@ -62,9 +62,9 @@ NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、 ...@@ -62,9 +62,9 @@ NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、
训练平台 训练平台
^^^^^^^^ ^^^^^^^^
如果您不准备使用分布式训练平台,您可以像使用普通Python函数库一样,在自己的电脑上直接运行NNI超参调优。 如果您不准备使用分布式训练平台,您可以像使用普通Python函数库一样,在自己的电脑上直接运行NNI超参调优。
如果想利用更多计算资源加速调优过程,您也可以使用NNI内建的训练平台集成,从简单的SSH服务器到可扩容的Kubernetes集群NNI都提供支持。 如果想利用更多计算资源加速调优过程,您也可以使用NNI内建的训练平台集成,从简单的SSH服务器到可扩容的Kubernetes集群NNI都提供支持。
完整内容: :doc:`/experiment/training_service/overview` 完整内容: :doc:`/experiment/training_service/overview`
...@@ -73,7 +73,7 @@ NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、 ...@@ -73,7 +73,7 @@ NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、
网页控制台 网页控制台
^^^^^^^^^^ ^^^^^^^^^^
您可以使用NNI的网页控制台来监控超参调优实验,它支持实时显示实验进度、对超参性能进行可视化、人工修改超参数值、同时管理多个实验等诸多功能。 您可以使用NNI的网页控制台来监控超参调优实验,它支持实时显示实验进度、对超参性能进行可视化、人工修改超参数值、同时管理多个实验等诸多功能。
完整内容: :doc:`/experiment/web_portal/web_portal` 完整内容: :doc:`/experiment/web_portal/web_portal`
...@@ -83,17 +83,20 @@ NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、 ...@@ -83,17 +83,20 @@ NNI内建了很多流行的调优算法,包括朴素算法如随机搜索、
教程 教程
---- ----
我们提供了以下教程帮助您上手NNI超参调优,您可以选择最熟悉的机器学习框架: 我们提供了以下教程帮助您上手NNI超参调优,您可以选择最熟悉的机器学习框架:
* :doc:`使用PyTorch的超参调优教程 </tutorials/hpo_quickstart_pytorch/main>` * :doc:`使用PyTorch的超参调优教程 </tutorials/hpo_quickstart_pytorch/main>`
* :doc:`使用TensorFlow的超参调优教程 </tutorials/hpo_quickstart_tensorflow/main>` * :doc:`使用TensorFlow的超参调优教程(英文) </tutorials/hpo_quickstart_tensorflow/main>`
更多功能 更多功能
-------- --------
在掌握了NNI超参调优的基础用法之后,您可以尝试以下更多功能: 在掌握了NNI超参调优的基础用法之后,您可以尝试以下更多功能:
* :doc:`Use command line tool to create and manage experiments (nnictl) </reference/nnictl>` * :doc:`Use command line tool to create and manage experiments (nnictl) </reference/nnictl>`
* :doc:`nnictl example </tutorials/hpo_nnictl/nnictl>`
* :doc:`Early stop non-optimal models (assessor) <assessors>` * :doc:`Early stop non-optimal models (assessor) <assessors>`
* :doc:`TensorBoard integration </experiment/web_portal/tensorboard>` * :doc:`TensorBoard integration </experiment/web_portal/tensorboard>`
* :doc:`Implement your own algorithm <custom_algorithm>` * :doc:`Implement your own algorithm <custom_algorithm>`
......
Quickstart
==========
.. toctree::
PyTorch </tutorials/hpo_quickstart_pytorch/main>
TensorFlow </tutorials/hpo_quickstart_tensorflow/main>
...@@ -275,17 +275,17 @@ Search Space Types Supported by Each Tuner ...@@ -275,17 +275,17 @@ Search Space Types Supported by Each Tuner
- -
* - :class:`BOHB <nni.algorithms.hpo.bohb_advisor.BOHB>` * - :class:`BOHB <nni.algorithms.hpo.bohb_advisor.BOHB>`
- choice -
- choice(nested) -
- randint -
- uniform -
- quniform -
- loguniform -
- qloguniform -
- normal -
- qnormal -
- lognormal -
- qlognormal -
* - :class:`GP <nni.algorithms.hpo.gp_tuner.GPTuner>` * - :class:`GP <nni.algorithms.hpo.gp_tuner.GPTuner>`
- ✓ - ✓
...@@ -301,17 +301,17 @@ Search Space Types Supported by Each Tuner ...@@ -301,17 +301,17 @@ Search Space Types Supported by Each Tuner
- -
* - :class:`PBT <nni.algorithms.hpo.pbt_tuner.PBTTuner>` * - :class:`PBT <nni.algorithms.hpo.pbt_tuner.PBTTuner>`
- choice -
- choice(nested) -
- randint -
- uniform -
- quniform -
- loguniform -
- qloguniform -
- normal -
- qnormal -
- lognormal -
- qlognormal -
* - :class:`DNGO <nni.algorithms.hpo.dngo_tuner.DNGOTuner>` * - :class:`DNGO <nni.algorithms.hpo.dngo_tuner.DNGOTuner>`
- ✓ - ✓
......
...@@ -2,11 +2,11 @@ Hyperparameter Optimization ...@@ -2,11 +2,11 @@ Hyperparameter Optimization
=========================== ===========================
.. toctree:: .. toctree::
:maxdepth: 2 :hidden:
Overview <overview> Overview <overview>
Tutorial </tutorials/hpo_quickstart_pytorch/main> quickstart
Search Space <search_space> Search Space <search_space>
Tuners <tuners> Tuners <tuners>
Assessors <assessors> Assessors <assessors>
Advanced Usage <advanced_toctree.rst> advanced_usage
.. 21e9c3e0f6b182cf42a99a7f6c4ecf98
超参调优
========
.. toctree::
:hidden:
概述 <overview>
教程 <quickstart>
搜索空间 <search_space>
Tuners <tuners>
Assessors <assessors>
高级用法 <advanced_usage>
...@@ -14,7 +14,7 @@ NNI Documentation ...@@ -14,7 +14,7 @@ NNI Documentation
:caption: User Guide :caption: User Guide
:hidden: :hidden:
Hyperparameter Optimization <hpo/index> hpo/toctree
nas/toctree nas/toctree
Model Compression <compression/toctree> Model Compression <compression/toctree>
feature_engineering/toctree feature_engineering/toctree
...@@ -82,7 +82,7 @@ NNI makes AutoML techniques plug-and-play ...@@ -82,7 +82,7 @@ NNI makes AutoML techniques plug-and-play
.. codesnippetcard:: .. codesnippetcard::
:icon: ../img/thumbnails/hpo-small.svg :icon: ../img/thumbnails/hpo-small.svg
:title: Hyper-parameter Tuning :title: Hyperparameter Tuning
:link: tutorials/hpo_quickstart_pytorch/main :link: tutorials/hpo_quickstart_pytorch/main
.. code-block:: .. code-block::
......
.. 27dfb81863f35f50fabc494a7d1ca457 .. f2a86f83def6c4b2e35ba50ce2487deb
NNI 文档 NNI 文档
================= =================
...@@ -16,7 +16,7 @@ NNI 文档 ...@@ -16,7 +16,7 @@ NNI 文档
:caption: 用户指南 :caption: 用户指南
:hidden: :hidden:
超参调优 <hpo/index> 超参调优 <hpo/toctree>
架构搜索 <nas/toctree> 架构搜索 <nas/toctree>
模型压缩 <compression/toctree> 模型压缩 <compression/toctree>
特征工程 <feature_engineering/toctree> 特征工程 <feature_engineering/toctree>
......
...@@ -22,7 +22,7 @@ In this figure: ...@@ -22,7 +22,7 @@ In this figure:
* *Exploration strategy* is the algorithm that is used to explore a model search space. Sometimes we also call it *search strategy*. * *Exploration strategy* is the algorithm that is used to explore a model search space. Sometimes we also call it *search strategy*.
* *Model evaluator* is responsible for training a model and evaluating its performance. * *Model evaluator* is responsible for training a model and evaluating its performance.
The process is similar to :doc:`Hyperparameter Optimization </hpo/index>`, except that the target is the best architecture rather than hyperparameter. Concretely, an exploration strategy selects an architecture from a predefined search space. The architecture is passed to a performance evaluation to get a score, which represents how well this architecture performs on a particular task. This process is repeated until the search process is able to find the best architecture. The process is similar to :doc:`Hyperparameter Optimization </hpo/overview>`, except that the target is the best architecture rather than hyperparameter. Concretely, an exploration strategy selects an architecture from a predefined search space. The architecture is passed to a performance evaluation to get a score, which represents how well this architecture performs on a particular task. This process is repeated until the search process is able to find the best architecture.
Key Features Key Features
------------ ------------
......
.. 1bfa9317e112e9ffc5c7c6a2625188ab .. 48c39585a539a877461aadef63078c48
神经架构搜索 神经架构搜索
=========================== ===========================
...@@ -33,7 +33,7 @@ ...@@ -33,7 +33,7 @@
* *探索策略* 是用于探索模型搜索空间的算法。有时我们也称它为 *搜索策略*。 * *探索策略* 是用于探索模型搜索空间的算法。有时我们也称它为 *搜索策略*。
* *模型评估者* 负责训练模型并评估其性能。 * *模型评估者* 负责训练模型并评估其性能。
该过程类似于 :doc:`超参数优化 </hpo/index>`,只不过目标是最佳网络结构而不是最优超参数。具体来说,探索策略从预定义的搜索空间中选择架构。该架构被传递给性能评估以获得评分,该评分表示这个网络结构在特定任务上的表现。重复此过程,直到搜索过程能够找到最优的网络结构。 该过程类似于 :doc:`超参数优化 </hpo/overview>`,只不过目标是最佳网络结构而不是最优超参数。具体来说,探索策略从预定义的搜索空间中选择架构。该架构被传递给性能评估以获得评分,该评分表示这个网络结构在特定任务上的表现。重复此过程,直到搜索过程能够找到最优的网络结构。
主要特点 主要特点
------------ ------------
......
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Port PyTorch Quickstart to NNI\nThis is a modified version of `PyTorch quickstart`_.\n\nIt can be run directly and will have the exact same result as original version.\n\nFurthermore, it enables the ability of auto tuning with an NNI *experiment*, which will be detailed later.\n\nIt is recommended to run this script directly first to verify the environment.\n\nThere are 2 key differences from the original version:\n\n1. In `Get optimized hyperparameters`_ part, it receives generated hyperparameters.\n2. In `Train model and report accuracy`_ part, it reports accuracy metrics to NNI.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import nni\nimport torch\nfrom torch import nn\nfrom torch.utils.data import DataLoader\nfrom torchvision import datasets\nfrom torchvision.transforms import ToTensor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Hyperparameters to be tuned\nThese are the hyperparameters that will be tuned.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"params = {\n 'features': 512,\n 'lr': 0.001,\n 'momentum': 0,\n}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get optimized hyperparameters\nIf run directly, :func:`nni.get_next_parameter` is a no-op and returns an empty dict.\nBut with an NNI *experiment*, it will receive optimized hyperparameters from tuning algorithm.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"optimized_params = nni.get_next_parameter()\nparams.update(optimized_params)\nprint(params)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load dataset\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"training_data = datasets.FashionMNIST(root=\"data\", train=True, download=True, transform=ToTensor())\ntest_data = datasets.FashionMNIST(root=\"data\", train=False, download=True, transform=ToTensor())\n\nbatch_size = 64\n\ntrain_dataloader = DataLoader(training_data, batch_size=batch_size)\ntest_dataloader = DataLoader(test_data, batch_size=batch_size)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build model with hyperparameters\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\nprint(f\"Using {device} device\")\n\nclass NeuralNetwork(nn.Module):\n def __init__(self):\n super(NeuralNetwork, self).__init__()\n self.flatten = nn.Flatten()\n self.linear_relu_stack = nn.Sequential(\n nn.Linear(28*28, params['features']),\n nn.ReLU(),\n nn.Linear(params['features'], params['features']),\n nn.ReLU(),\n nn.Linear(params['features'], 10)\n )\n\n def forward(self, x):\n x = self.flatten(x)\n logits = self.linear_relu_stack(x)\n return logits\n\nmodel = NeuralNetwork().to(device)\n\nloss_fn = nn.CrossEntropyLoss()\noptimizer = torch.optim.SGD(model.parameters(), lr=params['lr'], momentum=params['momentum'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define train and test\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def train(dataloader, model, loss_fn, optimizer):\n size = len(dataloader.dataset)\n model.train()\n for batch, (X, y) in enumerate(dataloader):\n X, y = X.to(device), y.to(device)\n pred = model(X)\n loss = loss_fn(pred, y)\n optimizer.zero_grad()\n loss.backward()\n optimizer.step()\n\ndef test(dataloader, model, loss_fn):\n size = len(dataloader.dataset)\n num_batches = len(dataloader)\n model.eval()\n test_loss, correct = 0, 0\n with torch.no_grad():\n for X, y in dataloader:\n X, y = X.to(device), y.to(device)\n pred = model(X)\n test_loss += loss_fn(pred, y).item()\n correct += (pred.argmax(1) == y).type(torch.float).sum().item()\n test_loss /= num_batches\n correct /= size\n return correct"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Train model and report accuracy\nReport accuracy metrics to NNI so the tuning algorithm can suggest better hyperparameters.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"epochs = 5\nfor t in range(epochs):\n print(f\"Epoch {t+1}\\n-------------------------------\")\n train(train_dataloader, model, loss_fn, optimizer)\n accuracy = test(test_dataloader, model, loss_fn)\n nni.report_intermediate_result(accuracy)\nnni.report_final_result(accuracy)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
\ No newline at end of file
"""
Port PyTorch Quickstart to NNI
==============================
This is a modified version of `PyTorch quickstart`_.
It can be run directly and will have the exact same result as original version.
Furthermore, it enables the ability of auto tuning with an NNI *experiment*, which will be detailed later.
It is recommended to run this script directly first to verify the environment.
There are 2 key differences from the original version:
1. In `Get optimized hyperparameters`_ part, it receives generated hyperparameters.
2. In `Train model and report accuracy`_ part, it reports accuracy metrics to NNI.
.. _PyTorch quickstart: https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
"""
# %%
import nni
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
# %%
# Hyperparameters to be tuned
# ---------------------------
# These are the hyperparameters that will be tuned.
params = {
'features': 512,
'lr': 0.001,
'momentum': 0,
}
# %%
# Get optimized hyperparameters
# -----------------------------
# If run directly, :func:`nni.get_next_parameter` is a no-op and returns an empty dict.
# But with an NNI *experiment*, it will receive optimized hyperparameters from tuning algorithm.
optimized_params = nni.get_next_parameter()
params.update(optimized_params)
print(params)
# %%
# Load dataset
# ------------
training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
test_data = datasets.FashionMNIST(root="data", train=False, download=True, transform=ToTensor())
batch_size = 64
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)
# %%
# Build model with hyperparameters
# --------------------------------
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, params['features']),
nn.ReLU(),
nn.Linear(params['features'], params['features']),
nn.ReLU(),
nn.Linear(params['features'], 10)
)
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
model = NeuralNetwork().to(device)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=params['lr'], momentum=params['momentum'])
# %%
# Define train and test
# ---------------------
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
model.train()
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
pred = model(X)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
model.eval()
test_loss, correct = 0, 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
return correct
# %%
# Train model and report accuracy
# -------------------------------
# Report accuracy metrics to NNI so the tuning algorithm can suggest better hyperparameters.
epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train(train_dataloader, model, loss_fn, optimizer)
accuracy = test(test_dataloader, model, loss_fn)
nni.report_intermediate_result(accuracy)
nni.report_final_result(accuracy)
ed8bfc27e3d555d842fc4eec2635e619
\ No newline at end of file
:orphan:
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/hpo_nnictl/model.py"
.. LINE NUMBERS ARE GIVEN BELOW.
.. only:: html
.. note::
:class: sphx-glr-download-link-note
Click :ref:`here <sphx_glr_download_tutorials_hpo_nnictl_model.py>`
to download the full example code
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_tutorials_hpo_nnictl_model.py:
Port PyTorch Quickstart to NNI
==============================
This is a modified version of `PyTorch quickstart`_.
It can be run directly and will have the exact same result as original version.
Furthermore, it enables the ability of auto tuning with an NNI *experiment*, which will be detailed later.
It is recommended to run this script directly first to verify the environment.
There are 2 key differences from the original version:
1. In `Get optimized hyperparameters`_ part, it receives generated hyperparameters.
2. In `Train model and report accuracy`_ part, it reports accuracy metrics to NNI.
.. _PyTorch quickstart: https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
.. GENERATED FROM PYTHON SOURCE LINES 21-28
.. code-block:: default
import nni
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
.. GENERATED FROM PYTHON SOURCE LINES 29-32
Hyperparameters to be tuned
---------------------------
These are the hyperparameters that will be tuned.
.. GENERATED FROM PYTHON SOURCE LINES 32-38
.. code-block:: default
params = {
'features': 512,
'lr': 0.001,
'momentum': 0,
}
.. GENERATED FROM PYTHON SOURCE LINES 39-43
Get optimized hyperparameters
-----------------------------
If run directly, :func:`nni.get_next_parameter` is a no-op and returns an empty dict.
But with an NNI *experiment*, it will receive optimized hyperparameters from tuning algorithm.
.. GENERATED FROM PYTHON SOURCE LINES 43-47
.. code-block:: default
optimized_params = nni.get_next_parameter()
params.update(optimized_params)
print(params)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
{'features': 512, 'lr': 0.001, 'momentum': 0}
.. GENERATED FROM PYTHON SOURCE LINES 48-50
Load dataset
------------
.. GENERATED FROM PYTHON SOURCE LINES 50-58
.. code-block:: default
training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
test_data = datasets.FashionMNIST(root="data", train=False, download=True, transform=ToTensor())
batch_size = 64
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)
.. GENERATED FROM PYTHON SOURCE LINES 59-61
Build model with hyperparameters
--------------------------------
.. GENERATED FROM PYTHON SOURCE LINES 61-86
.. code-block:: default
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, params['features']),
nn.ReLU(),
nn.Linear(params['features'], params['features']),
nn.ReLU(),
nn.Linear(params['features'], 10)
)
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
model = NeuralNetwork().to(device)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=params['lr'], momentum=params['momentum'])
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
Using cpu device
.. GENERATED FROM PYTHON SOURCE LINES 87-89
Define train and test
---------------------
.. GENERATED FROM PYTHON SOURCE LINES 89-115
.. code-block:: default
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
model.train()
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
pred = model(X)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
model.eval()
test_loss, correct = 0, 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
return correct
.. GENERATED FROM PYTHON SOURCE LINES 116-119
Train model and report accuracy
-------------------------------
Report accuracy metrics to NNI so the tuning algorithm can suggest better hyperparameters.
.. GENERATED FROM PYTHON SOURCE LINES 119-126
.. code-block:: default
epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train(train_dataloader, model, loss_fn, optimizer)
accuracy = test(test_dataloader, model, loss_fn)
nni.report_intermediate_result(accuracy)
nni.report_final_result(accuracy)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
Epoch 1
-------------------------------
[2022-03-21 01:09:37] INFO (nni/MainThread) Intermediate result: 0.461 (Index 0)
Epoch 2
-------------------------------
[2022-03-21 01:09:42] INFO (nni/MainThread) Intermediate result: 0.5529 (Index 1)
Epoch 3
-------------------------------
[2022-03-21 01:09:47] INFO (nni/MainThread) Intermediate result: 0.6155 (Index 2)
Epoch 4
-------------------------------
[2022-03-21 01:09:52] INFO (nni/MainThread) Intermediate result: 0.6345 (Index 3)
Epoch 5
-------------------------------
[2022-03-21 01:09:56] INFO (nni/MainThread) Intermediate result: 0.6505 (Index 4)
[2022-03-21 01:09:56] INFO (nni/MainThread) Final result: 0.6505
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 0 minutes 24.441 seconds)
.. _sphx_glr_download_tutorials_hpo_nnictl_model.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: model.py <model.py>`
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: model.ipynb <model.ipynb>`
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
Run HPO Experiment with nnictl
==============================
This tutorial has exactly the same effect as :doc:`../hpo_quickstart_pytorch/main`.
Both tutorials optimize the model in `official PyTorch quickstart
<https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html>`__ with auto-tuning,
while this one manages the experiment with command line tool and YAML config file, instead of pure Python code.
The tutorial consists of 4 steps:
1. Modify the model for auto-tuning.
2. Define hyperparameters' search space.
3. Create config file.
4. Run the experiment.
The first two steps are identical to quickstart.
Step 1: Prepare the model
-------------------------
In first step, we need to prepare the model to be tuned.
The model should be put in a separate script.
It will be evaluated many times concurrently,
and possibly will be trained on distributed platforms.
In this tutorial, the model is defined in :doc:`model.py <model>`.
In short, it is a PyTorch model with 3 additional API calls:
1. Use :func:`nni.get_next_parameter` to fetch the hyperparameters to be evalutated.
2. Use :func:`nni.report_intermediate_result` to report per-epoch accuracy metrics.
3. Use :func:`nni.report_final_result` to report final accuracy.
Please understand the model code before continue to next step.
Step 2: Define search space
---------------------------
In model code, we have prepared 3 hyperparameters to be tuned:
*features*, *lr*, and *momentum*.
Here we need to define their *search space* so the tuning algorithm can sample them in desired range.
Assuming we have following prior knowledge for these hyperparameters:
1. *features* should be one of 128, 256, 512, 1024.
2. *lr* should be a float between 0.0001 and 0.1, and it follows exponential distribution.
3. *momentum* should be a float between 0 and 1.
In NNI, the space of *features* is called ``choice``;
the space of *lr* is called ``loguniform``;
and the space of *momentum* is called ``uniform``.
You may have noticed, these names are derived from ``numpy.random``.
For full specification of search space, check :doc:`the reference </hpo/search_space>`.
Now we can define the search space as follow:
.. code-block:: yaml
search_space:
features:
_type: choice
_value: [ 128, 256, 512, 1024 ]
lr:
_type: loguniform
_value: [ 0.0001, 0.1 ]
momentum:
_type: uniform
_value: [ 0, 1 ]
Step 3: Configure the experiment
--------------------------------
NNI uses an *experiment* to manage the HPO process.
The *experiment config* defines how to train the models and how to explore the search space.
In this tutorial we use a YAML file ``config.yaml`` to define the experiment.
Configure trial code
^^^^^^^^^^^^^^^^^^^^
In NNI evaluation of each hyperparameter set is called a *trial*.
So the model script is called *trial code*.
.. code-block:: yaml
trial_command: python model.py
trial_code_directory: .
When ``trial_code_directory`` is a relative path, it relates to the config file.
So in this case we need to put ``config.yaml`` and ``model.py`` in the same directory.
.. attention::
The rules for resolving relative path are different in YAML config file and :doc:`Python experiment API </reference/experiment>`.
In Python experiment API relative paths are relative to current working directory.
Configure how many trials to run
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here we evaluate 10 sets of hyperparameters in total, and concurrently evaluate 2 sets at a time.
.. code-block:: yaml
max_trial_number: 10
trial_concurrency: 2
You may also set ``max_experiment_duration = '1h'`` to limit running time.
If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
the experiment will run forever until you stop it.
.. note::
``max_trial_number`` is set to 10 here for a fast example.
In real world it should be set to a larger number.
With default config TPE tuner requires 20 trials to warm up.
Configure tuning algorithm
^^^^^^^^^^^^^^^^^^^^^^^^^^
Here we use :doc:`TPE tuner </hpo/tuners>`.
.. code-block:: yaml
name: TPE
class_args:
optimize_mode: maximize
Configure training service
^^^^^^^^^^^^^^^^^^^^^^^^^^
In this tutorial we use *local* mode,
which means models will be trained on local machine, without using any special training platform.
.. code-block:: yaml
training_service:
platform: local
Wrap up
^^^^^^^
The full content of ``config.yaml`` is as follow:
.. code-block:: yaml
search_space:
features:
_type: choice
_value: [ 128, 256, 512, 1024 ]
lr:
_type: loguniform
_value: [ 0.0001, 0.1 ]
momentum:
_type: uniform
_value: [ 0, 1 ]
trial_command: python model.py
trial_code_directory: .
trial_concurrency: 2
max_trial_number: 10
tuner:
name: TPE
class_args:
optimize_mode: maximize
training_service:
platform: local
Step 4: Run the experiment
--------------------------
Now the experiment is ready. Launch it with ``nnictl create`` command:
.. code-block:: bash
$ nnictl create --config config.yaml --port 8080
You can use the web portal to view experiment status: http://localhost:8080.
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[2022-04-01 12:00:00] Creating experiment, Experiment ID: p43ny6ew
[2022-04-01 12:00:00] Starting web server...
[2022-04-01 12:00:01] Setting up...
[2022-04-01 12:00:01] Web portal URLs: http://127.0.0.1:8080 http://192.168.1.1:8080
[2022-04-01 12:00:01] To stop experiment run "nnictl stop p43ny6ew" or "nnictl stop --all"
[2022-04-01 12:00:01] Reference: https://nni.readthedocs.io/en/stable/reference/nnictl.html
When the experiment is done, use ``nnictl stop`` command to stop it.
.. code-block:: bash
$ nnictl stop p43ny6ew
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
INFO: Stopping experiment 7u8yg9zw
INFO: Stop experiment success.
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"\n# NNI HPO Quickstart with PyTorch\nThis tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning.\n\nThere is also a :doc:`TensorFlow version<../hpo_quickstart_tensorflow/main>` if you prefer it.\n\nThe tutorial consists of 4 steps: \n\n1. Modify the model for auto-tuning.\n2. Define hyperparameters' search space.\n3. Configure the experiment.\n4. Run the experiment.\n\n" "\n# HPO Quickstart with PyTorch\nThis tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning.\n\nThe tutorial consists of 4 steps: \n\n1. Modify the model for auto-tuning.\n2. Define hyperparameters' search space.\n3. Configure the experiment.\n4. Run the experiment.\n\n"
] ]
}, },
{ {
...@@ -144,7 +144,7 @@ ...@@ -144,7 +144,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<div class=\"alert alert-info\"><h4>Note</h4><p>``max_trial_number`` is set to 10 here for a fast example.\n In real world it should be set to a larger number.\n With default config TPE tuner requires 20 trials to warm up.</p></div>\n\nYou may also set ``max_experiment_duration = '1h'`` to limit running time.\n\nIf neither ``max_trial_number`` nor ``max_experiment_duration`` are set,\nthe experiment will run forever until you press Ctrl-C.\n\n" "You may also set ``max_experiment_duration = '1h'`` to limit running time.\n\nIf neither ``max_trial_number`` nor ``max_experiment_duration`` are set,\nthe experiment will run forever until you press Ctrl-C.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>``max_trial_number`` is set to 10 here for a fast example.\n In real world it should be set to a larger number.\n With default config TPE tuner requires 20 trials to warm up.</p></div>\n\n"
] ]
}, },
{ {
...@@ -187,7 +187,7 @@ ...@@ -187,7 +187,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
":meth:`nni.experiment.Experiment.stop` is automatically invoked when Python exits,\nso it can be omitted in your code.\n\nAfter the experiment is stopped, you can run :meth:`nni.experiment.Experiment.view` to restart web portal.\n\n.. tip::\n\n This example uses :doc:`Python API </reference/experiment>` to create experiment.\n\n You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.\n\n" ":meth:`nni.experiment.Experiment.stop` is automatically invoked when Python exits,\nso it can be omitted in your code.\n\nAfter the experiment is stopped, you can run :meth:`nni.experiment.Experiment.view` to restart web portal.\n\n.. tip::\n\n This example uses :doc:`Python API </reference/experiment>` to create experiment.\n\n You can also create and manage experiments with :doc:`command line tool <../hpo_nnictl/nnictl>`.\n\n"
] ]
} }
], ],
...@@ -207,7 +207,7 @@ ...@@ -207,7 +207,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.3" "version": "3.10.4"
} }
}, },
"nbformat": 4, "nbformat": 4,
......
""" """
NNI HPO Quickstart with PyTorch HPO Quickstart with PyTorch
=============================== ===========================
This tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning. This tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning.
There is also a :doc:`TensorFlow version<../hpo_quickstart_tensorflow/main>` if you prefer it.
The tutorial consists of 4 steps: The tutorial consists of 4 steps:
1. Modify the model for auto-tuning. 1. Modify the model for auto-tuning.
...@@ -113,16 +111,16 @@ experiment.config.tuner.class_args['optimize_mode'] = 'maximize' ...@@ -113,16 +111,16 @@ experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.max_trial_number = 10 experiment.config.max_trial_number = 10
experiment.config.trial_concurrency = 2 experiment.config.trial_concurrency = 2
# %% # %%
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
#
# .. note:: # .. note::
# #
# ``max_trial_number`` is set to 10 here for a fast example. # ``max_trial_number`` is set to 10 here for a fast example.
# In real world it should be set to a larger number. # In real world it should be set to a larger number.
# With default config TPE tuner requires 20 trials to warm up. # With default config TPE tuner requires 20 trials to warm up.
#
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
# %% # %%
# Step 4: Run the experiment # Step 4: Run the experiment
...@@ -154,4 +152,4 @@ experiment.stop() ...@@ -154,4 +152,4 @@ experiment.stop()
# #
# This example uses :doc:`Python API </reference/experiment>` to create experiment. # This example uses :doc:`Python API </reference/experiment>` to create experiment.
# #
# You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`. # You can also create and manage experiments with :doc:`command line tool <../hpo_nnictl/nnictl>`.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment