Unverified Commit f5b89bb6 authored by J-shang's avatar J-shang Committed by GitHub
Browse files

Merge pull request #4776 from microsoft/v2.7

parents 7aa44612 1546962f
......@@ -5,10 +5,10 @@
Computation times
=================
**00:24.441** total execution time for **tutorials_hpo_quickstart_pytorch** files:
**01:24.367** total execution time for **tutorials_hpo_quickstart_pytorch** files:
+--------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_hpo_quickstart_pytorch_model.py` (``model.py``) | 00:24.441 | 0.0 MB |
| :ref:`sphx_glr_tutorials_hpo_quickstart_pytorch_main.py` (``main.py``) | 01:24.367 | 0.0 MB |
+--------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_hpo_quickstart_pytorch_main.py` (``main.py``) | 00:00.000 | 0.0 MB |
| :ref:`sphx_glr_tutorials_hpo_quickstart_pytorch_model.py` (``model.py``) | 00:00.000 | 0.0 MB |
+--------------------------------------------------------------------------+-----------+--------+
......@@ -15,7 +15,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# NNI HPO Quickstart with TensorFlow\nThis tutorial optimizes the model in `official TensorFlow quickstart`_ with auto-tuning.\n\nThe tutorial consists of 4 steps: \n\n1. Modify the model for auto-tuning.\n2. Define hyperparameters' search space.\n3. Configure the experiment.\n4. Run the experiment.\n\n"
"\n# HPO Quickstart with TensorFlow\nThis tutorial optimizes the model in `official TensorFlow quickstart`_ with auto-tuning.\n\nThe tutorial consists of 4 steps: \n\n1. Modify the model for auto-tuning.\n2. Define hyperparameters' search space.\n3. Configure the experiment.\n4. Run the experiment.\n\n"
]
},
{
......@@ -144,7 +144,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-info\"><h4>Note</h4><p>``max_trial_number`` is set to 10 here for a fast example.\n In real world it should be set to a larger number.\n With default config TPE tuner requires 20 trials to warm up.</p></div>\n\nYou may also set ``max_experiment_duration = '1h'`` to limit running time.\n\nIf neither ``max_trial_number`` nor ``max_experiment_duration`` are set,\nthe experiment will run forever until you press Ctrl-C.\n\n"
"You may also set ``max_experiment_duration = '1h'`` to limit running time.\n\nIf neither ``max_trial_number`` nor ``max_experiment_duration`` are set,\nthe experiment will run forever until you press Ctrl-C.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>``max_trial_number`` is set to 10 here for a fast example.\n In real world it should be set to a larger number.\n With default config TPE tuner requires 20 trials to warm up.</p></div>\n\n"
]
},
{
......@@ -187,7 +187,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
":meth:`nni.experiment.Experiment.stop` is automatically invoked when Python exits,\nso it can be omitted in your code.\n\nAfter the experiment is stopped, you can run :meth:`nni.experiment.Experiment.view` to restart web portal.\n\n.. tip::\n\n This example uses :doc:`Python API </reference/experiment>` to create experiment.\n\n You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.\n\n"
":meth:`nni.experiment.Experiment.stop` is automatically invoked when Python exits,\nso it can be omitted in your code.\n\nAfter the experiment is stopped, you can run :meth:`nni.experiment.Experiment.view` to restart web portal.\n\n.. tip::\n\n This example uses :doc:`Python API </reference/experiment>` to create experiment.\n\n You can also create and manage experiments with :doc:`command line tool <../hpo_nnictl/nnictl>`.\n\n"
]
}
],
......@@ -207,7 +207,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.3"
"version": "3.10.4"
}
},
"nbformat": 4,
......
"""
NNI HPO Quickstart with TensorFlow
==================================
HPO Quickstart with TensorFlow
==============================
This tutorial optimizes the model in `official TensorFlow quickstart`_ with auto-tuning.
The tutorial consists of 4 steps:
......@@ -113,16 +113,16 @@ experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.max_trial_number = 10
experiment.config.trial_concurrency = 2
# %%
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
#
# .. note::
#
# ``max_trial_number`` is set to 10 here for a fast example.
# In real world it should be set to a larger number.
# With default config TPE tuner requires 20 trials to warm up.
#
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
# %%
# Step 4: Run the experiment
......@@ -154,4 +154,4 @@ experiment.stop()
#
# This example uses :doc:`Python API </reference/experiment>` to create experiment.
#
# You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.
# You can also create and manage experiments with :doc:`command line tool <../hpo_nnictl/nnictl>`.
fe5546e4ae3f3dbf5e852af322dae15f
\ No newline at end of file
b8a9880a36233005ade7a8dae6d428a8
\ No newline at end of file
:orphan:
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
......@@ -19,8 +18,8 @@
.. _sphx_glr_tutorials_hpo_quickstart_tensorflow_main.py:
NNI HPO Quickstart with TensorFlow
==================================
HPO Quickstart with TensorFlow
==============================
This tutorial optimizes the model in `official TensorFlow quickstart`_ with auto-tuning.
The tutorial consists of 4 steps:
......@@ -213,17 +212,17 @@ Here we evaluate 10 sets of hyperparameters in total, and concurrently evaluate
.. GENERATED FROM PYTHON SOURCE LINES 116-126
You may also set ``max_experiment_duration = '1h'`` to limit running time.
If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
the experiment will run forever until you press Ctrl-C.
.. note::
``max_trial_number`` is set to 10 here for a fast example.
In real world it should be set to a larger number.
With default config TPE tuner requires 20 trials to warm up.
You may also set ``max_experiment_duration = '1h'`` to limit running time.
If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
the experiment will run forever until you press Ctrl-C.
.. GENERATED FROM PYTHON SOURCE LINES 128-133
Step 4: Run the experiment
......@@ -248,10 +247,10 @@ You can use the web portal to view experiment status: http://localhost:8080.
.. code-block:: none
[2022-03-20 21:12:19] Creating experiment, Experiment ID: 8raiuoyb
[2022-03-20 21:12:19] Starting web server...
[2022-03-20 21:12:20] Setting up...
[2022-03-20 21:12:20] Web portal URLs: http://127.0.0.1:8080 http://192.168.100.103:8080
[2022-04-13 12:11:34] Creating experiment, Experiment ID: enw27qxj
[2022-04-13 12:11:34] Starting web server...
[2022-04-13 12:11:35] Setting up...
[2022-04-13 12:11:35] Web portal URLs: http://127.0.0.1:8080 http://192.168.100.103:8080
True
......@@ -285,8 +284,8 @@ allowing you to view the web portal after the experiment is done.
.. code-block:: none
[2022-03-20 21:13:41] Stopping experiment, please wait...
[2022-03-20 21:13:44] Experiment stopped
[2022-04-13 12:12:55] Stopping experiment, please wait...
[2022-04-13 12:12:58] Experiment stopped
......@@ -302,12 +301,12 @@ After the experiment is stopped, you can run :meth:`nni.experiment.Experiment.vi
This example uses :doc:`Python API </reference/experiment>` to create experiment.
You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.
You can also create and manage experiments with :doc:`command line tool <../hpo_nnictl/nnictl>`.
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 1 minutes 24.257 seconds)
**Total running time of the script:** ( 1 minutes 24.384 seconds)
.. _sphx_glr_download_tutorials_hpo_quickstart_tensorflow_main.py:
......
......@@ -5,10 +5,10 @@
Computation times
=================
**02:27.156** total execution time for **tutorials_hpo_quickstart_tensorflow** files:
**01:24.384** total execution time for **tutorials_hpo_quickstart_tensorflow** files:
+-----------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_hpo_quickstart_tensorflow_model.py` (``model.py``) | 02:27.156 | 0.0 MB |
| :ref:`sphx_glr_tutorials_hpo_quickstart_tensorflow_main.py` (``main.py``) | 01:24.384 | 0.0 MB |
+-----------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_hpo_quickstart_tensorflow_main.py` (``main.py``) | 00:00.000 | 0.0 MB |
| :ref:`sphx_glr_tutorials_hpo_quickstart_tensorflow_model.py` (``model.py``) | 00:00.000 | 0.0 MB |
+-----------------------------------------------------------------------------+-----------+--------+
......@@ -189,12 +189,12 @@ Tutorials
.. raw:: html
<div class="sphx-glr-thumbcontainer" tooltip="There is also a TensorFlow version&lt;../hpo_quickstart_tensorflow/main&gt; if you prefer it.">
<div class="sphx-glr-thumbcontainer" tooltip="The tutorial consists of 4 steps: ">
.. only:: html
.. figure:: /tutorials/hpo_quickstart_pytorch/images/thumb/sphx_glr_main_thumb.png
:alt: NNI HPO Quickstart with PyTorch
:alt: HPO Quickstart with PyTorch
:ref:`sphx_glr_tutorials_hpo_quickstart_pytorch_main.py`
......@@ -246,7 +246,7 @@ Tutorials
.. only:: html
.. figure:: /tutorials/hpo_quickstart_tensorflow/images/thumb/sphx_glr_main_thumb.png
:alt: NNI HPO Quickstart with TensorFlow
:alt: HPO Quickstart with TensorFlow
:ref:`sphx_glr_tutorials_hpo_quickstart_tensorflow_main.py`
......
......@@ -5,10 +5,10 @@
Computation times
=================
**02:15.810** total execution time for **tutorials** files:
**02:04.499** total execution time for **tutorials** files:
+-----------------------------------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_hello_nas.py` (``hello_nas.py``) | 02:15.810 | 0.0 MB |
| :ref:`sphx_glr_tutorials_hello_nas.py` (``hello_nas.py``) | 02:04.499 | 0.0 MB |
+-----------------------------------------------------------------------------------------------------+-----------+--------+
| :ref:`sphx_glr_tutorials_nasbench_as_dataset.py` (``nasbench_as_dataset.py``) | 00:00.000 | 0.0 MB |
+-----------------------------------------------------------------------------------------------------+-----------+--------+
......
......@@ -12,18 +12,18 @@ $(document).ready(function() {
// the image links are stored in layout.html
// to leverage jinja engine
downloadNote.html(`
<a class="notebook-action-link" href="${colabLink}">
<div class="notebook-action-div">
<img src="${GALLERY_LINKS.colab}"/>
<div>Run in Google Colab</div>
</div>
</a>
<a class="notebook-action-link" href="${notebookLink}">
<div class="notebook-action-div">
<img src="${GALLERY_LINKS.notebook}"/>
<div>Download Notebook</div>
</div>
</a>
<a class="notebook-action-link" href="${colabLink}">
<div class="notebook-action-div">
<img src="${GALLERY_LINKS.colab}"/>
<div>Run in Google Colab</div>
</div>
</a>
<a class="notebook-action-link" href="${githubLink}">
<div class="notebook-action-div">
<img src="${GALLERY_LINKS.github}"/>
......
......@@ -78,7 +78,7 @@ for path in iterate_dir(Path('source')):
failed_files.append('(redundant) ' + source_path.as_posix())
if not pipeline_mode:
print(f'Deleting {source_path}')
source_path.unlink()
path.unlink()
if pipeline_mode and failed_files:
......
......@@ -354,11 +354,11 @@ def evaluate_model_with_visualization(model_cls):
for model_dict in exp.export_top_models(formatter='dict'):
print(model_dict)
# The output is `json` object which records the mutation actions of the top model.
# If users want to output source code of the top model, they can use graph-based execution engine for the experiment,
# %%
# The output is ``json`` object which records the mutation actions of the top model.
# If users want to output source code of the top model,
# they can use :ref:`graph-based execution engine <graph-based-execution-engine>` for the experiment,
# by simply adding the following two lines.
#
# .. code-block:: python
#
# exp_config.execution_engine = 'base'
# export_formatter = 'code'
exp_config.execution_engine = 'base'
export_formatter = 'code'
search_space:
features:
_type: choice
_value: [ 128, 256, 512, 1024 ]
lr:
_type: loguniform
_value: [ 0.0001, 0.1 ]
momentum:
_type: uniform
_value: [ 0, 1 ]
trial_command: python model.py
trial_code_directory: .
trial_concurrency: 2
max_trial_number: 10
tuner:
name: TPE
class_args:
optimize_mode: maximize
training_service:
platform: local
"""
Port PyTorch Quickstart to NNI
==============================
This is a modified version of `PyTorch quickstart`_.
It can be run directly and will have the exact same result as original version.
Furthermore, it enables the ability of auto tuning with an NNI *experiment*, which will be detailed later.
It is recommended to run this script directly first to verify the environment.
There are 2 key differences from the original version:
1. In `Get optimized hyperparameters`_ part, it receives generated hyperparameters.
2. In `Train model and report accuracy`_ part, it reports accuracy metrics to NNI.
.. _PyTorch quickstart: https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
"""
# %%
import nni
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
# %%
# Hyperparameters to be tuned
# ---------------------------
# These are the hyperparameters that will be tuned.
params = {
'features': 512,
'lr': 0.001,
'momentum': 0,
}
# %%
# Get optimized hyperparameters
# -----------------------------
# If run directly, :func:`nni.get_next_parameter` is a no-op and returns an empty dict.
# But with an NNI *experiment*, it will receive optimized hyperparameters from tuning algorithm.
optimized_params = nni.get_next_parameter()
params.update(optimized_params)
print(params)
# %%
# Load dataset
# ------------
training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
test_data = datasets.FashionMNIST(root="data", train=False, download=True, transform=ToTensor())
batch_size = 64
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)
# %%
# Build model with hyperparameters
# --------------------------------
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, params['features']),
nn.ReLU(),
nn.Linear(params['features'], params['features']),
nn.ReLU(),
nn.Linear(params['features'], 10)
)
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
model = NeuralNetwork().to(device)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=params['lr'], momentum=params['momentum'])
# %%
# Define train and test
# ---------------------
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
model.train()
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
pred = model(X)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
model.eval()
test_loss, correct = 0, 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
return correct
# %%
# Train model and report accuracy
# -------------------------------
# Report accuracy metrics to NNI so the tuning algorithm can suggest better hyperparameters.
epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train(train_dataloader, model, loss_fn, optimizer)
accuracy = test(test_dataloader, model, loss_fn)
nni.report_intermediate_result(accuracy)
nni.report_final_result(accuracy)
Run HPO Experiment with nnictl
==============================
This tutorial has exactly the same effect as :doc:`PyTorch quickstart <../hpo_quickstart_pytorch/main>`.
Both tutorials optimize the model in `official PyTorch quickstart
<https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html>`__ with auto-tuning,
while this one manages the experiment with command line tool and YAML config file, instead of pure Python code.
The tutorial consists of 4 steps:
1. Modify the model for auto-tuning.
2. Define hyperparameters' search space.
3. Create config file.
4. Run the experiment.
The first two steps are identical to quickstart.
Step 1: Prepare the model
-------------------------
In first step, we need to prepare the model to be tuned.
The model should be put in a separate script.
It will be evaluated many times concurrently,
and possibly will be trained on distributed platforms.
In this tutorial, the model is defined in :doc:`model.py <model>`.
In short, it is a PyTorch model with 3 additional API calls:
1. Use :func:`nni.get_next_parameter` to fetch the hyperparameters to be evalutated.
2. Use :func:`nni.report_intermediate_result` to report per-epoch accuracy metrics.
3. Use :func:`nni.report_final_result` to report final accuracy.
Please understand the model code before continue to next step.
Step 2: Define search space
---------------------------
In model code, we have prepared 3 hyperparameters to be tuned:
*features*, *lr*, and *momentum*.
Here we need to define their *search space* so the tuning algorithm can sample them in desired range.
Assuming we have following prior knowledge for these hyperparameters:
1. *features* should be one of 128, 256, 512, 1024.
2. *lr* should be a float between 0.0001 and 0.1, and it follows exponential distribution.
3. *momentum* should be a float between 0 and 1.
In NNI, the space of *features* is called ``choice``;
the space of *lr* is called ``loguniform``;
and the space of *momentum* is called ``uniform``.
You may have noticed, these names are derived from ``numpy.random``.
For full specification of search space, check :doc:`the reference </hpo/search_space>`.
Now we can define the search space as follow:
.. code-block:: yaml
search_space:
features:
_type: choice
_value: [ 128, 256, 512, 1024 ]
lr:
_type: loguniform
_value: [ 0.0001, 0.1 ]
momentum:
_type: uniform
_value: [ 0, 1 ]
Step 3: Configure the experiment
--------------------------------
NNI uses an *experiment* to manage the HPO process.
The *experiment config* defines how to train the models and how to explore the search space.
In this tutorial we use a YAML file ``config.yaml`` to define the experiment.
Configure trial code
^^^^^^^^^^^^^^^^^^^^
In NNI evaluation of each hyperparameter set is called a *trial*.
So the model script is called *trial code*.
.. code-block:: yaml
trial_command: python model.py
trial_code_directory: .
When ``trial_code_directory`` is a relative path, it relates to the config file.
So in this case we need to put ``config.yaml`` and ``model.py`` in the same directory.
.. attention::
The rules for resolving relative path are different in YAML config file and :doc:`Python experiment API </reference/experiment>`.
In Python experiment API relative paths are relative to current working directory.
Configure how many trials to run
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here we evaluate 10 sets of hyperparameters in total, and concurrently evaluate 2 sets at a time.
.. code-block:: yaml
max_trial_number: 10
trial_concurrency: 2
You may also set ``max_experiment_duration = '1h'`` to limit running time.
If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
the experiment will run forever until you stop it.
.. note::
``max_trial_number`` is set to 10 here for a fast example.
In real world it should be set to a larger number.
With default config TPE tuner requires 20 trials to warm up.
Configure tuning algorithm
^^^^^^^^^^^^^^^^^^^^^^^^^^
Here we use :doc:`TPE tuner </hpo/tuners>`.
.. code-block:: yaml
name: TPE
class_args:
optimize_mode: maximize
Configure training service
^^^^^^^^^^^^^^^^^^^^^^^^^^
In this tutorial we use *local* mode,
which means models will be trained on local machine, without using any special training platform.
.. code-block:: yaml
training_service:
platform: local
Wrap up
^^^^^^^
The full content of ``config.yaml`` is as follow:
.. code-block:: yaml
search_space:
features:
_type: choice
_value: [ 128, 256, 512, 1024 ]
lr:
_type: loguniform
_value: [ 0.0001, 0.1 ]
momentum:
_type: uniform
_value: [ 0, 1 ]
trial_command: python model.py
trial_code_directory: .
trial_concurrency: 2
max_trial_number: 10
tuner:
name: TPE
class_args:
optimize_mode: maximize
training_service:
platform: local
Step 4: Run the experiment
--------------------------
Now the experiment is ready. Launch it with ``nnictl create`` command:
.. code-block:: bash
$ nnictl create --config config.yaml --port 8080
You can use the web portal to view experiment status: http://localhost:8080.
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[2022-04-01 12:00:00] Creating experiment, Experiment ID: p43ny6ew
[2022-04-01 12:00:00] Starting web server...
[2022-04-01 12:00:01] Setting up...
[2022-04-01 12:00:01] Web portal URLs: http://127.0.0.1:8080 http://192.168.1.1:8080
[2022-04-01 12:00:01] To stop experiment run "nnictl stop p43ny6ew" or "nnictl stop --all"
[2022-04-01 12:00:01] Reference: https://nni.readthedocs.io/en/stable/reference/nnictl.html
When the experiment is done, use ``nnictl stop`` command to stop it.
.. code-block:: bash
$ nnictl stop p43ny6ew
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
INFO: Stopping experiment 7u8yg9zw
INFO: Stop experiment success.
"""
NNI HPO Quickstart with PyTorch
===============================
HPO Quickstart with PyTorch
===========================
This tutorial optimizes the model in `official PyTorch quickstart`_ with auto-tuning.
There is also a :doc:`TensorFlow version<../hpo_quickstart_tensorflow/main>` if you prefer it.
The tutorial consists of 4 steps:
1. Modify the model for auto-tuning.
......@@ -113,16 +111,16 @@ experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.max_trial_number = 10
experiment.config.trial_concurrency = 2
# %%
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
#
# .. note::
#
# ``max_trial_number`` is set to 10 here for a fast example.
# In real world it should be set to a larger number.
# With default config TPE tuner requires 20 trials to warm up.
#
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
# %%
# Step 4: Run the experiment
......@@ -154,4 +152,4 @@ experiment.stop()
#
# This example uses :doc:`Python API </reference/experiment>` to create experiment.
#
# You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.
# You can also create and manage experiments with :doc:`command line tool <../hpo_nnictl/nnictl>`.
"""
NNI HPO Quickstart with TensorFlow
==================================
HPO Quickstart with TensorFlow
==============================
This tutorial optimizes the model in `official TensorFlow quickstart`_ with auto-tuning.
The tutorial consists of 4 steps:
......@@ -113,16 +113,16 @@ experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.max_trial_number = 10
experiment.config.trial_concurrency = 2
# %%
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
#
# .. note::
#
# ``max_trial_number`` is set to 10 here for a fast example.
# In real world it should be set to a larger number.
# With default config TPE tuner requires 20 trials to warm up.
#
# You may also set ``max_experiment_duration = '1h'`` to limit running time.
#
# If neither ``max_trial_number`` nor ``max_experiment_duration`` are set,
# the experiment will run forever until you press Ctrl-C.
# %%
# Step 4: Run the experiment
......@@ -154,4 +154,4 @@ experiment.stop()
#
# This example uses :doc:`Python API </reference/experiment>` to create experiment.
#
# You can also create and manage experiments with :doc:`command line tool </reference/nnictl>`.
# You can also create and manage experiments with :doc:`command line tool <../hpo_nnictl/nnictl>`.
......@@ -46,6 +46,7 @@ class FrameworkControllerConfig(TrainingServiceConfig):
service_account_name: Optional[str]
task_roles: List[FrameworkControllerRoleConfig]
reuse_mode: Optional[bool] = True
namespace: str = 'default'
def _canonicalize(self, parents):
super()._canonicalize(parents)
......
......@@ -43,6 +43,7 @@ class KubeflowConfig(TrainingServiceConfig):
ps: Optional[KubeflowRoleConfig] = None
master: Optional[KubeflowRoleConfig] = None
reuse_mode: Optional[bool] = True #set reuse mode as true for v2 config
namespace: str = 'default'
def _canonicalize(self, parents):
super()._canonicalize(parents)
......
......@@ -90,6 +90,13 @@ class Cell(nn.Module):
(e.g., the next cell wants to have the outputs of both this cell and previous cell as its input).
By default, directly use this cell's output.
.. tip::
It's highly recommended to make the candidate operators have an output of the same shape as input.
This is because, there can be dynamic connections within cell. If there's shape change within operations,
the input shape of the subsequent operation becomes unknown.
In addition, the final concatenation could have shape mismatch issues.
Parameters
----------
op_candidates : list of module or function, or dict
......@@ -131,7 +138,7 @@ class Cell(nn.Module):
Choose between conv2d and maxpool2d.
The cell have 4 nodes, 1 op per node, and 2 predecessors.
>>> cell = nn.Cell([nn.Conv2d(32, 32, 3), nn.MaxPool2d(3)], 4, 1, 2)
>>> cell = nn.Cell([nn.Conv2d(32, 32, 3, padding=1), nn.MaxPool2d(3, padding=1)], 4, 1, 2)
In forward:
......@@ -169,7 +176,7 @@ class Cell(nn.Module):
Warnings
--------
:class:`Cell` is not supported in :ref:`graph-based execution engine <graph-based-exeuction-engine>`.
:class:`Cell` is not supported in :ref:`graph-based execution engine <graph-based-execution-engine>`.
Attributes
----------
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment