Commit 1011377c authored by qianyj's avatar qianyj
Browse files

the source code of NNI for DCU

parent abc22158
**How to Debug in NNI**
===========================
Overview
--------
There are three parts that might have logs in NNI. They are nnimanager, dispatcher and trial. Here we will introduce them succinctly. More information please refer to `Overview <../Overview.rst>`__.
* **NNI controller**\ : NNI controller (nnictl) is the nni command-line tool that is used to manage experiments (e.g., start an experiment).
* **nnimanager**\ : nnimanager is the core of NNI, whose log is important when the whole experiment fails (e.g., no webUI or training service fails)
* **Dispatcher**\ : Dispatcher calls the methods of **Tuner** and **Assessor**. Logs of dispatcher are related to the tuner or assessor code.
* **Tuner**\ : Tuner is an AutoML algorithm, which generates a new configuration for the next try. A new trial will run with this configuration.
* **Assessor**\ : Assessor analyzes trial's intermediate results (e.g., periodically evaluated accuracy on test dataset) to tell whether this trial can be early stopped or not.
* **Trial**\ : Trial code is the code you write to run your experiment, which is an individual attempt at applying a new configuration (e.g., a set of hyperparameter values, a specific nerual architecture).
Where is the log
----------------
There are three kinds of log in NNI. When creating a new experiment, you can specify log level as debug by adding ``--debug``. Besides, you can set more detailed log level in your configuration file by using
``logLevel`` keyword. Available logLevels are: ``trace``\ , ``debug``\ , ``info``\ , ``warning``\ , ``error``\ , ``fatal``.
NNI controller
^^^^^^^^^^^^^^
All possible errors that happen when launching an NNI experiment can be found here.
You can use ``nnictl log stderr`` to find error information. For more options please refer to `NNICTL <Nnictl.rst>`__
Experiment Root Directory
^^^^^^^^^^^^^^^^^^^^^^^^^
Every experiment has a root folder, which is shown on the right-top corner of webUI. Or you could assemble it by replacing the ``experiment_id`` with your actual experiment_id in path ``~/nni-experiments/experiment_id/`` in case of webUI failure. ``experiment_id`` could be seen when you run ``nnictl create ...`` to create a new experiment.
..
For flexibility, we also offer a ``logDir`` option in your configuration, which specifies the directory to store all experiments (defaults to ``~/nni-experiments``\ ). Please refer to `Configuration <ExperimentConfig.rst>`__ for more details.
Under that directory, there is another directory named ``log``\ , where ``nnimanager.log`` and ``dispatcher.log`` are placed.
Trial Root Directory
^^^^^^^^^^^^^^^^^^^^
Usually in webUI, you can click ``+`` in the left of every trial to expand it to see each trial's log path.
Besides, there is another directory under experiment root directory, named ``trials``\ , which stores all the trials.
Every trial has a unique id as its directory name. In this directory, a file named ``stderr`` records trial error and another named ``trial.log`` records this trial's log.
Different kinds of errors
-------------------------
There are different kinds of errors. However, they can be divided into three categories based on their severity. So when nni fails, check each part sequentially.
Generally, if webUI is started successfully, there is a ``Status`` in the ``Overview`` tab, serving as a possible indicator of what kind of error happens. Otherwise you should check manually.
**NNI** Fails
^^^^^^^^^^^^^^^^^
This is the most serious error. When this happens, the whole experiment fails and no trial will be run. Usually this might be related to some installation problem.
When this happens, you should check ``nnictl``\ 's error output file ``stderr`` (i.e., nnictl log stderr) and then the ``nnimanager``\ 's log to find if there is any error.
**Dispatcher** Fails
^^^^^^^^^^^^^^^^^^^^^^^^
Dispatcher fails. Usually, for some new users of NNI, it means that tuner fails. You could check dispatcher's log to see what happens to your dispatcher. For built-in tuner, some common errors might be invalid search space (unsupported type of search space or inconsistence between initializing args in configuration file and actual tuner's ``__init__`` function args).
Take the later situation as an example. If you write a customized tuner who's __init__ function has an argument called ``optimize_mode``\ , which you do not provide in your configuration file, NNI will fail to run your tuner so the experiment fails. You can see errors in the webUI like:
.. image:: ../../img/dispatcher_error.jpg
:target: ../../img/dispatcher_error.jpg
:alt:
Here we can see it is a dispatcher error. So we can check dispatcher's log, which might look like:
.. code-block:: bash
[2019-02-19 19:36:45] DEBUG (nni.main/MainThread) START
[2019-02-19 19:36:47] ERROR (nni.main/MainThread) __init__() missing 1 required positional arguments: 'optimize_mode'
Traceback (most recent call last):
File "/usr/lib/python3.7/site-packages/nni/__main__.py", line 202, in <module>
main()
File "/usr/lib/python3.7/site-packages/nni/__main__.py", line 164, in main
args.tuner_args)
File "/usr/lib/python3.7/site-packages/nni/__main__.py", line 81, in create_customized_class_instance
instance = class_constructor(**class_args)
TypeError: __init__() missing 1 required positional arguments: 'optimize_mode'.
**Trial** Fails
^^^^^^^^^^^^^^^^^^^
In this situation, NNI can still run and create new trials.
It means your trial code (which is run by NNI) fails. This kind of error is strongly related to your trial code. Please check trial's log to fix any possible errors shown there.
A common example of this would be run the mnist example without installing tensorflow. Surely there is an Import Error (that is, not installing tensorflow but trying to import it in your trial code) and thus every trial fails.
.. image:: ../../img/trial_error.jpg
:target: ../../img/trial_error.jpg
:alt:
As it shows, every trial has a log path, where you can find trial's log and stderr.
In addition to experiment level debug, NNI also provides the capability for debugging a single trial without the need to start the entire experiment. Refer to `standalone mode <../TrialExample/Trials.rst#standalone-mode-for-debugging>`__ for more information about debug single trial code.
How to Launch an Experiment from Python
=======================================
.. toctree::
:hidden:
Start Usage <python_api_start>
Connect Usage <python_api_connect>
Overview
--------
Since ``v2.0``, NNI provides a new way to launch the experiments. Before that, you need to configure the experiment in the YAML configuration file and then use the ``nnictl`` command to launch the experiment. Now, you can also configure and run experiments directly in the Python file. If you are familiar with Python programming, this will undoubtedly bring you more convenience.
Run a New Experiment
--------------------
After successfully installing ``nni`` and prepare the `trial code <../TrialExample/Trials.rst>`__, you can start the experiment with a Python script in the following 2 steps.
Step 1 - Initialize an experiment instance and configure it
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: python
from nni.experiment import Experiment
experiment = Experiment('local')
Now, you have a ``Experiment`` instance, and this experiment will launch trials on your local machine due to ``training_service='local'``.
See all `training services <../training_services.rst>`__ supported in NNI.
.. code-block:: python
experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True
Use the form like ``experiment.config.foo = 'bar'`` to configure your experiment.
See all real `builtin tuners <../builtin_tuner.rst>`__ supported in NNI.
See `configuration reference <../reference/experiment_config.rst>`__ for more detailed usage of these fields.
Step 2 - Just run
^^^^^^^^^^^^^^^^^
.. code-block:: python
experiment.run(port=8080)
Now, you have successfully launched an NNI experiment. And you can type ``localhost:8080`` in your browser to observe your experiment in real time.
In this way, experiment will run in the foreground and will automatically exit when the experiment finished.
.. Note:: If you want to run an experiment in an interactive way, use ``start()`` in Step 2. If you launch the experiment in Python script, please use ``run()``, as ``start()`` is designed for the interactive scenarios.
Example
^^^^^^^
Below is an example for this new launching approach. You can find this code in :githublink:`mnist-tfv2/launch.py <examples/trials/mnist-tfv2/launch.py>`.
.. code-block:: python
from pathlib import Path
from nni.experiment import Experiment
search_space = {
"dropout_rate": { "_type": "uniform", "_value": [0.5, 0.9] },
"conv_size": { "_type": "choice", "_value": [2, 3, 5, 7] },
"hidden_size": { "_type": "choice", "_value": [124, 512, 1024] },
"batch_size": { "_type": "choice", "_value": [16, 32] },
"learning_rate": { "_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1] }
}
experiment = Experiment('local')
experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True
experiment.run(8080)
Start and Manage a New Experiment
---------------------------------
NNI migrates the API in ``NNI Client`` to this new launching approach. Launch the experiment by ``start()`` instead of ``run()``, then you can use these APIs in interactive mode.
Please refer to `example usage <./python_api_start.rst>`__ and code file :githublink:`python_api_start.ipynb <examples/trials/sklearn/classification/python_api_start.ipynb>`.
.. Note:: ``run()`` polls the experiment status and will automatically call ``stop()`` when the experiment finished. ``start()`` just launched a new experiment, so you need to manually stop the experiment by calling ``stop()``.
Connect and Manage an Exist Experiment
--------------------------------------
If you launch an experiment by ``nnictl`` and also want to use these APIs, you can use ``Experiment.connect()`` to connect to an existing experiment.
Please refer to `example usage <./python_api_connect.rst>`__ and code file :githublink:`python_api_connect.ipynb <examples/trials/sklearn/classification/python_api_connect.ipynb>`.
.. Note:: You can use ``stop()`` to stop the experiment when connecting to an existing experiment.
Resume/View and Manage a Stopped Experiment
-------------------------------------------
You can use ``Experiment.resume()`` and ``Experiment.view()`` to resume and view a stopped experiment, these functions behave like ``nnictl resume`` and ``nnictl view``.
If you want to manage the experiment, set ``wait_completion`` as ``False`` and the functions will return an ``Experiment`` instance. For more parameters, please refer to API reference.
API Reference
-------------
Detailed usage could be found `here <../reference/experiment_config.rst>`__.
* `Experiment`_
* `Experiment Config <#Experiment-Config>`_
* `Algorithm Config <#Algorithm-Config>`_
* `Training Service Config <#Training-Service-Config>`_
* `Local Config <#Local-Config>`_
* `Remote Config <#Remote-Config>`_
* `Openpai Config <#Openpai-Config>`_
* `AML Config <#AML-Config>`_
* `Shared Storage Config <Shared-Storage-Config>`_
Experiment
^^^^^^^^^^
.. autoclass:: nni.experiment.Experiment
:members:
Experiment Config
^^^^^^^^^^^^^^^^^
.. autoattribute:: nni.experiment.config.ExperimentConfig.experiment_name
.. autoattribute:: nni.experiment.config.ExperimentConfig.search_space_file
.. autoattribute:: nni.experiment.config.ExperimentConfig.search_space
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_command
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_code_directory
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_concurrency
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_gpu_number
.. autoattribute:: nni.experiment.config.ExperimentConfig.max_experiment_duration
.. autoattribute:: nni.experiment.config.ExperimentConfig.max_trial_number
.. autoattribute:: nni.experiment.config.ExperimentConfig.nni_manager_ip
.. autoattribute:: nni.experiment.config.ExperimentConfig.use_annotation
.. autoattribute:: nni.experiment.config.ExperimentConfig.debug
.. autoattribute:: nni.experiment.config.ExperimentConfig.log_level
.. autoattribute:: nni.experiment.config.ExperimentConfig.experiment_working_directory
.. autoattribute:: nni.experiment.config.ExperimentConfig.tuner_gpu_indices
.. autoattribute:: nni.experiment.config.ExperimentConfig.tuner
.. autoattribute:: nni.experiment.config.ExperimentConfig.assessor
.. autoattribute:: nni.experiment.config.ExperimentConfig.advisor
.. autoattribute:: nni.experiment.config.ExperimentConfig.training_service
.. autoattribute:: nni.experiment.config.ExperimentConfig.shared_storage
Algorithm Config
^^^^^^^^^^^^^^^^
.. autoattribute:: nni.experiment.config.AlgorithmConfig.name
.. autoattribute:: nni.experiment.config.AlgorithmConfig.class_args
.. autoattribute:: nni.experiment.config.CustomAlgorithmConfig.class_name
.. autoattribute:: nni.experiment.config.CustomAlgorithmConfig.code_directory
.. autoattribute:: nni.experiment.config.CustomAlgorithmConfig.class_args
Training Service Config
^^^^^^^^^^^^^^^^^^^^^^^
Local Config
************
.. autoattribute:: nni.experiment.config.LocalConfig.platform
.. autoattribute:: nni.experiment.config.LocalConfig.use_active_gpu
.. autoattribute:: nni.experiment.config.LocalConfig.max_trial_number_per_gpu
.. autoattribute:: nni.experiment.config.LocalConfig.gpu_indices
Remote Config
*************
.. autoattribute:: nni.experiment.config.RemoteConfig.platform
.. autoattribute:: nni.experiment.config.RemoteConfig.reuse_mode
.. autoattribute:: nni.experiment.config.RemoteConfig.machine_list
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.host
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.port
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.user
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.password
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.ssh_key_file
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.ssh_passphrase
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.use_active_gpu
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.max_trial_number_per_gpu
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.gpu_indices
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.python_path
Openpai Config
**************
.. autoattribute:: nni.experiment.config.OpenpaiConfig.platform
.. autoattribute:: nni.experiment.config.OpenpaiConfig.host
.. autoattribute:: nni.experiment.config.OpenpaiConfig.username
.. autoattribute:: nni.experiment.config.OpenpaiConfig.token
.. autoattribute:: nni.experiment.config.OpenpaiConfig.trial_cpu_number
.. autoattribute:: nni.experiment.config.OpenpaiConfig.trial_memory_size
.. autoattribute:: nni.experiment.config.OpenpaiConfig.storage_config_name
.. autoattribute:: nni.experiment.config.OpenpaiConfig.docker_image
.. autoattribute:: nni.experiment.config.OpenpaiConfig.local_storage_mount_point
.. autoattribute:: nni.experiment.config.OpenpaiConfig.container_storage_mount_point
.. autoattribute:: nni.experiment.config.OpenpaiConfig.reuse_mode
.. autoattribute:: nni.experiment.config.OpenpaiConfig.openpai_config
.. autoattribute:: nni.experiment.config.OpenpaiConfig.openpai_config_file
AML Config
**********
.. autoattribute:: nni.experiment.config.AmlConfig.platform
.. autoattribute:: nni.experiment.config.AmlConfig.subscription_id
.. autoattribute:: nni.experiment.config.AmlConfig.resource_group
.. autoattribute:: nni.experiment.config.AmlConfig.workspace_name
.. autoattribute:: nni.experiment.config.AmlConfig.compute_target
.. autoattribute:: nni.experiment.config.AmlConfig.docker_image
.. autoattribute:: nni.experiment.config.AmlConfig.max_trial_number_per_gpu
Shared Storage Config
^^^^^^^^^^^^^^^^^^^^^
Nfs Config
**********
.. autoattribute:: nni.experiment.config.NfsConfig.storage_type
.. autoattribute:: nni.experiment.config.NfsConfig.nfs_server
.. autoattribute:: nni.experiment.config.NfsConfig.exported_directory
Azure Blob Config
*****************
.. autoattribute:: nni.experiment.config.AzureBlobConfig.storage_type
.. autoattribute:: nni.experiment.config.AzureBlobConfig.storage_account_name
.. autoattribute:: nni.experiment.config.AzureBlobConfig.storage_account_key
.. autoattribute:: nni.experiment.config.AzureBlobConfig.container_name
**How to Use Docker in NNI**
================================
Overview
--------
`Docker <https://www.docker.com/>`__ is a tool to make it easier for users to deploy and run applications based on their own operating system by starting containers. Docker is not a virtual machine, it does not create a virtual operating system, but it allows different applications to use the same OS kernel and isolate different applications by container.
Users can start NNI experiments using Docker. NNI also provides an official Docker image `msranni/nni <https://hub.docker.com/r/msranni/nni>`__ on Docker Hub.
Using Docker in local machine
-----------------------------
Step 1: Installation of Docker
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Before you start using Docker for NNI experiments, you should install Docker on your local machine. `See here <https://docs.docker.com/install/linux/docker-ce/ubuntu/>`__.
Step 2: Start a Docker container
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you have installed the Docker package in your local machine, you can start a Docker container instance to run NNI examples. You should notice that because NNI will start a web UI process in a container and continue to listen to a port, you need to specify the port mapping between your host machine and Docker container to give access to web UI outside the container. By visiting the host IP address and port, you can redirect to the web UI process started in Docker container and visit web UI content.
For example, you could start a new Docker container from the following command:
.. code-block:: bash
docker run -i -t -p [hostPort]:[containerPort] [image]
``-i:`` Start a Docker in an interactive mode.
``-t:`` Docker assign the container an input terminal.
``-p:`` Port mapping, map host port to a container port.
For more information about Docker commands, please `refer to this <https://docs.docker.com/engine/reference/run/>`__.
Note:
.. code-block:: bash
NNI only supports Ubuntu and MacOS systems in local mode for the moment, please use correct Docker image type. If you want to use gpu in a Docker container, please use nvidia-docker.
Step 3: Run NNI in a Docker container
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you start a Docker image using NNI's official image ``msranni/nni``\ , you can directly start NNI experiments by using the ``nnictl`` command. Our official image has NNI's running environment and basic python and deep learning frameworks preinstalled.
If you start your own Docker image, you may need to install the NNI package first; please refer to `NNI installation <InstallationLinux.rst>`__.
If you want to run NNI's official examples, you may need to clone the NNI repo in GitHub using
.. code-block:: bash
git clone https://github.com/Microsoft/nni.git
then you can enter ``nni/examples/trials`` to start an experiment.
After you prepare NNI's environment, you can start a new experiment using the ``nnictl`` command. `See here <QuickStart.rst>`__.
Using Docker on a remote platform
---------------------------------
NNI supports starting experiments in `remoteTrainingService <../TrainingService/RemoteMachineMode.rst>`__\ , and running trial jobs on remote machines. As Docker can start an independent Ubuntu system as an SSH server, a Docker container can be used as the remote machine in NNI's remote mode.
Step 1: Setting a Docker environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You should install the Docker software on your remote machine first, please `refer to this <https://docs.docker.com/install/linux/docker-ce/ubuntu/>`__.
To make sure your Docker container can be connected by NNI experiments, you should build your own Docker image to set an SSH server or use images with an SSH configuration. If you want to use a Docker container as an SSH server, you should configure the SSH password login or private key login; please `refer to this <https://docs.docker.com/engine/examples/running_ssh_service/>`__.
Note:
.. code-block:: text
NNI's official image msranni/nni does not support SSH servers for the time being; you should build your own Docker image with an SSH configuration or use other images as a remote server.
Step 2: Start a Docker container on a remote machine
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
An SSH server needs a port; you need to expose Docker's SSH port to NNI as the connection port. For example, if you set your container's SSH port as ``A``, you should map the container's port ``A`` to your remote host machine's other port ``B``, NNI will connect port ``B`` as an SSH port, and your host machine will map the connection from port ``B`` to port ``A`` then NNI could connect to your Docker container.
For example, you could start your Docker container using the following commands:
.. code-block:: bash
docker run -dit -p [hostPort]:[containerPort] [image]
The ``containerPort`` is the SSH port used in your Docker container and the ``hostPort`` is your host machine's port exposed to NNI. You can set your NNI's config file to connect to ``hostPort`` and the connection will be transmitted to your Docker container.
For more information about Docker commands, please `refer to this <https://docs.docker.com/v17.09/edge/engine/reference/run/>`__.
Note:
.. code-block:: bash
If you use your own Docker image as a remote server, please make sure that this image has a basic python environment and an NNI SDK runtime environment. If you want to use a GPU in a Docker container, please use nvidia-docker.
Step 3: Run NNI experiments
^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can set your config file as a remote platform and set the ``machineList`` configuration to connect to your Docker SSH server; `refer to this <../TrainingService/RemoteMachineMode.rst>`__. Note that you should set the correct ``port``\ , ``username``\ , and ``passWd`` or ``sshKeyPath`` of your host machine.
``port:`` The host machine's port, mapping to Docker's SSH port.
``username:`` The username of the Docker container.
``passWd:`` The password of the Docker container.
``sshKeyPath:`` The path of the private key of the Docker container.
After the configuration of the config file, you could start an experiment, `refer to this <QuickStart.rst>`__.
**How to Use Shared Storage**
=============================
If you want to use your own storage during using NNI, shared storage can satisfy you.
Instead of using training service native storage, shared storage can bring you more convenience.
All the information generated by the experiment will be stored under ``/nni`` folder in your shared storage.
All the output produced by the trial will be located under ``/nni/{EXPERIMENT_ID}/trials/{TRIAL_ID}/nnioutput`` folder in your shared storage.
This saves you from finding for experiment-related information in various places.
Remember that your trial working directory is ``/nni/{EXPERIMENT_ID}/trials/{TRIAL_ID}``, so if you upload your data in this shared storage, you can open it like a local file in your trial code without downloading it.
And we will develop more practical features in the future based on shared storage. The config reference can be found `here <../reference/experiment_config.html#sharedstorageconfig>`_.
.. note::
Shared storage is currently in the experimental stage. We suggest use AzureBlob under Ubuntu/CentOS/RHEL, and NFS under Ubuntu/CentOS/RHEL/Fedora/Debian for remote.
And make sure your local machine can mount NFS or fuse AzureBlob and the machine used in training service has ``sudo`` permission without password. We only support shared storage under training service with reuse mode for now.
.. note::
What is the difference between training service native storage and shared storage? Training service native storage is usually provided by the specific training service.
E.g., the local storage on remote machine in remote mode, the provided storage in openpai mode. These storages might not easy to use, e.g., users have to upload datasets to all remote machines to train the model.
In these cases, shared storage can automatically mount to the machine in the training platform. Users can directly save and load data from the shared storage. All the data/log used/generated in one experiment can be placed under the same place.
After the experiment is finished, shared storage will automatically unmount from the training platform.
Example
-------
If you want to use AzureBlob, add below to your config. Full config file see :githublink:`mnist-sharedstorage/config_azureblob.yml <examples/trials/mnist-sharedstorage/config_azureblob.yml>`.
.. code-block:: yaml
sharedStorage:
storageType: AzureBlob
# please set localMountPoint as absolute path and localMountPoint should outside the code directory
# because nni will copy user code to localMountPoint
localMountPoint: ${your/local/mount/point}
# remoteMountPoint is the mount point on training service machine, it can be set as both absolute path and relative path
# make sure you have `sudo` permission without password on training service machine
remoteMountPoint: ${your/remote/mount/point}
storageAccountName: ${replace_to_your_storageAccountName}
storageAccountKey: ${replace_to_your_storageAccountKey}
containerName: ${replace_to_your_containerName}
# usermount means you have already mount this storage on localMountPoint
# nnimount means nni will try to mount this storage on localMountPoint
# nomount means storage will not mount in local machine, will support partial storages in the future
localMounted: nnimount
You can find ``storageAccountName``, ``storageAccountKey``, ``containerName`` on azure storage account portal.
.. image:: ../../img/azure_storage.png
:target: ../../img/azure_storage.png
:alt:
If you want to use NFS, add below to your config. Full config file see :githublink:`mnist-sharedstorage/config_nfs.yml <examples/trials/mnist-sharedstorage/config_nfs.yml>`.
.. code-block:: yaml
sharedStorage:
storageType: NFS
localMountPoint: ${your/local/mount/point}
remoteMountPoint: ${your/remote/mount/point}
nfsServer: ${nfs-server-ip}
exportedDirectory: ${nfs/exported/directory}
# usermount means you have already mount this storage on localMountPoint
# nnimount means nni will try to mount this storage on localMountPoint
# nomount means storage will not mount in local machine, will support partial storages in the future
localMounted: nnimount
**How to register customized algorithms as builtin tuners, assessors and advisors**
=======================================================================================
.. contents::
Overview
--------
NNI provides a lot of `builtin tuners <../Tuner/BuiltinTuner.rst>`_, `advisors <../Tuner/HyperbandAdvisor.rst>`__ and `assessors <../Assessor/BuiltinAssessor.rst>`__ can be used directly for Hyper Parameter Optimization, and some extra algorithms can be registered via ``nnictl algo register --meta <path_to_meta_file>`` after NNI is installed. You can check builtin algorithms via ``nnictl algo list`` command.
NNI also provides the ability to build your own customized tuners, advisors and assessors. To use the customized algorithm, users can simply follow the spec in experiment config file to properly reference the algorithm, which has been illustrated in the tutorials of `customized tuners <../Tuner/CustomizeTuner.rst>`_ / `advisors <../Tuner/CustomizeAdvisor.rst>`__ / `assessors <../Assessor/CustomizeAssessor.rst>`__.
NNI also allows users to install the customized algorithm as a builtin algorithm, in order for users to use the algorithm in the same way as NNI builtin tuners/advisors/assessors. More importantly, it becomes much easier for users to share or distribute their implemented algorithm to others. Customized tuners/advisors/assessors can be installed into NNI as builtin algorithms, once they are installed into NNI, you can use your customized algorithms the same way as builtin tuners/advisors/assessors in your experiment configuration file. For example, you built a customized tuner and installed it into NNI using a builtin name ``mytuner``, then you can use this tuner in your configuration file like below:
.. code-block:: yaml
tuner:
builtinTunerName: mytuner
Register customized algorithms as builtin tuners, assessors and advisors
------------------------------------------------------------------------
You can follow below steps to build a customized tuner/assessor/advisor, and register it into NNI as builtin algorithm.
1. Create a customized tuner/assessor/advisor
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Reference following instructions to create:
* `customized tuner <../Tuner/CustomizeTuner.rst>`_
* `customized assessor <../Assessor/CustomizeAssessor.rst>`_
* `customized advisor <../Tuner/CustomizeAdvisor.rst>`_
2. (Optional) Create a validator to validate classArgs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NNI provides a ``ClassArgsValidator`` interface for customized algorithms author to validate the classArgs parameters in experiment configuration file which are passed to customized algorithms constructors.
The ``ClassArgsValidator`` interface is defined as:
.. code-block:: python
class ClassArgsValidator(object):
def validate_class_args(self, **kwargs):
"""
The classArgs fields in experiment configuration are packed as a dict and
passed to validator as kwargs.
"""
pass
For example, you can implement your validator such as:
.. code-block:: python
from schema import Schema, Optional
from nni import ClassArgsValidator
class MedianstopClassArgsValidator(ClassArgsValidator):
def validate_class_args(self, **kwargs):
Schema({
Optional('optimize_mode'): self.choices('optimize_mode', 'maximize', 'minimize'),
Optional('start_step'): self.range('start_step', int, 0, 9999),
}).validate(kwargs)
The validator will be invoked before experiment is started to check whether the classArgs fields are valid for your customized algorithms.
3. Install your customized algorithms into python environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Firstly, the customized algorithms need to be prepared as a python package. Then you can install the package into python environment via:
* Run command ``python setup.py develop`` from the package directory, this command will install the package in development mode, this is recommended if your algorithm is under development.
* Run command ``python setup.py bdist_wheel`` from the package directory, this command build a whl file which is a pip installation source. Then run ``pip install <wheel file>`` to install it.
4. Prepare meta file
^^^^^^^^^^^^^^^^^^^^
Create a yaml file with following keys as meta file:
* ``algoType``: type of algorithms, could be one of ``tuner``, ``assessor``, ``advisor``
* ``builtinName``: builtin name used in experiment configuration file
* `className`: tuner class name, including its module name, for example: ``demo_tuner.DemoTuner``
* `classArgsValidator`: class args validator class name, including its module name, for example: ``demo_tuner.MyClassArgsValidator``
Following is an example of the yaml file:
.. code-block:: yaml
algoType: tuner
builtinName: demotuner
className: demo_tuner.DemoTuner
classArgsValidator: demo_tuner.MyClassArgsValidator
5. Register customized algorithms into NNI
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Run following command to register the customized algorithms as builtin algorithms in NNI:
.. code-block:: bash
nnictl algo register --meta <path_to_meta_file>
The ``<path_to_meta_file>`` is the path to the yaml file your created in above section.
Reference `customized tuner example <#example-register-a-customized-tuner-as-a-builtin-tuner>`_ for a full example.
Use the installed builtin algorithms in experiment
--------------------------------------------------
Once your customized algorithms is installed, you can use it in experiment configuration file the same way as other builtin tuners/assessors/advisors, for example:
.. code-block:: yaml
tuner:
builtinTunerName: demotuner
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
Manage builtin algorithms using ``nnictl algo``
-----------------------------------------------
List builtin algorithms
^^^^^^^^^^^^^^^^^^^^^^^
Run following command to list the registered builtin algorithms:
.. code-block:: bash
nnictl algo list
+-----------------+------------+-----------+--------=-------------+------------------------------------------+
| Name | Type | Source | Class Name | Module Name |
+-----------------+------------+-----------+----------------------+------------------------------------------+
| TPE | tuners | nni | HyperoptTuner | nni.hyperopt_tuner.hyperopt_tuner |
| Random | tuners | nni | HyperoptTuner | nni.hyperopt_tuner.hyperopt_tuner |
| Anneal | tuners | nni | HyperoptTuner | nni.hyperopt_tuner.hyperopt_tuner |
| Evolution | tuners | nni | EvolutionTuner | nni.evolution_tuner.evolution_tuner |
| BatchTuner | tuners | nni | BatchTuner | nni.batch_tuner.batch_tuner |
| GridSearch | tuners | nni | GridSearchTuner | nni.gridsearch_tuner.gridsearch_tuner |
| NetworkMorphism | tuners | nni | NetworkMorphismTuner | nni.networkmorphism_tuner.networkmo... |
| MetisTuner | tuners | nni | MetisTuner | nni.metis_tuner.metis_tuner |
| GPTuner | tuners | nni | GPTuner | nni.gp_tuner.gp_tuner |
| PBTTuner | tuners | nni | PBTTuner | nni.pbt_tuner.pbt_tuner |
| SMAC | tuners | nni | SMACTuner | nni.smac_tuner.smac_tuner |
| PPOTuner | tuners | nni | PPOTuner | nni.ppo_tuner.ppo_tuner |
| Medianstop | assessors | nni | MedianstopAssessor | nni.medianstop_assessor.medianstop_... |
| Curvefitting | assessors | nni | CurvefittingAssessor | nni.curvefitting_assessor.curvefitt... |
| Hyperband | advisors | nni | Hyperband | nni.hyperband_advisor.hyperband_adv... |
| BOHB | advisors | nni | BOHB | nni.bohb_advisor.bohb_advisor |
+-----------------+------------+-----------+----------------------+------------------------------------------+
Unregister builtin algorithms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Run following command to uninstall an installed package:
``nnictl algo unregister <builtin name>``
For example:
``nnictl algo unregister demotuner``
Porting customized algorithms from v1.x to v2.x
-----------------------------------------------
All that needs to be modified is to delete ``NNI Package :: tuner`` metadata in ``setup.py`` and add a meta file mentioned in `4. Prepare meta file`_. Then you can follow `Register customized algorithms as builtin tuners, assessors and advisors`_ to register your customized algorithms.
Example: Register a customized tuner as a builtin tuner
-------------------------------------------------------
You can following below steps to register a customized tuner in ``nni/examples/tuners/customized_tuner`` as a builtin tuner.
Install the customized tuner package into python environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are 2 options to install the package into python environment:
Option 1: install from directory
""""""""""""""""""""""""""""""""
From ``nni/examples/tuners/customized_tuner`` directory, run:
``python setup.py develop``
This command will build the ``nni/examples/tuners/customized_tuner`` directory as a pip installation source.
Option 2: install from whl file
"""""""""""""""""""""""""""""""
Step 1: From ``nni/examples/tuners/customized_tuner`` directory, run:
``python setup.py bdist_wheel``
This command build a whl file which is a pip installation source.
Step 2: Run command:
``pip install dist/demo_tuner-0.1-py3-none-any.whl``
Register the customized tuner as builtin tuner:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Run following command:
``nnictl algo register --meta meta_file.yml``
Check the registered builtin algorithms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Then run command ``nnictl algo list``\ , you should be able to see that demotuner is installed:
.. code-block:: bash
+-----------------+------------+-----------+--------=-------------+------------------------------------------+
| Name | Type | source | Class Name | Module Name |
+-----------------+------------+-----------+----------------------+------------------------------------------+
| demotuner | tuners | User | DemoTuner | demo_tuner |
+-----------------+------------+-----------+----------------------+------------------------------------------+
Install on Linux & Mac
======================
Installation
------------
Installation on Linux and macOS follow the same instructions, given below.
Install NNI through pip
^^^^^^^^^^^^^^^^^^^^^^^
Prerequisite: ``python 64-bit >= 3.6``
.. code-block:: bash
python3 -m pip install --upgrade nni
Install NNI through source code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you are interested in special or the latest code versions, you can install NNI through source code.
Prerequisites: ``python 64-bit >=3.6``, ``git``
.. code-block:: bash
git clone -b v2.6 https://github.com/Microsoft/nni.git
cd nni
python3 -m pip install -U -r dependencies/setup.txt
python3 -m pip install -r dependencies/develop.txt
python3 setup.py develop
Build wheel package from NNI source code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The previous section shows how to install NNI in `development mode <https://setuptools.readthedocs.io/en/latest/userguide/development_mode.html>`__.
If you want to perform a persist install instead, we recommend to build your own wheel package and install from wheel.
.. code-block:: bash
git clone -b v2.6 https://github.com/Microsoft/nni.git
cd nni
export NNI_RELEASE=2.0
python3 -m pip install -U -r dependencies/setup.txt
python3 -m pip install -r dependencies/develop.txt
python3 setup.py clean --all
python3 setup.py build_ts
python3 setup.py bdist_wheel -p manylinux1_x86_64
python3 -m pip install dist/nni-2.0-py3-none-manylinux1_x86_64.whl
Use NNI in a docker image
^^^^^^^^^^^^^^^^^^^^^^^^^
You can also install NNI in a docker image. Please follow the instructions `here <../Tutorial/HowToUseDocker.rst>`__ to build an NNI docker image. The NNI docker image can also be retrieved from Docker Hub through the command ``docker pull msranni/nni:latest``.
Verify installation
-------------------
*
Download the examples via cloning the source code.
.. code-block:: bash
git clone -b v2.6 https://github.com/Microsoft/nni.git
*
Run the MNIST example.
.. code-block:: bash
nnictl create --config nni/examples/trials/mnist-pytorch/config.yml
*
Wait for the message ``INFO: Successfully started experiment!`` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the ``Web UI url``.
.. code-block:: text
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: http://223.255.255.1:8080 http://127.0.0.1:8080
-----------------------------------------------------------------------
You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
commands description
1. nnictl experiment show show the information of experiments
2. nnictl trial ls list all of trial jobs
3. nnictl top monitor the status of running experiments
4. nnictl log stderr show stderr log content
5. nnictl log stdout show stdout log content
6. nnictl stop stop an experiment
7. nnictl trial kill kill a trial job by id
8. nnictl --help get help information about nnictl
-----------------------------------------------------------------------
* Open the ``Web UI url`` in your browser, you can view detailed information about the experiment and all the submitted trial jobs as shown below. `Here <../Tutorial/WebUI.rst>`__ are more Web UI pages.
.. image:: ../../img/webui_overview_page.png
:target: ../../img/webui_overview_page.png
:alt: overview
.. image:: ../../img/webui_trialdetail_page.png
:target: ../../img/webui_trialdetail_page.png
:alt: detail
System requirements
-------------------
Due to potential programming changes, the minimum system requirements of NNI may change over time.
Linux
^^^^^
.. list-table::
:header-rows: 1
:widths: auto
* -
- Recommended
- Minimum
* - **Operating System**
- Ubuntu 16.04 or above
-
* - **CPU**
- Intel® Core™ i5 or AMD Phenom™ II X3 or better
- Intel® Core™ i3 or AMD Phenom™ X3 8650
* - **GPU**
- NVIDIA® GeForce® GTX 660 or better
- NVIDIA® GeForce® GTX 460
* - **Memory**
- 6 GB RAM
- 4 GB RAM
* - **Storage**
- 30 GB available hare drive space
-
* - **Internet**
- Boardband internet connection
-
* - **Resolution**
- 1024 x 768 minimum display resolution
-
macOS
^^^^^
.. list-table::
:header-rows: 1
:widths: auto
* -
- Recommended
- Minimum
* - **Operating System**
- macOS 10.14.1 or above
-
* - **CPU**
- Intel® Core™ i7-4770 or better
- Intel® Core™ i5-760 or better
* - **GPU**
- AMD Radeon™ R9 M395X or better
- NVIDIA® GeForce® GT 750M or AMD Radeon™ R9 M290 or better
* - **Memory**
- 8 GB RAM
- 4 GB RAM
* - **Storage**
- 70GB available space SSD
- 70GB available space 7200 RPM HDD
* - **Internet**
- Boardband internet connection
-
* - **Resolution**
- 1024 x 768 minimum display resolution
-
Further reading
---------------
* `Overview <../Overview.rst>`__
* `Use command line tool nnictl <Nnictl.rst>`__
* `Use NNIBoard <WebUI.rst>`__
* `Define search space <SearchSpaceSpec.rst>`__
* `Config an experiment <ExperimentConfig.rst>`__
* `How to run an experiment on local (with multiple GPUs)? <../TrainingService/LocalMode.rst>`__
* `How to run an experiment on multiple machines? <../TrainingService/RemoteMachineMode.rst>`__
* `How to run an experiment on OpenPAI? <../TrainingService/PaiMode.rst>`__
* `How to run an experiment on Kubernetes through Kubeflow? <../TrainingService/KubeflowMode.rst>`__
* `How to run an experiment on Kubernetes through FrameworkController? <../TrainingService/FrameworkControllerMode.rst>`__
* `How to run an experiment on Kubernetes through AdaptDL? <../TrainingService/AdaptDLMode.rst>`__
Install on Windows
==================
Prerequires
-----------
*
Python 3.6 (or above) 64-bit. `Anaconda <https://www.anaconda.com/products/individual>`__ or `Miniconda <https://docs.conda.io/en/latest/miniconda.html>`__ is highly recommended to manage multiple Python environments on Windows.
*
If it's a newly installed Python environment, it needs to install `Microsoft C++ Build Tools <https://visualstudio.microsoft.com/visual-cpp-build-tools/>`__ to support build NNI dependencies like ``scikit-learn``.
.. code-block:: bat
pip install cython wheel
*
git for verifying installation.
Install NNI
-----------
In most cases, you can install and upgrade NNI from pip package. It's easy and fast.
If you are interested in special or the latest code versions, you can install NNI through source code.
If you want to contribute to NNI, refer to `setup development environment <SetupNniDeveloperEnvironment.rst>`__.
*
From pip package
.. code-block:: bat
python -m pip install --upgrade nni
*
From source code
.. code-block:: bat
git clone -b v2.6 https://github.com/Microsoft/nni.git
cd nni
python -m pip install -U -r dependencies/setup.txt
python -m pip install -r dependencies/develop.txt
python setup.py develop
Verify installation
-------------------
*
Clone examples within source code.
.. code-block:: bat
git clone -b v2.6 https://github.com/Microsoft/nni.git
*
Run the MNIST example.
.. code-block:: bat
nnictl create --config nni\examples\trials\mnist-pytorch\config_windows.yml
Note: If you are familiar with other frameworks, you can choose corresponding example under ``examples\trials``. It needs to change trial command ``python3`` to ``python`` in each example YAML, since default installation has ``python.exe``\ , not ``python3.exe`` executable.
*
Wait for the message ``INFO: Successfully started experiment!`` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the ``Web UI url``.
.. code-block:: text
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: http://223.255.255.1:8080 http://127.0.0.1:8080
-----------------------------------------------------------------------
You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
commands description
1. nnictl experiment show show the information of experiments
2. nnictl trial ls list all of trial jobs
3. nnictl top monitor the status of running experiments
4. nnictl log stderr show stderr log content
5. nnictl log stdout show stdout log content
6. nnictl stop stop an experiment
7. nnictl trial kill kill a trial job by id
8. nnictl --help get help information about nnictl
-----------------------------------------------------------------------
* Open the ``Web UI url`` in your browser, you can view detailed information about the experiment and all the submitted trial jobs as shown below. `Here <../Tutorial/WebUI.rst>`__ are more Web UI pages.
.. image:: ../../img/webui_overview_page.png
:target: ../../img/webui_overview_page.png
:alt: overview
.. image:: ../../img/webui_trialdetail_page.png
:target: ../../img/webui_trialdetail_page.png
:alt: detail
System requirements
-------------------
Below are the minimum system requirements for NNI on Windows, Windows 10.1809 is well tested and recommend. Due to potential programming changes, the minimum system requirements for NNI may change over time.
.. list-table::
:header-rows: 1
:widths: auto
* -
- Recommended
- Minimum
* - **Operating System**
- Windows 10 1809 or above
-
* - **CPU**
- Intel® Core™ i5 or AMD Phenom™ II X3 or better
- Intel® Core™ i3 or AMD Phenom™ X3 8650
* - **GPU**
- NVIDIA® GeForce® GTX 660 or better
- NVIDIA® GeForce® GTX 460
* - **Memory**
- 6 GB RAM
- 4 GB RAM
* - **Storage**
- 30 GB available hare drive space
-
* - **Internet**
- Boardband internet connection
-
* - **Resolution**
- 1024 x 768 minimum display resolution
-
FAQ
---
simplejson failed when installing NNI
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Make sure a C++ 14.0 compiler is installed.
..
building 'simplejson._speedups' extension error: [WinError 3] The system cannot find the path specified
Trial failed with missing DLL in command line or PowerShell
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This error is caused by missing LIBIFCOREMD.DLL and LIBMMD.DLL and failure to install SciPy. Using Anaconda or Miniconda with Python(64-bit) can solve it.
..
ImportError: DLL load failed
Trial failed on webUI
^^^^^^^^^^^^^^^^^^^^^
Please check the trial log file stderr for more details.
If there is a stderr file, please check it. Two possible cases are:
* forgetting to change the trial command ``python3`` to ``python`` in each experiment YAML.
* forgetting to install experiment dependencies such as TensorFlow, Keras and so on.
Fail to use BOHB on Windows
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Make sure a C++ 14.0 compiler is installed when trying to run ``pip install nni[BOHB]`` to install the dependencies.
Not supported tuner on Windows
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SMAC is not supported currently; for the specific reason refer to this `GitHub issue <https://github.com/automl/SMAC3/issues/483>`__.
Use Windows as a remote worker
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Refer to `Remote Machine mode <../TrainingService/RemoteMachineMode.rst>`__.
Segmentation fault (core dumped) when installing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Refer to `FAQ <FAQ.rst>`__.
Further reading
---------------
* `Overview <../Overview.rst>`__
* `Use command line tool nnictl <Nnictl.rst>`__
* `Use NNIBoard <WebUI.rst>`__
* `Define search space <SearchSpaceSpec.rst>`__
* `Config an experiment <ExperimentConfig.rst>`__
* `How to run an experiment on local (with multiple GPUs)? <../TrainingService/LocalMode.rst>`__
* `How to run an experiment on multiple machines? <../TrainingService/RemoteMachineMode.rst>`__
* `How to run an experiment on OpenPAI? <../TrainingService/PaiMode.rst>`__
* `How to run an experiment on Kubernetes through Kubeflow? <../TrainingService/KubeflowMode.rst>`__
* `How to run an experiment on Kubernetes through FrameworkController? <../TrainingService/FrameworkControllerMode.rst>`__
<!-- <style>
table, tr, td{
border: none;
}
div{
width: 300px;
height: 200px;
border: 1px solid grey;
background: #ccc;
box-sizing: border-box;
}
img{
width: 260px;
height: 160px;
margin: 20px;
}
</style> -->
<table>
<tr>
<td>
<div>
<img style="
width: 300px;
"
src="../../img/emoicons/NoBug.png"/>
</div>
</td>
<td>
<div>
<img style="
width: 300px;
" src="../../img/emoicons/Holiday.png"/>
</div>
</td>
<td>
<div>
<img style="
width: 300px;
height: 180px;
" src="../../img/emoicons/Error.png"/>
</div>
</td>
</tr>
<tr>
<td align="center">No bug</td>
<td align="center">Holiday</td>
<td align="center">Error</td>
</tr>
<tr>
<td>
<div>
<img style="
width: 300px;
height: 210px;
" src="../../img/emoicons/Working.png"/>
</div>
</td>
<td >
<div>
<img style="
width: 300px;
" src="../../img/emoicons/Sign.png"/>
</div>
</td>
<td>
<div>
<img style="
width: 300px;
" src="../../img/emoicons/Crying.png"/>
</div>
</td>
</tr>
<tr>
<td align="center" >Working</td>
<td align="center" >Sign</td>
<td align="center" >Crying</td>
</tr>
<tr>
<td>
<div>
<img style="
width: 300px;
height: 190px;
" src="../../img/emoicons/Cut.png"/>
</div>
</td>
<td>
<div>
<img style="
width: 300px;
" src="../../img/emoicons/Weaving.png"/>
</div>
</td>
<td>
<div>
<img style="
width: 300px;
" src="../../img/emoicons/Comfort.png"/>
</div>
</td>
</tr>
<tr>
<td align="center">Cut</td>
<td align="center">Weaving</td>
<td align="center">Comfort</td>
</tr>
<tr>
<td>
<div>
<img style="
width: 300px;
" src="../../img/emoicons/Sweat.png"/>
</div>
</td>
<td></td>
<td></td>
</tr>
<tr>
<td align="center">Sweat</td>
<td align="center"></td>
<td align="center"></td>
</tr>
</table>
.. role:: raw-html(raw)
:format: html
nnictl
======
Introduction
------------
**nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc.
Commands
--------
nnictl support commands:
* `nnictl create <#create>`__
* `nnictl resume <#resume>`__
* `nnictl view <#view>`__
* `nnictl stop <#stop>`__
* `nnictl update <#update>`__
* `nnictl trial <#trial>`__
* `nnictl top <#top>`__
* `nnictl experiment <#experiment>`__
* `nnictl platform <#platform>`__
* `nnictl config <#config>`__
* `nnictl log <#log>`__
* `nnictl webui <#webui>`__
* `nnictl algo <#algo>`__
* `nnictl ss_gen <#ss_gen>`__
* `nnictl --version <#version>`__
Manage an experiment
^^^^^^^^^^^^^^^^^^^^
:raw-html:`<a name="create"></a>`
nnictl create
^^^^^^^^^^^^^
*
Description
You can use this command to create a new experiment, using the configuration specified in config file.
After this command is successfully done, the context will be set as this experiment, which means the following command you issued is associated with this experiment, unless you explicitly changes the context(not supported yet).
*
Usage
.. code-block:: bash
nnictl create [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --config, -c
- True
-
- YAML configure file of the experiment
* - --port, -p
- False
-
- the port of restful server
* - --debug, -d
- False
-
- set debug mode
* - --foreground, -f
- False
-
- set foreground mode, print log content to terminal
*
Examples
..
create a new experiment with the default port: 8080
.. code-block:: bash
nnictl create --config nni/examples/trials/mnist-pytorch/config.yml
..
create a new experiment with specified port 8088
.. code-block:: bash
nnictl create --config nni/examples/trials/mnist-pytorch/config.yml --port 8088
..
create a new experiment with specified port 8088 and debug mode
.. code-block:: bash
nnictl create --config nni/examples/trials/mnist-pytorch/config.yml --port 8088 --debug
Note:
.. code-block:: text
Debug mode will disable version check function in Trialkeeper.
:raw-html:`<a name="resume"></a>`
nnictl resume
^^^^^^^^^^^^^
*
Description
You can use this command to resume a stopped experiment.
*
Usage
.. code-block:: bash
nnictl resume [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- True
-
- The id of the experiment you want to resume
* - --port, -p
- False
-
- Rest port of the experiment you want to resume
* - --debug, -d
- False
-
- set debug mode
* - --foreground, -f
- False
-
- set foreground mode, print log content to terminal
* - --experiment_dir, -e
- False
-
- Resume experiment from external folder, specify the full path of experiment folder
*
Example
..
resume an experiment with specified port 8088
.. code-block:: bash
nnictl resume [experiment_id] --port 8088
:raw-html:`<a name="view"></a>`
nnictl view
^^^^^^^^^^^
*
Description
You can use this command to view a stopped experiment.
*
Usage
.. code-block:: bash
nnictl view [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- True
-
- The id of the experiment you want to view
* - --port, -p
- False
-
- Rest port of the experiment you want to view
* - --experiment_dir, -e
- False
-
- View experiment from external folder, specify the full path of experiment folder
*
Example
..
view an experiment with specified port 8088
.. code-block:: bash
nnictl view [experiment_id] --port 8088
:raw-html:`<a name="stop"></a>`
nnictl stop
^^^^^^^^^^^
*
Description
You can use this command to stop a running experiment or multiple experiments.
*
Usage
.. code-block:: bash
nnictl stop [Options]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- The id of the experiment you want to stop
* - --port, -p
- False
-
- Rest port of the experiment you want to stop
* - --all, -a
- False
-
- Stop all of experiments
*
Details & Examples
#.
If there is no id specified, and there is an experiment running, stop the running experiment, or print error message.
.. code-block:: bash
nnictl stop
#.
If there is an id specified, and the id matches the running experiment, nnictl will stop the corresponding experiment, or will print error message.
.. code-block:: bash
nnictl stop [experiment_id]
#.
If there is a port specified, and an experiment is running on that port, the experiment will be stopped.
.. code-block:: bash
nnictl stop --port 8080
#.
Users could use 'nnictl stop --all' to stop all experiments.
.. code-block:: bash
nnictl stop --all
#.
If the id ends with \*, nnictl will stop all experiments whose ids matchs the regular.
#. If the id does not exist but match the prefix of an experiment id, nnictl will stop the matched experiment.
#. If the id does not exist but match multiple prefix of the experiment ids, nnictl will give id information.
:raw-html:`<a name="update"></a>`
nnictl update
^^^^^^^^^^^^^
*
**nnictl update searchspace**
*
Description
You can use this command to update an experiment's search space.
*
Usage
.. code-block:: bash
nnictl update searchspace [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --filename, -f
- True
-
- the file storing your new search space
*
Example
``update experiment's new search space with file dir 'examples/trials/mnist-pytorch/search_space.json'``
.. code-block:: bash
nnictl update searchspace [experiment_id] --filename examples/trials/mnist-pytorch/search_space.json
*
**nnictl update concurrency**
*
Description
You can use this command to update an experiment's concurrency.
*
Usage
.. code-block:: bash
nnictl update concurrency [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --value, -v
- True
-
- the number of allowed concurrent trials
*
Example
..
update experiment's concurrency
.. code-block:: bash
nnictl update concurrency [experiment_id] --value [concurrency_number]
*
**nnictl update duration**
*
Description
You can use this command to update an experiment's duration.
*
Usage
.. code-block:: bash
nnictl update duration [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --value, -v
- True
-
- Strings like '1m' for one minute or '2h' for two hours. SUFFIX may be 's' for seconds, 'm' for minutes, 'h' for hours or 'd' for days.
*
Example
..
update experiment's duration
.. code-block:: bash
nnictl update duration [experiment_id] --value [duration]
*
**nnictl update trialnum**
*
Description
You can use this command to update an experiment's maxtrialnum.
*
Usage
.. code-block:: bash
nnictl update trialnum [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --value, -v
- True
-
- the new number of maxtrialnum you want to set
*
Example
..
update experiment's trial num
.. code-block:: bash
nnictl update trialnum [experiment_id] --value [trial_num]
:raw-html:`<a name="trial"></a>`
nnictl trial
^^^^^^^^^^^^
*
**nnictl trial ls**
*
Description
You can use this command to show trial's information. Note that if ``head`` or ``tail`` is set, only complete trials will be listed.
*
Usage
.. code-block:: bash
nnictl trial ls
nnictl trial ls --head 10
nnictl trial ls --tail 10
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --head
- False
-
- the number of items to be listed with the highest default metric
* - --tail
- False
-
- the number of items to be listed with the lowest default metric
*
**nnictl trial kill**
*
Description
You can use this command to kill a trial job.
*
Usage
.. code-block:: bash
nnictl trial kill [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- Experiment ID of the trial
* - --trial_id, -T
- True
-
- ID of the trial you want to kill.
*
Example
..
kill trail job
.. code-block:: bash
nnictl trial kill [experiment_id] --trial_id [trial_id]
:raw-html:`<a name="top"></a>`
nnictl top
^^^^^^^^^^
*
Description
Monitor all of running experiments.
*
Usage
.. code-block:: bash
nnictl top
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --time, -t
- False
-
- The interval to update the experiment status, the unit of time is second, and the default value is 3 second.
:raw-html:`<a name="experiment"></a>`
Manage experiment information
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
*
**nnictl experiment show**
*
Description
Show the information of experiment.
*
Usage
.. code-block:: bash
nnictl experiment show
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
*
**nnictl experiment status**
*
Description
Show the status of experiment.
*
Usage
.. code-block:: bash
nnictl experiment status
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
*
**nnictl experiment list**
*
Description
Show the information of all the (running) experiments.
*
Usage
.. code-block:: bash
nnictl experiment list [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --all
- False
-
- list all of experiments
*
**nnictl experiment delete**
*
Description
Delete one or all experiments, it includes log, result, environment information and cache. It uses to delete useless experiment result, or save disk space.
*
Usage
.. code-block:: bash
nnictl experiment delete [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment
* - --all
- False
-
- delete all of experiments
*
**nnictl experiment export**
*
Description
You can use this command to export reward & hyper-parameter of trial jobs to a csv file.
*
Usage
.. code-block:: bash
nnictl experiment export [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment
* - --filename, -f
- True
-
- File path of the output file
* - --type
- True
-
- Type of output file, only support "csv" and "json"
* - --intermediate, -i
- False
-
- Are intermediate results included
*
Examples
..
export all trial data in an experiment as json format
.. code-block:: bash
nnictl experiment export [experiment_id] --filename [file_path] --type json --intermediate
*
**nnictl experiment import**
*
Description
You can use this command to import several prior or supplementary trial hyperparameters & results for NNI hyperparameter tuning. The data are fed to the tuning algorithm (e.g., tuner or advisor).
*
Usage
.. code-block:: bash
nnictl experiment import [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- The id of the experiment you want to import data into
* - --filename, -f
- True
-
- a file with data you want to import in json format
*
Details
NNI supports users to import their own data, please express the data in the correct format. An example is shown below:
.. code-block:: json
[
{"parameter": {"x": 0.5, "y": 0.9}, "value": 0.03},
{"parameter": {"x": 0.4, "y": 0.8}, "value": 0.05},
{"parameter": {"x": 0.3, "y": 0.7}, "value": 0.04}
]
Every element in the top level list is a sample. For our built-in tuners/advisors, each sample should have at least two keys: ``parameter`` and ``value``. The ``parameter`` must match this experiment's search space, that is, all the keys (or hyperparameters) in ``parameter`` must match the keys in the search space. Otherwise, tuner/advisor may have unpredictable behavior. ``Value`` should follow the same rule of the input in ``nni.report_final_result``\ , that is, either a number or a dict with a key named ``default``. For your customized tuner/advisor, the file could have any json content depending on how you implement the corresponding methods (e.g., ``import_data``\ ).
You also can use `nnictl experiment export <#export>`__ to export a valid json file including previous experiment trial hyperparameters and results.
Currently, following tuner and advisor support import data:
.. code-block:: yaml
builtinTunerName: TPE, Anneal, GridSearch, MetisTuner
builtinAdvisorName: BOHB
*If you want to import data to BOHB advisor, user are suggested to add "TRIAL_BUDGET" in parameter as NNI do, otherwise, BOHB will use max_budget as "TRIAL_BUDGET". Here is an example:*
.. code-block:: json
[
{"parameter": {"x": 0.5, "y": 0.9, "TRIAL_BUDGET": 27}, "value": 0.03}
]
*
Examples
..
import data to a running experiment
.. code-block:: bash
nnictl experiment import [experiment_id] -f experiment_data.json
*
**nnictl experiment save**
*
Description
Save nni experiment metadata and code data.
*
Usage
.. code-block:: bash
nnictl experiment save [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- True
-
- The id of the experiment you want to save
* - --path, -p
- False
-
- the folder path to store nni experiment data, default current working directory
* - --saveCodeDir, -s
- False
-
- save codeDir data of the experiment, default False
*
Examples
..
save an expeirment
.. code-block:: bash
nnictl experiment save [experiment_id] --saveCodeDir
*
**nnictl experiment load**
*
Description
Load an nni experiment.
*
Usage
.. code-block:: bash
nnictl experiment load [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --path, -p
- True
-
- the file path of nni package
* - --codeDir, -c
- True
-
- the path of codeDir for loaded experiment, this path will also put the code in the loaded experiment package
* - --logDir, -l
- False
-
- the path of logDir for loaded experiment
* - --searchSpacePath, -s
- True
-
- the path of search space file for loaded experiment, this path contains file name. Default in $codeDir/search_space.json
*
Examples
..
load an expeirment
.. code-block:: bash
nnictl experiment load --path [path] --codeDir [codeDir]
:raw-html:`<a name="platform"></a>`
Manage platform information
^^^^^^^^^^^^^^^^^^^^^^^^^^^
*
**nnictl platform clean**
*
Description
It uses to clean up disk on a target platform. The provided YAML file includes the information of target platform, and it follows the same schema as the NNI configuration file.
*
Note
if the target platform is being used by other users, it may cause unexpected errors to others.
*
Usage
.. code-block:: bash
nnictl platform clean [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --config
- True
-
- the path of yaml config file used when create an experiment
:raw-html:`<a name="config"></a>`
nnictl config show
^^^^^^^^^^^^^^^^^^
*
Description
Display the current context information.
*
Usage
.. code-block:: bash
nnictl config show
:raw-html:`<a name="log"></a>`
Manage log
^^^^^^^^^^
*
**nnictl log stdout**
*
Description
Show the stdout log content.
*
Usage
.. code-block:: bash
nnictl log stdout [options]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --head, -h
- False
-
- show head lines of stdout
* - --tail, -t
- False
-
- show tail lines of stdout
* - --path, -p
- False
-
- show the path of stdout file
*
Example
..
Show the tail of stdout log content
.. code-block:: bash
nnictl log stdout [experiment_id] --tail [lines_number]
*
**nnictl log stderr**
*
Description
Show the stderr log content.
*
Usage
.. code-block:: bash
nnictl log stderr [options]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
* - --head, -h
- False
-
- show head lines of stderr
* - --tail, -t
- False
-
- show tail lines of stderr
* - --path, -p
- False
-
- show the path of stderr file
*
**nnictl log trial**
*
Description
Show trial log path.
*
Usage
.. code-block:: bash
nnictl log trial [options]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- Experiment ID of the trial
* - --trial_id, -T
- False
-
- ID of the trial to be found the log path, required when id is not empty.
:raw-html:`<a name="webui"></a>`
Manage webui
^^^^^^^^^^^^
*
**nnictl webui url**
*
Description
Show an experiment's webui url
*
Usage
.. code-block:: bash
nnictl webui url [options]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- Experiment ID
:raw-html:`<a name="algo"></a>`
Manage builtin algorithms
^^^^^^^^^^^^^^^^^^^^^^^^^
*
**nnictl algo register**
*
Description
Register customized algorithms as builtin tuner/assessor/advisor.
*
Usage
.. code-block:: bash
nnictl algo register --meta <path_to_meta_file>
``<path_to_meta_file>`` is the path to the meta data file in yml format, which has following keys:
*
``algoType``: type of algorithms, could be one of ``tuner``, ``assessor``, ``advisor``
*
``builtinName``: builtin name used in experiment configuration file
*
``className``: tuner class name, including its module name, for example: ``demo_tuner.DemoTuner``
*
``classArgsValidator``: class args validator class name, including its module name, for example: ``demo_tuner.MyClassArgsValidator``
*
Example
..
Install a customized tuner in nni examples
.. code-block:: bash
cd nni/examples/tuners/customized_tuner
python3 setup.py develop
nnictl algo register --meta meta_file.yml
*
**nnictl algo show**
*
Description
Show the detailed information of specified registered algorithms.
*
Usage
.. code-block:: bash
nnictl algo show <builtinName>
*
Example
.. code-block:: bash
nnictl algo show SMAC
*
**nnictl package list**
*
Description
List the registered builtin algorithms.
*
Usage
.. code-block:: bash
nnictl algo list
*
Example
.. code-block:: bash
nnictl algo list
*
**nnictl algo unregister**
*
Description
Unregister a registered customized builtin algorithms. The NNI provided builtin algorithms can not be unregistered.
*
Usage
.. code-block:: bash
nnictl algo unregister <builtinName>
*
Example
.. code-block:: bash
nnictl algo unregister demotuner
:raw-html:`<a name="ss_gen"></a>`
Generate search space
^^^^^^^^^^^^^^^^^^^^^
*
**nnictl ss_gen**
*
Description
Generate search space from user trial code which uses NNI NAS APIs.
*
Usage
.. code-block:: bash
nnictl ss_gen [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --trial_command
- True
-
- The command of the trial code
* - --trial_dir
- False
- ./
- The directory of the trial code
* - --file
- False
- nni_auto_gen_search_space.json
- The file for storing generated search space
*
Example
..
Generate a search space
.. code-block:: bash
nnictl ss_gen --trial_command="python3 mnist.py" --trial_dir=./ --file=ss.json
:raw-html:`<a name="version"></a>`
Check NNI version
^^^^^^^^^^^^^^^^^
*
**nnictl --version**
*
Description
Describe the current version of NNI installed.
*
Usage
.. code-block:: bash
nnictl --version
QuickStart
==========
Installation
------------
Currently, NNI supports running on Linux, macOS and Windows. Ubuntu 16.04 or higher, macOS 10.14.1, and Windows 10.1809 are tested and supported. Simply run the following ``pip install`` in an environment that has ``python >= 3.6``.
Linux and macOS
^^^^^^^^^^^^^^^
.. code-block:: bash
python3 -m pip install --upgrade nni
Windows
^^^^^^^
.. code-block:: bash
python -m pip install --upgrade nni
.. Note:: For Linux and macOS, ``--user`` can be added if you want to install NNI in your home directory, which does not require any special privileges.
.. Note:: If there is an error like ``Segmentation fault``, please refer to the :doc:`FAQ <FAQ>`.
.. Note:: For the system requirements of NNI, please refer to :doc:`Install NNI on Linux & Mac <InstallationLinux>` or :doc:`Windows <InstallationWin>`. If you want to use docker, refer to :doc:`HowToUseDocker <HowToUseDocker>`.
"Hello World" example on MNIST
------------------------------
NNI is a toolkit to help users run automated machine learning experiments. It can automatically do the cyclic process of getting hyperparameters, running trials, testing results, and tuning hyperparameters. Here, we'll show how to use NNI to help you find the optimal hyperparameters on the MNIST dataset.
Here is an example script to train a CNN on the MNIST dataset **without NNI**:
.. code-block:: python
def main(args):
# load data
train_loader = torch.utils.data.DataLoader(datasets.MNIST(...), batch_size=args['batch_size'], shuffle=True)
test_loader = torch.tuils.data.DataLoader(datasets.MNIST(...), batch_size=1000, shuffle=True)
# build model
model = Net(hidden_size=args['hidden_size'])
optimizer = optim.SGD(model.parameters(), lr=args['lr'], momentum=args['momentum'])
# train
for epoch in range(10):
train(args, model, device, train_loader, optimizer, epoch)
test_acc = test(args, model, device, test_loader)
print(test_acc)
print('final accuracy:', test_acc)
if __name__ == '__main__':
params = {
'batch_size': 32,
'hidden_size': 128,
'lr': 0.001,
'momentum': 0.5
}
main(params)
The above code can only try one set of parameters at a time. If you want to tune the learning rate, you need to manually modify the hyperparameter and start the trial again and again.
NNI is born to help users tune jobs, whose working process is presented below:
.. code-block:: text
input: search space, trial code, config file
output: one optimal hyperparameter configuration
1: For t = 0, 1, 2, ..., maxTrialNum,
2: hyperparameter = chose a set of parameter from search space
3: final result = run_trial_and_evaluate(hyperparameter)
4: report final result to NNI
5: If reach the upper limit time,
6: Stop the experiment
7: return hyperparameter value with best final result
.. note::
If you want to use NNI to automatically train your model and find the optimal hyper-parameters, there are two approaches:
1. Write a config file and start the experiment from the command line.
2. Config and launch the experiment directly from a Python file
In the this part, we will focus on the first approach. For the second approach, please refer to `this tutorial <HowToLaunchFromPython.rst>`__\ .
Step 1: Modify the ``Trial`` Code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Modify your ``Trial`` file to get the hyperparameter set from NNI and report the final results to NNI.
.. code-block:: diff
+ import nni
def main(args):
# load data
train_loader = torch.utils.data.DataLoader(datasets.MNIST(...), batch_size=args['batch_size'], shuffle=True)
test_loader = torch.tuils.data.DataLoader(datasets.MNIST(...), batch_size=1000, shuffle=True)
# build model
model = Net(hidden_size=args['hidden_size'])
optimizer = optim.SGD(model.parameters(), lr=args['lr'], momentum=args['momentum'])
# train
for epoch in range(10):
train(args, model, device, train_loader, optimizer, epoch)
test_acc = test(args, model, device, test_loader)
- print(test_acc)
+ nni.report_intermediate_result(test_acc)
- print('final accuracy:', test_acc)
+ nni.report_final_result(test_acc)
if __name__ == '__main__':
- params = {'batch_size': 32, 'hidden_size': 128, 'lr': 0.001, 'momentum': 0.5}
+ params = nni.get_next_parameter()
main(params)
*Example:* :githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>`
Step 2: Define the Search Space
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Define a ``Search Space`` in a YAML file, including the ``name`` and the ``distribution`` (discrete-valued or continuous-valued) of all the hyperparameters you want to search.
.. code-block:: yaml
searchSpace:
batch_size:
_type: choice
_value: [16, 32, 64, 128]
hidden_size:
_type: choice
_value: [128, 256, 512, 1024]
lr:
_type: choice
_value: [0.0001, 0.001, 0.01, 0.1]
momentum:
_type: uniform
_value: [0, 1]
*Example:* :githublink:`config_detailed.yml <examples/trials/mnist-pytorch/config_detailed.yml>`
You can also write your search space in a JSON file and specify the file path in the configuration. For detailed tutorial on how to write the search space, please see `here <SearchSpaceSpec.rst>`__.
Step 3: Config the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In addition to the search_space defined in the `step2 <step-2-define-the-search-space>`__, you need to config the experiment in the YAML file. It specifies the key information of the experiment, such as the trial files, tuning algorithm, max trial number, and max duration, etc.
.. code-block:: yaml
experimentName: MNIST # An optional name to distinguish the experiments
trialCommand: python3 mnist.py # NOTE: change "python3" to "python" if you are using Windows
trialConcurrency: 2 # Run 2 trials concurrently
maxTrialNumber: 10 # Generate at most 10 trials
maxExperimentDuration: 1h # Stop generating trials after 1 hour
tuner: # Configure the tuning algorithm
name: TPE
classArgs: # Algorithm specific arguments
optimize_mode: maximize
trainingService: # Configure the training platform
platform: local
Experiment config reference could be found `here <../reference/experiment_config.rst>`__.
.. _nniignore:
.. Note:: If you are planning to use remote machines or clusters as your :doc:`training service <../TrainingService/Overview>`, to avoid too much pressure on network, NNI limits the number of files to 2000 and total size to 300MB. If your codeDir contains too many files, you can choose which files and subfolders should be excluded by adding a ``.nniignore`` file that works like a ``.gitignore`` file. For more details on how to write this file, see the `git documentation <https://git-scm.com/docs/gitignore#_pattern_format>`__.
*Example:* :githublink:`config_detailed.yml <examples/trials/mnist-pytorch/config_detailed.yml>` and :githublink:`.nniignore <examples/trials/mnist-pytorch/.nniignore>`
All the code above is already prepared and stored in :githublink:`examples/trials/mnist-pytorch/<examples/trials/mnist-pytorch>`.
Step 4: Launch the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Linux and macOS
***************
Run the **config_detailed.yml** file from your command line to start the experiment.
.. code-block:: bash
nnictl create --config nni/examples/trials/mnist-pytorch/config_detailed.yml
Windows
*******
Change ``python3`` to ``python`` of the ``trialCommand`` field in the **config_detailed.yml** file, and run the **config_detailed.yml** file from your command line to start the experiment.
.. code-block:: bash
nnictl create --config nni\examples\trials\mnist-pytorch\config_detailed.yml
.. Note:: ``nnictl`` is a command line tool that can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click :doc:`here <Nnictl>` for more usage of ``nnictl``.
Wait for the message ``INFO: Successfully started experiment!`` in the command line. This message indicates that your experiment has been successfully started. And this is what we expect to get:
.. code-block:: text
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: [Your IP]:8080
-----------------------------------------------------------------------
You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
commands description
1. nnictl experiment show show the information of experiments
2. nnictl trial ls list all of trial jobs
3. nnictl top monitor the status of running experiments
4. nnictl log stderr show stderr log content
5. nnictl log stdout show stdout log content
6. nnictl stop stop an experiment
7. nnictl trial kill kill a trial job by id
8. nnictl --help get help information about nnictl
-----------------------------------------------------------------------
If you prepared ``trial``\ , ``search space``\ , and ``config`` according to the above steps and successfully created an NNI job, NNI will automatically tune the optimal hyper-parameters and run different hyper-parameter sets for each trial according to the defined search space. You can see its progress through the WebUI clearly.
Step 5: View the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^
After starting the experiment successfully, you can find a message in the command-line interface that tells you the ``Web UI url`` like this:
.. code-block:: text
The Web UI urls are: [Your IP]:8080
Open the ``Web UI url`` (Here it's: ``[Your IP]:8080``\ ) in your browser, you can view detailed information about the experiment and all the submitted trial jobs as shown below. If you cannot open the WebUI link in your terminal, please refer to the `FAQ <FAQ.rst#could-not-open-webui-link>`__.
View Overview Page
******************
Information about this experiment will be shown in the WebUI, including the experiment profile and search space message. NNI also supports downloading this information and the parameters through the **Experiment summary** button.
.. image:: ../../img/webui-img/full-oview.png
:target: ../../img/webui-img/full-oview.png
:alt: overview
View Trials Detail Page
***********************
You could see the best trial metrics and hyper-parameter graph in this page. And the table content includes more columns when you click the button ``Add/Remove columns``.
.. image:: ../../img/webui-img/full-detail.png
:target: ../../img/webui-img/full-detail.png
:alt: detail
View Experiments Management Page
********************************
On the ``All experiments`` page, you can see all the experiments on your machine.
.. image:: ../../img/webui-img/managerExperimentList/expList.png
:target: ../../img/webui-img/managerExperimentList/expList.png
:alt: Experiments list
For more detailed usage of WebUI, please refer to `this doc <./WebUI.rst>`__.
Related Topic
-------------
* `How to debug? <HowToDebug.rst>`__
* `How to write a trial? <../TrialExample/Trials.rst>`__
* `How to try different Tuners? <../Tuner/BuiltinTuner.rst>`__
* `How to try different Assessors? <../Assessor/BuiltinAssessor.rst>`__
* `How to run an experiment on the different training platforms? <../training_services.rst>`__
* `How to use Annotation? <AnnotationSpec.rst>`__
* `How to use the command line tool nnictl? <Nnictl.rst>`__
* `How to launch Tensorboard on WebUI? <Tensorboard.rst>`__
.. role:: raw-html(raw)
:format: html
Search Space
============
Overview
--------
In NNI, tuner will sample parameters/architectures according to the search space.
To define a search space, users should define the name of the variable, the type of sampling strategy and its parameters.
* An example of a search space definition in a JSON file is as follow:
.. code-block:: json
{
"dropout_rate": {"_type": "uniform", "_value": [0.1, 0.5]},
"conv_size": {"_type": "choice", "_value": [2, 3, 5, 7]},
"hidden_size": {"_type": "choice", "_value": [124, 512, 1024]},
"batch_size": {"_type": "choice", "_value": [50, 250, 500]},
"learning_rate": {"_type": "uniform", "_value": [0.0001, 0.1]}
}
Take the first line as an example. ``dropout_rate`` is defined as a variable whose prior distribution is a uniform distribution with a range from ``0.1`` to ``0.5``.
.. note:: In the `experiment configuration (V2) schema <ExperimentConfig.rst>`_, NNI supports defining the search space directly in the configuration file, detailed usage can be found `here <QuickStart.rst#step-2-define-the-search-space>`__. When using Python API, users can write the search space in the Python file, refer `here <HowToLaunchFromPython.rst>`__.
Note that the available sampling strategies within a search space depend on the tuner you want to use. We list the supported types for each builtin tuner below. For a customized tuner, you don't have to follow our convention and you will have the flexibility to define any type you want.
Types
-----
All types of sampling strategies and their parameter are listed here:
*
``{"_type": "choice", "_value": options}``
* The variable's value is one of the options. Here ``options`` should be a list of **numbers** or a list of **strings**. Using arbitrary objects as members of this list (like sublists, a mixture of numbers and strings, or null values) should work in most cases, but may trigger undefined behaviors.
* ``options`` can also be a nested sub-search-space, this sub-search-space takes effect only when the corresponding element is chosen. The variables in this sub-search-space can be seen as conditional variables. Here is an simple :githublink:`example of nested search space definition <examples/trials/mnist-nested-search-space/search_space.json>`. If an element in the options list is a dict, it is a sub-search-space, and for our built-in tuners you have to add a ``_name`` key in this dict, which helps you to identify which element is chosen. Accordingly, here is a :githublink:`sample <examples/trials/mnist-nested-search-space/sample.json>` which users can get from nni with nested search space definition. See the table below for the tuners which support nested search spaces.
*
``{"_type": "randint", "_value": [lower, upper]}``
* Choosing a random integer between ``lower`` (inclusive) and ``upper`` (exclusive).
* Note: Different tuners may interpret ``randint`` differently. Some (e.g., TPE, GridSearch) treat integers from lower
to upper as unordered ones, while others respect the ordering (e.g., SMAC). If you want all the tuners to respect
the ordering, please use ``quniform`` with ``q=1``.
*
``{"_type": "uniform", "_value": [low, high]}``
* The variable value is uniformly sampled between low and high.
* When optimizing, this variable is constrained to a two-sided interval.
*
``{"_type": "quniform", "_value": [low, high, q]}``
* The variable value is determined using ``clip(round(uniform(low, high) / q) * q, low, high)``\ , where the clip operation is used to constrain the generated value within the bounds. For example, for ``_value`` specified as [0, 10, 2.5], possible values are [0, 2.5, 5.0, 7.5, 10.0]; For ``_value`` specified as [2, 10, 5], possible values are [2, 5, 10].
* Suitable for a discrete value with respect to which the objective is still somewhat "smooth", but which should be bounded both above and below. If you want to uniformly choose an integer from a range [low, high], you can write ``_value`` like this: ``[low, high, 1]``.
*
``{"_type": "loguniform", "_value": [low, high]}``
* The variable value is drawn from a range [low, high] according to a loguniform distribution like exp(uniform(log(low), log(high))), so that the logarithm of the return value is uniformly distributed.
* When optimizing, this variable is constrained to be positive.
*
``{"_type": "qloguniform", "_value": [low, high, q]}``
* The variable value is determined using ``clip(round(loguniform(low, high) / q) * q, low, high)``\ , where the clip operation is used to constrain the generated value within the bounds.
* Suitable for a discrete variable with respect to which the objective is "smooth" and gets smoother with the size of the value, but which should be bounded both above and below.
*
``{"_type": "normal", "_value": [mu, sigma]}``
* The variable value is a real value that's normally-distributed with mean mu and standard deviation sigma. When optimizing, this is an unconstrained variable.
*
``{"_type": "qnormal", "_value": [mu, sigma, q]}``
* The variable value is determined using ``round(normal(mu, sigma) / q) * q``
* Suitable for a discrete variable that probably takes a value around mu, but is fundamentally unbounded.
*
``{"_type": "lognormal", "_value": [mu, sigma]}``
* The variable value is drawn according to ``exp(normal(mu, sigma))`` so that the logarithm of the return value is normally distributed. When optimizing, this variable is constrained to be positive.
*
``{"_type": "qlognormal", "_value": [mu, sigma, q]}``
* The variable value is determined using ``round(exp(normal(mu, sigma)) / q) * q``
* Suitable for a discrete variable with respect to which the objective is smooth and gets smoother with the size of the variable, which is bounded from one side.
Search Space Types Supported by Each Tuner
------------------------------------------
.. list-table::
:header-rows: 1
:widths: auto
* -
- choice
- choice(nested)
- randint
- uniform
- quniform
- loguniform
- qloguniform
- normal
- qnormal
- lognormal
- qlognormal
* - TPE Tuner
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
* - Random Search Tuner
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
* - Anneal Tuner
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
* - Evolution Tuner
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
* - SMAC Tuner
- :raw-html:`&#10003;`
-
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
-
-
-
-
-
* - Batch Tuner
- :raw-html:`&#10003;`
-
-
-
-
-
-
-
-
-
-
* - Grid Search Tuner
- :raw-html:`&#10003;`
-
- :raw-html:`&#10003;`
-
- :raw-html:`&#10003;`
-
-
-
-
-
-
* - Hyperband Advisor
- :raw-html:`&#10003;`
-
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
* - Metis Tuner
- :raw-html:`&#10003;`
-
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
-
-
-
-
-
-
* - GP Tuner
- :raw-html:`&#10003;`
-
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
-
-
-
-
* - DNGO Tuner
- :raw-html:`&#10003;`
-
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
- :raw-html:`&#10003;`
-
-
-
-
Known Limitations:
*
GP Tuner, Metis Tuner and DNGO tuner support only **numerical values** in search space (\ ``choice`` type values can be no-numerical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.
*
Note that for nested search space:
* Only Random Search/TPE/Anneal/Evolution/Grid Search tuner supports nested search space
Setup NNI development environment
=================================
NNI development environment supports Ubuntu 1604 (or above), and Windows 10 with Python3 64bit.
Installation
------------
1. Clone source code
^^^^^^^^^^^^^^^^^^^^
.. code-block:: bash
git clone https://github.com/Microsoft/nni.git
Note, if you want to contribute code back, it needs to fork your own NNI repo, and clone from there.
2. Install from source code
^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: bash
python3 -m pip install -U -r dependencies/setup.txt
python3 -m pip install -r dependencies/develop.txt
python3 setup.py develop
This installs NNI in `development mode <https://setuptools.readthedocs.io/en/latest/userguide/development_mode.html>`__,
so you don't need to reinstall it after edit.
3. Check if the environment is ready
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Now, you can try to start an experiment to check if your environment is ready.
For example, run the command
.. code-block:: bash
nnictl create --config examples/trials/mnist-pytorch/config.yml
And open WebUI to check if everything is OK
4. Reload changes
^^^^^^^^^^^^^^^^^
Python
******
Nothing to do, the code is already linked to package folders.
TypeScript (Linux and macOS)
****************************
* If ``ts/nni_manager`` is changed, run ``yarn watch`` under this folder. It will watch and build code continually. The ``nnictl`` need to be restarted to reload NNI manager.
* If ``ts/webui`` is changed, run ``yarn dev``\ , which will run a mock API server and a webpack dev server simultaneously. Use ``EXPERIMENT`` environment variable (e.g., ``mnist-tfv1-running``\ ) to specify the mock data being used. Built-in mock experiments are listed in ``src/webui/mock``. An example of the full command is ``EXPERIMENT=mnist-tfv1-running yarn dev``.
TypeScript (Windows)
********************
Currently you must rebuild TypeScript modules with `python3 setup.py build_ts` after edit.
5. Submit Pull Request
^^^^^^^^^^^^^^^^^^^^^^
All changes are merged to master branch from your forked repo. The description of Pull Request must be meaningful, and useful.
We will review the changes as soon as possible. Once it passes review, we will merge it to master branch.
For more contribution guidelines and coding styles, you can refer to the `contributing document <Contributing.rst>`__.
How to Use Tensorboard within WebUI
===================================
You can launch a tensorboard process cross one or multi trials within webui since NNI v2.2. This feature supports local training service and reuse mode training service with shared storage for now, and will support more scenarios in later nni version.
Preparation
-----------
Make sure tensorboard installed in your environment. If you never used tensorboard, here are getting start tutorials for your reference, `tensorboard with tensorflow <https://www.tensorflow.org/tensorboard/get_started>`__, `tensorboard with pytorch <https://pytorch.org/tutorials/recipes/recipes/tensorboard_with_pytorch.html>`__.
Use WebUI Launch Tensorboard
----------------------------
1. Save Logs
^^^^^^^^^^^^
NNI will automatically fetch the ``tensorboard`` subfolder under trial's output folder as tensorboard logdir. So in trial's source code, you need to save the tensorboard logs under ``NNI_OUTPUT_DIR/tensorboard``. This log path can be joined as:
.. code-block:: python
log_dir = os.path.join(os.environ["NNI_OUTPUT_DIR"], 'tensorboard')
2. Launch Tensorboard
^^^^^^^^^^^^^^^^^^^^^
Like compare, select the trials you want to combine to launch the tensorboard at first, then click the ``Tensorboard`` button.
.. image:: ../../img/Tensorboard_1.png
:target: ../../img/Tensorboard_1.png
:alt:
After click the ``OK`` button in the pop-up box, you will jump to the tensorboard portal.
.. image:: ../../img/Tensorboard_2.png
:target: ../../img/Tensorboard_2.png
:alt:
You can see the ``SequenceID-TrialID`` on the tensorboard portal.
.. image:: ../../img/Tensorboard_3.png
:target: ../../img/Tensorboard_3.png
:alt:
3. Stop All
^^^^^^^^^^^^
If you want to open the portal you have already launched, click the tensorboard id. If you don't need the tensorboard anymore, click ``Stop all tensorboard`` button.
.. image:: ../../img/Tensorboard_4.png
:target: ../../img/Tensorboard_4.png
:alt:
WebUI
=====
Experiments managerment
-----------------------
Click the tab ``All experiments`` on the nav bar.
.. image:: ../../img/webui-img/managerExperimentList/experimentListNav.png
:target: ../../img/webui-img/managerExperimentList/experimentListNav.png
:alt: ExperimentList nav
* On the ``All experiments`` page, you can see all the experiments on your machine.
.. image:: ../../img/webui-img/managerExperimentList/expList.png
:target: ../../img/webui-img/managerExperimentList/expList.png
:alt: Experiments list
* When you want to see more details about an experiment you could click the trial id, look that:
.. image:: ../../img/webui-img/managerExperimentList/toAnotherExp.png
:target: ../../img/webui-img/managerExperimentList/toAnotherExp.png
:alt: See this experiment detail
* If has many experiments on the table, you can use the ``filter`` button.
.. image:: ../../img/webui-img/managerExperimentList/expFilter.png
:target: ../../img/webui-img/managerExperimentList/expFilter.png
:alt: filter button
View summary page
-----------------
Click the tab ``Overview``.
* On the overview tab, you can see the experiment information and status and the performance of ``top trials``.
.. image:: ../../img/webui-img/full-oview.png
:target: ../../img/webui-img/full-oview.png
:alt: overview
* If you want to see experiment search space and config, please click the right button ``Search space`` and ``Config`` (when you hover on this button).
1. Search space file:
.. image:: ../../img/webui-img/searchSpace.png
:target: ../../img/webui-img/searchSpace.png
:alt: searchSpace
2. Config file:
.. image:: ../../img/webui-img/config.png
:target: ../../img/webui-img/config.png
:alt: config
* You can view and download ``nni-manager/dispatcher log files`` on here.
.. image:: ../../img/webui-img/review-log.png
:target: ../../img/webui-img/review-log.png
:alt: logfile
* If your experiment has many trials, you can change the refresh interval here.
.. image:: ../../img/webui-img/refresh-interval.png
:target: ../../img/webui-img/refresh-interval.png
:alt: refresh
* You can review and download the experiment results(``experiment config``, ``trial message`` and ``intermeidate metrics``) when you click the button ``Experiment summary``.
.. image:: ../../img/webui-img/summary.png
:target: ../../img/webui-img/summary.png
:alt: summary
* You can change some experiment configurations such as ``maxExecDuration``, ``maxTrialNum`` and ``trial concurrency`` on here.
.. image:: ../../img/webui-img/edit-experiment-param.png
:target: ../../img/webui-img/edit-experiment-param.png
:alt: editExperimentParams
* You can click the icon to see specific error message and ``nni-manager/dispatcher log files`` by clicking ``Learn about`` link.
.. image:: ../../img/webui-img/experimentError.png
:target: ../../img/webui-img/experimentError.png
:alt: experimentError
* You can click ``About`` to see the version and report any questions.
View job default metric
-----------------------
* Click the tab ``Default metric`` to see the point graph of all trials. Hover to see its specific default metric and search space message.
.. image:: ../../img/webui-img/default-metric.png
:target: ../../img/webui-img/default-metric.png
:alt: defaultMetricGraph
* Turn on the switch named ``Optimization curve`` to see the experiment's optimization curve.
.. image:: ../../img/webui-img/best-curve.png
:target: ../../img/webui-img/best-curve.png
:alt: bestCurveGraph
View hyper parameter
--------------------
Click the tab ``Hyper-parameter`` to see the parallel graph.
* You can click the ``add/remove`` button to add or remove axes.
* Drag the axes to swap axes on the chart.
* You can select the percentage to see top trials.
.. image:: ../../img/webui-img/hyperPara.png
:target: ../../img/webui-img/hyperPara.png
:alt: hyperParameterGraph
View Trial Duration
-------------------
Click the tab ``Trial Duration`` to see the bar graph.
.. image:: ../../img/webui-img/trial_duration.png
:target: ../../img/webui-img/trial_duration.png
:alt: trialDurationGraph
View Trial Intermediate Result Graph
------------------------------------
Click the tab ``Intermediate Result`` to see the line graph.
.. image:: ../../img/webui-img/trials_intermeidate.png
:target: ../../img/webui-img/trials_intermeidate.png
:alt: trialIntermediateGraph
The trial may have many intermediate results in the training process. In order to see the trend of some trials more clearly, we set a filtering function for the intermediate result graph.
You may find that these trials will get better or worse at an intermediate result. This indicates that it is an important and relevant intermediate result. To take a closer look at the point here, you need to enter its corresponding X-value at #Intermediate. Then input the range of metrics on this intermedia result. In the picture below, we choose the No. 4 intermediate result and set the range of metrics to 0.8-1.
.. image:: ../../img/webui-img/filter-intermediate.png
:target: ../../img/webui-img/filter-intermediate.png
:alt: filterIntermediateGraph
View trials status
------------------
Click the tab ``Trials Detail`` to see the status of all trials. Specifically:
* Trial detail: trial's id, trial's duration, start time, end time, status, accuracy, and search space file.
.. image:: ../../img/webui-img/detail-local.png
:target: ../../img/webui-img/detail-local.png
:alt: detailLocalImage
* Support searching for a specific trial by its id, status, Trial No. and trial parameters.
1. Trial id:
.. image:: ../../img/webui-img/detail/searchId.png
:target: ../../img/webui-img/detail/searchId.png
:alt: searchTrialId
2. Trial No.:
.. image:: ../../img/webui-img/detail/searchNo.png
:target: ../../img/webui-img/detail/searchNo.png
:alt: searchTrialNo.
3. Trial status:
.. image:: ../../img/webui-img/detail/searchStatus.png
:target: ../../img/webui-img/detail/searchStatus.png
:alt: searchStatus
4. Trial parameters:
(1) parameters whose type is choice:
.. image:: ../../img/webui-img/detail/searchParameterChoice.png
:target: ../../img/webui-img/detail/searchParameterChoice.png
:alt: searchParameterChoice
(2) parameters whose type is not choice:
.. image:: ../../img/webui-img/detail/searchParameterRange.png
:target: ../../img/webui-img/detail/searchParameterRange.png
:alt: searchParameterRange
* The button named ``Add column`` can select which column to show on the table. If you run an experiment whose final result is a dict, you can see other keys in the table. You can choose the column ``Intermediate count`` to watch the trial's progress.
.. image:: ../../img/webui-img/addColumn.png
:target: ../../img/webui-img/addColumn.png
:alt: addColumnGraph
* If you want to compare some trials, you can select them and then click ``Compare`` to see the results.
.. image:: ../../img/webui-img/select-trial.png
:target: ../../img/webui-img/select-trial.png
:alt: selectTrialGraph
.. image:: ../../img/webui-img/compare.png
:target: ../../img/webui-img/compare.png
:alt: compareTrialsGraph
* ``Tensorboard`` please refer `doc <Tensorboard.rst>`__.
* You can use the button named ``Copy as python`` to copy the trial's parameters.
.. image:: ../../img/webui-img/copyParameter.png
:target: ../../img/webui-img/copyParameter.png
:alt: copyTrialParameters
* You could see trial logs on the tab of ``Log``. There are three buttons ``View trial log``, ``View trial error`` and ``View trial stdout`` on local mode. If you run on the OpenPAI or Kubeflow platform, you could see trial stdout and nfs log.
1. local mode:
.. image:: ../../img/webui-img/detail/log-local.png
:target: ../../img/webui-img/detail/log-local.png
:alt: logOnLocal
2. OpenPAI, Kubeflow and other mode:
.. image:: ../../img/webui-img/detail-pai.png
:target: ../../img/webui-img/detail-pai.png
:alt: detailPai
* Intermediate Result Graph: you can see the default metric in this graph by clicking the intermediate button.
.. image:: ../../img/webui-img/intermediate.png
:target: ../../img/webui-img/intermediate.png
:alt: intermeidateGraph
* Kill: you can kill a job that status is running.
.. image:: ../../img/webui-img/kill-running.png
:target: ../../img/webui-img/kill-running.png
:alt: killTrial
* Customized trial: you can change this trial parameters and then submit it to the experiment. If you want to rerun a failed trial you could submit the same parameters to the experiment.
.. image:: ../../img/webui-img/detail/customizedTrialButton.png
:target: ../../img/webui-img/detail/customizedTrialButton.png
:alt: customizedTrialButton
.. image:: ../../img/webui-img/detail/customizedTrial.png
:target: ../../img/webui-img/detail/customizedTrial.png
:alt: customizedTrial
{
"cells": [
{
"cell_type": "markdown",
"id": "white-electron",
"metadata": {},
"source": [
"## Connect and Manage an Exist Experiment"
]
},
{
"cell_type": "markdown",
"id": "recent-italic",
"metadata": {},
"source": [
"### 1. Connect Experiment"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "statistical-repair",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2021-03-05 12:18:28] Connect to port 8080 success, experiment id is DH8pVfXc, status is RUNNING.\n"
]
}
],
"source": [
"from nni.experiment import Experiment\n",
"experiment = Experiment.connect(8080)"
]
},
{
"cell_type": "markdown",
"id": "defensive-scratch",
"metadata": {},
"source": [
"### 2. Experiment View & Control"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "independent-touch",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'id': 'DH8pVfXc',\n",
" 'revision': 4,\n",
" 'execDuration': 10,\n",
" 'logDir': '/home/ningshang/nni-experiments/DH8pVfXc',\n",
" 'nextSequenceId': 1,\n",
" 'params': {'authorName': 'default',\n",
" 'experimentName': 'example_sklearn-classification',\n",
" 'trialConcurrency': 1,\n",
" 'maxExecDuration': 3600,\n",
" 'maxTrialNum': 100,\n",
" 'searchSpace': '{\"C\": {\"_type\": \"uniform\", \"_value\": [0.1, 1]}, \"kernel\": {\"_type\": \"choice\", \"_value\": [\"linear\", \"rbf\", \"poly\", \"sigmoid\"]}, \"degree\": {\"_type\": \"choice\", \"_value\": [1, 2, 3, 4]}, \"gamma\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}, \"coef0\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}}',\n",
" 'trainingServicePlatform': 'local',\n",
" 'tuner': {'builtinTunerName': 'TPE',\n",
" 'classArgs': {'optimize_mode': 'maximize'},\n",
" 'checkpointDir': '/home/ningshang/nni-experiments/DH8pVfXc/checkpoint'},\n",
" 'versionCheck': True,\n",
" 'clusterMetaData': [{'key': 'trial_config',\n",
" 'value': {'command': 'python3 main.py',\n",
" 'codeDir': '/home/ningshang/nni/examples/trials/sklearn/classification/.',\n",
" 'gpuNum': 0}}]},\n",
" 'startTime': 1614946699989}"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"experiment.get_experiment_profile()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "printable-bookmark",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2021-03-05 12:18:32] (root) Successfully update maxTrialNum.\n"
]
}
],
"source": [
"experiment.update_max_trial_number(200)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "marine-serial",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'id': 'DH8pVfXc',\n",
" 'revision': 5,\n",
" 'execDuration': 14,\n",
" 'logDir': '/home/ningshang/nni-experiments/DH8pVfXc',\n",
" 'nextSequenceId': 1,\n",
" 'params': {'authorName': 'default',\n",
" 'experimentName': 'example_sklearn-classification',\n",
" 'trialConcurrency': 1,\n",
" 'maxExecDuration': 3600,\n",
" 'maxTrialNum': 200,\n",
" 'searchSpace': '{\"C\": {\"_type\": \"uniform\", \"_value\": [0.1, 1]}, \"kernel\": {\"_type\": \"choice\", \"_value\": [\"linear\", \"rbf\", \"poly\", \"sigmoid\"]}, \"degree\": {\"_type\": \"choice\", \"_value\": [1, 2, 3, 4]}, \"gamma\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}, \"coef0\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}}',\n",
" 'trainingServicePlatform': 'local',\n",
" 'tuner': {'builtinTunerName': 'TPE',\n",
" 'classArgs': {'optimize_mode': 'maximize'},\n",
" 'checkpointDir': '/home/ningshang/nni-experiments/DH8pVfXc/checkpoint'},\n",
" 'versionCheck': True,\n",
" 'clusterMetaData': [{'key': 'trial_config',\n",
" 'value': {'command': 'python3 main.py',\n",
" 'codeDir': '/home/ningshang/nni/examples/trials/sklearn/classification/.',\n",
" 'gpuNum': 0}}]},\n",
" 'startTime': 1614946699989}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"experiment.get_experiment_profile()"
]
},
{
"cell_type": "markdown",
"id": "opened-lounge",
"metadata": {},
"source": [
"### 3. Stop Experiment"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "emotional-machinery",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2021-03-05 12:18:36] Stopping experiment, please wait...\n",
"[2021-03-05 12:18:38] Experiment stopped\n"
]
}
],
"source": [
"experiment.stop()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "nni-dev",
"language": "python",
"name": "nni-dev"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "technological-script",
"metadata": {},
"source": [
"## Start and Manage a New Experiment"
]
},
{
"cell_type": "markdown",
"id": "reported-somerset",
"metadata": {},
"source": [
"### 1. Configure Search Space"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "potential-williams",
"metadata": {},
"outputs": [],
"source": [
"search_space = {\n",
" \"C\": {\"_type\":\"quniform\",\"_value\":[0.1, 1, 0.1]},\n",
" \"kernel\": {\"_type\":\"choice\",\"_value\":[\"linear\", \"rbf\", \"poly\", \"sigmoid\"]},\n",
" \"degree\": {\"_type\":\"choice\",\"_value\":[1, 2, 3, 4]},\n",
" \"gamma\": {\"_type\":\"quniform\",\"_value\":[0.01, 0.1, 0.01]},\n",
" \"coef0\": {\"_type\":\"quniform\",\"_value\":[0.01, 0.1, 0.01]}\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "greek-archive",
"metadata": {},
"source": [
"### 2. Configure Experiment "
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "fiscal-expansion",
"metadata": {},
"outputs": [],
"source": [
"from nni.experiment import Experiment\n",
"experiment = Experiment('local')\n",
"experiment.config.experiment_name = 'Example'\n",
"experiment.config.trial_concurrency = 2\n",
"experiment.config.max_trial_number = 10\n",
"experiment.config.search_space = search_space\n",
"experiment.config.trial_command = 'python3 main.py'\n",
"experiment.config.trial_code_directory = './'\n",
"experiment.config.tuner.name = 'TPE'\n",
"experiment.config.tuner.class_args['optimize_mode'] = 'maximize'\n",
"experiment.config.training_service.use_active_gpu = True"
]
},
{
"cell_type": "markdown",
"id": "received-tattoo",
"metadata": {},
"source": [
"### 3. Start Experiment"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "pleasant-patent",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2021-03-05 12:12:19] Creating experiment, Experiment ID: wdt0le3v\n",
"[2021-03-05 12:12:19] Starting web server...\n",
"[2021-03-05 12:12:20] Setting up...\n",
"[2021-03-05 12:12:20] Web UI URLs: http://127.0.0.1:8080 http://10.0.1.5:8080 http://172.17.0.1:8080\n"
]
}
],
"source": [
"experiment.start(8080)"
]
},
{
"cell_type": "markdown",
"id": "miniature-prison",
"metadata": {},
"source": [
"### 4. Experiment View & Control"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "animated-english",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'RUNNING'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"experiment.get_status()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "alpha-ottawa",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[TrialResult(parameter={'C': 0.30000000000000004, 'kernel': 'linear', 'degree': 3, 'gamma': 0.03, 'coef0': 0.07}, value=0.9888888888888889, trialJobId='VLqU9'),\n",
" TrialResult(parameter={'C': 0.5, 'kernel': 'sigmoid', 'degree': 1, 'gamma': 0.03, 'coef0': 0.07}, value=0.8888888888888888, trialJobId='DLo6r')]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"experiment.export_data()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "unique-rendering",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'DLo6r': [TrialMetricData(timestamp=1614946351592, trialJobId='DLo6r', parameterId='1', type='FINAL', sequence=0, data=0.8888888888888888)],\n",
" 'VLqU9': [TrialMetricData(timestamp=1614946351607, trialJobId='VLqU9', parameterId='0', type='FINAL', sequence=0, data=0.9888888888888889)]}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"experiment.get_job_metrics()"
]
},
{
"cell_type": "markdown",
"id": "welsh-difference",
"metadata": {},
"source": [
"### 5. Stop Experiment"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "technological-cleanup",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2021-03-05 12:12:40] Stopping experiment, please wait...\n",
"[2021-03-05 12:12:42] Experiment stopped\n"
]
}
],
"source": [
"experiment.stop()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "nni-dev",
"language": "python",
"name": "nni-dev"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{% extends "!layout.html" %}
{% set title = "Welcome To Neural Network Intelligence !!!"%}
{% block document %}
<div class="rowHeight">
<div class="chinese"><a href="https://nni.readthedocs.io/zh/stable/">简体中文</a></div>
<b>NNI (Neural Network Intelligence)</b> is a lightweight but powerful toolkit to
help users <b>automate</b>
<a href="{{ pathto('FeatureEngineering/Overview') }}">Feature Engineering</a>,
<a href="{{ pathto('NAS/Overview') }}">Neural Architecture Search</a>,
<a href="{{ pathto('Tuner/BuiltinTuner') }}">Hyperparameter Tuning</a> and
<a href="{{ pathto('Compression/Overview') }}">Model Compression</a>.
</div>
<p class="gap rowHeight">
The tool manages automated machine learning (AutoML) experiments,
<b>dispatches and runs</b>
experiments' trial jobs generated by tuning algorithms to search the best neural
architecture and/or hyper-parameters in
<b>different training environments</b> like
<a href="{{ pathto('TrainingService/LocalMode') }}">Local Machine</a>,
<a href="{{ pathto('TrainingService/RemoteMachineMode') }}">Remote Servers</a>,
<a href="{{ pathto('TrainingService/PaiMode') }}">OpenPAI</a>,
<a href="{{ pathto('TrainingService/KubeflowMode') }}">Kubeflow</a>,
<a href="{{ pathto('TrainingService/FrameworkControllerMode') }}">FrameworkController on K8S (AKS etc.)</a>,
<a href="{{ pathto('TrainingService/DLTSMode') }}">DLWorkspace (aka. DLTS)</a>,
<a href="{{ pathto('TrainingService/AMLMode') }}">AML (Azure Machine Learning)</a>,
<a href="{{ pathto('TrainingService/AdaptDLMode') }}">AdaptDL (aka. ADL)</a>, other cloud options and even <a href="{{ pathto('TrainingService/HybridMode') }}">Hybrid mode</a>.
</p>
<!-- Who should consider using NNI -->
<div>
<h2 class="title">Who should consider using NNI</h2>
<ul>
<li>Those who want to <b>try different AutoML algorithms</b> in their training code/model.</li>
<li>Those who want to run AutoML trial jobs <b>in different environments</b> to speed up search.</li>
<li class="rowHeight">Researchers and data scientists who want to easily <b>implement and experiement new AutoML
algorithms</b>
, may it be: hyperparameter tuning algorithm,
neural architect search algorithm or model compression algorithm.
</li>
<li>ML Platform owners who want to <b>support AutoML in their platform</b></li>
</ul>
</div>
<!-- what's new -->
<div>
<div class="inline gap">
<h2>What's NEW! </h2>
<img width="48" src="_static/img/release_icon.png">
</div>
<hr class="whatNew"/>
<ul>
<li><b>New release:</b> <a href='https://github.com/microsoft/nni/releases/tag/v2.6'>{{ release }} is available2 <i>- released on Jan-18-2022</i></a></li>
<li><b>New demo available:</b> <a href="https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw">Youtube entry</a> | <a href="https://space.bilibili.com/1649051673">Bilibili</a> 入口 <i>- last updated on May-26-2021</i></li>
<li><b>New webinar:</b> <a href="https://note.microsoft.com/MSR-Webinar-Retiarii-Registration-On-Demand.html">
Introducing Retiarii: A deep learning exploratory-training framework on NNI
</a> <i>- scheduled on June-24-2021</i>
</li>
<li><b>New community channel:</b> <a href="https://github.com/microsoft/nni/discussions">Discussions</a></li>
<li>
<div><b>New emoticons release:</b> <a href="{{ pathto('nnSpider') }}">nnSpider</a></div>
<img class="gap" src="_static/img/home.svg"></img>
</li>
</ul>
</div>
<!-- NNI capabilities in a glance -->
<div class="gap">
<h2 class="title">NNI capabilities in a glance</h2>
<p class="rowHeight">
NNI provides CommandLine Tool as well as an user friendly WebUI to manage training experiements.
With the extensible API, you can customize your own AutoML algorithms and training services.
To make it easy for new users, NNI also provides a set of build-in stat-of-the-art
AutoML algorithms and out of box support for popular training platforms.
</p>
<p class="rowHeight">
Within the following table, we summarized the current NNI capabilities,
we are gradually adding new capabilities and we'd love to have your contribution.
</p>
</div>
<p align="center">
<a href="#overview"><img src="_static/img/overview.svg" /></a>
</p>
<table class="list">
<tbody>
<tr align="center" valign="bottom" class="column">
<td></td>
<td class="framework">
<b>Frameworks & Libraries</b>
</td>
<td>
<b>Algorithms</b>
</td>
<td>
<b>Training Services</b>
</td>
</tr>
</tr>
<tr>
<td class="verticalMiddle"><b>Built-in</b></td>
<td>
<ul class="firstUl">
<li><b>Supported Frameworks</b></li>
<ul class="circle">
<li>PyTorch</li>
<li>Keras</li>
<li>TensorFlow</li>
<li>MXNet</li>
<li>Caffe2</li>
<a href="{{ pathto('SupportedFramework_Library') }}">More...</a><br />
</ul>
</ul>
<ul class="firstUl">
<li><b>Supported Libraries</b></li>
<ul class="circle">
<li>Scikit-learn</li>
<li>XGBoost</li>
<li>LightGBM</li>
<a href="{{ pathto('SupportedFramework_Library') }}">More...</a><br />
</ul>
</ul>
<ul class="firstUl">
<li><b>Examples</b></li>
<ul class="circle">
<li><a href="https://github.com/microsoft/nni/tree/master/examples/trials/mnist-pytorch">MNIST-pytorch</li>
</a>
<li><a href="https://github.com/microsoft/nni/tree/master/examples/trials/mnist-tfv2">MNIST-tensorflow</li>
</a>
<li><a href="https://github.com/microsoft/nni/tree/master/examples/trials/mnist-keras">MNIST-keras</li></a>
<li><a href="{{ pathto('TrialExample/GbdtExample') }}">Auto-gbdt</a></li>
<li><a href="{{ pathto('TrialExample/Cifar10Examples') }}">Cifar10-pytorch</li></a>
<li><a href="{{ pathto('TrialExample/SklearnExamples') }}">Scikit-learn</a></li>
<li><a href="{{ pathto('TrialExample/EfficientNet') }}">EfficientNet</a></li>
<li><a href="{{ pathto('TrialExample/OpEvoExamples') }}">Kernel Tunning</li></a>
<a href="{{ pathto('SupportedFramework_Library') }}">More...</a><br />
</ul>
</ul>
</td>
<td align="left">
<a href="{{ pathto('Tuner/BuiltinTuner') }}">Hyperparameter Tuning</a>
<ul class="firstUl">
<div><b>Exhaustive search</b></div>
<ul class="circle">
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#Random">Random Search</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#GridSearch">Grid Search</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#Batch">Batch</a></li>
</ul>
<div><b>Heuristic search</b></div>
<ul class="circle">
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#Evolution">Naïve Evolution</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#Anneal">Anneal</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#Hyperband">Hyperband</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#PBTTuner">PBT</a></li>
</ul>
<div><b>Bayesian optimization</b></div>
<ul class="circle">
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#BOHB">BOHB</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#TPE">TPE</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#SMAC">SMAC</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#MetisTuner">Metis Tuner</a></li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#GPTuner">GP Tuner</a> </li>
<li><a href="{{ pathto('Tuner/BuiltinTuner') }}#DNGOTuner">DNGO Tuner</a></li>
</ul>
</ul>
<a href="{{ pathto('NAS/Overview') }}">Neural Architecture Search (Retiarii)</a>
<ul class="firstUl">
<ul class="circle">
<li><a href="{{ pathto('NAS/ENAS') }}">ENAS</a></li>
<li><a href="{{ pathto('NAS/DARTS') }}">DARTS</a></li>
<li><a href="{{ pathto('NAS/SPOS') }}">SPOS</a></li>
<li><a href="{{ pathto('NAS/Proxylessnas') }}">ProxylessNAS</a></li>
<li><a href="{{ pathto('NAS/FBNet') }}">FBNet</a></li>
<li><a href="{{ pathto('NAS/ExplorationStrategies') }}">Reinforcement Learning</a></li>
<li><a href="{{ pathto('NAS/ExplorationStrategies') }}">Regularized Evolution</a></li>
<li><a href="{{ pathto('NAS/Overview') }}">More...</a></li>
</ul>
</ul>
<a href="{{ pathto('Compression/Overview') }}">Model Compression</a>
<ul class="firstUl">
<div><b>Pruning</b></div>
<ul class="circle">
<li><a href="{{ pathto('Compression/Pruner') }}#agp-pruner">AGP Pruner</a></li>
<li><a href="{{ pathto('Compression/Pruner') }}#slim-pruner">Slim Pruner</a></li>
<li><a href="{{ pathto('Compression/Pruner') }}#fpgm-pruner">FPGM Pruner</a></li>
<li><a href="{{ pathto('Compression/Pruner') }}#netadapt-pruner">NetAdapt Pruner</a></li>
<li><a href="{{ pathto('Compression/Pruner') }}#simulatedannealing-pruner">SimulatedAnnealing Pruner</a></li>
<li><a href="{{ pathto('Compression/Pruner') }}#admm-pruner">ADMM Pruner</a></li>
<li><a href="{{ pathto('Compression/Pruner') }}#autocompress-pruner">AutoCompress Pruner</a></li>
<li><a href="{{ pathto('Compression/Overview') }}">More...</a></li>
</ul>
<div><b>Quantization</b></div>
<ul class="circle">
<li><a href="{{ pathto('Compression/Quantizer') }}#qat-quantize">QAT Quantizer</a></li>
<li><a href="{{ pathto('Compression/Quantizer') }}#dorefa-quantizer">DoReFa Quantizer</a></li>
<li><a href="{{ pathto('Compression/Quantizer') }}#bnn-quantizer">BNN Quantizer</a></li>
</ul>
</ul>
<a href="{{ pathto('FeatureEngineering/Overview') }}">Feature Engineering (Beta)</a>
<ul class="circle">
<li><a href="{{ pathto('FeatureEngineering/GradientFeatureSelector') }}">GradientFeatureSelector</a></li>
<li><a href="{{ pathto('FeatureEngineering/GBDTSelector') }}">GBDTSelector</a></li>
</ul>
<a href="{{ pathto('Assessor/BuiltinAssessor') }}">Early Stop Algorithms</a>
<ul class="circle">
<li><a href="{{ pathto('Assessor/BuiltinAssessor') }}#MedianStop">Median Stop</a></li>
<li><a href="{{ pathto('Assessor/BuiltinAssessor') }}#Curvefitting">Curve Fitting</a></li>
</ul>
</td>
<td>
<ul class="firstUl">
<li><a href="{{ pathto('TrainingService/LocalMode') }}">Local Machine</a></li>
<li><a href="{{ pathto('TrainingService/RemoteMachineMode') }}">Remote Servers</a></li>
<li><a href="{{ pathto('TrainingService/HybridMode') }}">Hybrid mode</a></li>
<li><a href="{{ pathto('TrainingService/AMLMode') }}">AML(Azure Machine Learning)</a></li>
<li><b>Kubernetes based services</b></li>
<ul>
<li><a href="{{ pathto('TrainingService/PaiMode') }}">OpenPAI</a></li>
<li><a href="{{ pathto('TrainingService/KubeflowMode') }}">Kubeflow</a></li>
<li><a href="{{ pathto('TrainingService/FrameworkControllerMode') }}">FrameworkController on K8S (AKS etc.)</a></li>
<li><a href="{{ pathto('TrainingService/DLTSMode') }}">DLWorkspace (aka. DLTS)</a></li>
<li><a href="{{ pathto('TrainingService/AdaptDLMode') }}">AdaptDL (aka. ADL)</a></li>
</ul>
</ul>
</td>
</tr>
<tr valign="top">
<td class="verticalMiddle"><b>References</b></td>
<td>
<ul class="firstUl">
<li><a href="{{ pathto('Tutorial/HowToLaunchFromPython') }}">Python API</a></li>
<li><a href="{{ pathto('Tutorial/AnnotationSpec') }}">NNI Annotation</a></li>
<li><a href="{{ pathto('installation') }}">Supported OS</a></li>
</ul>
</td>
<td>
<ul class="firstUl">
<li><a href="{{ pathto('Tuner/CustomizeTuner') }}">CustomizeTuner</a></li>
<li><a href="{{ pathto('Assessor/CustomizeAssessor') }}">CustomizeAssessor</a></li>
<li><a href="{{ pathto('Tutorial/InstallCustomizedAlgos') }}">Install Customized Algorithms as Builtin Tuners/Assessors/Advisors</a></li>
<li><a href="{{ pathto('NAS/QuickStart') }}">Define NAS Model Space</a></li>
<li><a href="{{ pathto('NAS/ApiReference') }}">NAS/Retiarii APIs</a></li>
</ul>
</td>
<td>
<ul class="firstUl">
<li><a href="{{ pathto('TrainingService/Overview') }}">Support TrainingService</a></li>
<li><a href="{{ pathto('TrainingService/HowToImplementTrainingService') }}">Implement TrainingService</a></li>
</ul>
</td>
</tr>
</tbody>
</table>
<!-- Installation -->
<div class="gap">
<h2 class="title">Installation</h2>
<div>
<h3 class="second-title">Install</h3>
<div class="gap2">
NNI supports and is tested on Ubuntu >= 16.04, macOS >= 10.14.1,
and Windows 10 >= 1809. Simply run the following <code>pip install</code>
in an environment that has <code>python 64-bit >= 3.6</code>.
</div>
<div class="command-intro">Linux or macOS</div>
<div class="command">python3 -m pip install --upgrade nni</div>
<div class="command-intro">Windows</div>
<div class="command">python -m pip install --upgrade nni</div>
<div class="command-intro">If you want to try latest code, please <a href="{{ pathto('installation') }}">install
NNI</a> from source code.
</div>
<div class="chinese">For detail system requirements of NNI, please refer to <a href="{{ pathto('Tutorial/InstallationLinux') }}">here</a>
for Linux & macOS, and <a href="{{ pathto('Tutorial/InstallationWin') }}">here</a> for Windows.</div>
</div>
<div>
<p>Note:</p>
<ul>
<li>If there is any privilege issue, add --user to install NNI in the user directory.</li>
<li class="rowHeight">Currently NNI on Windows supports local, remote and pai mode. Anaconda or Miniconda is highly
recommended to install <a href="{{ pathto('Tutorial/InstallationWin') }}">NNI on Windows</a>.</li>
<li>If there is any error like Segmentation fault, please refer to <a
href="{{ pathto('installation') }}">FAQ</a>. For FAQ on Windows, please refer
to <a href="{{ pathto('Tutorial/InstallationWin') }}">NNI on Windows</a>.</li>
</ul>
</div>
<div>
<h3 class="second-title gap">Verify installation</h3>
<div>
The following example is built on TensorFlow 1.x. Make sure <b>TensorFlow 1.x is used</b> when running
it.
</div>
<ul>
<li>
<div class="command-intro">Download the examples via clone the source code.</div>
<div class="command">git clone -b {{ release }} https://github.com/Microsoft/nni.git</div>
</li>
<li>
<div>Run the MNIST example.</div>
<div class="command-intro">Linux or macOS</div>
<div class="command">nnictl create --config nni/examples/trials/mnist-pytorch/config.yml</div>
<div class="command-intro">Windows</div>
<div class="command">nnictl create --config nni\examples\trials\mnist-pytorch\config_windows.yml</div>
</li>
<li>
<div class="rowHeight">
Wait for the message INFO: Successfully started experiment! in the command line.
This message indicates that your experiment has been successfully started.
You can explore the experiment using the Web UI url.
</div>
<!-- Indentation affects style! -->
<pre class="code">
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: http://223.255.255.1:8080 http://127.0.0.1:8080
-----------------------------------------------------------------------
You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
commands description
1. nnictl experiment show show the information of experiments
2. nnictl trial ls list all of trial jobs
3. nnictl top monitor the status of running experiments
4. nnictl log stderr show stderr log content
5. nnictl log stdout show stdout log content
6. nnictl stop stop an experiment
7. nnictl trial kill kill a trial job by id
8. nnictl --help get help information about nnictl
-----------------------------------------------------------------------
</pre>
</li>
<li class="rowHeight">
Open the Web UI url in your browser, you can view detail information of the experiment and
all the submitted trial jobs as shown below. <a href="{{ pathto('Tutorial/WebUI') }}">Here</a> are more Web UI
pages.
<img class="gap" src="_static/img/webui.gif" width="100%"/>
</div>
</li>
</ul>
</div>
<!-- Releases and Contributing -->
<div class="gap">
<h2 class="title">Releases and Contributing</h2>
<div>NNI has a monthly release cycle (major releases). Please let us know if you encounter a bug by filling an issue.</div>
<br/>
<div>We appreciate all contributions. If you are planning to contribute any bug-fixes, please do so without further discussions.</div>
<br/>
<div class="rowHeight">If you plan to contribute new features, new tuners, new training services, etc. please first open an issue or reuse an exisiting issue, and discuss the feature with us. We will discuss with you on the issue timely or set up conference calls if needed.</div>
<br/>
<div>To learn more about making a contribution to NNI, please refer to our <a href="{{ pathto('contribution') }}"">How-to contribution page</a>.</div>
<br/>
<div>We appreciate all contributions and thank all the contributors!</div>
<img class="gap" src="_static/img/contributors.png"></img>
</div>
<!-- feedback -->
<div class="gap">
<h2 class="title">Feedback</h2>
<ul>
<li><a href="https://github.com/microsoft/nni/issues/new/choose">File an issue</a> on GitHub.</li>
<li>Open or participate in a <a href="https://github.com/microsoft/nni/discussions">discussion</a>.</li>
<li>Discuss on the <a href="https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge">NNI Gitter</a> in NNI.</li>
</ul>
<div>
<div class="rowHeight">Join IM discussion groups:</div>
<table class="gap" border=1 style="border-collapse: collapse;">
<tbody>
<tr style="line-height: 30px;">
<th>Gitter</th>
<td></td>
<th>WeChat</th>
</tr>
<tr>
<td class="QR">
<img src="https://user-images.githubusercontent.com/39592018/80665738-e0574a80-8acc-11ea-91bc-0836dc4cbf89.png" alt="Gitter" />
</td>
<td width="80" align="center" class="or">OR</td>
<td class="QR">
<img src="https://github.com/scarlett2018/nniutil/raw/master/wechat.png" alt="NNI Wechat" />
</td>
</tr>
</tbody>
</table>
</div>
</div>
<!-- Test status -->
<div class="gap">
<h2 class="title">Test status</h2>
<h3>Essentials</h3>
<table class="pipeline">
<tr>
<th>Type</th>
<th>Status</th>
</tr>
<tr>
<td>Fast test</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=54&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/fast%20test?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>Full linux</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=62&repoName=microsoft%2Fnni&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20linux?repoName=microsoft%2Fnni&branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>Full windows</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=63&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20windows?branchName=master"/>
</a>
</td>
</tr>
</table>
<h3 class="gap">Training services</h3>
<table class="pipeline">
<tr>
<th>Type</th>
<th>Status</th>
</th>
<tr>
<td>Remote - linux to linux</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=64&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20remote%20-%20linux%20to%20linux?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>Remote - linux to windows</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=67&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20remote%20-%20linux%20to%20windows?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>Remote - windows to linux</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=68&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20remote%20-%20windows%20to%20linux?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>OpenPAI</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=65&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20openpai%20-%20linux?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>Frameworkcontroller</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=70&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20frameworkcontroller?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>Kubeflow</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=69&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20kubeflow?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>Hybrid</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=79&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20hybrid?branchName=master"/>
</a>
</td>
</tr>
<tr>
<td>AzureML</td>
<td>
<a href="https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=78&branchName=master">
<img src="https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20aml?branchName=master"/>
</a>
</td>
</tr>
</table>
</div>
<!-- Related Projects -->
<div class="gap">
<h2 class="title">Related Projects</h2>
<p class="rowHeight">
Targeting at openness and advancing state-of-art technology,
<a href="https://www.microsoft.com/en-us/research/group/systems-and-networking-research-group-asia/">Microsoft Research (MSR)</a>
had also released few
other open source projects.</p>
<ul id="relatedProject">
<li class="rowHeight">
<a href="https://github.com/Microsoft/pai">OpenPAI</a> : an open source platform that provides complete AI model
training and resource management
capabilities, it is easy to extend and supports on-premise,
cloud and hybrid environments in various scale.
</li>
<li class="rowHeight">
<a href="https://github.com/Microsoft/frameworkcontroller">FrameworkController</a> : an open source
general-purpose Kubernetes Pod Controller that orchestrate
all kinds of applications on Kubernetes by a single controller.
</li>
<li class="rowHeight">
<a href="https://github.com/Microsoft/MMdnn">MMdnn</a> : A comprehensive, cross-framework solution to convert,
visualize and diagnose deep neural network
models. The "MM" in MMdnn stands for model management
and "dnn" is an acronym for deep neural network.
</li>
<li class="rowHeight">
<a href="https://github.com/Microsoft/SPTAG">SPTAG</a> : Space Partition Tree And Graph (SPTAG) is an open
source library
for large scale vector approximate nearest neighbor search scenario.
</li>
<li class="rowHeight">
<a href="https://github.com/Microsoft/SPTAG">nn-Meter</a> : An accurate inference latency predictor for DNN models on diverse edge devices.
</li>
</ul>
<p>We encourage researchers and students leverage these projects to accelerate the AI development and research.</p>
</div>
<!-- License -->
<div>
<h2 class="title">License</h2>
<p>The entire codebase is under <a href="https://github.com/microsoft/nni/blob/master/LICENSE">MIT license</a></p>
</div>
</div>
{% endblock %}
{% extends "!layout.html" %}
{% block sidebartitle %}
{% if logo and theme_logo_only %}
<a href="{{ pathto('index') }}">
{% else %}
<a href="{{ pathto('index') }}" class="icon icon-home"> {{ project }}
{% endif %}
{% if logo %}
{# Not strictly valid HTML, but it's the only way to display/scale it properly, without weird scripting or heaps of work #}
<img src="{{ pathto('_static/' + logo, 1) }}" class="logo" />
{% endif %}
</a>
{% if theme_display_version %}
{%- set nav_version = version %}
{% if READTHEDOCS and current_version %}
{%- set nav_version = current_version %}
{% endif %}
{% if nav_version %}
<div class="version">
{{ nav_version }}
</div>
{% endif %}
{% endif %}
{% include "searchbox.html" %}
{% endblock %}
{% extends "!layout.html" %}
{% set title = "Welcome To Neural Network Intelligence !!!"%}
{% block document %}
<h2 class="center">nnSpider emoticons</h2>
<ul class="emotion">
<li class="first">
<div>
<a href="{{ pathto('nnSpider/nobug') }}">
<img src="_static/img/NoBug.png" alt="NoBug" />
</a>
</div>
<p class="center">NoBug</p>
</li>
<li class="first">
<div>
<a href="{{ pathto('nnSpider/holiday') }}">
<img src="_static/img/Holiday.png" alt="Holiday" />
</a>
</div>
<p class="center">Holiday</p>
</li>
<li class="first">
<div>
<a href="{{ pathto('nnSpider/errorEmotion') }}">
<img src="_static/img/Error.png" alt="Error" />
</a>
</div>
<p class="center">Error</p>
</li>
<li class="second">
<div>
<a href="{{ pathto('nnSpider/working') }}">
<img class="working" src="_static/img/Working.png" alt="Working" />
</a>
</div>
<p class="center">Working</p>
</li>
<li class="second">
<div>
<a href="{{ pathto('nnSpider/sign') }}">
<img class="sign" src="_static/img/Sign.png" alt="Sign" />
</a>
</div>
<p class="center">Sign</p>
</li>
<li class="second">
<div>
<a href="{{ pathto('nnSpider/crying') }}">
<img class="crying" src="_static/img/Crying.png" alt="Crying" />
</a>
</div>
<p class="center">Crying</p>
</li>
<li class="three">
<div>
<a href="{{ pathto('nnSpider/cut') }}">
<img src="_static/img/Cut.png" alt="Crying" />
</a>
</div>
<p class="center">Cut</p>
</li>
<li class="three">
<div>
<a href="{{ pathto('nnSpider/weaving') }}">
<img class="weaving" src="_static/img/Weaving.png" alt="Weaving" />
</a>
</div>
<p class="center">weaving</p>
</li>
<li class="three">
<div class="comfort">
<a href="{{ pathto('nnSpider/comfort') }}">
<img src="_static/img/Comfort.png" alt="Weaving" />
</a>
</div>
<p class="center">comfort</p>
</li>
<li class="four">
<div>
<a href="{{ pathto('nnSpider/sweat') }}">
<img src="_static/img/Sweat.png" alt="Sweat" />
</a>
</div>
<p class="center">Sweat</p>
</li>
<div class="clear"></div>
</ul>
{% endblock %}
\ No newline at end of file
{% extends "!layout.html" %}
{% set title = "Welcome To Neural Network Intelligence !!!"%}
{% block document %}
<h2>Comfort</h2>
<div class="details-container">
<img src="../_static/img/Comfort.png" alt="Comfort" />
</div>
{% endblock %}
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment