There are three parts that might have logs in NNI. They are nnimanager, dispatcher and trial. Here we will introduce them succinctly. More information please refer to `Overview <../Overview.rst>`__.
* **NNI controller**\ : NNI controller (nnictl) is the nni command-line tool that is used to manage experiments (e.g., start an experiment).
* **nnimanager**\ : nnimanager is the core of NNI, whose log is important when the whole experiment fails (e.g., no webUI or training service fails)
* **Dispatcher**\ : Dispatcher calls the methods of **Tuner** and **Assessor**. Logs of dispatcher are related to the tuner or assessor code.
* **Tuner**\ : Tuner is an AutoML algorithm, which generates a new configuration for the next try. A new trial will run with this configuration.
* **Assessor**\ : Assessor analyzes trial's intermediate results (e.g., periodically evaluated accuracy on test dataset) to tell whether this trial can be early stopped or not.
* **Trial**\ : Trial code is the code you write to run your experiment, which is an individual attempt at applying a new configuration (e.g., a set of hyperparameter values, a specific nerual architecture).
Where is the log
----------------
There are three kinds of log in NNI. When creating a new experiment, you can specify log level as debug by adding ``--debug``. Besides, you can set more detailed log level in your configuration file by using
All possible errors that happen when launching an NNI experiment can be found here.
You can use ``nnictl log stderr`` to find error information. For more options please refer to `NNICTL <Nnictl.rst>`__
Experiment Root Directory
^^^^^^^^^^^^^^^^^^^^^^^^^
Every experiment has a root folder, which is shown on the right-top corner of webUI. Or you could assemble it by replacing the ``experiment_id`` with your actual experiment_id in path ``~/nni-experiments/experiment_id/`` in case of webUI failure. ``experiment_id`` could be seen when you run ``nnictl create ...`` to create a new experiment.
..
For flexibility, we also offer a ``logDir`` option in your configuration, which specifies the directory to store all experiments (defaults to ``~/nni-experiments``\ ). Please refer to `Configuration <ExperimentConfig.rst>`__ for more details.
Under that directory, there is another directory named ``log``\ , where ``nnimanager.log`` and ``dispatcher.log`` are placed.
Trial Root Directory
^^^^^^^^^^^^^^^^^^^^
Usually in webUI, you can click ``+`` in the left of every trial to expand it to see each trial's log path.
Besides, there is another directory under experiment root directory, named ``trials``\ , which stores all the trials.
Every trial has a unique id as its directory name. In this directory, a file named ``stderr`` records trial error and another named ``trial.log`` records this trial's log.
Different kinds of errors
-------------------------
There are different kinds of errors. However, they can be divided into three categories based on their severity. So when nni fails, check each part sequentially.
Generally, if webUI is started successfully, there is a ``Status`` in the ``Overview`` tab, serving as a possible indicator of what kind of error happens. Otherwise you should check manually.
**NNI** Fails
^^^^^^^^^^^^^^^^^
This is the most serious error. When this happens, the whole experiment fails and no trial will be run. Usually this might be related to some installation problem.
When this happens, you should check ``nnictl``\ 's error output file ``stderr`` (i.e., nnictl log stderr) and then the ``nnimanager``\ 's log to find if there is any error.
**Dispatcher** Fails
^^^^^^^^^^^^^^^^^^^^^^^^
Dispatcher fails. Usually, for some new users of NNI, it means that tuner fails. You could check dispatcher's log to see what happens to your dispatcher. For built-in tuner, some common errors might be invalid search space (unsupported type of search space or inconsistence between initializing args in configuration file and actual tuner's ``__init__`` function args).
Take the later situation as an example. If you write a customized tuner who's __init__ function has an argument called ``optimize_mode``\ , which you do not provide in your configuration file, NNI will fail to run your tuner so the experiment fails. You can see errors in the webUI like:
.. image:: ../../img/dispatcher_error.jpg
:target: ../../img/dispatcher_error.jpg
:alt:
Here we can see it is a dispatcher error. So we can check dispatcher's log, which might look like:
In this situation, NNI can still run and create new trials.
It means your trial code (which is run by NNI) fails. This kind of error is strongly related to your trial code. Please check trial's log to fix any possible errors shown there.
A common example of this would be run the mnist example without installing tensorflow. Surely there is an Import Error (that is, not installing tensorflow but trying to import it in your trial code) and thus every trial fails.
.. image:: ../../img/trial_error.jpg
:target: ../../img/trial_error.jpg
:alt:
As it shows, every trial has a log path, where you can find trial's log and stderr.
In addition to experiment level debug, NNI also provides the capability for debugging a single trial without the need to start the entire experiment. Refer to `standalone mode <../TrialExample/Trials.rst#standalone-mode-for-debugging>`__ for more information about debug single trial code.
Since ``v2.0``, NNI provides a new way to launch the experiments. Before that, you need to configure the experiment in the YAML configuration file and then use the ``nnictl`` command to launch the experiment. Now, you can also configure and run experiments directly in the Python file. If you are familiar with Python programming, this will undoubtedly bring you more convenience.
Run a New Experiment
--------------------
After successfully installing ``nni`` and prepare the `trial code <../TrialExample/Trials.rst>`__, you can start the experiment with a Python script in the following 2 steps.
Step 1 - Initialize an experiment instance and configure it
Use the form like ``experiment.config.foo = 'bar'`` to configure your experiment.
See all real `builtin tuners <../builtin_tuner.rst>`__ supported in NNI.
See `configuration reference <../reference/experiment_config.rst>`__ for more detailed usage of these fields.
Step 2 - Just run
^^^^^^^^^^^^^^^^^
.. code-block:: python
experiment.run(port=8080)
Now, you have successfully launched an NNI experiment. And you can type ``localhost:8080`` in your browser to observe your experiment in real time.
In this way, experiment will run in the foreground and will automatically exit when the experiment finished.
.. Note:: If you want to run an experiment in an interactive way, use ``start()`` in Step 2. If you launch the experiment in Python script, please use ``run()``, as ``start()`` is designed for the interactive scenarios.
Example
^^^^^^^
Below is an example for this new launching approach. You can find this code in :githublink:`mnist-tfv2/launch.py <examples/trials/mnist-tfv2/launch.py>`.
NNI migrates the API in ``NNI Client`` to this new launching approach. Launch the experiment by ``start()`` instead of ``run()``, then you can use these APIs in interactive mode.
Please refer to `example usage <./python_api_start.rst>`__ and code file :githublink:`python_api_start.ipynb <examples/trials/sklearn/classification/python_api_start.ipynb>`.
.. Note:: ``run()`` polls the experiment status and will automatically call ``stop()`` when the experiment finished. ``start()`` just launched a new experiment, so you need to manually stop the experiment by calling ``stop()``.
Connect and Manage an Exist Experiment
--------------------------------------
If you launch an experiment by ``nnictl`` and also want to use these APIs, you can use ``Experiment.connect()`` to connect to an existing experiment.
Please refer to `example usage <./python_api_connect.rst>`__ and code file :githublink:`python_api_connect.ipynb <examples/trials/sklearn/classification/python_api_connect.ipynb>`.
.. Note:: You can use ``stop()`` to stop the experiment when connecting to an existing experiment.
Resume/View and Manage a Stopped Experiment
-------------------------------------------
You can use ``Experiment.resume()`` and ``Experiment.view()`` to resume and view a stopped experiment, these functions behave like ``nnictl resume`` and ``nnictl view``.
If you want to manage the experiment, set ``wait_completion`` as ``False`` and the functions will return an ``Experiment`` instance. For more parameters, please refer to API reference.
API Reference
-------------
Detailed usage could be found `here <../reference/experiment_config.rst>`__.
* `Experiment`_
* `Experiment Config <#Experiment-Config>`_
* `Algorithm Config <#Algorithm-Config>`_
* `Training Service Config <#Training-Service-Config>`_
`Docker <https://www.docker.com/>`__ is a tool to make it easier for users to deploy and run applications based on their own operating system by starting containers. Docker is not a virtual machine, it does not create a virtual operating system, but it allows different applications to use the same OS kernel and isolate different applications by container.
Users can start NNI experiments using Docker. NNI also provides an official Docker image `msranni/nni <https://hub.docker.com/r/msranni/nni>`__ on Docker Hub.
Using Docker in local machine
-----------------------------
Step 1: Installation of Docker
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Before you start using Docker for NNI experiments, you should install Docker on your local machine. `See here <https://docs.docker.com/install/linux/docker-ce/ubuntu/>`__.
Step 2: Start a Docker container
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you have installed the Docker package in your local machine, you can start a Docker container instance to run NNI examples. You should notice that because NNI will start a web UI process in a container and continue to listen to a port, you need to specify the port mapping between your host machine and Docker container to give access to web UI outside the container. By visiting the host IP address and port, you can redirect to the web UI process started in Docker container and visit web UI content.
For example, you could start a new Docker container from the following command:
.. code-block:: bash
docker run -i -t -p [hostPort]:[containerPort] [image]
``-i:`` Start a Docker in an interactive mode.
``-t:`` Docker assign the container an input terminal.
``-p:`` Port mapping, map host port to a container port.
For more information about Docker commands, please `refer to this <https://docs.docker.com/engine/reference/run/>`__.
Note:
.. code-block:: bash
NNI only supports Ubuntu and MacOS systems in local mode for the moment, please use correct Docker image type. If you want to use gpu in a Docker container, please use nvidia-docker.
Step 3: Run NNI in a Docker container
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you start a Docker image using NNI's official image ``msranni/nni``\ , you can directly start NNI experiments by using the ``nnictl`` command. Our official image has NNI's running environment and basic python and deep learning frameworks preinstalled.
If you start your own Docker image, you may need to install the NNI package first; please refer to `NNI installation <InstallationLinux.rst>`__.
If you want to run NNI's official examples, you may need to clone the NNI repo in GitHub using
.. code-block:: bash
git clone https://github.com/Microsoft/nni.git
then you can enter ``nni/examples/trials`` to start an experiment.
After you prepare NNI's environment, you can start a new experiment using the ``nnictl`` command. `See here <QuickStart.rst>`__.
Using Docker on a remote platform
---------------------------------
NNI supports starting experiments in `remoteTrainingService <../TrainingService/RemoteMachineMode.rst>`__\ , and running trial jobs on remote machines. As Docker can start an independent Ubuntu system as an SSH server, a Docker container can be used as the remote machine in NNI's remote mode.
Step 1: Setting a Docker environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You should install the Docker software on your remote machine first, please `refer to this <https://docs.docker.com/install/linux/docker-ce/ubuntu/>`__.
To make sure your Docker container can be connected by NNI experiments, you should build your own Docker image to set an SSH server or use images with an SSH configuration. If you want to use a Docker container as an SSH server, you should configure the SSH password login or private key login; please `refer to this <https://docs.docker.com/engine/examples/running_ssh_service/>`__.
Note:
.. code-block:: text
NNI's official image msranni/nni does not support SSH servers for the time being; you should build your own Docker image with an SSH configuration or use other images as a remote server.
Step 2: Start a Docker container on a remote machine
An SSH server needs a port; you need to expose Docker's SSH port to NNI as the connection port. For example, if you set your container's SSH port as ``A``, you should map the container's port ``A`` to your remote host machine's other port ``B``, NNI will connect port ``B`` as an SSH port, and your host machine will map the connection from port ``B`` to port ``A`` then NNI could connect to your Docker container.
For example, you could start your Docker container using the following commands:
.. code-block:: bash
docker run -dit -p [hostPort]:[containerPort] [image]
The ``containerPort`` is the SSH port used in your Docker container and the ``hostPort`` is your host machine's port exposed to NNI. You can set your NNI's config file to connect to ``hostPort`` and the connection will be transmitted to your Docker container.
For more information about Docker commands, please `refer to this <https://docs.docker.com/v17.09/edge/engine/reference/run/>`__.
Note:
.. code-block:: bash
If you use your own Docker image as a remote server, please make sure that this image has a basic python environment and an NNI SDK runtime environment. If you want to use a GPU in a Docker container, please use nvidia-docker.
Step 3: Run NNI experiments
^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can set your config file as a remote platform and set the ``machineList`` configuration to connect to your Docker SSH server; `refer to this <../TrainingService/RemoteMachineMode.rst>`__. Note that you should set the correct ``port``\ , ``username``\ , and ``passWd`` or ``sshKeyPath`` of your host machine.
``port:`` The host machine's port, mapping to Docker's SSH port.
``username:`` The username of the Docker container.
``passWd:`` The password of the Docker container.
``sshKeyPath:`` The path of the private key of the Docker container.
After the configuration of the config file, you could start an experiment, `refer to this <QuickStart.rst>`__.
If you want to use your own storage during using NNI, shared storage can satisfy you.
Instead of using training service native storage, shared storage can bring you more convenience.
All the information generated by the experiment will be stored under ``/nni`` folder in your shared storage.
All the output produced by the trial will be located under ``/nni/{EXPERIMENT_ID}/trials/{TRIAL_ID}/nnioutput`` folder in your shared storage.
This saves you from finding for experiment-related information in various places.
Remember that your trial working directory is ``/nni/{EXPERIMENT_ID}/trials/{TRIAL_ID}``, so if you upload your data in this shared storage, you can open it like a local file in your trial code without downloading it.
And we will develop more practical features in the future based on shared storage. The config reference can be found `here <../reference/experiment_config.html#sharedstorageconfig>`_.
.. note::
Shared storage is currently in the experimental stage. We suggest use AzureBlob under Ubuntu/CentOS/RHEL, and NFS under Ubuntu/CentOS/RHEL/Fedora/Debian for remote.
And make sure your local machine can mount NFS or fuse AzureBlob and the machine used in training service has ``sudo`` permission without password. We only support shared storage under training service with reuse mode for now.
.. note::
What is the difference between training service native storage and shared storage? Training service native storage is usually provided by the specific training service.
E.g., the local storage on remote machine in remote mode, the provided storage in openpai mode. These storages might not easy to use, e.g., users have to upload datasets to all remote machines to train the model.
In these cases, shared storage can automatically mount to the machine in the training platform. Users can directly save and load data from the shared storage. All the data/log used/generated in one experiment can be placed under the same place.
After the experiment is finished, shared storage will automatically unmount from the training platform.
Example
-------
If you want to use AzureBlob, add below to your config. Full config file see :githublink:`mnist-sharedstorage/config_azureblob.yml <examples/trials/mnist-sharedstorage/config_azureblob.yml>`.
.. code-block:: yaml
sharedStorage:
storageType: AzureBlob
# please set localMountPoint as absolute path and localMountPoint should outside the code directory
# because nni will copy user code to localMountPoint
localMountPoint: ${your/local/mount/point}
# remoteMountPoint is the mount point on training service machine, it can be set as both absolute path and relative path
# make sure you have `sudo` permission without password on training service machine
# usermount means you have already mount this storage on localMountPoint
# nnimount means nni will try to mount this storage on localMountPoint
# nomount means storage will not mount in local machine, will support partial storages in the future
localMounted: nnimount
You can find ``storageAccountName``, ``storageAccountKey``, ``containerName`` on azure storage account portal.
.. image:: ../../img/azure_storage.png
:target: ../../img/azure_storage.png
:alt:
If you want to use NFS, add below to your config. Full config file see :githublink:`mnist-sharedstorage/config_nfs.yml <examples/trials/mnist-sharedstorage/config_nfs.yml>`.
.. code-block:: yaml
sharedStorage:
storageType: NFS
localMountPoint: ${your/local/mount/point}
remoteMountPoint: ${your/remote/mount/point}
nfsServer: ${nfs-server-ip}
exportedDirectory: ${nfs/exported/directory}
# usermount means you have already mount this storage on localMountPoint
# nnimount means nni will try to mount this storage on localMountPoint
# nomount means storage will not mount in local machine, will support partial storages in the future
NNI provides a lot of `builtin tuners <../Tuner/BuiltinTuner.rst>`_, `advisors <../Tuner/HyperbandAdvisor.rst>`__ and `assessors <../Assessor/BuiltinAssessor.rst>`__ can be used directly for Hyper Parameter Optimization, and some extra algorithms can be registered via ``nnictl algo register --meta <path_to_meta_file>`` after NNI is installed. You can check builtin algorithms via ``nnictl algo list`` command.
NNI also provides the ability to build your own customized tuners, advisors and assessors. To use the customized algorithm, users can simply follow the spec in experiment config file to properly reference the algorithm, which has been illustrated in the tutorials of `customized tuners <../Tuner/CustomizeTuner.rst>`_ / `advisors <../Tuner/CustomizeAdvisor.rst>`__ / `assessors <../Assessor/CustomizeAssessor.rst>`__.
NNI also allows users to install the customized algorithm as a builtin algorithm, in order for users to use the algorithm in the same way as NNI builtin tuners/advisors/assessors. More importantly, it becomes much easier for users to share or distribute their implemented algorithm to others. Customized tuners/advisors/assessors can be installed into NNI as builtin algorithms, once they are installed into NNI, you can use your customized algorithms the same way as builtin tuners/advisors/assessors in your experiment configuration file. For example, you built a customized tuner and installed it into NNI using a builtin name ``mytuner``, then you can use this tuner in your configuration file like below:
.. code-block:: yaml
tuner:
builtinTunerName: mytuner
Register customized algorithms as builtin tuners, assessors and advisors
NNI provides a ``ClassArgsValidator`` interface for customized algorithms author to validate the classArgs parameters in experiment configuration file which are passed to customized algorithms constructors.
The ``ClassArgsValidator`` interface is defined as:
.. code-block:: python
class ClassArgsValidator(object):
def validate_class_args(self, **kwargs):
"""
The classArgs fields in experiment configuration are packed as a dict and
passed to validator as kwargs.
"""
pass
For example, you can implement your validator such as:
.. code-block:: python
from schema import Schema, Optional
from nni import ClassArgsValidator
class MedianstopClassArgsValidator(ClassArgsValidator):
Firstly, the customized algorithms need to be prepared as a python package. Then you can install the package into python environment via:
* Run command ``python setup.py develop`` from the package directory, this command will install the package in development mode, this is recommended if your algorithm is under development.
* Run command ``python setup.py bdist_wheel`` from the package directory, this command build a whl file which is a pip installation source. Then run ``pip install <wheel file>`` to install it.
4. Prepare meta file
^^^^^^^^^^^^^^^^^^^^
Create a yaml file with following keys as meta file:
* ``algoType``: type of algorithms, could be one of ``tuner``, ``assessor``, ``advisor``
* ``builtinName``: builtin name used in experiment configuration file
* `className`: tuner class name, including its module name, for example: ``demo_tuner.DemoTuner``
* `classArgsValidator`: class args validator class name, including its module name, for example: ``demo_tuner.MyClassArgsValidator``
Once your customized algorithms is installed, you can use it in experiment configuration file the same way as other builtin tuners/assessors/advisors, for example:
.. code-block:: yaml
tuner:
builtinTunerName: demotuner
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
Manage builtin algorithms using ``nnictl algo``
-----------------------------------------------
List builtin algorithms
^^^^^^^^^^^^^^^^^^^^^^^
Run following command to list the registered builtin algorithms:
Run following command to uninstall an installed package:
``nnictl algo unregister <builtin name>``
For example:
``nnictl algo unregister demotuner``
Porting customized algorithms from v1.x to v2.x
-----------------------------------------------
All that needs to be modified is to delete ``NNI Package :: tuner`` metadata in ``setup.py`` and add a meta file mentioned in `4. Prepare meta file`_. Then you can follow `Register customized algorithms as builtin tuners, assessors and advisors`_ to register your customized algorithms.
Example: Register a customized tuner as a builtin tuner
You can also install NNI in a docker image. Please follow the instructions `here <../Tutorial/HowToUseDocker.rst>`__ to build an NNI docker image. The NNI docker image can also be retrieved from Docker Hub through the command ``docker pull msranni/nni:latest``.
Verify installation
-------------------
*
Download the examples via cloning the source code.
Wait for the message ``INFO: Successfully started experiment!`` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the ``Web UI url``.
* Open the ``Web UI url`` in your browser, you can view detailed information about the experiment and all the submitted trial jobs as shown below. `Here <../Tutorial/WebUI.rst>`__ are more Web UI pages.
.. image:: ../../img/webui_overview_page.png
:target: ../../img/webui_overview_page.png
:alt: overview
.. image:: ../../img/webui_trialdetail_page.png
:target: ../../img/webui_trialdetail_page.png
:alt: detail
System requirements
-------------------
Due to potential programming changes, the minimum system requirements of NNI may change over time.
Linux
^^^^^
.. list-table::
:header-rows: 1
:widths: auto
* -
- Recommended
- Minimum
* - **Operating System**
- Ubuntu 16.04 or above
-
* - **CPU**
- Intel® Core™ i5 or AMD Phenom™ II X3 or better
- Intel® Core™ i3 or AMD Phenom™ X3 8650
* - **GPU**
- NVIDIA® GeForce® GTX 660 or better
- NVIDIA® GeForce® GTX 460
* - **Memory**
- 6 GB RAM
- 4 GB RAM
* - **Storage**
- 30 GB available hare drive space
-
* - **Internet**
- Boardband internet connection
-
* - **Resolution**
- 1024 x 768 minimum display resolution
-
macOS
^^^^^
.. list-table::
:header-rows: 1
:widths: auto
* -
- Recommended
- Minimum
* - **Operating System**
- macOS 10.14.1 or above
-
* - **CPU**
- Intel® Core™ i7-4770 or better
- Intel® Core™ i5-760 or better
* - **GPU**
- AMD Radeon™ R9 M395X or better
- NVIDIA® GeForce® GT 750M or AMD Radeon™ R9 M290 or better
* - **Memory**
- 8 GB RAM
- 4 GB RAM
* - **Storage**
- 70GB available space SSD
- 70GB available space 7200 RPM HDD
* - **Internet**
- Boardband internet connection
-
* - **Resolution**
- 1024 x 768 minimum display resolution
-
Further reading
---------------
* `Overview <../Overview.rst>`__
* `Use command line tool nnictl <Nnictl.rst>`__
* `Use NNIBoard <WebUI.rst>`__
* `Define search space <SearchSpaceSpec.rst>`__
* `Config an experiment <ExperimentConfig.rst>`__
* `How to run an experiment on local (with multiple GPUs)? <../TrainingService/LocalMode.rst>`__
* `How to run an experiment on multiple machines? <../TrainingService/RemoteMachineMode.rst>`__
* `How to run an experiment on OpenPAI? <../TrainingService/PaiMode.rst>`__
* `How to run an experiment on Kubernetes through Kubeflow? <../TrainingService/KubeflowMode.rst>`__
* `How to run an experiment on Kubernetes through FrameworkController? <../TrainingService/FrameworkControllerMode.rst>`__
* `How to run an experiment on Kubernetes through AdaptDL? <../TrainingService/AdaptDLMode.rst>`__
Python 3.6 (or above) 64-bit. `Anaconda <https://www.anaconda.com/products/individual>`__ or `Miniconda <https://docs.conda.io/en/latest/miniconda.html>`__ is highly recommended to manage multiple Python environments on Windows.
*
If it's a newly installed Python environment, it needs to install `Microsoft C++ Build Tools <https://visualstudio.microsoft.com/visual-cpp-build-tools/>`__ to support build NNI dependencies like ``scikit-learn``.
.. code-block:: bat
pip install cython wheel
*
git for verifying installation.
Install NNI
-----------
In most cases, you can install and upgrade NNI from pip package. It's easy and fast.
If you are interested in special or the latest code versions, you can install NNI through source code.
If you want to contribute to NNI, refer to `setup development environment <SetupNniDeveloperEnvironment.rst>`__.
Note: If you are familiar with other frameworks, you can choose corresponding example under ``examples\trials``. It needs to change trial command ``python3`` to ``python`` in each example YAML, since default installation has ``python.exe``\ , not ``python3.exe`` executable.
*
Wait for the message ``INFO: Successfully started experiment!`` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the ``Web UI url``.
* Open the ``Web UI url`` in your browser, you can view detailed information about the experiment and all the submitted trial jobs as shown below. `Here <../Tutorial/WebUI.rst>`__ are more Web UI pages.
.. image:: ../../img/webui_overview_page.png
:target: ../../img/webui_overview_page.png
:alt: overview
.. image:: ../../img/webui_trialdetail_page.png
:target: ../../img/webui_trialdetail_page.png
:alt: detail
System requirements
-------------------
Below are the minimum system requirements for NNI on Windows, Windows 10.1809 is well tested and recommend. Due to potential programming changes, the minimum system requirements for NNI may change over time.
.. list-table::
:header-rows: 1
:widths: auto
* -
- Recommended
- Minimum
* - **Operating System**
- Windows 10 1809 or above
-
* - **CPU**
- Intel® Core™ i5 or AMD Phenom™ II X3 or better
- Intel® Core™ i3 or AMD Phenom™ X3 8650
* - **GPU**
- NVIDIA® GeForce® GTX 660 or better
- NVIDIA® GeForce® GTX 460
* - **Memory**
- 6 GB RAM
- 4 GB RAM
* - **Storage**
- 30 GB available hare drive space
-
* - **Internet**
- Boardband internet connection
-
* - **Resolution**
- 1024 x 768 minimum display resolution
-
FAQ
---
simplejson failed when installing NNI
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Make sure a C++ 14.0 compiler is installed.
..
building 'simplejson._speedups' extension error: [WinError 3] The system cannot find the path specified
Trial failed with missing DLL in command line or PowerShell
This error is caused by missing LIBIFCOREMD.DLL and LIBMMD.DLL and failure to install SciPy. Using Anaconda or Miniconda with Python(64-bit) can solve it.
..
ImportError: DLL load failed
Trial failed on webUI
^^^^^^^^^^^^^^^^^^^^^
Please check the trial log file stderr for more details.
If there is a stderr file, please check it. Two possible cases are:
* forgetting to change the trial command ``python3`` to ``python`` in each experiment YAML.
* forgetting to install experiment dependencies such as TensorFlow, Keras and so on.
Fail to use BOHB on Windows
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Make sure a C++ 14.0 compiler is installed when trying to run ``pip install nni[BOHB]`` to install the dependencies.
Not supported tuner on Windows
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SMAC is not supported currently; for the specific reason refer to this `GitHub issue <https://github.com/automl/SMAC3/issues/483>`__.
Use Windows as a remote worker
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Refer to `Remote Machine mode <../TrainingService/RemoteMachineMode.rst>`__.
Segmentation fault (core dumped) when installing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Refer to `FAQ <FAQ.rst>`__.
Further reading
---------------
* `Overview <../Overview.rst>`__
* `Use command line tool nnictl <Nnictl.rst>`__
* `Use NNIBoard <WebUI.rst>`__
* `Define search space <SearchSpaceSpec.rst>`__
* `Config an experiment <ExperimentConfig.rst>`__
* `How to run an experiment on local (with multiple GPUs)? <../TrainingService/LocalMode.rst>`__
* `How to run an experiment on multiple machines? <../TrainingService/RemoteMachineMode.rst>`__
* `How to run an experiment on OpenPAI? <../TrainingService/PaiMode.rst>`__
* `How to run an experiment on Kubernetes through Kubeflow? <../TrainingService/KubeflowMode.rst>`__
* `How to run an experiment on Kubernetes through FrameworkController? <../TrainingService/FrameworkControllerMode.rst>`__
**nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc.
Commands
--------
nnictl support commands:
* `nnictl create <#create>`__
* `nnictl resume <#resume>`__
* `nnictl view <#view>`__
* `nnictl stop <#stop>`__
* `nnictl update <#update>`__
* `nnictl trial <#trial>`__
* `nnictl top <#top>`__
* `nnictl experiment <#experiment>`__
* `nnictl platform <#platform>`__
* `nnictl config <#config>`__
* `nnictl log <#log>`__
* `nnictl webui <#webui>`__
* `nnictl algo <#algo>`__
* `nnictl ss_gen <#ss_gen>`__
* `nnictl --version <#version>`__
Manage an experiment
^^^^^^^^^^^^^^^^^^^^
:raw-html:`<a name="create"></a>`
nnictl create
^^^^^^^^^^^^^
*
Description
You can use this command to create a new experiment, using the configuration specified in config file.
After this command is successfully done, the context will be set as this experiment, which means the following command you issued is associated with this experiment, unless you explicitly changes the context(not supported yet).
*
Usage
.. code-block:: bash
nnictl create [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --config, -c
- True
-
- YAML configure file of the experiment
* - --port, -p
- False
-
- the port of restful server
* - --debug, -d
- False
-
- set debug mode
* - --foreground, -f
- False
-
- set foreground mode, print log content to terminal
*
Examples
..
create a new experiment with the default port: 8080
- The interval to update the experiment status, the unit of time is second, and the default value is 3 second.
:raw-html:`<a name="experiment"></a>`
Manage experiment information
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
*
**nnictl experiment show**
*
Description
Show the information of experiment.
*
Usage
.. code-block:: bash
nnictl experiment show
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
*
**nnictl experiment status**
*
Description
Show the status of experiment.
*
Usage
.. code-block:: bash
nnictl experiment status
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment you want to set
*
**nnictl experiment list**
*
Description
Show the information of all the (running) experiments.
*
Usage
.. code-block:: bash
nnictl experiment list [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --all
- False
-
- list all of experiments
*
**nnictl experiment delete**
*
Description
Delete one or all experiments, it includes log, result, environment information and cache. It uses to delete useless experiment result, or save disk space.
*
Usage
.. code-block:: bash
nnictl experiment delete [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment
* - --all
- False
-
- delete all of experiments
*
**nnictl experiment export**
*
Description
You can use this command to export reward & hyper-parameter of trial jobs to a csv file.
*
Usage
.. code-block:: bash
nnictl experiment export [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- ID of the experiment
* - --filename, -f
- True
-
- File path of the output file
* - --type
- True
-
- Type of output file, only support "csv" and "json"
* - --intermediate, -i
- False
-
- Are intermediate results included
*
Examples
..
export all trial data in an experiment as json format
You can use this command to import several prior or supplementary trial hyperparameters & results for NNI hyperparameter tuning. The data are fed to the tuning algorithm (e.g., tuner or advisor).
*
Usage
.. code-block:: bash
nnictl experiment import [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - id
- False
-
- The id of the experiment you want to import data into
* - --filename, -f
- True
-
- a file with data you want to import in json format
*
Details
NNI supports users to import their own data, please express the data in the correct format. An example is shown below:
Every element in the top level list is a sample. For our built-in tuners/advisors, each sample should have at least two keys: ``parameter`` and ``value``. The ``parameter`` must match this experiment's search space, that is, all the keys (or hyperparameters) in ``parameter`` must match the keys in the search space. Otherwise, tuner/advisor may have unpredictable behavior. ``Value`` should follow the same rule of the input in ``nni.report_final_result``\ , that is, either a number or a dict with a key named ``default``. For your customized tuner/advisor, the file could have any json content depending on how you implement the corresponding methods (e.g., ``import_data``\ ).
You also can use `nnictl experiment export <#export>`__ to export a valid json file including previous experiment trial hyperparameters and results.
Currently, following tuner and advisor support import data:
*If you want to import data to BOHB advisor, user are suggested to add "TRIAL_BUDGET" in parameter as NNI do, otherwise, BOHB will use max_budget as "TRIAL_BUDGET". Here is an example:*
It uses to clean up disk on a target platform. The provided YAML file includes the information of target platform, and it follows the same schema as the NNI configuration file.
*
Note
if the target platform is being used by other users, it may cause unexpected errors to others.
*
Usage
.. code-block:: bash
nnictl platform clean [OPTIONS]
*
Options
.. list-table::
:header-rows: 1
:widths: auto
* - Name, shorthand
- Required
- Default
- Description
* - --config
- True
-
- the path of yaml config file used when create an experiment
NNIisatoolkittohelpusersrunautomatedmachinelearningexperiments.Itcanautomaticallydothecyclicprocessofgettinghyperparameters,runningtrials,testingresults,andtuninghyperparameters.Here,we'll show how to use NNI to help you find the optimal hyperparameters on the MNIST dataset.
Here is an example script to train a CNN on the MNIST dataset **without NNI**:
The above code can only try one set of parameters at a time. If you want to tune the learning rate, you need to manually modify the hyperparameter and start the trial again and again.
NNI is born to help users tune jobs, whose working process is presented below:
.. code-block:: text
input: search space, trial code, config file
output: one optimal hyperparameter configuration
1: For t = 0, 1, 2, ..., maxTrialNum,
2: hyperparameter = chose a set of parameter from search space
3: final result = run_trial_and_evaluate(hyperparameter)
4: report final result to NNI
5: If reach the upper limit time,
6: Stop the experiment
7: return hyperparameter value with best final result
.. note::
If you want to use NNI to automatically train your model and find the optimal hyper-parameters, there are two approaches:
1. Write a config file and start the experiment from the command line.
2. Config and launch the experiment directly from a Python file
In the this part, we will focus on the first approach. For the second approach, please refer to `this tutorial <HowToLaunchFromPython.rst>`__\ .
Step 1: Modify the ``Trial`` Code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Modify your ``Trial`` file to get the hyperparameter set from NNI and report the final results to NNI.
Define a ``Search Space`` in a YAML file, including the ``name`` and the ``distribution`` (discrete-valued or continuous-valued) of all the hyperparameters you want to search.
You can also write your search space in a JSON file and specify the file path in the configuration. For detailed tutorial on how to write the search space, please see `here <SearchSpaceSpec.rst>`__.
Step 3: Config the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In addition to the search_space defined in the `step2 <step-2-define-the-search-space>`__, you need to config the experiment in the YAML file. It specifies the key information of the experiment, such as the trial files, tuning algorithm, max trial number, and max duration, etc.
.. code-block:: yaml
experimentName: MNIST # An optional name to distinguish the experiments
trialCommand: python3 mnist.py # NOTE: change "python3" to "python" if you are using Windows
trialConcurrency: 2 # Run 2 trials concurrently
maxTrialNumber: 10 # Generate at most 10 trials
maxExperimentDuration: 1h # Stop generating trials after 1 hour
tuner: # Configure the tuning algorithm
name: TPE
classArgs: # Algorithm specific arguments
optimize_mode: maximize
trainingService: # Configure the training platform
platform: local
Experiment config reference could be found `here <../reference/experiment_config.rst>`__.
.. _nniignore:
.. Note:: If you are planning to use remote machines or clusters as your :doc:`training service <../TrainingService/Overview>`, to avoid too much pressure on network, NNI limits the number of files to 2000 and total size to 300MB. If your codeDir contains too many files, you can choose which files and subfolders should be excluded by adding a ``.nniignore`` file that works like a ``.gitignore`` file. For more details on how to write this file, see the `git documentation <https://git-scm.com/docs/gitignore#_pattern_format>`__.
*Example:* :githublink:`config_detailed.yml <examples/trials/mnist-pytorch/config_detailed.yml>` and :githublink:`.nniignore <examples/trials/mnist-pytorch/.nniignore>`
All the code above is already prepared and stored in :githublink:`examples/trials/mnist-pytorch/<examples/trials/mnist-pytorch>`.
Step 4: Launch the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Linux and macOS
***************
Run the **config_detailed.yml** file from your command line to start the experiment.
Change ``python3`` to ``python`` of the ``trialCommand`` field in the **config_detailed.yml** file, and run the **config_detailed.yml** file from your command line to start the experiment.
.. Note:: ``nnictl`` is a command line tool that can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click :doc:`here <Nnictl>` for more usage of ``nnictl``.
Wait for the message ``INFO: Successfully started experiment!`` in the command line. This message indicates that your experiment has been successfully started. And this is what we expect to get:
If you prepared ``trial``\ , ``search space``\ , and ``config`` according to the above steps and successfully created an NNI job, NNI will automatically tune the optimal hyper-parameters and run different hyper-parameter sets for each trial according to the defined search space. You can see its progress through the WebUI clearly.
Step 5: View the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^
After starting the experiment successfully, you can find a message in the command-line interface that tells you the ``Web UI url`` like this:
.. code-block:: text
The Web UI urls are: [Your IP]:8080
Open the ``Web UI url`` (Here it's:``[YourIP]:8080``\)inyourbrowser,youcanviewdetailedinformationabouttheexperimentandallthesubmittedtrialjobsasshownbelow.IfyoucannotopentheWebUIlinkinyourterminal,pleaserefertothe`FAQ<FAQ.rst#could-not-open-webui-link>`__.
Take the first line as an example. ``dropout_rate`` is defined as a variable whose prior distribution is a uniform distribution with a range from ``0.1`` to ``0.5``.
.. note:: In the `experiment configuration (V2) schema <ExperimentConfig.rst>`_, NNI supports defining the search space directly in the configuration file, detailed usage can be found `here <QuickStart.rst#step-2-define-the-search-space>`__. When using Python API, users can write the search space in the Python file, refer `here <HowToLaunchFromPython.rst>`__.
Note that the available sampling strategies within a search space depend on the tuner you want to use. We list the supported types for each builtin tuner below. For a customized tuner, you don't have to follow our convention and you will have the flexibility to define any type you want.
Types
-----
All types of sampling strategies and their parameter are listed here:
*
``{"_type": "choice", "_value": options}``
* The variable's value is one of the options. Here ``options`` should be a list of **numbers** or a list of **strings**. Using arbitrary objects as members of this list (like sublists, a mixture of numbers and strings, or null values) should work in most cases, but may trigger undefined behaviors.
* ``options`` can also be a nested sub-search-space, this sub-search-space takes effect only when the corresponding element is chosen. The variables in this sub-search-space can be seen as conditional variables. Here is an simple :githublink:`example of nested search space definition <examples/trials/mnist-nested-search-space/search_space.json>`. If an element in the options list is a dict, it is a sub-search-space, and for our built-in tuners you have to add a ``_name`` key in this dict, which helps you to identify which element is chosen. Accordingly, here is a :githublink:`sample <examples/trials/mnist-nested-search-space/sample.json>` which users can get from nni with nested search space definition. See the table below for the tuners which support nested search spaces.
* The variable value is determined using ``clip(round(uniform(low, high) / q) * q, low, high)``\ , where the clip operation is used to constrain the generated value within the bounds. For example, for ``_value`` specified as [0, 10, 2.5], possible values are [0, 2.5, 5.0, 7.5, 10.0]; For ``_value`` specified as [2, 10, 5], possible values are [2, 5, 10].
* Suitable for a discrete value with respect to which the objective is still somewhat "smooth", but which should be bounded both above and below. If you want to uniformly choose an integer from a range [low, high], you can write ``_value`` like this: ``[low, high, 1]``.
* The variable value is drawn from a range [low, high] according to a loguniform distribution like exp(uniform(log(low), log(high))), so that the logarithm of the return value is uniformly distributed.
* When optimizing, this variable is constrained to be positive.
* The variable value is determined using ``clip(round(loguniform(low, high) / q) * q, low, high)``\ , where the clip operation is used to constrain the generated value within the bounds.
* Suitable for a discrete variable with respect to which the objective is "smooth" and gets smoother with the size of the value, but which should be bounded both above and below.
*
``{"_type": "normal", "_value": [mu, sigma]}``
* The variable value is a real value that's normally-distributed with mean mu and standard deviation sigma. When optimizing, this is an unconstrained variable.
* The variable value is determined using ``round(normal(mu, sigma) / q) * q``
* Suitable for a discrete variable that probably takes a value around mu, but is fundamentally unbounded.
*
``{"_type": "lognormal", "_value": [mu, sigma]}``
* The variable value is drawn according to ``exp(normal(mu, sigma))`` so that the logarithm of the return value is normally distributed. When optimizing, this variable is constrained to be positive.
* The variable value is determined using ``round(exp(normal(mu, sigma)) / q) * q``
* Suitable for a discrete variable with respect to which the objective is smooth and gets smoother with the size of the variable, which is bounded from one side.
Search Space Types Supported by Each Tuner
------------------------------------------
.. list-table::
:header-rows: 1
:widths: auto
* -
- choice
- choice(nested)
- randint
- uniform
- quniform
- loguniform
- qloguniform
- normal
- qnormal
- lognormal
- qlognormal
* - TPE Tuner
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
* - Random Search Tuner
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
* - Anneal Tuner
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
* - Evolution Tuner
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
* - SMAC Tuner
- :raw-html:`✓`
-
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
-
-
-
-
-
* - Batch Tuner
- :raw-html:`✓`
-
-
-
-
-
-
-
-
-
-
* - Grid Search Tuner
- :raw-html:`✓`
-
- :raw-html:`✓`
-
- :raw-html:`✓`
-
-
-
-
-
-
* - Hyperband Advisor
- :raw-html:`✓`
-
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
* - Metis Tuner
- :raw-html:`✓`
-
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
-
-
-
-
-
-
* - GP Tuner
- :raw-html:`✓`
-
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
-
-
-
-
* - DNGO Tuner
- :raw-html:`✓`
-
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
- :raw-html:`✓`
-
-
-
-
Known Limitations:
*
GP Tuner, Metis Tuner and DNGO tuner support only **numerical values** in search space (\ ``choice`` type values can be no-numerical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.
*
Note that for nested search space:
* Only Random Search/TPE/Anneal/Evolution/Grid Search tuner supports nested search space
Nothing to do, the code is already linked to package folders.
TypeScript (Linux and macOS)
****************************
* If ``ts/nni_manager`` is changed, run ``yarn watch`` under this folder. It will watch and build code continually. The ``nnictl`` need to be restarted to reload NNI manager.
* If ``ts/webui`` is changed, run ``yarn dev``\ , which will run a mock API server and a webpack dev server simultaneously. Use ``EXPERIMENT`` environment variable (e.g., ``mnist-tfv1-running``\ ) to specify the mock data being used. Built-in mock experiments are listed in ``src/webui/mock``. An example of the full command is ``EXPERIMENT=mnist-tfv1-running yarn dev``.
TypeScript (Windows)
********************
Currently you must rebuild TypeScript modules with `python3 setup.py build_ts` after edit.
5. Submit Pull Request
^^^^^^^^^^^^^^^^^^^^^^
All changes are merged to master branch from your forked repo. The description of Pull Request must be meaningful, and useful.
We will review the changes as soon as possible. Once it passes review, we will merge it to master branch.
For more contribution guidelines and coding styles, you can refer to the `contributing document <Contributing.rst>`__.
You can launch a tensorboard process cross one or multi trials within webui since NNI v2.2. This feature supports local training service and reuse mode training service with shared storage for now, and will support more scenarios in later nni version.
Preparation
-----------
Make sure tensorboard installed in your environment. If you never used tensorboard, here are getting start tutorials for your reference, `tensorboard with tensorflow <https://www.tensorflow.org/tensorboard/get_started>`__, `tensorboard with pytorch <https://pytorch.org/tutorials/recipes/recipes/tensorboard_with_pytorch.html>`__.
Use WebUI Launch Tensorboard
----------------------------
1. Save Logs
^^^^^^^^^^^^
NNI will automatically fetch the ``tensorboard`` subfolder under trial's output folder as tensorboard logdir. So in trial's source code, you need to save the tensorboard logs under ``NNI_OUTPUT_DIR/tensorboard``. This log path can be joined as:
Like compare, select the trials you want to combine to launch the tensorboard at first, then click the ``Tensorboard`` button.
.. image:: ../../img/Tensorboard_1.png
:target: ../../img/Tensorboard_1.png
:alt:
After click the ``OK`` button in the pop-up box, you will jump to the tensorboard portal.
.. image:: ../../img/Tensorboard_2.png
:target: ../../img/Tensorboard_2.png
:alt:
You can see the ``SequenceID-TrialID`` on the tensorboard portal.
.. image:: ../../img/Tensorboard_3.png
:target: ../../img/Tensorboard_3.png
:alt:
3. Stop All
^^^^^^^^^^^^
If you want to open the portal you have already launched, click the tensorboard id. If you don't need the tensorboard anymore, click ``Stop all tensorboard`` button.
* On the overview tab, you can see the experiment information and status and the performance of ``top trials``.
.. image:: ../../img/webui-img/full-oview.png
:target: ../../img/webui-img/full-oview.png
:alt: overview
* If you want to see experiment search space and config, please click the right button ``Search space`` and ``Config`` (when you hover on this button).
1. Search space file:
.. image:: ../../img/webui-img/searchSpace.png
:target: ../../img/webui-img/searchSpace.png
:alt: searchSpace
2. Config file:
.. image:: ../../img/webui-img/config.png
:target: ../../img/webui-img/config.png
:alt: config
* You can view and download ``nni-manager/dispatcher log files`` on here.
.. image:: ../../img/webui-img/review-log.png
:target: ../../img/webui-img/review-log.png
:alt: logfile
* If your experiment has many trials, you can change the refresh interval here.
* You can review and download the experiment results(``experiment config``, ``trial message`` and ``intermeidate metrics``) when you click the button ``Experiment summary``.
.. image:: ../../img/webui-img/summary.png
:target: ../../img/webui-img/summary.png
:alt: summary
* You can change some experiment configurations such as ``maxExecDuration``, ``maxTrialNum`` and ``trial concurrency`` on here.
The trial may have many intermediate results in the training process. In order to see the trend of some trials more clearly, we set a filtering function for the intermediate result graph.
You may find that these trials will get better or worse at an intermediate result. This indicates that it is an important and relevant intermediate result. To take a closer look at the point here, you need to enter its corresponding X-value at #Intermediate. Then input the range of metrics on this intermedia result. In the picture below, we choose the No. 4 intermediate result and set the range of metrics to 0.8-1.
* The button named ``Add column`` can select which column to show on the table. If you run an experiment whose final result is a dict, you can see other keys in the table. You can choose the column ``Intermediate count`` to watch the trial's progress.
.. image:: ../../img/webui-img/addColumn.png
:target: ../../img/webui-img/addColumn.png
:alt: addColumnGraph
* If you want to compare some trials, you can select them and then click ``Compare`` to see the results.
* You can use the button named ``Copy as python`` to copy the trial's parameters.
.. image:: ../../img/webui-img/copyParameter.png
:target: ../../img/webui-img/copyParameter.png
:alt: copyTrialParameters
* You could see trial logs on the tab of ``Log``. There are three buttons ``View trial log``, ``View trial error`` and ``View trial stdout`` on local mode. If you run on the OpenPAI or Kubeflow platform, you could see trial stdout and nfs log.
* Intermediate Result Graph: you can see the default metric in this graph by clicking the intermediate button.
.. image:: ../../img/webui-img/intermediate.png
:target: ../../img/webui-img/intermediate.png
:alt: intermeidateGraph
* Kill: you can kill a job that status is running.
.. image:: ../../img/webui-img/kill-running.png
:target: ../../img/webui-img/kill-running.png
:alt: killTrial
* Customized trial: you can change this trial parameters and then submit it to the experiment. If you want to rerun a failed trial you could submit the same parameters to the experiment.
<ahref="{{ pathto('TrainingService/AdaptDLMode') }}">AdaptDL (aka. ADL)</a>, other cloud options and even <ahref="{{ pathto('TrainingService/HybridMode') }}">Hybrid mode</a>.
</p>
<!-- Who should consider using NNI -->
<div>
<h2class="title">Who should consider using NNI</h2>
<ul>
<li>Those who want to <b>try different AutoML algorithms</b> in their training code/model.</li>
<li>Those who want to run AutoML trial jobs <b>in different environments</b> to speed up search.</li>
<liclass="rowHeight">Researchers and data scientists who want to easily <b>implement and experiement new AutoML
algorithms</b>
, may it be: hyperparameter tuning algorithm,
neural architect search algorithm or model compression algorithm.
</li>
<li>ML Platform owners who want to <b>support AutoML in their platform</b></li>
</ul>
</div>
<!-- what's new -->
<div>
<divclass="inline gap">
<h2>What's NEW! </h2>
<imgwidth="48"src="_static/img/release_icon.png">
</div>
<hrclass="whatNew"/>
<ul>
<li><b>New release:</b><ahref='https://github.com/microsoft/nni/releases/tag/v2.6'>{{ release }} is available2 <i>- released on Jan-18-2022</i></a></li>
<li><b>New demo available:</b><ahref="https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw">Youtube entry</a> | <ahref="https://space.bilibili.com/1649051673">Bilibili</a> 入口 <i>- last updated on May-26-2021</i></li>
<div>NNI has a monthly release cycle (major releases). Please let us know if you encounter a bug by filling an issue.</div>
<br/>
<div>We appreciate all contributions. If you are planning to contribute any bug-fixes, please do so without further discussions.</div>
<br/>
<divclass="rowHeight">If you plan to contribute new features, new tuners, new training services, etc. please first open an issue or reuse an exisiting issue, and discuss the feature with us. We will discuss with you on the issue timely or set up conference calls if needed.</div>
<br/>
<div>To learn more about making a contribution to NNI, please refer to our <ahref="{{ pathto('contribution') }}"">How-to contribution page</a>.</div>
<br/>
<div>We appreciate all contributions and thank all the contributors!</div>
<li><ahref="https://github.com/microsoft/nni/issues/new/choose">File an issue</a> on GitHub.</li>
<li>Open or participate in a <ahref="https://github.com/microsoft/nni/discussions">discussion</a>.</li>
<li>Discuss on the <ahref="https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge">NNI Gitter</a> in NNI.</li>
</ul>
<div>
<divclass="rowHeight">Join IM discussion groups:</div>