Unverified Commit c6005142 authored by QuanluZhang's avatar QuanluZhang Committed by GitHub
Browse files

[doc] experiment doc refactor (#4617)

parent ebd56271
This diff is collapsed.
Experiment Management
=====================
TBD
\ No newline at end of file
An experiment can be created with command line tool ``nnictl`` or python APIs. NNI provides both command line tool ``nnictl`` and web Portal to manage the experiments, such as, creating, stopping, resuming, deleting, ranking, and comparing the experiments.
Management with ``nnictl``
--------------------------
The ability of ``nnictl`` on experiment management is almost equivalent to :doc:`./webui`. Users can refer to :doc:`../reference/nnictl` for detailed usage. It is highly suggested when visualization is not well supported in your environment (e.g., no GUI on your machine).
Management with web portal
--------------------------
Experiment management on web potral gives an quick overview of all the experiment on users' machine. Users can easily switch to one experiment from this page. Users can refer to the :ref:`exp-manage-webportal` page for details. The experiment management on web portal is still under intensive development to bring more user-friendly features.
\ No newline at end of file
Hybrid Training Service
=======================
TBD
\ No newline at end of file
Hybrid training service is for aggregating different types of computation resources into a virtually unified resource pool, in which trial jobs are dispatched. Hybrid training service is for collecting user's all available computation resources to jointly work on an AutoML task, it is flexibile enough to switch among different types of computation resources. For example, NNI could submit trial jobs to multiple remote machines and AML simultaneously.
Prerequisite
------------
NNI has supported :doc:`./local`, :doc:`./remote`, :doc:`./openpai`, :doc:`./aml`, :doc:`./kubeflow`, :doc:`./frameworkcontroller`, for hybrid training service. Before starting an experiment using using hybrid training service, users should first setup their chosen (sub) training services (e.g., remote training service) according to each training service's own document page.
Usage
-----
Unlike other training services (e.g., ``platform: remote`` in remote training service), there is no dedicated keyword for hybrid training service, users can simply list the configurations of their chosen training services under the ``trainingService`` field. Below is an example of a hybrid training service containing remote training service and local training service in experiment configuration yaml.
.. code-block:: yaml
# the experiment config yaml file
...
trainingService:
- platform: remote
machineList:
- host: 127.0.0.1 # your machine's IP address
user: bob
password: bob
- platform: local
...
A complete example configuration file can be found in :githublink:`examples/trials/mnist-pytorch/config_hybrid.yml`.
\ No newline at end of file
......@@ -14,9 +14,12 @@ Usage
.. code-block:: yaml
# the experiment config yaml file
...
trainingService:
platform: local
useActiveGpu: false # optional
...
There are other supported fields for local training service, such as ``maxTrialNumberPerGpu``, ``gpuIndices``, for concurrently running multiple trials on one GPU, and running trials on a subset of GPUs on your machine. Please refer to :ref:`reference-local-config-label` in reference for detailed usage.
......@@ -28,12 +31,9 @@ Then we explain how local training service works with different configurations o
.. code-block:: yaml
...
trialGpuNumber: 1
trialConcurrency: 4
...
trainingService:
platform: local
useActiveGpu: false
......
Visualize Trial with Netron
===========================
TBD
\ No newline at end of file
Web Portal
==========
Web portal is for users to conveniently visualize their NNI experiments, tuning and training progress, detailed metrics, and error logs. Web portal also allows users to control their NNI experiments, trials, such as updating an experiment of its concurrency, duration, rerunning trials.
.. toctree::
:hidden:
Experiment Web Portal <webui>
Visualize with TensorBoard <tensorboard>
Visualize with Netron <netron>
\ No newline at end of file
......@@ -89,6 +89,8 @@ How to use dict intermediate result
`The discussion <https://github.com/microsoft/nni/discussions/4289>`_ could help you.
.. _exp-manage-webportal:
Experiments management
----------------------
......
.. 1f92c8fa8fbaa1300343c01b53541b92
.. ebf0627529ecdbf758f9db38701b4225
Web 界面
========
......@@ -90,6 +90,7 @@ Q&A
`The discussion <https://github.com/microsoft/nni/discussions/4289>`_ 能帮助你。
.. _exp-manage-webportal:
实验管理
--------
......
......@@ -17,7 +17,7 @@ Neural Network Intelligence
.. toctree::
:maxdepth: 2
:caption: Advanced Materials
:caption: Full-scale Materials
:hidden:
Hyperparameter Optimization <hpo/index>
......@@ -33,7 +33,6 @@ Neural Network Intelligence
nnictl Commands <reference/nnictl>
Experiment Configuration <reference/experiment_config>
Experiment Configuration (legacy) <Tutorial/ExperimentConfig>
Python API <reference/_modules/nni>
.. toctree::
......
.. 16313ff0f7a4b190c06f8a388509a199
.. 1a07f898d4bde4f9566abae45239cad9
###########################
Neural Network Intelligence
......
......@@ -10,7 +10,6 @@ References
nnictl Commands <reference/nnictl>
Experiment Configuration <reference/experiment_config>
Experiment Configuration (legacy) <Tutorial/ExperimentConfig>
SDK API References <sdk_reference>
Supported Framework Library <SupportedFramework_Library>
Launch from Python <Tutorial/HowToLaunchFromPython>
.. ebdb4f520eb0601c779312975a205bdc
.. 435c9eaf84995753a99fb489190daaa6
:orphan:
......@@ -10,7 +10,6 @@
nnictl 命令 <reference/nnictl>
Experiment 配置 <reference/experiment_config>
Experiment 配置(遗产) <Tutorial/ExperimentConfig>
SDK API 参考 <sdk_reference>
支持的框架和库 <SupportedFramework_Library>
从 Python 发起实验 <Tutorial/HowToLaunchFromPython>
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment