the source code of NNI for DCU

1011377c · qianyj · abc22158 · 1011377c · 1011377c · 1011377c
Commit 1011377c authored Mar 31, 2022 by qianyj
20 changed files
--- a/docs/en_US/NAS/Mutators.rst
+++ b/docs/en_US/NAS/Mutators.rst
+Express Mutations with Mutators
+===============================
+Besides the inline mutation APIs demonstrated `here <./MutationPrimitives.rst>`__, NNI provides a more general approach to express a model space, i.e., *Mutator*, to cover more complex model spaces. Those inline mutation APIs are also implemented with mutator in the underlying system, which can be seen as a special case of model mutation.
+.. note:: Mutator and inline mutation APIs cannot be used together.
+A mutator is a piece of logic to express how to mutate a given model. Users are free to write their own mutators. Then a model space is expressed with a base model and a list of mutators. A model in the model space is sampled by applying the mutators on the base model one after another. An example is shown below.
+.. code-block:: python
+  applied_mutators = []
+  applied_mutators.append(BlockMutator('mutable_0'))
+  applied_mutators.append(BlockMutator('mutable_1'))
+``BlockMutator`` is defined by users to express how to mutate the base model. 
+Write a mutator
+---------------
+User-defined mutator should inherit ``Mutator`` class, and implement mutation logic in the member function ``mutate``.
+.. code-block:: python
+  from nni.retiarii import Mutator
+  class BlockMutator(Mutator):
+    def __init__(self, target: str, candidates: List):
+        super(BlockMutator, self).__init__()
+        self.target = target
+        self.candidate_op_list = candidates
+    def mutate(self, model):
+      nodes = model.get_nodes_by_label(self.target)
+      for node in nodes:
+        chosen_op = self.choice(self.candidate_op_list)
+        node.update_operation(chosen_op.type, chosen_op.params)
+The input of ``mutate`` is graph IR (Intermediate Representation) of the base model (please refer to `here <./ApiReference.rst>`__ for the format and APIs of the IR), users can mutate the graph using the graph's member functions (e.g., ``get_nodes_by_label``, ``update_operation``). The mutation operations can be combined with the API ``self.choice``, in order to express a set of possible mutations. In the above example, the node's operation can be changed to any operation from ``candidate_op_list``.
+Use placehoder to make mutation easier: ``nn.Placeholder``. If you want to mutate a subgraph or node of your model, you can define a placeholder in this model to represent the subgraph or node. Then, use mutator to mutate this placeholder to make it real modules.
+.. code-block:: python
+  ph = nn.Placeholder(
+    label='mutable_0',
+    kernel_size_options=[1, 3, 5],
+    n_layer_options=[1, 2, 3, 4],
+    exp_ratio=exp_ratio,
+    stride=stride
+  )
+``label`` is used by mutator to identify this placeholder. The other parameters are the information that is required by mutator. They can be accessed from ``node.operation.parameters`` as a dict, it could include any information that users want to put to pass it to user defined mutator. The complete example code can be found in :githublink:`Mnasnet base model <examples/nas/multi-trial/mnasnet/base_mnasnet.py>`.
+Starting an experiment is almost the same as using inline mutation APIs. The only difference is that the applied mutators should be passed to ``RetiariiExperiment``. Below is a simple example.
+.. code-block:: python
+  exp = RetiariiExperiment(base_model, trainer, applied_mutators, simple_strategy)
+  exp_config = RetiariiExeConfig('local')
+  exp_config.experiment_name = 'mnasnet_search'
+  exp_config.trial_concurrency = 2
+  exp_config.max_trial_number = 10
+  exp_config.training_service.use_active_gpu = False
+  exp.run(exp_config, 8081)
--- a/docs/en_US/NAS/OneshotTrainer.rst
+++ b/docs/en_US/NAS/OneshotTrainer.rst
+One-shot NAS
+============
+Before reading this tutorial, we highly recommend you to first go through the tutorial of how to `define a model space <./QuickStart.rst#define-your-model-space>`__.
+Model Search with One-shot Trainer
+----------------------------------
+With a defined model space, users can explore the space in two ways. One is using strategy and single-arch evaluator as demonstrated `here <./QuickStart.rst#explore-the-defined-model-space>`__. The other is using one-shot trainer, which consumes much less computational resource compared to the first one. In this tutorial we focus on this one-shot approach. The principle of one-shot approach is combining all the models in a model space into one big model (usually called super-model or super-graph). It takes charge of both search, training and testing, by training and evaluating this big model.
+We list the supported one-shot trainers here:
+* DARTS trainer
+* ENAS trainer
+* ProxylessNAS trainer
+* Single-path (random) trainer
+See `API reference <./ApiReference.rst>`__ for detailed usages. Here, we show an example to use DARTS trainer manually.
+.. code-block:: python
+  from nni.retiarii.oneshot.pytorch import DartsTrainer
+  trainer = DartsTrainer(
+      model=model,
+      loss=criterion,
+      metrics=lambda output, target: accuracy(output, target, topk=(1,)),
+      optimizer=optim,
+      num_epochs=args.epochs,
+      dataset=dataset_train,
+      batch_size=args.batch_size,
+      log_frequency=args.log_frequency,
+      unrolled=args.unrolled
+  )
+  trainer.fit()
+  final_architecture = trainer.export()
+After the searching is done, we can use the exported architecture to instantiate the full network for retraining. Here is an example:
+.. code-block:: python
+    from nni.retiarii import fixed_arch
+    with fixed_arch('/path/to/checkpoint.json'):
+        model = Model()
--- a/docs/en_US/NAS/Overview.rst
+++ b/docs/en_US/NAS/Overview.rst
+Retiarii for Neural Architecture Search
+=======================================
+.. attention:: NNI's latest NAS supports are all based on Retiarii Framework, users who are still on `early version using NNI NAS v1.0 <https://nni.readthedocs.io/en/v2.2/nas.html>`__ shall migrate your work to Retiarii as soon as possible.
+.. contents::
+Motivation
+----------
+Automatic neural architecture search is playing an increasingly important role in finding better models. Recent research has proven the feasibility of automatic NAS and has led to models that beat many manually designed and tuned models. Representative works include `NASNet <https://arxiv.org/abs/1707.07012>`__\ , `ENAS <https://arxiv.org/abs/1802.03268>`__\ , `DARTS <https://arxiv.org/abs/1806.09055>`__\ , `Network Morphism <https://arxiv.org/abs/1806.10282>`__\ , and `Evolution <https://arxiv.org/abs/1703.01041>`__. In addition, new innovations continue to emerge.
+However, it is pretty hard to use existing NAS work to help develop common DNN models. Therefore, we designed `Retiarii <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__, a novel NAS/HPO framework, and implemented it in NNI. It helps users easily construct a model space (or search space, tuning space), and utilize existing NAS algorithms. The framework also facilitates NAS innovation and is used to design new NAS algorithms.
+Overview
+--------
+There are three key characteristics of the Retiarii framework:
+* Simple APIs are provided for defining model search space within PyTorch/TensorFlow model.
+* SOTA NAS algorithms are built-in to be used for exploring model search space.
+* System-level optimizations are implemented for speeding up the exploration.
+There are two types of model space exploration approach: **Multi-trial NAS** and **One-shot NAS**. Mutli-trial NAS trains each sampled model in the model space independently, while One-shot NAS samples the model from a super model. After constructing the model space, users can use either exploration appraoch to explore the model space. 
+Multi-trial NAS
+---------------
+Multi-trial NAS means each sampled model from model space is trained independently. A typical multi-trial NAS is `NASNet <https://arxiv.org/abs/1707.07012>`__. The algorithm to sample models from model space is called exploration strategy. NNI has supported the following exploration strategies for multi-trial NAS.
+.. list-table::
+   :header-rows: 1
+   :widths: auto
+   * - Exploration Strategy Name
+     - Brief Introduction of Algorithm
+   * - Random Strategy
+     - Randomly sampling new model(s) from user defined model space. (``nni.retiarii.strategy.Random``)
+   * - Grid Search
+     - Sampling new model(s) from user defined model space using grid search algorithm. (``nni.retiarii.strategy.GridSearch``)
+   * - Regularized Evolution
+     - Generating new model(s) from generated models using `regularized evolution algorithm <https://arxiv.org/abs/1802.01548>`__ . (``nni.retiarii.strategy.RegularizedEvolution``)
+   * - TPE Strategy
+     - Sampling new model(s) from user defined model space using `TPE algorithm <https://papers.nips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf>`__ . (``nni.retiarii.strategy.TPEStrategy``)
+   * - RL Strategy
+     - It uses `PPO algorithm <https://arxiv.org/abs/1707.06347>`__ to sample new model(s) from user defined model space. (``nni.retiarii.strategy.PolicyBasedRL``)
+Please refer to `here <./multi_trial_nas.rst>`__ for detailed usage of multi-trial NAS.
+One-shot NAS
+------------
+One-shot NAS means building model space into a super-model, training the super-model with weight sharing, and then sampling models from the super-model to find the best one. `DARTS <https://arxiv.org/abs/1806.09055>`__ is a typical one-shot NAS.
+Below is the supported one-shot NAS algorithms. More one-shot NAS will be supported soon.
+.. list-table::
+   :header-rows: 1
+   :widths: auto
+   * - One-shot Algorithm Name
+     - Brief Introduction of Algorithm
+   * - `ENAS <ENAS.rst>`__
+     - `Efficient Neural Architecture Search via Parameter Sharing <https://arxiv.org/abs/1802.03268>`__. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. It uses parameter sharing between child models to achieve fast speed and excellent performance.
+   * - `DARTS <DARTS.rst>`__
+     - `DARTS: Differentiable Architecture Search <https://arxiv.org/abs/1806.09055>`__ introduces a novel algorithm for differentiable network architecture search on bilevel optimization.
+   * - `SPOS <SPOS.rst>`__
+     - `Single Path One-Shot Neural Architecture Search with Uniform Sampling <https://arxiv.org/abs/1904.00420>`__ constructs a simplified supernet trained with a uniform path sampling method and applies an evolutionary algorithm to efficiently search for the best-performing architectures.
+   * - `ProxylessNAS <Proxylessnas.rst>`__
+     - `ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware <https://arxiv.org/abs/1812.00332>`__. It removes proxy, directly learns the architectures for large-scale target tasks and target hardware platforms.
+Please refer to `here <one_shot_nas.rst>`__ for detailed usage of one-shot NAS algorithms.
+Reference and Feedback
+----------------------
+* `Quick Start <./QuickStart.rst>`__ ;
+* `Construct Your Model Space <./construct_space.rst>`__ ;
+* `Retiarii: A Deep Learning Exploratory-Training Framework <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__ ;
+* To `report a bug <https://github.com/microsoft/nni/issues/new?template=bug-report.rst>`__ for this feature in GitHub ;
+* To `file a feature or improvement request <https://github.com/microsoft/nni/issues/new?template=enhancement.rst>`__ for this feature in GitHub .
--- a/docs/en_US/NAS/Proxylessnas.rst
+++ b/docs/en_US/NAS/Proxylessnas.rst
+ProxylessNAS on NNI
+===================
+Introduction
+------------
+The paper `ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware <https://arxiv.org/pdf/1812.00332.pdf>`__ removes proxy, it directly learns the architectures for large-scale target tasks and target hardware platforms. They address high memory consumption issue of differentiable NAS and reduce the computational cost to the same level of regular training while still allowing a large candidate set. Please refer to the paper for the details.
+Usage
+-----
+To use ProxylessNAS training/searching approach, users need to specify search space in their model using `NNI NAS interface <./MutationPrimitives.rst>`__\ , e.g., ``LayerChoice``\ , ``InputChoice``. After defining and instantiating the model, the following work can be leaved to ProxylessNasTrainer by instantiating the trainer and passing the model to it.
+.. code-block:: python
+   trainer = ProxylessTrainer(model,
+                              loss=LabelSmoothingLoss(),
+                              dataset=None,
+                              optimizer=optimizer,
+                              metrics=lambda output, target: accuracy(output, target, topk=(1, 5,)),
+                              num_epochs=120,
+                              log_frequency=10,
+                              grad_reg_loss_type=args.grad_reg_loss_type, 
+                              grad_reg_loss_params=grad_reg_loss_params, 
+                              applied_hardware=args.applied_hardware, dummy_input=(1, 3, 224, 224),
+                              ref_latency=args.reference_latency)
+   trainer.train()
+   trainer.export(args.arch_path)
+The complete example code can be found :githublink:`here <examples/nas/oneshot/proxylessnas>`.
+**Input arguments of ProxylessNasTrainer**
+* **model** (*PyTorch model, required*\ ) - The model that users want to tune/search. It has mutables to specify search space.
+* **metrics** (*PyTorch module, required*\ ) - The main term of the loss function for model train. Receives logits and ground truth label, return a loss tensor.
+* **optimizer** (*PyTorch Optimizer, required*\) - The optimizer used for optimizing the model.
+* **num_epochs** (*int, optional, default = 120*\ ) - The number of epochs to train/search.
+* **dataset** (*PyTorch dataset, required*\ ) - Dataset for training. Will be split for training weights and architecture weights.
+* **warmup_epochs** (*int, optional, default = 0*\ ) - The number of epochs to do during warmup.
+* **batch_size** (*int, optional, default = 64*\ ) - Batch size.
+* **workers** (*int, optional, default = 4*\ ) - Workers for data loading.
+* **device** (*device, optional, default = 'cpu'*\ ) - The devices that users provide to do the train/search. The trainer applies data parallel on the model for users.
+* **log_frequency** (*int, optional, default = None*\ ) - Step count per logging.
+* **arc_learning_rate** (*float, optional, default = 1e-3*\ ) - The learning rate of the architecture parameters optimizer.
+* **grad_reg_loss_type** (*'mul#log', 'add#linear', or None, optional, default = 'add#linear'*\ ) - Regularization type to add hardware related loss. The trainer will not apply loss regularization when grad_reg_loss_type is set as None.
+* **grad_reg_loss_params** (*dict, optional, default = None*\ ) - Regularization params. 'alpha' and 'beta' is required when ``grad_reg_loss_type`` is 'mul#log', 'lambda' is required when ``grad_reg_loss_type`` is 'add#linear'.
+* **applied_hardware** (*string, optional, default = None*\ ) - Applied hardware for to constraint the model's latency. Latency is predicted by Microsoft nn-Meter (https://github.com/microsoft/nn-Meter). 
+* **dummy_input** (*tuple, optional, default = (1, 3, 224, 224)*\ ) - The dummy input shape when applied to the target hardware.
+* **ref_latency** (*float, optional, default = 65.0*\ ) - Reference latency value in the applied hardware (ms).
+Implementation
+--------------
+The implementation on NNI is based on the `offical implementation <https://github.com/mit-han-lab/ProxylessNAS>`__. The official implementation supports two training approaches: gradient descent and RL based. In our current implementation on NNI, gradient descent training approach is supported. The complete support of ProxylessNAS is ongoing.
+The official implementation supports different targeted hardware, including 'mobile', 'cpu', 'gpu8', 'flops'.  In NNI repo, the hardware latency prediction is supported by `Microsoft nn-Meter <https://github.com/microsoft/nn-Meter>`__. nn-Meter is an accurate inference latency predictor for DNN models on diverse edge devices. nn-Meter support four hardwares up to now, including *'cortexA76cpu_tflite21'*, *'adreno640gpu_tflite21'*, *'adreno630gpu_tflite21'*, and *'myriadvpu_openvino2019r2'*. Users can find more information about nn-Meter on its website. More hardware will be supported in the future. Users could find more details about applying ``nn-Meter`` `here <./HardwareAwareNAS.rst>`__ .
+Below we will describe implementation details. Like other one-shot NAS algorithms on NNI, ProxylessNAS is composed of two parts: *search space* and *training approach*. For users to flexibly define their own search space and use built-in ProxylessNAS training approach, we put the specified search space in :githublink:`example code <examples/nas/oneshot/proxylessnas>` using :githublink:`NNI NAS interface <nni/retiarii/oneshot/pytorch/proxyless>`.
+.. image:: ../../img/proxylessnas.png
+   :target: ../../img/proxylessnas.png
+   :alt: 
+ProxylessNAS training approach is composed of ProxylessLayerChoice and ProxylessNasTrainer. ProxylessLayerChoice instantiates MixedOp for each mutable (i.e., LayerChoice), and manage architecture weights in MixedOp. **For DataParallel**\ , architecture weights should be included in user model. Specifically, in ProxylessNAS implementation, we add MixedOp to the corresponding mutable (i.e., LayerChoice) as a member variable. The ProxylessLayerChoice class also exposes two member functions, i.e., ``resample``\ , ``finalize_grad``\ , for the trainer to control the training of architecture weights.
+ProxylessNasMutator also implements the forward logic of the mutables (i.e., LayerChoice).
+Reproduce Results
+-----------------
+To reproduce the result, we first run the search, we found that though it runs many epochs the chosen architecture converges at the first several epochs. This is probably induced by hyper-parameters or the implementation, we are working on it. 
\ No newline at end of file
--- a/docs/en_US/NAS/QuickStart.rst
+++ b/docs/en_US/NAS/QuickStart.rst
+Quick Start of Retiarii on NNI
+==============================
+.. contents::
+In this quick start, we use multi-trial NAS as an example to show how to construct and explore a model space. There are mainly three crucial components for a neural architecture search task, namely,
+* Model search space that defines a set of models to explore.
+* A proper strategy as the method to explore this model space.
+* A model evaluator that reports the performance of every model in the space.
+The tutorial for One-shot NAS can be found `here <./OneshotTrainer.rst>`__.
+Currently, PyTorch is the only supported framework by Retiarii, and we have only tested **PyTorch 1.7 to 1.10**. This documentation assumes PyTorch context but it should also apply to other frameworks, which is in our future plan.
+Define your Model Space
+-----------------------
+Model space is defined by users to express a set of models that users want to explore, which contains potentially good-performing models. In this framework, a model space is defined with two parts: a base model and possible mutations on the base model.
+Define Base Model
+^^^^^^^^^^^^^^^^^
+Defining a base model is almost the same as defining a PyTorch (or TensorFlow) model. Usually, you only need to replace the code ``import torch.nn as nn`` with ``import nni.retiarii.nn.pytorch as nn`` to use our wrapped PyTorch modules.
+Below is a very simple example of defining a base model.
+.. code-block:: python
+  import torch
+  import torch.nn.functional as F
+  import nni.retiarii.nn.pytorch as nn
+  from nni.retiarii import model_wrapper
+  @model_wrapper      # this decorator should be put on the out most
+  class Net(nn.Module):
+    def __init__(self):
+      super().__init__()
+      self.conv1 = nn.Conv2d(1, 32, 3, 1)
+      self.conv2 = nn.Conv2d(32, 64, 3, 1)
+      self.dropout1 = nn.Dropout(0.25)
+      self.dropout2 = nn.Dropout(0.5)
+      self.fc1 = nn.Linear(9216, 128)
+      self.fc2 = nn.Linear(128, 10)
+    def forward(self, x):
+      x = F.relu(self.conv1(x))
+      x = F.max_pool2d(self.conv2(x), 2)
+      x = torch.flatten(self.dropout1(x), 1)
+      x = self.fc2(self.dropout2(F.relu(self.fc1(x))))
+      output = F.log_softmax(x, dim=1)
+      return output
+.. tip:: Always keep in mind that you should use ``import nni.retiarii.nn.pytorch as nn`` and :meth:`nni.retiarii.model_wrapper`. Many mistakes are a result of forgetting one of those. Also, please use ``torch.nn`` for submodules of ``nn.init``, e.g., ``torch.nn.init`` instead of ``nn.init``. 
+Define Model Mutations
+^^^^^^^^^^^^^^^^^^^^^^
+A base model is only one concrete model not a model space. We provide `APIs and primitives <./MutationPrimitives.rst>`__ for users to express how the base model can be mutated. That is, to build a model space which includes many models.
+Based on the above base model, we can define a model space as below. 
+.. code-block:: diff
+  import torch
+  import torch.nn.functional as F
+  import nni.retiarii.nn.pytorch as nn
+  from nni.retiarii import model_wrapper
+  @model_wrapper
+  class Net(nn.Module):
+    def __init__(self):
+      super().__init__()
+      self.conv1 = nn.Conv2d(1, 32, 3, 1)
+  -   self.conv2 = nn.Conv2d(32, 64, 3, 1)
+  +   self.conv2 = nn.LayerChoice([
+  +       nn.Conv2d(32, 64, 3, 1),
+  +       DepthwiseSeparableConv(32, 64)
+  +   ])
+  -   self.dropout1 = nn.Dropout(0.25)
+  +   self.dropout1 = nn.Dropout(nn.ValueChoice([0.25, 0.5, 0.75]))
+      self.dropout2 = nn.Dropout(0.5)
+  -   self.fc1 = nn.Linear(9216, 128)
+  -   self.fc2 = nn.Linear(128, 10)
+  +   feature = nn.ValueChoice([64, 128, 256])
+  +   self.fc1 = nn.Linear(9216, feature)
+  +   self.fc2 = nn.Linear(feature, 10)
+    def forward(self, x):
+      x = F.relu(self.conv1(x))
+      x = F.max_pool2d(self.conv2(x), 2)
+      x = torch.flatten(self.dropout1(x), 1)
+      x = self.fc2(self.dropout2(F.relu(self.fc1(x))))
+      output = F.log_softmax(x, dim=1)
+      return output
+This example uses two mutation APIs, ``nn.LayerChoice`` and ``nn.ValueChoice``. ``nn.LayerChoice`` takes a list of candidate modules (two in this example), one will be chosen for each sampled model. It can be used like normal PyTorch module. ``nn.ValueChoice`` takes a list of candidate values, one will be chosen to take effect for each sampled model.
+More detailed API description and usage can be found `here <./construct_space.rst>`__ .
+.. note:: We are actively enriching the mutation APIs, to facilitate easy construction of model space. If the currently supported mutation APIs cannot express your model space, please refer to `this doc <./Mutators.rst>`__ for customizing mutators.
+Explore the Defined Model Space
+-------------------------------
+There are basically two exploration approaches: (1) search by evaluating each sampled model independently, which is the search approach in multi-trial NAS and (2) one-shot weight-sharing based search, which is used in one-shot NAS. We demonstrate the first approach in this tutorial. Users can refer to `here <./OneshotTrainer.rst>`__ for the second approach.
+First, users need to pick a proper exploration strategy to explore the defined model space. Second, users need to pick or customize a model evaluator to evaluate the performance of each explored model.
+Pick an exploration strategy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Retiarii supports many `exploration strategies <./ExplorationStrategies.rst>`__.
+Simply choosing (i.e., instantiate) an exploration strategy as below.
+.. code-block:: python
+  import nni.retiarii.strategy as strategy
+  search_strategy = strategy.Random(dedup=True)  # dedup=False if deduplication is not wanted
+Pick or customize a model evaluator
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+In the exploration process, the exploration strategy repeatedly generates new models. A model evaluator is for training and validating each generated model to obtain the model's performance. The performance is sent to the exploration strategy for the strategy to generate better models.
+Retiarii has provided `built-in model evaluators <./ModelEvaluators.rst>`__, but to start with, it is recommended to use ``FunctionalEvaluator``, that is, to wrap your own training and evaluation code with one single function. This function should receive one single model class and uses ``nni.report_final_result`` to report the final score of this model.
+An example here creates a simple evaluator that runs on MNIST dataset, trains for 2 epochs, and reports its validation accuracy.
+..  code-block:: python
+    def evaluate_model(model_cls):
+      # "model_cls" is a class, need to instantiate
+      model = model_cls()
+      optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
+      transf = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
+      train_loader = DataLoader(MNIST('data/mnist', download=True, transform=transf), batch_size=64, shuffle=True)
+      test_loader = DataLoader(MNIST('data/mnist', download=True, train=False, transform=transf), batch_size=64)
+      device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
+      for epoch in range(3):
+        # train the model for one epoch
+        train_epoch(model, device, train_loader, optimizer, epoch)
+        # test the model for one epoch
+        accuracy = test_epoch(model, device, test_loader)
+        # call report intermediate result. Result can be float or dict
+        nni.report_intermediate_result(accuracy)
+      # report final test result
+      nni.report_final_result(accuracy)
+    # Create the evaluator
+    evaluator = nni.retiarii.evaluator.FunctionalEvaluator(evaluate_model)
+The ``train_epoch`` and ``test_epoch`` here can be any customized function, where users can write their own training recipe. See :githublink:`examples/nas/multi-trial/mnist/search.py` for the full example.
+It is recommended that the ``evaluate_model`` here accepts no additional arguments other than ``model_cls``. However, in the `advanced tutorial <./ModelEvaluators.rst>`__, we will show how to use additional arguments in case you actually need those. In future, we will support mutation on the arguments of evaluators, which is commonly called "Hyper-parmeter tuning".
+Launch an Experiment
+--------------------
+After all the above are prepared, it is time to start an experiment to do the model search. An example is shown below.
+.. code-block:: python
+  exp = RetiariiExperiment(base_model, evaluator, [], search_strategy)
+  exp_config = RetiariiExeConfig('local')
+  exp_config.experiment_name = 'mnist_search'
+  exp_config.trial_concurrency = 2
+  exp_config.max_trial_number = 20
+  exp_config.training_service.use_active_gpu = False
+  exp.run(exp_config, 8081)
+The complete code of this example can be found :githublink:`here <examples/nas/multi-trial/mnist/search.py>`. Users can also run Retiarii Experiment with `different training services <../training_services.rst>`__ besides ``local`` training service.
+Visualize the Experiment
+------------------------
+Users can visualize their experiment in the same way as visualizing a normal hyper-parameter tuning experiment. For example, open ``localhost::8081`` in your browser, 8081 is the port that you set in ``exp.run``. Please refer to `here <../Tutorial/WebUI.rst>`__ for details.
+We support visualizing models with 3rd-party visualization engines (like `Netron <https://netron.app/>`__). This can be used by clicking ``Visualization`` in detail panel for each trial. Note that current visualization is based on `onnx <https://onnx.ai/>`__ , thus visualization is not feasible if the model cannot be exported into onnx. Built-in evaluators (e.g., Classification) will automatically export the model into a file. For your own evaluator, you need to save your file into ``$NNI_OUTPUT_DIR/model.onnx`` to make this work.
+Export Top Models
+-----------------
+Users can export top models after the exploration is done using ``export_top_models``.
+.. code-block:: python
+  for model_code in exp.export_top_models(formatter='dict'):
+    print(model_code)
+The output is `json` object which records the mutation actions of the top model. If users want to output source code of the top model, they can use graph-based execution engine for the experiment, by simply adding the following two lines.
+.. code-block:: python
+  exp_config.execution_engine = 'base'
+  export_formatter = 'code'
--- a/docs/en_US/NAS/SPOS.rst
+++ b/docs/en_US/NAS/SPOS.rst
+Single Path One-Shot (SPOS)
+===========================
+Introduction
+------------
+Proposed in `Single Path One-Shot Neural Architecture Search with Uniform Sampling <https://arxiv.org/abs/1904.00420>`__ is a one-shot NAS method that addresses the difficulties in training One-Shot NAS models by constructing a simplified supernet trained with an uniform path sampling method, so that all underlying architectures (and their weights) get trained fully and equally. An evolutionary algorithm is then applied to efficiently search for the best-performing architectures without any fine tuning.
+Implementation on NNI is based on `official repo <https://github.com/megvii-model/SinglePathOneShot>`__. We implement a trainer that trains the supernet and a evolution tuner that leverages the power of NNI framework that speeds up the evolutionary search phase.
+Examples
+--------
+Here is a use case, which is the search space in paper. However, we applied latency limit instead of flops limit to perform the architecture search phase.
+:githublink:`Example code <examples/nas/oneshot/spos>`
+Requirements
+^^^^^^^^^^^^
+Prepare ImageNet in the standard format (follow the script `here <https://gist.github.com/BIGBALLON/8a71d225eff18d88e469e6ea9b39cef4>`__\ ). Linking it to ``data/imagenet`` will be more convenient.
+Download the checkpoint file from `here <https://1drv.ms/u/s!Am_mmG2-KsrnajesvSdfsq_cN48?e=aHVppN>`__ (maintained by `Megvii <https://github.com/megvii-model>`__\ ) if you don't want to retrain the supernet.
+Put ``checkpoint-150000.pth.tar`` under ``data`` directory.
+After preparation, it's expected to have the following code structure:
+.. code-block:: bash
+   spos
+   ├── architecture_final.json
+   ├── blocks.py
+   ├── data
+   │   ├── imagenet
+   │   │   ├── train
+   │   │   └── val
+   │   └── checkpoint-150000.pth.tar
+   ├── network.py
+   ├── readme.md
+   ├── supernet.py
+   ├── evaluation.py
+   ├── search.py
+   └── utils.py
+Step 1. Train Supernet
+^^^^^^^^^^^^^^^^^^^^^^
+.. code-block:: bash
+   python supernet.py
+Will export the checkpoint to ``checkpoints`` directory, for the next step.
+NOTE: The data loading used in the official repo is `slightly different from usual <https://github.com/megvii-model/SinglePathOneShot/issues/5>`__\ , as they use BGR tensor and keep the values between 0 and 255 intentionally to align with their own DL framework. The option ``--spos-preprocessing`` will simulate the behavior used originally and enable you to use the checkpoints pretrained.
+Step 2. Evolution Search
+^^^^^^^^^^^^^^^^^^^^^^^^
+Single Path One-Shot leverages evolution algorithm to search for the best architecture. In the paper, the search module, which is responsible for testing the sampled architecture, recalculates all the batch norm for a subset of training images, and evaluates the architecture on the full validation set.
+In this example, we have an incomplete implementation of the evolution search. The example only support training from scratch. Inheriting weights from pretrained supernet is not supported yet. To search with the regularized evolution strategy, run
+.. code-block:: bash
+   python search.py
+The final architecture exported from every epoch of evolution can be found in ``trials`` under the working directory of your tuner, which, by default, is ``$HOME/nni-experiments/your_experiment_id/trials``.
+Step 3. Train for Evaluation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+.. code-block:: bash
+   python evaluation.py
+By default, it will use ``architecture_final.json``. This architecture is provided by the official repo (converted into NNI format). You can use any architecture (e.g., the architecture found in step 2) with ``--fixed-arc`` option.
+Reference
+---------
+PyTorch
+^^^^^^^
+..  autoclass:: nni.retiarii.oneshot.pytorch.SinglePathTrainer
+    :noindex:
+Known Limitations
+-----------------
+* Block search only. Channel search is not supported yet.
+* In the search phase, training from the scratch is required. Inheriting weights from supernet is not supported yet.
+Current Reproduction Results
+----------------------------
+Reproduction is still undergoing. Due to the gap between official release and original paper, we compare our current results with official repo (our run) and paper.
+* Evolution phase is almost aligned with official repo. Our evolution algorithm shows a converging trend and reaches ~65% accuracy at the end of search. Nevertheless, this result is not on par with paper. For details, please refer to `this issue <https://github.com/megvii-model/SinglePathOneShot/issues/6>`__.
+* Retrain phase is not aligned. Our retraining code, which uses the architecture released by the authors, reaches 72.14% accuracy, still having a gap towards 73.61% by official release and 74.3% reported in original paper.
--- a/docs/en_US/NAS/Serialization.rst
+++ b/docs/en_US/NAS/Serialization.rst
+Serialization
+=============
+In multi-trial NAS, a sampled model should be able to be executed on a remote machine or a training platform (e.g., AzureML, OpenPAI). "Serialization" enables re-instantiation of model evaluator in another process or machine, such that, both the model and its model evaluator should be correctly serialized. To make NNI correctly serialize model evaluator, users should apply ``nni.trace`` on some of their functions and objects. API references can be found in :func:`nni.trace`.
+Serialization is implemented as a combination of `json-tricks <https://json-tricks.readthedocs.io/en/latest/>`_ and `cloudpickle <https://github.com/cloudpipe/cloudpickle>`_. Essentially, it is json-tricks, that is a enhanced version of Python JSON, enabling handling of serialization of numpy arrays, date/times, decimal, fraction and etc. The difference lies in the handling of class instances. Json-tricks deals with class instances with ``__dict__`` and ``__class__``, which in most of our cases are not reliable (e.g., datasets, dataloaders). Rather, our serialization deals with class instances with two methods:
+1. If the class / factory that creates the object is decorated with ``nni.trace``, we can serialize the class / factory function, along with the parameters, such that the instance can be re-instantiated.
+2. Otherwise, cloudpickle is used to serialize the object into a binary.
+The recommendation is, unless you are absolutely certain that there is no problem and extra burden to serialize the object into binary, always add ``nni.trace``. In most cases, it will be more clean and neat, and enables possibilities such as mutation of parameters (will be supported in future).
+.. warning::
+    **What will happen if I forget to "trace" my objects?**
+    It is likely that the program can still run. NNI will try to serialize the untraced object into a binary. It might fail in complex cases. For example, when the object is too large. Even if it succeeds, the result might be a substantially large object. For example, if you forgot to add ``nni.trace`` on ``MNIST``, the MNIST dataset object wil be serialized into binary, which will be dozens of megabytes because the object has the whole 60k images stored inside. You might see warnings and even errors when running experiments. To avoid such issues, the easiest way is to always remember to add ``nni.trace`` to non-primitive objects.
+.. note:: In Retiarii, serializer will throw exception when one of an single object in the recursive serialization is larger than 64 KB when binary serialized. This indicates that such object needs to be wrapped by ``nni.trace``. In rare cases, if you insist on pickling large data, the limit can be overridden by setting an environment variable ``PICKLE_SIZE_LIMIT``, whose unit is byte. Please note that even if the experiment might be able to run, this can still cause performance issues and even the crash of NNI experiment.
+To trace a function or class, users can use decorator like,
+.. code-block:: python
+    @nni.trace
+    class MyClass:
+        ...
+Inline trace that traces instantly on the object instantiation or function invoke is also acceptable: ``nni.trace(MyClass)(parameters)``.
+Assuming a class ``cls`` is already traced, when it is serialized, its class type along with initialization parameters will be dumped. As the parameters are possibly class instances (if not primitive types like ``int`` and ``str``), their serialization will be a similar problem. We recommend decorate them with ``nni.trace`` as well. In other words, ``nni.trace`` should be applied recursively if necessary.
+Below is an example, ``transforms.Compose``, ``transforms.Normalize``, and ``MNIST`` are serialized manually using ``nni.trace``. ``nni.trace`` takes a class / function as its argument, and returns a wrapped class and function that has the same behavior with the original class / function. The usage of the wrapped class / function is also identical to the original one, except that the arguments are recorded. No need to apply ``nni.trace`` to ``pl.Classification`` and ``pl.DataLoader`` because they are already traced.
+.. code-block:: python
+  import nni
+  import nni.retiarii.evaluator.pytorch.lightning as pl
+  from torchvision import transforms
+  def create_mnist_dataset(root, transform):
+    return MNIST(root='data/mnist', train=False, download=True, transform=transform)
+  transform = nni.trace(transforms.Compose)([nni.trace(transforms.ToTensor)(), nni.trace(transforms.Normalize)((0.1307,), (0.3081,))])
+  # If you write like following, the whole transform will be serialized into a pickle.
+  # This actually works fine, but we do NOT recommend such practice.
+  # transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
+  train_dataset = nni.trace(MNIST)(root='data/mnist', train=True, download=True, transform=transform)
+  test_dataset = nni.trace(create_mnist_dataset)('data/mnist', transform=transform)  # factory is also acceptable
+  evaluator = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
+                                val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
+                                max_epochs=10)
+.. note::
+    **What's the relationship between model_wrapper, basic_unit and nni.trace?**
+    They are fundamentally different. ``model_wrapper`` is used to wrap a base model (search space), ``basic_unit`` to annotate a module as primitive. ``nni.trace`` is to enable serialization of general objects. Though they share similar underlying implementations, but do keep in mind that you will experience errors if you mix them up.
+    .. seealso:: Please refer to API reference of :meth:`nni.retiarii.model_wrapper`, :meth:`nni.retiarii.basic_unit`, and :meth:`nni.trace`.
--- a/docs/en_US/NAS/WriteOneshot.rst
+++ b/docs/en_US/NAS/WriteOneshot.rst
+Customize a New One-shot Trainer
+================================
+One-shot trainers should inherit ``nni.retiarii.oneshot.BaseOneShotTrainer``, and need to implement ``fit()`` (used to conduct the fitting and searching process) and ``export()`` method (used to return the searched best architecture).
+Writing a one-shot trainer is very different to single-arch evaluator. First of all, there are no more restrictions on init method arguments, any Python arguments are acceptable. Secondly, the model fed into one-shot trainers might be a model with Retiarii-specific modules, such as LayerChoice and InputChoice. Such model cannot directly forward-propagate and trainers need to decide how to handle those modules.
+A typical example is DartsTrainer, where learnable-parameters are used to combine multiple choices in LayerChoice. Retiarii provides ease-to-use utility functions for module-replace purposes, namely ``replace_layer_choice``, ``replace_input_choice``. A simplified example is as follows: 
+.. code-block:: python
+    from nni.retiarii.oneshot import BaseOneShotTrainer
+    from nni.retiarii.oneshot.pytorch import replace_layer_choice, replace_input_choice
+    class DartsLayerChoice(nn.Module):
+        def __init__(self, layer_choice):
+            super(DartsLayerChoice, self).__init__()
+            self.name = layer_choice.label
+            self.op_choices = nn.ModuleDict(layer_choice.named_children())
+            self.alpha = nn.Parameter(torch.randn(len(self.op_choices)) * 1e-3)
+        def forward(self, *args, **kwargs):
+            op_results = torch.stack([op(*args, **kwargs) for op in self.op_choices.values()])
+            alpha_shape = [-1] + [1] * (len(op_results.size()) - 1)
+            return torch.sum(op_results * F.softmax(self.alpha, -1).view(*alpha_shape), 0)
+    class DartsTrainer(BaseOneShotTrainer):
+        def __init__(self, model, loss, metrics, optimizer):
+            self.model = model
+            self.loss = loss
+            self.metrics = metrics
+            self.num_epochs = 10
+            self.nas_modules = []
+            replace_layer_choice(self.model, DartsLayerChoice, self.nas_modules)
+            ... # init dataloaders and optimizers
+        def fit(self):
+            for i in range(self.num_epochs):
+                for (trn_X, trn_y), (val_X, val_y) in zip(self.train_loader, self.valid_loader):
+                    self.train_architecture(val_X, val_y)
+                    self.train_model_weight(trn_X, trn_y)
+        @torch.no_grad()
+        def export(self):
+            result = dict()
+            for name, module in self.nas_modules:
+                if name not in result:
+                    result[name] = select_best_of_module(module)
+            return result
+The full code of DartsTrainer is available to Retiarii source code. Please have a check at :githublink:`DartsTrainer <nni/retiarii/oneshot/pytorch/darts.py>`.
--- a/docs/en_US/NAS/construct_space.rst
+++ b/docs/en_US/NAS/construct_space.rst
+#####################
+Construct Model Space
+#####################
+NNI provides powerful APIs for users to easily express model space (or search space). First, users can use mutation primitives (e.g., ValueChoice, LayerChoice) to inline a space in their model. Second, NNI provides simple interface for users to customize new mutators for expressing more complicated model spaces. In most cases, the mutation primitives are enough to express users' model spaces.
+..  toctree::
+    :maxdepth: 1
+    Mutation Primitives <MutationPrimitives>
+    Customize Mutators <Mutators>
+    Hypermodule Lib <Hypermodules>
\ No newline at end of file
--- a/docs/en_US/NAS/multi_trial_nas.rst
+++ b/docs/en_US/NAS/multi_trial_nas.rst
+Multi-trial NAS
+===============
+In multi-trial NAS, users need model evaluator to evaluate the performance of each sampled model, and need an exploration strategy to sample models from a defined model space. Here, users could use NNI provided model evaluators or write their own model evalutor. They can simply choose a exploration strategy. Advanced users can also customize new exploration strategy. For a simple example about how to run a multi-trial NAS experiment, please refer to `Quick Start <./QuickStart.rst>`__.
+..  toctree::
+    :maxdepth: 2
+    Model Evaluators <ModelEvaluators>
+    Exploration Strategies <ExplorationStrategies>
+    Execution Engines <ExecutionEngines>
+    Serialization <Serialization>
--- a/docs/en_US/NAS/one_shot_nas.rst
+++ b/docs/en_US/NAS/one_shot_nas.rst
+One-shot NAS
+============
+One-shot NAS algorithms leverage weight sharing among models in neural architecture search space to train a supernet, and use this supernet to guide the selection of better models. This type of algorihtms greatly reduces computational resource compared to independently training each model from scratch (which we call "Multi-trial NAS"). NNI has supported many popular One-shot NAS algorithms as following.
+..  toctree::
+    :maxdepth: 1
+    Run One-shot NAS <OneshotTrainer>
+    ENAS <ENAS>
+    DARTS <DARTS>
+    SPOS <SPOS>
+    ProxylessNAS <Proxylessnas>
+    FBNet <FBNet>
+    Customize One-shot NAS <WriteOneshot>
--- a/docs/en_US/Overview.rst
+++ b/docs/en_US/Overview.rst
+Overview
+========
+NNI (Neural Network Intelligence) is a toolkit to help users design and tune machine learning models (e.g., hyperparameters), neural network architectures, or complex system's parameters, in an efficient and automatic way. NNI has several appealing properties: ease-of-use, scalability, flexibility, and efficiency.
+* **Ease-of-use**\ : NNI can be easily installed through python pip. Only several lines need to be added to your code in order to use NNI's power. You can use both the commandline tool and WebUI to work with your experiments.
+* **Scalability**\ : Tuning hyperparameters or the neural architecture often demands a large number of computational resources, while NNI is designed to fully leverage different computation resources, such as remote machines, training platforms (e.g., OpenPAI, Kubernetes). Hundreds of trials could run in parallel by depending on the capacity of your configured training platforms.
+* **Flexibility**\ : Besides rich built-in algorithms, NNI allows users to customize various hyperparameter tuning algorithms, neural architecture search algorithms, early stopping algorithms, etc. Users can also extend NNI with more training platforms, such as virtual machines, kubernetes service on the cloud. Moreover, NNI can connect to external environments to tune special applications/models on them.
+* **Efficiency**\ : We are intensively working on more efficient model tuning on both the system and algorithm level. For example, we leverage early feedback to speedup the tuning procedure.
+The figure below shows high-level architecture of NNI.
+.. raw:: html
+   <p align="center">
+   <img src="https://user-images.githubusercontent.com/16907603/92089316-94147200-ee00-11ea-9944-bf3c4544257f.png" alt="drawing" width="700"/>
+   </p>
+Key Concepts
+------------
+* 
+  *Experiment*\ : One task of, for example, finding out the best hyperparameters of a model, finding out the best neural network architecture, etc. It consists of trials and AutoML algorithms.
+* 
+  *Search Space*\ : The feasible region for tuning the model. For example, the value range of each hyperparameter.
+* 
+  *Configuration*\ : An instance from the search space, that is, each hyperparameter has a specific value.
+* 
+  *Trial*\ : An individual attempt at applying a new configuration (e.g., a set of hyperparameter values, a specific neural architecture, etc.). Trial code should be able to run with the provided configuration.
+* 
+  *Tuner*\ : An AutoML algorithm, which generates a new configuration for the next try. A new trial will run with this configuration.
+* 
+  *Assessor*\ : Analyze a trial's intermediate results (e.g., periodically evaluated accuracy on test dataset) to tell whether this trial can be early stopped or not.
+* 
+  *Training Platform*\ : Where trials are executed. Depending on your experiment's configuration, it could be your local machine, or remote servers, or large-scale training platform (e.g., OpenPAI, Kubernetes).
+Basically, an experiment runs as follows: Tuner receives search space and generates configurations. These configurations will be submitted to training platforms, such as the local machine, remote machines, or training clusters. Their performances are reported back to Tuner. Then, new configurations are generated and submitted.
+For each experiment, the user only needs to define a search space and update a few lines of code, and then leverage NNI built-in Tuner/Assessor and training platforms to search the best hyperparameters and/or neural architecture. There are basically 3 steps:
+..
+   Step 1: `Define search space <Tutorial/SearchSpaceSpec.rst>`__
+   Step 2: `Update model codes <TrialExample/Trials.rst>`__
+   Step 3: `Define Experiment <reference/experiment_config.rst>`__
+.. raw:: html
+   <p align="center">
+   <img src="https://user-images.githubusercontent.com/23273522/51816627-5d13db80-2302-11e9-8f3e-627e260203d5.jpg" alt="drawing"/>
+   </p>
+For more details about how to run an experiment, please refer to `Get Started <Tutorial/QuickStart.rst>`__.
+Core Features
+-------------
+NNI provides a key capacity to run multiple instances in parallel to find the best combinations of parameters. This feature can be used in various domains, like finding the best hyperparameters for a deep learning model or finding the best configuration for database and other complex systems with real data.
+NNI also provides algorithm toolkits for machine learning and deep learning, especially neural architecture search (NAS) algorithms, model compression algorithms, and feature engineering algorithms.
+Hyperparameter Tuning
+^^^^^^^^^^^^^^^^^^^^^
+This is a core and basic feature of NNI, we provide many popular `automatic tuning algorithms <Tuner/BuiltinTuner.rst>`__ (i.e., tuner) and `early stop algorithms <Assessor/BuiltinAssessor.rst>`__ (i.e., assessor). You can follow `Quick Start <Tutorial/QuickStart.rst>`__ to tune your model (or system). Basically, there are the above three steps and then starting an NNI experiment.
+General NAS Framework
+^^^^^^^^^^^^^^^^^^^^^
+This NAS framework is for users to easily specify candidate neural architectures, for example, one can specify multiple candidate operations (e.g., separable conv, dilated conv) for a single layer, and specify possible skip connections. NNI will find the best candidate automatically. On the other hand, the NAS framework provides a simple interface for another type of user (e.g., NAS algorithm researchers) to implement new NAS algorithms. A detailed description of NAS and its usage can be found `here <NAS/Overview.rst>`__.
+NNI has support for many one-shot NAS algorithms such as ENAS and DARTS through NNI trial SDK. To use these algorithms you do not have to start an NNI experiment. Instead, import an algorithm in your trial code and simply run your trial code. If you want to tune the hyperparameters in the algorithms or want to run multiple instances, you can choose a tuner and start an NNI experiment.
+Other than one-shot NAS, NAS can also run in a classic mode where each candidate architecture runs as an independent trial job. In this mode, similar to hyperparameter tuning, users have to start an NNI experiment and choose a tuner for NAS.
+Model Compression
+^^^^^^^^^^^^^^^^^
+NNI provides an easy-to-use model compression framework to compress deep neural networks, the compressed networks typically have much smaller model size and much faster
+inference speed without losing performance significantlly. Model compression on NNI includes pruning algorithms and quantization algorithms. NNI provides many pruning and
+quantization algorithms through NNI trial SDK. Users can directly use them in their trial code and run the trial code without starting an NNI experiment. Users can also use NNI model compression framework to customize their own pruning and quantization algorithms.
+A detailed description of model compression and its usage can be found `here <Compression/Overview.rst>`__.
+Automatic Feature Engineering
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Automatic feature engineering is for users to find the best features for their tasks. A detailed description of automatic feature engineering and its usage can be found `here <FeatureEngineering/Overview.rst>`__. It is supported through NNI trial SDK, which means you do not have to create an NNI experiment. Instead, simply import a built-in auto-feature-engineering algorithm in your trial code and directly run your trial code. 
+The auto-feature-engineering algorithms usually have a bunch of hyperparameters themselves. If you want to automatically tune those hyperparameters, you can leverage hyperparameter tuning of NNI, that is, choose a tuning algorithm (i.e., tuner) and start an NNI experiment for it.
+Learn More
+----------
+* `Get started <Tutorial/QuickStart.rst>`__
+* `How to adapt your trial code on NNI? <TrialExample/Trials.rst>`__
+* `What are tuners supported by NNI? <Tuner/BuiltinTuner.rst>`__
+* `How to customize your own tuner? <Tuner/CustomizeTuner.rst>`__
+* `What are assessors supported by NNI? <Assessor/BuiltinAssessor.rst>`__
+* `How to customize your own assessor? <Assessor/CustomizeAssessor.rst>`__
+* `How to run an experiment on local? <TrainingService/LocalMode.rst>`__
+* `How to run an experiment on multiple machines? <TrainingService/RemoteMachineMode.rst>`__
+* `How to run an experiment on OpenPAI? <TrainingService/PaiMode.rst>`__
+* `Examples <TrialExample/MnistExamples.rst>`__
+* `Neural Architecture Search on NNI <NAS/Overview.rst>`__
+* `Model Compression on NNI <Compression/Overview.rst>`__
+* `Automatic feature engineering on NNI <FeatureEngineering/Overview.rst>`__
--- a/docs/en_US/Release.rst
+++ b/docs/en_US/Release.rst
+.. role:: raw-html(raw)
+   :format: html
+Change Log
+==========
+Release 2.6.1 - 2/18/2022
+-------------------------
+Bug Fixes
+^^^^^^^^^
+* Fix a bug that new TPE does not support dict metrics.
+* Fix a bug that missing comma. (Thanks to @mrshu)
+Release 2.6 - 1/19/2022
+-----------------------
+**NOTE**: NNI v2.6 is the last version that supports Python 3.6. From next release NNI will require Python 3.7+.
+Hyper-Parameter Optimization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Experiment
+""""""""""
+* The legacy experiment config format is now deprecated. `(doc of new config) <https://nni.readthedocs.io/en/v2.6/reference/experiment_config.html>`__
+  * If you are still using legacy format, nnictl will show equivalent new config on start. Please save it to replace the old one.
+* nnictl now uses ``nni.experiment.Experiment`` `APIs <https://nni.readthedocs.io/en/stable/Tutorial/HowToLaunchFromPython.html>`__ as backend. The output message of create, resume, and view commands have changed.
+* Added Kubeflow and Frameworkcontroller support to hybrid mode.  `(doc) <https://nni.readthedocs.io/en/v2.6/TrainingService/HybridMode.html>`__
+* The hidden tuner manifest file has been updated. This should be transparent to users, but if you encounter issues like failed to find tuner, please try to remove ``~/.config/nni``.
+Algorithms
+""""""""""
+* Random tuner now supports classArgs ``seed``. `(doc) <https://nni.readthedocs.io/en/v2.6/Tuner/RandomTuner.html>`__
+* TPE tuner is refactored: `(doc) <https://nni.readthedocs.io/en/v2.6/Tuner/TpeTuner.html>`__
+  * Support classArgs ``seed``.
+  * Support classArgs ``tpe_args`` for expert users to customize algorithm behavior.
+  * Parallel optimization has been turned on by default. To turn it off set ``tpe_args.constant_liar_type`` to ``null`` (or ``None`` in Python).
+  * ``parallel_optimize`` and ``constant_liar_type`` has been removed. If you are using them please update your config to use ``tpe_args.constant_liar_type`` instead.
+* Grid search tuner now supports all search space types, including uniform, normal, and nested choice. `(doc) <https://nni.readthedocs.io/en/v2.6/Tuner/GridsearchTuner.html>`__
+Neural Architecture Search
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Enhancement to serialization utilities `(doc) <https://nni.readthedocs.io/en/v2.6/NAS/Serialization.html>`__ and changes to recommended practice of customizing evaluators. `(doc) <https://nni.readthedocs.io/en/v2.6/NAS/QuickStart.html#pick-or-customize-a-model-evaluator>`__
+* Support latency constraint on edge device for ProxylessNAS based on nn-Meter. `(doc) <https://nni.readthedocs.io/en/v2.6/NAS/Proxylessnas.html>`__
+* Trial parameters are showed more friendly in Retiarii experiments.
+* Refactor NAS examples of ProxylessNAS and SPOS.
+Model Compression
+^^^^^^^^^^^^^^^^^
+* New Pruner Supported in Pruning V2
+  * Auto-Compress Pruner `(doc) <https://nni.readthedocs.io/en/v2.6/Compression/v2_pruning_algo.html#auto-compress-pruner>`__
+  * AMC Pruner `(doc) <https://nni.readthedocs.io/en/v2.6/Compression/v2_pruning_algo.html#amc-pruner>`__
+  * Movement Pruning Pruner `(doc) <https://nni.readthedocs.io/en/v2.6/Compression/v2_pruning_algo.html#movement-pruner>`__
+* Support ``nni.trace`` wrapped ``Optimizer`` in Pruning V2. In the case of not affecting the user experience as much as possible, trace the input parameters of the optimizer. `(doc) <https://nni.readthedocs.io/en/v2.6/Compression/v2_pruning_algo.html>`__
+* Optimize Taylor Pruner, APoZ Activation Pruner, Mean Activation Pruner in V2 memory usage.
+* Add more examples for Pruning V2.
+* Add document for pruning config list.  `(doc) <https://nni.readthedocs.io/en/v2.6/Compression/v2_pruning_config_list.html>`__
+* Parameter ``masks_file`` of ``ModelSpeedup`` now accepts `pathlib.Path` object. (Thanks to @dosemeion) `(doc) <https://nni.readthedocs.io/en/v2.6/Compression/ModelSpeedup.html#user-configuration-for-modelspeedup>`__
+* Bug Fix
+  * Fix Slim Pruner in V2 not sparsify the BN weight.
+  * Fix Simulator Annealing Task Generator generates config ignoring 0 sparsity.
+Documentation
+^^^^^^^^^^^^^
+* Supported GitHub feature "Cite this repository".
+* Updated index page of readthedocs.
+* Updated Chinese documentation.
+  * From now on NNI only maintains translation for most import docs and ensures they are up to date.
+* Reorganized HPO tuners' doc.
+Bugfixes
+^^^^^^^^
+* Fixed a bug where numpy array is used as a truth value. (Thanks to @khituras)
+* Fixed a bug in updating search space.
+* Fixed a bug that HPO search space file does not support scientific notation and tab indent.
+  * For now NNI does not support mixing scientific notation and YAML features. We are waiting for PyYAML to update.
+* Fixed a bug that causes DARTS 2nd order to crash.
+* Fixed a bug that causes deep copy of mutation primitives (e.g., LayerChoice) to crash.
+* Removed blank at bottom in Web UI overview page.
+Release 2.5 - 11/2/2021
+-----------------------
+Model Compression
+^^^^^^^^^^^^^^^^^
+* New major version of pruning framework `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/v2_pruning.html>`__
+  * Iterative pruning is more automated, users can use less code to implement iterative pruning.
+  * Support exporting intermediate models in the iterative pruning process.
+  * The implementation of the pruning algorithm is closer to the paper.
+  * Users can easily customize their own iterative pruning by using ``PruningScheduler``.
+  * Optimize the basic pruners underlying generate mask logic, easier to extend new functions.
+  * Optimized the memory usage of the pruners.
+* MobileNetV2 end-to-end example `(notebook) <https://github.com/microsoft/nni/blob/v2.5/examples/model_compress/pruning/mobilenetv2_end2end/Compressing%20MobileNetV2%20with%20NNI%20Pruners.ipynb>`__
+* Improved QAT quantizer `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/Quantizer.html#qat-quantizer>`__
+  * support dtype and scheme customization
+  * support dp multi-gpu training
+  * support load_calibration_config
+* Model speed-up now supports directly loading the mask `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/ModelSpeedup.html#nni.compression.pytorch.ModelSpeedup>`__
+* Support speed-up depth-wise convolution
+* Support bn-folding for LSQ quantizer
+* Support QAT and LSQ resume from PTQ
+* Added doc for observer quantizer `(doc) <https://nni.readthedocs.io/en/v2.5/Compression/Quantizer.html#observer-quantizer>`__
+Neural Architecture Search
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+* NAS benchmark `(doc) <https://nni.readthedocs.io/en/v2.5/NAS/Benchmarks.html>`__
+  * Support benchmark table lookup in experiments
+  * New data preparation approach
+* Improved `quick start doc <https://nni.readthedocs.io/en/v2.5/NAS/QuickStart.html>`__
+* Experimental CGO execution engine `(doc) <https://nni.readthedocs.io/en/v2.5/NAS/ExecutionEngines.html#cgo-execution-engine-experimental>`__
+Hyper-Parameter Optimization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* New training platform: Alibaba DSW+DLC `(doc) <https://nni.readthedocs.io/en/v2.5/TrainingService/DLCMode.html>`__
+* Support passing ConfigSpace definition directly to BOHB `(doc) <https://nni.readthedocs.io/en/v2.5/Tuner/BohbAdvisor.html#usage>`__ (thanks to khituras)
+* Reformatted `experiment config doc <https://nni.readthedocs.io/en/v2.5/reference/experiment_config.html>`__
+* Added example config files for Windows (thanks to @politecat314)
+* FrameworkController now supports reuse mode
+Fixed Bugs
+^^^^^^^^^^
+* Experiment cannot start due to platform timestamp format (issue #4077 #4083)
+* Cannot use ``1e-5`` in search space (issue #4080)
+* Dependency version conflict caused by ConfigSpace (issue #3909) (thanks to @jexxers)
+* Hardware-aware SPOS example does not work (issue #4198)
+* Web UI show wrong remaining time when duration exceeds limit (issue #4015)
+* cudnn.deterministic is always set in AMC pruner (#4117) thanks to @mstczuo
+And...
+^^^^^^
+* New `emoticons <https://github.com/microsoft/nni/blob/v2.5/docs/en_US/Tutorial/NNSpider.md>`__!
+.. image:: https://raw.githubusercontent.com/microsoft/nni/v2.5/docs/img/emoicons/Holiday.png
+Release 2.4 - 8/11/2021
+-----------------------
+Major Updates
+^^^^^^^^^^^^^
+Neural Architecture Search
+""""""""""""""""""""""""""
+* NAS visualization: visualize model graph through Netron (#3878)
+* Support NAS bench 101/201 on Retiarii framework (#3871 #3920)
+* Support hypermodule AutoActivation (#3868)
+* Support PyTorch v1.8/v1.9 (#3937)
+* Support Hardware-aware NAS with nn-Meter (#3938)
+* Enable `fixed_arch` on Retiarii (#3972)
+Model Compression
+"""""""""""""""""
+* Refactor of ModelSpeedup: auto shape/mask inference (#3462)
+* Added more examples for ModelSpeedup (#3880)
+* Support global sort for Taylor pruning (#3896)
+* Support TransformerHeadPruner (#3884)
+* Support batch normalization folding in QAT quantizer (#3911, thanks the external contributor @chenbohua3)
+* Support post-training observer quantizer (#3915, thanks the external contributor @chenbohua3)
+* Support ModelSpeedup for Slim Pruner (#4008)
+* Support TensorRT 8.0.0 in ModelSpeedup (#3866)
+Hyper-parameter Tuning
+""""""""""""""""""""""
+* Improve HPO benchmarks (#3925)
+* Improve type validation of user defined search space (#3975)
+Training service & nnictl
+"""""""""""""""""""""""""
+* Support JupyterLab (#3668 #3954)
+* Support viewing experiment from experiment folder (#3870)
+* Support kubeflow in training service reuse framework (#3919)
+* Support viewing trial log on WebUI for an experiment launched in `view` mode (#3872)
+Minor Updates & Bug Fixes
+"""""""""""""""""""""""""
+* Fix the failure of the exit of Retiarii experiment (#3899)
+* Fix `exclude` not supported in some `config_list` cases (#3815)
+* Fix bug in remote training service on reuse mode (#3941)
+* Improve IP address detection in modern way (#3860)
+* Fix bug of the search box on WebUI (#3935)
+* Fix bug in url_prefix of WebUI (#4051)
+* Support dict format of intermediate on WebUI (#3895)
+* Fix bug in openpai training service induced by experiment config v2 (#4027 #4057)
+* Improved doc (#3861 #3885 #3966 #4004 #3955)
+* Improved the API `export_model` in model compression (#3968)
+* Supported `UnSqueeze` in ModelSpeedup (#3960)
+* Thanks other external contributors: @Markus92 (#3936), @thomasschmied (#3963), @twmht (#3842)
+Release 2.3 - 6/15/2021
+-----------------------
+Major Updates
+^^^^^^^^^^^^^
+Neural Architecture Search
+""""""""""""""""""""""""""
+* Retiarii Framework (NNI NAS 2.0) Beta Release with new features:
+  * Support new high-level APIs: ``Repeat`` and ``Cell`` (#3481)
+  * Support pure-python execution engine (#3605)
+  * Support policy-based RL strategy (#3650)
+  * Support nested ModuleList (#3652)
+  * Improve documentation (#3785)
+  **Note**: there are more exciting features of Retiarii planned in the future releases, please refer to `Retiarii Roadmap <https://github.com/microsoft/nni/discussions/3744>`__  for more information.
+* Add new NAS algorithm: Blockwise DNAS FBNet (#3532, thanks the external contributor @alibaba-yiwuyao) 
+Model Compression
+"""""""""""""""""
+* Support Auto Compression Framework (#3631)
+* Support slim pruner in Tensorflow (#3614)
+* Support LSQ quantizer (#3503, thanks the external contributor @chenbohua3)
+* Improve APIs for iterative pruners (#3507 #3688)
+Training service & Rest
+"""""""""""""""""""""""
+* Support 3rd-party training service (#3662 #3726)
+* Support setting prefix URL (#3625 #3674 #3672 #3643)
+* Improve NNI manager logging (#3624)
+* Remove outdated TensorBoard code on nnictl (#3613)
+Hyper-Parameter Optimization
+""""""""""""""""""""""""""""
+* Add new tuner: DNGO (#3479 #3707)
+* Add benchmark for tuners (#3644 #3720 #3689)
+WebUI
+"""""
+* Improve search parameters on trial detail page (#3651 #3723 #3715)
+* Make selected trials consistent after auto-refresh in detail table (#3597)
+* Add trial stdout button on local mode (#3653 #3690)
+Examples & Documentation
+""""""""""""""""""""""""
+* Convert all trial examples' from config v1 to config v2 (#3721 #3733 #3711 #3600)
+* Add new jupyter notebook examples (#3599 #3700)
+Dev Excellent
+"""""""""""""
+* Upgrade dependencies in Dockerfile (#3713 #3722)
+* Substitute PyYAML for ``ruamel.yaml`` (#3702)
+* Add pipelines for AML and hybrid training service and experiment config V2 (#3477 #3648)
+* Add pipeline badge in README (#3589)
+* Update issue bug report template (#3501)
+Bug Fixes & Minor Updates
+^^^^^^^^^^^^^^^^^^^^^^^^^
+* Fix syntax error on Windows (#3634)
+* Fix a logging related bug (#3705)
+* Fix a bug in GPU indices (#3721)
+* Fix a bug in FrameworkController (#3730)
+* Fix a bug in ``export_data_url format`` (#3665)
+* Report version check failure as a warning (#3654)
+* Fix bugs and lints in nnictl (#3712)
+* Fix bug of ``optimize_mode`` on WebUI (#3731)
+* Fix bug of ``useActiveGpu`` in AML v2 config (#3655)
+* Fix bug of ``experiment_working_directory`` in Retiarii config (#3607)
+* Fix a bug in mask conflict (#3629, thanks the external contributor @Davidxswang) 
+* Fix a bug in model speedup shape inference (#3588, thanks the external contributor @Davidxswang)
+* Fix a bug in multithread on Windows (#3604, thanks the external contributor @Ivanfangsc)
+* Delete redundant code in training service (#3526, thanks the external contributor @maxsuren)
+* Fix typo in DoReFa compression doc (#3693, thanks the external contributor @Erfandarzi)
+* Update docstring in model compression (#3647, thanks the external contributor @ichejun)
+* Fix a bug when using Kubernetes container (#3719, thanks the external contributor @rmfan)
+Release 2.2 - 4/26/2021
+-----------------------
+Major updates
+^^^^^^^^^^^^^
+Neural Architecture Search
+""""""""""""""""""""""""""
+* Improve NAS 2.0 (Retiarii) Framework (Alpha Release)
+  * Support local debug mode (#3476)
+  * Support nesting ``ValueChoice`` in ``LayerChoice`` (#3508)
+  * Support dict/list type in ``ValueChoice`` (#3508)
+  * Improve the format of export architectures (#3464)
+  * Refactor of NAS examples (#3513)
+  * Refer to `here <https://github.com/microsoft/nni/issues/3301>`__ for Retiarii Roadmap
+Model Compression
+"""""""""""""""""
+* Support speedup for mixed precision quantization model (Experimental) (#3488 #3512)
+* Support model export for quantization algorithm (#3458 #3473)
+* Support model export in model compression for TensorFlow (#3487)
+* Improve documentation (#3482)
+nnictl & nni.experiment
+"""""""""""""""""""""""
+* Add native support for experiment config V2 (#3466 #3540 #3552)
+* Add resume and view mode in Python API ``nni.experiment`` (#3490 #3524 #3545)
+Training Service
+""""""""""""""""
+* Support umount for shared storage in remote training service (#3456)
+* Support Windows as the remote training service in reuse mode (#3500)
+* Remove duplicated env folder in remote training service (#3472)
+* Add log information for GPU metric collector (#3506)
+* Enable optional Pod Spec for FrameworkController platform (#3379, thanks the external contributor @mbu93)
+WebUI
+"""""
+* Support launching TensorBoard on WebUI (#3454 #3361 #3531)
+* Upgrade echarts-for-react to v5 (#3457)
+* Add wrap for dispatcher/nnimanager log monaco editor (#3461)
+Bug Fixes
+^^^^^^^^^
+* Fix bug of FLOPs counter (#3497)
+* Fix bug of hyper-parameter Add/Remove axes and table Add/Remove columns button conflict (#3491)
+* Fix bug that monaco editor search text is not displayed completely (#3492)
+* Fix bug of Cream NAS (#3498, thanks the external contributor @AliCloud-PAI)
+* Fix typos in docs (#3448, thanks the external contributor @OliverShang)
+* Fix typo in NAS 1.0 (#3538, thanks the external contributor @ankitaggarwal23)
+Release 2.1 - 3/10/2021
+-----------------------
+Major updates
+^^^^^^^^^^^^^
+Neural architecture search
+""""""""""""""""""""""""""
+* Improve NAS 2.0 (Retiarii) Framework (Improved Experimental)
+  * Improve the robustness of graph generation and code generation for PyTorch models (#3365)
+  * Support the inline mutation API ``ValueChoice`` (#3349 #3382)
+  * Improve the design and implementation of Model Evaluator (#3359 #3404)
+  * Support Random/Grid/Evolution exploration strategies (i.e., search algorithms) (#3377)
+  * Refer to `here <https://github.com/microsoft/nni/issues/3301>`__ for Retiarii Roadmap
+Training service
+""""""""""""""""
+* Support shared storage for reuse mode (#3354)
+* Support Windows as the local training service in hybrid mode (#3353)
+* Remove PAIYarn training service (#3327)
+* Add "recently-idle" scheduling algorithm (#3375)
+* Deprecate ``preCommand`` and enable ``pythonPath`` for remote training service (#3284 #3410)
+* Refactor reuse mode temp folder (#3374)
+nnictl & nni.experiment
+"""""""""""""""""""""""
+* Migrate ``nnicli`` to new Python API ``nni.experiment`` (#3334)
+* Refactor the way of specifying tuner in experiment Python API (\ ``nni.experiment``\ ), more aligned with ``nnictl`` (#3419)
+WebUI
+"""""
+* Support showing the assigned training service of each trial in hybrid mode on WebUI (#3261 #3391)
+* Support multiple selection for filter status in experiments management page (#3351)
+* Improve overview page (#3316 #3317 #3352)
+* Support copy trial id in the table (#3378)
+Documentation
+^^^^^^^^^^^^^
+* Improve model compression examples and documentation (#3326 #3371)
+* Add Python API examples and documentation (#3396)
+* Add SECURITY doc (#3358)
+* Add 'What's NEW!' section in README (#3395) 
+* Update English contributing doc (#3398, thanks external contributor @Yongxuanzhang)
+Bug fixes
+^^^^^^^^^
+* Fix AML outputs path and python process not killed (#3321)
+* Fix bug that an experiment launched from Python cannot be resumed by nnictl (#3309)
+* Fix import path of network morphism example (#3333)
+* Fix bug in the tuple unpack (#3340)
+* Fix bug of security for arbitrary code execution (#3311, thanks external contributor @huntr-helper)
+* Fix ``NoneType`` error on jupyter notebook (#3337, thanks external contributor @tczhangzhi)
+* Fix bugs in Retiarii (#3339 #3341 #3357, thanks external contributor @tczhangzhi)
+* Fix bug in AdaptDL mode example (#3381, thanks external contributor @ZeyaWang)
+* Fix the spelling mistake of assessor (#3416, thanks external contributor @ByronCHAO)
+* Fix bug in ruamel import (#3430, thanks external contributor @rushtehrani)
+Release 2.0 - 1/14/2021
+-----------------------
+Major updates
+^^^^^^^^^^^^^
+Neural architecture search
+""""""""""""""""""""""""""
+* Support an improved NAS framework: Retiarii (experimental)
+  * Feature roadmap (`issue #3301 <https://github.com/microsoft/nni/issues/3301>`__)
+  * `Related issues and pull requests <https://github.com/microsoft/nni/issues?q=label%3Aretiarii-v2.0>`__
+  * Documentation (#3221 #3282 #3287)
+* Support a new NAS algorithm: Cream (#2705)
+* Add a new NAS benchmark for NLP model search (#3140)
+Training service
+""""""""""""""""
+* Support hybrid training service (#3097 #3251 #3252)
+* Support AdlTrainingService, a new training service based on Kubernetes (#3022, thanks external contributors Petuum @pw2393)
+Model compression
+"""""""""""""""""
+* Support pruning schedule for fpgm pruning algorithm (#3110)
+* ModelSpeedup improvement: support torch v1.7 (updated graph_utils.py) (#3076)
+* Improve model compression utility: model flops counter (#3048 #3265)
+WebUI & nnictl 
+""""""""""""""
+* Support experiments management on WebUI, add a web page for it (#3081 #3127)
+* Improve the layout of overview page (#3046 #3123)
+* Add navigation bar on the right for logs and configs; add expanded icons for table (#3069 #3103)
+Others
+""""""
+* Support launching an experiment from Python code (#3111 #3210 #3263)
+* Refactor builtin/customized tuner installation (#3134)
+* Support new experiment configuration V2 (#3138 #3248 #3251)
+* Reorganize source code directory hierarchy (#2962 #2987 #3037)
+* Change SIGKILL to SIGTERM in local mode when cancelling trial jobs (#3173)
+* Refector hyperband (#3040)
+Documentation
+^^^^^^^^^^^^^
+* Port markdown docs to reStructuredText docs and introduce ``githublink`` (#3107)
+* List related research and publications in doc (#3150)
+* Add tutorial of saving and loading quantized model (#3192)
+* Remove paiYarn doc and add description of ``reuse`` config in remote mode (#3253)
+* Update EfficientNet doc to clarify repo versions (#3158, thanks external contributor @ahundt)
+Bug fixes
+^^^^^^^^^
+* Fix exp-duration pause timing under NO_MORE_TRIAL status (#3043)
+* Fix bug in NAS SPOS trainer, apply_fixed_architecture (#3051, thanks external contributor @HeekangPark)
+* Fix ``_compute_hessian`` bug in NAS DARTS (PyTorch version) (#3058, thanks external contributor @hroken)
+* Fix bug of conv1d in the cdarts utils (#3073, thanks external contributor @athaker)
+* Fix the handling of unknown trials when resuming an experiment (#3096)
+* Fix bug of kill command under Windows (#3106)
+* Fix lazy logging (#3108, thanks external contributor @HarshCasper)
+* Fix checkpoint load and save issue in QAT quantizer (#3124, thanks external contributor @eedalong)
+* Fix quant grad function calculation error (#3160, thanks external contributor @eedalong)
+* Fix device assignment bug in quantization algorithm (#3212, thanks external contributor @eedalong)
+* Fix bug in ModelSpeedup and enhance UT for it (#3279)
+* and others (#3063 #3065 #3098 #3109 #3125 #3143 #3156 #3168 #3175 #3180 #3181 #3183 #3203 #3205 #3207 #3214 #3216 #3219 #3223 #3224 #3230 #3237 #3239 #3240 #3245 #3247 #3255 #3257 #3258 #3262 #3263 #3267 #3269 #3271 #3279 #3283 #3289 #3290 #3295)
+Release 1.9 - 10/22/2020
+------------------------
+Major updates
+^^^^^^^^^^^^^
+Neural architecture search
+""""""""""""""""""""""""""
+* Support regularized evolution algorithm for NAS scenario (#2802)
+* Add NASBench201 in search space zoo (#2766)
+Model compression
+"""""""""""""""""
+* AMC pruner improvement: support resnet, support reproduction of the experiments (default parameters in our example code) in AMC paper (#2876 #2906)
+* Support constraint-aware on some of our pruners to improve model compression efficiency (#2657)
+* Support "tf.keras.Sequential" in model compression for TensorFlow (#2887)
+* Support customized op in the model flops counter (#2795)
+* Support quantizing bias in QAT quantizer (#2914)
+Training service
+""""""""""""""""
+* Support configuring python environment using "preCommand" in remote mode (#2875)
+* Support AML training service in Windows (#2882)
+* Support reuse mode for remote training service (#2923)
+WebUI & nnictl
+""""""""""""""
+* The "Overview" page on WebUI is redesigned with new layout (#2914)
+* Upgraded node, yarn and FabricUI, and enabled Eslint (#2894 #2873 #2744)
+* Add/Remove columns in hyper-parameter chart and trials table in "Trials detail" page (#2900)
+* JSON format utility beautify on WebUI (#2863)
+* Support nnictl command auto-completion (#2857)
+UT & IT
+^^^^^^^
+* Add integration test for experiment import and export (#2878)
+* Add integration test for user installed builtin tuner (#2859)
+* Add unit test for nnictl (#2912)
+Documentation
+^^^^^^^^^^^^^
+* Refactor of the document for model compression (#2919)
+Bug fixes
+^^^^^^^^^
+* Bug fix of naïve evolution tuner, correctly deal with trial fails (#2695)
+* Resolve the warning "WARNING (nni.protocol) IPC pipeline not exists, maybe you are importing tuner/assessor from trial code?" (#2864)
+* Fix search space issue in experiment save/load (#2886)
+* Fix bug in experiment import data (#2878)
+* Fix annotation in remote mode (python 3.8 ast update issue) (#2881)
+* Support boolean type for "choice" hyper-parameter when customizing trial configuration on WebUI (#3003)
+Release 1.8 - 8/27/2020
+-----------------------
+Major updates
+^^^^^^^^^^^^^
+Training service
+""""""""""""""""
+* Access trial log directly on WebUI (local mode only) (#2718)
+* Add OpenPAI trial job detail link (#2703)
+* Support GPU scheduler in reusable environment (#2627) (#2769)
+* Add timeout for ``web_channel`` in ``trial_runner`` (#2710)
+* Show environment error message in AzureML mode (#2724)
+* Add more log information when copying data in OpenPAI mode (#2702)
+WebUI, nnictl and nnicli
+""""""""""""""""""""""""
+* Improve hyper-parameter parallel coordinates plot (#2691) (#2759)
+* Add pagination for trial job list (#2738) (#2773)
+* Enable panel close when clicking overlay region (#2734)
+* Remove support for Multiphase on WebUI (#2760)
+* Support save and restore experiments (#2750)
+* Add intermediate results in export result (#2706)
+* Add `command <https://github.com/microsoft/nni/blob/v1.8/docs/en_US/Tutorial/Nnictl.md#nnictl-trial>`__ to list trial results with highest/lowest metrics (#2747)
+* Improve the user experience of `nnicli <https://github.com/microsoft/nni/blob/v1.8/docs/en_US/nnicli_ref.md>`__ with `examples <https://github.com/microsoft/nni/blob/v1.8/examples/notebooks/retrieve_nni_info_with_python.ipynb>`__ (#2713)
+Neural architecture search
+""""""""""""""""""""""""""
+* `Search space zoo: ENAS and DARTS <https://github.com/microsoft/nni/blob/v1.8/docs/en_US/NAS/SearchSpaceZoo.md>`__ (#2589)
+* API to query intermediate results in NAS benchmark (#2728)
+Model compression
+"""""""""""""""""
+* Support the List/Tuple Construct/Unpack operation for TorchModuleGraph (#2609)
+* Model speedup improvement: Add support of DenseNet and InceptionV3 (#2719)
+* Support the multiple successive tuple unpack operations (#2768)
+* `Doc of comparing the performance of supported pruners <https://github.com/microsoft/nni/blob/v1.8/docs/en_US/CommunitySharings/ModelCompressionComparison.md>`__ (#2742)
+* New pruners: `Sensitivity pruner <https://github.com/microsoft/nni/blob/v1.8/docs/en_US/Compressor/Pruner.md#sensitivity-pruner>`__ (#2684) and `AMC pruner <https://github.com/microsoft/nni/blob/v1.8/docs/en_US/Compressor/Pruner.md>`__ (#2573) (#2786)
+* TensorFlow v2 support in model compression (#2755)
+Backward incompatible changes
+"""""""""""""""""""""""""""""
+* Update the default experiment folder from ``$HOME/nni/experiments`` to ``$HOME/nni-experiments``. If you want to view the experiments created by previous NNI releases, you can move the experiments folders from  ``$HOME/nni/experiments`` to ``$HOME/nni-experiments`` manually. (#2686) (#2753)
+* Dropped support for Python 3.5 and scikit-learn 0.20 (#2778) (#2777) (2783) (#2787) (#2788) (#2790)
+Others
+""""""
+* Upgrade TensorFlow version in Docker image (#2732) (#2735) (#2720)
+Examples
+^^^^^^^^
+* Remove gpuNum in assessor examples (#2641)
+Documentation
+^^^^^^^^^^^^^
+* Improve customized tuner documentation (#2628)
+* Fix several typos and grammar mistakes in documentation (#2637 #2638, thanks @tomzx)
+* Improve AzureML training service documentation (#2631)
+* Improve CI of Chinese translation (#2654)
+* Improve OpenPAI training service documentation (#2685)
+* Improve documentation of community sharing (#2640)
+* Add tutorial of Colab support (#2700)
+* Improve documentation structure for model compression (#2676)
+Bug fixes
+^^^^^^^^^
+* Fix mkdir error in training service (#2673)
+* Fix bug when using chmod in remote training service (#2689)
+* Fix dependency issue by making ``_graph_utils`` imported inline (#2675)
+* Fix mask issue in ``SimulatedAnnealingPruner`` (#2736)
+* Fix intermediate graph zooming issue (#2738)
+* Fix issue when dict is unordered when querying NAS benchmark (#2728)
+* Fix import issue for gradient selector dataloader iterator (#2690)
+* Fix support of adding tens of machines in remote training service (#2725)
+* Fix several styling issues in WebUI (#2762 #2737)
+* Fix support of unusual types in metrics including NaN and Infinity (#2782)
+* Fix nnictl experiment delete (#2791)
+Release 1.7 - 7/8/2020
+----------------------
+Major Features
+^^^^^^^^^^^^^^
+Training Service
+""""""""""""""""
+* Support AML(Azure Machine Learning) platform as NNI training service.
+* OpenPAI job can be reusable. When a trial is completed, the OpenPAI job won't stop, and wait next trial. `refer to reuse flag in OpenPAI config <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/TrainingService/PaiMode.md#openpai-configurations>`__.
+* `Support ignoring files and folders in code directory with .nniignore when uploading code directory to training service <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/TrainingService/Overview.md#how-to-use-training-service>`__.
+Neural Architecture Search (NAS)
+""""""""""""""""""""""""""""""""
+* 
+  `Provide NAS Open Benchmarks (NasBench101, NasBench201, NDS) with friendly APIs <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/NAS/Benchmarks.md>`__.
+* 
+  `Support Classic NAS (i.e., non-weight-sharing mode) on TensorFlow 2.X <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/NAS/ClassicNas.md>`__.
+Model Compression
+"""""""""""""""""
+* Improve Model Speedup: track more dependencies among layers and automatically resolve mask conflict, support the speedup of pruned resnet.
+* Added new pruners, including three auto model pruning algorithms: `NetAdapt Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Compressor/Pruner.md#netadapt-pruner>`__\ , `SimulatedAnnealing Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Compressor/Pruner.md#simulatedannealing-pruner>`__\ , `AutoCompress Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Compressor/Pruner.md#autocompress-pruner>`__\ , and `ADMM Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Compressor/Pruner.md#admm-pruner>`__.
+* Added `model sensitivity analysis tool <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Compressor/CompressionUtils.md>`__ to help users find the sensitivity of each layer to the pruning.
+* 
+  `Easy flops calculation for model compression and NAS <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Compressor/CompressionUtils.md#model-flops-parameters-counter>`__.
+* 
+  Update lottery ticket pruner to export winning ticket.
+Examples
+""""""""
+* Automatically optimize tensor operators on NNI with a new `customized tuner OpEvo <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/TrialExample/OpEvoExamples.md>`__.
+Built-in tuners/assessors/advisors
+""""""""""""""""""""""""""""""""""
+* `Allow customized tuners/assessor/advisors to be installed as built-in algorithms <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Tutorial/InstallCustomizedAlgos.md>`__.
+WebUI
+"""""
+* Support visualizing nested search space more friendly.
+* Show trial's dict keys in hyper-parameter graph.
+* Enhancements to trial duration display.
+Others
+""""""
+* Provide utility function to merge parameters received from NNI
+* Support setting paiStorageConfigName in pai mode
+Documentation
+^^^^^^^^^^^^^
+* Improve `documentation for model compression <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/Compressor/Overview.md>`__
+* Improve `documentation <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/NAS/Benchmarks.md>`__
+  and `examples <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/NAS/BenchmarksExample.ipynb>`__ for NAS benchmarks.
+* Improve `documentation for AzureML training service <https://github.com/microsoft/nni/blob/v1.7/docs/en_US/TrainingService/AMLMode.md>`__
+* Homepage migration to readthedoc.
+Bug Fixes
+^^^^^^^^^
+* Fix bug for model graph with shared nn.Module
+* Fix nodejs OOM when ``make build``
+* Fix NASUI bugs
+* Fix duration and intermediate results pictures update issue.
+* Fix minor WebUI table style issues.
+Release 1.6 - 5/26/2020
+-----------------------
+Major Features
+^^^^^^^^^^^^^^
+New Features and improvement
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Improve IPC limitation to 100W
+* improve code storage upload logic among trials in non-local platform
+* support ``__version__`` for SDK version
+* support windows dev intall
+Web UI
+^^^^^^
+* Show trial error message
+* finalize homepage layout
+* Refactor overview's best trials module
+* Remove multiphase from webui
+* add tooltip for trial concurrency in the overview page
+* Show top trials for hyper-parameter graph
+HPO Updates
+^^^^^^^^^^^
+* Improve PBT on failure handling and support experiment resume for PBT
+NAS Updates
+^^^^^^^^^^^
+* NAS support for TensorFlow 2.0 (preview) `TF2.0 NAS examples <https://github.com/microsoft/nni/tree/v1.6/examples/nas/naive-tf>`__
+* Use OrderedDict for LayerChoice
+* Prettify the format of export
+* Replace layer choice with selected module after applied fixed architecture
+Model Compression Updates
+^^^^^^^^^^^^^^^^^^^^^^^^^
+* Model compression PyTorch 1.4 support
+Training Service Updates
+^^^^^^^^^^^^^^^^^^^^^^^^
+* update pai yaml merge logic
+* support windows as remote machine in remote mode `Remote Mode <https://github.com/microsoft/nni/blob/v1.6/docs/en_US/TrainingService/RemoteMachineMode.md#windows>`__
+Bug Fix
+^^^^^^^
+* fix dev install
+* SPOS example crash when the checkpoints do not have state_dict
+* Fix table sort issue when experiment had failed trial
+* Support multi python env (conda, pyenv etc)
+Release 1.5 - 4/13/2020
+-----------------------
+New Features and Documentation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Hyper-Parameter Optimizing
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+* New tuner: `Population Based Training (PBT) <https://github.com/microsoft/nni/blob/v1.5/docs/en_US/Tuner/PBTTuner.md>`__
+* Trials can now report infinity and NaN as result
+Neural Architecture Search
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+* New NAS algorithm: `TextNAS <https://github.com/microsoft/nni/blob/v1.5/docs/en_US/NAS/TextNAS.md>`__
+* ENAS and DARTS now support `visualization <https://github.com/microsoft/nni/blob/v1.5/docs/en_US/NAS/Visualization.md>`__ through web UI.
+Model Compression
+^^^^^^^^^^^^^^^^^
+* New Pruner: `GradientRankFilterPruner <https://github.com/microsoft/nni/blob/v1.5/docs/en_US/Compression/Pruner.md#gradientrankfilterpruner>`__
+* Compressors will validate configuration by default
+* Refactor: Adding optimizer as an input argument of pruner, for easy support of DataParallel and more efficient iterative pruning. This is a broken change for the usage of iterative pruning algorithms.
+* Model compression examples are refactored and improved
+* Added documentation for `implementing compressing algorithm <https://github.com/microsoft/nni/blob/v1.5/docs/en_US/Compression/Framework.md>`__
+Training Service
+^^^^^^^^^^^^^^^^
+* Kubeflow now supports pytorchjob crd v1 (thanks external contributor @jiapinai)
+* Experimental `DLTS <https://github.com/microsoft/nni/blob/v1.5/docs/en_US/TrainingService/DLTSMode.md>`__ support
+Overall Documentation Improvement
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Documentation is significantly improved on grammar, spelling, and wording (thanks external contributor @AHartNtkn)
+Fixed Bugs
+^^^^^^^^^^
+* ENAS cannot have more than one LSTM layers (thanks external contributor @marsggbo)
+* NNI manager's timers will never unsubscribe (thanks external contributor @guilhermehn)
+* NNI manager may exhaust head memory (thanks external contributor @Sundrops)
+* Batch tuner does not support customized trials (#2075)
+* Experiment cannot be killed if it failed on start (#2080)
+* Non-number type metrics break web UI (#2278)
+* A bug in lottery ticket pruner
+* Other minor glitches
+Release 1.4 - 2/19/2020
+-----------------------
+Major Features
+^^^^^^^^^^^^^^
+Neural Architecture Search
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Support `C-DARTS <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/CDARTS.md>`__ algorithm and add `the example <https://github.com/microsoft/nni/tree/v1.4/examples/nas/cdarts>`__ using it
+* Support a preliminary version of `ProxylessNAS <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/Proxylessnas.md>`__ and the corresponding `example <https://github.com/microsoft/nni/tree/v1.4/examples/nas/proxylessnas>`__
+* Add unit tests for the NAS framework
+Model Compression
+^^^^^^^^^^^^^^^^^
+* Support DataParallel for compressing models, and provide `an example <https://github.com/microsoft/nni/blob/v1.4/examples/model_compress/multi_gpu.py>`__ of using DataParallel
+* Support `model speedup <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/Compressor/ModelSpeedup.md>`__ for compressed models, in Alpha version
+Training Service
+^^^^^^^^^^^^^^^^
+* Support complete PAI configurations by allowing users to specify PAI config file path
+* Add example config yaml files for the new PAI mode (i.e., paiK8S)
+* Support deleting experiments using sshkey in remote mode (thanks external contributor @tyusr)
+WebUI
+^^^^^
+* WebUI refactor: adopt fabric framework
+Others
+^^^^^^
+* Support running `NNI experiment at foreground <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/Tutorial/Nnictl.md#manage-an-experiment>`__\ , i.e., ``--foreground`` argument in ``nnictl create/resume/view``
+* Support canceling the trials in UNKNOWN state
+* Support large search space whose size could be up to 50mb (thanks external contributor @Sundrops)
+Documentation
+^^^^^^^^^^^^^
+* Improve `the index structure <https://nni.readthedocs.io/en/latest/>`__ of NNI readthedocs
+* Improve `documentation for NAS <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/NasGuide.md>`__
+* Improve documentation for `the new PAI mode <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/TrainingService/PaiMode.md>`__
+* Add QuickStart guidance for `NAS <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/QuickStart.md>`__ and `model compression <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/Compressor/QuickStart.md>`__
+* Improve documentation for `the supported EfficientNet <https://github.com/microsoft/nni/blob/v1.4/docs/en_US/TrialExample/EfficientNet.md>`__
+Bug Fixes
+^^^^^^^^^
+* Correctly support NaN in metric data, JSON compliant
+* Fix the out-of-range bug of ``randint`` type in search space
+* Fix the bug of wrong tensor device when exporting onnx model in model compression
+* Fix incorrect handling of nnimanagerIP in the new PAI mode (i.e., paiK8S)
+Release 1.3 - 12/30/2019
+------------------------
+Major Features
+^^^^^^^^^^^^^^
+Neural Architecture Search Algorithms Support
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* `Single Path One Shot <https://github.com/microsoft/nni/tree/v1.3/examples/nas/spos/>`__ algorithm and the example using it
+Model Compression Algorithms Support
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* `Knowledge Distillation <https://github.com/microsoft/nni/blob/v1.3/docs/en_US/TrialExample/KDExample.md>`__ algorithm and the example using itExample
+* Pruners
+  * `L2Filter Pruner <https://github.com/microsoft/nni/blob/v1.3/docs/en_US/Compressor/Pruner.md#3-l2filter-pruner>`__
+  * `ActivationAPoZRankFilterPruner <https://github.com/microsoft/nni/blob/v1.3/docs/en_US/Compressor/Pruner.md#1-activationapozrankfilterpruner>`__
+  * `ActivationMeanRankFilterPruner <https://github.com/microsoft/nni/blob/v1.3/docs/en_US/Compressor/Pruner.md#2-activationmeanrankfilterpruner>`__
+* `BNN Quantizer <https://github.com/microsoft/nni/blob/v1.3/docs/en_US/Compressor/Quantizer.md#bnn-quantizer>`__
+Training Service
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* 
+  NFS Support for PAI
+    Instead of using HDFS as default storage, since OpenPAI v0.11, OpenPAI can have NFS or AzureBlob or other storage as default storage. In this release, NNI extended the support for this recent change made by OpenPAI, and could integrate with OpenPAI v0.11 or later version with various default storage.
+* 
+  Kubeflow update adoption
+    Adopted the Kubeflow 0.7's new supports for tf-operator.
+Engineering (code and build automation)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Enforced `ESLint <https://eslint.org/>`__ on static code analysis.
+Small changes & Bug Fixes
+^^^^^^^^^^^^^^^^^^^^^^^^^
+* correctly recognize builtin tuner and customized tuner
+* logging in dispatcher base
+* fix the bug where tuner/assessor's failure sometimes kills the experiment.
+* Fix local system as remote machine `issue <https://github.com/microsoft/nni/issues/1852>`__
+* de-duplicate trial configuration in smac tuner `ticket <https://github.com/microsoft/nni/issues/1364>`__
+Release 1.2 - 12/02/2019
+------------------------
+Major Features
+^^^^^^^^^^^^^^
+* `Feature Engineering <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/FeatureEngineering/Overview.md>`__
+  * New feature engineering interface
+  * Feature selection algorithms: `Gradient feature selector <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/FeatureEngineering/GradientFeatureSelector.md>`__ & `GBDT selector <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/FeatureEngineering/GBDTSelector.md>`__
+  * `Examples for feature engineering <https://github.com/microsoft/nni/tree/v1.2/examples/feature_engineering>`__
+* Neural Architecture Search (NAS) on NNI
+  * `New NAS interface <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/NAS/NasInterface.md>`__
+  * NAS algorithms: `ENAS <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/NAS/Overview.md#enas>`__\ , `DARTS <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/NAS/Overview.md#darts>`__\ , `P-DARTS <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/NAS/Overview.md#p-darts>`__ (in PyTorch)
+  * NAS in classic mode (each trial runs independently)
+* Model compression
+  * `New model pruning algorithms <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/Compressor/Overview.md>`__\ : lottery ticket pruning approach, L1Filter pruner, Slim pruner, FPGM pruner
+  * `New model quantization algorithms <https://github.com/microsoft/nni/blob/v1.2/docs/en_US/Compressor/Overview.md>`__\ : QAT quantizer, DoReFa quantizer
+  * Support the API for exporting compressed model.
+* Training Service
+  * Support OpenPAI token authentication
+* Examples:
+  * `An example to automatically tune rocksdb configuration with NNI <https://github.com/microsoft/nni/tree/v1.2/examples/trials/systems/rocksdb-fillrandom>`__.
+  * `A new MNIST trial example supports tensorflow 2.0 <https://github.com/microsoft/nni/tree/v1.2/examples/trials/mnist-tfv2>`__.
+* Engineering Improvements
+  * For remote training service,  trial jobs require no GPU are now scheduled with round-robin policy instead of random.
+  * Pylint rules added to check pull requests, new pull requests need to comply with these `pylint rules <https://github.com/microsoft/nni/blob/v1.2/pylintrc>`__.
+* Web Portal & User Experience
+  * Support user to add customized trial.
+  * User can zoom out/in in detail graphs, except Hyper-parameter.
+* Documentation
+  * Improved NNI API documentation with more API docstring.
+Bug fix
+^^^^^^^
+* Fix the table sort issue when failed trials haven't metrics. -Issue #1773
+* Maintain selected status(Maximal/Minimal) when the page switched. -PR#1710
+* Make hyper-parameters graph's default metric yAxis more accurate. -PR#1736
+* Fix GPU script permission issue. -Issue #1665
+Release 1.1 - 10/23/2019
+------------------------
+Major Features
+^^^^^^^^^^^^^^
+* New tuner: `PPO Tuner <https://github.com/microsoft/nni/blob/v1.1/docs/en_US/Tuner/PPOTuner.md>`__
+* `View stopped experiments <https://github.com/microsoft/nni/blob/v1.1/docs/en_US/Tutorial/Nnictl.md#view>`__
+* Tuners can now use dedicated GPU resource (see ``gpuIndices`` in `tutorial <https://github.com/microsoft/nni/blob/v1.1/docs/en_US/Tutorial/ExperimentConfig.md>`__ for details)
+* Web UI improvements
+  * Trials detail page can now list hyperparameters of each trial, as well as their start and end time (via "add column")
+  * Viewing huge experiment is now less laggy
+* More examples
+  * `EfficientNet PyTorch example <https://github.com/ultmaster/EfficientNet-PyTorch>`__
+  * `Cifar10 NAS example <https://github.com/microsoft/nni/blob/v1.1/examples/trials/nas_cifar10/README.md>`__
+* `Model compression toolkit - Alpha release <https://github.com/microsoft/nni/blob/v1.1/docs/en_US/Compressor/Overview.md>`__\ : We are glad to announce the alpha release for model compression toolkit on top of NNI, it's still in the experiment phase which might evolve based on usage feedback. We'd like to invite you to use, feedback and even contribute
+Fixed Bugs
+^^^^^^^^^^
+* Multiphase job hangs when search space exhuasted (issue #1204)
+* ``nnictl`` fails when log not available (issue #1548)
+Release 1.0 - 9/2/2019
+----------------------
+Major Features
+^^^^^^^^^^^^^^
+* 
+  Tuners and Assessors
+  * Support Auto-Feature generator & selection    -Issue#877  -PR #1387
+    * Provide auto feature interface
+    * Tuner based on beam search
+    * `Add Pakdd example <https://github.com/microsoft/nni/tree/v1.0/examples/trials/auto-feature-engineering>`__
+  * Add a parallel algorithm to improve the performance of TPE with large concurrency.  -PR #1052
+  * Support multiphase for hyperband    -PR #1257
+* 
+  Training Service
+  * Support private docker registry   -PR #755
+  * Engineering Improvements
+    * Python wrapper for rest api, support retrieve the values of the metrics in a programmatic way  PR #1318
+    * New python API : get_experiment_id(), get_trial_id()  -PR #1353   -Issue #1331 & -Issue#1368
+    * Optimized NAS Searchspace  -PR #1393
+      * Unify NAS search space with _type -- "mutable_type"e
+      * Update random search tuner
+    * Set gpuNum as optional      -Issue #1365
+    * Remove outputDir and dataDir configuration in PAI mode   -Issue #1342
+    * When creating a trial in Kubeflow mode, codeDir will no longer be copied to logDir   -Issue #1224
+* 
+  Web Portal & User Experience
+  * Show the best metric curve during search progress in WebUI  -Issue #1218
+  * Show the current number of parameters list in multiphase experiment   -Issue1210  -PR #1348
+  * Add "Intermediate count" option in AddColumn.      -Issue #1210
+  * Support search parameters value in WebUI     -Issue #1208
+  * Enable automatic scaling of axes for metric value  in default metric graph   -Issue #1360
+  * Add a detailed documentation link to the nnictl command in the command prompt    -Issue #1260
+  * UX improvement for showing Error log   -Issue #1173
+* 
+  Documentation
+  * Update the docs structure  -Issue #1231
+  * (deprecated) Multi phase document improvement   -Issue #1233  -PR #1242
+    * Add configuration example
+  * `WebUI description improvement <Tutorial/WebUI.rst>`__  -PR #1419
+Bug fix
+^^^^^^^
+* (Bug fix)Fix the broken links in 0.9 release  -Issue #1236
+* (Bug fix)Script for auto-complete
+* (Bug fix)Fix pipeline issue that it only check exit code of last command in a script.  -PR #1417
+* (Bug fix)quniform fors tuners    -Issue #1377
+* (Bug fix)'quniform' has different meaning beween GridSearch and other tuner.   -Issue #1335
+* (Bug fix)"nnictl experiment list" give the status of a "RUNNING" experiment as "INITIALIZED" -PR #1388
+* (Bug fix)SMAC cannot be installed if nni is installed in dev mode    -Issue #1376
+* (Bug fix)The filter button of the intermediate result cannot be clicked   -Issue #1263
+* (Bug fix)API "/api/v1/nni/trial-jobs/xxx" doesn't show a trial's all parameters in multiphase experiment    -Issue #1258
+* (Bug fix)Succeeded trial doesn't have final result but webui show ×××(FINAL)  -Issue #1207
+* (Bug fix)IT for nnictl stop -Issue #1298
+* (Bug fix)fix security warning
+* (Bug fix)Hyper-parameter page broken  -Issue #1332
+* (Bug fix)Run flake8 tests to find Python syntax errors and undefined names -PR #1217
+Release 0.9 - 7/1/2019
+----------------------
+Major Features
+^^^^^^^^^^^^^^
+* General NAS programming interface
+  * Add ``enas-mode``  and ``oneshot-mode`` for NAS interface: `PR #1201 <https://github.com/microsoft/nni/pull/1201#issue-291094510>`__
+* 
+  `Gaussian Process Tuner with Matern kernel <Tuner/GPTuner.rst>`__
+* 
+  (deprecated) Multiphase experiment supports
+  * Added new training service support for multiphase experiment: PAI mode supports multiphase experiment since v0.9.
+  * Added multiphase capability for the following builtin tuners:
+    * TPE, Random Search, Anneal, Naïve Evolution, SMAC, Network Morphism, Metis Tuner.
+* 
+  Web Portal
+  * Enable trial comparation in Web Portal. For details, refer to `View trials status <Tutorial/WebUI.rst>`__
+  * Allow users to adjust rendering interval of Web Portal. For details, refer to `View Summary Page <Tutorial/WebUI.rst>`__
+  * show intermediate results more friendly. For details, refer to `View trials status <Tutorial/WebUI.rst>`__
+* `Commandline Interface <Tutorial/Nnictl.rst>`__
+  * ``nnictl experiment delete``\ : delete one or all experiments, it includes log, result, environment information and cache. It uses to delete useless experiment result, or save disk space.
+  * ``nnictl platform clean``\ : It uses to clean up disk on a target platform. The provided YAML file includes the information of target platform, and it follows the same schema as the NNI configuration file.
+Bug fix and other changes
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Tuner Installation Improvements: add `sklearn <https://scikit-learn.org/stable/>`__ to nni dependencies.
+* (Bug Fix) Failed to connect to PAI http code - `Issue #1076 <https://github.com/microsoft/nni/issues/1076>`__
+* (Bug Fix) Validate file name for PAI platform - `Issue #1164 <https://github.com/microsoft/nni/issues/1164>`__
+* (Bug Fix) Update GMM evaluation in Metis Tuner
+* (Bug Fix) Negative time number rendering in Web Portal - `Issue #1182 <https://github.com/microsoft/nni/issues/1182>`__\ , `Issue #1185 <https://github.com/microsoft/nni/issues/1185>`__
+* (Bug Fix) Hyper-parameter not shown correctly in WebUI when there is only one hyper parameter - `Issue #1192 <https://github.com/microsoft/nni/issues/1192>`__
+Release 0.8 - 6/4/2019
+----------------------
+Major Features
+^^^^^^^^^^^^^^
+* Support NNI on Windows for OpenPAI/Remote mode
+  * NNI running on windows for remote mode
+  * NNI running on windows for OpenPAI mode
+* Advanced features for using GPU
+  * Run multiple trial jobs on the same GPU for local and remote mode
+  * Run trial jobs on the GPU running non-NNI jobs
+* Kubeflow v1beta2 operator
+  * Support Kubeflow TFJob/PyTorchJob v1beta2
+* `General NAS programming interface <https://github.com/microsoft/nni/blob/v0.8/docs/en_US/GeneralNasInterfaces.md>`__
+  * Provide NAS programming interface for users to easily express their neural architecture search space through NNI annotation
+  * Provide a new command ``nnictl trial codegen`` for debugging the NAS code
+  * Tutorial of NAS programming interface, example of NAS on MNIST, customized random tuner for NAS
+* Support resume tuner/advisor's state for experiment resume
+* For experiment resume, tuner/advisor will be resumed by replaying finished trial data
+* Web Portal
+  * Improve the design of copying trial's parameters
+  * Support 'randint' type in hyper-parameter graph
+  * Use should ComponentUpdate to avoid unnecessary render
+Bug fix and other changes
+^^^^^^^^^^^^^^^^^^^^^^^^^
+* Bug fix that ``nnictl update`` has inconsistent command styles
+* Support import data for SMAC tuner
+* Bug fix that experiment state transition from ERROR back to RUNNING
+* Fix bug of table entries
+* Nested search space refinement
+* Refine 'randint' type and support lower bound
+* `Comparison of different hyper-parameter tuning algorithm <CommunitySharings/HpoComparison.rst>`__
+* `Comparison of NAS algorithm <CommunitySharings/NasComparison.rst>`__
+* `NNI practice on Recommenders <CommunitySharings/RecommendersSvd.rst>`__
+Release 0.7 - 4/29/2018
+-----------------------
+Major Features
+^^^^^^^^^^^^^^
+* `Support NNI on Windows <Tutorial/InstallationWin.rst>`__
+  * NNI running on windows for local mode
+* `New advisor: BOHB <Tuner/BohbAdvisor.rst>`__
+  * Support a new advisor BOHB, which is a robust and efficient hyperparameter tuning algorithm, combines the advantages of Bayesian optimization and Hyperband
+* `Support import and export experiment data through nnictl <Tutorial/Nnictl.rst>`__
+  * Generate analysis results report after the experiment execution
+  * Support import data to tuner and advisor for tuning
+* `Designated gpu devices for NNI trial jobs <Tutorial/ExperimentConfig.rst#localConfig>`__
+  * Specify GPU devices for NNI trial jobs by gpuIndices configuration, if gpuIndices is set in experiment configuration file, only the specified GPU devices are used for NNI trial jobs.
+* Web Portal enhancement
+  * Decimal format of metrics other than default on the Web UI
+  * Hints in WebUI about Multi-phase
+  * Enable copy/paste for hyperparameters as python dict
+  * Enable early stopped trials data for tuners.
+* NNICTL provide better error message
+  * nnictl provide more meaningful error message for YAML file format error
+Bug fix
+^^^^^^^
+* Unable to kill all python threads after nnictl stop in async dispatcher mode
+* nnictl --version does not work with make dev-install
+* All trail jobs status stays on 'waiting' for long time on OpenPAI platform
+Release 0.6 - 4/2/2019
+----------------------
+Major Features
+^^^^^^^^^^^^^^
+* `Version checking <TrainingService/PaiMode.rst>`__
+  * check whether the version is consistent between nniManager and trialKeeper
+* `Report final metrics for early stop job <https://github.com/microsoft/nni/issues/776>`__
+  * If includeIntermediateResults is true, the last intermediate result of the trial that is early stopped by assessor is sent to tuner as final result. The default value of includeIntermediateResults is false.
+* `Separate Tuner/Assessor <https://github.com/microsoft/nni/issues/841>`__
+  * Adds two pipes to separate message receiving channels for tuner and assessor.
+* Make log collection feature configurable
+* Add intermediate result graph for all trials
+Bug fix
+^^^^^^^
+* `Add shmMB config key for OpenPAI <https://github.com/microsoft/nni/issues/842>`__
+* Fix the bug that doesn't show any result if metrics is dict
+* Fix the number calculation issue for float types in hyperband
+* Fix a bug in the search space conversion in SMAC tuner
+* Fix the WebUI issue when parsing experiment.json with illegal format
+* Fix cold start issue in Metis Tuner
+Release 0.5.2 - 3/4/2019
+------------------------
+Improvements
+^^^^^^^^^^^^
+* Curve fitting assessor performance improvement.
+Documentation
+^^^^^^^^^^^^^
+* Chinese version document: https://nni.readthedocs.io/zh/latest/
+* Debuggability/serviceability document: https://nni.readthedocs.io/en/latest/Tutorial/HowToDebug.html
+* Tuner assessor reference: https://nni.readthedocs.io/en/latest/sdk_reference.html
+Bug Fixes and Other Changes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Fix a race condition bug that does not store trial job cancel status correctly.
+* Fix search space parsing error when using SMAC tuner.
+* Fix cifar10 example broken pipe issue.
+* Add unit test cases for nnimanager and local training service.
+* Add integration test azure pipelines for remote machine, OpenPAI and kubeflow training services.
+* Support Pylon in OpenPAI webhdfs client.
+Release 0.5.1 - 1/31/2018
+-------------------------
+Improvements
+^^^^^^^^^^^^
+* Making `log directory <https://github.com/microsoft/nni/blob/v0.5.1/docs/ExperimentConfig.md>`__ configurable
+* Support `different levels of logs <https://github.com/microsoft/nni/blob/v0.5.1/docs/ExperimentConfig.md>`__\ , making it easier for debugging
+Documentation
+^^^^^^^^^^^^^
+* Reorganized documentation & New Homepage Released: https://nni.readthedocs.io/en/latest/
+Bug Fixes and Other Changes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Fix the bug of installation in python virtualenv, and refactor the installation logic
+* Fix the bug of HDFS access failure on OpenPAI mode after OpenPAI is upgraded.
+* Fix the bug that sometimes in-place flushed stdout makes experiment crash
+Release 0.5.0 - 01/14/2019
+--------------------------
+Major Features
+^^^^^^^^^^^^^^
+New tuner and assessor supports
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Support `Metis tuner <Tuner/MetisTuner.rst>`__ as a new NNI tuner. Metis algorithm has been proofed to be well performed for **online** hyper-parameter tuning.
+* Support `ENAS customized tuner <https://github.com/countif/enas_nni>`__\ , a tuner contributed by github community user, is an algorithm for neural network search, it could learn neural network architecture via reinforcement learning and serve a better performance than NAS.
+* Support `Curve fitting assessor <Assessor/CurvefittingAssessor.rst>`__ for early stop policy using learning curve extrapolation.
+* Advanced Support of `Weight Sharing <https://github.com/microsoft/nni/blob/v0.5/docs/AdvancedNAS.md>`__\ : Enable weight sharing for NAS tuners, currently through NFS.
+Training Service Enhancement
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* `FrameworkController Training service <TrainingService/FrameworkControllerMode.rst>`__\ : Support run experiments using frameworkcontroller on kubernetes
+  * FrameworkController is a Controller on kubernetes that is general enough to run (distributed) jobs with various machine learning frameworks, such as tensorflow, pytorch, MXNet.
+  * NNI provides unified and simple specification for job definition.
+  * MNIST example for how to use FrameworkController.
+User Experience improvements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* A better trial logging support for NNI experiments in OpenPAI, Kubeflow and FrameworkController mode:
+  * An improved logging architecture to send stdout/stderr of trials to NNI manager via Http post. NNI manager will store trial's stdout/stderr messages in local log file.
+  * Show the link for trial log file on WebUI.
+* Support to show final result's all key-value pairs.
+Release 0.4.1 - 12/14/2018
+--------------------------
+Major Features
+^^^^^^^^^^^^^^
+New tuner supports
+^^^^^^^^^^^^^^^^^^
+* Support `network morphism <Tuner/NetworkmorphismTuner.rst>`__ as a new tuner
+Training Service improvements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Migrate `Kubeflow training service <TrainingService/KubeflowMode.rst>`__\ 's dependency from kubectl CLI to `Kubernetes API <https://kubernetes.io/docs/concepts/overview/kubernetes-api/>`__ client
+* `Pytorch-operator <https://github.com/kubeflow/pytorch-operator>`__ support for Kubeflow training service
+* Improvement on local code files uploading to OpenPAI HDFS
+* Fixed OpenPAI integration WebUI bug: WebUI doesn't show latest trial job status, which is caused by OpenPAI token expiration
+NNICTL improvements
+^^^^^^^^^^^^^^^^^^^
+* Show version information both in nnictl and WebUI. You can run **nnictl -v** to show your current installed NNI version
+WebUI improvements
+^^^^^^^^^^^^^^^^^^
+* Enable modify concurrency number during experiment
+* Add feedback link to NNI github 'create issue' page
+* Enable customize top 10 trials regarding to metric numbers (largest or smallest)
+* Enable download logs for dispatcher & nnimanager
+* Enable automatic scaling of axes for metric number
+* Update annotation to support displaying real choice in searchspace
+New examples
+^^^^^^^^^^^^
+* `FashionMnist <https://github.com/microsoft/nni/tree/v0.5/examples/trials/network_morphism>`__\ , work together with network morphism tuner
+* `Distributed MNIST example <https://github.com/microsoft/nni/tree/v0.5/examples/trials/mnist-distributed-pytorch>`__ written in PyTorch
+Release 0.4 - 12/6/2018
+-----------------------
+Major Features
+^^^^^^^^^^^^^^
+* `Kubeflow Training service <TrainingService/KubeflowMode.rst>`__
+  * Support tf-operator
+  * `Distributed trial example <https://github.com/microsoft/nni/tree/v0.4/examples/trials/mnist-distributed/dist_mnist.py>`__ on Kubeflow
+* `Grid search tuner <Tuner/GridsearchTuner.rst>`__
+* `Hyperband tuner <Tuner/HyperbandAdvisor.rst>`__
+* Support launch NNI experiment on MAC
+* WebUI
+  * UI support for hyperband tuner
+  * Remove tensorboard button
+  * Show experiment error message
+  * Show line numbers in search space and trial profile
+  * Support search a specific trial by trial number
+  * Show trial's hdfsLogPath
+  * Download experiment parameters
+Others
+^^^^^^
+* Asynchronous dispatcher
+* Docker file update, add pytorch library
+* Refactor 'nnictl stop' process, send SIGTERM to nni manager process, rather than calling stop Rest API.
+* OpenPAI training service bug fix
+  * Support NNI Manager IP configuration(nniManagerIp) in OpenPAI cluster config file, to fix the issue that user’s machine has no eth0 device
+  * File number in codeDir is capped to 1000 now, to avoid user mistakenly fill root dir for codeDir
+  * Don’t print useless ‘metrics is empty’ log in OpenPAI job’s stdout. Only print useful message once new metrics are recorded, to reduce confusion when user checks OpenPAI trial’s output for debugging purpose
+  * Add timestamp at the beginning of each log entry in trial keeper.
+Release 0.3.0 - 11/2/2018
+-------------------------
+NNICTL new features and updates
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* 
+  Support running multiple experiments simultaneously.
+  Before v0.3, NNI only supports running single experiment once a time. After this release, users are able to run multiple experiments simultaneously. Each experiment will require a unique port, the 1st experiment will be set to the default port as previous versions. You can specify a unique port for the rest experiments as below:
+  .. code-block:: bash
+     nnictl create --port 8081 --config <config file path>
+* 
+  Support updating max trial number.
+  use ``nnictl update --help`` to learn more. Or refer to `NNICTL Spec <Tutorial/Nnictl.rst>`__ for the fully usage of NNICTL.
+API new features and updates
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* 
+  :raw-html:`<span style="color:red">**breaking change**</span>`\ : nn.get_parameters() is refactored to nni.get_next_parameter. All examples of prior releases can not run on v0.3, please clone nni repo to get new examples. If you had applied NNI to your own codes, please update the API accordingly.
+* 
+  New API **nni.get_sequence_id()**.
+  Each trial job is allocated a unique sequence number, which can be retrieved by nni.get_sequence_id() API.
+  .. code-block:: bash
+     git clone -b v0.3 https://github.com/microsoft/nni.git
+* 
+  **nni.report_final_result(result)** API supports more data types for result parameter.
+  It can be of following types:
+  * int
+  * float
+  * A python dict containing 'default' key, the value of 'default' key should be of type int or float. The dict can contain any other key value pairs.
+New tuner support
+^^^^^^^^^^^^^^^^^
+* **Batch Tuner** which iterates all parameter combination, can be used to submit batch trial jobs.
+New examples
+^^^^^^^^^^^^
+* 
+  A NNI Docker image for public usage:
+  .. code-block:: bash
+     docker pull msranni/nni:latest
+* 
+  New trial example: `NNI Sklearn Example <https://github.com/microsoft/nni/tree/v0.3/examples/trials/sklearn>`__
+* New competition example: `Kaggle Competition TGS Salt Example <https://github.com/microsoft/nni/tree/v0.3/examples/trials/kaggle-tgs-salt>`__
+Others
+^^^^^^
+* UI refactoring, refer to `WebUI doc <Tutorial/WebUI.rst>`__ for how to work with the new UI.
+* Continuous Integration: NNI had switched to Azure pipelines
+Release 0.2.0 - 9/29/2018
+-------------------------
+Major Features
+^^^^^^^^^^^^^^
+* Support `OpenPAI <https://github.com/microsoft/pai>`__ Training Platform (See `here <TrainingService/PaiMode.rst>`__ for instructions about how to submit NNI job in pai mode)
+  * Support training services on pai mode. NNI trials will be scheduled to run on OpenPAI cluster
+  * NNI trial's output (including logs and model file) will be copied to OpenPAI HDFS for further debugging and checking
+* Support `SMAC <https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf>`__ tuner (See `here <Tuner/SmacTuner.rst>`__ for instructions about how to use SMAC tuner)
+  * `SMAC <https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf>`__ is based on Sequential Model-Based Optimization (SMBO). It adapts the most prominent previously used model class (Gaussian stochastic process models) and introduces the model class of random forests to SMBO to handle categorical parameters. The SMAC supported by NNI is a wrapper on `SMAC3 <https://github.com/automl/SMAC3>`__
+* Support NNI installation on `conda <https://conda.io/docs/index.html>`__ and python virtual environment
+* Others
+  * Update ga squad example and related documentation
+  * WebUI UX small enhancement and bug fix
+Release 0.1.0 - 9/10/2018 (initial release)
+-------------------------------------------
+Initial release of Neural Network Intelligence (NNI).
+Major Features
+^^^^^^^^^^^^^^
+* Installation and Deployment
+  * Support pip install and source codes install
+  * Support training services on local mode(including Multi-GPU mode) as well as multi-machines mode
+* Tuners, Assessors and Trial
+  * Support AutoML algorithms including:  hyperopt_tpe, hyperopt_annealing, hyperopt_random, and evolution_tuner
+  * Support assessor(early stop) algorithms including: medianstop algorithm
+  * Provide Python API for user defined tuners and assessors
+  * Provide Python API for user to wrap trial code as NNI deployable codes
+* Experiments
+  * Provide a command line toolkit 'nnictl' for experiments management
+  * Provide a WebUI for viewing experiments details and managing experiments
+* Continuous Integration
+  * Support CI by providing out-of-box integration with `travis-ci <https://github.com/travis-ci>`__ on ubuntu
+* Others
+  * Support simple GPU job scheduling
--- a/docs/en_US/Release_v1.0.md
+++ b/docs/en_US/Release_v1.0.md
+<p align="center">
+<img src=".././img/release-1-title-1.png" width="100%" />
+</p>
+From September 2018 to September 2019, We are still moving on …
+ **Great news!**&nbsp;&nbsp;With the tag of **Scalability** and **Ease of Use**, NNI v1.0 is comming. Based on the various types of [Tuning Algorithms](./Tuner/BuiltinTuner.md), NNI has supported the Hyperparameter tuning, Neural Architecture search and Auto-Feature-Engineering, which is an exciting news for algorithmic engineers; besides these, NNI v1.0 has made many improvements in the optimization of tuning algorithm, [WebUI's simplicity and intuition](./Tutorial/WebUI.md) and [Platform diversification](./TrainingService/SupportTrainingService.md). NNI has grown into a more intelligent automated machine learning (AutoML) toolkit.
+<br/>
+<br/>
+<br/>
+<p align="center">
+<img src=".././img/nni-1.png" width="80%" />
+</p>
+<br />
+<br />
+<p align="center">
+<img src=".././img/release-1-title-2.png" width="100%" />
+</p>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**Step one**: Start with the [Tutorial Doc](./Tutorial/Installation.md), and install NNI v1.0 first.<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**Step two**: Find a " Hello world example", follow the [Tutorial Doc](./Tutorial/QuickStart.md) and have a Quick Start. <br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**Step three**: Get familiar with the [WebUI Tutorial](./Tutorial/WebUI.md) and let NNI better assists with your tuning tour.<br>
+The fully automated tool greatly improves the efficiency of the tuning process. For more detail about the 1.0 updates, you can refer to [Release 1.0](https://github.com/microsoft/nni/releases). More of our advance plan, you can refer to our [Roadmap](https://github.com/microsoft/nni/wiki/Roadmap). Besides, we also welcome more and more contributors to join us, there are many ways to participate, please refer to [How to contribute](./Tutorial/Contributing.md) for more details.
\ No newline at end of file
--- a/docs/en_US/ResearchPublications.rst
+++ b/docs/en_US/ResearchPublications.rst
+Research and Publications
+=========================
+We are intensively working on both tool chain and research to make automatic model design and tuning really practical and powerful. On the one hand, our main work is tool chain oriented development. On the other hand, our research works aim to improve this tool chain, rethink challenging problems in AutoML (on both system and algorithm) and propose elegant solutions. Below we list some of our research works, we encourage more research works on this topic and encourage collaboration with us.
+System Research
+---------------
+* `Retiarii: A Deep Learning Exploratory-Training Framework <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__
+.. code-block:: bibtex
+   @inproceedings{zhang2020retiarii,
+     title={Retiarii: A Deep Learning Exploratory-Training Framework},
+     author={Zhang, Quanlu and Han, Zhenhua and Yang, Fan and Zhang, Yuge and Liu, Zhe and Yang, Mao and Zhou, Lidong},
+     booktitle={14th $\{$USENIX$\}$ Symposium on Operating Systems Design and Implementation ($\{$OSDI$\}$ 20)},
+     pages={919--936},
+     year={2020}
+   }
+* `AutoSys: The Design and Operation of Learning-Augmented Systems <https://www.usenix.org/system/files/atc20-liang-chieh-jan.pdf>`__
+.. code-block:: bibtex
+   @inproceedings{liang2020autosys,
+     title={AutoSys: The Design and Operation of Learning-Augmented Systems},
+     author={Liang, Chieh-Jan Mike and Xue, Hui and Yang, Mao and Zhou, Lidong and Zhu, Lifei and Li, Zhao Lucis and Wang, Zibo and Chen, Qi and Zhang, Quanlu and Liu, Chuanjie and others},
+     booktitle={2020 $\{$USENIX$\}$ Annual Technical Conference ($\{$USENIX$\}$$\{$ATC$\}$ 20)},
+     pages={323--336},
+     year={2020}
+   }
+* `Gandiva: Introspective Cluster Scheduling for Deep Learning <https://www.usenix.org/system/files/osdi18-xiao.pdf>`__
+.. code-block:: bibtex
+   @inproceedings{xiao2018gandiva,
+     title={Gandiva: Introspective cluster scheduling for deep learning},
+     author={Xiao, Wencong and Bhardwaj, Romil and Ramjee, Ramachandran and Sivathanu, Muthian and Kwatra, Nipun and Han, Zhenhua and Patel, Pratyush and Peng, Xuan and Zhao, Hanyu and Zhang, Quanlu and others},
+     booktitle={13th $\{$USENIX$\}$ Symposium on Operating Systems Design and Implementation ($\{$OSDI$\}$ 18)},
+     pages={595--610},
+     year={2018}
+   }
+Algorithm Research
+------------------
+New Algorithms
+^^^^^^^^^^^^^^
+* `TextNAS: A Neural Architecture Search Space Tailored for Text Representation <https://arxiv.org/pdf/1912.10729.pdf>`__
+.. code-block:: bibtex
+   @inproceedings{wang2020textnas,
+     title={TextNAS: A Neural Architecture Search Space Tailored for Text Representation.},
+     author={Wang, Yujing and Yang, Yaming and Chen, Yiren and Bai, Jing and Zhang, Ce and Su, Guinan and Kou, Xiaoyu and Tong, Yunhai and Yang, Mao and Zhou, Lidong},
+     booktitle={AAAI},
+     pages={9242--9249},
+     year={2020}
+   }
+* `Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search <https://papers.nips.cc/paper/2020/file/d072677d210ac4c03ba046120f0802ec-Paper.pdf>`__
+.. code-block:: bibtex
+   @article{peng2020cream,
+     title={Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search},
+     author={Peng, Houwen and Du, Hao and Yu, Hongyuan and Li, Qi and Liao, Jing and Fu, Jianlong},
+     journal={Advances in Neural Information Processing Systems},
+     volume={33},
+     year={2020}
+   }
+* `Metis: Robustly tuning tail latencies of cloud systems <https://www.usenix.org/system/files/conference/atc18/atc18-li-zhao.pdf>`__
+.. code-block:: bibtex
+   @inproceedings{li2018metis,
+     title={Metis: Robustly tuning tail latencies of cloud systems},
+     author={Li, Zhao Lucis and Liang, Chieh-Jan Mike and He, Wenjia and Zhu, Lianjie and Dai, Wenjun and Jiang, Jin and Sun, Guangzhong},
+     booktitle={2018 $\{$USENIX$\}$ Annual Technical Conference ($\{$USENIX$\}$$\{$ATC$\}$ 18)},
+     pages={981--992},
+     year={2018}
+   }
+* `OpEvo: An Evolutionary Method for Tensor Operator Optimization <https://arxiv.org/abs/2006.05664>`__
+.. code-block:: bibtex
+   @article{Gao2021opevo, 
+        title={OpEvo: An Evolutionary Method for Tensor Operator Optimization}, 
+        volume={35},
+        url={https://ojs.aaai.org/index.php/AAAI/article/view/17462}, 
+        number={14}, 
+        journal={Proceedings of the AAAI Conference on Artificial Intelligence},
+        author={Gao, Xiaotian and Cui, Wei and Zhang, Lintao and Yang, Mao},
+        year={2021}, month={May}, pages={12320-12327}
+   }
+Measurement and Understanding
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* `Deeper insights into weight sharing in neural architecture search <https://arxiv.org/pdf/2001.01431.pdf>`__
+.. code-block:: bibtex
+   @article{zhang2020deeper,
+     title={Deeper insights into weight sharing in neural architecture search},
+     author={Zhang, Yuge and Lin, Zejun and Jiang, Junyang and Zhang, Quanlu and Wang, Yujing and Xue, Hui and Zhang, Chen and Yang, Yaming},
+     journal={arXiv preprint arXiv:2001.01431},
+     year={2020}
+   }
+* `How Does Supernet Help in Neural Architecture Search? <https://arxiv.org/abs/2010.08219>`__
+.. code-block:: bibtex
+   @article{zhang2020does,
+     title={How Does Supernet Help in Neural Architecture Search?},
+     author={Zhang, Yuge and Zhang, Quanlu and Yang, Yaming},
+     journal={arXiv preprint arXiv:2010.08219},
+     year={2020}
+   }
+Applications
+^^^^^^^^^^^^
+* `AutoADR: Automatic Model Design for Ad Relevance <https://arxiv.org/pdf/2010.07075.pdf>`__
+.. code-block:: bibtex
+   @inproceedings{chen2020autoadr,
+     title={AutoADR: Automatic Model Design for Ad Relevance},
+     author={Chen, Yiren and Yang, Yaming and Sun, Hong and Wang, Yujing and Xu, Yu and Shen, Wei and Zhou, Rong and Tong, Yunhai and Bai, Jing and Zhang, Ruofei},
+     booktitle={Proceedings of the 29th ACM International Conference on Information \& Knowledge Management},
+     pages={2365--2372},
+     year={2020}
+   }
--- a/docs/en_US/SupportedFramework_Library.rst
+++ b/docs/en_US/SupportedFramework_Library.rst
+.. role:: raw-html(raw)
+   :format: html
+Framework and Library Supports
+==============================
+With the built-in Python API, NNI naturally supports the hyper parameter tuning and neural network search for all the AI frameworks and libraries who support Python models(\ ``version >= 3.6``\ ). NNI had also provided a set of examples and tutorials for some of the popular scenarios to make jump start easier.
+Supported AI Frameworks
+-----------------------
+* `PyTorch <https://github.com/pytorch/pytorch>`__
+  * :githublink:`MNIST-pytorch <examples/trials/mnist-distributed-pytorch>`
+  * `CIFAR-10 <./TrialExample/Cifar10Examples.rst>`__
+  * :githublink:`TGS salt identification chanllenge <examples/trials/kaggle-tgs-salt/README.md>`
+  * :githublink:`Network_morphism <examples/trials/network_morphism/README.md>`
+* `TensorFlow <https://github.com/tensorflow/tensorflow>`__
+  * :githublink:`MNIST-tensorflow <examples/trials/mnist-distributed>`
+  * :githublink:`Squad <examples/trials/ga_squad/README.md>`
+* `Keras <https://github.com/keras-team/keras>`__
+  * :githublink:`MNIST-keras <examples/trials/mnist-keras>`
+  * :githublink:`Network_morphism <examples/trials/network_morphism/README.md>`
+* `MXNet <https://github.com/apache/incubator-mxnet>`__
+* `Caffe2 <https://github.com/BVLC/caffe>`__
+* `CNTK (Python language) <https://github.com/microsoft/CNTK>`__
+* `Spark MLlib <http://spark.apache.org/mllib/>`__
+* `Chainer <https://chainer.org/>`__
+* `Theano <https://pypi.org/project/Theano/>`__
+You are encouraged to `contribute more examples <Tutorial/Contributing.rst>`__ for other NNI users. 
+Supported Library
+-----------------
+NNI also supports all libraries written in python.Here are some common libraries, including some algorithms based on GBDT: XGBoost, CatBoost and lightGBM.
+* `Scikit-learn <https://scikit-learn.org/stable/>`__
+  * `Scikit-learn <TrialExample/SklearnExamples.rst>`__
+* `XGBoost <https://xgboost.readthedocs.io/en/latest/>`__
+* `CatBoost <https://catboost.ai/>`__
+* `LightGBM <https://lightgbm.readthedocs.io/en/latest/>`__
+  * `Auto-gbdt <TrialExample/GbdtExample.rst>`__
+Here is just a small list of libraries that supported by NNI. If you are interested in NNI, you can refer to the `tutorial <TrialExample/Trials.rst>`__ to complete your own hacks.
+In addition to the above examples, we also welcome more and more users to apply NNI to your own work, if you have any doubts, please refer `Write a Trial Run on NNI <TrialExample/Trials.rst>`__. In particular, if you want to be a contributor of NNI, whether it is the sharing of examples , writing of Tuner or otherwise, we are all looking forward to your participation.More information please refer to `here <Tutorial/Contributing.rst>`__.
--- a/docs/en_US/TrainingService/AMLMode.rst
+++ b/docs/en_US/TrainingService/AMLMode.rst
+**Run an Experiment on Azure Machine Learning**
+===================================================
+NNI supports running an experiment on `AML <https://azure.microsoft.com/en-us/services/machine-learning/>`__ , called aml mode.
+Setup environment
+-----------------
+Step 1. Install NNI, follow the install guide `here <../Tutorial/QuickStart.rst>`__.   
+Step 2. Create an Azure account/subscription using this `link <https://azure.microsoft.com/en-us/free/services/machine-learning/>`__. If you already have an Azure account/subscription, skip this step.
+Step 3. Install the Azure CLI on your machine, follow the install guide `here <https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest>`__.
+Step 4. Authenticate to your Azure subscription from the CLI. To authenticate interactively, open a command line or terminal and use the following command:
+.. code-block:: bash
+   az login
+Step 5. Log into your Azure account with a web browser and create a Machine Learning resource. You will need to choose a resource group and specific a workspace name. Then download ``config.json`` which will be used later.
+.. image:: ../../img/aml_workspace.png
+   :target: ../../img/aml_workspace.png
+   :alt: 
+Step 6. Create an AML cluster as the computeTarget.
+.. image:: ../../img/aml_cluster.png
+   :target: ../../img/aml_cluster.png
+   :alt: 
+Step 7. Open a command line and install AML package environment.
+.. code-block:: bash
+   python3 -m pip install azureml
+   python3 -m pip install azureml-sdk
+Run an experiment
+-----------------
+Use ``examples/trials/mnist-pytorch`` as an example. The NNI config YAML file's content is like:
+.. code-block:: yaml
+   searchSpaceFile: search_space.json
+   trialCommand: python3 mnist.py
+   trialConcurrency: 1
+   maxTrialNumber: 10
+   tuner:
+     name: TPE
+     classArgs:
+       optimize_mode: maximize
+   trainingService:
+     platform: aml
+     dockerImage: msranni/nni
+     subscriptionId: ${your subscription ID}
+     resourceGroup: ${your resource group}
+     workspaceName: ${your workspace name}
+     computeTarget: ${your compute target}
+Note: You should set ``platform: aml`` in NNI config YAML file if you want to start experiment in aml mode.
+Compared with `LocalMode <LocalMode.rst>`__ training service configuration in aml mode have these additional keys:
+* dockerImage
+  * required key. The docker image name used in job. NNI support image ``msranni/nni`` for running aml jobs.
+.. Note:: This image is build based on cuda environment, may not be suitable for CPU clusters in AML.
+amlConfig:
+* subscriptionId
+  * required key, the subscriptionId of your account
+* resourceGroup
+  * required key, the resourceGroup of your account
+* workspaceName
+  * required key, the workspaceName of your account
+* computeTarget
+  * required key, the compute cluster name you want to use in your AML workspace. `refer <https://docs.microsoft.com/en-us/azure/machine-learning/concept-compute-target>`__ See Step 6.
+* maxTrialNumberPerGpu
+  * optional key, default 1. Used to specify the max concurrency trial number on a GPU device.
+* useActiveGpu
+  * optional key, default false. Used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no other active process in the GPU.
+The required information of amlConfig could be found in the downloaded ``config.json`` in Step 5.
+Run the following commands to start the example experiment:
+.. code-block:: bash
+   git clone -b ${NNI_VERSION} https://github.com/microsoft/nni
+   cd nni/examples/trials/mnist-pytorch
+   # modify config_aml.yml ...
+   nnictl create --config config_aml.yml
+Replace ``${NNI_VERSION}`` with a released version name or branch name, e.g., ``v2.4``.
+Monitor your code in the cloud by using the studio
+--------------------------------------------------
+To monitor your job's code, you need to visit your studio which you create at step 5. Once the job completes, go to the Outputs + logs tab. There you can see a 70_driver_log.txt file, This file contains the standard output from a run and can be useful when you're debugging remote runs in the cloud. Learn more about aml from `here <https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-hello-world>`__.
--- a/docs/en_US/TrainingService/AdaptDLMode.rst
+++ b/docs/en_US/TrainingService/AdaptDLMode.rst
+Run an Experiment on AdaptDL
+============================
+Now NNI supports running experiment on `AdaptDL <https://github.com/petuum/adaptdl>`__. Before starting to use NNI AdaptDL mode, you should have a Kubernetes cluster, either on-premises or `Azure Kubernetes Service(AKS) <https://azure.microsoft.com/en-us/services/kubernetes-service/>`__\ , a Ubuntu machine on which `kubeconfig <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/>`__ is setup to connect to your Kubernetes cluster. In AdaptDL mode, your trial program will run as AdaptDL job in Kubernetes cluster.
+AdaptDL aims to make distributed deep learning easy and efficient in dynamic-resource environments such as shared clusters and the cloud.
+Prerequisite for Kubernetes Service
+-----------------------------------
+#. A **Kubernetes** cluster using Kubernetes 1.14 or later with storage. Follow this guideline to set up Kubernetes `on Azure <https://azure.microsoft.com/en-us/services/kubernetes-service/>`__\ , or `on-premise <https://kubernetes.io/docs/setup/>`__ with `cephfs <https://kubernetes.io/docs/concepts/storage/storage-classes/#ceph-rbd>`__\ , or `microk8s with storage add-on enabled <https://microk8s.io/docs/addons>`__.
+#. Helm install **AdaptDL Scheduler** to your Kubernetes cluster. Follow this `guideline <https://adaptdl.readthedocs.io/en/latest/installation/install-adaptdl.html>`__ to setup AdaptDL scheduler.
+#. Prepare a **kubeconfig** file, which will be used by NNI to interact with your Kubernetes API server. By default, NNI manager will use ``$(HOME)/.kube/config`` as kubeconfig file's path. You can also specify other kubeconfig files by setting the ** KUBECONFIG** environment variable. Refer this `guideline <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig>`__ to learn more about kubeconfig.
+#. If your NNI trial job needs GPU resource, you should follow this `guideline <https://github.com/NVIDIA/k8s-device-plugin>`__ to configure **Nvidia device plugin for Kubernetes**.
+#. (Optional) Prepare a **NFS server** and export a general purpose mount as external storage.
+#. Install **NNI**\ , follow the install guide `here <../Tutorial/QuickStart.rst>`__.
+Verify Prerequisites
+^^^^^^^^^^^^^^^^^^^^
+.. code-block:: bash
+   nnictl --version
+   # Expected: <version_number>
+.. code-block:: bash
+   kubectl version
+   # Expected that the kubectl client version matches the server version.
+.. code-block:: bash
+   kubectl api-versions | grep adaptdl
+   # Expected: adaptdl.petuum.com/v1
+Run an experiment
+-----------------
+We have a CIFAR10 example that fully leverages the AdaptDL scheduler under ``examples/trials/cifar10_pytorch`` folder. (\ ``main_adl.py`` and ``config_adl.yaml``\ )
+Here is a template configuration specification to use AdaptDL as a training service.
+.. code-block:: yaml
+   authorName: default
+   experimentName: minimal_adl
+   trainingServicePlatform: adl
+   nniManagerIp: 10.1.10.11
+   logCollection: http
+   tuner:
+     builtinTunerName: GridSearch
+   searchSpacePath: search_space.json
+   trialConcurrency: 2
+   maxTrialNum: 2
+   trial:
+     adaptive: false # optional.
+     image: <image_tag>
+     imagePullSecrets:  # optional
+       - name: stagingsecret
+     codeDir: .
+     command: python main.py
+     gpuNum: 1
+     cpuNum: 1  # optional
+     memorySize: 8Gi  # optional
+     nfs: # optional
+       server: 10.20.41.55
+       path: /
+       containerMountPath: /nfs
+     checkpoint: # optional
+       storageClass: dfs
+       storageSize: 1Gi
+Those configs not mentioned below, are following the
+`default specs defined </Tutorial/ExperimentConfig.rst#configuration-spec>`__  in the NNI doc.
+* **trainingServicePlatform**\ : Choose ``adl`` to use the Kubernetes cluster with AdaptDL scheduler.
+* **nniManagerIp**\ : *Required* to get the correct info and metrics back from the cluster, for ``adl`` training service.
+  IP address of the machine with NNI manager (NNICTL) that launches NNI experiment.
+* **logCollection**\ : *Recommended* to set as ``http``. It will collect the trial logs on cluster back to your machine via http.
+* **tuner**\ : It supports the Tuun tuner and all NNI built-in tuners (only except for the checkpoint feature of the NNI PBT tuners).
+* **trial**\ : It defines the specs of an ``adl`` trial.
+  * **namespace**\: (*Optional*\ ) Kubernetes namespace to launch the trials. Default to ``default`` namespace.
+  * **adaptive**\ : (*Optional*\ ) Boolean for AdaptDL trainer. While ``true``\ , it the job is preemptible and adaptive.
+  * **image**\ : Docker image for the trial
+  * **imagePullSecret**\ : (*Optional*\ ) If you are using a private registry,
+    you need to provide the secret to successfully pull the image.
+  * **codeDir**\ : the working directory of the container. ``.`` means the default working directory defined by the image.
+  * **command**\ : the bash command to start the trial
+  * **gpuNum**\ : the number of GPUs requested for this trial. It must be non-negative integer.
+  * **cpuNum**\ : (*Optional*\ ) the number of CPUs requested for this trial.  It must be non-negative integer.
+  * **memorySize**\ : (*Optional*\ ) the size of memory requested for this trial. It must follow the Kubernetes
+    `default format <https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory>`__.
+  * **nfs**\ : (*Optional*\ ) mounting external storage. For more information about using NFS please check the below paragraph.
+  * **checkpoint** (*Optional*\ ) storage settings for model checkpoints.
+    * **storageClass**\ : check `Kubernetes storage documentation <https://kubernetes.io/docs/concepts/storage/storage-classes/>`__ for how to use the appropriate ``storageClass``.
+    * **storageSize**\ : this value should be large enough to fit your model's checkpoints, or it could cause "disk quota exceeded" error.
+NFS Storage
+^^^^^^^^^^^
+As you may have noticed in the above configuration spec,
+an *optional* section is available to configure NFS external storage. It is optional when no external storage is required, when for example an docker image is sufficient with codes and data inside.
+Note that ``adl`` training service does NOT help mount an NFS to the local dev machine, so that one can manually mount it to local, manage the filesystem, copy the data or code etc.
+The ``adl`` training service can then mount it to the kubernetes for every trials, with the proper configurations:
+* **server**\ : NFS server address, e.g. IP address or domain
+* **path**\ : NFS server export path, i.e. the absolute path in NFS that can be mounted to trials
+* **containerMountPath**\ : In container absolute path to mount the NFS **path** above,
+  so that every trial will have the access to the NFS.
+  In the trial containers, you can access the NFS with this path.
+Use cases:
+* If your training trials depend on a dataset of large size, you may want to download it first onto the NFS first,
+  and mount it so that it can be shared across multiple trials.
+* The storage for containers are ephemeral and the trial containers will be deleted after a trial's lifecycle is over.
+  So if you want to export your trained models,
+  you may mount the NFS to the trial to persist and export your trained models.
+In short, it is not limited how a trial wants to read from or write on the NFS storage, so you may use it flexibly as per your needs.
+Monitor via Log Stream
+----------------------
+Follow the log streaming of a certain trial:
+.. code-block:: bash
+   nnictl log trial --trial_id=<trial_id>
+.. code-block:: bash
+   nnictl log trial <experiment_id> --trial_id=<trial_id>
+Note that *after* a trial has done and its pod has been deleted,
+no logs can be retrieved then via this command.
+However you may still be able to access the past trial logs
+according to the following approach.
+Monitor via TensorBoard
+-----------------------
+In the context of NNI, an experiment has multiple trials.
+For easy comparison across trials for a model tuning process,
+we support TensorBoard integration. Here one experiment has
+an independent TensorBoard logging directory thus dashboard.
+You can only use the TensorBoard while the monitored experiment is running.
+In other words, it is not supported to monitor stopped experiments.
+In the trial container you may have access to two environment variables:
+* ``ADAPTDL_TENSORBOARD_LOGDIR``\ : the TensorBoard logging directory for the current experiment,
+* ``NNI_TRIAL_JOB_ID``\ : the ``trial`` job id for the current trial.
+It is recommended for to have them joined as the directory for trial,
+for example in Python:
+.. code-block:: python
+   import os
+   tensorboard_logdir = os.path.join(
+       os.getenv("ADAPTDL_TENSORBOARD_LOGDIR"),
+       os.getenv("NNI_TRIAL_JOB_ID")
+   )
+If an experiment is stopped, the data logged here
+(defined by *the above envs* for monitoring with the following commands)
+will be lost. To persist the logged data, you can use the external storage (e.g. to mount an NFS)
+to export it and view the TensorBoard locally.
+With the above setting, you can monitor the experiment easily
+via TensorBoard by
+.. code-block:: bash
+   nnictl tensorboard start
+If having multiple experiment running at the same time, you may use
+.. code-block:: bash
+   nnictl tensorboard start <experiment_id>
+It will provide you the web url to access the tensorboard.
+Note that you have the flexibility to set up the local ``--port``
+for the TensorBoard.
--- a/docs/en_US/TrainingService/DLCMode.rst
+++ b/docs/en_US/TrainingService/DLCMode.rst
+**Run an Experiment on Aliyun PAI-DSW + PAI-DLC**
+===================================================
+NNI supports running an experiment on `PAI-DSW <https://help.aliyun.com/document_detail/194831.html>`__ , submit trials to `PAI-DLC <https://help.aliyun.com/document_detail/165137.html>`__ called dlc mode.
+PAI-DSW server performs the role to submit a job while PAI-DLC is where the training job runs.
+Setup environment
+-----------------
+Step 1. Install NNI, follow the install guide `here <../Tutorial/QuickStart.rst>`__.
+Step 2. Create PAI-DSW server following this `link <https://help.aliyun.com/document_detail/163684.html?section-2cw-lsi-es9#title-ji9-re9-88x>`__. Note as the training service will be run on PAI-DLC, it won't cost many resources to run and you may just need a PAI-DSW server with CPU.
+Step 3. Open PAI-DLC `here <https://pai-dlc.console.aliyun.com/#/guide>`__, select the same region as your PAI-DSW server. Move to ``dataset configuration`` and mount the same NAS disk as the PAI-DSW server does. (Note currently only PAI-DLC public-cluster is supported.)
+Step 4. Open your PAI-DSW server command line, download and install PAI-DLC python SDK to submit DLC tasks, refer to `this link <https://help.aliyun.com/document_detail/203290.html>`__. Skip this step if SDK is already installed.
+.. code-block:: bash
+   wget https://sdk-portal-cluster-prod.oss-cn-zhangjiakou.aliyuncs.com/downloads/u-3536038a-3de7-4f2e-9379-0cb309d29355-python-pai-dlc.zip
+   unzip u-3536038a-3de7-4f2e-9379-0cb309d29355-python-pai-dlc.zip
+   pip install ./pai-dlc-20201203  # pai-dlc-20201203 refer to unzipped sdk file name, replace it accordingly.
+Run an experiment
+-----------------
+Use ``examples/trials/mnist-pytorch`` as an example. The NNI config YAML file's content is like:
+.. code-block:: yaml
+  # working directory on DSW, please provie FULL path
+  experimentWorkingDirectory: /home/admin/workspace/{your_working_dir}
+  searchSpaceFile: search_space.json
+  # the command on trial runner(or, DLC container), be aware of data_dir
+  trialCommand: python mnist.py --data_dir /root/data/{your_data_dir}
+  trialConcurrency: 1  # NOTE: please provide number <= 3 due to DLC system limit.
+  maxTrialNumber: 10
+  tuner:
+    name: TPE
+    classArgs:
+      optimize_mode: maximize
+  # ref: https://help.aliyun.com/document_detail/203290.html?spm=a2c4g.11186623.6.727.6f9b5db6bzJh4x
+  trainingService:
+    platform: dlc
+    type: Worker
+    image: registry-vpc.cn-beijing.aliyuncs.com/pai-dlc/pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04
+    jobType: PyTorchJob                             # choices: [TFJob, PyTorchJob]
+    podCount: 1
+    ecsSpec: ecs.c6.large
+    region: cn-hangzhou
+    nasDataSourceId: ${your_nas_data_source_id}
+    accessKeyId: ${your_ak_id}
+    accessKeySecret: ${your_ak_key}
+    nasDataSourceId: ${your_nas_data_source_id}     # NAS datasource ID，e.g., datat56by9n1xt0a
+    localStorageMountPoint: /home/admin/workspace/  # default NAS path on DSW
+    containerStorageMountPoint: /root/data/         # default NAS path on DLC container, change it according your setting
+Note: You should set ``platform: dlc`` in NNI config YAML file if you want to start experiment in dlc mode.
+Compared with `LocalMode <LocalMode.rst>`__ training service configuration in dlc mode have these additional keys like ``type/image/jobType/podCount/ecsSpec/region/nasDataSourceId/accessKeyId/accessKeySecret``, for detailed explanation ref to this `link <https://help.aliyun.com/document_detail/203111.html#h2-url-3>`__.
+Also, as dlc mode requires DSW/DLC to mount the same NAS disk to share information, there are two extra keys related to this: ``localStorageMountPoint`` and ``containerStorageMountPoint``.
+Run the following commands to start the example experiment:
+.. code-block:: bash
+   git clone -b ${NNI_VERSION} https://github.com/microsoft/nni
+   cd nni/examples/trials/mnist-pytorch
+   # modify config_dlc.yml ...
+   nnictl create --config config_dlc.yml
+Replace ``${NNI_VERSION}`` with a released version name or branch name, e.g., ``v2.3``.
+Monitor your job
+----------------
+To monitor your job on DLC, you need to visit `DLC  <https://pai-dlc.console.aliyun.com/#/jobs>`__ to check job status.
--- a/docs/en_US/TrainingService/DLTSMode.rst
+++ b/docs/en_US/TrainingService/DLTSMode.rst
+**Run an Experiment on DLTS**
+=================================
+NNI supports running an experiment on `DLTS <https://github.com/microsoft/DLWorkspace.git>`__\ , called dlts mode. Before starting to use NNI dlts mode, you should have an account to access DLTS dashboard.
+Setup Environment
+-----------------
+Step 1. Choose a cluster from DLTS dashboard, ask administrator for the cluster dashboard URL.
+.. image:: ../../img/dlts-step1.png
+   :target: ../../img/dlts-step1.png
+   :alt: Choose Cluster
+Step 2. Prepare a NNI config YAML like the following:
+.. code-block:: yaml
+   # Set this field to "dlts"
+   trainingServicePlatform: dlts
+   authorName: your_name
+   experimentName: auto_mnist
+   trialConcurrency: 2
+   maxExecDuration: 3h
+   maxTrialNum: 100
+   searchSpacePath: search_space.json
+   useAnnotation: false
+   tuner:
+     builtinTunerName: TPE
+     classArgs:
+       optimize_mode: maximize
+   trial:
+     command: python3 mnist.py
+     codeDir: .
+     gpuNum: 1
+     image: msranni/nni
+   # Configuration to access DLTS
+   dltsConfig:
+     dashboard: # Ask administrator for the cluster dashboard URL
+Remember to fill the cluster dashboard URL to the last line.
+Step 3. Open your working directory of the cluster, paste the NNI config as well as related code to a directory.
+.. image:: ../../img/dlts-step3.png
+   :target: ../../img/dlts-step3.png
+   :alt: Copy Config
+Step 4. Submit a NNI manager job to the specified cluster.
+.. image:: ../../img/dlts-step4.png
+   :target: ../../img/dlts-step4.png
+   :alt: Submit Job
+Step 5. Go to Endpoints tab of the newly created job, click the Port 40000 link to check trial's information.
+.. image:: ../../img/dlts-step5.png
+   :target: ../../img/dlts-step5.png
+   :alt: View NNI WebUI