Merge pull request #246 from microsoft/master

merge master

Merge pull request #246 from microsoft/master
merge master
d90433da · SparkSnail · GitHub · 1e511829 · bf7daa8f · 1e511829
Unverified Commit d90433da authored May 12, 2020 by SparkSnail Committed by GitHub May 12, 2020
20 changed files
--- a/docs/en_US/AdvancedFeature/MultiPhase.md
+++ b/docs/en_US/AdvancedFeature/MultiPhase.md
-# Multi-phase
-
-## What is multi-phase experiment
-
-Typically each trial job gets a single configuration (e.g., hyperparameters) from tuner, tries this configuration and reports result, then exits. But sometimes a trial job may wants to request multiple configurations from tuner. We find this is a very compelling feature. For example:
-
-1. Job launch takes tens of seconds in some training platform. If a configuration takes only around a minute to finish, running only one configuration in a trial job would be very inefficient. An appealing way is that a trial job requests a configuration and finishes it, then requests another configuration and run. The extreme case is that a trial job can run infinite configurations. If you set concurrency to be for example 6, there would be 6 __long running__ jobs keeping trying different configurations.
-
-2. Some types of models have to be trained phase by phase, the configuration of next phase depends on the results of previous phase(s). For example, to find the best quantization for a model, the training procedure is often as follows: the auto-quantization algorithm (i.e., tuner in NNI) chooses a size of bits (e.g., 16 bits), a trial job gets this configuration and trains the model for some epochs and reports result (e.g., accuracy). The algorithm receives this result and makes decision of changing 16 bits to 8 bits, or changing back to 32 bits. This process is repeated for a configured times.
-
-The above cases can be supported by the same feature, i.e., multi-phase execution. To support those cases, basically a trial job should be able to request multiple configurations from tuner. Tuner is aware of whether two configuration requests are from the same trial job or different ones. Also in multi-phase a trial job can report multiple final results.
-
-## Create multi-phase experiment
-
-### Write trial code which leverages multi-phase:
-
-__1. Update trial code__
-
-It is pretty simple to use multi-phase in trial code, an example is shown below:
-
-```python
-# ...
-for i in range(5):
-    # get parameter from tuner
-    tuner_param = nni.get_next_parameter()
-    # nni.get_next_parameter returns None if there is no more hyper parameters can be generated by tuner.
-    if tuner_param is None:
-      break
-
-    # consume the params
-    # ...
-    # report final result somewhere for the parameter retrieved above
-    nni.report_final_result()
-    # ...
-# ...
-```
-
-In multi-phase experiments, at each time the API `nni.get_next_parameter()` is called, it returns a new hyper parameter generated by tuner, then the trail code consume this new hyper parameter and report final result of this hyper parameter. `nni.get_next_parameter()` and `nni.report_final_result()` should be called sequentially: __call the former one, then call the later one; and repeat this pattern__. If `nni.get_next_parameter()` is called multiple times consecutively, and then `nni.report_final_result()` is called once, the result is associated to the last configuration, which is retrieved from the last get_next_parameter call. So there is no result associated to previous get_next_parameter calls, and it may cause some multi-phase algorithm broken.
-
-Note that, `nni.get_next_parameter` returns None if there is no more hyper parameters can be generated by tuner.
-
-__2. Experiment configuration__
-
-To enable multi-phase, you should also add `multiPhase: true` in your experiment YAML configure file. If this line is not added, `nni.get_next_parameter()` would always return the same configuration.
-
-Multi-phase experiment configuration example:
-
-```yaml
-authorName: default
-experimentName: multiphase experiment
-trialConcurrency: 2
-maxExecDuration: 1h
-maxTrialNum: 8
-trainingServicePlatform: local
-searchSpacePath: search_space.json
-multiPhase: true
-useAnnotation: false
-tuner:
-  builtinTunerName: TPE
-  classArgs:
-    optimize_mode: maximize
-trial:
-  command: python3 mytrial.py
-  codeDir: .
-  gpuNum: 0
-```
-
-### Write a tuner that leverages multi-phase:
-
-Before writing a multi-phase tuner, we highly suggest you to go through [Customize Tuner](https://nni.readthedocs.io/en/latest/Tuner/CustomizeTuner.html). Same as writing a normal tuner, your tuner needs to inherit from `Tuner` class. When you enable multi-phase through configuration (set `multiPhase` to true), your tuner will get an additional parameter `trial_job_id` via tuner's following methods:
-
-```text
-generate_parameters
-generate_multiple_parameters
-receive_trial_result
-receive_customized_trial_result
-trial_end
-```
-
-With this information, the tuner could know which trial is requesting a configuration, and which trial is reporting results. This information provides enough flexibility for your tuner to deal with different trials and different phases. For example, you may want to use the trial_job_id parameter of generate_parameters method to generate hyperparameters for a specific trial job.
-
-### Tuners support multi-phase experiments:
-
-[TPE](../Tuner/HyperoptTuner.md), [Random](../Tuner/HyperoptTuner.md), [Anneal](../Tuner/HyperoptTuner.md), [Evolution](../Tuner/EvolutionTuner.md), [SMAC](../Tuner/SmacTuner.md), [NetworkMorphism](../Tuner/NetworkmorphismTuner.md), [MetisTuner](../Tuner/MetisTuner.md), [BOHB](../Tuner/BohbAdvisor.md), [Hyperband](../Tuner/HyperbandAdvisor.md).
-
-### Training services support multi-phase experiment:
-[Local Machine](../TrainingService/LocalMode.md), [Remote Servers](../TrainingService/RemoteMachineMode.md), [OpenPAI](../TrainingService/PaiMode.md)
--- a/docs/en_US/NAS/NasGuide.md
+++ b/docs/en_US/NAS/NasGuide.md
@@ -156,12 +156,23 @@ model = Net()
 apply_fixed_architecture(model, "model_dir/final_architecture.json")
 ```

-The JSON is simply a mapping from mutable keys to one-hot or multi-hot representation of choices. For example
+The JSON is simply a mapping from mutable keys to choices. Choices can be expressed in:
+
+* A string: select the candidate with corresponding name.
+* A number: select the candidate with corresponding index.
+* A list of string: select the candidates with corresponding names.
+* A list of number: select the candidates with corresponding indices.
+* A list of boolean values: a multi-hot array.
+
+For example,

 ```json
 {
-    "LayerChoice1": [false, true, false, false],
-    "InputChoice2": [true, true, false]
+    "LayerChoice1": "conv5x5",
+    "LayerChoice2": 6,
+    "InputChoice3": ["layer1", "layer3"],
+    "InputChoice4": [1, 2],
+    "InputChoice5": [false, true, false, false, true]
 }
 ```


--- a/docs/en_US/Release.md
+++ b/docs/en_US/Release.md
@@ -206,7 +206,7 @@

 * Documentation
    - Update the docs structure  -Issue #1231
-    - [Multi phase document improvement](AdvancedFeature/MultiPhase.md)   -Issue #1233  -PR #1242
+    - (deprecated) Multi phase document improvement   -Issue #1233  -PR #1242
         + Add configuration example
    - [WebUI description improvement](Tutorial/WebUI.md)  -PR #1419

@@ -234,12 +234,10 @@
    * Add `enas-mode`  and `oneshot-mode` for NAS interface: [PR #1201](https://github.com/microsoft/nni/pull/1201#issue-291094510)
 * [Gaussian Process Tuner with Matern kernel](Tuner/GPTuner.md) 

-* Multiphase experiment supports
+* (deprecated) Multiphase experiment supports
    * Added new training service support for multiphase experiment: PAI mode supports multiphase experiment since v0.9.
    * Added multiphase capability for the following builtin tuners: 
        * TPE, Random Search, Anneal, Naïve Evolution, SMAC, Network Morphism, Metis Tuner.
-    
-    For details, please refer to [Write a tuner that leverages multi-phase](AdvancedFeature/MultiPhase.md)

 * Web Portal
    * Enable trial comparation in Web Portal. For details, refer to [View trials status](Tutorial/WebUI.md)
@@ -549,4 +547,3 @@ Initial release of Neural Network Intelligence (NNI).
  * Support CI by providing out-of-box integration with [travis-ci](https://github.com/travis-ci) on ubuntu
 * Others
  * Support simple GPU job scheduling
-
--- a/docs/en_US/TrainingService/FrameworkControllerMode.md
+++ b/docs/en_US/TrainingService/FrameworkControllerMode.md
@@ -26,7 +26,7 @@ NNI supports running experiment using [FrameworkController](https://github.com/M

 ## Setup FrameworkController

-Follow the [guideline](https://github.com/Microsoft/frameworkcontroller/tree/master/example/run) to set up FrameworkController in the Kubernetes cluster, NNI supports FrameworkController by the stateful set mode.
+Follow the [guideline](https://github.com/Microsoft/frameworkcontroller/tree/master/example/run) to set up FrameworkController in the Kubernetes cluster, NNI supports FrameworkController by the stateful set mode. If your cluster enforces authorization, you need to create a service account with granted permission for FrameworkController, and then pass the name of the FrameworkController service account to the NNI Experiment Config. [refer](https://github.com/Microsoft/frameworkcontroller/tree/master/example/run#run-by-kubernetes-statefulset)

 ## Design

@@ -83,6 +83,7 @@ If you use Azure Kubernetes Service, you should  set `frameworkcontrollerConfig`
 ```yaml
 frameworkcontrollerConfig:
  storage: azureStorage
+  serviceAccountName: {your_frameworkcontroller_service_account_name}
  keyVault:
    vaultName: {your_vault_name}
    name: {your_secert_name}

--- a/docs/en_US/Tuner/BuiltinTuner.md
+++ b/docs/en_US/Tuner/BuiltinTuner.md
@@ -68,10 +68,6 @@ tuner:

 Random search is suggested when each trial does not take very long (e.g., each trial can be completed very quickly, or early stopped by the assessor), and you have enough computational resources. It's also useful if you want to uniformly explore the search space. Random Search can be considered a baseline search algorithm. [Detailed Description](./HyperoptTuner.md)

-**classArgs Requirements:**
-
-* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', the tuner will try to maximize metrics. If 'minimize', the tuner will try to minimize metrics.
-
 **Example Configuration:**

 ```yaml

--- a/docs/en_US/Tuner/CustomizeTuner.md
+++ b/docs/en_US/Tuner/CustomizeTuner.md
@@ -51,7 +51,7 @@ class CustomizedTuner(Tuner):
    ...
 ```

-`receive_trial_result` will receive the `parameter_id, parameters, value` as parameters input. Also, Tuner will receive the `value` object are exactly same value that Trial send. If `multiPhase` is set to `true` in the experiment configuration file, an additional `trial_job_id` parameter is passed to `receive_trial_result` and `generate_parameters` through the `**kwargs` parameter.
+`receive_trial_result` will receive the `parameter_id, parameters, value` as parameters input. Also, Tuner will receive the `value` object are exactly same value that Trial send.

 The `your_parameters` return from `generate_parameters` function, will be package as json object by NNI SDK. NNI SDK will unpack json object so the Trial will receive the exact same `your_parameters` from Tuner.

@@ -109,4 +109,4 @@ More detail example you could see:

 ### Write a more advanced automl algorithm

-The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](CustomizeAdvisor.md) for how to write a customized advisor.
\ No newline at end of file
+The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](CustomizeAdvisor.md) for how to write a customized advisor.
--- a/docs/en_US/Tutorial/ExperimentConfig.md
+++ b/docs/en_US/Tutorial/ExperimentConfig.md
@@ -17,7 +17,6 @@ This document describes the rules to write the config file, and provides some ex
    + [trainingServicePlatform](#trainingserviceplatform)
    + [searchSpacePath](#searchspacepath)
    + [useAnnotation](#useannotation)
-    + [multiPhase](#multiphase)
    + [multiThread](#multithread)
    + [nniManagerIp](#nnimanagerip)
    + [logDir](#logdir)
@@ -94,8 +93,6 @@ searchSpacePath:
 #choice: true, false, default: false
 useAnnotation:
 #choice: true, false, default: false
-multiPhase:
-#choice: true, false, default: false
 multiThread:
 tuner:
  #choice: TPE, Random, Anneal, Evolution
@@ -130,8 +127,6 @@ searchSpacePath:
 #choice: true, false, default: false
 useAnnotation:
 #choice: true, false, default: false
-multiPhase:
-#choice: true, false, default: false
 multiThread:
 tuner:
  #choice: TPE, Random, Anneal, Evolution
@@ -171,8 +166,6 @@ trainingServicePlatform:
 #choice: true, false, default: false
 useAnnotation:
 #choice: true, false, default: false
-multiPhase:
-#choice: true, false, default: false
 multiThread:
 tuner:
  #choice: TPE, Random, Anneal, Evolution
@@ -283,12 +276,6 @@ Use annotation to analysis trial code and generate search space.

 Note: if __useAnnotation__ is true, the searchSpacePath field should be removed.

-### multiPhase
-
-Optional. Bool. Default: false.
-
-Enable [multi-phase experiment](../AdvancedFeature/MultiPhase.md).
-
 ### multiThread

 Optional. Bool. Default: false.

--- a/docs/en_US/Tutorial/SetupNniDeveloperEnvironment.md
+++ b/docs/en_US/Tutorial/SetupNniDeveloperEnvironment.md
@@ -67,10 +67,8 @@ It doesn't need to redeploy, but the nnictl may need to be restarted.

 #### TypeScript

-* If `src/nni_manager` will be changed, run `yarn watch` continually under this folder. It will rebuild code instantly.
-* If `src/webui` or `src/nasui` is changed, use **step 3** to rebuild code.
-
-The nnictl may need to be restarted.
+* If `src/nni_manager` is changed, run `yarn watch` continually under this folder. It will rebuild code instantly. The nnictl may need to be restarted to reload NNI manager.
+* If `src/webui` or `src/nasui` are changed, run `yarn start` under the corresponding folder. The web UI will refresh automatically if code is changed.


 ---

--- a/docs/en_US/hpo_advanced.rst
+++ b/docs/en_US/hpo_advanced.rst
@@ -4,7 +4,6 @@ Advanced Features
 ..  toctree::
    :maxdepth: 2

-    Enable Multi-phase <AdvancedFeature/MultiPhase>
    Write a New Tuner <Tuner/CustomizeTuner>
    Write a New Assessor <Assessor/CustomizeAssessor>
    Write a New Advisor <Tuner/CustomizeAdvisor>

--- a/docs/zh_CN/AdvancedFeature/MultiPhase.md
+++ b/docs/zh_CN/AdvancedFeature/MultiPhase.md
-# 多阶段
-
-## 多阶段 Experiment
-
-通常，每个 Trial 任务只需要从 Tuner 获取一个配置（超参等），然后使用这个配置执行并报告结果，然后退出。 但有时，一个 Trial 任务可能需要从 Tuner 请求多次配置。 这是一个非常有用的功能。 例如：
-
-1. 在一些训练平台上，需要数十秒来启动一个任务。 如果一个配置只需要一分钟就能完成，那么每个 Trial 任务中只运行一个配置就会非常低效。 这种情况下，可以在同一个 Trial 任务中，完成一个配置后，再请求并完成另一个配置。 极端情况下，一个 Trial 任务可以运行无数个配置。 如果设置了并发（例如设为 6），那么就会有 6 个**长时间**运行的任务来不断尝试不同的配置。
-
-2. 有些类型的模型需要进行多阶段的训练，而下一个阶段的配置依赖于前一个阶段的结果。 例如，为了找到模型最好的量化结果，训练过程通常为：自动量化算法（例如 NNI 中的 TunerJ）选择一个位宽（如 16 位）， Trial 任务获得此配置，并训练数个 epoch，并返回结果（例如精度）。 算法收到结果后，决定是将 16 位改为 8 位，还是 32 位。 此过程会重复多次。
-
-上述情况都可以通过多阶段执行的功能来支持。 为了支持这些情况，一个 Trial 任务需要能从 Tuner 请求多个配置。 Tuner 需要知道两次配置请求是否来自同一个 Trial 任务。 同时，多阶段中的 Trial 任务需要多次返回最终结果。
-
-## 创建多阶段的 Experiment
-
-### 实现使用多阶段的 Trial 代码：
-
-**1. 更新 Trial 代码**
-
-Trial 代码中使用多阶段非常容易，示例如下：
-
-```python
-# ...
-for i in range(5):
-    # 从 Tuner 中获得参数
-    tuner_param = nni.get_next_parameter()
-    # 如果没有更多超参可生成，nni.get_next_parameter 会返回 None。
-    if tuner_param is None:
-      break
-
-    # 使用参数
-    # ...
-    # 返回最终结果
-        nni.report_final_result()
-        # ...
-# ...
-```
-
-在多阶段 Experiment 中，每次 API `nni.get_next_parameter()` 被调用时，会返回 Tuner 新生成的超参，然后 Trial 代码会使用新的超参，并返回其最终结果。 `nni.get_next_parameter()` 和 `nni.report_final_result()` 需要依次被调用：**先调用前者，然后调用后者，并按此顺序重复调用**。 如果 `nni.get_next_parameter()` 被连续多次调用，然后再调用 `nni.report_final_result()`，这会造成最终结果只会与 get_next_parameter 所返回的最后一个配置相关联。 因此，前面的 get_next_parameter 调用都没有关联的结果，这可能会造成一些多阶段算法出问题。
-
-注意，如果 `nni.get_next_parameter` 返回 None，表示 Tuner 没有生成更多的超参。
-
-**2. Experiment 配置**
-
-要启用多阶段，需要在 Experiment 的 YAML 配置文件中增加 `multiPhase: true`。 如果不添加此参数，`nni.get_next_parameter()` 会一直返回同样的配置。
-
-多阶段 Experiment 配置示例：
-
-```yaml
-authorName: default
-experimentName: multiphase experiment
-trialConcurrency: 2
-maxExecDuration: 1h
-maxTrialNum: 8
-trainingServicePlatform: local
-searchSpacePath: search_space.json
-multiPhase: true
-useAnnotation: false
-tuner:
-  builtinTunerName: TPE
-  classArgs:
-    optimize_mode: maximize
-trial:
-  command: python3 mytrial.py
-  codeDir: .
-  gpuNum: 0
-```
-
-### 实现使用多阶段的 Tuner：
-
-强烈建议首先阅读[自定义 Tuner](https://nni.readthedocs.io/zh/latest/Tuner/CustomizeTuner.html)，再开始实现多阶段 Tuner。 与普通 Tuner 一样，需要从 `Tuner` 类继承。 当通过配置启用多阶段时（将 `multiPhase` 设为 true），Tuner 会通过下列方法得到一个新的参数 `trial_job_id`：
-
-```text
-generate_parameters
-generate_multiple_parameters
-receive_trial_result
-receive_customized_trial_result
-trial_end
-```
-
-有了这个信息， Tuner 能够知道哪个 Trial 在请求配置信息， 返回的结果是哪个 Trial 的。 通过此信息，Tuner 能够灵活的为不同的 Trial 及其阶段实现功能。 例如，可在 generate_parameters 方法中使用 trial_job_id 来为特定的 Trial 任务生成超参。
-
-### 支持多阶段 Experiment 的 Tuner：
-
-[TPE](../Tuner/HyperoptTuner.md), [Random](../Tuner/HyperoptTuner.md), [Anneal](../Tuner/HyperoptTuner.md), [Evolution](../Tuner/EvolutionTuner.md), [SMAC](../Tuner/SmacTuner.md), [NetworkMorphism](../Tuner/NetworkmorphismTuner.md), [MetisTuner](../Tuner/MetisTuner.md), [BOHB](../Tuner/BohbAdvisor.md), [Hyperband](../Tuner/HyperbandAdvisor.md).
-
-### 支持多阶段 Experiment 的训练平台：
-
-[本机](../TrainingService/LocalMode.md), [远程计算机](../TrainingService/RemoteMachineMode.md), [OpenPAI](../TrainingService/PaiMode.md)
\ No newline at end of file
--- a/examples/nas/spos/network.py
+++ b/examples/nas/spos/network.py
@@ -147,8 +147,10 @@ class ShuffleNetV2OneShot(nn.Module):

 def load_and_parse_state_dict(filepath="./data/checkpoint-150000.pth.tar"):
    checkpoint = torch.load(filepath, map_location=torch.device("cpu"))
+    if "state_dict" in checkpoint:
+        checkpoint = checkpoint["state_dict"]
    result = dict()
-    for k, v in checkpoint["state_dict"].items():
+    for k, v in checkpoint.items():
        if k.startswith("module."):
            k = k[len("module."):]
        result[k] = v

--- a/src/nni_manager/training_service/pai/paiYarn/paiYarnTrainingService.ts
+++ b/src/nni_manager/training_service/pai/paiYarn/paiYarnTrainingService.ts
@@ -283,6 +283,9 @@ class PAIYarnTrainingService extends PAITrainingService {
        };
        request(submitJobRequest, (error: Error, response: request.Response, _body: any) => {
            if ((error !== undefined && error !== null) || response.statusCode >= 400) {
+                const errorMessage: string = (error !== undefined && error !== null) ? error.message : 
+                `Submit trial ${trialJobId} failed, http code:${response.statusCode}, http body: ${response.body.message}`;
+                this.log.error(errorMessage);
                trialJobDetail.status = 'FAILED';
                deferred.resolve(true);
            } else {

--- a/src/sdk/pynni/nni/nas/pytorch/fixed.py
+++ b/src/sdk/pynni/nni/nas/pytorch/fixed.py
@@ -3,10 +3,9 @@

 import json

-import torch
-
-from nni.nas.pytorch.mutables import MutableScope
-from nni.nas.pytorch.mutator import Mutator
+from .mutables import InputChoice, LayerChoice, MutableScope
+from .mutator import Mutator
+from .utils import to_list


 class FixedArchitecture(Mutator):
@@ -17,8 +16,8 @@ class FixedArchitecture(Mutator):
    ----------
    model : nn.Module
        A mutable network.
-    fixed_arc : str or dict
-        Path to the architecture checkpoint (a string), or preloaded architecture object (a dict).
+    fixed_arc : dict
+        Preloaded architecture object.
    strict : bool
        Force everything that appears in ``fixed_arc`` to be used at least once.
    """
@@ -33,6 +32,34 @@ class FixedArchitecture(Mutator):
            raise RuntimeError("Unexpected keys found in fixed architecture: {}.".format(fixed_arc_keys - mutable_keys))
        if mutable_keys - fixed_arc_keys:
            raise RuntimeError("Missing keys in fixed architecture: {}.".format(mutable_keys - fixed_arc_keys))
+        self._fixed_arc = self._from_human_readable_architecture(self._fixed_arc)
+
+    def _from_human_readable_architecture(self, human_arc):
+        # convert from an exported architecture
+        result_arc = {k: to_list(v) for k, v in human_arc.items()}  # there could be tensors, numpy arrays, etc.
+        # First, convert non-list to list, because there could be {"op1": 0} or {"op1": "conv"},
+        # which means {"op1": [0, ]} ir {"op1": ["conv", ]}
+        result_arc = {k: v if isinstance(v, list) else [v] for k, v in result_arc.items()}
+        # Second, infer which ones are multi-hot arrays and which ones are in human-readable format.
+        # This is non-trivial, since if an array in [0, 1], we cannot know for sure it means [false, true] or [true, true].
+        # Here, we assume an multihot array has to be a boolean array or a float array and matches the length.
+        for mutable in self.mutables:
+            if mutable.key not in result_arc:
+                continue  # skip silently
+            choice_arr = result_arc[mutable.key]
+            if all(isinstance(v, bool) for v in choice_arr) or all(isinstance(v, float) for v in choice_arr):
+                if (isinstance(mutable, LayerChoice) and len(mutable) == len(choice_arr)) or \
+                        (isinstance(mutable, InputChoice) and mutable.n_candidates == len(choice_arr)):
+                    # multihot, do nothing
+                    continue
+            if isinstance(mutable, LayerChoice):
+                choice_arr = [mutable.names.index(val) if isinstance(val, str) else val for val in choice_arr]
+                choice_arr = [i in choice_arr for i in range(len(mutable))]
+            elif isinstance(mutable, InputChoice):
+                choice_arr = [mutable.choose_from.index(val) if isinstance(val, str) else val for val in choice_arr]
+                choice_arr = [i in choice_arr for i in range(mutable.n_candidates)]
+            result_arc[mutable.key] = choice_arr
+        return result_arc

    def sample_search(self):
        """
@@ -47,17 +74,6 @@ class FixedArchitecture(Mutator):
        return self._fixed_arc


-def _encode_tensor(data):
-    if isinstance(data, list):
-        if all(map(lambda o: isinstance(o, bool), data)):
-            return torch.tensor(data, dtype=torch.bool)  # pylint: disable=not-callable
-        else:
-            return torch.tensor(data, dtype=torch.float)  # pylint: disable=not-callable
-    if isinstance(data, dict):
-        return {k: _encode_tensor(v) for k, v in data.items()}
-    return data
-
-
 def apply_fixed_architecture(model, fixed_arc):
    """
    Load architecture from `fixed_arc` and apply to model.
@@ -78,7 +94,6 @@ def apply_fixed_architecture(model, fixed_arc):
    if isinstance(fixed_arc, str):
        with open(fixed_arc) as f:
            fixed_arc = json.load(f)
-    fixed_arc = _encode_tensor(fixed_arc)
    architecture = FixedArchitecture(model, fixed_arc)
    architecture.reset()
    return architecture
--- a/src/sdk/pynni/nni/nas/pytorch/mutator.py
+++ b/src/sdk/pynni/nni/nas/pytorch/mutator.py
@@ -7,7 +7,9 @@ from collections import defaultdict
 import numpy as np
 import torch

-from nni.nas.pytorch.base_mutator import BaseMutator
+from .base_mutator import BaseMutator
+from .mutables import LayerChoice, InputChoice
+from .utils import to_list

 logger = logging.getLogger(__name__)

@@ -58,7 +60,16 @@ class Mutator(BaseMutator):
        dict
            A mapping from key of mutables to decisions.
        """
-        return self.sample_final()
+        sampled = self.sample_final()
+        result = dict()
+        for mutable in self.mutables:
+            if not isinstance(mutable, (LayerChoice, InputChoice)):
+                # not supported as built-in
+                continue
+            result[mutable.key] = self._convert_mutable_decision_to_human_readable(mutable, sampled.pop(mutable.key))
+        if sampled:
+            raise ValueError("Unexpected keys returned from 'sample_final()': %s", list(sampled.keys()))
+        return result

    def status(self):
        """
@@ -159,7 +170,7 @@ class Mutator(BaseMutator):
        mask = self._get_decision(mutable)
        assert len(mask) == len(mutable), \
            "Invalid mask, expected {} to be of length {}.".format(mask, len(mutable))
-        out = self._select_with_mask(_map_fn, [(choice, args, kwargs) for choice in mutable], mask)
+        out, mask = self._select_with_mask(_map_fn, [(choice, args, kwargs) for choice in mutable], mask)
        return self._tensor_reduction(mutable.reduction, out), mask

    def on_forward_input_choice(self, mutable, tensor_list):
@@ -185,17 +196,41 @@ class Mutator(BaseMutator):
        mask = self._get_decision(mutable)
        assert len(mask) == mutable.n_candidates, \
            "Invalid mask, expected {} to be of length {}.".format(mask, mutable.n_candidates)
-        out = self._select_with_mask(lambda x: x, [(t,) for t in tensor_list], mask)
+        out, mask = self._select_with_mask(lambda x: x, [(t,) for t in tensor_list], mask)
        return self._tensor_reduction(mutable.reduction, out), mask

    def _select_with_mask(self, map_fn, candidates, mask):
-        if "BoolTensor" in mask.type():
+        """
+        Select masked tensors and return a list of tensors.
+
+        Parameters
+        ----------
+        map_fn : function
+            Convert candidates to target candidates. Can be simply identity.
+        candidates : list of torch.Tensor
+            Tensor list to apply the decision on.
+        mask : list-like object
+            Can be a list, an numpy array or a tensor (recommended). Needs to
+            have the same length as ``candidates``.
+
+        Returns
+        -------
+        tuple of list of torch.Tensor and torch.Tensor
+            Output and mask.
+        """
+        if (isinstance(mask, list) and len(mask) >= 1 and isinstance(mask[0], bool)) or \
+                (isinstance(mask, np.ndarray) and mask.dtype == np.bool) or \
+                "BoolTensor" in mask.type():
            out = [map_fn(*cand) for cand, m in zip(candidates, mask) if m]
-        elif "FloatTensor" in mask.type():
+        elif (isinstance(mask, list) and len(mask) >= 1 and isinstance(mask[0], (float, int))) or \
+                (isinstance(mask, np.ndarray) and mask.dtype in (np.float32, np.float64, np.int32, np.int64)) or \
+                "FloatTensor" in mask.type():
            out = [map_fn(*cand) * m for cand, m in zip(candidates, mask) if m]
        else:
-            raise ValueError("Unrecognized mask")
-        return out
+            raise ValueError("Unrecognized mask '%s'" % mask)
+        if not torch.is_tensor(mask):
+            mask = torch.tensor(mask)  # pylint: disable=not-callable
+        return out, mask

    def _tensor_reduction(self, reduction_type, tensor_list):
        if reduction_type == "none":
@@ -237,3 +272,37 @@ class Mutator(BaseMutator):
        result = self._cache[mutable.key]
        logger.debug("Decision %s: %s", mutable.key, result)
        return result
+
+    def _convert_mutable_decision_to_human_readable(self, mutable, sampled):
+        # Assert the existence of mutable.key in returned architecture.
+        # Also check if there is anything extra.
+        multihot_list = to_list(sampled)
+        converted = None
+        # If it's a boolean array, we can do optimization.
+        if all([t == 0 or t == 1 for t in multihot_list]):
+            if isinstance(mutable, LayerChoice):
+                assert len(multihot_list) == len(mutable), \
+                    "Results returned from 'sample_final()' (%s: %s) either too short or too long." \
+                        % (mutable.key, multihot_list)
+                # check if all modules have different names and they indeed have names
+                if len(set(mutable.names)) == len(mutable) and not all(d.isdigit() for d in mutable.names):
+                    converted = [name for i, name in enumerate(mutable.names) if multihot_list[i]]
+                else:
+                    converted = [i for i in range(len(multihot_list)) if multihot_list[i]]
+            if isinstance(mutable, InputChoice):
+                assert len(multihot_list) == mutable.n_candidates, \
+                    "Results returned from 'sample_final()' (%s: %s) either too short or too long." \
+                        % (mutable.key, multihot_list)
+                # check if all input candidates have different names
+                if len(set(mutable.choose_from)) == mutable.n_candidates:
+                    converted = [name for i, name in enumerate(mutable.choose_from) if multihot_list[i]]
+                else:
+                    converted = [i for i in range(len(multihot_list)) if multihot_list[i]]
+        if converted is not None:
+            # if only one element, then remove the bracket
+            if len(converted) == 1:
+                converted = converted[0]
+        else:
+            # do nothing
+            converted = multihot_list
+        return converted
--- a/src/sdk/pynni/nni/nas/pytorch/utils.py
+++ b/src/sdk/pynni/nni/nas/pytorch/utils.py
@@ -4,6 +4,7 @@
 import logging
 from collections import OrderedDict

+import numpy as np
 import torch

 _counter = 0
@@ -45,6 +46,16 @@ def to_device(obj, device):
    raise ValueError("'%s' has unsupported type '%s'" % (obj, type(obj)))


+def to_list(arr):
+    if torch.is_tensor(arr):
+        return arr.cpu().numpy().tolist()
+    if isinstance(arr, np.ndarray):
+        return arr.tolist()
+    if isinstance(arr, (list, tuple)):
+        return list(arr)
+    return arr
+
+
 class AverageMeterGroup:
    """
    Average meter group for multiple average meters.

--- a/src/sdk/pynni/nni/pbt_tuner/pbt_tuner.py
+++ b/src/sdk/pynni/nni/pbt_tuner/pbt_tuner.py
@@ -74,18 +74,16 @@ def exploit_and_explore(bot_trial_info, top_trial_info, factor, resample_probabi
    top_hyper_parameters = top_trial_info.hyper_parameters
    hyper_parameters = copy.deepcopy(top_hyper_parameters)
    random_state = np.random.RandomState()
+    hyper_parameters['load_checkpoint_dir'] = hyper_parameters['save_checkpoint_dir']
+    hyper_parameters['save_checkpoint_dir'] = os.path.join(bot_checkpoint_dir, str(epoch))
    for key in hyper_parameters.keys():
        hyper_parameter = hyper_parameters[key]
-        if key == 'load_checkpoint_dir':
-            hyper_parameters[key] = hyper_parameters['save_checkpoint_dir']
-            continue
-        elif key == 'save_checkpoint_dir':
-            hyper_parameters[key] = os.path.join(bot_checkpoint_dir, str(epoch))
+        if key == 'load_checkpoint_dir' or key == 'save_checkpoint_dir':
            continue
        elif search_space[key]["_type"] == "choice":
            choices = search_space[key]["_value"]
-            ub, uv = len(choices) - 1, choices.index(hyper_parameter["_value"]) + 1
-            lb, lv = 0, choices.index(hyper_parameter["_value"]) - 1
+            ub, uv = len(choices) - 1, choices.index(hyper_parameter) + 1
+            lb, lv = 0, choices.index(hyper_parameter) - 1
        elif search_space[key]["_type"] == "randint":
            lb, ub = search_space[key]["_value"][:2]
            ub -= 1
@@ -132,10 +130,11 @@ def exploit_and_explore(bot_trial_info, top_trial_info, factor, resample_probabi
        else:
            logger.warning("Illegal type to perturb: %s", search_space[key]["_type"])
            continue
+
        if search_space[key]["_type"] == "choice":
            idx = perturbation(search_space[key]["_type"], search_space[key]["_value"],
                               resample_probability, uv, ub, lv, lb, random_state)
-            hyper_parameters[key] = {'_index': idx, '_value': choices[idx]}
+            hyper_parameters[key] = choices[idx]
        else:
            hyper_parameters[key] = perturbation(search_space[key]["_type"], search_space[key]["_value"],
                                                 resample_probability, uv, ub, lv, lb, random_state)
@@ -231,6 +230,7 @@ class PBTTuner(Tuner):
        for i in range(self.population_size):
            hyper_parameters = json2parameter(
                self.searchspace_json, is_rand, self.random_state)
+            hyper_parameters = split_index(hyper_parameters)
            checkpoint_dir = os.path.join(self.all_checkpoint_dir, str(i))
            hyper_parameters['load_checkpoint_dir'] = os.path.join(checkpoint_dir, str(self.epoch))
            hyper_parameters['save_checkpoint_dir'] = os.path.join(checkpoint_dir, str(self.epoch))
@@ -294,7 +294,42 @@ class PBTTuner(Tuner):
        trial_info.parameter_id = parameter_id
        self.running[parameter_id] = trial_info
        logger.info('Generate parameter : %s', trial_info.hyper_parameters)
-        return split_index(trial_info.hyper_parameters)
+        return trial_info.hyper_parameters
+
+    def _proceed_next_epoch(self):
+        """
+        """
+        logger.info('Proceeding to next epoch')
+        self.epoch += 1
+        self.population = []
+        self.pos = -1
+        self.running = {}
+        #exploit and explore
+        reverse = True if self.optimize_mode == OptimizeMode.Maximize else False
+        self.finished = sorted(self.finished, key=lambda x: x.score, reverse=reverse)
+        cutoff = int(np.ceil(self.fraction * len(self.finished)))
+        tops = self.finished[:cutoff]
+        bottoms = self.finished[self.finished_trials - cutoff:]
+        for bottom in bottoms:
+            top = np.random.choice(tops)
+            exploit_and_explore(bottom, top, self.factor, self.resample_probability, self.epoch, self.searchspace_json)
+        for trial in self.finished:
+            if trial not in bottoms:
+                trial.clean_id()
+                trial.hyper_parameters['load_checkpoint_dir'] = trial.hyper_parameters['save_checkpoint_dir']
+                trial.hyper_parameters['save_checkpoint_dir'] = os.path.join(trial.checkpoint_dir, str(self.epoch))
+        self.finished_trials = 0
+        for _ in range(self.population_size):
+            trial_info = self.finished.pop()
+            self.population.append(trial_info)
+        while self.credit > 0 and self.pos + 1 < len(self.population):
+            self.credit -= 1
+            self.pos += 1
+            parameter_id = self.param_ids.pop()
+            trial_info = self.population[self.pos]
+            trial_info.parameter_id = parameter_id
+            self.running[parameter_id] = trial_info
+            self.send_trial_callback(parameter_id, trial_info.hyper_parameters)

    def receive_trial_result(self, parameter_id, parameters, value, **kwargs):
        """
@@ -312,43 +347,99 @@ class PBTTuner(Tuner):
        """
        logger.info('Get one trial result, id = %d, value = %s', parameter_id, value)
        value = extract_scalar_reward(value)
+        trial_info = self.running.pop(parameter_id, None)
+        trial_info.score = value
+        self.finished.append(trial_info)
+        self.finished_trials += 1
+        if self.finished_trials == self.population_size:
+            self._proceed_next_epoch()
+
+    def trial_end(self, parameter_id, success, **kwargs):
+        """
+        Deal with trial failure
+
+        Parameters
+        ----------
+        parameter_id : int
+            Unique identifier for hyper-parameters used by this trial.
+        success : bool
+            True if the trial successfully completed; False if failed or terminated.
+        **kwargs
+            Unstable parameters which should be ignored by normal users.
+        """
+        if success:
+            return
        if self.optimize_mode == OptimizeMode.Minimize:
-            value = -value
+            value = float('inf')
+        else:
+            value = float('-inf')
        trial_info = self.running.pop(parameter_id, None)
        trial_info.score = value
        self.finished.append(trial_info)
        self.finished_trials += 1
        if self.finished_trials == self.population_size:
-            logger.info('Proceeding to next epoch')
-            self.epoch += 1
-            self.population = []
-            self.pos = -1
-            self.running = {}
-            #exploit and explore
-            self.finished = sorted(self.finished, key=lambda x: x.score, reverse=True)
-            cutoff = int(np.ceil(self.fraction * len(self.finished)))
-            tops = self.finished[:cutoff]
-            bottoms = self.finished[self.finished_trials - cutoff:]
-            for bottom in bottoms:
-                top = np.random.choice(tops)
-                exploit_and_explore(bottom, top, self.factor, self.resample_probability, self.epoch, self.searchspace_json)
-            for trial in self.finished:
-                if trial not in bottoms:
-                    trial.clean_id()
-                    trial.hyper_parameters['load_checkpoint_dir'] = trial.hyper_parameters['save_checkpoint_dir']
-                    trial.hyper_parameters['save_checkpoint_dir'] = os.path.join(trial.checkpoint_dir, str(self.epoch))
-            self.finished_trials = 0
-            for _ in range(self.population_size):
-                trial_info = self.finished.pop()
-                self.population.append(trial_info)
-            while self.credit > 0 and self.pos + 1 < len(self.population):
-                self.credit -= 1
-                self.pos += 1
-                parameter_id = self.param_ids.pop()
-                trial_info = self.population[self.pos]
-                trial_info.parameter_id = parameter_id
-                self.running[parameter_id] = trial_info
-                self.send_trial_callback(parameter_id, split_index(trial_info.hyper_parameters))
+            self._proceed_next_epoch()

    def import_data(self, data):
-        pass
+        """
+        Parameters
+        ----------
+        data : json obj
+            imported data records
+
+        Returns
+        -------
+        int
+            the start epoch number after data imported, only used for unittest
+        """
+        if self.running:
+            logger.warning("Do not support importing data in the middle of experiment")
+            return
+        # the following is for experiment resume
+        _completed_num = 0
+        epoch_data_dict = {}
+        for trial_info in data:
+            logger.info("Process data record %s / %s", _completed_num, len(data))
+            _completed_num += 1
+            # simply validate data format
+            _params = trial_info["parameter"]
+            _value = trial_info['value']
+            # assign fake value for failed trials
+            if not _value:
+                logger.info("Useless trial data, value is %s, skip this trial data.", _value)
+                _value = float('inf') if self.optimize_mode == OptimizeMode.Minimize else float('-inf')
+            _value = extract_scalar_reward(_value)
+            if 'save_checkpoint_dir' not in _params:
+                logger.warning("Invalid data record: save_checkpoint_dir is missing, abandon data import.")
+                return
+            epoch_num = int(os.path.basename(_params['save_checkpoint_dir']))
+            if epoch_num not in epoch_data_dict:
+                epoch_data_dict[epoch_num] = []
+            epoch_data_dict[epoch_num].append((_params, _value))
+        if not epoch_data_dict:
+            logger.warning("No valid epochs, abandon data import.")
+            return
+        # figure out start epoch for resume
+        max_epoch_num = max(epoch_data_dict, key=int)
+        if len(epoch_data_dict[max_epoch_num]) < self.population_size:
+            max_epoch_num -= 1
+        # If there is no a single complete round, no data to import, start from scratch
+        if max_epoch_num < 0:
+            logger.warning("No completed epoch, abandon data import.")
+            return
+        assert len(epoch_data_dict[max_epoch_num]) == self.population_size
+        # check existence of trial save checkpoint dir
+        for params, _ in epoch_data_dict[max_epoch_num]:
+            if not os.path.isdir(params['save_checkpoint_dir']):
+                logger.warning("save_checkpoint_dir %s does not exist, data will not be resumed", params['save_checkpoint_dir'])
+                return
+        # resume data
+        self.epoch = max_epoch_num
+        self.finished_trials = self.population_size
+        for params, value in epoch_data_dict[max_epoch_num]:
+            checkpoint_dir = os.path.dirname(params['save_checkpoint_dir'])
+            self.finished.append(TrialInfo(checkpoint_dir=checkpoint_dir, hyper_parameters=params, score=value))
+        self._proceed_next_epoch()
+        logger.info("Successfully import data to PBT tuner, total data: %d, imported data: %d.", len(data), self.population_size)
+        logger.info("Start from epoch %d ...", self.epoch)
+        return self.epoch # return for test
--- a/src/sdk/pynni/tests/test_builtin_tuners.py
+++ b/src/sdk/pynni/tests/test_builtin_tuners.py
@@ -159,6 +159,62 @@ class BuiltinTunersTestCase(TestCase):
            logger.info("Full supported search space: %s", full_supported_search_space)
            self.search_space_test_one(tuner_factory, full_supported_search_space)

+    def import_data_test_for_pbt(self):
+        """
+        test1: import data with complete epoch
+        test2: import data with incomplete epoch
+        """
+        search_space = {
+            "choice_str": {
+                "_type": "choice",
+                "_value": ["cat", "dog", "elephant", "cow", "sheep", "panda"]
+            }
+        }
+        all_checkpoint_dir = os.path.expanduser("~/nni/checkpoint/test/")
+        population_size = 4
+        # ===import data at the beginning===
+        tuner = PBTTuner(
+            all_checkpoint_dir=all_checkpoint_dir,
+            population_size=population_size
+        )
+        self.assertIsInstance(tuner, Tuner)
+        tuner.update_search_space(search_space)
+        save_dirs = [os.path.join(all_checkpoint_dir, str(i), str(0)) for i in range(population_size)]
+        # create save checkpoint directory
+        for save_dir in save_dirs:
+            os.makedirs(save_dir, exist_ok=True)
+        # for simplicity, omit "load_checkpoint_dir"
+        data = [{"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[0]}, "value": 1.1},
+                {"parameter": {"choice_str": "dog", "save_checkpoint_dir": save_dirs[1]}, "value": {"default": 1.2, "tmp": 2}},
+                {"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[2]}, "value": 11},
+                {"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[3]}, "value": 7}]
+        epoch = tuner.import_data(data)
+        self.assertEqual(epoch, 1)
+        logger.info("Imported data successfully at the beginning")
+        shutil.rmtree(all_checkpoint_dir)
+        # ===import another data at the beginning, test the case when there is an incompleted epoch===
+        tuner = PBTTuner(
+            all_checkpoint_dir=all_checkpoint_dir,
+            population_size=population_size
+        )
+        self.assertIsInstance(tuner, Tuner)
+        tuner.update_search_space(search_space)
+        for i in range(population_size - 1):
+            save_dirs.append(os.path.join(all_checkpoint_dir, str(i), str(1)))
+        for save_dir in save_dirs:
+            os.makedirs(save_dir, exist_ok=True)
+        data = [{"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[0]}, "value": 1.1},
+                {"parameter": {"choice_str": "dog", "save_checkpoint_dir": save_dirs[1]}, "value": {"default": 1.2, "tmp": 2}},
+                {"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[2]}, "value": 11},
+                {"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[3]}, "value": 7},
+                {"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[4]}, "value": 1.1},
+                {"parameter": {"choice_str": "dog", "save_checkpoint_dir": save_dirs[5]}, "value": {"default": 1.2, "tmp": 2}},
+                {"parameter": {"choice_str": "cat", "save_checkpoint_dir": save_dirs[6]}, "value": 11}]
+        epoch = tuner.import_data(data)
+        self.assertEqual(epoch, 1)
+        logger.info("Imported data successfully at the beginning with incomplete epoch")
+        shutil.rmtree(all_checkpoint_dir)
+
    def import_data_test(self, tuner_factory, stype="choice_str"):
        """
        import data at the beginning with number value and dict value
@@ -297,6 +353,7 @@ class BuiltinTunersTestCase(TestCase):
            all_checkpoint_dir=os.path.expanduser("~/nni/checkpoint/test/"),
            population_size=100
        ))
+        self.import_data_test_for_pbt()

    def tearDown(self):
        file_list = glob.glob("smac3*") + ["param_config_space.pcs", "scenario.txt", "model_path"]

--- a/src/webui/src/App.scss
+++ b/src/webui/src/App.scss
@@ -29,7 +29,18 @@
  margin: 0 auto;
  margin-top: 74px;
  margin-bottom: 30px;
-  background: #fff;
+}
+
+.bottomDiv{
+  margin-bottom: 10px;
+}
+
+.bgNNI{
+  background-color: #fff;
+}
+
+.borderRight{
+  margin-right: 10px;
 }

 /* office-fabric-ui */

--- a/src/webui/src/App.tsx
+++ b/src/webui/src/App.tsx
@@ -14,6 +14,7 @@ interface AppState {
    metricGraphMode: 'max' | 'min'; // tuner's optimize_mode filed
    isillegalFinal: boolean;
    expWarningMessage: string;
+    bestTrialEntries: string; // for overview page: best trial entreis
 }

 class App extends React.Component<{}, AppState> {
@@ -30,7 +31,8 @@ class App extends React.Component<{}, AppState> {
            trialsUpdateBroadcast: 0,
            metricGraphMode: 'max',
            isillegalFinal: false,
-            expWarningMessage: ''
+            expWarningMessage: '',
+            bestTrialEntries: '10'
        };
    }

@@ -92,9 +94,14 @@ class App extends React.Component<{}, AppState> {
        this.setState({ metricGraphMode: val });
    }

+    // overview best trial module
+    changeEntries = (entries: string): void => {
+        this.setState({bestTrialEntries: entries});
+    }
+
    render(): React.ReactNode {
        const { interval, columnList, experimentUpdateBroadcast, trialsUpdateBroadcast,
-            metricGraphMode, isillegalFinal, expWarningMessage 
+            metricGraphMode, isillegalFinal, expWarningMessage, bestTrialEntries
        } = this.state;
        if (experimentUpdateBroadcast === 0 || trialsUpdateBroadcast === 0) {
            return null;  // TODO: render a loading page
@@ -106,7 +113,8 @@ class App extends React.Component<{}, AppState> {
                    columnList, changeColumn: this.changeColumn,
                    experimentUpdateBroadcast,
                    trialsUpdateBroadcast,
-                    metricGraphMode, changeMetricGraphMode: this.changeMetricGraphMode
+                    metricGraphMode, changeMetricGraphMode: this.changeMetricGraphMode,
+                    bestTrialEntries, changeEntries: this.changeEntries
            })
        );


--- a/src/webui/src/components/Modals/ExperimentDrawer.tsx
+++ b/src/webui/src/components/Modals/ExperimentDrawer.tsx
@@ -7,6 +7,7 @@ import {
 import { MANAGER_IP, DRAWEROPTION } from '../../static/const';
 import MonacoEditor from 'react-monaco-editor';
 import '../../static/style/logDrawer.scss';
+import { TrialManager } from '../../static/model/trialmanager';

 interface ExpDrawerProps {
    isVisble: boolean;
@@ -37,27 +38,27 @@ class ExperimentDrawer extends React.Component<ExpDrawerProps, ExpDrawerState> {
                axios.get(`${MANAGER_IP}/trial-jobs`),
                axios.get(`${MANAGER_IP}/metric-data`)
            ])
-            .then(axios.spread((res, res1, res2) => {
-                if (res.status === 200 && res1.status === 200 && res2.status === 200) {
-                    if (res.data.params.searchSpace) {
-                        res.data.params.searchSpace = JSON.parse(res.data.params.searchSpace);
+            .then(axios.spread((resExperiment, resTrialJobs, resMetricData) => {
+                if (resExperiment.status === 200 && resTrialJobs.status === 200 && resMetricData.status === 200) {
+                    if (resExperiment.data.params.searchSpace) {
+                        resExperiment.data.params.searchSpace = JSON.parse(resExperiment.data.params.searchSpace);
                    }
-                    const trialMessagesArr = res1.data;
-                    const interResultList = res2.data;
+                    const trialMessagesArr = TrialManager.expandJobsToTrials(resTrialJobs.data);
+                    const interResultList = resMetricData.data;
                    Object.keys(trialMessagesArr).map(item => {
                        // not deal with trial's hyperParameters
                        const trialId = trialMessagesArr[item].id;
                        // add intermediate result message
                        trialMessagesArr[item].intermediate = [];
                        Object.keys(interResultList).map(key => {
-                            const interId = interResultList[key].trialJobId;
+                            const interId = `${interResultList[key].trialJobId}-${interResultList[key].parameterId}`;
                            if (trialId === interId) {
                                trialMessagesArr[item].intermediate.push(interResultList[key]);
                            }
                        });
                    });
                    const result = {
-                        experimentParameters: res.data,
+                        experimentParameters: resExperiment.data,
                        trialMessage: trialMessagesArr
                    };
                    if (this._isCompareMount === true) {