Unverified Commit 7e35d32e authored by Chi Song's avatar Chi Song Committed by GitHub
Browse files

Remove multiphase from WebUI (#2396)

As requirement of #2390 , removed multiphase from WebUI, document, and put all trials flatten on WebUI.

1. The major change is to split trial-jobs to trials in webui, and use parameter id as part of trial id.
2. If multiphase is enabled, the limitation of trial controls job count only, not trials actually.
3. When multiphase enabled, the trial status may not be right. Previous trials in a job will be marked as success, only last trial presents the job's status.
4. multiphase related documents and UX are removed.

minor changes,
1. Update dev document for webui development.
parent 31a247b7
# Multi-phase
## What is multi-phase experiment
Typically each trial job gets a single configuration (e.g., hyperparameters) from tuner, tries this configuration and reports result, then exits. But sometimes a trial job may wants to request multiple configurations from tuner. We find this is a very compelling feature. For example:
1. Job launch takes tens of seconds in some training platform. If a configuration takes only around a minute to finish, running only one configuration in a trial job would be very inefficient. An appealing way is that a trial job requests a configuration and finishes it, then requests another configuration and run. The extreme case is that a trial job can run infinite configurations. If you set concurrency to be for example 6, there would be 6 __long running__ jobs keeping trying different configurations.
2. Some types of models have to be trained phase by phase, the configuration of next phase depends on the results of previous phase(s). For example, to find the best quantization for a model, the training procedure is often as follows: the auto-quantization algorithm (i.e., tuner in NNI) chooses a size of bits (e.g., 16 bits), a trial job gets this configuration and trains the model for some epochs and reports result (e.g., accuracy). The algorithm receives this result and makes decision of changing 16 bits to 8 bits, or changing back to 32 bits. This process is repeated for a configured times.
The above cases can be supported by the same feature, i.e., multi-phase execution. To support those cases, basically a trial job should be able to request multiple configurations from tuner. Tuner is aware of whether two configuration requests are from the same trial job or different ones. Also in multi-phase a trial job can report multiple final results.
## Create multi-phase experiment
### Write trial code which leverages multi-phase:
__1. Update trial code__
It is pretty simple to use multi-phase in trial code, an example is shown below:
```python
# ...
for i in range(5):
# get parameter from tuner
tuner_param = nni.get_next_parameter()
# nni.get_next_parameter returns None if there is no more hyper parameters can be generated by tuner.
if tuner_param is None:
break
# consume the params
# ...
# report final result somewhere for the parameter retrieved above
nni.report_final_result()
# ...
# ...
```
In multi-phase experiments, at each time the API `nni.get_next_parameter()` is called, it returns a new hyper parameter generated by tuner, then the trail code consume this new hyper parameter and report final result of this hyper parameter. `nni.get_next_parameter()` and `nni.report_final_result()` should be called sequentially: __call the former one, then call the later one; and repeat this pattern__. If `nni.get_next_parameter()` is called multiple times consecutively, and then `nni.report_final_result()` is called once, the result is associated to the last configuration, which is retrieved from the last get_next_parameter call. So there is no result associated to previous get_next_parameter calls, and it may cause some multi-phase algorithm broken.
Note that, `nni.get_next_parameter` returns None if there is no more hyper parameters can be generated by tuner.
__2. Experiment configuration__
To enable multi-phase, you should also add `multiPhase: true` in your experiment YAML configure file. If this line is not added, `nni.get_next_parameter()` would always return the same configuration.
Multi-phase experiment configuration example:
```yaml
authorName: default
experimentName: multiphase experiment
trialConcurrency: 2
maxExecDuration: 1h
maxTrialNum: 8
trainingServicePlatform: local
searchSpacePath: search_space.json
multiPhase: true
useAnnotation: false
tuner:
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
trial:
command: python3 mytrial.py
codeDir: .
gpuNum: 0
```
### Write a tuner that leverages multi-phase:
Before writing a multi-phase tuner, we highly suggest you to go through [Customize Tuner](https://nni.readthedocs.io/en/latest/Tuner/CustomizeTuner.html). Same as writing a normal tuner, your tuner needs to inherit from `Tuner` class. When you enable multi-phase through configuration (set `multiPhase` to true), your tuner will get an additional parameter `trial_job_id` via tuner's following methods:
```text
generate_parameters
generate_multiple_parameters
receive_trial_result
receive_customized_trial_result
trial_end
```
With this information, the tuner could know which trial is requesting a configuration, and which trial is reporting results. This information provides enough flexibility for your tuner to deal with different trials and different phases. For example, you may want to use the trial_job_id parameter of generate_parameters method to generate hyperparameters for a specific trial job.
### Tuners support multi-phase experiments:
[TPE](../Tuner/HyperoptTuner.md), [Random](../Tuner/HyperoptTuner.md), [Anneal](../Tuner/HyperoptTuner.md), [Evolution](../Tuner/EvolutionTuner.md), [SMAC](../Tuner/SmacTuner.md), [NetworkMorphism](../Tuner/NetworkmorphismTuner.md), [MetisTuner](../Tuner/MetisTuner.md), [BOHB](../Tuner/BohbAdvisor.md), [Hyperband](../Tuner/HyperbandAdvisor.md).
### Training services support multi-phase experiment:
[Local Machine](../TrainingService/LocalMode.md), [Remote Servers](../TrainingService/RemoteMachineMode.md), [OpenPAI](../TrainingService/PaiMode.md)
......@@ -206,7 +206,7 @@
* Documentation
- Update the docs structure -Issue #1231
- [Multi phase document improvement](AdvancedFeature/MultiPhase.md) -Issue #1233 -PR #1242
- (deprecated) Multi phase document improvement -Issue #1233 -PR #1242
+ Add configuration example
- [WebUI description improvement](Tutorial/WebUI.md) -PR #1419
......@@ -234,12 +234,10 @@
* Add `enas-mode` and `oneshot-mode` for NAS interface: [PR #1201](https://github.com/microsoft/nni/pull/1201#issue-291094510)
* [Gaussian Process Tuner with Matern kernel](Tuner/GPTuner.md)
* Multiphase experiment supports
* (deprecated) Multiphase experiment supports
* Added new training service support for multiphase experiment: PAI mode supports multiphase experiment since v0.9.
* Added multiphase capability for the following builtin tuners:
* TPE, Random Search, Anneal, Naïve Evolution, SMAC, Network Morphism, Metis Tuner.
For details, please refer to [Write a tuner that leverages multi-phase](AdvancedFeature/MultiPhase.md)
* Web Portal
* Enable trial comparation in Web Portal. For details, refer to [View trials status](Tutorial/WebUI.md)
......@@ -549,4 +547,3 @@ Initial release of Neural Network Intelligence (NNI).
* Support CI by providing out-of-box integration with [travis-ci](https://github.com/travis-ci) on ubuntu
* Others
* Support simple GPU job scheduling
......@@ -51,7 +51,7 @@ class CustomizedTuner(Tuner):
...
```
`receive_trial_result` will receive the `parameter_id, parameters, value` as parameters input. Also, Tuner will receive the `value` object are exactly same value that Trial send. If `multiPhase` is set to `true` in the experiment configuration file, an additional `trial_job_id` parameter is passed to `receive_trial_result` and `generate_parameters` through the `**kwargs` parameter.
`receive_trial_result` will receive the `parameter_id, parameters, value` as parameters input. Also, Tuner will receive the `value` object are exactly same value that Trial send.
The `your_parameters` return from `generate_parameters` function, will be package as json object by NNI SDK. NNI SDK will unpack json object so the Trial will receive the exact same `your_parameters` from Tuner.
......@@ -109,4 +109,4 @@ More detail example you could see:
### Write a more advanced automl algorithm
The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](CustomizeAdvisor.md) for how to write a customized advisor.
\ No newline at end of file
The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](CustomizeAdvisor.md) for how to write a customized advisor.
......@@ -17,7 +17,6 @@ This document describes the rules to write the config file, and provides some ex
+ [trainingServicePlatform](#trainingserviceplatform)
+ [searchSpacePath](#searchspacepath)
+ [useAnnotation](#useannotation)
+ [multiPhase](#multiphase)
+ [multiThread](#multithread)
+ [nniManagerIp](#nnimanagerip)
+ [logDir](#logdir)
......@@ -94,8 +93,6 @@ searchSpacePath:
#choice: true, false, default: false
useAnnotation:
#choice: true, false, default: false
multiPhase:
#choice: true, false, default: false
multiThread:
tuner:
#choice: TPE, Random, Anneal, Evolution
......@@ -130,8 +127,6 @@ searchSpacePath:
#choice: true, false, default: false
useAnnotation:
#choice: true, false, default: false
multiPhase:
#choice: true, false, default: false
multiThread:
tuner:
#choice: TPE, Random, Anneal, Evolution
......@@ -171,8 +166,6 @@ trainingServicePlatform:
#choice: true, false, default: false
useAnnotation:
#choice: true, false, default: false
multiPhase:
#choice: true, false, default: false
multiThread:
tuner:
#choice: TPE, Random, Anneal, Evolution
......@@ -283,12 +276,6 @@ Use annotation to analysis trial code and generate search space.
Note: if __useAnnotation__ is true, the searchSpacePath field should be removed.
### multiPhase
Optional. Bool. Default: false.
Enable [multi-phase experiment](../AdvancedFeature/MultiPhase.md).
### multiThread
Optional. Bool. Default: false.
......
......@@ -67,10 +67,8 @@ It doesn't need to redeploy, but the nnictl may need to be restarted.
#### TypeScript
* If `src/nni_manager` will be changed, run `yarn watch` continually under this folder. It will rebuild code instantly.
* If `src/webui` or `src/nasui` is changed, use **step 3** to rebuild code.
The nnictl may need to be restarted.
* If `src/nni_manager` is changed, run `yarn watch` continually under this folder. It will rebuild code instantly. The nnictl may need to be restarted to reload NNI manager.
* If `src/webui` or `src/nasui` are changed, run `yarn start` under the corresponding folder. The web UI will refresh automatically if code is changed.
---
......
......@@ -4,7 +4,6 @@ Advanced Features
.. toctree::
:maxdepth: 2
Enable Multi-phase <AdvancedFeature/MultiPhase>
Write a New Tuner <Tuner/CustomizeTuner>
Write a New Assessor <Assessor/CustomizeAssessor>
Write a New Advisor <Tuner/CustomizeAdvisor>
......
# 多阶段
## 多阶段 Experiment
通常,每个 Trial 任务只需要从 Tuner 获取一个配置(超参等),然后使用这个配置执行并报告结果,然后退出。 但有时,一个 Trial 任务可能需要从 Tuner 请求多次配置。 这是一个非常有用的功能。 例如:
1. 在一些训练平台上,需要数十秒来启动一个任务。 如果一个配置只需要一分钟就能完成,那么每个 Trial 任务中只运行一个配置就会非常低效。 这种情况下,可以在同一个 Trial 任务中,完成一个配置后,再请求并完成另一个配置。 极端情况下,一个 Trial 任务可以运行无数个配置。 如果设置了并发(例如设为 6),那么就会有 6 个**长时间**运行的任务来不断尝试不同的配置。
2. 有些类型的模型需要进行多阶段的训练,而下一个阶段的配置依赖于前一个阶段的结果。 例如,为了找到模型最好的量化结果,训练过程通常为:自动量化算法(例如 NNI 中的 TunerJ)选择一个位宽(如 16 位), Trial 任务获得此配置,并训练数个 epoch,并返回结果(例如精度)。 算法收到结果后,决定是将 16 位改为 8 位,还是 32 位。 此过程会重复多次。
上述情况都可以通过多阶段执行的功能来支持。 为了支持这些情况,一个 Trial 任务需要能从 Tuner 请求多个配置。 Tuner 需要知道两次配置请求是否来自同一个 Trial 任务。 同时,多阶段中的 Trial 任务需要多次返回最终结果。
## 创建多阶段的 Experiment
### 实现使用多阶段的 Trial 代码:
**1. 更新 Trial 代码**
Trial 代码中使用多阶段非常容易,示例如下:
```python
# ...
for i in range(5):
# 从 Tuner 中获得参数
tuner_param = nni.get_next_parameter()
# 如果没有更多超参可生成,nni.get_next_parameter 会返回 None。
if tuner_param is None:
break
# 使用参数
# ...
# 返回最终结果
nni.report_final_result()
# ...
# ...
```
在多阶段 Experiment 中,每次 API `nni.get_next_parameter()` 被调用时,会返回 Tuner 新生成的超参,然后 Trial 代码会使用新的超参,并返回其最终结果。 `nni.get_next_parameter()``nni.report_final_result()` 需要依次被调用:**先调用前者,然后调用后者,并按此顺序重复调用**。 如果 `nni.get_next_parameter()` 被连续多次调用,然后再调用 `nni.report_final_result()`,这会造成最终结果只会与 get_next_parameter 所返回的最后一个配置相关联。 因此,前面的 get_next_parameter 调用都没有关联的结果,这可能会造成一些多阶段算法出问题。
注意,如果 `nni.get_next_parameter` 返回 None,表示 Tuner 没有生成更多的超参。
**2. Experiment 配置**
要启用多阶段,需要在 Experiment 的 YAML 配置文件中增加 `multiPhase: true`。 如果不添加此参数,`nni.get_next_parameter()` 会一直返回同样的配置。
多阶段 Experiment 配置示例:
```yaml
authorName: default
experimentName: multiphase experiment
trialConcurrency: 2
maxExecDuration: 1h
maxTrialNum: 8
trainingServicePlatform: local
searchSpacePath: search_space.json
multiPhase: true
useAnnotation: false
tuner:
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
trial:
command: python3 mytrial.py
codeDir: .
gpuNum: 0
```
### 实现使用多阶段的 Tuner:
强烈建议首先阅读[自定义 Tuner](https://nni.readthedocs.io/zh/latest/Tuner/CustomizeTuner.html),再开始实现多阶段 Tuner。 与普通 Tuner 一样,需要从 `Tuner` 类继承。 当通过配置启用多阶段时(将 `multiPhase` 设为 true),Tuner 会通过下列方法得到一个新的参数 `trial_job_id`
```text
generate_parameters
generate_multiple_parameters
receive_trial_result
receive_customized_trial_result
trial_end
```
有了这个信息, Tuner 能够知道哪个 Trial 在请求配置信息, 返回的结果是哪个 Trial 的。 通过此信息,Tuner 能够灵活的为不同的 Trial 及其阶段实现功能。 例如,可在 generate_parameters 方法中使用 trial_job_id 来为特定的 Trial 任务生成超参。
### 支持多阶段 Experiment 的 Tuner:
[TPE](../Tuner/HyperoptTuner.md), [Random](../Tuner/HyperoptTuner.md), [Anneal](../Tuner/HyperoptTuner.md), [Evolution](../Tuner/EvolutionTuner.md), [SMAC](../Tuner/SmacTuner.md), [NetworkMorphism](../Tuner/NetworkmorphismTuner.md), [MetisTuner](../Tuner/MetisTuner.md), [BOHB](../Tuner/BohbAdvisor.md), [Hyperband](../Tuner/HyperbandAdvisor.md).
### 支持多阶段 Experiment 的训练平台:
[本机](../TrainingService/LocalMode.md), [远程计算机](../TrainingService/RemoteMachineMode.md), [OpenPAI](../TrainingService/PaiMode.md)
\ No newline at end of file
......@@ -7,6 +7,7 @@ import {
import { MANAGER_IP, DRAWEROPTION } from '../../static/const';
import MonacoEditor from 'react-monaco-editor';
import '../../static/style/logDrawer.scss';
import { TrialManager } from '../../static/model/trialmanager';
interface ExpDrawerProps {
isVisble: boolean;
......@@ -37,27 +38,27 @@ class ExperimentDrawer extends React.Component<ExpDrawerProps, ExpDrawerState> {
axios.get(`${MANAGER_IP}/trial-jobs`),
axios.get(`${MANAGER_IP}/metric-data`)
])
.then(axios.spread((res, res1, res2) => {
if (res.status === 200 && res1.status === 200 && res2.status === 200) {
if (res.data.params.searchSpace) {
res.data.params.searchSpace = JSON.parse(res.data.params.searchSpace);
.then(axios.spread((resExperiment, resTrialJobs, resMetricData) => {
if (resExperiment.status === 200 && resTrialJobs.status === 200 && resMetricData.status === 200) {
if (resExperiment.data.params.searchSpace) {
resExperiment.data.params.searchSpace = JSON.parse(resExperiment.data.params.searchSpace);
}
const trialMessagesArr = res1.data;
const interResultList = res2.data;
const trialMessagesArr = TrialManager.expandJobsToTrials(resTrialJobs.data);
const interResultList = resMetricData.data;
Object.keys(trialMessagesArr).map(item => {
// not deal with trial's hyperParameters
const trialId = trialMessagesArr[item].id;
// add intermediate result message
trialMessagesArr[item].intermediate = [];
Object.keys(interResultList).map(key => {
const interId = interResultList[key].trialJobId;
const interId = `${interResultList[key].trialJobId}-${interResultList[key].parameterId}`;
if (trialId === interId) {
trialMessagesArr[item].intermediate.push(interResultList[key]);
}
});
});
const result = {
experimentParameters: res.data,
experimentParameters: resExperiment.data,
trialMessage: trialMessagesArr
};
if (this._isCompareMount === true) {
......
......@@ -77,7 +77,7 @@ class KillJob extends React.Component<KillJobProps, KillJobState> {
onKill = (): void => {
this.setState({ isCalloutVisible: false }, () => {
const { trial } = this.props;
killJob(trial.key, trial.id, trial.status);
killJob(trial.key, trial.jobId, trial.status);
});
}
......@@ -127,4 +127,4 @@ class KillJob extends React.Component<KillJobProps, KillJobState> {
}
}
export default KillJob;
\ No newline at end of file
export default KillJob;
......@@ -3,7 +3,6 @@ import * as copy from 'copy-to-clipboard';
import { Stack, PrimaryButton, Pivot, PivotItem } from 'office-ui-fabric-react';
import { Trial } from '../../static/model/trial';
import { EXPERIMENT, TRIALS } from '../../static/datamodel';
import { MANAGER_IP } from '../../static/const';
import JSONTree from 'react-json-tree';
import PaiTrialLog from '../public-child/PaiTrialLog';
import TrialLog from '../public-child/TrialLog';
......@@ -60,31 +59,12 @@ class OpenRow extends React.Component<OpenRowProps, OpenRowState> {
const { isHidenInfo, typeInfo, info } = this.state;
const trialId = this.props.trialId;
const trial = TRIALS.getTrial(trialId);
const trialLink: string = `${MANAGER_IP}/trial-jobs/${trialId}`;
const logPathRow = trial.info.logPath || 'This trial\'s log path is not available.';
const multiProgress = trial.info.hyperParameters === undefined ? 0 : trial.info.hyperParameters.length;
return (
<Stack className="openRow">
<Stack className="openRowContent">
<Pivot>
<PivotItem headerText="Parameters" key="1" itemIcon="TestParameter">
{
EXPERIMENT.multiPhase
?
<Stack className="link">
{
`
Trails for multiphase experiment will return a set of parameters,
we are listing the latest parameter in webportal.
For the entire parameter set, please refer to the following "
`
}
<a href={trialLink} rel="noopener noreferrer" target="_blank">{trialLink}</a>{`".`}
<div>Current Phase: {multiProgress}.</div>
</Stack>
:
null
}
{
trial.info.hyperParameters !== undefined
?
......
......@@ -9,7 +9,7 @@ import { LineChart, blocked, copy } from '../Buttons/Icon';
import { MANAGER_IP, COLUMNPro } from '../../static/const';
import { convertDuration, formatTimestamp, intermediateGraphOption, parseMetrics } from '../../static/function';
import { EXPERIMENT, TRIALS } from '../../static/datamodel';
import { TableRecord } from '../../static/interface';
import { TableRecord, TrialJobInfo } from '../../static/interface';
import Details from '../overview/Details';
import ChangeColumnComponent from '../Modals/ChangeColumnComponent';
import Compare from '../Modals/Compare';
......@@ -231,18 +231,23 @@ class TableList extends React.Component<TableListProps, TableListState> {
)
};
showIntermediateModal = async (id: string, event: React.SyntheticEvent<EventTarget>): Promise<void> => {
showIntermediateModal = async (record: TrialJobInfo, event: React.SyntheticEvent<EventTarget>): Promise<void> => {
event.preventDefault();
event.stopPropagation();
const res = await axios.get(`${MANAGER_IP}/metric-data/${id}`);
const res = await axios.get(`${MANAGER_IP}/metric-data/${record.jobId}`);
if (res.status === 200) {
const intermediateArr: number[] = [];
// support intermediate result is dict because the last intermediate result is
// final result in a succeed trial, it may be a dict.
// get intermediate result dict keys array
const { intermediateKey } = this.state;
const otherkeys: string[] = [ ];
if (res.data.length !== 0) {
const otherkeys: string[] = [];
// One trial job may contains multiple parameter id
// only show current trial's metric data
const metricDatas = res.data.filter(item => {
return item.parameterId == record.parameterId;
});
if (metricDatas.length !== 0) {
// just add type=number keys
const intermediateMetrics = parseMetrics(res.data[0].data);
for (const key in intermediateMetrics) {
......@@ -252,9 +257,10 @@ class TableList extends React.Component<TableListProps, TableListState> {
}
}
// intermediateArr just store default val
Object.keys(res.data).map(item => {
if (res.data[item].type === 'PERIODICAL') {
const temp = parseMetrics(res.data[item].data);
metricDatas.map(item => {
if (item.type === 'PERIODICAL') {
const temp = parseMetrics(item.data);
if (typeof temp === 'object') {
intermediateArr.push(temp[intermediateKey]);
} else {
......@@ -262,12 +268,12 @@ class TableList extends React.Component<TableListProps, TableListState> {
}
}
});
const intermediate = intermediateGraphOption(intermediateArr, id);
const intermediate = intermediateGraphOption(intermediateArr, record.id);
this.setState({
intermediateData: res.data, // store origin intermediate data for a trial
intermediateOption: intermediate,
intermediateOtherKeys: otherkeys,
intermediateId: id
intermediateId: record.id
});
}
this.setState({ modalVisible: true });
......@@ -426,8 +432,6 @@ class TableList extends React.Component<TableListProps, TableListState> {
// when user click [Add Column] need to use the function
private initTableColumnList = (columnList: string[]): IColumn[] => {
// const { columnList } = this.props;
// [supportCustomizedTrial: true]
const supportCustomizedTrial = (EXPERIMENT.multiPhase === true) ? false : true;
const disabledAddCustomizedTrial = ['DONE', 'ERROR', 'STOPPED'].includes(EXPERIMENT.status);
const showColumn: IColumn[] = [];
for (const item of columnList) {
......@@ -479,7 +483,7 @@ class TableList extends React.Component<TableListProps, TableListState> {
<PrimaryButton
className="detail-button-operation"
title="Intermediate"
onClick={this.showIntermediateModal.bind(this, record.id)}
onClick={this.showIntermediateModal.bind(this, record)}
>
{LineChart}
</PrimaryButton>
......@@ -494,20 +498,14 @@ class TableList extends React.Component<TableListProps, TableListState> {
<KillJob trial={record} />
}
{/* Add a new trial-customized trial */}
{
supportCustomizedTrial
?
<PrimaryButton
className="detail-button-operation"
title="Customized trial"
onClick={this.setCustomizedTrial.bind(this, record.id)}
disabled={disabledAddCustomizedTrial}
>
{copy}
</PrimaryButton>
:
null
}
<PrimaryButton
className="detail-button-operation"
title="Customized trial"
onClick={this.setCustomizedTrial.bind(this, record.id)}
disabled={disabledAddCustomizedTrial}
>
{copy}
</PrimaryButton>
</Stack>
);
},
......@@ -659,4 +657,4 @@ class TableList extends React.Component<TableListProps, TableListState> {
}
}
export default TableList;
\ No newline at end of file
export default TableList;
......@@ -2,7 +2,10 @@
const METRIC_GROUP_UPDATE_THRESHOLD = 100;
const METRIC_GROUP_UPDATE_SIZE = 20;
const MANAGER_IP = `/api/v1/nni`;
let MANAGER_IP = `/api/v1/nni`;
if (process.env.NODE_ENV == "development") {
MANAGER_IP = `//${window.location.hostname}:8080` + MANAGER_IP;
}
const DOWNLOAD_IP = `/logs`;
const WEBUIDOC = 'https://nni.readthedocs.io/en/latest/Tutorial/WebUI.html';
const trialJobStatus = [
......@@ -34,8 +37,8 @@ const OPERATION = 'Operation';
const COLUMN = ['Trial No.', 'ID', 'Duration', 'Status', 'Default', OPERATION];
// all choice column !dictory final
const COLUMNPro = ['Trial No.', 'ID', 'Start Time', 'End Time', 'Duration', 'Status',
'Intermediate result', 'Default', OPERATION];
const CONCURRENCYTOOLTIP = 'Trial concurrency is the number of trials running concurrently.';
'Intermediate result', 'Default', OPERATION];
const CONCURRENCYTOOLTIP = 'Trial concurrency is the number of trials running concurrently.';
export {
MANAGER_IP, DOWNLOAD_IP, trialJobStatus, COLUMNPro, WEBUIDOC,
......
......@@ -18,6 +18,8 @@ interface TableRecord {
startTime: number;
endTime?: number;
id: string;
jobId: string;
parameterId: string;
duration: number;
status: string;
intermediateCount: number;
......@@ -99,6 +101,7 @@ interface Intermedia {
interface MetricDataRecord {
timestamp: number;
trialJobId: string;
trialId: string;
parameterId: string;
type: string;
sequence: number;
......@@ -107,6 +110,8 @@ interface MetricDataRecord {
interface TrialJobInfo {
id: string;
jobId: string;
parameterId: string;
sequenceId: number;
status: string;
startTime?: number;
......@@ -126,7 +131,6 @@ interface ExperimentParams {
maxTrialNum: number;
searchSpace: string;
trainingServicePlatform: string;
multiPhase?: boolean;
multiThread?: boolean;
versionCheck?: boolean;
logCollection?: string;
......@@ -189,4 +193,4 @@ export {
AccurPoint, DetailAccurPoint, TooltipForIntermediate, TooltipForAccuracy,
Dimobj, ParaObj, Intermedia, MetricDataRecord, TrialJobInfo, ExperimentParams,
ExperimentProfile, NNIManagerStatus, EventMap
};
\ No newline at end of file
};
......@@ -66,10 +66,6 @@ class Experiment {
return !!(this.profile.params.logCollection && this.profile.params.logCollection !== 'none');
}
get multiPhase(): boolean {
return !!(this.profile.params.multiPhase);
}
get status(): string {
if (!this.statusField) {
throw Error('Experiment status not initialized');
......
......@@ -4,7 +4,7 @@ import { getFinal, formatAccuracy, metricAccuracy, parseMetrics, isArrayType } f
class Trial implements TableObj {
private metricsInitialized: boolean = false;
private infoField: TrialJobInfo | undefined;
private intermediates: (MetricDataRecord | undefined)[] = [ ];
private intermediates: (MetricDataRecord | undefined)[] = [];
public final: MetricDataRecord | undefined;
private finalAcc: number | undefined;
......@@ -29,7 +29,7 @@ class Trial implements TableObj {
}
get intermediateMetrics(): MetricDataRecord[] {
const ret: MetricDataRecord[] = [ ];
const ret: MetricDataRecord[] = [];
for (let i = 0; i < this.intermediates.length; i++) {
if (this.intermediates[i]) {
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
......@@ -80,6 +80,8 @@ class Trial implements TableObj {
key: this.info.id,
sequenceId: this.info.sequenceId,
id: this.info.id,
jobId: this.info.jobId,
parameterId: this.info.parameterId,
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
startTime: this.info.startTime!,
endTime: this.info.endTime,
......@@ -122,8 +124,8 @@ class Trial implements TableObj {
get description(): Parameters {
const ret: Parameters = {
parameters: { },
intermediate: [ ],
parameters: {},
intermediate: [],
multiProgress: 1
};
const tempHyper = this.info.hyperParameters;
......@@ -142,7 +144,7 @@ class Trial implements TableObj {
ret.logPath = this.info.logPath;
}
const mediate: number[] = [ ];
const mediate: number[] = [];
for (const items of this.intermediateMetrics) {
if (typeof parseMetrics(items.data) === 'object') {
mediate.push(parseMetrics(items.data).default);
......
......@@ -6,13 +6,29 @@ import { Trial } from './trial';
function groupMetricsByTrial(metrics: MetricDataRecord[]): Map<string, MetricDataRecord[]> {
const ret = new Map<string, MetricDataRecord[]>();
for (const metric of metrics) {
if (ret.has(metric.trialJobId)) {
const trialId = `${metric.trialJobId}-${metric.parameterId}`;
metric.trialId = trialId;
if (ret.has(trialId)) {
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
ret.get(metric.trialJobId)!.push(metric);
ret.get(trialId)!.push(metric);
} else {
ret.set(metric.trialJobId, [ metric ]);
ret.set(trialId, [metric]);
}
}
// to compatiable with multi-trial in same job, fix offset of sequence
ret.forEach((trialMetrics) => {
let minSequenceNumber = Number.POSITIVE_INFINITY;
trialMetrics.map((item) => {
if (item.sequence < minSequenceNumber && item.type !== "FINAL") {
minSequenceNumber = item.sequence;
}
});
trialMetrics.map((item) => {
if (item.type !== "FINAL") {
item.sequence -= minSequenceNumber;
}
});
});
return ret;
}
......@@ -31,7 +47,7 @@ class TrialManager {
}
public async update(lastTime?: boolean): Promise<boolean> {
const [ infoUpdated, metricUpdated ] = await Promise.all([ this.updateInfo(), this.updateMetrics(lastTime) ]);
const [infoUpdated, metricUpdated] = await Promise.all([this.updateInfo(), this.updateMetrics(lastTime)]);
return infoUpdated || metricUpdated;
}
......@@ -71,14 +87,14 @@ class TrialManager {
public countStatus(): Map<string, number> {
const cnt = new Map<string, number>([
[ 'UNKNOWN', 0 ],
[ 'WAITING', 0 ],
[ 'RUNNING', 0 ],
[ 'SUCCEEDED', 0 ],
[ 'FAILED', 0 ],
[ 'USER_CANCELED', 0 ],
[ 'SYS_CANCELED', 0 ],
[ 'EARLY_STOPPED', 0 ],
['UNKNOWN', 0],
['WAITING', 0],
['RUNNING', 0],
['SUCCEEDED', 0],
['FAILED', 0],
['USER_CANCELED', 0],
['SYS_CANCELED', 0],
['EARLY_STOPPED', 0],
]);
for (const trial of this.trials.values()) {
if (trial.initialized()) {
......@@ -89,19 +105,71 @@ class TrialManager {
return cnt;
}
public static expandJobsToTrials(jobs: TrialJobInfo[]): TrialJobInfo[] {
const trials: TrialJobInfo[] = [];
for (const jobInfo of jobs as TrialJobInfo[]) {
if (jobInfo.hyperParameters) {
let trial: TrialJobInfo | undefined;
let lastTrial: TrialJobInfo | undefined;
for (let i = 0; i < jobInfo.hyperParameters.length; i++) {
const hyperParameters = jobInfo.hyperParameters[i]
const hpObject = JSON.parse(hyperParameters);
const parameterId = hpObject["parameter_id"];
trial = {
id: `${jobInfo.id}-${parameterId}`,
jobId: jobInfo.id,
parameterId: parameterId,
sequenceId: parameterId,
status: "SUCCEEDED",
startTime: jobInfo.startTime,
endTime: jobInfo.startTime,
hyperParameters: [hyperParameters],
logPath: jobInfo.logPath,
stderrPath: jobInfo.stderrPath,
};
if (jobInfo.finalMetricData) {
for (const metricData of jobInfo.finalMetricData) {
if (metricData.parameterId == parameterId) {
trial.finalMetricData = [metricData];
trial.endTime = metricData.timestamp;
break;
}
}
}
if (lastTrial) {
trial.startTime = lastTrial.endTime;
} else {
trial.startTime = jobInfo.startTime;
}
lastTrial = trial;
trials.push(trial);
}
if (lastTrial !== undefined) {
lastTrial.status = jobInfo.status;
lastTrial.endTime = jobInfo.endTime;
}
} else {
trials.push(jobInfo);
}
}
return trials;
}
private async updateInfo(): Promise<boolean> {
const response = await axios.get(`${MANAGER_IP}/trial-jobs`);
let updated = false;
if (response.status === 200) {
for (const info of response.data as TrialJobInfo[]) {
if (this.trials.has(info.id)) {
const newTrials = TrialManager.expandJobsToTrials(response.data);
for (const trialInfo of newTrials as TrialJobInfo[]) {
if (this.trials.has(trialInfo.id)) {
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
updated = this.trials.get(info.id)!.updateTrialJobInfo(info) || updated;
updated = this.trials.get(trialInfo.id)!.updateTrialJobInfo(trialInfo) || updated;
} else {
this.trials.set(info.id, new Trial(info, undefined));
this.trials.set(trialInfo.id, new Trial(trialInfo, undefined));
updated = true;
}
this.maxSequenceId = Math.max(this.maxSequenceId, info.sequenceId);
this.maxSequenceId = Math.max(this.maxSequenceId, trialInfo.sequenceId);
}
this.infoInitialized = true;
}
......@@ -146,7 +214,7 @@ class TrialManager {
private doUpdateMetrics(allMetrics: MetricDataRecord[], latestOnly: boolean): boolean {
let updated = false;
for (const [ trialId, metrics ] of groupMetricsByTrial(allMetrics).entries()) {
for (const [trialId, metrics] of groupMetricsByTrial(allMetrics).entries()) {
if (this.trials.has(trialId)) {
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
const trial = this.trials.get(trialId)!;
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment