Commit fcbb0ea3 authored by Shufan Huang's avatar Shufan Huang Committed by chicm-ms
Browse files

Handling import data in Tuner/Advisor (#992)

Handling import data in Tuner/Advisor
parent 26c195e9
......@@ -298,24 +298,6 @@ Debug mode will disable version check function in Trialkeeper.
nnictl trial [trial_id] --experiment [experiment_id]
```
* __nnictl trial export__
* Description
You can use this command to export reward & hyper-parameter of trial jobs to a csv file.
* Usage
```bash
nnictl trial export [OPTIONS]
```
* Options
|Name, shorthand|Required|Default|Description|
|------|------|------ |------|
|id| False| |ID of the experiment |
|--file| True| |File path of the output csv file |
<a name="top"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `nnictl top`
......@@ -388,6 +370,92 @@ Debug mode will disable version check function in Trialkeeper.
nnictl experiment list
```
<a name="export"></a>
* __nnictl experiment export__
* Description
You can use this command to export reward & hyper-parameter of trial jobs to a csv file.
* Usage
```bash
nnictl experiment export [OPTIONS]
```
* Options
|Name, shorthand|Required|Default|Description|
|------|------|------ |------|
|id| False| |ID of the experiment |
|--file| True| |File path of the output file |
|--type| True| |Type of output file, only support "csv" and "json"|
* Examples
> export all trial data in an experiment as json format
```bash
nnictl experiment export [experiment_id] --file [file_path] --type json
```
* __nnictl experiment import__
* Description
You can use this command to import several prior or supplementary trial hyperparameters & results for NNI hyperparameter tuning. The data are fed to the tuning algorithm (e.g., tuner or advisor).
* Usage
```bash
nnictl experiment import [OPTIONS]
```
* Options
|Name, shorthand|Required|Default|Description|
|------|------|------|------|
|id| False| |The id of the experiment you want to import data into|
|--file, -f| True| |a file with data you want to import in json format|
* Details
NNI supports users to import their own data, please express the data in the correct format. An example is shown below:
```json
[
{"parameter": {"x": 0.5, "y": 0.9}, "value": 0.03},
{"parameter": {"x": 0.4, "y": 0.8}, "value": 0.05},
{"parameter": {"x": 0.3, "y": 0.7}, "value": 0.04}
]
```
Every element in the top level list is a sample. For our built-in tuners/advisors, each sample should have at least two keys: `parameter` and `value`. The `parameter` must match this experiment's search space, that is, all the keys (or hyperparameters) in `parameter` must match the keys in the search space. Otherwise, tuner/advisor may have unpredictable behavior. `Value` should follow the same rule of the input in `nni.report_final_result`, that is, either a number or a dict with a key named `default`. For your customized tuner/advisor, the file could have any json content depending on how you implement the corresponding methods (e.g., `import_data`).
You also can use [nnictl experiment export](#export) to export a valid json file including previous experiment trial hyperparameters and results.
Currenctly, following tuner and advisor support import data:
```yml
builtinTunerName: TPE, Anneal, GridSearch, MetisTuner
builtinAdvisorName: BOHB
```
*If you want to import data to BOHB advisor, user are suggested to add "TRIAL_BUDGET" in parameter as NNI do, otherwise, BOHB will use max_budget as "TRIAL_BUDGET". Here is an example:*
```json
[
{"parameter": {"x": 0.5, "y": 0.9, "TRIAL_BUDGET": 27}, "value": 0.03}
]
```
* Examples
> import data to a running experiment
```bash
nnictl experiment [experiment_id] -f experiment_data.json
```
<a name="config"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `nnictl config show`
......
......@@ -94,4 +94,7 @@ class BatchTuner(Tuner):
return self.values[self.count]
def receive_trial_result(self, parameter_id, parameters, value):
pass
\ No newline at end of file
pass
def import_data(self, data):
pass
......@@ -573,3 +573,35 @@ class BOHB(MsgDispatcherBase):
def handle_add_customized_trial(self, data):
pass
def handle_import_data(self, data):
"""Import additional data for tuning
Parameters
----------
data:
a list of dictionarys, each of which has at least two keys, 'parameter' and 'value'
Raises
------
AssertionError
data doesn't have required key 'parameter' and 'value'
"""
_completed_num = 0
for trial_info in data:
logger.info("Importing data, current processing progress %s / %s" %(_completed_num), len(data))
_completed_num += 1
assert "parameter" in trial_info
_params = trial_info["parameter"]
assert "value" in trial_info
_value = trial_info['value']
if _KEY not in _params:
_params[_KEY] = self.max_budget
logger.info("Set \"TRIAL_BUDGET\" value to %s (max budget)" %self.max_budget)
if self.optimize_mode is OptimizeMode.Maximize:
reward = -_value
else:
reward = _value
_budget = _params[_KEY]
self.cg.new_result(loss=reward, budget=_budget, parameters=_params, update_model=True)
logger.info("Successfully import tuning data to BOHB advisor.")
......@@ -34,7 +34,6 @@ from nni.tuner import Tuner
from nni.utils import extract_scalar_reward
from .. import parameter_expressions
@unique
class OptimizeMode(Enum):
"""Optimize Mode class
......@@ -299,3 +298,6 @@ class EvolutionTuner(Tuner):
indiv = Individual(config=params, result=reward)
self.population.append(indiv)
def import_data(self, data):
pass
......@@ -24,14 +24,17 @@ gridsearch_tuner.py including:
import copy
import numpy as np
import logging
import nni
from nni.tuner import Tuner
from nni.utils import convert_dict2tuple
TYPE = '_type'
CHOICE = 'choice'
VALUE = '_value'
logger = logging.getLogger('grid_search_AutoML')
class GridSearchTuner(Tuner):
'''
......@@ -51,6 +54,7 @@ class GridSearchTuner(Tuner):
def __init__(self):
self.count = -1
self.expanded_search_space = []
self.supplement_data = dict()
def json2paramater(self, ss_spec):
'''
......@@ -135,9 +139,31 @@ class GridSearchTuner(Tuner):
def generate_parameters(self, parameter_id):
self.count += 1
if self.count > len(self.expanded_search_space)-1:
raise nni.NoMoreTrialError('no more parameters now.')
return self.expanded_search_space[self.count]
while (self.count <= len(self.expanded_search_space)-1):
_params_tuple = convert_dict2tuple(self.expanded_search_space[self.count])
if _params_tuple in self.supplement_data:
self.count += 1
else:
return self.expanded_search_space[self.count]
raise nni.NoMoreTrialError('no more parameters now.')
def receive_trial_result(self, parameter_id, parameters, value):
pass
def import_data(self, data):
"""Import additional data for tuning
Parameters
----------
data:
a list of dictionarys, each of which has at least two keys, 'parameter' and 'value'
"""
_completed_num = 0
for trial_info in data:
logger.info("Importing data, current processing progress %s / %s" %(_completed_num), len(data))
_completed_num += 1
assert "parameter" in trial_info
_params = trial_info["parameter"]
_params_tuple = convert_dict2tuple(_params)
self.supplement_data[_params_tuple] = True
logger.info("Successfully import data to grid search tuner.")
......@@ -419,3 +419,6 @@ class Hyperband(MsgDispatcherBase):
def handle_add_customized_trial(self, data):
pass
def handle_import_data(self, data):
pass
......@@ -172,6 +172,7 @@ class HyperoptTuner(Tuner):
self.json = None
self.total_data = {}
self.rval = None
self.supplement_data_num = 0
def _choose_tuner(self, algorithm_name):
"""
......@@ -353,3 +354,27 @@ class HyperoptTuner(Tuner):
# remove '_index' from json2parameter and save params-id
total_params = json2parameter(self.json, parameter)
return total_params
def import_data(self, data):
"""Import additional data for tuning
Parameters
----------
data:
a list of dictionarys, each of which has at least two keys, 'parameter' and 'value'
"""
_completed_num = 0
for trial_info in data:
logger.info("Importing data, current processing progress %s / %s" %(_completed_num), len(data))
_completed_num += 1
if self.algorithm_name == 'random_search':
return
assert "parameter" in trial_info
_params = trial_info["parameter"]
assert "value" in trial_info
_value = trial_info['value']
self.supplement_data_num += 1
_parameter_id = '_'.join(["ImportData", str(self.supplement_data_num)])
self.total_data[_parameter_id] = _params
self.receive_trial_result(parameter_id=_parameter_id, parameters=_params, value=_value)
logger.info("Successfully import data to TPE/Anneal tuner.")
......@@ -96,7 +96,7 @@ class MetisTuner(Tuner):
self.samples_x = []
self.samples_y = []
self.samples_y_aggregation = []
self.history_parameters = []
self.total_data = []
self.space = None
self.no_resampling = no_resampling
self.no_candidates = no_candidates
......@@ -107,6 +107,7 @@ class MetisTuner(Tuner):
self.exploration_probability = exploration_probability
self.minimize_constraints_fun = None
self.minimize_starting_points = None
self.supplement_data_num = 0
def update_search_space(self, search_space):
......@@ -392,15 +393,35 @@ class MetisTuner(Tuner):
# ===== STEP 7: If current optimal hyperparameter occurs in the history or exploration probability is less than the threshold, take next config as exploration step =====
outputs = self._pack_output(lm_current['hyperparameter'])
ap = random.uniform(0, 1)
if outputs in self.history_parameters or ap<=self.exploration_probability:
if outputs in self.total_data or ap<=self.exploration_probability:
if next_candidate is not None:
outputs = self._pack_output(next_candidate['hyperparameter'])
else:
random_parameter = _rand_init(x_bounds, x_types, 1)[0]
outputs = self._pack_output(random_parameter)
self.history_parameters.append(outputs)
self.total_data.append(outputs)
return outputs
def import_data(self, data):
"""Import additional data for tuning
Parameters
----------
data:
a list of dictionarys, each of which has at least two keys, 'parameter' and 'value'
"""
_completed_num = 0
for trial_info in data:
logger.info("Importing data, current processing progress %s / %s" %(_completed_num), len(data))
_completed_num += 1
assert "parameter" in trial_info
_params = trial_info["parameter"]
assert "value" in trial_info
_value = trial_info['value']
self.supplement_data_num += 1
_parameter_id = '_'.join(["ImportData", str(self.supplement_data_num)])
self.total_data.append(_params)
self.receive_trial_result(parameter_id=_parameter_id, parameters=_params, value=_value)
logger.info("Successfully import data to metis tuner.")
def _rand_with_constraints(x_bounds, x_types):
outputs = None
......
......@@ -154,6 +154,9 @@ class MultiPhaseMsgDispatcher(MsgDispatcherBase):
self.tuner.trial_end(json_tricks.loads(data['hyper_params'])['parameter_id'], data['event'] == 'SUCCEEDED', trial_job_id)
return True
def handle_import_data(self, data):
pass
def _handle_intermediate_metric_data(self, data):
if data['type'] != 'PERIODICAL':
return True
......
......@@ -95,3 +95,6 @@ class MultiPhaseTuner(Recoverable):
def _on_error(self):
pass
def import_data(self, data):
pass
......@@ -307,3 +307,7 @@ class NetworkMorphismTuner(Tuner):
if item["model_id"] == model_id:
return item["metric_value"]
return None
def import_data(self, data):
pass
......@@ -24,7 +24,7 @@ class Recoverable:
def load_checkpoint(self):
pass
def save_checkpont(self):
def save_checkpoint(self):
pass
def get_checkpoint_path(self):
......
......@@ -261,3 +261,6 @@ class SMACTuner(Tuner):
params.append(self.convert_loguniform_categorical(challenger.get_dictionary()))
cnt += 1
return params
def import_data(self, data):
pass
......@@ -24,6 +24,8 @@ from .env_vars import dispatcher_env_vars
def extract_scalar_reward(value, scalar_key='default'):
"""
Extract scalar reward from trial result.
Raises
------
RuntimeError
......@@ -35,9 +37,20 @@ def extract_scalar_reward(value, scalar_key='default'):
elif isinstance(value, dict) and scalar_key in value and isinstance(value[scalar_key], (float, int)):
reward = value[scalar_key]
else:
raise RuntimeError('Incorrect final result: the final result for %s should be float/int, or a dict which has a key named "default" whose value is float/int.' % str(self.__class__))
raise RuntimeError('Incorrect final result: the final result should be float/int, or a dict which has a key named "default" whose value is float/int.')
return reward
def convert_dict2tuple(value):
"""
convert dict type to tuple to solve unhashable problem.
"""
if isinstance(value, dict):
for _keys in value:
value[_keys] = convert_dict2tuple(value[_keys])
return tuple(sorted(value.items()))
else:
return value
def init_dispatcher_logger():
""" Initialize dispatcher logging configuration"""
logger_file_path = 'dispatcher.log'
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment