"...resnet50_tensorflow.git" did not exist on "bd64315eb909fda4387b2a8e9e713bbbe184f6b9"
Commit 102faea1 authored by ShufanHuang's avatar ShufanHuang Committed by QuanluZhang
Browse files

Add curve fitting assessor (#481)

* Add curve fitting assessor

* Update HowToChooseTuner.md

* Update HowToChooseTuner.md

* Update HowToChooseTuner.md

* Update README.md

* Update README.md

* Update README.md

* Update HowToChooseTuner.md

* Update HowToChooseTuner.md

* Update HowToChooseTuner.md

* Update HowToChooseTuner.md

* Update curvefitting_assessor.py

* Update config_schema.py

* Add some comments and modifications

* Remove unnecessary .json file

* Remove unnecessary .lock file

* Revert "Remove unnecessary .lock file"

This reverts commit cdfaacb29114b3dee9c797d3e9b46ee18d7d34cc.

* Revert "Revert "Remove unnecessary .lock file""

This reverts commit 7182a5fb31a02b01684429eabb3347952bf7ce2a.

* Revert "Revert "Revert "Remove unnecessary .lock file"""

This reverts commit 0f010e2b508e9f7b34c809647ba09e4e132876d8.

* Revert "Remove unnecessary .json file"

This reverts commit c6f7b47c199dd0db7ccb850d4f2ac1fd97b0caf8.

* Revert "Add some comments and modifications"

This reverts commit f78f055df9a4eec5b433a9241ce93d8ba78e3500.

* Add some modifications by comments

* suppoort minimize mode

* Update README.md

* Update README.md

* Update modelfactory.py

* minor changes and fix typo

* minor chages

* update README.md
parent c17ea4d1
......@@ -206,7 +206,7 @@ _Usage_:
For now, NNI has supported the following assessor algorithms.
- [Medianstop](#Medianstop)
- Curve Extrapolation (ongoing)
- [Curvefitting](#Curvefitting)
## Supported Assessor Algorithms
......@@ -230,6 +230,36 @@ _Usage_:
start_step: 5
```
<a name="Curvefitting"></a>
**Curvefitting**
Curve Fitting Assessor is a LPA(learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of final epoch's performance worse than the best final performance in the trial history. In this algorithm, we use 12 curves to fit the accuracy curve, the large set of parametric curve models are chosen from [reference paper][9]. The learning curves' shape coincides with our prior knowlwdge about the form of learning curves: They are typically increasing, saturating functions.
_Suggested scenario_: It is applicable in a wide range of performance curves, thus, can be used in various scenarios to speed up the tuning progress. Even better, it's able to handle and assess curves with similar performance.
_Usage_:
```yaml
assessor:
builtinAssessorName: Curvefitting
classArgs:
# (required)The total number of epoch.
# We need to know the number of epoch to determine which point we need to predict.
epoch_num: 20
# (optional) choice: maximize, minimize
# Kindly reminds that if you choose minimize mode, please adjust the value of threshold >= 1.0 (e.g threshold=1.1)
* The default value of optimize_mode is maximize
optimize_mode: maximize
# (optional) A trial is determined to be stopped or not
# In order to save our computing resource, we start to predict when we have more than start_step(default=6) accuracy points.
# only after receiving start_step number of reported intermediate results.
* The default value of start_step is 6.
start_step: 6
# (optional) The threshold that we decide to early stop the worse performance curve.
# For example: if threshold = 0.95, optimize_mode = maximize, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
* The default value of threshold is 0.95.
threshold: 0.95
```
[1]: https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf
[2]: http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
[3]: https://arxiv.org/pdf/1703.01041.pdf
......@@ -237,4 +267,5 @@ _Usage_:
[5]: https://github.com/automl/SMAC3
[6]: https://arxiv.org/pdf/1603.06560.pdf
[7]: https://arxiv.org/abs/1806.10282
[8]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf
\ No newline at end of file
[8]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf
[9]: http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf
......@@ -16,7 +16,7 @@ tuner:
#choice: maximize, minimize
optimize_mode: maximize
assessor:
#choice: Medianstop
#choice: Medianstop, Curvefitting
builtinAssessorName: Medianstop
classArgs:
#choice: maximize, minimize
......
......@@ -128,7 +128,7 @@ export namespace ValidationSchemas {
checkpointDir: joi.string()
}),
assessor: joi.object({
builtinAssessorName: joi.string().valid('Medianstop'),
builtinAssessorName: joi.string().valid('Medianstop', 'Curvefitting'),
codeDir: joi.string(),
classFileName: joi.string(),
className: joi.string(),
......
......@@ -27,7 +27,8 @@ ModuleName = {
'BatchTuner': 'nni.batch_tuner.batch_tuner',
'Medianstop': 'nni.medianstop_assessor.medianstop_assessor',
'GridSearch': 'nni.gridsearch_tuner.gridsearch_tuner',
'NetworkMorphism': 'nni.networkmorphism_tuner.networkmorphism_tuner'
'NetworkMorphism': 'nni.networkmorphism_tuner.networkmorphism_tuner',
'Curvefitting': 'nni.curvefitting_assessor.curvefitting_assessor'
}
ClassName = {
......@@ -40,7 +41,8 @@ ClassName = {
'GridSearch': 'GridSearchTuner',
'NetworkMorphism':'NetworkMorphismTuner',
'Medianstop': 'MedianstopAssessor'
'Medianstop': 'MedianstopAssessor',
'Curvefitting': 'CurvefittingAssessor'
}
ClassArgs = {
......
Curve Fitting Assessor on NNI
===
## 1. Introduction
Curve Fitting Assessor is a LPA(learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of final epoch's performance is worse than the best final performance in the trial history.
In this algorithm, we use 12 curves to fit the learning curve, the large set of parametric curve models are chosen from [reference paper][1]. The learning curves' shape coincides with our prior knowlwdge about the form of learning curves: They are typically increasing, saturating functions.
<p align="center">
<img src="./learning_curve.PNG" alt="drawing"/>
</p>
We combine all learning curve models into a single, more powerful model. This combined model is given by a weighted linear combination:
<p align="center">
<img src="./f_comb.gif" alt="drawing"/>
</p>
where the new combined parameter vector
<p align="center">
<img src="./expression_xi.gif" alt="drawing"/>
</p>
Assuming additive a Gaussian noise and the noise parameter is initialized to its maximum likelihood estimate.
We determine the maximum probability value of the new combined parameter vector by learing the historical data. Use such value to predict the future trial performance, and stop the inadequate experiments to save computing resource.
Concretely,this algorithm goes through three stages of learning, predicting and assessing.
* Step1: Learning. We will learning about the trial history of the current trial and determine the \xi at Bayesian angle. First of all, We fit each curve using the least squares method(implement by `fit_theta`) to save our time. After we obtained the parameters, we filter the curve and remove the outliers(implement by `filter_curve`). Fially, we use the MCMC sampling method(implement by `mcmc_sampling`) to adjust the weight of each curve. Up to now, we have dertermined all the parameters in \xi.
* Step2: Predicting. Calculates the expected final result accuracy(implement by `f_comb`) at target position(ie the total number of epoch) by the \xi and the formula of the combined model.
* Step3: If the fitting result doesn't converge, the predicted value will be `None`, in this case we return `AssessResult.Good` to ask for future accuracy information and predict again. Furthermore, we will get a positive value by `predict()` function, if this value is strictly greater than the best final performance in history * `THRESHOLD`(default value = 0.95), return `AssessResult.Good`, otherwise, return `AssessResult.Bad`
The figure below is the result of our algorithm on MNIST trial history data, where the green point represents the data obtained by Assessor, the blue point represents the future but unknown data, and the red line is the Curve predicted by the Curve fitting assessor.
<p align="center">
<img src="./example_of_curve_fitting.PNG" alt="drawing"/>
</p>
## 2. Usage
To use Curve Fitting Assessor, you should add the following spec in your experiment's yaml config file:
```
assessor:
builtinAssessorName: Curvefitting
classArgs:
# (required)The total number of epoch.
# We need to know the number of epoch to determine which point we need to predict.
epoch_num: 20
# (optional) choice: maximize, minimize
* The default value of optimize_mode is maximize
optimize_mode: maximize
# Kindly reminds that if you choose minimize mode, please adjust the value of threshold >= 1.0 (e.g threshold=1.1)
# (optional) A trial is determined to be stopped or not
# In order to save our computing resource, we start to predict when we have more than start_step(default=6) accuracy points.
# only after receiving start_step number of reported intermediate results.
* The default value of start_step is 6.
start_step: 6
# (optional) The threshold that we decide to early stop the worse performance curve.
# For example: if threshold = 0.95, optimize_mode = maximize, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
* The default value of threshold is 0.95.
threshold: 0.95
```
## 3. File Structure
The assessor has a lot of different files, functions and classes. Here we will only give most of those files a brief introduction:
* `curvefunctions.py` includes all the function expression and default parameters.
* `modelfactory.py` includes learning and predicting, the corresponding calculation part is also implemented here.
* `curvefitting_assessor.py` is a assessor which receives the trial history and assess whether to early stop the trial.
## 4. TODO
* Further improve the accuracy of the prediction and test it on more models.
[1]: http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf
# Copyright (c) Microsoft Corporation
# All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
import logging
from nni.assessor import Assessor, AssessResult
from .model_factory import CurveModel
logger = logging.getLogger('curvefitting_Assessor')
class CurvefittingAssessor(Assessor):
'''
CurvefittingAssessor uses learning curve fitting algorithm to predict the learning curve performance in the future.
It stops a pending trial X at step S if the trial's forecast result at target step is convergence and lower than the
best performance in the history.
'''
def __init__(self, epoch_num=20, optimize_mode='maximize', start_step=6, threshold=0.95):
if start_step <= 0:
logger.warning('It\'s recommended to set start_step to a positive number')
# Record the target position we predict
self.target_pos = epoch_num
# Record the optimize_mode
if optimize_mode == 'maximize':
self.higher_better = True
elif optimize_mode == 'minimize':
self.higher_better = False
else:
self.higher_better = True
logger.warning('unrecognized optimize_mode', optimize_mode)
# Start forecasting when historical data reaches start step
self.start_step = start_step
# Record the compared threshold
self.threshold = threshold
# Record the best performance
self.set_best_performance = False
self.completed_best_performance = None
self.trial_history = []
logger.info('Successfully initials the curvefitting assessor')
def trial_end(self, trial_job_id, success):
'''
trial end: update the best performance of completed trial job
'''
if success:
if self.set_best_performance:
self.completed_best_performance = max(self.completed_best_performance, self.trial_history[-1])
else:
self.set_best_performance = True
self.completed_best_performance = self.trial_history[-1]
logger.info('Updated complted best performance, trial job id:', trial_job_id)
else:
logger.info('No need to update, trial job id: ', trial_job_id)
def assess_trial(self, trial_job_id, trial_history):
'''
assess whether a trial should be early stop by curve fitting algorithm
return AssessResult.Good or AssessResult.Bad
'''
self.trial_history = trial_history
curr_step = len(trial_history)
if curr_step < self.start_step:
return AssessResult.Good
if not self.set_best_performance:
return AssessResult.Good
try:
curvemodel = CurveModel(self.target_pos)
predict_y = curvemodel.predict(trial_history)
logger.info('Prediction done. Trial job id = ', trial_job_id, '. Predict value = ', predict_y)
if predict_y is None:
logger.info('wait for more information to predict precisely')
return AssessResult.Good
standard_performance = self.completed_best_performance * self.threshold
if self.higher_better:
if predict_y > standard_performance:
return AssessResult.Good
return AssessResult.Bad
else:
if predict_y < standard_performance:
return AssessResult.Good
return AssessResult.Bad
except Exception as exception:
logger.exception('unrecognize exception in curvefitting_asserssor', exception)
# Copyright (c) Microsoft Corporation
# All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
import numpy as np
all_models = {}
model_para = {}
model_para_num = {}
curve_combination_models = ['vap', 'pow3', 'linear', 'logx_linear', 'dr_hill_zero_background', 'log_power', 'pow4', 'mmf',
'exp4', 'ilog2', 'weibull', 'janoschek']
def vap(x, a, b, c):
''' Vapor pressure model '''
return np.exp(a+b/x+c*np.log(x))
all_models['vap'] = vap
model_para['vap'] = [-0.622028, -0.470050, 0.042322]
model_para_num['vap'] = 3
def pow3(x, c, a, alpha):
return c - a * x**(-alpha)
all_models['pow3'] = pow3
model_para['pow3'] = [0.84, 0.52, 0.01]
model_para_num['pow3'] = 3
def linear(x, a, b):
return a*x + b
all_models['linear'] = linear
model_para['linear'] = [1., 0]
model_para_num['linear'] = 2
def logx_linear(x, a, b):
x = np.log(x)
return a*x + b
all_models['logx_linear'] = logx_linear
model_para['logx_linear'] = [0.378106, 0.046506]
model_para_num['logx_linear'] = 2
def dr_hill_zero_background(x, theta, eta, kappa):
return (theta* x**eta) / (kappa**eta + x**eta)
all_models['dr_hill_zero_background'] = dr_hill_zero_background
model_para['dr_hill_zero_background'] = [0.772320, 0.586449, 2.460843]
model_para_num['dr_hill_zero_background'] = 3
def log_power(x, a, b, c):
#logistic power
return a/(1.+(x/np.exp(b))**c)
all_models['log_power'] = log_power
model_para['log_power'] = [0.77, 2.98, -0.51]
model_para_num['log_power'] = 3
def pow4(x, alpha, a, b, c):
return c - (a*x+b)**-alpha
all_models['pow4'] = pow4
model_para['pow4'] = [0.1, 200, 0., 0.8]
model_para_num['pow4'] = 4
def mmf(x, alpha, beta, kappa, delta):
'''
Morgan-Mercer-Flodin
http://www.pisces-conservation.com/growthhelp/index.html?morgan_mercer_floden.htm
'''
return alpha - (alpha - beta) / (1. + (kappa * x)**delta)
all_models['mmf'] = mmf
model_para['mmf'] = [0.7, 0.1, 0.01, 5]
model_para_num['mmf'] = 4
def exp4(x, c, a, b, alpha):
return c - np.exp(-a*(x**alpha)+b)
all_models['exp4'] = exp4
model_para['exp4'] = [0.7, 0.8, -0.8, 0.3]
model_para_num['exp4'] = 4
def ilog2(x, c, a):
return c - a / np.log(x)
all_models['ilog2'] = ilog2
model_para['ilog2'] = [0.78, 0.43]
model_para_num['ilog2'] = 2
def weibull(x, alpha, beta, kappa, delta):
'''
Weibull model
http://www.pisces-conservation.com/growthhelp/index.html?morgan_mercer_floden.htm
'''
return alpha - (alpha - beta) * np.exp(-(kappa * x)**delta)
all_models['weibull'] = weibull
model_para['weibull'] = [0.7, 0.1, 0.01, 1]
model_para_num['weibull'] = 4
def janoschek(x, a, beta, k, delta):
'''http://www.pisces-conservation.com/growthhelp/janoschek.htm'''
return a - (a - beta) * np.exp(-k*x**delta)
all_models['janoschek'] = janoschek
model_para['janoschek'] = [0.73, 0.07, 0.355, 0.46]
model_para_num['janoschek'] = 4
# Copyright (c) Microsoft Corporation
# All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
import logging
import numpy as np
from scipy import optimize
from .curvefunctions import *
# Number of curve functions we prepared, more details can be found in "curvefunctions.py"
NUM_OF_FUNCTIONS = 12
# Maximum number of iterations when fitting the curve optimal parameters
MAXFEV = 1000000
# Number of simulation time when we do MCMC sampling
NUM_OF_SIMULATION_TIME = 20
# Number of samples we select when we do MCMC sampling
NUM_OF_INSTANCE = 10
# The step size of each noise when we do MCMC sampling
STEP_SIZE = 0.0005
# Number of least fitting function, if effective function is lower than this number, we will ask for more information
LEAST_FITTED_FUNCTION = 4
logger = logging.getLogger('curvefitting_Assessor')
class CurveModel(object):
def __init__(self, target_pos):
self.target_pos = target_pos
self.trial_history = []
self.point_num = 0
self.effective_model = []
self.effective_model_num = 0
self.weight_samples = []
def fit_theta(self):
'''use least squares to fit all default curves parameter seperately'''
x = range(1, self.point_num + 1)
y = self.trial_history
for i in range(NUM_OF_FUNCTIONS):
model = curve_combination_models[i]
try:
if model_para_num[model] == 2:
a, b = optimize.curve_fit(all_models[model], x, y, maxfev=MAXFEV)[0]
model_para[model][0] = a
model_para[model][1] = b
elif model_para_num[model] == 3:
a, b, c = optimize.curve_fit(all_models[model], x, y, maxfev=MAXFEV)[0]
model_para[model][0] = a
model_para[model][1] = b
model_para[model][2] = c
elif model_para_num[model] == 4:
a, b, c, d = optimize.curve_fit(all_models[model], x, y, maxfev=MAXFEV)[0]
model_para[model][0] = a
model_para[model][1] = b
model_para[model][2] = c
model_para[model][3] = d
except (RuntimeError, FloatingPointError, OverflowError, ZeroDivisionError):
# Ignore exceptions caused by numerical calculations
pass
except Exception as exception:
logger.critical("Exceptions in fit_theta:", exception)
def filter_curve(self):
'''filter the poor performing curve'''
avg = np.sum(self.trial_history) / self.point_num
standard = avg * avg * self.point_num
predict_data = []
tmp_model = []
for i in range(NUM_OF_FUNCTIONS):
var = 0
model = curve_combination_models[i]
for j in range(1, self.point_num + 1):
y = self.predict_y(model, j)
var += (y - self.trial_history[j - 1]) * (y - self.trial_history[j - 1])
if var < standard:
predict_data.append(y)
tmp_model.append(curve_combination_models[i])
median = np.median(predict_data)
std = np.std(predict_data)
for model in tmp_model:
y = self.predict_y(model, self.target_pos)
if y < median + 3 * std and y > median - 3 * std:
self.effective_model.append(model)
self.effective_model_num = len(self.effective_model)
logger.info('List of effective model: ', self.effective_model)
def predict_y(self, model, pos):
'''return the predict y of 'model' when epoch = pos'''
if model_para_num[model] == 2:
y = all_models[model](pos, model_para[model][0], model_para[model][1])
elif model_para_num[model] == 3:
y = all_models[model](pos, model_para[model][0], model_para[model][1], model_para[model][2])
elif model_para_num[model] == 4:
y = all_models[model](pos, model_para[model][0], model_para[model][1], model_para[model][2], model_para[model][3])
return y
def f_comb(self, pos, sample):
'''return the value of the f_comb when epoch = pos'''
ret = 0
for i in range(self.effective_model_num):
model = self.effective_model[i]
y = self.predict_y(model, pos)
ret += sample[i] * y
return ret
def normalize_weight(self, samples):
'''normalize weight '''
for i in range(NUM_OF_INSTANCE):
total = 0
for j in range(self.effective_model_num):
total += samples[i][j]
for j in range(self.effective_model_num):
samples[i][j] /= total
return samples
def sigma_sq(self, sample):
'''returns the value of sigma square, given the weight's sample'''
ret = 0
for i in range(1, self.point_num + 1):
temp = self.trial_history[i - 1] - self.f_comb(i, sample)
ret += temp * temp
return 1.0 * ret / self.point_num
def normal_distribution(self, pos, sample):
'''returns the value of normal distribution, given the weight's sample and target position'''
curr_sigma_sq = self.sigma_sq(sample)
delta = self.trial_history[pos - 1] - self.f_comb(pos, sample)
return np.exp(np.square(delta) / (-2.0 * curr_sigma_sq)) / np.sqrt(2 * np.pi * np.sqrt(curr_sigma_sq))
def likelihood(self, samples):
'''likelihood'''
ret = np.ones(NUM_OF_INSTANCE)
for i in range(NUM_OF_INSTANCE):
for j in range(1, self.point_num + 1):
ret[i] *= self.normal_distribution(j, samples[i])
return ret
def prior(self, samples):
'''priori distribution'''
ret = np.ones(NUM_OF_INSTANCE)
for i in range(NUM_OF_INSTANCE):
for j in range(self.effective_model_num):
if not samples[i][j] > 0:
ret[i] = 0
if self.f_comb(1, samples[i]) >= self.f_comb(self.target_pos, samples[i]):
ret[i] = 0
return ret
def target_distribution(self, samples):
'''posterior probability'''
curr_likelihood = self.likelihood(samples)
curr_prior = self.prior(samples)
ret = np.ones(NUM_OF_INSTANCE)
for i in range(NUM_OF_INSTANCE):
ret[i] = curr_likelihood[i] * curr_prior[i]
return ret
def mcmc_sampling(self):
'''
Adjust the weight of each function using mcmc sampling.
The initial value of each weight is evenly distribute.
Brief introduction:
(1)Definition of sample:
Sample is a (1 * NUM_OF_FUNCTIONS) matrix, representing{w1, w2, ... wk}
(2)Definition of samples:
Samples is a collection of sample, it's a (NUM_OF_INSTANCE * NUM_OF_FUNCTIONS) matrix,
representing{{w11, w12, ..., w1k}, {w21, w22, ... w2k}, ...{wk1, wk2,..., wkk}}
(3)Definition of model:
Model is the function we chose right now. Such as: 'wap', 'weibull'.
(4)Definition of pos:
Pos is the position we want to predict, corresponds to the value of epoch.
'''
init_weight = np.ones((self.effective_model_num), dtype=np.float) / self.effective_model_num
self.weight_samples = np.broadcast_to(init_weight, (NUM_OF_INSTANCE, self.effective_model_num))
for i in range(NUM_OF_SIMULATION_TIME):
# sample new value from Q(i, j)
new_values = np.random.randn(NUM_OF_INSTANCE, self.effective_model_num) * STEP_SIZE + self.weight_samples
new_values = self.normalize_weight(new_values)
# compute alpha(i, j) = min{1, P(j)Q(j, i)/P(i)Q(i, j)}
alpha = np.minimum(1, self.target_distribution(new_values) / self.target_distribution(self.weight_samples))
# sample u
u = np.random.rand(NUM_OF_INSTANCE)
# new value
change_value_flag = (u < alpha).astype(np.int)
for j in range(NUM_OF_INSTANCE):
new_values[j] = self.weight_samples[j] * (1 - change_value_flag[j]) + new_values[j] * change_value_flag[j]
self.weight_samples = new_values
def predict(self, trial_history):
'''predict the value of target position'''
self.trial_history = trial_history
self.point_num = len(trial_history)
self.fit_theta()
self.filter_curve()
if self.effective_model_num < LEAST_FITTED_FUNCTION:
# different curve's predictions are too scattered, requires more information
return None
self.mcmc_sampling()
ret = 0
for i in range(NUM_OF_INSTANCE):
ret += self.f_comb(self.target_pos, self.weight_samples[i])
return ret / NUM_OF_INSTANCE
# Copyright (c) Microsoft Corporation
# All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
import unittest
from .curvefitting_assessor import CurvefittingAssessor
from nni.assessor import AssessResult
class TestCurveFittingAssessor(unittest.TestCase):
def test_init(self):
new_assessor = CurvefittingAssessor(20)
self.assertEquals(new_assessor.start_step, 6)
self.assertEquals(new_assessor.target_pos, 20)
self.assertEquals(new_assessor.completed_best_performance, 0.0001)
def test_insufficient_point(self):
new_assessor = CurvefittingAssessor(20)
ret = new_assessor.assess_trial(1, [1])
self.assertEquals(ret, AssessResult.Good)
def test_not_converged(self):
new_assessor = CurvefittingAssessor(20)
with self.assertRaises(TypeError):
ret = new_assessor.assess_trial([1, 199, 0, 199, 1, 209, 2])
ret = new_assessor.assess_trial(1, [1, 199, 0, 199, 1, 209, 2])
self.assertEquals(ret, AssessResult.Good)
models = CurveModel(21)
self.assertEquals(models.predict([1, 199, 0, 199, 1, 209, 2]), -1)
def test_curve_model(self):
test_model = CurveModel(21)
test_model.effective_model = ['vap', 'pow3', 'linear', 'logx_linear', 'dr_hill_zero_background', 'log_power', 'pow4', 'mmf', 'exp4', 'ilog2', 'weibull', 'janoschek']
test_model.effective_model_num = 12
test_model.point_num = 9
test_model.target_pos = 20
test_model.trial_history = ([1, 1, 1, 1, 1, 1, 1, 1, 1])
test_model.weight_samples = np.ones((test_model.effective_model_num), dtype=np.float) / test_model.effective_model_num
self.assertAlmostEquals(test_model.predict_y('vap', 9), 0.5591906328335763)
self.assertAlmostEquals(test_model.predict_y('logx_linear', 15), 1.0704360293379522)
self.assertAlmostEquals(test_model.f_comb(9, test_model.weight_samples), 1.1543379521172443)
self.assertAlmostEquals(test_model.f_comb(15, test_model.weight_samples), 1.6949395581692737)
if __name__ == '__main__':
unittest.main()
......@@ -82,6 +82,15 @@ Optional('assessor'): Or({
Optional('start_step'): And(int, lambda x: 0 <= x <= 9999)
},
Optional('gpuNum'): And(int, lambda x: 0 <= x <= 99999)
},{
'builtinAssessorName': lambda x: x in ['Curvefitting'],
Optional('classArgs'): {
'epoch_num': And(int, lambda x: 0 <= x <= 9999),
Optional('optimize_mode'): Or('maximize', 'minimize'),
Optional('start_step'): And(int, lambda x: 0 <= x <= 9999),
Optional('threshold'): And(float, lambda x: 0.0 <= x <= 9999.0)
},
Optional('gpuNum'): And(int, lambda x: 0 <= x <= 99999)
},{
'codeDir': os.path.exists,
'classFileName': str,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment