remove optimize_mode from curve fitting (#2471)

others 1. fix failed curve fitting UTs, due to code changes. 1. move all SDK UTs to tests folder, so that they can be run in default tests. 1. fix some deprecated ut assert function calls.

remove optimize_mode from curve fitting (#2471)
others 1. fix failed curve fitting UTs, due to code changes. 1. move all SDK UTs to tests folder, so that they can be run in default tests. 1. fix some deprecated ut assert function calls.
e75a9f5a · Chi Song · GitHub · 131fb2c1 · e75a9f5a · e75a9f5a
Unverified Commit e75a9f5a authored Jun 05, 2020 by Chi Song Committed by GitHub Jun 05, 2020
14 changed files
--- a/docs/en_US/Assessor/BuiltinAssessor.md
+++ b/docs/en_US/Assessor/BuiltinAssessor.md
@@ -55,13 +55,15 @@ assessor:

 It's applicable in a wide range of performance curves, thus, it can be used in various scenarios to speed up the tuning progress. Even better, it's able to handle and assess curves with similar performance. [Detailed Description](./CurvefittingAssessor.md)

+**Note**, according to the original paper, only incremental functions are supported. Therefore this assessor can only be used to maximize optimization metrics. For example, it can be used for accuracy, but not for loss.
+
+
 **classArgs requirements:**

 * **epoch_num** (*int, **required***) - The total number of epochs. We need to know the number of epochs to determine which points we need to predict.
-* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', assessor will **stop** the trial with smaller expectation. If 'minimize', assessor will **stop** the trial with larger expectation.
 * **start_step** (*int, optional, default = 6*) - A trial is determined to be stopped or not only after receiving start_step number of reported intermediate results.
-* **threshold** (*float, optional, default = 0.95*) - The threshold that we use to decide to early stop the worst performance curve. For example: if threshold = 0.95, optimize_mode = maximize, and the best performance in the history is 0.9, then we will stop the trial who's predicted value is lower than 0.95 * 0.9 = 0.855.
-* **gap** (*int, optional, default = 1*) - The gap interval between Assesor judgements. For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermediate results.
+* **threshold** (*float, optional, default = 0.95*) - The threshold that we use to decide to early stop the worst performance curve. For example: if threshold = 0.95, and the best performance in the history is 0.9, then we will stop the trial who's predicted value is lower than 0.95 * 0.9 = 0.855.
+* **gap** (*int, optional, default = 1*) - The gap interval between Assessor judgements. For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermediate results.

 **Usage example:**

@@ -71,8 +73,7 @@ assessor:
    builtinAssessorName: Curvefitting
    classArgs:
      epoch_num: 20
-      optimize_mode: maximize
      start_step: 6
      threshold: 0.95
      gap: 1
-```
\ No newline at end of file
+```
--- a/docs/en_US/Assessor/CurvefittingAssessor.md
+++ b/docs/en_US/Assessor/CurvefittingAssessor.md
-Curve Fitting Assessor on NNI
-===
+# Curve Fitting Assessor on NNI
+
+## Introduction

-## 1. Introduction
 The Curve Fitting Assessor is an LPA (learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of the final epoch's performance is worse than the best final performance in the trial history.

 In this algorithm, we use 12 curves to fit the learning curve. The set of parametric curve models are chosen from this [reference paper][1]. The learning curves' shape coincides with our prior knowledge about the form of learning curves: They are typically increasing, saturating functions.

-![](../../img/curvefitting_learning_curve.PNG)
+![learning_curve](../../img/curvefitting_learning_curve.PNG)

 We combine all learning curve models into a single, more powerful model. This combined model is given by a weighted linear combination:

-![](../../img/curvefitting_f_comb.gif)
+![f_comb](../../img/curvefitting_f_comb.gif)

 with the new combined parameter vector

-![](../../img/curvefitting_expression_xi.gif)
+![expression_xi](../../img/curvefitting_expression_xi.gif)

 Assuming additive Gaussian noise and the noise parameter being initialized to its maximum likelihood estimate.

@@ -30,44 +30,46 @@ Concretely, this algorithm goes through three stages of learning, predicting, an

 The figure below is the result of our algorithm on MNIST trial history data, where the green point represents the data obtained by Assessor, the blue point represents the future but unknown data, and the red line is the Curve predicted by the Curve fitting assessor.

-![](../../img/curvefitting_example.PNG)
+![examples](../../img/curvefitting_example.PNG)
+
+## Usage

-## 2. Usage
 To use Curve Fitting Assessor, you should add the following spec in your experiment's YAML config file:

-```
+```yaml
 assessor:
-    builtinAssessorName: Curvefitting
-    classArgs:
-      # (required)The total number of epoch.
-      #  We need to know the number of epoch to determine which point we need to predict.
-      epoch_num: 20
-      # (optional) choice: maximize, minimize
-      * The default value of optimize_mode is maximize
-      optimize_mode: maximize
-      # (optional) In order to save our computing resource, we start to predict when we have more than only after receiving start_step number of reported intermediate results.
-      * The default value of start_step is 6.
-      start_step: 6
-      # (optional) The threshold that we decide to early stop the worse performance curve.
-      # For example: if threshold = 0.95, optimize_mode = maximize, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
-      * The default value of threshold is 0.95.
-      # Kindly reminds that if you choose minimize mode, please adjust the value of threshold >= 1.0 (e.g threshold=1.1)
-      threshold: 0.95
-      # (optional) The gap interval between Assesor judgements.
-      # For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermedian result.
-      * The default value of gap is 1.
-      gap: 1
+  builtinAssessorName: Curvefitting
+  classArgs:
+    # (required)The total number of epoch.
+    #  We need to know the number of epoch to determine which point we need to predict.
+    epoch_num: 20
+    # (optional) In order to save our computing resource, we start to predict when we have more than only after receiving start_step number of reported intermediate results.
+    # The default value of start_step is 6.
+    start_step: 6
+    # (optional) The threshold that we decide to early stop the worse performance curve.
+    # For example: if threshold = 0.95, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
+    # The default value of threshold is 0.95.
+    threshold: 0.95
+    # (optional) The gap interval between Assesor judgements.
+    # For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermedian result.
+    # The default value of gap is 1.
+    gap: 1
 ```

-## 3. File Structure
+## Limitation
+
+According to the original paper, only incremental functions are supported. Therefore this assessor can only be used to maximize optimization metrics. For example, it can be used for accuracy, but not for loss.
+
+## File Structure
+
 The assessor has a lot of different files, functions, and classes. Here we briefly describe a few of them.

 * `curvefunctions.py` includes all the function expressions and default parameters.
 * `modelfactory.py` includes learning and predicting; the corresponding calculation part is also implemented here.
 * `curvefitting_assessor.py` is the assessor which receives the trial history and assess whether to early stop the trial.

-## 4. TODO
-* Further improve the accuracy of the prediction and test it on more models.
+## TODO

+* Further improve the accuracy of the prediction and test it on more models.

 [1]: http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf
--- a/examples/trials/mnist-pytorch/config_assessor.yml
+++ b/examples/trials/mnist-pytorch/config_assessor.yml
@@ -19,8 +19,6 @@ assessor:
  #choice: Medianstop, Curvefitting
  builtinAssessorName: Curvefitting
  classArgs:
-    #choice: maximize, minimize
-    optimize_mode: maximize
    epoch_num: 20
    threshold: 0.9
 trial:

--- a/examples/trials/mnist-tfv1/config_assessor.yml
+++ b/examples/trials/mnist-tfv1/config_assessor.yml
@@ -19,8 +19,6 @@ assessor:
  #choice: Medianstop, Curvefitting
  builtinAssessorName: Curvefitting
  classArgs:
-    #choice: maximize, minimize
-    optimize_mode: maximize
    epoch_num: 20
    threshold: 0.9
 trial:

--- a/examples/trials/mnist-tfv2/config_assessor.yml
+++ b/examples/trials/mnist-tfv2/config_assessor.yml
@@ -19,8 +19,6 @@ assessor:
  #choice: Medianstop, Curvefitting
  builtinAssessorName: Curvefitting
  classArgs:
-    #choice: maximize, minimize
-    optimize_mode: maximize
    epoch_num: 20
    threshold: 0.9
 trial:

--- a/src/sdk/pynni/nni/curvefitting_assessor/curvefitting_assessor.py
+++ b/src/sdk/pynni/nni/curvefitting_assessor/curvefitting_assessor.py
@@ -9,6 +9,7 @@ from .model_factory import CurveModel

 logger = logging.getLogger('curvefitting_Assessor')

+
 class CurvefittingAssessor(Assessor):
    """CurvefittingAssessor uses learning curve fitting algorithm to predict the learning curve performance in the future.
    It stops a pending trial X at step S if the trial's forecast result at target step is convergence and lower than the
@@ -18,26 +19,17 @@ class CurvefittingAssessor(Assessor):
    ----------
    epoch_num : int
        The total number of epoch
-    optimize_mode : str
-        optimize mode, 'maximize' or 'minimize'
    start_step : int
        only after receiving start_step number of reported intermediate results
    threshold : float
        The threshold that we decide to early stop the worse performance curve.
    """
-    def __init__(self, epoch_num=20, optimize_mode='maximize', start_step=6, threshold=0.95, gap=1):
+
+    def __init__(self, epoch_num=20, start_step=6, threshold=0.95, gap=1):
        if start_step <= 0:
            logger.warning('It\'s recommended to set start_step to a positive number')
        # Record the target position we predict
        self.target_pos = epoch_num
-        # Record the optimize_mode
-        if optimize_mode == 'maximize':
-            self.higher_better = True
-        elif optimize_mode == 'minimize':
-            self.higher_better = False
-        else:
-            self.higher_better = True
-            logger.warning('unrecognized optimize_mode %s', optimize_mode)
        # Start forecasting when historical data reaches start step
        self.start_step = start_step
        # Record the compared threshold
@@ -109,10 +101,12 @@ class CurvefittingAssessor(Assessor):
            # Predict the final result
            curvemodel = CurveModel(self.target_pos)
            predict_y = curvemodel.predict(scalar_trial_history)
-            logger.info('Prediction done. Trial job id = %s. Predict value = %s', trial_job_id, predict_y)
+            log_message = "Prediction done. Trial job id = {}, Predict value = {}".format(trial_job_id, predict_y)
            if predict_y is None:
-                logger.info('wait for more information to predict precisely')
+                logger.info('%s, wait for more information to predict precisely', log_message)
                return AssessResult.Good
+            else:
+                logger.info(log_message)
            standard_performance = self.completed_best_performance * self.threshold

            end_time = datetime.datetime.now()
@@ -122,14 +116,9 @@ class CurvefittingAssessor(Assessor):
                    trial_job_id, self.trial_history
                )

-            if self.higher_better:
-                if predict_y > standard_performance:
-                    return AssessResult.Good
-                return AssessResult.Bad
-            else:
-                if predict_y < standard_performance:
-                    return AssessResult.Good
-                return AssessResult.Bad
+            if predict_y > standard_performance:
+                return AssessResult.Good
+            return AssessResult.Bad

        except Exception as exception:
            logger.exception('unrecognize exception in curvefitting_assessor %s', exception)
--- a/src/sdk/pynni/nni/curvefitting_assessor/curvefunctions.py
+++ b/src/sdk/pynni/nni/curvefitting_assessor/curvefunctions.py
@@ -14,6 +14,7 @@ model_para_num = {}
 curve_combination_models = ['vap', 'pow3', 'linear', 'logx_linear', 'dr_hill_zero_background', 'log_power', 'pow4', 'mmf',
                            'exp4', 'ilog2', 'weibull', 'janoschek']

+
 def vap(x, a, b, c):
    """Vapor pressure model

@@ -31,10 +32,12 @@ def vap(x, a, b, c):
    """
    return np.exp(a+b/x+c*np.log(x))

+
 all_models['vap'] = vap
 model_para['vap'] = [-0.622028, -0.470050, 0.042322]
 model_para_num['vap'] = 3

+
 def pow3(x, c, a, alpha):
    """pow3

@@ -52,10 +55,12 @@ def pow3(x, c, a, alpha):
    """
    return c - a * x**(-alpha)

+
 all_models['pow3'] = pow3
 model_para['pow3'] = [0.84, 0.52, 0.01]
 model_para_num['pow3'] = 3

+
 def linear(x, a, b):
    """linear

@@ -72,10 +77,12 @@ def linear(x, a, b):
    """
    return a*x + b

+
 all_models['linear'] = linear
 model_para['linear'] = [1., 0]
 model_para_num['linear'] = 2

+
 def logx_linear(x, a, b):
    """logx linear

@@ -93,10 +100,12 @@ def logx_linear(x, a, b):
    x = np.log(x)
    return a*x + b

+
 all_models['logx_linear'] = logx_linear
 model_para['logx_linear'] = [0.378106, 0.046506]
 model_para_num['logx_linear'] = 2

+
 def dr_hill_zero_background(x, theta, eta, kappa):
    """dr hill zero background

@@ -112,12 +121,14 @@ def dr_hill_zero_background(x, theta, eta, kappa):
    float
        (theta* x**eta) / (kappa**eta + x**eta)
    """
-    return (theta* x**eta) / (kappa**eta + x**eta)
+    return (theta * x**eta) / (kappa**eta + x**eta)
+

 all_models['dr_hill_zero_background'] = dr_hill_zero_background
 model_para['dr_hill_zero_background'] = [0.772320, 0.586449, 2.460843]
 model_para_num['dr_hill_zero_background'] = 3

+
 def log_power(x, a, b, c):
    """"logistic power

@@ -135,10 +146,12 @@ def log_power(x, a, b, c):
    """
    return a/(1.+(x/np.exp(b))**c)

+
 all_models['log_power'] = log_power
 model_para['log_power'] = [0.77, 2.98, -0.51]
 model_para_num['log_power'] = 3

+
 def pow4(x, alpha, a, b, c):
    """pow4

@@ -157,10 +170,12 @@ def pow4(x, alpha, a, b, c):
    """
    return c - (a*x+b)**-alpha

+
 all_models['pow4'] = pow4
 model_para['pow4'] = [0.1, 200, 0., 0.8]
 model_para_num['pow4'] = 4

+
 def mmf(x, alpha, beta, kappa, delta):
    """Morgan-Mercer-Flodin
    http://www.pisces-conservation.com/growthhelp/index.html?morgan_mercer_floden.htm
@@ -180,10 +195,12 @@ def mmf(x, alpha, beta, kappa, delta):
    """
    return alpha - (alpha - beta) / (1. + (kappa * x)**delta)

+
 all_models['mmf'] = mmf
 model_para['mmf'] = [0.7, 0.1, 0.01, 5]
 model_para_num['mmf'] = 4

+
 def exp4(x, c, a, b, alpha):
    """exp4

@@ -202,10 +219,12 @@ def exp4(x, c, a, b, alpha):
    """
    return c - np.exp(-a*(x**alpha)+b)

+
 all_models['exp4'] = exp4
 model_para['exp4'] = [0.7, 0.8, -0.8, 0.3]
 model_para_num['exp4'] = 4

+
 def ilog2(x, c, a):
    """ilog2

@@ -222,10 +241,12 @@ def ilog2(x, c, a):
    """
    return c - a / np.log(x)

+
 all_models['ilog2'] = ilog2
 model_para['ilog2'] = [0.78, 0.43]
 model_para_num['ilog2'] = 2

+
 def weibull(x, alpha, beta, kappa, delta):
    """Weibull model
    http://www.pisces-conservation.com/growthhelp/index.html?morgan_mercer_floden.htm
@@ -245,10 +266,12 @@ def weibull(x, alpha, beta, kappa, delta):
    """
    return alpha - (alpha - beta) * np.exp(-(kappa * x)**delta)

+
 all_models['weibull'] = weibull
 model_para['weibull'] = [0.7, 0.1, 0.01, 1]
 model_para_num['weibull'] = 4

+
 def janoschek(x, a, beta, k, delta):
    """http://www.pisces-conservation.com/growthhelp/janoschek.htm

@@ -267,6 +290,7 @@ def janoschek(x, a, beta, k, delta):
    """
    return a - (a - beta) * np.exp(-k*x**delta)

+
 all_models['janoschek'] = janoschek
 model_para['janoschek'] = [0.73, 0.07, 0.355, 0.46]
 model_para_num['janoschek'] = 4
--- a/src/sdk/pynni/nni/curvefitting_assessor/test.py
+++ b/src/sdk/pynni/nni/curvefitting_assessor/test.py
@@ -4,30 +4,29 @@
 import numpy as np
 import unittest

-from .curvefitting_assessor import CurvefittingAssessor
-from .model_factory import CurveModel
+from nni.curvefitting_assessor import CurvefittingAssessor
+from nni.curvefitting_assessor.model_factory import CurveModel
 from nni.assessor import AssessResult

 class TestCurveFittingAssessor(unittest.TestCase):
    def test_init(self):
        new_assessor = CurvefittingAssessor(20)
-        self.assertEquals(new_assessor.start_step, 6)
-        self.assertEquals(new_assessor.target_pos, 20)
-        self.assertEquals(new_assessor.completed_best_performance, 0.0001)
+        self.assertEqual(new_assessor.start_step, 6)
+        self.assertEqual(new_assessor.target_pos, 20)

    def test_insufficient_point(self):
        new_assessor = CurvefittingAssessor(20)
        ret = new_assessor.assess_trial(1, [1])
-        self.assertEquals(ret, AssessResult.Good)
+        self.assertEqual(ret, AssessResult.Good)

    def test_not_converged(self):
        new_assessor = CurvefittingAssessor(20)
        with self.assertRaises(TypeError):
            ret = new_assessor.assess_trial([1, 199, 0, 199, 1, 209, 2])
        ret = new_assessor.assess_trial(1, [1, 199, 0, 199, 1, 209, 2])
-        self.assertEquals(ret, AssessResult.Good)
+        self.assertEqual(ret, AssessResult.Good)
        models = CurveModel(21)
-        self.assertEquals(models.predict([1, 199, 0, 199, 1, 209, 2]), -1)
+        self.assertEqual(models.predict([1, 199, 0, 199, 1, 209, 2]), None)

    def test_curve_model(self):
        test_model = CurveModel(21)
@@ -37,10 +36,10 @@ class TestCurveFittingAssessor(unittest.TestCase):
        test_model.target_pos = 20
        test_model.trial_history = ([1, 1, 1, 1, 1, 1, 1, 1, 1])
        test_model.weight_samples = np.ones((test_model.effective_model_num), dtype=np.float) / test_model.effective_model_num
-        self.assertAlmostEquals(test_model.predict_y('vap', 9), 0.5591906328335763)
-        self.assertAlmostEquals(test_model.predict_y('logx_linear', 15), 1.0704360293379522)
-        self.assertAlmostEquals(test_model.f_comb(9, test_model.weight_samples), 1.1543379521172443)
-        self.assertAlmostEquals(test_model.f_comb(15, test_model.weight_samples), 1.6949395581692737)
+        self.assertAlmostEqual(test_model.predict_y('vap', 9), 0.5591906328335763)
+        self.assertAlmostEqual(test_model.predict_y('logx_linear', 15), 1.0704360293379522)
+        self.assertAlmostEqual(test_model.f_comb(9, test_model.weight_samples), 1.1543379521172443)
+        self.assertAlmostEqual(test_model.f_comb(15, test_model.weight_samples), 1.6949395581692737)

 if __name__ == '__main__':
    unittest.main()
--- a/src/sdk/pynni/nni/evolution_tuner/test_evolution_tuner.py
+++ b/src/sdk/pynni/nni/evolution_tuner/test_evolution_tuner.py
--- a/src/sdk/pynni/tests/test_graph_utils.py
+++ b/src/sdk/pynni/tests/test_graph_utils.py
@@ -84,15 +84,15 @@ class GraphUtilsTestCase(TestCase):
        expected_proto = GraphDef()
        text_format.Parse(expected_str, expected_proto)

-        self.assertEquals(len(expected_proto.node), len(actual_proto.node))
+        self.assertEqual(len(expected_proto.node), len(actual_proto.node))
        for i in range(len(expected_proto.node)):
            expected_node = expected_proto.node[i]
            actual_node = actual_proto.node[i]
-            self.assertEquals(expected_node.name, actual_node.name)
-            self.assertEquals(expected_node.op, actual_node.op)
-            self.assertEquals(expected_node.input, actual_node.input)
-            self.assertEquals(expected_node.device, actual_node.device)
-            self.assertEquals(
+            self.assertEqual(expected_node.name, actual_node.name)
+            self.assertEqual(expected_node.op, actual_node.op)
+            self.assertEqual(expected_node.input, actual_node.input)
+            self.assertEqual(expected_node.device, actual_node.device)
+            self.assertEqual(
                sorted(expected_node.attr.keys()), sorted(actual_node.attr.keys()))

    @unittest.skipIf(torch.__version__ < "1.4.0", "not supported")

--- a/src/sdk/pynni/nni/hyperopt_tuner/test_hyperopt_tuner.py
+++ b/src/sdk/pynni/nni/hyperopt_tuner/test_hyperopt_tuner.py
--- a/src/sdk/pynni/nni/networkmorphism_tuner/test_networkmorphism_tuner.py
+++ b/src/sdk/pynni/nni/networkmorphism_tuner/test_networkmorphism_tuner.py
--- a/test/config/assessors/curvefitting.yml
+++ b/test/config/assessors/curvefitting.yml
@@ -13,7 +13,6 @@ assessor:
    builtinAssessorName: Curvefitting
    classArgs:
      epoch_num: 20
-      optimize_mode: maximize
      start_step: 6
      threshold: 0.95
 trial:

--- a/tools/nni_cmd/config_schema.py
+++ b/tools/nni_cmd/config_schema.py
@@ -222,7 +222,6 @@ assessor_schema_dict = {
        'builtinAssessorName': 'Curvefitting',
        Optional('classArgs'): {
            'epoch_num': setNumberRange('epoch_num', int, 0, 9999),
-            Optional('optimize_mode'): setChoice('optimize_mode', 'maximize', 'minimize'),
            Optional('start_step'): setNumberRange('start_step', int, 0, 9999),
            Optional('threshold'): setNumberRange('threshold', float, 0, 9999),
            Optional('gap'): setNumberRange('gap', int, 1, 9999),