Add comprehensive tests for tuners (merge into master) (#1681)

* Add comprehensive tests for tuners (#1570)

Add comprehensive tests for tuners (merge into master) (#1681)
* Add comprehensive tests for tuners (#1570)
1f9b7617 · Yuge Zhang · QuanluZhang · 358bdb18 · 1f9b7617 · 1f9b7617
Commit 1f9b7617 authored Nov 04, 2019 by Yuge Zhang Committed by QuanluZhang Nov 04, 2019
11 changed files
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -25,6 +25,8 @@ jobs:
    displayName: 'Run flake8 tests to find Python syntax errors and undefined names'
  - script: |
      cd test
+      sudo apt install -y swig
+      PATH=$HOME/.local/bin:$PATH nnictl package install --name=SMAC
      source unittest.sh
    displayName: 'Unit test'
  - script: |
@@ -65,7 +67,11 @@ jobs:
    displayName: 'Install nni toolkit via source code'
  - script: |
      cd test
-      PATH=$HOME/Library/Python/3.7/bin:$PATH && source unittest.sh
+      ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" < /dev/null 2> /dev/null
+      brew install swig@3
+      ln -s /usr/local/opt/swig\@3/bin/swig /usr/local/bin/swig
+      PATH=$HOME/Library/Python/3.7/bin:$PATH nnictl package install --name=SMAC
+      PATH=$HOME/Library/Python/3.7/bin:$PATH source unittest.sh
    displayName: 'Unit test'
  - script: |
      cd test

--- a/docs/en_US/Tuner/BuiltinTuner.md
+++ b/docs/en_US/Tuner/BuiltinTuner.md
@@ -122,7 +122,7 @@ Its requirement of computation resource is relatively high. Specifically, it req

 * **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics.

-* **population_size** (*int value (should > 0), optional, default = 20*) - the initial size of the population(trial num) in evolution tuner. Suggests `population_size` be much larger than `concurrency`, so users can get the most out of the algorithm (and at least `concurrency`, or the tuner will fail on their first generation of parameters).
+* **population_size** (*int value (should > 0), optional, default = 20*) - the initial size of the population (trial num) in evolution tuner. Suggests `population_size` be much larger than `concurrency`, so users can get the most out of the algorithm (and at least `concurrency`, or the tuner will fail on their first generation of parameters).

 **Usage example**

@@ -143,11 +143,11 @@ tuner:

 > Built-in Tuner Name: **SMAC**

-**Please note that SMAC doesn't support running on windows currently. The specific reason can be referred to this [GitHub issue](https://github.com/automl/SMAC3/issues/483).**
+**Please note that SMAC doesn't support running on Windows currently. The specific reason can be referred to this [GitHub issue](https://github.com/automl/SMAC3/issues/483).**

 **Installation**

-SMAC need to be installed by following command before first use.
+SMAC need to be installed by following command before first use. As a reminder, `swig` is required for SMAC: for Ubuntu `swig` can be installed with `apt`.

 ```bash
 nnictl package install --name=SMAC

--- a/docs/en_US/Tutorial/SearchSpaceSpec.md
+++ b/docs/en_US/Tutorial/SearchSpaceSpec.md
@@ -21,6 +21,8 @@ To define a search space, users should define the name of variable, the type of

 Take the first line as an example. `dropout_rate` is defined as a variable whose priori distribution is a uniform distribution of a range from `0.1` and `0.5`.

+Note that the ability of a search space is highly connected with your tuner. We listed the supported types for each builtin tuner below. For a customized tuner, you don't have to follow our convention and you will have the flexibility to define any type you want.
+
 ## Types

 All types of sampling strategies and their parameter are listed here:
@@ -74,6 +76,8 @@ All types of sampling strategies and their parameter are listed here:
 * `{"_type": "mutable_layer", "_value": {mutable_layer_infomation}}`
  * Type for [Neural Architecture Search Space][1]. Value is also a dictionary, which contains key-value pairs representing respectively name and search space of each mutable_layer.
  * For now, users can only use this type of search space with annotation, which means that there is no need to define a json file for search space since it will be automatically generated according to the annotation in trial code.
+  * The following HPO tuners can be adapted to tune this search space: TPE, Random, Anneal, Evolution, Grid Search,
+  Hyperband and BOHB.
  * For detailed usage, please refer to [General NAS Interfaces][1].

 ## Search Space Types Supported by Each Tuner
@@ -94,12 +98,12 @@ All types of sampling strategies and their parameter are listed here:

 Known Limitations:

-* GP Tuner and Metis Tuner support only **numerical values** in search space(`choice` type values can be no-numeraical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.
+* GP Tuner and Metis Tuner support only **numerical values** in search space (`choice` type values can be no-numeraical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.

 * Note that for nested search space:

    * Only Random Search/TPE/Anneal/Evolution tuner supports nested search space

-    * We do not support nested search space "Hyper Parameter" in visualization now, the enhancement is being considered in #1110(https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed
+    * We do not support nested search space "Hyper Parameter" in visualization now, the enhancement is being considered in [#1110](https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed

 [1]: ../AdvancedFeature/GeneralNasInterfaces.md
--- a/src/sdk/pynni/nni/evolution_tuner/evolution_tuner.py
+++ b/src/sdk/pynni/nni/evolution_tuner/evolution_tuner.py
@@ -158,11 +158,11 @@ class EvolutionTuner(Tuner):
    EvolutionTuner is tuner using navie evolution algorithm.
    """

-    def __init__(self, optimize_mode, population_size=32):
+    def __init__(self, optimize_mode="maximize", population_size=32):
        """
        Parameters
        ----------
-        optimize_mode : str
+        optimize_mode : str, default 'maximize'
        population_size : int
            initial population size. The larger population size,
        the better evolution performance.

--- a/src/sdk/pynni/nni/nas_utils.py
+++ b/src/sdk/pynni/nni/nas_utils.py
@@ -265,6 +265,8 @@ def convert_nas_search_space(search_space):
        param search_space: raw search space
        return: the new search space, mutable_layers will be converted into choice
    """
+    if not isinstance(search_space, dict):
+        return search_space
    ret = dict()
    for k, v in search_space.items():
        if "_type" not in v:

--- a/src/sdk/pynni/nni/parameter_expressions.py
+++ b/src/sdk/pynni/nni/parameter_expressions.py
@@ -48,7 +48,7 @@ def uniform(low, high, random_state):
    high: an float that represent an upper bound
    random_state: an object of numpy.random.RandomState
    '''
-    assert high > low, 'Upper bound must be larger than lower bound'
+    assert high >= low, 'Upper bound must be larger than lower bound'
    return random_state.uniform(low, high)



--- a/src/sdk/pynni/tests/assets/search_space.json
+++ b/src/sdk/pynni/tests/assets/search_space.json
+{
+  "choice_str": {
+    "_type": "choice",
+    "_value": ["cat", "dog", "elephant", "cow", "sheep", "panda"],
+    "fail": ["metis", "gp"]
+  },
+  "choice_int": {
+    "_type": "choice",
+    "_value": [42, 43, -1]
+  },
+  "choice_mixed": {
+    "_type": "choice",
+    "_value": [0.3, "cat", 1, null],
+    "fail": ["metis", "gp"]
+  },
+  "choice_float": {
+    "_type": "choice",
+    "_value": [0.3, 1, 2.0]
+  },
+  "choice_single": {
+    "_type": "choice",
+    "_value": [1]
+  },
+  "randint_ok": {
+    "_type": "randint",
+    "_value": [-2, 3]
+  },
+  "randint_single": {
+    "_type": "randint",
+    "_value": [10, 11]
+  },
+  "randint_fail_equal": {
+    "_type": "randint",
+    "_value": [0, 0]
+  },
+  "uniform_ok": {
+    "_type": "uniform",
+    "_value": [-1.0, 1.5]
+  },
+  "uniform_equal": {
+    "_type": "uniform",
+    "_value": [99.9, 99.9]
+  },
+  "quniform_ok": {
+    "_type": "quniform",
+    "_value": [0.0, 10.0, 2.5]
+  },
+  "quniform_clip": {
+    "_type": "quniform",
+    "_value": [2.0, 10.0, 5.0]
+  },
+  "quniform_clip_2": {
+    "_type": "quniform",
+    "_value": [-5.5, -0.5, 6]
+  },
+  "loguniform_ok": {
+    "_type": "loguniform",
+    "_value": [0.001, 100]
+  },
+  "loguniform_equal": {
+    "_type": "loguniform",
+    "_value": [1, 1]
+  },
+  "qloguniform_ok": {
+    "_type": "qloguniform",
+    "_value": [0.001, 100, 1]
+  },
+  "qloguniform_equal": {
+    "_type": "qloguniform",
+    "_value": [2, 2, 1]
+  },
+  "normal_ok": {
+    "_type": "normal",
+    "_value": [-1.0, 5.0]
+  },
+  "qnormal_ok": {
+    "_type": "qnormal",
+    "_value": [-1.5, 5.0, 0.1]
+  },
+  "lognormal_ok": {
+    "_type": "lognormal",
+    "_value": [-1.0, 5.0]
+  },
+  "qlognormal_ok": {
+    "_type": "qlognormal",
+    "_value": [-1.5, 5.0, 0.1]
+  }
+}
\ No newline at end of file
--- a/src/sdk/pynni/tests/test_msg_dispatcher.py
+++ b/src/sdk/pynni/tests/test_msg_dispatcher.py
+# Copyright (c) Microsoft Corporation. All rights reserved.
+#
+# MIT License
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
+# associated documentation files (the "Software"), to deal in the Software without restriction,
+# including without limitation the rights to use, copy, modify, merge, publish, distribute,
+# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all copies or
+# substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
+# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
+# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT
+# OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+# ==================================================================================================
+
+
+import json
+from io import BytesIO
+from unittest import TestCase, main
+
+import nni.protocol
+from nni.msg_dispatcher import MsgDispatcher
+from nni.protocol import CommandType, send, receive
+from nni.tuner import Tuner
+from nni.utils import extract_scalar_reward
+
+
+class NaiveTuner(Tuner):
+    def __init__(self):
+        self.param = 0
+        self.trial_results = []
+        self.search_space = None
+        self._accept_customized_trials()
+
+    def generate_parameters(self, parameter_id, **kwargs):
+        # report Tuner's internal states to generated parameters,
+        # so we don't need to pause the main loop
+        self.param += 2
+        return {
+            'param': self.param,
+            'trial_results': self.trial_results,
+            'search_space': self.search_space
+        }
+
+    def receive_trial_result(self, parameter_id, parameters, value, **kwargs):
+        reward = extract_scalar_reward(value)
+        self.trial_results.append((parameter_id, parameters['param'], reward, kwargs.get("customized")))
+
+    def update_search_space(self, search_space):
+        self.search_space = search_space
+
+
+_in_buf = BytesIO()
+_out_buf = BytesIO()
+
+
+def _reverse_io():
+    _in_buf.seek(0)
+    _out_buf.seek(0)
+    nni.protocol._out_file = _in_buf
+    nni.protocol._in_file = _out_buf
+
+
+def _restore_io():
+    _in_buf.seek(0)
+    _out_buf.seek(0)
+    nni.protocol._in_file = _in_buf
+    nni.protocol._out_file = _out_buf
+
+
+class MsgDispatcherTestCase(TestCase):
+    def test_msg_dispatcher(self):
+        _reverse_io()  # now we are sending to Tuner's incoming stream
+        send(CommandType.RequestTrialJobs, '2')
+        send(CommandType.ReportMetricData, '{"parameter_id":0,"type":"PERIODICAL","value":10}')
+        send(CommandType.ReportMetricData, '{"parameter_id":1,"type":"FINAL","value":11}')
+        send(CommandType.UpdateSearchSpace, '{"name":"SS0"}')
+        send(CommandType.AddCustomizedTrialJob, '{"param":-1}')
+        send(CommandType.ReportMetricData, '{"parameter_id":2,"type":"FINAL","value":22}')
+        send(CommandType.RequestTrialJobs, '1')
+        send(CommandType.KillTrialJob, 'null')
+        _restore_io()
+
+        tuner = NaiveTuner()
+        dispatcher = MsgDispatcher(tuner)
+        nni.msg_dispatcher_base._worker_fast_exit_on_terminate = False
+
+        dispatcher.run()
+        e = dispatcher.worker_exceptions[0]
+        self.assertIs(type(e), AssertionError)
+        self.assertEqual(e.args[0], 'Unsupported command: CommandType.KillTrialJob')
+
+        _reverse_io()  # now we are receiving from Tuner's outgoing stream
+        self._assert_params(0, 2, [], None)
+        self._assert_params(1, 4, [], None)
+
+        command, data = receive()  # this one is customized
+        data = json.loads(data)
+        self.assertIs(command, CommandType.NewTrialJob)
+        self.assertEqual(data['parameter_id'], 2)
+        self.assertEqual(data['parameter_source'], 'customized')
+        self.assertEqual(data['parameters'], {'param': -1})
+
+        self._assert_params(3, 6, [[1, 4, 11, False], [2, -1, 22, True]], {'name': 'SS0'})
+
+        self.assertEqual(len(_out_buf.read()), 0)  # no more commands
+
+    def _assert_params(self, parameter_id, param, trial_results, search_space):
+        command, data = receive()
+        self.assertIs(command, CommandType.NewTrialJob)
+        data = json.loads(data)
+        self.assertEqual(data['parameter_id'], parameter_id)
+        self.assertEqual(data['parameter_source'], 'algorithm')
+        self.assertEqual(data['parameters']['param'], param)
+        self.assertEqual(data['parameters']['trial_results'], trial_results)
+        self.assertEqual(data['parameters']['search_space'], search_space)
+
+
+if __name__ == '__main__':
+    main()
--- a/src/sdk/pynni/tests/test_tuner.py
+++ b/src/sdk/pynni/tests/test_tuner.py
@@ -17,107 +17,184 @@
 # DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT
 # OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 # ==================================================================================================
-
-
-import nni.protocol
-from nni.protocol import CommandType, send, receive
-from nni.tuner import Tuner
-from nni.msg_dispatcher import MsgDispatcher
-from nni.utils import extract_scalar_reward
-from io import BytesIO
+import glob
 import json
+import logging
+import os
+import shutil
+import sys
 from unittest import TestCase, main

+from nni.batch_tuner.batch_tuner import BatchTuner
+from nni.evolution_tuner.evolution_tuner import EvolutionTuner
+from nni.gp_tuner.gp_tuner import GPTuner
+from nni.gridsearch_tuner.gridsearch_tuner import GridSearchTuner
+from nni.hyperopt_tuner.hyperopt_tuner import HyperoptTuner
+from nni.metis_tuner.metis_tuner import MetisTuner
+try:
+    from nni.smac_tuner.smac_tuner import SMACTuner
+except ImportError:
+    assert sys.platform == "win32"
+from nni.tuner import Tuner

-class NaiveTuner(Tuner):
-    def __init__(self):
-        self.param = 0
-        self.trial_results = []
-        self.search_space = None
-        self._accept_customized_trials()
-
-    def generate_parameters(self, parameter_id, **kwargs):
-        # report Tuner's internal states to generated parameters,
-        # so we don't need to pause the main loop
-        self.param += 2
-        return {
-            'param': self.param,
-            'trial_results': self.trial_results,
-            'search_space': self.search_space
-        }
-
-    def receive_trial_result(self, parameter_id, parameters, value, customized, **kwargs):
-        reward = extract_scalar_reward(value)
-        self.trial_results.append((parameter_id, parameters['param'], reward, customized))
-
-    def update_search_space(self, search_space):
-        self.search_space = search_space
-
-
-_in_buf = BytesIO()
-_out_buf = BytesIO()
-
-
-def _reverse_io():
-    _in_buf.seek(0)
-    _out_buf.seek(0)
-    nni.protocol._out_file = _in_buf
-    nni.protocol._in_file = _out_buf
-
-
-def _restore_io():
-    _in_buf.seek(0)
-    _out_buf.seek(0)
-    nni.protocol._in_file = _in_buf
-    nni.protocol._out_file = _out_buf
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger('test_tuner')


 class TunerTestCase(TestCase):
-    def test_tuner(self):
-        _reverse_io()  # now we are sending to Tuner's incoming stream
-        send(CommandType.RequestTrialJobs, '2')
-        send(CommandType.ReportMetricData, '{"parameter_id":0,"type":"PERIODICAL","value":10}')
-        send(CommandType.ReportMetricData, '{"parameter_id":1,"type":"FINAL","value":11}')
-        send(CommandType.UpdateSearchSpace, '{"name":"SS0"}')
-        send(CommandType.AddCustomizedTrialJob, '{"param":-1}')
-        send(CommandType.ReportMetricData, '{"parameter_id":2,"type":"FINAL","value":22}')
-        send(CommandType.RequestTrialJobs, '1')
-        send(CommandType.KillTrialJob, 'null')
-        _restore_io()
-
-        tuner = NaiveTuner()
-        dispatcher = MsgDispatcher(tuner)
-        nni.msg_dispatcher_base._worker_fast_exit_on_terminate = False
-
-        dispatcher.run()
-        e = dispatcher.worker_exceptions[0]
-        self.assertIs(type(e), AssertionError)
-        self.assertEqual(e.args[0], 'Unsupported command: CommandType.KillTrialJob')
-
-        _reverse_io()  # now we are receiving from Tuner's outgoing stream
-        self._assert_params(0, 2, [], None)
-        self._assert_params(1, 4, [], None)
-
-        command, data = receive()  # this one is customized
-        data = json.loads(data)
-        self.assertIs(command, CommandType.NewTrialJob)
-        self.assertEqual(data['parameter_id'], 2)
-        self.assertEqual(data['parameter_source'], 'customized')
-        self.assertEqual(data['parameters'], {'param': -1})
-
-        self._assert_params(3, 6, [[1, 4, 11, False], [2, -1, 22, True]], {'name': 'SS0'})
-
-        self.assertEqual(len(_out_buf.read()), 0)  # no more commands
-
-    def _assert_params(self, parameter_id, param, trial_results, search_space):
-        command, data = receive()
-        self.assertIs(command, CommandType.NewTrialJob)
-        data = json.loads(data)
-        self.assertEqual(data['parameter_id'], parameter_id)
-        self.assertEqual(data['parameter_source'], 'algorithm')
-        self.assertEqual(data['parameters']['param'], param)
-        self.assertEqual(data['parameters']['trial_results'], trial_results)
-        self.assertEqual(data['parameters']['search_space'], search_space)
+    """
+    Targeted at testing functions of built-in tuners, including
+        - [ ] load_checkpoint
+        - [ ] save_checkpoint
+        - [X] update_search_space
+        - [X] generate_multiple_parameters
+        - [ ] import_data
+        - [ ] trial_end
+        - [ ] receive_trial_result
+    """
+
+    def search_space_test_one(self, tuner_factory, search_space):
+        tuner = tuner_factory()
+        self.assertIsInstance(tuner, Tuner)
+        tuner.update_search_space(search_space)
+
+        parameters = tuner.generate_multiple_parameters(list(range(0, 50)))
+        logger.info(parameters)
+        self.check_range(parameters, search_space)
+        if not parameters:  # TODO: not strict
+            raise ValueError("No parameters generated")
+        return parameters
+
+    def check_range(self, generated_params, search_space):
+        EPS = 1E-6
+        for param in generated_params:
+            if self._testMethodName == "test_batch":
+                param = {list(search_space.keys())[0]: param}
+            for k, v in param.items():
+                if k.startswith("_mutable_layer"):
+                    _, block, layer, choice = k.split("/")
+                    cand = search_space[block]["_value"][layer].get(choice)
+                    # cand could be None, e.g., optional_inputs_chosen_state
+                    if choice == "layer_choice":
+                        self.assertIn(v, cand)
+                    if choice == "optional_input_size":
+                        if isinstance(cand, int):
+                            self.assertEqual(v, cand)
+                        else:
+                            self.assertGreaterEqual(v, cand[0])
+                            self.assertLessEqual(v, cand[1])
+                    if choice == "optional_inputs":
+                        pass  # ignore for now
+                    continue
+                item = search_space[k]
+                if item["_type"] == "choice":
+                    self.assertIn(v, item["_value"])
+                if item["_type"] == "randint":
+                    self.assertIsInstance(v, int)
+                if item["_type"] == "uniform":
+                    self.assertIsInstance(v, float)
+                if item["_type"] in ("randint", "uniform", "quniform", "loguniform", "qloguniform"):
+                    self.assertGreaterEqual(v, item["_value"][0])
+                    self.assertLessEqual(v, item["_value"][1])
+                if item["_type"].startswith("q"):
+                    multiple = v / item["_value"][2]
+                    print(k, v, multiple, item)
+                    if item["_value"][0] + EPS < v < item["_value"][1] - EPS:
+                        self.assertAlmostEqual(int(round(multiple)), multiple)
+                if item["_type"] in ("qlognormal", "lognormal"):
+                    self.assertGreaterEqual(v, 0)
+                if item["_type"] == "mutable_layer":
+                    for layer_name in item["_value"].keys():
+                        self.assertIn(v[layer_name]["chosen_layer"], item["layer_choice"])
+
+    def search_space_test_all(self, tuner_factory, supported_types=None, ignore_types=None):
+        # NOTE(yuge): ignore types
+        # Supported types are listed in the table. They are meant to be supported and should be correct.
+        # Other than those, all the rest are "unsupported", which are expected to produce ridiculous results
+        # or throw some exceptions. However, there are certain types I can't check. For example, generate
+        # "normal" using GP Tuner returns successfully and results are fine if we check the range (-inf to +inf),
+        # but they make no sense: it's not a normal distribution. So they are ignored in tests for now.
+        with open(os.path.join(os.path.dirname(__file__), "assets/search_space.json"), "r") as fp:
+            search_space_all = json.load(fp)
+        if supported_types is None:
+            supported_types = ["choice", "randint", "uniform", "quniform", "loguniform", "qloguniform",
+                               "normal", "qnormal", "lognormal", "qlognormal"]
+        full_supported_search_space = dict()
+        for single in search_space_all:
+            single_keyword = single.split("_")
+            space = search_space_all[single]
+            expected_fail = not any([t in single_keyword for t in supported_types]) or "fail" in single_keyword
+            if ignore_types is not None and any([t in ignore_types for t in single_keyword]):
+                continue
+            if "fail" in space:
+                if self._testMethodName.split("_", 1)[1] in space.pop("fail"):
+                    expected_fail = True
+            single_search_space = {single: space}
+            if not expected_fail:
+                # supports this key
+                self.search_space_test_one(tuner_factory, single_search_space)
+                full_supported_search_space.update(single_search_space)
+            else:
+                # unsupported key
+                with self.assertRaises(Exception, msg="Testing {}".format(single)) as cm:
+                    self.search_space_test_one(tuner_factory, single_search_space)
+                logger.info("%s %s %s", tuner_factory, single, cm.exception)
+        if not any(t in self._testMethodName for t in ["batch", "grid_search"]):
+            # grid search fails for too many combinations
+            logger.info("Full supported search space: %s", full_supported_search_space)
+            self.search_space_test_one(tuner_factory, full_supported_search_space)
+
+    def test_grid_search(self):
+        self.search_space_test_all(lambda: GridSearchTuner(),
+                                   supported_types=["choice", "randint", "quniform"])
+
+    def test_tpe(self):
+        self.search_space_test_all(lambda: HyperoptTuner("tpe"))
+
+    def test_random_search(self):
+        self.search_space_test_all(lambda: HyperoptTuner("random_search"))
+
+    def test_anneal(self):
+        self.search_space_test_all(lambda: HyperoptTuner("anneal"))
+
+    def test_smac(self):
+        if sys.platform == "win32":
+            return  # smac doesn't work on windows
+        self.search_space_test_all(lambda: SMACTuner(),
+                                   supported_types=["choice", "randint", "uniform", "quniform", "loguniform"])
+
+    def test_batch(self):
+        self.search_space_test_all(lambda: BatchTuner(),
+                                   supported_types=["choice"])
+
+    def test_evolution(self):
+        # Needs enough population size, otherwise it will throw a runtime error
+        self.search_space_test_all(lambda: EvolutionTuner(population_size=100))
+
+    def test_gp(self):
+        self.search_space_test_all(lambda: GPTuner(),
+                                   supported_types=["choice", "randint", "uniform", "quniform", "loguniform",
+                                                    "qloguniform"],
+                                   ignore_types=["normal", "lognormal", "qnormal", "qlognormal"])
+
+    def test_metis(self):
+        self.search_space_test_all(lambda: MetisTuner(),
+                                   supported_types=["choice", "randint", "uniform", "quniform"])
+
+    def test_networkmorphism(self):
+        pass
+
+    def test_ppo(self):
+        pass
+
+    def tearDown(self):
+        file_list = glob.glob("smac3*") + ["param_config_space.pcs", "scenario.txt", "model_path"]
+        for file in file_list:
+            if os.path.exists(file):
+                if os.path.isdir(file):
+                    shutil.rmtree(file)
+                else:
+                    os.remove(file)


 if __name__ == '__main__':

--- a/test/metrics_test.py
+++ b/test/metrics_test.py
@@ -50,6 +50,8 @@ def run_test():
        if status == 'DONE':
            num_succeeded = get_succeeded_trial_num(TRIAL_JOBS_URL)
            print_stderr(TRIAL_JOBS_URL)
+            if sys.platform == "win32":
+                time.sleep(sleep_interval)  # Windows seems to have some issues on updating in time
            assert num_succeeded == max_trial_num, 'only %d succeeded trial jobs, there should be %d' % (num_succeeded, max_trial_num)
            check_metrics()
            break

--- a/test/unittest.sh
+++ b/test/unittest.sh
@@ -20,9 +20,7 @@ echo "===========================Testing: nni_sdk==========================="
 cd ${CWD}/../src/sdk/pynni/
 python3 -m unittest discover -v tests

-
-
-# -------------For typescrip unittest-------------
+# -------------For typescript unittest-------------
 cd ${CWD}/../src/nni_manager
 echo ""
 echo "===========================Testing: nni_manager==========================="