SklearnExamples.md 4.52 KB
Newer Older
Yan Ni's avatar
Yan Ni committed
1
# Scikit-learn in NNI
2

3
4
[Scikit-learn](https://github.com/scikit-learn/scikit-learn) is a popular machine learning tool for data mining and data analysis. It supports many kinds of machine learning models like LinearRegression, LogisticRegression, DecisionTree, SVM etc. How to make the use of scikit-learn more efficiency is a valuable topic.

Yan Ni's avatar
Yan Ni committed
5
NNI supports many kinds of tuning algorithms to search the best models and/or hyper-parameters for scikit-learn, and support many kinds of environments like local machine, remote servers and cloud.
6

Yan Ni's avatar
Yan Ni committed
7
## 1. How to run the example
8

xuehui's avatar
xuehui committed
9
To start using NNI, you should install the NNI package, and use the command line tool `nnictl` to start an experiment. For more information about installation and preparing for the environment,  please refer [here](../Tutorial/QuickStart.md).
10

Yan Ni's avatar
Yan Ni committed
11
After you installed NNI, you could enter the corresponding folder and start the experiment using following commands:
12
13

```bash
Yan Ni's avatar
Yan Ni committed
14
15
16
17
18
19
nnictl create --config ./config.yml
```

## 2. Description of the example

### 2.1 classification
20

21
This example uses the dataset of digits, which is made up of 1797 8x8 images, and each image is a hand-written digit, the goal is to classify these images into 10 classes.
22

Yan Ni's avatar
Yan Ni committed
23
24
25
In this example, we use SVC as the model, and choose some parameters of this model, including `"C", "keral", "degree", "gamma" and "coef0"`. For more information of these parameters, please [refer](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html).

### 2.2 regression
26

27
28
This example uses the Boston Housing Dataset, this dataset consists of price of houses in various places in Boston and the information such as Crime (CRIM), areas of non-retail business in the town (INDUS), the age of people who own the house (AGE) etc., to predict the house price of Boston.

Yan Ni's avatar
Yan Ni committed
29
30
In this example, we tune different kinds of regression models including `"LinearRegression", "SVR", "KNeighborsRegressor", "DecisionTreeRegressor"` and some parameters like `"svr_kernel", "knr_weights"`. You could get more details about these models from [here](https://scikit-learn.org/stable/supervised_learning.html#supervised-learning).

31
## 3. How to write scikit-learn code using NNI
32

33
It is easy to use NNI in your scikit-learn code, there are only a few steps.
34
35
36
37

* __step 1__

  Prepare a search_space.json to storage your choose spaces.
Yan Ni's avatar
Yan Ni committed
38
  For example, if you want to choose different models, you may try:
39
40

  ```json
Yan Ni's avatar
Yan Ni committed
41
42
43
44
  {
    "model_name":{"_type":"choice","_value":["LinearRegression", "SVR", "KNeighborsRegressor", "DecisionTreeRegressor"]}
  }
  ```
45

Yan Ni's avatar
Yan Ni committed
46
  If you want to choose different models and parameters, you could put them together in a search_space.json file.
47
48

  ```json
Yan Ni's avatar
Yan Ni committed
49
50
51
52
53
54
  {
    "model_name":{"_type":"choice","_value":["LinearRegression", "SVR", "KNeighborsRegressor", "DecisionTreeRegressor"]},
    "svr_kernel": {"_type":"choice","_value":["linear", "poly", "rbf"]},
    "knr_weights": {"_type":"choice","_value":["uniform", "distance"]}
  }
  ```
55

Yan Ni's avatar
Yan Ni committed
56
  Then you could read these values as a dict from your python code, please get into the step 2.
57
* __step 2__
58

Yan Ni's avatar
Yan Ni committed
59
  At the beginning of your python code, you should `import nni` to insure the packages works normally.
60
61

  First, you should use `nni.get_next_parameter()` function to get your parameters given by NNI. Then you could use these parameters to update your code.
Yan Ni's avatar
Yan Ni committed
62
  For example, if you define your search_space.json like following format:
63
64

  ```json
Yan Ni's avatar
Yan Ni committed
65
66
67
68
69
70
71
72
  {
    "C": {"_type":"uniform","_value":[0.1, 1]},
    "keral": {"_type":"choice","_value":["linear", "rbf", "poly", "sigmoid"]},
    "degree": {"_type":"choice","_value":[1, 2, 3, 4]},
    "gamma": {"_type":"uniform","_value":[0.01, 0.1]},
    "coef0 ": {"_type":"uniform","_value":[0.01, 0.1]}
  }
  ```
73

Yan Ni's avatar
Yan Ni committed
74
  You may get a parameter dict like this:
75
76

  ```python
Yan Ni's avatar
Yan Ni committed
77
78
79
80
81
82
83
84
  params = {
        'C': 1.0,
        'keral': 'linear',
        'degree': 3,
        'gamma': 0.01,
        'coef0': 0.01
  }
  ```
85

Yan Ni's avatar
Yan Ni committed
86
  Then you could use these variables to write your scikit-learn code.
87
* __step 3__
88
89
90
91

  After you finished your training, you could get your own score of the model, like your precision, recall or MSE etc. NNI needs your score to tuner algorithms and generate next group of parameters, please report the score back to NNI and start next trial job.

  You just need to use `nni.report_final_result(score)` to communicate with NNI after you process your scikit-learn code. Or if you have multiple scores in the steps of training, you could also report them back to NNI using `nni.report_intemediate_result(score)`. Note, you may not report intermediate result of your job, but you must report back your final result.