"src/targets/vscode:/vscode.git/clone" did not exist on "eca3cb69230a848474e44c67cd28a159620ebb3e"
sklearn_examples.md 4.5 KB
Newer Older
Yan Ni's avatar
Yan Ni committed
1
# Scikit-learn in NNI
2

Yan Ni's avatar
Yan Ni committed
3
4
[Scikit-learn](https://github.com/scikit-learn/scikit-learn) is a pupular meachine learning tool for data mining and data analysis. It supports many kinds of meachine learning models like LinearRegression, LogisticRegression, DecisionTree, SVM etc. How to make the use of scikit-learn more efficiency is a valuable topic.  
NNI supports many kinds of tuning algorithms to search the best models and/or hyper-parameters for scikit-learn, and support many kinds of environments like local machine, remote servers and cloud.
5

Yan Ni's avatar
Yan Ni committed
6
## 1. How to run the example
7

Yan Ni's avatar
Yan Ni committed
8
To start using NNI, you should install the nni package, and use the command line tool `nnictl` to start an experiment. For more information about installation and preparing for the environment,  please [refer](QuickStart.md).
Yan Ni's avatar
Yan Ni committed
9
After you installed NNI, you could enter the corresponding folder and start the experiment using following commands:
10
11

```bash
Yan Ni's avatar
Yan Ni committed
12
13
14
15
16
17
nnictl create --config ./config.yml
```

## 2. Description of the example

### 2.1 classification
18

Yan Ni's avatar
Yan Ni committed
19
20
21
22
This example uses the dataset of digits, which is made up of 1797 8x8 images, and each image is a hand-written digit, the goal is to classify these images into 10 classes.  
In this example, we use SVC as the model, and choose some parameters of this model, including `"C", "keral", "degree", "gamma" and "coef0"`. For more information of these parameters, please [refer](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html).

### 2.2 regression
23

Yan Ni's avatar
Yan Ni committed
24
25
26
27
This example uses the Boston Housing Dataset, this dataset consists of price of houses in various places in Boston and the information such as Crime (CRIM), areas of non-retail business in the town (INDUS), the age of people who own the house (AGE) etc to predict the house price of boston.
In this example, we tune different kinds of regression models including `"LinearRegression", "SVR", "KNeighborsRegressor", "DecisionTreeRegressor"` and some parameters like `"svr_kernel", "knr_weights"`. You could get more details about these models from [here](https://scikit-learn.org/stable/supervised_learning.html#supervised-learning).

## 3. How to write sklearn code using nni
28

Yan Ni's avatar
Yan Ni committed
29
It is easy to use nni in your sklearn code, there are only a few steps.
30
31
32
33

* __step 1__

  Prepare a search_space.json to storage your choose spaces.
Yan Ni's avatar
Yan Ni committed
34
  For example, if you want to choose different models, you may try:
35
36

  ```json
Yan Ni's avatar
Yan Ni committed
37
38
39
40
  {
    "model_name":{"_type":"choice","_value":["LinearRegression", "SVR", "KNeighborsRegressor", "DecisionTreeRegressor"]}
  }
  ```
41

Yan Ni's avatar
Yan Ni committed
42
  If you want to choose different models and parameters, you could put them together in a search_space.json file.
43
44

  ```json
Yan Ni's avatar
Yan Ni committed
45
46
47
48
49
50
  {
    "model_name":{"_type":"choice","_value":["LinearRegression", "SVR", "KNeighborsRegressor", "DecisionTreeRegressor"]},
    "svr_kernel": {"_type":"choice","_value":["linear", "poly", "rbf"]},
    "knr_weights": {"_type":"choice","_value":["uniform", "distance"]}
  }
  ```
51

Yan Ni's avatar
Yan Ni committed
52
53
54
55
56
  Then you could read these values as a dict from your python code, please get into the step 2.
* __step 2__  
  At the beginning of your python code, you should `import nni` to insure the packages works normally.
  First, you should use `nni.get_next_parameter()` function to get your parameters given by nni. Then you could use these parameters to update your code.
  For example, if you define your search_space.json like following format:
57
58

  ```json
Yan Ni's avatar
Yan Ni committed
59
60
61
62
63
64
65
66
  {
    "C": {"_type":"uniform","_value":[0.1, 1]},
    "keral": {"_type":"choice","_value":["linear", "rbf", "poly", "sigmoid"]},
    "degree": {"_type":"choice","_value":[1, 2, 3, 4]},
    "gamma": {"_type":"uniform","_value":[0.01, 0.1]},
    "coef0 ": {"_type":"uniform","_value":[0.01, 0.1]}
  }
  ```
67

Yan Ni's avatar
Yan Ni committed
68
  You may get a parameter dict like this:
69
70

  ```python
Yan Ni's avatar
Yan Ni committed
71
72
73
74
75
76
77
78
  params = {
        'C': 1.0,
        'keral': 'linear',
        'degree': 3,
        'gamma': 0.01,
        'coef0': 0.01
  }
  ```
79

Yan Ni's avatar
Yan Ni committed
80
81
  Then you could use these variables to write your scikit-learn code.
* __step 3__  
82
  After you finished your training, you could get your own score of the model, like your percision, recall or MSE etc. NNI needs your score to tuner algorithms and generate next group of parameters, please report the score back to NNI and start next trial job.  
Yan Ni's avatar
Yan Ni committed
83
  You just need to use `nni.report_final_result(score)` to communitate with NNI after you process your scikit-learn code. Or if you have multiple scores in the steps of training, you could also report them back to NNI using `nni.report_intemediate_result(score)`. Note, you may not report intemediate result of your job, but you must report back your final result.