QuickStart.md 11 KB
Newer Older
Yan Ni's avatar
Yan Ni committed
1
2
3
4
# QuickStart

## Installation

5
6
We support Linux MacOS and Windows(local mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
#### Linux and MacOS
Chi Song's avatar
Chi Song committed
7

Yan Ni's avatar
Yan Ni committed
8
9
10
```bash
    python3 -m pip install --upgrade nni
```
Chi Song's avatar
Chi Song committed
11

12
#### Windows
13
14
15
16
17
If you choose Windows local mode and use PowerShell to run script, you need run below PowerShell command as administrator at first time.
```bash
    Set-ExecutionPolicy -ExecutionPolicy Unrestricted
```
Then install nni through pip:
18
19
20
```bash
    python -m pip install --upgrade nni
```
Chi Song's avatar
Chi Song committed
21

Yan Ni's avatar
Yan Ni committed
22
23
Note:

24
* For Linux and MacOS `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
Yan Ni's avatar
Yan Ni committed
25
26
27
28
29
30
31
32
33
34
35
36
37
* If there is any error like `Segmentation fault`, please refer to [FAQ](FAQ.md)
* For the `system requirements` of NNI, please refer to [Install NNI](Installation.md)

## "Hello World" example on MNIST

NNI is a toolkit to help users run automated machine learning experiments. It can automatically do the cyclic process of getting hyperparameters, running trials, testing results, tuning hyperparameters. Now, we show how to use NNI to help you find the optimal hyperparameters.

Here is an example script to train a CNN on MNIST dataset **without NNI**:

```python
def run_trial(params):
    # Input data
    mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)
38
    # Build network
Yan Ni's avatar
Yan Ni committed
39
40
41
42
43
    mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'], channel_2_num=params['channel_2_num'], conv_size=params['conv_size'], hidden_size=params['hidden_size'], pool_size=params['pool_size'], learning_rate=params['learning_rate'])
    mnist_network.build_network()

    test_acc = 0.0
    with tf.Session() as sess:
44
        # Train network
Yan Ni's avatar
Yan Ni committed
45
        mnist_network.train(sess, mnist)
46
        # Evaluate network
Yan Ni's avatar
Yan Ni committed
47
48
49
50
51
52
53
        test_acc = mnist_network.evaluate(mnist)

if __name__ == '__main__':
    params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64, 'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
    run_trial(params)
```

Yan Ni's avatar
Yan Ni committed
54
Note: If you want to see the full implementation, please refer to [examples/trials/mnist/mnist_before.py](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist/mnist_before.py)
Yan Ni's avatar
Yan Ni committed
55
56
57
58
59

The above code can only try one set of parameters at a time, if we want to tune learning rate, we need to manually modify the hyperparameter and start the trial again and again.

NNI is born for helping user do the tuning jobs, the NNI working process is presented below:

Chi Song's avatar
Chi Song committed
60
```pseudo
Yan Ni's avatar
Yan Ni committed
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
input: search space, trial code, config file
output: one optimal hyperparameter configuration

1: For t = 0, 1, 2, ..., maxTrialNum,
2:      hyperparameter = chose a set of parameter from search space
3:      final result = run_trial_and_evaluate(hyperparameter)
4:      report final result to NNI
5:      If reach the upper limit time,
6:          Stop the experiment
7: return hyperparameter value with best final result
```

If you want to use NNI to automatically train your model and find the optimal hyper-parameters, you need to do three changes base on your code:

**Three things required to do when using NNI**

Chi Song's avatar
Chi Song committed
77
**Step 1**: Give a `Search Space` file in JSON, includes the `name` and the `distribution` (discrete valued or continuous valued) of all the hyperparameters you need to search.
Yan Ni's avatar
Yan Ni committed
78
79
80
81
82
83
84
85
86
87
88
89
90

```diff
-   params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
-   'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
+ {
+     "dropout_rate":{"_type":"uniform","_value":[0.5, 0.9]},
+     "conv_size":{"_type":"choice","_value":[2,3,5,7]},
+     "hidden_size":{"_type":"choice","_value":[124, 512, 1024]},
+     "batch_size": {"_type":"choice", "_value": [1, 4, 8, 16, 32]},
+     "learning_rate":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]}
+ }
```

Yan Ni's avatar
Yan Ni committed
91
*Implemented code directory: [search_space.json](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist/search_space.json)*
Yan Ni's avatar
Yan Ni committed
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115

**Step 2**: Modified your `Trial` file to get the hyperparameter set from NNI and report the final result to NNI.

```diff
+ import nni

  def run_trial(params):
      mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)

      mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'], channel_2_num=params['channel_2_num'], conv_size=params['conv_size'], hidden_size=params['hidden_size'], pool_size=params['pool_size'], learning_rate=params['learning_rate'])
      mnist_network.build_network()

      with tf.Session() as sess:
          mnist_network.train(sess, mnist)
          test_acc = mnist_network.evaluate(mnist)
+         nni.report_final_result(acc)

  if __name__ == '__main__':
-     params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
-     'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
+     params = nni.get_next_parameter()
      run_trial(params)
```

Yan Ni's avatar
Yan Ni committed
116
*Implemented code directory: [mnist.py](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist/mnist.py)*
Yan Ni's avatar
Yan Ni committed
117

118
**Step 3**: Define a `config` file in YAML, which declare the `path` to search space and trial, also give `other information` such as tuning algorithm, max trial number and max duration arguments.
Yan Ni's avatar
Yan Ni committed
119

Yan Ni's avatar
Yan Ni committed
120
```yaml
Yan Ni's avatar
Yan Ni committed
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
authorName: default
experimentName: example_mnist
trialConcurrency: 1
maxExecDuration: 1h
maxTrialNum: 10
trainingServicePlatform: local
# The path to Search Space
searchSpacePath: search_space.json
useAnnotation: false
tuner:
  builtinTunerName: TPE
# The path and the running command of trial
trial:  
  command: python3 mnist.py
  codeDir: .
  gpuNum: 0
```
Chi Song's avatar
Chi Song committed
138
139

Note, **for Windows, you need to change trial command `python3` to `python`**
Yan Ni's avatar
Yan Ni committed
140

Yan Ni's avatar
Yan Ni committed
141
*Implemented code directory: [config.yml](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist/config.yml)*
Yan Ni's avatar
Yan Ni committed
142

Yan Ni's avatar
Yan Ni committed
143
All the codes above are already prepared and stored in [examples/trials/mnist/](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist).
Yan Ni's avatar
Yan Ni committed
144

145
146
#### Linux and MacOS   
Run the **config.yml** file from your command line to start MNIST experiment.
Yan Ni's avatar
Yan Ni committed
147
148
149
150

```bash
    nnictl create --config nni/examples/trials/mnist/config.yml
```
151
152
#### Windows   
Run the **config_windows.yml** file from your command line to start MNIST experiment.
Chi Song's avatar
Chi Song committed
153
154

**Note**, if you're using windows local mode, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
155
156
157
158

```bash
    nnictl create --config nni/examples/trials/mnist/config_windows.yml
```
Yan Ni's avatar
Yan Ni committed
159

160
Note, **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Yan Ni's avatar
Yan Ni committed
161
162
163

Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. And this is what we expected to get:

Chi Song's avatar
Chi Song committed
164
```text
Yan Ni's avatar
Yan Ni committed
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: [Your IP]:8080
-----------------------------------------------------------------------

You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
         commands                       description
1. nnictl experiment show        show the information of experiments
2. nnictl trial ls               list all of trial jobs
3. nnictl top                    monitor the status of running experiments
4. nnictl log stderr             show stderr log content
5. nnictl log stdout             show stdout log content
6. nnictl stop                   stop an experiment
7. nnictl trial kill             kill a trial job by id
8. nnictl --help                 get help information about nnictl
-----------------------------------------------------------------------
```

If you prepare `trial`, `search space` and `config` according to the above steps and successfully create a NNI job, NNI will automatically tune the optimal hyper-parameters and run different hyper-parameters sets for each trial according to the requirements you set. You can clearly sees its progress by NNI WebUI.

## WebUI

After you start your experiment in NNI successfully, you can find a message in the command-line interface to tell you `Web UI url` like this:

Chi Song's avatar
Chi Song committed
196
```text
Yan Ni's avatar
Yan Ni committed
197
198
199
The Web UI urls are: [Your IP]:8080
```

200
Open the `Web UI url`(In this information is: `[Your IP]:8080`) in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below. If you can not open the WebUI link in your terminal, you can refer to [FAQ](FAQ.md).
Yan Ni's avatar
Yan Ni committed
201

Yan Ni's avatar
Yan Ni committed
202
#### View summary page
Yan Ni's avatar
Yan Ni committed
203
204
205
206
207

Click the tab "Overview".

Information about this experiment will be shown in the WebUI, including the experiment trial profile and search space message. NNI also support `download these information and parameters` through the **Download** button. You can download the experiment result anytime in the middle for the running or at the end of the execution, etc.

Yan Ni's avatar
Yan Ni committed
208
![](../img/QuickStart1.png)
Yan Ni's avatar
Yan Ni committed
209
210
211

Top 10 trials will be listed in the Overview page, you can browse all the trials in "Trials Detail" page.

Yan Ni's avatar
Yan Ni committed
212
![](../img/QuickStart2.png)
Yan Ni's avatar
Yan Ni committed
213

Yan Ni's avatar
Yan Ni committed
214
#### View trials detail page
Yan Ni's avatar
Yan Ni committed
215
216
217

Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.

Yan Ni's avatar
Yan Ni committed
218
![](../img/QuickStart3.png)
Yan Ni's avatar
Yan Ni committed
219
220
221
222
223
224

Click the tab "Hyper Parameter" to see the parallel graph.

* You can select the percentage to see top trials.
* Choose two axis to swap its positions

Yan Ni's avatar
Yan Ni committed
225
![](../img/QuickStart4.png)
Yan Ni's avatar
Yan Ni committed
226
227
228

Click the tab "Trial Duration" to see the bar graph.

Yan Ni's avatar
Yan Ni committed
229
![](../img/QuickStart5.png)
Yan Ni's avatar
Yan Ni committed
230
231
232
233

Below is the status of the all trials. Specifically:

* Trial detail: trial's id, trial's duration, start time, end time, status, accuracy and search space file.
234
* If you run on the OpenPAI platform, you can also see the hdfsLogPath.
Yan Ni's avatar
Yan Ni committed
235
236
237
* Kill: you can kill a job that status is running.
* Support to search for a specific trial.

Yan Ni's avatar
Yan Ni committed
238
![](../img/QuickStart6.png)
Yan Ni's avatar
Yan Ni committed
239

Chi Song's avatar
Chi Song committed
240
* Intermediate Result Graph
Yan Ni's avatar
Yan Ni committed
241

Yan Ni's avatar
Yan Ni committed
242
![](../img/QuickStart7.png)
Yan Ni's avatar
Yan Ni committed
243
244
245

## Related Topic

246
247
248
* [Try different Tuners](BuiltinTuner.md)
* [Try different Assessors](BuiltinAssessors.md)
* [How to use command line tool nnictl](Nnictl.md)
Yan Ni's avatar
Yan Ni committed
249
* [How to write a trial](Trials.md)
250
* [How to run an experiment on local (with multiple GPUs)?](LocalMode.md)
Yan Ni's avatar
Yan Ni committed
251
* [How to run an experiment on multiple machines?](RemoteMachineMode.md)
252
* [How to run an experiment on OpenPAI?](PaiMode.md)
Yan Ni's avatar
Yan Ni committed
253
* [How to run an experiment on Kubernetes through Kubeflow?](KubeflowMode.md)
Chi Song's avatar
Chi Song committed
254
* [How to run an experiment on Kubernetes through FrameworkController?](FrameworkControllerMode.md)