QuickStart.md 11.1 KB
Newer Older
Yan Ni's avatar
Yan Ni committed
1
2
3
4
# QuickStart

## Installation

5
We currently support Linux, macOS, and Windows. Ubuntu 16.04 or higher, macOS 10.14.1, and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
QuanluZhang's avatar
QuanluZhang committed
6
7

**Linux and macOS**
Chi Song's avatar
Chi Song committed
8

Yan Ni's avatar
Yan Ni committed
9
10
11
```bash
    python3 -m pip install --upgrade nni
```
Chi Song's avatar
Chi Song committed
12

QuanluZhang's avatar
QuanluZhang committed
13
**Windows**
14

15
16
17
```bash
    python -m pip install --upgrade nni
```
Chi Song's avatar
Chi Song committed
18

Yan Ni's avatar
Yan Ni committed
19
20
Note:

21
22
23
* For Linux and macOS, `--user` can be added if you want to install NNI in your home directory; this does not require any special privileges.
* If there is an error like `Segmentation fault`, please refer to the [FAQ](FAQ.md).
* For the `system requirements` of NNI, please refer to [Install NNI on Linux&Mac](InstallationLinux.md) or [Windows](InstallationWin.md).
Yan Ni's avatar
Yan Ni committed
24
25
26

## "Hello World" example on MNIST

27
NNI is a toolkit to help users run automated machine learning experiments. It can automatically do the cyclic process of getting hyperparameters, running trials, testing results, and tuning hyperparameters. Here, we'll show how to use NNI to help you find the optimal hyperparameters for a MNIST model.
Yan Ni's avatar
Yan Ni committed
28

29
Here is an example script to train a CNN on the MNIST dataset **without NNI**:
Yan Ni's avatar
Yan Ni committed
30
31
32
33
34

```python
def run_trial(params):
    # Input data
    mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)
35
    # Build network
Yan Ni's avatar
Yan Ni committed
36
37
38
39
40
    mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'], channel_2_num=params['channel_2_num'], conv_size=params['conv_size'], hidden_size=params['hidden_size'], pool_size=params['pool_size'], learning_rate=params['learning_rate'])
    mnist_network.build_network()

    test_acc = 0.0
    with tf.Session() as sess:
41
        # Train network
Yan Ni's avatar
Yan Ni committed
42
        mnist_network.train(sess, mnist)
43
        # Evaluate network
Yan Ni's avatar
Yan Ni committed
44
45
46
47
48
49
50
        test_acc = mnist_network.evaluate(mnist)

if __name__ == '__main__':
    params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64, 'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
    run_trial(params)
```

51
Note: If you want to see the full implementation, please refer to [examples/trials/mnist-tfv1/mnist_before.py](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1/mnist_before.py).
Yan Ni's avatar
Yan Ni committed
52

53
The above code can only try one set of parameters at a time; if we want to tune learning rate, we need to manually modify the hyperparameter and start the trial again and again.
Yan Ni's avatar
Yan Ni committed
54

55
NNI is born to help the user do tuning jobs; the NNI working process is presented below:
Yan Ni's avatar
Yan Ni committed
56

QuanluZhang's avatar
QuanluZhang committed
57
```text
Yan Ni's avatar
Yan Ni committed
58
59
60
61
62
63
64
65
66
67
68
69
input: search space, trial code, config file
output: one optimal hyperparameter configuration

1: For t = 0, 1, 2, ..., maxTrialNum,
2:      hyperparameter = chose a set of parameter from search space
3:      final result = run_trial_and_evaluate(hyperparameter)
4:      report final result to NNI
5:      If reach the upper limit time,
6:          Stop the experiment
7: return hyperparameter value with best final result
```

70
If you want to use NNI to automatically train your model and find the optimal hyper-parameters, you need to do three changes based on your code:
Yan Ni's avatar
Yan Ni committed
71

QuanluZhang's avatar
QuanluZhang committed
72
**Three steps to start an experiment**
Yan Ni's avatar
Yan Ni committed
73

74
**Step 1**: Give a `Search Space` file in JSON, including the `name` and the `distribution` (discrete-valued or continuous-valued) of all the hyperparameters you need to search.
Yan Ni's avatar
Yan Ni committed
75
76
77
78
79
80
81
82
83
84
85
86
87

```diff
-   params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
-   'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
+ {
+     "dropout_rate":{"_type":"uniform","_value":[0.5, 0.9]},
+     "conv_size":{"_type":"choice","_value":[2,3,5,7]},
+     "hidden_size":{"_type":"choice","_value":[124, 512, 1024]},
+     "batch_size": {"_type":"choice", "_value": [1, 4, 8, 16, 32]},
+     "learning_rate":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]}
+ }
```

88
*Implemented code directory: [search_space.json](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1/search_space.json)*
Yan Ni's avatar
Yan Ni committed
89

90
**Step 2**: Modify your `Trial` file to get the hyperparameter set from NNI and report the final result to NNI.
Yan Ni's avatar
Yan Ni committed
91
92
93
94
95
96
97
98
99
100
101
102
103

```diff
+ import nni

  def run_trial(params):
      mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)

      mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'], channel_2_num=params['channel_2_num'], conv_size=params['conv_size'], hidden_size=params['hidden_size'], pool_size=params['pool_size'], learning_rate=params['learning_rate'])
      mnist_network.build_network()

      with tf.Session() as sess:
          mnist_network.train(sess, mnist)
          test_acc = mnist_network.evaluate(mnist)
Chi Song's avatar
Chi Song committed
104
+         nni.report_final_result(test_acc)
Yan Ni's avatar
Yan Ni committed
105
106
107
108
109
110
111
112

  if __name__ == '__main__':
-     params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
-     'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
+     params = nni.get_next_parameter()
      run_trial(params)
```

113
*Implemented code directory: [mnist.py](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1/mnist.py)*
Yan Ni's avatar
Yan Ni committed
114

115
**Step 3**: Define a `config` file in YAML which declares the `path` to the search space and trial files. It also gives other information such as the tuning algorithm, max trial number, and max duration arguments.
Yan Ni's avatar
Yan Ni committed
116

Yan Ni's avatar
Yan Ni committed
117
```yaml
Yan Ni's avatar
Yan Ni committed
118
119
120
121
122
123
124
125
126
127
128
129
authorName: default
experimentName: example_mnist
trialConcurrency: 1
maxExecDuration: 1h
maxTrialNum: 10
trainingServicePlatform: local
# The path to Search Space
searchSpacePath: search_space.json
useAnnotation: false
tuner:
  builtinTunerName: TPE
# The path and the running command of trial
130
trial:
Yan Ni's avatar
Yan Ni committed
131
132
133
134
  command: python3 mnist.py
  codeDir: .
  gpuNum: 0
```
Chi Song's avatar
Chi Song committed
135

136
Note, **for Windows, you need to change the trial command from `python3` to `python`**.
Yan Ni's avatar
Yan Ni committed
137

138
*Implemented code directory: [config.yml](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1/config.yml)*
Yan Ni's avatar
Yan Ni committed
139

140
All the cod above is already prepared and stored in [examples/trials/mnist-tfv1/](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1).
Yan Ni's avatar
Yan Ni committed
141

QuanluZhang's avatar
QuanluZhang committed
142
143
**Linux and macOS**

144
Run the **config.yml** file from your command line to start an MNIST experiment.
Yan Ni's avatar
Yan Ni committed
145
146

```bash
147
    nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
Yan Ni's avatar
Yan Ni committed
148
```
QuanluZhang's avatar
QuanluZhang committed
149
150
151

**Windows**

152
Run the **config_windows.yml** file from your command line to start an MNIST experiment.
Chi Song's avatar
Chi Song committed
153

154
Note: if you're using NNI on Windows, you need to change `python3` to `python` in the config.yml file or use the config_windows.yml file to start the experiment.
155
156

```bash
157
    nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
158
```
Yan Ni's avatar
Yan Ni committed
159

160
Note: `nnictl` is a command line tool that can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Yan Ni's avatar
Yan Ni committed
161

162
Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. And this is what we expect to get:
Yan Ni's avatar
Yan Ni committed
163

Chi Song's avatar
Chi Song committed
164
```text
Yan Ni's avatar
Yan Ni committed
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: [Your IP]:8080
-----------------------------------------------------------------------

You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
         commands                       description
1. nnictl experiment show        show the information of experiments
2. nnictl trial ls               list all of trial jobs
3. nnictl top                    monitor the status of running experiments
4. nnictl log stderr             show stderr log content
5. nnictl log stdout             show stdout log content
6. nnictl stop                   stop an experiment
7. nnictl trial kill             kill a trial job by id
8. nnictl --help                 get help information about nnictl
-----------------------------------------------------------------------
```

190
If you prepared `trial`, `search space`, and `config` according to the above steps and successfully created an NNI job, NNI will automatically tune the optimal hyper-parameters and run different hyper-parameter sets for each trial according to the requirements you set. You can clearly see its progress through the NNI WebUI.
Yan Ni's avatar
Yan Ni committed
191
192
193

## WebUI

194
After you start your experiment in NNI successfully, you can find a message in the command-line interface that tells you the `Web UI url` like this:
Yan Ni's avatar
Yan Ni committed
195

Chi Song's avatar
Chi Song committed
196
```text
Yan Ni's avatar
Yan Ni committed
197
198
199
The Web UI urls are: [Your IP]:8080
```

200
Open the `Web UI url` (Here it's: `[Your IP]:8080`) in your browser; you can view detailed information about the experiment and all the submitted trial jobs as shown below. If you cannot open the WebUI link in your terminal, please refer to the [FAQ](FAQ.md).
Yan Ni's avatar
Yan Ni committed
201

QuanluZhang's avatar
QuanluZhang committed
202
### View summary page
Yan Ni's avatar
Yan Ni committed
203

204
Click the "Overview" tab.
Yan Ni's avatar
Yan Ni committed
205

206
Information about this experiment will be shown in the WebUI, including the experiment trial profile and search space message. NNI also supports downloading this information and the parameters through the **Download** button. You can download the experiment results anytime while the experiment is running, or you can wait until the end of the execution, etc.
Yan Ni's avatar
Yan Ni committed
207

xuehui's avatar
xuehui committed
208
![](../../img/QuickStart1.png)
Yan Ni's avatar
Yan Ni committed
209

210
The top 10 trials will be listed on the Overview page. You can browse all the trials on the "Trials Detail" page.
Yan Ni's avatar
Yan Ni committed
211

xuehui's avatar
xuehui committed
212
![](../../img/QuickStart2.png)
Yan Ni's avatar
Yan Ni committed
213

QuanluZhang's avatar
QuanluZhang committed
214
### View trials detail page
Yan Ni's avatar
Yan Ni committed
215

216
Click the "Default Metric" tab to see the point graph of all trials. Hover to see specific default metrics and search space messages.
Yan Ni's avatar
Yan Ni committed
217

xuehui's avatar
xuehui committed
218
![](../../img/QuickStart3.png)
Yan Ni's avatar
Yan Ni committed
219

220
Click the "Hyper Parameter" tab to see the parallel graph.
Yan Ni's avatar
Yan Ni committed
221

222
223
* You can select the percentage to see the top trials.
* Choose two axis to swap their positions.
Yan Ni's avatar
Yan Ni committed
224

xuehui's avatar
xuehui committed
225
![](../../img/QuickStart4.png)
Yan Ni's avatar
Yan Ni committed
226

227
Click the "Trial Duration" tab to see the bar graph.
Yan Ni's avatar
Yan Ni committed
228

xuehui's avatar
xuehui committed
229
![](../../img/QuickStart5.png)
Yan Ni's avatar
Yan Ni committed
230

231
Below is the status of all trials. Specifically:
Yan Ni's avatar
Yan Ni committed
232

233
* Trial detail: trial's id, duration, start time, end time, status, accuracy, and search space file.
234
* If you run on the OpenPAI platform, you can also see the hdfsLogPath.
235
236
* Kill: you can kill a job that has the `Running` status.
* Support: Used to search for a specific trial.
Yan Ni's avatar
Yan Ni committed
237

xuehui's avatar
xuehui committed
238
![](../../img/QuickStart6.png)
Yan Ni's avatar
Yan Ni committed
239

Chi Song's avatar
Chi Song committed
240
* Intermediate Result Graph
Yan Ni's avatar
Yan Ni committed
241

xuehui's avatar
xuehui committed
242
![](../../img/QuickStart7.png)
Yan Ni's avatar
Yan Ni committed
243
244
245

## Related Topic

xuehui's avatar
xuehui committed
246
247
* [Try different Tuners](../Tuner/BuiltinTuner.md)
* [Try different Assessors](../Assessor/BuiltinAssessor.md)
248
* [How to use command line tool nnictl](Nnictl.md)
xuehui's avatar
xuehui committed
249
250
251
252
253
254
* [How to write a trial](../TrialExample/Trials.md)
* [How to run an experiment on local (with multiple GPUs)?](../TrainingService/LocalMode.md)
* [How to run an experiment on multiple machines?](../TrainingService/RemoteMachineMode.md)
* [How to run an experiment on OpenPAI?](../TrainingService/PaiMode.md)
* [How to run an experiment on Kubernetes through Kubeflow?](../TrainingService/KubeflowMode.md)
* [How to run an experiment on Kubernetes through FrameworkController?](../TrainingService/FrameworkControllerMode.md)