QuickStart.rst 12.1 KB
Newer Older
1
2
3
4
5
6
QuickStart
==========

Installation
------------

7
Currently, NNI supports running on Linux, macOS and Windows. Ubuntu 16.04 or higher, macOS 10.14.1, and Windows 10.1809 are tested and supported. Simply run the following ``pip install`` in an environment that has ``python >= 3.6``.
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Linux and macOS
^^^^^^^^^^^^^^^

.. code-block:: bash

   python3 -m pip install --upgrade nni

Windows
^^^^^^^

.. code-block:: bash

   python -m pip install --upgrade nni

23
.. Note:: For Linux and macOS, ``--user`` can be added if you want to install NNI in your home directory, which does not require any special privileges.
24
25
26

.. Note:: If there is an error like ``Segmentation fault``, please refer to the :doc:`FAQ <FAQ>`.

27
.. Note:: For the system requirements of NNI, please refer to :doc:`Install NNI on Linux & Mac <InstallationLinux>` or :doc:`Windows <InstallationWin>`. If you want to use docker, refer to :doc:`HowToUseDocker <HowToUseDocker>`.
28
29
30
31
32


"Hello World" example on MNIST
------------------------------

33
NNI is a toolkit to help users run automated machine learning experiments. It can automatically do the cyclic process of getting hyperparameters, running trials, testing results, and tuning hyperparameters. Here, we'll show how to use NNI to help you find the optimal hyperparameters on the MNIST dataset.
34

35
Here is an example script to train a CNN on the MNIST dataset **without NNI**:
36
37
38

.. code-block:: python

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
    def main(args):
        # load data
        train_loader = torch.utils.data.DataLoader(datasets.MNIST(...), batch_size=args['batch_size'], shuffle=True)
        test_loader = torch.tuils.data.DataLoader(datasets.MNIST(...), batch_size=1000, shuffle=True)
        # build model
        model = Net(hidden_size=args['hidden_size'])
        optimizer = optim.SGD(model.parameters(), lr=args['lr'], momentum=args['momentum'])
        # train
        for epoch in range(10):
            train(args, model, device, train_loader, optimizer, epoch)
            test_acc = test(args, model, device, test_loader)
            print(test_acc)
        print('final accuracy:', test_acc)
         
    if __name__ == '__main__':
        params = {
            'batch_size': 32,
            'hidden_size': 128,
            'lr': 0.001,
            'momentum': 0.5
        }
        main(params)
61

62
The above code can only try one set of parameters at a time. If you want to tune the learning rate, you need to manually modify the hyperparameter and start the trial again and again.
63

64
NNI is born to help users tune jobs, whose working process is presented below:
65
66
67
68
69
70
71
72
73
74
75
76
77
78

.. code-block:: text

   input: search space, trial code, config file
   output: one optimal hyperparameter configuration

   1: For t = 0, 1, 2, ..., maxTrialNum,
   2:      hyperparameter = chose a set of parameter from search space
   3:      final result = run_trial_and_evaluate(hyperparameter)
   4:      report final result to NNI
   5:      If reach the upper limit time,
   6:          Stop the experiment
   7: return hyperparameter value with best final result

79
.. note::
80

81
   If you want to use NNI to automatically train your model and find the optimal hyper-parameters, there are two approaches:
82

83
84
   1. Write a config file and start the experiment from the command line.
   2. Config and launch the experiment directly from a Python file
85

86
   In the this part, we will focus on the first approach. For the second approach, please refer to `this tutorial <HowToLaunchFromPython.rst>`__\ .
87
88


89
90
Step 1: Modify the ``Trial`` Code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91

92
Modify your ``Trial`` file to get the hyperparameter set from NNI and report the final results to NNI.
93
94
95

.. code-block:: diff

96
97
98
99
100
101
102
103
104
105
106
107
108
109
    + import nni

      def main(args):
          # load data
          train_loader = torch.utils.data.DataLoader(datasets.MNIST(...), batch_size=args['batch_size'], shuffle=True)
          test_loader = torch.tuils.data.DataLoader(datasets.MNIST(...), batch_size=1000, shuffle=True)
          # build model
          model = Net(hidden_size=args['hidden_size'])
          optimizer = optim.SGD(model.parameters(), lr=args['lr'], momentum=args['momentum'])
          # train
          for epoch in range(10):
              train(args, model, device, train_loader, optimizer, epoch)
              test_acc = test(args, model, device, test_loader)
    -         print(test_acc)
Amit Shtober's avatar
Amit Shtober committed
110
    +         nni.report_intermediate_result(test_acc)
111
112
113
114
115
116
117
118
119
    -     print('final accuracy:', test_acc)
    +     nni.report_final_result(test_acc)
           
      if __name__ == '__main__':
    -     params = {'batch_size': 32, 'hidden_size': 128, 'lr': 0.001, 'momentum': 0.5}
    +     params = nni.get_next_parameter()
          main(params)

*Example:* :githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>`
120

121
122
123
124
125

Step 2: Define the Search Space
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Define a ``Search Space`` in a YAML file, including the ``name`` and the ``distribution`` (discrete-valued or continuous-valued) of all the hyperparameters you want to search.
126
127
128

.. code-block:: yaml

129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
   searchSpace:
      batch_size:
         _type: choice
         _value: [16, 32, 64, 128]
      hidden_size:
         _type: choice
         _value: [128, 256, 512, 1024]
      lr:
         _type: choice
         _value: [0.0001, 0.001, 0.01, 0.1]
      momentum:
         _type: uniform
         _value: [0, 1]

*Example:* :githublink:`config_detailed.yml <examples/trials/mnist-pytorch/config_detailed.yml>`

You can also write your search space in a JSON file and specify the file path in the configuration. For detailed tutorial on how to write the search space, please see `here <SearchSpaceSpec.rst>`__.


Step 3: Config the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In addition to the search_space defined in the `step2 <step-2-define-the-search-space>`__, you need to config the experiment in the YAML file. It specifies the key information of the experiment, such as the trial files, tuning algorithm, max trial number, and max duration, etc.

.. code-block:: yaml
154

155
156
157
158
159
160
161
162
163
164
165
166
167
   experimentName: MNIST               # An optional name to distinguish the experiments
   trialCommand: python3 mnist.py      # NOTE: change "python3" to "python" if you are using Windows
   trialConcurrency: 2                 # Run 2 trials concurrently
   maxTrialNumber: 10                  # Generate at most 10 trials
   maxExperimentDuration: 1h           # Stop generating trials after 1 hour
   tuner:                              # Configure the tuning algorithm
      name: TPE
      classArgs:                       # Algorithm specific arguments
         optimize_mode: maximize
   trainingService:                    # Configure the training platform
      platform: local

Experiment config reference could be found `here <../reference/experiment_config.rst>`__.
kvartet's avatar
kvartet committed
168
169
170

.. _nniignore:

liuzhe-lz's avatar
liuzhe-lz committed
171
.. Note:: If you are planning to use remote machines or clusters as your training service, to avoid too much pressure on network, NNI limits the number of files to 2000 and total size to 300MB. If your codeDir contains too many files, you can choose which files and subfolders should be excluded by adding a ``.nniignore`` file that works like a ``.gitignore`` file. For more details on how to write this file, see the `git documentation <https://git-scm.com/docs/gitignore#_pattern_format>`__.
172
173
174
175

*Example:* :githublink:`config_detailed.yml <examples/trials/mnist-pytorch/config_detailed.yml>` and :githublink:`.nniignore <examples/trials/mnist-pytorch/.nniignore>`

All the code above is already prepared and stored in :githublink:`examples/trials/mnist-pytorch/<examples/trials/mnist-pytorch>`.
176
177


178
179
Step 4: Launch the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
180
181

Linux and macOS
182
***************
183

184
Run the **config_detailed.yml** file from your command line to start the experiment.
185
186
187

.. code-block:: bash

188
   nnictl create --config nni/examples/trials/mnist-pytorch/config_detailed.yml
189
190

Windows
191
*******
192

193
Change ``python3`` to ``python`` of the ``trialCommand`` field in the **config_detailed.yml** file, and run the **config_detailed.yml** file from your command line to start the experiment.
194
195
196

.. code-block:: bash

197
   nnictl create --config nni\examples\trials\mnist-pytorch\config_detailed.yml
198

199
.. Note:: ``nnictl`` is a command line tool that can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click :doc:`here <../reference/nnictl>` for more usage of ``nnictl``.
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228

Wait for the message ``INFO: Successfully started experiment!`` in the command line. This message indicates that your experiment has been successfully started. And this is what we expect to get:

.. code-block:: text

   INFO: Starting restful server...
   INFO: Successfully started Restful server!
   INFO: Setting local config...
   INFO: Successfully set local config!
   INFO: Starting experiment...
   INFO: Successfully started experiment!
   -----------------------------------------------------------------------
   The experiment id is egchD4qy
   The Web UI urls are: [Your IP]:8080
   -----------------------------------------------------------------------

   You can use these commands to get more information about the experiment
   -----------------------------------------------------------------------
            commands                       description
   1. nnictl experiment show        show the information of experiments
   2. nnictl trial ls               list all of trial jobs
   3. nnictl top                    monitor the status of running experiments
   4. nnictl log stderr             show stderr log content
   5. nnictl log stdout             show stdout log content
   6. nnictl stop                   stop an experiment
   7. nnictl trial kill             kill a trial job by id
   8. nnictl --help                 get help information about nnictl
   -----------------------------------------------------------------------

229
230
231
232
If you prepared ``trial``\ , ``search space``\ , and ``config`` according to the above steps and successfully created an NNI job, NNI will automatically tune the optimal hyper-parameters and run different hyper-parameter sets for each trial according to the defined search space. You can see its progress through the WebUI clearly.

Step 5: View the Experiment
^^^^^^^^^^^^^^^^^^^^^^^^^^^
233

234
After starting the experiment successfully, you can find a message in the command-line interface that tells you the ``Web UI url`` like this:
235
236
237
238
239

.. code-block:: text

   The Web UI urls are: [Your IP]:8080

240
Open the ``Web UI url`` (Here it's: ``[Your IP]:8080``\ ) in your browser, you can view detailed information about the experiment and all the submitted trial jobs as shown below. If you cannot open the WebUI link in your terminal, please refer to the `FAQ <FAQ.rst#could-not-open-webui-link>`__.
241
242


243
244
View Overview Page
******************
245

246
Information about this experiment will be shown in the WebUI, including the experiment profile and search space message. NNI also supports downloading this information and the parameters through the **Experiment summary** button.
247

248
249
250
.. image:: ../../img/webui-img/full-oview.png
   :target: ../../img/webui-img/full-oview.png
   :alt: overview
251
252


253
254
View Trials Detail Page
***********************
255

256
You could see the best trial metrics and hyper-parameter graph in this page. And the table content includes more columns when you click the button ``Add/Remove columns``.
257

258
259
260
.. image:: ../../img/webui-img/full-detail.png
   :target: ../../img/webui-img/full-detail.png
   :alt: detail
261
262


263
264
View Experiments Management Page
********************************
265

266
On the ``All experiments`` page, you can see all the experiments on your machine. 
267

268
269
270
.. image:: ../../img/webui-img/managerExperimentList/expList.png
   :target: ../../img/webui-img/managerExperimentList/expList.png
   :alt: Experiments list
271

272
For more detailed usage of WebUI, please refer to `this doc <./WebUI.rst>`__.
273
274
275
276
277


Related Topic
-------------

278
279
280
281
282
283
284
285
* `How to debug? <HowToDebug.rst>`__
* `How to write a trial? <../TrialExample/Trials.rst>`__
* `How to try different Tuners? <../Tuner/BuiltinTuner.rst>`__
* `How to try different Assessors? <../Assessor/BuiltinAssessor.rst>`__
* `How to run an experiment on the different training platforms? <../training_services.rst>`__
* `How to use Annotation? <AnnotationSpec.rst>`__
* `How to use the command line tool nnictl? <Nnictl.rst>`__
* `How to launch Tensorboard on WebUI? <Tensorboard.rst>`__