"tests/vscode:/vscode.git/clone" did not exist on "f717d47fe0c64aa045b62f1298a028d246b16172"
Unverified Commit 1d845267 authored by SparkSnail's avatar SparkSnail Committed by GitHub
Browse files

Merge pull request #126 from Microsoft/master

merge master
parents 40391ec2 a6c2ac2b
......@@ -50,7 +50,7 @@ Compared with LocalMode and [RemoteMachineMode](RemoteMachineMode.md), trial con
* Required key. Should be positive number based on your trial program's memory requirement
* image
* Required key. In pai mode, your trial program will be scheduled by OpenPAI to run in [Docker container](https://www.docker.com/). This key is used to specify the Docker image used to create the container in which your trial will run.
* We already build a docker image [nnimsra/nni](https://hub.docker.com/r/msranni/nni/) on [Docker Hub](https://hub.docker.com/). It contains NNI python packages, Node modules and javascript artifact files required to start experiment, and all of NNI dependencies. The docker file used to build this image can be found at [here](https://github.com/Microsoft/nni/tree/master/deployment/Dockerfile.build.base). You can either use this image directly in your config file, or build your own image based on it.
* We already build a docker image [nnimsra/nni](https://hub.docker.com/r/msranni/nni/) on [Docker Hub](https://hub.docker.com/). It contains NNI python packages, Node modules and javascript artifact files required to start experiment, and all of NNI dependencies. The docker file used to build this image can be found at [here](https://github.com/Microsoft/nni/tree/master/deployment/docker/Dockerfile). You can either use this image directly in your config file, or build your own image based on it.
* dataDir
* Optional key. It specifies the HDFS data direcotry for trial to download data. The format should be something like hdfs://{your HDFS host}:9000/{your data directory}
* outputDir
......@@ -62,20 +62,20 @@ nnictl create --config exp_pai.yml
```
to start the experiment in pai mode. NNI will create OpenPAI job for each trial, and the job name format is something like `nni_exp_{experiment_id}_trial_{trial_id}`.
You can see jobs created by NNI in the OpenPAI cluster's web portal, like:
![](./img/nni_pai_joblist.jpg)
![](../img/nni_pai_joblist.jpg)
Notice: In pai mode, NNIManager will start a rest server and listen on a port which is your NNI WebUI's port plus 1. For example, if your WebUI port is `8080`, the rest server will listen on `8081`, to receive metrics from trial job running in Kubernetes. So you should `enable 8081` TCP port in your firewall rule to allow incoming traffic.
Once a trial job is completed, you can goto NNI WebUI's overview page (like http://localhost:8080/oview) to check trial's information.
Expand a trial information in trial list view, click the logPath link like:
![](./img/nni_webui_joblist.jpg)
![](../img/nni_webui_joblist.jpg)
And you will be redirected to HDFS web portal to browse the output files of that trial in HDFS:
![](./img/nni_trial_hdfs_output.jpg)
![](../img/nni_trial_hdfs_output.jpg)
You can see there're three fils in output folder: stderr, stdout, and trial.log
If you also want to save trial's other output into HDFS, like model files, you can use environment variable `NNI_OUTPUT_DIR` in your trial code to save your own output files, and NNI SDK will copy all the files in `NNI_OUTPUT_DIR` from trial's container to HDFS.
Any problems when using NNI in pai mode, plesae create issues on [NNI github repo](https://github.com/Microsoft/nni).
Any problems when using NNI in pai mode, please create issues on [NNI github repo](https://github.com/Microsoft/nni).
......@@ -183,28 +183,28 @@ Click the tab "Overview".
Information about this experiment will be shown in the WebUI, including the experiment trial profile and search space message. NNI also support `download these information and parameters` through the **Download** button. You can download the experiment result anytime in the middle for the running or at the end of the execution, etc.
![](./img/QuickStart1.png)
![](../img/QuickStart1.png)
Top 10 trials will be listed in the Overview page, you can browse all the trials in "Trials Detail" page.
![](./img/QuickStart2.png)
![](../img/QuickStart2.png)
#### View trials detail page
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.
![](./img/QuickStart3.png)
![](../img/QuickStart3.png)
Click the tab "Hyper Parameter" to see the parallel graph.
* You can select the percentage to see top trials.
* Choose two axis to swap its positions
![](./img/QuickStart4.png)
![](../img/QuickStart4.png)
Click the tab "Trial Duration" to see the bar graph.
![](./img/QuickStart5.png)
![](../img/QuickStart5.png)
Below is the status of the all trials. Specifically:
......@@ -213,11 +213,11 @@ Below is the status of the all trials. Specifically:
* Kill: you can kill a job that status is running.
* Support to search for a specific trial.
![](./img/QuickStart6.png)
![](../img/QuickStart6.png)
* Intermediate Result Grap
![](./img/QuickStart7.png)
![](../img/QuickStart7.png)
## Related Topic
......
......@@ -61,7 +61,7 @@ searchSpacePath: /path/to/your/search_space.json
You can refer to [here](ExperimentConfig.md) for more information about how to set up experiment configurations.
*Please refer to [here](sdk_reference.md) for more APIs (e.g., `nni.get_sequence_id()`) provided by NNI.
*Please refer to [here](https://nni.readthedocs.io/en/latest/sdk_reference.html) for more APIs (e.g., `nni.get_sequence_id()`) provided by NNI.
<a name="nni-annotation"></a>
......
......@@ -7,16 +7,16 @@ Click the tab "Overview".
* See the experiment trial profile and search space message.
* Support to download the experiment result.
![](./img/webui-img/over1.png)
![](../img/webui-img/over1.png)
* See good performance trials.
![](./img/webui-img/over2.png)
![](../img/webui-img/over2.png)
## View job default metric
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.
![](./img/accuracy.png)
![](../img/accuracy.png)
## View hyper parameter
......@@ -25,13 +25,13 @@ Click the tab "Hyper Parameter" to see the parallel graph.
* You can select the percentage to see top trials.
* Choose two axis to swap its positions
![](./img/hyperPara.png)
![](../img/hyperPara.png)
## View Trial Duration
Click the tab "Trial Duration" to see the bar graph.
![](./img/trial_duration.png)
![](../img/trial_duration.png)
## View trials status
......@@ -39,15 +39,15 @@ Click the tab "Trials Detail" to see the status of the all trials. Specifically:
* Trial detail: trial's id, trial's duration, start time, end time, status, accuracy and search space file.
![](./img/webui-img/detail-local.png)
![](../img/webui-img/detail-local.png)
* If you run on OpenPAI or Kubeflow platform, you can also see the hdfsLog.
![](./img/webui-img/detail-pai.png)
![](../img/webui-img/detail-pai.png)
* Kill: you can kill a job that status is running.
* Support to search for a specific trial.
* Intermediate Result Graph.
![](./img/intermediate.png)
![](../img/intermediate.png)
......@@ -8,7 +8,7 @@ Here is an experimental result of MNIST after using 'Curvefitting' Assessor in '
*Implemented code directory: config_assessor.yml <https://github.com/Microsoft/nni/blob/master/examples/trials/mnist/config_assessor.yml>*
.. image:: ./img/Assessor.png
.. image:: ../img/Assessor.png
Like Tuners, users can either use built-in Assessors, or customize an Assessor on their own. Please refer to the following tutorials for detail:
......
......@@ -44,6 +44,7 @@ extensions = [
'sphinx.ext.mathjax',
'sphinx_markdown_tables',
'sphinxarg.ext',
'sphinx.ext.napoleon',
]
# Add any paths that contain templates here, relative to this directory.
......@@ -107,7 +108,7 @@ html_theme_options = {
#
# html_sidebars = {}
html_logo = './img/nni_logo_dark.png'
html_logo = '../img/nni_logo_dark.png'
# -- Options for HTMLHelp output ---------------------------------------------
......
# GBDT in nni
Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion as other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function.
Gradient boosting decision tree has many popular implementations, such as [lightgbm](https://github.com/Microsoft/LightGBM), [xgboost](https://github.com/dmlc/xgboost), and [catboost](https://github.com/catboost/catboost), etc. GBDT is a great tool for solving the problem of traditional machine learning problem. Since GBDT is a robust algorithm, it could use in many domains. The better hyper-parameters for GBDT, the better performance you could achieve.
NNI is a great platform for tuning hyper-parameters, you could try various builtin search algorithm in nni and run multiple trials concurrently.
## 1. Search Space in GBDT
There are many hyper-parameters in GBDT, but what kind of parameters will affect the performance or speed? Based on some practical experience, some suggestion here(Take lightgbm as example):
> * For better accuracy
* `learning_rate`. The range of `learning rate` could be [0.001, 0.9].
* `num_leaves`. `num_leaves` is related to `max_depth`, you don't have to tune both of them.
* `bagging_freq`. `bagging_freq` could be [1, 2, 4, 8, 10]
* `num_iterations`. May larger if underfitting.
> * For speed up
* `bagging_fraction`. The range of `bagging_fraction` could be [0.7, 1.0].
* `feature_fraction`. The range of `feature_fraction` could be [0.6, 1.0].
* `max_bin`.
> * To avoid overfitting
* `min_data_in_leaf`. This depends on your dataset.
* `min_sum_hessian_in_leaf`. This depend on your dataset.
* `lambda_l1` and `lambda_l2`.
* `min_gain_to_split`.
* `num_leaves`.
Reference link:
[lightgbm](https://lightgbm.readthedocs.io/en/latest/Parameters-Tuning.html) and [autoxgoboost](https://github.com/ja-thomas/autoxgboost/blob/master/poster_2018.pdf)
## 2. Task description
Now we come back to our example "auto-gbdt" which run in lightgbm and nni. The data including [train data](https://github.com/Microsoft/nni/blob/master/examples/trials/auto-gbdt/data/regression.train) and [test data](https://github.com/Microsoft/nni/blob/master/examples/trials/auto-gbdt/data/regression.train).
Given the features and label in train data, we train a GBDT regression model and use it to predict.
## 3. How to run in nni
### 3.1 Prepare your trial code
You need to prepare a basic code as following:
```python
...
def get_default_parameters():
...
return params
def load_data(train_path='./data/regression.train', test_path='./data/regression.test'):
'''
Load or create dataset
'''
...
return lgb_train, lgb_eval, X_test, y_test
def run(lgb_train, lgb_eval, params, X_test, y_test):
# train
gbm = lgb.train(params,
lgb_train,
num_boost_round=20,
valid_sets=lgb_eval,
early_stopping_rounds=5)
# predict
y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
# eval
rmse = mean_squared_error(y_test, y_pred) ** 0.5
print('The rmse of prediction is:', rmse)
if __name__ == '__main__':
lgb_train, lgb_eval, X_test, y_test = load_data()
PARAMS = get_default_parameters()
# train
run(lgb_train, lgb_eval, PARAMS, X_test, y_test)
```
### 3.2 Prepare your search space.
If you like to tune `num_leaves`, `learning_rate`, `bagging_fraction` and `bagging_freq`, you could write a [search_space.json](https://github.com/Microsoft/nni/blob/master/examples/trials/auto-gbdt/search_space.json) as follow:
```json
{
"num_leaves":{"_type":"choice","_value":[31, 28, 24, 20]},
"learning_rate":{"_type":"choice","_value":[0.01, 0.05, 0.1, 0.2]},
"bagging_fraction":{"_type":"uniform","_value":[0.7, 1.0]},
"bagging_freq":{"_type":"choice","_value":[1, 2, 4, 8, 10]}
}
```
More support variable type you could reference [here](SearchSpaceSpec.md).
### 3.3 Add SDK of nni into your code.
```diff
+import nni
...
def get_default_parameters():
...
return params
def load_data(train_path='./data/regression.train', test_path='./data/regression.test'):
'''
Load or create dataset
'''
...
return lgb_train, lgb_eval, X_test, y_test
def run(lgb_train, lgb_eval, params, X_test, y_test):
# train
gbm = lgb.train(params,
lgb_train,
num_boost_round=20,
valid_sets=lgb_eval,
early_stopping_rounds=5)
# predict
y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
# eval
rmse = mean_squared_error(y_test, y_pred) ** 0.5
print('The rmse of prediction is:', rmse)
+ nni.report_final_result(rmse)
if __name__ == '__main__':
lgb_train, lgb_eval, X_test, y_test = load_data()
+ RECEIVED_PARAMS = nni.get_next_parameter()
PARAMS = get_default_parameters()
+ PARAMS.update(RECEIVED_PARAMS)
PARAMS = get_default_parameters()
PARAMS.update(RECEIVED_PARAMS)
# train
run(lgb_train, lgb_eval, PARAMS, X_test, y_test)
```
### 3.4 Write a config file and run it.
In the config file, you could set some settings including:
* Experiment setting: `trialConcurrency`, `maxExecDuration`, `maxTrialNum`, `trial gpuNum`, etc.
* Platform setting: `trainingServicePlatform`, etc.
* Path seeting: `searchSpacePath`, `trial codeDir`, etc.
* Algorithm setting: select `tuner` algorithm, `tuner optimize_mode`, etc.
An config.yml as follow:
```yaml
authorName: default
experimentName: example_auto-gbdt
trialConcurrency: 1
maxExecDuration: 10h
maxTrialNum: 10
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution, BatchTuner
#SMAC (SMAC should be installed through nnictl)
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: minimize
trial:
command: python3 main.py
codeDir: .
gpuNum: 0
```
Run this experiment with command as follow:
```bash
nnictl create --config ./config.yml
# GBDT in nni
Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion as other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function.
Gradient boosting decision tree has many popular implementations, such as [lightgbm](https://github.com/Microsoft/LightGBM), [xgboost](https://github.com/dmlc/xgboost), and [catboost](https://github.com/catboost/catboost), etc. GBDT is a great tool for solving the problem of traditional machine learning problem. Since GBDT is a robust algorithm, it could use in many domains. The better hyper-parameters for GBDT, the better performance you could achieve.
NNI is a great platform for tuning hyper-parameters, you could try various builtin search algorithm in nni and run multiple trials concurrently.
## 1. Search Space in GBDT
There are many hyper-parameters in GBDT, but what kind of parameters will affect the performance or speed? Based on some practical experience, some suggestion here(Take lightgbm as example):
> * For better accuracy
* `learning_rate`. The range of `learning rate` could be [0.001, 0.9].
* `num_leaves`. `num_leaves` is related to `max_depth`, you don't have to tune both of them.
* `bagging_freq`. `bagging_freq` could be [1, 2, 4, 8, 10]
* `num_iterations`. May larger if underfitting.
> * For speed up
* `bagging_fraction`. The range of `bagging_fraction` could be [0.7, 1.0].
* `feature_fraction`. The range of `feature_fraction` could be [0.6, 1.0].
* `max_bin`.
> * To avoid overfitting
* `min_data_in_leaf`. This depends on your dataset.
* `min_sum_hessian_in_leaf`. This depend on your dataset.
* `lambda_l1` and `lambda_l2`.
* `min_gain_to_split`.
* `num_leaves`.
Reference link:
[lightgbm](https://lightgbm.readthedocs.io/en/latest/Parameters-Tuning.html) and [autoxgoboost](https://github.com/ja-thomas/autoxgboost/blob/master/poster_2018.pdf)
## 2. Task description
Now we come back to our example "auto-gbdt" which run in lightgbm and nni. The data including [train data](https://github.com/Microsoft/nni/blob/master/examples/trials/auto-gbdt/data/regression.train) and [test data](https://github.com/Microsoft/nni/blob/master/examples/trials/auto-gbdt/data/regression.train).
Given the features and label in train data, we train a GBDT regression model and use it to predict.
## 3. How to run in nni
### 3.1 Prepare your trial code
You need to prepare a basic code as following:
```python
...
def get_default_parameters():
...
return params
def load_data(train_path='./data/regression.train', test_path='./data/regression.test'):
'''
Load or create dataset
'''
...
return lgb_train, lgb_eval, X_test, y_test
def run(lgb_train, lgb_eval, params, X_test, y_test):
# train
gbm = lgb.train(params,
lgb_train,
num_boost_round=20,
valid_sets=lgb_eval,
early_stopping_rounds=5)
# predict
y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
# eval
rmse = mean_squared_error(y_test, y_pred) ** 0.5
print('The rmse of prediction is:', rmse)
if __name__ == '__main__':
lgb_train, lgb_eval, X_test, y_test = load_data()
PARAMS = get_default_parameters()
# train
run(lgb_train, lgb_eval, PARAMS, X_test, y_test)
```
### 3.2 Prepare your search space.
If you like to tune `num_leaves`, `learning_rate`, `bagging_fraction` and `bagging_freq`, you could write a [search_space.json](https://github.com/Microsoft/nni/blob/master/examples/trials/auto-gbdt/search_space.json) as follow:
```json
{
"num_leaves":{"_type":"choice","_value":[31, 28, 24, 20]},
"learning_rate":{"_type":"choice","_value":[0.01, 0.05, 0.1, 0.2]},
"bagging_fraction":{"_type":"uniform","_value":[0.7, 1.0]},
"bagging_freq":{"_type":"choice","_value":[1, 2, 4, 8, 10]}
}
```
More support variable type you could reference [here](SearchSpaceSpec.md).
### 3.3 Add SDK of nni into your code.
```diff
+import nni
...
def get_default_parameters():
...
return params
def load_data(train_path='./data/regression.train', test_path='./data/regression.test'):
'''
Load or create dataset
'''
...
return lgb_train, lgb_eval, X_test, y_test
def run(lgb_train, lgb_eval, params, X_test, y_test):
# train
gbm = lgb.train(params,
lgb_train,
num_boost_round=20,
valid_sets=lgb_eval,
early_stopping_rounds=5)
# predict
y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
# eval
rmse = mean_squared_error(y_test, y_pred) ** 0.5
print('The rmse of prediction is:', rmse)
+ nni.report_final_result(rmse)
if __name__ == '__main__':
lgb_train, lgb_eval, X_test, y_test = load_data()
+ RECEIVED_PARAMS = nni.get_next_parameter()
PARAMS = get_default_parameters()
+ PARAMS.update(RECEIVED_PARAMS)
PARAMS = get_default_parameters()
PARAMS.update(RECEIVED_PARAMS)
# train
run(lgb_train, lgb_eval, PARAMS, X_test, y_test)
```
### 3.4 Write a config file and run it.
In the config file, you could set some settings including:
* Experiment setting: `trialConcurrency`, `maxExecDuration`, `maxTrialNum`, `trial gpuNum`, etc.
* Platform setting: `trainingServicePlatform`, etc.
* Path seeting: `searchSpacePath`, `trial codeDir`, etc.
* Algorithm setting: select `tuner` algorithm, `tuner optimize_mode`, etc.
An config.yml as follow:
```yaml
authorName: default
experimentName: example_auto-gbdt
trialConcurrency: 1
maxExecDuration: 10h
maxTrialNum: 10
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution, BatchTuner
#SMAC (SMAC should be installed through nnictl)
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: minimize
trial:
command: python3 main.py
codeDir: .
gpuNum: 0
```
Run this experiment with command as follow:
```bash
nnictl create --config ./config.yml
```
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment