"...resnet50_tensorflow.git" did not exist on "59c6bdb3c63f0630941d620db022ea18a50ed9ad"
Commit df6145a2 authored by Yuge Zhang's avatar Yuge Zhang
Browse files

Merge branch 'master' of https://github.com/microsoft/nni into dev-retiarii

parents 0f0c6288 f8424a9f
Dockerfile
===
## 1.Description
This is the Dockerfile of NNI project. It includes serveral popular deep learning frameworks and NNI. It is tested on `Ubuntu 16.04 LTS`:
```
CUDA 9.0
CuDNN 7.0
numpy 1.14.3
scipy 1.1.0
tensorflow-gpu 1.15.0
keras 2.1.6
torch 1.4.0
scikit-learn 0.23.2
pandas 0.23.4
lightgbm 2.2.2
nni
```
You can take this Dockerfile as a reference for your own customized Dockerfile.
## 2.How to build and run
__Use the following command from `nni/deployment/docker` to build docker image__
```
docker build -t nni/nni .
```
__Run the docker image__
* If does not use GPU in docker container, simply run the following command
```
docker run -it nni/nni
```
Note that if you want to use tensorflow, please uninstall tensorflow-gpu and install tensorflow in this docker container. Or modify `Dockerfile` to install tensorflow (without gpu) and build docker image.
* If use GPU in docker container, make sure you have installed [NVIDIA Container Runtime](https://github.com/NVIDIA/nvidia-docker), then run the following command
```
nvidia-docker run -it nni/nni
```
or
```
docker run --runtime=nvidia -it nni/nni
```
## 3.Directly retrieve the docker image
Use the following command to retrieve the NNI docker image from Docker Hub
```
docker pull msranni/nni:latest
```
# Dockerfile
## 1. 说明
这是 NNI 项目的 Dockerfile 文件。 其中包含了 NNI 以及多个流行的深度学习框架。 在 `Ubuntu 16.04 LTS` 上进行过测试:
CUDA 9.0
CuDNN 7.0
numpy 1.14.3
scipy 1.1.0
tensorflow-gpu 1.15.0
keras 2.1.6
torch 1.4.0
scikit-learn 0.23.2
pandas 0.23.4
lightgbm 2.2.2
nni
此 Dockerfile 可作为定制的参考。
## 2.如何生成和运行
**使用 `nni/deployment/docker` 的下列命令来生成 docker 映像。**
docker build -t nni/nni .
**运行 docker 映像**
* 如果 docker 容器中没有 GPU,运行下面的命令
docker run -it nni/nni
注意,如果要使用 tensorflow,需要先卸载 tensorflow-gpu,然后在 Docker 容器中安装 tensorflow。 或者修改 `Dockerfile` 来安装没有 GPU 的 tensorflow 版本,并重新生成 Docker 映像。
* 如果 docker 容器中有 GPU,确保安装了 [NVIDIA 容器运行包](https://github.com/NVIDIA/nvidia-docker),然后运行下面的命令
nvidia-docker run -it nni/nni
或者
docker run --runtime=nvidia -it nni/nni
## 3.拉取 docker 映像
使用下列命令从 docker Hub 中拉取 NNI docker 映像。
docker pull msranni/nni:latest
\ No newline at end of file
# Python Package Index (PyPI) for NNI
This is the PyPI build and upload tool for NNI project.
## **For Linux**
* __Prepare environment__
Before build and upload NNI package, make sure the below OS and tools are available.
```
Ubuntu 16.04 LTS
make
wget
Python >= 3.6
Pip
Node.js
Yarn
```
* __How to build__
```bash
make
```
* __How to upload__
**upload for testing**
```bash
TWINE_REPOSITORY_URL=https://test.pypi.org/legacy/ make upload
```
You may need to input the account and password of https://test.pypi.org during this process.
**upload for release**
```bash
make upload
```
You may need to input the account and password of https://pypi.org during this process.
## **For Windows**
* __Prepare environment__
Before build and upload NNI package, make sure the below OS and tools are available.
```
Windows 10
powershell
Python >= 3.6
Pip
Yarn
```
* __How to build__
parameter `version_os` is used to build for Windows 64-bit or 32-bit.
```bash
powershell ./install.ps1 -version_os [64/32]
```
* __How to upload__
**upload for testing**
```bash
powershell ./upload.ps1
```
You may need to input the account and password of https://test.pypi.org during this process.
**upload for release**
```bash
powershell ./upload.ps1 -test $False
```
You may need to input the account and password of https://pypi.org during this process.
# 用于 NNI 的 python 包索引 (pypi)
这是用于 NNI 项目的 PyPI 生成和上传的工具。
## **Linux**
* **准备环境**
在生成和上传 NNI 包之前,确保使用了下列环境。
Ubuntu 16.04 LTS
make
wget
Python >= 3.5
Pip
Node.js
Yarn
* **如何生成**
```bash
make
```
* **如何上传**
**上传测试包**
```bash
TWINE_REPOSITORY_URL=https://test.pypi.org/legacy/ make upload
```
上传过程中,可能需要输入 https://test.pypi.org 的用户和密码。
**上传发布包**
```bash
make upload
```
上传过程中,可能需要输入 https://pypi.org 的用户和密码。
## **Windows**
* **准备环境**
在生成和上传 NNI 包之前,确保使用了下列环境。
Windows 10
powershell
Python >= 3.5
Pip
Yarn
* **如何生成**
参数 `version_os` 用来选择使用 64 位还是 32 位 Windows 来生成。
```bash
powershell ./install.ps1 -version_os [64/32]
```
* **如何上传**
**upload for testing**
```bash
powershell ./upload.ps1
```
上传过程中,可能需要输入 https://test.pypi.org 的用户和密码。
**上传发布包**
```bash
powershell ./upload.ps1 -test $False
```
上传过程中,可能需要输入 https://pypi.org 的用户和密码。
\ No newline at end of file
## Install ##
Install for current user:
pip install --user -e .
Install for all users:
pip install -e .
## Test ##
python setup.py test
## 安装
为当前用户安装:
pip install --user -e .
为所有用户安装:
pip install -e .
## 测试
python setup.py test
\ No newline at end of file
# NNI Annotation
## Overview
To improve user experience and reduce user effort, we design an annotation grammar. Using NNI annotation, users can adapt their code to NNI just by adding some standalone annotating strings, which does not affect the execution of the original code.
Below is an example:
```python
'''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
The meaning of this example is that NNI will choose one of several values (0.1, 0.01, 0.001) to assign to the learning_rate variable. Specifically, this first line is an NNI annotation, which is a single string. Following is an assignment statement. What nni does here is to replace the right value of this assignment statement according to the information provided by the annotation line.
In this way, users could either run the python code directly or launch NNI to tune hyper-parameter in this code, without changing any codes.
## Types of Annotation:
In NNI, there are mainly four types of annotation:
### 1. Annotate variables
`'''@nni.variable(sampling_algo, name)'''`
`@nni.variable` is used in NNI to annotate a variable.
**Arguments**
- **sampling_algo**: Sampling algorithm that specifies a search space. User should replace it with a built-in NNI sampling function whose name consists of an `nni.` identification and a search space type specified in [SearchSpaceSpec](https://nni.readthedocs.io/en/latest/Tutorial/SearchSpaceSpec.html) such as `choice` or `uniform`.
- **name**: The name of the variable that the selected value will be assigned to. Note that this argument should be the same as the left value of the following assignment statement.
There are 10 types to express your search space as follows:
* `@nni.variable(nni.choice(option1,option2,...,optionN),name=variable)`
Which means the variable value is one of the options, which should be a list The elements of options can themselves be stochastic expressions
* `@nni.variable(nni.randint(upper),name=variable)`
Which means the variable value is a random integer in the range [0, upper).
* `@nni.variable(nni.uniform(low, high),name=variable)`
Which means the variable value is a value uniformly between low and high.
* `@nni.variable(nni.quniform(low, high, q),name=variable)`
Which means the variable value is a value like round(uniform(low, high) / q) * q
* `@nni.variable(nni.loguniform(low, high),name=variable)`
Which means the variable value is a value drawn according to exp(uniform(low, high)) so that the logarithm of the return value is uniformly distributed.
* `@nni.variable(nni.qloguniform(low, high, q),name=variable)`
Which means the variable value is a value like round(exp(uniform(low, high)) / q) * q
* `@nni.variable(nni.normal(mu, sigma),name=variable)`
Which means the variable value is a real value that's normally-distributed with mean mu and standard deviation sigma.
* `@nni.variable(nni.qnormal(mu, sigma, q),name=variable)`
Which means the variable value is a value like round(normal(mu, sigma) / q) * q
* `@nni.variable(nni.lognormal(mu, sigma),name=variable)`
Which means the variable value is a value drawn according to exp(normal(mu, sigma))
* `@nni.variable(nni.qlognormal(mu, sigma, q),name=variable)`
Which means the variable value is a value like round(exp(normal(mu, sigma)) / q) * q
Below is an example:
```python
'''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
### 2. Annotate functions
`'''@nni.function_choice(*functions, name)'''`
`@nni.function_choice` is used to choose one from several functions.
**Arguments**
- **functions**: Several functions that are waiting to be selected from. Note that it should be a complete function call with arguments. Such as `max_pool(hidden_layer, pool_size)`.
- **name**: The name of the function that will be replaced in the following assignment statement.
An example here is:
```python
"""@nni.function_choice(max_pool(hidden_layer, pool_size), avg_pool(hidden_layer, pool_size), name=max_pool)"""
h_pooling = max_pool(hidden_layer, pool_size)
```
### 3. Annotate intermediate result
`'''@nni.report_intermediate_result(metrics)'''`
`@nni.report_intermediate_result` is used to report intermediate result, whose usage is the same as `nni.report_intermediate_result` in the doc of [Write a trial run on NNI](https://nni.readthedocs.io/en/latest/TrialExample/Trials.html)
### 4. Annotate final result
`'''@nni.report_final_result(metrics)'''`
`@nni.report_final_result` is used to report the final result of the current trial, whose usage is the same as `nni.report_final_result` in the doc of [Write a trial run on NNI](https://nni.readthedocs.io/en/latest/TrialExample/Trials.html)
# NNI Annotation
## 概述
为了获得良好的用户体验并减少对以后代码的影响,NNI 设计了通过 Annotation(标记)来使用的语法。 通过 Annotation,只需要在代码中加入一些注释字符串,就能启用 NNI,完全不影响代码原先的执行逻辑。
示例如下:
```python
'''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
此示例中,NNI 会从 (0.1, 0.01, 0.001) 中选择一个值赋给 learning_rate 变量。 第一行就是 NNI 的 Annotation,是 Python 中的一个字符串。 接下来的一行需要是赋值语句。 NNI 会根据 Annotation 行的信息,来给这一行的变量赋上相应的值。
通过这种方式,不需要修改任何代码,代码既可以直接运行,又可以使用 NNI 来调参。
## Annotation 的类型:
NNI 中,有 4 种类型的 Annotation;
### 1. 变量
`'''@nni.variable(sampling_algo, name)'''`
`@nni.variable` 用来标记变量。
**参数**
- **sampling_algo**: 指定搜索空间的采样算法。 可将其换成 NNI 支持的其它采样函数,函数要以 `nni.` 开头。例如,`choice``uniform`,详见 [SearchSpaceSpec](https://nni.readthedocs.io/zh/latest/Tutorial/SearchSpaceSpec.html)
- **name**: 将被赋值的变量名称。 注意,此参数应该与下面一行等号左边的值相同。
NNI 支持如下 10 种类型来表示搜索空间:
- `@nni.variable(nni.choice(option1,option2,...,optionN),name=variable)` 变量值是选项中的一种,这些变量可以是任意的表达式。
- `@nni.variable(nni.randint(upper),name=variable)` 变量可以是范围 [0, upper) 中的任意整数。
- `@nni.variable(nni.uniform(low, high),name=variable)` 变量值会是 low 和 high 之间均匀分布的某个值。
- `@nni.variable(nni.quniform(low, high, q),name=variable)` 变量值会是 low 和 high 之间均匀分布的某个值,公式为:round(uniform(low, high) / q) * q
- `@nni.variable(nni.loguniform(low, high),name=variable)` 变量值是 exp(uniform(low, high)) 的点,数值以对数均匀分布。
- `@nni.variable(nni.qloguniform(low, high, q),name=variable)` 变量值会是 low 和 high 之间均匀分布的某个值,公式为:round(exp(uniform(low, high)) / q) * q
- `@nni.variable(nni.normal(mu, sigma),name=variable)` 变量值为正态分布的实数值,平均值为 mu,标准方差为 sigma。
- `@nni.variable(nni.qnormal(mu, sigma, q),name=variable)` 变量值分布的公式为: round(normal(mu, sigma) / q) * q
- `@nni.variable(nni.lognormal(mu, sigma),name=variable)` 变量值分布的公式为: exp(normal(mu, sigma))
- `@nni.variable(nni.qlognormal(mu, sigma, q),name=variable)` 变量值分布的公式为: round(exp(normal(mu, sigma)) / q) * q
示例如下:
```python
'''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
### 2. 函数
`'''@nni.function_choice(*functions, name)'''`
`@nni.function_choice` 可以从几个函数中选择一个来执行。
**参数**
- **functions**: 可选择的函数。 注意,必须是包括参数的完整函数调用。 例如 `max_pool(hidden_layer, pool_size)`。
- **name**: 将被替换的函数名称。
例如:
```python
"""@nni.function_choice(max_pool(hidden_layer, pool_size), avg_pool(hidden_layer, pool_size), name=max_pool)"""
h_pooling = max_pool(hidden_layer, pool_size)
```
### 3. 中间结果
`'''@nni.report_intermediate_result(metrics)'''`
`@nni.report_intermediate_result` 用来返回中间结果,这和[在 NNI 上实现 Trial](https://nni.readthedocs.io/zh/latest/TrialExample/Trials.html)`nni.report_intermediate_result` 的用法一样。
### 4. 最终结果
`'''@nni.report_final_result(metrics)'''`
`@nni.report_final_result` 用来返回当前 Trial 的最终结果,这和[在 NNI 上实现 Trial](https://nni.readthedocs.io/zh/latest/TrialExample/Trials.html) 中的 `nni.report_final_result` 用法一样。
\ No newline at end of file
## NNI CTL
The NNI CTL module is used to control Neural Network Intelligence, including start a new experiment, stop an experiment and update an experiment etc.
## Environment
```
Ubuntu 16.04 or other Linux OS
python >= 3.6
```
## Installation
1. Enter tools directory
1. Use pip to install packages
* Install for current user:
```bash
python3 -m pip install --user -e .
```
* Install for all users:
```bash
python3 -m pip install -e .
```
1. Change the mode of nnictl file
```bash
chmod +x ./nnictl
```
1. Add nnictl to your PATH system environment variable.
* You could use `export` command to set PATH variable temporary.
export PATH={your nnictl path}:$PATH
* Or you could edit your `/etc/profile` file.
```txt
1.sudo vim /etc/profile
2.At the end of the file, add
export PATH={your nnictl path}:$PATH
save and exit.
3.source /etc/profile
```
## To start using NNI CTL
please reference to the [NNI CTL document].
[NNI CTL document]: ../docs/en_US/Nnictl.md
## NNI CTL
NNI CTL 模块用来控制 Neural Network Intelligence,包括开始新 Experiment,停止 Experiment,更新 Experiment等。
## 环境
Ubuntu 16.04 或其它 Linux 操作系统。
python >= 3.6
## 安装
1. 进入 tools 目录
2. 使用 pip 来安装包
- 为当前用户安装:
```bash
python3 -m pip install --user -e .
```
- 为所有用户安装:
```bash
python3 -m pip install -e .
```
3. 修改 nnictl 文件的权限
```bash
chmod +x ./nnictl
```
4. 将 nnictl 添加到系统的 PATH 环境变量中。
- 可以用 `export` 命令来临时设置 PATH 变量。
export PATH={your nnictl path}:$PATH
- 或者编辑 `/etc/profile` 文件。
```txt
1.sudo vim /etc/profile
2.在文件末尾加上
export PATH={your nnictl path}:$PATH
保存并退出。
3.source /etc/profile
```
## 开始使用 NNI CTL
参考 [NNI CTL 文档](../docs/zh_CN/Nnictl.md)
\ No newline at end of file
# Azure hosted agents specification:
# https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/hosted?view=azure-devops
jobs:
- job: 'ubuntu_1804_python36'
pool:
vmImage: 'Ubuntu 18.04'
steps:
- script: |
set -e
python3 -m pip install --upgrade pip setuptools --user
python3 -m pip install coverage --user
echo "##vso[task.setvariable variable=PATH]${HOME}/.local/bin:${PATH}"
displayName: 'Install python tools'
- script: |
python3 setup.py develop
displayName: 'Install nni toolkit via source code'
- script: |
set -e
cd ts/nni_manager
yarn eslint
# uncomment following 2 lines to enable webui eslint
cd ../webui
yarn eslint
displayName: 'Run eslint'
- script: |
set -e
sudo apt-get install -y pandoc
python3 -m pip install pygments --user --upgrade
python3 -m pip install torch==1.5.0+cpu torchvision==0.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html --user
python3 -m pip install tensorflow==2.2.0 --user
python3 -m pip install keras==2.4.2 --user
python3 -m pip install gym onnx peewee thop --user
python3 -m pip install sphinx==1.8.3 sphinx-argparse==0.2.5 sphinx-markdown-tables==0.0.9 sphinx-rtd-theme==0.4.2 sphinxcontrib-websupport==1.1.0 recommonmark==0.5.0 nbsphinx --user
sudo apt-get install swig -y
python3 -m pip install -e .[SMAC,BOHB]
displayName: 'Install dependencies'
- script: |
cd test
source scripts/unittest.sh
displayName: 'Unit test'
- script: |
cd test
python3 nni_test/nnitest/run_tests.py --config config/pr_tests.yml
displayName: 'Simple test'
#- script: |
# cd docs/en_US/
# sphinx-build -M html . _build -W
# displayName: 'Sphinx Documentation Build check'
- job: 'ubuntu_1804_python36_legacy_torch_tf'
pool:
vmImage: 'Ubuntu 18.04'
steps:
- script: |
set -e
python3 -m pip install --upgrade pip setuptools --user
python3 -m pip install pylint==2.6.0 astroid==2.4.2 --user
python3 -m pip install coverage --user
python3 -m pip install thop --user
echo "##vso[task.setvariable variable=PATH]${HOME}/.local/bin:${PATH}"
displayName: 'Install python tools'
- script: |
python3 setup.py develop
displayName: 'Install nni toolkit via source code'
- script: |
set -e
python3 -m pip install torch==1.3.1+cpu torchvision==0.4.2+cpu -f https://download.pytorch.org/whl/torch_stable.html --user
python3 -m pip install tensorflow==1.15.2 --user
python3 -m pip install keras==2.1.6 --user
python3 -m pip install gym onnx peewee --user
sudo apt-get install swig -y
python3 -m pip install -e .[SMAC,BOHB]
displayName: 'Install dependencies'
- script: |
set -e
python3 -m pylint --rcfile pylintrc nni
displayName: 'Run pylint'
- script: |
set -e
python3 -m pip install flake8 --user
python3 -m flake8 nni --count --select=E9,F63,F72,F82 --show-source --statistics
EXCLUDES=examples/trials/mnist-nas/*/mnist*.py,examples/trials/nas_cifar10/src/cifar10/general_child.py
python3 -m flake8 examples --count --exclude=$EXCLUDES --select=E9,F63,F72,F82 --show-source --statistics
displayName: 'Run flake8 tests to find Python syntax errors and undefined names'
- script: |
cd test
source scripts/unittest.sh
displayName: 'Unit test'
- script: |
cd test
python3 nni_test/nnitest/run_tests.py --config config/pr_tests.yml
displayName: 'Simple test'
#- job: 'macos_latest_python38'
# pool:
# vmImage: 'macOS-latest'
#
# steps:
# - script: |
# export PYTHON38_BIN_DIR=/usr/local/Cellar/python@3.8/`ls /usr/local/Cellar/python@3.8`/bin
# echo "##vso[task.setvariable variable=PATH]${PYTHON38_BIN_DIR}:${HOME}/Library/Python/3.8/bin:${PATH}"
# python3 -m pip install --upgrade pip setuptools
# displayName: 'Install python tools'
# - script: |
# echo "network-timeout 600000" >> ${HOME}/.yarnrc
# source install.sh
# displayName: 'Install nni toolkit via source code'
# - script: |
# set -e
# # pytorch Mac binary does not support CUDA, default is cpu version
# python3 -m pip install torchvision==0.6.0 torch==1.5.0 --user
# python3 -m pip install tensorflow==2.2 --user
# brew install swig@3
# rm -f /usr/local/bin/swig
# ln -s /usr/local/opt/swig\@3/bin/swig /usr/local/bin/swig
# nnictl package install --name=SMAC
# displayName: 'Install dependencies'
# - script: |
# cd test
# source scripts/unittest.sh
# displayName: 'Unit test'
# - script: |
# cd test
# python3 nni_test/nnitest/run_tests.py --config config/pr_tests.yml
# displayName: 'Simple test'
#
#- job: 'win2016_python37'
# pool:
# vmImage: 'vs2017-win2016'
#
# steps:
# - script: |
# powershell.exe -file install.ps1
# displayName: 'Install nni toolkit via source code'
# - script: |
# python -m pip install scikit-learn==0.23.2 --user
# python -m pip install keras==2.1.6 --user
# python -m pip install torch==1.5.0+cpu torchvision==0.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html --user
# python -m pip install tensorflow==1.15.2 --user
# displayName: 'Install dependencies'
# - script: |
# cd test
# powershell.exe -file scripts/unittest.ps1
# displayName: 'unit test'
# - script: |
# cd test
# python nni_test/nnitest/run_tests.py --config config/pr_tests.yml
# displayName: 'Simple test'
advisors: advisors:
- builtinName: Hyperband - builtinName: Hyperband
classArgsValidator: nni.algorithms.hpo.hyperband_advisor.hyperband_advisor.HyperbandClassArgsValidator classArgsValidator: nni.algorithms.hpo.hyperband_advisor.HyperbandClassArgsValidator
className: nni.algorithms.hpo.hyperband_advisor.hyperband_advisor.Hyperband className: nni.algorithms.hpo.hyperband_advisor.Hyperband
source: nni source: nni
- builtinName: BOHB - builtinName: BOHB
classArgsValidator: nni.algorithms.hpo.bohb_advisor.bohb_advisor.BOHBClassArgsValidator classArgsValidator: nni.algorithms.hpo.bohb_advisor.BOHBClassArgsValidator
className: nni.algorithms.hpo.bohb_advisor.bohb_advisor.BOHB className: nni.algorithms.hpo.bohb_advisor.BOHB
source: nni source: nni
assessors: assessors:
- builtinName: Medianstop - builtinName: Medianstop
classArgsValidator: nni.algorithms.hpo.medianstop_assessor.medianstop_assessor.MedianstopClassArgsValidator classArgsValidator: nni.algorithms.hpo.medianstop_assessor.MedianstopClassArgsValidator
className: nni.algorithms.hpo.medianstop_assessor.medianstop_assessor.MedianstopAssessor className: nni.algorithms.hpo.medianstop_assessor.MedianstopAssessor
source: nni source: nni
- builtinName: Curvefitting - builtinName: Curvefitting
classArgsValidator: nni.algorithms.hpo.curvefitting_assessor.curvefitting_assessor.CurvefittingClassArgsValidator classArgsValidator: nni.algorithms.hpo.curvefitting_assessor.CurvefittingClassArgsValidator
className: nni.algorithms.hpo.curvefitting_assessor.curvefitting_assessor.CurvefittingAssessor className: nni.algorithms.hpo.curvefitting_assessor.CurvefittingAssessor
source: nni source: nni
tuners: tuners:
- builtinName: PPOTuner - builtinName: PPOTuner
classArgsValidator: nni.algorithms.hpo.ppo_tuner.ppo_tuner.PPOClassArgsValidator classArgsValidator: nni.algorithms.hpo.ppo_tuner.PPOClassArgsValidator
className: nni.algorithms.hpo.ppo_tuner.ppo_tuner.PPOTuner className: nni.algorithms.hpo.ppo_tuner.PPOTuner
source: nni source: nni
- builtinName: SMAC - builtinName: SMAC
classArgsValidator: nni.algorithms.hpo.smac_tuner.smac_tuner.SMACClassArgsValidator classArgsValidator: nni.algorithms.hpo.smac_tuner.SMACClassArgsValidator
className: nni.algorithms.hpo.smac_tuner.smac_tuner.SMACTuner className: nni.algorithms.hpo.smac_tuner.SMACTuner
source: nni source: nni
- builtinName: TPE - builtinName: TPE
classArgs: classArgs:
algorithm_name: tpe algorithm_name: tpe
classArgsValidator: nni.algorithms.hpo.hyperopt_tuner.hyperopt_tuner.HyperoptClassArgsValidator classArgsValidator: nni.algorithms.hpo.hyperopt_tuner.HyperoptClassArgsValidator
className: nni.algorithms.hpo.hyperopt_tuner.hyperopt_tuner.HyperoptTuner className: nni.algorithms.hpo.hyperopt_tuner.HyperoptTuner
source: nni source: nni
- acceptClassArgs: false - acceptClassArgs: false
builtinName: Random builtinName: Random
classArgs: classArgs:
algorithm_name: random_search algorithm_name: random_search
classArgsValidator: nni.algorithms.hpo.hyperopt_tuner.hyperopt_tuner.HyperoptClassArgsValidator classArgsValidator: nni.algorithms.hpo.hyperopt_tuner.HyperoptClassArgsValidator
className: nni.algorithms.hpo.hyperopt_tuner.hyperopt_tuner.HyperoptTuner className: nni.algorithms.hpo.hyperopt_tuner.HyperoptTuner
source: nni source: nni
- builtinName: Anneal - builtinName: Anneal
classArgs: classArgs:
algorithm_name: anneal algorithm_name: anneal
classArgsValidator: nni.algorithms.hpo.hyperopt_tuner.hyperopt_tuner.HyperoptClassArgsValidator classArgsValidator: nni.algorithms.hpo.hyperopt_tuner.HyperoptClassArgsValidator
className: nni.algorithms.hpo.hyperopt_tuner.hyperopt_tuner.HyperoptTuner className: nni.algorithms.hpo.hyperopt_tuner.HyperoptTuner
source: nni source: nni
- builtinName: Evolution - builtinName: Evolution
classArgsValidator: nni.algorithms.hpo.evolution_tuner.evolution_tuner.EvolutionClassArgsValidator classArgsValidator: nni.algorithms.hpo.evolution_tuner.EvolutionClassArgsValidator
className: nni.algorithms.hpo.evolution_tuner.evolution_tuner.EvolutionTuner className: nni.algorithms.hpo.evolution_tuner.EvolutionTuner
source: nni source: nni
- acceptClassArgs: false - acceptClassArgs: false
builtinName: BatchTuner builtinName: BatchTuner
className: nni.algorithms.hpo.batch_tuner.batch_tuner.BatchTuner className: nni.algorithms.hpo.batch_tuner.BatchTuner
source: nni source: nni
- acceptClassArgs: false - acceptClassArgs: false
builtinName: GridSearch builtinName: GridSearch
className: nni.algorithms.hpo.gridsearch_tuner.gridsearch_tuner.GridSearchTuner className: nni.algorithms.hpo.gridsearch_tuner.GridSearchTuner
source: nni source: nni
- builtinName: NetworkMorphism - builtinName: NetworkMorphism
classArgsValidator: nni.algorithms.hpo.networkmorphism_tuner.networkmorphism_tuner.NetworkMorphismClassArgsValidator classArgsValidator: nni.algorithms.hpo.networkmorphism_tuner.NetworkMorphismClassArgsValidator
className: nni.algorithms.hpo.networkmorphism_tuner.networkmorphism_tuner.NetworkMorphismTuner className: nni.algorithms.hpo.networkmorphism_tuner.NetworkMorphismTuner
source: nni source: nni
- builtinName: MetisTuner - builtinName: MetisTuner
classArgsValidator: nni.algorithms.hpo.metis_tuner.metis_tuner.MetisClassArgsValidator classArgsValidator: nni.algorithms.hpo.metis_tuner.MetisClassArgsValidator
className: nni.algorithms.hpo.metis_tuner.metis_tuner.MetisTuner className: nni.algorithms.hpo.metis_tuner.MetisTuner
source: nni source: nni
- builtinName: GPTuner - builtinName: GPTuner
classArgsValidator: nni.algorithms.hpo.gp_tuner.gp_tuner.GPClassArgsValidator classArgsValidator: nni.algorithms.hpo.gp_tuner.GPClassArgsValidator
className: nni.algorithms.hpo.gp_tuner.gp_tuner.GPTuner className: nni.algorithms.hpo.gp_tuner.GPTuner
source: nni source: nni
- builtinName: PBTTuner - builtinName: PBTTuner
classArgsValidator: nni.algorithms.hpo.pbt_tuner.pbt_tuner.PBTClassArgsValidator classArgsValidator: nni.algorithms.hpo.pbt_tuner.PBTClassArgsValidator
className: nni.algorithms.hpo.pbt_tuner.pbt_tuner.PBTTuner className: nni.algorithms.hpo.pbt_tuner.PBTTuner
source: nni source: nni
- builtinName: RegularizedEvolutionTuner - builtinName: RegularizedEvolutionTuner
classArgsValidator: nni.algorithms.hpo.regularized_evolution_tuner.regularized_evolution_tuner.EvolutionClassArgsValidator classArgsValidator: nni.algorithms.hpo.regularized_evolution_tuner.EvolutionClassArgsValidator
className: nni.algorithms.hpo.regularized_evolution_tuner.regularized_evolution_tuner.RegularizedEvolutionTuner className: nni.algorithms.hpo.regularized_evolution_tuner.RegularizedEvolutionTuner
source: nni source: nni
# Built-in Assessors
NNI provides state-of-the-art tuning algorithms within our builtin-assessors and makes them easy to use. Below is a brief overview of NNI's current builtin Assessors.
Note: Click the **Assessor's name** to get each Assessor's installation requirements, suggested usage scenario, and a config example. A link to a detailed description of each algorithm is provided at the end of the suggested scenario for each Assessor.
Currently, we support the following Assessors:
|Assessor|Brief Introduction of Algorithm|
|---|---|
|[__Medianstop__](#MedianStop)|Medianstop is a simple early stopping rule. It stops a pending trial X at step S if the trial’s best objective value by step S is strictly worse than the median value of the running averages of all completed trials’ objectives reported up to step S. [Reference Paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf)|
|[__Curvefitting__](#Curvefitting)|Curve Fitting Assessor is an LPA (learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of the final epoch's performance worse than the best final performance in the trial history. In this algorithm, we use 12 curves to fit the accuracy curve. [Reference Paper](http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf)|
## Usage of Builtin Assessors
Usage of builtin assessors provided by the NNI SDK requires one to declare the **builtinAssessorName** and **classArgs** in the `config.yml` file. In this part, we will introduce the details of usage and the suggested scenarios, classArg requirements, and an example for each assessor.
Note: Please follow the provided format when writing your `config.yml` file.
<a name="MedianStop"></a>
### Median Stop Assessor
> Builtin Assessor Name: **Medianstop**
**Suggested scenario**
It's applicable in a wide range of performance curves, thus, it can be used in various scenarios to speed up the tuning progress. [Detailed Description](./MedianstopAssessor.md)
**classArgs requirements:**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', assessor will **stop** the trial with smaller expectation. If 'minimize', assessor will **stop** the trial with larger expectation.
* **start_step** (*int, optional, default = 0*) - A trial is determined to be stopped or not only after receiving start_step number of reported intermediate results.
**Usage example:**
```yaml
# config.yml
assessor:
builtinAssessorName: Medianstop
classArgs:
optimize_mode: maximize
start_step: 5
```
<br>
<a name="Curvefitting"></a>
### Curve Fitting Assessor
> Builtin Assessor Name: **Curvefitting**
**Suggested scenario**
It's applicable in a wide range of performance curves, thus, it can be used in various scenarios to speed up the tuning progress. Even better, it's able to handle and assess curves with similar performance. [Detailed Description](./CurvefittingAssessor.md)
**Note**, according to the original paper, only incremental functions are supported. Therefore this assessor can only be used to maximize optimization metrics. For example, it can be used for accuracy, but not for loss.
**classArgs requirements:**
* **epoch_num** (*int, **required***) - The total number of epochs. We need to know the number of epochs to determine which points we need to predict.
* **start_step** (*int, optional, default = 6*) - A trial is determined to be stopped or not only after receiving start_step number of reported intermediate results.
* **threshold** (*float, optional, default = 0.95*) - The threshold that we use to decide to early stop the worst performance curve. For example: if threshold = 0.95, and the best performance in the history is 0.9, then we will stop the trial who's predicted value is lower than 0.95 * 0.9 = 0.855.
* **gap** (*int, optional, default = 1*) - The gap interval between Assessor judgements. For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermediate results.
**Usage example:**
```yaml
# config.yml
assessor:
builtinAssessorName: Curvefitting
classArgs:
epoch_num: 20
start_step: 6
threshold: 0.95
gap: 1
```
# Curve Fitting Assessor on NNI
## Introduction
The Curve Fitting Assessor is an LPA (learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of the final epoch's performance is worse than the best final performance in the trial history.
In this algorithm, we use 12 curves to fit the learning curve. The set of parametric curve models are chosen from this [reference paper][1]. The learning curves' shape coincides with our prior knowledge about the form of learning curves: They are typically increasing, saturating functions.
![learning_curve](../../img/curvefitting_learning_curve.PNG)
We combine all learning curve models into a single, more powerful model. This combined model is given by a weighted linear combination:
![f_comb](../../img/curvefitting_f_comb.gif)
with the new combined parameter vector
![expression_xi](../../img/curvefitting_expression_xi.gif)
Assuming additive Gaussian noise and the noise parameter being initialized to its maximum likelihood estimate.
We determine the maximum probability value of the new combined parameter vector by learning the historical data. We use such a value to predict future trial performance and stop the inadequate experiments to save computing resources.
Concretely, this algorithm goes through three stages of learning, predicting, and assessing.
* Step1: Learning. We will learn about the trial history of the current trial and determine the \xi at the Bayesian angle. First of all, We fit each curve using the least-squares method, implemented by `fit_theta`. After we obtained the parameters, we filter the curve and remove the outliers, implemented by `filter_curve`. Finally, we use the MCMC sampling method. implemented by `mcmc_sampling`, to adjust the weight of each curve. Up to now, we have determined all the parameters in \xi.
* Step2: Predicting. It calculates the expected final result accuracy, implemented by `f_comb`, at the target position (i.e., the total number of epochs) by \xi and the formula of the combined model.
* Step3: If the fitting result doesn't converge, the predicted value will be `None`. In this case, we return `AssessResult.Good` to ask for future accuracy information and predict again. Furthermore, we will get a positive value from the `predict()` function. If this value is strictly greater than the best final performance in history * `THRESHOLD`(default value = 0.95), return `AssessResult.Good`, otherwise, return `AssessResult.Bad`
The figure below is the result of our algorithm on MNIST trial history data, where the green point represents the data obtained by Assessor, the blue point represents the future but unknown data, and the red line is the Curve predicted by the Curve fitting assessor.
![examples](../../img/curvefitting_example.PNG)
## Usage
To use Curve Fitting Assessor, you should add the following spec in your experiment's YAML config file:
```yaml
assessor:
builtinAssessorName: Curvefitting
classArgs:
# (required)The total number of epoch.
# We need to know the number of epoch to determine which point we need to predict.
epoch_num: 20
# (optional) In order to save our computing resource, we start to predict when we have more than only after receiving start_step number of reported intermediate results.
# The default value of start_step is 6.
start_step: 6
# (optional) The threshold that we decide to early stop the worse performance curve.
# For example: if threshold = 0.95, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
# The default value of threshold is 0.95.
threshold: 0.95
# (optional) The gap interval between Assesor judgements.
# For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermedian result.
# The default value of gap is 1.
gap: 1
```
## Limitation
According to the original paper, only incremental functions are supported. Therefore this assessor can only be used to maximize optimization metrics. For example, it can be used for accuracy, but not for loss.
## File Structure
The assessor has a lot of different files, functions, and classes. Here we briefly describe a few of them.
* `curvefunctions.py` includes all the function expressions and default parameters.
* `modelfactory.py` includes learning and predicting; the corresponding calculation part is also implemented here.
* `curvefitting_assessor.py` is the assessor which receives the trial history and assess whether to early stop the trial.
## TODO
* Further improve the accuracy of the prediction and test it on more models.
[1]: http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf
# Customize Assessor
NNI supports to build an assessor by yourself for tuning demand.
If you want to implement a customized Assessor, there are three things to do:
1. Inherit the base Assessor class
1. Implement assess_trial function
1. Configure your customized Assessor in experiment YAML config file
**1. Inherit the base Assessor class**
```python
from nni.assessor import Assessor
class CustomizedAssessor(Assessor):
def __init__(self, ...):
...
```
**2. Implement assess trial function**
```python
from nni.assessor import Assessor, AssessResult
class CustomizedAssessor(Assessor):
def __init__(self, ...):
...
def assess_trial(self, trial_history):
"""
Determines whether a trial should be killed. Must override.
trial_history: a list of intermediate result objects.
Returns AssessResult.Good or AssessResult.Bad.
"""
# you code implement here.
...
```
**3. Configure your customized Assessor in experiment YAML config file**
NNI needs to locate your customized Assessor class and instantiate the class, so you need to specify the location of the customized Assessor class and pass literal values as parameters to the \_\_init__ constructor.
```yaml
assessor:
codeDir: /home/abc/myassessor
classFileName: my_customized_assessor.py
className: CustomizedAssessor
# Any parameter need to pass to your Assessor class __init__ constructor
# can be specified in this optional classArgs field, for example
classArgs:
arg1: value1
```
Please noted in **2**. The object `trial_history` are exact the object that Trial send to Assessor by using SDK `report_intermediate_result` function.
The working directory of your assessor is `<home>/nni-experiments/<experiment_id>/log`, which can be retrieved with environment variable `NNI_LOG_DIRECTORY`,
More detail example you could see:
> * [medianstop-assessor](https://github.com/Microsoft/nni/tree/v1.9/src/sdk/pynni/nni/medianstop_assessor)
> * [curvefitting-assessor](https://github.com/Microsoft/nni/tree/v1.9/src/sdk/pynni/nni/curvefitting_assessor)
\ No newline at end of file
Medianstop Assessor on NNI
===
## Median Stop
Medianstop is a simple early stopping rule mentioned in this [paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf). It stops a pending trial X after step S if the trial’s best objective value by step S is strictly worse than the median value of the running averages of all completed trials’ objectives reported up to step S.
\ No newline at end of file
# Auto Completion for nnictl Commands
NNI's command line tool __nnictl__ support auto-completion, i.e., you can complete a nnictl command by pressing the `tab` key.
For example, if the current command is
```
nnictl cre
```
By pressing the `tab` key, it will be completed to
```
nnictl create
```
For now, auto-completion will not be enabled by default if you install NNI through `pip`, and it only works on Linux with bash shell. If you want to enable this feature on your computer, please refer to the following steps:
### Step 1. Download `bash-completion`
```
cd ~
wget https://raw.githubusercontent.com/microsoft/nni/{nni-version}/tools/bash-completion
```
Here, {nni-version} should by replaced by the version of NNI, e.g., `master`, `v1.9`. You can also check the latest `bash-completion` script [here](https://github.com/microsoft/nni/blob/v1.9/tools/bash-completion).
### Step 2. Install the script
If you are running a root account and want to install this script for all the users
```
install -m644 ~/bash-completion /usr/share/bash-completion/completions/nnictl
```
If you just want to install this script for your self
```
mkdir -p ~/.bash_completion.d
install -m644 ~/bash-completion ~/.bash_completion.d/nnictl
echo '[[ -f ~/.bash_completion.d/nnictl ]] && source ~/.bash_completion.d/nnictl' >> ~/.bash_completion
```
### Step 3. Reopen your terminal
Reopen your terminal and you should be able to use the auto-completion feature. Enjoy!
### Step 4. Uninstall
If you want to uninstall this feature, just revert the changes in the steps above.
# Hyper Parameter Optimization Comparison
*Posted by Anonymous Author*
Comparison of Hyperparameter Optimization (HPO) algorithms on several problems.
Hyperparameter Optimization algorithms are list below:
- [Random Search](../Tuner/BuiltinTuner.md)
- [Grid Search](../Tuner/BuiltinTuner.md)
- [Evolution](../Tuner/BuiltinTuner.md)
- [Anneal](../Tuner/BuiltinTuner.md)
- [Metis](../Tuner/BuiltinTuner.md)
- [TPE](../Tuner/BuiltinTuner.md)
- [SMAC](../Tuner/BuiltinTuner.md)
- [HyperBand](../Tuner/BuiltinTuner.md)
- [BOHB](../Tuner/BuiltinTuner.md)
All algorithms run in NNI local environment.
Machine Environment:
```
OS: Linux Ubuntu 16.04 LTS
CPU: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz 2600 MHz
Memory: 112 GB
NNI Version: v0.7
NNI Mode(local|pai|remote): local
Python version: 3.6
Is conda or virtualenv used?: Conda
is running in docker?: no
```
## AutoGBDT Example
### Problem Description
Nonconvex problem on the hyper-parameter search of [AutoGBDT](../TrialExample/GbdtExample.md) example.
### Search Space
```json
{
"num_leaves": {
"_type": "choice",
"_value": [10, 12, 14, 16, 18, 20, 22, 24, 28, 32, 48, 64, 96, 128]
},
"learning_rate": {
"_type": "choice",
"_value": [0.00001, 0.0001, 0.001, 0.01, 0.05, 0.1, 0.2, 0.5]
},
"max_depth": {
"_type": "choice",
"_value": [-1, 2, 3, 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 28, 32, 48, 64, 96, 128]
},
"feature_fraction": {
"_type": "choice",
"_value": [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]
},
"bagging_fraction": {
"_type": "choice",
"_value": [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]
},
"bagging_freq": {
"_type": "choice",
"_value": [1, 2, 4, 8, 10, 12, 14, 16]
}
}
```
The total search space is 1,204,224, we set the number of maximum trial to 1000. The time limitation is 48 hours.
### Results
| Algorithm | Best loss | Average of Best 5 Losses | Average of Best 10 Losses |
| ------------- | ------------ | ------------- | ------------- |
| Random Search |0.418854|0.420352|0.421553|
| Random Search |0.417364|0.420024|0.420997|
| Random Search |0.417861|0.419744|0.420642|
| Grid Search |0.498166|0.498166|0.498166|
| Evolution |0.409887|0.409887|0.409887|
| Evolution |0.413620|0.413875|0.414067|
| Evolution |0.409887|0.409887|0.409887|
| Anneal |0.414877|0.417289|0.418281|
| Anneal |0.409887|0.409887|0.410118|
| Anneal |0.413683|0.416949|0.417537|
| Metis |0.416273|0.420411|0.422380|
| Metis |0.420262|0.423175|0.424816|
| Metis |0.421027|0.424172|0.425714|
| TPE |0.414478|0.414478|0.414478|
| TPE |0.415077|0.417986|0.418797|
| TPE |0.415077|0.417009|0.418053|
| SMAC |**0.408386**|**0.408386**|**0.408386**|
| SMAC |0.414012|0.414012|0.414012|
| SMAC |**0.408386**|**0.408386**|**0.408386**|
| BOHB |0.410464|0.415319|0.417755|
| BOHB |0.418995|0.420268|0.422604|
| BOHB |0.415149|0.418072|0.418932|
| HyperBand |0.414065|0.415222|0.417628|
| HyperBand |0.416807|0.417549|0.418828|
| HyperBand |0.415550|0.415977|0.417186|
| GP |0.414353|0.418563|0.420263|
| GP |0.414395|0.418006|0.420431|
| GP |0.412943|0.416566|0.418443|
In this example, all the algorithms are used with default parameters. For Metis, there are about 300 trials because it runs slowly due to its high time complexity O(n^3) in Gaussian Process.
## RocksDB Benchmark 'fillrandom' and 'readrandom'
### Problem Description
[DB_Bench](<https://github.com/facebook/rocksdb/wiki/Benchmarking-tools>) is the main tool that is used to benchmark [RocksDB](https://rocksdb.org/)'s performance. It has so many hapermeter to tune.
The performance of `DB_Bench` is associated with the machine configuration and installation method. We run the `DB_Bench`in the Linux machine and install the Rock in shared library.
#### Machine configuration
```
RocksDB: version 6.1
CPU: 6 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
CPUCache: 35840 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
```
#### Storage performance
**Latency**: each IO request will take some time to complete, this is called the average latency. There are several factors that would affect this time including network connection quality and hard disk IO performance.
**IOPS**: **IO operations per second**, which means the amount of _read or write operations_ that could be done in one seconds time.
**IO size**: **the size of each IO request**. Depending on the operating system and the application/service that needs disk access it will issue a request to read or write a certain amount of data at the same time.
**Throughput (in MB/s) = Average IO size x IOPS **
IOPS is related to online processing ability and we use the IOPS as the metric in my experiment.
### Search Space
```json
{
"max_background_compactions": {
"_type": "quniform",
"_value": [1, 256, 1]
},
"block_size": {
"_type": "quniform",
"_value": [1, 500000, 1]
},
"write_buffer_size": {
"_type": "quniform",
"_value": [1, 130000000, 1]
},
"max_write_buffer_number": {
"_type": "quniform",
"_value": [1, 128, 1]
},
"min_write_buffer_number_to_merge": {
"_type": "quniform",
"_value": [1, 32, 1]
},
"level0_file_num_compaction_trigger": {
"_type": "quniform",
"_value": [1, 256, 1]
},
"level0_slowdown_writes_trigger": {
"_type": "quniform",
"_value": [1, 1024, 1]
},
"level0_stop_writes_trigger": {
"_type": "quniform",
"_value": [1, 1024, 1]
},
"cache_size": {
"_type": "quniform",
"_value": [1, 30000000, 1]
},
"compaction_readahead_size": {
"_type": "quniform",
"_value": [1, 30000000, 1]
},
"new_table_reader_for_compaction_inputs": {
"_type": "randint",
"_value": [1]
}
}
```
The search space is enormous (about 10^40) and we set the maximum number of trial to 100 to limit the computation resource.
### Results
#### fillrandom' Benchmark
| Model | Best IOPS (Repeat 1) | Best IOPS (Repeat 2) | Best IOPS (Repeat 3) |
| --------- | -------------------- | -------------------- | -------------------- |
| Random | 449901 | 427620 | 477174 |
| Anneal | 461896 | 467150 | 437528 |
| Evolution | 436755 | 389956 | 389790 |
| TPE | 378346 | 482316 | 468989 |
| SMAC | 491067 | 490472 | **491136** |
| Metis | 444920 | 457060 | 454438 |
Figure:
![](../../img/hpo_rocksdb_fillrandom.png)
#### 'readrandom' Benchmark
| Model | Best IOPS (Repeat 1) | Best IOPS (Repeat 2) | Best IOPS (Repeat 3) |
| --------- | -------------------- | -------------------- | -------------------- |
| Random | 2276157 | 2285301 | 2275142 |
| Anneal | 2286330 | 2282229 | 2284012 |
| Evolution | 2286524 | 2283673 | 2283558 |
| TPE | 2287366 | 2282865 | 2281891 |
| SMAC | 2270874 | 2284904 | 2282266 |
| Metis | **2287696** | 2283496 | 2277701 |
Figure:
![](../../img/hpo_rocksdb_readrandom.png)
# Comparison of Filter Pruning Algorithms
To provide an initial insight into the performance of various filter pruning algorithms,
we conduct extensive experiments with various pruning algorithms on some benchmark models and datasets.
We present the experiment result in this document.
In addition, we provide friendly instructions on the re-implementation of these experiments to facilitate further contributions to this effort.
## Experiment Setting
The experiments are performed with the following pruners/datasets/models:
* Models: [VGG16, ResNet18, ResNet50](https://github.com/microsoft/nni/tree/v1.9/examples/model_compress/models/cifar10)
* Datasets: CIFAR-10
* Pruners:
- These pruners are included:
- Pruners with scheduling : `SimulatedAnnealing Pruner`, `NetAdapt Pruner`, `AutoCompress Pruner`.
Given the overal sparsity requirement, these pruners can automatically generate a sparsity distribution among different layers.
- One-shot pruners: `L1Filter Pruner`, `L2Filter Pruner`, `FPGM Pruner`.
The sparsity of each layer is set the same as the overall sparsity in this experiment.
- Only **filter pruning** performances are compared here.
For the pruners with scheduling, `L1Filter Pruner` is used as the base algorithm. That is to say, after the sparsities distribution is decided by the scheduling algorithm, `L1Filter Pruner` is used to performn real pruning.
- All the pruners listed above are implemented in [nni](https://github.com/microsoft/nni/tree/v1.9/docs/en_US/Compression/Overview.md).
## Experiment Result
For each dataset/model/pruner combination, we prune the model to different levels by setting a series of target sparsities for the pruner.
Here we plot both **Number of Weights - Performances** curve and **FLOPs - Performance** curve.
As a reference, we also plot the result declared in the paper [AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates](http://arxiv.org/abs/1907.03141) for models VGG16 and ResNet18 on CIFAR-10.
The experiment result are shown in the following figures:
CIFAR-10, VGG16:
![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png)
CIFAR-10, ResNet18:
![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png)
CIFAR-10, ResNet50:
![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png)
## Analysis
From the experiment result, we get the following conclusions:
* Given the constraint on the number of parameters, the pruners with scheduling ( `AutoCompress Pruner` , `SimualatedAnnealing Pruner` ) performs better than the others when the constraint is strict. However, they have no such advantage in FLOPs/Performances comparison since only number of parameters constraint is considered in the optimization process;
* The basic algorithms `L1Filter Pruner` , `L2Filter Pruner` , `FPGM Pruner` performs very similarly in these experiments;
* `NetAdapt Pruner` can not achieve very high compression rate. This is caused by its mechanism that it prunes only one layer each pruning iteration. This leads to un-acceptable complexity if the sparsity per iteration is much lower than the overall sparisity constraint.
## Experiments Reproduction
### Implementation Details
* The experiment results are all collected with the default configuration of the pruners in nni, which means that when we call a pruner class in nni, we don't change any default class arguments.
* Both FLOPs and the number of parameters are counted with [Model FLOPs/Parameters Counter](https://github.com/microsoft/nni/tree/v1.9/docs/en_US/Compression/CompressionUtils.md#model-flopsparameters-counter) after [model speed up](https://github.com/microsoft/nni/tree/v1.9/docs/en_US/Compression/ModelSpeedup.md).
This avoids potential issues of counting them of masked models.
* The experiment code can be found [here]( https://github.com/microsoft/nni/tree/v1.9/examples/model_compress/auto_pruners_torch.py).
### Experiment Result Rendering
* If you follow the practice in the [example]( https://github.com/microsoft/nni/tree/v1.9/examples/model_compress/auto_pruners_torch.py), for every single pruning experiment, the experiment result will be saved in JSON format as follows:
``` json
{
"performance": {"original": 0.9298, "pruned": 0.1, "speedup": 0.1, "finetuned": 0.7746},
"params": {"original": 14987722.0, "speedup": 167089.0},
"flops": {"original": 314018314.0, "speedup": 38589922.0}
}
```
* The experiment results are saved [here](https://github.com/microsoft/nni/tree/v1.9/examples/model_compress/comparison_of_pruners).
You can refer to [analyze](https://github.com/microsoft/nni/tree/v1.9/examples/model_compress/comparison_of_pruners/analyze.py) to plot new performance comparison figures.
## Contribution
### TODO Items
* Pruners constrained by FLOPS/latency
* More pruning algorithms/datasets/models
### Issues
For algorithm implementation & experiment issues, please [create an issue](https://github.com/microsoft/nni/issues/new/).
# NNI review article from Zhihu: <an open source project with highly reasonable design> - By Garvin Li
The article is by a NNI user on Zhihu forum. In the article, Garvin had shared his experience on using NNI for Automatic Feature Engineering. We think this article is very useful for users who are interested in using NNI for feature engineering. With author's permission, we translated the original article into English.
**原文(source)**: [如何看待微软最新发布的AutoML平台NNI?By Garvin Li](https://www.zhihu.com/question/297982959/answer/964961829?utm_source=wechat_session&utm_medium=social&utm_oi=28812108627968&from=singlemessage&isappinstalled=0)
## 01 Overview of AutoML
In author's opinion, AutoML is not only about hyperparameter optimization, but
also a process that can target various stages of the machine learning process,
including feature engineering, NAS, HPO, etc.
## 02 Overview of NNI
NNI (Neural Network Intelligence) is an open source AutoML toolkit from
Microsoft, to help users design and tune machine learning models, neural network
architectures, or a complex system’s parameters in an efficient and automatic
way.
Link:[ https://github.com/Microsoft/nni](https://github.com/Microsoft/nni)
In general, most of Microsoft tools have one prominent characteristic: the
design is highly reasonable (regardless of the technology innovation degree).
NNI's AutoFeatureENG basically meets all user requirements of AutoFeatureENG
with a very reasonable underlying framework design.
## 03 Details of NNI-AutoFeatureENG
>The article is following the github project: [https://github.com/SpongebBob/tabular_automl_NNI](https://github.com/SpongebBob/tabular_automl_NNI).
Each new user could do AutoFeatureENG with NNI easily and efficiently. To exploring the AutoFeatureENG capability, downloads following required files, and then run NNI install through pip.
![](https://pic3.zhimg.com/v2-8886eea730cad25f5ac06ef1897cd7e4_r.jpg)
NNI treats AutoFeatureENG as a two-steps-task, feature generation exploration and feature selection. Feature generation exploration is mainly about feature derivation and high-order feature combination.
## 04 Feature Exploration
For feature derivation, NNI offers many operations which could automatically generate new features, which list [as following](https://github.com/SpongebBob/tabular_automl_NNI/blob/master/AutoFEOp.md) :
**count**: Count encoding is based on replacing categories with their counts computed on the train set, also named frequency encoding.
**target**: Target encoding is based on encoding categorical variable values with the mean of target variable per value.
**embedding**: Regard features as sentences, generate vectors using *Word2Vec.*
**crosscout**: Count encoding on more than one-dimension, alike CTR (Click Through Rate).
**aggregete**: Decide the aggregation functions of the features, including min/max/mean/var.
**nunique**: Statistics of the number of unique features.
**histsta**: Statistics of feature buckets, like histogram statistics.
Search space could be defined in a **JSON file**: to define how specific features intersect, which two columns intersect and how features generate from corresponding columns.
![](https://pic1.zhimg.com/v2-3c3eeec6eea9821e067412725e5d2317_r.jpg)
The picture shows us the procedure of defining search space. NNI provides count encoding for 1-order-op, as well as cross count encoding, aggerate statistics (min max var mean median nunique) for 2-order-op.
For example, we want to search the features which are a frequency encoding (valuecount) features on columns name {“C1”, ...,” C26”}, in the following way:
![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%203.jpg)
we can define a cross frequency encoding (value count on cross dims) method on columns {"C1",...,"C26"} x {"C1",...,"C26"} in the following way:
![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%204.jpg)
The purpose of Exploration is to generate new features. You can use **get_next_parameter** function to get received feature candidates of one trial.
>RECEIVED_PARAMS = nni.get_next_parameter()
## 05 Feature selection
To avoid feature explosion and overfitting, feature selection is necessary. In the feature selection of NNI-AutoFeatureENG, LightGBM (Light Gradient Boosting Machine), a gradient boosting framework developed by Microsoft, is mainly promoted.
![](https://pic2.zhimg.com/v2-7bf9c6ae1303692101a911def478a172_r.jpg)
If you have used **XGBoost** or **GBDT**, you would know the algorithm based on tree structure can easily calculate the importance of each feature on results. LightGBM is able to make feature selection naturally.
The issue is that selected features might be applicable to *GBDT* (Gradient Boosting Decision Tree), but not to the linear algorithm like *LR* (Logistic Regression).
![](https://pic4.zhimg.com/v2-d2f919497b0ed937acad0577f7a8df83_r.jpg)
## 06 Summary
NNI's AutoFeatureEng sets a well-established standard, showing us the operation procedure, available modules, which is highly convenient to use. However, a simple model is probably not enough for good results.
## Suggestions to NNI
About Exploration: If consider using DNN (like xDeepFM) to extract high-order feature would be better.
About Selection: There could be more intelligent options, such as automatic selection system based on downstream models.
Conclusion: NNI could offer users some inspirations of design and it is a good open source project. I suggest researchers leverage it to accelerate the AI research.
Tips: Because the scripts of open source projects are compiled based on gcc7, Mac system may encounter problems of gcc (GNU Compiler Collection). The solution is as follows:
#brew install libomp
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment