Unverified Commit 4f66d0c1 authored by SparkSnail's avatar SparkSnail Committed by GitHub
Browse files

Merge pull request #229 from microsoft/master

merge master
parents 4132f620 049634f7
......@@ -88,6 +88,7 @@ Within the following table, we summarized the current NNI capabilities, we are g
<li><a href="docs/en_US/TrialExample/GbdtExample.md">Auto-gbdt</a></li>
<li><a href="docs/en_US/TrialExample/Cifar10Examples.md">Cifar10-pytorch</li></a>
<li><a href="docs/en_US/TrialExample/SklearnExamples.md">Scikit-learn</a></li>
<li><a href="docs/en_US/TrialExample/EfficientNet.md">EfficientNet</a></li>
<a href="docs/en_US/SupportedFramework_Library.md">More...</a><br/>
</ul>
</ul>
......@@ -126,6 +127,7 @@ Within the following table, we summarized the current NNI capabilities, we are g
<li><a href="docs/en_US/NAS/Overview.md#enas">ENAS</a></li>
<li><a href="docs/en_US/NAS/Overview.md#darts">DARTS</a></li>
<li><a href="docs/en_US/NAS/Overview.md#p-darts">P-DARTS</a></li>
<li><a href="docs/en_US/NAS/Overview.md#cdarts">CDARTS</a></li>
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a> </li>
</ul>
</ul>
......@@ -196,13 +198,13 @@ Within the following table, we summarized the current NNI capabilities, we are g
</tbody>
</table>
## **Install & Verify**
## **Installation**
**Install through pip**
### **Install**
* We support Linux, MacOS and Windows (local, remote and pai mode) in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 along with Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
NNI supports and is tested on Ubuntu >= 16.04, macOS >= 10.14.1, and Windows 10 >= 1809. Simply run the following `pip install` in an environment that has `python 64-bit >= 3.5`.
Linux and MacOS
Linux or macOS
```bash
python3 -m pip install --upgrade nni
......@@ -214,65 +216,39 @@ Windows
python -m pip install --upgrade nni
```
Note:
* `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* Currently NNI on Windows support local, remote and pai mode. Anaconda or Miniconda is highly recommended to install NNI on Windows.
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/Tutorial/FAQ.md)
**Install through source code**
* We support Linux (Ubuntu 16.04 or higher), MacOS (10.14.1) and Windows (10.1809) in our current stage.
If you want to try latest code, please [install NNI](docs/en_US/Tutorial/Installation.md) from source code.
Linux and MacOS
For detail system requirements of NNI, please refer to [here](docs/en_US/Tutorial/Installation.md#system-requirements).
* Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`.
```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
cd nni
source install.sh
```
Windows
* Run the following commands in an environment that has `python >=3.5`, `git` and `PowerShell`
```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
cd nni
powershell -ExecutionPolicy Bypass -file install.ps1
```
For the system requirements of NNI, please refer to [Install NNI](docs/en_US/Tutorial/Installation.md)
Note:
For NNI on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/NniOnWindows.md)
* If there is any privilege issue, add `--user` to install NNI in the user directory.
* Currently NNI on Windows supports local, remote and pai mode. Anaconda or Miniconda is highly recommended to install NNI on Windows.
* If there is any error like `Segmentation fault`, please refer to [FAQ](docs/en_US/Tutorial/FAQ.md). For FAQ on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/NniOnWindows.md).
**Verify install**
### **Verify installation**
The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow 1.x installed** before running it. Note that **currently Tensorflow 2.0 is NOT supported**.
The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is used** when running it.
* Download the examples via clone the source code.
```bash
```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
```
Linux and MacOS
```
* Run the MNIST example.
```bash
nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
```
Linux or macOS
Windows
```bash
nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
```
* Run the MNIST example.
Windows
```bash
```bash
nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
```
```
* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.
......@@ -322,9 +298,10 @@ When you submit a pull request, a CLA-bot will automatically determine whether y
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the Code of [Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact opencode@microsoft.com with any additional questions or comments.
After getting familiar with contribution agreements, you are ready to create your first PR =), follow the NNI developer tutorials to get start:
* We recommend new contributors to start with ['good first issue'](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) or ['help-wanted'](https://github.com/microsoft/nni/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22), these issues are simple and easy to start.
* We recommend new contributors to start with simple issues: ['good first issue'](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) or ['help-wanted'](https://github.com/microsoft/nni/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22).
* [NNI developer environment installation tutorial](docs/en_US/Tutorial/SetupNniDeveloperEnvironment.md)
* [How to debug](docs/en_US/Tutorial/HowToDebug.md)
* If you have any questions on usage, review [FAQ](https://github.com/microsoft/nni/blob/master/docs/en_US/Tutorial/FAQ.md) first, if there are no relevant issues and answers to your question, try contact NNI dev team and users in [Gitter](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) or [File an issue](https://github.com/microsoft/nni/issues/new/choose) on GitHub.
* [Customize your own Tuner](docs/en_US/Tuner/CustomizeTuner.md)
* [Implement customized TrainingService](docs/en_US/TrainingService/HowToImplementTrainingService.md)
* [Implement a new NAS trainer on NNI](https://github.com/microsoft/nni/blob/master/docs/en_US/NAS/NasInterface.md#implement-a-new-nas-trainer-on-nni)
......@@ -368,4 +345,3 @@ We encourage researchers and students leverage these projects to accelerate the
## **License**
The entire codebase is under [MIT license](LICENSE)
......@@ -4,7 +4,7 @@
* * *
[![MIT 许可证](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE) [![生成状态](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/Microsoft.nni)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=6) [![问题](https://img.shields.io/github/issues-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen) [![Bug](https://img.shields.io/github/issues/Microsoft/nni/bug.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3Abug) [![拉取请求](https://img.shields.io/github/issues-pr-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/pulls?q=is%3Apr+is%3Aopen) [![版本](https://img.shields.io/github/release/Microsoft/nni.svg)](https://github.com/Microsoft/nni/releases) [![进入 https://gitter.im/Microsoft/nni 聊天室提问](https://badges.gitter.im/Microsoft/nni.svg)](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![文档状态](https://readthedocs.org/projects/nni/badge/?version=latest)](https://nni.readthedocs.io/zh/latest/?badge=latest)
[![MIT 许可证](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE) [![生成状态](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration-test-local?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=17&branchName=master) [![问题](https://img.shields.io/github/issues-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen) [![Bug](https://img.shields.io/github/issues/Microsoft/nni/bug.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3Abug) [![拉取请求](https://img.shields.io/github/issues-pr-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/pulls?q=is%3Apr+is%3Aopen) [![版本](https://img.shields.io/github/release/Microsoft/nni.svg)](https://github.com/Microsoft/nni/releases) [![进入 https://gitter.im/Microsoft/nni 聊天室提问](https://badges.gitter.im/Microsoft/nni.svg)](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![文档状态](https://readthedocs.org/projects/nni/badge/?version=latest)](https://nni.readthedocs.io/zh/latest/?badge=latest)
[English](README.md)
......@@ -83,6 +83,7 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
<li><a href="docs/zh_CN/TrialExample/GbdtExample.md">Auto-gbdt</a></li>
<li><a href="docs/zh_CN/TrialExample/Cifar10Examples.md">Cifar10-pytorch</li></a>
<li><a href="docs/zh_CN/TrialExample/SklearnExamples.md">Scikit-learn</a></li>
<li><a href="docs/zh_CN/TrialExample/EfficientNet.md">EfficientNet</a></li>
<a href="docs/zh_CN/SupportedFramework_Library.md">更多...</a><br/>
</ul>
</ul>
......@@ -121,6 +122,7 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
<li><a href="docs/zh_CN/NAS/Overview.md#enas">ENAS</a></li>
<li><a href="docs/zh_CN/NAS/Overview.md#darts">DARTS</a></li>
<li><a href="docs/zh_CN/NAS/Overview.md#p-darts">P-DARTS</a></li>
<li><a href="docs/zh_CN/NAS/Overview.md#cdarts">CDARTS</a></li>
<li><a href="docs/zh_CN/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a> </li>
</ul>
</ul>
......@@ -191,13 +193,13 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
</tbody>
</table>
## **安装和验证**
## **安装**
**通过 pip 命令安装**
### **安装**
* 当前支持 Linux,MacOS 和 Windows(本机,远程,OpenPAI 模式),在 Ubuntu 16.04 或更高版本,MacOS 10.14.1 以及 Windows 10.1809 上进行了测试。 在 `python >= 3.5` 的环境中,只需要运行 `pip install` 即可完成安装。
NNI 支持并在 Ubuntu >= 16.04, macOS >= 10.14.1, 和 Windows 10 >= 1809 通过了测试。 在 `python 64-bit >= 3.5` 的环境中,只需要运行 `pip install` 即可完成安装。
Linux macOS
Linux macOS
```bash
python3 -m pip install --upgrade nni
......@@ -209,65 +211,39 @@ Windows
python -m pip install --upgrade nni
```
注意:
* 如果需要将 NNI 安装到自己的 home 目录中,可使用 `--user`,这样也不需要任何特殊权限。
* 目前,Windows 上的 NNI 支持本机,远程和 OpenPAI 模式。 强烈推荐使用 Anaconda 或 Miniconda 在 Windows 上安装 NNI。
* 如果遇到如`Segmentation fault` 这样的任何错误请参考[常见问题](docs/zh_CN/Tutorial/FAQ.md)
**通过源代码安装**
如果想要尝试最新代码,可通过源代码[安装 NNI](docs/zh_CN/Tutorial/Installation.md)
* 当前支持 Linux(Ubuntu 16.04 或更高版本),MacOS(10.14.1)以及 Windows 10(1809 版)。
Linux 和 MacOS
*`python >= 3.5` 的环境中运行命令: `git``wget`,确保安装了这两个组件。
```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
cd nni
source install.sh
```
Windows
*`python >=3.5` 的环境中运行命令: `git``PowerShell`,确保安装了这两个组件。
```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
cd nni
powershell -ExecutionPolicy Bypass -file install.ps1
```
有关 NNI 的详细系统要求,参考[这里](docs/zh_CN/Tutorial/Installation.md#system-requirements)
参考[安装 NNI](docs/zh_CN/Tutorial/Installation.md) 了解系统需求。
注意:
Windows 上参考 [Windows 上使用 NNI](docs/zh_CN/Tutorial/NniOnWindows.md)
* 如果遇到任何权限问题,可添加 `--user` 在用户目录中安装 NNI。
* 目前,Windows 上的 NNI 支持本机,远程和 OpenPAI 模式。 强烈推荐使用 Anaconda 或 Miniconda 在 Windows 上安装 NNI。
* 如果遇到如 `Segmentation fault` 等错误参考[常见问题](docs/zh_CN/Tutorial/FAQ.md)。 Windows 上的 FAQ 参考[在 Windows 上使用 NNI](docs/zh_CN/Tutorial/NniOnWindows.md)
**验证安装**
### **验证安装**
以下示例 Experiment 依赖于 TensorFlow 。 在运行前确保安装了 **TensorFlow 1.x** 注意,**目前不支持 TensorFlow 2.0**
以下示例基于 TensorFlow 1.x 。确保运行环境中使用的的是 ** TensorFlow 1.x**
* 通过克隆源代码下载示例。
```bash
```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
```
Linux 和 MacOS
```
* 运行 MNIST 示例。
```bash
Linux 或 macOS
```bash
nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
```
```
Windows
Windows
* 运行 MNIST 示例。
```bash
```bash
nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
```
```
* 在命令行中等待输出 `INFO: Successfully started experiment!`。 此消息表明 Experiment 已成功启动。 通过命令行输出的 `Web UI url` 来访问 Experiment 的界面。
......@@ -319,11 +295,12 @@ You can use these commands to get more information about the experiment
该项目采用了 [ Microsoft 开源行为准则 ](https://opensource.microsoft.com/codeofconduct/)。 有关详细信息,请参阅[常见问题解答](https://opensource.microsoft.com/codeofconduct/faq/),如有任何疑问或意见可联系 opencode@microsoft.com。
熟悉贡献协议后,即可按照 NNI 开发人员教程,创建第一个 PR =)
熟悉贡献协议后,即可按照 NNI 开发人员教程,创建第一个 PR:
* 推荐新贡献者先找到标有 ['good first issue'](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)['help-wanted'](https://github.com/microsoft/nni/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) 标签的 Issue。这些都比较简单,可以从这些问题开始
* 推荐新贡献者先从简单的问题开始:['good first issue'](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)['help-wanted'](https://github.com/microsoft/nni/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22)
* [NNI 开发环境安装教程](docs/zh_CN/Tutorial/SetupNniDeveloperEnvironment.md)
* [如何调试](docs/zh_CN/Tutorial/HowToDebug.md)
* 如果有使用上的问题,可先查看[常见问题解答](https://github.com/microsoft/nni/blob/master/docs/zh_CN/Tutorial/FAQ.md)。如果没能解决问题,可通过 [Gitter](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) 联系 NNI 开发团队或在 GitHub 上 [报告问题](https://github.com/microsoft/nni/issues/new/choose)
* [自定义 Tuner](docs/zh_CN/Tuner/CustomizeTuner.md)
* [实现定制的训练平台](docs/zh_CN/TrainingService/HowToImplementTrainingService.md)
* [在 NNI 上实现新的 NAS Trainer](https://github.com/microsoft/nni/blob/master/docs/zh_CN/NAS/NasInterface.md#implement-a-new-nas-trainer-on-nni)
......@@ -349,7 +326,7 @@ You can use these commands to get more information about the experiment
* [使用 NNI 为 SPTAG 自动调参](docs/zh_CN/CommunitySharings/SptagAutoTune.md)
* [使用 NNI 为 scikit-learn 查找超参](https://towardsdatascience.com/find-thy-hyper-parameters-for-scikit-learn-pipelines-using-microsoft-nni-f1015b1224c1)
* **博客** - [AutoML 工具(Advisor,NNI 与 Google Vizier)的对比](http://gaocegege.com/Blog/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/katib-new#%E6%80%BB%E7%BB%93%E4%B8%8E%E5%88%86%E6%9E%90) 作者:[@gaocegege](https://github.com/gaocegege) - kubeflow/katib 的设计与实现的总结与分析章节
* **Blog (中文)** - [NNI 2019 新功能汇总](https://mp.weixin.qq.com/s/7_KRT-rRojQbNuJzkjFMuA) by @squirrelsc
* **博客** - [NNI 2019 新功能汇总](https://mp.weixin.qq.com/s/7_KRT-rRojQbNuJzkjFMuA) by @squirrelsc
## **反馈**
......
# NNI review article from Zhihu: <an open source project with highly reasonable design> - By Garvin Li
The article is by a NNI user on Zhihu forum. In the article, Garvin had shared his experience on using NNI for Automatic Feature Engineering. We think this article is very useful for users who are interested in using NNI for feature engineering. With author's permission, we translated the original article into English.
**原文(source)**: [如何看待微软最新发布的AutoML平台NNI?By Garvin Li](https://www.zhihu.com/question/297982959/answer/964961829?utm_source=wechat_session&utm_medium=social&utm_oi=28812108627968&from=singlemessage&isappinstalled=0)
## 01 Overview of AutoML
In author's opinion, AutoML is not only about hyperparameter optimization, but
also a process that can target various stages of the machine learning process,
including feature engineering, NAS, HPO, etc.
## 02 Overview of NNI
NNI (Neural Network Intelligence) is an open source AutoML toolkit from
Microsoft, to help users design and tune machine learning models, neural network
architectures, or a complex system’s parameters in an efficient and automatic
way.
Link:[ https://github.com/Microsoft/nni](https://github.com/Microsoft/nni)
In general, most of Microsoft tools have one prominent characteristic: the
design is highly reasonable (regardless of the technology innovation degree).
NNI's AutoFeatureENG basically meets all user requirements of AutoFeatureENG
with a very reasonable underlying framework design.
## 03 Details of NNI-AutoFeatureENG
>The article is following the github project: [https://github.com/SpongebBob/tabular_automl_NNI](https://github.com/SpongebBob/tabular_automl_NNI).
Each new user could do AutoFeatureENG with NNI easily and efficiently. To exploring the AutoFeatureENG capability, downloads following required files, and then run NNI install through pip.
![](https://pic3.zhimg.com/v2-8886eea730cad25f5ac06ef1897cd7e4_r.jpg)
NNI treats AutoFeatureENG as a two-steps-task, feature generation exploration and feature selection. Feature generation exploration is mainly about feature derivation and high-order feature combination.
## 04 Feature Exploration
For feature derivation, NNI offers many operations which could automatically generate new features, which list [as following](https://github.com/SpongebBob/tabular_automl_NNI/blob/master/AutoFEOp.md) :
**count**: Count encoding is based on replacing categories with their counts computed on the train set, also named frequency encoding.
**target**: Target encoding is based on encoding categorical variable values with the mean of target variable per value.
**embedding**: Regard features as sentences, generate vectors using *Word2Vec.*
**crosscout**: Count encoding on more than one-dimension, alike CTR (Click Through Rate).
**aggregete**: Decide the aggregation functions of the features, including min/max/mean/var.
**nunique**: Statistics of the number of unique features.
**histsta**: Statistics of feature buckets, like histogram statistics.
Search space could be defined in a **JSON file**: to define how specific features intersect, which two columns intersect and how features generate from corresponding columns.
![](https://pic1.zhimg.com/v2-3c3eeec6eea9821e067412725e5d2317_r.jpg)
The picture shows us the procedure of defining search space. NNI provides count encoding for 1-order-op, as well as cross count encoding, aggerate statistics (min max var mean median nunique) for 2-order-op.
For example, we want to search the features which are a frequency encoding (valuecount) features on columns name {“C1”, ...,” C26”}, in the following way:
![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%203.jpg)
we can define a cross frequency encoding (value count on cross dims) method on columns {"C1",...,"C26"} x {"C1",...,"C26"} in the following way:
![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%204.jpg)
The purpose of Exploration is to generate new features. You can use **get_next_parameter** function to get received feature candidates of one trial.
>RECEIVED_PARAMS = nni.get_next_parameter()
## 05 Feature selection
To avoid feature explosion and overfitting, feature selection is necessary. In the feature selection of NNI-AutoFeatureENG, LightGBM (Light Gradient Boosting Machine), a gradient boosting framework developed by Microsoft, is mainly promoted.
![](https://pic2.zhimg.com/v2-7bf9c6ae1303692101a911def478a172_r.jpg)
If you have used **XGBoost** or **GBDT**, you would know the algorithm based on tree structure can easily calculate the importance of each feature on results. LightGBM is able to make feature selection naturally.
The issue is that selected features might be applicable to *GBDT* (Gradient Boosting Decision Tree), but not to the linear algorithm like *LR* (Logistic Regression).
![](https://pic4.zhimg.com/v2-d2f919497b0ed937acad0577f7a8df83_r.jpg)
## 06 Summary
NNI's AutoFeatureEng sets a well-established standard, showing us the operation procedure, available modules, which is highly convenient to use. However, a simple model is probably not enough for good results.
## Suggestions to NNI
About Exploration: If consider using DNN (like xDeepFM) to extract high-order feature would be better.
About Selection: There could be more intelligent options, such as automatic selection system based on downstream models.
Conclusion: NNI could offer users some inspirations of design and it is a good open source project. I suggest researchers leverage it to accelerate the AI research.
Tips: Because the scripts of open source projects are compiled based on gcc7, Mac system may encounter problems of gcc (GNU Compiler Collection). The solution is as follows:
#brew install libomp
......@@ -13,3 +13,4 @@ In addtion to the official tutorilas and examples, we encourage community contri
Hyper-parameter Tuning Algorithm Comparsion <HpoComparision>
Parallelizing Optimization for TPE <ParallelizingTpeSearch>
Automatically tune systems with NNI <TuningSystems>
NNI review article from Zhihu: - By Garvin Li <NNI_AutoFeatureEng>
# CDARTS
## Introduction
CDARTS builds a cyclic feedback mechanism between the search and evaluation networks. First, the search network generates an initial topology for evaluation, so that the weights of the evaluation network can be optimized. Second, the architecture topology in the search network is further optimized by the label supervision in classification, as well as the regularization from the evaluation network through feature distillation. Repeating the above cycle results in a joint optimization of the search and evaluation networks, and thus enables the evolution of the topology to fit the final evaluation network.
In implementation of `CdartsTrainer`, it first instantiates two models and two mutators (one for each). The first model is the so-called "search network", which is mutated with a `RegularizedDartsMutator` -- a mutator with subtle differences with `DartsMutator`. The second model is the "evaluation network", which is mutated with a discrete mutator that leverages the previous search network mutator, to sample a single path each time. Trainers train models and mutators alternatively. Users can refer to [references](#reference) if they are interested in more details on these trainers and mutators.
## Reproduction Results
This is CDARTS based on the NNI platform, which currently supports CIFAR10 search and retrain. ImageNet search and retrain should also be supported, and we provide corresponding interfaces. Our reproduced results on NNI are slightly lower than the paper, but much higher than the original DARTS. Here we show the results of three independent experiments on CIFAR10.
| Runs | Paper | NNI |
| ---- |:-------------:| :-----:|
| 1 | 97.52 | 97.44 |
| 2 | 97.53 | 97.48 |
| 3 | 97.58 | 97.56 |
## Examples
[Example code](https://github.com/microsoft/nni/tree/master/examples/nas/cdarts)
```bash
# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
git clone https://github.com/Microsoft/nni.git
# install apex for distributed training.
git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install --cpp_ext --cuda_ext
# search the best architecture
cd examples/nas/cdarts
bash run_search_cifar.sh
# train the best architecture.
bash run_retrain_cifar.sh
```
## Reference
### PyTorch
```eval_rst
.. autoclass:: nni.nas.pytorch.cdarts.CdartsTrainer
:members:
.. automethod:: __init__
.. autoclass:: nni.nas.pytorch.cdarts.RegularizedDartsMutator
:members:
.. autoclass:: nni.nas.pytorch.cdarts.DartsDiscreteMutator
:members:
.. automethod:: __init__
.. autoclass:: nni.nas.pytorch.cdarts.RegularizedMutatorParallel
:members:
```
......@@ -22,6 +22,7 @@ NNI supports below NAS algorithms now and is adding more. User can reproduce an
| [DARTS](DARTS.md) | [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) introduces a novel algorithm for differentiable network architecture search on bilevel optimization. |
| [P-DARTS](PDARTS.md) | [Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation](https://arxiv.org/abs/1904.12760) is based on DARTS. It introduces an efficient algorithm which allows the depth of searched architectures to grow gradually during the training procedure. |
| [SPOS](SPOS.md) | [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420) constructs a simplified supernet trained with an uniform path sampling method, and applies an evolutionary algorithm to efficiently search for the best-performing architectures. |
| [CDARTS](CDARTS.md) | [Cyclic Differentiable Architecture Search](https://arxiv.org/abs/****) builds a cyclic feedback mechanism between the search and evaluation networks. It introduces a cyclic differentiable architecture search framework which integrates the two networks into a unified architecture.|
One-shot algorithms run **standalone without nnictl**. Only PyTorch version has been implemented. Tensorflow 2.x will be supported in future release.
......
# Run an Experiment on Multiple Machines
# Run an Experiment on Remote Machines
NNI supports running an experiment on multiple machines through SSH channel, called `remote` mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code.
NNI can run one experiment on multiple remote machines through SSH, called `remote` mode. It's like a lightweight training platform. In this mode, NNI can be started from your computer, and dispatch trials to remote machines in parallel.
e.g. Three machines and you login in with account `bob` (Note: the account is not necessarily the same on different machine):
## Remote machine requirements
| IP | Username| Password |
| -------- |---------|-------|
| 10.1.1.1 | bob | bob123 |
| 10.1.1.2 | bob | bob123 |
| 10.1.1.3 | bob | bob123 |
* It only supports Linux as remote machines, and [linux part in system specification](../Tutorial/Installation.md) is same as NNI local mode.
* Follow [installation](../Tutorial/Installation.md) to install NNI on each machine.
## Setup NNI environment
* Make sure remote machines meet environment requirements of your trial code. If the default environment does not meet the requirements, the setup script can be added into `command` field of NNI config.
Install NNI on each of your machines following the install guide [here](../Tutorial/QuickStart.md).
* Make sure remote machines can be accessed through SSH from the machine which runs `nnictl` command. It supports both password and key authentication of SSH. For advanced usages, please refer to [machineList part of configuration](../Tutorial/ExperimentConfig.md).
* Make sure the NNI version on each machine is consistent.
## Run an experiment
Install NNI on another machine which has network accessibility to those three machines above, or you can just run `nnictl` on any one of the three to launch the experiment.
e.g. there are three machines, which can be logged in with username and password.
We use `examples/trials/mnist-annotation` as an example here. Shown here is `examples/trials/mnist-annotation/config_remote.yml`:
| IP | Username | Password |
| -------- | -------- | -------- |
| 10.1.1.1 | bob | bob123 |
| 10.1.1.2 | bob | bob123 |
| 10.1.1.3 | bob | bob123 |
Install and run NNI on one of those three machines or another machine, which has network access to them.
Use `examples/trials/mnist-annotation` as the example. Below is content of `examples/trials/mnist-annotation/config_remote.yml`:
```yaml
authorName: default
......@@ -58,14 +66,8 @@ machineList:
passwd: bob123
```
Files in `codeDir` will be automatically uploaded to the remote machine. You can run NNI on different operating systems (Windows, Linux, MacOS) to spawn experiments on the remote machines (only Linux allowed):
Files in `codeDir` will be uploaded to remote machines automatically. You can run below command on Windows, Linux, or macOS to spawn trials on remote Linux machines:
```bash
nnictl create --config examples/trials/mnist-annotation/config_remote.yml
```
You can also use public/private key pairs instead of username/password for authentication. For advanced usages, please refer to [Experiment Config Reference](../Tutorial/ExperimentConfig.md).
## Version check
NNI support version check feature in since version 0.6, [reference](PaiMode.md).
\ No newline at end of file
......@@ -4,10 +4,11 @@ NNI TrainingService provides the training platform for running NNI trial jobs. N
NNI not only provides few built-in training service options, but also provides a method for customers to build their own training service easily.
## Built-in TrainingService
|TrainingService|Brief Introduction|
|---|---|
|[__Local__](./LocalMode.md)|NNI supports running an experiment on local machine, called local mode. Local mode means that NNI will run the trial jobs and nniManager process in same machine, and support gpu schedule function for trial jobs.|
|[__Remote__](./RemoteMachineMode.md)|NNI supports running an experiment on multiple machines through SSH channel, called remote mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code. NNI will submit the trial jobs in remote machine, and schedule suitable machine with enouth gpu resource if specified.|
|[__Remote__](./RemoteMachineMode.md)|NNI supports running an experiment on multiple machines through SSH channel, called remote mode. NNI assumes that you have access to those machines, and already setup the environment for running deep learning training code. NNI will submit the trial jobs in remote machine, and schedule suitable machine with enough gpu resource if specified.|
|[__Pai__](./PaiMode.md)|NNI supports running an experiment on [OpenPAI](https://github.com/Microsoft/pai) (aka pai), called pai mode. Before starting to use NNI pai mode, you should have an account to access an [OpenPAI](https://github.com/Microsoft/pai) cluster. See [here](https://github.com/Microsoft/pai#how-to-deploy) if you don't have any OpenPAI account and want to deploy an OpenPAI cluster. In pai mode, your trial program will run in pai's container created by Docker.|
|[__Kubeflow__](./KubeflowMode.md)|NNI supports running experiment on [Kubeflow](https://github.com/kubeflow/kubeflow), called kubeflow mode. Before starting to use NNI kubeflow mode, you should have a Kubernetes cluster, either on-premises or [Azure Kubernetes Service(AKS)](https://azure.microsoft.com/en-us/services/kubernetes-service/), a Ubuntu machine on which [kubeconfig](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/) is setup to connect to your Kubernetes cluster. If you are not familiar with Kubernetes, [here](https://kubernetes.io/docs/tutorials/kubernetes-basics/) is a good start. In kubeflow mode, your trial program will run as Kubeflow job in Kubernetes cluster.|
|[__FrameworkController__](./FrameworkControllerMode.md)|NNI supports running experiment using [FrameworkController](https://github.com/Microsoft/frameworkcontroller), called frameworkcontroller mode. FrameworkController is built to orchestrate all kinds of applications on Kubernetes, you don't need to install Kubeflow for specific deep learning framework like tf-operator or pytorch-operator. Now you can use FrameworkController as the training service to run NNI experiment.|
......@@ -16,7 +17,8 @@ NNI not only provides few built-in training service options, but also provides a
TrainingService is designed to be easily implemented, we define an abstract class TrainingService as the parent class of all kinds of TrainingService, users just need to inherit the parent class and complete their own child class if they want to implement customized TrainingService.
The abstract function in TrainingService is shown below:
```
```javascript
abstract class TrainingService {
public abstract listTrialJobs(): Promise<TrialJobDetail[]>;
public abstract getTrialJob(trialJobId: string): Promise<TrialJobDetail>;
......@@ -32,5 +34,6 @@ abstract class TrainingService {
public abstract run(): Promise<void>;
}
```
The parent class of TrainingService has a few abstract functions, users need to inherit the parent class and implement all of these abstract functions.
For more information about how to write your own TrainingService, please [refer](https://github.com/microsoft/nni/blob/master/docs/en_US/TrainingService/HowToImplementTrainingService.md).
# EfficientNet
[EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946)
Use Grid search to find the best combination of alpha, beta and gamma for EfficientNet-B1, as discussed in Section 3.3 in paper. Search space, tuner, configuration examples are provided here.
## Instructions
[Example code](https://github.com/microsoft/nni/tree/master/examples/trials/efficientnet)
1. Set your working directory here in the example code directory.
2. Run `git clone https://github.com/ultmaster/EfficientNet-PyTorch` to clone this modified version of [EfficientNet-PyTorch](https://github.com/lukemelas/EfficientNet-PyTorch). The modifications were done to adhere to the original [Tensorflow version](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet) as close as possible (including EMA, label smoothing and etc.); also added are the part which gets parameters from tuner and reports intermediate/final results. Clone it into `EfficientNet-PyTorch`; the files like `main.py`, `train_imagenet.sh` will appear inside, as specified in the configuration files.
3. Run `nnictl create --config config_local.yml` (use `config_pai.yml` for OpenPAI) to find the best EfficientNet-B1. Adjust the training service (PAI/local/remote), batch size in the config files according to the environment.
For training on ImageNet, read `EfficientNet-PyTorch/train_imagenet.sh`. Download ImageNet beforehand and extract it adhering to [PyTorch format](https://pytorch.org/docs/stable/torchvision/datasets.html#imagenet) and then replace `/mnt/data/imagenet` in with the location of the ImageNet storage. This file should also be a good example to follow for mounting ImageNet into the container on OpenPAI.
## Results
The follow image is a screenshot, demonstrating the relationship between acc@1 and alpha, beta, gamma.
![](../../img/efficientnet_search_result.png)
......@@ -47,5 +47,9 @@ Probably it's a problem with your network config. Here is a checklist.
### NNI on Windows problems
Please refer to [NNI on Windows](NniOnWindows.md)
### More FAQ issues
[NNI Issues with FAQ labels](https://github.com/microsoft/nni/labels/FAQ)
### Help us improve
Please inquiry the problem in https://github.com/Microsoft/nni/issues to see whether there are other people already reported the problem, create a new one if there are no existing issues been created.
# Installation of NNI
Currently we support installation on Linux, Mac and Windows.
Currently we support installation on Linux, macOS and Windows.
## **Installation on Linux & Mac**
## Install on Linux or macOS
* __Install NNI through pip__
* Install NNI through pip
Prerequisite: `python >= 3.5`
Prerequisite: `python 64-bit >= 3.5`
```bash
python3 -m pip install --upgrade nni
```
* __Install NNI through source code__
* Install NNI through source code
Prerequisite: `python >=3.5`, `git`, `wget`
If you are interested on special or latest code version, you can install NNI through source code.
Prerequisites: `python 64-bit >=3.5`, `git`, `wget`
```bash
git clone -b v0.8 https://github.com/Microsoft/nni.git
......@@ -22,25 +24,27 @@ Currently we support installation on Linux, Mac and Windows.
./install.sh
```
* __Install NNI in docker image__
* Use NNI in a docker image
You can also install NNI in a docker image. Please follow the instructions [here](https://github.com/Microsoft/nni/tree/master/deployment/docker/README.md) to build NNI docker image. The NNI docker image can also be retrieved from Docker Hub through the command `docker pull msranni/nni:latest`.
## **Installation on Windows**
## Install on Windows
Anaconda or Miniconda is highly recommended.
Anaconda or Miniconda is highly recommended to manage multiple Python environments.
* __Install NNI through pip__
* Install NNI through pip
Prerequisite: `python(64-bit) >= 3.5`
Prerequisites: `python 64-bit >= 3.5`
```bash
python -m pip install --upgrade nni
```
* __Install NNI through source code__
* Install NNI through source code
If you are interested on special or latest code version, you can install NNI through source code.
Prerequisite: `python >=3.5`, `git`, `PowerShell`.
Prerequisites: `python 64-bit >=3.5`, `git`, `PowerShell`.
```bash
git clone -b v0.8 https://github.com/Microsoft/nni.git
......@@ -48,43 +52,103 @@ Currently we support installation on Linux, Mac and Windows.
powershell -ExecutionPolicy Bypass -file install.ps1
```
## **System requirements**
Below are the minimum system requirements for NNI on Linux. Due to potential programming changes, the minimum system requirements for NNI may change over time.
||Minimum Requirements|Recommended Specifications|
|---|---|---|
|**Operating System**|Ubuntu 16.04 or above|Ubuntu 16.04 or above|
|**CPU**|Intel® Core™ i3 or AMD Phenom™ X3 8650|Intel® Core™ i5 or AMD Phenom™ II X3 or better|
|**GPU**|NVIDIA® GeForce® GTX 460|NVIDIA® GeForce® GTX 660 or better|
|**Memory**|4 GB RAM|6 GB RAM|
|**Storage**|30 GB available hare drive space|
|**Internet**|Boardband internet connection|
|**Resolution**|1024 x 768 minimum display resolution|
Below are the minimum system requirements for NNI on macOS. Due to potential programming changes, the minimum system requirements for NNI may change over time.
||Minimum Requirements|Recommended Specifications|
|---|---|---|
|**Operating System**|macOS 10.14.1 (latest version)|macOS 10.14.1 (latest version)|
|**CPU**|Intel® Core™ i5-760 or better|Intel® Core™ i7-4770 or better|
|**GPU**|NVIDIA® GeForce® GT 750M or AMD Radeon™ R9 M290 or better|AMD Radeon™ R9 M395X or better|
|**Memory**|4 GB RAM|8 GB RAM|
|**Storage**|70GB available space 7200 RPM HDD|70GB available space SSD|
|**Internet**|Boardband internet connection|
|**Resolution**|1024 x 768 minimum display resolution|
Below are the minimum system requirements for NNI on Windows, Windows 10.1809 is well tested and recommend. Due to potential programming changes, the minimum system requirements for NNI may change over time.
||Minimum Requirements|Recommended Specifications|
|---|---|---|
|**Operating System**|Windows 10|Windows 10|
|**CPU**|Intel® Core™ i3 or AMD Phenom™ X3 8650|Intel® Core™ i5 or AMD Phenom™ II X3 or better|
|**GPU**|NVIDIA® GeForce® GTX 460|NVIDIA® GeForce® GTX 660 or better|
|**Memory**|4 GB RAM|6 GB RAM|
|**Storage**|30 GB available hare drive space|
|**Internet**|Boardband internet connection|
|**Resolution**|1024 x 768 minimum display resolution|
## Verify installation
The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is used** when running it.
* Download the examples via clone the source code.
```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
```
* Run the MNIST example.
Linux or macOS
```bash
nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
```
Windows
```bash
nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
```
* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.
```text
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: http://223.255.255.1:8080 http://127.0.0.1:8080
-----------------------------------------------------------------------
You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
commands description
1. nnictl experiment show show the information of experiments
2. nnictl trial ls list all of trial jobs
3. nnictl top monitor the status of running experiments
4. nnictl log stderr show stderr log content
5. nnictl log stdout show stdout log content
6. nnictl stop stop an experiment
7. nnictl trial kill kill a trial job by id
8. nnictl --help get help information about nnictl
-----------------------------------------------------------------------
```
* Open the `Web UI url` in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below. [Here](../Tutorial/WebUI.md) are more Web UI pages.
![overview](../../img/webui_overview_page.png)
![detail](../../img/webui_trialdetail_page.png)
## System requirements
Due to potential programming changes, the minimum system requirements of NNI may change over time.
### Linux
| | Recommended | Minimum |
| -------------------- | ---------------------------------------------- | -------------------------------------- |
| **Operating System** | Ubuntu 16.04 or above |
| **CPU** | Intel® Core™ i5 or AMD Phenom™ II X3 or better | Intel® Core™ i3 or AMD Phenom™ X3 8650 |
| **GPU** | NVIDIA® GeForce® GTX 660 or better | NVIDIA® GeForce® GTX 460 |
| **Memory** | 6 GB RAM | 4 GB RAM |
| **Storage** | 30 GB available hare drive space |
| **Internet** | Boardband internet connection |
| **Resolution** | 1024 x 768 minimum display resolution |
### macOS
| | Recommended | Minimum |
| -------------------- | ------------------------------------- | --------------------------------------------------------- |
| **Operating System** | macOS 10.14.1 or above |
| **CPU** | Intel® Core™ i7-4770 or better | Intel® Core™ i5-760 or better |
| **GPU** | AMD Radeon™ R9 M395X or better | NVIDIA® GeForce® GT 750M or AMD Radeon™ R9 M290 or better |
| **Memory** | 8 GB RAM | 4 GB RAM |
| **Storage** | 70GB available space SSD | 70GB available space 7200 RPM HDD |
| **Internet** | Boardband internet connection |
| **Resolution** | 1024 x 768 minimum display resolution |
### Windows
| | Recommended | Minimum |
| -------------------- | ---------------------------------------------- | -------------------------------------- |
| **Operating System** | Windows 10 1809 or above |
| **CPU** | Intel® Core™ i5 or AMD Phenom™ II X3 or better | Intel® Core™ i3 or AMD Phenom™ X3 8650 |
| **GPU** | NVIDIA® GeForce® GTX 660 or better | NVIDIA® GeForce® GTX 460 |
| **Memory** | 6 GB RAM | 4 GB RAM |
| **Storage** | 30 GB available hare drive space |
| **Internet** | Boardband internet connection |
| **Resolution** | 1024 x 768 minimum display resolution |
## Further reading
......
......@@ -2,14 +2,15 @@
## Installation
We support Linux MacOS and Windows in current stage, Ubuntu 16.04 or higher, MacOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
#### Linux and MacOS
We support Linux macOS and Windows in current stage, Ubuntu 16.04 or higher, macOS 10.14.1 and Windows 10.1809 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
**Linux and macOS**
```bash
python3 -m pip install --upgrade nni
```
#### Windows
**Windows**
```bash
python -m pip install --upgrade nni
......@@ -17,7 +18,7 @@ We support Linux MacOS and Windows in current stage, Ubuntu 16.04 or higher, Mac
Note:
* For Linux and MacOS `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* For Linux and macOS `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* If there is any error like `Segmentation fault`, please refer to [FAQ](FAQ.md)
* For the `system requirements` of NNI, please refer to [Install NNI](Installation.md)
......@@ -53,7 +54,7 @@ The above code can only try one set of parameters at a time, if we want to tune
NNI is born for helping user do the tuning jobs, the NNI working process is presented below:
```
```text
input: search space, trial code, config file
output: one optimal hyperparameter configuration
......@@ -68,7 +69,7 @@ output: one optimal hyperparameter configuration
If you want to use NNI to automatically train your model and find the optimal hyper-parameters, you need to do three changes base on your code:
**Three things required to do when using NNI**
**Three steps to start an experiment**
**Step 1**: Give a `Search Space` file in JSON, includes the `name` and the `distribution` (discrete valued or continuous valued) of all the hyperparameters you need to search.
......@@ -138,22 +139,25 @@ Note, **for Windows, you need to change trial command `python3` to `python`**
All the codes above are already prepared and stored in [examples/trials/mnist-tfv1/](https://github.com/Microsoft/nni/tree/master/examples/trials/mnist-tfv1).
#### Linux and MacOS
**Linux and macOS**
Run the **config.yml** file from your command line to start MNIST experiment.
```bash
nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
```
#### Windows
**Windows**
Run the **config_windows.yml** file from your command line to start MNIST experiment.
**Note**, if you're using NNI on Windows, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
Note, if you're using NNI on Windows, it needs to change `python3` to `python` in the config.yml file, or use the config_windows.yml file to start the experiment.
```bash
nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
```
Note, **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Note, `nnictl` is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](Nnictl.md) for more usage of `nnictl`
Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. And this is what we expected to get:
......@@ -195,7 +199,7 @@ The Web UI urls are: [Your IP]:8080
Open the `Web UI url`(In this information is: `[Your IP]:8080`) in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below. If you can not open the WebUI link in your terminal, you can refer to [FAQ](FAQ.md).
#### View summary page
### View summary page
Click the tab "Overview".
......@@ -207,7 +211,7 @@ Top 10 trials will be listed in the Overview page, you can browse all the trials
![](../../img/QuickStart2.png)
#### View trials detail page
### View trials detail page
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.
......
......@@ -47,6 +47,9 @@ extensions = [
'sphinx.ext.napoleon',
]
# Add mock modules
autodoc_mock_imports = ['apex']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
......
......@@ -12,3 +12,4 @@ Examples
GBDT<./TrialExample/GbdtExample>
RocksDB <./TrialExample/RocksdbExamples>
KDExample <./TrialExample/KDExample>
EfficientNet <./TrialExample/EfficientNet>
......@@ -24,3 +24,4 @@ For details, please refer to the following tutorials:
DARTS <NAS/DARTS>
P-DARTS <NAS/PDARTS>
SPOS <NAS/SPOS>
CDARTS <NAS/CDARTS>
......@@ -10,4 +10,4 @@ numpy
scipy
coverage
scikit-learn==0.20
torch==1.3.1
\ No newline at end of file
https://download.pytorch.org/whl/cpu/torch-1.3.1%2Bcpu-cp37-cp37m-linux_x86_64.whl
# 来自知乎的评论: <an open source project with highly reasonable design> - 作者 Garvin Li
本文由 NNI 用户在知乎论坛上发表。 在这篇文章中,Garvin 分享了在使用 NNI 进行自动特征工程方面的体验。 我们认为本文对于有兴趣使用 NNI 进行特征工程的用户非常有用。 经作者许可,将原始文章摘编如下。
**原文**: [如何看待微软最新发布的AutoML平台NNI?作者 Garvin Li](https://www.zhihu.com/question/297982959/answer/964961829?utm_source=wechat_session&utm_medium=social&utm_oi=28812108627968&from=singlemessage&isappinstalled=0)
## 01 AutoML概述
作者认为 AutoML 不光是调参,应该包含自动特征工程。AutoML 是一个系统化的体系,包括:自动特征工程(AutoFeatureEng)、自动调参(AutoTuning)、自动神经网络探索(NAS)等。
## 02 NNI 概述
NNI((Neural Network Intelligence)是一个微软的开源 AutoML 工具包,通过自动而有效的方法来帮助用户设计并调优机器学习模型,神经网络架构,或复杂系统的参数。
链接:[ https://github.com/Microsoft/nni](https://github.com/Microsoft/nni)
我目前只学习了自动特征工程这一个模块,总体看微软的工具都有一个比较大的特点,技术可能不一定多新颖,但是设计都非常赞。 NNI 的 AutoFeatureENG 基本包含了用户对于 AutoFeatureENG 的一切幻想。在微软做 PD 应该挺幸福吧,底层的这些个框架的设计都极为合理。
## 03 细说NNI - AutoFeatureENG
> 本文使用了此项目: [https://github.com/SpongebBob/tabular_automl_NNI](https://github.com/SpongebBob/tabular_automl_NNI)。
新用户可以使用 NNI 轻松高效地进行 AutoFeatureENG。 使用是非常简单的,安装下文件中的 require,然后 pip install NNI。
![](https://pic3.zhimg.com/v2-8886eea730cad25f5ac06ef1897cd7e4_r.jpg) NNI把 AutoFeatureENG 拆分成 exploration 和 selection 两个模块。 exploration 主要是特征衍生和交叉,selection 讲的是如何做特征筛选。
## 04 特征 Exploration
对于功能派生,NNI 提供了许多可自动生成新功能的操作,[列表](https://github.com/SpongebBob/tabular_automl_NNI/blob/master/AutoFEOp.md)如下:
**count**:传统的统计,统计一些数据的出现频率
**target**:特征和目标列的一些映射特征
**embedding**:把特征看成句子,用 *word2vector* 的方式制作向量
**crosscount**:特征间除法,有点类似CTR
**aggregete**:特征的 min/max/var/mean
**nunique**:统计唯一特征的数量。
**histsta**:特征存储桶的统计信息,如直方图统计信息。
具体特征怎么交叉,哪一列和哪一列交叉,每一列特征用什么方式衍生呢?可以通过 **search_space. json** 这个文件控制。
![](https://pic1.zhimg.com/v2-3c3eeec6eea9821e067412725e5d2317_r.jpg)
图片展示了定义搜索空间的过程。 NNI 为 1 阶运算提供计数编码,并为 2 阶运算提供聚合的统计(min max var mean median nunique)。
例如,希望以下列方式搜索列名称 {"C1"、"...","C26"} 上的频率编码(valuecount)功能的功能:
![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%203.jpg)
可以在列 {"C1",...,"C26"} x {"C1",...,"C26"} 上定义交叉频率编码(交叉维度的值计数)方法:
![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%204.jpg)
Exploration 的目的就是长生出新的特征。 在代码里可以用 **get_next_parameter** 的方式获取 tuning 的参数:
> RECEIVED_PARAMS = nni.get_next_parameter()
## 05 特征 Selection
为了避免特征泛滥的情况,避免过拟合,一定要有 Selection 的机制挑选特征。 在 NNI-AutoFeatureENG 的 Selection 中,主要使用了微软开发的梯度提升框架 LightGBM(Light Gradient Boosting Machine)。
![](https://pic2.zhimg.com/v2-7bf9c6ae1303692101a911def478a172_r.jpg)
了解 xgboost 或者 GBDT 算法同学应该知道,这种树形结构的算法是很容易计算出每个特征对于结果的影响的。 所以使用 lightGBM 可以天然的进行特征筛选。
弊病就是,如果下游是个 *LR*(逻辑回归)这种线性算法,筛选出来的特征是否具备普适性。
![](https://pic4.zhimg.com/v2-d2f919497b0ed937acad0577f7a8df83_r.jpg)
## 06 总结
NNI 的 AutoFeature 模块是给整个行业制定了一个教科书般的标准,告诉大家这个东西要怎么做,有哪些模块,使用起来非常方便。 但是如果只是基于这样简单的模式,不一定能达到很好的效果。
## 对 NNI 的建议
我觉得在Exploration方面可以引用一些 DNN(如:xDeepFM) 的特征组合方式,提取更高维度的特征。
在 Selection 方面可以有更多的智能化方案,比如可以基于下游的算法自动选择 Selection 机制。
总之 NNI 在设计曾给了我一些启发,还是一个挺好的开源项目,推荐给大家~ 建议 AI 研究人员使用它来加速研究。
大家用的时候如果是 Mac 电脑可能会遇到 gcc 的问题,因为开源项目自带的脚本是基于 gcc7 编译的, 可以用下面的方法绕过去:
# brew install libomp
......@@ -13,3 +13,4 @@
超参调优算法的对比<HpoComparision>
TPE 的并行优化<ParallelizingTpeSearch>
使用 NNI 自动调优系统 <TuningSystems>
来自知乎的评论:作者 Garvin Li <NNI_AutoFeatureEng>
......@@ -335,5 +335,3 @@ pruner.compress()
- **sparsity:** 卷积过滤器要修剪的百分比。
- **op_types:** 在 ActivationMeanRankFilterPruner 中仅支持 Conv2d。
***
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment