Merge pull request #173 from microsoft/master

merge master

Merge pull request #173 from microsoft/master
merge master
c5acd8c2 · SparkSnail · GitHub · 40bae6e2 · d135d184 · c5acd8c2
Unverified Commit c5acd8c2 authored May 27, 2019 by SparkSnail Committed by GitHub May 27, 2019
13 changed files
--- a/src/sdk/pynni/nni/evolution_tuner/README_zh_CN.md
+++ b/src/sdk/pynni/nni/evolution_tuner/README_zh_CN.md
+# Naive Evolution Tuner
+## Naive Evolution（进化算法）
+进化算法来自于 [Large-Scale Evolution of Image Classifiers](https://arxiv.org/pdf/1703.01041.pdf)。 它会基于搜索空间随机生成一个种群。 在每一代中，会选择较好的结果，并对其下一代进行一些变异（例如，改动一个超参，增加或减少一层）。 进化算法需要很多次 Trial 才能有效，但它也非常简单，也很容易扩展新功能。
\ No newline at end of file
--- a/src/sdk/pynni/nni/gridsearch_tuner/README_zh_CN.md
+++ b/src/sdk/pynni/nni/gridsearch_tuner/README_zh_CN.md
+# Grid Search
+## Grid Search（遍历搜索）
+Grid Search 会穷举定义在搜索空间文件中的所有超参组合。 注意，搜索空间仅支持 `choice`, `quniform`, `qloguniform`。 `quniform` 和 `qloguniform` 中的 **数字 `q` 有不同的含义（与[搜索空间](../../../../../docs/zh_CN/SearchSpaceSpec.md)说明不同）。 这里的意义是在 `low` 和 `high` 之间均匀取值的数量。</p>
\ No newline at end of file
--- a/src/sdk/pynni/nni/hyperband_advisor/README_zh_CN.md
+++ b/src/sdk/pynni/nni/hyperband_advisor/README_zh_CN.md
+# NNI 中使用 Hyperband
+## 1. 介绍
+[Hyperband](https://arxiv.org/pdf/1603.06560.pdf) 是一种流行的自动机器学习算法。 Hyperband 的基本思想是对配置分组，每组有 `n` 个随机生成的超参配置，每个配置使用 `r` 次资源（如，epoch 数量，批处理数量等）。 当 `n` 个配置完成后，会选择最好的 `n/eta` 个配置，并增加 `r*eta` 次使用的资源。 最后，会选择出的最好配置。
+## 2. 实现并行
+首先，此样例是基于 MsgDispatcherBase 来实现的自动机器学习算法，而不是基于 Tuner 和Assessor。 这种实现方法下，Hyperband 集成了 Tuner 和 Assessor 两者的功能，因而将它叫做 Advisor。
+其次，本实现完全利用了 Hyperband 内部的并行性。 具体来说，下一个分组不会严格的在当前分组结束后再运行，只要有资源，就可以开始运行新的分组。
+## 3. 用法
+要使用 Hyperband，需要在 Experiment 的 YAML 配置文件进行如下改动。
+    advisor:
+      #可选项: Hyperband
+      builtinAdvisorName: Hyperband
+      classArgs:
+        #R: 最大的步骤
+        R: 100
+        #eta: 丢弃的 Trial 的比例
+        eta: 3
+        #可选项: maximize, minimize
+        optimize_mode: maximize
+注意，一旦使用了 Advisor，就不能在配置文件中添加 Tuner 和 Assessor。 使用 Hyperband 时，Trial 代码收到的超参（如键值对）中，除了用户定义的超参，会多一个 `STEPS`。 **使用 `STEPS`，Trial 能够控制其运行的时间。</p> 
+对于 Trial 代码中 `report_intermediate_result(metric)` 和 `report_final_result(metric)` 的**`指标` 应该是数值，或者用一个 dict，并保证其中有键值为 default 的项目，其值也为数值型**。 这是需要进行最大化或者最小化优化的数值，如精度或者损失度。
+`R` 和 `eta` 是 Hyperband 中可以改动的参数。 `R` 表示可以分配给配置的最大步数（STEPS）。 这里，STEPS 可以代表 epoch 或 批处理数量。 `STEPS` 应该被 Trial 代码用来控制运行的次数。 参考样例 `examples/trials/mnist-hyperband/` ，了解详细信息。
+`eta` 表示 `n` 个配置中的 `n/eta` 个配置会留存下来，并用更多的 STEPS 来运行。
+下面是 `R=81` 且 `eta=3` 时的样例：
+|   | s=4  | s=3  | s=2  | s=1  | s=0  |
+| - | ---- | ---- | ---- | ---- | ---- |
+| i | n r  | n r  | n r  | n r  | n r  |
+| 0 | 81 1 | 27 3 | 9 9  | 6 27 | 5 81 |
+| 1 | 27 3 | 9 9  | 3 27 | 2 81 |      |
+| 2 | 9 9  | 3 27 | 1 81 |      |      |
+| 3 | 3 27 | 1 81 |      |      |      |
+| 4 | 1 81 |      |      |      |      |
+`s` 表示分组， `n` 表示生成的配置数量，相应的 `r` 表示配置会运行多少 STEPS。 `i` 表示轮数，如分组 4 有 5 轮，分组 3 有 4 轮。
+关于如何实现 Trial 代码，参考 `examples/trials/mnist-hyperband/` 中的说明。
+## 4. 待改进
+当前实现的 Hyperband 算法可以通过改进支持的提前终止算法来提高，原因是最好的 `n/eta` 个配置并不一定都表现很好。 不好的配置可以更早的终止。
+在当前实现中，遵循了[此论文](https://arxiv.org/pdf/1603.06560.pdf)的设计，配置都是随机生成的。 要进一步提升，配置生成过程可以利用更高级的算法。
\ No newline at end of file
--- a/src/sdk/pynni/nni/hyperopt_tuner/README_zh_CN.md
+++ b/src/sdk/pynni/nni/hyperopt_tuner/README_zh_CN.md
+# TPE, Random Search, Anneal Tuners
+## TPE
+Tree-structured Parzen Estimator (TPE) 是一种 sequential model-based optimization（SMBO，即基于序列模型优化）的方法。 SMBO 方法根据历史指标数据来按顺序构造模型，来估算超参的性能，随后基于此模型来选择新的超参。 TPE 方法对 P(x|y) 和 P(y) 建模，其中 x 表示超参，y 表示相关的评估指标。 P(x|y) 通过变换超参的生成过程来建模，用非参数密度（non-parametric densities）代替配置的先验分布。 细节可参考 [Algorithms for Hyper-Parameter Optimization](https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf)。 
+## Random Search（随机搜索）
+[Random Search for Hyper-Parameter Optimization](http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf) 中介绍了随机搜索惊人的简单和效果。 建议当不清楚超参的先验分布时，采用随机搜索作为基准。
+## Anneal（退火算法）
+这种简单的退火算法从先前的采样开始，会越来越靠近发现的最佳点取样。 此算法是随机搜索的简单变体，利用了响应面的平滑性。 退火率不是自适应的。
\ No newline at end of file
--- a/src/sdk/pynni/nni/medianstop_assessor/README_zh_CN.md
+++ b/src/sdk/pynni/nni/medianstop_assessor/README_zh_CN.md
+# Medianstop Assessor
+## Median Stop
+Medianstop 是一种简单的提前终止 Trial 的策略，可参考[论文](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf)。 如果 Trial X 的在步骤 S 的最好目标值比所有已完成 Trial 的步骤 S 的中位数值明显要低，这个 Trial 就会被提前停止。
\ No newline at end of file
--- a/src/sdk/pynni/nni/metis_tuner/README_zh_CN.md
+++ b/src/sdk/pynni/nni/metis_tuner/README_zh_CN.md
+# Metis Tuner
+## Metis Tuner
+大多数调参工具仅仅预测最优配置，而 [Metis](https://www.microsoft.com/en-us/research/publication/metis-robustly-tuning-tail-latencies-cloud-systems/) 的优势在于有两个输出：(a) 最优配置的当前预测结果， 以及 (b) 下一次 Trial 的建议。 不再需要随机猜测!
+大多数工具假设训练集没有噪声数据，但 Metis 会知道是否需要对某个超参重新采样。
+大多数工具都有着重于在已有结果上继续发展的问题，而 Metis 的搜索策略可以在探索，发展和重新采样（可选）中进行平衡。
+Metis 属于基于序列的贝叶斯优化 (SMBO) 的类别，它也基于贝叶斯优化框架。 为了对超参-性能空间建模，Metis 同时使用了高斯过程（Gaussian Process）和高斯混合模型（GMM）。 由于每次 Trial 都可能有很高的时间成本，Metis 大量使用了已有模型来进行推理计算。 在每次迭代中，Metis 执行两个任务：
+在高斯过程空间中找到全局最优点。 这一点表示了最佳配置。
+它会标识出下一个超参的候选项。 这是通过对隐含信息的探索、挖掘和重采样来实现的。
+注意，搜索空间仅支持 `choice`, `quniform`, `uniform` 和 `randint`。
+更多详情，参考论文：https://www.microsoft.com/en-us/research/publication/metis-robustly-tuning-tail-latencies-cloud-systems/
\ No newline at end of file
--- a/src/webui/src/components/public-child/OpenRow.tsx
+++ b/src/webui/src/components/public-child/OpenRow.tsx
@@ -3,9 +3,10 @@ import * as copy from 'copy-to-clipboard';
 import PaiTrialLog from '../public-child/PaiTrialLog';
 import TrialLog from '../public-child/TrialLog';
 import { TableObj } from '../../static/interface';
-import { Row, Tabs, Button, message } from 'antd';
+import { Row, Tabs, Button, message, Modal } from 'antd';
 import { MANAGER_IP } from '../../static/const';
 import '../../static/style/overview.scss';
+import '../../static/style/copyParameter.scss';
 import JSONTree from 'react-json-tree';
 const TabPane = Tabs.TabPane;
@@ -17,43 +18,62 @@ interface OpenRowProps {
 }
 interface OpenRowState {
-    idList: Array<string>;
+    isShowFormatModal: boolean;
+    formatStr: string;
 }
 class OpenRow extends React.Component<OpenRowProps, OpenRowState> {
+    public _isMounted: boolean;
    constructor(props: OpenRowProps) {
        super(props);
        this.state = {
-            idList: ['']
+            isShowFormatModal: false,
+            formatStr: ''
        };
+    }
+    showFormatModal = (record: TableObj) => {
+        // get copy parameters
+        const params = JSON.stringify(record.description.parameters, null, 4);
+        // open modal with format string
+        if (this._isMounted === true) {
+            this.setState(() => ({ isShowFormatModal: true, formatStr: params }));
+        }
+    }
+    hideFormatModal = () => {
+        // close modal, destroy state format string data
+        if (this._isMounted === true) {
+            this.setState(() => ({ isShowFormatModal: false, formatStr: '' }));
+        }
    }
-    copyParams = (record: TableObj) => {
+    copyParams = () => {
        // json format
-        const params = JSON.stringify(record.description.parameters, null, 4);
+        const { formatStr } = this.state;
-        if (copy(params)) {
+        if (copy(formatStr)) {
            message.destroy();
            message.success('Success copy parameters to clipboard in form of python dict !', 3);
-            const { idList } = this.state;
-            const copyIdList: Array<string> = idList;
-            copyIdList[copyIdList.length - 1] = record.id;
-            this.setState(() => ({
-                idList: copyIdList
-            }));
        } else {
            message.destroy();
            message.error('Failed !', 2);
        }
+        this.hideFormatModal();
    }
+    componentDidMount() {
+        this._isMounted = true;
+    }
+    componentWillUnmount() {
+        this._isMounted = false;
+    }
    render() {
        const { trainingPlatform, record, logCollection, multiphase } = this.props;
-        const { idList } = this.state;
+        const { isShowFormatModal, formatStr } = this.state;
        let isClick = false;
        let isHasParameters = true;
-        if (idList.indexOf(record.id) !== -1) { isClick = true; }
        if (record.description.parameters.error) {
            isHasParameters = false;
        }
@@ -101,7 +121,7 @@ class OpenRow extends React.Component<OpenRowProps, OpenRowState> {
                                    </Row>
                                    <Row className="copy">
                                        <Button
-                                            onClick={this.copyParams.bind(this, record)}
+                                            onClick={this.showFormatModal.bind(this, record)}
                                        >
                                            Copy as python
                                        </Button>
@@ -128,6 +148,21 @@ class OpenRow extends React.Component<OpenRowProps, OpenRowState> {
                        }
                    </TabPane>
                </Tabs>
+                <Modal
+                    title="Format"
+                    okText="Copy"
+                    centered={true}
+                    visible={isShowFormatModal}
+                    onCancel={this.hideFormatModal}
+                    maskClosable={false} // click mongolian layer don't close modal
+                    onOk={this.copyParams}
+                    destroyOnClose={true}
+                    width="60%"
+                    className="format"
+                >
+                    {/* write string in pre to show format string */}
+                    <pre className="formatStr">{formatStr}</pre>
+                </Modal>
            </Row >
        );
    }

--- a/src/webui/src/static/style/copyParameter.scss
+++ b/src/webui/src/static/style/copyParameter.scss
+$color: #f2f2f2;
+.formatStr{
+    border: 1px solid #8f8f8f;
+    color: #333;
+    padding: 5px 10px;
+    background-color: #fff; 
+}
+.format {
+    .ant-modal-header{
+        background-color: $color;
+        border-bottom: none;
+    }
+    .ant-modal-footer{
+        background-color: $color;
+        border-top: none;
+    }
+    .ant-modal-body{
+        background-color: $color;
+        padding: 10px 24px !important;
+    }
+}
--- a/src/webui/src/static/style/overview.scss
+++ b/src/webui/src/static/style/overview.scss
@@ -52,4 +52,3 @@
 .link{
    margin-bottom: 10px;
 }
--- a/test/pipelines-it-remote-windows.yml
+++ b/test/pipelines-it-remote-windows.yml
+jobs:
+- job: 'integration_test_remote_windows'
+  steps:
+  - script: python -m pip install --upgrade pip setuptools
+    displayName: 'Install python tools'
+  - task: CopyFilesOverSSH@0
+    inputs:
+      sshEndpoint: $(end_point)
+      targetFolder: /tmp/nnitest/$(Build.BuildId)/nni-remote
+      overwrite: true
+    displayName: 'Copy all files to remote machine'
+  - script: |
+      powershell.exe -file install.ps1
+    displayName: 'Install nni toolkit via source code'
+  - script: |
+      python -m pip install scikit-learn==0.20.1 --user
+    displayName: 'Install dependencies for integration tests'
+  - task: SSH@0
+    inputs:
+      sshEndpoint: $(end_point)
+      runOptions: inline
+      inline: cd /tmp/nnitest/$(Build.BuildId)/nni-remote/deployment/pypi;make build
+    continueOnError: true
+    displayName: 'build nni bdsit_wheel'
+  - task: SSH@0
+    inputs:
+      sshEndpoint: $(end_point)
+      runOptions: commands
+      commands: python3 /tmp/nnitest/$(Build.BuildId)/nni-remote/test/remote_docker.py --mode start --name $(Build.BuildId) --image nni/nni --os windows
+    displayName: 'Start docker'
+  - powershell: |
+      Write-Host "Downloading Putty..."
+      (New-Object Net.WebClient).DownloadFile("https://the.earth.li/~sgtatham/putty/latest/w64/pscp.exe", "$(Agent.TempDirectory)\pscp.exe")
+      $(Agent.TempDirectory)\pscp.exe -hostkey $(hostkey) -pw $(pscp_pwd) $(remote_user)@$(remote_host):/tmp/nnitest/$(Build.BuildId)/port test\port
+      Get-Content test\port
+    displayName: 'Get docker port'
+  - powershell: |
+      cd test
+      python generate_ts_config.py --ts remote --remote_user $(docker_user) --remote_host $(remote_host) --remote_port $(Get-Content port) --remote_pwd $(docker_pwd) --nni_manager_ip $(nni_manager_ip)
+      Get-Content training_service.yml
+      python config_test.py --ts remote --exclude cifar10,smac,bohb
+    displayName: 'integration test'
+  - task: SSH@0
+    inputs:
+      sshEndpoint: $(end_point)
+      runOptions: commands
+      commands: python3 /tmp/nnitest/$(Build.BuildId)/nni-remote/test/remote_docker.py --mode stop --name $(Build.BuildId) --os windows
+    displayName: 'Stop docker'
--- a/test/remote_docker.py
+++ b/test/remote_docker.py
@@ -30,18 +30,33 @@ def find_wheel_package(dir):
            return file_name
    return None
-def start_container(image, name):
+def start_container(image, name, nnimanager_os):
    '''Start docker container, generate a port in /tmp/nnitest/{name}/port file'''
    port = find_port()
    source_dir = '/tmp/nnitest/' + name
    run_cmds = ['docker', 'run', '-d', '-p', str(port) + ':22', '--name', name, '--mount', 'type=bind,source=' + source_dir + ',target=/tmp/nni', image]
    output = check_output(run_cmds)
    commit_id = output.decode('utf-8')
-    wheel_name = find_wheel_package(os.path.join(source_dir, 'dist'))
+    if nnimanager_os == 'windows':
+        wheel_name = find_wheel_package(os.path.join(source_dir, 'nni-remote/deployment/pypi/dist'))
+    else:
+        wheel_name = find_wheel_package(os.path.join(source_dir, 'dist'))
    if not wheel_name:
        print('Error: could not find wheel package in {0}'.format(source_dir))
        exit(1)
-    sdk_cmds = ['docker', 'exec', name, 'python3', '-m', 'pip', 'install', '/tmp/nni/dist/{0}'.format(wheel_name)]
+    def get_dist(wheel_name):
+        '''get the wheel package path'''
+        if nnimanager_os == 'windows':
+            return '/tmp/nni/nni-remote/deployment/pypi/dist/{0}'.format(wheel_name)
+        else:
+            return '/tmp/nni/dist/{0}'.format(wheel_name)
+    pip_cmds = ['docker', 'exec', name, 'python3', '-m', 'pip', 'install', '--upgrade', 'pip']
+    check_call(pip_cmds)
+    sdk_cmds = ['docker', 'exec', name, 'python3', '-m', 'pip', 'install', get_dist(wheel_name)]
    check_call(sdk_cmds)
    with open(source_dir + '/port', 'w') as file:
        file.write(str(port))
@@ -58,8 +73,9 @@ if __name__ == '__main__':
    parser.add_argument('--mode', required=True, choices=['start', 'stop'], dest='mode', help='start or stop a container')
    parser.add_argument('--name', required=True, dest='name', help='the name of container to be used')
    parser.add_argument('--image', dest='image', help='the image to be used')
+    parser.add_argument('--os', dest='os', default='unix', choices=['unix', 'windows'], help='nniManager os version')
    args = parser.parse_args()
    if args.mode == 'start':
-        start_container(args.image, args.name)
+        start_container(args.image, args.name, args.os)
    else:
        stop_container(args.name)
--- a/tools/README_zh_CN.md
+++ b/tools/README_zh_CN.md
@@ -54,4 +54,4 @@ NNI CTL 模块用来控制 Neural Network Intelligence，包括开始新 Experim
 ## 开始使用 NNI CTL
-参考 [NNI CTL 文档](../docs/zh_CN/NNICTLDOC.md)。
+参考 [NNI CTL 文档](../docs/zh_CN/Nnictl.md)。
\ No newline at end of file
--- a/uninstall.ps1
+++ b/uninstall.ps1
+$NNI_DEPENDENCY_FOLDER = [System.IO.Path]::GetTempPath()+$env:USERNAME
-$NNI_DEPENDENCY_FOLDER = "C:\tmp\$env:USERNAME"
 $env:PYTHONIOENCODING = "UTF-8"
 if($env:VIRTUAL_ENV){
@@ -27,4 +26,4 @@ Remove-Item "src/nni_manager/node_modules" -Recurse -Force
 Remove-Item "src/webui/build" -Recurse -Force
 Remove-Item "src/webui/node_modules" -Recurse -Force
 Remove-Item $NNI_YARN_FOLDER -Recurse -Force
 Remove-Item $NNI_NODE_FOLDER -Recurse -Force
\ No newline at end of file