"src/node/valueevents.cpp" did not exist on "5012063143f2afe006363300e60bba52124018e7"
Commit e773dfcc authored by qianyj's avatar qianyj
Browse files

create branch for v2.9

parents
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Set the version of Python and other tools you might need
build:
os: ubuntu-20.04
tools:
python: "3.9"
nodejs: "16" # specified but actually not used
# You can also specify other tool versions:
# rust: "1.55"
# golang: "1.17"
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/source/conf.py
# Optionally declare the Python requirements required to build your docs
python:
install:
- requirements: dependencies/develop.txt
- requirements: dependencies/required.txt
# The issue with smac and swig prevents us from installing required_extra.
# As a result, the docstring from several tuners including SMAC, PPO cannot be rendered.
# - requirements: dependencies/required_extra.txt
- requirements: dependencies/recommended.txt
# We cannot have `python setup.py install` here,
# because it's not supported by NNI.
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- name: "Microsoft"
title: "Neural Network Intelligence"
date-released: 2021-01-14
url: "https://github.com/microsoft/nni"
version: 2.0
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
FROM nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04
ARG NNI_RELEASE
LABEL maintainer='Microsoft NNI Team<nni@microsoft.com>'
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update
RUN apt-get -y install \
automake \
build-essential \
cmake \
curl \
git \
openssh-server \
python3 \
python3-dev \
python3-pip \
sudo \
unzip \
wget \
zip
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*
RUN ln -s python3 /usr/bin/python
RUN python3 -m pip --no-cache-dir install pip==22.0.3 setuptools==60.9.1 wheel==0.37.1
RUN python3 -m pip --no-cache-dir install \
lightgbm==3.3.2 \
numpy==1.22.2 \
pandas==1.4.1 \
scikit-learn==1.0.2 \
scipy==1.8.0
RUN python3 -m pip --no-cache-dir install \
torch==1.10.2+cu113 \
torchvision==0.11.3+cu113 \
torchaudio==0.10.2+cu113 \
-f https://download.pytorch.org/whl/cu113/torch_stable.html
RUN python3 -m pip --no-cache-dir install pytorch-lightning==1.6.1
RUN python3 -m pip --no-cache-dir install tensorflow==2.9.1
RUN python3 -m pip --no-cache-dir install azureml==0.2.7 azureml-sdk==1.38.0
COPY dist/nni-${NNI_RELEASE}-py3-none-manylinux1_x86_64.whl .
RUN python3 -m pip install nni-${NNI_RELEASE}-py3-none-manylinux1_x86_64.whl
RUN rm nni-${NNI_RELEASE}-py3-none-manylinux1_x86_64.whl
ENV PATH=/root/.local/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:/usr/bin:/usr/sbin
WORKDIR /root
Copyright (c) Microsoft Corporation.
MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
<div align="center">
<img src="docs/img/nni_logo.png" width="600"/>
</div>
<br/>
[![MIT licensed](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE)
[![Issues](https://img.shields.io/github/issues-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen)
[![Bugs](https://img.shields.io/github/issues/Microsoft/nni/bug.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3Abug)
[![Pull Requests](https://img.shields.io/github/issues-pr-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/pulls?q=is%3Apr+is%3Aopen)
[![Version](https://img.shields.io/github/release/Microsoft/nni.svg)](https://github.com/Microsoft/nni/releases)
[![Documentation Status](https://readthedocs.org/projects/nni/badge/?version=stable)](https://nni.readthedocs.io/en/stable/?badge=stable)
[![](https://img.shields.io/github/contributors-anon/microsoft/nni)](https://github.com/microsoft/nni/graphs/contributors)
[<img src="docs/img/readme_banner.png" width="100%"/>](https://nni.readthedocs.io/en/stable)
NNI automates feature engineering, neural architecture search, hyperparameter tuning, and model compression for deep learning. Find the latest features, API, examples and tutorials in our **[official documentation](https://nni.readthedocs.io/) ([简体中文版点这里](https://nni.readthedocs.io/zh/stable))**.
## What's NEW! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>
* **New release**: [v2.9 is available](https://github.com/microsoft/nni/releases/tag/v2.9) - _released on Sept-8-2022_
* **New demo available**: [Youtube entry](https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw) | [Bilibili 入口](https://space.bilibili.com/1649051673) - _last updated on June-22-2022_
* **New research paper**: [SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute](https://www.usenix.org/system/files/osdi22-zheng-ningxin.pdf) - _published in OSDI 2022_
* **New research paper**: [Privacy-preserving Online AutoML for Domain-Specific Face Detection](https://openaccess.thecvf.com/content/CVPR2022/papers/Yan_Privacy-Preserving_Online_AutoML_for_Domain-Specific_Face_Detection_CVPR_2022_paper.pdf) - _published in CVPR 2022_
* **Newly upgraded documentation**: [Doc upgraded](https://nni.readthedocs.io/en/stable)
## Installation
See the [NNI installation guide](https://nni.readthedocs.io/en/stable/installation.html) to install from pip, or build from source.
To install the current release:
```
$ pip install nni
```
To update NNI to the latest version, add `--upgrade` flag to the above commands.
## NNI capabilities in a glance
<img src="docs/img/overview.svg" width="100%"/>
<table>
<tbody>
<tr align="center" valign="bottom">
<td></td>
<td>
<b>Hyperparameter Tuning</b>
<img src="docs/img/bar.png" />
</td>
<td>
<b>Neural Architecture Search</b>
<img src="docs/img/bar.png" />
</td>
<td>
<b>Model Compression</b>
<img src="docs/img/bar.png" />
</td>
</tr>
<tr valign="top">
<td align="center" valign="middle">
<b>Algorithms</b>
</td>
<td>
<ul>
<li><b>Exhaustive search</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.gridsearch_tuner.GridSearchTuner">Grid Search</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.random_tuner.RandomTuner">Random</a></li>
</ul>
<li><b>Heuristic search</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.hyperopt_tuner.HyperoptTuner">Anneal</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.evolution_tuner.EvolutionTuner">Evolution</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.hyperband_advisor.Hyperband">Hyperband</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.pbt_tuner.PBTTuner">PBT</a></li>
</ul>
<li><b>Bayesian optimization</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.bohb_advisor.BOHB">BOHB</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.dngo_tuner.DNGOTuner">DNGO</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.gp_tuner.GPTuner">GP</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.metis_tuner.MetisTuner">Metis</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.smac_tuner.SMACTuner">SMAC</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/reference/hpo.html#nni.algorithms.hpo.tpe_tuner.TpeTuner">TPE</a></li>
</ul>
</ul>
</td>
<td>
<ul>
<li><b>Multi-trial</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#grid-search-strategy">Grid Search</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#policy-based-rl-strategy">Policy Based RL</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#random-strategy">Random</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#regularized-evolution-strategy">Regularized Evolution</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#tpe-strategy">TPE</a></li>
</ul>
<li><b>One-shot</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#darts-strategy">DARTS</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#enas-strategy">ENAS</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#fbnet-strategy">FBNet</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#proxylessnas-strategy">ProxylessNAS</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/nas/exploration_strategy.html#spos-strategy">SPOS</a></li>
</ul>
</ul>
</td>
<td>
<ul>
<li><b>Pruning</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/compression/pruner.html#level-pruner">Level</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/pruner.html#l1-norm-pruner">L1 Norm</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/pruner.html#taylor-fo-weight-pruner">Taylor FO Weight</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/pruner.html#movement-pruner">Movement</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/pruner.html#agp-pruner">AGP</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/pruner.html#auto-compress-pruner">Auto Compress</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/pruner.html">More...</a></li>
</ul>
<li><b>Quantization</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/compression/quantizer.html#naive-quantizer">Naive</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/quantizer.html#qat-quantizer">QAT</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/quantizer.html#lsq-quantizer">LSQ</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/quantizer.html#observer-quantizer">Observer</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/quantizer.html#dorefa-quantizer">DoReFa</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/compression/quantizer.html#bnn-quantizer">BNN</a></li>
</ul>
</ul>
</td>
<tr align="center" valign="bottom">
<td></td>
<td>
<b>Supported Frameworks</b>
<img src="docs/img/bar.png" />
</td>
<td>
<b>Training Services</b>
<img src="docs/img/bar.png" />
</td>
<td>
<b>Tutorials</b>
<img src="docs/img/bar.png" />
</td>
</tr>
<tr valign="top">
<td align="center" valign="middle">
<b>Supports</b>
</td>
<td>
<ul>
<li>PyTorch</li>
<li>TensorFlow</li>
<li>Scikit-learn</li>
<li>XGBoost</li>
<li>LightGBM</li>
<li>MXNet</li>
<li>Caffe2</li>
<li>More...</li>
</ul>
</td>
<td>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/local.html">Local machine</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/remote.html">Remote SSH servers</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/aml.html">Azure Machine Learning (AML)</a></li>
<li><b>Kubernetes Based</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/openpai.html">OpenAPI</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/kubeflow.html">Kubeflow</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/frameworkcontroller.html">FrameworkController</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/adaptdl.html">AdaptDL</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/paidlc.html">PAI DLC</a></li>
</ul>
<li><a href="https://nni.readthedocs.io/en/latest/experiment/hybrid.html">Hybrid training services</a></li>
</ul>
</td>
<td>
<ul>
<li><b>HPO</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/hpo_quickstart_pytorch/main.html">PyTorch</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/hpo_quickstart_tensorflow/main.html">TensorFlow</a></li>
</ul>
<li><b>NAS</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/hello_nas.html">Hello NAS</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/nasbench_as_dataset.html">NAS Benchmarks</a></li>
</ul>
<li><b>Compression</b></li>
<ul>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/pruning_quick_start_mnist.html">Pruning</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/pruning_speed_up.html">Pruning Speedup</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/quantization_quick_start_mnist.html">Quantization</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/tutorials/quantization_speed_up.html">Quantization Speedup</a></li>
</ul>
</ul>
</td>
</tbody>
</table>
<img src="docs/static/img/webui.gif" alt="webui" width="100%"/>
## Resources
* [NNI Documentation Homepage](https://nni.readthedocs.io/en/stable)
* [NNI Installation Guide](https://nni.readthedocs.io/en/stable/installation.html)
* [NNI Examples](https://nni.readthedocs.io/en/latest/examples.html)
* [Python API Reference](https://nni.readthedocs.io/en/latest/reference/python_api.html)
* [Releases (Change Log)](https://nni.readthedocs.io/en/latest/release.html)
* [Related Research and Publications](https://nni.readthedocs.io/en/latest/notes/research_publications.html)
* [Youtube Channel of NNI](https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw)
* [Bilibili Space of NNI](https://space.bilibili.com/1649051673)
* [Webinar of Introducing Retiarii: A deep learning exploratory-training framework on NNI](https://note.microsoft.com/MSR-Webinar-Retiarii-Registration-Live.html)
* [Community Discussions](https://github.com/microsoft/nni/discussions)
## Contribution guidelines
If you want to contribute to NNI, be sure to review the [contribution guidelines](https://nni.readthedocs.io/en/stable/notes/contributing.html), which includes instructions of submitting feedbacks, best coding practices, and code of conduct.
We use [GitHub issues](https://github.com/microsoft/nni/issues) to track tracking requests and bugs.
Please use [NNI Discussion](https://github.com/microsoft/nni/discussions) for general questions and new ideas.
For questions of specific use cases, please go to [Stack Overflow](https://stackoverflow.com/questions/tagged/nni).
Participating discussions via the following IM groups is also welcomed.
|Gitter||WeChat|
|----|----|----|
|![image](https://user-images.githubusercontent.com/39592018/80665738-e0574a80-8acc-11ea-91bc-0836dc4cbf89.png)| OR |![image](https://github.com/scarlett2018/nniutil/raw/master/wechat.png)|
Over the past few years, NNI has received thousands of feedbacks on GitHub issues, and pull requests from hundreds of contributors.
We appreciate all contributions from community to make NNI thrive.
<img src="https://img.shields.io/github/contributors-anon/microsoft/nni"/>
<a href="https://github.com/microsoft/nni/graphs/contributors"><img src="https://contrib.rocks/image?repo=microsoft/nni&max=240&columns=18" /></a>
## Test status
### Essentials
| Type | Status |
| :---: | :---: |
| Fast test | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/fast%20test?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=54&branchName=master) |
| Full test - HPO | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20HPO?repoName=microsoft%2Fnni&branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=90&repoName=microsoft%2Fnni&branchName=master) |
| Full test - NAS | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20NAS?repoName=microsoft%2Fnni&branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=89&repoName=microsoft%2Fnni&branchName=master) |
| Full test - compression | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20compression?repoName=microsoft%2Fnni&branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=91&repoName=microsoft%2Fnni&branchName=master) |
### Training services
| Type | Status |
| :---: | :---: |
| Local - linux | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20local%20-%20linux?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=92&branchName=master) |
| Local - windows | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20local%20-%20windows?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=98&branchName=master) |
| Remote - linux to linux | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20remote%20-%20linux%20to%20linux?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=64&branchName=master) |
| Remote - windows to windows | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20remote%20-%20windows%20to%20windows?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=99&branchName=master) |
| OpenPAI | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20openpai%20-%20linux?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=65&branchName=master) |
| Frameworkcontroller | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20frameworkcontroller?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=70&branchName=master) |
| Kubeflow | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20kubeflow?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=69&branchName=master) |
| Hybrid | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20hybrid?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=79&branchName=master) |
| AzureML | [![Build Status](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/integration%20test%20-%20aml?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=78&branchName=master) |
## Related Projects
Targeting at openness and advancing state-of-art technology, [Microsoft Research (MSR)](https://www.microsoft.com/en-us/research/group/systems-and-networking-research-group-asia/) had also released few other open source projects.
* [OpenPAI](https://github.com/Microsoft/pai) : an open source platform that provides complete AI model training and resource management capabilities, it is easy to extend and supports on-premise, cloud and hybrid environments in various scale.
* [FrameworkController](https://github.com/Microsoft/frameworkcontroller) : an open source general-purpose Kubernetes Pod Controller that orchestrate all kinds of applications on Kubernetes by a single controller.
* [MMdnn](https://github.com/Microsoft/MMdnn) : A comprehensive, cross-framework solution to convert, visualize and diagnose deep neural network models. The "MM" in MMdnn stands for model management and "dnn" is an acronym for deep neural network.
* [SPTAG](https://github.com/Microsoft/SPTAG) : Space Partition Tree And Graph (SPTAG) is an open source library for large scale vector approximate nearest neighbor search scenario.
* [nn-Meter](https://github.com/microsoft/nn-Meter) : An accurate inference latency predictor for DNN models on diverse edge devices.
We encourage researchers and students leverage these projects to accelerate the AI development and research.
## License
The entire codebase is under [MIT license](LICENSE).
<!-- BEGIN MICROSOFT SECURITY.MD V0.0.5 BLOCK -->
## Security
Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).
If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below.
## Reporting Security Issues
**Please do not report security vulnerabilities through public GitHub issues.**
Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).
If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).
You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).
Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
* Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
* Full paths of source file(s) related to the manifestation of the issue
* The location of the affected source code (tag/branch/commit or direct URL)
* Any special configuration required to reproduce the issue
* Step-by-step instructions to reproduce the issue
* Proof-of-concept or exploit code (if possible)
* Impact of the issue, including how an attacker might exploit the issue
This information will help us triage your report more quickly.
If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://microsoft.com/msrc/bounty) page for more details about our active programs.
## Preferred Languages
We prefer all communications to be in English.
## Policy
Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd).
<!-- END MICROSOFT SECURITY.MD BLOCK -->
project_id_env: CROWDIN_PROJECT_ID
api_token_env: CROWDIN_PERSONAL_TOKEN
preserve_hierarchy: true
files:
- source: /docs/en_US/**/*
ignore:
- /docs/zh_CN/**/*
translation: /docs/%locale_with_underscore%/**/%original_file_name%
- source: '/**/*.[mM][dD]'
ignore:
- '/**/*_%locale_with_underscore%.md'
- /docs
- /%locale_with_underscore%
- /.github
translation: /%original_path%/%file_name%_%locale_with_underscore%.md
aioconsole
coverage
cython
flake8
ipython
jupyter
jupyterlab == 3.0.9
nbsphinx
pylint < 2.15
pyright == 1.1.250
pytest
pytest-cov
rstcheck >= 6.0
sphinx >= 4.5
sphinx-argparse-nni >= 0.4.0
sphinx-copybutton
sphinx-gallery
sphinx-intl
sphinx-tabs
sphinxcontrib-bibtex
sphinxcontrib-youtube
git+https://github.com/bashtage/sphinx-material@6e0ef82#egg=sphinx_material
# Recommended because some non-commonly-used modules/examples depend on those packages.
-f https://download.pytorch.org/whl/torch_stable.html
tensorflow >= 2.7.0
tensorboard >= 2.7.0
torch == 1.10.0+cpu ; sys_platform != "darwin"
torch == 1.10.0 ; sys_platform == "darwin"
torchvision == 0.11.1+cpu ; sys_platform != "darwin"
torchvision == 0.11.1 ; sys_platform == "darwin"
pytorch-lightning >= 1.6.1
torchmetrics
lightgbm
onnx
peewee
graphviz
gym
tianshou >= 0.4.1
matplotlib
timm >= 0.5.4
# Recommended because some non-commonly-used modules/examples depend on those packages.
-f https://download.pytorch.org/whl/torch_stable.html
tensorflow
torch == 1.10.0+cu113
torchvision == 0.11.1+cu113
pytorch-lightning >= 1.6.1
lightgbm
onnx
peewee
graphviz
gym
tianshou >= 0.4.1
timm >= 0.5.4
-f https://download.pytorch.org/whl/torch_stable.html
torch == 1.7.1+cpu
torchvision == 0.8.2+cpu
# It will install pytorch-lightning 0.8.x and unit tests won't work.
# Latest version has conflict with tensorboard and tensorflow 1.x.
pytorch-lightning
torchmetrics
lightgbm
onnx
peewee
graphviz
gym < 0.23
tianshou >= 0.4.1, < 0.4.9
matplotlib
timm >= 0.5.4
# TODO: time to drop tensorflow 1.x
keras
tensorflow < 2.0
protobuf <= 3.20.1
astor
cloudpickle
colorama
filelock
hyperopt == 0.1.2
json_tricks >= 3.15.5
numpy < 1.22 ; python_version < "3.8"
numpy ; python_version >= "3.8"
packaging
pandas
prettytable
psutil
PythonWebHDFS
pyyaml >= 5.4
requests
responses
schema
scikit-learn >= 0.24.1
scipy < 1.8 ; python_version < "3.8"
scipy ; python_version >= "3.8"
tqdm
typeguard
typing_extensions >= 4.0.0
websockets >= 10.1
# the following content will be read by setup.py.
# please follow the logic in setup.py.
# SMAC
ConfigSpaceNNI>=0.4.7.3
smac4nni
# BOHB
ConfigSpace>=0.4.17
statsmodels>=0.12.0
# PPOTuner
gym
# DNGO
pybnn
pip < 23
setuptools < 63
wheel < 0.38
build/
# legacy build
_build/
# ignored copied rst in tutorials
/source/tutorials/**/cp_*.rst
# auto-generated reference table
_modules/
# Machine-style translation files
*.mo
.PHONY: all
all: en zh
.PHONY: en
en:
## English ##
sphinx-build -T source build/html
.PHONY: zh
zh:
## Chinese ##
sphinx-build -T -D language=zh source build/html_zh
# Build message catelogs for translation
.PHONY: i18n
i18n:
sphinx-build -b getpartialtext source build/i18n
sphinx-intl update -p build/i18n -d source/locales -l zh
.PHONY: clean
clean:
rm -rf build
rm -rf source/reference/_modules
Curve Fitting Assessor on NNI
=============================
Introduction
------------
The Curve Fitting Assessor is an LPA (learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of the final epoch's performance is worse than the best final performance in the trial history.
In this algorithm, we use 12 curves to fit the learning curve. The set of parametric curve models are chosen from this `reference paper <http://aad.informatik.uni-freiburg.de/papers/15-IJCAI-Extrapolation_of_Learning_Curves.pdf>`__. The learning curves' shape coincides with our prior knowledge about the form of learning curves: They are typically increasing, saturating functions.
.. image:: ../../img/curvefitting_learning_curve.PNG
:target: ../../img/curvefitting_learning_curve.PNG
:alt: learning_curve
We combine all learning curve models into a single, more powerful model. This combined model is given by a weighted linear combination:
.. image:: ../../img/curvefitting_f_comb.gif
:target: ../../img/curvefitting_f_comb.gif
:alt: f_comb
with the new combined parameter vector
.. image:: ../../img/curvefitting_expression_xi.gif
:target: ../../img/curvefitting_expression_xi.gif
:alt: expression_xi
Assuming additive Gaussian noise and the noise parameter being initialized to its maximum likelihood estimate.
We determine the maximum probability value of the new combined parameter vector by learning the historical data. We use such a value to predict future trial performance and stop the inadequate experiments to save computing resources.
Concretely, this algorithm goes through three stages of learning, predicting, and assessing.
*
Step1: Learning. We will learn about the trial history of the current trial and determine the \xi at the Bayesian angle. First of all, We fit each curve using the least-squares method, implemented by ``fit_theta``. After we obtained the parameters, we filter the curve and remove the outliers, implemented by ``filter_curve``. Finally, we use the MCMC sampling method. implemented by ``mcmc_sampling``\ , to adjust the weight of each curve. Up to now, we have determined all the parameters in \xi.
*
Step2: Predicting. It calculates the expected final result accuracy, implemented by ``f_comb``\ , at the target position (i.e., the total number of epochs) by \xi and the formula of the combined model.
*
Step3: If the fitting result doesn't converge, the predicted value will be ``None``. In this case, we return ``AssessResult.Good`` to ask for future accuracy information and predict again. Furthermore, we will get a positive value from the ``predict()`` function. If this value is strictly greater than the best final performance in history * ``THRESHOLD``\ (default value = 0.95), return ``AssessResult.Good``\ , otherwise, return ``AssessResult.Bad``
The figure below is the result of our algorithm on MNIST trial history data, where the green point represents the data obtained by Assessor, the blue point represents the future but unknown data, and the red line is the Curve predicted by the Curve fitting assessor.
.. image:: ../../img/curvefitting_example.PNG
:target: ../../img/curvefitting_example.PNG
:alt: examples
Usage
-----
To use Curve Fitting Assessor, you should add the following spec in your experiment's YAML config file:
.. code-block:: yaml
assessor:
builtinAssessorName: Curvefitting
classArgs:
# (required)The total number of epoch.
# We need to know the number of epoch to determine which point we need to predict.
epoch_num: 20
# (optional) In order to save our computing resource, we start to predict when we have more than only after receiving start_step number of reported intermediate results.
# The default value of start_step is 6.
start_step: 6
# (optional) The threshold that we decide to early stop the worse performance curve.
# For example: if threshold = 0.95, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
# The default value of threshold is 0.95.
threshold: 0.95
# (optional) The gap interval between Assesor judgements.
# For example: if gap = 2, start_step = 6, then we will assess the result when we get 6, 8, 10, 12...intermedian result.
# The default value of gap is 1.
gap: 1
Limitation
----------
According to the original paper, only incremental functions are supported. Therefore this assessor can only be used to maximize optimization metrics. For example, it can be used for accuracy, but not for loss.
File Structure
--------------
The assessor has a lot of different files, functions, and classes. Here we briefly describe a few of them.
* ``curvefunctions.py`` includes all the function expressions and default parameters.
* ``modelfactory.py`` includes learning and predicting; the corresponding calculation part is also implemented here.
* ``curvefitting_assessor.py`` is the assessor which receives the trial history and assess whether to early stop the trial.
TODO
----
* Further improve the accuracy of the prediction and test it on more models.
**How to Debug in NNI**
===========================
Overview
--------
There are three parts that might have logs in NNI. They are nnimanager, dispatcher and trial. Here we will introduce them succinctly. More information please refer to `Overview <../Overview.rst>`__.
* **NNI controller**\ : NNI controller (nnictl) is the nni command-line tool that is used to manage experiments (e.g., start an experiment).
* **nnimanager**\ : nnimanager is the core of NNI, whose log is important when the whole experiment fails (e.g., no webUI or training service fails)
* **Dispatcher**\ : Dispatcher calls the methods of **Tuner** and **Assessor**. Logs of dispatcher are related to the tuner or assessor code.
* **Tuner**\ : Tuner is an AutoML algorithm, which generates a new configuration for the next try. A new trial will run with this configuration.
* **Assessor**\ : Assessor analyzes trial's intermediate results (e.g., periodically evaluated accuracy on test dataset) to tell whether this trial can be early stopped or not.
* **Trial**\ : Trial code is the code you write to run your experiment, which is an individual attempt at applying a new configuration (e.g., a set of hyperparameter values, a specific nerual architecture).
Where is the log
----------------
There are three kinds of log in NNI. When creating a new experiment, you can specify log level as debug by adding ``--debug``. Besides, you can set more detailed log level in your configuration file by using
``logLevel`` keyword. Available logLevels are: ``trace``\ , ``debug``\ , ``info``\ , ``warning``\ , ``error``\ , ``fatal``.
NNI controller
^^^^^^^^^^^^^^
All possible errors that happen when launching an NNI experiment can be found here.
You can use ``nnictl log stderr`` to find error information. For more options please refer to `NNICTL <Nnictl.rst>`__
Experiment Root Directory
^^^^^^^^^^^^^^^^^^^^^^^^^
Every experiment has a root folder, which is shown on the right-top corner of webUI. Or you could assemble it by replacing the ``experiment_id`` with your actual experiment_id in path ``~/nni-experiments/experiment_id/`` in case of webUI failure. ``experiment_id`` could be seen when you run ``nnictl create ...`` to create a new experiment.
..
For flexibility, we also offer a ``logDir`` option in your configuration, which specifies the directory to store all experiments (defaults to ``~/nni-experiments``\ ). Please refer to `Configuration <ExperimentConfig.rst>`__ for more details.
Under that directory, there is another directory named ``log``\ , where ``nnimanager.log`` and ``dispatcher.log`` are placed.
Trial Root Directory
^^^^^^^^^^^^^^^^^^^^
Usually in webUI, you can click ``+`` in the left of every trial to expand it to see each trial's log path.
Besides, there is another directory under experiment root directory, named ``trials``\ , which stores all the trials.
Every trial has a unique id as its directory name. In this directory, a file named ``stderr`` records trial error and another named ``trial.log`` records this trial's log.
Different kinds of errors
-------------------------
There are different kinds of errors. However, they can be divided into three categories based on their severity. So when nni fails, check each part sequentially.
Generally, if webUI is started successfully, there is a ``Status`` in the ``Overview`` tab, serving as a possible indicator of what kind of error happens. Otherwise you should check manually.
**NNI** Fails
^^^^^^^^^^^^^^^^^
This is the most serious error. When this happens, the whole experiment fails and no trial will be run. Usually this might be related to some installation problem.
When this happens, you should check ``nnictl``\ 's error output file ``stderr`` (i.e., nnictl log stderr) and then the ``nnimanager``\ 's log to find if there is any error.
**Dispatcher** Fails
^^^^^^^^^^^^^^^^^^^^^^^^
Dispatcher fails. Usually, for some new users of NNI, it means that tuner fails. You could check dispatcher's log to see what happens to your dispatcher. For built-in tuner, some common errors might be invalid search space (unsupported type of search space or inconsistence between initializing args in configuration file and actual tuner's ``__init__`` function args).
Take the later situation as an example. If you write a customized tuner who's __init__ function has an argument called ``optimize_mode``\ , which you do not provide in your configuration file, NNI will fail to run your tuner so the experiment fails. You can see errors in the webUI like:
.. image:: ../../img/dispatcher_error.jpg
:target: ../../img/dispatcher_error.jpg
:alt:
Here we can see it is a dispatcher error. So we can check dispatcher's log, which might look like:
.. code-block:: bash
[2019-02-19 19:36:45] DEBUG (nni.main/MainThread) START
[2019-02-19 19:36:47] ERROR (nni.main/MainThread) __init__() missing 1 required positional arguments: 'optimize_mode'
Traceback (most recent call last):
File "/usr/lib/python3.7/site-packages/nni/__main__.py", line 202, in <module>
main()
File "/usr/lib/python3.7/site-packages/nni/__main__.py", line 164, in main
args.tuner_args)
File "/usr/lib/python3.7/site-packages/nni/__main__.py", line 81, in create_customized_class_instance
instance = class_constructor(**class_args)
TypeError: __init__() missing 1 required positional arguments: 'optimize_mode'.
**Trial** Fails
^^^^^^^^^^^^^^^^^^^
In this situation, NNI can still run and create new trials.
It means your trial code (which is run by NNI) fails. This kind of error is strongly related to your trial code. Please check trial's log to fix any possible errors shown there.
A common example of this would be run the mnist example without installing tensorflow. Surely there is an Import Error (that is, not installing tensorflow but trying to import it in your trial code) and thus every trial fails.
.. image:: ../../img/trial_error.jpg
:target: ../../img/trial_error.jpg
:alt:
As it shows, every trial has a log path, where you can find trial's log and stderr.
In addition to experiment level debug, NNI also provides the capability for debugging a single trial without the need to start the entire experiment. Refer to `standalone mode <../TrialExample/Trials.rst#standalone-mode-for-debugging>`__ for more information about debug single trial code.
How to Launch an Experiment from Python
=======================================
.. code-block::
.. toctree::
:hidden:
Start Usage <python_api_start>
Connect Usage <python_api_connect>
Overview
--------
Since ``v2.0``, NNI provides a new way to launch the experiments. Before that, you need to configure the experiment in the YAML configuration file and then use the ``nnictl`` command to launch the experiment. Now, you can also configure and run experiments directly in the Python file. If you are familiar with Python programming, this will undoubtedly bring you more convenience.
Run a New Experiment
--------------------
After successfully installing ``nni`` and prepare the `trial code <../TrialExample/Trials.rst>`__, you can start the experiment with a Python script in the following 2 steps.
Step 1 - Initialize an experiment instance and configure it
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: python
from nni.experiment import Experiment
experiment = Experiment('local')
Now, you have a ``Experiment`` instance, and this experiment will launch trials on your local machine due to ``training_service='local'``.
See all `training services <../training_services.rst>`__ supported in NNI.
.. code-block:: python
experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True
Use the form like ``experiment.config.foo = 'bar'`` to configure your experiment.
See all real `builtin tuners <../builtin_tuner.rst>`__ supported in NNI.
See `configuration reference <../reference/experiment_config.rst>`__ for more detailed usage of these fields.
Step 2 - Just run
^^^^^^^^^^^^^^^^^
.. code-block:: python
experiment.run(port=8080)
Now, you have successfully launched an NNI experiment. And you can type ``localhost:8080`` in your browser to observe your experiment in real time.
In this way, experiment will run in the foreground and will automatically exit when the experiment finished.
.. Note:: If you want to run an experiment in an interactive way, use ``start()`` in Step 2. If you launch the experiment in Python script, please use ``run()``, as ``start()`` is designed for the interactive scenarios.
Example
^^^^^^^
Below is an example for this new launching approach. You can find this code in :githublink:`mnist-tfv2/launch.py <examples/trials/mnist-tfv2/launch.py>`.
.. code-block:: python
from pathlib import Path
from nni.experiment import Experiment
search_space = {
"dropout_rate": { "_type": "uniform", "_value": [0.5, 0.9] },
"conv_size": { "_type": "choice", "_value": [2, 3, 5, 7] },
"hidden_size": { "_type": "choice", "_value": [124, 512, 1024] },
"batch_size": { "_type": "choice", "_value": [16, 32] },
"learning_rate": { "_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1] }
}
experiment = Experiment('local')
experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True
experiment.run(8080)
Start and Manage a New Experiment
---------------------------------
NNI migrates the API in ``NNI Client`` to this new launching approach. Launch the experiment by ``start()`` instead of ``run()``, then you can use these APIs in interactive mode.
Please refer to `example usage <./python_api_start.rst>`__ and code file :githublink:`python_api_start.ipynb <examples/trials/sklearn/classification/python_api_start.ipynb>`.
.. Note:: ``run()`` polls the experiment status and will automatically call ``stop()`` when the experiment finished. ``start()`` just launched a new experiment, so you need to manually stop the experiment by calling ``stop()``.
Connect and Manage an Exist Experiment
--------------------------------------
If you launch an experiment by ``nnictl`` and also want to use these APIs, you can use ``Experiment.connect()`` to connect to an existing experiment.
Please refer to `example usage <./python_api_connect.rst>`__ and code file :githublink:`python_api_connect.ipynb <examples/trials/sklearn/classification/python_api_connect.ipynb>`.
.. Note:: You can use ``stop()`` to stop the experiment when connecting to an existing experiment.
Resume/View and Manage a Stopped Experiment
-------------------------------------------
You can use ``Experiment.resume()`` and ``Experiment.view()`` to resume and view a stopped experiment, these functions behave like ``nnictl resume`` and ``nnictl view``.
If you want to manage the experiment, set ``wait_completion`` as ``False`` and the functions will return an ``Experiment`` instance. For more parameters, please refer to API reference.
API Reference
-------------
Detailed usage could be found `here <../reference/experiment_config.rst>`__.
* `Experiment`_
* `Experiment Config <#Experiment-Config>`_
* `Algorithm Config <#Algorithm-Config>`_
* `Training Service Config <#Training-Service-Config>`_
* `Local Config <#Local-Config>`_
* `Remote Config <#Remote-Config>`_
* `Openpai Config <#Openpai-Config>`_
* `AML Config <#AML-Config>`_
* `Shared Storage Config <Shared-Storage-Config>`_
Experiment
^^^^^^^^^^
.. autoclass:: nni.experiment.Experiment
:members:
Experiment Config
^^^^^^^^^^^^^^^^^
.. autoattribute:: nni.experiment.config.ExperimentConfig.experiment_name
.. autoattribute:: nni.experiment.config.ExperimentConfig.search_space_file
.. autoattribute:: nni.experiment.config.ExperimentConfig.search_space
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_command
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_code_directory
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_concurrency
.. autoattribute:: nni.experiment.config.ExperimentConfig.trial_gpu_number
.. autoattribute:: nni.experiment.config.ExperimentConfig.max_experiment_duration
.. autoattribute:: nni.experiment.config.ExperimentConfig.max_trial_number
.. autoattribute:: nni.experiment.config.ExperimentConfig.nni_manager_ip
.. autoattribute:: nni.experiment.config.ExperimentConfig.use_annotation
.. autoattribute:: nni.experiment.config.ExperimentConfig.debug
.. autoattribute:: nni.experiment.config.ExperimentConfig.log_level
.. autoattribute:: nni.experiment.config.ExperimentConfig.experiment_working_directory
.. autoattribute:: nni.experiment.config.ExperimentConfig.tuner_gpu_indices
.. autoattribute:: nni.experiment.config.ExperimentConfig.tuner
.. autoattribute:: nni.experiment.config.ExperimentConfig.assessor
.. autoattribute:: nni.experiment.config.ExperimentConfig.advisor
.. autoattribute:: nni.experiment.config.ExperimentConfig.training_service
.. autoattribute:: nni.experiment.config.ExperimentConfig.shared_storage
Algorithm Config
^^^^^^^^^^^^^^^^
.. autoattribute:: nni.experiment.config.AlgorithmConfig.name
.. autoattribute:: nni.experiment.config.AlgorithmConfig.class_args
.. autoattribute:: nni.experiment.config.CustomAlgorithmConfig.class_name
.. autoattribute:: nni.experiment.config.CustomAlgorithmConfig.code_directory
.. autoattribute:: nni.experiment.config.CustomAlgorithmConfig.class_args
Training Service Config
^^^^^^^^^^^^^^^^^^^^^^^
Local Config
************
.. autoattribute:: nni.experiment.config.LocalConfig.platform
.. autoattribute:: nni.experiment.config.LocalConfig.use_active_gpu
.. autoattribute:: nni.experiment.config.LocalConfig.max_trial_number_per_gpu
.. autoattribute:: nni.experiment.config.LocalConfig.gpu_indices
Remote Config
*************
.. autoattribute:: nni.experiment.config.RemoteConfig.platform
.. autoattribute:: nni.experiment.config.RemoteConfig.reuse_mode
.. autoattribute:: nni.experiment.config.RemoteConfig.machine_list
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.host
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.port
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.user
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.password
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.ssh_key_file
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.ssh_passphrase
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.use_active_gpu
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.max_trial_number_per_gpu
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.gpu_indices
.. autoattribute:: nni.experiment.config.RemoteMachineConfig.python_path
Openpai Config
**************
.. autoattribute:: nni.experiment.config.OpenpaiConfig.platform
.. autoattribute:: nni.experiment.config.OpenpaiConfig.host
.. autoattribute:: nni.experiment.config.OpenpaiConfig.username
.. autoattribute:: nni.experiment.config.OpenpaiConfig.token
.. autoattribute:: nni.experiment.config.OpenpaiConfig.trial_cpu_number
.. autoattribute:: nni.experiment.config.OpenpaiConfig.trial_memory_size
.. autoattribute:: nni.experiment.config.OpenpaiConfig.storage_config_name
.. autoattribute:: nni.experiment.config.OpenpaiConfig.docker_image
.. autoattribute:: nni.experiment.config.OpenpaiConfig.local_storage_mount_point
.. autoattribute:: nni.experiment.config.OpenpaiConfig.container_storage_mount_point
.. autoattribute:: nni.experiment.config.OpenpaiConfig.reuse_mode
.. autoattribute:: nni.experiment.config.OpenpaiConfig.openpai_config
.. autoattribute:: nni.experiment.config.OpenpaiConfig.openpai_config_file
AML Config
**********
.. autoattribute:: nni.experiment.config.AmlConfig.platform
.. autoattribute:: nni.experiment.config.AmlConfig.subscription_id
.. autoattribute:: nni.experiment.config.AmlConfig.resource_group
.. autoattribute:: nni.experiment.config.AmlConfig.workspace_name
.. autoattribute:: nni.experiment.config.AmlConfig.compute_target
.. autoattribute:: nni.experiment.config.AmlConfig.docker_image
.. autoattribute:: nni.experiment.config.AmlConfig.max_trial_number_per_gpu
Shared Storage Config
^^^^^^^^^^^^^^^^^^^^^
Nfs Config
**********
.. autoattribute:: nni.experiment.config.NfsConfig.storage_type
.. autoattribute:: nni.experiment.config.NfsConfig.nfs_server
.. autoattribute:: nni.experiment.config.NfsConfig.exported_directory
Azure Blob Config
*****************
.. autoattribute:: nni.experiment.config.AzureBlobConfig.storage_type
.. autoattribute:: nni.experiment.config.AzureBlobConfig.storage_account_name
.. autoattribute:: nni.experiment.config.AzureBlobConfig.storage_account_key
.. autoattribute:: nni.experiment.config.AzureBlobConfig.container_name
**How to Use Docker in NNI**
================================
Overview
--------
`Docker <https://www.docker.com/>`__ is a tool to make it easier for users to deploy and run applications based on their own operating system by starting containers. Docker is not a virtual machine, it does not create a virtual operating system, but it allows different applications to use the same OS kernel and isolate different applications by container.
Users can start NNI experiments using Docker. NNI also provides an official Docker image `msranni/nni <https://hub.docker.com/r/msranni/nni>`__ on Docker Hub.
Using Docker in local machine
-----------------------------
Step 1: Installation of Docker
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Before you start using Docker for NNI experiments, you should install Docker on your local machine. `See here <https://docs.docker.com/install/linux/docker-ce/ubuntu/>`__.
Step 2: Start a Docker container
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you have installed the Docker package in your local machine, you can start a Docker container instance to run NNI examples. You should notice that because NNI will start a web UI process in a container and continue to listen to a port, you need to specify the port mapping between your host machine and Docker container to give access to web UI outside the container. By visiting the host IP address and port, you can redirect to the web UI process started in Docker container and visit web UI content.
For example, you could start a new Docker container from the following command:
.. code-block:: bash
docker run -i -t -p [hostPort]:[containerPort] [image]
``-i:`` Start a Docker in an interactive mode.
``-t:`` Docker assign the container an input terminal.
``-p:`` Port mapping, map host port to a container port.
For more information about Docker commands, please `refer to this <https://docs.docker.com/engine/reference/run/>`__.
Note:
.. code-block:: bash
NNI only supports Ubuntu and MacOS systems in local mode for the moment, please use correct Docker image type. If you want to use gpu in a Docker container, please use nvidia-docker.
Step 3: Run NNI in a Docker container
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you start a Docker image using NNI's official image ``msranni/nni``\ , you can directly start NNI experiments by using the ``nnictl`` command. Our official image has NNI's running environment and basic python and deep learning frameworks preinstalled.
If you start your own Docker image, you may need to install the NNI package first; please refer to `NNI installation <InstallationLinux.rst>`__.
If you want to run NNI's official examples, you may need to clone the NNI repo in GitHub using
.. code-block:: bash
git clone https://github.com/Microsoft/nni.git
then you can enter ``nni/examples/trials`` to start an experiment.
After you prepare NNI's environment, you can start a new experiment using the ``nnictl`` command. `See here <QuickStart.rst>`__.
Using Docker on a remote platform
---------------------------------
NNI supports starting experiments in `remoteTrainingService <../TrainingService/RemoteMachineMode.rst>`__\ , and running trial jobs on remote machines. As Docker can start an independent Ubuntu system as an SSH server, a Docker container can be used as the remote machine in NNI's remote mode.
Step 1: Setting a Docker environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You should install the Docker software on your remote machine first, please `refer to this <https://docs.docker.com/install/linux/docker-ce/ubuntu/>`__.
To make sure your Docker container can be connected by NNI experiments, you should build your own Docker image to set an SSH server or use images with an SSH configuration. If you want to use a Docker container as an SSH server, you should configure the SSH password login or private key login; please `refer to this <https://docs.docker.com/engine/examples/running_ssh_service/>`__.
Note:
.. code-block:: text
NNI's official image msranni/nni does not support SSH servers for the time being; you should build your own Docker image with an SSH configuration or use other images as a remote server.
Step 2: Start a Docker container on a remote machine
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
An SSH server needs a port; you need to expose Docker's SSH port to NNI as the connection port. For example, if you set your container's SSH port as ``A``, you should map the container's port ``A`` to your remote host machine's other port ``B``, NNI will connect port ``B`` as an SSH port, and your host machine will map the connection from port ``B`` to port ``A`` then NNI could connect to your Docker container.
For example, you could start your Docker container using the following commands:
.. code-block:: bash
docker run -dit -p [hostPort]:[containerPort] [image]
The ``containerPort`` is the SSH port used in your Docker container and the ``hostPort`` is your host machine's port exposed to NNI. You can set your NNI's config file to connect to ``hostPort`` and the connection will be transmitted to your Docker container.
For more information about Docker commands, please `refer to this <https://docs.docker.com/v17.09/edge/engine/reference/run/>`__.
Note:
.. code-block:: bash
If you use your own Docker image as a remote server, please make sure that this image has a basic python environment and an NNI SDK runtime environment. If you want to use a GPU in a Docker container, please use nvidia-docker.
Step 3: Run NNI experiments
^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can set your config file as a remote platform and set the ``machineList`` configuration to connect to your Docker SSH server; `refer to this <../TrainingService/RemoteMachineMode.rst>`__. Note that you should set the correct ``port``\ , ``username``\ , and ``passWd`` or ``sshKeyPath`` of your host machine.
``port:`` The host machine's port, mapping to Docker's SSH port.
``username:`` The username of the Docker container.
``passWd:`` The password of the Docker container.
``sshKeyPath:`` The path of the private key of the Docker container.
After the configuration of the config file, you could start an experiment, `refer to this <QuickStart.rst>`__.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment