* Open the `Web UI url` in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below. [Here](docs/WebUI.md) are more Web UI pages.
* Open the `Web UI url` in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below. [Here](docs/en_US/WebUI.md) are more Web UI pages.
@@ -169,27 +169,27 @@ You can use these commands to get more information about the experiment
...
@@ -169,27 +169,27 @@ You can use these commands to get more information about the experiment
</table>
</table>
## **Documentation**
## **Documentation**
*[NNI overview](docs/Overview.md)
*[NNI overview](docs/en_US/Overview.md)
*[Quick start](docs/QuickStart.md)
*[Quick start](docs/en_US/QuickStart.md)
## **How to**
## **How to**
*[Install NNI](docs/Installation.md)
*[Install NNI](docs/en_US/Installation.md)
*[Use command line tool nnictl](docs/NNICTLDOC.md)
*[Use command line tool nnictl](docs/en_US/NNICTLDOC.md)
*[Use NNIBoard](docs/WebUI.md)
*[Use NNIBoard](docs/en_US/WebUI.md)
*[How to define search space](docs/SearchSpaceSpec.md)
*[How to define search space](docs/en_US/SearchSpaceSpec.md)
*[How to define a trial](docs/Trials.md)
*[How to define a trial](docs/en_US/Trials.md)
*[How to choose tuner/search-algorithm](docs/Builtin_Tuner.md)
*[How to choose tuner/search-algorithm](docs/en_US/Builtin_Tuner.md)
*[Config an experiment](docs/ExperimentConfig.md)
*[Config an experiment](docs/en_US/ExperimentConfig.md)
*[How to use annotation](docs/Trials.md#nni-python-annotation)
*[How to use annotation](docs/en_US/Trials.md#nni-python-annotation)
## **Tutorials**
## **Tutorials**
*[Run an experiment on local (with multiple GPUs)?](docs/tutorial_1_CR_exp_local_api.md)
*[Run an experiment on local (with multiple GPUs)?](docs/en_US/tutorial_1_CR_exp_local_api.md)
*[Run an experiment on multiple machines?](docs/RemoteMachineMode.md)
*[Run an experiment on multiple machines?](docs/en_US/RemoteMachineMode.md)
*[Run an experiment on OpenPAI?](docs/PAIMode.md)
*[Run an experiment on OpenPAI?](docs/en_US/PAIMode.md)
*[Run an experiment on Kubeflow?](docs/KubeflowMode.md)
*[Run an experiment on Kubeflow?](docs/en_US/KubeflowMode.md)
*[Try different tuners](docs/tuners.rst)
*[Try different tuners](docs/en_US/tuners.rst)
*[Try different assessors](docs/assessors.rst)
*[Try different assessors](docs/en_US/assessors.rst)
*[Implement a customized tuner](docs/Customize_Tuner.md)
*[Implement a customized tuner](docs/en_US/Customize_Tuner.md)
*[Implement a customized assessor](examples/assessors/README.md)
*[Implement a customized assessor](docs/en_US/Customize_Assessor.md)
*[Use Genetic Algorithm to find good model architectures for Reading Comprehension task](examples/trials/ga_squad/README.md)
*[Use Genetic Algorithm to find good model architectures for Reading Comprehension task](examples/trials/ga_squad/README.md)
## **Contribute**
## **Contribute**
...
@@ -197,11 +197,11 @@ This project welcomes contributions and suggestions, we use [GitHub issues](http
...
@@ -197,11 +197,11 @@ This project welcomes contributions and suggestions, we use [GitHub issues](http
Issues with the **good first issue** label are simple and easy-to-start ones that we recommend new contributors to start with.
Issues with the **good first issue** label are simple and easy-to-start ones that we recommend new contributors to start with.
To set up environment for NNI development, refer to the instruction: [Set up NNI developer environment](docs/SetupNNIDeveloperEnvironment.md)
To set up environment for NNI development, refer to the instruction: [Set up NNI developer environment](docs/en_US/SetupNNIDeveloperEnvironment.md)
Before start coding, review and get familiar with the NNI Code Contribution Guideline: [Contributing](docs/CONTRIBUTING.md)
Before start coding, review and get familiar with the NNI Code Contribution Guideline: [Contributing](docs/en_US/CONTRIBUTING.md)
We are in construction of the instruction for [How to Debug](docs/HowToDebug.md), you are also welcome to contribute questions or suggestions on this area.
We are in construction of the instruction for [How to Debug](docs/en_US/HowToDebug.md), you are also welcome to contribute questions or suggestions on this area.
## **License**
## **License**
The entire codebase is under [MIT license](LICENSE)
The entire codebase is under [MIT license](LICENSE)
@@ -38,4 +38,14 @@ Head over to [issues](https://github.com/Microsoft/nni/issues) to find issues wh
...
@@ -38,4 +38,14 @@ Head over to [issues](https://github.com/Microsoft/nni/issues) to find issues wh
A person looking to contribute can take up an issue by claiming it as a comment/assign their Github ID to it. In case there is no PR or update in progress for a week on the said issue, then the issue reopens for anyone to take up again. We need to consider high priority issues/regressions where response time must be a day or so.
A person looking to contribute can take up an issue by claiming it as a comment/assign their Github ID to it. In case there is no PR or update in progress for a week on the said issue, then the issue reopens for anyone to take up again. We need to consider high priority issues/regressions where response time must be a day or so.
## Code Styles & Naming Conventions
## Code Styles & Naming Conventions
We follow [PEP8](https://www.python.org/dev/peps/pep-0008/) for Python code and naming conventions, do try to adhere to the same when making a pull request or making a change. One can also take the help of linters such as `flake8` or `pylint`
* We follow [PEP8](https://www.python.org/dev/peps/pep-0008/) for Python code and naming conventions, do try to adhere to the same when making a pull request or making a change. One can also take the help of linters such as `flake8` or `pylint`
* We also follow [NumPy Docstring Style](https://www.sphinx-doc.org/en/master/usage/extensions/example_numpy.html#example-numpy) for Python Docstring Conventions. During the [documentation building](CONTRIBUTING.md#documentation), we use [sphinx.ext.napoleon](https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html) to generate Python API documentation from Docstring.
## Documentation
Our documentation is built with [sphinx](http://sphinx-doc.org/), supporting [Markdown](https://guides.github.com/features/mastering-markdown/) and [reStructuredText](http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html) format. All our documentations are placed under [docs/en_US](https://github.com/Microsoft/nni/tree/master/docs).
* Before submitting the documentation change, please __build homepage locally__: `cd docs/en_US && make html`, then you can see all the built documentation webpage under the folder `docs/en_US/_build/html`. It's also highly recommended taking care of __every WARNING__ during the build, which is very likely the signal of a __deadlink__ and other annoying issues.
* For links, please consider using __relative paths__ first. However, if the documentation is written in Markdown format, and:
* It's an image link which needs to be formatted with embedded html grammar, please use global URL like `https://user-images.githubusercontent.com/44491713/51381727-e3d0f780-1b4f-11e9-96ab-d26b9198ba65.png`, which can be automatically generated by dragging picture onto [Github Issue](https://github.com/Microsoft/nni/issues/new) Box.
* It cannot be re-formatted by sphinx, such as source code, please use its global URL. For source code that links to our github repo, please use URLs rooted at `https://github.com/Microsoft/nni/tree/master/` ([mnist.py](https://github.com/Microsoft/nni/blob/master/examples/trials/mnist/mnist.py) for example).
@@ -23,7 +23,7 @@ Now NNI supports running experiment on [Kubeflow](https://github.com/kubeflow/ku
...
@@ -23,7 +23,7 @@ Now NNI supports running experiment on [Kubeflow](https://github.com/kubeflow/ku
5. To access Azure storage service, NNI need the access key of the storage account, and NNI use [Azure Key Vault](https://azure.microsoft.com/en-us/services/key-vault/) Service to protect your private key. Set up Azure Key Vault Service, add a secret to Key Vault to store the access key of Azure storage account. Follow this [guideline](https://docs.microsoft.com/en-us/azure/key-vault/quick-create-cli) to store the access key.
5. To access Azure storage service, NNI need the access key of the storage account, and NNI use [Azure Key Vault](https://azure.microsoft.com/en-us/services/key-vault/) Service to protect your private key. Set up Azure Key Vault Service, add a secret to Key Vault to store the access key of Azure storage account. Follow this [guideline](https://docs.microsoft.com/en-us/azure/key-vault/quick-create-cli) to store the access key.
## Design
## Design


Kubeflow training service instantiates a kubernetes rest client to interact with your K8s cluster's API server.
Kubeflow training service instantiates a kubernetes rest client to interact with your K8s cluster's API server.
For each trial, we will upload all the files in your local codeDir path (configured in nni_config.yml) together with NNI generated files like parameter.cfg into a storage volumn. Right now we support two kinds of storage volumns: [nfs](https://en.wikipedia.org/wiki/Network_File_System) and [azure file storage](https://azure.microsoft.com/en-us/services/storage/files/), you should configure the storage volumn in NNI config YAML file. After files are prepared, Kubeflow training service will call K8S rest API to create kubeflow jobs ([tf-operator](https://github.com/kubeflow/tf-operator) job or [pytorch-operator](https://github.com/kubeflow/pytorch-operator) job) in K8S, and mount your storage volumn into the job's pod. Output files of kubeflow job, like stdout, stderr, trial.log or model files, will also be copied back to the storage volumn. NNI will show the storage volumn's URL for each trial in WebUI, to allow user browse the log files and job's output files.
For each trial, we will upload all the files in your local codeDir path (configured in nni_config.yml) together with NNI generated files like parameter.cfg into a storage volumn. Right now we support two kinds of storage volumns: [nfs](https://en.wikipedia.org/wiki/Network_File_System) and [azure file storage](https://azure.microsoft.com/en-us/services/storage/files/), you should configure the storage volumn in NNI config YAML file. After files are prepared, Kubeflow training service will call K8S rest API to create kubeflow jobs ([tf-operator](https://github.com/kubeflow/tf-operator) job or [pytorch-operator](https://github.com/kubeflow/pytorch-operator) job) in K8S, and mount your storage volumn into the job's pod. Output files of kubeflow job, like stdout, stderr, trial.log or model files, will also be copied back to the storage volumn. NNI will show the storage volumn's URL for each trial in WebUI, to allow user browse the log files and job's output files.
...
@@ -179,7 +179,7 @@ Trial configuration in kubeflow mode have the following configuration keys:
...
@@ -179,7 +179,7 @@ Trial configuration in kubeflow mode have the following configuration keys:
* gpuNum
* gpuNum
* image
* image
* Required key. In kubeflow mode, your trial program will be scheduled by Kubernetes to run in [Pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/). This key is used to specify the Docker image used to create the pod where your trail program will run.
* Required key. In kubeflow mode, your trial program will be scheduled by Kubernetes to run in [Pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/). This key is used to specify the Docker image used to create the pod where your trail program will run.
* We already build a docker image [msranni/nni](https://hub.docker.com/r/msranni/nni/) on [Docker Hub](https://hub.docker.com/). It contains NNI python packages, Node modules and javascript artifact files required to start experiment, and all of NNI dependencies. The docker file used to build this image can be found at [here](https://github.com/Microsoft/nni/tree/master/deployment/Dockerfile.build.base). You can either use this image directly in your config file, or build your own image based on it.
* We already build a docker image [msranni/nni](https://hub.docker.com/r/msranni/nni/) on [Docker Hub](https://hub.docker.com/). It contains NNI python packages, Node modules and javascript artifact files required to start experiment, and all of NNI dependencies. The docker file used to build this image can be found at [here](https://github.com/Microsoft/nni/tree/master/deployment/docker/Dockerfile). You can either use this image directly in your config file, or build your own image based on it.
* apiVersion
* apiVersion
* Required key. The API version of your kubeflow.
* Required key. The API version of your kubeflow.
* ps (optional). This config section is used to configure tensorflow parameter server role.
* ps (optional). This config section is used to configure tensorflow parameter server role.
...
@@ -196,4 +196,4 @@ Notice: In kubeflow mode, NNIManager will start a rest server and listen on a po
...
@@ -196,4 +196,4 @@ Notice: In kubeflow mode, NNIManager will start a rest server and listen on a po
Once a trial job is completed, you can goto NNI WebUI's overview page (like http://localhost:8080/oview) to check trial's information.
Once a trial job is completed, you can goto NNI WebUI's overview page (like http://localhost:8080/oview) to check trial's information.
Any problems when using NNI in kubeflow mode, plesae create issues on [NNI Github repo](https://github.com/Microsoft/nni).
Any problems when using NNI in kubeflow mode, please create issues on [NNI Github repo](https://github.com/Microsoft/nni).