"symphony/git@developer.sourcefind.cn:cnjsdfcy/simbricks.git" did not exist on "81fe3a288ddfdc7a8075b9342d376cd64f1233f7"
Commit bc9eab33 authored by Yan Ni's avatar Yan Ni Committed by QuanluZhang
Browse files

doc refactor to master (#648)

* initial commit for document refactor (#533)

* enable sphinx

* fix docs/readme link

* update conf

* use mkdocs to build homepage

* Update mkdocs.yml

use docs/README as main

* delete sphinx file

* add nav to mkdocs config

* fix mkdocs bug

* add requirement for online build

* clean requirements.txt

* add contribution

* change requirements file location

* delete sphinx from gitignore

* mnist examples doc (#566)

* mnist examples doc

* update

* update

* update

* update

* update

* Docs refactor of Tutorial (QuickStart, Tuners, Assessors) (#554)

* Refactor of QuickStart

* fix some typo

* Make new changes based on suggestions

* update successful INFO

* update Tuners

* update Tuners:overview

* Update Tuners.md

* Add Assessor.md

* update

* update

* mkdocs.yml

* Update QuickStart.md

* update

* update

* update

* update

* update

* add diff

* modified QuickStart.md and add mnist without nni example

* update

* small change

* update

* update

* update

* update

* update and refactor the mnist.py

* update

* update working process

* refator the mnist.py

* update

* update

* update mkdocs.md

* add metis tuner

* update QuickStart webUI part

* update capture

* update capture

* update description

* update picture for assessor

* update

* modified mkdocs.yml

* update Tuners.md

* test format

* update Tuner.md

* update Tuner.md

* change display format

* fix typo

* update

* fix typo

* fix typo and rename customize_Advisor

* fix typo and modified

* Dev doc: Add docs for Trials, SearchSpace, Annotation and GridSearch (#569)

* add Trials.md

* add Trials.md

* add Trials.md

* add Trials.md

* add docs

* add docs

* add docs

* add docs

* add docs

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* update triaL.MD

* update triaL.MD

* update triaL.MD

* update triaL.MD

* update triaL.MD

* add grid search tuner doc

* add grid search tuner doc

* Chec dev doc (#606)

* multiPhase doc

* updates

* updates

* updates

* updates

* updates

* updates

* update dev-doc to sphinx (#630)

* add trigger (#544)

* NNI logging architecture improvement  (#539)

* Removed unused log code, refactor to rename some class name in nni sdk and trial_tools

* Fix the regression bug that loca/remote mode doesnt work

* [WebUI] Fix issue#517 & issue#459 (#524)

* [WebUI] Fix issue#517 & issue#459

* update

* [Logging architecture refactor] Remove unused metrics related code in nni trial_tools, support kubeflow mode for logging architecture refactor (#551)

* Remove unused metrics related code in nni trial_tools, support kubeflow mode for logging architecture refactor

* Doc typo and format fixes (#560)

* fix incorrect document

* fix doc format and typo

* fix state transition (#504)

* Add enas nni version from contributor (#557)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* update readme

* update

* fix path

* update reference

* fix bug in config file

* update nni_arch_overview.png

* update

* update

* update

* add enas_nni

* Code coverage report (#559)

* Add UT code coverage report

* updates

* updates

* updates

* updates

* updates

* updates

* integration test python code coverage report

* Updating Readme to add the Related Projects like PAI, KubeLauncher and MMdnn (#565)

* Adding related projects to Readme

* Fix remote TrainingService bug, change forEach to "for of" (#564)

trial job could not be stopped in remote machine when experiment is stopped, because awit/async does not work normally in forEach, refer https://codeburst.io/javascript-async-await-with-foreach-b6ba62bbf404

.

* To install the whole nni in an virtual environment (#538)

* support venv

* adapt venv

* adapt venv

* adapt venv

* adapt venv

* new test

* new test

* new test

* support venv

* support venv

* support venv

* support venv

* support venv

* support venv

* support venv

* colorful output for mac

* colorful output for mac

* permission denied in /tmp

* permission denied in /tmp

* permission denied in /tmp

* remove unused variable

* final

* remove build python

* Make it feasible for annotation whether to add an extra line "nni.get_next_parameter()" or not (#526)

* fix bug

* add docs

* add ut

* add ut

* add to ci

* update doc

* update doc

* update ut

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* test

* test

* test

* test

* test

* test

* test

* test

* test

* test

* revert

* refactor

* refactor

* s

* merge

* fix annotation for extra line

* add deprecation warning

* fix permision deny (#567)

* Add Metis Tuner (#534)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* update readme

* update

* fix path

* update reference

* fix bug in config file

* update nni_arch_overview.png

* update

* update

* update

* add metis tuner code

* 1. fix bug about import 2.update other sdk file

* add auto-gbdt-example and remove unused code

* add metis_tuner into README

* update the README

* update README | remove unused variable

* fix typo

* add sklearn into requirments

* Update src/sdk/pynni/nni/metis_tuner/metis_tuner.py

add default value in __init__
Co-Authored-By: default avatarxuehui1991 <xuehui@microsoft.com>

* Update docs/HowToChooseTuner.md
Co-Authored-By: default avatarxuehui1991 <xuehui@microsoft.com>

* Update docs/HowToChooseTuner.md
Co-Authored-By: default avatarxuehui1991 <xuehui@microsoft.com>

* fix typo | add more comments

* Change WARNING to INFO (#574)

change the warning level to info level when expand relative path
add nnictl --version log
update readme.md

* Fix some bugs in doc and log (#561)

* fix some bugs in doc and log

* The learning rate focus more on validation sets accuracy than training sets accuracy.

*  Fix a race condidtion issue in trial_keeper for reading log from pipe (#578)

* Fix a race condidtion issue in trial_keeper for reading log from pipe

* [WebUI] Fix issue#458 about final result as dict (#563)

* [WebUI] Fix issue#458 about final result as dict

* Fix comments

* fix bug

* support frameworkcontroller log (#572)

support frameworkcontroller log

* Dev weight sharing (#568) (#576)

* Dev weight sharing (#568)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* Dev weight sharing update doc (#577)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* add example section

* Dev weight sharing update (#579)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* add example section

* update weight sharing tutorial

* Dev weight sharing (#581)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* add example section

* update weight sharing tutorial

* fix divide by zero risk

* update tuner thread exception handling

* fix bug for async test

* Add frameworkcontroller document (#530)

Add frameworkcontroller document.
Fix other document small issues.

* [WebUI] Show trial log for pai and k8s (#580)

* [WebUI] Show trial log for pai and k8s

* fix lint

* Fix comments

* [WebUI] Show trial log for pai and k8s (#580)

* [WebUI] Show trial log for pai and k8s

* fix lint

* Fix comments

* add __init__.py to metis_tuner (#588)

* [Document] Update webui doc (#587)

* Update webui document

* update

* Update Dockerfile and README (#589)

* fix some bugs in doc and log

* The learning rate focus more on validation sets accuracy than training sets accuracy.

* update Dockerfile and README

* Update README.md

Merge to branch v0.5

* [WebUI] Fix bug (#591)

* fix bug

* fix bug of background

* update

* update

* add frameworkcontroller platform

* update README in metis and update RuntimeError info (#595)

* update README in metis and update RuntimeError

* fix typo

* add numerical choice check

* update

* udpate NFS setup tutorial (#597)

* Remove unused example (#600)

* update README in metis and update RuntimeError

* remove smart params

* Update release note (#603)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md
…

* update doc: overview (#555)

* doc overview

* update overview

* modification

* update overview

* update

* update

* update

* update

* Delete mkdocs.yml

* cifar10 example doc (#573)

* add cifar10 examples

* add search space and result

* update

* update typo

* update

* update

* Update doc: refactor ExperimentConfig.md (#602)

* fix doc

* add link in doc

* update

* add nnictl package cmd

* update doc index & add api reference (#636)

* add trigger (#544)

* NNI logging architecture improvement  (#539)

* Removed unused log code, refactor to rename some class name in nni sdk and trial_tools

* Fix the regression bug that loca/remote mode doesnt work

* [WebUI] Fix issue#517 & issue#459 (#524)

* [WebUI] Fix issue#517 & issue#459

* update

* [Logging architecture refactor] Remove unused metrics related code in nni trial_tools, support kubeflow mode for logging architecture refactor (#551)

* Remove unused metrics related code in nni trial_tools, support kubeflow mode for logging architecture refactor

* Doc typo and format fixes (#560)

* fix incorrect document

* fix doc format and typo

* fix state transition (#504)

* Add enas nni version from contributor (#557)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* update readme

* update

* fix path

* update reference

* fix bug in config file

* update nni_arch_overview.png

* update

* update

* update

* add enas_nni

* Code coverage report (#559)

* Add UT code coverage report

* updates

* updates

* updates

* updates

* updates

* updates

* integration test python code coverage report

* Updating Readme to add the Related Projects like PAI, KubeLauncher and MMdnn (#565)

* Adding related projects to Readme

* add Trials.md

* add Trials.md

* add Trials.md

* Fix remote TrainingService bug, change forEach to "for of" (#564)

trial job could not be stopped in remote machine when experiment is stopped, because awit/async does not work normally in forEach, refer https://codeburst.io/javascript-async-await-with-foreach-b6ba62bbf404

.

* add Trials.md

* add docs

* To install the whole nni in an virtual environment (#538)

* support venv

* adapt venv

* adapt venv

* adapt venv

* adapt venv

* new test

* new test

* new test

* support venv

* support venv

* support venv

* support venv

* support venv

* support venv

* support venv

* colorful output for mac

* colorful output for mac

* permission denied in /tmp

* permission denied in /tmp

* permission denied in /tmp

* remove unused variable

* final

* remove build python

* Make it feasible for annotation whether to add an extra line "nni.get_next_parameter()" or not (#526)

* fix bug

* add docs

* add ut

* add ut

* add to ci

* update doc

* update doc

* update ut

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* add ut to ci

* test

* test

* test

* test

* test

* test

* test

* test

* test

* test

* revert

* refactor

* refactor

* s

* merge

* fix annotation for extra line

* add deprecation warning

* fix permision deny (#567)

* Add Metis Tuner (#534)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* update readme

* update

* fix path

* update reference

* fix bug in config file

* update nni_arch_overview.png

* update

* update

* update

* add metis tuner code

* 1. fix bug about import 2.update other sdk file

* add auto-gbdt-example and remove unused code

* add metis_tuner into README

* update the README

* update README | remove unused variable

* fix typo

* add sklearn into requirments

* Update src/sdk/pynni/nni/metis_tuner/metis_tuner.py

add default value in __init__
Co-Authored-By: default avatarxuehui1991 <xuehui@microsoft.com>

* Update docs/HowToChooseTuner.md
Co-Authored-By: default avatarxuehui1991 <xuehui@microsoft.com>

* Update docs/HowToChooseTuner.md
Co-Authored-By: default avatarxuehui1991 <xuehui@microsoft.com>

* fix typo | add more comments

* Change WARNING to INFO (#574)

change the warning level to info level when expand relative path
add nnictl --version log
update readme.md

* Fix some bugs in doc and log (#561)

* fix some bugs in doc and log

* The learning rate focus more on validation sets accuracy than training sets accuracy.

*  Fix a race condidtion issue in trial_keeper for reading log from pipe (#578)

* Fix a race condidtion issue in trial_keeper for reading log from pipe

* [WebUI] Fix issue#458 about final result as dict (#563)

* [WebUI] Fix issue#458 about final result as dict

* Fix comments

* fix bug

* support frameworkcontroller log (#572)

support frameworkcontroller log

* Dev weight sharing (#568) (#576)

* Dev weight sharing (#568)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* Dev weight sharing update doc (#577)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* add example section

* Dev weight sharing update (#579)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* add example section

* update weight sharing tutorial

* Dev weight sharing (#581)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* simple weight sharing

* update gitignore file

* change tuner codedir to relative path

* add python cache files to gitignore list

* move extract scalar reward logic from dispatcher to tuner

* update tuner code corresponding to last commit

* update doc for receive_trial_result api change

* add numpy to package whitelist of pylint

* distinguish param value from return reward for tuner.extract_scalar_reward

* update pylintrc

* add comments to dispatcher.handle_report_metric_data

* update install for mac support

* fix root mode bug on Makefile

* Quick fix bug: nnictl port value error (#245)

* fix port bug

* Dev exp stop more (#221)

* Exp stop refactor (#161)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* fix setup.py (#115)

* Add DAG model configuration format for SQuAD example.

* Explain config format for SQuAD QA model.

* Add more detailed introduction about the evolution algorithm.

* Fix install.sh add add trial log path (#109)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* show trial log path

* update document

* fix install.sh

* set default vallue for maxTrialNum and maxExecDuration

* fix nnictl

* Dev smac (#116)

* support package install (#91)

* fix nnictl bug

* support package install

* update

* update package install logic

* Fix package install issue (#95)

* fix nnictl bug

* fix pakcage install

* support SMAC as a tuner on nni (#81)

* update doc

* update doc

* update doc

* update hyperopt installation

* update doc

* update doc

* update description in setup.py

* update setup.py

* modify encoding

* encoding

* add encoding

* remove pymc3

* update doc

* update builtin tuner spec

* support smac in sdk, fix logging issue

* support smac tuner

* add optimize_mode

* update config in nnictl

* add __init__.py

* update smac

* update import path

* update setup.py: remove entry_point

* update rest server validation

* fix bug in nnictl launcher

* support classArgs: optimize_mode

* quick fix bug

* test travis

* add dependency

* add dependency

* add dependency

* add dependency

* create smac python package

* fix trivial points

* optimize import of tuners, modify nnictl accordingly

* fix bug: incorrect algorithm_name

* trivial refactor

* for debug

* support virtual

* update doc of SMAC

* update smac requirements

* update requirements

* change debug mode

* update doc

* update doc

* refactor based on comments

* fix comments

* modify example config path to relative path and increase maxTrialNum (#94)

* modify example config path to relative path and increase maxTrialNum

* add document

* support conda (#90) (#110)

* support install from venv and travis CI

* support install from venv and travis CI

* support install from venv and travis CI

* support conda

* support conda

* modify example config path to relative path and increase maxTrialNum

* undo messy commit

* undo messy commit

* Support pip install as root (#77)

* Typo on #58 (#122)

* PAI Training Service implementation (#128)

* PAI Training service implementation
**1. Implement PAITrainingService
**2. Add trial-keeper python module, and modify setup.py to install the module
**3. Add PAItrainingService rest server to collect metrics from PAI container.

* fix datastore for multiple final result (#129)

* Update NNI v0.2 release notes (#132)

Update NNI v0.2 release notes

* Update setup.py Makefile and documents (#130)

* update makefile and setup.py

* update makefile and setup.py

* update document

* update document

* Update Makefile no travis

*  update doc

*  update doc

* fix convert from ss to pcs (#133)

* Fix bugs about webui (#131)

* Fix webui bugs

* Fix tslint

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* Merge branch V0.2 to Master (#143)

* webui logpath and document (#135)

* Add webui document and logpath as a href

* fix tslint

* fix comments by Chengmin

* Pai training service bug fix and enhancement (#136)

* Add NNI installation scripts

* Update pai script, update NNI_out_dir

* Update NNI dir in nni sdk local.py

* Create .nni folder in nni sdk local.py

* Add check before creating .nni folder

* Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT

* Improve annotation (#138)

* Improve annotation

* Minor bugfix

* Selectively install through pip (#139)

Selectively install through pip 
* update setup.py

* fix paiTrainingService bugs (#137)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* Add documentation for NNI PAI mode experiment (#141)

* Add documentation for NNI PAI mode

* Fix typo based on PR comments

* Exit with subprocess return code of trial keeper

* Remove additional exit code

* Fix typo based on PR comments

* update doc for smac tuner (#140)

* Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)

* Revert "Selectively install through pip (#139)"

This reverts commit 1d174836.

* Add exit code of subprocess for trial_keeper

* Update README, add link to PAImode doc

* fix bug (#147)

* Refactor nnictl and add config_pai.yml (#144)

* fix nnictl bug

* add hdfs host validation

* fix bugs

* fix dockerfile

* fix install.sh

* update install.sh

* fix dockerfile

* Set timeout for HDFSUtility exists function

* remove unused TODO

* fix sdk

* add optional for outputDir and dataDir

* refactor dockerfile.base

* Remove unused import in hdfsclientUtility

* add config_pai.yml

* refactor nnictl create logic and add colorful print

* fix nnictl stop logic

* add annotation for config_pai.yml

* add document for start experiment

* fix config.yml

* fix document

* Fix trial keeper wrongly exit issue (#152)

* Fix trial keeper bug, use actual exitcode to exit rather than 1

* Fix bug of table sort (#145)

* Update doc for PAIMode and v0.2 release notes (#153)

* Update v0.2 documentation regards to release note and PAI training service

* Update document to describe NNI docker image

* fix antd (#159)

* refactor experiment stopping logic

* support change concurrency

* remove trialJobs.ts

* trivial changes

* fix bugs

* fix bug

* support updating maxTrialNum

* Modify IT scripts for supporting multiple experiments

* Update ci (#175)

* Update RemoteMachineMode.md (#63)

* Remove unused classes for SQuAD QA example.

* Remove more unused functions for SQuAD QA example.

* Fix default dataset config.

* Add Makefile README (#64)

* update document (#92)

* Edit readme.md

* updated a word

* Update GetStarted.md

* Update GetStarted.md

* refact readme, getstarted and write your trial md.

* Update README.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Update WriteYourTrial.md

* Fix nnictl bugs and add new feature (#75)

* fix nnictl bug

* fix nnictl create bug

* add experiment status logic

* add more information for nnictl

* fix Evolution Tuner bug

* refactor code

* fix code in updater.py

* fix nnictl --help

* fix classArgs bug

* update check response.status_code logic

* remove Buffer warning (#100)

* update readme in ga_squad

* update readme

* fix typo

* Update README.md

* Update README.md

* Update README.md

* Add support for debugging mode

* modify CI cuz of refracting exp stop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* update CI for expstop

* file saving

* fix issues from code merge

* remove $(INSTALL_PREFIX)/nni/nni_manager before install

* fix indent

* fix merge issue

* socket close

* update port

* fix merge error

* modify ci logic in nnimanager

* fix ci

* fix bug

* change suspended to done

* update ci (#229)

* update ci

* update ci

* update ci (#232)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update ci (#233)

* update ci

* update ci

* update azure-pipelines

* update azure-pipelines

* update azure-pipelines

* run.py (#238)

* Nnupdate ci (#239)

* run.py

* test ci

* Nnupdate ci (#240)

* run.py

* test ci

* test ci

* Udci (#241)

* run.py

* test ci

* test ci

* test ci

* update ci (#242)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh (#244)

* run.py

* test ci

* test ci

* test ci

* update ci

* revert install.sh

* add comments

* remove assert

* trivial change

* trivial change

* update Makefile (#246)

* update Makefile

* update Makefile

* quick fix for ci (#248)

* add update trialNum and fix bugs (#261)

* Add builtin tuner to CI (#247)

* update Makefile

* update Makefile

* add builtin-tuner test

* add builtin-tuner test

* refractor ci

* update azure.yml

* add built-in tuner test

* fix bugs

* Doc refactor (#258)

* doc refactor

* image name refactor

* Refactor nnictl to support listing stopped experiments. (#256)

Refactor nnictl to support listing stopped experiments.

* Show experiment parameters more beautifully (#262)

* fix error on example of RemoteMachineMode (#269)

* add pycharm project files to .gitignore list

* update pylintrc to conform vscode settings

* fix RemoteMachineMode for wrong trainingServicePlatform

* Update docker file to use latest nni release (#263)

* fix bug about execDuration and endTime (#270)

* fix bug about execDuration and endTime

* modify time interval to 30 seconds

* refactor based on Gems's suggestion

* for triggering ci

* Refactor dockerfile (#264)

* refactor Dockerfile

* Support nnictl tensorboard (#268)

support tensorboard

* Sdk update (#272)

* Rename get_parameters to get_next_parameter

* annotations add get_next_parameter

* updates

* updates

* updates

* updates

* updates

* add experiment log path to experiment profile (#276)

* refactor extract reward from dict by tuner

* update Makefile for mac support, wait for aka.ms support

* refix Makefile for colorful echo

* unversion config.yml with machine information

* sync graph.py between tuners & trial of ga_squad

* sync graph.py between tuners & trial of ga_squad

* copy weight shared ga_squad under weight_sharing folder

* mv ga_squad code back to master

* simple tuner & trial ready

* Fix nnictl multiThread option

* weight sharing with async dispatcher simple example ready

* update for ga_squad

* fix bug

* modify multihead attention name

* add min_layer_num to Graph

* fix bug

* update share id calc

* fix bug

* add save logging

* fix ga_squad tuner bug

* sync bug fix for ga_squad tuner

* fix same hash_id bug

* add lock to simple tuner in weight sharing

* Add readme to simple weight sharing

* update

* update

* add paper link

* update

* reformat with autopep8

* add documentation for weight sharing

* test for weight sharing

* delete irrelevant files

* move details of weight sharing in to code comments

* add example section

* update weight sharing tutorial

* fix divide by zero risk

* update tuner thread exception handling

* fix bug for async test

* Add frameworkcontroller document (#530)

Add frameworkcontroller document.
Fix other document small issues.

* [WebUI] Show trial log for pai and k8s (#580)

* [WebUI] Show trial log for pai and k8s

* fix lint

* Fix comments

* [WebUI] Show trial log for pai and k8s (#580)

* [WebUI] Show trial log for pai and k8s

* fix lint

* Fix comments

* add __init__.py to metis_tuner (#588)

* add docs

* [Document] Update webui doc (#587)

* Update webui document

* update

* Update Dockerfile and README (#589)

* fix some bugs in doc and log

* The learning rate focus more on validation sets accuracy than training sets accuracy.

* update Dockerfile and README

* Update README.md

Merge to branch v0.5

* [WebUI] Fix bug (#591)

* fix bug

* fix bug of background

* update

* update

* add frameworkcontroller platform

* update README in metis and update RuntimeError info (#595)

* update README in metis and update RuntimeError

* fix typo

* add numerical choice check

* update

* udpate NFS setup tutorial (#597)

* Remove unused example (#600)

* update README in metis and update RuntimeError

* remove smart params

* Update release note …

* Add sklearn_example.md (#647)

* fix doc

* add link in doc

* update

* add nnictl package cmd

* add sklearn-example.md

* update

* update

* fix link

* add desc in example.rst

* revert nnictl before sphinx try

* fix mnist.py example

* add SQuAD_evolution_examples.md (#620)

* add SQuAD_evolution_examples.md

* add update

* remove yml file

* remove mkdoc.yml

* update Example.rst

* Add GBDT example doc (#654)

* add SQuAD_evolution_examples.md

* add update

* remove yml file

* remove mkdoc.yml

* update Example.rst

* update gbdt_example.md

* update

* add run command line

* update Evlution_SQuAD.md

* update link

* add gbdt in Example.rst

* Update SearchSpaceSpec (#656)

* add Trials.md

* add Trials.md

* add Trials.md

* add Trials.md

* add docs

* add docs

* add docs

* add docs

* add docs

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* docs modification

* update triaL.MD

* update triaL.MD

* update triaL.MD

* update triaL.MD

* update triaL.MD

* add grid search tuner doc

* add grid search tuner doc

* update SearchSpaceSpec

* update SearchSpaceSpec

* update SearchSpaceSpec

* update SearchSpaceSpec

* update SearchSpaceSpec

* update SearchSpaceSpec

* update SearchSpaceSpec

* update SearchSpaceSpec

* update SearchSpaceSpec

* fix color for zejun

* fix mnist before

* fix image

* Fix doc format (#658)

* add SQuAD_evolution_examples.md

* add update

* remove yml file

* fix format problem

* update index

* fix typo

* fix broken-links of quickstart/tuners/assessors (#662)

* update assessor.rst

* fix

* Dev doc fix4 (#672)

* refactor index

* fix title format for remote

* fix nnictl doc w/ rst

* fix assessor pic link

* fix type for training service

* adjust tutorial

* fix typo

* Dev doc (#669)

* changed image link

* deleted link of triallog.png

new version do not has this function

* Update AdvancedNAS.md

* changed image link

weight sharing image

* update doc (#670)

* update doc

* Dev doc fix4 (#672)

* refactor index

* fix title format for remote

* fix nnictl doc w/ rst

* fix assessor pic link

* fix type for training service

* adjust tutorial

* fix typo

* Update customized assessor doc (#671)

* Update customized assessor doc

* updates

* fix typo

* update doc (#673)
parent a441558c
_build
_static
_templates
\ No newline at end of file
......@@ -19,7 +19,7 @@ tuner:
```
And let tuner decide where to save & load weights and feed the paths to trials through `nni.get_next_parameters()`:
![weight_sharing_design](./img/weight_sharing.png)
<img src="https://user-images.githubusercontent.com/23273522/51817667-93ebf080-2306-11e9-8395-b18b322062bc.png" alt="drawing" width="700"/>
For example, in tensorflow:
```python
......@@ -86,4 +86,4 @@ For details, please refer to this [simple weight sharing example](../test/async_
[2]: https://arxiv.org/abs/1707.07012
[3]: https://arxiv.org/abs/1806.09055
[4]: https://arxiv.org/abs/1806.10282
[5]: https://arxiv.org/abs/1703.01041
\ No newline at end of file
[5]: https://arxiv.org/abs/1703.01041
# NNI Annotation
For good user experience and reduce user effort, we need to design a good annotation grammar.
If users use NNI system, they only need to:
## Overview
1. Use nni.get_next_parameter() to retrieve hyper parameters from Tuner, before using other annotation, use following annotation at the begining of trial code:
'''@nni.get_next_parameter()'''
To improve user experience and reduce user effort, we design an annotation grammar. Using NNI annotation, users can adapt their code to NNI just by adding some standalone annotating strings, which does not affect the execution of the original code.
2. Annotation variable in code as:
Below is an example:
'''@nni.variable(nni.choice(2,3,5,7),name=self.conv_size)'''
```python
'''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
The meaning of this example is that NNI will choose one of several values (0.1, 0.01, 0.001) to assign to the learning_rate variable. Specifically, this first line is an NNI annotation, which is a single string. Following is an assignment statement. What nni does here is to replace the right value of this assignment statement according to the information provided by the annotation line.
3. Annotation intermediate in code as:
'''@nni.report_intermediate_result(test_acc)'''
In this way, users could either run the python code directly or launch NNI to tune hyper-parameter in this code, without changing any codes.
4. Annotation output in code as:
## Types of Annotation:
'''@nni.report_final_result(test_acc)'''
In NNI, there are mainly four types of annotation:
5. Annotation `function_choice` in code as:
'''@nni.function_choice(max_pool(h_conv1, self.pool_size),avg_pool(h_conv1, self.pool_size),name=max_pool)'''
### 1. Annotate variables
In this way, they can easily implement automatic tuning on NNI.
`'''@nni.variable(sampling_algo, name)'''`
For `@nni.variable`, `nni.choice` is the type of search space and there are 10 types to express your search space as follows:
`@nni.variable` is used in NNI to annotate a variable.
1. `@nni.variable(nni.choice(option1,option2,...,optionN),name=variable)`
Which means the variable value is one of the options, which should be a list The elements of options can themselves be stochastic expressions
**Arguments**
2. `@nni.variable(nni.randint(upper),name=variable)`
Which means the variable value is a random integer in the range [0, upper).
- **sampling_algo**: Sampling algorithm that specifies a search space. User should replace it with a built-in NNI sampling function whose name consists of an `nni.` identification and a search space type specified in [SearchSpaceSpec](SearchSpaceSpec.md) such as `choice` or `uniform`.
- **name**: The name of the variable that the selected value will be assigned to. Note that this argument should be the same as the left value of the following assignment statement.
3. `@nni.variable(nni.uniform(low, high),name=variable)`
Which means the variable value is a value uniformly between low and high.
An example here is:
4. `@nni.variable(nni.quniform(low, high, q),name=variable)`
Which means the variable value is a value like round(uniform(low, high) / q) * q
```python
'''@nni.variable(nni.choice(0.1, 0.01, 0.001), name=learning_rate)'''
learning_rate = 0.1
```
5. `@nni.variable(nni.loguniform(low, high),name=variable)`
Which means the variable value is a value drawn according to exp(uniform(low, high)) so that the logarithm of the return value is uniformly distributed.
### 2. Annotate functions
6. `@nni.variable(nni.qloguniform(low, high, q),name=variable)`
Which means the variable value is a value like round(exp(uniform(low, high)) / q) * q
`'''@nni.function_choice(*functions, name)'''`
7. `@nni.variable(nni.normal(label, mu, sigma),name=variable)`
Which means the variable value is a real value that's normally-distributed with mean mu and standard deviation sigma.
`@nni.function_choice` is used to choose one from several functions.
8. `@nni.variable(nni.qnormal(label, mu, sigma, q),name=variable)`
Which means the variable value is a value like round(normal(mu, sigma) / q) * q
**Arguments**
9. `@nni.variable(nni.lognormal(label, mu, sigma),name=variable)`
Which means the variable value is a value drawn according to exp(normal(mu, sigma))
- **\*functions**: Several functions that are waiting to be selected from. Note that it should be a complete function call with arguments. Such as `max_pool(hidden_layer, pool_size)`.
- **name**: The name of the function that will be replaced in the following assignment statement.
10. `@nni.variable(nni.qlognormal(label, mu, sigma, q),name=variable)`
Which means the variable value is a value like round(exp(normal(mu, sigma)) / q) * q
An example here is:
```python
"""@nni.function_choice(max_pool(hidden_layer, pool_size), avg_pool(hidden_layer, pool_size), name=max_pool)"""
h_pooling = max_pool(hidden_layer, pool_size)
```
### 3. Annotate intermediate result
`'''@nni.report_intermediate_result(metrics)'''`
`@nni.report_intermediate_result` is used to report intermediate result, whose usage is the same as `nni.report_intermediate_result` in [Trials.md](Trials.md)
### 4. Annotate final result
`'''@nni.report_final_result(metrics)'''`
`@nni.report_final_result` is used to report the final result of the current trial, whose usage is the same as `nni.report_final_result` in [Trials.md](Trials.md)
# Built-in Assessors
NNI provides state-of-the-art tuning algorithm in our builtin-assessors and makes them easy to use. Below is the brief overview of NNI current builtin Assessors:
|Assessor|Brief Introduction of Algorithm|
|---|---|
|**Medianstop**<br>[(Usage)](#MedianStop)|Medianstop is a simple early stopping rule mentioned in the [paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf). It stops a pending trial X at step S if the trial’s best objective value by step S is strictly worse than the median value of the running averages of all completed trials’ objectives reported up to step S.|
|[Curvefitting](https://github.com/Microsoft/nni/blob/master/src/sdk/pynni/nni/curvefitting_assessor/README.md)<br>[(Usage)](#Curvefitting)|Curve Fitting Assessor is a LPA(learning, predicting, assessing) algorithm. It stops a pending trial X at step S if the prediction of final epoch's performance worse than the best final performance in the trial history. In this algorithm, we use 12 curves to fit the accuracy curve|
<br>
## Usage of Builtin Assessors
Use builtin assessors provided by NNI SDK requires to declare the **builtinAssessorName** and **classArgs** in `config.yml` file. In this part, we will introduce the detailed usage about the suggested scenarios, classArg requirements, and example for each assessor.
Note: Please follow the format when you write your `config.yml` file.
<a name="MedianStop"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Median Stop Assessor`
> Builtin Assessor Name: **Medianstop**
**Suggested scenario**
It is applicable in a wide range of performance curves, thus, can be used in various scenarios to speed up the tuning progress.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', assessor will **stop** the trial with smaller expectation. If 'minimize', assessor will **stop** the trial with larger expectation.
* **start_step** (*int, optional, default = 0*) - A trial is determined to be stopped or not, only after receiving start_step number of reported intermediate results.
**Usage example:**
```yaml
# config.yml
assessor:
builtinAssessorName: Medianstop
classArgs:
optimize_mode: maximize
start_step: 5
```
<br>
<a name="Curvefitting"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Curve Fitting Assessor`
> Builtin Assessor Name: **Curvefitting**
**Suggested scenario**
It is applicable in a wide range of performance curves, thus, can be used in various scenarios to speed up the tuning progress. Even better, it's able to handle and assess curves with similar performance.
**Requirement of classArg**
* **epoch_num** (*int, **required***) - The total number of epoch. We need to know the number of epoch to determine which point we need to predict.
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', assessor will **stop** the trial with smaller expectation. If 'minimize', assessor will **stop** the trial with larger expectation.
* **start_step** (*int, optional, default = 6*) - A trial is determined to be stopped or not, we start to predict only after receiving start_step number of reported intermediate results.
* **threshold** (*float, optional, default = 0.95*) - The threshold that we decide to early stop the worse performance curve. For example: if threshold = 0.95, optimize_mode = maximize, best performance in the history is 0.9, then we will stop the trial which predict value is lower than 0.95 * 0.9 = 0.855.
**Usage example:**
```yaml
# config.yml
assessor:
builtinAssessorName: Curvefitting
classArgs:
epoch_num: 20
optimize_mode: maximize
start_step: 6
threshold: 0.95
```
\ No newline at end of file
# Built-in Tuners
NNI provides state-of-the-art tuning algorithm as our builtin-tuners and makes them easy to use. Below is the brief summary of NNI currently built-in Tuners:
|Tuner|Brief Introduction of Algorithm|
|---|---|
|**TPE**<br>[(Usage)](#TPE)|The Tree-structured Parzen Estimator (TPE) is a sequential model-based optimization (SMBO) approach. SMBO methods sequentially construct models to approximate the performance of hyperparameters based on historical measurements, and then subsequently choose new hyperparameters to test based on this model.|
|**Random Search**<br>[(Usage)](#Random)|In Random Search for Hyper-Parameter Optimization show that Random Search might be surprisingly simple and effective. We suggest that we could use Random Search as the baseline when we have no knowledge about the prior distribution of hyper-parameters.|
|**Anneal**<br>[(Usage)](#Anneal)|This simple annealing algorithm begins by sampling from the prior, but tends over time to sample from points closer and closer to the best ones observed. This algorithm is a simple variation on the random search that leverages smoothness in the response surface. The annealing rate is not adaptive.|
|**Naive Evolution**<br>[(Usage)](#Evolution)|Naive Evolution comes from Large-Scale Evolution of Image Classifiers. It randomly initializes a population-based on search space. For each generation, it chooses better ones and does some mutation (e.g., change a hyperparameter, add/remove one layer) on them to get the next generation. Naive Evolution requires many trials to works, but it's very simple and easy to expand new features.|
|**SMAC**<br>[(Usage)](#SMAC)|SMAC is based on Sequential Model-Based Optimization (SMBO). It adapts the most prominent previously used model class (Gaussian stochastic process models) and introduces the model class of random forests to SMBO, in order to handle categorical parameters. The SMAC supported by nni is a wrapper on the SMAC3 Github repo.|
|**Batch tuner**<br>[(Usage)](#Batch)|Batch tuner allows users to simply provide several configurations (i.e., choices of hyper-parameters) for their trial code. After finishing all the configurations, the experiment is done. Batch tuner only supports the type choice in search space spec.|
|**Grid Search**<br>[(Usage)](#GridSearch)|Grid Search performs an exhaustive searching through a manually specified subset of the hyperparameter space defined in the searchspace file. Note that the only acceptable types of search space are choice, quniform, qloguniform. The number q in quniform and qloguniform has special meaning (different from the spec in search space spec). It means the number of values that will be sampled evenly from the range low and high.|
|[Hyperband](https://github.com/Microsoft/nni/tree/master/src/sdk/pynni/nni/hyperband_advisor)<br>[(Usage)](#Hyperband)|Hyperband tries to use the limited resource to explore as many configurations as possible, and finds out the promising ones to get the final result. The basic idea is generating many configurations and to run them for the small number of STEPs to find out promising one, then further training those promising ones to select several more promising one.|
|[Network Morphism](https://github.com/Microsoft/nni/blob/master/src/sdk/pynni/nni/networkmorphism_tuner/README.md)<br>[(Usage)](#NetworkMorphism)|Network Morphism provides functions to automatically search for architecture of deep learning models. Every child network inherits the knowledge from its parent network and morphs into diverse types of networks, including changes of depth, width, and skip-connection. Next, it estimates the value of a child network using the historic architecture and metric pairs. Then it selects the most promising one to train.|
|**Metis Tuner**<br>[(Usage)](#MetisTuner)|Metis offers the following benefits when it comes to tuning parameters: While most tools only predict the optimal configuration, Metis gives you two outputs: (a) current prediction of optimal configuration, and (b) suggestion for the next trial. No more guesswork. While most tools assume training datasets do not have noisy data, Metis actually tells you if you need to re-sample a particular hyper-parameter.|
<br>
## Usage of Builtin Tuners
Use builtin tuner provided by NNI SDK requires to declare the **builtinTunerName** and **classArgs** in `config.yml` file. In this part, we will introduce the detailed usage about the suggested scenarios, classArg requirements and example for each tuner.
Note: Please follow the format when you write your `config.yml` file.
<a name="TPE"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `TPE`
> Builtin Tuner Name: **TPE**
**Suggested scenario**
TPE, as a black-box optimization, can be used in various scenarios and shows good performance in general. Especially when you have limited computation resource and can only try a small number of trials. From a large amount of experiments, we could found that TPE is far better than Random Search.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example:**
```yaml
# config.yml
tuner:
builtinTunerName: TPE
classArgs:
optimize_mode: maximize
```
<br>
<a name="Random"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Random Search`
> Builtin Tuner Name: **Random**
**Suggested scenario**
Random search is suggested when each trial does not take too long (e.g., each trial can be completed very soon, or early stopped by assessor quickly), and you have enough computation resource. Or you want to uniformly explore the search space. Random Search could be considered as baseline of search algorithm.
**Requirement of classArg:**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: Random
classArgs:
optimize_mode: maximize
```
<br>
<a name="Anneal"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Anneal`
> Builtin Tuner Name: **Anneal**
**Suggested scenario**
Anneal is suggested when each trial does not take too long, and you have enough computation resource(almost same with Random Search). Or the variables in search space could be sample from some prior distribution.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: Anneal
classArgs:
optimize_mode: maximize
```
<br>
<a name="Evolution"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Naive Evolution`
> Builtin Tuner Name: **Evolution**
**Suggested scenario**
Its requirement of computation resource is relatively high. Specifically, it requires large initial population to avoid falling into local optimum. If your trial is short or leverages assessor, this tuner is a good choice. And, it is more suggested when your trial code supports weight transfer, that is, the trial could inherit the converged weights from its parent(s). This can greatly speed up the training progress.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: Evolution
classArgs:
optimize_mode: maximize
```
<br>
<a name="SMAC"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `SMAC`
> Builtin Tuner Name: **SMAC**
**Suggested scenario**
Similar to TPE, SMAC is also a black-box tuner which can be tried in various scenarios, and is suggested when computation resource is limited. It is optimized for discrete hyperparameters, thus, suggested when most of your hyperparameters are discrete.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: SMAC
classArgs:
optimize_mode: maximize
```
<br>
<a name="Batch"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Batch Tuner`
> Builtin Tuner Name: BatchTuner
**Suggested scenario**
If the configurations you want to try have been decided, you can list them in searchspace file (using `choice`) and run them using batch tuner.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: BatchTuner
```
<br>
Note that the search space that BatchTuner supported like:
```json
{
"combine_params":
{
"_type" : "choice",
"_value" : [{"optimizer": "Adam", "learning_rate": 0.00001},
{"optimizer": "Adam", "learning_rate": 0.0001},
{"optimizer": "Adam", "learning_rate": 0.001},
{"optimizer": "SGD", "learning_rate": 0.01},
{"optimizer": "SGD", "learning_rate": 0.005},
{"optimizer": "SGD", "learning_rate": 0.0002}]
}
}
```
The search space file including the high-level key `combine_params`. The type of params in search space must be `choice` and the `values` including all the combined-params value.
<a name="GridSearch"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Grid Search`
> Builtin Tuner Name: **Grid Search**
**Suggested scenario**
Note that the only acceptable types of search space are `choice`, `quniform`, `qloguniform`. **The number `q` in `quniform` and `qloguniform` has special meaning (different from the spec in [search space spec](./SearchSpaceSpec.md)). It means the number of values that will be sampled evenly from the range `low` and `high`.**
It is suggested when search space is small, it is feasible to exhaustively sweeping the whole search space.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: GridSearch
```
<br>
<a name="Hyperband"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Hyperband`
> Builtin Advisor Name: **Hyperband**
**Suggested scenario**
It is suggested when you have limited computation resource but have relatively large search space. It performs well in the scenario that intermediate result (e.g., accuracy) can reflect good or bad of final result (e.g., accuracy) to some extent.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
* **R** (*int, optional, default = 60*) - the maximum STEPS (could be the number of mini-batches or epochs) can be allocated to a trial. Each trial should use STEPS to control how long it runs.
* **eta** (*int, optional, default = 3*) - `(eta-1)/eta` is the proportion of discarded trials
**Usage example**
```yaml
# config.yml
advisor:
builtinAdvisorName: Hyperband
classArgs:
optimize_mode: maximize
R: 60
eta: 3
```
<br>
<a name="NetworkMorphism"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Network Morphism`
> Builtin Tuner Name: **NetworkMorphism**
**Installation**
NetworkMorphism requires [pyTorch](https://pytorch.org/get-started/locally), so users should install it first.
**Suggested scenario**
It is suggested that you want to apply deep learning methods to your task (your own dataset) but you have no idea of how to choose or design a network. You modify the [example](../examples/trials/network_morphism/cifar10/cifar10_keras.py) to fit your own dataset and your own data augmentation method. Also you can change the batch size, learning rate or optimizer. It is feasible for different tasks to find a good network architecture. Now this tuner only supports the computer vision domain.
**Requirement of classArg**
* **optimize_mode** (*maximize or minimize, optional, default = maximize*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
* **task** (*('cv'), optional, default = 'cv'*) - The domain of experiment, for now, this tuner only supports the computer vision(cv) domain.
* **input_width** (*int, optional, default = 32*) - input image width
* **input_channel** (*int, optional, default = 3*) - input image channel
* **n_output_node** (*int, optional, default = 10*) - number of classes
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: NetworkMorphism
classArgs:
optimize_mode: maximize
task: cv
input_width: 32
input_channel: 3
n_output_node: 10
```
<br>
<a name="MetisTuner"></a>
![](https://placehold.it/15/1589F0/000000?text=+) `Metis Tuner`
> Builtin Tuner Name: **MetisTuner**
Note that the only acceptable types of search space are `choice`, `quniform`, `uniform` and `randint`.
**Installation**
Metis Tuner requires [sklearn](https://scikit-learn.org/), so users should install it first. User could use `pip3 install sklearn` to install it.
**Suggested scenario**
Similar to TPE and SMAC, Metis is a black-box tuner. If your system takes a long time to finish each trial, Metis is more favorable than other approaches such as random search. Furthermore, Metis provides guidance on the subsequent trial. Here is an [example](../examples/trials/auto-gbdt/search_space_metis.json) about the use of Metis. User only need to send the final result like `accuracy` to tuner, by calling the nni SDK.
**Requirement of classArg**
* **optimize_mode** (*'maximize' or 'minimize', optional, default = 'maximize'*) - If 'maximize', tuners will return the hyperparameter set with larger expectation. If 'minimize', tuner will return the hyperparameter set with smaller expectation.
**Usage example**
```yaml
# config.yml
tuner:
builtinTunerName: MetisTuner
classArgs:
optimize_mode: maximize
```
###############################
Contribution to NNI
###############################
.. toctree::
Development Setup<SetupNNIDeveloperEnvironment>
Contribution Guide<CONTRIBUTING>
Debug HowTo<HowToDebug>
\ No newline at end of file
# Customize Assessor
## Customize Assessor
NNI also support building an assessor by yourself to adjust your tuning demand.
If you want to implement a customized Assessor, there are three things for you to do:
1) Inherit an assessor of a base Assessor class
2) Implement assess_trial function
3) Configure your customized Assessor in experiment yaml config file
**1. Inherit an assessor of a base Assessor class**
```python
from nni.assessor import Assessor
class CustomizedAssessor(Assessor):
def __init__(self, ...):
...
```
**2. Implement assess trial function**
```python
from nni.assessor import Assessor, AssessResult
class CustomizedAssessor(Assessor):
def __init__(self, ...):
...
def assess_trial(self, trial_history):
"""
Determines whether a trial should be killed. Must override.
trial_history: a list of intermediate result objects.
Returns AssessResult.Good or AssessResult.Bad.
"""
# you code implement here.
...
```
**3. Configure your customized Assessor in experiment yaml config file**
NNI needs to locate your customized Assessor class and instantiate the class, so you need to specify the location of the customized Assessor class and pass literal values as parameters to the \_\_init__ constructor.
```yaml
assessor:
codeDir: /home/abc/myassessor
classFileName: my_customized_assessor.py
className: CustomizedAssessor
# Any parameter need to pass to your Assessor class __init__ constructor
# can be specified in this optional classArgs field, for example
classArgs:
arg1: value1
```
Please noted in **2**. The object `trial_history` are exact the object that Trial send to Assessor by using SDK `report_intermediate_result` function.
More detail example you could see:
> * [medianstop-assessor](../src/sdk/pynni/nni/medianstop_assessor)
> * [curvefitting-assessor](../src/sdk/pynni/nni/curvefitting_assessor)
\ No newline at end of file
# Customize-Tuner
## Customize Tuner
NNI provides state-of-the-art tuning algorithm in our builtin-tuners. We also support building a tuner by yourself to adjust your tuning demand.
If you want to implement and use your own tuning algorithm, you can implement a customized Tuner, there are three things for you to do:
1) Inherit a tuner of a base Tuner class
2) Implement receive_trial_result and generate_parameter function
3) Configure your customized tuner in experiment yaml config file
Here is an example:
**1. Inherit a tuner of a base Tuner class**
```python
from nni.tuner import Tuner
class CustomizedTuner(Tuner):
def __init__(self, ...):
...
```
**2. Implement receive_trial_result and generate_parameter function**
```python
from nni.tuner import Tuner
class CustomizedTuner(Tuner):
def __init__(self, ...):
...
def receive_trial_result(self, parameter_id, parameters, value):
'''
Record an observation of the objective function and Train
parameter_id: int
parameters: object created by 'generate_parameters()'
value: final metrics of the trial, including default matrix
'''
# your code implements here.
...
def generate_parameters(self, parameter_id):
'''
Returns a set of trial (hyper-)parameters, as a serializable object
parameter_id: int
'''
# your code implements here.
return your_parameters
...
```
`receive_trial_result` will receive the `parameter_id, parameters, value` as parameters input. Also, Tuner will receive the `value` object are exactly same value that Trial send.
The `your_parameters` return from `generate_parameters` function, will be package as json object by NNI SDK. NNI SDK will unpack json object so the Trial will receive the exact same `your_parameters` from Tuner.
For example:
If the you implement the `generate_parameters` like this:
```python
def generate_parameters(self, parameter_id):
'''
Returns a set of trial (hyper-)parameters, as a serializable object
parameter_id: int
'''
# your code implements here.
return {"dropout": 0.3, "learning_rate": 0.4}
```
It means your Tuner will always generate parameters `{"dropout": 0.3, "learning_rate": 0.4}`. Then Trial will receive `{"dropout": 0.3, "learning_rate": 0.4}` by calling API `nni.get_next_parameter()`. Once the trial ends with a result (normally some kind of metrics), it can send the result to Tuner by calling API `nni.report_final_result()`, for example `nni.report_final_result(0.93)`. Then your Tuner's `receive_trial_result` function will receied the result like:
```
parameter_id = 82347
parameters = {"dropout": 0.3, "learning_rate": 0.4}
value = 0.93
```
**Note that** if you want to access a file (e.g., `data.txt`) in the directory of your own tuner, you cannot use `open('data.txt', 'r')`. Instead, you should use the following:
```
_pwd = os.path.dirname(__file__)
_fd = open(os.path.join(_pwd, 'data.txt'), 'r')
```
This is because your tuner is not executed in the directory of your tuner (i.e., `pwd` is not the directory of your own tuner).
**3. Configure your customized tuner in experiment yaml config file**
NNI needs to locate your customized tuner class and instantiate the class, so you need to specify the location of the customized tuner class and pass literal values as parameters to the \_\_init__ constructor.
```yaml
tuner:
codeDir: /home/abc/mytuner
classFileName: my_customized_tuner.py
className: CustomizedTuner
# Any parameter need to pass to your tuner class __init__ constructor
# can be specified in this optional classArgs field, for example
classArgs:
arg1: value1
```
More detail example you could see:
> * [evolution-tuner](../src/sdk/pynni/nni/evolution_tuner)
> * [hyperopt-tuner](../src/sdk/pynni/nni/hyperopt_tuner)
> * [evolution-based-customized-tuner](../examples/tuners/ga_customer_tuner)
### Write a more advanced automl algorithm
The methods above are usually enough to write a general tuner. However, users may also want more methods, for example, intermediate results, trials' state (e.g., the methods in assessor), in order to have a more powerful automl algorithm. Therefore, we have another concept called `advisor` which directly inherits from `MsgDispatcherBase` in [`src/sdk/pynni/nni/msg_dispatcher_base.py`](../src/sdk/pynni/nni/msg_dispatcher_base.py). Please refer to [here](./howto_3_CustomizedAdvisor.md) for how to write a customized advisor.
\ No newline at end of file
######################
Examples
######################
.. toctree::
:maxdepth: 2
MNIST<mnist_examples>
Cifar10<cifar10_examples>
Scikit-learn<sklearn_examples>
EvolutionSQuAD<SQuAD_evolution_examples>
GBDT<gbdt_example>
......@@ -3,6 +3,12 @@
A config file is needed when create an experiment, the path of the config file is provide to nnictl.
The config file is written in yaml format, and need to be written correctly.
This document describes the rule to write config file, and will provide some examples and templates.
- [Template](#Template) (the templates of an config file)
- [Configuration spec](#Configuration) (the configuration specification of every attribute in config file)
- [Examples](#Examples) (the examples of config file)
<a name="Template"></a>
## Template
* __light weight(without Annotation and Assessor)__
......@@ -112,8 +118,8 @@ machineList:
username:
passwd:
```
## Configuration
<a name="Configuration"></a>
## Configuration spec
* __authorName__
* Description
......@@ -131,12 +137,14 @@ machineList:
__trialConcurrency__ specifies the max num of trial jobs run simultaneously.
Note: if trialGpuNum is bigger than the free gpu numbers, and the trial jobs running simultaneously can not reach trialConcurrency number, some trial jobs will be put into a queue to wait for gpu allocation.
Note: if trialGpuNum is bigger than the free gpu numbers, and the trial jobs running simultaneously can not reach trialConcurrency number, some trial jobs will be put into a queue to wait for gpu allocation.
* __maxExecDuration__
* Description
__maxExecDuration__ specifies the max duration time of an experiment.The unit of the time is {__s__, __m__, __h__, __d__}, which means {_seconds_, _minutes_, _hours_, _days_}.
Note: The maxExecDuration spec set the time of an experiment, not a trial job. If the experiment reach the max duration time, the experiment will not stop, but could not submit new trial jobs any more.
* __maxTrialNum__
* Description
......@@ -437,7 +445,7 @@ machineList:
<a name="Examples"></a>
## Examples
* __local mode__
......
# FAQ
This page is for frequent asked questions and answers.
......
**Get Started with NNI**
===
# Get Started with NNI
## **Installation**
......
Grid Search on nni
===
## 1. Introduction
Grid Search performs an exhaustive searching through a manually specified subset of the hyperparameter space defined in the searchspace file.
Note that the only acceptable types of search space are `choice`, `quniform`, `qloguniform` since only these three types of search space can be exhausted.
Moreover, in GridSearch Tuner, for users' convenience, the definition of `quniform` and `qloguniform` change, where q here specifies the number of values that will be sampled. Details about them are listed as follows:
* Type 'quniform' will receive three values [low, high, q], where [low, high] specifies a range and 'q' specifies the number of values that will be sampled evenly. Note that q should be at least 2. It will be sampled in a way that the first sampled value is 'low', and each of the following values is (high-low)/q larger that the value in front of it.
* Type 'qloguniform' behaves like 'quniform' except that it will first change the range to [log(low), log(high)] and sample and then change the sampled value back.
## 2. Usage
Since Grid Search Tuner will exhaust all possible hyper-parameter combination according to the search space file without any hyper-parameter for tuner itself, all you need to do is to specify tuner name in your experiment's yaml config file:
```
tuner:
builtinTunerName: GridSearch
```
**Installation of NNI**
===
# Installation of NNI
Currently we only support installation on Linux & Mac.
## **Installation**
* __Dependencies__
```bash
python >= 3.5
git
wget
```
python pip should also be correctly installed. You could use "python3 -m pip -v" to check pip version.
* __Install NNI through pip__
Prerequisite: `python >= 3.5`
```bash
python3 -m pip install --user --upgrade nni
python3 -m pip install --upgrade nni
```
* __Install NNI through source code__
Prerequisite: `python >=3.5, git, wget`
```bash
git clone -b v0.5 https://github.com/Microsoft/nni.git
git clone -b v0.5.1 https://github.com/Microsoft/nni.git
cd nni
source install.sh
./install.sh
```
* __Install NNI in docker image__
......@@ -66,5 +58,7 @@ Below are the minimum system requirements for NNI on macOS. Due to potential pro
* [Define search space](SearchSpaceSpec.md)
* [Config an experiment](ExperimentConfig.md)
* [How to run an experiment on local (with multiple GPUs)?](tutorial_1_CR_exp_local_api.md)
* [How to run an experiment on multiple machines?](tutorial_2_RemoteMachineMode.md)
* [How to run an experiment on multiple machines?](RemoteMachineMode.md)
* [How to run an experiment on OpenPAI?](PAIMode.md)
* [How to run an experiment on Kubernetes through Kubeflow?](KubeflowMode.md)
* [How to run an experiment on Kubernetes through FrameworkController?](FrameworkControllerMode.md)
\ No newline at end of file
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
\ No newline at end of file
.. role:: raw-html-m2r(raw)
:format: html
nnictl
======
Introduction
------------
**nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc.
Commands
--------
nnictl support commands:
* `nnictl create <#create>`_
* `nnictl resume <#resume>`_
* `nnictl stop <#stop>`_
* `nnictl update <#update>`_
* `nnictl trial <#trial>`_
* `nnictl top <#top>`_
* `nnictl experiment <#experiment>`_
* `nnictl config <#config>`_
* `nnictl log <#log>`_
* `nnictl webui <#webui>`_
* `nnictl tensorboard <#tensorboard>`_
* `nnictl package <#package>`_
Manage an experiment
^^^^^^^^^^^^^^^^^^^^
:raw-html-m2r:`<a name="create"></a>`
*
**nnictl create**
*
Description
You can use this command to create a new experiment, using the configuration specified in config file.
After this command is successfully done, the context will be set as this experiment,
which means the following command you issued is associated with this experiment,
unless you explicitly changes the context(not supported yet).
*
Usage
.. code-block:: bash
nnictl create [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+=====================================+
| --config, -c | True | |yaml configure file of the experiment|
+-------------------+-----------+-----------+-------------------------------------+
| --port, -p | False | |the port of restful server |
+-------------------+-----------+-----------+-------------------------------------+
:raw-html-m2r:`<a name="resume"></a>`
*
**nnictl resume**
*
Description
You can use this command to resume a stopped experiment.
*
Usage
.. code-block:: bash
nnictl resume [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |The id of the experiment you want to resume |
+-------------------+-----------+-----------+-----------------------------------------------+
| --port, -p | False | |Rest port of the experiment you want to resume |
+-------------------+-----------+-----------+-----------------------------------------------+
:raw-html-m2r:`<a name="stop"></a>`
*
**nnictl stop**
*
Description
You can use this command to stop a running experiment or multiple experiments.
*
Usage
.. code-block:: bash
nnictl stop [id]
*
Detail
1.If there is an id specified, and the id matches the running experiment, nnictl will stop the corresponding experiment, or will print error message.
2.If there is no id specified, and there is an experiment running, stop the running experiment, or print error message.
3.If the id ends with *, nnictl will stop all experiments whose ids matchs the regular.
4.If the id does not exist but match the prefix of an experiment id, nnictl will stop the matched experiment.
5.If the id does not exist but match multiple prefix of the experiment ids, nnictl will give id information.
6.Users could use 'nnictl stop all' to stop all experiments
:raw-html-m2r:`<a name="update"></a>`
*
**nnictl update**
*
**nnictl update searchspace**
*
Description
You can use this command to update an experiment's search space.
*
Usage
.. code-block:: bash
nnictl update searchspace [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --filename, -f | True | |the file storing your new search space |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl update concurrency**
*
Description
You can use this command to update an experiment's concurrency.
*
Usage
.. code-block:: bash
nnictl update concurrency [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --value, -v | True | |the number of allowed concurrent trials |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl update duration**
*
Description
You can use this command to update an experiment's concurrency.
*
Usage
.. code-block:: bash
nnictl update duration [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --value, -v | True | |the experiment duration will be NUMBER seconds.|
| | | |SUFFIX may be 's' for seconds (the default), |
| | | |'m' for minutes, 'h' for hours or 'd' for days.|
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl update trialnum**
*
Description
You can use this command to update an experiment's maxtrialnum.
*
Usage
.. code-block:: bash
nnictl update trialnum [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --value, -v | True | |the new number of maxtrialnum you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
:raw-html-m2r:`<a name="trial"></a>`
*
**nnictl trial**
*
**nnictl trial ls**
*
Description
You can use this command to show trial's information.
*
Usage
.. code-block:: bash
nnictl trial ls
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl trial kill**
*
Description
You can use this command to kill a trial job.
*
Usage
.. code-block:: bash
nnictl trial kill [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --trialid, -t | True | |ID of the trial you want to kill. |
+-------------------+-----------+-----------+-----------------------------------------------+
:raw-html-m2r:`<a name="top"></a>`
*
**nnictl top**
*
Description
Monitor all of running experiments.
*
Usage
.. code-block:: bash
nnictl top
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --time, -t | False | |The interval to update the experiment status, |
| | | |the unit of time is second, |
| | | |and the default value is 3 second. |
+-------------------+-----------+-----------+-----------------------------------------------+
:raw-html-m2r:`<a name="experiment"></a>`
Manage experiment information
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
*
**nnictl experiment show**
*
Description
Show the information of experiment.
*
Usage
.. code-block:: bash
nnictl experiment show
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl experiment status**
*
Description
Show the status of experiment.
*
Usage
.. code-block:: bash
nnictl experiment status
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to check |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl experiment list**
*
Description
Show the information of all the (running) experiments.
*
Usage
.. code-block:: bash
nnictl experiment list
:raw-html-m2r:`<a name="config"></a>`
*
**nnictl config show**
*
Description
Display the current context information.
*
Usage
.. code-block:: bash
nnictl config show
:raw-html-m2r:`<a name="log"></a>`
Manage log
^^^^^^^^^^
*
**nnictl log stdout**
*
Description
Show the stdout log content.
*
Usage
.. code-block:: bash
nnictl log stdout [options]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --head, -h | False | |show head lines of stdout |
+-------------------+-----------+-----------+-----------------------------------------------+
| --tail, -t | False | |show tail lines of stdout |
+-------------------+-----------+-----------+-----------------------------------------------+
| --path, -p | False | |show the path of stdout file |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl log stderr**
*
Description
Show the stderr log content.
*
Usage
.. code-block:: bash
nnictl log stderr [options]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --head, -h | False | |show head lines of stderr |
+-------------------+-----------+-----------+-----------------------------------------------+
| --tail, -t | False | |show tail lines of stderr |
+-------------------+-----------+-----------+-----------------------------------------------+
| --path, -p | False | |show the path of stderr file |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl log trial**
*
Description
Show trial log path.
*
Usage
:raw-html-m2r:`<a name="webui"></a>`
Manage webui
^^^^^^^^^^^^
*
**nnictl webui url**
:raw-html-m2r:`<a name="tensorboard"></a>`
Manage tensorboard
^^^^^^^^^^^^^^^^^^
*
**nnictl tensorboard start**
*
Description
Start the tensorboard process.
*
Usage
.. code-block:: bash
nnictl tensorboard start
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment you want to set |
+-------------------+-----------+-----------+-----------------------------------------------+
| --trialid | False | |ID of the trial |
+-------------------+-----------+-----------+-----------------------------------------------+
| --port | False | 6006 |The port of the tensorboard process |
+-------------------+-----------+-----------+-----------------------------------------------+
*
Detail
#. NNICTL support tensorboard function in local and remote platform for the moment, other platforms will be supported later.
#. If you want to use tensorboard, you need to write your tensorboard log data to environment variable [NNI_OUTPUT_DIR] path.
#. In local mode, nnictl will set --logdir=[NNI_OUTPUT_DIR] directly and start a tensorboard process.
#. In remote mode, nnictl will create a ssh client to copy log data from remote machine to local temp directory firstly, and then start a tensorboard process in your local machine. You need to notice that nnictl only copy the log data one time when you use the command, if you want to see the later result of tensorboard, you should execute nnictl tensorboard command again.
#. If there is only one trial job, you don't need to set trialid. If there are multiple trial jobs running, you should set the trialid, or you could use [nnictl tensorboard start --trialid all] to map --logdir to all trial log paths.
*
**nnictl tensorboard stop**
*
Description
Stop all of the tensorboard process.
*
Usage
.. code-block:: bash
nnictl tensorboard stop
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| id | False | |ID of the experiment |
+-------------------+-----------+-----------+-----------------------------------------------+
:raw-html-m2r:`<a name="package"></a>`
Manage package
^^^^^^^^^^^^^^
*
**nnictl package install**
*
Description
Install the packages needed in nni experiments.
*
Usage
.. code-block:: bash
nnictl package install [OPTIONS]
*
Options:
+-------------------+-----------+-----------+-----------------------------------------------+
| Name, shorthand | Required | Default | Description |
+===================+===========+===========+===============================================+
| --name | True | |The name of package to be installed |
+-------------------+-----------+-----------+-----------------------------------------------+
*
**nnictl package show**
*
Description
.. code-block:: bash
List the packages supported.
*
Usage
.. code-block:: bash
nnictl package show
......@@ -8,43 +8,42 @@ __nnictl__ is a command line tool, which can be used to control experiments, suc
## Commands
nnictl support commands:
- [nnictl create](#create)
- [nnictl resume](#resume)
- [nnictl stop](#stop)
- [nnictl update](#update)
- [nnictl trial](#trial)
- [nnictl top](#top)
- [nnictl experiment](#experiment)
- [nnictl config](#config)
- [nnictl log](#log)
- [nnictl webui](#webui)
- [nnictl tensorboard](#tensorboard)
- [nnictl package](#package)
```bash
nnictl create
nnictl stop
nnictl update
nnictl resume
nnictl trial
nnictl experiment
nnictl config
nnictl log
nnictl webui
nnictl tensorboard
nnictl top
nnictl --version
```
### Manage an experiment
* __nnictl create__
* Description
You can use this command to create a new experiment, using the configuration specified in config file. After this command is successfully done, the context will be set as this experiment, which means the following command you issued is associated with this experiment, unless you explicitly changes the context(not supported yet).
### Manage an experiment
<a name="create"></a>
* __nnictl create__
* Description
You can use this command to create a new experiment, using the configuration specified in config file.
After this command is successfully done, the context will be set as this experiment,
which means the following command you issued is associated with this experiment,
unless you explicitly changes the context(not supported yet).
* Usage
```bash
nnictl create [OPTIONS]
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| --config, -c| True| |yaml configure file of the experiment|
| --port, -p | False| |the port of restful server|
| --debug, -d | False| |Set log level to debug|
nnictl create [OPTIONS]
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| --config, -c| True| |yaml configure file of the experiment|
| --port, -p | False| |the port of restful server|
<a name="resume"></a>
* __nnictl resume__
* Description
......@@ -63,8 +62,10 @@ nnictl --version
| ------ | ------ | ------ |------ |
| id| False| |The id of the experiment you want to resume|
| --port, -p| False| |Rest port of the experiment you want to resume|
| --debug, -d | False| |Set log level to debug|
<a name="stop"></a>
* __nnictl stop__
* Description
......@@ -77,92 +78,82 @@ nnictl --version
```
* Detail
1. If there is an id specified, and the id matches the running experiment, nnictl will stop the corresponding experiment, or will print error message.
2. If there is no id specified, and there is an experiment running, stop the running experiment, or print error message.
3. If the id ends with *, nnictl will stop all experiments whose ids matchs the regular.
4. If the id does not exist but match the prefix of an experiment id, nnictl will stop the matched experiment.
5. If the id does not exist but match multiple prefix of the experiment ids, nnictl will give id information.
6. Users could use 'nnictl stop all' to stop all experiments
1.If there is an id specified, and the id matches the running experiment, nnictl will stop the corresponding experiment, or will print error message.
2.If there is no id specified, and there is an experiment running, stop the running experiment, or print error message.
3.If the id ends with *, nnictl will stop all experiments whose ids matchs the regular.
4.If the id does not exist but match the prefix of an experiment id, nnictl will stop the matched experiment.
5.If the id does not exist but match multiple prefix of the experiment ids, nnictl will give id information.
6.Users could use 'nnictl stop all' to stop all experiments
<a name="update"></a>
* __nnictl update__
* __nnictl update searchspace__
* Description
You can use this command to update an experiment's search space.
* Usage
```bash
nnictl update searchspace [OPTIONS]
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --filename, -f| True| |the file storing your new search space|
* __nnictl update concurrency__
* Description
You can use this command to update an experiment's concurrency.
* Usage
```bash
nnictl update concurrency [OPTIONS]
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --value, -v| True| |the number of allowed concurrent trials|
* __nnictl update duration__
* Description
You can use this command to update an experiment's duration.
* Usage
```bash
nnictl update duration [OPTIONS]
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --value, -v| True| |the experiment duration will be NUMBER seconds. SUFFIX may be 's' for seconds (the default), 'm' for minutes, 'h' for hours or 'd' for days.|
* __nnictl update trialnum__
* Description
You can use this command to update an experiment's maxtrialnum.
* Usage
```bash
nnictl update trialnum [OPTIONS]
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --value, -v| True| |the new number of maxtrialnum you want to set|
* __nnictl update searchspace__
* Description
You can use this command to update an experiment's search space.
* Usage
nnictl update searchspace [OPTIONS]
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --filename, -f| True| |the file storing your new search space|
* __nnictl update concurrency__
* Description
You can use this command to update an experiment's concurrency.
* Usage
nnictl update concurrency [OPTIONS]
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --value, -v| True| |the number of allowed concurrent trials|
* __nnictl update duration__
* Description
You can use this command to update an experiment's concurrency.
* Usage
nnictl update duration [OPTIONS]
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --value, -v| True| |the experiment duration will be NUMBER seconds. SUFFIX may be 's' for seconds (the default), 'm' for minutes, 'h' for hours or 'd' for days.|
* __nnictl update trialnum__
* Description
You can use this command to update an experiment's maxtrialnum.
* Usage
nnictl update trialnum [OPTIONS]
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --value, -v| True| |the new number of maxtrialnum you want to set|
<a name="trial"></a>
* __nnictl trial__
* __nnictl trial ls__
......@@ -184,41 +175,39 @@ nnictl --version
| id| False| |ID of the experiment you want to set|
* __nnictl trial kill__
* Description
You can use this command to kill a trial job.
* Usage
```bash
nnictl trial kill [OPTIONS]
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --trialid, -t| True| |ID of the trial you want to kill.|
* Description
You can use this command to kill a trial job.
* Usage
* __nnictl top__
nnictl trial kill [OPTIONS]
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --trialid, -t| True| |ID of the trial you want to kill.|
<a name="top"></a>
* __nnictl top__
* Description
Monitor all of running experiments.
* Usage
```bash
nnictl top
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --time, -t| False| |The interval to update the experiment status, the unit of time is second, and the default value is 3 second.|
Monitor all of running experiments.
* Usage
nnictl top
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
| --time, -t| False| |The interval to update the experiment status, the unit of time is second, and the default value is 3 second.|
<a name="experiment"></a>
### Manage experiment information
* __nnictl experiment show__
......@@ -268,23 +257,18 @@ nnictl --version
nnictl experiment list
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| all| False| False|Show all of experiments, including stopped experiments.|
<a name="config"></a>
* __nnictl config show__
* Description
Display the current context information.
* Usage
```bash
nnictl config show
```
* Description
Display the current context information.
* Usage
nnictl config show
<a name="log"></a>
### Manage log
* __nnictl log stdout__
......@@ -335,36 +319,12 @@ nnictl --version
* Usage
```bash
nnictl log trial [options]
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |the id of trial|
<a name="webui"></a>
### Manage webui
* __nnictl webui url__
* Description
Show the urls of the experiment.
* Usage
```bash
nnictl webui url
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
<a name="tensorboard"></a>
### Manage tensorboard
* __nnictl tensorboard start__
......@@ -396,32 +356,42 @@ nnictl --version
5. If there is only one trial job, you don't need to set trialid. If there are multiple trial jobs running, you should set the trialid, or you could use [nnictl tensorboard start --trialid all] to map --logdir to all trial log paths.
* __nnictl tensorboard stop__
* Description
Stop all of the tensorboard process.
* Usage
```bash
nnictl tensorboard stop
```
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
### Check nni version
* __nnictl --version__
* Description
Describe the current version of nni installed.
* Usage
```bash
nnictl --version
```
* Description
Stop all of the tensorboard process.
* Usage
nnictl tensorboard stop
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| id| False| |ID of the experiment you want to set|
<a name="package"></a>
### Manage package
* __nnictl package install__
* Description
Install the packages needed in nni experiments.
* Usage
nnictl package install [OPTIONS]
Options:
| Name, shorthand | Required|Default | Description |
| ------ | ------ | ------ |------ |
| --name| True| |The name of package to be installed|
* __nnictl package show__
* Description
List the packages supported.
* Usage
nnictl package show
# NNI Overview
# Overview
NNI (Neural Network Intelligence) is a toolkit to help users run automated machine learning experiments. For each experiment, user only need to define a search space and update a few lines of code, and then leverage NNI build-in algorithms and training services to search the best hyper parameters and/or neural architecture.
NNI (Neural Network Intelligence) is a toolkit to help users design and tune machine learning models (e.g., hyperparameters), neural network architectures, or complex system's parameters, in an efficient and automatic way. NNI has several appealing properties: easy-to-use, scalability, flexibility, and efficiency.
>Step 1: [Define search space](SearchSpaceSpec.md)
* **Easy-to-use**: NNI can be easily installed through python pip. Only several lines need to be added to your code in order to use NNI's power. You can use both commandline tool and WebUI to work with your experiments.
* **Scalability**: Tuning hyperparameters or neural architecture often demands large amount of computation resource, while NNI is designed to fully leverage different computation resources, such as remote machines, training platforms (e.g., PAI, Kubernetes). Thousands of trials could run in parallel by depending on the capacity of your configured training platforms.
* **Flexibility**: Besides rich built-in algorithms, NNI allows users to customize various hyperparameter tuning algorithms, neural architecture search algorithms, early stopping algorithms, etc. Users could also extend NNI with more training platforms, such as virtual machines, kubernetes service on the cloud. Moreover, NNI can connect to external environments to tune special applications/models on them.
* **Efficiency**: We are intensively working on more efficient model tuning from both system level and algorithm level. For example, leveraging early feedback to speedup tuning procedure.
>Step 2: [Update model codes](howto_1_WriteTrial.md)
The figure below shows high-level architecture of NNI.
>Step 3: [Define Experiment](ExperimentConfig.md)
<p align="center">
<img src="https://user-images.githubusercontent.com/23273522/51816536-ed055580-2301-11e9-8ad8-605a79ee1b9a.png" alt="drawing" width="700"/>
</p>
## Key Concepts
* *Experiment*: An experiment is one task of, for example, finding out the best hyperparameters of a model, finding out the best neural network architecture. It consists of trials and AutoML algorithms.
<p align="center">
<img src="./img/3_steps.jpg" alt="drawing"/>
</p>
* *Search Space*: It means the feasible region for tuning the model. For example, the value range of each hyperparameters.
After user submits the experiment through a command line tool [nnictl](../tools/README.md), a demon process (NNI manager) take care of search process. NNI manager continuously get search settings that generated by tuning algorithms, then NNI manager asks the training service component to dispatch and run trial jobs in a targeted training environment (e.g. local machine, remote servers and cloud). The results of trials jobs such as model accurate will send back to tuning algorithms for generating more meaningful search settings. NNI manager stops the search process after it find the best models.
* *Configuration*: A configuration is an instance from the search space, that is, each hyperparameter has a specific value.
## Architecture Overview
<p align="center">
<img src="./img/nni_arch_overview.png" alt="drawing"/>
</p>
* *Trial*: Trial is an individual attempt at applying a new configuration (e.g., a set of hyperparameter values, a specific nerual architecture). Trial code should be able to run with the provided configuration.
User can use the nnictl and/or a visualized Web UI nniboard to monitor and debug a given experiment.
* *Tuner*: Tuner is an AutoML algorithm, which generates a new configuration for the next try. A new trial will run with this configuration.
NNI provides a set of examples in the package to get you familiar with the above process.
* *Assessor*: Assessor analyzes trial's intermediate results (e.g., periodically evaluated accuracy on test dataset) to tell whether this trial can be early stopped or not.
## Key Concepts
* *Training Platform*: It means where trials are executed. Depending on your experiment's configuration, it could be your local machine, or remote servers, or large-scale training platform (e.g., PAI, Kubernetes).
**Experiment** in NNI is a method for testing different assumptions (hypotheses) by Trials under conditions constructed and controlled by NNI. During the experiment, one or more conditions are allowed to change in an organized manner and effects of these changes on associated conditions.
Basically, an experiment runs as follows: Tuner receives search space and generates configurations. These configurations will be submitted to training platforms, such as local machine, remote machines, or training clusters. Their performances are reported back to Tuner. Then, new configurations are generated and submitted.
### **Trial**
**Trial** in NNI is an individual attempt at applying a set of parameters on a model.
For each experiment, user only needs to define a search space and update a few lines of code, and then leverage NNI built-in Tuner/Assessor and training platforms to search the best hyperparameters and/or neural architecture. There are basically 3 steps:
### **Tuner**
**Tuner** in NNI is an implementation of Tuner API for a special tuning algorithm. [Read more about the Tuners supported in the latest NNI release](HowToChooseTuner.md)
>Step 1: [Define search space](SearchSpaceSpec.md)
>Step 2: [Update model codes](Trials.md)
>Step 3: [Define Experiment](ExperimentConfig.md)
### **Assessor**
**Assessor** in NNI is an implementation of Assessor API for optimizing the execution of experiment.
<p align="center">
<img src="https://user-images.githubusercontent.com/23273522/51816627-5d13db80-2302-11e9-8f3e-627e260203d5.jpg" alt="drawing"/>
</p>
More details about how to run an experiment, please refer to [Get Started](QuickStart.md).
## Learn More
* [Get started](GetStarted.md)
* [Install NNI](Installation.md)
* [Use command line tool nnictl](NNICTLDOC.md)
* [Use NNIBoard](WebUI.md)
* [Use annotation](howto_1_WriteTrial.md#nni-python-annotation)
### **Tutorials**
* [How to run an experiment on local (with multiple GPUs)?](tutorial_1_CR_exp_local_api.md)
* [Get started](QuickStart.md)
* [How to adapt your trial code on NNI?](Trials.md)
* [What are tuners supported by NNI?](Builtin_Tuner.md)
* [How to customize your own tuner?](Customize_Tuner.md)
* [What are assessors supported by NNI?](Builtin_Assessors.md)
* [How to customize your own assessor?](Customize_Assessor.md)
* [How to run an experiment on local?](tutorial_1_CR_exp_local_api.md)
* [How to run an experiment on multiple machines?](tutorial_2_RemoteMachineMode.md)
* [How to run an experiment on OpenPAI?](PAIMode.md)
* [Examples](mnist_examples.md)
[How to do trouble shooting when using NNI?]: <> ()
\ No newline at end of file
# QuickStart
## Installation
We support Linux and MacOS in current stage, Ubuntu 16.04 or higher and MacOS 10.14.1 are tested and supported. Simply run the following `pip install` in an environment that has `python >= 3.5`.
```bash
python3 -m pip install --upgrade nni
```
Note:
* `--user` can be added if you want to install NNI in your home directory, which does not require any special privileges.
* If there is any error like `Segmentation fault`, please refer to [FAQ](FAQ.md)
* For the `system requirements` of NNI, please refer to [Install NNI](Installation.md)
## "Hello World" example on MNIST
NNI is a toolkit to help users run automated machine learning experiments. It can automatically do the cyclic process of getting hyperparameters, running trials, testing results, tuning hyperparameters. Now, we show how to use NNI to help you find the optimal hyperparameters.
Here is an example script to train a CNN on MNIST dataset **without NNI**:
```python
def run_trial(params):
# Input data
mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)
# Build MNIST network
mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'], channel_2_num=params['channel_2_num'], conv_size=params['conv_size'], hidden_size=params['hidden_size'], pool_size=params['pool_size'], learning_rate=params['learning_rate'])
mnist_network.build_network()
test_acc = 0.0
with tf.Session() as sess:
# Train MNIST network
mnist_network.train(sess, mnist)
# Evaluate MNIST network
test_acc = mnist_network.evaluate(mnist)
if __name__ == '__main__':
params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64, 'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
run_trial(params)
```
Note: If you want to see the full implementation, please refer to [examples/trials/mnist/mnist_before.py](../examples/trials/mnist/mnist_before.py)
The above code can only try one set of parameters at a time, if we want to tune learning rate, we need to manually modify the hyperparameter and start the trial again and again.
NNI is born for helping user do the tuning jobs, the NNI working process is presented below:
```
input: search space, trial code, config file
output: one optimal hyperparameter configuration
1: For t = 0, 1, 2, ..., maxTrialNum,
2: hyperparameter = chose a set of parameter from search space
3: final result = run_trial_and_evaluate(hyperparameter)
4: report final result to NNI
5: If reach the upper limit time,
6: Stop the experiment
7: return hyperparameter value with best final result
```
If you want to use NNI to automatically train your model and find the optimal hyper-parameters, you need to do three changes base on your code:
**Three things required to do when using NNI**
**Step 1**: Give a `Search Space` file in json, includes the `name` and the `distribution` (discrete valued or continuous valued) of all the hyperparameters you need to search.
```diff
- params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
- 'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
+ {
+ "dropout_rate":{"_type":"uniform","_value":[0.5, 0.9]},
+ "conv_size":{"_type":"choice","_value":[2,3,5,7]},
+ "hidden_size":{"_type":"choice","_value":[124, 512, 1024]},
+ "batch_size": {"_type":"choice", "_value": [1, 4, 8, 16, 32]},
+ "learning_rate":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]}
+ }
```
*Implemented code directory: [search_space.json](../examples/trials/mnist/search_space.json)*
**Step 2**: Modified your `Trial` file to get the hyperparameter set from NNI and report the final result to NNI.
```diff
+ import nni
def run_trial(params):
mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)
mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'], channel_2_num=params['channel_2_num'], conv_size=params['conv_size'], hidden_size=params['hidden_size'], pool_size=params['pool_size'], learning_rate=params['learning_rate'])
mnist_network.build_network()
with tf.Session() as sess:
mnist_network.train(sess, mnist)
test_acc = mnist_network.evaluate(mnist)
+ nni.report_final_result(acc)
if __name__ == '__main__':
- params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
- 'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
+ params = nni.get_next_parameter()
run_trial(params)
```
*Implemented code directory: [mnist.py](../examples/trials/mnist/mnist.py)*
**Step 3**: Define a `config` file in yaml, which declare the `path` to search space and trial, also give `other information` such as tuning algorithm, max trial number and max runtime arguments.
```yaml
authorName: default
experimentName: example_mnist
trialConcurrency: 1
maxExecDuration: 1h
maxTrialNum: 10
trainingServicePlatform: local
# The path to Search Space
searchSpacePath: search_space.json
useAnnotation: false
tuner:
builtinTunerName: TPE
# The path and the running command of trial
trial:
command: python3 mnist.py
codeDir: .
gpuNum: 0
```
*Implemented code directory: [config.yml](../examples/trials/mnist/config.yml)*
All the codes above are already prepared and stored in [examples/trials/mnist/](../examples/trials/mnist).
When these things are done, **run the config.yml file from your command line to start the experiment**.
```bash
nnictl create --config nni/examples/trials/mnist/config.yml
```
Note: **nnictl** is a command line tool, which can be used to control experiments, such as start/stop/resume an experiment, start/stop NNIBoard, etc. Click [here](NNICTLDOC.md) for more usage of `nnictl`
Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. And this is what we expected to get:
```
INFO: Starting restful server...
INFO: Successfully started Restful server!
INFO: Setting local config...
INFO: Successfully set local config!
INFO: Starting experiment...
INFO: Successfully started experiment!
-----------------------------------------------------------------------
The experiment id is egchD4qy
The Web UI urls are: [Your IP]:8080
-----------------------------------------------------------------------
You can use these commands to get more information about the experiment
-----------------------------------------------------------------------
commands description
1. nnictl experiment show show the information of experiments
2. nnictl trial ls list all of trial jobs
3. nnictl top monitor the status of running experiments
4. nnictl log stderr show stderr log content
5. nnictl log stdout show stdout log content
6. nnictl stop stop an experiment
7. nnictl trial kill kill a trial job by id
8. nnictl --help get help information about nnictl
-----------------------------------------------------------------------
```
If you prepare `trial`, `search space` and `config` according to the above steps and successfully create a NNI job, NNI will automatically tune the optimal hyper-parameters and run different hyper-parameters sets for each trial according to the requirements you set. You can clearly sees its progress by NNI WebUI.
## WebUI
After you start your experiment in NNI successfully, you can find a message in the command-line interface to tell you `Web UI url` like this:
```
The Web UI urls are: [Your IP]:8080
```
Open the `Web UI url`(In this information is: `[Your IP]:8080`) in your browser, you can view detail information of the experiment and all the submitted trial jobs as shown below.
### View summary page
Click the tab "Overview".
Information about this experiment will be shown in the WebUI, including the experiment trial profile and search space message. NNI also support `download these information and parameters` through the **Download** button. You can download the experiment result anytime in the middle for the running or at the end of the execution, etc.
![](./img/QuickStart1.png)
Top 10 trials will be listed in the Overview page, you can browse all the trials in "Trials Detail" page.
![](./img/QuickStart2.png)
### View trials detail page
Click the tab "Default Metric" to see the point graph of all trials. Hover to see its specific default metric and search space message.
![](./img/QuickStart3.png)
Click the tab "Hyper Parameter" to see the parallel graph.
* You can select the percentage to see top trials.
* Choose two axis to swap its positions
![](./img/QuickStart4.png)
Click the tab "Trial Duration" to see the bar graph.
![](./img/QuickStart5.png)
Below is the status of the all trials. Specifically:
* Trial detail: trial's id, trial's duration, start time, end time, status, accuracy and search space file.
* If you run a pai experiment, you can also see the hdfsLogPath.
* Kill: you can kill a job that status is running.
* Support to search for a specific trial.
![](./img/QuickStart6.png)
* Intermediate Result Grap
![](./img/QuickStart7.png)
## Related Topic
* [Try different Tuners](Builtin_Tuner.md)
* [Try different Assessors](Builtin_Assessors.md)
* [How to use command line tool nnictl](NNICTLDOC.md)
* [How to write a trial](Trials.md)
* [How to run an experiment on local (with multiple GPUs)?](tutorial_1_CR_exp_local_api.md)
* [How to run an experiment on multiple machines?](RemoteMachineMode.md)
* [How to run an experiment on OpenPAI?](PAIMode.md)
* [How to run an experiment on Kubernetes through Kubeflow?](KubeflowMode.md)
* [How to run an experiment on Kubernetes through FrameworkController?](FrameworkControllerMode.md)
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment