- 03 Jan, 2019 2 commits
-
-
chicm-ms authored
* Add UT code coverage report * updates * updates * updates * updates * updates * updates * integration test python code coverage report
-
QuanluZhang authored
-
- 02 Jan, 2019 1 commit
-
-
fishyds authored
[Logging architecture refactor] Remove unused metrics related code in nni trial_tools, support kubeflow mode for logging architecture refactor (#551) * Remove unused metrics related code in nni trial_tools, support kubeflow mode for logging architecture refactor
-
- 29 Dec, 2018 2 commits
- 26 Dec, 2018 2 commits
-
-
Zejun Lin authored
* fix bug * add docs * add ut * add ut * add to ci * update doc * update doc * update ut * add ut to ci * add ut to ci * add ut to ci * add ut to ci * add ut to ci * add ut to ci * add ut to ci * add ut to ci * test * test * test * test * test * test * test * test * test * test * revert * refactor * refactor * s * merge
-
goooxu authored
-
- 25 Dec, 2018 2 commits
-
-
SparkSnail authored
Remote TrainingService, passwd is not required when users set sshKeyPath.
-
SparkSnail authored
Add frameworkcontroller training service based on kubeflow training service. Refactor code structure, add kubernetes training service as father class, and set kubeflow training service and frameworkcontroller training service as child class.
-
- 21 Dec, 2018 6 commits
- 20 Dec, 2018 3 commits
-
-
Zejun Lin authored
* fix-annotation * fix annotation for error printing * update annotation * update annotation * update annotation * update annotation * update unittest for annotation * update unit test * update annotation * update annotation * update annotation * update ut * update ut
-
fishyds authored
* Update nnictl.py Fix the issue that nnictl --version via pip installation doesn't work * Update kubeflow training service document (#494) * Remove kubectl related document, add messages for kubeconfig * Add design section for kubeflow training service * Move the image files for PAI training service doc into img folder. * Update KubeflowMode.md (#498) Update KubeflowMode.md, small terms change * [V0.4.1 bug fix] Cannot run kubeflow training service due to trial_keeper change (#503) * Update kubeflow training service document * fix bug a that kubeflow trial job cannot run * upgrade version number (#499) * [V0.4.1 bug fix] Support read K8S config from KUBECONFIG environment variable (#507) * Add KUBCONFIG env variable support * In main.ts, throw cached error to make sure nnictl can show the error in stderr
-
ShufanHuang authored
* Add curve fitting assessor * Update HowToChooseTuner.md * Update HowToChooseTuner.md * Update HowToChooseTuner.md * Update README.md * Update README.md * Update README.md * Update HowToChooseTuner.md * Update HowToChooseTuner.md * Update HowToChooseTuner.md * Update HowToChooseTuner.md * Update curvefitting_assessor.py * Update config_schema.py * Add some comments and modifications * Remove unnecessary .json file * Remove unnecessary .lock file * Revert "Remove unnecessary .lock file" This reverts commit cdfaacb29114b3dee9c797d3e9b46ee18d7d34cc. * Revert "Revert "Remove unnecessary .lock file"" This reverts commit 7182a5fb31a02b01684429eabb3347952bf7ce2a. * Revert "Revert "Revert "Remove unnecessary .lock file""" This reverts commit 0f010e2b508e9f7b34c809647ba09e4e132876d8. * Revert "Remove unnecessary .json file" This reverts commit c6f7b47c199dd0db7ccb850d4f2ac1fd97b0caf8. * Revert "Add some comments and modifications" This reverts commit f78f055df9a4eec5b433a9241ce93d8ba78e3500. * Add some modifications by comments * suppoort minimize mode * Update README.md * Update README.md * Update modelfactory.py * minor changes and fix typo * minor chages * update README.md
-
- 19 Dec, 2018 1 commit
-
-
fishyds authored
* Small refactor: remove useless INFO log, and pring valid PAI token error message
-
- 18 Dec, 2018 1 commit
-
-
Lijiao authored
* [WebUI]Show version and support download logfile * Update download style
-
- 17 Dec, 2018 4 commits
-
-
Gems Guo authored
-
fishyds authored
* [PAI training service] codeDir files upload improvement * Create full local temp folder * Organize the folder structure for experiment and trial files
-
chicm-ms authored
* Support download log files * updates
-
Zejun Lin authored
* modify status experiment_running to running * fix hyperband doc * fix hyperband doc * fix hyperband doc
-
- 14 Dec, 2018 2 commits
-
-
SparkSnail authored
rest api of kubernetes does not use base64 to encode chars, now use base64 to encode username and then create secret.
-
Lijiao authored
* [WebUI]Close timer * Add edit concurrency btn * fix bug
-
- 13 Dec, 2018 2 commits
-
-
fishyds authored
[Kubeflow training service] Use Kubernete API server to replace kubectl dependency
-
Lee authored
* Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * PR merge to 0.3 (#297) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Update README.md (#288) added License badge * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * revise the installation cmd to v0.2 * revise to install v0.2 * remove enas readme (#292) * Fix datastore performance issue (#301) * Fix nnictl in v0.3 (#299) Fix old version of config file fix sklearn requirements Fix resume log logic * add basic tuner and trial for network morphism * Complete basic receive_trial_result() and generate_parameters(). Use onnx as the intermediate representation ( But it cannot convert to pytorch model ) * add tensorflow cifar10 for network morphism * add unit test for tuner and its function * use temporary torch_model * fix request bug and program can communicate nni * add basic pickle support for graph and train successful in pytorch * Update unittest for networkmorphism_tuner * Network Morphism add multi-gpu trial training support * Format code with black tool * change intermediate representation from pickle file to json we defined * successfully pass the unittest for test_graph_json_transform * add README for network morphism and it works fine in both Pytorch and Keras. * separate the original Readme.md in network-morphism into two parts (tuner and trial) * change the openpai image path * beautify the file structure of network_morphism and add a fashion_mnist keras example * pretty the source and add some docstring for funtion in order to pass the pylint. * remove unused module import and add some docstring * add some details for the application scenario Network Morphism Tuner * follow the advice and modify the doc file * add the config file for each task in the examples trial of network morphism * change default python interpreter from python to python3
-
- 10 Dec, 2018 2 commits
-
-
SparkSnail authored
quick fix paiTrainingService, add deferred.resolve();
-
YiChia Huang authored
-
- 07 Dec, 2018 2 commits
-
-
SparkSnail authored
1.Support pytorch-operator 2.remove unsupported operator
-
SparkSnail authored
Update pai token every 2 hours.
-
- 05 Dec, 2018 7 commits
-
-
Yan Ni authored
* add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * add python cache files to gitignore list * move extract scalar reward logic from dispatcher to tuner * update tuner code corresponding to last commit * update doc for receive_trial_result api change * add numpy to package whitelist of pylint * distinguish param value from return reward for tuner.extract_scalar_reward * update pylintrc * add comments to dispatcher.handle_report_metric_data * update install for mac support * fix root mode bug on Makefile * Quick fix bug: nnictl port value error (#245) * fix port bug * Dev exp stop more (#221) * Exp stop refactor (#161) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * fix setup.py (#115) * Add DAG model configuration format for SQuAD example. * Explain config format for SQuAD QA model. * Add more detailed introduction about the evolution algorithm. * Fix install.sh add add trial log path (#109) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * show trial log path * update document * fix install.sh * set default vallue for maxTrialNum and maxExecDuration * fix nnictl * Dev smac (#116) * support package install (#91) * fix nnictl bug * support package install * update * update package install logic * Fix package install issue (#95) * fix nnictl bug * fix pakcage install * support SMAC as a tuner on nni (#81) * update doc * update doc * update doc * update hyperopt installation * update doc * update doc * update description in setup.py * update setup.py * modify encoding * encoding * add encoding * remove pymc3 * update doc * update builtin tuner spec * support smac in sdk, fix logging issue * support smac tuner * add optimize_mode * update config in nnictl * add __init__.py * update smac * update import path * update setup.py: remove entry_point * update rest server validation * fix bug in nnictl launcher * support classArgs: optimize_mode * quick fix bug * test travis * add dependency * add dependency * add dependency * add dependency * create smac python package * fix trivial points * optimize import of tuners, modify nnictl accordingly * fix bug: incorrect algorithm_name * trivial refactor * for debug * support virtual * update doc of SMAC * update smac requirements * update requirements * change debug mode * update doc * update doc * refactor based on comments * fix comments * modify example config path to relative path and increase maxTrialNum (#94) * modify example config path to relative path and increase maxTrialNum * add document * support conda (#90) (#110) * support install from venv and travis CI * support install from venv and travis CI * support install from venv and travis CI * support conda * support conda * modify example config path to relative path and increase maxTrialNum * undo messy commit * undo messy commit * Support pip install as root (#77) * Typo on #58 (#122) * PAI Training Service implementation (#128) * PAI Training service implementation **1. Implement PAITrainingService **2. Add trial-keeper python module, and modify setup.py to install the module **3. Add PAItrainingService rest server to collect metrics from PAI container. * fix datastore for multiple final result (#129) * Update NNI v0.2 release notes (#132) Update NNI v0.2 release notes * Update setup.py Makefile and documents (#130) * update makefile and setup.py * update makefile and setup.py * update document * update document * Update Makefile no travis * update doc * update doc * fix convert from ss to pcs (#133) * Fix bugs about webui (#131) * Fix webui bugs * Fix tslint * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * Merge branch V0.2 to Master (#143) * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * fix bug (#147) * Refactor nnictl and add config_pai.yml (#144) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * Fix trial keeper wrongly exit issue (#152) * Fix trial keeper bug, use actual exitcode to exit rather than 1 * Fix bug of table sort (#145) * Update doc for PAIMode and v0.2 release notes (#153) * Update v0.2 documentation regards to release note and PAI training service * Update document to describe NNI docker image * fix antd (#159) * refactor experiment stopping logic * support change concurrency * remove trialJobs.ts * trivial changes * fix bugs * fix bug * support updating maxTrialNum * Modify IT scripts for supporting multiple experiments * Update ci (#175) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * modify CI cuz of refracting exp stop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * file saving * fix issues from code merge * remove $(INSTALL_PREFIX)/nni/nni_manager before install * fix indent * fix merge issue * socket close * update port * fix merge error * modify ci logic in nnimanager * fix ci * fix bug * change suspended to done * update ci (#229) * update ci * update ci * update ci (#232) * update ci * update ci * update azure-pipelines * update azure-pipelines * update ci (#233) * update ci * update ci * update azure-pipelines * update azure-pipelines * update azure-pipelines * run.py (#238) * Nnupdate ci (#239) * run.py * test ci * Nnupdate ci (#240) * run.py * test ci * test ci * Udci (#241) * run.py * test ci * test ci * test ci * update ci (#242) * run.py * test ci * test ci * test ci * update ci * revert install.sh (#244) * run.py * test ci * test ci * test ci * update ci * revert install.sh * add comments * remove assert * trivial change * trivial change * update Makefile (#246) * update Makefile * update Makefile * quick fix for ci (#248) * add update trialNum and fix bugs (#261) * Add builtin tuner to CI (#247) * update Makefile * update Makefile * add builtin-tuner test * add builtin-tuner test * refractor ci * update azure.yml * add built-in tuner test * fix bugs * Doc refactor (#258) * doc refactor * image name refactor * Refactor nnictl to support listing stopped experiments. (#256) Refactor nnictl to support listing stopped experiments. * Show experiment parameters more beautifully (#262) * fix error on example of RemoteMachineMode (#269) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * Update docker file to use latest nni release (#263) * fix bug about execDuration and endTime (#270) * fix bug about execDuration and endTime * modify time interval to 30 seconds * refactor based on Gems's suggestion * for triggering ci * Refactor dockerfile (#264) * refactor Dockerfile * Support nnictl tensorboard (#268) support tensorboard * Sdk update (#272) * Rename get_parameters to get_next_parameter * annotations add get_next_parameter * updates * updates * updates * updates * updates * add experiment log path to experiment profile (#276) * refactor extract reward from dict by tuner * update Makefile for mac support, wait for aka.ms support * refix Makefile for colorful echo * update Makefile with shorturl * fix false fail on mac webui * fix cross os remote tmpdir issue * add readonly to RemoteMachineTrainingService.remoteOS * fix var name for PR 386 * cross platform package * update pypi/makefile for multiple platform support * update linux os spec * udpate doc for installation & pypi * update readme * job timestamp compatibility for mac
-
fishyds authored
* Docoment update for kubeflow and release notes
-
Lijiao authored
* Update webui document * Fix comments of Chengmin
-
chicm-ms authored
-
fishyds authored
* Remove unused kubernetesServer config entry in config file and schema validation
-
Zejun Lin authored
* modify loguniform and lognormal * fix bug * fix bug * update doc * update doc * fix * update tpe for loguniform * update tpe for loguniform * update for loguniform * update for loguniform * update loguniform and qloguniform * update doc * update * revert * revert * revert * revert * update loguniform for smac * update loguniform for smac * update loguniform for smac * update loguniform for smac
-
xuehui authored
* update readme in ga_squad * fix typo * Update README.md * Update README.md * Update README.md * fix path * update README reference * fix bug in config file about batch tuner
-
- 03 Dec, 2018 1 commit
-
-
QuanluZhang authored
* test * tuners * refactor doc of tuners * update * update assessor doc * update * update
-