1. 20 Jun, 2019 1 commit
  2. 19 Jun, 2019 1 commit
  3. 27 Mar, 2019 1 commit
  4. 22 Mar, 2019 1 commit
  5. 15 Mar, 2019 1 commit
    • SparkSnail's avatar
      Support version check of nni (#807) · d0b22fc7
      SparkSnail authored
      check nni version in trialkeeper, to make sure the version of trialkeeper is consistent with trainingService
      add a debug mode in config file
      d0b22fc7
  6. 25 Feb, 2019 2 commits
    • SparkSnail's avatar
      Support webhdfs path in python hdfs client (#722) · 8c4c0ef2
      SparkSnail authored
      trial_keeper use 50070 port to connect to webhdfs server, and PAI use a mapping method to map 50070 port to 5070 port to visit restful server, this method has some risk for PAI may not support this kind of mapping in later release.Now use Pylon path(/webhdfs/api/v1) instead of 50070 port in webhdfs client of trial_keeper, the path is transmitted in trainingService.
      In this pr, we have these changes:
      
      1. Change to use webhdfs path instead of 50070 port in hdfs client.
      2. Change to use new hdfs package "PythonWebHDFS", which is build to support pylon by myself. You could test the new function from "sparksnail/nni:dev-pai" image to test pai trainingService.
      3. Update some variables' name according to comments.
      8c4c0ef2
    • fishyds's avatar
      Fix a race condition bug that does not store Trial Job cancel status correctly (#707) · 9a3a75c8
      fishyds authored
      * Fix a race condition bug that does not store Trial Job cancel status correctly
      9a3a75c8
  7. 17 Dec, 2018 1 commit
  8. 20 Nov, 2018 1 commit
    • fishyds's avatar
      [Kubeflow Training Service] V1, merge from kubeflow branch to master branch (#382) · 806afeb6
      fishyds authored
      * Kubeflow TrainingService support, v1 (#373)
      
      1. Create new Training Service: kubeflow trainning service, use 'kubectl' and kubeflow tfjobs CRD to submit and manage jobs
      2. Update nni python SDK to support new kubeflow platform
      3. Update nni python SDK's get_sequende_id() implementation, read NNI_TRIAL_SEQ_ID env variable, instead of reading .nni/sequence_id file
      4. This version only supports Tensorflow operator. Will add more operators' support in future versions
      806afeb6
  9. 12 Nov, 2018 1 commit
  10. 05 Nov, 2018 1 commit
  11. 02 Nov, 2018 1 commit
  12. 31 Oct, 2018 3 commits
  13. 17 Oct, 2018 1 commit
  14. 12 Oct, 2018 1 commit
    • chicm-ms's avatar
      Add api nni.get_sequence_id() (#203) · 1388d763
      chicm-ms authored
      * Pull latest code (#2)
      
      * webui logpath and document (#135)
      
      * Add webui document and logpath as a href
      
      * fix tslint
      
      * fix comments by Chengmin
      
      * Pai training service bug fix and enhancement (#136)
      
      * Add NNI installation scripts
      
      * Update pai script, update NNI_out_dir
      
      * Update NNI dir in nni sdk local.py
      
      * Create .nni folder in nni sdk local.py
      
      * Add check before creating .nni folder
      
      * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT
      
      * Improve annotation (#138)
      
      * Improve annotation
      
      * Minor bugfix
      
      * Selectively install through pip (#139)
      
      Selectively install through pip 
      * update setup.py
      
      * fix paiTrainingService bugs (#137)
      
      * fix nnictl bug
      
      * add hdfs host validation
      
      * fix bugs
      
      * fix dockerfile
      
      * fix install.sh
      
      * update install.sh
      
      * fix dockerfile
      
      * Set timeout for HDFSUtility exists function
      
      * remove unused TODO
      
      * fix sdk
      
      * add optional for outputDir and dataDir
      
      * refactor dockerfile.base
      
      * Remove unused import in hdfsclientUtility
      
      * Add documentation for NNI PAI mode experiment (#141)
      
      * Add documentation for NNI PAI mode
      
      * Fix typo based on PR comments
      
      * Exit with subprocess return code of trial keeper
      
      * Remove additional exit code
      
      * Fix typo based on PR comments
      
      * update doc for smac tuner (#140)
      
      * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)
      
      * Revert "Selectively install through pip (#139)"
      
      This reverts commit 1d174836.
      
      * Add exit code of subprocess for trial_keeper
      
      * Update README, add link to PAImode doc
      
      * fix bug (#147)
      
      * Refactor nnictl and add config_pai.yml (#144)
      
      * fix nnictl bug
      
      * add hdfs host validation
      
      * fix bugs
      
      * fix dockerfile
      
      * fix install.sh
      
      * update install.sh
      
      * fix dockerfile
      
      * Set timeout for HDFSUtility exists function
      
      * remove unused TODO
      
      * fix sdk
      
      * add optional for outputDir and dataDir
      
      * refactor dockerfile.base
      
      * Remove unused import in hdfsclientUtility
      
      * add config_pai.yml
      
      * refactor nnictl create logic and add colorful print
      
      * fix nnictl stop logic
      
      * add annotation for config_pai.yml
      
      * add document for start experiment
      
      * fix config.yml
      
      * fix document
      
      * Fix trial keeper wrongly exit issue (#152)
      
      * Fix trial keeper bug, use actual exitcode to exit rather than 1
      
      * Fix bug of table sort (#145)
      
      * Update doc for PAIMode and v0.2 release notes (#153)
      
      * Update v0.2 documentation regards to release note and PAI training service
      
      * Update document to describe NNI docker image
      
      * Bug fix for SQuAD example tuner. (#134)
      
      * Update Makefile (#151)
      
      * test
      
      * update setup.py
      
      * update Makefile and install.sh
      
      * rever setup.py
      
      * change color
      
      * update doc
      
      * update doc
      
      * fix auto-completion's extra space
      
      * update Makefile
      
      * update webui
      
      * Update doc image (#163)
      
      * update doc
      
      * trivial
      
      * trivial
      
      * trivial
      
      * trivial
      
      * trivial
      
      * trivial
      
      * update image
      
      * update image size
      
      * Update ga squad (#104)
      
      * update readme in ga_squad
      
      * update readme
      
      * fix typo
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update readme
      
      * sklearn examples (#169)
      
      * fix nnictl bug
      
      * fix install.sh
      
      * add sklearn-regression example
      
      * add sklearn classification
      
      * update sklearn
      
      * update example
      
      * remove additional code
      
      * Update batch tuner (#158)
      
      * update readme in ga_squad
      
      * update readme
      
      * fix typo
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update readme
      
      * update batch tuner
      
      * Quickly fix cascading search space bug in tuner (#156)
      
      * update readme in ga_squad
      
      * update readme
      
      * fix typo
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update readme
      
      * quickly fix cascading searchspace bug in tuner
      
      * Add iterative search space example (#119)
      
      * update readme in ga_squad
      
      * update readme
      
      * fix typo
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update readme
      
      * add iterative search space example
      
      * update
      
      * update readme
      
      * change name
      
      * Add api nni.get_sequence_id()
      
      * Add sequence_id to TrialJobDetail
      1388d763
  15. 29 Sep, 2018 1 commit
    • fishyds's avatar
      Merge branch V0.2 to Master (#143) · 2a28a578
      fishyds authored
      * webui logpath and document (#135)
      
      * Add webui document and logpath as a href
      
      * fix tslint
      
      * fix comments by Chengmin
      
      * Pai training service bug fix and enhancement (#136)
      
      * Add NNI installation scripts
      
      * Update pai script, update NNI_out_dir
      
      * Update NNI dir in nni sdk local.py
      
      * Create .nni folder in nni sdk local.py
      
      * Add check before creating .nni folder
      
      * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT
      
      * Improve annotation (#138)
      
      * Improve annotation
      
      * Minor bugfix
      
      * Selectively install through pip (#139)
      
      Selectively install through pip 
      * update setup.py
      
      * fix paiTrainingService bugs (#137)
      
      * fix nnictl bug
      
      * add hdfs host validation
      
      * fix bugs
      
      * fix dockerfile
      
      * fix install.sh
      
      * update install.sh
      
      * fix dockerfile
      
      * Set timeout for HDFSUtility exists function
      
      * remove unused TODO
      
      * fix sdk
      
      * add optional for outputDir and dataDir
      
      * refactor dockerfile.base
      
      * Remove unused import in hdfsclientUtility
      
      * Add documentation for NNI PAI mode experiment (#141)
      
      * Add documentation for NNI PAI mode
      
      * Fix typo based on PR comments
      
      * Exit with subprocess return code of trial keeper
      
      * Remove additional exit code
      
      * Fix typo based on PR comments
      
      * update doc for smac tuner (#140)
      
      * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142)
      
      * Revert "Selectively install through pip (#139)"
      
      This reverts commit 1d174836.
      
      * Add exit code of subprocess for trial_keeper
      
      * Update README, add link to PAImode doc
      2a28a578
  16. 28 Sep, 2018 1 commit
    • fishyds's avatar
      Pai training service bug fix and enhancement (#136) · 70be7d0f
      fishyds authored
      * Add NNI installation scripts
      
      * Update pai script, update NNI_out_dir
      
      * Update NNI dir in nni sdk local.py
      
      * Create .nni folder in nni sdk local.py
      
      * Add check before creating .nni folder
      
      * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT
      70be7d0f
  17. 27 Sep, 2018 1 commit
    • fishyds's avatar
      PAI Training Service implementation (#128) · d3506e34
      fishyds authored
      * PAI Training service implementation
      **1. Implement PAITrainingService
      **2. Add trial-keeper python module, and modify setup.py to install the module
      **3. Add PAItrainingService rest server to collect metrics from PAI container.
      d3506e34