1. 27 Mar, 2019 1 commit
  2. 22 Mar, 2019 1 commit
  3. 15 Mar, 2019 1 commit
    • SparkSnail's avatar
      Support version check of nni (#807) · d0b22fc7
      SparkSnail authored
      check nni version in trialkeeper, to make sure the version of trialkeeper is consistent with trainingService
      add a debug mode in config file
      d0b22fc7
  4. 29 Dec, 2018 1 commit
  5. 25 Dec, 2018 1 commit
    • SparkSnail's avatar
      support frameworkcontroller training service (#484) · 36dbc0fe
      SparkSnail authored
      Add frameworkcontroller training service based on kubeflow training service.
      Refactor code structure, add kubernetes training service as father class, and set kubeflow training service and frameworkcontroller training service as child class.
      36dbc0fe
  6. 29 Nov, 2018 1 commit
  7. 23 Nov, 2018 1 commit
    • SparkSnail's avatar
      Add nniManagerIp in nnictl and trainingService (#393) · c2a4ce6c
      SparkSnail authored
      Add nniManager Ip in nnictl, pai TrainingService and kubeflow TrainingService.
      If users set nniManagerIp, pai and kubeflow will use this ip instead of using getIPV4() function.
      Web UI will also use this nniManagerIp.
      c2a4ce6c
  8. 20 Nov, 2018 1 commit
    • fishyds's avatar
      [Kubeflow Training Service] V1, merge from kubeflow branch to master branch (#382) · 806afeb6
      fishyds authored
      * Kubeflow TrainingService support, v1 (#373)
      
      1. Create new Training Service: kubeflow trainning service, use 'kubectl' and kubeflow tfjobs CRD to submit and manage jobs
      2. Update nni python SDK to support new kubeflow platform
      3. Update nni python SDK's get_sequende_id() implementation, read NNI_TRIAL_SEQ_ID env variable, instead of reading .nni/sequence_id file
      4. This version only supports Tensorflow operator. Will add more operators' support in future versions
      806afeb6
  9. 16 Oct, 2018 1 commit
  10. 27 Sep, 2018 1 commit
    • fishyds's avatar
      PAI Training Service implementation (#128) · d3506e34
      fishyds authored
      * PAI Training service implementation
      **1. Implement PAITrainingService
      **2. Add trial-keeper python module, and modify setup.py to install the module
      **3. Add PAItrainingService rest server to collect metrics from PAI container.
      d3506e34
  11. 24 Aug, 2018 1 commit
  12. 20 Aug, 2018 1 commit