1. 20 Jun, 2019 1 commit
  2. 19 Jun, 2019 1 commit
  3. 25 Dec, 2018 1 commit
    • SparkSnail's avatar
      support frameworkcontroller training service (#484) · 36dbc0fe
      SparkSnail authored
      Add frameworkcontroller training service based on kubeflow training service.
      Refactor code structure, add kubernetes training service as father class, and set kubeflow training service and frameworkcontroller training service as child class.
      36dbc0fe
  4. 13 Dec, 2018 1 commit
  5. 07 Dec, 2018 1 commit
  6. 05 Dec, 2018 1 commit
  7. 28 Nov, 2018 1 commit
  8. 23 Nov, 2018 1 commit
    • SparkSnail's avatar
      Add nniManagerIp in nnictl and trainingService (#393) · c2a4ce6c
      SparkSnail authored
      Add nniManager Ip in nnictl, pai TrainingService and kubeflow TrainingService.
      If users set nniManagerIp, pai and kubeflow will use this ip instead of using getIPV4() function.
      Web UI will also use this nniManagerIp.
      c2a4ce6c
  9. 22 Nov, 2018 1 commit
    • fishyds's avatar
      [Kubeflow training service] Update kubeflow exp job config schema to support... · e341df81
      fishyds authored
      [Kubeflow training service] Update kubeflow exp job config schema to support distributed training (#387)
      
      * Support distributed training on tf-operator, for worker and ps
      
      * Update validation rule for kubeflow config
      
      * small code refactor adjustment for private methods
      
      * Use different output folder for ps and worker
      e341df81
  10. 20 Nov, 2018 1 commit
    • fishyds's avatar
      [Kubeflow Training Service] V1, merge from kubeflow branch to master branch (#382) · 806afeb6
      fishyds authored
      * Kubeflow TrainingService support, v1 (#373)
      
      1. Create new Training Service: kubeflow trainning service, use 'kubectl' and kubeflow tfjobs CRD to submit and manage jobs
      2. Update nni python SDK to support new kubeflow platform
      3. Update nni python SDK's get_sequende_id() implementation, read NNI_TRIAL_SEQ_ID env variable, instead of reading .nni/sequence_id file
      4. This version only supports Tensorflow operator. Will add more operators' support in future versions
      806afeb6