- 20 Jun, 2019 1 commit
-
-
demianzhang authored
* fix local and remote training services tslint
-
- 19 Jun, 2019 1 commit
-
-
Hongarc authored
-
- 25 Dec, 2018 1 commit
-
-
SparkSnail authored
Add frameworkcontroller training service based on kubeflow training service. Refactor code structure, add kubernetes training service as father class, and set kubeflow training service and frameworkcontroller training service as child class.
-
- 13 Dec, 2018 1 commit
-
-
fishyds authored
[Kubeflow training service] Use Kubernete API server to replace kubectl dependency
-
- 07 Dec, 2018 1 commit
-
-
SparkSnail authored
1.Support pytorch-operator 2.remove unsupported operator
-
- 05 Dec, 2018 1 commit
-
-
fishyds authored
* Remove unused kubernetesServer config entry in config file and schema validation
-
- 28 Nov, 2018 1 commit
-
-
SparkSnail authored
Support aks of kuberflow training service Support nnictl set nniManagerIp
-
- 23 Nov, 2018 1 commit
-
-
SparkSnail authored
Add nniManager Ip in nnictl, pai TrainingService and kubeflow TrainingService. If users set nniManagerIp, pai and kubeflow will use this ip instead of using getIPV4() function. Web UI will also use this nniManagerIp.
-
- 22 Nov, 2018 1 commit
-
-
fishyds authored
[Kubeflow training service] Update kubeflow exp job config schema to support distributed training (#387) * Support distributed training on tf-operator, for worker and ps * Update validation rule for kubeflow config * small code refactor adjustment for private methods * Use different output folder for ps and worker
-
- 20 Nov, 2018 1 commit
-
-
fishyds authored
* Kubeflow TrainingService support, v1 (#373) 1. Create new Training Service: kubeflow trainning service, use 'kubectl' and kubeflow tfjobs CRD to submit and manage jobs 2. Update nni python SDK to support new kubeflow platform 3. Update nni python SDK's get_sequende_id() implementation, read NNI_TRIAL_SEQ_ID env variable, instead of reading .nni/sequence_id file 4. This version only supports Tensorflow operator. Will add more operators' support in future versions
-