1. 11 Dec, 2019 1 commit
  2. 10 Dec, 2019 1 commit
  3. 25 Nov, 2019 1 commit
  4. 06 Nov, 2019 1 commit
  5. 28 Oct, 2019 1 commit
  6. 26 Sep, 2019 1 commit
  7. 26 Aug, 2019 1 commit
  8. 14 Aug, 2019 1 commit
  9. 12 Aug, 2019 1 commit
  10. 05 Aug, 2019 1 commit
  11. 01 Aug, 2019 1 commit
  12. 30 Jul, 2019 1 commit
  13. 24 Jun, 2019 1 commit
  14. 20 Jun, 2019 1 commit
  15. 19 Jun, 2019 2 commits
  16. 23 May, 2019 1 commit
  17. 17 Apr, 2019 1 commit
  18. 11 Apr, 2019 2 commits
  19. 27 Mar, 2019 1 commit
  20. 22 Mar, 2019 1 commit
  21. 20 Mar, 2019 1 commit
  22. 15 Mar, 2019 1 commit
    • SparkSnail's avatar
      Support version check of nni (#807) · d0b22fc7
      SparkSnail authored
      check nni version in trialkeeper, to make sure the version of trialkeeper is consistent with trainingService
      add a debug mode in config file
      d0b22fc7
  23. 25 Feb, 2019 2 commits
    • SparkSnail's avatar
      Support webhdfs path in python hdfs client (#722) · 8c4c0ef2
      SparkSnail authored
      trial_keeper use 50070 port to connect to webhdfs server, and PAI use a mapping method to map 50070 port to 5070 port to visit restful server, this method has some risk for PAI may not support this kind of mapping in later release.Now use Pylon path(/webhdfs/api/v1) instead of 50070 port in webhdfs client of trial_keeper, the path is transmitted in trainingService.
      In this pr, we have these changes:
      
      1. Change to use webhdfs path instead of 50070 port in hdfs client.
      2. Change to use new hdfs package "PythonWebHDFS", which is build to support pylon by myself. You could test the new function from "sparksnail/nni:dev-pai" image to test pai trainingService.
      3. Update some variables' name according to comments.
      8c4c0ef2
    • fishyds's avatar
      Fix a race condition bug that does not store Trial Job cancel status correctly (#707) · 9a3a75c8
      fishyds authored
      * Fix a race condition bug that does not store Trial Job cancel status correctly
      9a3a75c8
  24. 25 Jan, 2019 2 commits
    • fishyds's avatar
      [PAI bug fixing] Fix the incorrect PAI webhdfs endpoint path (#653) · 6bc12de0
      fishyds authored
      
      * Fix PAI webhdfs api endpoint
      6bc12de0
    • chicm-ms's avatar
      Refactoring nnimanager log (#652) · 6d591989
      chicm-ms authored
      * Pull code (#22)
      
      * Support distributed job for frameworkcontroller (#612)
      
      support distributed job for frameworkcontroller
      
      * Multiphase doc (#519)
      
      * multiPhase doc
      
      * updates
      
      * updates
      
      * Add time parser for 'nnictl update duration' (#632)
      
      Current nnictl update duration only support seconds unit, add a parser for this command to support {s, m, h, d}
      
      * fix experiment state bug (#629)
      
      * update top README.md (#622)
      
      * Update README.md
      
      * update (#634)
      
      * Integration tests refactoring (#625)
      
      * Integration test refactoring (#21) (#616)
      
      * Integration test refactoring (#21)
      
      * Refactoring integration tests
      
      * test metrics
      
      * update azure pipeline
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * update trigger
      
      * Integration test refactoring (#618)
      
      * updates
      
      * updates
      
      * update pipeline (#619)
      
      * update pipeline
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * test pipeline (#623)
      
      * test pipeline
      
      * updates
      
      * updates
      
      * updates
      
      * Update integration test (#624)
      
      * Update integration test
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * Revert "Pull code (#22)"
      
      This reverts commit 62fc165ad7b2ba724eead3b99f010aa34491e2c7.
      
      * Update nnimanager logs
      
      * updates
      
      * Update README.md
      
      * Revert "Update README.md"
      
      This reverts commit bc67061160e5d57305a6e7fb63d491d12d0e9002.
      
      * updates
      
      * updates
      6d591989
  25. 04 Jan, 2019 1 commit
  26. 03 Jan, 2019 1 commit
  27. 29 Dec, 2018 1 commit
  28. 19 Dec, 2018 1 commit
  29. 17 Dec, 2018 1 commit
  30. 10 Dec, 2018 1 commit
  31. 07 Dec, 2018 1 commit
  32. 30 Nov, 2018 1 commit
  33. 29 Nov, 2018 1 commit
  34. 28 Nov, 2018 1 commit
  35. 25 Nov, 2018 1 commit
    • QuanluZhang's avatar
      Fix trialjobstate (#385) · c4d1aefe
      QuanluZhang authored
      * add one more trial job status, EARLY_STOPPED
      
      * fix datastore/nnimanager/mockeddatastore. test/webui/metrics_reader not done. USER_TO_CANCEL
      
      * fix bug
      
      * modifications based on Deshui's comments
      
      * fix bug
      
      * fix bug in remote mode
      c4d1aefe
  36. 23 Nov, 2018 1 commit
    • SparkSnail's avatar
      Add nniManagerIp in nnictl and trainingService (#393) · c2a4ce6c
      SparkSnail authored
      Add nniManager Ip in nnictl, pai TrainingService and kubeflow TrainingService.
      If users set nniManagerIp, pai and kubeflow will use this ip instead of using getIPV4() function.
      Web UI will also use this nniManagerIp.
      c2a4ce6c