1. 05 Mar, 2021 2 commits
  2. 04 Mar, 2021 2 commits
  3. 01 Mar, 2021 1 commit
    • Min Xu's avatar
      [chores]: make CI more efficient and update py39 env a bit (#447) · 5eb6b8c7
      Min Xu authored
      * [chores]: CI py39 on GPU and more efficiency
      
      * add test list files
      
      * fix
      
      * add test list files
      
      * split benchmark run into 2 runs
      
      * fix 1.8 version and balance benchmarks
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * recording tests
      
      * py39 install fix
      
      * test again
      
      * move tests
      
      * reorg tests
      
      * skip tests for torch 1.8 due to an upstream bug
      
      * removed __init__.py from tests since it confuses pytest
      
      * Revert "removed __init__.py from tests since it confuses pytest"
      
      This reverts commit 7e156ba33dfaa5ed052031780613ec0cb57a45b0.
      
      * don't include __init__ in file list
      
      * notes on __init__.py and added missing ones
      
      * fixed mypy in a test file
      
      * balance test runtime
      
      * better pip install
      
      * balance more
      
      * pip fix
      
      * balance
      
      * balance more, all test should finish within 20m now
      
      * minor license update
      
      * trying cu102
      
      * more doc and addressed Ben's comments
      
      * debugging
      
      * debugging...
      5eb6b8c7
  4. 26 Feb, 2021 1 commit
  5. 04 Feb, 2021 1 commit
  6. 03 Feb, 2021 2 commits
  7. 29 Jan, 2021 1 commit
    • Min Xu's avatar
      [test]: test with py39 + torch 1.8 nightly (#339) · e348806b
      Min Xu authored
      * [test]: test with py39 + torch 1.8 nightly
      
      * version fix
      
      * more fix
      
      * fix version function for nightly version
      
      * fix torch_pg build
      
      * invalidate cache
      
      * separate benchmark requirements
      
      * comment
      
      * fixed mypy
      
      * fixed a test
      e348806b
  8. 27 Jan, 2021 1 commit
  9. 25 Jan, 2021 1 commit
    • Min Xu's avatar
      [test] cover python 3.7 to 3.9 on CPU (#303) · 8459634f
      Min Xu authored
      * [test] cover python 3.7 to 3.9 on CPU
      
      - covering common python versions on CPU tests
      - added doc build test
      
      * add doc build test
      
      * skipping failing tests on py39
      
      * catching doc build warnings
      
      * add doc build to py38 and py39
      
      * minor fix
      
      * fix doc build for adascale
      
      * removed dead code
      
      * fix the skipping
      
      * skip unit test for py39
      
      * add failing example
      
      * no more py39 skipping the tests
      8459634f
  10. 16 Jan, 2021 1 commit
  11. 15 Jan, 2021 1 commit
  12. 11 Jan, 2021 1 commit
  13. 05 Jan, 2021 1 commit
    • Benjamin Lefaudeux's avatar
      [fix] Flaky tests (#283) · 79365ee6
      Benjamin Lefaudeux authored
      * adding the pytest timeout plugin to properly root out hanging tests
      * removing redundant code, slightly more reasonable timeout, works on single cuda
      * finding the root bug for some of the cpu hangs, rpc init
      * propagating all the rpc init test changes to the pipe and model parallel tests
      79365ee6
  14. 30 Dec, 2020 1 commit
  15. 22 Dec, 2020 1 commit
  16. 30 Nov, 2020 1 commit
  17. 22 Nov, 2020 1 commit
  18. 21 Nov, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [feat] ShardedDataParallel with autoreduce (#157) · ad933b34
      Benjamin Lefaudeux authored
      * rewrite using autograd and Variable execution queue to make the reduce automatic
      * share buckets with OSS to remove duplication
      * some speed still likely on the table since the speed vs. bucketing does not match expectations, could be a follow up
      ad933b34
  19. 20 Nov, 2020 1 commit
  20. 19 Nov, 2020 1 commit
  21. 06 Nov, 2020 1 commit
  22. 30 Oct, 2020 1 commit
  23. 29 Oct, 2020 1 commit
  24. 28 Oct, 2020 1 commit
  25. 23 Oct, 2020 1 commit
  26. 22 Oct, 2020 1 commit
  27. 21 Oct, 2020 1 commit
  28. 17 Oct, 2020 1 commit
  29. 16 Oct, 2020 1 commit
  30. 14 Oct, 2020 1 commit
  31. 10 Oct, 2020 1 commit
  32. 09 Oct, 2020 1 commit
  33. 08 Oct, 2020 1 commit
  34. 01 Oct, 2020 1 commit
  35. 24 Sep, 2020 1 commit
  36. 22 Sep, 2020 1 commit
  37. 17 Sep, 2020 1 commit
    • Tom Birch's avatar
      Multi-process pipe (#90) · 63f7796a
      Tom Birch authored
      Adds support for distributing pipeline stages across multiple processes (and therefore multiple machines)
      * Adds a style argument to the Pipe constructor, defaulting to PipelineStyle.SingleProcess, but also supporting PipelineStyle.MultiProcess
      * Added support for lazy construction of modules (see lazy_construction for an example)
      * Added two implementations of inter-process communication: one based on rpc with globally visible queues, one based on send/recv
      * Copied all the relevant tests from tests/pipe to tests/pipe_process and modified them to exercise PipelineStyle.MultiProcess
      63f7796a