1. 30 Oct, 2020 1 commit
  2. 29 Oct, 2020 1 commit
  3. 23 Oct, 2020 1 commit
  4. 21 Oct, 2020 1 commit
  5. 20 Oct, 2020 1 commit
  6. 17 Oct, 2020 1 commit
  7. 16 Oct, 2020 2 commits
  8. 14 Oct, 2020 1 commit
  9. 08 Oct, 2020 2 commits
    • msbaines's avatar
      [feat] moe: initial implementation of MOELayer (#128) · 22ff665d
      msbaines authored
      Currently only implemented for a single process and expert.
      22ff665d
    • Min Xu's avatar
      [test] Add unittest for checkpoint & DDP (#126) · 6658be22
      Min Xu authored
      * Add unittest for checkpoint & DDP
      
      - this change adds test cases to reproduce the error with checkpoint & DDP
      - mandeep mentioned that there is also deadlock in this case, but this
        change doesn't cover that.
      - we cover cases where weight sharing is OK
      - however, same module multiple checkpoint or find_unused_parameters are
        both not OK
      
      * added norm checks
      6658be22
  10. 06 Oct, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [feat] OSS/SDP : bucketing (#122) · 341d8b2b
      Benjamin Lefaudeux authored
      Same bucketing strategy for OSS and SDP:
      sort everything ahead of time, per rank and per size, smaller tensors first. Bucket the smallest elements in a fixed buffer, send async, then send all the others async, and get back to the bucket. Once done then scatter the contents if needed
      341d8b2b
  11. 05 Oct, 2020 1 commit
  12. 02 Oct, 2020 1 commit
  13. 29 Sep, 2020 1 commit
  14. 17 Sep, 2020 2 commits
    • Tom Birch's avatar
      Multi-process pipe (#90) · 63f7796a
      Tom Birch authored
      Adds support for distributing pipeline stages across multiple processes (and therefore multiple machines)
      * Adds a style argument to the Pipe constructor, defaulting to PipelineStyle.SingleProcess, but also supporting PipelineStyle.MultiProcess
      * Added support for lazy construction of modules (see lazy_construction for an example)
      * Added two implementations of inter-process communication: one based on rpc with globally visible queues, one based on send/recv
      * Copied all the relevant tests from tests/pipe to tests/pipe_process and modified them to exercise PipelineStyle.MultiProcess
      63f7796a
    • Benjamin Lefaudeux's avatar
      [feat] Sharded DDP - small refactor and new features (#97) · 49a198c9
      Benjamin Lefaudeux authored
      - rename oss_ddp to ShardedDataParallel
      - some refactoring
      - ShardedDataParallel owns the sharded optimizer, exposed if need be
      - some small perf bumps
      49a198c9
  15. 28 Aug, 2020 1 commit
  16. 06 Aug, 2020 1 commit
  17. 31 Jul, 2020 1 commit
  18. 08 Jul, 2020 1 commit