1. 27 Apr, 2021 1 commit
  2. 21 Apr, 2021 1 commit
  3. 06 Apr, 2021 1 commit
  4. 05 Apr, 2021 1 commit
  5. 04 Apr, 2021 1 commit
  6. 19 Mar, 2021 1 commit
  7. 17 Mar, 2021 1 commit
  8. 15 Mar, 2021 1 commit
  9. 11 Mar, 2021 2 commits
  10. 09 Mar, 2021 2 commits
  11. 05 Mar, 2021 2 commits
  12. 23 Feb, 2021 1 commit
    • Myle Ott's avatar
      Add FullyShardedDataParallel (FSDP) (#413) · 15512d9e
      Myle Ott authored
      Recent work by [Microsoft](https://arxiv.org/abs/1910.02054) and [Google](https://arxiv.org/abs/2004.13336
      
      ) has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the new **`FullyShardedDataParallel` (FSDP)** wrapper, which is a drop-in replacement for PyTorch's `DistributedDataParallel` (DDP) wrapper.
      
      Compared to PyTorch DDP:
      * FSDP shards parameters (FP16 + FP32) and optimizer state across data parallel GPUs
      * FSDP with `reshard_after_forward=False` has the same communication cost as PyTorch DDP and is similar to ZeRO-2
      * FSDP with `reshard_after_forward=True` increases total communication by 50% and is similar to ZeRO-3:
          * all-gather parameters at start of forward pass and start of backward pass
          * reduce-scatter grads at end of backward pass
      Co-authored-by: default avatarMin Xu <24926999+min-xu-ai@users.noreply.github.com>
      Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
      15512d9e
  13. 22 Feb, 2021 1 commit
  14. 19 Feb, 2021 1 commit
  15. 14 Feb, 2021 1 commit
  16. 12 Feb, 2021 3 commits
  17. 08 Feb, 2021 1 commit
  18. 05 Feb, 2021 1 commit
  19. 04 Feb, 2021 1 commit
  20. 02 Feb, 2021 1 commit
  21. 27 Jan, 2021 1 commit
  22. 26 Jan, 2021 1 commit
  23. 21 Jan, 2021 1 commit
  24. 20 Jan, 2021 1 commit
  25. 11 Jan, 2021 2 commits
  26. 08 Jan, 2021 3 commits
  27. 30 Dec, 2020 1 commit
  28. 22 Dec, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [OSS] Balance the trainable params only (#262) · c386e937
      Benjamin Lefaudeux authored
      * fix, one liner
      
      * adjust so that frozen trunks get spread still, even if this should have little consequences
      
      * removing dead code, hopeful unit test fix
      
      * now with some linting..
      
      * adding a proper unit test case
      c386e937
  29. 19 Dec, 2020 1 commit
  30. 17 Dec, 2020 2 commits
  31. 16 Dec, 2020 1 commit