"llm/vscode:/vscode.git/clone" did not exist on "8a65717f556b7fb1d72d5aa66fa99263ca3b8624"
  1. 02 Jan, 2021 1 commit
  2. 30 Dec, 2020 1 commit
  3. 29 Dec, 2020 2 commits
  4. 28 Dec, 2020 1 commit
  5. 22 Dec, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [OSS] Balance the trainable params only (#262) · c386e937
      Benjamin Lefaudeux authored
      * fix, one liner
      
      * adjust so that frozen trunks get spread still, even if this should have little consequences
      
      * removing dead code, hopeful unit test fix
      
      * now with some linting..
      
      * adding a proper unit test case
      c386e937
  6. 19 Dec, 2020 1 commit
  7. 16 Dec, 2020 1 commit
    • Min Xu's avatar
      [feat]: AdaScale work with lr_scheduler and tests, examples (#229) · d65cd838
      Min Xu authored
      * [doc]: AdaScale example and notes
      
      * formatted notes correctly as suggested by Benjamin
      
      * added feature and unit test to make sure lr_scheduler works
      
      * update the example with lr_scheduler
      
      * fixed doc with "make html"
      
      * addressed Mike's suggestions
      d65cd838
  8. 14 Dec, 2020 1 commit
  9. 10 Dec, 2020 1 commit
  10. 06 Dec, 2020 1 commit
  11. 04 Dec, 2020 1 commit
  12. 03 Dec, 2020 1 commit
    • Min Xu's avatar
      [feat] AdaScale: Gradient Accumulation and Add PyTest unit tests (#202) · ce5860ea
      Min Xu authored
      * added AdaScale to README
      
      * [adascale] added gradient accumulation
      
      - added gradient accumulation
      - tested with cifar full trainings with different value of accumulation
      and verified the full accuracy is obtained
      - also removed the patch optimize flag until we need it
      
      * [adascale] adding pytest
      
      - added basic and ddp tests and grad_accum
      - closes #195
      
      * added changelog
      
      * added ddp grad_accum test
      
      * moved ddp and non-ddp tests into separate files
      
      * added checkpoint test
      
      * more doc
      
      * addressed Mike's comments
      ce5860ea
  13. 01 Dec, 2020 2 commits
  14. 21 Nov, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [feat] ShardedDataParallel with autoreduce (#157) · ad933b34
      Benjamin Lefaudeux authored
      * rewrite using autograd and Variable execution queue to make the reduce automatic
      * share buckets with OSS to remove duplication
      * some speed still likely on the table since the speed vs. bucketing does not match expectations, could be a follow up
      ad933b34
  15. 18 Nov, 2020 1 commit
  16. 16 Nov, 2020 1 commit
  17. 11 Nov, 2020 2 commits
  18. 10 Nov, 2020 1 commit
    • Tom Birch's avatar
      Single-process control via PipeRPCWrapper (#156) · 5d4f50fb
      Tom Birch authored
      Adds support for:
      * Reused layers (e.g. for weight sharing)
      * Lazily-constructed layers
      * Single-process control via PipeRPCWrapper
      * PipelineStyle.AsyncScheudle, which lays the foundation for asynchronous pipeline work by introducing an event loop for each rank/worker to process either activations or gradients as they arrive
      
      Also added examples for multi-process and PipeRPCWrapper
      5d4f50fb
  19. 06 Nov, 2020 1 commit
  20. 30 Oct, 2020 1 commit
  21. 29 Oct, 2020 1 commit
  22. 28 Oct, 2020 1 commit
  23. 23 Oct, 2020 1 commit
  24. 21 Oct, 2020 1 commit
  25. 20 Oct, 2020 1 commit
  26. 17 Oct, 2020 1 commit
  27. 16 Oct, 2020 2 commits
  28. 14 Oct, 2020 2 commits
  29. 08 Oct, 2020 3 commits
  30. 06 Oct, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [feat] OSS/SDP : bucketing (#122) · 341d8b2b
      Benjamin Lefaudeux authored
      Same bucketing strategy for OSS and SDP:
      sort everything ahead of time, per rank and per size, smaller tensors first. Bucket the smallest elements in a fixed buffer, send async, then send all the others async, and get back to the bucket. Once done then scatter the contents if needed
      341d8b2b
  31. 05 Oct, 2020 1 commit
  32. 02 Oct, 2020 1 commit
  33. 29 Sep, 2020 1 commit