1. 03 Feb, 2021 1 commit
    • anj-s's avatar
      [refactor] Refactor and enable multiprocess nn.Pipe benchmarks. (#319) · cd186441
      anj-s authored
      
      
      * mp cleanup
      
      * round of multiprocess refactoring
      
      * test golden run
      
      * print cuda stats
      
      * fix lint errors
      
      * enable multiprocess pipe benchmarks
      
      * set world size to be available gpus
      
      * more changes
      
      * use synthetic loaders for intermediate pipeline stages
      
      * merged master
      
      * fix for the devices property
      
      * dataloader fix
      
      * modify rank check
      
      * print wps stats
      
      * enable verification
      
      * fix logging
      
      * fix flag name
      
      * fix flag name
      
      * check for rank
      
      * fix indent
      
      * pass args
      
      * pass args
      
      * modify golden data
      
      * remove unused print messsage
      
      * fix lint errors
      
      * add comments
      
      * fix benchmarks
      Co-authored-by: default avatarAnjali Sridhar <anj@devfair0443.h2.fair>
      cd186441
  2. 29 Jan, 2021 1 commit
    • Min Xu's avatar
      [test]: test with py39 + torch 1.8 nightly (#339) · e348806b
      Min Xu authored
      * [test]: test with py39 + torch 1.8 nightly
      
      * version fix
      
      * more fix
      
      * fix version function for nightly version
      
      * fix torch_pg build
      
      * invalidate cache
      
      * separate benchmark requirements
      
      * comment
      
      * fixed mypy
      
      * fixed a test
      e348806b
  3. 27 Jan, 2021 1 commit
  4. 25 Jan, 2021 1 commit
    • Min Xu's avatar
      [test] cover python 3.7 to 3.9 on CPU (#303) · 8459634f
      Min Xu authored
      * [test] cover python 3.7 to 3.9 on CPU
      
      - covering common python versions on CPU tests
      - added doc build test
      
      * add doc build test
      
      * skipping failing tests on py39
      
      * catching doc build warnings
      
      * add doc build to py38 and py39
      
      * minor fix
      
      * fix doc build for adascale
      
      * removed dead code
      
      * fix the skipping
      
      * skip unit test for py39
      
      * add failing example
      
      * no more py39 skipping the tests
      8459634f
  5. 16 Jan, 2021 1 commit
  6. 15 Jan, 2021 1 commit
  7. 11 Jan, 2021 1 commit
  8. 05 Jan, 2021 1 commit
    • Benjamin Lefaudeux's avatar
      [fix] Flaky tests (#283) · 79365ee6
      Benjamin Lefaudeux authored
      * adding the pytest timeout plugin to properly root out hanging tests
      * removing redundant code, slightly more reasonable timeout, works on single cuda
      * finding the root bug for some of the cpu hangs, rpc init
      * propagating all the rpc init test changes to the pipe and model parallel tests
      79365ee6
  9. 30 Dec, 2020 1 commit
  10. 22 Dec, 2020 1 commit
  11. 30 Nov, 2020 1 commit
  12. 22 Nov, 2020 1 commit
  13. 21 Nov, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [feat] ShardedDataParallel with autoreduce (#157) · ad933b34
      Benjamin Lefaudeux authored
      * rewrite using autograd and Variable execution queue to make the reduce automatic
      * share buckets with OSS to remove duplication
      * some speed still likely on the table since the speed vs. bucketing does not match expectations, could be a follow up
      ad933b34
  14. 20 Nov, 2020 1 commit
  15. 19 Nov, 2020 1 commit
  16. 06 Nov, 2020 1 commit
  17. 30 Oct, 2020 1 commit
  18. 29 Oct, 2020 1 commit
  19. 28 Oct, 2020 1 commit
  20. 23 Oct, 2020 1 commit
  21. 22 Oct, 2020 1 commit
  22. 21 Oct, 2020 1 commit
  23. 17 Oct, 2020 1 commit
  24. 16 Oct, 2020 1 commit
  25. 14 Oct, 2020 1 commit
  26. 10 Oct, 2020 1 commit
  27. 09 Oct, 2020 1 commit
  28. 08 Oct, 2020 1 commit
  29. 01 Oct, 2020 1 commit
  30. 24 Sep, 2020 1 commit
  31. 22 Sep, 2020 1 commit
  32. 17 Sep, 2020 2 commits
    • Tom Birch's avatar
      Multi-process pipe (#90) · 63f7796a
      Tom Birch authored
      Adds support for distributing pipeline stages across multiple processes (and therefore multiple machines)
      * Adds a style argument to the Pipe constructor, defaulting to PipelineStyle.SingleProcess, but also supporting PipelineStyle.MultiProcess
      * Added support for lazy construction of modules (see lazy_construction for an example)
      * Added two implementations of inter-process communication: one based on rpc with globally visible queues, one based on send/recv
      * Copied all the relevant tests from tests/pipe to tests/pipe_process and modified them to exercise PipelineStyle.MultiProcess
      63f7796a
    • Benjamin Lefaudeux's avatar
      [feat] Sharded DDP - small refactor and new features (#97) · 49a198c9
      Benjamin Lefaudeux authored
      - rename oss_ddp to ShardedDataParallel
      - some refactoring
      - ShardedDataParallel owns the sharded optimizer, exposed if need be
      - some small perf bumps
      49a198c9
  33. 03 Sep, 2020 1 commit
    • Jun Ru Anderson's avatar
      Add grad scaler (#48) · b6a5e634
      Jun Ru Anderson authored
      
      
      Add GradScaler to Fairscale, subclassing PyTorch's GradScaler. Use GradScaler in the pipe benchmark; though it is not needed in this case, it is a good example of how to use gradient scaling for larger models that do require gradient scaling in order to converge.
      Co-authored-by: default avatarJun Ru Anderson <andersonic@fb.com>
      b6a5e634
  34. 21 Aug, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [feat] Simple macro OSS benchmark (#47) · 46c3776b
      Benjamin Lefaudeux authored
      
      
      * initial commit, dummy training loop, pure pytorch but not DDP
      
      * probably slightly broken, but rough DDP benchmark run
      
      * adding the torchvision requirement for testing
      
      * brainfart
      
      * reduce the loss, do something slightly distributed
      
      * Some cleanup, distributing the training on two GPUs
      
      * some cleanup + adding a vanilla run, still not good to go
      
      * less silly defaults, gtg for a start I think
      
      * smaller batch to fit the smaller gpus used in the circleci rigs
      
      * Adding some options for the benchmark, and regression testing
      
      * [test] set torch seed for Adam tests (#49)
      
      Set the torch seed for tests. xfail mixed precision and memory-efficient mixed-precision state_dict tests due to their states being cast to FP16 and back to FP32 during load_state_dict.
      Co-authored-by: default avatarJun Ru Anderson <andersonic@fb.com>
      
      * linting, I really need to automate this isort insanity
      Co-authored-by: default avatarJun Ru Anderson <33384298+andersonic@users.noreply.github.com>
      Co-authored-by: default avatarJun Ru Anderson <andersonic@fb.com>
      46c3776b
  35. 14 Aug, 2020 1 commit
  36. 13 Aug, 2020 2 commits
  37. 31 Jul, 2020 2 commits