1. 06 Oct, 2020 1 commit
  2. 05 Oct, 2020 1 commit
  3. 02 Oct, 2020 1 commit
  4. 01 Oct, 2020 3 commits
  5. 29 Sep, 2020 1 commit
  6. 24 Sep, 2020 3 commits
  7. 22 Sep, 2020 3 commits
  8. 17 Sep, 2020 6 commits
  9. 16 Sep, 2020 2 commits
  10. 15 Sep, 2020 2 commits
  11. 14 Sep, 2020 1 commit
  12. 12 Sep, 2020 1 commit
  13. 11 Sep, 2020 1 commit
  14. 10 Sep, 2020 3 commits
  15. 09 Sep, 2020 7 commits
  16. 08 Sep, 2020 1 commit
    • Benjamin Lefaudeux's avatar
      [feat] OSS: Sync all attributes (#67) · 5a268b25
      Benjamin Lefaudeux authored
      Make sure that all attributes (not just LR) are in sync in between the OSS.param_groups and the actual wrapped optimizer. Some frameworks make it possible to alter any attribute on a scheduled basis, which proves useful depending on the optimizer, so the keys need to be generically supported (not just "lr"). Not syncing these attributes is a worst case scenario, since these adjustments are silently not propagated, fixing that. 
      5a268b25
  17. 04 Sep, 2020 1 commit
  18. 03 Sep, 2020 2 commits
    • Benjamin Lefaudeux's avatar
      [feat] Add a memory usage regression test to the OSS benchmark (#62) · ee38e1e0
      Benjamin Lefaudeux authored
      * Aligning the optimizer state dict with what PyTorch expects
      
      * Adding a check on the dict keys, ensure that `state` and `param_groups` are there
      
      * after installing the specific isort, black and all, one liner to please the linter..
      
      * Adding some measurement of the memory consumption while training + checkpointing
      
      * mandatory lintfix commit
      
      * brainfart, reset the memory use counter at the beginning of the training in case two of them are run in a row
      
      * move reset stats call, hotfix
      
      * move the optimizer to rmsprop, more stateful and still used in CV
      
      * trying to figure out a sigsev in circleci
      ee38e1e0
    • Jun Ru Anderson's avatar
      Add grad scaler (#48) · b6a5e634
      Jun Ru Anderson authored
      
      
      Add GradScaler to Fairscale, subclassing PyTorch's GradScaler. Use GradScaler in the pipe benchmark; though it is not needed in this case, it is a good example of how to use gradient scaling for larger models that do require gradient scaling in order to converge.
      Co-authored-by: default avatarJun Ru Anderson <andersonic@fb.com>
      b6a5e634