1. 24 Dec, 2021 1 commit
  2. 21 Dec, 2021 5 commits
  3. 16 Dec, 2021 1 commit
    • Freddy Snijder's avatar
      Added warn_on_trainable_params_changed constructor parameter to allow the user... · 99163d4f
      Freddy Snijder authored
      Added warn_on_trainable_params_changed constructor parameter to allow the user to suppress the warning on trainable parameters changed (#886)
      
      * Added warn_on_trainable_params_changed constructor parameter to allow the user to suppress the warning on trainable parameters changed; the default is True and thus the default behavior is unchanged
      
      * Addded parameter documentation
      99163d4f
  4. 13 Dec, 2021 1 commit
    • Min Xu's avatar
      [feat] support eval in mevo (#884) · 56add6d5
      Min Xu authored
      - During eval, we will fallback to just output projection without fusing
      - added unit test to ensure the shape is correct
      56add6d5
  5. 06 Dec, 2021 1 commit
    • Freddy Snijder's avatar
      Fix for Key Error that can happen in certain FSDP wrapping scenarios of... · e6acdcc3
      Freddy Snijder authored
      Fix for Key Error that can happen in certain FSDP wrapping scenarios of Huggingface model sub-modules (issue #876) (#881)
      
      * Fix for Key Error that can happen in certain FSDP wrapping scenarios of Huggingface model sub-modules (issue #876)
      
      * Styling fixes
      
      * Updated the test to be independent of the Huggingface transformers package
      
      * Added test for issue #876
      
      * Small error message fix
      
      * Skip test when CUDA is not available
      
      * Fixed naming of model
      e6acdcc3
  6. 02 Dec, 2021 5 commits
  7. 29 Nov, 2021 1 commit
  8. 24 Nov, 2021 2 commits
  9. 21 Nov, 2021 1 commit
  10. 19 Nov, 2021 1 commit
    • h-vetinari's avatar
      Add installation instructions through conda (#863) · 117fc8bd
      h-vetinari authored
      * DOC: fix the rst-headers in installation instructions
      
      * DOC: add installation through conda-forge to instructions
      
      * DOC: fix rst-syntax in installation-instructions
      
      * DOC: add comment about building from source with GPU-support
      117fc8bd
  11. 18 Nov, 2021 4 commits
  12. 17 Nov, 2021 2 commits
  13. 15 Nov, 2021 1 commit
    • Anupam Bhatnagar's avatar
      Allow sharded grad scaler to cpu offload with FSDP (#831) · ba5785f7
      Anupam Bhatnagar authored
      * first commit
      
      * sharded scaler hitting nan assertions
      
      * adding test for sharded grad scaler without cpu offload
      
      * ddp grad scaler and fsdp sharded grad scaler test failing
      
      * removing test_output
      
      * fix no cpu offload test
      
      * changing optimizer from OSS to SGD
      
      * all tests passing, code cleanup pending
      
      * code cleanup
      
      * fix pyproject.toml
      
      * removing .isort.cfg
      
      * running isort linter
      
      * resolving isort issues
      
      * resolving black linter issue
      
      * resolving mypy issues
      
      * fix import statement
      
      * fix mypy error
      
      * modifying import statement
      
      * adding pytorch version requirement
      
      * fixing pytest skip test decorator
      
      * apply version guard for ShardedGradScaler
      
      * removing test_fsdp_grad_scaler
      
      * increasing num_epochs for ShardedGradScaler so that updates are not skipped
      
      * adding support for torch 1.8
      
      * minor edit
      
      * [skip ci] more torch 1.8 changes
      
      * parametrizing the tests
      
      * cleanup code with linters
      
      * [skip ci] update doc string
      
      * [skip ci] addressing some more comments
      ba5785f7
  14. 12 Nov, 2021 1 commit
    • Anupam Bhatnagar's avatar
      Setup pre-commit github action and apply pre-commit to all files (#849) · 7d7edf6d
      Anupam Bhatnagar authored
      * adding pre-commit files
      
      * applying pre-commit to all files
      
      * adding no-strict-optional argument to mypy in circle ci config
      
      * fix typo
      
      * updating python versions
      
      * [skip ci] remove extra args
      
      * adding python 3.9
      
      * [skip ci] set pre-commit version in requirements-dev.txt
      
      * set CACHE_VERSION
      
      * move linters from circleci to github actions
      
      * update python version
      
      * update python version in benchmarks_2
      
      * moving to python 3.9.7
      7d7edf6d
  15. 09 Nov, 2021 1 commit
  16. 08 Nov, 2021 3 commits
  17. 05 Nov, 2021 1 commit
    • Min Xu's avatar
      [feat] experimental MEVO layer (#840) · 8347c1a2
      Min Xu authored
      
      
      * [feat] MEVO kernel
      
      - initial import from min/softmax and min/testing branches
      - need to rename and further cleanup
      
      * only test with newer pytorch
      
      * renamed and added comments and code cleanup
      
      * rename and reduce test memory
      
      * testing
      
      * minor fixing
      
      * fixing
      
      * more fix
      
      * changelog
      
      * more 1.7 and 1.8 paper cuts
      
      * remove dead code
      
      * addressed Benjamin's comments
      
      * addressed more comments
      Co-authored-by: default avatarMin Xu <min.xu.public@gmail.com>
      8347c1a2
  18. 03 Nov, 2021 1 commit
  19. 02 Nov, 2021 2 commits
  20. 01 Nov, 2021 2 commits
    • Min Xu's avatar
      [feat] [FSDP]: add experimental support to shared weights (#836) · f2af4c66
      Min Xu authored
      
      
      * added a new test, passing without shared weights
      
      * tested weight sharing
      
      * added the test to test list file
      
      * extended to world_size = 2
      
      * fixed test
      
      * [feat]: add limited and experimental support for shared parameter
      
      * fixed tests
      
      * simplify to work with layer with at least 1 non-shared params and add code to pick up linked_param field for sharding the shared param
      
      * fixed the case where linked param is not in separate FSDP
      
      * changelog and remove old code
      Co-authored-by: default avatarMin Xu <min.xu.public@gmail.com>
      f2af4c66
    • anj-s's avatar
      [feature] Add the low level SSD APIs (#829) · a9fcaa28
      anj-s authored
      * add doc strings
      
      * add lower level SSD APIs and tests
      
      * add the test to the list to be run
      
      * remove unused imports
      
      * more doc string changes
      
      * fix lint errors
      a9fcaa28
  21. 28 Oct, 2021 1 commit
  22. 27 Oct, 2021 2 commits
    • anj-s's avatar
      [fix] Decouple `move_params_to_cpu` from the `mixed_precision`. (#822) · ed7ca766
      anj-s authored
      * remove offload dependency on fp16
      
      * update python version for cpu tess
      
      * run CPU tests with updated PyTorch version
      
      * split changes
      
      * revert tests config
      
      * fix lint errors
      
      * update nightly and test PyTorch versions
      
      * skip failing multiprocess pipe test
      
      * always skip test
      
      * always skip test
      
      * always skip test
      
      * lint error
      
      * skip unsupported versions
      
      * improve skip message
      
      * lint errors
      
      * modify docs
      
      * add tests
      
      * fix test failures
      
      * modify comments
      
      * fix lint errors
      
      * fix lint errors
      ed7ca766
    • Min Xu's avatar
      [test] improve a test's coverage (#798) · b60f3db0
      Min Xu authored
      
      
      * checkpoint + nonflat + mixed_precision
      
      * make tests pass with expected errors
      
      * addressed comments
      
      * add a comment
      Co-authored-by: default avatarMin Xu <min.xu.public@gmail.com>
      b60f3db0