- 15 Nov, 2021 1 commit
-
-
Anupam Bhatnagar authored
* first commit * sharded scaler hitting nan assertions * adding test for sharded grad scaler without cpu offload * ddp grad scaler and fsdp sharded grad scaler test failing * removing test_output * fix no cpu offload test * changing optimizer from OSS to SGD * all tests passing, code cleanup pending * code cleanup * fix pyproject.toml * removing .isort.cfg * running isort linter * resolving isort issues * resolving black linter issue * resolving mypy issues * fix import statement * fix mypy error * modifying import statement * adding pytorch version requirement * fixing pytest skip test decorator * apply version guard for ShardedGradScaler * removing test_fsdp_grad_scaler * increasing num_epochs for ShardedGradScaler so that updates are not skipped * adding support for torch 1.8 * minor edit * [skip ci] more torch 1.8 changes * parametrizing the tests * cleanup code with linters * [skip ci] update doc string * [skip ci] addressing some more comments
-
- 08 Nov, 2021 1 commit
-
-
anj-s authored
* update release notes * initial commit * lint cleanup etc. * helper functions; lint errors * lint errors * lint errors * add back the boolean for named_parameters * address comments and fix lint * remove unused functions and class * remove unused state
-
- 27 Oct, 2021 1 commit
-
-
anj-s authored
* remove offload dependency on fp16 * update python version for cpu tess * run CPU tests with updated PyTorch version * split changes * revert tests config * fix lint errors * update nightly and test PyTorch versions * skip failing multiprocess pipe test * always skip test * always skip test * always skip test * lint error * skip unsupported versions * improve skip message * lint errors * modify docs * add tests * fix test failures * modify comments * fix lint errors * fix lint errors
-
- 26 Jul, 2021 1 commit
-
-
Min Xu authored
* [feat] FSDP: supporting multiple flatten parameter groups - step 3: make FSDP use FlattenParamModule unconditionally * fixing the auto_wrap tests * minor * rewrite local_metadata_dict - updated FPW so that custom flat param name is also supported * bug fix * mypy * rewrote consolidate_shard_weights - test_consolidate passes * comments * fixing pickling * Fix shared params and MoE logic (#749) * add strict kwarg to support fairseq:gshard MoE saving logic * Test fairseq style shard * style * formatting and address comments * added changelog * fixing a test after padding renaming Co-authored-by:
Min Xu <min.xu.public@gmail.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 21 Jun, 2021 1 commit
-
-
Min Xu authored
* [feat] FSDP: supporting multiple flatten parameter groups - step 2: extending FPW to support multiple flat params groups - FSDP still only use one group - unit test does this the new code paths - updated the changelog * first cut, mypy passed * test_flatten_params_wrapper.py::TestFlattenParams tests pass * added two more test cases and fixed a case in the code * fixed one bug with param_path_infos * fixed two more tests with hardcoded flat_param names * Update CHANGELOG.md Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
- 01 Jun, 2021 1 commit
-
-
Pete authored
* add failing test for buffer dtype * fix buffer dtype issue * update CHANGELOG * fix
-
- 14 May, 2021 1 commit
-
-
Shruti Bhosale authored
* fix saving and loading checkpoints with use_sharded_state=True * mypy fix * better fix of the infinite recursion - we need to specifically call FSDP.state_dict from its local state_dict - added unit test that fails without the fix and works with the fix - fixed mypy for the overloaded functions * make cpu-only fsdp work for state_dict at least Co-authored-by:
Min Xu <min.xu@acm.org> Co-authored-by:
Min Xu <min.xu.public@gmail.com> Co-authored-by:
Min Xu <m1n@fb.com>
-
- 07 Apr, 2021 1 commit
-
-
Myle Ott authored
-