Commits · f13382ba4eea840dcd5123db024d06f1c0c36bd3 · OpenDAS / fairscale

11 Apr, 2023 2 commits
- Update documentation to remove obsolete references (#1116) · f13382ba
  Dale Evans authored Apr 11, 2023
```
All the examples were deleted from the repo as part of issue #712
```
  f13382ba
- Fix docstring typo (#1118) · cf76d94e
  gregor-soniox authored Apr 11, 2023
  
  cf76d94e
10 Mar, 2023 1 commit
- Fix bibtex entry (#1110) · 5b38de38
  mrbaozi authored Mar 10, 2023
  
  5b38de38
28 Feb, 2023 1 commit
- make a logging warning once (#1108) · 3fbde78f
  Min Xu authored Feb 28, 2023
```
- fixes #1107
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
  3fbde78f
23 Feb, 2023 1 commit
- Remove `torch._six` from `__init__.py` (#1106) · da0cbd7c
  Nikita Shulga authored Feb 23, 2023
```
Just use `torch.inf`, as torch._six is gone after https://github.com/pytorch/pytorch/pull/94709
```
  da0cbd7c
15 Feb, 2023 1 commit
- [fix] typo in wikitext2_data.py (#1104) · 377e9696
  Junyeol Ryu authored Feb 16, 2023
```
* [fix] typo in wikitext2_data.py

* [fix] typo and code duplication in fsdp.py
```
  377e9696
04 Feb, 2023 1 commit
- [fix] typo in flatten_params_wrapper.py (#1103) · f87255d5
  Ikko Eltociear Ashimine authored Feb 05, 2023
```
heirarchy -> hierarchy
```
  f87255d5
12 Dec, 2022 1 commit

[test] ci py 3.11 tests (#1099) · 4a98000c

Min Xu authored Dec 11, 2022



* [test] ci py 3.11 tests
Co-authored-by: Min Xu <min.xu.public@gmail.com>

* fixed setup.py

* fixed ci config

* fixed ci config's python 3.11 version

* fixed torch installs on cpu

* update pygit2 for 3.11

* we don't run benchmark on cpu, so no need to install the benchmark reqs

* update torch install

* try to install torchvision

* numpy version 311

* fix cpu test dependency installation

* pip git install cmd fix

* bypass some tests in 3.11. failure due to packages they use haven't been updated for 3.11 yet
Co-authored-by: Min Xu <min.xu.public@gmail.com>

4a98000c

11 Dec, 2022 3 commits
- ci and benchmark test (#1098) · 6ad51702
  Min Xu authored Dec 11, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
  6ad51702
- 0.4.13 release · 50b06d25
  Anupam Bhatnagar authored Dec 11, 2022
  
  50b06d25
- add fair_dev packages (#1097) · 52043ace
  Min Xu authored Dec 11, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
  52043ace
05 Dec, 2022 1 commit

Implement _compute_intra_grad_corr_mean for gradient computation (#1095) · 99edeff1

Changyu Gao authored Dec 05, 2022

* Fix gradient accumulation

Add ``is_scaled_loss`` flag to support both scaled / unscaled loss
Fix ``test_grad_accum`` and``test_set_num_gradients_to_accumulate``

* Add a method to scale grad for grad_accum using unscaled loss

- Revert the changes in `step` method
- Add a method `scale_grad_by_num_grads_to_accum`to handle gradient accumulation using unscaled loss more explicitly
- Add gradient tests

* Implement _compute_corr_mean_between_grads

* Improve tests and comments

* Use ubuntu-20.04 instead of latest

Use ubuntu-20.04 to fix the `arch x64 not found` issue
[Version 3.10 with arch x64 not found actions/setup-python#401](https://github.com/actions/setup-python/issues/401)

* Switch flake8 from gitlab to github

Flake8 was moved to Github
See discussions https://www.reddit.com/r/Python/comments/yvfww8/flake8_took_down_the_gitlab_repository_in_favor/

* Fix scikit-learn package

* Update PyTorch versions

* Resolve comments from Min

* Minor fix

* Disable broken tests for new versions of PyTorch

99edeff1

21 Oct, 2022 1 commit
- minor cleanup (#1091) · ee647b97
  Min Xu authored Oct 21, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
  ee647b97
05 Oct, 2022 3 commits

Fix gradient accumulation (#1086) · f5e727cc

Changyu Gao authored Oct 05, 2022

* Fix gradient accumulation

- Add ``is_scaled_loss`` flag to support both scaled / unscaled loss
- Add a method `scale_grad_by_num_grads_to_accum`to handle gradient accumulation using unscaled loss more explicitly
- Fix ``test_grad_accum`` and``test_set_num_gradients_to_accumulate``
- Add tests for gradient

f5e727cc

0.4.12 release · b0b92e70
Anupam Bhatnagar authored Oct 05, 2022

b0b92e70
fix doc build (#1087) · d91658f5
Min Xu authored Oct 05, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
d91658f5

30 Sep, 2022 1 commit
- 0.4.11 release · 287d2e6d
  Anupam Bhatnagar authored Sep 30, 2022
  
  287d2e6d
25 Sep, 2022 1 commit
- reenable a test (#1081) · 4975b05e
  Min Xu authored Sep 24, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
  4975b05e
24 Sep, 2022 3 commits

[cleanup] remove ssd offload to simplify the FSDP code (#1080) · e71d2570

Min Xu authored Sep 24, 2022



* simlificed the readme

* clean up ssd offload

* try to fix readthedocs
Co-authored-by: Min Xu <min.xu.public@gmail.com>

e71d2570

[Fix][FSDP] Don't remove post backward hooks for multiple backward fix (#1079) · f4fcee7e

Min Xu authored Sep 24, 2022



* tmp

* test again

* test again

* add new test

* clean up

* add test file to the testlist

* more comments

* add changelog
Co-authored-by: Min Xu <min.xu.public@gmail.com>

f4fcee7e

[chore] move fair_dev into fairscale (#1078) · 8f8f8ef9
Min Xu authored Sep 23, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
8f8f8ef9

23 Sep, 2022 6 commits

[fix] SDP syncing buffers during gradient accumulation (#1075) · bfd57ff3

Min Xu authored Sep 23, 2022



- Fixes from Benjamin.

Original commit msg:
  - Fixes #1041. I just had a minute or two, hoping that it's enough :)
Co-authored-by: Min Xu <min.xu.public@gmail.com>

bfd57ff3

disable code cov (#1077) · abfa7193
Min Xu authored Sep 23, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
abfa7193
disable codecov (#1076) · 72fcabec
Min Xu authored Sep 23, 2022
```
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
72fcabec
0.4.10 release · 6f03e415
Anupam Bhatnagar authored Sep 23, 2022

6f03e415

[fix] better handling non-flatten in FSDP (#1072) · 429f3d31

Min Xu authored Sep 23, 2022



* [fix] better handling non-flatten in FSDP

- see the detailed comment about that backward firing case
- also minor debugging help in FSDP
- also minor fix in FPW's state dict

* [feat] disallow reset_parameters by default

* [feat] adding fsdp_instances API - useful in check wrapping by user code

* [fix] one line fix but more than a day of debugging

* fixed the case of loading combined check with empty fsdp instances

* fixed another bug around state loading the root/nonroot module full param caching due to not resharding after forward

* [feat] support .half and .float better

* fixed a bug in gather optim state losses extra keys from the original state_dict

* fixed a test failure in mixed precision

* fixed another bug affecting no_sync grad acc

* fixed a bug and a test in fsdp optim state

* fixed another corner case

* added a comment

* skip ssd offload tests

* skip fsdp one for ssd overload
Co-authored-by: Min Xu <min.xu.public@gmail.com>

429f3d31

[fix] don't import ProcessGroup eagerly (#1074) · 47ce21ac

Min Xu authored Sep 22, 2022



* [fix] don't import ProcessGroup eagerly

- move the import into typing since it is only used for type checking
- fixes #1057

* more fixes

* one more

* tested at least
Co-authored-by: Min Xu <min.xu.public@gmail.com>

47ce21ac

13 Sep, 2022 3 commits
- [bug] fix optim state gather when there is empty FSDP instances (#1071) · d8fc94d9
  Min Xu authored Sep 13, 2022
```
* [bug] fix optim state gather when there is empty FSDP instances

* fixes an anssert and a test bug
```
  d8fc94d9
- [minor] add a warning in the doc (#1070) · 203dd668
  Min Xu authored Sep 12, 2022
  
  203dd668
- [feat] support namedtuple in container.py (#1069) · eeb6684e
  Min Xu authored Sep 12, 2022
  
  eeb6684e
10 Sep, 2022 1 commit

[minor] help pure fp16 FSDP init a bit (#1068) · 73bf5964

Min Xu authored Sep 10, 2022

* [minor] [FSDP] add a better for pure fp16

* [minor] [wrap] add a flag to help fsdp pure fp16 wrapping

73bf5964

07 Sep, 2022 4 commits
- [minor] fix doc and assert and test around percent (#1067) · 454537d1
  Min Xu authored Sep 07, 2022
  
  454537d1
- [feat] add random_sparse_mask api (#1066) · 1a8d234d
  Min Xu authored Sep 07, 2022
```
* [feat] add random_sparse_mask api

* correct test skip
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
  1a8d234d
- 0.4.9 release · 19033c32
  Anupam Bhatnagar authored Sep 07, 2022
  
  19033c32
- [feat] support a context for loading state_dict for FSDP (#1065) · 4b126c7b
  Min Xu authored Sep 06, 2022
```
* [fix]: add a context for supporting state_dict from a non-FSDP parent module

* formatting
Co-authored-by: Min Xu <min.xu.public@gmail.com>
```
  4b126c7b
26 Aug, 2022 1 commit

[feat] support optional SST and DST (#1063) · 3cc7fa8d

Min Xu authored Aug 25, 2022



* [feat] support sst disabled and dst disabled cases

* added tests
Co-authored-by: Min Xu <min.xu.public@gmail.com>

3cc7fa8d

25 Aug, 2022 1 commit

[chore] update nightly version (#1064) · 15d4cf15

Min Xu authored Aug 25, 2022



* update nightly version

* update wgit to use numpy for load/store

- this is introduced with new nightly torch version, which made torch.save() not
  producing deterministic bytes
- this make tensor<->numpy conversion and then do the save/load to avoid that issues.

* fixed tests
Co-authored-by: Min Xu <min.xu.public@gmail.com>

15d4cf15

11 Aug, 2022 1 commit

[feat] signal sparsity profiling class (#1060) · e982b433

Min Xu authored Aug 11, 2022



* added a profiling class

* no more type ignore after merging main

* fixed a int/round bug

* add unit tests

* skip if no cuda for a test
Co-authored-by: Min Xu <min.xu.public@gmail.com>

e982b433

08 Aug, 2022 2 commits

[fix] bugs in signal sparsity class and improving tests (#1058) · 4c830de1

Min Xu authored Aug 08, 2022



* update examples and comment

* fixed issue with fft/ifft only doing the last dim

* fixed a int/round bug; fixed tests

* add cuda tests

* add atol and rtol

* skip cuda test correctly
Co-authored-by: Min Xu <min.xu.public@gmail.com>

4c830de1

Disable broken tests (#1055) · f81a60be
Crutcher Dunnavant authored Aug 08, 2022

f81a60be