Commits · a55940320d20d660df72187a9314c49fdaccd750 · OpenDAS / fairscale

28 Apr, 2021 1 commit

[feat] save memory by using bucket buffer only in backward (#633) · a5594032

Min Xu authored Apr 27, 2021



* [feat] save memory by using bucket buffer only in backward

- this fixes bug #627
- added documentation to clarify the buffer's cost and speed/memory
  tradeoff
- added setup/teardown calls so that the buffer is only allocated
  during the backward pass, saving more memory for forward and stepping
  so that they can be used for things like activations.
- added a unit test that assert the memory is in range.

Comparing with DDP:

  1. buffer size scales with # of FSDP not model size
  2. buffer is only allocated during backward
  3. buffer is used for small tensors only to reduce overhead
  4. overlapping of compute-reduction is very different

* add PR number to changelog

* filled in with memory number on 1.9

* addressed comments

* update comments

* fix for 1.6

* add a todo
Co-authored-by: Min Xu <min.xu@acm.org>

a5594032

26 Apr, 2021 1 commit

[fix]: let FSDP handle model with multiple forward pass and checkpoint (#621) · a1612d79

Min Xu authored Apr 26, 2021



* [fix]: let FSDP handle model with multiple forward pass and checkpoint

* try CI again

* save

* save

* fixed case with bn

* minor

* add the new file

* minor

* added test of a single case, runtime is about 50s

* enable all 8 test cases

* cleanup

* cleanup

* skip flatten case with 1.6 and 1.7

* minor
Co-authored-by: Min Xu <min.xu@acm.org>

a1612d79

23 Apr, 2021 1 commit

[FSDP] relax checking root condition (#620) · d3b86d65

shuyingsunshine21 authored Apr 22, 2021

* relax checking root condition

* formatting

* add unittest

* add unittest to ci test list

* isort for import of unittest

* format black .

* move test to list 1

* add skip no cuda

* black and isort

d3b86d65

22 Apr, 2021 3 commits

[fix] mypy and flaky test (#624) · 961df76e

Min Xu authored Apr 22, 2021



* [fix] mypy and flaky test

- CI didn't seem to catch this or maybe I merged incorrectly yesterday
- this should fix the mypy error on master
- also updated a test that seems to be flaky due to tcp port conflict

* another flaky test, hopefully more determinism helps

* CR

* skip 1.6

* fix

* minor
Co-authored-by: Min Xu <min.xu@acm.org>

961df76e

[SDP] removing an assert which does not seem always accurate (#625) · 85962b97
Benjamin Lefaudeux authored Apr 22, 2021

85962b97

Fixing logging, changing info to debug to avoid clutter (#622) · b0048b28

girifb authored Apr 21, 2021



* Changing FSDP init to by pass pg validation for freshly minted pgs inside of init.

* Addressing Min's review comments.

* Changing logging in init to debug from info

* Changing logging in init to debug from info
Co-authored-by: Giri Anantharaman <giriman@devfair0439.h2.fair>

b0048b28

21 Apr, 2021 1 commit

Changing FSDP init to by pass pg validation (#619) · f768eb93

girifb authored Apr 21, 2021



* Changing FSDP init to by pass pg validation for freshly minted pgs inside of init.

* Addressing Min's review comments.
Co-authored-by: Giri Anantharaman <giriman@devfair0439.h2.fair>

f768eb93

20 Apr, 2021 1 commit
- [FSDP] Consolidate cpu_adam optimizer state dict (#607) · d9f36130
  Sam Shleifer authored Apr 20, 2021
  
  d9f36130
19 Apr, 2021 1 commit

FSDP: fixing training with freezing weights (#614) · 24da3b11

Min Xu authored Apr 18, 2021



* FSDP: fixing training with freezing weights

- an assert is changed to catch this case correctly
- unit test added (based on Quentin's test code) for this case and
  compare DDP and FSDP

fixes: #610

* added test file to list 1

* Use better and simpler code as suggested by Myle

* testing both methods of freezing as well
Co-authored-by: Min Xu <min.xu@acm.org>

24da3b11

14 Apr, 2021 1 commit
- [fix] [FSDP] Make _get_default_cuda_device more robust to modules without params (#606) · 8f7ee69f
  Myle Ott authored Apr 14, 2021
  
  8f7ee69f
13 Apr, 2021 1 commit
- [FSDP] use all_gather for 10X OSD consolidation speedup (#595) · a82825db
  Sam Shleifer authored Apr 13, 2021
  
  a82825db
08 Apr, 2021 1 commit
- [fix] [FSDP] optim state dict should be completely on CPU (#590) · a6549be7
  Sam Shleifer authored Apr 08, 2021
  
  a6549be7
07 Apr, 2021 1 commit
- [FSDP] [feat] Add state_dict_device option (#579) · 14abed6e
  Myle Ott authored Apr 07, 2021
  
  14abed6e
04 Apr, 2021 1 commit
- [FSDP] add no_broadcast_optim_state option (#560) · 1fcbd624
  Sam Shleifer authored Apr 04, 2021
  
  1fcbd624
03 Apr, 2021 1 commit
- [FSDP] Add gradient predivide factor to avoid overflow/underflow with large world size (#565) · 04001e76
  Shruti Bhosale authored Apr 03, 2021
  
  04001e76
31 Mar, 2021 1 commit

[fix] FSDP: disable single rank process group for auto_wrap_bn and fixed mixed... · a0458b98

Min Xu authored Mar 31, 2021

[fix] FSDP: disable single rank process group for auto_wrap_bn and fixed mixed precision regnet test (#556)

* [fix] disable single rank process group for auto_wrap_bn

- beefed up unit test with regnet-like model
- found that single-rank process group is causing problem
- disabled it to enable convergence tests on the vissl side
- use `raise e from None` to get a better assertion output
  in testing.py.

* [test] fix regnet test for ddp+mixed_precision

- need AMP context in FSDP
- workaround different between ddp & fsdp when bias=True
- fixed a bug in input data generation that caused different ranks have
  the same data with wrong iteration count.
- added TODO for need a better loss and grad_scaler and reduced
  iters so there is no nan.
- added a (disabled) debugging code

* lint

* lint

* add scaler

* lint

* scaler

* add a real loss

* seeding in the ranks

* blance tests

* run AMP DDP==FSDP test only on cuda version 11 and up

* add relu inplace and comment

* make wrap_bn covers more cases in full precision mode

a0458b98

25 Mar, 2021 1 commit
- [FSDP][feature] optimizer state dict save and load (#537) · 9474d75d
  Sam Shleifer authored Mar 25, 2021
```
Co-authored-by: Min Xu <24926999+min-xu-ai@users.noreply.github.com>
```
  9474d75d
20 Mar, 2021 1 commit

[fix][FSDP] fix weight init when using apply() (fixes #490 and #444) (#543) · fa1b85fb

Myle Ott authored Mar 20, 2021

* Add new test for weight init (fails)
* Set FSDP.compute_device so summon_full_params works before module moves to CUDA
* Override FSDP.apply to enable custom weight init

fa1b85fb

18 Mar, 2021 3 commits

[perf] [FSDP] micro-optimization for memory usage (#533) · fcbf1ea3
Myle Ott authored Mar 18, 2021

fcbf1ea3

[feat] FSDP: add auto_wrap_bn (#531) · 8b59267b

Min Xu authored Mar 18, 2021

* [feat] FSDP: add auto_wrap_bn

- add an utility function to handle wrapping of BN

* changelog

8b59267b

[feature] FSDP: enable pytorch SyncBN (#527) · 2fc1f6d8

Min Xu authored Mar 17, 2021

* [feature] FSDP: enable pytorch SyncBN

- not fully validated yet but at least not asserting
- this enables VISSL to move forward with its next PR

* add the test file

* changelog and lint

* addressed comment

2fc1f6d8

17 Mar, 2021 1 commit
- better backward debugging (#526) · 66dfe606
  Min Xu authored Mar 17, 2021
  
  66dfe606
12 Mar, 2021 1 commit

[fix] FSDP: multi-pass autograd graph and mixed precision (#513) · 82986ca0

Min Xu authored Mar 12, 2021



* FSDP: multi-pass autograd graph and mixed precision

- added BACKWARD_PRE/POST checking
- better assert_state
- fixed issue of backward hook misfiring

* fix

* cleanup

* Update fairscale/nn/data_parallel/fully_sharded_data_parallel.py
Co-authored-by: Myle Ott <myleott@fb.com>
Co-authored-by: Myle Ott <myleott@fb.com>

82986ca0

09 Mar, 2021 3 commits
- [perf] Further improve performance for FSDP.no_sync (#502) · 0cbf3bab
  Myle Ott authored Mar 09, 2021
  
  0cbf3bab
- [fix] FSDP: fix MoE corner case (fixes #467) (#501) · 05ce7971
  Myle Ott authored Mar 08, 2021
  
  05ce7971
- [doc] fix enable_wrap syntax in FSDP docs (#497) · 64bbb6e1
  Sam Shleifer authored Mar 08, 2021
  
  64bbb6e1
08 Mar, 2021 3 commits

[fix] FSDP: fix CPU offload corner case (#496) · c06efdf6
Myle Ott authored Mar 08, 2021

c06efdf6
[docs] add fsdp_tips.rst (#455) · ad611a34
Sam Shleifer authored Mar 08, 2021
```
* Document FSDP tips and tricks in a separate file
```
ad611a34

[fix]: handle inputs with containers in mixed precision (#486) · 2e9a14e7

Min Xu authored Mar 08, 2021

* [fix]: handle inputs with containers

- this is an issue surfaces by vissl as well
- fix seems to be super simple
- also cleaned up two tests with respect to multiple such tests
  running back to back (they don't do that presently)

* cleanup

* fix

* lint

2e9a14e7

06 Mar, 2021 1 commit
- [perf] FSDP: speed up no_sync and test communication volume (#470) · 1204c7cf
  Myle Ott authored Mar 06, 2021
  
  1204c7cf
04 Mar, 2021 1 commit
- [feat] add buffer_dtype kwarg for more control of batchnorm (#458) · b36e01d5
  Sam Shleifer authored Mar 04, 2021
  
  b36e01d5
02 Mar, 2021 2 commits

[fix] Make state_dict all-gather FP32 params (#451) · d2924670
Myle Ott authored Mar 02, 2021

d2924670

[feat] Add context manager to FSDP for easier child module wrapping (#446) · f3359550

Sean Naren authored Mar 02, 2021

This adds a context manager that assists in making child modules with similar defaults.
Usage:
```
from fairscale.nn.misc import enable_wrap, wrap

with enable_wrap(**handleful_of_important_params):
    layer_1 = wrap(torch.nn.Linear(5, 5))
    layer_2 = wrap(torch.nn.Linear(5, 5), flatten_parameters=True) # Override parameters if you'd like

# without the context manager, creates Linear layer
layer_1 = wrap(torch.nn.Linear(5, 5))
```
If not within the FSDP context, this would be a no-op. This makes it easier to annotate layers without having to copy any changes in parameters.

f3359550

01 Mar, 2021 1 commit
- Add is root check to only cast to FP16 on main FSDP wrapper (#452) · 5c5866b3
  Sean Naren authored Mar 01, 2021
  
  5c5866b3
27 Feb, 2021 1 commit

[fix] FSDP: fix the corner case of all params are in the children (#441) · b75a5e26

Min Xu authored Feb 26, 2021

* [fix] FSDP corner case of all params at in the children

* lint

* fix

* tradeoff

* fix doc build

* review comments

b75a5e26

26 Feb, 2021 2 commits
- [fix] fix FSDP state_dict/load_state_dict for nested wrapped instances (#440) · b6dc98cf
  Myle Ott authored Feb 26, 2021
  
  b6dc98cf
- [feat]: add summon_full_params context mgr (#433) · 77f92b38
  Min Xu authored Feb 25, 2021
```
* [feat]: add summon_full_params context mgr

* fix

* fix

* addressed comments

* fixed the state_dict copy

* lint
```
  77f92b38
25 Feb, 2021 1 commit
- [cleanup] FSDP docstrings (#428) · 6b2897ca
  Myle Ott authored Feb 25, 2021
  
  6b2897ca
24 Feb, 2021 1 commit
- [fix]: Fix non-float buffers in FSDP (#427) · 9e0df348
  Myle Ott authored Feb 23, 2021
  
  9e0df348
23 Feb, 2021 1 commit
- [docs] fsdp changelog and doc (#414) · 2b15720b
  Min Xu authored Feb 22, 2021
  
  2b15720b