- 06 Apr, 2021 1 commit
-
-
Benjamin Lefaudeux authored
-
- 05 Apr, 2021 3 commits
-
-
anj-s authored
* add model * add offload regression benchmarks * add golden data * remove mp pipe benchmark * fix lint * remove rank * add check for model type * lint errors
-
Benjamin Lefaudeux authored
* making APIs more private * linting
-
Benjamin Lefaudeux authored
* fixing given torchvision's change
-
- 04 Apr, 2021 3 commits
-
-
Sam Shleifer authored
-
msbaines authored
This test is flaky for torch >= 1.8.0.
-
Benjamin Lefaudeux authored
-
- 03 Apr, 2021 1 commit
-
-
Shruti Bhosale authored
-
- 02 Apr, 2021 6 commits
-
-
msbaines authored
NCCL all_to_all is now supported in PyTorch (since v1.8.0) Fixes: #548
-
Min Xu authored
- releasing 0.3.3 - I need it in vissl for the auto_wrap_bn change
-
anj-s authored
-
Anjali Sridhar authored
-
Anjali Sridhar authored
-
anj-s authored
* add record_function support * add more record_function cutpoints * add more record_function cutpoints * lint errors * make string ids more specific
-
- 01 Apr, 2021 1 commit
-
-
msbaines authored
-
- 31 Mar, 2021 5 commits
-
-
Siddharth Goyal authored
-
msbaines authored
-
anj-s authored
* renaming/adding error messages * address comments * address comments * add more comments * add more comments
-
Min Xu authored
[fix] FSDP: disable single rank process group for auto_wrap_bn and fixed mixed precision regnet test (#556) * [fix] disable single rank process group for auto_wrap_bn - beefed up unit test with regnet-like model - found that single-rank process group is causing problem - disabled it to enable convergence tests on the vissl side - use `raise e from None` to get a better assertion output in testing.py. * [test] fix regnet test for ddp+mixed_precision - need AMP context in FSDP - workaround different between ddp & fsdp when bias=True - fixed a bug in input data generation that caused different ranks have the same data with wrong iteration count. - added TODO for need a better loss and grad_scaler and reduced iters so there is no nan. - added a (disabled) debugging code * lint * lint * add scaler * lint * scaler * add a real loss * seeding in the ranks * blance tests * run AMP DDP==FSDP test only on cuda version 11 and up * add relu inplace and comment * make wrap_bn covers more cases in full precision mode
-
msbaines authored
-
- 30 Mar, 2021 1 commit
-
-
Benjamin Lefaudeux authored
* survive the model being moved to device post-construction * make sure that a unit test would catch a regression
-
- 29 Mar, 2021 3 commits
-
-
msbaines authored
-
anj-s authored
* codedcov testing * codecov testnig * more changes for uploading cov * fix invalid config * fix invalid config * modify name * fix config Co-authored-by:Anjali Sridhar <anj@devfair0443.h2.fair>
-
msbaines authored
-
- 28 Mar, 2021 1 commit
-
-
msbaines authored
-
- 26 Mar, 2021 2 commits
- 25 Mar, 2021 3 commits
-
-
Benjamin Lefaudeux authored
-
Benjamin Lefaudeux authored
* re-activating unit test * removing changed that slipped in
-
Sam Shleifer authored
Co-authored-by:Min Xu <24926999+min-xu-ai@users.noreply.github.com>
-
- 22 Mar, 2021 1 commit
-
-
Benjamin Lefaudeux authored
-
- 20 Mar, 2021 1 commit
-
-
Myle Ott authored
* Add new test for weight init (fails) * Set FSDP.compute_device so summon_full_params works before module moves to CUDA * Override FSDP.apply to enable custom weight init
-
- 19 Mar, 2021 3 commits
-
-
Benjamin Lefaudeux authored
* param buckets * unifying the buckets
-
msbaines authored
-
msbaines authored
-
- 18 Mar, 2021 5 commits
-
-
Benjamin Lefaudeux authored
-
Min Xu authored
-
Benjamin Lefaudeux authored
* extracting the buckets in a dedicated class, fixing the resize_ bug * adding a unit test * copyright
-
Myle Ott authored
-
Benjamin Lefaudeux authored
* enabling disabled tests
-