- 12 Feb, 2021 1 commit
-
-
Benjamin Lefaudeux authored
* Better unit testing * Make it possible to refresh the DDP assumptions when the model has changed. Make it optional so that you save some time * Enabling accumulation tests
-
- 10 Feb, 2021 1 commit
-
-
Myle Ott authored
* Add fairscale.utils.containers Co-authored-by:
Min Xu <24926999+min-xu-ai@users.noreply.github.com> * Add fairscale.nn.misc.checkpoint_activations Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> Co-authored-by:
Min Xu <24926999+min-xu-ai@users.noreply.github.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 09 Feb, 2021 1 commit
-
-
msbaines authored
-
- 05 Feb, 2021 1 commit
-
-
Benjamin Lefaudeux authored
fix a broken earlier commit, only worked for the first step
-
- 04 Feb, 2021 4 commits
-
-
msbaines authored
-
Benjamin Lefaudeux authored
* Adding a proper ddp parity / AMP unit test, overdue * catch non-AMP pytorch
-
msbaines authored
-
msbaines authored
-
- 03 Feb, 2021 4 commits
-
-
Benjamin Lefaudeux authored
* precise skip, only if agent has only cpu
-
msbaines authored
-
Min Xu authored
* [feat] Add AdaScaleWrapper - This enables a different API for wrapping an optimizer with AdaScale. - This also enables AdaScale to be wrapped by OSS. - However, OSS wrapping AdaScale results in different optimization, which future research will be needed to study its effects. testing: add unit tests. * addressed comment: typo
-
Benjamin Lefaudeux authored
* adding the .to(device) support + unit testing * doc update
-
- 02 Feb, 2021 2 commits
-
-
Benjamin Lefaudeux authored
* adding a test to prove the inter operability with upstream pytorch * updating the changelog * eager state pruning * pytorch 1.5 compat
-
Benjamin Lefaudeux authored
* no idea about the root issue, but it proved to be fairly narrowed (gloo+cpu+python3.8+no cuda installed) so I guess that's out of scope for fairscale
-
- 30 Jan, 2021 1 commit
-
-
msbaines authored
-
- 29 Jan, 2021 2 commits
- 28 Jan, 2021 1 commit
-
-
Min Xu authored
* [test]: test adascale with oss * minor fix * add a small comment * refactor: moved find_tensor_by_shape * refactor: move test golden data into its own module * refactor: simplied the train function * refactor: added comments as suggested
-
- 27 Jan, 2021 2 commits
-
-
Benjamin Lefaudeux authored
-
msbaines authored
-
- 23 Jan, 2021 1 commit
-
-
Siddharth Goyal authored
* Add AMPnet implementation (clean version) * Move ampnet to experimental * Move stuff around pipeline * Address review comments and fix pre-commit errors * Refactor and modify delegate functionality * Modify header in pipe.py
-
- 21 Jan, 2021 3 commits
-
-
Benjamin Lefaudeux authored
* working around broken mypy
-
Myle Ott authored
-
Myle Ott authored
-
- 20 Jan, 2021 1 commit
-
-
Benjamin Lefaudeux authored
-
- 15 Jan, 2021 1 commit
-
-
Benjamin Lefaudeux authored
* minor, but ease of life, one less papercut
-
- 11 Jan, 2021 1 commit
-
-
Benjamin Lefaudeux authored
* tentatively fixing the cpu version of circleci jobs, now pipe tests are the last ones standing * fixing oss backcompat, trying to fix rpc in old pytorch also * fixing the file based init in torch 1.5
-
- 08 Jan, 2021 3 commits
-
-
Benjamin Lefaudeux authored
* adding a parity unit test * code review, better testing, use torch defaults and check for the loss, log world size
-
Benjamin Lefaudeux authored
-
Joshua Meier authored
* add additional unit test * support model parallelism in oss
-
- 05 Jan, 2021 1 commit
-
-
Benjamin Lefaudeux authored
* adding the pytest timeout plugin to properly root out hanging tests * removing redundant code, slightly more reasonable timeout, works on single cuda * finding the root bug for some of the cpu hangs, rpc init * propagating all the rpc init test changes to the pipe and model parallel tests
-
- 04 Jan, 2021 1 commit
-
-
Min Xu authored
* [feat] sync adascale from internal repo - tbd testing: tbd * Update argument document of __init__ * update documentation around set_num_gradients_to_accumulate * added checking code for proper API calling places * rename internal APIs to make them internal * updated changelog * added support for add_param_group and its unit test * added unit test for set_num_gradients_to_accumulate * added debias_ewma unit test * fixed test_set_num_gradients_to_accumulate (need zero_grad() call) * added missing zero_grad() to test_lr_scheduler * fixed test_add_param_group with respect to optim.zero_grad() * added test_gradient_value * added test_scale_not_equal_default for scale != world_size * grad_accum * added test_unhook() * removed print statements * fixed a typo * addressed Ben's comment
-
- 02 Jan, 2021 1 commit
-
-
Benjamin Lefaudeux authored
* fix typo, backend for CPU test
-
- 30 Dec, 2020 1 commit
-
-
Sean Naren authored
* Add function to add handle for sync BN * Add test to ensure batch norm handles have been added
-
- 29 Dec, 2020 2 commits
-
-
Benjamin Lefaudeux authored
* catching properly a given test failing if not enough gpus
-
Joshua Meier authored
author: Joshua Meier
-
- 28 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* file based dist init * nicer handling of broken world sizes vs. number of available GPUs, do not break but warn out
-
- 22 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* fix, one liner * adjust so that frozen trunks get spread still, even if this should have little consequences * removing dead code, hopeful unit test fix * now with some linting.. * adding a proper unit test case
-
- 19 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
[OSS] Getting rid of the "should bucket" hash table, just use a list + non trainable params fix (#259) * Getting rid of the "should bucket" hash table, just use a list Properly handle all params, with or without requires_grad * make sure that this case is unit tested
-
- 16 Dec, 2020 1 commit
-
-
Min Xu authored
* [doc]: AdaScale example and notes * formatted notes correctly as suggested by Benjamin * added feature and unit test to make sure lr_scheduler works * update the example with lr_scheduler * fixed doc with "make html" * addressed Mike's suggestions
-