"vscode:/vscode.git/clone" did not exist on "c89cd065841d16d27ffaae81d10f2a93f569ab14"
- 27 Oct, 2021 3 commits
-
-
Min Xu authored
* added the failing test * fixed the bug * fine-tune the condition * typo * typo * changelog and added test to test files Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
anj-s authored
-
Eugen Hotaj authored
Fixes #827. Co-authored-by:Eugen Hotaj <ehotaj@fb.com>
-
- 24 Oct, 2021 1 commit
-
-
anj-s authored
* relax speed constraints * relax the regressions constraints
-
- 22 Oct, 2021 2 commits
-
-
anj-s authored
-
Eugen Hotaj authored
auto_shard.py currently uses torch.fx to create a symbolic DAG of operations and linearizes that DAG into an nn.Sequential so it can later be used for model offloading. This works in most cases but runs into issues for certain eager mode features, such as dynamic conditionals, shape-dependent computation, etc. This PR extends auto_shard.py to first run a preprocessing step which wraps any nn.Module which cannot be traced through. It adds a test for dynamic conditionals and updates existing failing test code. There are some immediate extensions to this approach which are marked as TODO in the code.
-
- 21 Oct, 2021 2 commits
-
-
anj-s authored
* update pytorch version for benchmarks * reduce golden data precision check
-
anj-s authored
* update python version for cpu tess * run CPU tests with updated PyTorch version * update nightly and test PyTorch versions * skip failing multiprocess pipe test * always skip test * always skip test * always skip test * lint error * skip unsupported versions * improve skip message * lint errors
-
- 20 Oct, 2021 3 commits
-
-
anj-s authored
* add log for new memory tracker features * add log for new memory tracker features
-
Quentin Duval authored
* [feat] layer memory tracking * [feat] layer memory tracking (add tests in CI) * [feat] layer memory tracking: doc typos * [feat] layer memory tracking: mypy fixes * [feat] layer memory tracking: fixes for FSDP all gather tracking on pytorch 1.9 and above * [feat] layer memory tracking: lint * [feat] layer memory tracking: mypy Co-authored-by:QuentinDuval <QuentinDuval@users.noreply.github.com>
-
anj-s authored
-
- 19 Oct, 2021 1 commit
-
-
Rohan Varma authored
* fix * remove dup file
-
- 28 Sep, 2021 1 commit
-
-
Anjali Sridhar authored
-
- 24 Sep, 2021 1 commit
-
-
Anjali Sridhar authored
-
- 22 Sep, 2021 1 commit
-
-
tmarkstrum authored
* update master branch to main * added FAQ about updating the branch from master to main * fixed some false positive correction * added what is new section * fixed the quoted code area * added release what is new section * added a step in release.md * fixed a word
-
- 21 Sep, 2021 1 commit
-
-
anj-s authored
-
- 20 Sep, 2021 1 commit
-
-
tmarkstrum authored
* [chore]0.4.1 release * put more details in one change log
-
- 17 Sep, 2021 1 commit
-
-
tmarkstrum authored
* add toggler to disable the using the nccl base collectives * added todo to remove the toggle when the issue is resolved.
-
- 13 Sep, 2021 1 commit
-
-
Benjamin Lefaudeux authored
-
- 12 Sep, 2021 2 commits
-
-
Min Xu authored
* add changelog for previous commit * add changelog for previous commit * add changelog for previous commit * fix a merge induced error Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
Darryl Barnhart authored
* [fix] FSDP intra-backwards gradient accumulation. Ensure gradient reduction accumulates into the unsharded gradient tensor within a backwards pass. This matters when an FSDP module is called multiple times within a forward pass, and reduction is _not_ deferred using activation checkpoint forward counters, bucketing or some other mechanism. Closes #780 * [refactor] Remove forward counters. Comments. Removed forward counters from the activation checkpointing utility, now that FSDP does not require them for correct operation. Add more detailed comment about memory usage behaviour with gradient reduction. * [refactor] Delete deprecated forward counter usage. * [refactor] Add state assertion as end of pre-backward hook.
-
- 11 Sep, 2021 1 commit
-
-
Alex Xiao authored
Before this commit, output tensors of checkpointed modules always require grad, even if they shouldn't. This commit makes it so that the outputs of checkpointed modules only require grad if either the input requires grad or if the parameters require grad. To achieve this, this commit also adds a new _unflattened_param_views attribute to modules being flattened. This allows the checkpointing to still access the parameters and check if gradients need to be computed. Co-authored-by:Alex Xiao <axiao@fb.com>
-
- 10 Sep, 2021 2 commits
-
-
Min Xu authored
Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
Benjamin Lefaudeux authored
-
- 07 Sep, 2021 1 commit
-
-
Achal Dixit authored
* [test] Added disable_checkpointing unit test * [test] Added disable_checkpointing unit test (Clean-up) * [test] Added disable_checkpointing unit test (Clean-up)
-
- 06 Sep, 2021 2 commits
-
-
-
Min Xu authored
[cleanup] CI test updates; mypy cleanup; partial broadcast_object cleanup; pre-commit documentation (#744) * changelog; mypy; oss cleanup * more broadcast_object cleanup in FSDP * one more mypy fix * retire pytorch 1.6 from circleci, add new lightly, add 1.8 LTS and 1.9 stable release * update torch version for LTS * minor fixes * update cache key * trying newer gpu VMs * bump the cache * update to gpu.medium, which should be 2 GPUs * update nightly version * add pre-commit instruction * fixed CHANGELOG after merging * updated to newer nightly * retained the older broadcast function for older GPUs for oss.py * fixed a bug * added a comment * fixing a test for pytorch 1.10 * testing a fix * Update fairscale/optim/oss.py * Update CONTRIBUTING.md Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
- 05 Sep, 2021 1 commit
-
-
Min Xu authored
* [bug] [FSDP] making sure we use full params for multiple backwards within an iteration * changelog Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
- 18 Aug, 2021 1 commit
-
-
Vittorio Caggiano authored
-
- 12 Aug, 2021 4 commits
-
-
anj-s authored
-
Min Xu authored
* minor: changelog and pre-commit * addressed comment * update the release doc Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
anj-s authored
* add additional assert for checking if the requires_grad field is set. * fix lint errors * add unit tests and address comments
-
anj-s authored
[FSDP][feature] Support returning the original parameter names after a model has been wrapped with FSDP (#755) * checkpoint work * fix lint issues * remove debug statement * remove print * fix lint errors * fix lint errors * fix lint errors * add comments and fix lint errors * modified comments and tests
-
- 10 Aug, 2021 1 commit
-
-
Rahul Iyer authored
Pre-commit hook fails when run on all files for three reasons: (see trace below) 1. Trailing whitespace on multiple files 2. mypy fails to load numpy and then subsequently fails to load LazyModule from pipe.py 3. isort sees issues with known_third_party packages ``` > pre-commit run --all-files Trim Trailing Whitespace.................................................Failed - hook id: trailing-whitespace - exit code: 1 - files were modified by this hook Fixing docs/source/conf.py Fixing fairscale/experimental/nn/auto_shard.py Fixing docs/source/deep_dive/activation_checkpointing.rst Fixing docs/source/tutorials/pipe.rst Fixing docs/source/installation_instructions.rst Fixing docs/source/deep_dive/pipeline_parallelism.rst Fixing docs/source/tutorials/activation_checkpointing.rst Fixing docs/source/tutorials/offload_model.rst Fixing docs/source/deep_dive/oss_sdp_fsdp.rst Fixing docs/source/what_is_fairscale.rst Fixing CHANGELOG.md Fixing fairscale/experimental/nn/offload.py Fixing docs/source/index.rst Fixing docs/source/deep_dive/adascale.rst Fixing README.md Fixing docs/source/tutorials/oss.rst Fixing docs/source/deep_dive/offload.rst Check python ast.........................................................Passed Check for merge conflicts................................................Passed Don't commit to branch...................................................Passed Check for added large files..............................................Passed Fix End of Files.........................................................Failed - hook id: end-of-file-fixer - exit code: 1 - files were modified by this hook Fixing requirements.txt Fixing docs/source/getting_started.rst Fixing docs/source/installation_instructions.rst Fixing codecov.yml Fixing docs/source/deep_dive/adascale.rst Fixing docs/source/tutorials/oss.rst Fixing docs/source/deep_dive/offload.rst black....................................................................Passed flake8...................................................................Passed seed isort known_third_party.............................................Failed - hook id: seed-isort-config - exit code: 1 - files were modified by this hook isort....................................................................Passed mypy.....................................................................Failed - hook id: mypy - exit code: 2 setup.cfg:45: error: Error importing plugin 'numpy.typing.mypy_plugin': No module named 'numpy' Found 1 error in 1 file (checked 197 source files) ```
-
- 02 Aug, 2021 2 commits
-
-
mrshenli authored
`wrap` from `auto_wrap` is used in the docstring example which is missing from the imports.
-
Howard Huang authored
-
- 01 Aug, 2021 1 commit
-
-
Min Xu authored
Co-authored-by:Min Xu <min.xu.public@gmail.com>
-
- 31 Jul, 2021 1 commit
-
-
Myle Ott authored
* Add test (broken) for gradient accumulation without no_sync context manager * changelog * no_sync to grad_acc renaming for tests * clean up tmp files * support grad acc without no_sync * minor * update changelog * Update fairscale/nn/data_parallel/fully_sharded_data_parallel.py Better assertion from Sam. Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * lint Co-authored-by:
Min Xu <min.xu.public@gmail.com> Co-authored-by:
Min Xu <24926999+min-xu-ai@users.noreply.github.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 30 Jul, 2021 1 commit
-
-
Yanli Zhao authored
Move final backward callback to pre-backward hook of root FSDP instance Summary: Move final backward callback to pre-backward hook of root FSDP instance, so that it is always attached to the outer most backward call and fired after all backward calls are completed. Also added flags to check final backward callback is fired when final backward callback is required. If root FSDP is checkpointed and called multiple times in forward, check pointer counter is used to make sure final backward callback is queued inside last inner backward call as well. Test Plan: unit tests Reviewers: Subscribers: Tasks: Tags: * reformat Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * nits and unit tests Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * address some comments Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * replace m with self Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * reformat Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * nits Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * remove the fired flag Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * assert state on root only Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * comments Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * comments Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
-
- 27 Jul, 2021 1 commit
-
-
Min Xu authored
* [chore] 0.3.9 release * update changelog * address comments Co-authored-by:Min Xu <min.xu.public@gmail.com>
-