Unverified Commit 31e36453 authored by Min Xu's avatar Min Xu Committed by GitHub
Browse files

[fix] minor fixes for master branch (#792)



* add changelog for previous commit

* add changelog for previous commit

* add changelog for previous commit

* fix a merge induced error
Co-authored-by: default avatarMin Xu <min.xu.public@gmail.com>
parent 4fa2ab9b
...@@ -13,6 +13,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ...@@ -13,6 +13,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
that don't require grad and hence can fail the previous assert. [#761] that don't require grad and hence can fail the previous assert. [#761]
- FSDP: Fixed a bug when multiple backward pass is called within an iteration, parameters' sharding - FSDP: Fixed a bug when multiple backward pass is called within an iteration, parameters' sharding
state might be incorrect. [#775] state might be incorrect. [#775]
- activation checkpoint: Ensure outputs of checkpointed modules only require grad if either
the input requires grad or if the parameters require grad. [#787]
### Added ### Added
- FSDP: Added support for returning the original names of parameters when `named_parameters` is called on - FSDP: Added support for returning the original names of parameters when `named_parameters` is called on
...@@ -20,7 +22,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ...@@ -20,7 +22,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
call `named_parameters` under the `summon_full_params` context when using flattened params or original call `named_parameters` under the `summon_full_params` context when using flattened params or original
params. If you are using original params (i.e flatten_params=False), calling `named_parameters` outside params. If you are using original params (i.e flatten_params=False), calling `named_parameters` outside
of the `summon_full_params` context will still return the original param names along with the local shards. [#755] of the `summon_full_params` context will still return the original param names along with the local shards. [#755]
- FSDP: Ensure gradient reduction accumulates into the unsharded gradient tensor
within a backwards pass. This matters when an FSDP module is called
multiple times within a forward pass, and reduction is not deferred
using activation checkpoint forward counters, bucketing or some other
mechanism. [#784]
- activation checkpoint: Added a context manager to disable checkpoint in case the same wrapped module
needs to be checkpointed and not checkpointed in different parts of
the module forward pass. [#772]
## [0.4.0] - 2021-07-31 ## [0.4.0] - 2021-07-31
### Fixed ### Fixed
......
...@@ -277,6 +277,7 @@ class CheckpointFunction(torch.autograd.Function): ...@@ -277,6 +277,7 @@ class CheckpointFunction(torch.autograd.Function):
with torch.no_grad(), enable_checkpointing(): with torch.no_grad(), enable_checkpointing():
unpacked_args, unpacked_kwargs = unpack_kwargs(kwarg_keys, args) unpacked_args, unpacked_kwargs = unpack_kwargs(kwarg_keys, args)
outputs = run_function(*unpacked_args, **unpacked_kwargs) outputs = run_function(*unpacked_args, **unpacked_kwargs)
the_module = unpacked_args[0]
# Because we run with torch.no_grad(), we can't actually access # Because we run with torch.no_grad(), we can't actually access
# outputs.requires_grad. Instead, we manually compute it by # outputs.requires_grad. Instead, we manually compute it by
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment