Add proper barriers around FSDP checkpointing
Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/621 There should be barriers around FSDP checkpointing to ensure other ranks do not continue to training while rank 0 is still checkpointing Also add log after checkpoint finishes Reviewed By: wat3rBro Differential Revision: D49541229 fbshipit-source-id: ac8c086eb0d65611be0b258e3006d9e14b7387ad
Showing
Please register or sign in to comment