FullyShardedDataParallel: only return full state dict on rank 0 (#885)
* FullyShardedDataParallel: only return full state dict on rank 0 * Add flag and make rank 0 only optional * Add tests * Add docs * address comments * update comments * update torch nightly version * update torchvision number for torch nightly dependence * add changelog * Update CHANGELOG.md * Update CHANGELOG.md
Showing
Please register or sign in to comment