-
Olatunji Ruwase authored
* Support saving and loading ZeRO checkpoints on different data parallelism degree. * Fix formatting * Support checkpoint with varying GPU count in ZeRO stage 1 * Fix formatting * Formatting fixes * Update model tests * Remove pprint * Minor fix * Fix formatting * Update model tests Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
7ccc9daf