Handle parameter groups smaller than DP (#273)
* Load non-DeepSpeed checkpoints into ZeRO optimizer * Handle parameters smaller than DP * Formatting fixes
Showing
Please register or sign in to comment
* Load non-DeepSpeed checkpoints into ZeRO optimizer * Handle parameters smaller than DP * Formatting fixes