Merge branch 'dist_chkpt_act' into 'main'
Revisited distributing checkpointed activations along the tensor parallel ranks See merge request ADLR/megatron-lm!311
Showing
Please register or sign in to comment
Revisited distributing checkpointed activations along the tensor parallel ranks See merge request ADLR/megatron-lm!311