Merge branch 'sequence_parallel' into 'main'
Sequence parallelism + attention checkpoint See merge request ADLR/megatron-lm!413
Showing
Please register or sign in to comment
Sequence parallelism + attention checkpoint See merge request ADLR/megatron-lm!413