Merge branch 'sequence_parallel' into 'main'
preallocating global buffer to avoid memory fragmentation See merge request ADLR/megatron-lm!419
Showing
Please register or sign in to comment
preallocating global buffer to avoid memory fragmentation See merge request ADLR/megatron-lm!419