Merge branch 'nmt-main' into 'main'
inverse_square_root learning param schedule See merge request ADLR/megatron-lm!466
Showing
Please register or sign in to comment
inverse_square_root learning param schedule See merge request ADLR/megatron-lm!466