Merge branch 'refactor_learning_rate' into 'blendable_dataset'
Refactor learning rate so it is easier to make learning rate based on consumed samples See merge request ADLR/megatron-lm!179
Showing
Please register or sign in to comment
Refactor learning rate so it is easier to make learning rate based on consumed samples See merge request ADLR/megatron-lm!179