Merge branch 'rescaling' into 'main'
Add support for signal-based dynamic checkpointing See merge request ADLR/megatron-lm!361
Showing
Please register or sign in to comment
Add support for signal-based dynamic checkpointing See merge request ADLR/megatron-lm!361