• Samyam Rajbhandari's avatar
    Samyamr/grad acc stage2 (#338) · 7240abf3
    Samyam Rajbhandari authored
    
    
    * Adding gradient accumulation support for ZeRO Stage 2. Changing all Megatron-LM tests to also test gradient accumulation
    
    * Gradient Accumulation support for Stage 2. Model tests added to test the feature
    
    * formatting
    
    * Update deepspeed_light.py
    
    removing comment
    
    * Update ds_config_func_bs8_zero1.json
    
    reverting this file back. Its not needed for this PR
    
    * defining baseline prefix
    Co-authored-by: default avatarJeff Rasley <jerasley@microsoft.com>
    7240abf3
run_func_test.py 10.9 KB