"tests/python/vscode:/vscode.git/clone" did not exist on "57281e9f21bb70225ed4b2f048e4b58338cac8db"
Samyamr/grad acc stage2 (#338)
* Adding gradient accumulation support for ZeRO Stage 2. Changing all Megatron-LM tests to also test gradient accumulation
* Gradient Accumulation support for Stage 2. Model tests added to test the feature
* formatting
* Update deepspeed_light.py
removing comment
* Update ds_config_func_bs8_zero1.json
reverting this file back. Its not needed for this PR
* defining baseline prefix
Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
Showing
Please register or sign in to comment