"megatron/vscode:/vscode.git/clone" did not exist on "f602ac56bd95543cdcf90abd90ecc4c16c2210ab"
-
Ruoxi authored
* Multiply lr scheduler steps by `num_processes`. * Stop multiplying steps by gradient accumulation.
ece55227
* Multiply lr scheduler steps by `num_processes`. * Stop multiplying steps by gradient accumulation.