Unverified Commit 7cd531c4 authored by Serge Panev's avatar Serge Panev Committed by GitHub
Browse files

[Dist][Optim] Change op order in SparseAdagrad to be numerically closer to PyTorch (#4253)


Signed-off-by: default avatarSerge Panev <spanev@nvidia.com>
Co-authored-by: default avatarMufei Li <mufeili1996@gmail.com>
parent 8292bf32
......@@ -255,7 +255,7 @@ class SparseAdagrad(DistSparseGradOptimizer):
update_event.record()
# update emb
std_values = grad_state.add_(eps).sqrt_()
std_values = grad_state.sqrt_().add_(eps)
tmp = clr * grad_values / std_values
tmp_dst = tmp.to(state_dev, non_blocking=True)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment