Move EMA to after backward.
Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/494 Currently EMA computation is in the after step hook. It is in the critical path where no other work is available. This increases the training iteration time. This diff moves the EMA computation to after the backward but before the optimizer step. This way, the majority of the EMA computation time on the CPU can be hidden since CPU at that time is waiting for the GPU to finish the backward anyway. This change may completely hide the EMA CPU time. It reduces the EMA time from 20ms to 4ms, where the 4ms is the GPU time. However, with this change, the EMA gets its value from the previous iteration value (since it is before step). but since we do many epochs of training, one iteration difference may not be significant. Reviewed By: tglik Differential Revision: D43527552 fbshipit-source-id: 1faa9d910b20cae0fc77da541bc0ad176bce18a8
Showing
Please register or sign in to comment