Grad scaler (#1277)
* add keyword argument of `grad_scaler` * update test * pass dtype to fwd_step_func * add log * calc loss in autocast as per https://pytorch.org/docs/stable/amp.html#autocasting * add keyword argument of `grad_scaler` * update test * pass dtype to fwd_step_func * add log * calc loss in autocast as per https://pytorch.org/docs/stable/amp.html#autocasting * option to turn off autocast inside forward_step function As there's some users who activate `autocast` outside fwd/bwd functions. * add missing arg of disable_autocast * reorder args of no pipeline
Showing
Please register or sign in to comment