• Deyu Fu's avatar
    clean up variance options support by all fused optimizers: · 18062b69
    Deyu Fu authored
    correctly not apply bias correction to epsilon(same as recent upstream change)
    correctly not apply bias correction to weight decay(consistent with upstream AdamW)
    Make adam_w_mode for FusedAdam/LAMB, to do L2 or Weight Decay (Adam vs AdamW)
    Correct document reg_inside_moment differently from adam_w_mode in FusedNovoGrad
    Removed legacy eps_mode from FusedAdam
    Make internal math type float across fused optimizers
    18062b69
amp_C_frontend.cpp 2.69 KB