[transformer] Warn only when `gradient_accumulation_fusion` is `True` and...
[transformer] Warn only when `gradient_accumulation_fusion` is `True` and `fused_weight_gradient_mlp_cuda` is missing (#1317)
Showing
Please register or sign in to comment