fix: update grad_output quant to avoid redundant work (#1736)
* fix: update grad_output quant to avoid redundant work Signed-off-by:kshitij12345 <kshitijkalambarkar@gmail.com> * add test Signed-off-by:
kshitij12345 <kshitijkalambarkar@gmail.com> * don't keep only columnwise quant if requires_dgrad=False Signed-off-by:
kshitij12345 <kshitijkalambarkar@gmail.com> * fix stray merge Signed-off-by:
kshitij12345 <kshitijkalambarkar@gmail.com> * fix for ctx.use_bias is True case Signed-off-by:
kshitij12345 <kshitijkalambarkar@gmail.com> * Skip if FP8 not available Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
kshitij12345 <kshitijkalambarkar@gmail.com> Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Showing
Please register or sign in to comment