Unverified Commit eec890c1 authored by Wentao Ye's avatar Wentao Ye Committed by GitHub
Browse files

[Bug] Fix B200 DeepGEMM E8M0 Accuracy Issue (#22399)


Signed-off-by: default avataryewentao256 <zhyanwentao@126.com>
parent 46a13949
...@@ -799,7 +799,8 @@ def requant_weight_ue8m0_inplace( ...@@ -799,7 +799,8 @@ def requant_weight_ue8m0_inplace(
s_exp = s_exp[:m_cur, :k_cur] s_exp = s_exp[:m_cur, :k_cur]
w_dq = w_q.to(torch.float32) * s_exp w_dq = w_q.to(torch.float32) * s_exp
# Re-quantise using power-of-two scaling (UE8M0). # Re-quantise using power-of-two scaling (UE8M0).
w_requant, s_requant = per_block_cast_to_fp8(w_dq, [block_m, block_k]) w_requant, s_requant = per_block_cast_to_fp8(w_dq, [block_m, block_k],
use_ue8m0=True)
# Write back the results in-place. # Write back the results in-place.
w_q.copy_(w_requant) w_q.copy_(w_requant)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment