Unverified Commit 8b8c209e authored by Eldar Kurtić's avatar Eldar Kurtić Committed by GitHub
Browse files

static_scaled_fp8_quant should not run when scale.numel is not 1 (#20076)

parent 23a04e08
...@@ -1276,7 +1276,7 @@ def scaled_fp8_quant( ...@@ -1276,7 +1276,7 @@ def scaled_fp8_quant(
torch.ops._C.dynamic_scaled_fp8_quant(output, input, scale) torch.ops._C.dynamic_scaled_fp8_quant(output, input, scale)
else: else:
# num_token_padding not implemented for this case # num_token_padding not implemented for this case
assert (scale.numel() == 1 or num_token_padding is None) assert (scale.numel() == 1 and num_token_padding is None)
torch.ops._C.static_scaled_fp8_quant(output, input, scale) torch.ops._C.static_scaled_fp8_quant(output, input, scale)
return output, scale return output, scale
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment