Unverified Commit c7fc6b13 authored by Lucia Fang's avatar Lucia Fang Committed by GitHub
Browse files

fix incompatibililty with non cuda platform for nvfp4 (#23478)


Signed-off-by: default avatarLu Fang <fanglu@fb.com>
Co-authored-by: default avatarLucia (Lu) Fang <fanglu@meta.com>
parent ad788684
...@@ -47,8 +47,10 @@ QUANT_OPS: dict[QuantKey, OpOverload] = { ...@@ -47,8 +47,10 @@ QUANT_OPS: dict[QuantKey, OpOverload] = {
torch.ops._C.dynamic_scaled_fp8_quant.default, # noqa: E501 torch.ops._C.dynamic_scaled_fp8_quant.default, # noqa: E501
kFp8DynamicTokenSym: kFp8DynamicTokenSym:
torch.ops._C.dynamic_per_token_scaled_fp8_quant.default, # noqa: E501 torch.ops._C.dynamic_per_token_scaled_fp8_quant.default, # noqa: E501
kNvfp4Quant: torch.ops._C.scaled_fp4_quant.default, # noqa: E501
} }
if current_platform.is_cuda() and hasattr(torch.ops._C, "scaled_fp4_quant"):
QUANT_OPS[
kNvfp4Quant] = torch.ops._C.scaled_fp4_quant.default # noqa: E501
class FusedRMSQuantKey(NamedTuple): class FusedRMSQuantKey(NamedTuple):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment