Unverified Commit b532a5fd authored by xiaobochen's avatar xiaobochen Committed by GitHub
Browse files

fix moe-ep accuracy issue for fp8 (#2489)

parent a0592c05
......@@ -644,6 +644,10 @@ class Fp8EPMoEMethod(Fp8MoEMethod):
"QuantConfig has static quantization, but found "
"activation scales are None."
)
layer.w13_weight_scale = torch.nn.Parameter(
torch.max(layer.w13_weight_scale, dim=1).values,
requires_grad=False,
)
return
def apply(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment