Unverified Commit 0df25101 authored by rainkert's avatar rainkert Committed by GitHub
Browse files

[Bugfix] Fix gptq_marlin for deepseek-v3 (#13750)


Signed-off-by: default avatardangshunya <dangshunya@baichuan-inc.com>
Co-authored-by: default avatardangshunya <dangshunya@baichuan-inc.com>
parent e123aafd
...@@ -569,7 +569,9 @@ class GPTQMarlinMoEMethod(FusedMoEMethodBase): ...@@ -569,7 +569,9 @@ class GPTQMarlinMoEMethod(FusedMoEMethodBase):
replace_parameter(layer, "w13_scales", marlin_w13_scales) replace_parameter(layer, "w13_scales", marlin_w13_scales)
marlin_w2_scales = marlin_moe_permute_scales( marlin_w2_scales = marlin_moe_permute_scales(
s=layer.w2_scales, s=layer.w2_scales,
size_k=layer.w2_scales.shape[1] * self.quant_config.pack_factor, size_k=layer.w2_scales.shape[1] *
(self.quant_config.group_size if self.quant_config.group_size != -1
else self.quant_config.pack_factor),
size_n=layer.w2_scales.shape[2], size_n=layer.w2_scales.shape[2],
group_size=self.quant_config.group_size, group_size=self.quant_config.group_size,
) )
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment