Unverified Commit 4d251ad0 authored by Michael Goin's avatar Michael Goin Committed by GitHub
Browse files

Fix CompressedTensorsWNA16MoE with grouped scales (#13769)

parent 18e50593
...@@ -527,7 +527,8 @@ class CompressedTensorsWNA16MoEMethod(CompressedTensorsMoEMethod): ...@@ -527,7 +527,8 @@ class CompressedTensorsWNA16MoEMethod(CompressedTensorsMoEMethod):
replace_tensor("w13_weight_scale", marlin_w13_scales) replace_tensor("w13_weight_scale", marlin_w13_scales)
marlin_w2_scales = marlin_moe_permute_scales( marlin_w2_scales = marlin_moe_permute_scales(
layer.w2_weight_scale, layer.w2_weight_scale,
layer.w2_weight_scale.shape[1] * self.packed_factor, layer.w2_weight_scale.shape[1] *
(self.group_size if self.group_size != -1 else self.packed_factor),
size_k2, size_k2,
self.group_size, self.group_size,
self.num_bits, self.num_bits,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment