Unverified Commit 56383649 authored by Rex's avatar Rex Committed by GitHub
Browse files

Refactor 2 awq gemm kernels into m16nXk32 (#2723)


Co-authored-by: default avatarChunan Zeng <chunanzeng@Chunans-Air.attlocal.net>
parent 4ca2c358
This diff is collapsed.
......@@ -145,8 +145,8 @@ class AWQLinearMethod(LinearMethodBase):
x: torch.Tensor,
bias: Optional[torch.Tensor] = None) -> torch.Tensor:
qweight = weights["qweight"]
qzeros = weights["qzeros"]
scales = weights["scales"]
qzeros = weights["qzeros"]
pack_factor = self.quant_config.pack_factor
out_shape = (x.shape[:-1] + (qweight.shape[-1] * pack_factor, ))
reshaped_x = x.reshape(-1, x.shape[-1])
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment