"...git@developer.sourcefind.cn:2222/OpenDAS/vllm_cscc.git" did not exist on "ea9b8584bcd925f1fc7fc9dc2f480b0f5febfdb7"
Unverified Commit d20e2611 authored by Qubitium-ModelCloud's avatar Qubitium-ModelCloud Committed by GitHub
Browse files

Fix non-contiguous input passed to Marlin kernel (#15319)

parent f622dbcf
...@@ -115,6 +115,10 @@ class MarlinLinearKernel(MPLinearKernel): ...@@ -115,6 +115,10 @@ class MarlinLinearKernel(MPLinearKernel):
layer: torch.nn.Module, layer: torch.nn.Module,
x: torch.Tensor, x: torch.Tensor,
bias: Optional[torch.Tensor] = None) -> torch.Tensor: bias: Optional[torch.Tensor] = None) -> torch.Tensor:
# marlin requires contiguous memory layout
# prefix caching may cause x to be non-contiguous
x = x.contiguous() # no-op if already contiguous
c = self.config c = self.config
w_q, w_s, w_zp, w_gidx = self._get_weight_params(layer) w_q, w_s, w_zp, w_gidx = self._get_weight_params(layer)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment