"vscode:/vscode.git/clone" did not exist on "bc4eb65b5492b4f84a1b714bfc14bcff73d401f1"
Unverified Commit 7938d121 authored by DorBernsohn's avatar DorBernsohn Committed by GitHub
Browse files

[Bugfix] Fix CPU backend crash in KV cache block zeroing (#37550)


Signed-off-by: default avatarDorBernsohn <dor.bernsohn@gmail.com>
parent debd6e76
......@@ -88,6 +88,11 @@ class CPUModelRunner(GPUModelRunner):
def _sync_device(self) -> None:
pass
def _zero_block_ids(self, block_ids: list[int]) -> None:
# CPU attention assigns -INF to logits at invalid positions,
# so stale KV cache data never affects computation.
pass
def get_dp_padding(self, num_tokens: int) -> tuple[int, torch.Tensor | None]:
# Note: For CPU backend, dp padding is not required for now.
return 0, None
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment