Commit 0d8ce320 authored by Nicolò Lucchesi's avatar Nicolò Lucchesi Committed by khluu
Browse files

[Bugfix] Fix DeepseekV32 `AssertionError: num_kv_heads == 1` (#33090)


Signed-off-by: default avatarNickLucche <nlucches@redhat.com>
(cherry picked from commit 492a7983)
parent d51e1f8b
...@@ -322,7 +322,7 @@ class TpKVTopology: ...@@ -322,7 +322,7 @@ class TpKVTopology:
# Figure out whether the first dimension of the cache is K/V # Figure out whether the first dimension of the cache is K/V
# or num_blocks. This is used to register the memory regions correctly. # or num_blocks. This is used to register the memory regions correctly.
kv_cache_shape = self.attn_backend.get_kv_cache_shape( kv_cache_shape = self.attn_backend.get_kv_cache_shape(
num_blocks=1, block_size=16, num_kv_heads=4, head_size=1 num_blocks=1, block_size=16, num_kv_heads=1, head_size=1
) )
# Non-MLA backends caches have 5 dims [2, num_blocks, H,N,D], # Non-MLA backends caches have 5 dims [2, num_blocks, H,N,D],
# we just mock num_blocks to 1 for the dimension check below. # we just mock num_blocks to 1 for the dimension check below.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment