Unverified Commit 9f1710f1 authored by Ying Zhong's avatar Ying Zhong Committed by GitHub
Browse files

Fix mla prefill context performance (#13897)


Signed-off-by: default avatarZhongYingMatrix <zhongyingmatrix@gmail.com>
parent e642ec96
......@@ -1308,7 +1308,7 @@ class MLACommonImpl(MLAAttentionImpl[T], Generic[T]):
)
kv_c_normed = workspace[:toks]\
[..., :self.kv_lora_rank].unsqueeze(1)
[..., :self.kv_lora_rank]
k_pe = workspace[:toks]\
[..., self.kv_lora_rank:].unsqueeze(1)
......
......@@ -874,7 +874,7 @@ class MLACommonImpl(MLAAttentionImpl[M], Generic[M]):
)
kv_c_normed = workspace[:toks]\
[..., :self.kv_lora_rank].unsqueeze(1)
[..., :self.kv_lora_rank]
k_pe = workspace[:toks]\
[..., self.kv_lora_rank:].unsqueeze(1)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment