Unverified Commit f976e3b9 authored by xaguilar-amd's avatar xaguilar-amd Committed by GitHub
Browse files

[Performance] Remove unnecessary zero-fill of MLA decode output tensor in Aiter backend (#37539)


Signed-off-by: default avatarxaguilar-amd <xaguilar@amd.com>
parent d468322d
...@@ -416,7 +416,7 @@ class AiterMLAImpl(MLACommonImpl[AiterMLAMetadata]): ...@@ -416,7 +416,7 @@ class AiterMLAImpl(MLACommonImpl[AiterMLAMetadata]):
else: else:
kernel_num_heads = self.num_heads kernel_num_heads = self.num_heads
o = torch.zeros( o = torch.empty(
B, B,
kernel_num_heads, kernel_num_heads,
self.kv_lora_rank, self.kv_lora_rank,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment