Unverified Commit 047797ef authored by vllmellm's avatar vllmellm Committed by GitHub
Browse files

[Bugfix] Triton FA function takes no keyword arguments (#16902)


Signed-off-by: default avatarvllmellm <vllm.ellm@embeddedllm.com>
parent eb8ef422
...@@ -1091,7 +1091,14 @@ class MLACommonImpl(MLAAttentionImpl[T], Generic[T]): ...@@ -1091,7 +1091,14 @@ class MLACommonImpl(MLAAttentionImpl[T], Generic[T]):
q, q,
k, k,
maybe_padded_v, maybe_padded_v,
**kwargs, None, # output
kwargs["cu_seqlens_q"],
kwargs["cu_seqlens_k"],
kwargs["max_seqlen_q"],
kwargs["max_seqlen_k"],
kwargs["causal"],
softmax_scale,
None, # bias
) )
if is_vllm_fa: if is_vllm_fa:
attn_out = self.flash_attn_varlen_func( attn_out = self.flash_attn_varlen_func(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment