Unverified Commit 980a1724 authored by Percy's avatar Percy Committed by GitHub
Browse files

[Kernel] update comment for KV shape in unified triton attn (#18099)


Signed-off-by: default avatarhaochengxia <xhc_1007@163.com>
parent e1f5a71e
...@@ -31,8 +31,8 @@ def apply_softcap(S, x): ...@@ -31,8 +31,8 @@ def apply_softcap(S, x):
def kernel_unified_attention_2d( def kernel_unified_attention_2d(
output_ptr, # [num_tokens, num_query_heads, head_size] output_ptr, # [num_tokens, num_query_heads, head_size]
query_ptr, # [num_tokens, num_query_heads, head_size] query_ptr, # [num_tokens, num_query_heads, head_size]
key_cache_ptr, # [num_blks, num_kv_heads, head_size // x, blk_size, x] key_cache_ptr, # [num_blks, blk_size, num_kv_heads, head_size]
value_cache_ptr, # [num_blks, num_kv_heads, head_size, blk_size] value_cache_ptr, # [num_blks, blk_size, num_kv_heads, head_size]
block_tables_ptr, # [num_seqs, max_num_blocks_per_seq] block_tables_ptr, # [num_seqs, max_num_blocks_per_seq]
seq_lens_ptr, # [num_seqs] seq_lens_ptr, # [num_seqs]
alibi_slopes_ptr, # [num_query_heads] alibi_slopes_ptr, # [num_query_heads]
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment