feat(v1 attention): 为 ROCm FlashAttention 接入 unified kv layout,并打通...
feat(v1 attention): 为 ROCm FlashAttention 接入 unified kv layout,并打通 mm_prefix、qq_bias 与 use_alibi_sqrt 透传 在 ROCm FlashAttention 后端增加 unified KV layout 选择逻辑 接入 unified varlen kernel 调用路径 在 FlashAttention metadata 中补充 mm_prefix_range 与 qq_bias 透传
Showing
Please register or sign in to comment