• laibao's avatar
    feat(v1 attention): 为 ROCm FlashAttention 接入 unified kv layout,并打通... · ea9b8584
    laibao authored
    feat(v1 attention): 为 ROCm FlashAttention 接入 unified kv layout,并打通 mm_prefix、qq_bias 与 use_alibi_sqrt 透传
    在 ROCm FlashAttention 后端增加 unified KV layout 选择逻辑
    接入 unified varlen kernel 调用路径
    在 FlashAttention metadata 中补充 mm_prefix_range 与 qq_bias 透传
    ea9b8584
backend.py 26.6 KB