Implements dual-chunk-flash-attn backend for dual chunk attention with sparse...
Implements dual-chunk-flash-attn backend for dual chunk attention with sparse attention support (#11844)
Showing
Please register or sign in to comment
Implements dual-chunk-flash-attn backend for dual chunk attention with sparse attention support (#11844)