Support MHA with chunked prefix cache for flashinfer/flashmla backend, support...
Support MHA with chunked prefix cache for flashinfer/flashmla backend, support page size > 1 for MHA chunked prefix (#8616)
Co-authored-by:
xuyongfei.xyf <xuyongfei.xyf@antgroup.com>
Showing
Please register or sign in to comment