[CK_TILE] Fix incorrect computation of group mode PagedAttention (#1688)
* Allow getting batch size from splitkv tile partitioner
* Fix wrong paged-kvcache impl for group mode
* Fix wrong example code for page-kvcache
* Undo changes in fmha_fwd.cpp
* Always use 2D block table
* Add is_gappy kernel argument for paged-kvcache
The is_gappy argument is used for differentiating seqstart_k_ptr usage
in flash-attention & xformers
* Remove out-of-date comments
* Remove no-longer used method
* Fix wrong # page-block calculation
* Fix wrong comment
---------
Co-authored-by:
Qianfeng <qianfeng.zhang@amd.com>
Showing
Please register or sign in to comment