[CK_TILE] fmha forward split-kv + combine kernels (#1338)
* FA fwd dropout
* FA bwd
* epilogue reuse
* CMakeLists update
* [CK_TILE] support alibi (#1269)
* add alibi support
* fix code
* update code based on comment
* Support more hdim
* fix fp8 bias
* support seqlen_k=0 case
* remove unused printf
* fix format
---------
Co-authored-by:
rocking <ChunYu.Lai@amd.com>
* now fwd/bwd can build
* bwd alibi
* add bwd validation stream_config
* update generated filenames
* update bwd kernel launch
* CK_TILE_HOST_DEVICE in philox
* Transpose -> transpose
* format
* format
* format
* Generate the instance for FA required
* format
* fix error in WarpGemm
* Add num_splits option and dummy split-kv api method
* Generate fmha_fwd_splitkv()
* Add SplitKV kernel codegen logics
* Add SplitKV combine kernel codegen logics
* Fix mismatched return type
* Clean-up code
* Replace sentinel value before storing
* Fix wrong l...
Showing
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment