[CK_TILE] Add fmha fwd headdim96 support (#1608)
* Add ceil_to_qualified_tile_length()
* Rename kK0BlockLength to kQKHeaddim
* Add kSubQKHeaddim concept to support headdim96
* Fix in math.hpp to avoid using __half interfaces
* Add LdsBufferSequence instance for headdim96
* Update in fmha_fwd/fmha_fwd_splitkv codegen to support hd96 testing
* Disable hd96 instance generation in codegen fmha_fwd and fmha_fwd_splitkv to save compiling time
* Reformat one file
* Fix text alignment in fmha_fwd_splitkv.py
---------
Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com>
Showing
Please register or sign in to comment