Change in fwd-splitkv kernel to support num_splits=1 case (#1690)
* Change in fwd-splitkv kernel to support num_splits=1 case
* Update in codegen fwd-splitkv to make num_splits > 1 cases pass
* Specify instance traits in dispatch
* Fix link error for fp8 kernels
---------
Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com>
Showing
Please register or sign in to comment