- 03 Nov, 2023 1 commit
-
-
carlushuang authored
* unify q persistent in register * add refactor warp_gemm dispatcher
-
- 27 Oct, 2023 1 commit
-
-
carlushuang authored
* support batch & nhead * support scale * tile scheduler * rename tile-scheduler to tile-partitioner * add some exp2 math * fix a bug when chaning tile size
-
- 19 Oct, 2023 1 commit
-
-
carlushuang authored
* Revert "Extract gemm0 prefetch0 out from loop" This reverts commit d3b56f39f9fd12edb476b24ae9cf480841d311e4. * add fmha fwd pipeline * Extract gemm0 prefetch0 out from loop * move blockSize to another place ; fix a missing header in tile_window_impl_static_distribution.hpp * remove KArgs from tile modules --------- Co-authored-by:
Po-Yen, Chen <PoYen.Chen@amd.com> Co-authored-by:
Chao Liu <chao.liu2@amd.com>
-