Added Int4 mixed batch gemm support (#1839)
* remove redundant kernels. * added batched_gemm_xdl_fp16int4_b_scale_v3 * Enabled the split K. * added the batched_gemm_b_scale ckProfiler, meet function issue * fix some typo * fix ckProfiler build issue * fix some bugs * updated some debug info * comment some code * Fix * fixed some bugs and refactor the code * fixed a function bug. * formatted files. * formatted * uncommented the ckProfiler CMakeLists * fixed. * fix ckProfiler for batched_gemm_b_scale --------- Co-authored-by:mtgu0705 <mtgu@amd.com> Co-authored-by:
aska-0096 <haocwang@amd.com> Co-authored-by:
Bartlomiej Kocot <barkocot@amd.com>
Showing
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment