[Kernel][Triton][AMD] Use block size heuristic for avg 2.8x speedup for int8 models (#11698)
Signed-off-by:
Randall Smith <Randall.Smith@amd.com>
Showing
Please register or sign in to comment
Signed-off-by:
Randall Smith <Randall.Smith@amd.com>