Update tuning parameter & compilation options of DeviceGemmXdl<> instance (layout=TT) (#819)
* Enable pipeline v2 opt for layout=TT instance * Use better thread mapping for reading A tile * Conditionally enable pipeline v2 opt * Allow enabling only fp16 gemm instances in profiler * Fix formatting error * Fix compilation error if we enable fp32 in profiler
Showing
Please register or sign in to comment