-
Alp Dener authored
Make `CanonicalizeGemmInput()` support non-TN layout FP8 GEMM on Blackwell with column-wise/transposed data (#2233) Modified CanonicalizeGemmInput() logic to pull from column-wise data for FP8 GEMM on Blackwell when row-wise is not available. Signed-off-by:Alp Dener <adener@nvidia.com>
ee384ab5