Make `CanonicalizeGemmInput()` support non-TN layout FP8 GEMM on Blackwell...
Make `CanonicalizeGemmInput()` support non-TN layout FP8 GEMM on Blackwell with column-wise/transposed data (#2233)
Modified CanonicalizeGemmInput() logic to pull from column-wise data for FP8 GEMM on Blackwell when row-wise is not available.
Signed-off-by:
Alp Dener <adener@nvidia.com>
Showing
Please register or sign in to comment