• Haocong WANG's avatar
    [GEMM] F8 GEMM, performance optimized. (#1384) · 8c90f25b
    Haocong WANG authored
    
    
    * add ab_scale init support
    
    * enabled interwave
    
    * add scale type; update isSupport
    
    * adjust example
    
    * clean
    
    * enable f8 pure gemm rcr ckprofiler
    
    * Add gemm_multiply_multiply instances
    
    * clang format
    
    * Optimize for ScaleBlockMNK=128
    
    * enable abscale f8 gemm ck profiler
    
    * Add pure f8 gemm test suite
    
    * Reverting to the state of project at f60fd77
    
    * update copyright
    
    * clang format
    
    * update copyright
    
    ---------
    Co-authored-by: default avatarroot <jizhan@amd.com>
    8c90f25b
profile_gemm_ab_scale.cpp 6.42 KB