• Po Yen Chen's avatar
    GEMM pipeline v2 (#317) · 63914743
    Po Yen Chen authored
    
    
    * format
    
    * improving pipeline
    
    * fix typo
    
    * format
    
    * adding thread group
    
    * adding thread group
    
    * adding thread group
    
    * adding gemm pipeline
    
    * tweak
    
    * refactor
    
    * refactor
    
    * add missing type convert
    
    * refactor
    
    * refactor
    
    * refactor
    
    * clean
    
    * fix build
    
    * refactor
    
    * format
    
    * clean up
    
    * use remove_cvref_t
    
    * clean
    
    * use pipeline_v2 for gemm kernel
    
    * Remove inconsistent indent
    
    * Fix compilation errors due to incomplete merge process
    
    * Add missing include directives
    
    * Fix compilation errors in currently unused files
    
    * Add license in newly added files
    
    * Re-format touched files by clang-format-10
    
    * Fix wrong template argument count of DeviceGemm<>
    
    * Use language construct to choose between types
    
    * Use language construct to choose GEMM example instance
    
    * Fix compilation error due to interface change
    
    * Re-use type alias to avoid duplication
    
    * Unify type alias usage in source file
    
    * Only use v2 pipeline in one gridwise GEMM type
    
    * Remove no-longer used include directives
    
    * Add static_assert() to check pipeline type requirements
    
    * Revert "Add static_assert() to check pipeline type requirements"
    
    This reverts commit f0985f0a132671a1caaea92810c9f30dcf062bde.
    
    * clean
    
    * clean
    
    * clean
    
    * clean
    Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
    Co-authored-by: wangshaojie6's avatarshaojiewang <wsjmessi@163.com>
    63914743
contraction_scale_xdl_fp32.cpp 25.7 KB