• Jianfeng Yan's avatar
    Navi21 gemm (#197) · 40b59a63
    Jianfeng Yan authored
    
    
    * start adding navi21 GEMM
    
    * navi_gemm_km_kn_mn_fp32 compiles and passes one test.
    
    * rename variables and functions in gridwise_gemm_dlops_v1r3
    
    * add other 3 layouts; format instance
    
    * adding more tuning parameters
    
    add tuning parameters for other 3 layouts
    
    * add gemm_dlops_f16
    
    * tmp
    
    * add dependence of DeviceGemm::IsSupportedArg() on arch
    
    * minor changes
    
    * minor changes
    
    * minor changes
    
    * minor changes
    
    * minor changes
    
    * minor changes
    
    * minor changes
    
    * push gemm_dlops into profiler
    
    * minor changes
    
    * if using xdl or dlops is moved into profiler_gemm_impl
    
    * minor changes
    
    * minor changes
    
    * remove is_xdl from profile_gemm_impl
    
    * make IsSupportedArg dependent on arch for other device_gemm
    
    * minor changes
    
    * minor changes
    
    * fix a bug in f_generate_tensor_value
    
    * add 64x64x64 for gemm_dlops_int8
    
    * add 64x64x64 for gemm_dlops_int8
    
    * comment out 3 layouts in gemm_dlops_int8; add 32x32x32 for gemm_dlops_int8; init A values to 1
    
    * fix
    
    * start fixing tuning parameters
    
    * monir
    
    * minor changes
    
    * minor changes
    
    * minor changes
    
    * fixing
    
    * adding example
    
    * adding example
    
    * adding example
    
    * add gemm fp32 example
    
    * clean up
    
    * use 128x128x16 as MNK tile in navi21 gemm example
    
    * bug fix
    
    * fix test
    
    * use new block c tile
    
    * clean
    
    * fix build
    Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
    Co-authored-by: wangshaojie6's avatarshaojiewang <wsjmessi@163.com>
    40b59a63
static_buffer.hpp 5.41 KB