1. 26 May, 2022 1 commit
    • ltqin's avatar
      Add FP64 XDL GEMM built-in function (#199) · 3e6c2610
      ltqin authored
      
      
      * add intrin_mfma_f64_16x16x4f64
      
      * add example
      
      * gemm reference add double data type
      
      * chang init data
      
      * fix M N PerXdlops
      
      * fix ifdef
      
      * add comparsion config
      
      * add conv fwd example
      
      * format log out
      
      * change rc matrix egister layout
      
      * reorganize example
      
      * reorganize example 2
      
      * format,because merge develop
      
      * fix call impl adding acc data type
      
      * lost ;
      
      * add compiler warning
      
      * change example tunning parameters
      
      * add test for fp64
      
      * add instance
      
      * add test/gemm/gemm_fp64.cpp
      
      * fix get name issue
      
      * remove some tunning parameter
      
      * fix conflict
      
      * format
      
      * use integer value for GEMM test
      
      * add acc data type
      
      * remove typeid because fp16
      
      * fix streamconfig etc bug from merging develop
      
      * format
      
      * remove test_gemm_xdl_fp64
      
      * add AccDataType
      
      * AccDataType problem
      Co-authored-by: default avatarqinletao <letaoqin@amd.com>
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      3e6c2610
  2. 24 May, 2022 1 commit
    • Jianfeng Yan's avatar
      Navi21 gemm (#197) · 40b59a63
      Jianfeng Yan authored
      
      
      * start adding navi21 GEMM
      
      * navi_gemm_km_kn_mn_fp32 compiles and passes one test.
      
      * rename variables and functions in gridwise_gemm_dlops_v1r3
      
      * add other 3 layouts; format instance
      
      * adding more tuning parameters
      
      add tuning parameters for other 3 layouts
      
      * add gemm_dlops_f16
      
      * tmp
      
      * add dependence of DeviceGemm::IsSupportedArg() on arch
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * push gemm_dlops into profiler
      
      * minor changes
      
      * if using xdl or dlops is moved into profiler_gemm_impl
      
      * minor changes
      
      * minor changes
      
      * remove is_xdl from profile_gemm_impl
      
      * make IsSupportedArg dependent on arch for other device_gemm
      
      * minor changes
      
      * minor changes
      
      * fix a bug in f_generate_tensor_value
      
      * add 64x64x64 for gemm_dlops_int8
      
      * add 64x64x64 for gemm_dlops_int8
      
      * comment out 3 layouts in gemm_dlops_int8; add 32x32x32 for gemm_dlops_int8; init A values to 1
      
      * fix
      
      * start fixing tuning parameters
      
      * monir
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * fixing
      
      * adding example
      
      * adding example
      
      * adding example
      
      * add gemm fp32 example
      
      * clean up
      
      * use 128x128x16 as MNK tile in navi21 gemm example
      
      * bug fix
      
      * fix test
      
      * use new block c tile
      
      * clean
      
      * fix build
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      Co-authored-by: wangshaojie6's avatarshaojiewang <wsjmessi@163.com>
      40b59a63
  3. 09 Mar, 2022 1 commit
    • Chao Liu's avatar
      Reorganize files, Part 1 (#119) · 5d37d7bf
      Chao Liu authored
      * delete obselete files
      
      * move files
      
      * build
      
      * update cmake
      
      * update cmake
      
      * fix build
      
      * reorg examples
      
      * update cmake for example and test
      5d37d7bf