1. 25 Jun, 2022 1 commit
    • Chao Liu's avatar
      Absolute include path (#281) · d1db6a0c
      Chao Liu authored
      * ad gelu and fast_gelu
      
      * added GeLU and fast GeLU
      
      * clean up
      
      * add gemm+fastgelu example
      
      * add gemm+gelu instances
      
      * update profiler
      
      * clean up
      
      * clean up
      
      * adding gemm+bias+activation
      
      * clean
      
      * adding bias
      
      * clean
      
      * adding gemm multiple d
      
      * debugging
      
      * add gemm bias add fastgelu
      
      * rename, clean
      
      * refactoring; add readme
      
      * refactor
      
      * refactor
      
      * refactor
      
      * refactor
      
      * refactor
      
      * refactor
      
      * fix
      
      * fix
      
      * update example
      
      * update example
      
      * rename
      
      * update example
      
      * add ckProfiler
      
      * clean
      
      * clean
      
      * clean
      
      * clean
      
      * add client app example
      
      * update readme
      
      * delete obselete files
      
      * remove old client app
      
      * delete old file
      
      * cleaning
      
      * clean
      
      * remove half
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path for all examples
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * fix header path
      
      * revert client app example
      
      * clean build
      
      * fix build
      
      * temporary disable client test on Jenkins
      
      * clean
      
      * clean
      
      * clean
      d1db6a0c
  2. 24 May, 2022 1 commit
    • Jianfeng Yan's avatar
      Navi21 gemm (#197) · 40b59a63
      Jianfeng Yan authored
      
      
      * start adding navi21 GEMM
      
      * navi_gemm_km_kn_mn_fp32 compiles and passes one test.
      
      * rename variables and functions in gridwise_gemm_dlops_v1r3
      
      * add other 3 layouts; format instance
      
      * adding more tuning parameters
      
      add tuning parameters for other 3 layouts
      
      * add gemm_dlops_f16
      
      * tmp
      
      * add dependence of DeviceGemm::IsSupportedArg() on arch
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * push gemm_dlops into profiler
      
      * minor changes
      
      * if using xdl or dlops is moved into profiler_gemm_impl
      
      * minor changes
      
      * minor changes
      
      * remove is_xdl from profile_gemm_impl
      
      * make IsSupportedArg dependent on arch for other device_gemm
      
      * minor changes
      
      * minor changes
      
      * fix a bug in f_generate_tensor_value
      
      * add 64x64x64 for gemm_dlops_int8
      
      * add 64x64x64 for gemm_dlops_int8
      
      * comment out 3 layouts in gemm_dlops_int8; add 32x32x32 for gemm_dlops_int8; init A values to 1
      
      * fix
      
      * start fixing tuning parameters
      
      * monir
      
      * minor changes
      
      * minor changes
      
      * minor changes
      
      * fixing
      
      * adding example
      
      * adding example
      
      * adding example
      
      * add gemm fp32 example
      
      * clean up
      
      * use 128x128x16 as MNK tile in navi21 gemm example
      
      * bug fix
      
      * fix test
      
      * use new block c tile
      
      * clean
      
      * fix build
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      Co-authored-by: wangshaojie6's avatarshaojiewang <wsjmessi@163.com>
      40b59a63