1. 07 Aug, 2023 1 commit
    • Illia Silin's avatar
      Allow building CK for specific data types and split off last remaining DL instances. (#830) · 08eb1769
      Illia Silin authored
      * properly split conv_nd_bwd_data instances
      
      * split conv2d_fwd instance data types
      
      * split the gemm, conv2d_fwd and batched_gemm_softamx_gemm
      
      * split the tests by data types where possible
      
      * filter examples by DTYPES
      
      * split few remaining examples by DTYPES
      
      * filter most instances by DTYPES
      
      * add new lines at end of headers, fix grouped_gemm profiler
      
      * fix syntax
      
      * split the ckprofiler instances by DTYPES
      
      * split the conv2d and quantization DL and XDL instances
      
      * fix the splitting of conv2d DL instances
      
      * split softmax and pool_fwd tests for fp16 and fp32 types
      
      * fix syntax
      
      * fix the dl_int8 quantization instances isolation
      08eb1769
  2. 15 Jun, 2023 1 commit
    • Illia Silin's avatar
      Enable gfx941 and gfx942 architectures. (#752) · 027e46ee
      Illia Silin authored
      * enable gfx941/942 targets
      
      * fix clang format
      
      * fix the cmake logic for multiple targets
      
      * fix cmake syntax for looping over targets
      
      * add gfx941/942 support for gemm_xdl instances
      027e46ee
  3. 23 May, 2023 1 commit
    • Illia Silin's avatar
      Enable gemm_dl and other kernels on Navi3x. (#714) · d821d1e5
      Illia Silin authored
      * enable dl kernels on navi3
      
      * do not build xdl tests and examples on Navi
      
      * run tests before building everything on jenkins
      
      * disable gemm_bilinear on gfx1030
      
      * add gpu targets to installer on Navi
      
      * put tests in the same order as before
      
      * reduce the number of navi targets in CI
      
      * build CI installed for gfx940 as well
      
      * only build for MI300 during QA runs
      d821d1e5
  4. 28 Oct, 2022 1 commit
  5. 27 Oct, 2022 1 commit
    • Rostyslav Geyyer's avatar
      Fix Batched Gemm op for int8 data (#482) · cd517326
      Rostyslav Geyyer authored
      * Fix for lwpck-425, update BlockTransferSrcVectorDim
      
      * Revert "Fix for lwpck-425, update BlockTransferSrcVectorDim"
      
      This reverts commit fd24e280e28ff238b452cfdde58a988affd46461.
      
      * Add Batched Gemm int8 test, expect it to fail
      
      * Format
      
      * Re-add the fix
      cd517326
  6. 29 Jul, 2022 1 commit
    • Chao Liu's avatar
      Clean up conv example, Instances, profiler and test (#324) · 500fa995
      Chao Liu authored
      * convnd_fwd fp16 example
      
      * update example
      
      * update example
      
      * update instance
      
      * updating refernce conv
      
      * update reference conv
      
      * update conv fwd profiler
      
      * update conv 1d and 3d instance
      
      * update include path
      
      * clean
      
      * update profiler for conv bwd data and weight
      
      * update conv bwd weight
      
      * clean
      
      * update conv example
      
      * update profiler for conv bwd weight
      
      * update ckprofiler for conv bwd data
      
      * fix reference conv bwd data bug; update conv bwd data test
      
      * update examples
      
      * fix initialization issue
      
      * update test for conv fwd
      
      * clean
      
      * clean
      
      * remove test case too sensitive to error threshhold
      
      * fix test
      
      * clean
      
      * fix build
      
      * adding conv multiple d
      
      * adding conv multiple D
      
      * add matrix padder
      
      * add gemm padding to convnd
      
      * adding group conv
      
      * update gemm multi-d
      
      * refactor
      
      * refactor
      
      * refactor
      
      * clean
      
      * clean
      
      * refactor
      
      * refactor
      
      * reorg
      
      * add ds
      
      * add bias
      
      * clean
      
      * add G
      
      * adding group
      
      * adding group
      
      * adding group
      
      * update Tensor
      
      * clean
      
      * update example
      
      * update DeviceGemmMultipleD_Xdl_CShuffle
      
      * update conv bwd-data and bwd-weight
      
      * upate contraction example
      
      * update gemm and batch gemm with e permute
      
      * fix example build
      
      * instance for grouped conv1d
      
      * update example
      
      * adding group conv instance
      
      * update gemm bilinear instance
      
      * update gemm+add+add+fastgelu instance
      
      * update profiler
      
      * update profiler
      
      * update test
      
      * update test and client example
      
      * clean
      
      * add grouped conv into profiler
      
      * update profiler
      
      * clean
      
      * add test grouped conv, update all conv test to gtest
      
      * update test
      500fa995
  7. 21 Mar, 2022 1 commit