1. 19 Apr, 2023 7 commits
  2. 18 Apr, 2023 1 commit
    • Illia Silin's avatar
      Allow using ROCm release candidate compilers. (#679) · bb0b772d
      Illia Silin authored
      * enable use of rocm5.5 release candidate 4
      
      * upgrade to ROCM5.5 RC5
      
      * try fix the PUB_KEY error, remove the cmake-data package
      
      * upgrade to latest cmake version
      
      * use private dockerhub repo for rocm5.5 rc5
      
      * add missing bracket
      bb0b772d
  3. 17 Apr, 2023 6 commits
  4. 16 Apr, 2023 2 commits
  5. 14 Apr, 2023 1 commit
  6. 11 Apr, 2023 5 commits
  7. 10 Apr, 2023 1 commit
    • rocking5566's avatar
      Groupnorm + swish external api (#668) · ed3a2e52
      rocking5566 authored
      * Rename to proper naming
      
      * Add example of groupnorm + swish
      
      * Extract duplicate code in example
      
      * Add groupnorm + swish instances
      
      * Ractor instance generation, split into multiple cpp file
      
      * Add external api and client example
      
      * Refine profiler message
      
      * Use ck math version of exp
      
      * Refine problem size in example
      
      * Add host version of exp
      ed3a2e52
  8. 07 Apr, 2023 1 commit
  9. 06 Apr, 2023 1 commit
  10. 30 Mar, 2023 3 commits
  11. 29 Mar, 2023 3 commits
  12. 27 Mar, 2023 1 commit
  13. 24 Mar, 2023 1 commit
  14. 23 Mar, 2023 1 commit
  15. 22 Mar, 2023 2 commits
  16. 20 Mar, 2023 2 commits
  17. 15 Mar, 2023 2 commits
    • Rostyslav Geyyer's avatar
      Update cmake-ck-dev.sh script (#641) · fa998675
      Rostyslav Geyyer authored
      
      Co-authored-by: default avatarRosty Geyyer <rosty.geyyer@amd.com>
      fa998675
    • rocking5566's avatar
      gemm/Conv xdlops + dlops quantization (#625) · 16dc18e0
      rocking5566 authored
      * Add conv perlayer quantization
      
      * Add gemm_dlops quantization
      
      * Support int8 for innerproduct
      
      * Refine gemm dlops int8 kernel parameter
      
      * Support gfx908(MI100) and gfx90a(MI200)
      
      * clang-format
      
      * Rename example number
      
      * Support different layout for d tensor
      
      * Add conv dlops perchannel quantization example
      
      * Move to example 40
      
      * Extract the common code for different platform (dlops and xdlops)
      
      * Move ot subfolder. Prepare to add other op of quantization
      
      * Refine the quantization instance library
      
      * Add conv dl instances and client example
      
      * Remove unnecessary type
      
      * Add gemm quantization instance
      
      * Add external api and client example
      
      * Refine num_bytes
      
      * Separete different layout to different cpp
      
      * Add more xdl instances
      
      * Revert "Remove unnecessary type"
      
      This reverts commit 82086918
      
      .
      
      * Remove CShuffleDataType in dlops
      Let acc and CShuffleDataType be the same in xdlops
      
      ---------
      Co-authored-by: default avatarzjing14 <zhangjing14@gmail.com>
      16dc18e0