1. 22 Apr, 2023 1 commit
  2. 21 Apr, 2023 2 commits
  3. 18 Apr, 2023 1 commit
    • Illia Silin's avatar
      Allow using ROCm release candidate compilers. (#679) · bb0b772d
      Illia Silin authored
      * enable use of rocm5.5 release candidate 4
      
      * upgrade to ROCM5.5 RC5
      
      * try fix the PUB_KEY error, remove the cmake-data package
      
      * upgrade to latest cmake version
      
      * use private dockerhub repo for rocm5.5 rc5
      
      * add missing bracket
      bb0b772d
  4. 17 Apr, 2023 1 commit
  5. 16 Apr, 2023 2 commits
  6. 11 Apr, 2023 5 commits
  7. 10 Apr, 2023 1 commit
    • rocking5566's avatar
      Groupnorm + swish external api (#668) · ed3a2e52
      rocking5566 authored
      * Rename to proper naming
      
      * Add example of groupnorm + swish
      
      * Extract duplicate code in example
      
      * Add groupnorm + swish instances
      
      * Ractor instance generation, split into multiple cpp file
      
      * Add external api and client example
      
      * Refine profiler message
      
      * Use ck math version of exp
      
      * Refine problem size in example
      
      * Add host version of exp
      ed3a2e52
  8. 07 Apr, 2023 1 commit
  9. 30 Mar, 2023 3 commits
  10. 29 Mar, 2023 3 commits
  11. 27 Mar, 2023 1 commit
  12. 24 Mar, 2023 1 commit
  13. 23 Mar, 2023 1 commit
  14. 22 Mar, 2023 2 commits
  15. 20 Mar, 2023 2 commits
  16. 15 Mar, 2023 6 commits
  17. 10 Mar, 2023 2 commits
  18. 09 Mar, 2023 2 commits
  19. 08 Mar, 2023 1 commit
    • Adam Osewski's avatar
      GroupedGEMM + Gelu client example/instances/profiler (#614) · 9096b1c7
      Adam Osewski authored
      
      
      * Grouped gemm + Gelu instances.
      
      * Device Instance Factory for GroupedGemm+Gelu
      
      * Client example
      
      * Rangify fill helper functions.
      
      * Fix name clash.
      
      * Profiler for grouped_gemm+gelu
      
      * No need to use full namespace name.
      
      * Add check for MRaw divisible by vector load.
      
      * Ugly fix for big errors.
      
      * Add grouped_gemm+gelu to profiler CMakelists.
      
      * Store in argument additional info.
      
      * Information about Mraw, Nraw, Kraw values.
      
      * Use FastGelu instead of Gelu.
      
      * Change client ex to use FastGelu
      
      * Remove relaxed error precision.
      
      * Remove duplicate output elementwise-op
      
      ---------
      Co-authored-by: default avatarAdam Osewski <aosewski@amd.com>
      Co-authored-by: default avatarzjing14 <zhangjing14@gmail.com>
      9096b1c7
  20. 06 Mar, 2023 2 commits