1. 06 Feb, 2025 1 commit
  2. 24 Jan, 2025 1 commit
  3. 22 Jan, 2025 1 commit
  4. 14 Jan, 2025 1 commit
  5. 13 Jan, 2025 3 commits
  6. 10 Jan, 2025 2 commits
    • Bartłomiej Kocot's avatar
      Grouped convolution backward weight special vector size loads (#1772) · fd46a01d
      Bartłomiej Kocot authored
      * Grouped convolution backward weight special vector size loads
      
      * Instnaces and tests
      
      * Fixes
      
      * Add 7 and 13 special cases
      
      * fix comments
      
      * Fix
      
      * Fix2
      
      * fixes
      
      * fix atomic add bf16
      fd46a01d
    • Thomas Ning's avatar
      Ck tile/gemm perf measure (#1750) · 73a076ee
      Thomas Ning authored
      
      
      * Finished adding the performance benchmark for ck tile gemm
      
      * Fix the executable rename problem
      
      * fix the executable name error
      
      * delete the unsupported layout combinations
      
      * Update run_full_test.sh
      
      * Update benchmark_mem_pipeline.sh
      
      * Update benchmark_basic.sh
      
      * change the executable of gemm_universal
      
      * change ck_tile_gemm script permissions
      
      * Addressed the comment
      
      * Addressed the comment
      
      * Fixed the comments
      
      * Fixed Comment
      
      * roll back the malfunctioned change
      
      * Fix the Typo
      
      * finalize the tile_gemm_fp16 performance monitoring
      
      * fix the stash names for ck_tile gemm logs
      
      * change the stashing logic
      
      * change stashing syntax
      
      ---------
      Co-authored-by: default avatarIllia Silin <98187287+illsilin@users.noreply.github.com>
      Co-authored-by: default avatarillsilin <Illia.Silin@amd.com>
      73a076ee
  7. 08 Jan, 2025 12 commits
  8. 07 Jan, 2025 3 commits
  9. 04 Jan, 2025 3 commits
  10. 03 Jan, 2025 4 commits
  11. 02 Jan, 2025 2 commits
  12. 01 Jan, 2025 1 commit
  13. 29 Dec, 2024 1 commit
    • Qianfeng's avatar
      Remove using partitioner for all fmha kernels (#1778) · 4e076909
      Qianfeng authored
      * Remove using tile partitioner for fmha_fwd_kernel
      
      * Remove using tile partitioner for fmha_fwd_splitkv and splitkv-combine kernels
      
      * Remove using tile partitioner for fmha_fwd_appendkv kernel
      
      * Unify the format of GetTileIndex
      4e076909
  14. 28 Dec, 2024 1 commit
  15. 25 Dec, 2024 1 commit
  16. 23 Dec, 2024 1 commit
  17. 20 Dec, 2024 2 commits