1. 14 Jan, 2025 2 commits
  2. 10 Jan, 2025 1 commit
  3. 26 Dec, 2024 2 commits
  4. 25 Dec, 2024 1 commit
  5. 23 Dec, 2024 1 commit
  6. 20 Dec, 2024 3 commits
  7. 19 Dec, 2024 1 commit
  8. 18 Dec, 2024 3 commits
    • aledudek's avatar
      [CK TILE] Refactor GemmKernel to be reused by other GEMM related operators (#1730) · 453ca373
      aledudek authored
      * Gemm Kernel Refactor part1
      
      * Gemm Kernel Refactor common gemm pipeline part2
      
      * [CK TILE] Refactor batched gemm to reuse GemmKernel
      
      * [CK TILE] Refactor GemmKernel - review changes part1
      
      * [CK TILE] Refactor GemmKernel - references fix
      
      * [CK TILE] Refactor GemmKernel - naming changes, add problem
      
      * [CK_TILE] Refactor GemmKernel - update tests
      
      * [CK_TILE] Refactor GemmKernel - review changes
      
      * [CK_TILE] Refactor GemmKernel - update test
      
      * [CK_TILE] Refactor GemmKernel - constness fixes
      
      * [CK_TILE] Refactor GemmKernel - update tests
      453ca373
    • Xiaodong Wang's avatar
      Disambiguate bit_cast (#1749) · 1c1b3363
      Xiaodong Wang authored
      
      
      Adding namespace to disambiguate with std::bit_cast
      Co-authored-by: default avatarPo Yen Chen <PoYen.Chen@amd.com>
      1c1b3363
    • aledudek's avatar
      [CK_TILE] Move hipmalloc/memcpy calls out of gpu reference gemm (#1743) · f6c4d614
      aledudek authored
      * [CK_TILE] Move hipmalloc/memcpy calls out of gpu reference gemm
      
      * [CK_TILE] Move hipmalloc/memcpy calls out of gpu reference gemm - review changes
      
      * [CK_TILE] Move hipmalloc/memcpy calls out of gpu reference gemm - review fix
      f6c4d614
  9. 17 Dec, 2024 6 commits
  10. 16 Dec, 2024 5 commits
  11. 15 Dec, 2024 1 commit
  12. 14 Dec, 2024 2 commits
  13. 13 Dec, 2024 2 commits
  14. 12 Dec, 2024 1 commit
    • carlushuang's avatar
      [CK_TILE] naive attn (#1708) · 77a38e02
      carlushuang authored
      * add reference attention fwd
      
      * refactor addresser
      
      * update
      
      * paged, and i8 reflect-quant
      
      * lets call it forward-quant
      
      * fix error in decode variation
      
      * update naive-attn
      
      * fix page table
      
      * fix build err
      77a38e02
  15. 10 Dec, 2024 4 commits
  16. 09 Dec, 2024 3 commits
  17. 06 Dec, 2024 2 commits
    • Illia Silin's avatar
      Refactor CI performance tests. (#1726) · 355893cd
      Illia Silin authored
      * merge the build and performance tests CI stages together
      
      * add gemm performance test on gfx11/gfx12
      
      * add suffices to distinguish gemm performance logs from different archs
      
      * use smaller gemm set in CI for gfx10/gfx11/gfx12
      
      * disable performance tests on gfx1030
      
      * fix the shashing logic
      
      * fix finding python3 for mha instances
      355893cd
    • Rostyslav Geyyer's avatar
      Add copy assignment op test (#1718) · 5e6bd75a
      Rostyslav Geyyer authored
      * Add copy assignment op test
      
      * Add a deep copy testing
      5e6bd75a