1. 27 Jun, 2024 1 commit
  2. 18 Jun, 2024 1 commit
  3. 06 Jun, 2024 4 commits
  4. 05 Jun, 2024 3 commits
    • Adam Osewski's avatar
      Code cleanup. · e628b162
      Adam Osewski authored
      e628b162
    • Adam Osewski's avatar
      Fix GetNextKTiles. · 2a16c61c
      Adam Osewski authored
      2a16c61c
    • Rostyslav Geyyer's avatar
      Add a scale op, related instances and examples (#1242) · cb0645be
      Rostyslav Geyyer authored
      
      
      * Add a scale op
      
      * Update the element op
      
      * Add instances
      
      * Add an example
      
      * Add a client example
      
      * Add a flag check
      
      * Revert flag check addition
      
      * Fix flag check
      
      * Update d strides in example
      
      * Update d strides in client example
      
      * Apply suggestions from code review
      
      Update copyright header
      Co-authored-by: default avatarBartłomiej Kocot <barkocot@amd.com>
      
      * Move the example
      
      * Move the client example
      
      * Update element op
      
      * Update example with the new element op
      
      * Add scalar layout
      
      * Update example
      
      * Update kernel for scalar Ds
      
      * Revert kernel changes
      
      * Update element op
      
      * Update example to use scales' pointers
      
      * Format
      
      * Update instances
      
      * Update client example
      
      * Move element op to unary elements
      
      * Update element op to work with values instead of pointers
      
      * Update instances to take element op as an argument
      
      * Update examples to use random scale values
      
      ---------
      Co-authored-by: default avatarBartłomiej Kocot <barkocot@amd.com>
      cb0645be
  5. 04 Jun, 2024 3 commits
  6. 01 Jun, 2024 1 commit
    • zjing14's avatar
      Post-merge fix of PR 1300 (#1313) · 6fb1f4e0
      zjing14 authored
      * add f8 gemm with multiD for both row/col wise
      
      * change compute_type to fp8
      
      * changed tuning parameters in the example
      
      * add rcr example
      
      * post-merge fix
      
      * fix
      
      * reduce init range
      6fb1f4e0
  7. 28 May, 2024 2 commits
    • zjing14's avatar
      add f8 gemm multiD with both row/col wise scale (#1300) · 80db62f0
      zjing14 authored
      * add f8 gemm with multiD for both row/col wise
      
      * change compute_type to fp8
      
      * changed tuning parameters in the example
      
      * add rcr example
      80db62f0
    • carlushuang's avatar
      [CK_TILE] support group from cmdline (#1295) · 5055b3bd
      carlushuang authored
      * support cmdline seqlen decode
      
      * silent print
      
      * update readme
      
      * update kernel launch 3d
      
      * update tile partitioner
      
      * fix spill for bf16
      
      * modify based on comment
      
      * modify payload_t
      
      * fix bug for alibi mode
      
      * fix alibi test err
      
      * refactor kernel launch, support select timer
      
      * add missing file
      
      * remove useless code
      
      * add some comments
      5055b3bd
  8. 22 May, 2024 1 commit
  9. 21 May, 2024 3 commits
  10. 20 May, 2024 1 commit
  11. 17 May, 2024 2 commits
  12. 15 May, 2024 3 commits
  13. 14 May, 2024 3 commits
  14. 10 May, 2024 2 commits
  15. 09 May, 2024 2 commits
  16. 08 May, 2024 2 commits
  17. 07 May, 2024 2 commits
  18. 06 May, 2024 1 commit
    • Adam Osewski's avatar
      Multiple fixes. · 160932b6
      Adam Osewski authored
      * fix Accumulation when there's only one workgroup per K dim.
      * Update occupancy values after KBatch update and fix it's calculation.
      160932b6
  19. 02 May, 2024 1 commit
  20. 29 Apr, 2024 2 commits