"tests/vscode:/vscode.git/clone" did not exist on "23071a560eda671f8324337c6688a74d9896c47e"
  1. 24 Jun, 2024 1 commit
  2. 21 Jun, 2024 2 commits
  3. 20 Jun, 2024 3 commits
  4. 19 Jun, 2024 1 commit
  5. 18 Jun, 2024 3 commits
  6. 17 Jun, 2024 2 commits
  7. 14 Jun, 2024 1 commit
  8. 13 Jun, 2024 1 commit
  9. 10 Jun, 2024 1 commit
  10. 05 Jun, 2024 2 commits
    • Bartłomiej Kocot's avatar
      Integrate universal gemm with conv forward (#1320) · ac58cc5d
      Bartłomiej Kocot authored
      * Integrate universal gemm with conv fwd
      
      * Fix conv fwd wmma test
      
      * Fix instances
      
      * Remove direct load check
      ac58cc5d
    • Rostyslav Geyyer's avatar
      Add a scale op, related instances and examples (#1242) · cb0645be
      Rostyslav Geyyer authored
      
      
      * Add a scale op
      
      * Update the element op
      
      * Add instances
      
      * Add an example
      
      * Add a client example
      
      * Add a flag check
      
      * Revert flag check addition
      
      * Fix flag check
      
      * Update d strides in example
      
      * Update d strides in client example
      
      * Apply suggestions from code review
      
      Update copyright header
      Co-authored-by: default avatarBartłomiej Kocot <barkocot@amd.com>
      
      * Move the example
      
      * Move the client example
      
      * Update element op
      
      * Update example with the new element op
      
      * Add scalar layout
      
      * Update example
      
      * Update kernel for scalar Ds
      
      * Revert kernel changes
      
      * Update element op
      
      * Update example to use scales' pointers
      
      * Format
      
      * Update instances
      
      * Update client example
      
      * Move element op to unary elements
      
      * Update element op to work with values instead of pointers
      
      * Update instances to take element op as an argument
      
      * Update examples to use random scale values
      
      ---------
      Co-authored-by: default avatarBartłomiej Kocot <barkocot@amd.com>
      cb0645be
  11. 04 Jun, 2024 1 commit
    • Dan Yao's avatar
      CK Tile FA Training kernels (#1286) · 2cab8d39
      Dan Yao authored
      
      
      * FA fwd dropout
      
      * FA bwd
      
      * epilogue reuse
      
      * CMakeLists update
      
      * [CK_TILE] support alibi (#1269)
      
      * add alibi support
      
      * fix code
      
      * update code based on comment
      
      * Support more hdim
      
      * fix fp8 bias
      
      * support seqlen_k=0 case
      
      * remove unused printf
      
      * fix format
      
      ---------
      Co-authored-by: default avatarrocking <ChunYu.Lai@amd.com>
      
      * now fwd/bwd can build
      
      * bwd alibi
      
      * add bwd validation stream_config
      
      * update generated filenames
      
      * update bwd kernel launch
      
      * CK_TILE_HOST_DEVICE in philox
      
      * Transpose -> transpose
      
      * format
      
      * format
      
      * format
      
      * Generate the instance for FA required
      
      * format
      
      * fix error in WarpGemm
      
      ---------
      
      Co-authored-by: danyao12 <danyao12>
      Co-authored-by: default avatarcarlushuang <carlus.huang@amd.com>
      Co-authored-by: default avatarrocking <ChunYu.Lai@amd.com>
      Co-authored-by: default avatarPo Yen Chen <PoYen.Chen@amd.com>
      Co-authored-by: default avatarJing Zhang <jizhan@amd.com>
      2cab8d39
  12. 01 Jun, 2024 1 commit
    • zjing14's avatar
      Post-merge fix of PR 1300 (#1313) · 6fb1f4e0
      zjing14 authored
      * add f8 gemm with multiD for both row/col wise
      
      * change compute_type to fp8
      
      * changed tuning parameters in the example
      
      * add rcr example
      
      * post-merge fix
      
      * fix
      
      * reduce init range
      6fb1f4e0
  13. 28 May, 2024 2 commits
    • zjing14's avatar
      add f8 gemm multiD with both row/col wise scale (#1300) · 80db62f0
      zjing14 authored
      * add f8 gemm with multiD for both row/col wise
      
      * change compute_type to fp8
      
      * changed tuning parameters in the example
      
      * add rcr example
      80db62f0
    • carlushuang's avatar
      [CK_TILE] support group from cmdline (#1295) · 5055b3bd
      carlushuang authored
      * support cmdline seqlen decode
      
      * silent print
      
      * update readme
      
      * update kernel launch 3d
      
      * update tile partitioner
      
      * fix spill for bf16
      
      * modify based on comment
      
      * modify payload_t
      
      * fix bug for alibi mode
      
      * fix alibi test err
      
      * refactor kernel launch, support select timer
      
      * add missing file
      
      * remove useless code
      
      * add some comments
      5055b3bd
  14. 22 May, 2024 1 commit
  15. 20 May, 2024 1 commit
  16. 17 May, 2024 2 commits
  17. 15 May, 2024 3 commits
  18. 10 May, 2024 2 commits
  19. 09 May, 2024 2 commits
  20. 08 May, 2024 2 commits
  21. 07 May, 2024 2 commits
  22. 02 May, 2024 1 commit
  23. 29 Apr, 2024 1 commit
  24. 26 Apr, 2024 2 commits