1. 13 Dec, 2024 1 commit
  2. 12 Dec, 2024 1 commit
    • carlushuang's avatar
      [CK_TILE] naive attn (#1708) · 77a38e02
      carlushuang authored
      * add reference attention fwd
      
      * refactor addresser
      
      * update
      
      * paged, and i8 reflect-quant
      
      * lets call it forward-quant
      
      * fix error in decode variation
      
      * update naive-attn
      
      * fix page table
      
      * fix build err
      77a38e02
  3. 10 Dec, 2024 4 commits
  4. 09 Dec, 2024 3 commits
  5. 06 Dec, 2024 5 commits
  6. 05 Dec, 2024 2 commits
  7. 04 Dec, 2024 2 commits
  8. 03 Dec, 2024 2 commits
    • Bartłomiej Kocot's avatar
      Add basic documentation structure (#1715) · 5affda81
      Bartłomiej Kocot authored
      * Add basic documentation structure
      
      * Add terminology placeholder
      
      * Add codegen placeholder
      
      * Create template for each page
      5affda81
    • Illia Silin's avatar
      OCP FP8 support for gfx12. (#1710) · 08d5c02c
      Illia Silin authored
      * (2/5) bilinear gemm pass, perf bug: skip a lds has lower performance than skip b lds
      
      * (3/5) batched gemm pass, perf bug: skip a lds has lower performance than skip b lds
      
      * (4/5) grouped conv pass
      
      * (5/5) attention pass, todo: debug lds perf bug
      
      * AIT Attention API refactor (#8)
      
      * sanity pass
      
      * sanity pass 2
      
      * confirm significant performance regression.
      
      * turn on all instances
      
      * turn off instance format
      
      * Fix bug & tunning & format
      
      * DML meta, self_attn+cross_attn
      
      * sanity pass
      
      * remove useless flag
      
      * update tile and problem size used in AIT attention
      
      * bug fix in grouped conv supporting check
      
      * deprecate inline asm wmma
      
      * Bug fix: double lds skip
      
      * clang-format
      
      * Fix errors in
      1. example, fmha
      2. gridwise pipeline
      3. deviceop, fmha, change some containers from vector to array
      
      * part2 of previous commit
      
      * clang format
      
      * API fix of gridwisegemmpipeline
      
      * separate array base and vector base attention...
      08d5c02c
  9. 02 Dec, 2024 2 commits
  10. 30 Nov, 2024 2 commits
  11. 29 Nov, 2024 2 commits
  12. 28 Nov, 2024 3 commits
  13. 27 Nov, 2024 3 commits
  14. 26 Nov, 2024 7 commits
  15. 25 Nov, 2024 1 commit