1. 27 Oct, 2023 1 commit
    • carlushuang's avatar
      support batch & nhead, and scale (#20) · 95889861
      carlushuang authored
      * support batch & nhead
      
      * support scale
      
      * tile scheduler
      
      * rename tile-scheduler to tile-partitioner
      
      * add some exp2 math
      
      * fix a bug when chaning tile size
      95889861
  2. 19 Oct, 2023 1 commit
    • carlushuang's avatar
      add fmha fwd pipeline (#17) · 9f36ac7c
      carlushuang authored
      
      
      * Revert "Extract gemm0 prefetch0 out from loop"
      
      This reverts commit d3b56f39f9fd12edb476b24ae9cf480841d311e4.
      
      * add fmha fwd  pipeline
      
      * Extract gemm0 prefetch0 out from loop
      
      * move blockSize to another place ; fix a missing header in tile_window_impl_static_distribution.hpp
      
      * remove KArgs from tile modules
      
      ---------
      Co-authored-by: default avatarPo-Yen, Chen <PoYen.Chen@amd.com>
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      9f36ac7c
  3. 12 Oct, 2023 2 commits
    • Chao Liu's avatar
      Refactor 1010 (#14) · 7337ec25
      Chao Liu authored
      * refactor
      
      * refactor
      
      * change load_tile, update block gemm
      
      * debug
      
      * clean
      
      * clean
      
      * experiment lod
      
      * workaround spilling issue
      
      * clean
      7337ec25
    • carlushuang's avatar
      slice kv, and use 3d padding LDS layout (#15) · 7b1a0b7f
      carlushuang authored
      * slice kv, and use 3d padding LDS layout
      
      * add missing sync
      
      * put sync to another poace
      
      * move sync place
      
      * revert to normal
      7b1a0b7f
  4. 14 Sep, 2023 1 commit