1. 30 Jul, 2024 1 commit
    • jayhshah's avatar
      Fp8 kernel with "in-kernel" transpose of V in producer (#1100) · 5018ac6a
      jayhshah authored
      * base version
      
      * restructure pipelines, add special fp8 epilogue
      
      * add variants
      
      * add fp8 causal and modify dynamic tile scheduler
      
      * better causal schedule
      
      * maintain two schedules for non causal and causal
      
      * removing macros
      
      * fix regression
      
      * clean up unneeded methods and variants
      
      * fix mistake with NumProducerThreads
      
      * base version
      
      * restructure pipelines, add special fp8 epilogue
      
      * add variants
      
      * add fp8 causal and modify dynamic tile scheduler
      
      * better causal schedule
      
      * maintain two schedules for non causal and causal
      
      * removing macros
      
      * fix regression
      
      * clean up unneeded methods and variants
      
      * fix mistake with NumProducerThreads
      
      * use seqlen traits
      
      * add fp8 .cu files and benchmark script
      
      * fix merge issue
      
      * fix merge issue
      
      * fix merge issue
      
      * remove duplicate code
      
      * fix regression with varseqlen
      
      * move varseqlen init in constexpr
      
      * fix test script
      
      * more constexpr on varseqlen and add max offset
      
      * add back test cases
      5018ac6a
  2. 25 Jul, 2024 1 commit
  3. 23 Jul, 2024 1 commit
    • ganeshcolfax's avatar
      Changes For FP8 (#1075) · 1899c970
      ganeshcolfax authored
      
      
      * adding files for fp8 changes.
      
      * removed contiguous check.
      
      * enable all tests except odd-seq-lengths, where it crashes now.
      
      * undid clang formatting.
      
      * change to correct tile size for headdim=128.
      
      * fixed odd-seq-len-k.
      
      * minor formatting.
      
      * minor reformatting.
      
      ---------
      Co-authored-by: default avatarTri Dao <tridao@users.noreply.github.com>
      1899c970
  4. 11 Jul, 2024 1 commit