1. 30 Jul, 2024 1 commit
    • jayhshah's avatar
      Fp8 kernel with "in-kernel" transpose of V in producer (#1100) · 5018ac6a
      jayhshah authored
      * base version
      
      * restructure pipelines, add special fp8 epilogue
      
      * add variants
      
      * add fp8 causal and modify dynamic tile scheduler
      
      * better causal schedule
      
      * maintain two schedules for non causal and causal
      
      * removing macros
      
      * fix regression
      
      * clean up unneeded methods and variants
      
      * fix mistake with NumProducerThreads
      
      * base version
      
      * restructure pipelines, add special fp8 epilogue
      
      * add variants
      
      * add fp8 causal and modify dynamic tile scheduler
      
      * better causal schedule
      
      * maintain two schedules for non causal and causal
      
      * removing macros
      
      * fix regression
      
      * clean up unneeded methods and variants
      
      * fix mistake with NumProducerThreads
      
      * use seqlen traits
      
      * add fp8 .cu files and benchmark script
      
      * fix merge issue
      
      * fix merge issue
      
      * fix merge issue
      
      * remove duplicate code
      
      * fix regression with varseqlen
      
      * move varseqlen init in constexpr
      
      * fix test script
      
      * more constexpr on varseqlen and add max offset
      
      * add back test cases
      5018ac6a
  2. 25 Jul, 2024 1 commit
  3. 23 Jul, 2024 2 commits
  4. 15 Jul, 2024 1 commit
  5. 11 Jul, 2024 1 commit