- 30 Jul, 2024 1 commit
-
-
jayhshah authored
* base version * restructure pipelines, add special fp8 epilogue * add variants * add fp8 causal and modify dynamic tile scheduler * better causal schedule * maintain two schedules for non causal and causal * removing macros * fix regression * clean up unneeded methods and variants * fix mistake with NumProducerThreads * base version * restructure pipelines, add special fp8 epilogue * add variants * add fp8 causal and modify dynamic tile scheduler * better causal schedule * maintain two schedules for non causal and causal * removing macros * fix regression * clean up unneeded methods and variants * fix mistake with NumProducerThreads * use seqlen traits * add fp8 .cu files and benchmark script * fix merge issue * fix merge issue * fix merge issue * remove duplicate code * fix regression with varseqlen * move varseqlen init in constexpr * fix test script * more constexpr on varseqlen and add max offset * add back test cases
-
- 25 Jul, 2024 1 commit
-
- 23 Jul, 2024 2 commits
-
-
ganeshcolfax authored
* adding files for fp8 changes. * removed contiguous check. * enable all tests except odd-seq-lengths, where it crashes now. * undid clang formatting. * change to correct tile size for headdim=128. * fixed odd-seq-len-k. * minor formatting. * minor reformatting. --------- Co-authored-by:Tri Dao <tridao@users.noreply.github.com>
-
Ying Zhang authored
* fwd var-seq-len * fixes * benchmark * fixes --------- Co-authored-by:Tri Dao <tridao@users.noreply.github.com>
-
- 15 Jul, 2024 1 commit
-
-
Tri Dao authored
-
- 11 Jul, 2024 1 commit
-
-
Tri Dao authored
-