- 29 Aug, 2024 2 commits
-
-
Pavani Majety authored
[Core][Kernels] Enable FP8 KV Cache with Flashinfer backend. + BugFix for kv_cache_dtype=auto (#7985) Co-authored-by:
Simon Mo <simon.mo@hey.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
youkaichao authored
-
- 28 Aug, 2024 1 commit
-
-
Pavani Majety authored
Co-authored-by:Simon Mo <simon.mo@hey.com>
-
- 21 Aug, 2024 1 commit
-
-
LI MOU authored
[BUG] fix crash on flashinfer backend with cudagraph disabled, when attention group_size not in [1,2,4,8] (#7509)
-
- 16 Aug, 2024 1 commit
-
-
jon-chuang authored
-
- 04 Jul, 2024 1 commit
-
-
Lily Liu authored
Co-authored-by:Simon Mo <simon.mo@hey.com>
-