- 05 Dec, 2024 2 commits
- 04 Dec, 2024 1 commit
-
-
zhuwenwen authored
-
- 03 Dec, 2024 1 commit
-
-
gaoqiong authored
-
- 27 Nov, 2024 1 commit
-
-
zhuwenwen authored
add VLLM_OPTEST_MODELS_PATH/OPTEST_MODELS_PATH to load models from local path instead of Hugging Face Hub
-
- 20 Nov, 2024 1 commit
-
-
zhuwenwen authored
-
- 19 Nov, 2024 1 commit
-
-
zhuwenwen authored
-
- 15 Nov, 2024 1 commit
-
-
王敏 authored
-
- 10 Nov, 2024 1 commit
-
-
zhuwenwen authored
-
- 09 Oct, 2024 1 commit
-
-
zhuwenwen authored
-
- 25 Sep, 2024 1 commit
-
-
bnellnm authored
-
- 23 Sep, 2024 1 commit
-
-
Lucas Wilkinson authored
Co-authored-by:
mgoin <michael@neuralmagic.com> Co-authored-by:
Divakar Verma <137818590+divakar-amd@users.noreply.github.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
- 19 Sep, 2024 1 commit
-
-
Charlie Fu authored
-
- 18 Sep, 2024 2 commits
-
-
Tyler Michael Smith authored
-
Cyrus Leung authored
-
- 17 Sep, 2024 2 commits
-
-
Tyler Michael Smith authored
-
Simon Mo authored
-
- 16 Sep, 2024 2 commits
-
-
Luka Govedič authored
-
ElizaWszola authored
Co-authored-by:Dipika <dipikasikka1@gmail.com>
-
- 15 Sep, 2024 1 commit
-
-
Isotr0py authored
-
- 14 Sep, 2024 1 commit
-
-
Charlie Fu authored
-
- 13 Sep, 2024 1 commit
-
-
Cyrus Leung authored
-
- 11 Sep, 2024 1 commit
-
-
bnellnm authored
Co-authored-by:Sage Moore <sage@neuralmagic.com>
-
- 10 Sep, 2024 1 commit
-
-
Dipika Sikka authored
-
- 05 Sep, 2024 1 commit
-
-
Elfie Guo authored
-
- 29 Aug, 2024 2 commits
-
-
Pavani Majety authored
[Core][Kernels] Enable FP8 KV Cache with Flashinfer backend. + BugFix for kv_cache_dtype=auto (#7985) Co-authored-by:
Simon Mo <simon.mo@hey.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
youkaichao authored
-
- 28 Aug, 2024 3 commits
-
-
Mor Zusman authored
-
rasmith authored
[Kernel] [Triton] [AMD] Adding Triton implementations awq_dequantize and awq_gemm to support AWQ (#7386)
-
Pavani Majety authored
Co-authored-by:Simon Mo <simon.mo@hey.com>
-
- 21 Aug, 2024 2 commits
- 20 Aug, 2024 1 commit
-
-
Lucas Wilkinson authored
-
- 16 Aug, 2024 3 commits
-
-
Charlie Fu authored
-
youkaichao authored
-
jon-chuang authored
-
- 12 Aug, 2024 1 commit
-
-
jon-chuang authored
Co-authored-by:Cody Yu <hao.yu.cody@gmail.com>
-
- 08 Aug, 2024 1 commit
-
-
Luka Govedič authored
-
- 06 Aug, 2024 2 commits
-
-
afeldman-nm authored
[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942) Co-authored-by:
Andrew Feldman <afeld2012@gmail.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com>
-
Luka Govedič authored
Co-authored-by:Tyler Michael Smith <tyler@neuralmagic.com>
-