- 25 Feb, 2025 14 commits
-
-
Shanshan Shen authored
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
Cyrus Leung authored
-
Varun Sundar Rabindranath authored
-
Michael Goin authored
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Lucas Wilkinson authored
-
Mark McLoughlin authored
-
wangxiyuan authored
Signed-off-by:wangxiyuan <wangxiyuan1007@gmail.com>
-
Tyler Michael Smith authored
Signed-off-by:Tyler Michael Smith <tyler@neuralmagic.com>
-
cjackal authored
Signed-off-by:cjackal <44624812+cjackal@users.noreply.github.com>
-
Eli Boyarski authored
-
Harry Mellor authored
-
Robert Shaw authored
-
- 24 Feb, 2025 10 commits
-
-
Robert Shaw authored
Signed-off-by:rshaw@neuralmagic.com <rshaw@neuralmagic.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Roger Wang authored
-
afeldman-nm authored
Signed-off-by:
Andrew Feldman <afeldman@neuralmagic.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Nicolò Lucchesi authored
[Misc][Docs] Raise error when flashinfer is not installed and `VLLM_ATTENTION_BACKEND` is set (#12513) Signed-off-by:NickLucche <nlucches@redhat.com>
-
Zhonghua Deng authored
-
Jongseok Park authored
-
Roger Meier authored
-
Mengqing Cao authored
-
Roger Wang authored
-
- 23 Feb, 2025 7 commits
-
-
Nick Hill authored
Even though ZMQ context.destroy() is meant to close open sockets before terminating the context, it appears to be necessary to do this explicitly or else it can hang in the context.term() method. Close zmq sockets explicitly before terminating context, make shutdown of client resource more robust, shut down engine core process prior to terminating zmq context. Signed-off-by:Nick Hill <nhill@redhat.com>
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Nick Hill authored
-
Isotr0py authored
-
Kyle Sayers authored
-
Kevin H. Luu authored
-
Andy Lo authored
Signed-off-by:Andy Lo <andy@mistral.ai>
-
- 22 Feb, 2025 9 commits
-
-
Helena Kloosterman authored
-
Gregory Shtrasberg authored
-
Sage Moore authored
[V1][Kernel] Refactor the prefix_prefill kernel so that the caller no longer has to pass in the context lengths (#13095)
-
Kaixi Hou authored
-
Keyun Tong authored
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Cyrus Leung authored
-
Jee Jee Li authored
-