- 24 Feb, 2025 10 commits
-
-
afeldman-nm authored
Signed-off-by:
Andrew Feldman <afeldman@neuralmagic.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Nicolò Lucchesi authored
[Misc][Docs] Raise error when flashinfer is not installed and `VLLM_ATTENTION_BACKEND` is set (#12513) Signed-off-by:NickLucche <nlucches@redhat.com>
-
Zhonghua Deng authored
-
Jongseok Park authored
-
Roger Meier authored
-
Roger Meier authored
-
Mengqing Cao authored
-
Roger Wang authored
-
Kevin H. Luu authored
Signed-off-by: <> Co-authored-by:EC2 Default User <ec2-user@ip-172-31-63-253.us-west-2.compute.internal>
-
Huy Do authored
Signed-off-by:Huy Do <huydhn@gmail.com>
-
- 23 Feb, 2025 9 commits
-
-
Nick Hill authored
Even though ZMQ context.destroy() is meant to close open sockets before terminating the context, it appears to be necessary to do this explicitly or else it can hang in the context.term() method. Close zmq sockets explicitly before terminating context, make shutdown of client resource more robust, shut down engine core process prior to terminating zmq context. Signed-off-by:Nick Hill <nhill@redhat.com>
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Roger Wang authored
Signed-off-by:Roger Wang <ywang@roblox.com>
-
Nick Hill authored
-
Isotr0py authored
-
Kyle Sayers authored
-
Kevin H. Luu authored
-
Andy Lo authored
Signed-off-by:Andy Lo <andy@mistral.ai>
-
Roger Wang authored
Signed-off-by:Roger Wang <ywang@roblox.com>
-
- 22 Feb, 2025 21 commits
-
-
Daniele authored
-
Yan Ma authored
-
Helena Kloosterman authored
-
Cyrus Leung authored
-
Gregory Shtrasberg authored
-
Sage Moore authored
[V1][Kernel] Refactor the prefix_prefill kernel so that the caller no longer has to pass in the context lengths (#13095)
-
Kaixi Hou authored
-
Keyun Tong authored
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Cyrus Leung authored
-
Jee Jee Li authored
-
Mark McLoughlin authored
-
Mark McLoughlin authored
-
Yu Chin Fabian Lim authored
-
Jennifer Zhao authored
Signed-off-by:
Jennifer Zhao <7443418+JenZhao@users.noreply.github.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
Lu Fang authored
Signed-off-by:Lu Fang <lufang@fb.com>
-
Robin authored
[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len (#13691) Signed-off-by:WangErXiao <863579016@qq.com>
-
Dipika Sikka authored
-
Shane A authored
-
Gordon Wong authored
-