- 19 Jan, 2025 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
- 11 Jan, 2025 1 commit
-
-
sixgod authored
Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Isotr0py <2037008807@qq.com>
-
- 30 Dec, 2024 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 26 Nov, 2024 1 commit
-
-
Roger Wang authored
Signed-off-by:Roger Wang <ywang@roblox.com>
-
- 24 Nov, 2024 1 commit
-
-
Jee Jee Li authored
Signed-off-by:
Jee Jee Li <pandaleefree@gmail.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
- 22 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 19 Nov, 2024 1 commit
-
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
- 18 Nov, 2024 2 commits
-
-
B-201 authored
Signed-off-by:B-201 <Joy25810@foxmail.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
- 14 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 13 Nov, 2024 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 11 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 09 Nov, 2024 2 commits
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 06 Nov, 2024 3 commits
-
-
Joe Runde authored
Signed-off-by:Joe Runde <Joseph.Runde@ibm.com>
-
Aaron Pham authored
Signed-off-by:Aaron Pham <contact@aarnphm.xyz>
-
zifeitong authored
Signed-off-by:Zifei Tong <zifeitong@gmail.com>
-
- 02 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 24 Oct, 2024 1 commit
-
-
Cyrus Leung authored
-
- 16 Oct, 2024 1 commit
-
-
Cyrus Leung authored
-
- 11 Oct, 2024 1 commit
-
-
sixgod authored
-
- 04 Oct, 2024 1 commit
-
-
Murali Andoorveedu authored
Signed-off-by:
Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Signed-off-by:
Murali Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by:
DarkLight1337 <tlleungac@connect.ust.hk>
-
- 30 Aug, 2024 1 commit
-
-
afeldman-nm authored
-
- 20 Aug, 2024 1 commit
-
-
Zijian Hu authored
-
- 13 Aug, 2024 1 commit
-
-
Cyrus Leung authored
-
- 02 Jul, 2024 2 commits
-
-
Qubitium-ModelCloud authored
Co-authored-by:
Robert Shaw <rshaw@neuralmagic.com> Co-authored-by:
ZX <zx@lbx.dev>
-
Murali Andoorveedu authored
Signed-off-by:Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
-
- 27 Jun, 2024 2 commits
-
-
Cyrus Leung authored
-
Cyrus Leung authored
-
- 22 May, 2024 1 commit
-
-
Cody Yu authored
The 2nd PR for #4532. This PR supports loading FP8 kv-cache scaling factors from a FP8 checkpoint (with .kv_scale parameter).
-
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 13 May, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 26 Apr, 2024 1 commit
-
-
Cody Yu authored
-
- 16 Apr, 2024 1 commit
-
-
Antoni Baum authored
-
- 10 Apr, 2024 1 commit
-
-
youkaichao authored
[WIP][Core][Refactor] move vllm/model_executor/parallel_utils into vllm/distributed and vllm/device_communicators (#3950)
-
- 26 Mar, 2024 1 commit
-
-
Jee Li authored
Co-authored-by:Antoni Baum <antoni.baum@protonmail.com>
-
- 25 Mar, 2024 2 commits
-
-
SangBin Cho authored
-
Woosuk Kwon authored
-
- 20 Mar, 2024 1 commit
-
-
Roy authored
-
- 07 Mar, 2024 1 commit
-
-
Woosuk Kwon authored
-