- 29 Jan, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 20 Jan, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 12 Jan, 2025 1 commit
-
-
Rafael Vasquez authored
Signed-off-by:Rafael Vasquez <rafvasq21@gmail.com>
-
- 08 Jan, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 18 Dec, 2024 1 commit
-
-
Dipika Sikka authored
Co-authored-by:
Faraz Shahsavan <faraz.shahsavan@gmail.com> Co-authored-by:
ilmarkov <markovilya197@gmail.com> Co-authored-by:
Rahul Tuli <rahul@neuralmagic.com> Co-authored-by:
rshaw@neuralmagic.com <rshaw@neuralmagic.com>
-
- 11 Dec, 2024 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 25 Nov, 2024 1 commit
-
-
Wallas Henrique authored
Signed-off-by:
Wallas Santos <wallashss@ibm.com> Co-authored-by:
Michael Goin <michael@neuralmagic.com>
-
- 12 Nov, 2024 1 commit
-
-
youkaichao authored
-
- 10 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 09 Nov, 2024 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 08 Nov, 2024 2 commits
-
-
Florian Zimmermeister authored
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 07 Nov, 2024 1 commit
-
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
- 06 Nov, 2024 2 commits
-
-
Joe Runde authored
Signed-off-by:Joe Runde <Joseph.Runde@ibm.com>
-
Aaron Pham authored
Signed-off-by:Aaron Pham <contact@aarnphm.xyz>
-
- 04 Nov, 2024 1 commit
-
-
bnellnm authored
Signed-off-by:Bill Nell <bill@neuralmagic.com>
-
- 27 Oct, 2024 1 commit
-
-
bnellnm authored
Signed-off-by:
Bill Nell <bill@neuralmagic.com> Signed-off-by:
youkaichao <youkaichao@gmail.com> Co-authored-by:
youkaichao <youkaichao@gmail.com>
-
- 14 Oct, 2024 1 commit
-
-
Daniele authored
-
- 24 Sep, 2024 1 commit
-
-
Daniele authored
-
- 23 Sep, 2024 1 commit
-
-
Daniele authored
Co-authored-by:youkaichao <youkaichao@126.com>
-
- 18 Sep, 2024 1 commit
-
-
Aaron Pham authored
Signed-off-by:
Aaron Pham <contact@aarnphm.xyz> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
- 13 Sep, 2024 2 commits
-
-
Cyrus Leung authored
-
Cyrus Leung authored
-
- 27 Aug, 2024 1 commit
-
-
Jonathan Berkhahn authored
-
- 21 Aug, 2024 2 commits
-
-
sasha0552 authored
-
Cyrus Leung authored
Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Fei <dfdfcai4@gmail.com>
-
- 12 Aug, 2024 1 commit
-
-
Daniele authored
-
- 01 Aug, 2024 1 commit
-
-
Sage Moore authored
Co-authored-by:Michael Goin <michael@neuralmagic.com>
-
- 31 Jul, 2024 2 commits
-
-
Simon Mo authored
Co-authored-by:Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
-
Cyrus Leung authored
-
- 12 Jul, 2024 1 commit
-
-
Cody Yu authored
-
- 30 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 18 Jun, 2024 1 commit
-
-
Roger Wang authored
-
- 06 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 01 Jun, 2024 1 commit
-
-
Tyler Michael Smith authored
-
- 28 May, 2024 1 commit
-
-
Cyrus Leung authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 21 May, 2024 1 commit
-
-
Michael Goin authored
-
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 30 Apr, 2024 1 commit
-
-
Michael Goin authored
-
- 26 Apr, 2024 1 commit
-
-
SangBin Cho authored
Co-authored-by:Danny Guinther <dguinther@neuralmagic.com>
-