- 29 Jul, 2024 1 commit
-
-
Isotr0py authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 26 Jul, 2024 1 commit
-
-
Michael Goin authored
-
- 23 Jul, 2024 2 commits
-
-
Roger Wang authored
-
Roger Wang authored
-
- 22 Jul, 2024 1 commit
-
-
Roger Wang authored
-
- 10 Jul, 2024 1 commit
-
-
Abhinav Goyal authored
-
- 01 Jul, 2024 1 commit
-
-
Thomas Parnell authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Joshua Rosenkranz <jmrosenk@us.ibm.com>
-
- 27 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 21 Jun, 2024 1 commit
-
-
Joshua Rosenkranz authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Davis Wertheimer <Davis.Wertheimer@ibm.com>
-
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 09 May, 2024 1 commit
-
-
Hao Zhang authored
Co-authored-by:
Dash Desai <1723932+iamontheinet@users.noreply.github.com> Co-authored-by:
Aurick Qiao <qiao@aurick.net> Co-authored-by:
Aurick Qiao <aurick.qiao@snowflake.com> Co-authored-by:
Aurick Qiao <aurickq@users.noreply.github.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 26 Apr, 2024 1 commit
-
-
SangBin Cho authored
Co-authored-by:Danny Guinther <dguinther@neuralmagic.com>
-
- 23 Apr, 2024 1 commit
-
-
SangBin Cho authored
-
- 12 Apr, 2024 1 commit
-
-
Michael Feil authored
Co-authored-by:Roger Wang <136131678+ywang96@users.noreply.github.com>
-
- 27 Mar, 2024 1 commit
-
-
Megha Agarwal authored
-
- 25 Mar, 2024 1 commit
-
-
SangBin Cho authored
-
- 21 Mar, 2024 2 commits
-
-
Woosuk Kwon authored
Co-authored-by:
Roy <jasonailu87@gmail.com> Co-authored-by:
Roger Meier <r.meier@siemens.com>
-
Lalit Pradhan authored
-
- 11 Mar, 2024 1 commit
-
-
Zhuohan Li authored
-
- 29 Feb, 2024 1 commit
-
-
Seonghyeon authored
-
- 27 Feb, 2024 1 commit
-
-
Roy authored
-
- 19 Feb, 2024 1 commit
-
-
Isotr0py authored
-
- 14 Feb, 2024 1 commit
-
-
Roy authored
-
- 13 Feb, 2024 3 commits
-
-
Philipp Moritz authored
Co-authored-by:Roy <jasonailu87@gmail.com>
-
Philipp Moritz authored
This reverts commit 5c976a7e.
-
Roy authored
-
- 20 Nov, 2023 1 commit
-
-
Simon Mo authored
-
- 16 Nov, 2023 1 commit
-
-
Megha Agarwal authored
-
- 07 Nov, 2023 1 commit
-
-
GoHomeToMacDonal authored
-
- 06 Nov, 2023 1 commit
-
-
Roy authored
-
- 01 Nov, 2023 1 commit
-
-
Woosuk Kwon authored
-
- 13 Oct, 2023 2 commits
-
-
Lu Wang authored
-
Woosuk Kwon authored
-
- 28 Sep, 2023 2 commits
-
-
Woosuk Kwon authored
-
Chris Bamford authored
Co-authored-by:timlacroix <t@mistral.ai>
-
- 27 Sep, 2023 1 commit
-
-
Qing authored
-
- 22 Aug, 2023 1 commit
-
-
shunxing1234 authored
* add aquila Signed-off-by:
ftgreat <ftgreat@163.com> * fix some bug Signed-off-by:
shunxing1234 <xw747777271@gmail.com> * delete pdb Signed-off-by:
shunxing1234 <xw747777271@gmail.com> * fix bugs Signed-off-by:
shunxing1234 <xw747777271@gmail.com> * fix bugs Signed-off-by:
shunxing1234 <xw747777271@gmail.com> * delete whitespace Signed-off-by:
shunxing1234 <xw747777271@gmail.com> * format * fix order --------- Signed-off-by:
ftgreat <ftgreat@163.com> Signed-off-by:
shunxing1234 <xw747777271@gmail.com> Co-authored-by:
ftgreat <ftgreat@163.com>
-
- 08 Aug, 2023 1 commit
-
-
Qing authored
Co-authored-by:wq.chu <wq.chu@tianrang-inc.com>
-
- 02 Aug, 2023 1 commit
-
-
Zhuohan Li authored
-
- 17 Jul, 2023 1 commit
-
-
codethazine authored
-