- 21 Jun, 2024 1 commit
-
-
Joshua Rosenkranz authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Davis Wertheimer <Davis.Wertheimer@ibm.com>
-
- 18 Jun, 2024 1 commit
-
-
Ronen Schaffer authored
This PR adds basic support for OpenTelemetry distributed tracing. It includes changes to enable tracing functionality and improve monitoring capabilities. I've also added a markdown with print-screens to guide users how to use this feature. You can find it here
-
- 17 Jun, 2024 3 commits
-
-
Bruce Fontaine authored
-
Kunshang Ji authored
Co-authored-by:
Jiang Li <jiang1.li@intel.com> Co-authored-by:
Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by:
Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
-
Amit Garg authored
-
- 15 Jun, 2024 2 commits
-
-
Nick Hill authored
-
SangBin Cho authored
Co-authored-by:Cyrus Leung <tlleungac@connect.ust.hk>
-
- 13 Jun, 2024 2 commits
-
-
Antoni Baum authored
-
Cody Yu authored
Co-authored-by:Philipp Moritz <pcmoritz@gmail.com>
-
- 12 Jun, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 11 Jun, 2024 3 commits
-
-
Nick Hill authored
Co-authored-by:Antoni Baum <antoni.baum@protonmail.com>
-
sasha0552 authored
-
maor-ps authored
Co-authored-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 10 Jun, 2024 1 commit
-
-
Dipika Sikka authored
Co-authored-by:Michael Goin <michael@neuralmagic.com>
-
- 07 Jun, 2024 1 commit
-
-
Roger Wang authored
Co-authored-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 06 Jun, 2024 2 commits
-
-
liuyhwangyh authored
Co-authored-by:mulin.lyh <mulin.lyh@taobao.com>
-
Cyrus Leung authored
-
- 05 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 03 Jun, 2024 2 commits
-
-
Kaiyang Chen authored
-
Cyrus Leung authored
-
- 01 Jun, 2024 1 commit
-
-
chenqianfzh authored
-
- 30 May, 2024 1 commit
-
-
Robert Shaw authored
-
- 27 May, 2024 1 commit
-
-
Zhuohan Li authored
Co-authored-by:
rsnm2 <rshaw@neuralmagic.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
-
- 22 May, 2024 2 commits
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 17 May, 2024 1 commit
-
-
Alexei-V-Ivanov-AMD authored
[Build/CI] Extending the set of AMD tests with Regression, Basic Correctness, Distributed, Engine, Llava Tests (#4797)
-
- 16 May, 2024 2 commits
-
-
Alexander Matveev authored
Co-authored-by:Robert Shaw <rshaw@neuralmagic.com>
-
Aurick Qiao authored
Co-authored-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 15 May, 2024 1 commit
-
-
zifeitong authored
-
- 14 May, 2024 1 commit
-
-
Nick Hill authored
Co-authored-by:SAHIL SUNEJA <suneja@us.ibm.com>
-
- 13 May, 2024 1 commit
-
-
Cody Yu authored
-
- 11 May, 2024 1 commit
-
-
Chang Su authored
-
- 09 May, 2024 1 commit
-
-
Michael Goin authored
-
- 08 May, 2024 1 commit
-
-
Cody Yu authored
Co-authored-by:Cade Daniel <edacih@gmail.com>
-
- 05 May, 2024 1 commit
-
-
zhaoyang-star authored
-
- 04 May, 2024 2 commits
-
-
DearPlanet authored
-
SangBin Cho authored
-
- 03 May, 2024 2 commits
-
-
Lily Liu authored
Co-authored-by:LiuXiaoxuanPKU <llilyliupku@gmail.com>
-
SangBin Cho authored
-