"vscode:/vscode.git/clone" did not exist on "ab74b2a27a4eb88b90356bfb4b452d29edf05574"
- 23 Oct, 2024 2 commits
-
-
Michael Goin authored
-
Chen Zhang authored
-
- 22 Oct, 2024 1 commit
-
-
Jeremy Arnold authored
-
- 17 Oct, 2024 1 commit
-
-
Kuntai Du authored
Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
-
- 10 Oct, 2024 2 commits
- 07 Oct, 2024 1 commit
-
-
youkaichao authored
-
- 06 Oct, 2024 1 commit
-
-
Brendan Wong authored
Co-authored-by:youkaichao <youkaichao@126.com>
-
- 26 Sep, 2024 1 commit
-
-
zhuwenwen authored
-
- 24 Sep, 2024 1 commit
-
-
youkaichao authored
Co-authored-by:Brendan Wong <bjwpokemon@gmail.com>
-
- 19 Sep, 2024 1 commit
-
-
Kunshang Ji authored
-
- 11 Sep, 2024 1 commit
-
-
Aarni Koskela authored
-
- 05 Sep, 2024 2 commits
- 04 Sep, 2024 1 commit
-
-
Nick Hill authored
-
- 27 Aug, 2024 1 commit
-
-
Megha Agarwal authored
Co-authored-by:Alexander Matveev <alexm@neuralmagic.com>
-
- 23 Aug, 2024 1 commit
-
-
Alexander Matveev authored
-
- 15 Aug, 2024 1 commit
-
-
zhuwenwen authored
-
- 10 Aug, 2024 1 commit
-
-
zhuwenwen authored
-
- 03 Aug, 2024 1 commit
-
-
zhuwenwen authored
-
- 01 Aug, 2024 1 commit
-
-
zhuwenwen authored
-
- 31 Jul, 2024 1 commit
-
-
zhuwenwen authored
-
- 28 Jun, 2024 1 commit
-
-
Ilya Lavrenov authored
-
- 20 Jun, 2024 1 commit
-
-
Michael Goin authored
-
- 17 Jun, 2024 1 commit
-
-
Kunshang Ji authored
Co-authored-by:
Jiang Li <jiang1.li@intel.com> Co-authored-by:
Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by:
Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
-
- 15 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 14 Jun, 2024 1 commit
-
-
Kuntai Du authored
[CI/Build][Misc] Add CI that benchmarks vllm performance on those PRs with `perf-benchmarks` label (#5073) Co-authored-by:simon-mo <simon.mo@hey.com>
-
- 12 Jun, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 08 Jun, 2024 1 commit
-
-
Benjamin Kitor authored
-
- 22 May, 2024 1 commit
-
-
Cody Yu authored
The 2nd PR for #4532. This PR supports loading FP8 kv-cache scaling factors from a FP8 checkpoint (with .kv_scale parameter).
-
- 16 May, 2024 2 commits
- 24 Apr, 2024 1 commit
-
-
zifeitong authored
-
- 18 Apr, 2024 1 commit
-
-
Michael Goin authored
-
- 11 Apr, 2024 1 commit
-
-
SangBin Cho authored
-
- 10 Apr, 2024 1 commit
-
-
Zedong Peng authored
-
- 04 Apr, 2024 1 commit
-
-
TianYu GUO authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 03 Apr, 2024 1 commit
-
-
Adrian Abeyta authored
Co-authored-by:
Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Co-authored-by:
HaiShaw <hixiao@gmail.com> Co-authored-by:
AdrianAbeyta <Adrian.Abeyta@amd.com> Co-authored-by:
Matthew Wong <Matthew.Wong2@amd.com> Co-authored-by:
root <root@gt-pla-u18-08.pla.dcgpu> Co-authored-by:
mawong-amd <156021403+mawong-amd@users.noreply.github.com> Co-authored-by:
ttbachyinsda <ttbachyinsda@outlook.com> Co-authored-by:
guofangze <guofangze@kuaishou.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
jacobthebanana <50071502+jacobthebanana@users.noreply.github.com> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 29 Mar, 2024 1 commit
-
-
Yile (Michael) Gu authored
-
- 27 Mar, 2024 1 commit
-
-
AmadeusChan authored
-