- 02 Feb, 2025 1 commit
-
-
Shawn Du authored
As mentioned in RFC https://github.com/vllm-project/vllm/issues/12254 , this PR achieves the task: combine allocate_slots and append_slots. There should be no functionality change, except that in decode, also raise exception when num_tokens is zero (like prefill), and change the unit test case accordingly. @comaniac @rickyyx @WoosukKwon @youkaichao @heheda12345 @simon-mo --------- Signed-off-by:
Shawn Du <shawnd200@outlook.com>
-
- 31 Jan, 2025 1 commit
-
-
Chen Zhang authored
This pr adds extra key to block hash, to generate different hash value for two blocks with the same token string but different extra_keys in their parent blocks. For example, it can generate different hash value for the second block of the following two requests: ```python request1 = make_request( request_id=0, prompt_token_ids=[_ for _ in range(6)], mm_positions=[{ "offset": 0, "length": 3 }, { "offset": 3, "length": 3 }], mm_hashes=["hash1", "hash2"], ) request2 = make_request( request_id=1, prompt_token_ids=[_ for _ in range(6)], mm_positions=[{ "offset": 0, "length": 3 }, { "offset": 3, "length": 3 }], mm_hashes=["hash3", "hash2"], ) ``` --------- Signed-off-by:Chen Zhang <zhangch99@outlook.com>
-
- 27 Jan, 2025 1 commit
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 24 Jan, 2025 1 commit
-
-
Nick Hill authored
Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Robert Shaw <rshaw@neuralmagic.com>
-
- 23 Jan, 2025 1 commit
-
-
Cody Yu authored
-
- 22 Jan, 2025 1 commit
-
-
Cody Yu authored
-
- 21 Jan, 2025 1 commit
-
-
Ricky Xu authored
Signed-off-by:rickyx <rickyx@anyscale.com>
-
- 17 Jan, 2025 1 commit
-
-
Chen Zhang authored
Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 15 Jan, 2025 1 commit
-
-
Chen Zhang authored
-
- 13 Jan, 2025 1 commit
-
-
Robert Shaw authored
Signed-off-by:rshaw@neuralmagic.com <rshaw@neuralmagic.com>
-
- 12 Jan, 2025 1 commit
-
-
Robert Shaw authored
Signed-off-by:rshaw@neuralmagic.com <rshaw@neuralmagic.com>
-
- 10 Jan, 2025 1 commit
-
-
Chen Zhang authored
Signed-off-by:Chen Zhang <zhangch99@outlook.com>
-
- 06 Jan, 2025 2 commits
-
-
Roger Wang authored
[V1] Extend beyond image modality and support mixed-modality inference with Llava-OneVision (#11685) Signed-off-by:
Roger Wang <ywang@roblox.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by:
DarkLight1337 <tlleungac@connect.ust.hk>
-
Rui Qiao authored
-
- 04 Jan, 2025 2 commits
-
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
xcnick authored
Signed-off-by:xcnick <xcnick0412@gmail.com>
-
- 03 Jan, 2025 1 commit
-
-
Robert Shaw authored
-
- 01 Jan, 2025 1 commit
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 31 Dec, 2024 2 commits
-
-
Chen Zhang authored
-
sakunkun authored
[Bugfix] Move the _touch(computed_blocks) call in the allocate_slots method to after the check for allocating new blocks. (#11565)
-
- 28 Dec, 2024 2 commits
-
-
Robert Shaw authored
Signed-off-by:rshaw@neuralmagic.com <rshaw@neuralmagic.com>
-
Robert Shaw authored
-
- 27 Dec, 2024 1 commit
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 26 Dec, 2024 1 commit
-
-
sroy745 authored
Signed-off-by:
Sourashis Roy <sroy@roblox.com> Signed-off-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 18 Dec, 2024 1 commit
-
-
Cody Yu authored
Signed-off-by:Cody Yu <hao.yu.cody@gmail.com>
-
- 13 Dec, 2024 1 commit
-
-
Cody Yu authored
-
- 12 Dec, 2024 1 commit
-
-
Alexander Matveev authored
Signed-off-by:
Roger Wang <ywang@roblox.com> Signed-off-by:
Alexander Matveev <alexm@neuralmagic.com> Co-authored-by:
Michael Goin <michael@neuralmagic.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
- 03 Dec, 2024 1 commit
-
-
Alexander Matveev authored
Signed-off-by:
Roger Wang <ywang@roblox.com> Co-authored-by:
Michael Goin <michael@neuralmagic.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
- 28 Nov, 2024 2 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Ricky Xu authored
Signed-off-by:rickyx <rickyx@anyscale.com>
-
- 26 Nov, 2024 1 commit
-
-
Ricky Xu authored
Signed-off-by:rickyx <rickyx@anyscale.com>
-
- 25 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 22 Nov, 2024 1 commit
-
-
Ricky Xu authored
Signed-off-by:
rickyx <rickyx@anyscale.com> Signed-off-by:
Cody Yu <hao.yu.cody@gmail.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 13 Nov, 2024 2 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 11 Nov, 2024 1 commit
-
-
Robert Shaw authored
Signed-off-by:
Nick Hill <nickhill@us.ibm.com> Signed-off-by:
rshaw@neuralmagic.com <rshaw@neuralmagic.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
- 08 Nov, 2024 1 commit
-
-
Cody Yu authored
Signed-off-by:Cody Yu <hao.yu.cody@gmail.com>
-