- 19 Feb, 2024 4 commits
-
-
Ronen Schaffer authored
-
Simon Mo authored
-
Isotr0py authored
-
Zhuohan Li authored
-
- 18 Feb, 2024 2 commits
-
-
Zhuohan Li authored
-
Mark Mozolewski authored
-
- 17 Feb, 2024 2 commits
-
-
jvmncs authored
how to serve the loras (mimicking the [multilora inference example](https://github.com/vllm-project/vllm/blob/main/examples/multilora_inference.py)): ```terminal $ export LORA_PATH=~/.cache/huggingface/hub/models--yard1--llama-2-7b-sql-lora-test/ $ python -m vllm.entrypoints.api_server \ --model meta-llama/Llama-2-7b-hf \ --enable-lora \ --lora-modules sql-lora=$LORA_PATH sql-lora2=$LORA_PATH ``` the above server will list 3 separate values if the user queries `/models`: one for the base served model, and one each for the specified lora modules. in this case sql-lora and sql-lora2 point to the same underlying lora, but this need not be the case. lora config values take the same values they do in EngineArgs no work has been done here to scope client permissions to specific models
-
Nick Hill authored
If the SamplingParams object passed to LLMEngine.add_request() is mutated after it returns, it could affect the async sampling process for that request. Suggested by @Yard1 https://github.com/vllm-project/vllm/pull/2514#discussion_r1490106059
-
- 16 Feb, 2024 2 commits
-
-
Woosuk Kwon authored
-
shiyi.c_98 authored
-
- 15 Feb, 2024 4 commits
-
-
Hongxia Yang authored
-
Philipp Moritz authored
-
Woosuk Kwon authored
-
Philipp Moritz authored
* Fix AttributeError: MixtralModel object has no attribute org_vocab_size. * Make LoRA logic for Mistral and Mixtral the same --------- Co-authored-by:Pernekhan Utemuratov <pernekhan@deepinfra.com>
-
- 14 Feb, 2024 6 commits
-
-
Woosuk Kwon authored
-
Roy authored
-
Nikola Borisov authored
-
Woosuk Kwon authored
-
-
Philipp Moritz authored
Co-authored-by:Roy <jasonailu87@gmail.com>
-
- 13 Feb, 2024 7 commits
-
-
Terry authored
* add mixtral lora support * formatting * fix incorrectly ported logic * polish tests * minor fixes and refactoring * minor fixes * formatting * rename and remove redundant logic * refactoring * refactoring * minor fix * minor refactoring * fix code smell
-
Philipp Moritz authored
Co-authored-by:Roy <jasonailu87@gmail.com>
-
Woosuk Kwon authored
-
Philipp Moritz authored
This reverts commit 5c976a7e.
-
Roy authored
-
Simon Mo authored
-
Roger Wang authored
-
- 12 Feb, 2024 2 commits
-
-
Rex authored
Co-authored-by:Chunan Zeng <chunanzeng@Chunans-Air.attlocal.net>
-
Philipp Moritz authored
-
- 11 Feb, 2024 1 commit
-
-
Hongxia Yang authored
-
- 08 Feb, 2024 2 commits
-
-
Woosuk Kwon authored
-
SangBin Cho authored
-
- 07 Feb, 2024 2 commits
-
-
Philipp Moritz authored
-
Hongxia Yang authored
-
- 06 Feb, 2024 3 commits
-
-
Lily Liu authored
-
liuyhwangyh authored
-
Woosuk Kwon authored
-
- 05 Feb, 2024 3 commits
-
-
Douglas Lehr authored
-
Lukas authored
-
Hongxia Yang authored
-