- 22 Feb, 2024 9 commits
-
-
Ronen Schaffer authored
-
Woosuk Kwon authored
-
44670 authored
-
Woosuk Kwon authored
-
Massimiliano Pronesti authored
-
Woosuk Kwon authored
-
Roy authored
-
Mustafa Eyceoz authored
-
Ronen Schaffer authored
-
- 21 Feb, 2024 7 commits
-
-
Zhuohan Li authored
This version is for more model support. Add support for Gemma models (#2964) and OLMo models (#2832).
-
Nick Hill authored
-
Woosuk Kwon authored
-
Zhuohan Li authored
-
Woosuk Kwon authored
-
Xiang Xu authored
-
Antoni Baum authored
-
- 20 Feb, 2024 3 commits
-
-
Antoni Baum authored
-
Zhuohan Li authored
-
James Whedbee authored
-
- 19 Feb, 2024 4 commits
-
-
Ronen Schaffer authored
-
Simon Mo authored
-
Isotr0py authored
-
Zhuohan Li authored
-
- 18 Feb, 2024 2 commits
-
-
Zhuohan Li authored
-
Mark Mozolewski authored
-
- 17 Feb, 2024 2 commits
-
-
jvmncs authored
how to serve the loras (mimicking the [multilora inference example](https://github.com/vllm-project/vllm/blob/main/examples/multilora_inference.py)): ```terminal $ export LORA_PATH=~/.cache/huggingface/hub/models--yard1--llama-2-7b-sql-lora-test/ $ python -m vllm.entrypoints.api_server \ --model meta-llama/Llama-2-7b-hf \ --enable-lora \ --lora-modules sql-lora=$LORA_PATH sql-lora2=$LORA_PATH ``` the above server will list 3 separate values if the user queries `/models`: one for the base served model, and one each for the specified lora modules. in this case sql-lora and sql-lora2 point to the same underlying lora, but this need not be the case. lora config values take the same values they do in EngineArgs no work has been done here to scope client permissions to specific models
-
Nick Hill authored
If the SamplingParams object passed to LLMEngine.add_request() is mutated after it returns, it could affect the async sampling process for that request. Suggested by @Yard1 https://github.com/vllm-project/vllm/pull/2514#discussion_r1490106059
-
- 16 Feb, 2024 2 commits
-
-
Woosuk Kwon authored
-
shiyi.c_98 authored
-
- 15 Feb, 2024 4 commits
-
-
Hongxia Yang authored
-
Philipp Moritz authored
-
Woosuk Kwon authored
-
Philipp Moritz authored
* Fix AttributeError: MixtralModel object has no attribute org_vocab_size. * Make LoRA logic for Mistral and Mixtral the same --------- Co-authored-by:Pernekhan Utemuratov <pernekhan@deepinfra.com>
-
- 14 Feb, 2024 6 commits
-
-
Woosuk Kwon authored
-
Roy authored
-
Nikola Borisov authored
-
Woosuk Kwon authored
-
-
Philipp Moritz authored
Co-authored-by:Roy <jasonailu87@gmail.com>
-
- 13 Feb, 2024 1 commit
-
-
Terry authored
* add mixtral lora support * formatting * fix incorrectly ported logic * polish tests * minor fixes and refactoring * minor fixes * formatting * rename and remove redundant logic * refactoring * refactoring * minor fix * minor refactoring * fix code smell
-