- 13 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 11 Jun, 2024 2 commits
-
-
Cyrus Leung authored
-
maor-ps authored
Co-authored-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 10 Jun, 2024 1 commit
-
-
Itay Etelis authored
-
- 07 Jun, 2024 1 commit
-
-
Itay Etelis authored
-
- 04 Jun, 2024 1 commit
-
-
Toshiki Kataoka authored
-
- 03 Jun, 2024 1 commit
-
-
Breno Faria authored
-
- 30 May, 2024 1 commit
-
-
Breno Faria authored
Co-authored-by:Breno Faria <breno.faria@intrafind.com>
-
- 28 May, 2024 1 commit
-
-
Cyrus Leung authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 15 May, 2024 1 commit
-
-
Cyrus Leung authored
-
- 13 May, 2024 1 commit
-
-
Cyrus Leung authored
Since #4335 was merged, I've noticed that the definition of ServerRunner in the tests is the same as in the test for OpenAI API. I have moved the class to the test utilities to avoid code duplication. (Although it only has been repeated twice so far, I will add another similar test suite in #4200 which would duplicate the code a third time) Also, I have moved the test utilities file (test_utils.py) to under the test directory (tests/utils.py), since none of its code is actually used in the main package. Note that I have added __init__.py to each test subpackage and updated the ray.init() call in the test utilities file in order to relative import tests/utils.py.
-
- 11 May, 2024 1 commit
-
-
Chang Su authored
-
- 03 May, 2024 1 commit
-
-
Sebastian Schoennenbeck authored
-
- 01 May, 2024 1 commit
-
-
sasha0552 authored
-
- 30 Apr, 2024 1 commit
-
-
Florian Greinacher authored
Co-authored-by:
Lily Liu <lilyliupku@gmail.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
- 27 Apr, 2024 1 commit
-
-
Cyrus Leung authored
-
- 20 Apr, 2024 1 commit
-
-
Ayush Rautwar authored
Co-authored-by:Ubuntu <ubuntu@ip-172-31-13-147.ec2.internal>
-
- 18 Apr, 2024 1 commit
-
-
James Whedbee authored
-
- 16 Apr, 2024 1 commit
-
-
Noam Gat authored
Co-authored-by:Simon Mo <simon.mo@hey.com>
-
- 11 Apr, 2024 2 commits
-
-
Dylan Hawk authored
Co-authored-by:Dylan Hawk <dylanwawk@gmail.com>
-
SangBin Cho authored
-
- 29 Mar, 2024 1 commit
-
-
Roy authored
-
- 25 Mar, 2024 2 commits
-
-
Dylan Hawk authored
Co-authored-by:Dylan Hawk <dylanwawk@gmail.com>
-
SangBin Cho authored
-
- 16 Mar, 2024 1 commit
-
-
Simon Mo authored
-
- 11 Mar, 2024 1 commit
-
-
Zhuohan Li authored
-
- 04 Mar, 2024 1 commit
-
-
Antoni Baum authored
Co-authored-by:Avnish Narayan <avnish@anyscale.com>
-
- 29 Feb, 2024 1 commit
-
-
felixzhu555 authored
Co-authored-by:
br3no <breno@veltefaria.de> Co-authored-by:
simon-mo <simon.mo@hey.com>
-
- 27 Feb, 2024 1 commit
-
-
Dylan Hawk authored
-
- 26 Feb, 2024 1 commit
-
-
Jared Moore authored
-
- 17 Feb, 2024 1 commit
-
-
jvmncs authored
how to serve the loras (mimicking the [multilora inference example](https://github.com/vllm-project/vllm/blob/main/examples/multilora_inference.py)): ```terminal $ export LORA_PATH=~/.cache/huggingface/hub/models--yard1--llama-2-7b-sql-lora-test/ $ python -m vllm.entrypoints.api_server \ --model meta-llama/Llama-2-7b-hf \ --enable-lora \ --lora-modules sql-lora=$LORA_PATH sql-lora2=$LORA_PATH ``` the above server will list 3 separate values if the user queries `/models`: one for the base served model, and one each for the specified lora modules. in this case sql-lora and sql-lora2 point to the same underlying lora, but this need not be the case. lora config values take the same values they do in EngineArgs no work has been done here to scope client permissions to specific models
-
- 25 Jan, 2024 1 commit
-
-
Simon Mo authored
-
- 19 Jan, 2024 1 commit
-
-
Simon Mo authored
-
- 17 Jan, 2024 1 commit
-
-
FlorianJoncour authored
-