1. 11 Jun, 2024 2 commits
  2. 10 Jun, 2024 1 commit
  3. 07 Jun, 2024 1 commit
  4. 04 Jun, 2024 1 commit
  5. 03 Jun, 2024 1 commit
  6. 30 May, 2024 1 commit
  7. 28 May, 2024 1 commit
  8. 15 May, 2024 1 commit
  9. 13 May, 2024 1 commit
    • Cyrus Leung's avatar
      [CI/Build] Move `test_utils.py` to `tests/utils.py` (#4425) · 350f9e10
      Cyrus Leung authored
      Since #4335 was merged, I've noticed that the definition of ServerRunner in the tests is the same as in the test for OpenAI API. I have moved the class to the test utilities to avoid code duplication. (Although it only has been repeated twice so far, I will add another similar test suite in #4200 which would duplicate the code a third time)
      
      Also, I have moved the test utilities file (test_utils.py) to under the test directory (tests/utils.py), since none of its code is actually used in the main package. Note that I have added __init__.py to each test subpackage and updated the ray.init() call in the test utilities file in order to relative import tests/utils.py.
      350f9e10
  10. 11 May, 2024 1 commit
  11. 03 May, 2024 1 commit
  12. 01 May, 2024 1 commit
  13. 30 Apr, 2024 1 commit
  14. 27 Apr, 2024 1 commit
  15. 20 Apr, 2024 1 commit
  16. 18 Apr, 2024 1 commit
  17. 16 Apr, 2024 1 commit
  18. 11 Apr, 2024 2 commits
  19. 29 Mar, 2024 1 commit
  20. 25 Mar, 2024 2 commits
  21. 16 Mar, 2024 1 commit
  22. 11 Mar, 2024 1 commit
  23. 04 Mar, 2024 1 commit
  24. 29 Feb, 2024 1 commit
  25. 27 Feb, 2024 1 commit
  26. 26 Feb, 2024 1 commit
  27. 17 Feb, 2024 1 commit
    • jvmncs's avatar
      multi-LoRA as extra models in OpenAI server (#2775) · 8f36444c
      jvmncs authored
      how to serve the loras (mimicking the [multilora inference example](https://github.com/vllm-project/vllm/blob/main/examples/multilora_inference.py)):
      ```terminal
      $ export LORA_PATH=~/.cache/huggingface/hub/models--yard1--llama-2-7b-sql-lora-test/
      $ python -m vllm.entrypoints.api_server \
       --model meta-llama/Llama-2-7b-hf \
       --enable-lora \
       --lora-modules sql-lora=$LORA_PATH sql-lora2=$LORA_PATH
      ```
      the above server will list 3 separate values if the user queries `/models`: one for the base served model, and one each for the specified lora modules. in this case sql-lora and sql-lora2 point to the same underlying lora, but this need not be the case. lora config values take the same values they do in EngineArgs
      
      no work has been done here to scope client permissions to specific models
      8f36444c
  28. 25 Jan, 2024 1 commit
  29. 19 Jan, 2024 1 commit
  30. 17 Jan, 2024 1 commit