"examples/python_rs/llm/vllm_nixl/README.md" did not exist on "ea401e3bc00e4b4c8c8f06fcc643854db80e66a3"
  1. 13 May, 2025 1 commit
  2. 09 May, 2025 1 commit
  3. 08 May, 2025 1 commit
    • Graham King's avatar
      feat: Qwen3, Gemma3 and Llama4 support (#1002) · ceaeba3e
      Graham King authored
      . New mistralrs and llamacpp version
      . mistralrs: Handle Gemma 3 and Llama 4 as vision models
      . Update the dynamo-run docs to use Qwen 3
      . Our pre-processor now supports Llama 4's newer multi-modal `config.json`
      . Upgrade minijinja to handle Qwen 3's prompt template
      
      For Llama 4 we'll need to limit the max seq len. vllm says:
      > To serve at least one request with the models's max seq len (10485760), (240.00 GiB KV cache is needed,...
      
      I was able to run Llama 4 with llamacpp and a quantized GGUF, with Dynamo doing the pre-processing.
      ceaeba3e
  4. 07 May, 2025 3 commits
  5. 06 May, 2025 2 commits
    • Graham King's avatar
      feat(dynamo-run): vllm and sglang subprocess engines (#954) · 28fd481c
      Graham King authored
      New vllm and sglang engines that run in a sub-process. Will hopefully replace the existing embedded python engines.
          
      Why?
          
        - Pure Python, does not require knowing Rust to work on it. Much simpler to maintain.
        - No embedded Python interpreter which avoids linking libpython and avoids the MacOS virtualenv issues.
        - Should have better performance as it's "native" vllm / sglang.
        - Works with any version of vllm (including v1!) and sglang. Less upgrade struggle.
      28fd481c
    • hhzhang16's avatar
      403344e5
  6. 05 May, 2025 1 commit
  7. 29 Apr, 2025 2 commits
  8. 28 Apr, 2025 1 commit
  9. 26 Apr, 2025 2 commits
  10. 25 Apr, 2025 2 commits
  11. 24 Apr, 2025 1 commit
  12. 23 Apr, 2025 1 commit
  13. 22 Apr, 2025 1 commit
  14. 21 Apr, 2025 1 commit
  15. 18 Apr, 2025 4 commits
  16. 15 Apr, 2025 3 commits
  17. 11 Apr, 2025 2 commits
  18. 09 Apr, 2025 3 commits
  19. 08 Apr, 2025 1 commit
  20. 07 Apr, 2025 1 commit
  21. 03 Apr, 2025 1 commit
  22. 25 Mar, 2025 1 commit
  23. 24 Mar, 2025 1 commit
  24. 21 Mar, 2025 3 commits