1. 18 Apr, 2025 1 commit
    • Graham King's avatar
      feat(dynamo-engine-vllm): vllm 0.8.X support (#728) · a745a980
      Graham King authored
      It's different enough that I made a new engine vllm0_8 and renamed the previous engine to vllm0_7.
      
      `dynamo-run out=vllm` now expects 0.8. This matches the container change in #690.
      
      For older use `dynamo-run out=vllm0_7`.
      a745a980
  2. 07 Apr, 2025 1 commit
    • Graham King's avatar
      feat(dynamo-run): Basic routing choice (#524) · ec2e7307
      Graham King authored
      As a first step towards KV routing:
      - introduce a `--router-mode` in dynamo-run that only does random and round-robin right now. Not that interesting yet.
      - Make the vllm engine publish the KV events received from our patched vllm.
      
      Now we "just" need to connect the two. Easy right?
      ec2e7307
  3. 03 Apr, 2025 1 commit
  4. 25 Mar, 2025 1 commit
  5. 24 Mar, 2025 1 commit
  6. 08 Mar, 2025 1 commit
  7. 05 Mar, 2025 1 commit
  8. 04 Mar, 2025 1 commit
  9. 28 Feb, 2025 1 commit
  10. 25 Feb, 2025 1 commit
    • Graham King's avatar
      feat: sglang backend for tio (#271) · e97493eb
      Graham King authored
      - Setup venv
      
      ```
      uv venv
      source .venv/bin/activate
      uv pip install pip
      uv pip install sgl-kernel --force-reinstall --no-deps
      uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
      ```
      
      - Build: `cargo build --release --features sglang`
      
      - Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model`
      
      - Run Deepseek multi-gpu / multi-node:
      
      Node 1:
      ```
      tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876
      ```
      
      Node 2:
      ```
      tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876
      ```
      e97493eb