1. 25 Feb, 2025 1 commit
    • Graham King's avatar
      feat: sglang backend for tio (#271) · e97493eb
      Graham King authored
      - Setup venv
      
      ```
      uv venv
      source .venv/bin/activate
      uv pip install pip
      uv pip install sgl-kernel --force-reinstall --no-deps
      uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
      ```
      
      - Build: `cargo build --release --features sglang`
      
      - Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model`
      
      - Run Deepseek multi-gpu / multi-node:
      
      Node 1:
      ```
      tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876
      ```
      
      Node 2:
      ```
      tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876
      ```
      e97493eb