• Graham King's avatar
    feat: sglang backend for tio (#271) · e97493eb
    Graham King authored
    - Setup venv
    
    ```
    uv venv
    source .venv/bin/activate
    uv pip install pip
    uv pip install sgl-kernel --force-reinstall --no-deps
    uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
    ```
    
    - Build: `cargo build --release --features sglang`
    
    - Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model`
    
    - Run Deepseek multi-gpu / multi-node:
    
    Node 1:
    ```
    tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876
    ```
    
    Node 2:
    ```
    tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876
    ```
    e97493eb
opt.rs 3.62 KB