"launch/llmctl/git@developer.sourcefind.cn:OpenDAS/dynamo.git" did not exist on "00730fc6aaa43a64d7d4eb916ed0b9a7f104708f"
Commit e0571935 authored by Hongkuan Zhou's avatar Hongkuan Zhou Committed by GitHub
Browse files

fix: add multi-node deployment instruction for vllm-nixl (#93)


Co-authored-by: default avatarhongkuanz <hongkuanz@nvidia.com>
parent f784b36a
...@@ -217,6 +217,19 @@ CUDA_VISIBLE_DEVICES=1 python3 worker.py \ ...@@ -217,6 +217,19 @@ CUDA_VISIBLE_DEVICES=1 python3 worker.py \
<optional disaggregated router args: --conditional-disagg --custom-disagg-router --max-local-prefill-length <length>> <optional disaggregated router args: --conditional-disagg --custom-disagg-router --max-local-prefill-length <length>>
``` ```
### Multi-Node Deployment
For multi-node deployment, etcd, nats, processor, and kv router
are only required on the head node. The only components that need
to be deployed on all nodes are the workers.
Set the following environment variables on each node before running the workers:
```bash
export NATS_SERVER="nats://<nats-server-host>:<nats-server-port>"
export ETCD_ENDPOINTS="http://<etcd-server-host>:<etcd-server-port>"
```
### Common Issues ### Common Issues
If torch GLOO backend is complaining about file name too long, set If torch GLOO backend is complaining about file name too long, set
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment