Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
c29f78c1
Unverified
Commit
c29f78c1
authored
Jan 09, 2026
by
dagil-nvidia
Committed by
GitHub
Jan 09, 2026
Browse files
docs: add host and bootstrap port to disagg multinode example (#5309)
Signed-off-by:
Dan Gil
<
dagil@nvidia.com
>
parent
5f8d90a3
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
14 additions
and
0 deletions
+14
-0
examples/basics/multinode/README.md
examples/basics/multinode/README.md
+14
-0
No files found.
examples/basics/multinode/README.md
View file @
c29f78c1
...
@@ -97,6 +97,7 @@ Ensure the following ports are accessible between nodes:
...
@@ -97,6 +97,7 @@ Ensure the following ports are accessible between nodes:
- **2379**: etcd client port
- **2379**: etcd client port
- **4222**: NATS client port
- **4222**: NATS client port
- **8000**: Frontend HTTP port (only needed on frontend node)
- **8000**: Frontend HTTP port (only needed on frontend node)
- **${DISAGG_BOOTSTRAP_PORT}**: SGLang disaggregation bootstrap port (set in Step 1; must be reachable across nodes)
- **High-speed interconnect**: For optimal NIXL performance (InfiniBand, RoCE, or high-bandwidth Ethernet)
- **High-speed interconnect**: For optimal NIXL performance (InfiniBand, RoCE, or high-bandwidth Ethernet)
### 4. Hardware Setup
### 4. Hardware Setup
...
@@ -128,6 +129,9 @@ export INFRA_NODE_IP=<INFRA_NODE_IP>
...
@@ -128,6 +129,9 @@ export INFRA_NODE_IP=<INFRA_NODE_IP>
export ETCD_ENDPOINTS=http://${INFRA_NODE_IP}:2379
export ETCD_ENDPOINTS=http://${INFRA_NODE_IP}:2379
export NATS_SERVER=nats://${INFRA_NODE_IP}:4222
export NATS_SERVER=nats://${INFRA_NODE_IP}:4222
export DYN_LOG=debug # Enable debug logging to see routing decisions
export DYN_LOG=debug # Enable debug logging to see routing decisions
# Use a fixed, reachable port for the disaggregation bootstrap server
# Pick any free port and ensure it's open between nodes
export DISAGG_BOOTSTRAP_PORT=32963
```
```
### Step 2: Launch Replica 1 (Node 1)
### Step 2: Launch Replica 1 (Node 1)
...
@@ -141,8 +145,10 @@ CUDA_VISIBLE_DEVICES=0 python3 -m dynamo.sglang \
...
@@ -141,8 +145,10 @@ CUDA_VISIBLE_DEVICES=0 python3 -m dynamo.sglang \
--served-model-name Qwen/Qwen3-0.6B
\
--served-model-name Qwen/Qwen3-0.6B
\
--page-size 16
\
--page-size 16
\
--tp 1
\
--tp 1
\
--host 0.0.0.0
\
--trust-remote-code
\
--trust-remote-code
\
--skip-tokenizer-init
\
--skip-tokenizer-init
\
--disaggregation-bootstrap-port ${DISAGG_BOOTSTRAP_PORT}
\
--disaggregation-mode prefill
\
--disaggregation-mode prefill
\
--disaggregation-transfer-backend nixl &
--disaggregation-transfer-backend nixl &
...
@@ -151,8 +157,10 @@ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.sglang \
...
@@ -151,8 +157,10 @@ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.sglang \
--served-model-name Qwen/Qwen3-0.6B
\
--served-model-name Qwen/Qwen3-0.6B
\
--page-size 16
\
--page-size 16
\
--tp 1
\
--tp 1
\
--host 0.0.0.0
\
--trust-remote-code
\
--trust-remote-code
\
--skip-tokenizer-init
\
--skip-tokenizer-init
\
--disaggregation-bootstrap-port ${DISAGG_BOOTSTRAP_PORT}
\
--disaggregation-mode decode
\
--disaggregation-mode decode
\
--disaggregation-transfer-backend nixl
--disaggregation-transfer-backend nixl
```
```
...
@@ -161,6 +169,8 @@ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.sglang \
...
@@ -161,6 +169,8 @@ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.sglang \
>
>
> - `CUDA_VISIBLE_DEVICES`: Controls which GPU each worker uses (0 and 1 for different > GPUs)
> - `CUDA_VISIBLE_DEVICES`: Controls which GPU each worker uses (0 and 1 for different > GPUs)
> - `--page-size 16`: Sets the KV cache block size - must be identical across all workers
> - `--page-size 16`: Sets the KV cache block size - must be identical across all workers
> - `--host 0.0.0.0`: Exposes the SGLang bootstrap server on all interfaces so other nodes can reach it
> - `--disaggregation-bootstrap-port`: Uses the fixed port you set in `DISAGG_BOOTSTRAP_PORT`; ensure this port is open between nodes
> - `--disaggregation-mode`: Separates prefill (prompt processing) from decode (token > generation)
> - `--disaggregation-mode`: Separates prefill (prompt processing) from decode (token > generation)
> - `--disaggregation-transfer-backend nixl`: Enables high-speed GPU-to-GPU transfers
> - `--disaggregation-transfer-backend nixl`: Enables high-speed GPU-to-GPU transfers
> - `--skip-tokenizer-init`: Avoids duplicate tokenizer loading since the frontend > handles tokenization
> - `--skip-tokenizer-init`: Avoids duplicate tokenizer loading since the frontend > handles tokenization
...
@@ -176,8 +186,10 @@ CUDA_VISIBLE_DEVICES=0 python3 -m dynamo.sglang \
...
@@ -176,8 +186,10 @@ CUDA_VISIBLE_DEVICES=0 python3 -m dynamo.sglang \
--served-model-name Qwen/Qwen3-0.6B
\
--served-model-name Qwen/Qwen3-0.6B
\
--page-size 16
\
--page-size 16
\
--tp 1
\
--tp 1
\
--host 0.0.0.0
\
--trust-remote-code
\
--trust-remote-code
\
--skip-tokenizer-init
\
--skip-tokenizer-init
\
--disaggregation-bootstrap-port ${DISAGG_BOOTSTRAP_PORT}
\
--disaggregation-mode prefill
\
--disaggregation-mode prefill
\
--disaggregation-transfer-backend nixl &
--disaggregation-transfer-backend nixl &
...
@@ -187,8 +199,10 @@ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.sglang \
...
@@ -187,8 +199,10 @@ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.sglang \
--served-model-name Qwen/Qwen3-0.6B
\
--served-model-name Qwen/Qwen3-0.6B
\
--page-size 16
\
--page-size 16
\
--tp 1
\
--tp 1
\
--host 0.0.0.0
\
--trust-remote-code
\
--trust-remote-code
\
--skip-tokenizer-init
\
--skip-tokenizer-init
\
--disaggregation-bootstrap-port ${DISAGG_BOOTSTRAP_PORT}
\
--disaggregation-mode decode
\
--disaggregation-mode decode
\
--disaggregation-transfer-backend nixl
--disaggregation-transfer-backend nixl
```
```
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment