Unverified Commit 2a24e4aa authored by ishandhanani's avatar ishandhanani Committed by GitHub
Browse files

fix: fix the ooms in gb200 default instructions (#3768)

parent 9a723734
......@@ -59,7 +59,6 @@ MC_TE_METRIC=true \
SGLANG_DISAGGREGATION_HEARTBEAT_MAX_FAILURE=100000 \
SGLANG_DISAGGREGATION_BOOTSTRAP_TIMEOUT=100000 \
SGLANG_DISAGGREGATION_WAITING_TIMEOUT=100000 \
SGLANG_MOONCAKE_CUSTOM_MEM_POOL=True \
MC_FORCE_MNNVL=1 \
NCCL_MNNVL_ENABLE=1 \
NCCL_CUMEM_ENABLE=1 \
......@@ -98,7 +97,7 @@ python3 -m dynamo.sglang \
--disable-cuda-graph \
--chunked-prefill-size 16384 \
--max-total-tokens 32768 \
--mem-fraction-static 0.8 \
--mem-fraction-static 0.82 \
--log-level debug
```
......@@ -128,10 +127,10 @@ python3 -m dynamo.sglang \
--disaggregation-mode decode \
--dist-init-addr ${HEAD_DECODE_NODE_IP}:29500 \
--disaggregation-bootstrap-port 30001 \
--nnodes 12 \
--nnodes 2 \
--node-rank 0 \
--tp-size 48 \
--dp-size 48 \
--tp-size 8 \
--dp-size 8 \
--enable-dp-attention \
--host 0.0.0.0 \
--decode-log-interval 1 \
......@@ -155,4 +154,4 @@ python3 -m dynamo.sglang \
--log-level debug
```
On the other decode nodes (this example has 2 total decode nodes), run the same command but change `--node-rank` to 1
\ No newline at end of file
On the other decode nodes (this example has 2 total decode nodes), run the same command but change `--node-rank` to 1.
\ No newline at end of file
......@@ -39,6 +39,7 @@ python3 -m dynamo.sglang \
--disaggregation-mode prefill \
--disaggregation-transfer-backend nixl \
--disaggregation-bootstrap-port 30001 \
--load-balance-method round_robin \
--host 0.0.0.0 \
--mem-fraction-static 0.82
```
......@@ -60,6 +61,7 @@ python3 -m dynamo.sglang \
--disaggregation-transfer-backend nixl \
--disaggregation-bootstrap-port 30001 \
--host 0.0.0.0 \
--load-balance-method round_robin \
--mem-fraction-static 0.82
```
......@@ -80,6 +82,7 @@ python3 -m dynamo.sglang \
--disaggregation-transfer-backend nixl \
--disaggregation-bootstrap-port 30001 \
--host 0.0.0.0 \
--prefill-round-robin-balance \
--mem-fraction-static 0.82
```
......@@ -100,6 +103,7 @@ python3 -m dynamo.sglang \
--disaggregation-transfer-backend nixl \
--disaggregation-bootstrap-port 30001 \
--host 0.0.0.0 \
--prefill-round-robin-balance \
--mem-fraction-static 0.82
```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment