Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
cebe9219
Unverified
Commit
cebe9219
authored
Aug 13, 2025
by
Ryan McCormick
Committed by
GitHub
Aug 13, 2025
Browse files
feat: Add vars to multi-node trtllm slurm scripts to support xP yD deployments (#2429)
parent
dcfa87be
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
44 additions
and
34 deletions
+44
-34
components/backends/trtllm/multinode/multinode-examples.md
components/backends/trtllm/multinode/multinode-examples.md
+4
-0
components/backends/trtllm/multinode/srun_disaggregated.sh
components/backends/trtllm/multinode/srun_disaggregated.sh
+40
-34
No files found.
components/backends/trtllm/multinode/multinode-examples.md
View file @
cebe9219
...
...
@@ -186,6 +186,10 @@ deployment across 8 nodes:
./srun_disaggregated.sh
```
> [!Tip]
> To launch multiple replicas of the configured prefill/decode workers, you can set
> NUM_PREFILL_WORKERS and NUM_DECODE_WORKERS respectively (default: 1).
## Understanding the Output
1.
The
`srun_aggregated.sh`
launches two
`srun`
jobs. The first launches
...
...
components/backends/trtllm/multinode/srun_disaggregated.sh
View file @
cebe9219
...
...
@@ -16,9 +16,11 @@ MOUNTS="${MOUNTS:-${DEFAULT_MOUNT}}"
NUM_GPUS_PER_NODE
=
${
NUM_GPUS_PER_NODE
:-
4
}
NUM_PREFILL_NODES
=
${
NUM_PREFILL_NODES
:-
4
}
NUM_PREFILL_WORKERS
=
${
NUM_PREFILL_WORKERS
:-
1
}
PREFILL_ENGINE_CONFIG
=
"
${
PREFILL_ENGINE_CONFIG
:-
/mnt/engine_configs/deepseek_r1/wide_ep/wide_ep_prefill.yaml
}
"
NUM_DECODE_NODES
=
${
NUM_DECODE_NODES
:-
4
}
NUM_DECODE_WORKERS
=
${
NUM_DECODE_WORKERS
:-
1
}
DECODE_ENGINE_CONFIG
=
"
${
DECODE_ENGINE_CONFIG
:-
/mnt/engine_configs/deepseek_r1/wide_ep/wide_ep_decode.yaml
}
"
DISAGGREGATION_STRATEGY
=
${
DISAGGREGATION_STRATEGY
:-
"decode_first"
}
...
...
@@ -59,10 +61,11 @@ srun \
# NOTE: Output streamed to stdout for ease of understanding the example, but
# in practice you would probably set `srun --output ... --error ...` to pipe
# the stdout/stderr to files.
echo
"Launching multi-node prefill worker in background."
DISAGGREGATION_MODE
=
prefill
\
ENGINE_CONFIG
=
${
PREFILL_ENGINE_CONFIG
}
\
srun
\
for
((
i
=
1
;
i<
=
${
NUM_PREFILL_WORKERS
}
;
i++
))
;
do
echo
"Launching multi-node prefill worker in background."
DISAGGREGATION_MODE
=
prefill
\
ENGINE_CONFIG
=
${
PREFILL_ENGINE_CONFIG
}
\
srun
\
--mpi
pmix
\
--oversubscribe
\
--container-image
"
${
IMAGE
}
"
\
...
...
@@ -76,11 +79,13 @@ srun \
--ntasks-per-node
"
${
NUM_GPUS_PER_NODE
}
"
\
--jobid
"
${
SLURM_JOB_ID
}
"
\
/mnt/multinode/start_trtllm_worker.sh &
done
echo
"Launching multi-node decode worker in background."
DISAGGREGATION_MODE
=
decode
\
ENGINE_CONFIG
=
${
DECODE_ENGINE_CONFIG
}
\
srun
\
for
((
i
=
1
;
i<
=
${
NUM_DECODE_WORKERS
}
;
i++
))
;
do
echo
"Launching multi-node decode worker in background."
DISAGGREGATION_MODE
=
decode
\
ENGINE_CONFIG
=
${
DECODE_ENGINE_CONFIG
}
\
srun
\
--mpi
pmix
\
--oversubscribe
\
--container-image
"
${
IMAGE
}
"
\
...
...
@@ -94,3 +99,4 @@ srun \
--ntasks-per-node
"
${
NUM_GPUS_PER_NODE
}
"
\
--jobid
"
${
SLURM_JOB_ID
}
"
\
/mnt/multinode/start_trtllm_worker.sh &
done
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment