Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
57402e70
Unverified
Commit
57402e70
authored
May 08, 2025
by
Ryan McCormick
Committed by
GitHub
May 08, 2025
Browse files
docs: Add slurm env var workaround for MPI spawn errors (#992)
parent
02145479
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
8 additions
and
0 deletions
+8
-0
examples/tensorrt_llm/README.md
examples/tensorrt_llm/README.md
+8
-0
No files found.
examples/tensorrt_llm/README.md
View file @
57402e70
...
...
@@ -197,6 +197,14 @@ Notes:
cd
/workspace/examples/tensorrt_llm
dynamo serve components.worker:TensorRTLLMWorker
-f
./configs/disagg.yaml
--service-name
TensorRTLLMWorker &
```
-
If you see an error about MPI Spawn failing during TRTLLM Worker initialziation on a Slurm-based cluster,
try unsetting the following environment variables before launching the TRTLLM worker. If you intend to
run other slurm-based commands or processes on the same node after deploying the TRTLLM worker, you may
want to save these values into temporary variables and then restore them afterwards.
```
bash
# Workaround for error: `mpi4py.MPI.Exception: MPI_ERR_SPAWN: could not spawn processes`
unset
SLURM_JOBID SLURM_JOB_ID SLURM_NODELIST
```
### Client
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment