Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
ae7e08a3
Unverified
Commit
ae7e08a3
authored
Jun 14, 2025
by
Ryan McCormick
Committed by
GitHub
Jun 13, 2025
Browse files
fix: Fix NATS_SERVER value, add details on customizing MOUNTS (#1520)
parent
75503dae
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
22 additions
and
2 deletions
+22
-2
examples/tensorrt_llm/configs/deepseek_r1/multinode/README.md
...ples/tensorrt_llm/configs/deepseek_r1/multinode/README.md
+19
-0
examples/tensorrt_llm/configs/deepseek_r1/multinode/srun_script.sh
...tensorrt_llm/configs/deepseek_r1/multinode/srun_script.sh
+3
-2
No files found.
examples/tensorrt_llm/configs/deepseek_r1/multinode/README.md
View file @
ae7e08a3
...
...
@@ -68,6 +68,25 @@ inside an interactive shell on one of the allocated nodes:
# https://github.com/ai-dynamo/dynamo/tree/main/examples/tensorrt_llm#build-docker
export
IMAGE
=
"<dynamo_trtllm_image>"
# MOUNTS are the host:container path pairs that are mounted into the containers
# launched by each `srun` command.
#
# If you want to reference files, such as $MODEL_PATH below, in a
# different location, you can customize MOUNTS or specify additional
# comma-separated mount pairs here.
#
# NOTE: Currently, this example assumes that the local bash scripts and configs
# referenced are mounted into into /mnt inside the container. If you want to
# customize the location of the scripts, make sure to modify `srun_script.sh`
# accordingly for the new locations of `start_frontend_services.sh` and
# `start_trtllm_worker.sh`.
#
# For example, assuming your cluster had a `/lustre` directory on the host, you
# could add that as a mount like so:
#
# export MOUNTS="${PWD}:/mnt,/lustre:/lustre"
export
MOUNTS
=
"
${
PWD
}
:/mnt"
# NOTE: In general, Deepseek R1 is very large, so it is recommended to
# pre-download the model weights and save them in some shared location,
# NFS storage, HF_CACHE, etc. and modify the `--model-path` below
...
...
examples/tensorrt_llm/configs/deepseek_r1/multinode/srun_script.sh
View file @
ae7e08a3
...
...
@@ -10,7 +10,8 @@ IMAGE="${IMAGE:-""}"
# but you may freely customize the mounts based on your cluster. A common practice
# is to mount paths to NFS storage for common scripts, model weights, etc.
# NOTE: This can be a comma separated list of multiple mounts as well.
MOUNTS
=
"
$PWD
:/mnt"
DEFAULT_MOUNT
=
"
${
PWD
}
:/mnt"
MOUNTS
=
"
${
MOUNTS
:-${
DEFAULT_MOUNT
}}
"
# Example values, assuming 4 nodes with 4 GPUs on each node, such as 4xGB200 nodes.
# For 8xH100 nodes as an example, you may set this to 2 nodes x 16 gpus, or 4 nodes x 32 gpus instead.
...
...
@@ -23,7 +24,7 @@ ACCOUNT="$(sacctmgr -nP show assoc where user=$(whoami) format=account)"
export
HEAD_NODE
=
"
${
SLURMD_NODENAME
}
"
export
HEAD_NODE_IP
=
"
$(
hostname
-i
)
"
export
ETCD_ENDPOINTS
=
"
${
HEAD_NODE_IP
}
:2379"
export
NATS_SERVER
=
"
${
HEAD_NODE_IP
}
:4222"
export
NATS_SERVER
=
"
nats://
${
HEAD_NODE_IP
}
:4222"
if
[[
-z
${
IMAGE
}
]]
;
then
echo
"ERROR: You need to set the IMAGE environment variable to the "
\
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment