Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
977180c9
Unverified
Commit
977180c9
authored
Jul 08, 2025
by
Ricardo Decal
Committed by
GitHub
Jul 08, 2025
Browse files
[Docs] Improve documentation for multi-node service helper script (#20600)
Signed-off-by:
Ricardo Decal
<
rdecal@anyscale.com
>
parent
c40784c7
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
32 additions
and
7 deletions
+32
-7
examples/online_serving/multi-node-serving.sh
examples/online_serving/multi-node-serving.sh
+32
-7
No files found.
examples/online_serving/multi-node-serving.sh
View file @
977180c9
#!/bin/bash
#!/bin/bash
#
# Helper script to manually start or join a Ray cluster for online serving of vLLM models.
# This script is first executed on the head node, and then on each worker node with the IP address
# of the head node.
#
# Subcommands:
# leader: Launches a Ray head node and blocks until the cluster reaches the expected size (head + workers).
# worker: Starts a worker node that connects to an existing Ray head node.
#
# Example usage:
# On the head node machine, start the Ray head node process and run a vLLM server.
# ./multi-node-serving.sh leader --ray_port=6379 --ray_cluster_size=<SIZE> [<extra ray args>] && \
# python3 -m vllm.entrypoints.openai.api_server --port 8080 --model meta-llama/Meta-Llama-3.1-405B-Instruct --tensor-parallel-size 8 --pipeline_parallel_size 2
#
# On each worker node, start the Ray worker node process.
# ./multi-node-serving.sh worker --ray_address=<HEAD_NODE_IP> --ray_port=6379 [<extra ray args>]
#
# About Ray:
# Ray is an open-source distributed execution framework that simplifies
# distributed computing. Learn more:
# https://ray.io/
subcommand
=
$1
shift
ray_port
=
6379
subcommand
=
$1
# Either "leader" or "worker".
ray_init_timeout
=
300
shift
# Remove the subcommand from the argument list.
declare
-a
start_params
ray_port
=
6379
# Port used by the Ray head node.
ray_init_timeout
=
300
# Seconds to wait before timing out.
declare
-a
start_params
# Parameters forwarded to the underlying 'ray start' command.
# Handle the worker subcommand.
case
"
$subcommand
"
in
case
"
$subcommand
"
in
worker
)
worker
)
ray_address
=
""
ray_address
=
""
...
@@ -32,6 +55,7 @@ case "$subcommand" in
...
@@ -32,6 +55,7 @@ case "$subcommand" in
exit
1
exit
1
fi
fi
# Retry until the worker node connects to the head node or the timeout expires.
for
((
i
=
0
;
i <
$ray_init_timeout
;
i+
=
5
))
;
do
for
((
i
=
0
;
i <
$ray_init_timeout
;
i+
=
5
))
;
do
ray start
--address
=
$ray_address
:
$ray_port
--block
"
${
start_params
[@]
}
"
ray start
--address
=
$ray_address
:
$ray_port
--block
"
${
start_params
[@]
}
"
if
[
$?
-eq
0
]
;
then
if
[
$?
-eq
0
]
;
then
...
@@ -45,6 +69,7 @@ case "$subcommand" in
...
@@ -45,6 +69,7 @@ case "$subcommand" in
exit
1
exit
1
;;
;;
# Handle the leader subcommand.
leader
)
leader
)
ray_cluster_size
=
""
ray_cluster_size
=
""
while
[
$#
-gt
0
]
;
do
while
[
$#
-gt
0
]
;
do
...
@@ -69,10 +94,10 @@ case "$subcommand" in
...
@@ -69,10 +94,10 @@ case "$subcommand" in
exit
1
exit
1
fi
fi
#
s
tart the
r
ay
daemon
#
S
tart the
R
ay
head node.
ray start
--head
--port
=
$ray_port
"
${
start_params
[@]
}
"
ray start
--head
--port
=
$ray_port
"
${
start_params
[@]
}
"
#
wait
until
all
worker
s are
active
#
Poll Ray
until
every
worker
node is
active
.
for
((
i
=
0
;
i <
$ray_init_timeout
;
i+
=
5
))
;
do
for
((
i
=
0
;
i <
$ray_init_timeout
;
i+
=
5
))
;
do
active_nodes
=
`
python3
-c
'import ray; ray.init(); print(sum(node["Alive"] for node in ray.nodes()))'
`
active_nodes
=
`
python3
-c
'import ray; ray.init(); print(sum(node["Alive"] for node in ray.nodes()))'
`
if
[
$active_nodes
-eq
$ray_cluster_size
]
;
then
if
[
$active_nodes
-eq
$ray_cluster_size
]
;
then
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment