Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
01a634d1
Unverified
Commit
01a634d1
authored
Dec 02, 2025
by
ishandhanani
Committed by
GitHub
Dec 02, 2025
Browse files
feat: SGLang FP8 improvements and vLLM benchmark enhancements (#4675)
parent
c9fdc2ea
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
30 additions
and
8 deletions
+30
-8
examples/backends/sglang/slurm_jobs/scripts/vllm/bench.sh
examples/backends/sglang/slurm_jobs/scripts/vllm/bench.sh
+30
-8
No files found.
examples/backends/sglang/slurm_jobs/scripts/vllm/bench.sh
View file @
01a634d1
...
...
@@ -27,19 +27,42 @@ chosen_req_rate=$8
echo
"Config
${
chosen_isl
}
;
${
chosen_osl
}
;
${
chosen_concurrencies
[@]
}
;
${
chosen_req_rate
}
"
wait_for_model_timeout
=
1500
# 25 minutes
wait_for_model_check_interval
=
5
# check interval -> 5s
wait_for_model_report_interval
=
60
# wait_for_model report interval -> 60s
wait_for_model_timeout
=
3000
wait_for_model_check_interval
=
5
wait_for_model_report_interval
=
60
wait_for_model
$head_node
$head_port
$n_prefill
$n_decode
$wait_for_model_check_interval
$wait_for_model_timeout
$wait_for_model_report_interval
wait_for_model
$head_node
$head_port
$n_prefill
$n_decode
\
$wait_for_model_check_interval
$wait_for_model_timeout
$wait_for_model_report_interval
set
-e
# Warmup the model with a sweep of concurrencies
# Warmup defaults
warmup_isl
=
$chosen_isl
warmup_osl
=
$chosen_osl
warmup_req_rate
=
250
warmup_concurrency_list
=(
1 4 8 32 64 128 256 512 1024 4096
)
warmup_concurrency_list
=(
1 4 8 32 64 128 256 512
)
# Ensure all chosen concurrencies are in warmup list
for
c
in
"
${
chosen_concurrencies
[@]
}
"
;
do
found
=
false
for
w
in
"
${
warmup_concurrency_list
[@]
}
"
;
do
if
[[
"
$c
"
==
"
$w
"
]]
;
then
found
=
true
break
fi
done
if
[[
"
$found
"
==
false
]]
;
then
warmup_concurrency_list+
=(
"
$c
"
)
fi
done
# Optional: sort warmup list numerically
IFS
=
$'
\n
'
warmup_concurrency_list
=(
$(
sort
-n
<<<
"
${
warmup_concurrency_list
[*]
}
"
)
)
unset
IFS
echo
"Final warmup list:
${
warmup_concurrency_list
[@]
}
"
# Warmup
for
warmup_concurrency
in
"
${
warmup_concurrency_list
[@]
}
"
do
echo
"Warming up model with concurrency
$warmup_concurrency
"
...
...
@@ -73,7 +96,7 @@ for concurrency in "${chosen_concurrencies[@]}"
do
num_prompts
=
$((
concurrency
*
5
))
echo
"Running benchmark with concurrency:
$concurrency
and num-prompts:
$num_prompts
, writing to file
${
result_dir
}
"
result_filename
=
"isl_
${
chosen_isl
}
_osl_
${
chosen_osl
}
_concurrency_
${
concurrency
}
_req_rate_
${
chosen_req_rate
}
_ctx
${
prefill_gpus
}
_gen
${
decode_gpus
}
.json"
result_filename
=
"isl_
${
chosen_isl
}
_osl_
${
chosen_osl
}
_concurrency_
${
concurrency
}
_req_rate_
${
chosen_req_rate
}
_ctx
_
${
prefill_gpus
}
_gen
_
${
decode_gpus
}
_gpus_
${
total_gpus
}
.json"
set
-x
echo
"
$(
date
'+%Y-%m-%d %H:%M:%S'
)
"
...
...
@@ -98,4 +121,3 @@ do
echo
"Completed benchmark with concurrency:
$concurrency
"
echo
"-----------------------------------------"
done
set
+e
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment