Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
9780bf3a
"lib/mocker/src/vscode:/vscode.git/clone" did not exist on "dcbccbcd2ea52d5a0762eb5834718af00317c8e6"
Unverified
Commit
9780bf3a
authored
Mar 04, 2026
by
Qi Wang
Committed by
GitHub
Mar 04, 2026
Browse files
perf: multimodal benchmark sweep (#6795)
parent
f0bfda1e
Changes
21
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
37 additions
and
0 deletions
+37
-0
examples/backends/vllm/launch/vllm_serve_embedding_cache.sh
examples/backends/vllm/launch/vllm_serve_embedding_cache.sh
+37
-0
No files found.
examples/backends/vllm/launch/vllm_serve_embedding_cache.sh
0 → 100755
View file @
9780bf3a
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
MODEL
=
"Qwen/Qwen3-VL-30B-A3B-Instruct-FP8"
CAPACITY_GB
=
10
EXTRA_ARGS
=()
while
[[
$#
-gt
0
]]
;
do
case
"
$1
"
in
--model
)
MODEL
=
"
$2
"
;
shift
2
;;
--multimodal-embedding-cache-capacity-gb
)
CAPACITY_GB
=
"
$2
"
;
shift
2
;;
*
)
EXTRA_ARGS+
=(
"
$1
"
)
;
shift
;;
esac
done
# Need vLLM main or v0.17+
EC_ARGS
=()
if
[[
"
$CAPACITY_GB
"
!=
"0"
]]
;
then
EC_ARGS
=(
--ec-transfer-config
"{
\"
ec_role
\"
:
\"
ec_both
\"
,
\"
ec_connector
\"
:
\"
DynamoMultimodalEmbeddingCacheConnector
\"
,
\"
ec_connector_module_path
\"
:
\"
dynamo.vllm.multimodal_utils.multimodal_embedding_cache_connector
\"
,
\"
ec_connector_extra_config
\"
: {
\"
multimodal_embedding_cache_capacity_gb
\"
:
$CAPACITY_GB
}
}"
)
fi
CUDA_VISIBLE_DEVICES
=
2
\
vllm serve
"
$MODEL
"
\
--enable-log-requests
\
--max-model-len
16384
\
--gpu-memory-utilization
.9
\
"
${
EC_ARGS
[@]
}
"
\
"
${
EXTRA_ARGS
[@]
}
"
Prev
1
2
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment