Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
869562da
Unverified
Commit
869562da
authored
Jan 13, 2026
by
Biswa Panda
Committed by
GitHub
Jan 13, 2026
Browse files
feat: add examples for kv state approximation based routing (#5320)
parent
648e1cd2
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
50 additions
and
3 deletions
+50
-3
examples/backends/sglang/launch/agg_router.sh
examples/backends/sglang/launch/agg_router.sh
+21
-3
examples/backends/trtllm/launch/agg_router_approx.sh
examples/backends/trtllm/launch/agg_router_approx.sh
+29
-0
No files found.
examples/backends/sglang/launch/agg_router.sh
View file @
869562da
...
...
@@ -13,16 +13,22 @@ trap cleanup EXIT INT TERM
# Parse command line arguments
ENABLE_OTEL
=
false
APPROX_MODE
=
false
while
[[
$#
-gt
0
]]
;
do
case
$1
in
--enable-otel
)
ENABLE_OTEL
=
true
shift
;;
--approx
)
APPROX_MODE
=
true
shift
;;
-h
|
--help
)
echo
"Usage:
$0
[OPTIONS]"
echo
"Options:"
echo
" --enable-otel Enable OpenTelemetry tracing"
echo
" --approx Enable approximate KV routing (no KV events)"
echo
" -h, --help Show this help message"
echo
""
echo
"Note: System metrics are enabled by default on ports 8081 (worker-1), 8082 (worker-2)"
...
...
@@ -47,11 +53,23 @@ fi
# run ingress
# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000)
FRONTEND_ARGS
=(
--router-mode
kv
)
if
[
"
$APPROX_MODE
"
=
true
]
;
then
FRONTEND_ARGS+
=(
--no-kv-events
)
fi
OTEL_SERVICE_NAME
=
dynamo-frontend
\
python3
-m
dynamo.frontend
--router-mode
kv
&
python3
-m
dynamo.frontend
"
${
FRONTEND_ARGS
[@]
}
"
&
DYNAMO_PID
=
$!
# run worker
# Build KV events args conditionally (only when not in approx mode)
KV_EVENTS_ARGS_1
=()
KV_EVENTS_ARGS_2
=()
if
[
"
$APPROX_MODE
"
=
false
]
;
then
KV_EVENTS_ARGS_1
=(
--kv-events-config
'{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://*:5557"}'
)
KV_EVENTS_ARGS_2
=(
--kv-events-config
'{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://*:5558"}'
)
fi
OTEL_SERVICE_NAME
=
dynamo-worker-1
DYN_SYSTEM_PORT
=
${
DYN_SYSTEM_PORT_WORKER1
:-
8081
}
\
python3
-m
dynamo.sglang
\
--model-path
Qwen/Qwen3-0.6B
\
...
...
@@ -59,7 +77,7 @@ python3 -m dynamo.sglang \
--page-size
16
\
--tp
1
\
--trust-remote-code
\
--kv-events-config
'{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://*:5557"}'
\
"
${
KV_EVENTS_ARGS_1
[@]
}
"
\
--enable-metrics
\
"
${
TRACE_ARGS
[@]
}
"
&
WORKER_PID
=
$!
...
...
@@ -71,6 +89,6 @@ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.sglang \
--page-size
16
\
--tp
1
\
--trust-remote-code
\
--kv-events-config
'{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://*:5558"}'
\
"
${
KV_EVENTS_ARGS_2
[@]
}
"
\
--enable-metrics
\
"
${
TRACE_ARGS
[@]
}
"
examples/backends/trtllm/launch/agg_router_approx.sh
0 → 100755
View file @
869562da
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
# Environment variables with defaults
export
DYNAMO_HOME
=
${
DYNAMO_HOME
:-
"/workspace"
}
export
MODEL_PATH
=
${
MODEL_PATH
:-
"Qwen/Qwen3-0.6B"
}
export
SERVED_MODEL_NAME
=
${
SERVED_MODEL_NAME
:-
"Qwen/Qwen3-0.6B"
}
export
AGG_ENGINE_ARGS
=
${
AGG_ENGINE_ARGS
:-
"
$DYNAMO_HOME
/examples/backends/trtllm/engine_configs/qwen3/agg.yaml"
}
# Setup cleanup trap
cleanup
()
{
echo
"Cleaning up background processes..."
kill
$DYNAMO_PID
2>/dev/null
||
true
wait
$DYNAMO_PID
2>/dev/null
||
true
echo
"Cleanup complete."
}
trap
cleanup EXIT INT TERM
# run frontend with KV router in approximate mode (i.e. no KV events)
python3
-m
dynamo.frontend
--router-mode
kv
--no-kv-events
&
DYNAMO_PID
=
$!
# run worker (no event publishing needed - frontend handles routing with predictive approx kv mode)
python3
-m
dynamo.trtllm
\
--model-path
"
$MODEL_PATH
"
\
--served-model-name
"
$SERVED_MODEL_NAME
"
\
--extra-engine-args
"
$AGG_ENGINE_ARGS
"
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment