Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
a7b703bd
Unverified
Commit
a7b703bd
authored
Oct 28, 2025
by
hhzhang16
Committed by
GitHub
Oct 29, 2025
Browse files
fix: profiler sidecar and other SLA-driven autodeployment fixes (#3932)
Signed-off-by:
Hannah Zhang
<
hannahz@nvidia.com
>
parent
3998fdcb
Changes
7
Show whitespace changes
Inline
Side-by-side
Showing
7 changed files
with
27 additions
and
13 deletions
+27
-13
benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml
benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml
+2
-3
benchmarks/profiler/deploy/profile_sla_dgdr.yaml
benchmarks/profiler/deploy/profile_sla_dgdr.yaml
+4
-7
benchmarks/profiler/utils/search_space_autogen.py
benchmarks/profiler/utils/search_space_autogen.py
+4
-0
deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml
...les/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml
+0
-1
deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller.go
...nal/controller/dynamographdeploymentrequest_controller.go
+17
-0
docs/benchmarks/sla_driven_profiling.md
docs/benchmarks/sla_driven_profiling.md
+0
-1
docs/planner/sla_planner_quickstart.md
docs/planner/sla_planner_quickstart.md
+0
-1
No files found.
benchmarks/profiler/deploy/profile_sla_aic_dgdr.yaml
View file @
a7b703bd
...
@@ -12,7 +12,7 @@ spec:
...
@@ -12,7 +12,7 @@ spec:
# ProfilingConfig maps directly to the profile_sla.py config format
# ProfilingConfig maps directly to the profile_sla.py config format
profilingConfig
:
profilingConfig
:
profilerImage
:
"
nvcr.io/nvidian/dynamo-dev/vllm-runtime:dep-5
40.5
"
profilerImage
:
"
nvcr.io/nvidian/dynamo-dev/vllm-runtime:dep-5
54.0
"
config
:
config
:
# Sweep/profiling configuration
# Sweep/profiling configuration
sweep
:
sweep
:
...
@@ -31,8 +31,7 @@ spec:
...
@@ -31,8 +31,7 @@ spec:
# Deployment overrides for the auto-created DGD
# Deployment overrides for the auto-created DGD
deploymentOverrides
:
deploymentOverrides
:
workersImage
:
"
nvcr.io/nvidian/dynamo-dev/trtllm-runtime:dep-5
40.5
"
workersImage
:
"
nvcr.io/nvidian/dynamo-dev/trtllm-runtime:dep-5
54.0
"
# Automatically create DynamoGraphDeployment after profiling
# Automatically create DynamoGraphDeployment after profiling
autoApply
:
true
autoApply
:
true
benchmarks/profiler/deploy/profile_sla_dgdr.yaml
View file @
a7b703bd
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
# SPDX-License-Identifier: Apache-2.0
#
#
# DynamoGraphDeploymentRequest for standard online profiling
# DynamoGraphDeploymentRequest for online profiling (actual deployment testing)
# Converted from profile_sla_job.yaml
apiVersion
:
nvidia.com/v1alpha1
apiVersion
:
nvidia.com/v1alpha1
kind
:
DynamoGraphDeploymentRequest
kind
:
DynamoGraphDeploymentRequest
metadata
:
metadata
:
...
@@ -13,12 +12,11 @@ spec:
...
@@ -13,12 +12,11 @@ spec:
# ProfilingConfig maps directly to the profile_sla.py config format
# ProfilingConfig maps directly to the profile_sla.py config format
profilingConfig
:
profilingConfig
:
profilerImage
:
"
nvcr.io/nvidian/dynamo-dev/vllm-runtime:dep-5
40.5
"
profilerImage
:
"
nvcr.io/nvidian/dynamo-dev/vllm-runtime:dep-5
54.0
"
config
:
config
:
# Sweep/profiling configuration
# Sweep/profiling configuration
sweep
:
sweep
:
skip_existing_results
:
true
# Online profiling mode (real deployment testing)
# Standard online profiling (not using AI Configurator)
use_ai_configurator
:
false
use_ai_configurator
:
false
# SLA targets for profiling
# SLA targets for profiling
...
@@ -30,8 +28,7 @@ spec:
...
@@ -30,8 +28,7 @@ spec:
# Deployment overrides for the auto-created DGD
# Deployment overrides for the auto-created DGD
deploymentOverrides
:
deploymentOverrides
:
workersImage
:
"
nvcr.io/nvidian/dynamo-dev/vllm-runtime:dep-5
40.5
"
workersImage
:
"
nvcr.io/nvidian/dynamo-dev/vllm-runtime:dep-5
54.0
"
# Automatically create DynamoGraphDeployment after profiling
# Automatically create DynamoGraphDeployment after profiling
autoApply
:
true
autoApply
:
true
benchmarks/profiler/utils/search_space_autogen.py
View file @
a7b703bd
...
@@ -44,6 +44,10 @@ def auto_generate_search_space(args: argparse.Namespace) -> None:
...
@@ -44,6 +44,10 @@ def auto_generate_search_space(args: argparse.Namespace) -> None:
logger
.
info
(
f
"Updating model in DGD config file to
{
args
.
model
}
"
)
logger
.
info
(
f
"Updating model in DGD config file to
{
args
.
model
}
"
)
config
=
config_modifier
.
update_model
(
config
,
args
.
model
)
config
=
config_modifier
.
update_model
(
config
,
args
.
model
)
if
args
.
dgd_image
:
logger
.
info
(
f
"Updating DGD image to
{
args
.
dgd_image
}
"
)
config
=
config_modifier
.
update_image
(
config
,
args
.
dgd_image
)
config_fn
=
f
"
{
args
.
output_dir
}
/disagg_config.yaml"
config_fn
=
f
"
{
args
.
output_dir
}
/disagg_config.yaml"
logger
.
info
(
f
"Saving generated disagg DGD config for profiling to
{
config_fn
}
"
)
logger
.
info
(
f
"Saving generated disagg DGD config for profiling to
{
config_fn
}
"
)
os
.
makedirs
(
args
.
output_dir
,
exist_ok
=
True
)
os
.
makedirs
(
args
.
output_dir
,
exist_ok
=
True
)
...
...
deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeploymentrequest.yaml
View file @
a7b703bd
...
@@ -48,7 +48,6 @@ spec:
...
@@ -48,7 +48,6 @@ spec:
# Sweep/profiling configuration
# Sweep/profiling configuration
sweep
:
sweep
:
skip_existing_results
:
true
# Skip configurations that already have results
prefill_interpolation_granularity
:
16
# Samples for TTFT interpolation
prefill_interpolation_granularity
:
16
# Samples for TTFT interpolation
decode_interpolation_granularity
:
6
# Samples for ITL interpolation
decode_interpolation_granularity
:
6
# Samples for ITL interpolation
...
...
deploy/cloud/operator/internal/controller/dynamographdeploymentrequest_controller.go
View file @
a7b703bd
...
@@ -159,7 +159,24 @@ const (
...
@@ -159,7 +159,24 @@ const (
const
sidecarScriptTemplate
=
`
const
sidecarScriptTemplate
=
`
set -e
set -e
set -o pipefail
set -o pipefail
# Wait for the profiler container to complete, not just for the file to exist
# This ensures we capture the final config, not intermediate results
echo "Waiting for profiler to complete..."
while true; do
# Check if profiler container has finished (either Completed or Error state)
# Use kubectl to check the pod's container status
STATUS=$(kubectl get pod $HOSTNAME -n {{.Namespace}} -o jsonpath='{.status.containerStatuses[?(@.name=="profiler")].state}' 2>/dev/null || echo "")
if echo "$STATUS" | grep -q "terminated"; then
echo "Profiler container has terminated"
break
fi
sleep 5
done
# Now wait for the output file to exist
echo "Waiting for output file {{.OutputPath}}/{{.OutputFile}}..."
while [ ! -f {{.OutputPath}}/{{.OutputFile}} ]; do sleep 2; done
while [ ! -f {{.OutputPath}}/{{.OutputFile}} ]; do sleep 2; done
echo "Output file found, creating ConfigMap..."
# Start building ConfigMap YAML with DGD spec
# Start building ConfigMap YAML with DGD spec
cat >/tmp/cm.yaml <<EOF
cat >/tmp/cm.yaml <<EOF
...
...
docs/benchmarks/sla_driven_profiling.md
View file @
a7b703bd
...
@@ -345,7 +345,6 @@ spec:
...
@@ -345,7 +345,6 @@ spec:
sweep
:
sweep
:
use_ai_configurator
:
false
use_ai_configurator
:
false
skip_existing_results
:
false
deploymentOverrides
:
deploymentOverrides
:
workersImage
:
"
nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.6.1"
workersImage
:
"
nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.6.1"
...
...
docs/planner/sla_planner_quickstart.md
View file @
a7b703bd
...
@@ -324,7 +324,6 @@ profilingConfig:
...
@@ -324,7 +324,6 @@ profilingConfig:
# Profiling sweep settings (optional)
# Profiling sweep settings (optional)
sweep
:
sweep
:
skip_existing_results
:
false
force_rerun
:
false
force_rerun
:
false
```
```
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment