Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
b8461b6c
"...git@developer.sourcefind.cn:2222/OpenDAS/vllm_cscc.git" did not exist on "b5a9a18d99aa41c59ba31709c3567b5cfd324ead"
Unverified
Commit
b8461b6c
authored
Jul 27, 2025
by
Neelay Shah
Committed by
GitHub
Jul 27, 2025
Browse files
chore: updated health checks to use new probes (#2124)
parent
222245e2
Changes
5
Show whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
182 additions
and
120 deletions
+182
-120
components/backends/vllm/deploy/agg.yaml
components/backends/vllm/deploy/agg.yaml
+23
-15
components/backends/vllm/deploy/agg_router.yaml
components/backends/vllm/deploy/agg_router.yaml
+23
-15
components/backends/vllm/deploy/disagg.yaml
components/backends/vllm/deploy/disagg.yaml
+46
-30
components/backends/vllm/deploy/disagg_planner.yaml
components/backends/vllm/deploy/disagg_planner.yaml
+46
-30
components/backends/vllm/deploy/disagg_router.yaml
components/backends/vllm/deploy/disagg_router.yaml
+44
-30
No files found.
components/backends/vllm/deploy/agg.yaml
View file @
b8461b6c
...
@@ -48,24 +48,19 @@ spec:
...
@@ -48,24 +48,19 @@ spec:
VllmDecodeWorker
:
VllmDecodeWorker
:
envFromSecret
:
hf-token-secret
envFromSecret
:
hf-token-secret
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/live
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
dynamoNamespace
:
vllm-agg
dynamoNamespace
:
vllm-agg
componentType
:
worker
componentType
:
worker
replicas
:
1
replicas
:
1
...
@@ -78,8 +73,21 @@ spec:
...
@@ -78,8 +73,21 @@ spec:
cpu
:
"
10"
cpu
:
"
10"
memory
:
"
20Gi"
memory
:
"
20Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
-
name
:
DYN_SYSTEM_PORT
value
:
"
9090"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
...
components/backends/vllm/deploy/agg_router.yaml
View file @
b8461b6c
...
@@ -48,24 +48,19 @@ spec:
...
@@ -48,24 +48,19 @@ spec:
VllmDecodeWorker
:
VllmDecodeWorker
:
envFromSecret
:
hf-token-secret
envFromSecret
:
hf-token-secret
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/live
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
dynamoNamespace
:
vllm-agg-router
dynamoNamespace
:
vllm-agg-router
componentType
:
worker
componentType
:
worker
replicas
:
2
replicas
:
2
...
@@ -78,8 +73,21 @@ spec:
...
@@ -78,8 +73,21 @@ spec:
cpu
:
"
10"
cpu
:
"
10"
memory
:
"
20Gi"
memory
:
"
20Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
-
name
:
DYN_SYSTEM_PORT
value
:
"
9090"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
...
components/backends/vllm/deploy/disagg.yaml
View file @
b8461b6c
...
@@ -51,24 +51,19 @@ spec:
...
@@ -51,24 +51,19 @@ spec:
componentType
:
worker
componentType
:
worker
replicas
:
1
replicas
:
1
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/live
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
resources
:
resources
:
requests
:
requests
:
cpu
:
"
32"
cpu
:
"
32"
...
@@ -78,8 +73,21 @@ spec:
...
@@ -78,8 +73,21 @@ spec:
cpu
:
"
32"
cpu
:
"
32"
memory
:
"
40Gi"
memory
:
"
40Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
-
name
:
DYN_SYSTEM_PORT
value
:
"
9090"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
@@ -93,24 +101,19 @@ spec:
...
@@ -93,24 +101,19 @@ spec:
componentType
:
worker
componentType
:
worker
replicas
:
1
replicas
:
1
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/live
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
resources
:
resources
:
requests
:
requests
:
cpu
:
"
32"
cpu
:
"
32"
...
@@ -120,8 +123,21 @@ spec:
...
@@ -120,8 +123,21 @@ spec:
cpu
:
"
32"
cpu
:
"
32"
memory
:
"
40Gi"
memory
:
"
40Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
-
name
:
DYN_SYSTEM_PORT
value
:
"
9090"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
...
components/backends/vllm/deploy/disagg_planner.yaml
View file @
b8461b6c
...
@@ -51,24 +51,19 @@ spec:
...
@@ -51,24 +51,19 @@ spec:
componentType
:
worker
componentType
:
worker
replicas
:
1
replicas
:
1
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/live
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
resources
:
resources
:
requests
:
requests
:
cpu
:
"
10"
cpu
:
"
10"
...
@@ -78,8 +73,21 @@ spec:
...
@@ -78,8 +73,21 @@ spec:
cpu
:
"
10"
cpu
:
"
10"
memory
:
"
20Gi"
memory
:
"
20Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
-
name
:
DYN_SYSTEM_PORT
value
:
"
9090"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
@@ -93,24 +101,19 @@ spec:
...
@@ -93,24 +101,19 @@ spec:
componentType
:
worker
componentType
:
worker
replicas
:
1
replicas
:
1
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
resources
:
resources
:
requests
:
requests
:
cpu
:
"
10"
cpu
:
"
10"
...
@@ -120,8 +123,21 @@ spec:
...
@@ -120,8 +123,21 @@ spec:
cpu
:
"
10"
cpu
:
"
10"
memory
:
"
20Gi"
memory
:
"
20Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
-
name
:
DYN_SYSTEM_PORT
value
:
"
9090"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
...
components/backends/vllm/deploy/disagg_router.yaml
View file @
b8461b6c
...
@@ -51,24 +51,19 @@ spec:
...
@@ -51,24 +51,19 @@ spec:
componentType
:
worker
componentType
:
worker
replicas
:
2
replicas
:
2
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/live
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
resources
:
resources
:
requests
:
requests
:
cpu
:
"
10"
cpu
:
"
10"
...
@@ -78,8 +73,19 @@ spec:
...
@@ -78,8 +73,19 @@ spec:
cpu
:
"
10"
cpu
:
"
10"
memory
:
"
20Gi"
memory
:
"
20Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
@@ -93,24 +99,19 @@ spec:
...
@@ -93,24 +99,19 @@ spec:
componentType
:
worker
componentType
:
worker
replicas
:
1
replicas
:
1
livenessProbe
:
livenessProbe
:
exec
:
httpGet
:
command
:
path
:
/live
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
5
-
"
exit
0"
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
1
readinessProbe
:
readinessProbe
:
exec
:
httpGet
:
command
:
path
:
/health
-
/bin/sh
port
:
9090
-
-c
periodSeconds
:
10
-
'
grep
"VllmWorker.*has
been
initialized"
/tmp/vllm.log'
initialDelaySeconds
:
60
periodSeconds
:
60
timeoutSeconds
:
30
timeoutSeconds
:
30
failureThreshold
:
1
0
failureThreshold
:
6
0
resources
:
resources
:
requests
:
requests
:
cpu
:
"
10"
cpu
:
"
10"
...
@@ -120,8 +121,21 @@ spec:
...
@@ -120,8 +121,21 @@ spec:
cpu
:
"
10"
cpu
:
"
10"
memory
:
"
20Gi"
memory
:
"
20Gi"
gpu
:
"
1"
gpu
:
"
1"
envs
:
-
name
:
DYN_SYSTEM_ENABLED
value
:
"
true"
-
name
:
DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS
value
:
"
[
\"
generate
\"
]"
-
name
:
DYN_SYSTEM_PORT
value
:
"
9090"
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
failureThreshold
:
60
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
image
:
nvcr.io/nvidian/nim-llm-dev/vllm-runtime:dep-233.17
workingDir
:
/workspace/components/backends/vllm
workingDir
:
/workspace/components/backends/vllm
command
:
command
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment