Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
f3f764eb
Unverified
Commit
f3f764eb
authored
Nov 19, 2025
by
Hongkuan Zhou
Committed by
GitHub
Nov 19, 2025
Browse files
fix: use hf id in dsr1 recipe to support DGDR (#4481)
Signed-off-by:
hongkuanz
<
hongkuanz@nvidia.com
>
parent
473cb57e
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
22 additions
and
60 deletions
+22
-60
recipes/deepseek-r1/sglang/disagg-16gpu/deploy.yaml
recipes/deepseek-r1/sglang/disagg-16gpu/deploy.yaml
+10
-28
recipes/deepseek-r1/sglang/disagg-8gpu/deploy.yaml
recipes/deepseek-r1/sglang/disagg-8gpu/deploy.yaml
+12
-32
No files found.
recipes/deepseek-r1/sglang/disagg-16gpu/deploy.yaml
View file @
f3f764eb
...
@@ -6,8 +6,11 @@ kind: DynamoGraphDeployment
...
@@ -6,8 +6,11 @@ kind: DynamoGraphDeployment
metadata
:
metadata
:
name
:
sgl-dsr1-16gpu
name
:
sgl-dsr1-16gpu
spec
:
spec
:
envs
:
-
name
:
HF_HOME
value
:
/opt/model
pvcs
:
pvcs
:
-
name
:
model-cache
-pvc
-
name
:
model-cache
create
:
false
create
:
false
services
:
services
:
Frontend
:
Frontend
:
...
@@ -16,13 +19,6 @@ spec:
...
@@ -16,13 +19,6 @@ spec:
replicas
:
1
replicas
:
1
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
8000
periodSeconds
:
10
timeoutSeconds
:
1800
failureThreshold
:
60
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
decode
:
decode
:
dynamoNamespace
:
sgl-dsr1-16gpu
dynamoNamespace
:
sgl-dsr1-16gpu
...
@@ -34,19 +30,12 @@ spec:
...
@@ -34,19 +30,12 @@ spec:
limits
:
limits
:
gpu
:
"
8"
gpu
:
"
8"
volumeMounts
:
volumeMounts
:
-
name
:
model-cache
-pvc
-
name
:
model-cache
mountPoint
:
/model
-cache
mountPoint
:
/opt
/model
sharedMemory
:
sharedMemory
:
size
:
80Gi
size
:
80Gi
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
timeoutSeconds
:
10
failureThreshold
:
600
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
workingDir
:
/sgl-workspace/dynamo
workingDir
:
/sgl-workspace/dynamo
command
:
command
:
...
@@ -55,7 +44,7 @@ spec:
...
@@ -55,7 +44,7 @@ spec:
-
dynamo.sglang
-
dynamo.sglang
args
:
args
:
-
--model-path
-
--model-path
-
/model-cache/d
eep
s
eek-
r
1
-
deepseek-ai/D
eep
S
eek-
R
1
-
--served-model-name
-
--served-model-name
-
deepseek-ai/DeepSeek-R1
-
deepseek-ai/DeepSeek-R1
-
--tp
-
--tp
...
@@ -86,19 +75,12 @@ spec:
...
@@ -86,19 +75,12 @@ spec:
limits
:
limits
:
gpu
:
"
8"
gpu
:
"
8"
volumeMounts
:
volumeMounts
:
-
name
:
model-cache
-pvc
-
name
:
model-cache
mountPoint
:
/model
-cache
mountPoint
:
/opt
/model
sharedMemory
:
sharedMemory
:
size
:
80Gi
size
:
80Gi
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
timeoutSeconds
:
10
failureThreshold
:
600
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
workingDir
:
/sgl-workspace/dynamo
workingDir
:
/sgl-workspace/dynamo
command
:
command
:
...
@@ -107,7 +89,7 @@ spec:
...
@@ -107,7 +89,7 @@ spec:
-
dynamo.sglang
-
dynamo.sglang
args
:
args
:
-
--model-path
-
--model-path
-
/model-cache/d
eep
s
eek-
r
1
-
deepseek-ai/D
eep
S
eek-
R
1
-
--served-model-name
-
--served-model-name
-
deepseek-ai/DeepSeek-R1
-
deepseek-ai/DeepSeek-R1
-
--tp
-
--tp
...
...
recipes/deepseek-r1/sglang/disagg-8gpu/deploy.yaml
View file @
f3f764eb
...
@@ -6,8 +6,11 @@ kind: DynamoGraphDeployment
...
@@ -6,8 +6,11 @@ kind: DynamoGraphDeployment
metadata
:
metadata
:
name
:
sgl-dsr1-8gpu
name
:
sgl-dsr1-8gpu
spec
:
spec
:
envs
:
-
name
:
HF_HOME
value
:
/opt/model
pvcs
:
pvcs
:
-
name
:
model-cache
-pvc
-
name
:
model-cache
create
:
false
create
:
false
services
:
services
:
Frontend
:
Frontend
:
...
@@ -16,13 +19,6 @@ spec:
...
@@ -16,13 +19,6 @@ spec:
replicas
:
1
replicas
:
1
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
8000
periodSeconds
:
10
timeoutSeconds
:
1800
failureThreshold
:
60
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
decode
:
decode
:
dynamoNamespace
:
sgl-dsr1-8gpu
dynamoNamespace
:
sgl-dsr1-8gpu
...
@@ -32,28 +28,21 @@ spec:
...
@@ -32,28 +28,21 @@ spec:
limits
:
limits
:
gpu
:
"
8"
gpu
:
"
8"
volumeMounts
:
volumeMounts
:
-
name
:
model-cache
-pvc
-
name
:
model-cache
mountPoint
:
/model
-cache
mountPoint
:
/opt
/model
sharedMemory
:
sharedMemory
:
size
:
80Gi
size
:
80Gi
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
timeoutSeconds
:
10
failureThreshold
:
600
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
workingDir
:
/
sgl-
workspace
/dynamo
workingDir
:
/workspace
command
:
command
:
-
python3
-
python3
-
-m
-
-m
-
dynamo.sglang
-
dynamo.sglang
args
:
args
:
-
--model-path
-
--model-path
-
/model-cache/d
eep
s
eek-
r
1
-
deepseek-ai/D
eep
S
eek-
R
1
-
--served-model-name
-
--served-model-name
-
deepseek-ai/DeepSeek-R1
-
deepseek-ai/DeepSeek-R1
-
--tp
-
--tp
...
@@ -64,7 +53,6 @@ spec:
...
@@ -64,7 +53,6 @@ spec:
-
--ep-size
-
--ep-size
-
"
8"
-
"
8"
-
--trust-remote-code
-
--trust-remote-code
-
--skip-tokenizer-init
-
--disaggregation-mode
-
--disaggregation-mode
-
decode
-
decode
-
--disaggregation-bootstrap-port
-
--disaggregation-bootstrap-port
...
@@ -80,28 +68,21 @@ spec:
...
@@ -80,28 +68,21 @@ spec:
limits
:
limits
:
gpu
:
"
8"
gpu
:
"
8"
volumeMounts
:
volumeMounts
:
-
name
:
model-cache
-pvc
-
name
:
model-cache
mountPoint
:
/model
-cache
mountPoint
:
/opt
/model
sharedMemory
:
sharedMemory
:
size
:
80Gi
size
:
80Gi
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
startupProbe
:
httpGet
:
path
:
/health
port
:
9090
periodSeconds
:
10
timeoutSeconds
:
10
failureThreshold
:
600
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
workingDir
:
/
sgl-
workspace
/dynamo
workingDir
:
/workspace
command
:
command
:
-
python3
-
python3
-
-m
-
-m
-
dynamo.sglang
-
dynamo.sglang
args
:
args
:
-
--model-path
-
--model-path
-
/model-cache/d
eep
s
eek-
r
1
-
deepseek-ai/D
eep
S
eek-
R
1
-
--served-model-name
-
--served-model-name
-
deepseek-ai/DeepSeek-R1
-
deepseek-ai/DeepSeek-R1
-
--tp
-
--tp
...
@@ -109,7 +90,6 @@ spec:
...
@@ -109,7 +90,6 @@ spec:
-
--ep-size
-
--ep-size
-
"
8"
-
"
8"
-
--trust-remote-code
-
--trust-remote-code
-
--skip-tokenizer-init
-
--disaggregation-mode
-
--disaggregation-mode
-
prefill
-
prefill
-
--disaggregation-bootstrap-port
-
--disaggregation-bootstrap-port
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment