Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
a294dbe8
Unverified
Commit
a294dbe8
authored
Dec 31, 2025
by
Hongkuan Zhou
Committed by
GitHub
Dec 31, 2025
Browse files
fix: sglang dsr1 recipe pvc path (#5119)
Signed-off-by:
hongkuanz
<
hongkuanz@nvidia.com
>
parent
0b33c1df
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
49 additions
and
3 deletions
+49
-3
benchmarks/profiler/deploy/profile_sla_moe_dgdr.yaml
benchmarks/profiler/deploy/profile_sla_moe_dgdr.yaml
+1
-1
recipes/README.md
recipes/README.md
+4
-2
recipes/deepseek-r1/model-cache/model-download-sglang.yaml
recipes/deepseek-r1/model-cache/model-download-sglang.yaml
+38
-0
recipes/deepseek-r1/sglang/disagg-16gpu/deploy.yaml
recipes/deepseek-r1/sglang/disagg-16gpu/deploy.yaml
+3
-0
recipes/deepseek-r1/sglang/disagg-8gpu/deploy.yaml
recipes/deepseek-r1/sglang/disagg-8gpu/deploy.yaml
+3
-0
No files found.
benchmarks/profiler/deploy/profile_sla_moe_dgdr.yaml
View file @
a294dbe8
...
@@ -29,7 +29,7 @@ spec:
...
@@ -29,7 +29,7 @@ spec:
# Reference to ConfigMap containing the DGD base config
# Reference to ConfigMap containing the DGD base config
# For MoE models, this should point to the appropriate disagg config
# For MoE models, this should point to the appropriate disagg config
# Original path: /sgl-workspace/dynamo/recipes/deepseek-r1/sglang/disagg-16gpu.yaml
# Original path: /sgl-workspace/dynamo/recipes/deepseek-r1/sglang/disagg-16gpu
/deploy
.yaml
configMapRef
:
configMapRef
:
name
:
deepseek-r1-config
name
:
deepseek-r1-config
key
:
tep16p-dep16d-disagg.yaml
key
:
tep16p-dep16d-disagg.yaml
...
...
recipes/README.md
View file @
a294dbe8
...
@@ -16,10 +16,12 @@ Production-tested Kubernetes deployment recipes for LLM inference using NVIDIA D
...
@@ -16,10 +16,12 @@ Production-tested Kubernetes deployment recipes for LLM inference using NVIDIA D
|
**[Qwen3-32B-FP8](qwen3-32b-fp8/trtllm/disagg/)**
| TensorRT-LLM | Disaggregated | 8x GPU | ✅ | ✅ | Prefill + Decode separation | ❌ |
|
**[Qwen3-32B-FP8](qwen3-32b-fp8/trtllm/disagg/)**
| TensorRT-LLM | Disaggregated | 8x GPU | ✅ | ✅ | Prefill + Decode separation | ❌ |
|
**[GPT-OSS-120B](gpt-oss-120b/trtllm/agg/)**
| TensorRT-LLM | Aggregated | 4x GB200 | ✅ | ✅ | Blackwell only, WideEP | ❌ |
|
**[GPT-OSS-120B](gpt-oss-120b/trtllm/agg/)**
| TensorRT-LLM | Aggregated | 4x GB200 | ✅ | ✅ | Blackwell only, WideEP | ❌ |
|
**[GPT-OSS-120B](gpt-oss-120b/trtllm/disagg/)**
| TensorRT-LLM | Disaggregated | TBD | ❌ | ❌ | Engine configs only, no K8s manifest | ❌ |
|
**[GPT-OSS-120B](gpt-oss-120b/trtllm/disagg/)**
| TensorRT-LLM | Disaggregated | TBD | ❌ | ❌ | Engine configs only, no K8s manifest | ❌ |
|
**[DeepSeek-R1](deepseek-r1/sglang/disagg-8gpu/)**
| SGLang | Disagg WideEP | 8x H200 | ✅ | ❌ | Benchmark recipe pending | ❌ |
|
**[DeepSeek-R1](deepseek-r1/sglang/disagg-8gpu/)**
| SGLang | Disagg WideEP | 8x H200 | ✅
*
1
| ❌ | Benchmark recipe pending | ❌ |
|
**[DeepSeek-R1](deepseek-r1/sglang/disagg-16gpu/)**
| SGLang | Disagg WideEP | 16x H200 | ✅ | ❌ | Benchmark recipe pending | ❌ |
|
**[DeepSeek-R1](deepseek-r1/sglang/disagg-16gpu/)**
| SGLang | Disagg WideEP | 16x H200 | ✅
*
1
| ❌ | Benchmark recipe pending | ❌ |
|
**[DeepSeek-R1](deepseek-r1/trtllm/disagg/wide_ep/gb200/)**
| TensorRT-LLM | Disagg WideEP (GB200) | 32+4 GB200 | ✅ | ✅ |Multi-node: 8 decode + 1 prefill nodes | ❌ |
|
**[DeepSeek-R1](deepseek-r1/trtllm/disagg/wide_ep/gb200/)**
| TensorRT-LLM | Disagg WideEP (GB200) | 32+4 GB200 | ✅ | ✅ |Multi-node: 8 decode + 1 prefill nodes | ❌ |
*
1: Please use
`deepseek-r1/model-cache/model-download-sglang.yaml`
to download the model into the PVC.
**Legend:**
**Legend:**
-
**Deployment**
: ✅ = Complete
`deploy.yaml`
manifest available | ❌ = Missing or incomplete
-
**Deployment**
: ✅ = Complete
`deploy.yaml`
manifest available | ❌ = Missing or incomplete
-
**Benchmark Recipe**
: ✅ = Includes
`perf.yaml`
for running AIPerf benchmarks | ❌ = No benchmark recipe provided
-
**Benchmark Recipe**
: ✅ = Includes
`perf.yaml`
for running AIPerf benchmarks | ❌ = No benchmark recipe provided
...
...
recipes/deepseek-r1/model-cache/model-download-sglang.yaml
0 → 100644
View file @
a294dbe8
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
apiVersion
:
batch/v1
kind
:
Job
metadata
:
name
:
model-download
spec
:
backoffLimit
:
3
completions
:
1
parallelism
:
1
template
:
metadata
:
labels
:
app
:
model-download
spec
:
restartPolicy
:
Never
tolerations
:
[]
containers
:
-
name
:
model-download
image
:
python:3.10-slim
command
:
[
"
sh"
,
"
-c"
]
env
:
-
name
:
HF_HUB_ENABLE_HF_TRANSFER
value
:
"
1"
-
name
:
HF_HOME
value
:
/opt/model-cache
args
:
-
|
set -eux
pip install --no-cache-dir huggingface_hub hf_transfer
hf download deepseek-ai/DeepSeek-R1
volumeMounts
:
-
name
:
model-cache
mountPath
:
/opt/model-cache
volumes
:
-
name
:
model-cache
persistentVolumeClaim
:
claimName
:
model-cache
\ No newline at end of file
recipes/deepseek-r1/sglang/disagg-16gpu/deploy.yaml
View file @
a294dbe8
...
@@ -17,6 +17,9 @@ spec:
...
@@ -17,6 +17,9 @@ spec:
dynamoNamespace
:
sgl-dsr1-16gpu
dynamoNamespace
:
sgl-dsr1-16gpu
componentType
:
frontend
componentType
:
frontend
replicas
:
1
replicas
:
1
volumeMounts
:
-
name
:
model-cache
mountPoint
:
/opt/model
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
...
...
recipes/deepseek-r1/sglang/disagg-8gpu/deploy.yaml
View file @
a294dbe8
...
@@ -17,6 +17,9 @@ spec:
...
@@ -17,6 +17,9 @@ spec:
dynamoNamespace
:
sgl-dsr1-8gpu
dynamoNamespace
:
sgl-dsr1-8gpu
componentType
:
frontend
componentType
:
frontend
replicas
:
1
replicas
:
1
volumeMounts
:
-
name
:
model-cache
mountPoint
:
/opt/model
extraPodSpec
:
extraPodSpec
:
mainContainer
:
mainContainer
:
image
:
my-registry/sglang-runtime:my-tag
image
:
my-registry/sglang-runtime:my-tag
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment