Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
9a002b30
Unverified
Commit
9a002b30
authored
Oct 15, 2025
by
ishandhanani
Committed by
GitHub
Oct 16, 2025
Browse files
fix: add `--host` to k8s and bump sgl version (#3666)
parent
ea07d51f
Changes
11
Show whitespace changes
Inline
Side-by-side
Showing
11 changed files
with
37 additions
and
8 deletions
+37
-8
components/backends/sglang/deploy/disagg-multinode.yaml
components/backends/sglang/deploy/disagg-multinode.yaml
+4
-0
components/backends/sglang/deploy/disagg.yaml
components/backends/sglang/deploy/disagg.yaml
+8
-1
components/backends/sglang/deploy/disagg_planner.yaml
components/backends/sglang/deploy/disagg_planner.yaml
+8
-0
container/Dockerfile.sglang
container/Dockerfile.sglang
+1
-1
container/Dockerfile.sglang-wideep
container/Dockerfile.sglang-wideep
+1
-1
docs/backends/sglang/README.md
docs/backends/sglang/README.md
+2
-2
docs/backends/sglang/dsr1-wideep-h100.md
docs/backends/sglang/dsr1-wideep-h100.md
+2
-0
docs/backends/sglang/multinode-examples.md
docs/backends/sglang/multinode-examples.md
+4
-0
pyproject.toml
pyproject.toml
+1
-1
recipes/deepseek-r1/sglang-wideep/tep16p-dep16d-disagg.yaml
recipes/deepseek-r1/sglang-wideep/tep16p-dep16d-disagg.yaml
+3
-1
recipes/deepseek-r1/sglang-wideep/tep8p-dep8d-disagg.yaml
recipes/deepseek-r1/sglang-wideep/tep8p-dep8d-disagg.yaml
+3
-1
No files found.
components/backends/sglang/deploy/disagg-multinode.yaml
View file @
9a002b30
...
...
@@ -56,6 +56,8 @@ spec:
-
nixl
-
--disaggregation-bootstrap-port
-
"
30001"
-
--host
-
"
0.0.0.0"
-
--mem-fraction-static
-
"
0.82"
prefill
:
...
...
@@ -93,3 +95,5 @@ spec:
-
"
30001"
-
--mem-fraction-static
-
"
0.82"
-
--host
-
"
0.0.0.0"
\ No newline at end of file
components/backends/sglang/deploy/disagg.yaml
View file @
9a002b30
...
...
@@ -46,7 +46,10 @@ spec:
-
decode
-
--disaggregation-transfer-backend
-
nixl
-
--disaggregation-bootstrap-port
-
"
12345"
-
--host
-
"
0.0.0.0"
prefill
:
envFromSecret
:
hf-token-secret
dynamoNamespace
:
sglang-disagg
...
...
@@ -79,3 +82,7 @@ spec:
-
prefill
-
--disaggregation-transfer-backend
-
nixl
-
--disaggregation-bootstrap-port
-
"
12345"
-
--host
-
"
0.0.0.0"
\ No newline at end of file
components/backends/sglang/deploy/disagg_planner.yaml
View file @
9a002b30
...
...
@@ -70,6 +70,10 @@ spec:
-
decode
-
--disaggregation-transfer-backend
-
nixl
-
--disaggregation-bootstrap-port
-
"
12345"
-
--host
-
"
0.0.0.0"
prefill
:
dynamoNamespace
:
dynamo
envFromSecret
:
hf-token-secret
...
...
@@ -102,3 +106,7 @@ spec:
-
prefill
-
--disaggregation-transfer-backend
-
nixl
-
--disaggregation-bootstrap-port
-
"
12345"
-
--host
-
"
0.0.0.0"
container/Dockerfile.sglang
View file @
9a002b30
...
...
@@ -14,7 +14,7 @@ ARG RUNTIME_IMAGE="nvcr.io/nvidia/cuda"
ARG RUNTIME_IMAGE_TAG="12.8.1-runtime-ubuntu24.04"
# Make sure to update the dependency version in pyproject.toml when updating this
ARG SGLANG_VERSION="0.5.3.post
1
"
ARG SGLANG_VERSION="0.5.3.post
2
"
# Define general architecture ARGs for supporting both x86 and aarch64 builds.
...
...
container/Dockerfile.sglang-wideep
View file @
9a002b30
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
ARG SGLANG_IMAGE_TAG="v0.5.3.post
1
"
ARG SGLANG_IMAGE_TAG="v0.5.3.post
2
"
ARG BRANCH_TYPE
FROM scratch AS local_src
...
...
docs/backends/sglang/README.md
View file @
9a002b30
...
...
@@ -104,8 +104,8 @@ cd $DYNAMO_HOME
# installs sglang supported version along with dynamo
# include the prerelease flag to install flashinfer rc versions
uv pip
install
-e
.
# install any sglang version >= 0.5.3
uv pip
install
"sglang[all]==0.5.3.post
1
"
# install any sglang version >= 0.5.3
.post2
uv pip
install
"sglang[all]==0.5.3.post
2
"
```
</details>
...
...
docs/backends/sglang/dsr1-wideep-h100.md
View file @
9a002b30
...
...
@@ -58,6 +58,7 @@ python3 -m dynamo.sglang \
--skip-tokenizer-init
\
--disaggregation-mode
prefill
\
--disaggregation-transfer-backend
nixl
\
--host
0.0.0.0
\
--disaggregation-bootstrap-port
30001
\
--dist-init-addr
${
HEAD_PREFILL_NODE_IP
}
:29500
\
--nnodes
4
\
...
...
@@ -95,6 +96,7 @@ python3 -m dynamo.sglang \
--disaggregation-mode
decode
\
--disaggregation-transfer-backend
nixl
\
--disaggregation-bootstrap-port
30001
\
--host
0.0.0.0
\
--dist-init-addr
${
HEAD_DECODE_NODE_IP
}
:29500
\
--nnodes
4
\
--node-rank
0
\
...
...
docs/backends/sglang/multinode-examples.md
View file @
9a002b30
...
...
@@ -39,6 +39,7 @@ python3 -m dynamo.sglang \
--disaggregation-mode
prefill
\
--disaggregation-transfer-backend
nixl
\
--disaggregation-bootstrap-port
30001
\
--host
0.0.0.0
\
--mem-fraction-static
0.82
```
...
...
@@ -58,6 +59,7 @@ python3 -m dynamo.sglang \
--disaggregation-mode
prefill
\
--disaggregation-transfer-backend
nixl
\
--disaggregation-bootstrap-port
30001
\
--host
0.0.0.0
\
--mem-fraction-static
0.82
```
...
...
@@ -77,6 +79,7 @@ python3 -m dynamo.sglang \
--disaggregation-mode
decode
\
--disaggregation-transfer-backend
nixl
\
--disaggregation-bootstrap-port
30001
\
--host
0.0.0.0
\
--mem-fraction-static
0.82
```
...
...
@@ -96,6 +99,7 @@ python3 -m dynamo.sglang \
--disaggregation-mode
decode
\
--disaggregation-transfer-backend
nixl
\
--disaggregation-bootstrap-port
30001
\
--host
0.0.0.0
\
--mem-fraction-static
0.82
```
...
...
pyproject.toml
View file @
9a002b30
...
...
@@ -60,7 +60,7 @@ vllm = [
sglang
=
[
"uvloop"
,
"nixl<=0.6.0"
,
"sglang[all]==0.5.3.post
1
"
,
"sglang[all]==0.5.3.post
2
"
,
]
[dependency-groups]
...
...
recipes/deepseek-r1/sglang-wideep/tep16p-dep16d-disagg.yaml
View file @
9a002b30
...
...
@@ -67,6 +67,7 @@ spec:
--disaggregation-transfer-backend nixl
--disaggregation-bootstrap-port 30001
--mem-fraction-static 0.8
--host 0.0.0.0
prefill
:
dynamoNamespace
:
sgl-dsr1-16gpu
componentType
:
worker
...
...
@@ -108,3 +109,4 @@ spec:
--disaggregation-transfer-backend nixl
--disaggregation-bootstrap-port 30001
--mem-fraction-static 0.8
--host 0.0.0.0
\ No newline at end of file
recipes/deepseek-r1/sglang-wideep/tep8p-dep8d-disagg.yaml
View file @
9a002b30
...
...
@@ -64,6 +64,7 @@ spec:
--disaggregation-mode decode
--disaggregation-transfer-backend nixl
--disaggregation-bootstrap-port 30001
--host 0.0.0.0
prefill
:
dynamoNamespace
:
sgl-dsr1-8gpu
componentType
:
worker
...
...
@@ -102,3 +103,4 @@ spec:
--disaggregation-mode prefill
--disaggregation-transfer-backend nixl
--disaggregation-bootstrap-port 30001
--host 0.0.0.0
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment