fix: point to new sglang container and rm references to old (#4383)

48c340f4 · ishandhanani · GitHub · cf64ca7c · 48c340f4 · 48c340f4
Unverified Commit 48c340f4 authored Nov 17, 2025 by ishandhanani Committed by GitHub Nov 18, 2025
7 changed files
--- a/.github/filters.yaml
+++ b/.github/filters.yaml
@@ -35,7 +35,6 @@ vllm: &vllm

 sglang: &sglang
  - 'container/Dockerfile.sglang'
-  - 'container/Dockerfile.sglang-wideep'
  - 'examples/backends/sglang/**'
  - 'components/src/dynamo/sglang/**'
  - 'container/build.sh'

--- a/container/Dockerfile.sglang
+++ b/container/Dockerfile.sglang
@@ -382,6 +382,7 @@ COPY --chown=dynamo: examples /workspace/examples
 COPY --chown=dynamo: benchmarks /workspace/benchmarks
 COPY --chown=dynamo: deploy /workspace/deploy
 COPY --chown=dynamo: components/ /workspace/components/
+COPY --chown=dynamo: recipes/ /workspace/recipes/

 ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"]
 CMD []

--- a/docs/backends/sglang/README.md
+++ b/docs/backends/sglang/README.md
@@ -135,11 +135,9 @@ We are in the process of shipping pre-built docker containers that contain insta

 ```bash
 cd $DYNAMO_ROOT
-docker build \
-  -f container/Dockerfile.sglang-wideep \
-  -t dynamo-sglang \
-  --no-cache \
-  .
+./container/build.sh \
+  --framework SGLANG \
+  --tag dynamo-sglang:latest \
 ```

 And then run it using

--- a/docs/backends/sglang/dsr1-wideep-gb200.md
+++ b/docs/backends/sglang/dsr1-wideep-gb200.md
@@ -5,26 +5,21 @@ SPDX-License-Identifier: Apache-2.0

 # Running DeepSeek-R1 Disaggregated with WideEP on GB200s

-Dynamo supports SGLang's GB200 implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-06-16-gb200-part-1/) for more details. We provide a Dockerfile for this in `container/Dockerfile.sglang-wideep` and a sample configuration that demonstrates WideEP and P/D  disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/7227). In this example, we will run 1 prefill worker on 2 GB200 nodes (4 GPUs each) and 1 decode worker on 2 GB200 nodes (total 8 GPUs).
+Dynamo supports SGLang's GB200 implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-06-16-gb200-part-1/) for more details. We provide a sample configuration that demonstrates WideEP and P/D  disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/7227). In this example, we will run 1 prefill worker on 2 GB200 nodes (4 GPUs each) and 1 decode worker on 2 GB200 nodes (total 8 GPUs).

 ## Instructions

-1. Build the Dynamo container using the latest published dynamo version and stable sglang version. If you want to build from a local dynamo repo, you can add `--build-arg BRANCH_TYPE=local` to the build command. If you want to build from a remote dynamo repo, you can add `--build-arg BRANCH_TYPE=remote` to the build command. If you want to use a specific tag for the default sglang version, you can add `--build-arg SGLANG_IMAGE_TAG=<tag>` to the build command.
+1. Build the Dynamo container for ARM64 (GB200) using the `build.sh` script.

 > [!Note]
-> Please ensure that you are building this on an ARM64 machine. The correct SGLang image will be selected automatically via the multi-arch manifest.
-
-> [!Note]
-> Please use `--build-arg SGLANG_IMAGE_TAG=nightly-dev-20251019-fda0cb2a` to build the container due to a bug that we found with the DeepEP version being installed. This was fixed in [PR 11773](https://github.com/sgl-project/sglang/pull/11773). When SGLang releases a version > `0.5.3.post3` we will update these instructions.
+> Please ensure that you are building this on an ARM64 machine. The build script will automatically configure the correct platform and build arguments for SGLang on ARM64/GB200.

 ```bash
 cd $DYNAMO_ROOT
-docker build \
-  -f container/Dockerfile.sglang-wideep \
-  -t dynamo-wideep-gb200 \
-  --build-arg SGLANG_IMAGE_TAG=nightly-dev-20251019-fda0cb2a \
-  --no-cache \
-  .
+./container/build.sh \
+  --framework SGLANG \
+  --platform linux/arm64 \
+  --tag dynamo-wideep-gb200:latest
 ```

 2. You can run this container on each 4xGB200 node using the following command.
@@ -177,4 +172,4 @@ python3 -m dynamo.sglang \
  --disaggregation-transfer-backend nixl
 ```

-On the other decode nodes (this example has 2 total decode nodes), run the same command but change `--node-rank` to 1.
\ No newline at end of file
+On the other decode nodes (this example has 2 total decode nodes), run the same command but change `--node-rank` to 1.
--- a/docs/backends/sglang/dsr1-wideep-h100.md
+++ b/docs/backends/sglang/dsr1-wideep-h100.md
@@ -5,22 +5,20 @@ SPDX-License-Identifier: Apache-2.0

 # Running DeepSeek-R1 Disaggregated with WideEP on H100s

-Dynamo supports SGLang's implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-05-05-large-scale-ep/) for more details. We provide a Dockerfile for this in `container/Dockerfile.sglang-wideep` and a sample configuration that demonstrates WideEP and P/D  disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/6017). In this example, we will run 1 prefill worker on 4 H100 nodes and 1 decode worker on 4 H100 nodes (64 total GPUs).
+Dynamo supports SGLang's implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-05-05-large-scale-ep/) for more details. We provide a sample configuration that demonstrates WideEP and P/D  disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/6017). In this example, we will run 1 prefill worker on 4 H100 nodes (32 GPUs each) and 1 decode worker on 4 H100 nodes (total 64 GPUs).

 ## Instructions

-1. Build the Dynamo container using the latest published dynamo version and stable sglang version. If you want to build from a local dynamo repo, you can add `--build-arg BRANCH_TYPE=local` to the build command. If you want to build from a remote dynamo repo, you can add `--build-arg BRANCH_TYPE=remote` to the build command. If you want to use a specific tag for the default sglang version, you can add `--build-arg SGLANG_IMAGE_TAG=<tag>` to the build command.
+1. Build the Dynamo container for AMD64/x86_64 (H100) using the `build.sh` script.

 > [!Note]
-> Please ensure that you are building this on an AMD64 (x86_64) machine. The correct SGLang image will be selected automatically via the multi-arch manifest.
+> Please ensure that you are building this on an AMD64 (x86_64) machine. The build script will automatically configure the correct platform for SGLang.

 ```bash
 cd $DYNAMO_ROOT
-docker build \
-  -f container/Dockerfile.sglang-wideep \
-  -t dynamo-wideep \
-  --no-cache \
-  .
+./container/build.sh \
+  --framework SGLANG \
+  --tag dynamo-wideep:latest \
 ```

 2. You can run this container on each 8xH100 node using the following command.

--- a/docs/backends/sglang/multinode-examples.md
+++ b/docs/backends/sglang/multinode-examples.md
@@ -13,7 +13,9 @@ SGLang allows you to deploy multi-node sized models by adding in the `dist-init-

 ```bash
 cd $DYNAMO_ROOT
-docker build -f container/Dockerfile.sglang-wideep . -t dynamo-wideep --no-cache
+./container/build.sh \
+  --framework SGLANG \
+  --tag dynamo-wideep:latest \
 ```

 You can use a specific tag from the [lmsys dockerhub](https://hub.docker.com/r/lmsysorg/sglang/tags) by adding `--build-arg SGLANG_IMAGE_TAG=<tag>` to the build command.

--- a/recipes/deepseek-r1/sglang/README.md
+++ b/recipes/deepseek-r1/sglang/README.md
@@ -4,10 +4,10 @@ This recipe is for running DeepSeek R1 with SGLang in disaggregated mode. It is

 ## Container

-Use the Dockerfile in `container/Dockerfile.sglang-wideep` to build the container, or
+Build the container using the `build.sh` script:

 ```bash
-./container/build.sh --framework sglang-wideep
+./container/build.sh --framework SGLANG
 ```

 Dynamo commits after `1b3eed4b6a0e735d4ecec6681f4c0b89f2112167` (Sep 18, 2025) are required.