Unverified Commit 48c340f4 authored by ishandhanani's avatar ishandhanani Committed by GitHub
Browse files

fix: point to new sglang container and rm references to old (#4383)

parent cf64ca7c
...@@ -35,7 +35,6 @@ vllm: &vllm ...@@ -35,7 +35,6 @@ vllm: &vllm
sglang: &sglang sglang: &sglang
- 'container/Dockerfile.sglang' - 'container/Dockerfile.sglang'
- 'container/Dockerfile.sglang-wideep'
- 'examples/backends/sglang/**' - 'examples/backends/sglang/**'
- 'components/src/dynamo/sglang/**' - 'components/src/dynamo/sglang/**'
- 'container/build.sh' - 'container/build.sh'
......
...@@ -382,6 +382,7 @@ COPY --chown=dynamo: examples /workspace/examples ...@@ -382,6 +382,7 @@ COPY --chown=dynamo: examples /workspace/examples
COPY --chown=dynamo: benchmarks /workspace/benchmarks COPY --chown=dynamo: benchmarks /workspace/benchmarks
COPY --chown=dynamo: deploy /workspace/deploy COPY --chown=dynamo: deploy /workspace/deploy
COPY --chown=dynamo: components/ /workspace/components/ COPY --chown=dynamo: components/ /workspace/components/
COPY --chown=dynamo: recipes/ /workspace/recipes/
ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"] ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"]
CMD [] CMD []
......
...@@ -135,11 +135,9 @@ We are in the process of shipping pre-built docker containers that contain insta ...@@ -135,11 +135,9 @@ We are in the process of shipping pre-built docker containers that contain insta
```bash ```bash
cd $DYNAMO_ROOT cd $DYNAMO_ROOT
docker build \ ./container/build.sh \
-f container/Dockerfile.sglang-wideep \ --framework SGLANG \
-t dynamo-sglang \ --tag dynamo-sglang:latest \
--no-cache \
.
``` ```
And then run it using And then run it using
......
...@@ -5,26 +5,21 @@ SPDX-License-Identifier: Apache-2.0 ...@@ -5,26 +5,21 @@ SPDX-License-Identifier: Apache-2.0
# Running DeepSeek-R1 Disaggregated with WideEP on GB200s # Running DeepSeek-R1 Disaggregated with WideEP on GB200s
Dynamo supports SGLang's GB200 implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-06-16-gb200-part-1/) for more details. We provide a Dockerfile for this in `container/Dockerfile.sglang-wideep` and a sample configuration that demonstrates WideEP and P/D disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/7227). In this example, we will run 1 prefill worker on 2 GB200 nodes (4 GPUs each) and 1 decode worker on 2 GB200 nodes (total 8 GPUs). Dynamo supports SGLang's GB200 implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-06-16-gb200-part-1/) for more details. We provide a sample configuration that demonstrates WideEP and P/D disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/7227). In this example, we will run 1 prefill worker on 2 GB200 nodes (4 GPUs each) and 1 decode worker on 2 GB200 nodes (total 8 GPUs).
## Instructions ## Instructions
1. Build the Dynamo container using the latest published dynamo version and stable sglang version. If you want to build from a local dynamo repo, you can add `--build-arg BRANCH_TYPE=local` to the build command. If you want to build from a remote dynamo repo, you can add `--build-arg BRANCH_TYPE=remote` to the build command. If you want to use a specific tag for the default sglang version, you can add `--build-arg SGLANG_IMAGE_TAG=<tag>` to the build command. 1. Build the Dynamo container for ARM64 (GB200) using the `build.sh` script.
> [!Note] > [!Note]
> Please ensure that you are building this on an ARM64 machine. The correct SGLang image will be selected automatically via the multi-arch manifest. > Please ensure that you are building this on an ARM64 machine. The build script will automatically configure the correct platform and build arguments for SGLang on ARM64/GB200.
> [!Note]
> Please use `--build-arg SGLANG_IMAGE_TAG=nightly-dev-20251019-fda0cb2a` to build the container due to a bug that we found with the DeepEP version being installed. This was fixed in [PR 11773](https://github.com/sgl-project/sglang/pull/11773). When SGLang releases a version > `0.5.3.post3` we will update these instructions.
```bash ```bash
cd $DYNAMO_ROOT cd $DYNAMO_ROOT
docker build \ ./container/build.sh \
-f container/Dockerfile.sglang-wideep \ --framework SGLANG \
-t dynamo-wideep-gb200 \ --platform linux/arm64 \
--build-arg SGLANG_IMAGE_TAG=nightly-dev-20251019-fda0cb2a \ --tag dynamo-wideep-gb200:latest
--no-cache \
.
``` ```
2. You can run this container on each 4xGB200 node using the following command. 2. You can run this container on each 4xGB200 node using the following command.
...@@ -177,4 +172,4 @@ python3 -m dynamo.sglang \ ...@@ -177,4 +172,4 @@ python3 -m dynamo.sglang \
--disaggregation-transfer-backend nixl --disaggregation-transfer-backend nixl
``` ```
On the other decode nodes (this example has 2 total decode nodes), run the same command but change `--node-rank` to 1. On the other decode nodes (this example has 2 total decode nodes), run the same command but change `--node-rank` to 1.
\ No newline at end of file
...@@ -5,22 +5,20 @@ SPDX-License-Identifier: Apache-2.0 ...@@ -5,22 +5,20 @@ SPDX-License-Identifier: Apache-2.0
# Running DeepSeek-R1 Disaggregated with WideEP on H100s # Running DeepSeek-R1 Disaggregated with WideEP on H100s
Dynamo supports SGLang's implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-05-05-large-scale-ep/) for more details. We provide a Dockerfile for this in `container/Dockerfile.sglang-wideep` and a sample configuration that demonstrates WideEP and P/D disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/6017). In this example, we will run 1 prefill worker on 4 H100 nodes and 1 decode worker on 4 H100 nodes (64 total GPUs). Dynamo supports SGLang's implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-05-05-large-scale-ep/) for more details. We provide a sample configuration that demonstrates WideEP and P/D disaggregation. To run the exact configuration shown in the blog post, you can view the commands created by the SGLang team [here](https://github.com/sgl-project/sglang/issues/6017). In this example, we will run 1 prefill worker on 4 H100 nodes (32 GPUs each) and 1 decode worker on 4 H100 nodes (total 64 GPUs).
## Instructions ## Instructions
1. Build the Dynamo container using the latest published dynamo version and stable sglang version. If you want to build from a local dynamo repo, you can add `--build-arg BRANCH_TYPE=local` to the build command. If you want to build from a remote dynamo repo, you can add `--build-arg BRANCH_TYPE=remote` to the build command. If you want to use a specific tag for the default sglang version, you can add `--build-arg SGLANG_IMAGE_TAG=<tag>` to the build command. 1. Build the Dynamo container for AMD64/x86_64 (H100) using the `build.sh` script.
> [!Note] > [!Note]
> Please ensure that you are building this on an AMD64 (x86_64) machine. The correct SGLang image will be selected automatically via the multi-arch manifest. > Please ensure that you are building this on an AMD64 (x86_64) machine. The build script will automatically configure the correct platform for SGLang.
```bash ```bash
cd $DYNAMO_ROOT cd $DYNAMO_ROOT
docker build \ ./container/build.sh \
-f container/Dockerfile.sglang-wideep \ --framework SGLANG \
-t dynamo-wideep \ --tag dynamo-wideep:latest \
--no-cache \
.
``` ```
2. You can run this container on each 8xH100 node using the following command. 2. You can run this container on each 8xH100 node using the following command.
......
...@@ -13,7 +13,9 @@ SGLang allows you to deploy multi-node sized models by adding in the `dist-init- ...@@ -13,7 +13,9 @@ SGLang allows you to deploy multi-node sized models by adding in the `dist-init-
```bash ```bash
cd $DYNAMO_ROOT cd $DYNAMO_ROOT
docker build -f container/Dockerfile.sglang-wideep . -t dynamo-wideep --no-cache ./container/build.sh \
--framework SGLANG \
--tag dynamo-wideep:latest \
``` ```
You can use a specific tag from the [lmsys dockerhub](https://hub.docker.com/r/lmsysorg/sglang/tags) by adding `--build-arg SGLANG_IMAGE_TAG=<tag>` to the build command. You can use a specific tag from the [lmsys dockerhub](https://hub.docker.com/r/lmsysorg/sglang/tags) by adding `--build-arg SGLANG_IMAGE_TAG=<tag>` to the build command.
......
...@@ -4,10 +4,10 @@ This recipe is for running DeepSeek R1 with SGLang in disaggregated mode. It is ...@@ -4,10 +4,10 @@ This recipe is for running DeepSeek R1 with SGLang in disaggregated mode. It is
## Container ## Container
Use the Dockerfile in `container/Dockerfile.sglang-wideep` to build the container, or Build the container using the `build.sh` script:
```bash ```bash
./container/build.sh --framework sglang-wideep ./container/build.sh --framework SGLANG
``` ```
Dynamo commits after `1b3eed4b6a0e735d4ecec6681f4c0b89f2112167` (Sep 18, 2025) are required. Dynamo commits after `1b3eed4b6a0e735d4ecec6681f4c0b89f2112167` (Sep 18, 2025) are required.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment