Unverified Commit 8bd37c96 authored by Anant Sharma's avatar Anant Sharma Committed by GitHub
Browse files

refactor: move backend deploy, launch and slurm files from components to examples (#3849)


Signed-off-by: default avatarAnant Sharma <anants@nvidia.com>
parent 78359046
...@@ -19,7 +19,7 @@ helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-$ ...@@ -19,7 +19,7 @@ helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-$
helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE} helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE}
``` ```
3. Model hosting with vLLM backend 3. Model hosting with vLLM backend
This `agg_router.yaml` is adpated from vLLM deployment [example](https://github.com/ai-dynamo/dynamo/blob/main/components/backends/vllm/deploy/agg_router.yaml). It has following customizations This `agg_router.yaml` is adpated from vLLM deployment [example](https://github.com/ai-dynamo/dynamo/blob/main/examples/backends/vllm/deploy/agg_router.yaml). It has following customizations
- Deployed `Qwen/Qwen2.5-1.5B-Instruct` model - Deployed `Qwen/Qwen2.5-1.5B-Instruct` model
- Use KV cache based routing in frontend deployment via the `DYN_ROUTER_MODE=kv` environment variable - Use KV cache based routing in frontend deployment via the `DYN_ROUTER_MODE=kv` environment variable
- Mounted a local cache folder `/YOUR/LOCAL/CACHE/FOLDER` for model artifacts reuse - Mounted a local cache folder `/YOUR/LOCAL/CACHE/FOLDER` for model artifacts reuse
......
...@@ -39,7 +39,7 @@ spec: ...@@ -39,7 +39,7 @@ spec:
volumeMounts: volumeMounts:
- name: local-model-cache - name: local-model-cache
mountPath: /root/.cache mountPath: /root/.cache
workingDir: /workspace/components/backends/vllm workingDir: /workspace/examples/backends/vllm
command: command:
- /bin/sh - /bin/sh
- -c - -c
......
...@@ -36,7 +36,7 @@ spec: ...@@ -36,7 +36,7 @@ spec:
type: DirectoryOrCreate type: DirectoryOrCreate
mainContainer: mainContainer:
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:my-tag image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:my-tag
workingDir: /workspace/components/backends/vllm workingDir: /workspace/examples/backends/vllm
volumeMounts: volumeMounts:
- name: local-model-cache - name: local-model-cache
mountPath: /root/.cache mountPath: /root/.cache
...@@ -64,7 +64,7 @@ spec: ...@@ -64,7 +64,7 @@ spec:
type: DirectoryOrCreate type: DirectoryOrCreate
mainContainer: mainContainer:
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:my-tag image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:my-tag
workingDir: /workspace/components/backends/vllm workingDir: /workspace/examples/backends/vllm
volumeMounts: volumeMounts:
- name: local-model-cache - name: local-model-cache
mountPath: /root/.cache mountPath: /root/.cache
......
...@@ -46,7 +46,7 @@ spec: ...@@ -46,7 +46,7 @@ spec:
extraPodSpec: extraPodSpec:
mainContainer: mainContainer:
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.5.0 image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.5.0
workingDir: /workspace/components/backends/vllm workingDir: /workspace/examples/backends/vllm
command: command:
- /bin/sh - /bin/sh
- -c - -c
......
...@@ -32,7 +32,7 @@ if [[ -z ${IMAGE} ]]; then ...@@ -32,7 +32,7 @@ if [[ -z ${IMAGE} ]]; then
echo "ERROR: You need to set the IMAGE environment variable to the " \ echo "ERROR: You need to set the IMAGE environment variable to the " \
"Dynamo+TRTLLM docker image or .sqsh file from 'enroot import' " \ "Dynamo+TRTLLM docker image or .sqsh file from 'enroot import' " \
"See how to build one from source here: " \ "See how to build one from source here: " \
"https://github.com/ai-dynamo/dynamo/tree/main/components/backends/trtllm#build-docker" "https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container"
exit 1 exit 1
fi fi
......
...@@ -37,7 +37,7 @@ if [[ -z ${IMAGE} ]]; then ...@@ -37,7 +37,7 @@ if [[ -z ${IMAGE} ]]; then
echo "ERROR: You need to set the IMAGE environment variable to the " \ echo "ERROR: You need to set the IMAGE environment variable to the " \
"Dynamo+TRTLLM docker image or .sqsh file from 'enroot import' " \ "Dynamo+TRTLLM docker image or .sqsh file from 'enroot import' " \
"See how to build one from source here: " \ "See how to build one from source here: " \
"https://github.com/ai-dynamo/dynamo/tree/main/components/backends/trtllm#build-docker" "https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container"
exit 1 exit 1
fi fi
......
...@@ -84,13 +84,13 @@ Please follow steps below to create this task ...@@ -84,13 +84,13 @@ Please follow steps below to create this task
|ETCD_ENDPOINTS|Value|http://IP_ADDRESS:2379| |ETCD_ENDPOINTS|Value|http://IP_ADDRESS:2379|
|NATS_SERVER|Value|nats://IP_ADDRESS:4222| |NATS_SERVER|Value|nats://IP_ADDRESS:4222|
- Docker configuration - Docker configuration
Add `sh,-c` in **Entry point** and `cd components/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager` in **Command** Add `sh,-c` in **Entry point** and `cd examples/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager` in **Command**
2. Dynamo vLLM PrefillWorker Task 2. Dynamo vLLM PrefillWorker Task
Create the PrefillWorker task same as the frontend worker, except for following changes Create the PrefillWorker task same as the frontend worker, except for following changes
- Set container name as `dynamo-prefill` - Set container name as `dynamo-prefill`
- No container port mapping - No container port mapping
- Docker configuration with command `cd components/backends/vllm && python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker` - Docker configuration with command `cd examples/backends/vllm && python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker`
## 5. Task Deployment ## 5. Task Deployment
You can create a service or directly run the task from the task definition You can create a service or directly run the task from the task definition
......
...@@ -23,7 +23,7 @@ ...@@ -23,7 +23,7 @@
"-c" "-c"
], ],
"command": [ "command": [
"cd components/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager" "cd examples/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager"
], ],
"environment": [ "environment": [
{ {
......
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
"-c" "-c"
], ],
"command": [ "command": [
"cd components/backends/vllm && python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker" "python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker"
], ],
"environment": [ "environment": [
{ {
......
...@@ -86,7 +86,7 @@ helm install dynamo-platform ./platform/ \ ...@@ -86,7 +86,7 @@ helm install dynamo-platform ./platform/ \
Your pods should be running like below Your pods should be running like below
``` ```
ubuntu@ip-192-168-83-157:~/dynamo/components/backends/vllm/deploy$ kubectl get pods -A ubuntu@ip-192-168-83-157:~/dynamo/examples/backends/vllm/deploy$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE NAMESPACE NAME READY STATUS RESTARTS AGE
dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m
dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
## 1. Deploy Dynamo Graph ## 1. Deploy Dynamo Graph
``` ```
cd dynamo/components/backends/vllm/deploy cd dynamo/examples/backends/vllm/deploy
vim agg_router.yaml #under metadata add namespace: dynamo-cloud and change image to your built base image vim agg_router.yaml #under metadata add namespace: dynamo-cloud and change image to your built base image
kubectl apply -f agg_router.yaml kubectl apply -f agg_router.yaml
``` ```
...@@ -11,7 +11,7 @@ kubectl apply -f agg_router.yaml ...@@ -11,7 +11,7 @@ kubectl apply -f agg_router.yaml
Your pods should be running like below Your pods should be running like below
``` ```
ubuntu@ip-192-168-83-157:~/dynamo/components/backends/vllm/deploy$ kubectl get pods -A ubuntu@ip-192-168-83-157:~/dynamo/examples/backends/vllm/deploy$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE NAMESPACE NAME READY STATUS RESTARTS AGE
dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m
dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m
......
...@@ -25,7 +25,7 @@ spec: ...@@ -25,7 +25,7 @@ spec:
extraPodSpec: extraPodSpec:
mainContainer: mainContainer:
image: my-registry/sglang-runtime:my-tag image: my-registry/sglang-runtime:my-tag
workingDir: /workspace/components/backends/sglang workingDir: /workspace/examples/backends/sglang
command: command:
- /bin/sh - /bin/sh
- -c - -c
...@@ -48,7 +48,7 @@ spec: ...@@ -48,7 +48,7 @@ spec:
extraPodSpec: extraPodSpec:
mainContainer: mainContainer:
image: my-registry/sglang-runtime:my-tag image: my-registry/sglang-runtime:my-tag
workingDir: /workspace/components/backends/sglang workingDir: /workspace/examples/backends/sglang
command: command:
- /bin/sh - /bin/sh
- -c - -c
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment