Unverified Commit 8bd37c96 authored by Anant Sharma's avatar Anant Sharma Committed by GitHub
Browse files

refactor: move backend deploy, launch and slurm files from components to examples (#3849)


Signed-off-by: default avatarAnant Sharma <anants@nvidia.com>
parent 78359046
......@@ -19,7 +19,7 @@ helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-$
helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE}
```
3. Model hosting with vLLM backend
This `agg_router.yaml` is adpated from vLLM deployment [example](https://github.com/ai-dynamo/dynamo/blob/main/components/backends/vllm/deploy/agg_router.yaml). It has following customizations
This `agg_router.yaml` is adpated from vLLM deployment [example](https://github.com/ai-dynamo/dynamo/blob/main/examples/backends/vllm/deploy/agg_router.yaml). It has following customizations
- Deployed `Qwen/Qwen2.5-1.5B-Instruct` model
- Use KV cache based routing in frontend deployment via the `DYN_ROUTER_MODE=kv` environment variable
- Mounted a local cache folder `/YOUR/LOCAL/CACHE/FOLDER` for model artifacts reuse
......
......@@ -39,7 +39,7 @@ spec:
volumeMounts:
- name: local-model-cache
mountPath: /root/.cache
workingDir: /workspace/components/backends/vllm
workingDir: /workspace/examples/backends/vllm
command:
- /bin/sh
- -c
......
......@@ -36,7 +36,7 @@ spec:
type: DirectoryOrCreate
mainContainer:
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:my-tag
workingDir: /workspace/components/backends/vllm
workingDir: /workspace/examples/backends/vllm
volumeMounts:
- name: local-model-cache
mountPath: /root/.cache
......@@ -64,7 +64,7 @@ spec:
type: DirectoryOrCreate
mainContainer:
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:my-tag
workingDir: /workspace/components/backends/vllm
workingDir: /workspace/examples/backends/vllm
volumeMounts:
- name: local-model-cache
mountPath: /root/.cache
......
......@@ -46,7 +46,7 @@ spec:
extraPodSpec:
mainContainer:
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.5.0
workingDir: /workspace/components/backends/vllm
workingDir: /workspace/examples/backends/vllm
command:
- /bin/sh
- -c
......
......@@ -32,7 +32,7 @@ if [[ -z ${IMAGE} ]]; then
echo "ERROR: You need to set the IMAGE environment variable to the " \
"Dynamo+TRTLLM docker image or .sqsh file from 'enroot import' " \
"See how to build one from source here: " \
"https://github.com/ai-dynamo/dynamo/tree/main/components/backends/trtllm#build-docker"
"https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container"
exit 1
fi
......
......@@ -37,7 +37,7 @@ if [[ -z ${IMAGE} ]]; then
echo "ERROR: You need to set the IMAGE environment variable to the " \
"Dynamo+TRTLLM docker image or .sqsh file from 'enroot import' " \
"See how to build one from source here: " \
"https://github.com/ai-dynamo/dynamo/tree/main/components/backends/trtllm#build-docker"
"https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container"
exit 1
fi
......
......@@ -84,13 +84,13 @@ Please follow steps below to create this task
|ETCD_ENDPOINTS|Value|http://IP_ADDRESS:2379|
|NATS_SERVER|Value|nats://IP_ADDRESS:4222|
- Docker configuration
Add `sh,-c` in **Entry point** and `cd components/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager` in **Command**
Add `sh,-c` in **Entry point** and `cd examples/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager` in **Command**
2. Dynamo vLLM PrefillWorker Task
Create the PrefillWorker task same as the frontend worker, except for following changes
- Set container name as `dynamo-prefill`
- No container port mapping
- Docker configuration with command `cd components/backends/vllm && python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker`
- Docker configuration with command `cd examples/backends/vllm && python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker`
## 5. Task Deployment
You can create a service or directly run the task from the task definition
......
......@@ -23,7 +23,7 @@
"-c"
],
"command": [
"cd components/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager"
"cd examples/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager"
],
"environment": [
{
......
......@@ -15,7 +15,7 @@
"-c"
],
"command": [
"cd components/backends/vllm && python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker"
"python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker"
],
"environment": [
{
......
......@@ -86,7 +86,7 @@ helm install dynamo-platform ./platform/ \
Your pods should be running like below
```
ubuntu@ip-192-168-83-157:~/dynamo/components/backends/vllm/deploy$ kubectl get pods -A
ubuntu@ip-192-168-83-157:~/dynamo/examples/backends/vllm/deploy$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m
dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m
......
......@@ -3,7 +3,7 @@
## 1. Deploy Dynamo Graph
```
cd dynamo/components/backends/vllm/deploy
cd dynamo/examples/backends/vllm/deploy
vim agg_router.yaml #under metadata add namespace: dynamo-cloud and change image to your built base image
kubectl apply -f agg_router.yaml
```
......@@ -11,7 +11,7 @@ kubectl apply -f agg_router.yaml
Your pods should be running like below
```
ubuntu@ip-192-168-83-157:~/dynamo/components/backends/vllm/deploy$ kubectl get pods -A
ubuntu@ip-192-168-83-157:~/dynamo/examples/backends/vllm/deploy$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m
dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m
......
......@@ -25,7 +25,7 @@ spec:
extraPodSpec:
mainContainer:
image: my-registry/sglang-runtime:my-tag
workingDir: /workspace/components/backends/sglang
workingDir: /workspace/examples/backends/sglang
command:
- /bin/sh
- -c
......@@ -48,7 +48,7 @@ spec:
extraPodSpec:
mainContainer:
image: my-registry/sglang-runtime:my-tag
workingDir: /workspace/components/backends/sglang
workingDir: /workspace/examples/backends/sglang
command:
- /bin/sh
- -c
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment