Unverified Commit f8096590 authored by atchernych's avatar atchernych Committed by GitHub
Browse files

docs: hello world deploy example (#2102)

parent 1e6709db
......@@ -4,6 +4,9 @@
Follow individual examples under components/backends/ to serve models locally.
For example follow the [vLLM Backend Example](../../components/backends/vllm/README.md)
For a basic GPU - unaware example see the [Hello World Example](../../examples/runtime/hello_world/README.md)
## Deploying Examples to Kubernetes
......@@ -14,14 +17,9 @@ Before you can deploy your graphs, you need to deploy the Dynamo Runtime and Dyn
If you are a **👤 Dynamo User** first follow the [Quickstart Guide](../guides/dynamo_deploy/quickstart.md) first.
### Instructions for Dynamo Contributor
If you are a **🧑‍💻 Dynamo Contributor** first follow the instructions in [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md) to create your Dynamo Cloud deployment.
You would have to rebuild the dynamo platform images as the code evolves. For more details please look at the [Cloud Guide](../guides/dynamo_deploy/dynamo_cloud.md)
```bash
export DYNAMO_IMAGE=<your-registry>/<your-image-name>:<your-tag>
```
If you are a **🧑‍💻 Dynamo Contributor** you may have to rebuild the dynamo platform images as the code evolves.
For more details read the [Cloud Guide](../guides/dynamo_deploy/dynamo_cloud.md)
Read more on deploying Dynamo Cloud read [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md).
### Deploying a particular example
......@@ -42,12 +40,26 @@ kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE}
You can use `kubectl get dynamoGraphDeployment -n ${NAMESPACE}` to view your deployment.
You can use `kubectl delete dynamoGraphDeployment <your-dep-name> -n ${NAMESPACE}` to delete the deployment.
We provide a Custom Resource yaml file for many examples under the `deploy/` folder.
Use [VLLM YAML](../../components/backends/vllm/deploy/agg.yaml) for an example.
**Note 1** Example Image
The examples use a prebuilt image from the `nvcr.io/nvidian/nim-llm-dev registry`.
The examples use a prebuilt image from the `nvcr.io` registry.
You can build your own image and update the image location in your CR file prior to applying.
See [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image)
You could build your own image using
```bash
./container/build.sh --framework <your-inference-framework>
```
For example for the `sglang` run
```bash
./container/build.sh --framework sglang
```
Then you would need to overwrite the image in the examples.
```bash
extraPodSpec:
......@@ -72,4 +84,4 @@ kubectl port-forward svc/${SERVICE_NAME}-frontend 8080:8080 -n ${NAMESPACE}
Consult the [Port Forward Documentation](https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/)
More on [LLM examples](llm_deployment.md)
......@@ -101,3 +101,19 @@ Hello star!
- **`worker`**: A dynamo worker that connects to the backend service and processes the streaming response
## Deployment to Kubernetes
Follow the [Quickstart Guide](../../../docs/guides/dynamo_deploy/quickstart.md) to install Dynamo Cloud.
Then deploy to kubernetes using
```bash
export NAMESPACE=<your-namespace>
cd dynamo
kubectl apply -f examples/runtime/hello_world/deploy/hello_world.yaml -n ${NAMESPACE}
```
to delete your deployment:
```bash
kubectl delete dynamographdeployment hello-world -n ${NAMESPACE}
```
\ No newline at end of file
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
apiVersion: nvidia.com/v1alpha1
kind: DynamoGraphDeployment
metadata:
name: hello-world
spec:
services:
Frontend:
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 3
readinessProbe:
exec:
command:
- /bin/sh
- -c
- 'echo ok'
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 3
dynamoNamespace: hello-world
componentType: main
replicas: 1
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "1"
memory: "2Gi"
extraPodSpec:
mainContainer:
image: gitlab-master.nvidia.com:5005/dl/ai-dynamo/dynamo/dynamo:helloworld
workingDir: /workspace/examples/runtime/hello_world/
command:
- /bin/sh
- -c
args:
- "python3 client.py"
HelloWorldWorker:
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "exit 0"
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 3
readinessProbe:
exec:
command:
- /bin/sh
- -c
- 'grep "Serving endpoint" /tmp/hello_world.log'
initialDelaySeconds: 60
periodSeconds: 60
timeoutSeconds: 30
failureThreshold: 10
dynamoNamespace: hello-world
componentType: worker
replicas: 1
resources:
requests:
cpu: "1"
memory: "4Gi"
limits:
cpu: "1"
memory: "4Gi"
extraPodSpec:
mainContainer:
image: gitlab-master.nvidia.com:5005/dl/ai-dynamo/dynamo/dynamo:helloworld
workingDir: /workspace/examples/runtime/hello_world/
command:
- /bin/sh
- -c
args:
- python3 hello_world.py 2>&1 | tee /tmp/hello_world.log
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment