Unverified Commit 57fe1068 authored by atchernych's avatar atchernych Committed by GitHub
Browse files

docs: Improve Dynamo Deploy Documentation (#1692)


Signed-off-by: default avatarhhzhang16 <54051230+hhzhang16@users.noreply.github.com>
Co-authored-by: default avatarJulien Mancuso <jmancuso@nvidia.com>
Co-authored-by: default avatarhhzhang16 <54051230+hhzhang16@users.noreply.github.com>
parent 27c24b3f
# Contributing to Dynamo Deploy
Welcome to the Dynamo Deploy project! This guide will help you get started with contributing to the deployment infrastructure and tooling for the Dynamo distributed inference platform.
## Getting Started
### Prerequisites
### Quick Setup
### Project Structure
The deploy directory contains several key components:
```
deploy/
├── cloud/ # Cloud deployment platform
│ ├── helm/ # Cloud platform Helm charts
│ └── operator/ # Kubernetes operator (Go)
├── helm/ # Manual deployment Helm charts
├── metrics/ # Monitoring and observability
├── sdk/ # Python scripts
└── inference-gateway/ # Gateway components
```
## Development Environment
### Setting Up Your Environment
### IDE Configuration
**VS Code:**
- Install Go extension
- Install Python extension
- Configure settings for Go formatting and linting
- Add workspace settings for consistent formatting
### Contribution Workflow Caveats
- We do signed commits
```bash
commit -S
```
- Every time you modify `deploy/cloud/helm/crds/templates/*.yaml`, please bump up the version of the CRD helm chart in
1. deploy/cloud/helm/platform/components/operator/Chart.yaml
2. deploy/cloud/helm/platform/Chart.yaml
then
```bash
deploy/cloud/helm/platform
helm dependency update
```
#### Commit Message Guidelines
Follow conventional commit format:
- `feat:` new features
- `fix:` bug fixes
- `docs:` documentation changes
- `test:` adding or updating tests
- `refactor:` code refactoring
- `perf:` performance improvements
- `ci:` CI/CD changes
Examples:
```
feat(operator): add support for custom resource limits
fix(sdk): resolve service discovery timeout issue
docs(helm): update deployment guide with new examples
test(e2e): add integration tests for disaggregated serving
```
## Style Guide
### Go Code Style (Operator)
Follow standard Go conventions.
### Python Code Style (SDK)
Follow PEP 8 and use modern Python practices:
### YAML/Helm Templates
```yaml
# Use consistent indentation (2 spaces)
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "dynamo.fullname" . }}
labels:
{{- include "dynamo.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "dynamo.selectorLabels" . | nindent 6 }}
```
## Testing
Once you have an MR up and standard checks pass trigger the integration tests by adding the comment “/ok to test <COMMIT-ID>
### Unit Tests
**Go Tests (Operator):**
```bash
cd deploy/cloud/operator
go test ./... -v
go test -race ./...
```
**Python Tests (SDK):**
```bash
cd deploy/sdk
pytest tests/ -v
pytest tests/ --cov=dynamo.sdk
```
### Integration Tests
**End-to-End Deployment Tests:**
```bash
# Run full deployment test suite
pytest tests/serve/test_dynamo_serve.py -v
# Test specific deployment scenarios
pytest tests/serve/test_dynamo_serve.py::test_serve_deployment[agg] -v
```
**Operator Integration Tests:**
```bash
cd deploy/cloud/operator
make test-e2e
```
### Writing Tests
**Example Unit Test:**
**Example Integration Test:**
### Examples Testing
Ensure documentation examples work.
Thank you for contributing to Dynamo Deploy! 🚀
...@@ -17,72 +17,38 @@ limitations under the License. ...@@ -17,72 +17,38 @@ limitations under the License.
# 🚀 Deploy Dynamo Cloud to Kubernetes # 🚀 Deploy Dynamo Cloud to Kubernetes
## 🏗️ Building Docker images for Dynamo Cloud components Dynamo Cloud acts as an orchestration layer between the end user and Kubernetes, handling the complexity of deploying your graphs for you.
Before you can deploy your graphs, you need to deploy the Dynamo Runtime and Dynamo Cloud images. This is a one-time action, only necessary the first time you deploy a DynamoGraph.
You can build and push Docker images for the Dynamo cloud components (API server, API store, and operator) to any container registry of your choice. Here's how to build each component: [See Dynamo Cloud Guide](../../../docs/guides/dynamo_deploy/dynamo_cloud.md) for advanced cases and details on how to install and use Dynamo Cloud. For a quick start follow the steps below.
### 📋 Prerequisites
- [Earthly](https://earthly.dev/) installed
- Docker installed and running
- Access to a container registry of your choice
### ⚙️ Building and Pushing Images ## 🏗️ Building Docker images for Dynamo Cloud components
First, set the required environment variables: You can build and push Docker images for the Dynamo cloud components to any container registry of your choice.
```bash
export DOCKER_SERVER=<CONTAINER_REGISTRY>
export IMAGE_TAG=<TAG>
```
As a description of the placeholders: **Important** Make sure you're logged in to your container registry before pushing images. For example:
- `<CONTAINER_REGISTRY>`: Your container registry (e.g., `nvcr.io`, `docker.io/<your-username>`, etc.)
- `<TAG>`: The tag you want to use for the image (e.g., `latest`, `0.0.1`, etc.)
Note: Make sure you're logged in to your container registry before pushing images. For example:
```bash ```bash
docker login <CONTAINER_REGISTRY> docker login <CONTAINER_REGISTRY>
``` ```
You can build each component individually or build all components at once: #### 🛠️ Build and push images for the Dynamo Cloud platform components
#### 🛠️ Build and push platform components
```bash
earthly --push +all-docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG
```
## 🚀 Deploy Dynamo Cloud Platform
### 📋 Prerequisites
Before deploying Dynamo Cloud, ensure your Kubernetes cluster meets the following requirements:
#### 1. 🛡️ Istio Installation [One-time Action]
Dynamo Cloud requires Istio for service mesh capabilities. Verify Istio is installed and running: You should build the images for the Dynamo Cloud Platform.
If you are a **👤 Dynamo User** you would do this step once.
```bash ```bash
# Check if Istio is installed export DOCKER_SERVER=<your-docker-server>
kubectl get pods -n istio-system export IMAGE_TAG=<TAG>
earthly --push +all-docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG
# Expected output should show running Istio pods
# istiod-* pods should be in Running state
``` ```
#### 2. 💾 PVC Support with Default Storage Class If you are a **🧑‍💻 Dynamo Contributor** you would have to rebuild the dynamo platform images as the code evolves. To do so please look at the [Cloud Guide](../../../docs/guides/dynamo_deploy/dynamo_cloud.md).
Dynamo Cloud requires Persistent Volume Claim (PVC) support with a default storage class. Verify your cluster configuration:
```bash
# Check if default storage class exists
kubectl get storageclass
# Expected output should show at least one storage class marked as (default)
# Example:
# NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
# standard (default) kubernetes.io/gce-pd Delete Immediate true 1d
```
> [!TIP] ### 🚀 Deploying the Dynamo Cloud Platform
> Don't have a Kubernetes cluster? Check out our [Minikube setup guide](../../../docs/guides/dynamo_deploy/minikube.md) to set up a local environment! 🏠
### 📥 Installation
1. Set the required environment variables: 1. Set the required environment variables:
```bash ```bash
...@@ -101,16 +67,23 @@ export DYNAMO_INGRESS_SUFFIX=dynamo-cloud.com # change this to whatever you want ...@@ -101,16 +67,23 @@ export DYNAMO_INGRESS_SUFFIX=dynamo-cloud.com # change this to whatever you want
cd $PROJECT_ROOT/deploy/cloud/helm cd $PROJECT_ROOT/deploy/cloud/helm
kubectl create namespace $NAMESPACE kubectl create namespace $NAMESPACE
kubectl config set-context --current --namespace=$NAMESPACE kubectl config set-context --current --namespace=$NAMESPACE
```
3. Deploy Dynamo Cloud using the Helm chart via the provided deploy script:
To deploy the Dynamo Cloud Platform on Kubernetes, run:
kubectl create secret docker-registry docker-imagepullsecret \ ```bash
--docker-server=$DOCKER_SERVER \ ./deploy.sh --crds
--docker-username=$DOCKER_USERNAME \
--docker-password=$DOCKER_PASSWORD \
--namespace=$NAMESPACE
``` ```
3. Deploy the helm chart using the deploy script: if you want guidance during the process, run the deployment script with the `--interactive` flag:
```bash ```bash
./deploy.sh ./deploy.sh --crds --interactive
``` ```
omitting `--crds` will skip the CRDs installation/upgrade. This is useful when installing on a shared cluster as CRDs are cluster-scoped resources.
...@@ -16,6 +16,17 @@ ...@@ -16,6 +16,17 @@
# limitations under the License. # limitations under the License.
set -euo pipefail set -euo pipefail
trap 'echo "Error at line $LINENO. Exiting."' ERR
required_tools=(envsubst kubectl helm)
for tool in "${required_tools[@]}"; do
if ! command -v "$tool" >/dev/null 2>&1; then
echo "Error: Required tool '$tool' is not installed or not in PATH."
exit 1
fi
done
# Use system helm # Use system helm
HELM_CMD=$(which helm) HELM_CMD=$(which helm)
...@@ -42,6 +53,7 @@ export ENABLE_LWS="${ENABLE_LWS:=false}" ...@@ -42,6 +53,7 @@ export ENABLE_LWS="${ENABLE_LWS:=false}"
# Add command line options # Add command line options
INTERACTIVE=false INTERACTIVE=false
INSTALL_CRDS=false INSTALL_CRDS=false
YAML_ONLY=false
# Parse command line arguments # Parse command line arguments
while [[ $# -gt 0 ]]; do while [[ $# -gt 0 ]]; do
key="$1" key="$1"
...@@ -54,10 +66,15 @@ while [[ $# -gt 0 ]]; do ...@@ -54,10 +66,15 @@ while [[ $# -gt 0 ]]; do
INSTALL_CRDS=true INSTALL_CRDS=true
shift shift
;; ;;
--yaml-only)
YAML_ONLY=true
shift
;;
--help) --help)
echo "Usage: $0 [options]" echo "Usage: $0 [options]"
echo "Options:" echo "Options:"
echo " --interactive Run in interactive mode" echo " --interactive Run in interactive mode"
echo " --yaml-only Only generate generated-values.yaml and exit"
echo " --help Show this help message" echo " --help Show this help message"
echo " --crds Also install the CRDs" echo " --crds Also install the CRDs"
exit 0 exit 0
...@@ -122,8 +139,6 @@ retry_command() { ...@@ -122,8 +139,6 @@ retry_command() {
# Update the helm repo and build the dependencies # Update the helm repo and build the dependencies
retry_command "$HELM_CMD repo add nats https://nats-io.github.io/k8s/helm/charts/" 5 5 && \ retry_command "$HELM_CMD repo add nats https://nats-io.github.io/k8s/helm/charts/" 5 5 && \
retry_command "$HELM_CMD repo add bitnami https://charts.bitnami.com/bitnami" 5 5 && \
retry_command "$HELM_CMD repo add minio https://charts.min.io/" 5 5 && \
retry_command "$HELM_CMD repo update" 5 5 retry_command "$HELM_CMD repo update" 5 5
...@@ -153,6 +168,10 @@ cat generated-values.yaml ...@@ -153,6 +168,10 @@ cat generated-values.yaml
echo "" echo ""
echo "Generated values file saved as generated-values.yaml" echo "Generated values file saved as generated-values.yaml"
if [ "$YAML_ONLY" = true ]; then
echo "--yaml-only flag detected; skipping Helm deployment steps."
exit 0
fi
# Build dependencies before installation # Build dependencies before installation
echo "Building helm dependencies..." echo "Building helm dependencies..."
...@@ -166,11 +185,16 @@ if [ "$INSTALL_CRDS" = true ]; then ...@@ -166,11 +185,16 @@ if [ "$INSTALL_CRDS" = true ]; then
$HELM_CMD upgrade --install dynamo-crds crds/ --namespace default --wait --atomic $HELM_CMD upgrade --install dynamo-crds crds/ --namespace default --wait --atomic
fi fi
# Install/upgrade the helm chart # Build Platform
echo "Installing/upgrading helm chart..." echo "Building platform..."
$HELM_CMD upgrade --install $RELEASE_NAME platform/ \ $HELM_CMD dep build ./platform/
-f generated-values.yaml \
--create-namespace \ # Install platform
--namespace ${NAMESPACE} echo "Installing platform..."
helm install dynamo-platform ./platform/ \
--namespace ${NAMESPACE} \
--set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \
--set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}" \
--set "dynamo-operator.imagePullSecrets[0].name=docker-imagepullsecret" \
--set controller.env.DYNAMO_CLOUD=${DYNAMO_CLOUD}
echo "Helm chart deployment complete" echo "Helm chart deployment complete"
# Examples of using Dynamo Platform
## Serving examples locally
Follow individual examples to serve models locally.
## Deploying Examples to Kubernetes
First you need to install the Dynamo Cloud Platform. Dynamo Cloud acts as an orchestration layer between the end user and Kubernetes, handling the complexity of deploying your graphs for you.
Before you can deploy your graphs, you need to deploy the Dynamo Runtime and Dynamo Cloud images. This is a one-time action, only necessary the first time you deploy a DynamoGraph.
### Instructions for Dynamo User
If you are a **👤 Dynamo User** first follow the [Quickstart Guide](../guides/dynamo_deploy/quickstart.md) first.
### Instructions for Dynamo Contributor
If you are a **🧑‍💻 Dynamo Contributor** first follow the instructions in [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md) to create your Dynamo Cloud deployment.
Make sure your dynamo cloud the `deploy.sh --crds --interactive` script finished successfully.
You would have to rebuild the dynamo platform images as the code evolves. For more details please look at the [Cloud Guide](../guides/dynamo_deploy/dynamo_cloud.md)
Export the [Dynamo Base Image](../get_started.md#building-the-dynamo-base-image) you want to use (or built during the prerequisites step) as the `DYNAMO_IMAGE` environment variable.
```bash
export DYNAMO_IMAGE=<your-registry>/<your-image-name>:<your-tag>
```
### Post Install Instructions
```bash
# Set your dynamo root directory
cd <root-dynamo-folder>
export PROJECT_ROOT=$(pwd)
export NAMESPACE=<your-namespace> # the namespace you used to deploy Dynamo cloud to.
```
Pick your deployment destination.
If local
```bash
export DYNAMO_CLOUD=http://localhost:8080
```
If kubernetes
```bash
export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com
```
Deploying examples consists of the simple `kubectl apply -f` command.
...@@ -26,7 +26,7 @@ This example demonstrates the basic concepts of Dynamo by creating a simple mult ...@@ -26,7 +26,7 @@ This example demonstrates the basic concepts of Dynamo by creating a simple mult
3. Set up a simple HTTP API endpoint 3. Set up a simple HTTP API endpoint
4. Deploy and interact with a Dynamo service graph 4. Deploy and interact with a Dynamo service graph
Pipeline Architecture: Graph Architecture:
``` ```
Users/Clients (HTTP) Users/Clients (HTTP)
...@@ -66,6 +66,12 @@ Users/Clients (HTTP) ...@@ -66,6 +66,12 @@ Users/Clients (HTTP)
## Running the Example Locally ## Running the Example Locally
Make sure you are running etcd and nats
```bash
sudo systemctl start etcd
sudo systemctl start nats-server
```
1. Launch all three services using a single command: 1. Launch all three services using a single command:
```bash ```bash
...@@ -87,65 +93,37 @@ curl -X 'POST' \ ...@@ -87,65 +93,37 @@ curl -X 'POST' \
}' }'
``` ```
## Deploying to and Running the Example in Kubernetes # Deploy to Kubernetes
This example can be deployed to a Kubernetes cluster using [Dynamo Cloud](../../docs/guides/dynamo_deploy/dynamo_cloud.md) and the Dynamo CLI.
### Prerequisites
You must have first followed the instructions in [deploy/cloud/helm/README.md](https://github.com/ai-dynamo/dynamo/blob/main/deploy/cloud/helm/README.md) to create your Dynamo cloud deployment.
### Deployment Steps For your Hello World graph. You should first deploy the Dynamo Cloud Platform.
If you are a **👤 Dynamo User** first follow the [Quickstart Guide](../guides/dynamo_deploy/quickstart.md).
If you are a **🧑‍💻 Dynamo Contributor** and you have changed the platform code you would have to rebuild the dynamo platform. To do so please look at the [Cloud Guide](../guides/dynamo_deploy/dynamo_cloud.md).
For detailed deployment instructions, please refer to the [Operator Deployment Guide](../../docs/guides/dynamo_deploy/operator_deployment.md). The following are the specific commands for the hello world example: ## Deploy your service using a DynamoGraphDeployment CR.
```bash ```bash
Make sure your dynamo cloud deploy.sh script from the prior step finished successfully and setup port forwaring in another window kubectl apply -f examples/hello_world/deploy/hello_world.yaml -n ${NAMESPACE}
per its suggestion.
kubectl port-forward svc/...-dynamo-api-store <local-port>:80 -n $NAMESPACE
# Set your dynamo root directory
cd <root dynamo folder>
export PROJECT_ROOT=$(pwd)
# Configure environment variables (see operator_deployment.md for details)
export KUBE_NS=hello-world
export DYNAMO_CLOUD=http://localhost:8080 # If using port-forward
# OR
# export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com # If using Ingress/VirtualService
# Build the Dynamo base image (see operator_deployment.md for details)
export DYNAMO_IMAGE=<your-registry>/<your-image-name>:<your-tag>
# Build the service
cd $PROJECT_ROOT/examples/hello_world
DYNAMO_TAG=$(dynamo build hello_world:Frontend | grep "Successfully built" | awk '{ print $3 }' | sed 's/\.$//')
# Deploy to Kubernetes
# TODO: Deploy your service using a DynamoGraphDeployment CR.
``` ```
### Testing the Deployment ## Testing the Deployment
Once the deployment is complete, you can test it using: Once the deployment is complete, you can test it using commands below.
Do the port forward in another terminal if needed.
```bash ```bash
# Find your frontend pod export DEPLOYMENT_NAME=hello-world
export FRONTEND_POD=$(kubectl get pods -n ${KUBE_NS} | grep "${DEPLOYMENT_NAME}-frontend" | sort -k1 | tail -n1 | awk '{print $1}')
# Forward the pod's port to localhost # Forward the pod's port to localhost
kubectl port-forward pod/$FRONTEND_POD 8000:8000 -n ${KUBE_NS} kubectl port-forward svc/$DEPLOYMENT_NAME-frontend 8000:8000 -n ${NAMESPACE}
```
```bash
# Test the API endpoint # Test the API endpoint
curl -X 'POST' 'http://localhost:8000/generate' \ curl -N -X POST http://localhost:8000/generate \
-H 'accept: text/event-stream' \ -H "accept: text/event-stream" \
-H 'Content-Type: application/json' \ -H "Content-Type: application/json" \
-d '{"text": "test"}' -d '{"text": "test"}'
``` ```
For more details on managing deployments, testing, and troubleshooting, please refer to the [Operator Deployment Guide](../../docs/guides/dynamo_deploy/operator_deployment.md).
## Expected Output ## Expected Output
......
...@@ -253,11 +253,8 @@ export DEPLOYMENT_NAME=llm-agg ...@@ -253,11 +253,8 @@ export DEPLOYMENT_NAME=llm-agg
Once the deployment is complete, you can test it using: Once the deployment is complete, you can test it using:
```bash ```bash
# Find your frontend pod # Forward the port to localhost
export FRONTEND_POD=$(kubectl get pods -n ${KUBE_NS} | grep "${DEPLOYMENT_NAME}-frontend" | sort -k1 | tail -n1 | awk '{print $1}') kubectl port-forward svc/$DEPLOYMENT_NAME-frontend 8000:8000 -n ${KUBE_NS}
# Forward the pod's port to localhost
kubectl port-forward pod/$FRONTEND_POD 8000:8000 -n ${KUBE_NS}
# Test the API endpoint # Test the API endpoint
curl localhost:8000/v1/chat/completions \ curl localhost:8000/v1/chat/completions \
...@@ -274,5 +271,3 @@ curl localhost:8000/v1/chat/completions \ ...@@ -274,5 +271,3 @@ curl localhost:8000/v1/chat/completions \
"max_tokens": 30 "max_tokens": 30
}' }'
``` ```
For more details on managing deployments, testing, and troubleshooting, please refer to the [Operator Deployment Guide](../../docs/guides/dynamo_deploy/operator_deployment.md).
<!--
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.s
-->
# Building Dynamo (`dynamo build`)
This guide explains how to use the `dynamo build` command to containerize Dynamo inference graphs (pipelines) for deployment.
`dynamo build` is a command-line tool that helps containerize inference graphs created with Dynamo SDK.
Run `dynamo build --containerize` to build a stand-alone Docker container that encapsulates your entire inference graph.
The generated container-image can then be shared and/or run standalone.
> [!Caution]
> This experimental feature is tested on the examples in the `examples/` directory.
> You need to make some modifications.
> Pay particular attention if your inference graph introduces custom dependencies.
## Building a containerized inference graph
The basic workflow for using `dynamo build` includes:
1. Defining your inference graph and testing locally with `dynamo serve`.
2. Specifying a base image for your inference graph. More on this below.
3. Running `dynamo build` to build a containerized inference graph.
### Basic Usage
```bash
dynamo build <graph_definition> --containerize
```
...@@ -36,7 +36,8 @@ The Dynamo Cloud Platform (`deploy/cloud/`) provides a managed deployment experi ...@@ -36,7 +36,8 @@ The Dynamo Cloud Platform (`deploy/cloud/`) provides a managed deployment experi
For detailed instructions on using the Dynamo Cloud Platform, see: For detailed instructions on using the Dynamo Cloud Platform, see:
- [Dynamo Cloud Platform Guide](dynamo_cloud.md): walks through installing and configuring the Dynamo cloud components on your Kubernetes cluster. - [Dynamo Cloud Platform Guide](dynamo_cloud.md): walks through installing and configuring the Dynamo cloud components on your Kubernetes cluster.
- [Operator Deployment Guide](operator_deployment.md) - [Dynamo Operator Guide](dynamo_operator.md)
### Manual Deployment with Helm Charts ### Manual Deployment with Helm Charts
......
...@@ -24,7 +24,6 @@ The Dynamo Cloud platform is a comprehensive solution for deploying and managing ...@@ -24,7 +24,6 @@ The Dynamo Cloud platform is a comprehensive solution for deploying and managing
The Dynamo cloud platform consists of several key components: The Dynamo cloud platform consists of several key components:
- **Dynamo Operator**: A Kubernetes operator that manages the lifecycle of Dynamo inference graphs from build ➡️ deploy. For more information on the operator, see [Dynamo Kubernetes Operator Documentation](../dynamo_deploy/dynamo_operator.md) - **Dynamo Operator**: A Kubernetes operator that manages the lifecycle of Dynamo inference graphs from build ➡️ deploy. For more information on the operator, see [Dynamo Kubernetes Operator Documentation](../dynamo_deploy/dynamo_operator.md)
- **API Store**: Stores and manages service configurations and metadata related to Dynamo deployments. Needs to be exposed externally.
- **Custom Resources**: Kubernetes custom resources for defining and managing Dynamo services - **Custom Resources**: Kubernetes custom resources for defining and managing Dynamo services
These components work together to provide a seamless deployment experience, handling everything from containerization to scaling and monitoring. These components work together to provide a seamless deployment experience, handling everything from containerization to scaling and monitoring.
...@@ -42,8 +41,30 @@ Before getting started with the Dynamo cloud platform, ensure you have: ...@@ -42,8 +41,30 @@ Before getting started with the Dynamo cloud platform, ensure you have:
- `kubectl` configured to access your cluster - `kubectl` configured to access your cluster
- Helm installed (version 3.0 or later) - Helm installed (version 3.0 or later)
```{tip}
Don't have a Kubernetes cluster? Check out our [Minikube setup guide](./minikube.md) to set up a local environment! > [!TIP]
> Don't have a Kubernetes cluster? Check out our [Minikube setup guide](../../../docs/guides/dynamo_deploy/minikube.md) to set up a local environment! 🏠
#### 🏗️ Build Dynamo inference runtime.
[One-time Action]
Before you could use Dynamo make sure you have setup the Inference Runtime Image.
For basic cases you could use the prebuilt image for the Dynamo Inference Runtime.
Just export the environment variable. This will be the image used by your individual components. You pick whatever dynamo version you want or use the latest (default)
```bash
export DYNAMO_IMAGE=nvcr.io/nvidia/dynamo:latest-vllm
```
For advanced examples make sure you have first built and pushed to your registry Dynamo Base Image for Dynamo inference runtime. This is a one-time operation.
```bash
# Run the script to build the default dynamo:latest-vllm image.
./container/build.sh
export IMAGE_TAG=<TAG>
# retag the image
docker tag dynamo:latest-vllm <your-registry>/dynamo:${IMAGE_TAG}
docker push <your-registry>/dynamo:${IMAGE_TAG}
``` ```
## Building Docker Images for Dynamo Cloud Components ## Building Docker Images for Dynamo Cloud Components
...@@ -61,9 +82,12 @@ export DOCKER_SERVER=<CONTAINER_REGISTRY> ...@@ -61,9 +82,12 @@ export DOCKER_SERVER=<CONTAINER_REGISTRY>
export IMAGE_TAG=<TAG> export IMAGE_TAG=<TAG>
``` ```
Where: As a description of the placeholders:
- `<CONTAINER_REGISTRY>`: Your container registry (e.g., `nvcr.io`, `docker.io/<your-username>`, etc.) - `<CONTAINER_REGISTRY>`: Your container registry (e.g., `nvcr.io`, `docker.io/<your-username>`, etc.)
- `<TAG>`: The version tag for your images (e.g., `latest`, `0.0.1`, `v1.0.0`) - `<TAG>`: The tag you want to use for the images of the Dynamo cloud components (e.g., `latest`, `0.0.1`, etc.)
If the runtime image tag is not explicitly set, the default is the `latest`.
The tag will go into the dynamo-operator:<IMAGE_TAG> image for the Operator. The runtime (base) image handles the inference toolchain and the sdk and built by the (`build.sh`). The tags do not have to match the runtime image tag but the images must be compatible.
**Important** Make sure you're logged in to your container registry before pushing images. For example: **Important** Make sure you're logged in to your container registry before pushing images. For example:
...@@ -79,7 +103,7 @@ You can build and push all platform components at once: ...@@ -79,7 +103,7 @@ You can build and push all platform components at once:
earthly --push +all-docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG earthly --push +all-docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG
``` ```
## Deploying the Dynamo Cloud Platform ### 🚀 Deploying the Dynamo Cloud Platform
Once you've built and pushed the components, you can deploy the platform to your Kubernetes cluster. Once you've built and pushed the components, you can deploy the platform to your Kubernetes cluster.
...@@ -87,7 +111,18 @@ Once you've built and pushed the components, you can deploy the platform to your ...@@ -87,7 +111,18 @@ Once you've built and pushed the components, you can deploy the platform to your
Before deploying Dynamo Cloud, ensure your Kubernetes cluster meets the following requirements: Before deploying Dynamo Cloud, ensure your Kubernetes cluster meets the following requirements:
#### PVC Support with Default Storage Class #### 1. 🛡️ Istio Installation
Dynamo Cloud requires Istio for service mesh capabilities. Verify Istio is installed and running:
```bash
# Check if Istio is installed
kubectl get pods -n istio-system
# Expected output should show running Istio pods
# istiod-* pods should be in Running state
```
#### 2. 💾 PVC Support with Default Storage Class
Dynamo Cloud requires Persistent Volume Claim (PVC) support with a default storage class. Verify your cluster configuration: Dynamo Cloud requires Persistent Volume Claim (PVC) support with a default storage class. Verify your cluster configuration:
```bash ```bash
...@@ -100,21 +135,19 @@ kubectl get storageclass ...@@ -100,21 +135,19 @@ kubectl get storageclass
# standard (default) kubernetes.io/gce-pd Delete Immediate true 1d # standard (default) kubernetes.io/gce-pd Delete Immediate true 1d
``` ```
### Cloud Provider-Specific deployment
#### Google Kubernetes Engine (GKE) deployment
You can find detailed instructions for deployment in GKE [here](../dynamo_deploy/gke_setup.md)
### Installation ### Installation using the helper script
1. Set the required environment variables: 1. Set the required environment variables:
```bash ```bash
export PROJECT_ROOT=$(pwd)
export DOCKER_USERNAME=<your-docker-username> export DOCKER_USERNAME=<your-docker-username>
export DOCKER_PASSWORD=<your-docker-password> export DOCKER_PASSWORD=<your-docker-password>
export DOCKER_SERVER=<your-docker-server> export DOCKER_SERVER=<your-docker-server>
export IMAGE_TAG=<TAG> # Use the same tag you used when building the images export IMAGE_TAG=<TAG> # Use the same tag you used when building the images
export NAMESPACE=dynamo-cloud # change this to whatever you want! export NAMESPACE=dynamo-cloud # change this to whatever you want!
export DYNAMO_INGRESS_SUFFIX=dynamo-cloud.com # change this to whatever you want!
``` ```
``` {note} ``` {note}
...@@ -124,12 +157,14 @@ A docker image pull secret is created automatically if these variables are set. ...@@ -124,12 +157,14 @@ A docker image pull secret is created automatically if these variables are set.
The Dynamo Cloud Platform auto-generates docker images for pipelines and pushes them to a container registry. The Dynamo Cloud Platform auto-generates docker images for pipelines and pushes them to a container registry.
By default, the platform uses the same container registry as the platform components (specified by `DOCKER_SERVER`). By default, the platform uses the same container registry as the platform components (specified by `DOCKER_SERVER`).
However, you can specify a different container registry for pipelines by additionally setting the following environment variables: However, you can use a different container registry for the platform components by making sure an associated kubernetes secret is present:
```bash ```bash
export PIPELINES_DOCKER_SERVER=<your-docker-server> kubectl create secret docker-registry dynamo-components-imagepullsecret \
export PIPELINES_DOCKER_USERNAME=<your-docker-username> --docker-server=<docker-registry-for-dynamo-components> \
export PIPELINES_DOCKER_PASSWORD=<your-docker-password> --docker-username=<username> \
--docker-password=<password> \
--namespace=${NAMESPACE}
``` ```
If you wish to expose your Dynamo Cloud Platform externally, you can setup the following environment variables: If you wish to expose your Dynamo Cloud Platform externally, you can setup the following environment variables:
...@@ -168,48 +203,17 @@ if you want guidance during the process, run the deployment script with the `--i ...@@ -168,48 +203,17 @@ if you want guidance during the process, run the deployment script with the `--i
omitting `--crds` will skip the CRDs installation/upgrade. This is useful when installing on a shared cluster as CRDs are cluster-scoped resources. omitting `--crds` will skip the CRDs installation/upgrade. This is useful when installing on a shared cluster as CRDs are cluster-scoped resources.
4. **Expose Dynamo Cloud Externally** If you'd like to only generate the generated-values.yaml file without deploying to Kubernetes (e.g., for inspection, CI workflows, or dry-run testing), use:
``` {note}
The script automatically displays information about the endpoint that you can use to access Dynamo Cloud. We refer to this externally available endpoint as `DYNAMO_CLOUD`.
```
The simplest way to expose the `dynamo-store` service within the namespace externally is to use a port-forward:
```bash ```bash
kubectl port-forward svc/dynamo-store <local-port>:80 -n $NAMESPACE ./deploy_dynamo_cloud.py --yaml-only
export DYNAMO_CLOUD=http://localhost:<local-port>
``` ```
## Next Steps
After deploying the Dynamo cloud platform, you can:
1. Deploy your first inference graph using the [Dynamo CLI](operator_deployment.md)
2. Deploy Dynamo LLM pipelines to Kubernetes using the [Dynamo CLI](../../examples/llm_deployment.md)
3. Manage your deployments using the Dynamo CLI
For more detailed information about deploying inference graphs, see the [Dynamo Deploy Guide](README.md).
### Installation using published helm chart ### Installation using published helm chart
To install Dynamo Cloud using the published Helm chart, you'll need to configure Docker registry credentials and image settings. The chart supports both direct credential configuration and existing Kubernetes secrets. To install Dynamo Cloud using the published Helm chart, you'll need to configure Docker registry credentials and image settings.
#### Configuration Options
You have two options for providing Docker registry credentials:
**Option 1: Direct Credentials** (Simpler for testing)
- Provide username and password directly via Helm values
- Credentials are stored in a Kubernetes secret created by the chart
**Option 2: Existing Secret** (Recommended for production)
- Use an existing Kubernetes secret containing Docker registry credentials
- More secure and follows Kubernetes best practices
#### Environment Setup #### Environment Setup
...@@ -217,23 +221,16 @@ Set the required environment variables: ...@@ -217,23 +221,16 @@ Set the required environment variables:
```bash ```bash
# Docker registry configuration # Docker registry configuration
export DOCKER_SERVER="your-registry.com" # Docker registry server where images of dynamo cloud services (api-server and operator) are available export DOCKER_SERVER="your-registry.com" # Docker registry server where images of dynamo cloud services (operator) are available
export IMAGE_TAG="v1.0.0" # Image tag to deploy export IMAGE_TAG="v1.0.0" # Image tag to deploy
export NAMESPACE="dynamo-cloud" # Target namespace export NAMESPACE="dynamo-cloud" # Target namespace
# Pipeline-specific Docker registry (can be different from DOCKER_SERVER) # Components-specific Docker registry (if different from DOCKER_SERVER)
export PIPELINES_DOCKER_SERVER="your-pipeline-registry.com" # Registry for pipeline images export COMPONENTS_DOCKER_SERVER="your-pipeline-registry.com" # Registry for Dynamo components images
# Option 1: Direct credentials
export PIPELINES_DOCKER_USERNAME="your-username"
export PIPELINES_DOCKER_PASSWORD="your-password"
# Option 2: Existing secret (recommended)
export PIPELINES_DOCKER_CREDS_SECRET="my-docker-secret" # Name of existing secret
# Note: If not specified, the chart will look for a secret named "dynamo-regcred"
# Image pull secret for the operator itself # Image pull secret for the operator itself
export DOCKER_SECRET_NAME="my-pull-secret" # Secret for pulling images of dynamo cloud services (api-server and operator) operator images export DOCKER_SECRET_NAME="my-pull-secret" # Secret for pulling images of dynamo cloud services (operator) operator images
export COMPONENTS_DOCKER_SECRET_NAME="my-components-pull-secret" # Secret for pulling images of dynamo components images (if needed)
``` ```
you can easily create an image pull secret with the following command : you can easily create an image pull secret with the following command :
...@@ -241,9 +238,17 @@ you can easily create an image pull secret with the following command : ...@@ -241,9 +238,17 @@ you can easily create an image pull secret with the following command :
```bash ```bash
kubectl create secret docker-registry ${DOCKER_SECRET_NAME} \ kubectl create secret docker-registry ${DOCKER_SECRET_NAME} \
--docker-server=${DOCKER_SERVER} \ --docker-server=${DOCKER_SERVER} \
--docker-username=${DOCKER_USERNAME} \ --docker-username=<docker-server-username> \
--docker-password=${DOCKER_PASSWORD} \ --docker-password=<docker-server-password> \
--namespace=${NAMESPACE}
# Only if using a different registry for Dynamo components
kubectl create secret docker-registry ${COMPONENTS_DOCKER_SECRET_NAME} \
--docker-server=${COMPONENTS_DOCKER_SERVER} \
--docker-username=<components-docker-server-username> \
--docker-password=<components-docker-server-password> \
--namespace=${NAMESPACE} --namespace=${NAMESPACE}
``` ```
#### Installation Commands #### Installation Commands
...@@ -259,39 +264,28 @@ helm install dynamo-crds dynamo-crds-helm-chart.tgz \ ...@@ -259,39 +264,28 @@ helm install dynamo-crds dynamo-crds-helm-chart.tgz \
**Step 2: Install Dynamo Platform** **Step 2: Install Dynamo Platform**
Choose one of the following approaches based on your credential configuration: Run the following helm command:
**Using Direct Credentials:**
```bash ```bash
helm install dynamo-platform dynamo-platform-helm-chart.tgz \ helm install dynamo-platform dynamo-platform-helm-chart.tgz \
--namespace ${NAMESPACE} \ --namespace ${NAMESPACE} \
--create-namespace \
--set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \ --set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \
--set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}" \ --set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}" \
--set "dynamo-operator.imagePullSecrets[0].name=${DOCKER_SECRET_NAME}" \ --set "dynamo-operator.imagePullSecrets[0].name=${DOCKER_SECRET_NAME}"
--set "dynamo-operator.dynamo.dockerRegistry.server=${PIPELINES_DOCKER_SERVER:-$DOCKER_SERVER}" \
--set "dynamo-operator.dynamo.dockerRegistry.username=${PIPELINES_DOCKER_USERNAME}" \
--set "dynamo-operator.dynamo.dockerRegistry.password=${PIPELINES_DOCKER_PASSWORD}" \
--set "dynamo-api-store.image.repository=${DOCKER_SERVER}/dynamo-api-store" \
--set "dynamo-api-store.image.tag=${IMAGE_TAG}" \
--set "dynamo-api-store.imagePullSecrets[0].name=${DOCKER_SECRET_NAME}"
``` ```
**Using Existing Secret (Recommended):** ### Cloud Provider-Specific deployment
```bash
helm install dynamo-platform dynamo-platform-helm-chart.tgz \ #### Google Kubernetes Engine (GKE) deployment
--namespace ${NAMESPACE} \
--create-namespace \ You can find detailed instructions for deployment in GKE [here](../dynamo_deploy/gke_setup.md)
--set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \
--set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}" \
--set "dynamo-operator.imagePullSecrets[0].name=${DOCKER_SECRET_NAME}" \
--set "dynamo-operator.dynamo.dockerRegistry.server=${PIPELINES_DOCKER_SERVER:-$DOCKER_SERVER}" \
--set "dynamo-operator.dynamo.dockerRegistry.existingSecretName=${PIPELINES_DOCKER_CREDS_SECRET:-dynamo-regcred}" \
--set "dynamo-api-store.image.repository=${DOCKER_SERVER}/dynamo-api-store" \
--set "dynamo-api-store.image.tag=${IMAGE_TAG}" \
--set "dynamo-api-store.imagePullSecrets[0].name=${DOCKER_SECRET_NAME}"
```
[!Note] ## Next Steps
- If `PIPELINES_DOCKER_SERVER` is not set, it defaults to `DOCKER_SERVER`
- If `PIPELINES_DOCKER_CREDS_SECRET` is not set, the chart will look for a secret named `dynamo-regcred` After deploying the Dynamo cloud platform, you can:
1. Deploy your first inference graph using the [Dynamo CLI](operator_deployment.md)
2. Deploy Dynamo LLM graphs to Kubernetes using the [Dynamo CLI](../../examples/llm_deployment.md)
3. Manage your deployments using the Dynamo CLI
For more detailed information about deploying inference graphs, see the [Dynamo Deploy Guide](README.md).
...@@ -12,7 +12,6 @@ Dynamo operator is a Kubernetes operator that simplifies the deployment, configu ...@@ -12,7 +12,6 @@ Dynamo operator is a Kubernetes operator that simplifies the deployment, configu
- **Controllers:** - **Controllers:**
- `DynamoGraphDeploymentController`: Watches `DynamoGraphDeployment` CRs and orchestrates graph deployments. - `DynamoGraphDeploymentController`: Watches `DynamoGraphDeployment` CRs and orchestrates graph deployments.
- `DynamoComponentDeploymentController`: Watches `DynamoComponentDeployment` CRs and handles individual component deployments. - `DynamoComponentDeploymentController`: Watches `DynamoComponentDeployment` CRs and handles individual component deployments.
- `DynamoComponentController`: Watches `DynamoComponent` CRs and manages image builds and artifact tracking.
- **Workflow:** - **Workflow:**
1. A custom resource is created by the user or API server. 1. A custom resource is created by the user or API server.
...@@ -29,8 +28,8 @@ Dynamo operator is a Kubernetes operator that simplifies the deployment, configu ...@@ -29,8 +28,8 @@ Dynamo operator is a Kubernetes operator that simplifies the deployment, configu
| Field | Type | Description | Required | Default | | Field | Type | Description | Required | Default |
|------------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------|----------|---------| |------------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------|----------|---------|
| `dynamoComponent`| string | Reference to the dynamoComponent identifier | Yes | | | `services` | map | Map of service names to runtime configurations. This allows the user to override the service configuration defined in the DynamoComponentDeployment. | Yes | |
| `services` | map | Map of service names to runtime configurations. This allows the user to override the service configuration defined in the DynamoComponentDeployment. | No | | | `envs` | list | list of global environment variables. | No | |
**API Version:** `nvidia.com/v1alpha1` **API Version:** `nvidia.com/v1alpha1`
...@@ -43,7 +42,6 @@ kind: DynamoGraphDeployment ...@@ -43,7 +42,6 @@ kind: DynamoGraphDeployment
metadata: metadata:
name: disagg name: disagg
spec: spec:
dynamoComponent: frontend:jh2o6dqzpsgfued4
envs: envs:
- name: GLOBAL_ENV_VAR - name: GLOBAL_ENV_VAR
value: some_global_value value: some_global_value
...@@ -80,11 +78,8 @@ spec: ...@@ -80,11 +78,8 @@ spec:
| Field | Type | Description | Required | Default | | Field | Type | Description | Required | Default |
|--------------------|----------|---------------------------------------------------------------|----------|---------| |--------------------|----------|---------------------------------------------------------------|----------|---------|
| `dynamoNamespace` | string | Namespace of the DynamoComponent | Yes | | | `dynamoNamespace` | string | Namespace of the DynamoComponent | Yes | |
| `dynamoComponent` | string | Name of the dynamoComponent artifact | Yes | |
| `dynamoTag` | string | FQDN of the service to run | Yes | |
| `serviceName` | string | Logical name of the service being deployed | Yes | | | `serviceName` | string | Logical name of the service being deployed | Yes | |
| `envs` | array | Environment variables for runtime | No | `[]` | | `envs` | array | Environment variables for runtime | No | `[]` |
| `externalServices`| map | External service dependencies | No | |
| `annotations` | map | Additional metadata annotations for the pod | No | | | `annotations` | map | Additional metadata annotations for the pod | No | |
| `labels` | map | Custom labels applied to the deployment and pod | No | | | `labels` | map | Custom labels applied to the deployment and pod | No | |
| `resources` | object | Resource limits and requests (CPU, memory, GPU) | No | | | `resources` | object | Resource limits and requests (CPU, memory, GPU) | No | |
...@@ -109,8 +104,6 @@ metadata: ...@@ -109,8 +104,6 @@ metadata:
name: test-41fa991-vllmworker name: test-41fa991-vllmworker
spec: spec:
dynamoNamespace: dynamo dynamoNamespace: dynamo
dynamoComponent: frontend:jh2o6dqzpsgfued4
dynamoTag: graphs.disagg:Frontend
envs: envs:
- name: DYN_DEPLOYMENT_CONFIG - name: DYN_DEPLOYMENT_CONFIG
value: '<long JSON config>' value: '<long JSON config>'
...@@ -130,35 +123,6 @@ spec: ...@@ -130,35 +123,6 @@ spec:
serviceName: Frontend serviceName: Frontend
``` ```
### CRD: `DynamoComponent`
| Field | Type | Description | Required | Default |
|---------------------------------|--------------------------|--------------------------------------------------------------------------------------|----------|---------|
| `dynamoComponent` | string | Name of the dynamoComponent artifact | Yes | |
| `image` | string | Custom container image. If not specified, an image will be built | No | |
| `imageBuildTimeout` | Duration | Timeout duration for the image building process | No | |
| `buildArgs` | []string | Additional arguments to pass to the container image build process | No | |
| `imageBuilderExtraPodMetadata` | ExtraPodMetadata | Additional metadata to add to the image builder pod | No | |
| `imageBuilderExtraPodSpec` | ExtraPodSpec | Additional pod spec configurations for the image builder pod | No | |
| `imageBuilderExtraContainerEnv` | []EnvVar | Additional environment variables for the image builder container | No | |
| `imageBuilderContainerResources`| ResourceRequirements | Resource requirements (CPU, memory) for the image builder container | No | |
| `imagePullSecrets` | []LocalObjectReference | Secrets required for pulling private container images | No | |
| `dockerConfigJsonSecretName` | string | Name of the secret containing Docker registry credentials | No | |
| `downloaderContainerEnvFrom` | []EnvFromSource | Environment variables to be sourced for the downloader container | No | |
**API Version:** `nvidia.com/v1alpha1`
**Scope:** Namespaced
#### Example
```yaml
apiVersion: nvidia.com/v1alpha1
kind: DynamoComponent
metadata:
name: frontend--jh2o6dqzpsgfued4
spec:
dynamoComponent: frontend:jh2o6dqzpsgfued4
```
## Installation ## Installation
[See installation steps](dynamo_cloud.md#overview) [See installation steps](dynamo_cloud.md#overview)
...@@ -214,7 +178,6 @@ kind: DynamoGraphDeployment ...@@ -214,7 +178,6 @@ kind: DynamoGraphDeployment
metadata: metadata:
name: llm-agg name: llm-agg
spec: spec:
dynamoComponent: frontend:jh2o6dqzpsgfued4 # Use the tag from Step 1
services: services:
Frontend: Frontend:
replicas: 1 replicas: 1
...@@ -245,25 +208,9 @@ Commit and push this file to your Git repository. FluxCD will detect the new CR ...@@ -245,25 +208,9 @@ Commit and push this file to your Git repository. FluxCD will detect the new CR
### Step 3: Update Existing Deployment ### Step 3: Update Existing Deployment
To update your pipeline: To update your pipeline, just update the associated DynamoGraphDeployment CRD
1. Build and push a new version of your pipeline:
```bash
DYNAMO_TAG=$(dynamo build --push graphs.agg:Frontend | grep "Successfully built" | awk '{ print $NF }' | sed 's/\.$//')
```
2. Update the `dynamoComponent` field in your CR with the new tag:
```yaml
spec:
dynamoComponent: frontend:new_tag_here # Update with new tag from Step 1
```
3. Commit and push the changes to your Git repository.
The Dynamo operator will: The Dynamo operator will automatically reconcile it.
- Detect the updated CR
- Build new container images for the updated components
- Perform a rolling update of the deployments when the new images are ready and the components are ready to serve traffic
- Preserve existing PVCs and their data
### Monitoring the Deployment ### Monitoring the Deployment
...@@ -302,13 +249,6 @@ kubectl get dynamocomponentdeployment -n $KUBE_NS ...@@ -302,13 +249,6 @@ kubectl get dynamocomponentdeployment -n $KUBE_NS
- **Status Management:** - **Status Management:**
- `.status.conditions`: Reflects readiness, failure, progress states - `.status.conditions`: Reflects readiness, failure, progress states
### DynamoComponent
- **Actions:**
- Create a job to build the docker image
- **Status Management:**
- `.status.conditions`: Reflects readiness, failure, progress states
## Configuration ## Configuration
...@@ -316,18 +256,7 @@ kubectl get dynamocomponentdeployment -n $KUBE_NS ...@@ -316,18 +256,7 @@ kubectl get dynamocomponentdeployment -n $KUBE_NS
| Name | Description | Default | | Name | Description | Default |
|----------------------------------------------------|--------------------------------------|--------------------------------------------------------| |----------------------------------------------------|--------------------------------------|--------------------------------------------------------|
| `ADD_NAMESPACE_PREFIX_TO_IMAGE_NAME` | Adds namespace prefix to image names | `false` |
| `DYNAMO_IMAGE_BUILD_ENGINE` | Engine used for building images | `buildkit` |
| `BUILDKIT_URL` | BuildKit daemon URL | `tcp://dynamo-platform-dynamo-operator-buildkitd:1234` |
| `DOCKER_REGISTRY_DYNAMO_COMPONENTS_REPOSITORY_NAME`| Repository name for dynamo images | `dynamo-components` |
| `DOCKER_REGISTRY_SECURE` | Use secure connection for registry | `true` |
| `DOCKER_REGISTRY_SERVER` | Docker registry server address | `nvcr.io/nvidian/dynamo` |
| `DOCKER_REGISTRY_USERNAME` | Registry authentication username | `username` |
| `ESTARGZ_ENABLED` | Enable eStargz image optimization | `false` |
| `INTERNAL_IMAGES_BUILDKIT` | BuildKit image | `moby/buildkit:v0.20.2` |
| `LOG_LEVEL` | Logging verbosity level | `info` | | `LOG_LEVEL` | Logging verbosity level | `info` |
| `API_STORE_ENDPOINT` | Api store service endpoint | `http://dynamo-store` |
| `DYNAMO_IMAGE_BUILDER_NAMESPACE` | Namespace for image building | `dynamo` |
| `DYNAMO_SYSTEM_NAMESPACE` | System namespace | `dynamo` | | `DYNAMO_SYSTEM_NAMESPACE` | System namespace | `dynamo` |
- **Flags:** - **Flags:**
......
...@@ -136,23 +136,16 @@ dynamo-operator: ...@@ -136,23 +136,16 @@ dynamo-operator:
iam.gke.io/gcp-service-account: your-sa@your-gcp-project.iam.gserviceaccount.com iam.gke.io/gcp-service-account: your-sa@your-gcp-project.iam.gserviceaccount.com
... ...
dynamo: dynamo:
dockerRegistry:
useKubernetesSecret: false
server: us-central1-docker.pkg.dev/your-project/your-registry
components: components:
serviceAccount: serviceAccount:
annotations: annotations:
iam.gke.io/gcp-service-account: your-sa@your-gcp-project.iam.gserviceaccount.com iam.gke.io/gcp-service-account: your-sa@your-gcp-project.iam.gserviceaccount.com
imageBuilder:
serviceAccount:
annotations:
iam.gke.io/gcp-service-account: your-sa@your-gcp-project.iam.gserviceaccount.com
... ...
.... ....
``` ```
You can use it during helm installation (last step of deploy.sh) You can use it during helm installation (last step of /deploy/cloud/helm/deploy.sh)
```bash ```bash
helm upgrade --install ${RELEASE} platform/ -f values.yaml --namespace ${NAMESPACE} helm upgrade --install ${RELEASE} platform/ -f values.yaml --namespace ${NAMESPACE}
......
...@@ -178,7 +178,6 @@ kind: DynamoGraphDeployment ...@@ -178,7 +178,6 @@ kind: DynamoGraphDeployment
metadata: metadata:
name: model-caching name: model-caching
spec: spec:
dynamoGraph: "frontend:3x6rl5b3gcnf5skh"
envs: envs:
- name: HF_HOME - name: HF_HOME
value: /model value: /model
...@@ -275,7 +274,6 @@ kind: DynamoGraphDeployment ...@@ -275,7 +274,6 @@ kind: DynamoGraphDeployment
metadata: metadata:
name: my-hello-world name: my-hello-world
spec: spec:
dynamoGraph: frontend:214c1690
envs: envs:
- name: DYN_LOG - name: DYN_LOG
value: "debug" value: "debug"
......
<!--
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform
This guide walks you through deploying an inference graph created with the Dynamo SDK onto a Kubernetes cluster using the Dynamo cloud platform and the Dynamo deploy CLI. The Dynamo cloud platform provides a streamlined experience for deploying and managing your inference services.
## Prerequisites
Before proceeding with deployment, ensure you have:
- [Dynamo Python package](../../get_started.md#alternative-setup-manual-installation) installed
- A Kubernetes cluster with the [Dynamo cloud platform](dynamo_cloud.md) installed
- Ubuntu 24.04 as the base image for your services
- Required dependencies:
- Helm package manager
- Rust packages and toolchain
You must have first followed the instructions in [deploy/dynamo/helm/README.md](https://github.com/ai-dynamo/dynamo/blob/main/deploy/dynamo/cloud/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
## Understanding the Deployment Process
The deployment process involves two main steps:
1. **Local Build (`dynamo build`)**
- Creates a Dynamo service archive containing:
- Service code and dependencies
- Service configuration and metadata
- Runtime requirements
- Service graph definition
- This archive is used as input for the remote build process
2. **Remote Image Build**
- A `yatai-dynamonim-image-builder` pod is created in your cluster
- This pod:
- Takes the Dynamo service archive
- Containerizes it using the specified base image
- Pushes the final container image to your cluster's registry
- The build process is managed by the Dynamo operator
## Deployment Steps
### 1. Configure Environment Variables
First, set up your environment variables for working with Dynamo Cloud. You have two options for accessing the `dynamo-store` service:
#### Option 1: Using Port-Forward (Local Development)
This is the simplest approach for local development and testing:
```bash
# Set your project root directory
export PROJECT_ROOT=$(pwd)
# Set your Kubernetes namespace (must match the namespace where Dynamo cloud is installed)
export KUBE_NS=hello-world
# In a separate terminal, run port-forward to expose the dynamo-store service locally
kubectl port-forward svc/dynamo-store 8080:80 -n $KUBE_NS
# Set DYNAMO_CLOUD to use the local port-forward endpoint
export DYNAMO_CLOUD=http://localhost:8080
```
#### Option 2: Using Ingress/VirtualService (Production)
For production environments, you should use proper ingress configuration:
```bash
# Set your project root directory
export PROJECT_ROOT=$(pwd)
# Set your Kubernetes namespace (must match the namespace where Dynamo cloud is installed)
export KUBE_NS=hello-world
# Set DYNAMO_CLOUD to your externally accessible endpoint
# This could be your Ingress hostname or VirtualService URL
export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com # Replace with your actual endpoint
```
``` {note}
The `DYNAMO_CLOUD` environment variable is required for all Dynamo deployment commands. Make sure it's set before running any deployment operations.
```
### 2. Build the Dynamo Base Image
Before building your service, you need to ensure the base image is properly set up:
1. For detailed instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../get_started.md#building-the-dynamo-base-image) section in the main README.
2. Export the image from the previous step to your environment.
```bash
# Export the image from the previous step to your environment
export DYNAMO_IMAGE=<your-registry>/<your-image-name>:<your-tag>
# Navigate to your project directory
cd $PROJECT_ROOT/examples/hello_world
# Build the service and capture the tag
DYNAMO_TAG=$(dynamo build hello_world:Frontend | grep "Successfully built" | awk '{ print $3 }' | sed 's/\.$//')
```
### 3. Deploy to Kubernetes
TODO: Deploy your service using a DynamoGraphDeployment CR.
#### Managing Deployments
Once you have deployments running, you can manage them using the following commands:
To see a list of all deployments in your namespace:
```bash
dynamo deployment list
```
This command displays a table of all deployments.
To get detailed information about a specific deployment:
```bash
dynamo deployment get $DEPLOYMENT_NAME
```
To update a specific deployment:
```bash
dynamo deployment update $DEPLOYMENT_NAME [--config-file FILENAME] [--env ENV_VAR]
```
To remove a deployment and all its associated resources:
```bash
dynamo deployment delete $DEPLOYMENT_NAME
```
```{warning}
This command permanently deletes the deployment and all associated resources. Make sure you have any necessary backups before proceeding.
```
### 4. Test the Deployment
The deployment process creates several pods:
1. A `yatai-dynamonim-image-builder` pod for building the container image
2. Service pods prefixed with `$DEPLOYMENT_NAME` once the build is complete
To test your deployment:
```bash
# Forward the service port to localhost
kubectl -n ${KUBE_NS} port-forward svc/${DEPLOYMENT_NAME}-frontend 3000:3000
# Test the API endpoint
curl -X 'POST' 'http://localhost:3000/generate' \
-H 'accept: text/event-stream' \
-H 'Content-Type: application/json' \
-d '{"text": "test"}'
```
## Expected Output
When you send a request with "test" as input, you'll see how the text flows through each service:
```
Frontend: Middle: Backend: test-mid-back
```
This demonstrates the service pipeline:
1. The Frontend receives "test"
2. The Middle service adds "-mid" to create "test-mid"
3. The Backend service adds "-back" to create "test-mid-back"
## Using Kubernetes Secrets for Environment Variables
Dynamo supports securely injecting environment variables from Kubernetes secrets into your deployment. This is only supported when deploying with `--target kubernetes`.
### Creating a Secret
First, create a Kubernetes secret containing your sensitive values:
```bash
export HF_TOKEN=your_hf_token
kubectl create secret generic dynamo-env-secrets \
--from-literal=huggingface.token=$HF_TOKEN \
--from-literal=another_secret.key=value \
-n $KUBE_NS
```
### Referencing Secrets in Your Deployment
You can reference secret keys in your deployment using the `--env-from-secret` flag:
- `--env-from-secret HF_TOKEN=huggingface.token` will set the `HF_TOKEN` environment variable from the `huggingface.token` key in the secret.
- `--env-from-secret ANOTHER_SECRET=another_secret.key` will set the `ANOTHER_SECRET` environment variable from the same-named key in the secret.
- You can also mix normal envs: `--env NORMAL_ENV_KEY=value`.
By default, Dynamo will look for a secret named `dynamo-env-secrets`. You can override this with the `--env-secrets-name` flag or the `DYNAMO_ENV_SECRETS` environment variable.
### Example Full Command
```bash
dynamo deploy $DYNAMO_TAG -n $DEPLOYMENT_NAME -f ./configs/agg.yaml \
--env NORMAL_ENV_KEY=value \
--env-from-secret HF_TOKEN=huggingface.token \
--env-from-secret ANOTHER_SECRET=another_secret.key \
--target kubernetes
```
# Quickstart
Before deploying your inference graphs you need to install the Dynamo Inference Platform and the Dynamo Cloud.
## 1. Installing from Published Artifacts
Use this approach when installing from pre-built helm charts and docker images published to NGC.
### Prerequisites
```bash
export NAMESPACE=dynamo-cloud
export RELEASE_VERSION=0.3.2
```
Install `envsubst`, `kubectl`, `helm`
### Authenticate with NGC
```bash
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia --username='$oauthtoken' --password=<YOUR_NGC_CLI_API_KEY>
```
### Fetch Helm Charts
```bash
# Fetch the CRDs helm chart
helm fetch https://helm.ngc.nvidia.com/nvidia/charts/dynamo-crds-v${RELEASE_VERSION}.tgz
# Fetch the platform helm chart
helm fetch https://helm.ngc.nvidia.com/nvidia/charts/dynamo-platform-v${RELEASE_VERSION}.tgz
```
### Install Dynamo Cloud
**Step 1: Install Custom Resource Definitions (CRDs)**
```bash
helm install dynamo-crds dynamo-crds-v${RELEASE_VERSION}.tgz \
--namespace default \
--wait \
--atomic
```
**Step 2: Install Dynamo Platform**
```bash
kubectl create namespace ${NAMESPACE}
helm install dynamo-platform dynamo-platform-v${RELEASE_VERSION}.tgz --namespace ${NAMESPACE}
```
## 2. Installing from Source
Use this approach when developing or customizing Dynamo as a contributor, or using local helm charts from the source repository.
### Prerequisites
Ensure you have the source code checked out and are in the `dynamo` directory:
```bash
cd deploy/cloud/helm/
```
### Set Environment Variables
```bash
export NAMESPACE=dynamo-cloud
export DOCKER_USERNAME=your-username
export DOCKER_PASSWORD=your-password
export DOCKER_SERVER=your-docker-registry.com
export IMAGE_TAG=your-image-tag
```
### Install Dynamo Cloud
You could run the `deploy.sh` or use the manual commands under Step 1 and Step 2.
**Installing with a script (alternative to the Step 1 and Step 2)**
Create the namespace and the docker registry secret.
```bash
kubectl create namespace ${NAMESPACE}
kubectl create secret docker-registry docker-imagepullsecret \
--docker-server=${DOCKER_SERVER} \
--docker-username=${DOCKER_USERNAME} \
--docker-password=${DOCKER_PASSWORD} \
--namespace=${NAMESPACE}
```
```bash
./deploy.sh --crds
```
if you want guidance during the process, run the deployment script with the `--interactive` flag:
```bash
./deploy.sh --crds --interactive
```
**Step 1: Install Custom Resource Definitions (CRDs)**
```bash
helm install dynamo-crds ./crds/ \
--namespace default \
--wait \
--atomic
```
**Step 2: Build Dependencies and Install Platform**
```bash
helm dep build ./platform/
kubectl create namespace ${NAMESPACE}
# Create docker registry secret
kubectl create secret docker-registry docker-imagepullsecret \
--docker-server=${DOCKER_SERVER} \
--docker-username=${DOCKER_USERNAME} \
--docker-password=${DOCKER_PASSWORD} \
--namespace=${NAMESPACE}
# Install platform
helm install dynamo-platform ./platform/ \
--namespace ${NAMESPACE} \
--set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \
--set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}" \
--set "dynamo-operator.imagePullSecrets[0].name=docker-imagepullsecret"
```
[More on Deploying to Dynamo Cloud](./dynamo_cloud.md)
## Explore Examples
### Hello World
For a basic example that doesn't require a GPU, see the [Hello World](../../examples/hello_world.md)
### LLM
Create a Kubernetes secret containing your sensitive values if needed:
```bash
export HF_TOKEN=your_hf_token
kubectl create secret generic hf-token-secret \
--from-literal=HF_TOKEN=${HF_TOKEN} \
-n ${NAMESPACE}
```
Pick your deployment destination.
If local
```bash
export DYNAMO_CLOUD=http://localhost:8080
```
If kubernetes
```bash
export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com
```
```bash
# Go to your main dynamo directory.
cd ../../../
kubectl apply -f examples/llm/deploy/agg.yaml -n $NAMESPACE
```
...@@ -30,9 +30,16 @@ The NVIDIA Dynamo Platform is a high-performance, low-latency inference framewor ...@@ -30,9 +30,16 @@ The NVIDIA Dynamo Platform is a high-performance, low-latency inference framewor
- `Dynamo examples repo <https://github.com/ai-dynamo/examples>`_ - `Dynamo examples repo <https://github.com/ai-dynamo/examples>`_
Quick Start
-----------------
Follow the :doc:`Quick Guide to install Dynamo Platform <guides/dynamo_deploy/quickstart>`.
Dive in: Examples Dive in: Examples
----------------- -----------------
The examples below assume you build the latest image yourself from source. If using a prebuilt image follow the examples from the corresponding branch.
.. grid:: 1 2 2 2 .. grid:: 1 2 2 2
:gutter: 3 :gutter: 3
:margin: 0 :margin: 0
...@@ -105,6 +112,7 @@ Dive in: Examples ...@@ -105,6 +112,7 @@ Dive in: Examples
:hidden: :hidden:
:caption: Deployment Guides :caption: Deployment Guides
Dynamo Deploy Quickstart <guides/dynamo_deploy/quickstart.md>
Dynamo Cloud Kubernetes Platform <guides/dynamo_deploy/dynamo_cloud.md> Dynamo Cloud Kubernetes Platform <guides/dynamo_deploy/dynamo_cloud.md>
Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform <guides/dynamo_deploy/operator_deployment.md> Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform <guides/dynamo_deploy/operator_deployment.md>
Manual Helm Deployment <guides/dynamo_deploy/manual_helm_deployment.md> Manual Helm Deployment <guides/dynamo_deploy/manual_helm_deployment.md>
......
...@@ -245,7 +245,7 @@ in `dynamo deployment get ${DEPLOYMENT_NAME}` and skip the steps to find and for ...@@ -245,7 +245,7 @@ in `dynamo deployment get ${DEPLOYMENT_NAME}` and skip the steps to find and for
export FRONTEND_POD=$(kubectl get pods -n ${KUBE_NS} | grep "${DEPLOYMENT_NAME}-frontend" | sort -k1 | tail -n1 | awk '{print $1}') export FRONTEND_POD=$(kubectl get pods -n ${KUBE_NS} | grep "${DEPLOYMENT_NAME}-frontend" | sort -k1 | tail -n1 | awk '{print $1}')
# Forward the pod's port to localhost # Forward the pod's port to localhost
kubectl port-forward pod/$FRONTEND_POD 3000:3000 -n ${KUBE_NS} dynamo-operator-deployment.yaml/$FRONTEND_POD 3000:3000 -n ${KUBE_NS}
# Test the API endpoint # Test the API endpoint
curl localhost:3000/v1/chat/completions \ curl localhost:3000/v1/chat/completions \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment