Unverified Commit 1c77531a authored by hhzhang16's avatar hhzhang16 Committed by GitHub
Browse files

docs: move deploy docs to docs/guides (#674)


Signed-off-by: default avatarhhzhang16 <54051230+hhzhang16@users.noreply.github.com>
Co-authored-by: default avatarmohammedabdulwahhab <furkhan324@berkeley.edu>
parent efb3e7d4
...@@ -57,17 +57,20 @@ Although not needed for local development, deploying your Dynamo pipelines to Ku ...@@ -57,17 +57,20 @@ Although not needed for local development, deploying your Dynamo pipelines to Ku
Here's how to build it: Here's how to build it:
```bash ```bash
export CI_REGISTRY_IMAGE=<your-registry> ./container/build.sh
export CI_COMMIT_SHA=<your-tag> docker tag dynamo:latest-vllm <your-registry>/dynamo-base:latest-vllm
docker login <your-registry>
earthly --push +dynamo-base-docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA docker push <your-registry>/dynamo-base:latest-vllm
``` ```
After building, you can use this image by setting the `DYNAMO_IMAGE` environment variable to point to your built image: After building, you can use this image by setting the `DYNAMO_IMAGE` environment variable to point to your built image:
```bash ```bash
export DYNAMO_IMAGE=<your-registry>/dynamo-base-docker:<your-tag> export DYNAMO_IMAGE=<your-registry>/dynamo-base:latest-vllm
``` ```
> [!NOTE]
> We are working on leaner base images that can be built using the targets in the top-level Earthfile.
### Running and Interacting with an LLM Locally ### Running and Interacting with an LLM Locally
To run a model and interact with it locally you can call `dynamo To run a model and interact with it locally you can call `dynamo
......
# Deploy Dynamo Cloud <!--
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Deploy Dynamo Cloud to Kubernetes
## Building Docker images for Dynamo Cloud components ## Building Docker images for Dynamo Cloud components
......
...@@ -32,8 +32,8 @@ Dynamo CLI has the following 4 sub-commands. ...@@ -32,8 +32,8 @@ Dynamo CLI has the following 4 sub-commands.
- :runner: dynamo run: quickly spin up a server to experiment with a specified model, input and output target. - :runner: dynamo run: quickly spin up a server to experiment with a specified model, input and output target.
- :palm_up_hand: dynamo serve: compose a graph of workers locally and serve. - :palm_up_hand: dynamo serve: compose a graph of workers locally and serve.
- :hammer: (Experiemental) dynamo build: containerize either the entire graph or parts of graph to multiple containers - :hammer: (Experimental) dynamo build: containerize either the entire graph or parts of graph to multiple containers
- :rocket: (Experiemental) dynamo deploy: deploy to K8 with helm charts or custom operators - :rocket: (Experimental) dynamo deploy: deploy to K8 with helm charts or custom operators
For more detailed examples on serving LLMs with disaggregated serving, KV aware routing, etc, please refer to [LLM deployment examples](https://github.com/ai-dynamo/dynamo/blob/main/examples/llm/README.md) For more detailed examples on serving LLMs with disaggregated serving, KV aware routing, etc, please refer to [LLM deployment examples](https://github.com/ai-dynamo/dynamo/blob/main/examples/llm/README.md)
<!--
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Deploying Dynamo Inference Graphs to Kubernetes
This guide provides an overview of the different deployment options available for Dynamo inference graphs in Kubernetes environments.
## Deployment Options
Dynamo provides two distinct deployment paths, each serving different use cases:
### 1. 🚀 Dynamo Cloud Kubernetes Platform [PREFERRED]
The Dynamo Cloud Platform (`deploy/dynamo/helm/`) provides a managed deployment experience:
- Contains the infrastructure components required for the Dynamo cloud platform
- Used when deploying with the `dynamo deploy` CLI commands
- Provides a managed deployment experience
For detailed instructions on using the Dynamo Cloud Platform, see:
- [Dynamo Cloud Platform Guide](dynamo_cloud.md): walks through installing and configuring the Dynamo cloud components on your Kubernetes cluster.
- [Operator Deployment Guide](operator_deployment.md)
### 2. Manual Deployment with Helm Charts
The manual deployment path (`deploy/Kubernetes/`) is available for users who need more control over their deployments:
- Used for manually deploying inference graphs to Kubernetes
- Contains Helm charts and configurations for deploying individual inference pipelines
- Provides full control over deployment parameters
- Requires manual management of infrastructure components
- Documentation:
- [Deploying Dynamo Inference Graphs to Kubernetes using Helm](../../Kubernetes/pipeline/README.md): all-in-one script
- [Manual Helm Deployment Guide](manual_helm_deployment.md): detailed instructions on manual deployment
## Getting Started
1. **For Dynamo Cloud Platform**:
- Follow the [Dynamo Cloud Platform Guide](dynamo_cloud.md)
- Deploy a Hello World pipeline using the [Operator Deployment Guide](operator_deployment.md)
- Deploy a Dynamo LLM pipeline to Kubernetes [Deploy LLM Guide](../../../examples/llm/README.md#deploy-to-kubernetes)
2. **For Manual Deployment**:
- Follow the [Manual Helm Deployment Guide](manual_helm_deployment.md)
## Example Deployment
See the [Hello World example](../../../examples/hello_world/README.md#deploying-to-and-running-the-example-in-kubernetes) for a complete walkthrough of deploying a simple inference graph.
See the [LLM example](../../../examples/llm/README.md#deploy-to-kubernetes) for a complete walkthrough of deploying a production-ready LLM inference pipeline to Kubernetes.
\ No newline at end of file
<!--
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# 🚀 Dynamo Cloud Kubernetes Platform (Dynamo Deploy)
The Dynamo Cloud platform is a comprehensive solution for deploying and managing Dynamo inference graphs (also referred to as pipelines) in Kubernetes environments. It provides a streamlined experience for deploying, scaling, and monitoring your inference services. You can interface with Dynamo Cloud using the `deploy` subcommand available in the Dynamo CLI (e.g `dynamo deploy`)
## Overview
The Dynamo cloud platform consists of several key components:
- **Dynamo Operator**: A Kubernetes operator that manages the lifecycle of Dynamo inference graphs from build ➡️ deploy.
- **API Store**: Stores and manages service configurations and metadata related to Dynamo deployments. Needs to be exposed externally.
- **Custom Resources**: Kubernetes custom resources for defining and managing Dynamo services
These components work together to provide a seamless deployment experience, handling everything from containerization to scaling and monitoring.
![Dynamo Deploy](../../images/dynamo-deploy.png)
## Prerequisites
Before getting started with the Dynamo cloud platform, ensure you have:
- A Kubernetes cluster (version 1.24 or later)
- [Earthly](https://earthly.dev/) installed for building components
- Docker installed and running
- Access to a container registry (e.g., Docker Hub, NVIDIA NGC, etc.)
- `kubectl` configured to access your cluster
- Helm installed (version 3.0 or later)
## Building Docker Images for Dynamo Cloud Components
The Dynamo cloud platform components need to be built and pushed to a container registry before deployment. You can build these components individually or all at once.
### Setting Up Environment Variables
First, set the required environment variables for building and pushing images:
```bash
# Set your container registry and organization
export CI_REGISTRY_IMAGE=<CONTAINER_REGISTRY>/<ORGANIZATION>
# Set the image tag (e.g., latest, 0.0.1, etc.)
export CI_COMMIT_SHA=<TAG>
```
Where:
- `<CONTAINER_REGISTRY>/<ORGANIZATION>`: Your container registry and organization name
- Examples: `nvcr.io/myorg`, `docker.io/myorg`
- `<TAG>`: The version tag for your images
- Examples: `latest`, `0.0.1`, `v1.0.0`
> [!IMPORTANT]
> Make sure you're logged in to your container registry before pushing images:
> ```bash
> docker login <CONTAINER_REGISTRY>
> ```
### Building Components
You have two options for building the components:
#### Option 1: Build All Components at Once
This is the simplest approach and builds and pushes all components in a single command:
```bash
earthly --push +all-docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA
```
#### Option 2: Build Components Individually
If you need to build components separately:
1. **API Store**
```bash
cd deploy/dynamo/api-store
earthly --push +docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA
```
2. **Operator**
```bash
cd deploy/dynamo/operator
earthly --push +docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA
```
## Deploying the Dynamo Cloud Platform
Once you've built and pushed the components, you can deploy the platform to your Kubernetes cluster.
### Prerequisites
Make sure you're in the correct directory:
```bash
cd deploy/dynamo/helm
```
Set your namespace (this will be used for all deployments):
```bash
export KUBE_NS=hello-world # Change this to your preferred namespace
```
### Deployment Steps
1. **Create Namespace and Set Context**
```bash
# Create a new namespace
kubectl create namespace $KUBE_NS
# Set the namespace as your default context
kubectl config set-context --current --namespace=$KUBE_NS
# [Optional] Create image pull secrets if your registry requires authentication
kubectl create secret docker-registry docker-imagepullsecret \
--docker-server=<your-registry> \
--docker-username=<your-username> \
--docker-password=<your-password> \
--namespace=$KUBE_NS
```
2. **Deploy Using the Helm Chart**
Set the required environment variables:
```bash
export NGC_TOKEN=$NGC_API_TOKEN
export NAMESPACE=$KUBE_NS
export CI_COMMIT_SHA=<TAG> # Use the same tag you used when building the images
export CI_REGISTRY_IMAGE=<CONTAINER_REGISTRY>/<ORGANIZATION> # Use the same registry/org you used when building the images
export RELEASE_NAME=$KUBE_NS
```
Deploy the platform:
```bash
./deploy.sh
```
3. **Expose Dynamo Cloud Externally**
You must also expose the `dynamo-store` service within the namespace externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud. You might setup an Ingress, use an `ExternalService` with Istio, or simply port-forward. In our docs, we refer to this externally available endpoint as `DYNAMO_SERVER`.
## Next Steps
After deploying the Dynamo cloud platform, you can:
1. Deploy your first inference graph using the [Dynamo CLI](operator_deployment.md)
2. Deploy Dynamo LLM pipelines to Kubernetes using the [Dynamo CLI](../../../examples/llm/README.md)!
3. Manage your deployments using the Dynamo CLI
For more detailed information about deploying inference graphs, see the [Dynamo Deploy Guide](README.md).
\ No newline at end of file
...@@ -17,13 +17,9 @@ limitations under the License. ...@@ -17,13 +17,9 @@ limitations under the License.
# Deploying Dynamo Inference Graphs to Kubernetes using Helm # Deploying Dynamo Inference Graphs to Kubernetes using Helm
This guide will walk you through the process of deploying an inference graph created using the Dynamo SDK onto a Kubernetes cluster. Note that this is currently an experimental feature. This guide will walk you through the process of deploying an inference graph created using the Dynamo SDK onto a Kubernetes cluster.
## Dynamo Kubernetes Operator Coming Soon! While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph is to [deploy with the Dynamo cloud platform](operator_deployment.md). The [Dynamo cloud platform](dynamo_cloud.md) simplifies the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, Kubernetes Custom Resources, etc.) that work together to streamline the deployment and management process.
![Dynamo Deploy](../images/dynamo-deploy.png)
While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph is via the Dynamo cloud platform. The Dynamo cloud platform, documented in [deploy/dynamo/README.md](../../deploy/dynamo/README.md), simplifies the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, Kubernetes Custom Resources, etc.) that work together to streamline the deployment and management process.
Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps: Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps:
...@@ -118,7 +114,7 @@ Follow these steps to containerize and deploy your inference pipeline: ...@@ -118,7 +114,7 @@ Follow these steps to containerize and deploy your inference pipeline:
1. Build and containerize the pipeline: 1. Build and containerize the pipeline:
> [!NOTE] > [!NOTE]
> For instructions on building the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README. > For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
```bash ```bash
# Navigate to example directory # Navigate to example directory
......
# Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform
This guide walks you through deploying an inference graph created with the Dynamo SDK onto a Kubernetes cluster using the Dynamo cloud platform and the Dynamo deploy CLI. The Dynamo cloud platform provides a streamlined experience for deploying and managing your inference services.
## Prerequisites
Before proceeding with deployment, ensure you have:
- [Dynamo CLI](../README.md#installation) installed
- A Kubernetes cluster with the [Dynamo cloud platform](dynamo_cloud.md) installed
- Ubuntu 24.04 as the base image for your services
- Required dependencies:
- Helm package manager
- Dynamo SDK and CLI tools
- Rust packages and toolchain
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
**Note**: Note the `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
## Understanding the Deployment Process
The deployment process involves two main steps:
1. **Local Build (`dynamo build`)**
- Creates a Dynamo service archive containing:
- Service code and dependencies
- Service configuration and metadata
- Runtime requirements
- Service graph definition
- This archive is used as input for the remote build process
2. **Remote Image Build**
- A `yatai-dynamonim-image-builder` pod is created in your cluster
- This pod:
- Takes the Dynamo service archive
- Containerizes it using the specified base image
- Pushes the final container image to your cluster's registry
- The build process is managed by the Dynamo operator
## Deployment Steps
### 1. Login to Dynamo Server
First, configure your environment and login to the Dynamo server:
```bash
# Set your project root directory
export PROJECT_ROOT=$(pwd)
# Set your Kubernetes namespace (must match the namespace where Dynamo cloud is installed)
export KUBE_NS=hello-world
# Externally accessible endpoint to the `dynamo-store` service within your Dynamo Cloud installation
export DYNAMO_SERVER=https://${KUBE_NS}.dev.aire.nvidia.com
# Login to the Dynamo server
dynamo server login --api-token TEST-TOKEN --endpoint $DYNAMO_SERVER
```
### 2. Build the Dynamo Base Image
> [!NOTE]
> For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
```bash
# Set your runtime image name
export DYNAMO_IMAGE=<dynamo_docker_image_name>
# Navigate to your project directory
cd $PROJECT_ROOT/examples/hello_world
# Build the service and capture the tag
DYNAMO_TAG=$(dynamo build hello_world:Frontend | grep "Successfully built" | awk -F"\"" '{ print $2 }')
```
### 3. Deploy to Kubernetes
Deploy your service using the Dynamo deployment command:
```bash
# Set your Helm release name
export DEPLOYMENT_NAME=hello-world
# Create the deployment
dynamo deployment create $DYNAMO_TAG --no-wait -n $DEPLOYMENT_NAME
```
To delete an existing deployment:
```bash
kubectl delete dynamodeployment $DEPLOYMENT_NAME
```
### 4. Test the Deployment
The deployment process creates several pods:
1. A `yatai-dynamonim-image-builder` pod for building the container image
2. Service pods prefixed with `$DEPLOYMENT_NAME` once the build is complete
To test your deployment:
```bash
# Forward the service port to localhost
kubectl -n ${KUBE_NS} port-forward svc/${DEPLOYMENT_NAME}-frontend 3000:3000
# Test the API endpoint
curl -X 'POST' 'http://localhost:3000/generate' \
-H 'accept: text/event-stream' \
-H 'Content-Type: application/json' \
-d '{"text": "test"}'
```
## Expected Output
When you send a request with "test" as input, you'll see how the text flows through each service:
```
Frontend: Middle: Backend: test-mid-back
```
This demonstrates the service pipeline:
1. The Frontend receives "test"
2. The Middle service adds "-mid" to create "test-mid"
3. The Backend service adds "-back" to create "test-mid-back"
...@@ -145,10 +145,10 @@ export DYNAMO_SERVER=https://${KUBE_NS}.dev.aire.nvidia.com ...@@ -145,10 +145,10 @@ export DYNAMO_SERVER=https://${KUBE_NS}.dev.aire.nvidia.com
dynamo server login --api-token TEST-TOKEN --endpoint $DYNAMO_SERVER dynamo server login --api-token TEST-TOKEN --endpoint $DYNAMO_SERVER
``` ```
2. **Build the Dynamo Image** 2. **Build the Dynamo Base Image**
> [!NOTE] > [!NOTE]
> For instructions on building the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README. > For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
```bash ```bash
# Set runtime image name # Set runtime image name
......
...@@ -158,3 +158,82 @@ See [multinode-examples.md](multinode-examples.md) for more details. ...@@ -158,3 +158,82 @@ See [multinode-examples.md](multinode-examples.md) for more details.
### Close deployment ### Close deployment
See [close deployment](../../docs/guides/dynamo_serve.md#close-deployment) section to learn about how to close the deployment. See [close deployment](../../docs/guides/dynamo_serve.md#close-deployment) section to learn about how to close the deployment.
## Deploy to Kubernetes
These examples can be deployed to a Kubernetes cluster using Dynamo Cloud and the Dynamo deploy CLI.
### Prerequisites
Before deploying, ensure you have:
- Dynamo CLI installed
- Ubuntu 24.04 as the base image
- Required dependencies:
- Helm package manager
- Dynamo SDK and CLI tools
- Rust packages and toolchain
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
**Note**: Note the `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
### Deployment Steps
1. **Login to Dynamo Cloud**
```bash
export PROJECT_ROOT=$(pwd)
export KUBE_NS=dynamo-cloud # Note: This must match the Kubernetes namespace where you installed Dynamo Cloud
export DYNAMO_SERVER=https://${KUBE_NS}.dev.aire.nvidia.com # Externally accessible endpoint to the `dynamo-store` service within your Dynamo Cloud installation
dynamo server login --api-token TEST-TOKEN --endpoint $DYNAMO_SERVER
```
2. **Build the Dynamo Base Image**
> [!NOTE]
> For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
```bash
# Set runtime image name
export DYNAMO_IMAGE=<dynamo_docker_image_name>
# Prepare your project for deployment.
cd $PROJECT_ROOT/examples/llm
DYNAMO_TAG=$(dynamo build graphs.agg:Frontend | grep "Successfully built" | awk '{ print $NF }' | sed 's/\.$//')
```
3. **Deploy to Kubernetes**
```bash
echo $DYNAMO_TAG
export DEPLOYMENT_NAME=llm-agg
dynamo deployment create $DYNAMO_TAG --no-wait -n $DEPLOYMENT_NAME -f ./configs/agg.yaml
```
To delete an existing Dynamo deployment:
```bash
kubectl delete dynamodeployment $DEPLOYMENT_NAME
```
4. **Test the deployment**
Once you create the Dynamo deployment, a pod prefixed with `yatai-dynamonim-image-builder` will begin running. Once it finishes running, pods will be created using the image that was built. Once the pods prefixed with `$DEPLOYMENT_NAME` are up and running, you can test out your example!
```bash
# Forward the service port to localhost
kubectl -n ${KUBE_NS} port-forward svc/${DEPLOYMENT_NAME}-frontend 3000:3000
# Test the API endpoint
curl localhost:3000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
"messages": [
{
"role": "user",
"content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map hinting at ests that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your Task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost familt clue is hidden."
}
],
"stream":false,
"max_tokens": 30
}'
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment