docs: move deploy docs to docs/guides (#674)

Signed-off-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>

docs: move deploy docs to docs/guides (#674)
Signed-off-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
1c77531a · hhzhang16 · GitHub · efb3e7d4 · 1c77531a · 1c77531a
Unverified Commit 1c77531a authored Apr 14, 2025 by hhzhang16 Committed by GitHub Apr 15, 2025
9 changed files
--- a/README.md
+++ b/README.md
@@ -57,17 +57,20 @@ Although not needed for local development, deploying your Dynamo pipelines to Ku
 Here's how to build it:
 ```bash
-export CI_REGISTRY_IMAGE=<your-registry>
+./container/build.sh
-export CI_COMMIT_SHA=<your-tag>
+docker tag dynamo:latest-vllm <your-registry>/dynamo-base:latest-vllm
+docker login <your-registry>
-earthly --push +dynamo-base-docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA
+docker push <your-registry>/dynamo-base:latest-vllm
 ```
 After building, you can use this image by setting the `DYNAMO_IMAGE` environment variable to point to your built image:
 ```bash
-export DYNAMO_IMAGE=<your-registry>/dynamo-base-docker:<your-tag>
+export DYNAMO_IMAGE=<your-registry>/dynamo-base:latest-vllm
 ```
+> [!NOTE]
+> We are working on leaner base images that can be built using the targets in the top-level Earthfile.
 ### Running and Interacting with an LLM Locally
 To run a model and interact with it locally you can call `dynamo

--- a/deploy/dynamo/helm/README.md
+++ b/deploy/dynamo/helm/README.md
-# Deploy Dynamo Cloud
+<!--
+SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+SPDX-License-Identifier: Apache-2.0
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Deploy Dynamo Cloud to Kubernetes
 ## Building Docker images for Dynamo Cloud components

--- a/docs/guides/README.md
+++ b/docs/guides/README.md
@@ -32,8 +32,8 @@ Dynamo CLI has the following 4 sub-commands.
 - :runner: dynamo run: quickly spin up a server to experiment with a specified model, input and output target.
 - :palm_up_hand: dynamo serve: compose a graph of workers locally and serve.
- :hammer: (Experiemental) dynamo build: containerize either the entire graph or parts of graph to multiple containers
+- :hammer: (Experimental) dynamo build: containerize either the entire graph or parts of graph to multiple containers
- :rocket: (Experiemental) dynamo deploy: deploy to K8 with helm charts or custom operators
+- :rocket: (Experimental) dynamo deploy: deploy to K8 with helm charts or custom operators
 For more detailed examples on serving LLMs with disaggregated serving, KV aware routing, etc,  please refer to [LLM deployment examples](https://github.com/ai-dynamo/dynamo/blob/main/examples/llm/README.md)
--- a/docs/guides/dynamo_deploy/README.md
+++ b/docs/guides/dynamo_deploy/README.md
+<!--
+SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+SPDX-License-Identifier: Apache-2.0
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Deploying Dynamo Inference Graphs to Kubernetes
+This guide provides an overview of the different deployment options available for Dynamo inference graphs in Kubernetes environments.
+## Deployment Options
+Dynamo provides two distinct deployment paths, each serving different use cases:
+### 1. 🚀 Dynamo Cloud Kubernetes Platform [PREFERRED]
+The Dynamo Cloud Platform (`deploy/dynamo/helm/`) provides a managed deployment experience:
+- Contains the infrastructure components required for the Dynamo cloud platform
+- Used when deploying with the `dynamo deploy` CLI commands
+- Provides a managed deployment experience
+For detailed instructions on using the Dynamo Cloud Platform, see:
+- [Dynamo Cloud Platform Guide](dynamo_cloud.md): walks through installing and configuring the Dynamo cloud components on your Kubernetes cluster.
+- [Operator Deployment Guide](operator_deployment.md)
+### 2. Manual Deployment with Helm Charts
+The manual deployment path (`deploy/Kubernetes/`) is available for users who need more control over their deployments:
+- Used for manually deploying inference graphs to Kubernetes
+- Contains Helm charts and configurations for deploying individual inference pipelines
+- Provides full control over deployment parameters
+- Requires manual management of infrastructure components
+- Documentation:
+  - [Deploying Dynamo Inference Graphs to Kubernetes using Helm](../../Kubernetes/pipeline/README.md): all-in-one script
+  - [Manual Helm Deployment Guide](manual_helm_deployment.md): detailed instructions on manual deployment
+## Getting Started
+1. **For Dynamo Cloud Platform**:
+   - Follow the [Dynamo Cloud Platform Guide](dynamo_cloud.md)
+   - Deploy a Hello World pipeline using the [Operator Deployment Guide](operator_deployment.md)
+   - Deploy a Dynamo LLM pipeline to Kubernetes [Deploy LLM Guide](../../../examples/llm/README.md#deploy-to-kubernetes)
+2. **For Manual Deployment**:
+   - Follow the [Manual Helm Deployment Guide](manual_helm_deployment.md)
+## Example Deployment
+See the [Hello World example](../../../examples/hello_world/README.md#deploying-to-and-running-the-example-in-kubernetes) for a complete walkthrough of deploying a simple inference graph.
+See the [LLM example](../../../examples/llm/README.md#deploy-to-kubernetes) for a complete walkthrough of deploying a production-ready LLM inference pipeline to Kubernetes.
\ No newline at end of file
--- a/docs/guides/dynamo_deploy/dynamo_cloud.md
+++ b/docs/guides/dynamo_deploy/dynamo_cloud.md
+<!--
+SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+SPDX-License-Identifier: Apache-2.0
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# 🚀 Dynamo Cloud Kubernetes Platform (Dynamo Deploy)
+The Dynamo Cloud platform is a comprehensive solution for deploying and managing Dynamo inference graphs (also referred to as pipelines) in Kubernetes environments. It provides a streamlined experience for deploying, scaling, and monitoring your inference services. You can interface with Dynamo Cloud using the `deploy` subcommand available in the Dynamo CLI (e.g `dynamo deploy`)
+## Overview
+The Dynamo cloud platform consists of several key components:
+- **Dynamo Operator**: A Kubernetes operator that manages the lifecycle of Dynamo inference graphs from build ➡️ deploy.
+- **API Store**: Stores and manages service configurations and metadata related to Dynamo deployments. Needs to be exposed externally.
+- **Custom Resources**: Kubernetes custom resources for defining and managing Dynamo services
+These components work together to provide a seamless deployment experience, handling everything from containerization to scaling and monitoring.
+![Dynamo Deploy](../../images/dynamo-deploy.png)
+## Prerequisites
+Before getting started with the Dynamo cloud platform, ensure you have:
+- A Kubernetes cluster (version 1.24 or later)
+- [Earthly](https://earthly.dev/) installed for building components
+- Docker installed and running
+- Access to a container registry (e.g., Docker Hub, NVIDIA NGC, etc.)
+- `kubectl` configured to access your cluster
+- Helm installed (version 3.0 or later)
+## Building Docker Images for Dynamo Cloud Components
+The Dynamo cloud platform components need to be built and pushed to a container registry before deployment. You can build these components individually or all at once.
+### Setting Up Environment Variables
+First, set the required environment variables for building and pushing images:
+```bash
+# Set your container registry and organization
+export CI_REGISTRY_IMAGE=<CONTAINER_REGISTRY>/<ORGANIZATION>
+# Set the image tag (e.g., latest, 0.0.1, etc.)
+export CI_COMMIT_SHA=<TAG>
+```
+Where:
+- `<CONTAINER_REGISTRY>/<ORGANIZATION>`: Your container registry and organization name
+  - Examples: `nvcr.io/myorg`, `docker.io/myorg`
+- `<TAG>`: The version tag for your images
+  - Examples: `latest`, `0.0.1`, `v1.0.0`
+> [!IMPORTANT]
+> Make sure you're logged in to your container registry before pushing images:
+> ```bash
+> docker login <CONTAINER_REGISTRY>
+> ```
+### Building Components
+You have two options for building the components:
+#### Option 1: Build All Components at Once
+This is the simplest approach and builds and pushes all components in a single command:
+```bash
+earthly --push +all-docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA
+```
+#### Option 2: Build Components Individually
+If you need to build components separately:
+1. **API Store**
+```bash
+cd deploy/dynamo/api-store
+earthly --push +docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA
+```
+2. **Operator**
+```bash
+cd deploy/dynamo/operator
+earthly --push +docker --CI_REGISTRY_IMAGE=$CI_REGISTRY_IMAGE --CI_COMMIT_SHA=$CI_COMMIT_SHA
+```
+## Deploying the Dynamo Cloud Platform
+Once you've built and pushed the components, you can deploy the platform to your Kubernetes cluster.
+### Prerequisites
+Make sure you're in the correct directory:
+```bash
+cd deploy/dynamo/helm
+```
+Set your namespace (this will be used for all deployments):
+```bash
+export KUBE_NS=hello-world    # Change this to your preferred namespace
+```
+### Deployment Steps
+1. **Create Namespace and Set Context**
+```bash
+# Create a new namespace
+kubectl create namespace $KUBE_NS
+# Set the namespace as your default context
+kubectl config set-context --current --namespace=$KUBE_NS
+# [Optional] Create image pull secrets if your registry requires authentication
+kubectl create secret docker-registry docker-imagepullsecret \
+  --docker-server=<your-registry> \
+  --docker-username=<your-username> \
+  --docker-password=<your-password> \
+  --namespace=$KUBE_NS
+```
+2. **Deploy Using the Helm Chart**
+Set the required environment variables:
+```bash
+export NGC_TOKEN=$NGC_API_TOKEN
+export NAMESPACE=$KUBE_NS
+export CI_COMMIT_SHA=<TAG>  # Use the same tag you used when building the images
+export CI_REGISTRY_IMAGE=<CONTAINER_REGISTRY>/<ORGANIZATION>  # Use the same registry/org you used when building the images
+export RELEASE_NAME=$KUBE_NS
+```
+Deploy the platform:
+```bash
+./deploy.sh
+```
+3. **Expose Dynamo Cloud Externally**
+You must also expose the `dynamo-store` service within the namespace externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud. You might setup an Ingress, use an `ExternalService` with Istio, or simply port-forward. In our docs, we refer to this externally available endpoint as `DYNAMO_SERVER`.
+## Next Steps
+After deploying the Dynamo cloud platform, you can:
+1. Deploy your first inference graph using the [Dynamo CLI](operator_deployment.md)
+2. Deploy Dynamo LLM pipelines to Kubernetes using the [Dynamo CLI](../../../examples/llm/README.md)!
+3. Manage your deployments using the Dynamo CLI
+For more detailed information about deploying inference graphs, see the [Dynamo Deploy Guide](README.md).
\ No newline at end of file
--- a/docs/guides/dynamo_deploy.md
+++ b/docs/guides/dynamo_deploy.md
@@ -17,13 +17,9 @@ limitations under the License.
 # Deploying Dynamo Inference Graphs to Kubernetes using Helm
-This guide will walk you through the process of deploying an inference graph created using the Dynamo SDK onto a Kubernetes cluster. Note that this is currently an experimental feature.
+This guide will walk you through the process of deploying an inference graph created using the Dynamo SDK onto a Kubernetes cluster.
-## Dynamo Kubernetes Operator Coming Soon!
+While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph is to [deploy with the Dynamo cloud platform](operator_deployment.md). The [Dynamo cloud platform](dynamo_cloud.md) simplifies the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, Kubernetes Custom Resources, etc.) that work together to streamline the deployment and management process.
-![Dynamo Deploy](../images/dynamo-deploy.png)
-While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph is via the Dynamo cloud platform. The Dynamo cloud platform, documented in [deploy/dynamo/README.md](../../deploy/dynamo/README.md), simplifies the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, Kubernetes Custom Resources, etc.) that work together to streamline the deployment and management process.
 Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps:
@@ -118,7 +114,7 @@ Follow these steps to containerize and deploy your inference pipeline:
 1. Build and containerize the pipeline:
 > [!NOTE]
-> For instructions on building the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
+> For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
 ```bash
 # Navigate to example directory

--- a/docs/guides/dynamo_deploy/operator_deployment.md
+++ b/docs/guides/dynamo_deploy/operator_deployment.md
+# Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform
+This guide walks you through deploying an inference graph created with the Dynamo SDK onto a Kubernetes cluster using the Dynamo cloud platform and the Dynamo deploy CLI. The Dynamo cloud platform provides a streamlined experience for deploying and managing your inference services.
+## Prerequisites
+Before proceeding with deployment, ensure you have:
+- [Dynamo CLI](../README.md#installation) installed
+- A Kubernetes cluster with the [Dynamo cloud platform](dynamo_cloud.md) installed
+- Ubuntu 24.04 as the base image for your services
+- Required dependencies:
+  - Helm package manager
+  - Dynamo SDK and CLI tools
+  - Rust packages and toolchain
+You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
+**Note**: Note the `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
+## Understanding the Deployment Process
+The deployment process involves two main steps:
+1. **Local Build (`dynamo build`)**
+   - Creates a Dynamo service archive containing:
+     - Service code and dependencies
+     - Service configuration and metadata
+     - Runtime requirements
+     - Service graph definition
+   - This archive is used as input for the remote build process
+2. **Remote Image Build**
+   - A `yatai-dynamonim-image-builder` pod is created in your cluster
+   - This pod:
+     - Takes the Dynamo service archive
+     - Containerizes it using the specified base image
+     - Pushes the final container image to your cluster's registry
+   - The build process is managed by the Dynamo operator
+## Deployment Steps
+### 1. Login to Dynamo Server
+First, configure your environment and login to the Dynamo server:
+```bash
+# Set your project root directory
+export PROJECT_ROOT=$(pwd)
+# Set your Kubernetes namespace (must match the namespace where Dynamo cloud is installed)
+export KUBE_NS=hello-world
+# Externally accessible endpoint to the `dynamo-store` service within your Dynamo Cloud installation
+export DYNAMO_SERVER=https://${KUBE_NS}.dev.aire.nvidia.com
+# Login to the Dynamo server
+dynamo server login --api-token TEST-TOKEN --endpoint $DYNAMO_SERVER
+```
+### 2. Build the Dynamo Base Image
+> [!NOTE]
+> For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
+```bash
+# Set your runtime image name
+export DYNAMO_IMAGE=<dynamo_docker_image_name>
+# Navigate to your project directory
+cd $PROJECT_ROOT/examples/hello_world
+# Build the service and capture the tag
+DYNAMO_TAG=$(dynamo build hello_world:Frontend | grep "Successfully built" | awk -F"\"" '{ print $2 }')
+```
+### 3. Deploy to Kubernetes
+Deploy your service using the Dynamo deployment command:
+```bash
+# Set your Helm release name
+export DEPLOYMENT_NAME=hello-world
+# Create the deployment
+dynamo deployment create $DYNAMO_TAG --no-wait -n $DEPLOYMENT_NAME
+```
+To delete an existing deployment:
+```bash
+kubectl delete dynamodeployment $DEPLOYMENT_NAME
+```
+### 4. Test the Deployment
+The deployment process creates several pods:
+1. A `yatai-dynamonim-image-builder` pod for building the container image
+2. Service pods prefixed with `$DEPLOYMENT_NAME` once the build is complete
+To test your deployment:
+```bash
+# Forward the service port to localhost
+kubectl -n ${KUBE_NS} port-forward svc/${DEPLOYMENT_NAME}-frontend 3000:3000
+# Test the API endpoint
+curl -X 'POST' 'http://localhost:3000/generate' \
+    -H 'accept: text/event-stream' \
+    -H 'Content-Type: application/json' \
+    -d '{"text": "test"}'
+```
+## Expected Output
+When you send a request with "test" as input, you'll see how the text flows through each service:
+```
+Frontend: Middle: Backend: test-mid-back
+```
+This demonstrates the service pipeline:
+1. The Frontend receives "test"
+2. The Middle service adds "-mid" to create "test-mid"
+3. The Backend service adds "-back" to create "test-mid-back"
--- a/examples/hello_world/README.md
+++ b/examples/hello_world/README.md
@@ -145,10 +145,10 @@ export DYNAMO_SERVER=https://${KUBE_NS}.dev.aire.nvidia.com
 dynamo server login --api-token TEST-TOKEN --endpoint $DYNAMO_SERVER
 ```
-2. **Build the Dynamo Image**
+2. **Build the Dynamo Base Image**
 > [!NOTE]
-> For instructions on building the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
+> For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
 ```bash
 # Set runtime image name

--- a/examples/llm/README.md
+++ b/examples/llm/README.md
@@ -158,3 +158,82 @@ See [multinode-examples.md](multinode-examples.md) for more details.
 ### Close deployment
 See [close deployment](../../docs/guides/dynamo_serve.md#close-deployment) section to learn about how to close the deployment.
+## Deploy to Kubernetes
+These examples can be deployed to a Kubernetes cluster using Dynamo Cloud and the Dynamo deploy CLI.
+### Prerequisites
+Before deploying, ensure you have:
+- Dynamo CLI installed
+- Ubuntu 24.04 as the base image
+- Required dependencies:
+  - Helm package manager
+  - Dynamo SDK and CLI tools
+  - Rust packages and toolchain
+You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
+**Note**: Note the `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
+### Deployment Steps
+1. **Login to Dynamo Cloud**
+```bash
+export PROJECT_ROOT=$(pwd)
+export KUBE_NS=dynamo-cloud  # Note: This must match the Kubernetes namespace where you installed Dynamo Cloud
+export DYNAMO_SERVER=https://${KUBE_NS}.dev.aire.nvidia.com # Externally accessible endpoint to the `dynamo-store` service within your Dynamo Cloud installation
+dynamo server login --api-token TEST-TOKEN --endpoint $DYNAMO_SERVER
+```
+2. **Build the Dynamo Base Image**
+> [!NOTE]
+> For instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../README.md#building-the-dynamo-base-image) section in the main README.
+```bash
+# Set runtime image name
+export DYNAMO_IMAGE=<dynamo_docker_image_name>
+# Prepare your project for deployment.
+cd $PROJECT_ROOT/examples/llm
+DYNAMO_TAG=$(dynamo build graphs.agg:Frontend | grep "Successfully built" |  awk '{ print $NF }' | sed 's/\.$//')
+```
+3. **Deploy to Kubernetes**
+```bash
+echo $DYNAMO_TAG
+export DEPLOYMENT_NAME=llm-agg
+dynamo deployment create $DYNAMO_TAG --no-wait -n $DEPLOYMENT_NAME -f ./configs/agg.yaml
+```
+To delete an existing Dynamo deployment:
+```bash
+kubectl delete dynamodeployment $DEPLOYMENT_NAME
+```
+4. **Test the deployment**
+Once you create the Dynamo deployment, a pod prefixed with `yatai-dynamonim-image-builder` will begin running. Once it finishes running, pods will be created using the image that was built. Once the pods prefixed with `$DEPLOYMENT_NAME` are up and running, you can test out your example!
+```bash
+# Forward the service port to localhost
+kubectl -n ${KUBE_NS} port-forward svc/${DEPLOYMENT_NAME}-frontend 3000:3000
+# Test the API endpoint
+curl localhost:3000/v1/chat/completions   -H "Content-Type: application/json"   -d '{
+    "model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
+    "messages": [
+    {
+        "role": "user",
+        "content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map hinting at ests that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your Task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost familt clue is hidden."
+    }
+    ],
+    "stream":false,
+    "max_tokens": 30
+  }'
+```