operator_deployment.md 7.71 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<!--
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

18
19
20
21
22
23
24
25
# Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform

This guide walks you through deploying an inference graph created with the Dynamo SDK onto a Kubernetes cluster using the Dynamo cloud platform and the Dynamo deploy CLI. The Dynamo cloud platform provides a streamlined experience for deploying and managing your inference services.

## Prerequisites

Before proceeding with deployment, ensure you have:

26
- [Dynamo Python package](../../get_started.md#installation) installed
27
28
29
30
31
32
- A Kubernetes cluster with the [Dynamo cloud platform](dynamo_cloud.md) installed
- Ubuntu 24.04 as the base image for your services
- Required dependencies:
  - Helm package manager
  - Rust packages and toolchain

33
You must have first followed the instructions in [deploy/dynamo/helm/README.md](https://github.com/ai-dynamo/dynamo/blob/main/deploy/dynamo/cloud/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

## Understanding the Deployment Process

The deployment process involves two main steps:

1. **Local Build (`dynamo build`)**
   - Creates a Dynamo service archive containing:
     - Service code and dependencies
     - Service configuration and metadata
     - Runtime requirements
     - Service graph definition
   - This archive is used as input for the remote build process

2. **Remote Image Build**
   - A `yatai-dynamonim-image-builder` pod is created in your cluster
   - This pod:
     - Takes the Dynamo service archive
     - Containerizes it using the specified base image
     - Pushes the final container image to your cluster's registry
   - The build process is managed by the Dynamo operator

## Deployment Steps

57
### 1. Configure Environment Variables
58

59
60
61
62
First, set up your environment variables for working with Dynamo Cloud. You have two options for accessing the `dynamo-store` service:

#### Option 1: Using Port-Forward (Local Development)
This is the simplest approach for local development and testing:
63
64
65
66
67
68
69
70

```bash
# Set your project root directory
export PROJECT_ROOT=$(pwd)

# Set your Kubernetes namespace (must match the namespace where Dynamo cloud is installed)
export KUBE_NS=hello-world

71
72
73
74
75
# In a separate terminal, run port-forward to expose the dynamo-store service locally
kubectl port-forward svc/dynamo-store 8080:80 -n $KUBE_NS

# Set DYNAMO_CLOUD to use the local port-forward endpoint
export DYNAMO_CLOUD=http://localhost:8080
76
77
```

78
79
#### Option 2: Using Ingress/VirtualService (Production)
For production environments, you should use proper ingress configuration:
80

81
82
83
84
85
86
87
88
89
90
91
```bash
# Set your project root directory
export PROJECT_ROOT=$(pwd)

# Set your Kubernetes namespace (must match the namespace where Dynamo cloud is installed)
export KUBE_NS=hello-world

# Set DYNAMO_CLOUD to your externally accessible endpoint
# This could be your Ingress hostname or VirtualService URL
export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com  # Replace with your actual endpoint
```
92

93
94
95
``` {note}
The `DYNAMO_CLOUD` environment variable is required for all Dynamo deployment commands. Make sure it's set before running any deployment operations.
```
96
97
98
99
100

### 2. Build the Dynamo Base Image

Before building your service, you need to ensure the base image is properly set up:

101
1. For detailed instructions on building and pushing the Dynamo base image, see the [Building the Dynamo Base Image](../../get_started.md#building-the-dynamo-base-image) section in the main README.
102

103
2. Export the image from the previous step to your environment.
104
```bash
105
106
# Export the image from the previous step to your environment
export DYNAMO_IMAGE=<your-registry>/<your-image-name>:<your-tag>
107
108
109
110
111

# Navigate to your project directory
cd $PROJECT_ROOT/examples/hello_world

# Build the service and capture the tag
112
DYNAMO_TAG=$(dynamo build hello_world:Frontend | grep "Successfully built" | awk '{ print $3 }' | sed 's/\.$//')
113
114
115
116
117
118
119
120
121
122
123
```

### 3. Deploy to Kubernetes

Deploy your service using the Dynamo deployment command:

```bash
# Set your Helm release name
export DEPLOYMENT_NAME=hello-world

# Create the deployment
124
dynamo deployment create $DYNAMO_TAG -n $DEPLOYMENT_NAME
125
126
```

127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
#### Managing Deployments

Once you have deployments running, you can manage them using the following commands:

To see a list of all deployments in your namespace:

```bash
dynamo deployment list
```
This command displays a table of all deployments.

To get detailed information about a specific deployment:

```bash
dynamo deployment get $DEPLOYMENT_NAME
```

144
145
146
147
148
149
To update a specific deployment:

```bash
dynamo deployment update $DEPLOYMENT_NAME [--config-file FILENAME] [--env ENV_VAR]
```

150
151
To remove a deployment and all its associated resources:

152
```bash
153
dynamo deployment delete $DEPLOYMENT_NAME
154
```
155
156
157
158

```{warning}
This command permanently deletes the deployment and all associated resources. Make sure you have any necessary backups before proceeding.
```
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190

### 4. Test the Deployment

The deployment process creates several pods:
1. A `yatai-dynamonim-image-builder` pod for building the container image
2. Service pods prefixed with `$DEPLOYMENT_NAME` once the build is complete

To test your deployment:

```bash
# Forward the service port to localhost
kubectl -n ${KUBE_NS} port-forward svc/${DEPLOYMENT_NAME}-frontend 3000:3000

# Test the API endpoint
curl -X 'POST' 'http://localhost:3000/generate' \
    -H 'accept: text/event-stream' \
    -H 'Content-Type: application/json' \
    -d '{"text": "test"}'
```

## Expected Output

When you send a request with "test" as input, you'll see how the text flows through each service:

```
Frontend: Middle: Backend: test-mid-back
```

This demonstrates the service pipeline:
1. The Frontend receives "test"
2. The Middle service adds "-mid" to create "test-mid"
3. The Backend service adds "-back" to create "test-mid-back"
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227

## Using Kubernetes Secrets for Environment Variables

Dynamo supports securely injecting environment variables from Kubernetes secrets into your deployment. This is only supported when deploying with `--target kubernetes`.

### Creating a Secret

First, create a Kubernetes secret containing your sensitive values:

```bash
export HF_TOKEN=your_hf_token
kubectl create secret generic dynamo-env-secrets \
  --from-literal=huggingface.token=$HF_TOKEN \
  --from-literal=another_secret.key=value \
  -n $KUBE_NS
```

### Referencing Secrets in Your Deployment

You can reference secret keys in your deployment using the `--env-from-secret` flag:

- `--env-from-secret HF_TOKEN=huggingface.token` will set the `HF_TOKEN` environment variable from the `huggingface.token` key in the secret.
- `--env-from-secret ANOTHER_SECRET=another_secret.key` will set the `ANOTHER_SECRET` environment variable from the same-named key in the secret.
- You can also mix normal envs: `--env NORMAL_ENV_KEY=value`.

By default, Dynamo will look for a secret named `dynamo-env-secrets`. You can override this with the `--env-secrets-name` flag or the `DYNAMO_ENV_SECRETS` environment variable.

### Example Full Command

```bash
dynamo deploy $DYNAMO_TAG -n $DEPLOYMENT_NAME -f ./configs/agg.yaml \
  --env NORMAL_ENV_KEY=value \
  --env-from-secret HF_TOKEN=huggingface.token \
  --env-from-secret ANOTHER_SECRET=another_secret.key \
  --target kubernetes
```