README.md.gotmpl 5.6 KB
Newer Older
1
<!--
2
SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

{{ template "chart.header" . }}

{{ template "chart.description" . }}

{{ template "chart.versionBadge" . }}{{ template "chart.typeBadge" . }}{{ template "chart.appVersionBadge" . }}

## 🚀 Overview

26
The Dynamo Platform Helm chart deploys the complete Dynamo Kubernetes Platform infrastructure on Kubernetes, including:
27
28
29

- **Dynamo Operator**: Kubernetes operator for managing Dynamo deployments
- **NATS**: High-performance messaging system for component communication
30
- **etcd**: Distributed key-value store for service discovery (optional, disabled by default)
31
32
33
34
35
36
37
38
39
- **Grove**: Multi-node inference orchestration (optional)
- **Kai Scheduler**: Advanced workload scheduling (optional)

## 📋 Prerequisites

- Kubernetes cluster (v1.20+)
- Helm 3.8+
- Sufficient cluster resources for your deployment scale
- Container registry access (if using private images)
40
41
42
43
44
45
46
47
48
49
50
- TLS certificate infrastructure for admission webhooks (auto-generated via Helm hooks by default, or [cert-manager](https://cert-manager.io/), or externally managed)

## 🔄 Upgrading Notes

### Webhooks are now mandatory (v1.0.0+)

The `webhook.enabled` Helm value has been removed. Admission webhooks are now a required component of the operator and cannot be disabled. This change aligns with the upcoming addition of CRD conversion webhooks, which are mandatory for multi-version API support.

No action is required for most upgrades — Helm hooks automatically generate TLS certificates and inject the CA bundle during `helm upgrade`. If you use cert-manager or externally managed certificates, ensure your existing configuration is correct before upgrading.

---
51

52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
## ⚠️ Important: Cluster-Wide vs Namespace-Scoped Deployment

### Single Cluster-Wide Operator (Recommended)

**By default, the Dynamo operator runs with cluster-wide permissions and should only be deployed ONCE per cluster.**

- ✅ **Recommended**: Deploy one cluster-wide operator per cluster
- ❌ **Not Recommended**: Multiple cluster-wide operators in the same cluster

### Multiple Namespace-Scoped Operators (Advanced)

If you need multiple operator instances (e.g., for multi-tenancy), use namespace-scoped deployment:

```yaml
# values.yaml
dynamo-operator:
  namespaceRestriction:
    enabled: true
    targetNamespace: "my-tenant-namespace"  # Optional, defaults to release namespace
```

### Validation and Safety

The chart includes built-in validation to prevent all operator conflicts:

- **Automatic Detection**: Scans for existing operators (both cluster-wide and namespace-restricted) during installation
- **Prevents Multiple Cluster-Wide**: Installation will fail if another cluster-wide operator exists
- **Prevents Mixed Deployments (Type 1)**: Installation will fail if trying to install namespace-restricted operator when cluster-wide exists
- **Prevents Mixed Deployments (Type 2)**: Installation will fail if trying to install cluster-wide operator when namespace-restricted operators exist
- **Safe Defaults**: Leader election uses shared ID for proper coordination

#### 🚫 **Blocked Conflict Scenarios**

| Existing Operator | New Operator | Status | Reason |
|-------------------|--------------|---------|--------|
| None | Cluster-wide | ✅ **Allowed** | No conflicts |
| None | Namespace-restricted | ✅ **Allowed** | No conflicts |
| Cluster-wide | Cluster-wide | ❌ **Blocked** | Multiple cluster managers |
| Cluster-wide | Namespace-restricted | ❌ **Blocked** | Cluster-wide already manages target namespace |
| Namespace-restricted | Cluster-wide | ❌ **Blocked** | Would conflict with existing namespace operators |
| Namespace-restricted A | Namespace-restricted B (diff ns) | ✅ **Allowed** | Different scopes |

94
95
96
97
98
99
100
101
102
103
104
105
106
## 🔧 Configuration

{{ template "chart.requirementsSection" . }}

{{ template "chart.valuesSection" . }}

### NATS Configuration

For detailed NATS configuration options beyond `nats.enabled`, please refer to the official NATS Helm chart documentation:
**[NATS Helm Chart Documentation](https://github.com/nats-io/k8s/tree/main/helm/charts/nats)**

### etcd Configuration

107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
etcd is **no longer required** for the Dynamo platform. The operator uses Kubernetes-native service discovery by default, and the bundled etcd subchart is **disabled by default**.

To enable the bundled etcd subchart (e.g., for etcd-based service discovery):

```yaml
global:
  etcd:
    install: true
```

To use an external etcd instance instead:

```yaml
dynamo-operator:
  etcdAddr: "http://my-external-etcd:2379"
```

For detailed etcd configuration options, please refer to the official Bitnami etcd Helm chart documentation:
125
126
127
128
129
**[etcd Helm Chart Documentation](https://github.com/bitnami/charts/tree/main/bitnami/etcd)**


## 📚 Additional Resources

130
- [Dynamo Cloud Deployment Installation Guide](../../../../docs/kubernetes/installation-guide.md)
131
132
133
134
135
- [NATS Documentation](https://docs.nats.io/)
- [etcd Documentation](https://etcd.io/docs/)
- [Kubernetes Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/)

{{ template "helm-docs.versionFooter" . }}