@@ -118,11 +118,14 @@ The chart includes built-in validation to prevent all operator conflicts:
| dynamo-operator.etcdAddr | string | `""` | etcd server address for an external etcd instance. Only needed when using external etcd without the bundled subchart. Format: "http://hostname:port" or "https://hostname:port" |
| dynamo-operator.nats.enabled | bool | `true` | Whether the NATS is enabled |
| dynamo-operator.modelExpressURL | string | `""` | URL for the Model Express server if not deployed by this helm chart. This is ignored if Model Express server is installed by this helm chart (global.model-express.enabled is true). |
| dynamo-operator.namespaceRestriction | object | `{"enabled":false,"lease":{"duration":"30s","renewInterval":"10s"},"targetNamespace":null}` | Namespace access controls for the operator |
| dynamo-operator.namespaceRestriction.enabled | bool | `false` | Whether to restrict operator to specific namespaces. By default, the operator will run with cluster-wide permissions. Only 1 instance of the operator should be deployed in the cluster. If you want to deploy multiple operator instances, you can set this to true and specify the target namespace (by default, the target namespace is the helm release namespace). |
| dynamo-operator.namespaceRestriction.targetNamespace | string | `nil` | Target namespace for operator deployment (leave empty for current namespace) |
| dynamo-operator.gpuDiscovery.enabled | bool | `true` | Whether to provision a ClusterRole for the namespace-scoped operator to read GPU node labels. When true (default), Helm creates a ClusterRole/ClusterRoleBinding granting node read access. Set to false if your installer lacks ClusterRole creation permissions. |
| dynamo-operator.namespaceRestriction | object | `{"enabled":false,"lease":{"duration":"30s","renewInterval":"10s"},"targetNamespace":null}` | DEPRECATED: Namespace-restricted mode is deprecated and will be removed in a future release. Use cluster-wide mode (the default) instead. Do not enable this for new deployments. |
| dynamo-operator.namespaceRestriction.enabled | bool | `false` | DEPRECATED: Do not enable for new deployments. Namespace-restricted mode is deprecated. |
| dynamo-operator.namespaceRestriction.targetNamespace | string | `nil` | DEPRECATED: Only used in namespace-restricted mode, which is deprecated. |
| dynamo-operator.namespaceRestriction.lease | object | `{"duration":"30s","renewInterval":"10s"}` | DEPRECATED: Only used in namespace-restricted mode, which is deprecated. |
| dynamo-operator.namespaceRestriction.lease.duration | string | `"30s"` | DEPRECATED: Lease duration for namespace-restricted mode, which is deprecated. |
| dynamo-operator.namespaceRestriction.lease.renewInterval | string | `"10s"` | DEPRECATED: Lease renew interval for namespace-restricted mode, which is deprecated. |
| dynamo-operator.gpuDiscovery | object | `{"enabled":true}` | DEPRECATED: GPU discovery for namespace-scoped operators is deprecated along with namespace-restricted mode. |
| dynamo-operator.gpuDiscovery.enabled | bool | `true` | DEPRECATED: Only relevant when namespaceRestriction is enabled, which is deprecated. |
| dynamo-operator.controllerManager.tolerations | list | `[]` | Node tolerations for controller manager pods |
| dynamo-operator.controllerManager.leaderElection.id | string | `""` | Leader election ID for cluster-wide coordination. WARNING: All cluster-wide operators must use the SAME ID to prevent split-brain. Different IDs would allow multiple leaders simultaneously. |
# Environment variables to pass to operator Deployment.
env:[]
# Namespace restriction configuration for the operator
# If enabled: true and targetNamespace is empty, the operator will be restricted to the release namespace
# If enabled: true and targetNamespace is set, the operator will be restricted to the specified namespace
# If enabled: false, the operator will run with cluster-wide permissions
# -- DEPRECATED: Namespace-restricted mode is deprecated and will be removed in a future release. Use cluster-wide mode (the default) instead. Do not enable this for new deployments.
namespaceRestriction:
# Whether to restrict the operator to a single namespace
# -- DEPRECATED: Do not enable for new deployments. Namespace-restricted mode is deprecated.
enabled:false
# The target namespace to restrict to. If empty, defaults to the release namespace
# -- DEPRECATED: Only used in namespace-restricted mode, which is deprecated.
targetNamespace:""
# Namespace scope marker lease configuration (used to prevent conflicts when running both cluster-wide and namespace-restricted operators)
# -- DEPRECATED: Only used in namespace-restricted mode, which is deprecated.
lease:
# Duration before the namespace scope marker lease expires if not renewed (namespace-restricted mode only). When a namespace-restricted operator is running, it creates a lease in its namespace. The cluster-wide operator detects this lease and excludes that namespace from processing. If the namespace operator stops renewing the lease (e.g., crashes), the lease expires and the cluster-wide operator automatically resumes processing that namespace.
# -- DEPRECATED: Lease duration for namespace-restricted mode, which is deprecated.
duration:30s
# Interval for renewing the namespace scope marker lease (namespace-restricted mode only). The namespace-restricted operator renews its lease at this interval to signal it's still running.
# -- DEPRECATED: Lease renewinterval for namespace-restricted mode, which is deprecated.
renewInterval:10s
# -- GPU discovery configuration (only applies when namespaceRestriction.enabled=true)
# -- DEPRECATED: GPU discovery for namespace-scoped operators is deprecated along with namespace-restricted mode.
gpuDiscovery:
# -- Whether to provision a ClusterRole for the namespace-scoped operator to read GPU node labels.
# When true (default), Helm creates a ClusterRole/ClusterRoleBinding granting node read access.
# Set to false if your installer lacks ClusterRole creation permissions; you must then provide
# hardware config manually in each DynamoGraphDeploymentRequest.
# -- DEPRECATED: Only relevant when namespaceRestriction is enabled, which is deprecated.
# -- URL for the Model Express server if not deployed by this helm chart. This is ignored if Model Express server is installed by this helm chart (global.model-express.enabled is true).
modelExpressURL:""
# -- Namespace access controls for the operator
# -- DEPRECATED: Namespace-restricted mode is deprecated and will be removed in a future release. Use cluster-wide mode (the default) instead. Do not enable this for new deployments.
namespaceRestriction:
# -- Whether to restrict operator to specific namespaces. By default, the operator will run with cluster-wide permissions. Only 1 instance of the operator should be deployed in the cluster. If you want to deploy multiple operator instances, you can set this to true and specify the target namespace (by default, the target namespace is the helm release namespace).
# -- DEPRECATED: Do not enable for new deployments. Namespace-restricted mode is deprecated.
enabled:false
# -- Target namespace for operator deployment (leave empty for current namespace)
# -- DEPRECATED: Only used in namespace-restricted mode, which is deprecated.
targetNamespace:
# Namespace scope marker lease configuration (used to prevent conflicts when running both cluster-wide and namespace-restricted operators)
# -- DEPRECATED: Only used in namespace-restricted mode, which is deprecated.
lease:
# Duration before the namespace scope marker lease expires if not renewed (namespace-restricted mode only). When a namespace-restricted operator is running, it creates a lease in its namespace. The cluster-wide operator detects this lease and excludes that namespace from processing. If the namespace operator stops renewing the lease (e.g., crashes), the lease expires and the cluster-wide operator automatically resumes processing that namespace.
# -- DEPRECATED: Lease duration for namespace-restricted mode, which is deprecated.
duration:30s
# Interval for renewing the namespace scope marker lease (namespace-restricted mode only). The namespace-restricted operator renews its lease at this interval to signal it's still running.
# -- DEPRECATED: Lease renewinterval for namespace-restricted mode, which is deprecated.
renewInterval:10s
# -- GPU discovery configuration (only applies when namespaceRestriction.enabled=true)
# -- DEPRECATED: GPU discovery for namespace-scoped operators is deprecated along with namespace-restricted mode.
gpuDiscovery:
# -- Whether to provision a ClusterRole for the namespace-scoped operator to read GPU node labels.
# When true (default), Helm creates a ClusterRole/ClusterRoleBinding granting node read access.
# Set to false if your installer lacks ClusterRole creation permissions.
# -- DEPRECATED: Only relevant when namespaceRestriction is enabled, which is deprecated.
enabled:true
# -- The Dynamo discovery backend to use. Default is "kubernetes" for Kubernetes API service discovery. Set to "etcd" to use ETCD for discovery. --
> **DEPRECATED:** Namespace-restricted mode (`namespaceRestriction.enabled=true`) is deprecated and will be removed in a future release. Use cluster-wide mode (the default) instead.
For more details or customization options (including multinode deployments), see **[Installation Guide for Dynamo Kubernetes Platform](installation-guide.md)**.
| `restricted` _string_ | Deprecated: Namespace-restricted mode is deprecated and will be removed in a future release.<br />Use cluster-wide mode (leave Restricted empty) instead. | | |
| `scope` _[NamespaceScopeConfiguration](#namespacescopeconfiguration)_ | Deprecated: Scope is only used in namespace-restricted mode, which is deprecated. | | |
#### NamespaceScopeConfiguration
NamespaceScopeConfiguration holds lease settings for namespace-restricted mode.
Deprecated: NamespaceScopeConfiguration is used only by the deprecated namespace-restricted
- A cluster-wide Dynamo operator is likely already running
-**Do NOT install another operator** - use the existing cluster-wide operator
- Only install a namespace-restricted operator if you specifically need to prevent the cluster-wide operator from managing your namespace (e.g., testing operator features you're developing)
Note: Use the full path `dynamo-operator.namespaceRestriction.enabled=true` (not just `namespaceRestriction.enabled=true`).
If you see this validation error, you need namespace restriction:
```
VALIDATION ERROR: Cannot install cluster-wide Dynamo operator.
Found existing namespace-restricted Dynamo operators in namespaces: ...
```
> **DEPRECATED:** Namespace-restricted mode (`namespaceRestriction.enabled=true`) is deprecated and will be removed in a future release. New deployments should use the default cluster-wide mode. If you are currently using namespace-restricted mode, plan to migrate to cluster-wide mode.
> [!TIP]
> For multinode deployments, you need to install multinode orchestration components:
...
...
@@ -196,17 +182,13 @@ Found existing namespace-restricted Dynamo operators in namespaces: ...
> By default, Dynamo Operator is installed cluster-wide and will monitor all namespaces.
> If you wish to restrict the operator to monitor only a specific namespace (the helm release namespace by default), you can set the namespaceRestriction.enabled to true.
> You can also change the restricted namespace by setting the targetNamespace property.
> [!WARNING]
> **DEPRECATED:** Namespace-restricted mode is deprecated and will be removed in a future release.
> By default, Dynamo Operator is installed cluster-wide and will monitor all namespaces. This is the recommended and only supported mode going forward.
### GPU Discovery for DynamoGraphDeploymentRequests (Deprecated Namespace-Scoped Mode)
### GPU Discovery for DynamoGraphDeploymentRequests with Namespace-Scoped Operators
> **DEPRECATED:** This section applies only to the deprecated namespace-restricted mode. New deployments should use cluster-wide mode, which has GPU discovery by default.
GPU discovery is **enabled by default** for namespace-scoped operators. The Helm chart automatically provisions a ClusterRole/ClusterRoleBinding granting the operator read-only access to node GPU labels.
...
...
@@ -270,15 +252,13 @@ cd deploy/helm/charts
# 4. Install Platform (CRDs are automatically installed by the chart)
helm dep build ./platform/
# To install cluster-wide instead, set NS_RESTRICT_FLAGS="" (empty) or omit that line entirely.
# NOTE: Namespace-restricted mode is DEPRECATED. Use cluster-wide mode (the default).
Note: Use the full path `dynamo-operator.namespaceRestriction.enabled=true` (not just `namespaceRestriction.enabled=true`).
Solution: Migrate the existing namespace-restricted operators to cluster-wide mode. Namespace-restricted mode is deprecated and should no longer be used.
@@ -15,7 +15,7 @@ TAS is **opt-in**. Existing deployments without topology constraints continue to
| **Grove** | Installed on the cluster. See the [Grove Installation Guide](https://github.com/NVIDIA/grove/blob/main/docs/installation.md). |
| **ClusterTopology CR** | A cluster-scoped `ClusterTopology` resource configured by the cluster admin, mapping topology domain names to node labels. See [Grove documentation](https://github.com/NVIDIA/grove) for setup instructions. |
| **KAI Scheduler** | [KAI Scheduler](https://github.com/NVIDIA/KAI-Scheduler) is required by Grove for topology-aware pod placement. |
| **Dynamo operator** | The latest Dynamo operator Helm chart includes read-only RBAC for `clustertopologies.grove.io` via a dedicated ClusterRole. This works for both cluster-wide and namespace-restricted operator deployments — no extra configuration is needed. |
| **Dynamo operator** | The latest Dynamo operator Helm chart includes read-only RBAC for `clustertopologies.grove.io` via a dedicated ClusterRole. No extra configuration is needed. |
@@ -18,7 +18,6 @@ All webhook types (validating, mutating, conversion, etc.) share the same **webh
- ✅ **Shared certificate infrastructure** - All webhook types use the same TLS certificates
- ✅ **Automatic certificate generation and rotation** - Built-in cert-controller, no manual management required
- ✅ **cert-manager integration** - Optional integration for custom PKI or organizational certificate policies
- ✅ **Multi-operator support** - Lease-based coordination for cluster-wide and namespace-restricted deployments
- ✅ **Immutability enforcement** - Critical fields protected via CEL validation rules
### Current Webhook Types
...
...
@@ -165,7 +164,7 @@ webhook:
values:["disabled"]
```
**Note:** For **namespace-restricted operators**, the namespace selector is automatically set to validate only the operator's namespace. This configuration is ignored in namespace-restricted mode.
**Note:** For **namespace-restricted operators** (deprecated), the namespace selector is automatically set to validate only the operator's namespace. This configuration is ignored in namespace-restricted mode.
> **DEPRECATED:** Namespace-restricted mode and multi-operator deployments are deprecated and will be removed in a future release. Use a single cluster-wide operator instead.
The operator supports running both **cluster-wide** and **namespace-restricted** instances simultaneously using a **lease-based coordination mechanism**.