Unverified Commit ad3a46a6 authored by Julien Mancuso's avatar Julien Mancuso Committed by GitHub
Browse files

feat: deprecated namespace-restricted mode (#7934)

parent 0078e283
......@@ -85,10 +85,11 @@ kubectl create secret generic hf-token-secret \
### Step 1.3: Install Dynamo Platform
If your cluster uses namespace-restricted Dynamo operators, you'll need to install the Dynamo platform in the workload namespace. Follow the [Dynamo Kubernetes Installation Guide](https://github.com/ai-dynamo/dynamo/blob/main/docs/kubernetes/installation-guide.md) to install the platform in `dynamo-bench`.
Follow the [Dynamo Kubernetes Installation Guide](https://github.com/ai-dynamo/dynamo/blob/main/docs/kubernetes/installation-guide.md) to install the platform in `dynamo-bench`.
> **Note:** Namespace-restricted mode (`namespaceRestriction.enabled=true`) is deprecated and will be removed in a future release. Use cluster-wide mode for new deployments.
**Key Configuration Notes:**
- If your cluster uses namespace restrictions, ensure `dynamo-operator.namespaceRestriction.enabled=true` is set during installation
- Adjust version tags to match your cluster's available Dynamo versions
- If you encounter operator compatibility issues (e.g., unsupported MPI arguments), consult your cluster administrator or the Dynamo troubleshooting documentation
......
......@@ -332,10 +332,13 @@ See [AI Configurator documentation](https://github.com/ai-dynamo/aiconfigurator#
The operator automatically discovers GPU resources from cluster nodes, providing hardware info (GPU model, VRAM, GPUs per node) and automatic profiling search space calculation.
**Requirements:**
- **Cluster-scoped operators**: Have node read permissions by default
- **Namespace-scoped operators**: GPU discovery is enabled by default when installing via Helm — the chart provisions the required ClusterRole/ClusterRoleBinding automatically
- **Cluster-scoped operators** (recommended): Have node read permissions by default. GPU discovery works automatically.
**For namespace-scoped operators**, GPU discovery is controlled by a Helm value:
> **DEPRECATED:** The following applies only to namespace-scoped operators, which are deprecated and will be removed in a future release. Use cluster-wide mode for new deployments.
- **Namespace-scoped operators** (deprecated): GPU discovery is enabled by default when installing via Helm — the chart provisions the required ClusterRole/ClusterRoleBinding automatically
**For namespace-scoped operators (deprecated)**, GPU discovery is controlled by a Helm value:
```bash
# GPU discovery enabled (default) — Helm provisions read-only node access automatically
......
......@@ -107,7 +107,7 @@ For the full spec reference, see the [DGDR API Reference](api-reference.md) and
[Profiler Guide](../components/profiler/profiler-guide.md).
> [!IMPORTANT]
> If you are using a **namespace-scoped operator** with GPU discovery disabled, you must also
> If you are using a **namespace-scoped operator** (deprecated) with GPU discovery disabled, you must also
> provide explicit hardware info or the DGDR will be rejected at admission:
>
> ```yaml
......@@ -119,8 +119,10 @@ For the full spec reference, see the [DGDR API Reference](api-reference.md) and
> vramMb: 81920
> ```
>
> See the [installation guide](installation-guide.md#gpu-discovery-for-dynamographdeploymentrequests-with-namespace-scoped-operators)
> See the [installation guide](installation-guide.md#gpu-discovery-for-dynamographdeploymentrequests-deprecated-namespace-scoped-mode)
> for details.
>
> **Note:** Namespace-scoped mode is deprecated. Use cluster-wide mode for new deployments.
## Step 3: Monitor Profiling Progress
......@@ -269,7 +271,7 @@ kubectl describe dynamographdeploymentrequest qwen3-first-model -n ${NAMESPACE}
Common causes: no available GPU nodes, image pull failure (check image tag; NGC credentials are
optional but may be needed if you hit rate limits pulling from public NGC), missing `hardware`
config for a namespace-scoped operator.
config for a namespace-scoped operator (deprecated).
> [!TIP]
> **GPU node taints** are a frequent cause of pods staying `Pending`. Many clusters (including
......
......@@ -28,7 +28,7 @@ Dynamo operator is a Kubernetes operator that simplifies the deployment, configu
The Dynamo operator supports three deployment modes to accommodate different cluster environments and use cases:
### 1. Cluster-Wide Mode (Default)
### 1. Cluster-Wide Mode (Default, Recommended)
The operator monitors and manages DynamoGraph resources across **all namespaces** in the cluster.
......@@ -39,7 +39,9 @@ The operator monitors and manages DynamoGraph resources across **all namespaces*
---
### 2. Namespace-Scoped Mode
### 2. Namespace-Scoped Mode (DEPRECATED)
> **DEPRECATED:** Namespace-scoped mode (`namespaceRestriction.enabled=true`) is deprecated and will be removed in a future release. Use cluster-wide mode instead. Do not use this for new deployments.
The operator monitors and manages DynamoGraph resources **only in a specific namespace**. A lease marker is created to signal the operator's presence to any cluster-wide operators.
......@@ -59,7 +61,9 @@ helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz \
---
### 3. Hybrid Mode
### 3. Hybrid Mode (DEPRECATED)
> **DEPRECATED:** Hybrid mode relies on namespace-scoped operators, which are deprecated and will be removed in a future release. Use a single cluster-wide operator instead.
A **cluster-wide operator** manages most namespaces, while **one or more namespace-scoped operators** run in specific namespaces (e.g., for testing new versions). The cluster-wide operator automatically detects and excludes namespaces with namespace-scoped operators using lease markers.
......@@ -128,7 +132,6 @@ The Dynamo Operator uses **Kubernetes admission webhooks** for real-time validat
- ✅ Shared certificate infrastructure across all webhook types
- ✅ Automatic certificate generation and rotation (default, all environments)
- ✅ cert-manager integration (optional, for custom PKI)
- ✅ Multi-operator support with lease-based coordination
- ✅ Immutability enforcement for critical fields
For complete documentation on webhooks, certificate management, and troubleshooting, see:
......@@ -175,7 +178,7 @@ helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-$
helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE} --create-namespace
```
> **Note:** For shared/multi-tenant clusters or testing scenarios, see [Deployment Modes](#deployment-modes) above for namespace-scoped and hybrid configurations.
> **Note:** Namespace-scoped and hybrid deployment modes are deprecated. Use cluster-wide mode for all new deployments. See [Deployment Modes](#deployment-modes) above if you need backward-compatible configurations.
### Building from Source
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment