fix: allow v1alpha1 DGDR creation by fixing webhook version matching and backend enum (#6803)

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>

fix: allow v1alpha1 DGDR creation by fixing webhook version matching and backend enum (#6803)
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
0404a9be · Julien Mancuso · GitHub · 57bdbdf1 · 0404a9be · 0404a9be
Unverified Commit 0404a9be authored Mar 03, 2026 by Julien Mancuso Committed by GitHub Mar 03, 2026
5 changed files
--- a/deploy/helm/charts/platform/components/operator/crds/nvidia.com_dynamographdeploymentrequests.yaml
+++ b/deploy/helm/charts/platform/components/operator/crds/nvidia.com_dynamographdeploymentrequests.yaml
@@ -94,6 +94,7 @@ spec:
                    The controller automatically sets this value in profilingConfig.config.engine.backend.
                    Profiling runs on real GPUs or via AIC simulation to collect performance data.
                  enum:
+                    - auto
                    - vllm
                    - sglang
                    - trtllm

--- a/deploy/helm/charts/platform/components/operator/templates/webhook-configuration.yaml
+++ b/deploy/helm/charts/platform/components/operator/templates/webhook-configuration.yaml
@@ -160,7 +160,6 @@ webhooks:
  - apiGroups:
    - nvidia.com
    apiVersions:
-    - v1alpha1
    - v1beta1
    operations:
    - CREATE

--- a/deploy/operator/api/v1alpha1/dynamographdeploymentrequest_types.go
+++ b/deploy/operator/api/v1alpha1/dynamographdeploymentrequest_types.go
@@ -151,7 +151,7 @@ type DynamoGraphDeploymentRequestSpec struct {
 	// The controller automatically sets this value in profilingConfig.config.engine.backend.
 	// Profiling runs on real GPUs or via AIC simulation to collect performance data.
 	// +kubebuilder:validation:Required
-	// +kubebuilder:validation:Enum=vllm;sglang;trtllm
+	// +kubebuilder:validation:Enum=auto;vllm;sglang;trtllm
 	Backend string `json:"backend"`

 	// UseMocker indicates whether to deploy a mocker DynamoGraphDeployment instead of

--- a/deploy/operator/config/crd/bases/nvidia.com_dynamographdeploymentrequests.yaml
+++ b/deploy/operator/config/crd/bases/nvidia.com_dynamographdeploymentrequests.yaml
@@ -94,6 +94,7 @@ spec:
                    The controller automatically sets this value in profilingConfig.config.engine.backend.
                    Profiling runs on real GPUs or via AIC simulation to collect performance data.
                  enum:
+                    - auto
                    - vllm
                    - sglang
                    - trtllm

--- a/docs/kubernetes/api-reference.md
+++ b/docs/kubernetes/api-reference.md
@@ -510,7 +510,7 @@ _Appears in:_
 | Field | Description | Default | Validation |
 | --- | --- | --- | --- |
 | `model` _string_ | Model specifies the model to deploy (e.g., "Qwen/Qwen3-0.6B", "meta-llama/Llama-3-70b").<br />This is a high-level identifier for easy reference in kubectl output and logs.<br />The controller automatically sets this value in profilingConfig.config.deployment.model. |  | Required: \{\} <br /> |
-| `backend` _string_ | Backend specifies the inference backend for profiling.<br />The controller automatically sets this value in profilingConfig.config.engine.backend.<br />Profiling runs on real GPUs or via AIC simulation to collect performance data. |  | Enum: [vllm sglang trtllm] <br />Required: \{\} <br /> |
+| `backend` _string_ | Backend specifies the inference backend for profiling.<br />The controller automatically sets this value in profilingConfig.config.engine.backend.<br />Profiling runs on real GPUs or via AIC simulation to collect performance data. |  | Enum: [auto vllm sglang trtllm] <br />Required: \{\} <br /> |
 | `useMocker` _boolean_ | UseMocker indicates whether to deploy a mocker DynamoGraphDeployment instead of<br />a real backend deployment. When true, the deployment uses simulated engines that<br />don't require GPUs, using the profiling data to simulate realistic timing behavior.<br />Mocker is available in all backend images and useful for large-scale experiments.<br />Profiling still runs against the real backend (specified above) to collect performance data. | false |  |
 | `profilingConfig` _[ProfilingConfigSpec](#profilingconfigspec)_ | ProfilingConfig provides the complete configuration for the profiling job.<br />Note: GPU discovery is automatically attempted to detect GPU resources from Kubernetes<br />cluster nodes. If the operator has node read permissions (cluster-wide or explicitly granted),<br />discovered GPU configuration is used as defaults when hardware configuration is not manually<br />specified (minNumGpusPerEngine, maxNumGpusPerEngine, numGpusPerNode). User-specified values<br />always take precedence over auto-discovered values. If GPU discovery fails (e.g.,<br />namespace-restricted operator without node permissions), manual hardware config is required.<br />This configuration is passed directly to the profiler.<br />The structure matches the profile_sla config format exactly (see ProfilingConfigSpec for schema).<br />Note: deployment.model and engine.backend are automatically set from the high-level<br />modelName and backend fields and should not be specified in this config. |  | Required: \{\} <br /> |
 | `enableGpuDiscovery` _boolean_ | EnableGPUDiscovery controls whether the operator attempts to discover GPU hardware from cluster nodes.<br />DEPRECATED: This field is deprecated and will be removed in v1beta1. GPU discovery is now always<br />attempted automatically. Setting this field has no effect - the operator will always try to discover<br />GPU hardware when node read permissions are available. If discovery is unavailable (e.g., namespace-scoped<br />operator without permissions), manual hardware configuration is required regardless of this setting. | true | Optional: \{\} <br /> |