Unverified Commit 0404a9be authored by Julien Mancuso's avatar Julien Mancuso Committed by GitHub
Browse files

fix: allow v1alpha1 DGDR creation by fixing webhook version matching and backend enum (#6803)


Signed-off-by: default avatarJulien Mancuso <jmancuso@nvidia.com>
parent 57bdbdf1
...@@ -94,6 +94,7 @@ spec: ...@@ -94,6 +94,7 @@ spec:
The controller automatically sets this value in profilingConfig.config.engine.backend. The controller automatically sets this value in profilingConfig.config.engine.backend.
Profiling runs on real GPUs or via AIC simulation to collect performance data. Profiling runs on real GPUs or via AIC simulation to collect performance data.
enum: enum:
- auto
- vllm - vllm
- sglang - sglang
- trtllm - trtllm
......
...@@ -160,7 +160,6 @@ webhooks: ...@@ -160,7 +160,6 @@ webhooks:
- apiGroups: - apiGroups:
- nvidia.com - nvidia.com
apiVersions: apiVersions:
- v1alpha1
- v1beta1 - v1beta1
operations: operations:
- CREATE - CREATE
......
...@@ -151,7 +151,7 @@ type DynamoGraphDeploymentRequestSpec struct { ...@@ -151,7 +151,7 @@ type DynamoGraphDeploymentRequestSpec struct {
// The controller automatically sets this value in profilingConfig.config.engine.backend. // The controller automatically sets this value in profilingConfig.config.engine.backend.
// Profiling runs on real GPUs or via AIC simulation to collect performance data. // Profiling runs on real GPUs or via AIC simulation to collect performance data.
// +kubebuilder:validation:Required // +kubebuilder:validation:Required
// +kubebuilder:validation:Enum=vllm;sglang;trtllm // +kubebuilder:validation:Enum=auto;vllm;sglang;trtllm
Backend string `json:"backend"` Backend string `json:"backend"`
// UseMocker indicates whether to deploy a mocker DynamoGraphDeployment instead of // UseMocker indicates whether to deploy a mocker DynamoGraphDeployment instead of
......
...@@ -94,6 +94,7 @@ spec: ...@@ -94,6 +94,7 @@ spec:
The controller automatically sets this value in profilingConfig.config.engine.backend. The controller automatically sets this value in profilingConfig.config.engine.backend.
Profiling runs on real GPUs or via AIC simulation to collect performance data. Profiling runs on real GPUs or via AIC simulation to collect performance data.
enum: enum:
- auto
- vllm - vllm
- sglang - sglang
- trtllm - trtllm
......
...@@ -510,7 +510,7 @@ _Appears in:_ ...@@ -510,7 +510,7 @@ _Appears in:_
| Field | Description | Default | Validation | | Field | Description | Default | Validation |
| --- | --- | --- | --- | | --- | --- | --- | --- |
| `model` _string_ | Model specifies the model to deploy (e.g., "Qwen/Qwen3-0.6B", "meta-llama/Llama-3-70b").<br />This is a high-level identifier for easy reference in kubectl output and logs.<br />The controller automatically sets this value in profilingConfig.config.deployment.model. | | Required: \{\} <br /> | | `model` _string_ | Model specifies the model to deploy (e.g., "Qwen/Qwen3-0.6B", "meta-llama/Llama-3-70b").<br />This is a high-level identifier for easy reference in kubectl output and logs.<br />The controller automatically sets this value in profilingConfig.config.deployment.model. | | Required: \{\} <br /> |
| `backend` _string_ | Backend specifies the inference backend for profiling.<br />The controller automatically sets this value in profilingConfig.config.engine.backend.<br />Profiling runs on real GPUs or via AIC simulation to collect performance data. | | Enum: [vllm sglang trtllm] <br />Required: \{\} <br /> | | `backend` _string_ | Backend specifies the inference backend for profiling.<br />The controller automatically sets this value in profilingConfig.config.engine.backend.<br />Profiling runs on real GPUs or via AIC simulation to collect performance data. | | Enum: [auto vllm sglang trtllm] <br />Required: \{\} <br /> |
| `useMocker` _boolean_ | UseMocker indicates whether to deploy a mocker DynamoGraphDeployment instead of<br />a real backend deployment. When true, the deployment uses simulated engines that<br />don't require GPUs, using the profiling data to simulate realistic timing behavior.<br />Mocker is available in all backend images and useful for large-scale experiments.<br />Profiling still runs against the real backend (specified above) to collect performance data. | false | | | `useMocker` _boolean_ | UseMocker indicates whether to deploy a mocker DynamoGraphDeployment instead of<br />a real backend deployment. When true, the deployment uses simulated engines that<br />don't require GPUs, using the profiling data to simulate realistic timing behavior.<br />Mocker is available in all backend images and useful for large-scale experiments.<br />Profiling still runs against the real backend (specified above) to collect performance data. | false | |
| `profilingConfig` _[ProfilingConfigSpec](#profilingconfigspec)_ | ProfilingConfig provides the complete configuration for the profiling job.<br />Note: GPU discovery is automatically attempted to detect GPU resources from Kubernetes<br />cluster nodes. If the operator has node read permissions (cluster-wide or explicitly granted),<br />discovered GPU configuration is used as defaults when hardware configuration is not manually<br />specified (minNumGpusPerEngine, maxNumGpusPerEngine, numGpusPerNode). User-specified values<br />always take precedence over auto-discovered values. If GPU discovery fails (e.g.,<br />namespace-restricted operator without node permissions), manual hardware config is required.<br />This configuration is passed directly to the profiler.<br />The structure matches the profile_sla config format exactly (see ProfilingConfigSpec for schema).<br />Note: deployment.model and engine.backend are automatically set from the high-level<br />modelName and backend fields and should not be specified in this config. | | Required: \{\} <br /> | | `profilingConfig` _[ProfilingConfigSpec](#profilingconfigspec)_ | ProfilingConfig provides the complete configuration for the profiling job.<br />Note: GPU discovery is automatically attempted to detect GPU resources from Kubernetes<br />cluster nodes. If the operator has node read permissions (cluster-wide or explicitly granted),<br />discovered GPU configuration is used as defaults when hardware configuration is not manually<br />specified (minNumGpusPerEngine, maxNumGpusPerEngine, numGpusPerNode). User-specified values<br />always take precedence over auto-discovered values. If GPU discovery fails (e.g.,<br />namespace-restricted operator without node permissions), manual hardware config is required.<br />This configuration is passed directly to the profiler.<br />The structure matches the profile_sla config format exactly (see ProfilingConfigSpec for schema).<br />Note: deployment.model and engine.backend are automatically set from the high-level<br />modelName and backend fields and should not be specified in this config. | | Required: \{\} <br /> |
| `enableGpuDiscovery` _boolean_ | EnableGPUDiscovery controls whether the operator attempts to discover GPU hardware from cluster nodes.<br />DEPRECATED: This field is deprecated and will be removed in v1beta1. GPU discovery is now always<br />attempted automatically. Setting this field has no effect - the operator will always try to discover<br />GPU hardware when node read permissions are available. If discovery is unavailable (e.g., namespace-scoped<br />operator without permissions), manual hardware configuration is required regardless of this setting. | true | Optional: \{\} <br /> | | `enableGpuDiscovery` _boolean_ | EnableGPUDiscovery controls whether the operator attempts to discover GPU hardware from cluster nodes.<br />DEPRECATED: This field is deprecated and will be removed in v1beta1. GPU discovery is now always<br />attempted automatically. Setting this field has no effect - the operator will always try to discover<br />GPU hardware when node read permissions are available. If discovery is unavailable (e.g., namespace-scoped<br />operator without permissions), manual hardware configuration is required regardless of this setting. | true | Optional: \{\} <br /> |
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment