| `name` _string_ | Name of the ConfigMap containing the desired data. | | Required: {} <br/> |
| `name` _string_ | Name of the ConfigMap containing the desired data. | | Required: \{\}<br/> |
| `key` _string_ | Key in the ConfigMap to select. If not specified, defaults to "disagg.yaml". | disagg.yaml | |
| `key` _string_ | Key in the ConfigMap to select. If not specified, defaults to "disagg.yaml". | disagg.yaml | |
...
@@ -95,11 +95,11 @@ _Appears in:_
...
@@ -95,11 +95,11 @@ _Appears in:_
| Field | Description | Default | Validation |
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| --- | --- | --- | --- |
| `name` _string_ | Name is the desired name for the created DynamoGraphDeployment.<br/>If not specified, defaults to the DGDR name. | | Optional: {} <br/> |
| `name` _string_ | Name is the desired name for the created DynamoGraphDeployment.<br/>If not specified, defaults to the DGDR name. | | Optional: \{\}<br/> |
| `namespace` _string_ | Namespace is the desired namespace for the created DynamoGraphDeployment.<br/>If not specified, defaults to the DGDR namespace. | | Optional: {} <br/> |
| `namespace` _string_ | Namespace is the desired namespace for the created DynamoGraphDeployment.<br/>If not specified, defaults to the DGDR namespace. | | Optional: \{\}<br/> |
| `labels` _object (keys:string, values:string)_ | Labels are additional labels to add to the DynamoGraphDeployment metadata.<br/>These are merged with auto-generated labels from the profiling process. | | Optional: {} <br/> |
| `labels` _object (keys:string, values:string)_ | Labels are additional labels to add to the DynamoGraphDeployment metadata.<br/>These are merged with auto-generated labels from the profiling process. | | Optional: \{\}<br/> |
| `annotations` _object (keys:string, values:string)_ | Annotations are additional annotations to add to the DynamoGraphDeployment metadata. | | Optional: {} <br/> |
| `annotations` _object (keys:string, values:string)_ | Annotations are additional annotations to add to the DynamoGraphDeployment metadata. | | Optional: \{\}<br/> |
| `workersImage` _string_ | WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.<br/>This image is used for both temporary DGDs created during online profiling and the final DGD.<br/>If omitted, the image from the base config file (e.g., disagg.yaml) is used.<br/>Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.6.1" | | Optional: {} <br/> |
| `workersImage` _string_ | WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.<br/>This image is used for both temporary DGDs created during online profiling and the final DGD.<br/>If omitted, the image from the base config file (e.g., disagg.yaml) is used.<br/>Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.6.1" | | Optional: \{\}<br/> |
#### DeploymentStatus
#### DeploymentStatus
...
@@ -159,7 +159,8 @@ _Appears in:_
...
@@ -159,7 +159,8 @@ _Appears in:_
| `serviceName` _string_ | The name of the component | | |
| `serviceName` _string_ | The name of the component | | |
| `componentType` _string_ | ComponentType indicates the role of this component (for example, "main"). | | |
| `componentType` _string_ | ComponentType indicates the role of this component (for example, "main"). | | |
| `subComponentType` _string_ | SubComponentType indicates the sub-role of this component (for example, "prefill"). | | |
| `subComponentType` _string_ | SubComponentType indicates the sub-role of this component (for example, "prefill"). | | |
| `dynamoNamespace` _string_ | Dynamo namespace of the service (allows to override the Dynamo namespace of the service defined in annotations inside the Dynamo archive) | | |
| `dynamoNamespace` _string_ | DynamoNamespace is deprecated and will be removed in a future version.<br/>The DGD Kubernetes namespace and DynamoGraphDeployment name are used to construct the Dynamo namespace for each component | | Optional: \{\}<br/> |
| `globalDynamoNamespace` _boolean_ | GlobalDynamoNamespace indicates that the Component will be placed in the global Dynamo namespace | | |
| `resources` _[Resources](#resources)_ | Resources requested and limits for this component, including CPU, memory,<br/>GPUs/devices, and any runtime-specific resources. | | |
| `resources` _[Resources](#resources)_ | Resources requested and limits for this component, including CPU, memory,<br/>GPUs/devices, and any runtime-specific resources. | | |
| `autoscaling` _[Autoscaling](#autoscaling)_ | Autoscaling config for this component (replica range, target utilization, etc.). | | |
| `autoscaling` _[Autoscaling](#autoscaling)_ | Autoscaling config for this component (replica range, target utilization, etc.). | | |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs defines additional environment variables to inject into the component containers. | | |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs defines additional environment variables to inject into the component containers. | | |
...
@@ -194,7 +195,8 @@ _Appears in:_
...
@@ -194,7 +195,8 @@ _Appears in:_
| `serviceName` _string_ | The name of the component | | |
| `serviceName` _string_ | The name of the component | | |
| `componentType` _string_ | ComponentType indicates the role of this component (for example, "main"). | | |
| `componentType` _string_ | ComponentType indicates the role of this component (for example, "main"). | | |
| `subComponentType` _string_ | SubComponentType indicates the sub-role of this component (for example, "prefill"). | | |
| `subComponentType` _string_ | SubComponentType indicates the sub-role of this component (for example, "prefill"). | | |
| `dynamoNamespace` _string_ | Dynamo namespace of the service (allows to override the Dynamo namespace of the service defined in annotations inside the Dynamo archive) | | |
| `dynamoNamespace` _string_ | DynamoNamespace is deprecated and will be removed in a future version.<br/>The DGD Kubernetes namespace and DynamoGraphDeployment name are used to construct the Dynamo namespace for each component | | Optional: \{\}<br/> |
| `globalDynamoNamespace` _boolean_ | GlobalDynamoNamespace indicates that the Component will be placed in the global Dynamo namespace | | |
| `resources` _[Resources](#resources)_ | Resources requested and limits for this component, including CPU, memory,<br/>GPUs/devices, and any runtime-specific resources. | | |
| `resources` _[Resources](#resources)_ | Resources requested and limits for this component, including CPU, memory,<br/>GPUs/devices, and any runtime-specific resources. | | |
| `autoscaling` _[Autoscaling](#autoscaling)_ | Autoscaling config for this component (replica range, target utilization, etc.). | | |
| `autoscaling` _[Autoscaling](#autoscaling)_ | Autoscaling config for this component (replica range, target utilization, etc.). | | |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs defines additional environment variables to inject into the component containers. | | |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs defines additional environment variables to inject into the component containers. | | |
...
@@ -237,7 +239,6 @@ DynamoGraphDeploymentRequest is the Schema for the dynamographdeploymentrequests
...
@@ -237,7 +239,6 @@ DynamoGraphDeploymentRequest is the Schema for the dynamographdeploymentrequests
It serves as the primary interface for users to request model deployments with
It serves as the primary interface for users to request model deployments with
specific performance and resource constraints, enabling SLA-driven deployments.
specific performance and resource constraints, enabling SLA-driven deployments.
Lifecycle:
Lifecycle:
1. Initial → Pending: Validates spec and prepares for profiling
1. Initial → Pending: Validates spec and prepares for profiling
2. Pending → Profiling: Creates and runs profiling job (online or AIC)
2. Pending → Profiling: Creates and runs profiling job (online or AIC)
...
@@ -246,7 +247,6 @@ Lifecycle:
...
@@ -246,7 +247,6 @@ Lifecycle:
5. Ready: Terminal state when DGD is operational or spec is available
5. Ready: Terminal state when DGD is operational or spec is available
6. DeploymentDeleted: Terminal state when auto-created DGD is manually deleted
6. DeploymentDeleted: Terminal state when auto-created DGD is manually deleted
The spec becomes immutable once profiling starts. Users must delete and recreate
The spec becomes immutable once profiling starts. Users must delete and recreate
the DGDR to modify configuration after this point.
the DGDR to modify configuration after this point.
...
@@ -278,12 +278,12 @@ _Appears in:_
...
@@ -278,12 +278,12 @@ _Appears in:_
| Field | Description | Default | Validation |
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| --- | --- | --- | --- |
| `model` _string_ | Model specifies the model to deploy (e.g., "Qwen/Qwen3-0.6B", "meta-llama/Llama-3-70b").<br/>This is a high-level identifier for easy reference in kubectl output and logs.<br/>The controller automatically sets this value in profilingConfig.config.deployment.model. | | Required: {} <br/> |
| `model` _string_ | Model specifies the model to deploy (e.g., "Qwen/Qwen3-0.6B", "meta-llama/Llama-3-70b").<br/>This is a high-level identifier for easy reference in kubectl output and logs.<br/>The controller automatically sets this value in profilingConfig.config.deployment.model. | | Required: \{\}<br/> |
| `backend` _string_ | Backend specifies the inference backend to use.<br/>The controller automatically sets this value in profilingConfig.config.engine.backend. | | Enum: [vllm sglang trtllm] <br/>Required: {} <br/> |
| `backend` _string_ | Backend specifies the inference backend to use.<br/>The controller automatically sets this value in profilingConfig.config.engine.backend. | | Enum: [vllm sglang trtllm] <br/>Required: \{\}<br/> |
| `enableGpuDiscovery` _boolean_ | EnableGpuDiscovery controls whether the profiler should automatically discover GPU<br/>resources from the Kubernetes cluster nodes. When enabled, the profiler will override<br/>any manually specified hardware configuration (min_num_gpus_per_engine, max_num_gpus_per_engine,<br/>num_gpus_per_node) with values detected from the cluster.<br/>Requires cluster-wide node access permissions - only available with cluster-scoped operators. | false | Optional: {} <br/> |
| `enableGpuDiscovery` _boolean_ | EnableGpuDiscovery controls whether the profiler should automatically discover GPU<br/>resources from the Kubernetes cluster nodes. When enabled, the profiler will override<br/>any manually specified hardware configuration (min_num_gpus_per_engine, max_num_gpus_per_engine,<br/>num_gpus_per_node) with values detected from the cluster.<br/>Requires cluster-wide node access permissions - only available with cluster-scoped operators. | false | Optional: \{\}<br/> |
| `profilingConfig` _[ProfilingConfigSpec](#profilingconfigspec)_ | ProfilingConfig provides the complete configuration for the profiling job.<br />This configuration is passed directly to the profiler.<br />The structure matches the profile_sla config format exactly (see ProfilingConfigSpec for schema).<br/>Note: deployment.model and engine.backend are automatically set from the high-level<br/>modelName and backend fields and should not be specified in this config. | | Required: {} <br/> |
| `profilingConfig` _[ProfilingConfigSpec](#profilingconfigspec)_ | ProfilingConfig provides the complete configuration for the profiling job.<br />This configuration is passed directly to the profiler.<br />The structure matches the profile_sla config format exactly (see ProfilingConfigSpec for schema).<br/>Note: deployment.model and engine.backend are automatically set from the high-level<br/>modelName and backend fields and should not be specified in this config. | | Required: \{\}<br/> |
| `autoApply` _boolean_ | AutoApply indicates whether to automatically create a DynamoGraphDeployment<br/>after profiling completes. If false, only the spec is generated and stored in status.<br/>Users can then manually create a DGD using the generated spec. | false | |
| `autoApply` _boolean_ | AutoApply indicates whether to automatically create a DynamoGraphDeployment<br/>after profiling completes. If false, only the spec is generated and stored in status.<br/>Users can then manually create a DGD using the generated spec. | false | |
| `deploymentOverrides` _[DeploymentOverridesSpec](#deploymentoverridesspec)_ | DeploymentOverrides allows customizing metadata for the auto-created DGD.<br/>Only applicable when AutoApply is true. | | Optional: {} <br/> |
| `deploymentOverrides` _[DeploymentOverridesSpec](#deploymentoverridesspec)_ | DeploymentOverrides allows customizing metadata for the auto-created DGD.<br/>Only applicable when AutoApply is true. | | Optional: \{\}<br/> |
#### DynamoGraphDeploymentRequestStatus
#### DynamoGraphDeploymentRequestStatus
...
@@ -301,12 +301,12 @@ _Appears in:_
...
@@ -301,12 +301,12 @@ _Appears in:_
| Field | Description | Default | Validation |
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| --- | --- | --- | --- |
| `state` _string_ | State is a high-level textual status of the deployment request lifecycle.<br/>Possible values: "", "Pending", "Profiling", "Deploying", "Ready", "DeploymentDeleted", "Failed"<br/>Empty string ("") represents the initial state before initialization. | | |
| `state` _string_ | State is a high-level textual status of the deployment request lifecycle.<br/>Possible values: "", "Pending", "Profiling", "Deploying", "Ready", "DeploymentDeleted", "Failed"<br/>Empty string ("") represents the initial state before initialization. | | |
| `backend` _string_ | Backend is extracted from profilingConfig.config.engine.backend for display purposes.<br/>This field is populated by the controller and shown in kubectl output. | | Optional: {} <br/> |
| `backend` _string_ | Backend is extracted from profilingConfig.config.engine.backend for display purposes.<br/>This field is populated by the controller and shown in kubectl output. | | Optional: \{\}<br/> |
| `observedGeneration` _integer_ | ObservedGeneration reflects the generation of the most recently observed spec.<br/>Used to detect spec changes and enforce immutability after profiling starts. | | |
| `observedGeneration` _integer_ | ObservedGeneration reflects the generation of the most recently observed spec.<br/>Used to detect spec changes and enforce immutability after profiling starts. | | |
| `conditions` _[Condition](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#condition-v1-meta) array_ | Conditions contains the latest observed conditions of the deployment request.<br/>Standard condition types include: Validation, Profiling, SpecGenerated, DeploymentReady.<br/>Conditions are merged by type on patch updates. | | |
| `conditions` _[Condition](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#condition-v1-meta) array_ | Conditions contains the latest observed conditions of the deployment request.<br/>Standard condition types include: Validation, Profiling, SpecGenerated, DeploymentReady.<br/>Conditions are merged by type on patch updates. | | |
| `profilingResults` _string_ | ProfilingResults contains a reference to the ConfigMap holding profiling data.<br/>Format: "configmap/<name>" | | Optional: {} <br/> |
| `profilingResults` _string_ | ProfilingResults contains a reference to the ConfigMap holding profiling data.<br/>Format: "configmap/<name>" | | Optional: \{\}<br/> |
| `generatedDeployment` _[RawExtension](#rawextension)_ | GeneratedDeployment contains the full generated DynamoGraphDeployment specification<br/>including metadata, based on profiling results. Users can extract this to create<br/>a DGD manually, or it's used automatically when autoApply is true.<br/>Stored as RawExtension to preserve all fields including metadata. | | EmbeddedResource: {} <br/>Optional: {} <br/> |
| `generatedDeployment` _[RawExtension](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#rawextension-runtime-pkg)_ | GeneratedDeployment contains the full generated DynamoGraphDeployment specification<br/>including metadata, based on profiling results. Users can extract this to create<br/>a DGD manually, or it's used automatically when autoApply is true.<br/>Stored as RawExtension to preserve all fields including metadata. | | EmbeddedResource: \{\}<br/>Optional: \{\}<br/> |
| `deployment` _[DeploymentStatus](#deploymentstatus)_ | Deployment tracks the auto-created DGD when AutoApply is true.<br/>Contains name, namespace, state, and creation status of the managed DGD. | | Optional: {} <br/> |
| `deployment` _[DeploymentStatus](#deploymentstatus)_ | Deployment tracks the auto-created DGD when AutoApply is true.<br/>Contains name, namespace, state, and creation status of the managed DGD. | | Optional: \{\}<br/> |
#### DynamoGraphDeploymentSpec
#### DynamoGraphDeploymentSpec
...
@@ -322,9 +322,9 @@ _Appears in:_
...
@@ -322,9 +322,9 @@ _Appears in:_
| Field | Description | Default | Validation |
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| --- | --- | --- | --- |
| `pvcs` _[PVC](#pvc) array_ | PVCs defines a list of persistent volume claims that can be referenced by components.<br/>Each PVC must have a unique name that can be referenced in component specifications. | | Optional: {} <br/> |
| `pvcs` _[PVC](#pvc) array_ | PVCs defines a list of persistent volume claims that can be referenced by components.<br/>Each PVC must have a unique name that can be referenced in component specifications. | | MaxItems: 100 <br/>Optional: \{\}<br/> |
| `services` _object (keys:string, values:[DynamoComponentDeploymentSharedSpec](#dynamocomponentdeploymentsharedspec))_ | Services are the services to deploy as part of this deployment. | | Optional: {} <br/> |
| `services` _object (keys:string, values:[DynamoComponentDeploymentSharedSpec](#dynamocomponentdeploymentsharedspec))_ | Services are the services to deploy as part of this deployment. | | MaxProperties: 25 <br/>Optional: \{\}<br/> |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs are environment variables applied to all services in the deployment unless<br/>overridden by service-specific configuration. | | Optional: {} <br/> |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs are environment variables applied to all services in the deployment unless<br/>overridden by service-specific configuration. | | Optional: \{\}<br/> |
| `create` _boolean_ | Create indicates to create a new PVC | | |
| `create` _boolean_ | Create indicates to create a new PVC | | |
| `name` _string_ | Name is the name of the PVC | | Required: {} <br/> |
| `name` _string_ | Name is the name of the PVC | | Required: \{\}<br/> |
| `storageClass` _string_ | StorageClass to be used for PVC creation. Required when create is true. | | |
| `storageClass` _string_ | StorageClass to be used for PVC creation. Required when create is true. | | |
| `size` _[Quantity](#quantity)_ | Size of the volume in Gi, used during PVC creation. Required when create is true. | | |
| `size` _[Quantity](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#quantity-resource-api)_ | Size of the volume in Gi, used during PVC creation. Required when create is true. | | |
| `volumeAccessMode` _[PersistentVolumeAccessMode](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#persistentvolumeaccessmode-v1-core)_ | VolumeAccessMode is the volume access mode of the PVC. Required when create is true. | | |
| `volumeAccessMode` _[PersistentVolumeAccessMode](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#persistentvolumeaccessmode-v1-core)_ | VolumeAccessMode is the volume access mode of the PVC. Required when create is true. | | |
...
@@ -439,9 +439,9 @@ _Appears in:_
...
@@ -439,9 +439,9 @@ _Appears in:_
| Field | Description | Default | Validation |
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| --- | --- | --- | --- |
| `config` _[JSON](#json)_ | Config is the profiling configuration as arbitrary JSON/YAML. This will be passed directly to the profiler.<br/>The profiler will validate the configuration and report any errors. | | Optional: {} <br/>Type: object <br/> |
| `config` _[JSON](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#json-v1-apiextensions-k8s-io)_ | Config is the profiling configuration as arbitrary JSON/YAML. This will be passed directly to the profiler.<br/>The profiler will validate the configuration and report any errors. | | Optional: \{\}<br/>Type: object <br/> |
| `configMapRef` _[ConfigMapKeySelector](#configmapkeyselector)_ | ConfigMapRef is an optional reference to a ConfigMap containing the DynamoGraphDeployment<br/>base config file (disagg.yaml). This is separate from the profiling config above.<br/>The path to this config will be set as engine.config in the profiling config. | | Optional: {} <br/> |
| `configMapRef` _[ConfigMapKeySelector](#configmapkeyselector)_ | ConfigMapRef is an optional reference to a ConfigMap containing the DynamoGraphDeployment<br/>base config file (disagg.yaml). This is separate from the profiling config above.<br/>The path to this config will be set as engine.config in the profiling config. | | Optional: \{\}<br/> |
| `profilerImage` _string_ | ProfilerImage specifies the container image to use for profiling jobs.<br/>This image contains the profiler code and dependencies needed for SLA-based profiling.<br/>Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.6.1" | | Required: {} <br/> |
| `profilerImage` _string_ | ProfilerImage specifies the container image to use for profiling jobs.<br/>This image contains the profiler code and dependencies needed for SLA-based profiling.<br/>Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.6.1" | | Required: \{\}<br/> |
| `name` _string_ | Name references a PVC name defined in the top-level PVCs map | | Required: {} <br/> |
| `name` _string_ | Name references a PVC name defined in the top-level PVCs map | | Required: \{\}<br/> |
| `mountPoint` _string_ | MountPoint specifies where to mount the volume.<br/>If useAsCompilationCache is true and mountPoint is not specified,<br/>a backend-specific default will be used. | | |
| `mountPoint` _string_ | MountPoint specifies where to mount the volume.<br/>If useAsCompilationCache is true and mountPoint is not specified,<br/>a backend-specific default will be used. | | |
| `useAsCompilationCache` _boolean_ | UseAsCompilationCache indicates this volume should be used as a compilation cache.<br/>When true, backend-specific environment variables will be set and default mount points may be used. | false | |
| `useAsCompilationCache` _boolean_ | UseAsCompilationCache indicates this volume should be used as a compilation cache.<br/>When true, backend-specific environment variables will be set and default mount points may be used. | false | |