| dynamo-operator.enabled | bool | `true` | Whether to enable the Dynamo Kubernetes operator deployment |
| dynamo-operator.natsAddr | string | `""` | NATS server address for operator communication (leave empty to use the bundled NATS chart). Format: "nats://hostname:port" |
| dynamo-operator.etcdAddr | string | `""` | etcd server address for operator state storage (leave empty to use the bundled etcd chart). Format: "http://hostname:port" or "https://hostname:port" |
| dynamo-operator.namespaceRestriction.enabled | bool | `true` | Whether to restrict operator to specific namespaces |
| dynamo-operator.namespaceRestriction.targetNamespace | string | `nil` | Target namespace for operator deployment (leave empty for current namespace) |
| dynamo-operator.controllerManager.tolerations | list | `[]` | Node tolerations for controller manager pods |
| dynamo-operator.controllerManager.manager.image.repository | string | `"nvcr.io/nvidia/ai-dynamo/kubernetes-operator"` | Official NVIDIA Dynamo operator image repository |
| dynamo-operator.controllerManager.manager.image.tag | string | `""` | Image tag (leave empty to use chart default) |
| dynamo-operator.controllerManager.manager.image.pullPolicy | string | `"IfNotPresent"` | Image pull policy - when to pull the image |
| dynamo-operator.controllerManager.manager.args[0] | string | `"--health-probe-bind-address=:8081"` | Health probe endpoint for Kubernetes health checks |
| dynamo-operator.controllerManager.manager.args[1] | string | `"--metrics-bind-address=127.0.0.1:8080"` | Metrics endpoint for Prometheus scraping (localhost only for security) |
| dynamo-operator.imagePullSecrets | list | `[]` | Secrets for pulling private container images |
| dynamo-operator.dynamo.groveTerminationDelay | string | `"15m"` | How long to wait before forcefully terminating Grove instances |
| dynamo-operator.dynamo.virtualServiceSupportsHTTPS | bool | `false` | Whether VirtualServices should support HTTPS routing |
| grove.enabled | bool | `false` | Whether to enable Grove for multi-node inference coordination, if enabled, the Grove operator will be deployed cluster-wide |
| kai-scheduler.enabled | bool | `false` | Whether to enable Kai Scheduler for intelligent resource allocation, if enabled, the Kai Scheduler operator will be deployed cluster-wide |
| etcd.enabled | bool | `true` | Whether to enable etcd deployment, disable if you want to use an external etcd instance |
| nats.enabled | bool | `true` | Whether to enable NATS deployment, disable if you want to use an external NATS instance |
### NATS Configuration
For detailed NATS configuration options beyond `nats.enabled`, please refer to the official NATS Helm chart documentation:
# Used to generate top-level secrets (overridden by custom-values.yaml)
# Subcharts
# Subcharts configuration
# Dynamo operator configuration
dynamo-operator:
# -- Whether to enable the Dynamo Kubernetes operator deployment
enabled:true
# -- NATS server address for operator communication (leave empty to use the bundled NATS chart). Format: "nats://hostname:port"
natsAddr:""
# -- etcd server address for operator state storage (leave empty to use the bundled etcd chart). Format: "http://hostname:port" or "https://hostname:port"
etcdAddr:""
# Namespace access controls for the operator
namespaceRestriction:
# -- Whether to restrict operator to specific namespaces
enabled:true
# -- Target namespace for operator deployment (leave empty for current namespace)
targetNamespace:
# Controller manager configuration
controllerManager:
# -- Node tolerations for controller manager pods
tolerations:[]
manager:
# Container image configuration for the operator manager
image:
# -- Official NVIDIA Dynamo operator image repository
| `metadata` _[ObjectMeta](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#objectmeta-v1-meta)_ | Refer to Kubernetes API documentation for fields of `metadata`. | | |
| `spec` _[DynamoComponentDeploymentSpec](#dynamocomponentdeploymentspec)_ | Spec defines the desired state for this Dynamo component deployment. | | |
| `annotations` _object (keys:string, values:string)_ | Annotations to add to generated Kubernetes resources for this component<br/>(such as Pod, Service, and Ingress when applicable). | | |
| `labels` _object (keys:string, values:string)_ | Labels to add to generated Kubernetes resources for this component. | | |
| `serviceName` _string_ | The name of the component | | |
| `componentType` _string_ | ComponentType indicates the role of this component (for example, "main"). | | |
| `dynamoNamespace` _string_ | Dynamo namespace of the service (allows to override the Dynamo namespace of the service defined in annotations inside the Dynamo archive) | | |
| `resources` _[Resources](#resources)_ | Resources requested and limits for this component, including CPU, memory,<br/>GPUs/devices, and any runtime-specific resources. | | |
| `autoscaling` _[Autoscaling](#autoscaling)_ | Autoscaling config for this component (replica range, target utilization, etc.). | | |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs defines additional environment variables to inject into the component containers. | | |
| `envFromSecret` _string_ | EnvFromSecret references a Secret whose key/value pairs will be exposed as<br/>environment variables in the component containers. | | |
| `pvc` _[PVC](#pvc)_ | PVC config describing volumes to be mounted by the component. | | |
| `ingress` _[IngressSpec](#ingressspec)_ | Ingress config to expose the component outside the cluster (or through a service mesh). | | |
| `sharedMemory` _[SharedMemorySpec](#sharedmemoryspec)_ | SharedMemory controls the tmpfs mounted at /dev/shm (enable/disable and size). | | |
| `extraPodMetadata` _[ExtraPodMetadata](#extrapodmetadata)_ | ExtraPodMetadata adds labels/annotations to the created Pods. | | |
| `extraPodSpec` _[ExtraPodSpec](#extrapodspec)_ | ExtraPodSpec allows to override the main pod spec configuration.<br/>It is a k8s standard PodSpec. It also contains a MainContainer (standard k8s Container) field<br/>that allows overriding the main container configuration. | | |
| `livenessProbe` _[Probe](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#probe-v1-core)_ | LivenessProbe to detect and restart unhealthy containers. | | |
| `readinessProbe` _[Probe](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#probe-v1-core)_ | ReadinessProbe to signal when the container is ready to receive traffic. | | |
| `replicas` _integer_ | Replicas is the desired number of Pods for this component when autoscaling is not used. | | |
| `multinode` _[MultinodeSpec](#multinodespec)_ | Multinode is the configuration for multinode components. | | |
#### DynamoComponentDeploymentSpec
DynamoComponentDeploymentSpec defines the desired state of DynamoComponentDeployment
| `dynamoComponent` _string_ | DynamoComponent selects the Dynamo component from the archive to deploy.<br/>Typically corresponds to a component defined in the packaged Dynamo artifacts. | | |
| `dynamoTag` _string_ | contains the tag of the DynamoComponent: for example, "my_package:MyService" | | |
| `annotations` _object (keys:string, values:string)_ | Annotations to add to generated Kubernetes resources for this component<br/>(such as Pod, Service, and Ingress when applicable). | | |
| `labels` _object (keys:string, values:string)_ | Labels to add to generated Kubernetes resources for this component. | | |
| `serviceName` _string_ | The name of the component | | |
| `componentType` _string_ | ComponentType indicates the role of this component (for example, "main"). | | |
| `dynamoNamespace` _string_ | Dynamo namespace of the service (allows to override the Dynamo namespace of the service defined in annotations inside the Dynamo archive) | | |
| `resources` _[Resources](#resources)_ | Resources requested and limits for this component, including CPU, memory,<br/>GPUs/devices, and any runtime-specific resources. | | |
| `autoscaling` _[Autoscaling](#autoscaling)_ | Autoscaling config for this component (replica range, target utilization, etc.). | | |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs defines additional environment variables to inject into the component containers. | | |
| `envFromSecret` _string_ | EnvFromSecret references a Secret whose key/value pairs will be exposed as<br/>environment variables in the component containers. | | |
| `pvc` _[PVC](#pvc)_ | PVC config describing volumes to be mounted by the component. | | |
| `ingress` _[IngressSpec](#ingressspec)_ | Ingress config to expose the component outside the cluster (or through a service mesh). | | |
| `sharedMemory` _[SharedMemorySpec](#sharedmemoryspec)_ | SharedMemory controls the tmpfs mounted at /dev/shm (enable/disable and size). | | |
| `extraPodMetadata` _[ExtraPodMetadata](#extrapodmetadata)_ | ExtraPodMetadata adds labels/annotations to the created Pods. | | |
| `extraPodSpec` _[ExtraPodSpec](#extrapodspec)_ | ExtraPodSpec allows to override the main pod spec configuration.<br/>It is a k8s standard PodSpec. It also contains a MainContainer (standard k8s Container) field<br/>that allows overriding the main container configuration. | | |
| `livenessProbe` _[Probe](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#probe-v1-core)_ | LivenessProbe to detect and restart unhealthy containers. | | |
| `readinessProbe` _[Probe](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#probe-v1-core)_ | ReadinessProbe to signal when the container is ready to receive traffic. | | |
| `replicas` _integer_ | Replicas is the desired number of Pods for this component when autoscaling is not used. | | |
| `multinode` _[MultinodeSpec](#multinodespec)_ | Multinode is the configuration for multinode components. | | |
#### DynamoGraphDeployment
DynamoGraphDeployment is the Schema for the dynamographdeployments API.
| `metadata` _[ObjectMeta](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#objectmeta-v1-meta)_ | Refer to Kubernetes API documentation for fields of `metadata`. | | |
| `spec` _[DynamoGraphDeploymentSpec](#dynamographdeploymentspec)_ | Spec defines the desired state for this graph deployment. | | |
| `status` _[DynamoGraphDeploymentStatus](#dynamographdeploymentstatus)_ | Status reflects the current observed state of this graph deployment. | | |
#### DynamoGraphDeploymentSpec
DynamoGraphDeploymentSpec defines the desired state of DynamoGraphDeployment.
_Appears in:_
-[DynamoGraphDeployment](#dynamographdeployment)
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| `dynamoGraph` _string_ | DynamoGraph selects the graph (workflow/topology) to deploy. This must match<br/>a graph name packaged with the Dynamo archive. | | |
| `envs` _[EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#envvar-v1-core) array_ | Envs are environment variables applied to all services in the graph unless<br/>overridden by service-specific configuration. | | Optional: {} <br/> |
DynamoGraphDeploymentStatus defines the observed state of DynamoGraphDeployment.
_Appears in:_
-[DynamoGraphDeployment](#dynamographdeployment)
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| `state` _string_ | State is a high-level textual status of the graph deployment lifecycle. | | |
| `conditions` _[Condition](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#condition-v1-meta) array_ | Conditions contains the latest observed conditions of the graph deployment.<br/>The slice is merged by type on patch updates. | | |
| `nodeCount` _integer_ | Indicates the number of nodes to deploy for multinode components.<br/>Total number of GPUs is NumberOfNodes * GPU limit.<br/>Must be greater than 1. | 2 | Minimum: 2 <br/> |
| `create` _boolean_ | Create indicates to create a new PVC | | |
| `name` _string_ | Name is the name of the PVC | | |
| `storageClass` _string_ | StorageClass to be used for PVC creation. Leave it as empty if the PVC is already created. | | |
| `size` _[Quantity](#quantity)_ | Size of the NIM cache in Gi, used during PVC creation | | |
| `volumeAccessMode` _[PersistentVolumeAccessMode](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#persistentvolumeaccessmode-v1-core)_ | VolumeAccessMode is the volume access mode of the PVC | | |
@@ -59,6 +59,14 @@ It's a Kubernetes Custom Resource that defines your inference pipeline:
The scripts in the `components/<backend>/launch` folder like `agg.sh` demonstrate how you can serve your models locally. The corresponding YAML files like `agg.yaml` show you how you could create a kubernetes deployment for your inference graph.
## 📖 API Reference & Documentation
For detailed technical specifications of Dynamo's Kubernetes resources:
-**[API Reference](api-reference.md)** - Complete CRD field specifications for `DynamoGraphDeployment` and `DynamoComponentDeployment`
-**[Operator Guide](dynamo_operator.md)** - Dynamo operator configuration and management
| `services` | map | Map of service names to runtime configurations. This allows the user to override the service configuration defined in the DynamoComponentDeployment. | Yes | |
| `envs` | list | list of global environment variables. | No | |
**API Version:**`nvidia.com/v1alpha1`
**Scope:** Namespaced
#### Example
```yaml
apiVersion:nvidia.com/v1alpha1
kind:DynamoGraphDeployment
metadata:
name:disagg
spec:
envs:
-name:GLOBAL_ENV_VAR
value:some_global_value
services:
Frontend:
replicas:1
envs:
-name:SPECIFIC_ENV_VAR
value:some_specific_value
Processor:
replicas:1
envs:
-name:SPECIFIC_ENV_VAR
value:some_specific_value
VllmWorker:
replicas:1
envs:
-name:SPECIFIC_ENV_VAR
value:some_specific_value
PrefillWorker:
replicas:1
envs:
-name:SPECIFIC_ENV_VAR
value:some_specific_value
```
**📖 [Dynamo CRD API Reference](../../../deploy/cloud/operator/docs/api_reference.md)**
@@ -87,10 +87,14 @@ Grove represents a significant advancement in Kubernetes-based orchestration for
## Getting Started
> **Note**: Grove is currently in development and aligning with NVIDIA Dynamo's release schedule.
Grove relies on KAI Scheduler for resource allocation and scheduling.
For KAI Scheduler, see the [KAI Scheduler Deployment Guide](https://github.com/NVIDIA/KAI-Scheduler).
For installation instructions, see the [Grove Installation Guide](https://github.com/NVIDIA/grove/blob/main/docs/installation.md).
For practical examples of Grove-based multinode deployments in action, see the [Multinode Deployment Guide](multinode-deployment.md), which demonstrates multi-node disaggregated serving scenarios.
For the latest updates on Grove, refer to the [official project on GitHub](https://github.com/NVIDIA/grove).
Dynamo Cloud also allows you to install Grove and KAI Scheduler as part of the platform installation. See the [Dynamo Cloud Deployment Guide](dynamo_cloud.md) for more details.