Unverified Commit 7b941d7e authored by dagil-nvidia's avatar dagil-nvidia Committed by GitHub
Browse files

docs: fix kubernetes docs links and bump example image tags to 1.0.0 (#7400)


Signed-off-by: default avatarDan Gil <dagil@nvidia.com>
parent 5e326d6f
...@@ -131,7 +131,7 @@ spec: ...@@ -131,7 +131,7 @@ spec:
WorkersImage specifies the container image to use for DynamoGraphDeployment worker components. WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.
This image is used for both temporary DGDs created during online profiling and the final DGD. This image is used for both temporary DGDs created during online profiling and the final DGD.
If omitted, the image from the base config file (e.g., disagg.yaml) is used. If omitted, the image from the base config file (e.g., disagg.yaml) is used.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string type: string
type: object type: object
enableGpuDiscovery: enableGpuDiscovery:
...@@ -206,7 +206,7 @@ spec: ...@@ -206,7 +206,7 @@ spec:
description: |- description: |-
ProfilerImage specifies the container image to use for profiling jobs. ProfilerImage specifies the container image to use for profiling jobs.
This image contains the profiler code and dependencies needed for SLA-based profiling. This image contains the profiler code and dependencies needed for SLA-based profiling.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string type: string
resources: resources:
description: |- description: |-
......
...@@ -64,7 +64,7 @@ type ProfilingConfigSpec struct { ...@@ -64,7 +64,7 @@ type ProfilingConfigSpec struct {
// ProfilerImage specifies the container image to use for profiling jobs. // ProfilerImage specifies the container image to use for profiling jobs.
// This image contains the profiler code and dependencies needed for SLA-based profiling. // This image contains the profiler code and dependencies needed for SLA-based profiling.
// Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" // Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
// +kubebuilder:validation:Required // +kubebuilder:validation:Required
ProfilerImage string `json:"profilerImage"` ProfilerImage string `json:"profilerImage"`
...@@ -132,7 +132,7 @@ type DeploymentOverridesSpec struct { ...@@ -132,7 +132,7 @@ type DeploymentOverridesSpec struct {
// WorkersImage specifies the container image to use for DynamoGraphDeployment worker components. // WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.
// This image is used for both temporary DGDs created during online profiling and the final DGD. // This image is used for both temporary DGDs created during online profiling and the final DGD.
// If omitted, the image from the base config file (e.g., disagg.yaml) is used. // If omitted, the image from the base config file (e.g., disagg.yaml) is used.
// Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" // Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
// +kubebuilder:validation:Optional // +kubebuilder:validation:Optional
WorkersImage string `json:"workersImage,omitempty"` WorkersImage string `json:"workersImage,omitempty"`
} }
......
...@@ -131,7 +131,7 @@ spec: ...@@ -131,7 +131,7 @@ spec:
WorkersImage specifies the container image to use for DynamoGraphDeployment worker components. WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.
This image is used for both temporary DGDs created during online profiling and the final DGD. This image is used for both temporary DGDs created during online profiling and the final DGD.
If omitted, the image from the base config file (e.g., disagg.yaml) is used. If omitted, the image from the base config file (e.g., disagg.yaml) is used.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string type: string
type: object type: object
enableGpuDiscovery: enableGpuDiscovery:
...@@ -206,7 +206,7 @@ spec: ...@@ -206,7 +206,7 @@ spec:
description: |- description: |-
ProfilerImage specifies the container image to use for profiling jobs. ProfilerImage specifies the container image to use for profiling jobs.
This image contains the profiler code and dependencies needed for SLA-based profiling. This image contains the profiler code and dependencies needed for SLA-based profiling.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string type: string
resources: resources:
description: |- description: |-
......
...@@ -25,7 +25,7 @@ spec: ...@@ -25,7 +25,7 @@ spec:
backend: trtllm backend: trtllm
# Image is the container image reference for the profiling job # Image is the container image reference for the profiling job
image: "nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.9.0" image: "nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:1.0.0"
# SearchStrategy controls the profiling search depth # SearchStrategy controls the profiling search depth
# "rapid" for fast sweep; "thorough" for deeper exploration # "rapid" for fast sweep; "thorough" for deeper exploration
......
...@@ -72,7 +72,7 @@ kubectl create secret generic hf-token-secret \ ...@@ -72,7 +72,7 @@ kubectl create secret generic hf-token-secret \
``` ```
Create a model configuration file similar to the vllm_agg_qwen.yaml for your model. Create a model configuration file similar to the vllm_agg_qwen.yaml for your model.
This file demonstrates the values needed for the Vllm Agg setup in [agg.yaml](../../examples/backends/vllm/deploy/agg.yaml) This file demonstrates the values needed for the vLLM aggregated setup in [agg.yaml](https://github.com/ai-dynamo/dynamo/blob/main/examples/backends/vllm/deploy/agg.yaml)
Take a note of the model's block size provided in the model card. Take a note of the model's block size provided in the model card.
### 4. Build EPP image (Optional) ### 4. Build EPP image (Optional)
......
...@@ -45,7 +45,7 @@ make docker-push-placeholder \ ...@@ -45,7 +45,7 @@ make docker-push-placeholder \
PLACEHOLDER_IMG="${PLACEHOLDER_IMAGE}" PLACEHOLDER_IMG="${PLACEHOLDER_IMAGE}"
``` ```
This flow is defined in [deploy/snapshot/Makefile](../../deploy/snapshot/Makefile) and [deploy/snapshot/Dockerfile](../../deploy/snapshot/Dockerfile). The placeholder image preserves the base runtime entrypoint and command contract, and adds the CRIU, `cuda-checkpoint`, and `nsrestore` tooling needed for restore. This flow is defined in [deploy/snapshot/Makefile](https://github.com/ai-dynamo/dynamo/blob/main/deploy/snapshot/Makefile) and [deploy/snapshot/Dockerfile](https://github.com/ai-dynamo/dynamo/blob/main/deploy/snapshot/Dockerfile). The placeholder image preserves the base runtime entrypoint and command contract, and adds the CRIU, `cuda-checkpoint`, and `nsrestore` tooling needed for restore.
### 2. Enable checkpointing in the platform and verify it ### 2. Enable checkpointing in the platform and verify it
...@@ -75,7 +75,7 @@ kubectl get configmap "${OPERATOR_CONFIG}" -n "${PLATFORM_NAMESPACE}" \ ...@@ -75,7 +75,7 @@ kubectl get configmap "${OPERATOR_CONFIG}" -n "${PLATFORM_NAMESPACE}" \
Verify that the rendered config includes `enabled: true` and the same PVC name and base path you plan to use for the snapshot chart. Verify that the rendered config includes `enabled: true` and the same PVC name and base path you plan to use for the snapshot chart.
For the full platform/operator configuration surface, see [deploy/helm/charts/platform/README.md](../../deploy/helm/charts/platform/README.md) and [deploy/helm/charts/platform/components/operator/values.yaml](../../deploy/helm/charts/platform/components/operator/values.yaml). For the full platform/operator configuration surface, see [deploy/helm/charts/platform/README.md](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/platform/README.md) and [deploy/helm/charts/platform/components/operator/values.yaml](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/platform/components/operator/values.yaml).
### 3. Install the snapshot chart ### 3. Install the snapshot chart
...@@ -97,11 +97,11 @@ kubectl get pvc snapshot-pvc -n ${NAMESPACE} ...@@ -97,11 +97,11 @@ kubectl get pvc snapshot-pvc -n ${NAMESPACE}
kubectl rollout status daemonset/snapshot-agent -n ${NAMESPACE} kubectl rollout status daemonset/snapshot-agent -n ${NAMESPACE}
``` ```
For the full snapshot chart configuration surface, see [deploy/helm/charts/snapshot/README.md](../../deploy/helm/charts/snapshot/README.md) and [deploy/helm/charts/snapshot/values.yaml](../../deploy/helm/charts/snapshot/values.yaml). For the full snapshot chart configuration surface, see [deploy/helm/charts/snapshot/README.md](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/snapshot/README.md) and [deploy/helm/charts/snapshot/values.yaml](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/snapshot/values.yaml).
### 4. Apply a snapshot-compatible `DynamoGraphDeployment` ### 4. Apply a snapshot-compatible `DynamoGraphDeployment`
This example is adapted from [examples/backends/vllm/deploy/agg.yaml](../../examples/backends/vllm/deploy/agg.yaml). The worker must use the placeholder image from step 1, and the checkpoint identity must describe the runtime state you want to reuse. This example is adapted from [examples/backends/vllm/deploy/agg.yaml](https://github.com/ai-dynamo/dynamo/blob/main/examples/backends/vllm/deploy/agg.yaml). The worker must use the placeholder image from step 1, and the checkpoint identity must describe the runtime state you want to reuse.
```yaml ```yaml
apiVersion: nvidia.com/v1alpha1 apiVersion: nvidia.com/v1alpha1
...@@ -490,6 +490,6 @@ Or use `mode: Auto` with the same identity and snapshot-hash label, and the oper ...@@ -490,6 +490,6 @@ Or use `mode: Auto` with the same identity and snapshot-hash label, and the oper
## Related Documentation ## Related Documentation
- [Dynamo Snapshot Helm Chart README](../../deploy/helm/charts/snapshot/README.md) - Chart configuration - [Dynamo Snapshot Helm Chart README](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/snapshot/README.md) - Chart configuration
- [Installation Guide](installation-guide.md) - Platform installation - [Installation Guide](installation-guide.md) - Platform installation
- [API Reference](api-reference.md) - Complete CRD specifications - [API Reference](api-reference.md) - Complete CRD specifications
...@@ -66,7 +66,7 @@ Each DGDR requires a container image for profiling and deployment: ...@@ -66,7 +66,7 @@ Each DGDR requires a container image for profiling and deployment:
```yaml ```yaml
spec: spec:
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
``` ```
#### Quick Start: Deploy with DGDR #### Quick Start: Deploy with DGDR
...@@ -83,7 +83,7 @@ metadata: ...@@ -83,7 +83,7 @@ metadata:
spec: spec:
model: "Qwen/Qwen3-0.6B" model: "Qwen/Qwen3-0.6B"
backend: vllm backend: vllm
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
workload: workload:
isl: 3000 isl: 3000
...@@ -229,7 +229,7 @@ metadata: ...@@ -229,7 +229,7 @@ metadata:
spec: spec:
model: "Qwen/Qwen3-0.6B" model: "Qwen/Qwen3-0.6B"
backend: vllm backend: vllm
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0" image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
workload: { ... } workload: { ... }
sla: { ... } sla: { ... }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment