Unverified Commit 7b941d7e authored by dagil-nvidia's avatar dagil-nvidia Committed by GitHub
Browse files

docs: fix kubernetes docs links and bump example image tags to 1.0.0 (#7400)


Signed-off-by: default avatarDan Gil <dagil@nvidia.com>
parent 5e326d6f
......@@ -131,7 +131,7 @@ spec:
WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.
This image is used for both temporary DGDs created during online profiling and the final DGD.
If omitted, the image from the base config file (e.g., disagg.yaml) is used.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string
type: object
enableGpuDiscovery:
......@@ -206,7 +206,7 @@ spec:
description: |-
ProfilerImage specifies the container image to use for profiling jobs.
This image contains the profiler code and dependencies needed for SLA-based profiling.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string
resources:
description: |-
......
......@@ -64,7 +64,7 @@ type ProfilingConfigSpec struct {
// ProfilerImage specifies the container image to use for profiling jobs.
// This image contains the profiler code and dependencies needed for SLA-based profiling.
// Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
// Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
// +kubebuilder:validation:Required
ProfilerImage string `json:"profilerImage"`
......@@ -132,7 +132,7 @@ type DeploymentOverridesSpec struct {
// WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.
// This image is used for both temporary DGDs created during online profiling and the final DGD.
// If omitted, the image from the base config file (e.g., disagg.yaml) is used.
// Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
// Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
// +kubebuilder:validation:Optional
WorkersImage string `json:"workersImage,omitempty"`
}
......
......@@ -131,7 +131,7 @@ spec:
WorkersImage specifies the container image to use for DynamoGraphDeployment worker components.
This image is used for both temporary DGDs created during online profiling and the final DGD.
If omitted, the image from the base config file (e.g., disagg.yaml) is used.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string
type: object
enableGpuDiscovery:
......@@ -206,7 +206,7 @@ spec:
description: |-
ProfilerImage specifies the container image to use for profiling jobs.
This image contains the profiler code and dependencies needed for SLA-based profiling.
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
Example: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
type: string
resources:
description: |-
......
......@@ -25,7 +25,7 @@ spec:
backend: trtllm
# Image is the container image reference for the profiling job
image: "nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.9.0"
image: "nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:1.0.0"
# SearchStrategy controls the profiling search depth
# "rapid" for fast sweep; "thorough" for deeper exploration
......
......@@ -72,7 +72,7 @@ kubectl create secret generic hf-token-secret \
```
Create a model configuration file similar to the vllm_agg_qwen.yaml for your model.
This file demonstrates the values needed for the Vllm Agg setup in [agg.yaml](../../examples/backends/vllm/deploy/agg.yaml)
This file demonstrates the values needed for the vLLM aggregated setup in [agg.yaml](https://github.com/ai-dynamo/dynamo/blob/main/examples/backends/vllm/deploy/agg.yaml)
Take a note of the model's block size provided in the model card.
### 4. Build EPP image (Optional)
......
......@@ -45,7 +45,7 @@ make docker-push-placeholder \
PLACEHOLDER_IMG="${PLACEHOLDER_IMAGE}"
```
This flow is defined in [deploy/snapshot/Makefile](../../deploy/snapshot/Makefile) and [deploy/snapshot/Dockerfile](../../deploy/snapshot/Dockerfile). The placeholder image preserves the base runtime entrypoint and command contract, and adds the CRIU, `cuda-checkpoint`, and `nsrestore` tooling needed for restore.
This flow is defined in [deploy/snapshot/Makefile](https://github.com/ai-dynamo/dynamo/blob/main/deploy/snapshot/Makefile) and [deploy/snapshot/Dockerfile](https://github.com/ai-dynamo/dynamo/blob/main/deploy/snapshot/Dockerfile). The placeholder image preserves the base runtime entrypoint and command contract, and adds the CRIU, `cuda-checkpoint`, and `nsrestore` tooling needed for restore.
### 2. Enable checkpointing in the platform and verify it
......@@ -75,7 +75,7 @@ kubectl get configmap "${OPERATOR_CONFIG}" -n "${PLATFORM_NAMESPACE}" \
Verify that the rendered config includes `enabled: true` and the same PVC name and base path you plan to use for the snapshot chart.
For the full platform/operator configuration surface, see [deploy/helm/charts/platform/README.md](../../deploy/helm/charts/platform/README.md) and [deploy/helm/charts/platform/components/operator/values.yaml](../../deploy/helm/charts/platform/components/operator/values.yaml).
For the full platform/operator configuration surface, see [deploy/helm/charts/platform/README.md](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/platform/README.md) and [deploy/helm/charts/platform/components/operator/values.yaml](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/platform/components/operator/values.yaml).
### 3. Install the snapshot chart
......@@ -97,11 +97,11 @@ kubectl get pvc snapshot-pvc -n ${NAMESPACE}
kubectl rollout status daemonset/snapshot-agent -n ${NAMESPACE}
```
For the full snapshot chart configuration surface, see [deploy/helm/charts/snapshot/README.md](../../deploy/helm/charts/snapshot/README.md) and [deploy/helm/charts/snapshot/values.yaml](../../deploy/helm/charts/snapshot/values.yaml).
For the full snapshot chart configuration surface, see [deploy/helm/charts/snapshot/README.md](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/snapshot/README.md) and [deploy/helm/charts/snapshot/values.yaml](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/snapshot/values.yaml).
### 4. Apply a snapshot-compatible `DynamoGraphDeployment`
This example is adapted from [examples/backends/vllm/deploy/agg.yaml](../../examples/backends/vllm/deploy/agg.yaml). The worker must use the placeholder image from step 1, and the checkpoint identity must describe the runtime state you want to reuse.
This example is adapted from [examples/backends/vllm/deploy/agg.yaml](https://github.com/ai-dynamo/dynamo/blob/main/examples/backends/vllm/deploy/agg.yaml). The worker must use the placeholder image from step 1, and the checkpoint identity must describe the runtime state you want to reuse.
```yaml
apiVersion: nvidia.com/v1alpha1
......@@ -490,6 +490,6 @@ Or use `mode: Auto` with the same identity and snapshot-hash label, and the oper
## Related Documentation
- [Dynamo Snapshot Helm Chart README](../../deploy/helm/charts/snapshot/README.md) - Chart configuration
- [Dynamo Snapshot Helm Chart README](https://github.com/ai-dynamo/dynamo/blob/main/deploy/helm/charts/snapshot/README.md) - Chart configuration
- [Installation Guide](installation-guide.md) - Platform installation
- [API Reference](api-reference.md) - Complete CRD specifications
......@@ -66,7 +66,7 @@ Each DGDR requires a container image for profiling and deployment:
```yaml
spec:
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
```
#### Quick Start: Deploy with DGDR
......@@ -83,7 +83,7 @@ metadata:
spec:
model: "Qwen/Qwen3-0.6B"
backend: vllm
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
workload:
isl: 3000
......@@ -229,7 +229,7 @@ metadata:
spec:
model: "Qwen/Qwen3-0.6B"
backend: vllm
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0"
image: "nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0"
workload: { ... }
sla: { ... }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment