feat: revamp kubernetes doc (#3173)

Signed-off-by: Julien Mancuso <161955438+julienmancuso@users.noreply.github.com> Co-authored-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>

feat: revamp kubernetes doc (#3173)
Signed-off-by: Julien Mancuso <161955438+julienmancuso@users.noreply.github.com> Co-authored-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>
4a718028 · Julien Mancuso · GitHub · 13a5d61b · 4a718028 · 4a718028
Unverified Commit 4a718028 authored Sep 23, 2025 by Julien Mancuso Committed by GitHub Sep 23, 2025
18 changed files
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -50,13 +50,13 @@ Quickstart
   :hidden:
   :caption: Kubernetes Deployment

-   Quickstart (K8s) <../guides/dynamo_deploy/README.md>
-   Detailed Installation Guide <../guides/dynamo_deploy/installation_guide.md>
-   Dynamo Operator <../guides/dynamo_deploy/dynamo_operator.md>
-   Metrics <../guides/dynamo_deploy/metrics.md>
-   Logging <../guides/dynamo_deploy/logging.md>
-   Multinode <../guides/dynamo_deploy/multinode-deployment.md>
-   Minikube Setup <../guides/dynamo_deploy/minikube.md>
+   Quickstart (K8s) <../kubernetes/README.md>
+   Detailed Installation Guide <../kubernetes/installation_guide.md>
+   Dynamo Operator <../kubernetes/dynamo_operator.md>
+   Metrics <../kubernetes/metrics.md>
+   Logging <../kubernetes/logging.md>
+   Multinode <../kubernetes/multinode-deployment.md>
+   Minikube Setup <../kubernetes/minikube.md>

 .. toctree::
   :hidden:

--- a/docs/guides/dynamo_deploy/README.md
+++ b/docs/guides/dynamo_deploy/README.md
@@ -31,12 +31,11 @@ helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-${REL
 helm install dynamo-crds dynamo-crds-${RELEASE_VERSION}.tgz --namespace default

 # 3. Install Platform
-kubectl create namespace ${NAMESPACE}
 helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-${RELEASE_VERSION}.tgz
-helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE}
+helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE} --create-namespace
 ```

-For more details or customization options, see **[Installation Guide for Dynamo Kubernetes Platform](/docs/guides/dynamo_deploy/installation_guide.md)**.
+For more details or customization options (including multinode deployments), see **[Installation Guide for Dynamo Kubernetes Platform](/docs/kubernetes/installation_guide.md)**.

 ## 2. Choose Your Backend

@@ -44,9 +43,9 @@ Each backend has deployment examples and configuration options:

 | Backend | Available Configurations |
 |---------|--------------------------|
-| **[vLLM](/components/backends/vllm/deploy/README.md)** | Aggregated, Aggregated + Router, Disaggregated, Disaggregated + Router, Disaggregated + Planner |
+| **[vLLM](/components/backends/vllm/deploy/README.md)** | Aggregated, Aggregated + Router, Disaggregated, Disaggregated + Router, Disaggregated + Planner, Disaggregated Multi-node |
 | **[SGLang](/components/backends/sglang/deploy/README.md)** | Aggregated, Aggregated + Router, Disaggregated, Disaggregated + Planner, Disaggregated Multi-node |
-| **[TensorRT-LLM](/components/backends/trtllm/deploy/README.md)** | Aggregated, Aggregated + Router, Disaggregated, Disaggregated + Router |
+| **[TensorRT-LLM](/components/backends/trtllm/deploy/README.md)** | Aggregated, Aggregated + Router, Disaggregated, Disaggregated + Router, Disaggregated Multi-node |

 ## 3. Deploy Your First Model

@@ -73,15 +72,15 @@ It's a Kubernetes Custom Resource that defines your inference pipeline:
 - Scaling policies
 - Frontend/backend connections

-The scripts in the `components/<backend>/launch` folder like `agg.sh` demonstrate how you can serve your models locally. The corresponding YAML files like `agg.yaml` show you how you could create a kubernetes deployment for your inference graph.
+Refer to the [API Reference and Documentation](/docs/kubernetes/api_reference.md) for more details.

 ## 📖 API Reference & Documentation

 For detailed technical specifications of Dynamo's Kubernetes resources:

- **[API Reference](/docs/guides/dynamo_deploy/api_reference.md)** - Complete CRD field specifications for `DynamoGraphDeployment` and `DynamoComponentDeployment`
- **[Operator Guide](/docs/guides/dynamo_deploy/dynamo_operator.md)** - Dynamo operator configuration and management
- **[Create Deployment](/docs/guides/dynamo_deploy/create_deployment.md)** - Step-by-step deployment creation examples
+- **[API Reference](/docs/kubernetes/api_reference.md)** - Complete CRD field specifications for `DynamoGraphDeployment` and `DynamoComponentDeployment`
+- **[Operator Guide](/docs/kubernetes/dynamo_operator.md)** - Dynamo operator configuration and management
+- **[Create Deployment](/docs/kubernetes/create_deployment.md)** - Step-by-step deployment creation examples

 ### Choosing Your Architecture Pattern

@@ -165,7 +164,12 @@ Key customization points include:
 ## Additional Resources

 - **[Examples](/examples/README.md)** - Complete working examples
- **[Create Custom Deployments](/docs/guides/dynamo_deploy/create_deployment.md)** - Build your own CRDs
- **[Operator Documentation](/docs/guides/dynamo_deploy/dynamo_operator.md)** - How the platform works
+- **[Create Custom Deployments](/docs/kubernetes/create_deployment.md)** - Build your own CRDs
+- **[Operator Documentation](/docs/kubernetes/dynamo_operator.md)** - How the platform works
 - **[Helm Charts](/deploy/helm/README.md)** - For advanced users
- **[GitOps Deployment with FluxCD](/docs/guides/dynamo_deploy/fluxcd.md)** - For advanced users
\ No newline at end of file
+- **[GitOps Deployment with FluxCD](/docs/kubernetes/fluxcd.md)** - For advanced users
+- **[Logging](/docs/kubernetes/logging.md)** - For logging setup
+- **[Multinode Deployment](/docs/kubernetes/multinode-deployment.md)** - For multinode deployment
+- **[Grove](/docs/kubernetes/grove.md)** - For grove details and custom installation
+- **[Monitoring](/docs/kubernetes/metrics.md)** - For monitoring setup
+- **[Model Caching with Fluid](/docs/kubernetes/model_caching_with_fluid.md)** - For model caching with Fluid
\ No newline at end of file
--- a/docs/guides/dynamo_deploy/api_reference.md
+++ b/docs/guides/dynamo_deploy/api_reference.md
--- a/docs/guides/dynamo_deploy/create_deployment.md
+++ b/docs/guides/dynamo_deploy/create_deployment.md
@@ -13,13 +13,13 @@ Select the architecture pattern as your template that best fits your use case.
 For example, when using the `VLLM` inference backend:

 - **Development / Testing**
-  Use [`agg.yaml`](../../../components/backends/vllm/deploy/agg.yaml) as the base configuration.
+  Use [`agg.yaml`](/components/backends/vllm/deploy/agg.yaml) as the base configuration.

 - **Production with Load Balancing**
-  Use [`agg_router.yaml`](../../../components/backends/vllm/deploy/agg_router.yaml) to enable scalable, load-balanced inference.
+  Use [`agg_router.yaml`](/components/backends/vllm/deploy/agg_router.yaml) to enable scalable, load-balanced inference.

 - **High Performance / Disaggregated Deployment**
-  Use [`disagg_router.yaml`](../../../components/backends/vllm/deploy/disagg_router.yaml) for maximum throughput and modular scalability.
+  Use [`disagg_router.yaml`](/components/backends/vllm/deploy/disagg_router.yaml) for maximum throughput and modular scalability.


 ## Step 2: Customize the Template
@@ -90,7 +90,7 @@ Consult the corresponding sh file. Each of the python commands to launch a compo

 The front end is launched with "python3 -m dynamo.frontend [--http-port 8000] [--router-mode kv]"
 Each worker will launch `python -m dynamo.YOUR_INFERENCE_BACKEND --model YOUR_MODEL --your-flags `command.
-If you are a Dynamo contributor the [dynamo run guide](../dynamo_run.md) for details on how to run this command.
+If you are a Dynamo contributor the [dynamo run guide](/docs/guides/dynamo_run.md) for details on how to run this command.


 ## Step 3: Key Customization Points

--- a/docs/guides/dynamo_deploy/dynamo_operator.md
+++ b/docs/guides/dynamo_deploy/dynamo_operator.md
@@ -23,11 +23,11 @@ Dynamo operator is a Kubernetes operator that simplifies the deployment, configu

 For the complete technical API reference for Dynamo Custom Resource Definitions, see:

-**📖 [Dynamo CRD API Reference](/docs/guides/dynamo_deploy/api_reference.md)**
+**📖 [Dynamo CRD API Reference](/docs/kubernetes/api_reference.md)**

 ## Installation

-[See installation steps](/docs/guides/dynamo_deploy/installation_guide.md#overview)
+[See installation steps](/docs/kubernetes/installation_guide.md#overview)


 ## Development

--- a/docs/guides/dynamo_deploy/fluxcd.md
+++ b/docs/guides/dynamo_deploy/fluxcd.md
 # GitOps Deployment with FluxCD

-This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](../../../components/backends/vllm/README.md) to demonstrate the workflow.
+This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](/components/backends/vllm/README.md) to demonstrate the workflow.

 ## Prerequisites

- A Kubernetes cluster with [Dynamo Cloud](/docs/guides/dynamo_deploy/installation_guide.md) installed
+- A Kubernetes cluster with [Dynamo Cloud](/docs/kubernetes/installation_guide.md) installed
 - [FluxCD](https://fluxcd.io/flux/installation/) installed in your cluster
 - A Git repository to store your deployment configurations

@@ -18,7 +18,7 @@ The GitOps workflow for Dynamo deployments consists of three main steps:

 ## Step 1: Build and Push Dynamo Cloud Operator

-First, follow to [See Install Dynamo Cloud](/docs/guides/dynamo_deploy/installation_guide.md).
+First, follow to [See Install Dynamo Cloud](/docs/kubernetes/installation_guide.md).

 ## Step 2: Create Initial Deployment


--- a/docs/guides/dynamo_deploy/gke_setup.md
+++ b/docs/guides/dynamo_deploy/gke_setup.md
--- a/docs/guides/dynamo_deploy/grove.md
+++ b/docs/guides/dynamo_deploy/grove.md
@@ -19,7 +19,7 @@ Grove enables disaggregated serving by breaking down large language model infere

 Grove implements disaggregated serving through several custom Kubernetes resources that provide declarative composition of role-based pod groups:

-### PodGangSet
+### PodCliqueSet
 The top-level Grove object that defines a group of components managed and colocated together. Key features include:
 - Support for autoscaling
 - Topology-aware spread of replicas for availability
@@ -39,10 +39,10 @@ A set of PodCliques that scale and are scheduled together, ideal for tightly cou
 Grove provides several specialized features that make it particularly well-suited for disaggregated serving:

 ### Flexible Gang Scheduling
-PodCliques and PodCliqueScalingGroups allow users to specify flexible gang-scheduling requirements at multiple levels within a PodGangSet to prevent resource deadlocks and ensure all components of a disaggregated system start together.
+PodCliques and PodCliqueScalingGroups allow users to specify flexible gang-scheduling requirements at multiple levels within a PodCliqueSet to prevent resource deadlocks and ensure all components of a disaggregated system start together.

 ### Multi-level Horizontal Auto-Scaling
-Supports pluggable horizontal auto-scaling solutions to scale PodGangSet, PodClique, and PodCliqueScalingGroup custom resources independently based on their specific metrics and requirements.
+Supports pluggable horizontal auto-scaling solutions to scale PodCliqueSet, PodClique, and PodCliqueScalingGroup custom resources independently based on their specific metrics and requirements.

 ### Network Topology-Aware Scheduling
 Allows specifying network topology pack and spread constraints to optimize for both network performance and service availability, crucial for disaggregated systems where components need efficient inter-node communication.

--- a/docs/guides/dynamo_deploy/installation_guide.md
+++ b/docs/guides/dynamo_deploy/installation_guide.md
@@ -21,7 +21,7 @@ Deploy and manage Dynamo inference graphs on Kubernetes with automated orchestra

 ## Quick Start Paths

-Platform is installed using Dynamo Kubernetes Platform [helm chart](../../../deploy/cloud/helm/platform/README.md).
+Platform is installed using Dynamo Kubernetes Platform [helm chart](/deploy/cloud/helm/platform/README.md).

 **Path A: Production Install**
 Install from published artifacts on your existing cluster → [Jump to Path A](#path-a-production-install)
@@ -32,6 +32,20 @@ Set up Minikube first → [Minikube Setup](minikube.md) → Then follow Path A
 **Path C: Custom Development**
 Build from source for customization → [Jump to Path C](#path-c-custom-development)

+All helm install commands could be overridden by either setting the values.yaml file or by passing in your own values.yaml:
+
+```bash
+helm install ...
+  -f your-values.yaml
+```
+
+and/or setting values as flags to the helm install command, as follows:
+
+```bash
+helm install ...
+  --set "your-value=your-value"
+```
+
 ## Prerequisites

 ```bash
@@ -68,7 +82,9 @@ helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace
 ```

 > [!TIP]
-> By default, Grove and Kai Scheduler are NOT installed. You can enable them by setting the following flags in the helm install command:
+> For multinode deployments, you need to enable Grove and Kai Scheduler.
+> You might chose to install them manually or through the dynamo-platform helm install command.
+> When using the dynamo-platform helm install command, Grove and Kai Scheduler are NOT installed by default. You can enable their installation by setting the following flags in the helm install command:

 ```bash
 --set "grove.enabled=true"
@@ -111,7 +127,7 @@ docker build -t $DOCKER_SERVER/dynamo-operator:$IMAGE_TAG . && docker push $DOCK

 cd -

-# 3. Create namespace and secrets to be able to pull the operator image
+# 3. Create namespace and secrets to be able to pull the operator image (only needed if you pushed the operator image to a private registry)
 kubectl create namespace ${NAMESPACE}
 kubectl create secret docker-registry docker-imagepullsecret \
  --docker-server=${DOCKER_SERVER} \
@@ -123,9 +139,8 @@ kubectl create secret docker-registry docker-imagepullsecret \
 helm upgrade --install dynamo-crds ./crds/ --namespace default

 # 5. Install Platform
-helm repo add bitnami https://charts.bitnami.com/bitnami
 helm dep build ./platform/
-helm upgrade --install dynamo-platform ./platform/ \
+helm install dynamo-platform ./platform/ \
  --namespace ${NAMESPACE} \
  --set dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator \
  --set dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG} \
@@ -158,9 +173,9 @@ kubectl get pods -n ${NAMESPACE}
   ```

 2. **Explore Backend Guides**
-   - [vLLM Deployments](../../../components/backends/vllm/deploy/README.md)
-   - [SGLang Deployments](../../../components/backends/sglang/deploy/README.md)
-   - [TensorRT-LLM Deployments](../../../components/backends/trtllm/deploy/README.md)
+   - [vLLM Deployments](/components/backends/vllm/deploy/README.md)
+   - [SGLang Deployments](/components/backends/sglang/deploy/README.md)
+   - [TensorRT-LLM Deployments](/components/backends/trtllm/deploy/README.md)

 3. **Optional:**
   - [Set up Prometheus & Grafana](metrics.md)
@@ -200,7 +215,7 @@ just add the following to the helm install command:

 ## Advanced Options

- [Helm Chart Configuration](../../../deploy/cloud/helm/platform/README.md)
+- [Helm Chart Configuration](/deploy/cloud/helm/platform/README.md)
 - [GKE-specific setup](gke_setup.md)
 - [Create custom deployments](create_deployment.md)
 - [Dynamo Operator details](dynamo_operator.md)

--- a/docs/guides/dynamo_deploy/logging.md
+++ b/docs/guides/dynamo_deploy/logging.md
--- a/docs/guides/dynamo_deploy/metrics.md
+++ b/docs/guides/dynamo_deploy/metrics.md
@@ -28,7 +28,7 @@ helm install prometheus -n monitoring --create-namespace prometheus-community/ku
 > The commands enumerated below assume you have installed the kube-prometheus-stack with the installation method listed above. Depending on your installation configuration of the monitoring stack, you may need to modify the `kubectl` commands that follow in this document accordingly (e.g modifying Namespace or Service names accordingly).

 ### Install Dynamo Operator
-Before setting up metrics collection, you'll need to have the Dynamo operator installed in your cluster. Follow our [Installation Guide](../dynamo_deploy/installation_guide.md) for detailed instructions on deploying the Dynamo operator.
+Before setting up metrics collection, you'll need to have the Dynamo operator installed in your cluster. Follow our [Installation Guide](/docs/kubernetes/installation_guide.md) for detailed instructions on deploying the Dynamo operator.
 Make sure to set the `prometheusEndpoint` to the Prometheus endpoint you installed in the previous step.

 ```bash
@@ -64,8 +64,8 @@ This will create two components:
 - A Worker component exposing metrics on its system port

 Both components expose a `/metrics` endpoint following the OpenMetrics format, but with different metrics appropriate to their roles. For details about:
- Deployment configuration: See the [vLLM README](../../components/backends/vllm/README.md)
- Available metrics: See the [metrics guide](../metrics.md)
+- Deployment configuration: See the [vLLM README](/components/backends/vllm/README.md)
+- Available metrics: See the [metrics guide](/docs/guides/metrics.md)

 ### Validate the Deployment


--- a/docs/guides/dynamo_deploy/minikube.md
+++ b/docs/guides/dynamo_deploy/minikube.md
--- a/docs/guides/dynamo_deploy/model_caching_with_fluid.md
+++ b/docs/guides/dynamo_deploy/model_caching_with_fluid.md
--- a/docs/guides/dynamo_deploy/multinode-deployment.md
+++ b/docs/guides/dynamo_deploy/multinode-deployment.md
--- a/docs/guides/dynamo_deploy/sla_planner_deployment.md
+++ b/docs/guides/dynamo_deploy/sla_planner_deployment.md
--- a/examples/custom_backend/hello_world/README.md
+++ b/examples/custom_backend/hello_world/README.md
@@ -106,7 +106,7 @@ Hello star!
 Note that this a very simple degenerate example which does not demonstrate the standard Dynamo FrontEnd-Backend deployment. The hello-world client is not a web server, it is a one-off function which sends the predefined text "world,sun,moon,star" to the backend. The example is meant to show the HelloWorldWorker. As such you will only see the HelloWorldWorker pod in deployment. The client will run and exit and the pod will not be operational.


-Follow the [Quickstart Guide](../../../docs/guides/dynamo_deploy/README.md) to install Dynamo Kubernetes Platform.
+Follow the [Quickstart Guide](../../../docs/kubernetes/README.md) to install Dynamo Kubernetes Platform.
 Then deploy to kubernetes using

 ```bash

--- a/examples/deployments/AKS/AKS-deployment.md
+++ b/examples/deployments/AKS/AKS-deployment.md
@@ -90,7 +90,7 @@ git clone https://github.com/ai-dynamo/dynamo.git
 cd dynamo
 ```

-2. Install Dynamo from Published Artifacts on NGC (see the [Dynamo Cloud guide](../../../docs/guides/dynamo_deploy/installation_guide.md)):
+2. Install Dynamo from Published Artifacts on NGC (see the [Dynamo Cloud guide](../../../docs/kubernetes/installation_guide.md)):
 ```bash
 export NAMESPACE=dynamo-cloud
 export RELEASE_VERSION=0.3.2
@@ -124,7 +124,7 @@ dynamo-platform-nats-0                                            2/2     Runnin
 dynamo-platform-nats-box-5dbf45c748-kln82                         1/1     Running   0          2m51s
 ```

-There are other ways to install Dynamo, you can find them [here](../../../docs/guides/dynamo_deploy/installation_guide.md).
+There are other ways to install Dynamo, you can find them [here](../../../docs/kubernetes/installation_guide.md).

 ### Task 4. Deploy a model


--- a/recipes/README.md
+++ b/recipes/README.md
@@ -19,7 +19,7 @@ export NAMESPACE=your-namespace
 kubectl create namespace ${NAMESPACE}
 ```

-2. **Dynamo Cloud Platform installed** - Follow [Quickstart Guide](../docs/guides/dynamo_deploy/README.md)
+2. **Dynamo Cloud Platform installed** - Follow [Quickstart Guide](../docs/kubernetes/README.md)

 3. **Kubernetes cluster with GPU support**