Unverified Commit 4a718028 authored by Julien Mancuso's avatar Julien Mancuso Committed by GitHub
Browse files

feat: revamp kubernetes doc (#3173)


Signed-off-by: default avatarJulien Mancuso <161955438+julienmancuso@users.noreply.github.com>
Co-authored-by: default avatarhhzhang16 <54051230+hhzhang16@users.noreply.github.com>
parent 13a5d61b
...@@ -148,7 +148,7 @@ Rerun with `curl -N` and change `stream` in the request to `true` to get the res ...@@ -148,7 +148,7 @@ Rerun with `curl -N` and change `stream` in the request to `true` to get the res
### Deploying Dynamo ### Deploying Dynamo
- Follow the [Quickstart Guide](docs/guides/dynamo_deploy/README.md) to deploy on Kubernetes. - Follow the [Quickstart Guide](docs/kubernetes/README.md) to deploy on Kubernetes.
- Check out [Backends](components/backends) to deploy various workflow configurations (e.g. SGLang with router, vLLM with disaggregated serving, etc.) - Check out [Backends](components/backends) to deploy various workflow configurations (e.g. SGLang with router, vLLM with disaggregated serving, etc.)
- Run some [Examples](examples) to learn about building components in Dynamo and exploring various integrations. - Run some [Examples](examples) to learn about building components in Dynamo and exploring various integrations.
......
...@@ -74,7 +74,7 @@ extraPodSpec: ...@@ -74,7 +74,7 @@ extraPodSpec:
Before using these templates, ensure you have: Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Installing Dynamo Cloud](../../../../docs/guides/dynamo_deploy/installation_guide.md) 1. **Dynamo Cloud Platform installed** - See [Installing Dynamo Cloud](../../../../docs/kubernetes/installation_guide.md)
2. **Kubernetes cluster with GPU support** 2. **Kubernetes cluster with GPU support**
3. **Container registry access** for SGLang runtime images 3. **Container registry access** for SGLang runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`) 4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
...@@ -144,9 +144,9 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you ...@@ -144,9 +144,9 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
## Further Reading ## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md) - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md) - **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/installation_guide.md) - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md) - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) - **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
...@@ -159,4 +159,4 @@ Common issues and solutions: ...@@ -159,4 +159,4 @@ Common issues and solutions:
3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds` 3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
4. **Out of memory**: Increase memory limits or reduce model batch size 4. **Out of memory**: Increase memory limits or reduce model batch size
For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/README.md). For additional support, refer to the [deployment guide](../../../../docs/kubernetes/README.md).
...@@ -102,7 +102,7 @@ extraPodSpec: ...@@ -102,7 +102,7 @@ extraPodSpec:
Before using these templates, ensure you have: Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md) 1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/kubernetes/README.md)
2. **Kubernetes cluster with GPU support** 2. **Kubernetes cluster with GPU support**
3. **Container registry access** for TensorRT-LLM runtime images 3. **Container registry access** for TensorRT-LLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`) 4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
...@@ -153,7 +153,7 @@ args: ...@@ -153,7 +153,7 @@ args:
### 3. Deploy ### 3. Deploy
See the [Create Deployment Guide](../../../../docs/guides/dynamo_deploy/create_deployment.md) to learn how to deploy the deployment file. See the [Create Deployment Guide](../../../../docs/kubernetes/create_deployment.md) to learn how to deploy the deployment file.
First, create a secret for the HuggingFace token. First, create a secret for the HuggingFace token.
```bash ```bash
...@@ -277,9 +277,9 @@ Configure the `model` name and `host` based on your deployment. ...@@ -277,9 +277,9 @@ Configure the `model` name and `host` based on your deployment.
## Further Reading ## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md) - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md) - **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/installation_guide.md) - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md) - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md) - **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
- **Multinode Deployment**: [Multinode Examples](../multinode/multinode-examples.md) - **Multinode Deployment**: [Multinode Examples](../multinode/multinode-examples.md)
...@@ -298,4 +298,4 @@ Common issues and solutions: ...@@ -298,4 +298,4 @@ Common issues and solutions:
6. **Git LFS issues**: Ensure git-lfs is installed before building containers 6. **Git LFS issues**: Ensure git-lfs is installed before building containers
7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines 7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md). For additional support, refer to the [deployment troubleshooting guide](../../../../docs/kubernetes/README.md).
...@@ -82,7 +82,7 @@ extraPodSpec: ...@@ -82,7 +82,7 @@ extraPodSpec:
Before using these templates, ensure you have: Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md) 1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/kubernetes/README.md)
2. **Kubernetes cluster with GPU support** 2. **Kubernetes cluster with GPU support**
3. **Container registry access** for vLLM runtime images 3. **Container registry access** for vLLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`) 4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
...@@ -234,10 +234,10 @@ args: ...@@ -234,10 +234,10 @@ args:
## Further Reading ## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md) - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md) - **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/installation_guide.md) - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
- **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/guides/dynamo_deploy/sla_planner_deployment.md) - **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/kubernetes/sla_planner_deployment.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md) - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md) - **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
...@@ -251,4 +251,4 @@ Common issues and solutions: ...@@ -251,4 +251,4 @@ Common issues and solutions:
4. **Out of memory**: Increase memory limits or reduce model batch size 4. **Out of memory**: Increase memory limits or reduce model batch size
5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command 5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md). For additional support, refer to the [deployment troubleshooting guide](../../../../docs/kubernetes/README.md).
...@@ -17,4 +17,4 @@ limitations under the License. ...@@ -17,4 +17,4 @@ limitations under the License.
# Dynamo Kubernetes Platform CRDs Helm Chart # Dynamo Kubernetes Platform CRDs Helm Chart
This chart installs the [CRDs](../../../../docs/guides/dynamo_deploy/api_reference.md) for the Dynamo Kubernetes Platform. This chart installs the [CRDs](../../../../docs/kubernetes/api_reference.md) for the Dynamo Kubernetes Platform.
\ No newline at end of file \ No newline at end of file
...@@ -103,7 +103,7 @@ For detailed etcd configuration options beyond `etcd.enabled`, please refer to t ...@@ -103,7 +103,7 @@ For detailed etcd configuration options beyond `etcd.enabled`, please refer to t
## 📚 Additional Resources ## 📚 Additional Resources
- [Dynamo Cloud Deployment Installation Guide](../../../../docs/guides/dynamo_deploy/installation_guide.md) - [Dynamo Cloud Deployment Installation Guide](../../../../docs/kubernetes/installation_guide.md)
- [NATS Documentation](https://docs.nats.io/) - [NATS Documentation](https://docs.nats.io/)
- [etcd Documentation](https://etcd.io/docs/) - [etcd Documentation](https://etcd.io/docs/)
- [Kubernetes Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) - [Kubernetes Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/)
......
...@@ -57,7 +57,7 @@ For detailed etcd configuration options beyond `etcd.enabled`, please refer to t ...@@ -57,7 +57,7 @@ For detailed etcd configuration options beyond `etcd.enabled`, please refer to t
## 📚 Additional Resources ## 📚 Additional Resources
- [Dynamo Cloud Deployment Installation Guide](../../../../docs/guides/dynamo_deploy/installation_guide.md) - [Dynamo Cloud Deployment Installation Guide](../../../../docs/kubernetes/installation_guide.md)
- [NATS Documentation](https://docs.nats.io/) - [NATS Documentation](https://docs.nats.io/)
- [etcd Documentation](https://etcd.io/docs/) - [etcd Documentation](https://etcd.io/docs/)
- [Kubernetes Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) - [Kubernetes Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/)
......
...@@ -288,7 +288,7 @@ generate-api-docs: crd-ref-docs ## Generate API reference documentation from CRD ...@@ -288,7 +288,7 @@ generate-api-docs: crd-ref-docs ## Generate API reference documentation from CRD
--output-path=./docs/api_reference.md --output-path=./docs/api_reference.md
@echo "✅ Generated API reference at ./docs/api_reference.md" @echo "✅ Generated API reference at ./docs/api_reference.md"
# concatenate header.md and api_reference.md # concatenate header.md and api_reference.md
cat docs/header.md ./docs/api_reference.md > ../../../docs/guides/dynamo_deploy/api_reference.md cat docs/header.md ./docs/api_reference.md > ../../../docs/kubernetes/api_reference.md
rm ./docs/api_reference.md rm ./docs/api_reference.md
@echo "✅ Concatenated header.md and api_reference.md" @echo "✅ Concatenated header.md and api_reference.md"
......
...@@ -24,4 +24,4 @@ make ...@@ -24,4 +24,4 @@ make
### Install ### Install
See [Dynamo Kubernetes Platform Installation Guide](/docs/guides/dynamo_deploy/installation_guide.md) for installation instructions. See [Dynamo Kubernetes Platform Installation Guide](/docs/kubernetes/installation_guide.md) for installation instructions.
...@@ -24,7 +24,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat ...@@ -24,7 +24,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat
### 1. Install Dynamo Platform ### ### 1. Install Dynamo Platform ###
[See Quickstart Guide](../../docs/guides/dynamo_deploy/README.md) to install Dynamo Cloud. [See Quickstart Guide](../../docs/kubernetes/README.md) to install Dynamo Cloud.
### 2. Deploy Inference Gateway ### ### 2. Deploy Inference Gateway ###
......
# Dynamo Logging on Kubernetes # Dynamo Logging on Kubernetes
For detailed documentation on collecting and visualizing logs on Kubernetes, see [docs/guides/dynamo_deploy/logging.md](../../docs/guides/dynamo_deploy/logging.md). For detailed documentation on collecting and visualizing logs on Kubernetes, see [docs/kubernetes/logging.md](../../docs/kubernetes/logging.md).
# Dynamo Metrics Collection on Kubernetes # Dynamo Metrics Collection on Kubernetes
For detailed documentation on collecting and visualizing metrics on Kubernetes, see [docs/guides/dynamo_deploy/metrics.md](../../../docs/guides/dynamo_deploy/metrics.md). For detailed documentation on collecting and visualizing metrics on Kubernetes, see [docs/kubernetes/metrics.md](../../../docs/kubernetes/metrics.md).
...@@ -6,7 +6,7 @@ This directory contains utilities and manifests for Dynamo benchmarking and prof ...@@ -6,7 +6,7 @@ This directory contains utilities and manifests for Dynamo benchmarking and prof
**Before using these utilities, you must first set up Dynamo Cloud following the main installation guide:** **Before using these utilities, you must first set up Dynamo Cloud following the main installation guide:**
👉 **[Follow the Dynamo Cloud installation guide](/docs/guides/dynamo_deploy/installation_guide.md) to install the Dynamo Kubernetes Platform first.** 👉 **[Follow the Dynamo Cloud installation guide](/docs/kubernetes/installation_guide.md) to install the Dynamo Kubernetes Platform first.**
This includes: This includes:
1. Installing the Dynamo CRDs 1. Installing the Dynamo CRDs
......
...@@ -56,7 +56,7 @@ fi ...@@ -56,7 +56,7 @@ fi
if ! kubectl get pods -n "$NAMESPACE" | grep -q "dynamo-platform"; then if ! kubectl get pods -n "$NAMESPACE" | grep -q "dynamo-platform"; then
warn "Dynamo platform pods not found in namespace $NAMESPACE" warn "Dynamo platform pods not found in namespace $NAMESPACE"
warn "Please ensure Dynamo Cloud platform is installed first:" warn "Please ensure Dynamo Cloud platform is installed first:"
warn " See: docs/guides/dynamo_deploy/installation_guide.md" warn " See: docs/kubernetes/installation_guide.md"
if [[ -z "${FORCE:-}" && -z "${YES:-}" ]]; then if [[ -z "${FORCE:-}" && -z "${YES:-}" ]]; then
read -p "Continue anyway? [y/N]: " -r ans read -p "Continue anyway? [y/N]: " -r ans
[[ "$ans" =~ ^[Yy]$ ]] || exit 1 [[ "$ans" =~ ^[Yy]$ ]] || exit 1
......
...@@ -110,7 +110,7 @@ Finally, SLA planner applies the change by scaling up/down the number of prefill ...@@ -110,7 +110,7 @@ Finally, SLA planner applies the change by scaling up/down the number of prefill
### K8s Deployment ### K8s Deployment
For detailed deployment instructions including setup, configuration, troubleshooting, and architecture overview, see the [SLA Planner Deployment Guide](../guides/dynamo_deploy/sla_planner_deployment.md). For detailed deployment instructions including setup, configuration, troubleshooting, and architecture overview, see the [SLA Planner Deployment Guide](../kubernetes/sla_planner_deployment.md).
**To deploy SLA Planner:** **To deploy SLA Planner:**
```bash ```bash
......
...@@ -97,7 +97,7 @@ Client-side benchmarking runs on your local machine and connects to Kubernetes d ...@@ -97,7 +97,7 @@ Client-side benchmarking runs on your local machine and connects to Kubernetes d
Follow these steps to benchmark Dynamo deployments using client-side benchmarking: Follow these steps to benchmark Dynamo deployments using client-side benchmarking:
### Step 1: Establish Kubernetes Cluster and Install Dynamo ### Step 1: Establish Kubernetes Cluster and Install Dynamo
Set up your Kubernetes cluster with NVIDIA GPUs and install the Dynamo Cloud platform. First follow the [installation guide](/docs/guides/dynamo_deploy/installation_guide.md) to install Dynamo Cloud, then use [deploy/utils/README](../../deploy/utils/README.md) to set up benchmarking resources. Set up your Kubernetes cluster with NVIDIA GPUs and install the Dynamo Cloud platform. First follow the [installation guide](/docs/kubernetes/installation_guide.md) to install Dynamo Cloud, then use [deploy/utils/README](../../deploy/utils/README.md) to set up benchmarking resources.
### Step 2: Deploy DynamoGraphDeployments ### Step 2: Deploy DynamoGraphDeployments
Deploy your DynamoGraphDeployments separately using the [deployment documentation](../../components/backends/). Each deployment should have a frontend service exposed. Deploy your DynamoGraphDeployments separately using the [deployment documentation](../../components/backends/). Each deployment should have a frontend service exposed.
......
...@@ -89,7 +89,7 @@ SLA planner can work with any interpolation data that follows the above format. ...@@ -89,7 +89,7 @@ SLA planner can work with any interpolation data that follows the above format.
## Running the Profiling Script in Kubernetes ## Running the Profiling Script in Kubernetes
Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](../../deploy/README.md), then set up profiling resources using [deploy/utils/README](../../deploy/utils/README.md). If your namespace is already set up, skip this step. Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](/docs/kubernetes/installation_guide.md), then set up profiling resources using [deploy/utils/README](/deploy/utils/README.md). If your namespace is already set up, skip this step.
**Prerequisites**: Ensure all dependencies are installed. If you ran the setup script above, dependencies are already installed. Otherwise, install them manually: **Prerequisites**: Ensure all dependencies are installed. If you ran the setup script above, dependencies are already installed. Otherwise, install them manually:
```bash ```bash
......
...@@ -146,4 +146,4 @@ curl -d '{"model": "Qwen/Qwen3-0.6B", "max_completion_tokens": 2049, "messages": ...@@ -146,4 +146,4 @@ curl -d '{"model": "Qwen/Qwen3-0.6B", "max_completion_tokens": 2049, "messages":
- [Distributed Runtime Architecture](../architecture/distributed_runtime.md) - [Distributed Runtime Architecture](../architecture/distributed_runtime.md)
- [Dynamo Architecture Overview](../architecture/architecture.md) - [Dynamo Architecture Overview](../architecture/architecture.md)
- [Backend Guide](backend.md) - [Backend Guide](backend.md)
- [Log Aggregation in Kubernetes](dynamo_deploy/logging.md) - [Log Aggregation in Kubernetes](../kubernetes/logging.md)
...@@ -31,7 +31,7 @@ Dynamo automatically exposes metrics with the `dynamo_` name prefixes. It also a ...@@ -31,7 +31,7 @@ Dynamo automatically exposes metrics with the `dynamo_` name prefixes. It also a
**Specialized Component Metrics**: Components can also expose additional metrics specific to their functionality. For example, a `preprocessor` component exposes metrics with the `dynamo_preprocessor_*` prefix. See the [Available Metrics section](../../deploy/metrics/README.md#available-metrics) for details on specialized component metrics. **Specialized Component Metrics**: Components can also expose additional metrics specific to their functionality. For example, a `preprocessor` component exposes metrics with the `dynamo_preprocessor_*` prefix. See the [Available Metrics section](../../deploy/metrics/README.md#available-metrics) for details on specialized component metrics.
**Kubernetes Integration**: For comprehensive Kubernetes deployment and monitoring setup, see the [Kubernetes Metrics Guide](dynamo_deploy/metrics.md). This includes Prometheus Operator setup, metrics collection configuration, and visualization in Grafana. **Kubernetes Integration**: For comprehensive Kubernetes deployment and monitoring setup, see the [Kubernetes Metrics Guide](../kubernetes/metrics.md). This includes Prometheus Operator setup, metrics collection configuration, and visualization in Grafana.
## Metrics Hierarchy ## Metrics Hierarchy
......
...@@ -24,16 +24,16 @@ ...@@ -24,16 +24,16 @@
API/nixl_connect/write_operation.md API/nixl_connect/write_operation.md
API/nixl_connect/README.md API/nixl_connect/README.md
guides/dynamo_deploy/api_reference.md kubernetes/api_reference.md
guides/dynamo_deploy/create_deployment.md kubernetes/create_deployment.md
guides/dynamo_deploy/fluxcd.md kubernetes/fluxcd.md
guides/dynamo_deploy/gke_setup.md kubernetes/gke_setup.md
guides/dynamo_deploy/grove.md kubernetes/grove.md
guides/dynamo_deploy/model_caching_with_fluid.md kubernetes/model_caching_with_fluid.md
guides/dynamo_deploy/README.md kubernetes/README.md
guides/dynamo_run.md guides/dynamo_run.md
guides/dynamo_deploy/sla_planner_deployment.md kubernetes/sla_planner_deployment.md
guides/metrics.md guides/metrics.md
guides/run_kvbm_in_vllm.md guides/run_kvbm_in_vllm.md
guides/run_kvbm_in_trtllm.md guides/run_kvbm_in_trtllm.md
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment