Unverified Commit 4a718028 authored by Julien Mancuso's avatar Julien Mancuso Committed by GitHub
Browse files

feat: revamp kubernetes doc (#3173)


Signed-off-by: default avatarJulien Mancuso <161955438+julienmancuso@users.noreply.github.com>
Co-authored-by: default avatarhhzhang16 <54051230+hhzhang16@users.noreply.github.com>
parent 13a5d61b
......@@ -148,7 +148,7 @@ Rerun with `curl -N` and change `stream` in the request to `true` to get the res
### Deploying Dynamo
- Follow the [Quickstart Guide](docs/guides/dynamo_deploy/README.md) to deploy on Kubernetes.
- Follow the [Quickstart Guide](docs/kubernetes/README.md) to deploy on Kubernetes.
- Check out [Backends](components/backends) to deploy various workflow configurations (e.g. SGLang with router, vLLM with disaggregated serving, etc.)
- Run some [Examples](examples) to learn about building components in Dynamo and exploring various integrations.
......
......@@ -74,7 +74,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Installing Dynamo Cloud](../../../../docs/guides/dynamo_deploy/installation_guide.md)
1. **Dynamo Cloud Platform installed** - See [Installing Dynamo Cloud](../../../../docs/kubernetes/installation_guide.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for SGLang runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -144,9 +144,9 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/installation_guide.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
......@@ -159,4 +159,4 @@ Common issues and solutions:
3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
4. **Out of memory**: Increase memory limits or reduce model batch size
For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/README.md).
For additional support, refer to the [deployment guide](../../../../docs/kubernetes/README.md).
......@@ -102,7 +102,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/kubernetes/README.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for TensorRT-LLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -153,7 +153,7 @@ args:
### 3. Deploy
See the [Create Deployment Guide](../../../../docs/guides/dynamo_deploy/create_deployment.md) to learn how to deploy the deployment file.
See the [Create Deployment Guide](../../../../docs/kubernetes/create_deployment.md) to learn how to deploy the deployment file.
First, create a secret for the HuggingFace token.
```bash
......@@ -277,9 +277,9 @@ Configure the `model` name and `host` based on your deployment.
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/installation_guide.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
- **Multinode Deployment**: [Multinode Examples](../multinode/multinode-examples.md)
......@@ -298,4 +298,4 @@ Common issues and solutions:
6. **Git LFS issues**: Ensure git-lfs is installed before building containers
7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/kubernetes/README.md).
......@@ -82,7 +82,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/kubernetes/README.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for vLLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -234,10 +234,10 @@ args:
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/installation_guide.md)
- **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/guides/dynamo_deploy/sla_planner_deployment.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
- **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/kubernetes/sla_planner_deployment.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
......@@ -251,4 +251,4 @@ Common issues and solutions:
4. **Out of memory**: Increase memory limits or reduce model batch size
5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/kubernetes/README.md).
......@@ -17,4 +17,4 @@ limitations under the License.
# Dynamo Kubernetes Platform CRDs Helm Chart
This chart installs the [CRDs](../../../../docs/guides/dynamo_deploy/api_reference.md) for the Dynamo Kubernetes Platform.
\ No newline at end of file
This chart installs the [CRDs](../../../../docs/kubernetes/api_reference.md) for the Dynamo Kubernetes Platform.
\ No newline at end of file
......@@ -103,7 +103,7 @@ For detailed etcd configuration options beyond `etcd.enabled`, please refer to t
## 📚 Additional Resources
- [Dynamo Cloud Deployment Installation Guide](../../../../docs/guides/dynamo_deploy/installation_guide.md)
- [Dynamo Cloud Deployment Installation Guide](../../../../docs/kubernetes/installation_guide.md)
- [NATS Documentation](https://docs.nats.io/)
- [etcd Documentation](https://etcd.io/docs/)
- [Kubernetes Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/)
......
......@@ -57,7 +57,7 @@ For detailed etcd configuration options beyond `etcd.enabled`, please refer to t
## 📚 Additional Resources
- [Dynamo Cloud Deployment Installation Guide](../../../../docs/guides/dynamo_deploy/installation_guide.md)
- [Dynamo Cloud Deployment Installation Guide](../../../../docs/kubernetes/installation_guide.md)
- [NATS Documentation](https://docs.nats.io/)
- [etcd Documentation](https://etcd.io/docs/)
- [Kubernetes Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/)
......
......@@ -288,7 +288,7 @@ generate-api-docs: crd-ref-docs ## Generate API reference documentation from CRD
--output-path=./docs/api_reference.md
@echo "✅ Generated API reference at ./docs/api_reference.md"
# concatenate header.md and api_reference.md
cat docs/header.md ./docs/api_reference.md > ../../../docs/guides/dynamo_deploy/api_reference.md
cat docs/header.md ./docs/api_reference.md > ../../../docs/kubernetes/api_reference.md
rm ./docs/api_reference.md
@echo "✅ Concatenated header.md and api_reference.md"
......
......@@ -24,4 +24,4 @@ make
### Install
See [Dynamo Kubernetes Platform Installation Guide](/docs/guides/dynamo_deploy/installation_guide.md) for installation instructions.
See [Dynamo Kubernetes Platform Installation Guide](/docs/kubernetes/installation_guide.md) for installation instructions.
......@@ -24,7 +24,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat
### 1. Install Dynamo Platform ###
[See Quickstart Guide](../../docs/guides/dynamo_deploy/README.md) to install Dynamo Cloud.
[See Quickstart Guide](../../docs/kubernetes/README.md) to install Dynamo Cloud.
### 2. Deploy Inference Gateway ###
......
# Dynamo Logging on Kubernetes
For detailed documentation on collecting and visualizing logs on Kubernetes, see [docs/guides/dynamo_deploy/logging.md](../../docs/guides/dynamo_deploy/logging.md).
For detailed documentation on collecting and visualizing logs on Kubernetes, see [docs/kubernetes/logging.md](../../docs/kubernetes/logging.md).
# Dynamo Metrics Collection on Kubernetes
For detailed documentation on collecting and visualizing metrics on Kubernetes, see [docs/guides/dynamo_deploy/metrics.md](../../../docs/guides/dynamo_deploy/metrics.md).
For detailed documentation on collecting and visualizing metrics on Kubernetes, see [docs/kubernetes/metrics.md](../../../docs/kubernetes/metrics.md).
......@@ -6,7 +6,7 @@ This directory contains utilities and manifests for Dynamo benchmarking and prof
**Before using these utilities, you must first set up Dynamo Cloud following the main installation guide:**
👉 **[Follow the Dynamo Cloud installation guide](/docs/guides/dynamo_deploy/installation_guide.md) to install the Dynamo Kubernetes Platform first.**
👉 **[Follow the Dynamo Cloud installation guide](/docs/kubernetes/installation_guide.md) to install the Dynamo Kubernetes Platform first.**
This includes:
1. Installing the Dynamo CRDs
......
......@@ -56,7 +56,7 @@ fi
if ! kubectl get pods -n "$NAMESPACE" | grep -q "dynamo-platform"; then
warn "Dynamo platform pods not found in namespace $NAMESPACE"
warn "Please ensure Dynamo Cloud platform is installed first:"
warn " See: docs/guides/dynamo_deploy/installation_guide.md"
warn " See: docs/kubernetes/installation_guide.md"
if [[ -z "${FORCE:-}" && -z "${YES:-}" ]]; then
read -p "Continue anyway? [y/N]: " -r ans
[[ "$ans" =~ ^[Yy]$ ]] || exit 1
......
......@@ -110,7 +110,7 @@ Finally, SLA planner applies the change by scaling up/down the number of prefill
### K8s Deployment
For detailed deployment instructions including setup, configuration, troubleshooting, and architecture overview, see the [SLA Planner Deployment Guide](../guides/dynamo_deploy/sla_planner_deployment.md).
For detailed deployment instructions including setup, configuration, troubleshooting, and architecture overview, see the [SLA Planner Deployment Guide](../kubernetes/sla_planner_deployment.md).
**To deploy SLA Planner:**
```bash
......
......@@ -97,7 +97,7 @@ Client-side benchmarking runs on your local machine and connects to Kubernetes d
Follow these steps to benchmark Dynamo deployments using client-side benchmarking:
### Step 1: Establish Kubernetes Cluster and Install Dynamo
Set up your Kubernetes cluster with NVIDIA GPUs and install the Dynamo Cloud platform. First follow the [installation guide](/docs/guides/dynamo_deploy/installation_guide.md) to install Dynamo Cloud, then use [deploy/utils/README](../../deploy/utils/README.md) to set up benchmarking resources.
Set up your Kubernetes cluster with NVIDIA GPUs and install the Dynamo Cloud platform. First follow the [installation guide](/docs/kubernetes/installation_guide.md) to install Dynamo Cloud, then use [deploy/utils/README](../../deploy/utils/README.md) to set up benchmarking resources.
### Step 2: Deploy DynamoGraphDeployments
Deploy your DynamoGraphDeployments separately using the [deployment documentation](../../components/backends/). Each deployment should have a frontend service exposed.
......
......@@ -89,7 +89,7 @@ SLA planner can work with any interpolation data that follows the above format.
## Running the Profiling Script in Kubernetes
Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](../../deploy/README.md), then set up profiling resources using [deploy/utils/README](../../deploy/utils/README.md). If your namespace is already set up, skip this step.
Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](/docs/kubernetes/installation_guide.md), then set up profiling resources using [deploy/utils/README](/deploy/utils/README.md). If your namespace is already set up, skip this step.
**Prerequisites**: Ensure all dependencies are installed. If you ran the setup script above, dependencies are already installed. Otherwise, install them manually:
```bash
......
......@@ -146,4 +146,4 @@ curl -d '{"model": "Qwen/Qwen3-0.6B", "max_completion_tokens": 2049, "messages":
- [Distributed Runtime Architecture](../architecture/distributed_runtime.md)
- [Dynamo Architecture Overview](../architecture/architecture.md)
- [Backend Guide](backend.md)
- [Log Aggregation in Kubernetes](dynamo_deploy/logging.md)
- [Log Aggregation in Kubernetes](../kubernetes/logging.md)
......@@ -31,7 +31,7 @@ Dynamo automatically exposes metrics with the `dynamo_` name prefixes. It also a
**Specialized Component Metrics**: Components can also expose additional metrics specific to their functionality. For example, a `preprocessor` component exposes metrics with the `dynamo_preprocessor_*` prefix. See the [Available Metrics section](../../deploy/metrics/README.md#available-metrics) for details on specialized component metrics.
**Kubernetes Integration**: For comprehensive Kubernetes deployment and monitoring setup, see the [Kubernetes Metrics Guide](dynamo_deploy/metrics.md). This includes Prometheus Operator setup, metrics collection configuration, and visualization in Grafana.
**Kubernetes Integration**: For comprehensive Kubernetes deployment and monitoring setup, see the [Kubernetes Metrics Guide](../kubernetes/metrics.md). This includes Prometheus Operator setup, metrics collection configuration, and visualization in Grafana.
## Metrics Hierarchy
......
......@@ -24,16 +24,16 @@
API/nixl_connect/write_operation.md
API/nixl_connect/README.md
guides/dynamo_deploy/api_reference.md
guides/dynamo_deploy/create_deployment.md
guides/dynamo_deploy/fluxcd.md
guides/dynamo_deploy/gke_setup.md
guides/dynamo_deploy/grove.md
guides/dynamo_deploy/model_caching_with_fluid.md
guides/dynamo_deploy/README.md
kubernetes/api_reference.md
kubernetes/create_deployment.md
kubernetes/fluxcd.md
kubernetes/gke_setup.md
kubernetes/grove.md
kubernetes/model_caching_with_fluid.md
kubernetes/README.md
guides/dynamo_run.md
guides/dynamo_deploy/sla_planner_deployment.md
kubernetes/sla_planner_deployment.md
guides/metrics.md
guides/run_kvbm_in_vllm.md
guides/run_kvbm_in_trtllm.md
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment