Unverified Commit 39d645e5 authored by Jonathan Tong's avatar Jonathan Tong Committed by GitHub
Browse files

docs: migrate Fern docs from fern/ into docs/ (#6206)


Signed-off-by: default avatarJont828 <jt572@cornell.edu>
parent d381e6ff
---
orphan: true
---
# <Backend> Guide
Advanced deployment and configuration for the <Backend> backend.
## Deployment
### Single-Node Setup
<!-- Local deployment instructions -->
### Multi-Node Setup
<!-- Distributed deployment with TP/PP -->
### Kubernetes Deployment
```yaml
# Full DGDR example
```
## Configuration
### CLI Arguments
| Argument | Type | Default | Description |
|----------|------|---------|-------------|
| <!-- arg --> | <!-- type --> | <!-- default --> | <!-- description --> |
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| <!-- var --> | <!-- default --> | <!-- description --> |
### Model Configuration
<!-- Model-specific settings, quantization -->
## Performance Tuning
### Memory Optimization
<!-- KV cache sizing, batch limits -->
### Throughput Optimization
<!-- Concurrency, prefill/decode settings -->
## Troubleshooting
### Common Issues
| Issue | Cause | Solution |
|-------|-------|----------|
| <!-- issue --> | <!-- cause --> | <!-- solution --> |
### Debug Mode
```bash
# Add debug command from existing docs
```
## See Also
| Document | Path |
|----------|------|
| `<Backend> Overview` | `./README.md` |
| Backend Comparison | `../README.md` |
<!-- Convert to links when using template -->
---
orphan: true
---
# <Backend> Backend
<!-- 2-3 sentence overview of this backend integration -->
## Feature Matrix
<!-- Copy actual feature matrix from existing backend docs -->
<!-- Example pattern (from vLLM README): -->
| Feature | Status | Notes |
|---------|--------|-------|
| Disaggregated Serving | ✅ | |
| KV-Aware Routing | ✅ | |
| SLA-Based Planner | ✅ | |
| Multimodal | ✅ | Vision models |
| LoRA | 🚧 | Experimental |
## Quick Start
### Prerequisites
- <!-- List prerequisites -->
### Usage
```bash
# Add minimal usage example from existing backend docs
# Example pattern (vLLM):
# python -m dynamo.vllm --model <model-name>
# Example pattern (SGLang):
# python -m dynamo.sglang --model <model-name>
```
### Kubernetes
```yaml
# Add DGDR example - use apiVersion: nvidia.com/v1alpha1
# See recipes/ folder for production examples
```
## Configuration
| Parameter | Default | Description |
|-----------|---------|-------------|
| <!-- param --> | <!-- default --> | <!-- description --> |
<!-- EXAMPLE: Filled-in Configuration for vLLM would look like:
| Parameter | Default | Description |
|-----------|---------|-------------|
| `--model` | required | Model path or HuggingFace ID |
| `--tensor-parallel-size` | `1` | Number of GPUs for tensor parallelism |
| `--max-model-len` | auto | Maximum sequence length |
-->
## Next Steps
| Document | Path | Description |
|----------|------|-------------|
| `<Backend> Guide` | `<backend>_guide.md` | Advanced configuration |
| Backend Comparison | `../README.md` | Compare backends |
<!-- Convert table rows to markdown links -->
---
orphan: true
---
# <Component> Design
Architecture and design decisions for the <Component>.
## Overview
<!-- High-level architecture description -->
## Design Goals
1. **Goal 1** - Description
2. **Goal 2** - Description
3. **Goal 3** - Description
## Architecture
### Components
<!-- Description of internal components -->
### Data Flow
```
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Input │───▶│ Process │───▶│ Output │
└─────────┘ └─────────┘ └─────────┘
```
## Design Decisions
### Decision 1: <!-- Title -->
**Context:** <!-- What problem were we solving? -->
**Options Considered:**
1. Option A - Pros/Cons
2. Option B - Pros/Cons
**Decision:** <!-- What we chose and why -->
**Consequences:** <!-- Trade-offs accepted -->
## Algorithms
### <!-- Algorithm Name -->
<!-- Algorithm description -->
```
Pseudocode or formula
```
## Performance Considerations
<!-- Performance characteristics, bottlenecks, optimization opportunities -->
## Future Work
- <!-- Planned improvement 1 -->
- <!-- Planned improvement 2 -->
## References
- <!-- Related design docs -->
- <!-- External papers or resources -->
---
orphan: true
---
# <Component> Examples
Usage examples for the <Component>.
## Basic Examples
### Example 1: <!-- Title -->
```bash
# Add example from existing docs
```
### Example 2: <!-- Title -->
```python
# Add example from existing docs
```
## Kubernetes Examples
### Minimal Deployment
```yaml
# Add minimal DGDR from existing docs
```
### Production Deployment
```yaml
# Add production DGDR from existing docs
```
## Advanced Examples
### <!-- Advanced Use Case Title -->
<!-- Description -->
```bash
# Add example
```
## Sample Configurations
### config-minimal.yaml
```yaml
# Add from existing docs
```
---
orphan: true
---
# <Component> Guide
This guide covers deployment, configuration, and integration for the <Component>.
## Deployment
### Single-Node Setup
<!-- Instructions for local/single-node deployment -->
### Multi-Node Setup
<!-- Instructions for distributed deployment -->
### Kubernetes Deployment
```yaml
# Full DGDR example
```
## Configuration
### CLI Arguments
| Argument | Type | Default | Description |
|----------|------|---------|-------------|
| <!-- arg --> | <!-- type --> | <!-- default --> | <!-- description --> |
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| <!-- var --> | <!-- default --> | <!-- description --> |
### Configuration File
```yaml
# Add config file example if applicable
```
## Integration
### With Router
<!-- How to integrate with Router -->
### With Planner
<!-- How to integrate with Planner -->
### With Observability
<!-- Metrics, logging, tracing integration -->
## Troubleshooting
### Common Issues
| Issue | Cause | Solution |
|-------|-------|----------|
| Error message | Root cause | Fix |
### Debug Mode
```bash
# Add debug command from existing docs
```
## See Also
| Document | Path |
|----------|------|
| `<Component> Examples` | `<component>_examples.md` |
| `<Component> Design` | `/docs/design_docs/<component>_design.md` |
<!-- Convert table rows to markdown links -->
---
orphan: true
---
# <Component>
<!-- 2-3 sentence overview of what this component does and its role in Dynamo -->
## Feature Matrix
| Feature | Status |
|---------|--------|
| Feature 1 | ✅ Supported |
| Feature 2 | 🚧 Experimental |
| Feature 3 | ❌ Not Supported |
## Quick Start
### Prerequisites
- <!-- List prerequisites -->
### Usage
```bash
# Add minimal usage example from existing docs
# Example pattern (from Router):
# python -m dynamo.frontend --router-mode kv --http-port 8000
```
### Kubernetes
```yaml
# Add DGDR example - use apiVersion: nvidia.com/v1alpha1
# Example pattern (from Router):
# apiVersion: nvidia.com/v1alpha1
# kind: DynamoGraphDeployment
# metadata:
# name: <component>-deployment
# spec:
# services:
# ...
```
<!-- EXAMPLE: Filled-in Quick Start for Router would look like:
### Prerequisites
- Dynamo platform installed
- At least one backend worker running
### Usage
```bash
python -m dynamo.frontend --router-mode kv --http-port 8000
```
### Kubernetes
```yaml
apiVersion: nvidia.com/v1alpha1
kind: DynamoGraphDeployment
metadata:
name: router-example
spec:
graphs:
- name: frontend
replicas: 1
```
-->
## Configuration
| Parameter | Default | Description |
|-----------|---------|-------------|
| <!-- param --> | <!-- default --> | <!-- description --> |
## Next Steps
| Document | Path | Description |
|----------|------|-------------|
| `<Component> Guide` | `<component>_guide.md` | Deployment and configuration |
| `<Component> Examples` | `<component>_examples.md` | Usage examples |
| `<Component> Design` | `/docs/design_docs/<component>_design.md` | Architecture |
<!-- Convert table rows to markdown links -->
---
orphan: true
---
# <Feature> with <Backend>
Using <Feature> with the <Backend> backend.
## Prerequisites
- <Backend> installed with <feature> support
- <!-- Other requirements -->
## Configuration
### CLI Arguments
| Argument | Default | Description |
|----------|---------|-------------|
| <!-- arg --> | <!-- default --> | <!-- description --> |
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| <!-- var --> | <!-- default --> | <!-- description --> |
## Examples
### Basic Usage
```python
# Add example from existing docs
```
### Kubernetes Deployment
```yaml
# Add DGDR example from existing docs
```
## Limitations
- <!-- Backend-specific limitations -->
## Troubleshooting
| Issue | Solution |
|-------|----------|
| <!-- issue --> | <!-- solution --> |
## See Also
| Document | Path |
|----------|------|
| `<Feature> Overview` | `./README.md` |
| `<Backend> Guide` | `/docs/backends/<backend>/README.md` |
<!-- Convert to links: [Multimodal Overview](./README.md) -->
---
orphan: true
---
# <Feature>
<!-- 2-3 sentence overview of this cross-cutting feature -->
## Backend Support
<!-- Copy actual backend support from existing feature docs -->
<!-- Example pattern (from Multimodal index.md): -->
| Backend | Status | Notes |
|---------|--------|-------|
| vLLM | ✅ | Full support |
| SGLang | ✅ | |
| TensorRT-LLM | 🚧 | Limited support |
See the Feature Matrix for full compatibility.
## Overview
<!-- How this feature works across backends -->
## Quick Start
<!-- Add minimal example from existing feature docs -->
## Backend-Specific Guides
| Backend | Guide |
|---------|-------|
| vLLM | `<feature>_vllm.md` |
| SGLang | `<feature>_sglang.md` |
| TensorRT-LLM | `<feature>_trtllm.md` |
<!-- Convert table rows to markdown links -->
## See Also
- <!-- Related features -->
- <!-- Related components -->
---
orphan: true
---
# Dynamo <Component>
<!-- One-sentence description -->
See `docs/components/<component>/` for full documentation.
<!-- When using this template, replace with actual link to component docs.
For backends, use: docs/backends/<backend>/
-->
---
orphan: true
---
# <Topic>
<!-- 2-3 sentence overview of this infrastructure topic. -->
## Quick Start
<!-- Minimal steps to get started -->
## Guides
| Guide | Path |
|-------|------|
| Guide 1 | `<subtopic1>.md` |
| Guide 2 | `<subtopic2>.md` |
## Reference
<!-- Links to reference material -->
## See Also
| Topic | Path |
|-------|------|
| Related topic 1 | `../related/` |
| Related topic 2 | `../other/` |
---
orphan: true
---
# <Integration> Integration
<!-- 2-3 sentence overview of this external integration -->
## Version Compatibility
| Dynamo | <Integration> | Notes |
|--------|---------------|-------|
| 0.9.x | 1.2.x | Recommended |
| 0.8.x | 1.1.x | |
## Backend Support
| Backend | Status | Notes |
|---------|--------|-------|
| vLLM | ✅ | |
| SGLang | 🚧 | |
| TensorRT-LLM | ❌ | |
## Quick Start
```bash
# Add installation and usage from existing integration docs
# Example pattern (LMCache):
# python -m dynamo.vllm --model <model> --connector lmcache
```
## Configuration
| Parameter | Default | Description |
|-----------|---------|-------------|
| <!-- param --> | <!-- default --> | <!-- description --> |
## Guides
| Document | Path | Description |
|----------|------|-------------|
| `<Integration> Setup` | `<integration>_setup.md` | Installation and configuration |
| `<Integration> with vLLM` | `<integration>_vllm.md` | vLLM-specific usage |
<!-- Convert table rows to markdown links -->
## External Resources
- [<Integration> Documentation](https://...)
- [<Integration> GitHub](https://github.com/...)
......@@ -270,6 +270,8 @@ navigation:
path: ../pages/backends/vllm/prometheus.md
- page: Prompt Embeddings
path: ../pages/backends/vllm/prompt-embeddings.md
- page: vLLM-Omni
path: ../pages/backends/vllm/vllm-omni.md
- section: SGLang Details
contents:
- page: Expert Distribution (EPLB)
......
......@@ -82,4 +82,4 @@ If you're running Kubernetes/cloud deployment examples (EKS, AKS, GKE), you'll a
| **kubectl** | v1.24+ | [Install kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) |
| **Helm** | v3.0+ | [Install Helm](https://helm.sh/docs/intro/install/) |
See the [Kubernetes Installation Guide](/docs/kubernetes/installation_guide.md#prerequisites) for detailed setup instructions and pre-deployment checks.
See the [Kubernetes Installation Guide](/docs/pages/kubernetes/installation-guide.md#prerequisites) for detailed setup instructions and pre-deployment checks.
......@@ -74,7 +74,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Kubernetes Platform installed** - See [Installing Dynamo Kubernetes Platform](../../../../docs/kubernetes/installation_guide.md)
1. **Dynamo Kubernetes Platform installed** - See [Installing Dynamo Kubernetes Platform](../../../../docs/pages/kubernetes/installation-guide.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for SGLang runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -144,10 +144,10 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/kubernetes/installation_guide.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/pages/kubernetes/deployment/create-deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/pages/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/pages/kubernetes/installation-guide.md)
- **Examples**: [Deployment Examples](../../../../docs/pages/getting-started/examples.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
## Troubleshooting
......@@ -159,4 +159,4 @@ Common issues and solutions:
3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
4. **Out of memory**: Increase memory limits or reduce model batch size
For additional support, refer to the [deployment guide](../../../../docs/kubernetes/README.md).
For additional support, refer to the [deployment guide](../../../../docs/pages/kubernetes/README.md).
......@@ -17,7 +17,7 @@ For this example, we will make some assumptions about your SLURM cluster:
If your cluster supports similar container based plugins, you may be able to
modify the template to use that instead.
3. We assume you have already built a recent Dynamo+SGLang container image as
described [here](../../../../docs/backends/sglang/README.md#using-docker-containers).
described [here](../../../../docs/pages/backends/sglang/README.md#using-docker-containers).
This is the image that can be passed to the `--container-image` argument in later steps.
## Scripts Overview
......
......@@ -223,6 +223,6 @@ To add other backends (TensorRT, ONNX, Python, etc.), edit the Makefile's `build
## Related Documentation
- [Dynamo Backend Guide](../../../docs/development/backend-guide.md)
- [Dynamo Backend Guide](../../../docs/pages/development/backend-guide.md)
- [Triton Inference Server](https://github.com/triton-inference-server/server)
- [KServe Protocol](https://kserve.github.io/website/latest/modelserving/data_plane/v2_protocol/)
......@@ -53,7 +53,7 @@ Advanced disaggregated deployment with SLA-based automatic scaling.
- `TRTLLMPrefillWorker`: Specialized prefill-only worker
> [!NOTE]
> This deployment requires pre-deployment profiling to be completed first. See [Pre-Deployment Profiling](../../../../docs/components/profiler/profiler_guide.md) for detailed instructions.
> This deployment requires pre-deployment profiling to be completed first. See [Pre-Deployment Profiling](../../../../docs/pages/components/profiler/profiler-guide.md) for detailed instructions.
## CRD Structure
......@@ -102,7 +102,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Kubernetes Platform installed** - See [Quickstart Guide](../../../../docs/kubernetes/README.md)
1. **Dynamo Kubernetes Platform installed** - See [Quickstart Guide](../../../../docs/pages/kubernetes/README.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for TensorRT-LLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -155,7 +155,7 @@ args:
### 3. Deploy
See the [Create Deployment Guide](../../../../docs/kubernetes/deployment/create_deployment.md) to learn how to deploy the deployment file.
See the [Create Deployment Guide](../../../../docs/pages/kubernetes/deployment/create-deployment.md) to learn how to deploy the deployment file.
First, create a secret for the HuggingFace token.
```bash
......@@ -219,7 +219,7 @@ TensorRT-LLM workers are configured through command-line arguments in the deploy
## Testing the Deployment
Send a test request to verify your deployment. See the [client section](../../../../docs/backends/vllm/README.md#client) for detailed instructions.
Send a test request to verify your deployment. See the [client section](../../../../docs/pages/backends/vllm/README.md#client) for detailed instructions.
**Note:** For multi-node deployments, target the node running `python3 -m dynamo.frontend <args>`.
......@@ -241,11 +241,11 @@ TensorRT-LLM supports two methods for KV cache transfer in disaggregated serving
- **UCX** (default): Standard method for KV cache transfer
- **NIXL** (experimental): Alternative transfer method
For detailed configuration instructions, see the [KV cache transfer guide](../../../../docs/backends/trtllm/kv-cache-transfer.md).
For detailed configuration instructions, see the [KV cache transfer guide](../../../../docs/pages/backends/trtllm/kv-cache-transfer.md).
## Request Migration
You can enable [request migration](../../../../docs/fault_tolerance/request_migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
You can enable [request migration](../../../../docs/pages/fault-tolerance/request-migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
```yaml
args:
......@@ -264,13 +264,13 @@ Configure the `model` name and `host` based on your deployment.
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/kubernetes/installation_guide.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/design_docs/disagg_serving.md), [KV-Aware Routing](../../../../docs/components/router/README.md)
- **Multinode Deployment**: [Multinode Examples](../../../../docs/backends/trtllm/multinode/multinode-examples.md)
- **Speculative Decoding**: [Llama 4 + Eagle Guide](../../../../docs/backends/trtllm/llama4_plus_eagle.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/pages/kubernetes/deployment/create-deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/pages/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/pages/kubernetes/installation-guide.md)
- **Examples**: [Deployment Examples](../../../../docs/pages/getting-started/examples.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/pages/design-docs/disagg-serving.md), [KV-Aware Routing](../../../../docs/pages/components/router/README.md)
- **Multinode Deployment**: [Multinode Examples](../../../../docs/pages/backends/trtllm/multinode/multinode-examples.md)
- **Speculative Decoding**: [Llama 4 + Eagle Guide](../../../../docs/pages/backends/trtllm/llama4-plus-eagle.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
## Troubleshooting
......@@ -285,4 +285,4 @@ Common issues and solutions:
6. **Git LFS issues**: Ensure git-lfs is installed before building containers
7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/kubernetes/README.md).
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/pages/kubernetes/README.md).
......@@ -41,7 +41,7 @@ Please note that:
3. `post_process.py` - Scan the aiperf results to produce a json with entries to each config point.
4. `plot_performance_comparison.py` - Takes the json result file for disaggregated and/or aggregated configuration sweeps and plots a pareto line for better visualization.
For more finer grained details on how to launch TRTLLM backend workers with DeepSeek R1 on GB200 slurm, please refer [multinode-examples.md](../../../../docs/backends/trtllm/multinode/multinode-examples.md). This guide shares similar assumption to the multinode examples guide.
For more finer grained details on how to launch TRTLLM backend workers with DeepSeek R1 on GB200 slurm, please refer [multinode-examples.md](../../../../docs/pages/backends/trtllm/multinode/multinode-examples.md). This guide shares similar assumption to the multinode examples guide.
## Usage
......@@ -49,7 +49,7 @@ For more finer grained details on how to launch TRTLLM backend workers with Deep
Before running the scripts, ensure you have:
1. Access to a SLURM cluster
2. Container image of Dynamo with TensorRT-LLM built using instructions from [here](https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container).
2. Container image of Dynamo with TensorRT-LLM built using instructions from [here](https://github.com/ai-dynamo/dynamo/tree/main/docs/pages/backends/trtllm/README.md#build-container).
3. Model files accessible on the cluster
4. Required environment variables set
......@@ -69,7 +69,7 @@ export SLURM_JOB_NAME=""
# NOTE: IMAGE must be set manually for now
# To build an iamge, see the steps here:
# https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container
# https://github.com/ai-dynamo/dynamo/tree/main/docs/pages/backends/trtllm/README.md#build-container
export IMAGE="<dynamo_trtllm_image>"
# NOTE: In general, Deepseek R1 is very large, so it is recommended to
......
......@@ -92,7 +92,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Kubernetes Platform installed** - See [Quickstart Guide](../../../../docs/kubernetes/README.md)
1. **Dynamo Kubernetes Platform installed** - See [Quickstart Guide](../../../../docs/pages/kubernetes/README.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for vLLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -110,7 +110,7 @@ docker build -f container/rendered.Dockerfile .
### Pre-Deployment Profiling (SLA Planner Only)
If using the SLA Planner deployment (`disagg_planner.yaml`), follow the [pre-deployment profiling guide](../../../../docs/components/profiler/profiler_guide.md) to run pre-deployment profiling.
If using the SLA Planner deployment (`disagg_planner.yaml`), follow the [pre-deployment profiling guide](../../../../docs/pages/components/profiler/profiler-guide.md) to run pre-deployment profiling.
## Usage
......@@ -235,7 +235,7 @@ All templates use **Qwen/Qwen3-0.6B** as the default model, but you can use any
## Request Migration
You can enable [request migration](../../../../docs/fault_tolerance/request_migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
You can enable [request migration](../../../../docs/pages/fault-tolerance/request-migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
```yaml
args:
......@@ -245,12 +245,12 @@ args:
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/kubernetes/installation_guide.md)
- **SLA Planner**: [SLA Planner Quickstart Guide](../../../../docs/components/planner/planner_guide.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/design_docs/disagg_serving.md), [KV-Aware Routing](../../../../docs/components/router/README.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/pages/kubernetes/deployment/create-deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/pages/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/pages/kubernetes/installation-guide.md)
- **SLA Planner**: [SLA Planner Quickstart Guide](../../../../docs/pages/components/planner/planner-guide.md)
- **Examples**: [Deployment Examples](../../../../docs/pages/getting-started/examples.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/pages/design-docs/disagg-serving.md), [KV-Aware Routing](../../../../docs/pages/components/router/README.md)
## Troubleshooting
......@@ -262,4 +262,4 @@ Common issues and solutions:
4. **Out of memory**: Increase memory limits or reduce model batch size
5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/kubernetes/README.md).
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/pages/kubernetes/README.md).
......@@ -11,7 +11,7 @@ This deployment pattern enables dynamic LoRA adapter loading from S3-compatible
- Kubernetes cluster with GPU support
- Helm 3.x installed
- `kubectl` configured to access your cluster
- Dynamo Kubernetes Platform installed ([Installation Guide](../../../../../docs/kubernetes/installation_guide.md))
- Dynamo Kubernetes Platform installed ([Installation Guide](../../../../../docs/pages/kubernetes/installation-guide.md))
- HuggingFace token for downloading Base and LoRA adapters
## Files in This Directory
......@@ -293,5 +293,5 @@ kubectl delete secret hf-token-secret -n ${NAMESPACE}
## Further Reading
- [vLLM Deployment Guide](../README.md) - Other deployment patterns
- [Dynamo Kubernetes Guide](../../../../../docs/kubernetes/README.md) - Platform setup
- [Installation Guide](../../../../../docs/kubernetes/installation_guide.md) - Platform installation
- [Dynamo Kubernetes Guide](../../../../../docs/pages/kubernetes/README.md) - Platform setup
- [Installation Guide](../../../../../docs/pages/kubernetes/installation-guide.md) - Platform installation
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment