Unverified Commit ece08dc9 authored by Neal Vaidya's avatar Neal Vaidya Committed by GitHub
Browse files

docs: restructure docs directory and move fern config to fern/ (#6700)


Signed-off-by: default avatarNeal Vaidya <nealv@nvidia.com>
Co-authored-by: default avatarClaude Opus 4.6 <noreply@anthropic.com>
parent 1412e44b
......@@ -82,4 +82,4 @@ If you're running Kubernetes/cloud deployment examples (EKS, AKS, GKE), you'll a
| **kubectl** | v1.24+ | [Install kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) |
| **Helm** | v3.0+ | [Install Helm](https://helm.sh/docs/intro/install/) |
See the [Kubernetes Installation Guide](/docs/pages/kubernetes/installation-guide.md#prerequisites) for detailed setup instructions and pre-deployment checks.
See the [Kubernetes Installation Guide](/docs/kubernetes/installation-guide.md#prerequisites) for detailed setup instructions and pre-deployment checks.
......@@ -74,7 +74,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Kubernetes Platform installed** - See [Installing Dynamo Kubernetes Platform](../../../../docs/pages/kubernetes/installation-guide.md)
1. **Dynamo Kubernetes Platform installed** - See [Installing Dynamo Kubernetes Platform](../../../../docs/kubernetes/installation-guide.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for SGLang runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -144,10 +144,10 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/pages/kubernetes/deployment/create-deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/pages/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/pages/kubernetes/installation-guide.md)
- **Examples**: [Deployment Examples](../../../../docs/pages/getting-started/examples.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create-deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/kubernetes/installation-guide.md)
- **Examples**: [Deployment Examples](../../../../docs/getting-started/examples.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
## Troubleshooting
......@@ -159,4 +159,4 @@ Common issues and solutions:
3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
4. **Out of memory**: Increase memory limits or reduce model batch size
For additional support, refer to the [deployment guide](../../../../docs/pages/kubernetes/README.md).
For additional support, refer to the [deployment guide](../../../../docs/kubernetes/README.md).
......@@ -223,6 +223,6 @@ To add other backends (TensorRT, ONNX, Python, etc.), edit the Makefile's `build
## Related Documentation
- [Dynamo Backend Guide](../../../docs/pages/development/backend-guide.md)
- [Dynamo Backend Guide](../../../docs/development/backend-guide.md)
- [Triton Inference Server](https://github.com/triton-inference-server/server)
- [KServe Protocol](https://kserve.github.io/website/latest/modelserving/data_plane/v2_protocol/)
......@@ -53,7 +53,7 @@ Advanced disaggregated deployment with SLA-based automatic scaling.
- `TRTLLMPrefillWorker`: Specialized prefill-only worker
> [!NOTE]
> This deployment requires pre-deployment profiling to be completed first. See [Pre-Deployment Profiling](../../../../docs/pages/components/profiler/profiler-guide.md) for detailed instructions.
> This deployment requires pre-deployment profiling to be completed first. See [Pre-Deployment Profiling](../../../../docs/components/profiler/profiler-guide.md) for detailed instructions.
## CRD Structure
......@@ -102,7 +102,7 @@ extraPodSpec:
Before using these templates, ensure you have:
1. **Dynamo Kubernetes Platform installed** - See [Quickstart Guide](../../../../docs/pages/kubernetes/README.md)
1. **Dynamo Kubernetes Platform installed** - See [Quickstart Guide](../../../../docs/kubernetes/README.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for TensorRT-LLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
......@@ -155,7 +155,7 @@ args:
### 3. Deploy
See the [Create Deployment Guide](../../../../docs/pages/kubernetes/deployment/create-deployment.md) to learn how to deploy the deployment file.
See the [Create Deployment Guide](../../../../docs/kubernetes/deployment/create-deployment.md) to learn how to deploy the deployment file.
First, create a secret for the HuggingFace token.
```bash
......@@ -219,7 +219,7 @@ TensorRT-LLM workers are configured through command-line arguments in the deploy
## Testing the Deployment
Send a test request to verify your deployment. See the [client section](../../../../docs/pages/backends/vllm/README.md#client) for detailed instructions.
Send a test request to verify your deployment. See the [client section](../../../../docs/backends/vllm/README.md#client) for detailed instructions.
**Note:** For multi-node deployments, target the node running `python3 -m dynamo.frontend <args>`.
......@@ -241,11 +241,11 @@ TensorRT-LLM supports two methods for KV cache transfer in disaggregated serving
- **UCX** (default): Standard method for KV cache transfer
- **NIXL** (experimental): Alternative transfer method
For detailed configuration instructions, see the [KV cache transfer guide](../../../../docs/pages/backends/trtllm/kv-cache-transfer.md).
For detailed configuration instructions, see the [KV cache transfer guide](../../../../docs/backends/trtllm/kv-cache-transfer.md).
## Request Migration
You can enable [request migration](../../../../docs/pages/fault-tolerance/request-migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
You can enable [request migration](../../../../docs/fault-tolerance/request-migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
```yaml
args:
......@@ -264,13 +264,13 @@ Configure the `model` name and `host` based on your deployment.
## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/pages/kubernetes/deployment/create-deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/pages/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/pages/kubernetes/installation-guide.md)
- **Examples**: [Deployment Examples](../../../../docs/pages/getting-started/examples.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/pages/design-docs/disagg-serving.md), [KV-Aware Routing](../../../../docs/pages/components/router/README.md)
- **Multinode Deployment**: [Multinode Examples](../../../../docs/pages/backends/trtllm/multinode/multinode-examples.md)
- **Speculative Decoding**: [Llama 4 + Eagle Guide](../../../../docs/pages/backends/trtllm/llama4-plus-eagle.md)
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create-deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/kubernetes/installation-guide.md)
- **Examples**: [Deployment Examples](../../../../docs/getting-started/examples.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/design-docs/disagg-serving.md), [KV-Aware Routing](../../../../docs/components/router/README.md)
- **Multinode Deployment**: [Multinode Examples](../../../../docs/backends/trtllm/multinode/multinode-examples.md)
- **Speculative Decoding**: [Llama 4 + Eagle Guide](../../../../docs/backends/trtllm/llama4-plus-eagle.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
## Troubleshooting
......@@ -285,4 +285,4 @@ Common issues and solutions:
6. **Git LFS issues**: Ensure git-lfs is installed before building containers
7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/pages/kubernetes/README.md).
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/kubernetes/README.md).
......@@ -41,7 +41,7 @@ Please note that:
3. `post_process.py` - Scan the aiperf results to produce a json with entries to each config point.
4. `plot_performance_comparison.py` - Takes the json result file for disaggregated and/or aggregated configuration sweeps and plots a pareto line for better visualization.
For more finer grained details on how to launch TRTLLM backend workers with DeepSeek R1 on GB200 slurm, please refer [multinode-examples.md](../../../../docs/pages/backends/trtllm/multinode/multinode-examples.md). This guide shares similar assumption to the multinode examples guide.
For more finer grained details on how to launch TRTLLM backend workers with DeepSeek R1 on GB200 slurm, please refer [multinode-examples.md](../../../../docs/backends/trtllm/multinode/multinode-examples.md). This guide shares similar assumption to the multinode examples guide.
## Usage
......@@ -49,7 +49,7 @@ For more finer grained details on how to launch TRTLLM backend workers with Deep
Before running the scripts, ensure you have:
1. Access to a SLURM cluster
2. Container image of Dynamo with TensorRT-LLM built using instructions from [here](https://github.com/ai-dynamo/dynamo/tree/main/docs/pages/backends/trtllm/README.md#build-container).
2. Container image of Dynamo with TensorRT-LLM built using instructions from [here](https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container).
3. Model files accessible on the cluster
4. Required environment variables set
......@@ -69,7 +69,7 @@ export SLURM_JOB_NAME=""
# NOTE: IMAGE must be set manually for now
# To build an iamge, see the steps here:
# https://github.com/ai-dynamo/dynamo/tree/main/docs/pages/backends/trtllm/README.md#build-container
# https://github.com/ai-dynamo/dynamo/tree/main/docs/backends/trtllm/README.md#build-container
export IMAGE="<dynamo_trtllm_image>"
# NOTE: In general, Deepseek R1 is very large, so it is recommended to
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment