Unverified Commit a6e1d5fe authored by hhzhang16's avatar hhzhang16 Committed by GitHub
Browse files

docs: add mention of deploy/utils quickstart to planner deployment docs (#2821)


Signed-off-by: default avatarHannah Zhang <hannahz@nvidia.com>
parent 7eef9ac3
...@@ -28,7 +28,9 @@ The SLA planner consists of several key components: ...@@ -28,7 +28,9 @@ The SLA planner consists of several key components:
## Pre-Deployment Profiling ## Pre-Deployment Profiling
SLA-based planner requires pre-deployment profiling to operate. See [Pre-Deployment Profiling](../benchmarks/pre_deployment_profiling.md) for more details. **Prerequisite**: SLA-based planner requires pre-deployment profiling to be completed before deployment. The profiling process analyzes your model's performance characteristics to determine optimal tensor parallelism configurations and scaling parameters that the planner will use during operation.
See [Pre-Deployment Profiling](../benchmarks/pre_deployment_profiling.md) for detailed instructions on running the profiling process.
## Load Prediction ## Load Prediction
......
...@@ -23,10 +23,11 @@ flowchart LR ...@@ -23,10 +23,11 @@ flowchart LR
## Prerequisites ## Prerequisites
- Kubernetes cluster with GPU nodes - Kubernetes cluster with GPU nodes
- `hf-token-secret` created in target namespace - [Pre-Deployment Profiling](../../benchmarks/pre_deployment_profiling.md) completed and its results saved to `dynamo-pvc` PVC.
- [Pre-Deployment Profiling](../../benchmarks/pre_deployment_profiling.md) results saved to `dynamo-pvc` PVC.
- Prefill and decode worker uses the best parallelization mapping suggested by the pre-deployment profiling script. - Prefill and decode worker uses the best parallelization mapping suggested by the pre-deployment profiling script.
> [!NOTE]
> **Important**: The profiling that occurs before Planner deployment requires additional Kubernetes manifests (ServiceAccount, Role, RoleBinding, PVC) that are not included in standard Dynamo deployments. Apply these manifests in the same namespace as `$NAMESPACE`. For a complete setup, start with the [Quick Start guide](../../../deploy/utils/README.md#quick-start), which provides a fully encapsulated deployment including all required manifests.
```bash ```bash
export NAMESPACE=your-namespace export NAMESPACE=your-namespace
``` ```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment