Unverified Commit 5528f3b4 authored by hhzhang16's avatar hhzhang16 Committed by GitHub
Browse files

fix: update planner + docs now that profiling results are stored in /data (#4098)


Signed-off-by: default avatarHannah Zhang <hannahz@nvidia.com>
parent 83a3fe4e
......@@ -92,25 +92,38 @@ python3 -m deploy.utils.inject_manifest \
--dest /data/configs/disagg.yaml
```
**Download benchmark/profiling results:**
**Download benchmark results:**
```bash
# After benchmarking or profiling completes, download results
# After benchmarking completes, download results
python3 -m deploy.utils.download_pvc_results \
--namespace $NAMESPACE \
--output-dir ./pvc_files \
--output-dir ./benchmarks/results \
--folder /data/results \
--no-config # optional: skip *.yaml/*.yml in the download
```
**Download profiling results (optional, for local inspection):**
```bash
# Optional: Download profiling data for local analysis
# The planner reads directly from the PVC, so this is only needed for inspection
python3 -m deploy.utils.download_pvc_results \
--namespace $NAMESPACE \
--output-dir ./profiling_data \
--folder /data
```
> **Note on Profiling Results**: When using DGDR (DynamoGraphDeploymentRequest) for SLA-driven profiling, profiling data is stored in `/data/` on the PVC. The planner component reads this data directly from the PVC, so downloading is **optional** - only needed if you want to inspect the profiling results locally (e.g., view performance plots, check configurations).
#### Path Requirements
**Important**: The PVC is mounted at `/data` in the access pod for security reasons. All destination paths must start with `/data/`.
**Common path patterns:**
- `/data/configs/` - Configuration files (DGD manifests)
- `/data/results/` - Benchmark results
- `/data/profiling_results/` - Profiling data
- `/data/results/` - Benchmark results (for download after benchmarking jobs)
- `/data/` - Profiling data (used directly by planner, typically not downloaded)
- `/data/benchmarking/` - Benchmarking artifacts
**User-friendly error messages**: If you forget the `/data/` prefix, the script will show a helpful error message with the correct path and example commands.
......
......@@ -182,7 +182,7 @@ def main():
parser.add_argument(
"--folder",
required=True,
help="Absolute folder path in the PVC to download, must start with /data/, e.g. /data/profiling_results or /data/benchmarking_results",
help="Absolute folder path in the PVC to download, must start with /data",
)
args = parser.parse_args()
......@@ -192,10 +192,6 @@ def main():
print("❌ Error: Folder path must start with '/data/'")
print(f" Provided: {args.folder}")
print(" Quick Fix: Add '/data/' prefix to your path")
print(" Examples:")
print(" /profiling_results → /data/profiling_results")
print(" /benchmarking_results → /data/benchmarking_results")
print(" /configs → /data/configs")
sys.exit(1)
print("📥 PVC Results Download")
......
......@@ -134,7 +134,6 @@ def main():
print("🔍 Common patterns:")
print(" /configs/file.yaml → /data/configs/file.yaml")
print(" /results/data.yaml → /data/results/data.yaml")
print(" /profiling_results/... → /data/profiling_results/...")
print("=" * 60)
sys.exit(1)
......
......@@ -345,14 +345,18 @@ DGDRs are **immutable** - if you need to update SLAs or configuration:
### Manual Deployment Control
Disable auto-deployment to review configurations before deploying:
There are two ways to manually control deployment after profiling:
#### Option 1: Use DGDR-Generated Configuration (Recommended)
Disable auto-deployment to review the generated DGD before applying:
```yaml
spec:
autoApply: false
```
Then manually apply the generated DGD:
Then manually extract and apply the generated DGD:
```bash
# Extract generated config
......@@ -365,6 +369,27 @@ vi my-dgd.yaml
kubectl apply -f my-dgd.yaml -n $NAMESPACE
```
The generated DGD includes optimized configurations and the SLA planner component.
#### Option 2: Use Standalone Planner Templates (Advanced)
For advanced use cases, you can manually deploy using the standalone planner templates in `examples/backends/*/deploy/disagg_planner.yaml`:
```bash
# After profiling completes, profiling data is stored on the PVC at /data
# Optional: Download profiling results for local inspection
python3 -m deploy.utils.download_pvc_results \
--namespace $NAMESPACE \
--output-dir ./profiling_data \
--folder /data
# Update backend planner manifest as needed, then deploy
kubectl apply -f examples/backends/<backend>/deploy/disagg_planner.yaml -n $NAMESPACE
```
> **Note**: The standalone templates are provided as examples and may need customization for your model and requirements. The DGDR-generated configuration (Option 1) is recommended as it's automatically tuned to your profiling results and SLA targets.
### Relationship to DynamoGraphDeployment (DGD)
- **DGDR**: High-level "intent" - what you want deployed
......
......@@ -37,7 +37,7 @@ spec:
- --environment=kubernetes
- --backend=sglang
- --adjustment-interval=60
- --profile-results-dir=/data/profiling_results
- --profile-results-dir=/data
decode:
dynamoNamespace: dynamo
envFromSecret: hf-token-secret
......
......@@ -57,7 +57,7 @@ spec:
- --environment=kubernetes
- --backend=trtllm
- --adjustment-interval=60
- --profile-results-dir=/data/profiling_results
- --profile-results-dir=/data
- --prometheus-port=9085
TRTLLMDecodeWorker:
dynamoNamespace: trtllm-disagg-planner
......
......@@ -36,7 +36,7 @@ spec:
- --environment=kubernetes
- --backend=vllm
- --adjustment-interval=60
- --profile-results-dir=/data/profiling_results
- --profile-results-dir=/data
VllmDecodeWorker:
dynamoNamespace: vllm-disagg-planner
envFromSecret: hf-token-secret
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment