fix: update planner + docs now that profiling results are stored in /data (#4098)

Signed-off-by: Hannah Zhang <hannahz@nvidia.com>

fix: update planner + docs now that profiling results are stored in /data (#4098)
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
5528f3b4 · hhzhang16 · GitHub · 83a3fe4e · 5528f3b4 · 5528f3b4
Unverified Commit 5528f3b4 authored Nov 04, 2025 by hhzhang16 Committed by GitHub Nov 05, 2025
7 changed files
--- a/deploy/utils/README.md
+++ b/deploy/utils/README.md
@@ -92,25 +92,38 @@ python3 -m deploy.utils.inject_manifest \
  --dest /data/configs/disagg.yaml
 ```
-**Download benchmark/profiling results:**
+**Download benchmark results:**
 ```bash
-# After benchmarking or profiling completes, download results
+# After benchmarking completes, download results
 python3 -m deploy.utils.download_pvc_results \
  --namespace $NAMESPACE \
-  --output-dir ./pvc_files \
+  --output-dir ./benchmarks/results \
  --folder /data/results \
  --no-config   # optional: skip *.yaml/*.yml in the download
 ```
+**Download profiling results (optional, for local inspection):**
+```bash
+# Optional: Download profiling data for local analysis
+# The planner reads directly from the PVC, so this is only needed for inspection
+python3 -m deploy.utils.download_pvc_results \
+  --namespace $NAMESPACE \
+  --output-dir ./profiling_data \
+  --folder /data
+```
+> **Note on Profiling Results**: When using DGDR (DynamoGraphDeploymentRequest) for SLA-driven profiling, profiling data is stored in `/data/` on the PVC. The planner component reads this data directly from the PVC, so downloading is **optional** - only needed if you want to inspect the profiling results locally (e.g., view performance plots, check configurations).
 #### Path Requirements
 **Important**: The PVC is mounted at `/data` in the access pod for security reasons. All destination paths must start with `/data/`.
 **Common path patterns:**
 - `/data/configs/` - Configuration files (DGD manifests)
- `/data/results/` - Benchmark results
+- `/data/results/` - Benchmark results (for download after benchmarking jobs)
- `/data/profiling_results/` - Profiling data
+- `/data/` - Profiling data (used directly by planner, typically not downloaded)
 - `/data/benchmarking/` - Benchmarking artifacts
 **User-friendly error messages**: If you forget the `/data/` prefix, the script will show a helpful error message with the correct path and example commands.

--- a/deploy/utils/download_pvc_results.py
+++ b/deploy/utils/download_pvc_results.py
@@ -182,7 +182,7 @@ def main():
    parser.add_argument(
        "--folder",
        required=True,
-        help="Absolute folder path in the PVC to download, must start with /data/, e.g. /data/profiling_results or /data/benchmarking_results",
+        help="Absolute folder path in the PVC to download, must start with /data",
    )
    args = parser.parse_args()
@@ -192,10 +192,6 @@ def main():
        print("❌ Error: Folder path must start with '/data/'")
        print(f"   Provided: {args.folder}")
        print("   Quick Fix: Add '/data/' prefix to your path")
-        print("   Examples:")
-        print("     /profiling_results → /data/profiling_results")
-        print("     /benchmarking_results → /data/benchmarking_results")
-        print("     /configs → /data/configs")
        sys.exit(1)
    print("📥 PVC Results Download")

--- a/deploy/utils/inject_manifest.py
+++ b/deploy/utils/inject_manifest.py
@@ -134,7 +134,6 @@ def main():
        print("🔍 Common patterns:")
        print("  /configs/file.yaml     → /data/configs/file.yaml")
        print("  /results/data.yaml     → /data/results/data.yaml")
-        print("  /profiling_results/... → /data/profiling_results/...")
        print("=" * 60)
        sys.exit(1)

--- a/docs/planner/sla_planner_quickstart.md
+++ b/docs/planner/sla_planner_quickstart.md
@@ -345,14 +345,18 @@ DGDRs are **immutable** - if you need to update SLAs or configuration:
 ### Manual Deployment Control
-Disable auto-deployment to review configurations before deploying:
+There are two ways to manually control deployment after profiling:
+#### Option 1: Use DGDR-Generated Configuration (Recommended)
+Disable auto-deployment to review the generated DGD before applying:
 ```yaml
 spec:
  autoApply: false
 ```
-Then manually apply the generated DGD:
+Then manually extract and apply the generated DGD:
 ```bash
 # Extract generated config
@@ -365,6 +369,27 @@ vi my-dgd.yaml
 kubectl apply -f my-dgd.yaml -n $NAMESPACE
 ```
+The generated DGD includes optimized configurations and the SLA planner component.
+#### Option 2: Use Standalone Planner Templates (Advanced)
+For advanced use cases, you can manually deploy using the standalone planner templates in `examples/backends/*/deploy/disagg_planner.yaml`:
+```bash
+# After profiling completes, profiling data is stored on the PVC at /data
+# Optional: Download profiling results for local inspection
+python3 -m deploy.utils.download_pvc_results \
+  --namespace $NAMESPACE \
+  --output-dir ./profiling_data \
+  --folder /data
+# Update backend planner manifest as needed, then deploy
+kubectl apply -f examples/backends/<backend>/deploy/disagg_planner.yaml -n $NAMESPACE
+```
+> **Note**: The standalone templates are provided as examples and may need customization for your model and requirements. The DGDR-generated configuration (Option 1) is recommended as it's automatically tuned to your profiling results and SLA targets.
 ### Relationship to DynamoGraphDeployment (DGD)
 - **DGDR**: High-level "intent" - what you want deployed

--- a/examples/backends/sglang/deploy/disagg_planner.yaml
+++ b/examples/backends/sglang/deploy/disagg_planner.yaml
@@ -37,7 +37,7 @@ spec:
            - --environment=kubernetes
            - --backend=sglang
            - --adjustment-interval=60
-            - --profile-results-dir=/data/profiling_results
+            - --profile-results-dir=/data
    decode:
      dynamoNamespace: dynamo
      envFromSecret: hf-token-secret

--- a/examples/backends/trtllm/deploy/disagg_planner.yaml
+++ b/examples/backends/trtllm/deploy/disagg_planner.yaml
@@ -57,7 +57,7 @@ spec:
            - --environment=kubernetes
            - --backend=trtllm
            - --adjustment-interval=60
-            - --profile-results-dir=/data/profiling_results
+            - --profile-results-dir=/data
            - --prometheus-port=9085
    TRTLLMDecodeWorker:
      dynamoNamespace: trtllm-disagg-planner

--- a/examples/backends/vllm/deploy/disagg_planner.yaml
+++ b/examples/backends/vllm/deploy/disagg_planner.yaml
@@ -36,7 +36,7 @@ spec:
            - --environment=kubernetes
            - --backend=vllm
            - --adjustment-interval=60
-            - --profile-results-dir=/data/profiling_results
+            - --profile-results-dir=/data
    VllmDecodeWorker:
      dynamoNamespace: vllm-disagg-planner
      envFromSecret: hf-token-secret