fix(replay): resolve planner_profile_data directory (#7586)

Signed-off-by: Yongming Ding <yongmingd@nvidia.com>

fix(replay): resolve planner_profile_data directory (#7586)
Signed-off-by: Yongming Ding <yongmingd@nvidia.com>
16bca7b6 · Yongming Ding · GitHub · a58d2241 · 16bca7b6 · 16bca7b6
Unverified Commit 16bca7b6 authored Mar 24, 2026 by Yongming Ding Committed by GitHub Mar 24, 2026
Show whitespace changes
Inline Side-by-side

Showing with 30 additions and 5 deletions

docs/mocker/mocker.md docs/mocker/mocker.md +9 -0

lib/bindings/python/src/dynamo/replay/main.py lib/bindings/python/src/dynamo/replay/main.py +21 -5

No files found.
--- a/docs/mocker/mocker.md
+++ b/docs/mocker/mocker.md
@@ -234,6 +234,15 @@ python -m dynamo.mocker \
 The AIC model automatically uses `--model-path` and `--engine-type` to select the appropriate performance data. Available systems include `h200_sxm`, `h100_sxm`, etc. (see AIC SDK documentation for the full list).
+When using `python -m dynamo.replay`, there are no dedicated AIC flags. Pass the equivalent fields directly via `--extra-engine-args`:
+```bash
+python -m dynamo.replay /path/to/trace.jsonl \
+    --extra-engine-args '{"aic_backend":"vllm","aic_system":"h200_sxm","aic_model_path":"nvidia/Llama-3.1-8B-Instruct-FP8","aic_tp_size":1}'
+```
+The `aic_backend` field enables the AIC perf model and should match `engine_type` (`"vllm"` or `"sglang"`). The `aic_model_path` field is the equivalent of `--model-path` in `dynamo.mocker`.
 Example `--reasoning` configuration:
 ```bash

--- a/lib/bindings/python/src/dynamo/replay/main.py
+++ b/lib/bindings/python/src/dynamo/replay/main.py
@@ -4,13 +4,16 @@
 from __future__ import annotations
 import argparse
+import json
 import os
 import sys
 from collections.abc import Sequence
+from pathlib import Path
 os.environ.setdefault("DYNAMO_SKIP_PYTHON_LOG_INIT", "1")
 from dynamo.llm import KvRouterConfig, MockEngineArgs
+from dynamo.mocker.args import resolve_planner_profile_data
 from dynamo.replay import run_synthetic_trace_replay, run_trace_replay
 from dynamo.replay.reporting import format_report_table, write_report_json
@@ -71,11 +74,24 @@ def main(argv: Sequence[str] | None = None) -> int:
            "synthetic replay requires --input-tokens, --output-tokens, and --request-count"
        )
-    extra_engine_args = (
+    # Resolve planner_profile_data directory -> NPZ before passing to Rust.
-        MockEngineArgs.from_json(args.extra_engine_args)
+    # Rust only accepts NPZ files; resolve_planner_profile_data handles conversion.
-        if args.extra_engine_args is not None
+    profile_data_result = None
-        else None
+    if args.extra_engine_args is not None:
+        raw = json.loads(args.extra_engine_args)
+        if "planner_profile_data" in raw:
+            profile_data_result = resolve_planner_profile_data(
+                Path(raw["planner_profile_data"])
            )
+            if profile_data_result.npz_path is not None:
+                raw["planner_profile_data"] = str(profile_data_result.npz_path)
+            else:
+                del raw["planner_profile_data"]
+            extra_engine_args = MockEngineArgs.from_json(json.dumps(raw))
+        else:
+            extra_engine_args = MockEngineArgs.from_json(args.extra_engine_args)
+    else:
+        extra_engine_args = None
    router_config = (
        KvRouterConfig.from_json(args.router_config)
        if args.router_config is not None