Unverified Commit 5c35805e authored by Hongkuan Zhou's avatar Hongkuan Zhou Committed by GitHub
Browse files

docs(profiler): clarify when planner-profile-data ConfigMap is emitted [DYN-2751] (#8486)


Signed-off-by: default avatarhongkuanz <hongkuanz@nvidia.com>
parent 6d44b904
...@@ -402,6 +402,14 @@ async def run_profile( ...@@ -402,6 +402,14 @@ async def run_profile(
phase=ops.current_phase, phase=ops.current_phase,
) )
if not is_disagg_config: if not is_disagg_config:
# TODO: agg + throughput-scaling has no profiling-data
# fallback today. The NPZ sweep (thorough) and the AIC
# spec (rapid, see build_aic_interpolation_spec) are both
# shaped around prefill + decode picks. For agg picks the
# planner currently falls back to DYN_BENCHMARK_MODE at
# runtime only. Extend AICInterpolationSpec and
# run_interpolation to carry an agg_pick so both paths
# work for aggregated deployments too.
logger.info( logger.info(
"Picked config is aggregated (chosen_exp=%r) — " "Picked config is aggregated (chosen_exp=%r) — "
"skipping interpolation (requires disaggregated config).", "skipping interpolation (requires disaggregated config).",
......
...@@ -90,7 +90,11 @@ def assemble_final_config( ...@@ -90,7 +90,11 @@ def assemble_final_config(
planner config so the planner runs AIC interpolation at bootstrap planner config so the planner runs AIC interpolation at bootstrap
if the endpoint is unavailable. if the endpoint is unavailable.
4. **Profile data** — attach interpolation-data ConfigMap when mocker 4. **Profile data** — attach interpolation-data ConfigMap when mocker
or planner-thorough is enabled. or planner-thorough is enabled. The ConfigMap is only emitted when
the picked config is disaggregated AND the interpolation NPZ files
were produced on disk; rapid-mode deployments never emit it (the
planner uses AIC in-process or ``get_perf_metrics`` instead), and
agg picks skip interpolation entirely.
""" """
if not dgd_config: if not dgd_config:
return dgd_config return dgd_config
...@@ -519,6 +523,19 @@ def build_aic_interpolation_spec( ...@@ -519,6 +523,19 @@ def build_aic_interpolation_spec(
* ``pre_deployment_sweeping_mode`` is not ``Rapid`` * ``pre_deployment_sweeping_mode`` is not ``Rapid``
* picks are missing * picks are missing
* ``resolved_backend`` is not one AIC supports * ``resolved_backend`` is not one AIC supports
.. note::
The spec only carries ``prefill_pick`` + ``decode_pick``, so the
caller in ``profile_sla.py`` gates this on a disaggregated pick
(``is_disagg_config``). When rapid AIC picks an aggregated config
and the override to disagg fails, ``aic_spec`` is ``None`` and the
planner has no AIC fallback — it relies solely on the
``get_perf_metrics`` endpoint (``DYN_BENCHMARK_MODE``).
TODO: extend ``AICInterpolationSpec`` with an ``agg_pick`` so
throughput-scaling on an aggregated deployment has a matching
AIC bootstrap path (planner + mocker + thorough NPZ). Tracking
via the wider agg+throughput-scaling rework.
""" """
planner = ( planner = (
dgdr.features.planner # type: ignore[union-attr] dgdr.features.planner # type: ignore[union-attr]
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment