docs: clarify GAIE fallback and source installs (#7077)

Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io>

docs: clarify GAIE fallback and source installs (#7077)
Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io>
4c3e5991 · Daneyon Hansen · GitHub · b2c59aa4 · 4c3e5991
Unverified Commit 4c3e5991 authored Mar 24, 2026 by Daneyon Hansen Committed by GitHub Mar 24, 2026
Show whitespace changes
Inline Side-by-side

Showing with 2 additions and 1 deletion

docs/kubernetes/inference-gateway.md docs/kubernetes/inference-gateway.md +2 -1

No files found.
--- a/docs/kubernetes/inference-gateway.md
+++ b/docs/kubernetes/inference-gateway.md
@@ -12,7 +12,7 @@ Integrate Dynamo with the Gateway API Inference Extension for intelligent KV-awa
 EPP's default kv-routing approach is not token-aware because the prompt is not tokenized. But the Dynamo plugin uses a token-aware KV algorithm. It employs the dynamo router which implements kv routing by running your model's tokenizer inline. The EPP plugin configuration lives in [`helm/dynamo-gaie/epp-config-dynamo.yaml`](https://github.com/ai-dynamo/dynamo/blob/main/deploy/inference-gateway/standalone/helm/dynamo-gaie/epp-config-dynamo.yaml) per EPP [convention](https://gateway-api-inference-extension.sigs.k8s.io/guides/epp-configuration/config-text/).
-Dynamo Integration with the Inference Gateway supports Aggregated and Disaggregated Serving. The epp config is the same for both. If no prefill workers found the service degrades gracefully to perform aggregated serving.
+Dynamo Integration with the Inference Gateway supports Aggregated and Disaggregated Serving. A request only exercises disaggregated routing when the EPP config defines a `prefill` profile and prefill workers are available. The standalone [`epp-config-dynamo.yaml`](https://github.com/ai-dynamo/dynamo/blob/main/deploy/inference-gateway/standalone/helm/dynamo-gaie/epp-config-dynamo.yaml) currently only defines a `decode` profile, while the recipe examples use separate aggregated and disaggregated configs under `recipes/llama-3-70b/vllm/agg/gaie/` and `recipes/llama-3-70b/vllm/disagg-single-node/gaie/`. Unless `DYN_ENFORCE_DISAGG=true`, deployments without a `prefill` profile or prefill workers fall back to aggregated serving.
 If you want to use LoRA deploy Dynamo without the Inference Gateway.
 Currently, these setups are only supported with the kGateway based Inference Gateway.
@@ -27,6 +27,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat
 ### 1. Install Dynamo Platform ###
 [See Quickstart Guide](./README.md) to install Dynamo Kubernetes Platform.
+If you are installing from the source tree rather than a release chart, follow [Path B: Custom Build from Source](./installation-guide.md#path-b-custom-build-from-source) and run `helm dep build ./platform/` before `helm install` so the vendored subcharts match the local chart contents.
 ### 2. Deploy Inference Gateway ###