docs: fix broken documentation links (#5330)

4f6996c7 · dagil-nvidia · GitHub · 475999cf · 4f6996c7 · 4f6996c7
Unverified Commit 4f6996c7 authored Jan 09, 2026 by dagil-nvidia Committed by GitHub Jan 09, 2026
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 1 deletion

docs/backends/trtllm/kv-cache-transfer.md docs/backends/trtllm/kv-cache-transfer.md +4 -0

docs/router/kv_cache_routing.md docs/router/kv_cache_routing.md +1 -1

No files found.
--- a/docs/backends/trtllm/kv-cache-transfer.md
+++ b/docs/backends/trtllm/kv-cache-transfer.md
@@ -21,6 +21,10 @@ limitations under the License.
 In disaggregated serving architectures, KV cache must be transferred between prefill and decode workers. TensorRT-LLM supports two methods for this transfer:
+## Using NIXL for KV Cache Transfer
+Start the disaggregated service: See [Disaggregated Serving](./README.md#disaggregated) to learn how to start the deployment.
 ## Default Method: NIXL
 By default, TensorRT-LLM uses **NIXL** (NVIDIA Inference Xfer Library) with UCX (Unified Communication X) as backend for KV cache transfer between prefill and decode workers. [NIXL](https://github.com/ai-dynamo/nixl) is NVIDIA's high-performance communication library designed for efficient data transfer in distributed GPU environments.

--- a/docs/router/kv_cache_routing.md
+++ b/docs/router/kv_cache_routing.md
@@ -79,7 +79,7 @@ For basic model registration without KV routing, you can use `--router-mode roun
 ## Disaggregated Serving (Prefill and Decode)
-Dynamo supports disaggregated serving where prefill (prompt processing) and decode (token generation) are handled by separate worker pools. When you register workers with `ModelType.Prefill` (see [Backend Guide](../development/backend-guide.md#model-types)), the frontend automatically detects them and activates an internal prefill router.
+Dynamo supports disaggregated serving where prefill (prompt processing) and decode (token generation) are handled by separate worker pools. When you register workers with `ModelType.Prefill` (see [Backend Guide](../development/backend-guide.md)), the frontend automatically detects them and activates an internal prefill router.
 ### Automatic Prefill Router Activation