docs: remove stale router diagram in trtllm docs (#5034)

Signed-off-by: PeaBrane <yanrpei@gmail.com>

docs: remove stale router diagram in trtllm docs (#5034)
Signed-off-by: PeaBrane <yanrpei@gmail.com>
0055f2a4 · Yan Ru Pei · GitHub · 043c80c4 · 0055f2a4
Unverified Commit 0055f2a4 authored Dec 19, 2025 by Yan Ru Pei Committed by GitHub Dec 19, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 17 deletions

docs/backends/trtllm/README.md docs/backends/trtllm/README.md +1 -17

No files found.
--- a/docs/backends/trtllm/README.md
+++ b/docs/backends/trtllm/README.md
@@ -107,23 +107,7 @@ apt-get update && apt-get -y install git git-lfs
 > [!IMPORTANT]
 > Below we provide some simple shell scripts that run the components for each configuration. Each shell script is simply running the `python3 -m dynamo.frontend <args>` to start up the ingress and using `python3 -m dynamo.trtllm <args>` to start up the workers. You can easily take each command and run them in separate terminals.
-This figure shows an overview of the major components to deploy:
+For detailed information about the architecture and how KV-aware routing works, see the [KV Cache Routing documentation](../../router/kv_cache_routing.md).
-```
-+------+      +-----------+      +------------------+             +---------------+
-| HTTP |----->| processor |----->|      Worker1     |------------>|    Worker2    |
-|      |<-----|           |<-----|                  |<------------|               |
-+------+      +-----------+      +------------------+             +---------------+
-                  |    ^                  |
-       query best |    | return           | publish kv events
-           worker |    | worker_id        v
-                  |    |         +------------------+
-                  |    +---------|     kv-router    |
-                  +------------->|                  |
-                                 +------------------+
-```
-**Note:** The diagram above shows all possible components in a deployment. In disaggregated serving, Worker1 acts as the decode worker and Worker2 as the prefill worker, with the unified frontend coordinating request routing between them.
 ### Aggregated
 ```bash