Unverified Commit 80e7bafd authored by akshatha-k's avatar akshatha-k Committed by GitHub
Browse files

docs: Migrate router documentation to three-tier structure (#5979)


Signed-off-by: default avatarakshatha-k <akshutk@gmail.com>
Signed-off-by: default avatardagil-nvidia <dagil@nvidia.com>
Signed-off-by: default avatarDan Gil <dagil@nvidia.com>
Co-authored-by: default avatardagil-nvidia <dagil@nvidia.com>
Co-authored-by: default avatarCursor <cursoragent@cursor.com>
parent b5c0db63
...@@ -249,7 +249,7 @@ args: ...@@ -249,7 +249,7 @@ args:
- **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/kubernetes/installation_guide.md) - **Platform Setup**: [Dynamo Kubernetes Platform Installation](../../../../docs/kubernetes/installation_guide.md)
- **SLA Planner**: [SLA Planner Quickstart Guide](../../../../docs/planner/sla_planner_quickstart.md) - **SLA Planner**: [SLA Planner Quickstart Guide](../../../../docs/planner/sla_planner_quickstart.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md) - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/design_docs/disagg_serving.md), [KV-Aware Routing](../../../../docs/router/kv_cache_routing.md) - **Architecture Docs**: [Disaggregated Serving](../../../../docs/design_docs/disagg_serving.md), [KV-Aware Routing](../../../../docs/router/README.md)
## Troubleshooting ## Troubleshooting
......
...@@ -5,7 +5,7 @@ This example demonstrates running Dynamo across multiple nodes with **KV-aware r ...@@ -5,7 +5,7 @@ This example demonstrates running Dynamo across multiple nodes with **KV-aware r
For more information about the core concepts, see: For more information about the core concepts, see:
- [Dynamo Disaggregated Serving](../../../docs/design_docs/disagg_serving.md) - [Dynamo Disaggregated Serving](../../../docs/design_docs/disagg_serving.md)
- [KV Cache Routing Architecture](../../../docs/router/kv_cache_routing.md) - [KV Cache Routing](../../../docs/router/README.md)
## Architecture Overview ## Architecture Overview
...@@ -65,7 +65,7 @@ This is particularly beneficial for: ...@@ -65,7 +65,7 @@ This is particularly beneficial for:
- **Similar queries**: Common prefixes are computed once and reused - **Similar queries**: Common prefixes are computed once and reused
- **Batch processing**: Related requests can be routed to workers with shared context - **Batch processing**: Related requests can be routed to workers with shared context
For detailed technical information about how KV routing works, see the [KV Cache Routing Architecture documentation](../../../docs/router/kv_cache_routing.md). For detailed technical information about how KV routing works, see the [Router Guide](../../../docs/router/router_guide.md).
## Prerequisites ## Prerequisites
...@@ -475,7 +475,7 @@ python -m dynamo.frontend \ ...@@ -475,7 +475,7 @@ python -m dynamo.frontend \
--router-temperature 0.0 # Temperature for probabilistic routing (0 = deterministic) --router-temperature 0.0 # Temperature for probabilistic routing (0 = deterministic)
``` ```
For more advanced configuration options including custom worker selection, block size tuning, and alternative indexing strategies, see the [KV Cache Routing documentation](../../../docs/router/kv_cache_routing.md). For more advanced configuration options including custom worker selection, block size tuning, and alternative indexing strategies, see the [Router Guide](../../../docs/router/router_guide.md).
## Cleanup ## Cleanup
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment