Unverified Commit 598cbbb7 authored by Anish's avatar Anish Committed by GitHub
Browse files

docs: reorganizing documentation to make things clearer (#3658)


Signed-off-by: default avatarathreesh <anish.maddipoti@utexas.edu>
Co-authored-by: default avatarClaude <noreply@anthropic.com>
parent 34fc9693
...@@ -197,4 +197,4 @@ date: Wed, 03 Sep 2025 13:42:45 GMT ...@@ -197,4 +197,4 @@ date: Wed, 03 Sep 2025 13:42:45 GMT
- [Distributed Runtime Architecture](../architecture/distributed_runtime.md) - [Distributed Runtime Architecture](../architecture/distributed_runtime.md)
- [Dynamo Architecture Overview](../architecture/architecture.md) - [Dynamo Architecture Overview](../architecture/architecture.md)
- [Backend Guide](backend.md) - [Backend Guide](../development/backend-guide.md)
...@@ -187,5 +187,5 @@ curl -d '{"model": "Qwen/Qwen3-0.6B", "max_completion_tokens": 2049, "messages": ...@@ -187,5 +187,5 @@ curl -d '{"model": "Qwen/Qwen3-0.6B", "max_completion_tokens": 2049, "messages":
- [Distributed Runtime Architecture](../architecture/distributed_runtime.md) - [Distributed Runtime Architecture](../architecture/distributed_runtime.md)
- [Dynamo Architecture Overview](../architecture/architecture.md) - [Dynamo Architecture Overview](../architecture/architecture.md)
- [Backend Guide](backend.md) - [Backend Guide](../development/backend-guide.md)
- [Log Aggregation in Kubernetes](../kubernetes/logging.md) - [Log Aggregation in Kubernetes](../kubernetes/logging.md)
...@@ -96,6 +96,6 @@ The metrics system includes a pre-configured Grafana dashboard for visualizing s ...@@ -96,6 +96,6 @@ The metrics system includes a pre-configured Grafana dashboard for visualizing s
- [Distributed Runtime Architecture](../architecture/distributed_runtime.md) - [Distributed Runtime Architecture](../architecture/distributed_runtime.md)
- [Dynamo Architecture Overview](../architecture/architecture.md) - [Dynamo Architecture Overview](../architecture/architecture.md)
- [Backend Guide](backend.md) - [Backend Guide](../development/backend-guide.md)
- [Metrics Implementation Examples](../../deploy/metrics/README.md#implementation-examples) - [Metrics Implementation Examples](../../deploy/metrics/README.md#implementation-examples)
- [Complete Metrics Setup Guide](../../deploy/metrics/README.md) - [Complete Metrics Setup Guide](../../deploy/metrics/README.md)
\ No newline at end of file
...@@ -29,7 +29,7 @@ Key features include: ...@@ -29,7 +29,7 @@ Key features include:
.. admonition:: 🚀 Quick Start .. admonition:: 🚀 Quick Start
:class: seealso :class: seealso
**New to SLA Planner?** Start with the [SLA Planner Quick Start Guide](/docs/kubernetes/sla_planner_quickstart.md) for a complete, step-by-step workflow. **New to SLA Planner?** Start with the [SLA Planner Quick Start Guide](/docs/planner/sla_planner_quickstart.md) for a complete, step-by-step workflow.
**Prerequisites**: SLA-based planner requires pre-deployment profiling (2-4 hours on real silicon or a few minutes using simulator) before deployment. The Quick Start guide includes everything you need. **Prerequisites**: SLA-based planner requires pre-deployment profiling (2-4 hours on real silicon or a few minutes using simulator) before deployment. The Quick Start guide includes everything you need.
...@@ -77,6 +77,6 @@ Key features include: ...@@ -77,6 +77,6 @@ Key features include:
:hidden: :hidden:
Overview <self> Overview <self>
SLA Planner Quick Start <../kubernetes/sla_planner_quickstart> SLA Planner Quick Start <sla_planner_quickstart>
Pre-Deployment Profiling <../benchmarks/pre_deployment_profiling.md> Pre-Deployment Profiling <../benchmarks/pre_deployment_profiling.md>
SLA-based Planner <sla_planner.md> SLA-based Planner <sla_planner.md>
# SLA-based Planner # SLA-based Planner
> [!TIP] > [!TIP]
> **New to SLA Planner?** For a complete workflow including profiling and deployment, see the [SLA Planner Quick Start Guide](/docs/kubernetes/sla_planner_quickstart.md). > **New to SLA Planner?** For a complete workflow including profiling and deployment, see the [SLA Planner Quick Start Guide](/docs/planner/sla_planner_quickstart.md).
This document covers information regarding the SLA-based planner in `examples/common/utils/planner_core.py`. This document covers information regarding the SLA-based planner in `examples/common/utils/planner_core.py`.
...@@ -129,7 +129,7 @@ Finally, SLA planner applies the change by scaling up/down the number of prefill ...@@ -129,7 +129,7 @@ Finally, SLA planner applies the change by scaling up/down the number of prefill
## Deploying ## Deploying
For complete deployment instructions, see the [SLA Planner Quick Start Guide](/docs/kubernetes/sla_planner_quickstart.md). For complete deployment instructions, see the [SLA Planner Quick Start Guide](/docs/planner/sla_planner_quickstart.md).
> [!NOTE] > [!NOTE]
> The SLA planner requires a frontend that reports metrics at the `/metrics` HTTP endpoint with the number of requests, ISL, OSL, TTFT, and ITL in the correct format. The dynamo frontend provides these metrics automatically. > The SLA planner requires a frontend that reports metrics at the `/metrics` HTTP endpoint with the number of requests, ISL, OSL, TTFT, and ITL in the correct format. The dynamo frontend provides these metrics automatically.
......
...@@ -249,7 +249,7 @@ This is because the `subComponentType` field has only been added in newer versio ...@@ -249,7 +249,7 @@ This is because the `subComponentType` field has only been added in newer versio
## Next Steps ## Next Steps
- **Architecture Details**: See [SLA-based Planner Architecture](/docs/architecture/sla_planner.md) for technical details - **Architecture Details**: See [SLA-based Planner Architecture](/docs/planner/sla_planner.md) for technical details
- **Performance Tuning**: See [Pre-Deployment Profiling Guide](/docs/benchmarks/pre_deployment_profiling.md) for advanced profiling options - **Performance Tuning**: See [Pre-Deployment Profiling Guide](/docs/benchmarks/pre_deployment_profiling.md) for advanced profiling options
- **Load Testing**: See [SLA Planner Load Test](/tests/planner/README.md) for comprehensive testing tools - **Load Testing**: See [SLA Planner Load Test](/tests/planner/README.md) for comprehensive testing tools
......
...@@ -50,7 +50,7 @@ maturin develop --uv ...@@ -50,7 +50,7 @@ maturin develop --uv
### Prerequisite ### Prerequisite
See [README.md](../../../docs/runtime/README.md#prerequisites). See [README.md](../../../docs/development/runtime-guide.md#prerequisites).
### Hello World Example ### Hello World Example
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment