Unverified Commit ece08dc9 authored by Neal Vaidya's avatar Neal Vaidya Committed by GitHub
Browse files

docs: restructure docs directory and move fern config to fern/ (#6700)


Signed-off-by: default avatarNeal Vaidya <nealv@nvidia.com>
Co-authored-by: default avatarClaude Opus 4.6 <noreply@anthropic.com>
parent 1412e44b
...@@ -8,7 +8,7 @@ This document provides an in-depth look at the architecture, components, framewo ...@@ -8,7 +8,7 @@ This document provides an in-depth look at the architecture, components, framewo
## KVBM Components ## KVBM Components
![Internal Components of Dynamo KVBM](../../assets/img/kvbm-components.png) ![Internal Components of Dynamo KVBM](../assets/img/kvbm-components.png)
*Internal Components of Dynamo KVBM* *Internal Components of Dynamo KVBM*
...@@ -39,7 +39,7 @@ This document provides an in-depth look at the architecture, components, framewo ...@@ -39,7 +39,7 @@ This document provides an in-depth look at the architecture, components, framewo
## KVBM Data Flows ## KVBM Data Flows
![KVBM Data Flows](../../assets/img/kvbm-data-flows.png) ![KVBM Data Flows](../assets/img/kvbm-data-flows.png)
*KVBM Data Flows from device to other memory hierarchies* *KVBM Data Flows from device to other memory hierarchies*
...@@ -72,7 +72,7 @@ This document provides an in-depth look at the architecture, components, framewo ...@@ -72,7 +72,7 @@ This document provides an in-depth look at the architecture, components, framewo
## Internal Architecture Deep Dive ## Internal Architecture Deep Dive
![Internal architecture and key modules in the Dynamo KVBM](../../assets/img/kvbm-internal-arch.png) ![Internal architecture and key modules in the Dynamo KVBM](../assets/img/kvbm-internal-arch.png)
*Internal architecture and key modules in the Dynamo KVBM* *Internal architecture and key modules in the Dynamo KVBM*
...@@ -320,23 +320,23 @@ There are two components of the interface: ...@@ -320,23 +320,23 @@ There are two components of the interface:
- **Scheduler (Leader)**: Responsible for orchestration of KV block offload/onboard, builds metadata specifying transfer data to the workers. It also maintains hooks for handling asynchronous transfer completion. - **Scheduler (Leader)**: Responsible for orchestration of KV block offload/onboard, builds metadata specifying transfer data to the workers. It also maintains hooks for handling asynchronous transfer completion.
- **Worker**: Responsible for reading metadata built by the scheduler (leader), performs async onboarding/offloading at the end of the forward pass. - **Worker**: Responsible for reading metadata built by the scheduler (leader), performs async onboarding/offloading at the end of the forward pass.
![vLLM KVBM Integration](../../assets/img/kvbm-integrations.png) ![vLLM KVBM Integration](../assets/img/kvbm-integrations.png)
*Typical integration of KVBM with inference frameworks (vLLM shown as example)* *Typical integration of KVBM with inference frameworks (vLLM shown as example)*
### Onboarding Operations ### Onboarding Operations
![Onboarding blocks from Host to Device](../../assets/img/kvbm-onboard-host2device.png) ![Onboarding blocks from Host to Device](../assets/img/kvbm-onboard-host2device.png)
*Onboarding blocks from Host to Device* *Onboarding blocks from Host to Device*
![Onboarding blocks from Disk to Device](../../assets/img/kvbm-onboard-disk2device.png) ![Onboarding blocks from Disk to Device](../assets/img/kvbm-onboard-disk2device.png)
*Onboarding blocks from Disk to Device* *Onboarding blocks from Disk to Device*
### Offloading Operations ### Offloading Operations
![Offloading blocks from Device to Host & Disk](../../assets/img/kvbm-offload.png) ![Offloading blocks from Device to Host & Disk](../assets/img/kvbm-offload.png)
*Offloading blocks from Device to Host & Disk* *Offloading blocks from Device to Host & Disk*
......
...@@ -12,7 +12,7 @@ The Planner is Dynamo's autoscaling controller. It supports two scaling modes: * ...@@ -12,7 +12,7 @@ The Planner is Dynamo's autoscaling controller. It supports two scaling modes: *
## Throughput-Based Scaling ## Throughput-Based Scaling
![Planner architecture showing Metric Collector, Load Predictor, and Performance Interpolator feeding into the Scaling Algorithm and Connector Layer](../../assets/img/planner-architecture.svg) ![Planner architecture showing Metric Collector, Load Predictor, and Performance Interpolator feeding into the Scaling Algorithm and Connector Layer](../assets/img/planner-architecture.svg)
## Scaling Algorithm ## Scaling Algorithm
......
...@@ -23,17 +23,17 @@ AIConfigurator answers these questions in seconds, providing: ...@@ -23,17 +23,17 @@ AIConfigurator answers these questions in seconds, providing:
### End-to-End Workflow ### End-to-End Workflow
![AIConfigurator end-to-end workflow](../../../assets/img/e2e-workflow.svg) ![AIConfigurator end-to-end workflow](../../assets/img/e2e-workflow.svg)
### Aggregated vs Disaggregated Architecture ### Aggregated vs Disaggregated Architecture
AIConfigurator evaluates two deployment architectures and recommends the best one for your workload: AIConfigurator evaluates two deployment architectures and recommends the best one for your workload:
![Aggregated vs Disaggregated architecture comparison](../../../assets/img/arch-comparison.svg) ![Aggregated vs Disaggregated architecture comparison](../../assets/img/arch-comparison.svg)
### When to Use Each Architecture ### When to Use Each Architecture
![Decision flowchart for choosing aggregated vs disaggregated](../../../assets/img/decision-flowchart.svg) ![Decision flowchart for choosing aggregated vs disaggregated](../../assets/img/decision-flowchart.svg)
## Quick Start ## Quick Start
...@@ -287,7 +287,7 @@ Run AIPerf **inside the cluster** to avoid network latency affecting measurement ...@@ -287,7 +287,7 @@ Run AIPerf **inside the cluster** to avoid network latency affecting measurement
To use AIPerf to benchmark an AIC-recommended configuration, you'll need to translate AIC parameters into AIPerf profiling arguments (we are working to automate this): To use AIPerf to benchmark an AIC-recommended configuration, you'll need to translate AIC parameters into AIPerf profiling arguments (we are working to automate this):
![AIC-to-AIPerf parameter mapping](../../../assets/img/param-mapping.svg) ![AIC-to-AIPerf parameter mapping](../../assets/img/param-mapping.svg)
| AIC Output | AIPerf Parameter | Notes | | AIC Output | AIPerf Parameter | Notes |
|------------|-----------------|-------| |------------|-----------------|-------|
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment