docs: restructure docs directory and move fern config to fern/ (#6700)

Signed-off-by: Neal Vaidya <nealv@nvidia.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

docs: restructure docs directory and move fern config to fern/ (#6700)
Signed-off-by: Neal Vaidya <nealv@nvidia.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ece08dc9 · Neal Vaidya · GitHub · 1412e44b · ece08dc9 · ece08dc9
Unverified Commit ece08dc9 authored Mar 01, 2026 by Neal Vaidya Committed by GitHub Mar 01, 2026
20 changed files
--- a/docs/pages/design-docs/kvbm-design.md
+++ b/docs/pages/design-docs/kvbm-design.md
@@ -8,7 +8,7 @@ This document provides an in-depth look at the architecture, components, framewo

 ## KVBM Components

-![Internal Components of Dynamo KVBM](../../assets/img/kvbm-components.png)
+![Internal Components of Dynamo KVBM](../assets/img/kvbm-components.png)


 *Internal Components of Dynamo KVBM*
@@ -39,7 +39,7 @@ This document provides an in-depth look at the architecture, components, framewo

 ## KVBM Data Flows

-![KVBM Data Flows](../../assets/img/kvbm-data-flows.png)
+![KVBM Data Flows](../assets/img/kvbm-data-flows.png)


 *KVBM Data Flows from device to other memory hierarchies*
@@ -72,7 +72,7 @@ This document provides an in-depth look at the architecture, components, framewo

 ## Internal Architecture Deep Dive

-![Internal architecture and key modules in the Dynamo KVBM](../../assets/img/kvbm-internal-arch.png)
+![Internal architecture and key modules in the Dynamo KVBM](../assets/img/kvbm-internal-arch.png)


 *Internal architecture and key modules in the Dynamo KVBM*
@@ -320,23 +320,23 @@ There are two components of the interface:
 - **Scheduler (Leader)**: Responsible for orchestration of KV block offload/onboard, builds metadata specifying transfer data to the workers. It also maintains hooks for handling asynchronous transfer completion.
 - **Worker**: Responsible for reading metadata built by the scheduler (leader), performs async onboarding/offloading at the end of the forward pass.

-![vLLM KVBM Integration](../../assets/img/kvbm-integrations.png)
+![vLLM KVBM Integration](../assets/img/kvbm-integrations.png)

 *Typical integration of KVBM with inference frameworks (vLLM shown as example)*

 ### Onboarding Operations

-![Onboarding blocks from Host to Device](../../assets/img/kvbm-onboard-host2device.png)
+![Onboarding blocks from Host to Device](../assets/img/kvbm-onboard-host2device.png)

 *Onboarding blocks from Host to Device*

-![Onboarding blocks from Disk to Device](../../assets/img/kvbm-onboard-disk2device.png)
+![Onboarding blocks from Disk to Device](../assets/img/kvbm-onboard-disk2device.png)

 *Onboarding blocks from Disk to Device*

 ### Offloading Operations

-![Offloading blocks from Device to Host & Disk](../../assets/img/kvbm-offload.png)
+![Offloading blocks from Device to Host & Disk](../assets/img/kvbm-offload.png)

 *Offloading blocks from Device to Host & Disk*


--- a/docs/pages/design-docs/planner-design.md
+++ b/docs/pages/design-docs/planner-design.md
@@ -12,7 +12,7 @@ The Planner is Dynamo's autoscaling controller. It supports two scaling modes: *

 ## Throughput-Based Scaling

-![Planner architecture showing Metric Collector, Load Predictor, and Performance Interpolator feeding into the Scaling Algorithm and Connector Layer](../../assets/img/planner-architecture.svg)
+![Planner architecture showing Metric Collector, Load Predictor, and Performance Interpolator feeding into the Scaling Algorithm and Connector Layer](../assets/img/planner-architecture.svg)

 ## Scaling Algorithm


--- a/docs/pages/design-docs/request-plane.md
+++ b/docs/pages/design-docs/request-plane.md
--- a/docs/pages/design-docs/router-design.md
+++ b/docs/pages/design-docs/router-design.md
--- a/docs/pages/development/backend-guide.md
+++ b/docs/pages/development/backend-guide.md
--- a/docs/pages/development/jail-stream.md
+++ b/docs/pages/development/jail-stream.md
--- a/docs/pages/development/runtime-guide.md
+++ b/docs/pages/development/runtime-guide.md
--- a/docs/pages/fault-tolerance/README.md
+++ b/docs/pages/fault-tolerance/README.md
--- a/docs/pages/fault-tolerance/graceful-shutdown.md
+++ b/docs/pages/fault-tolerance/graceful-shutdown.md
--- a/docs/pages/fault-tolerance/request-cancellation.md
+++ b/docs/pages/fault-tolerance/request-cancellation.md
--- a/docs/pages/fault-tolerance/request-migration.md
+++ b/docs/pages/fault-tolerance/request-migration.md
--- a/docs/pages/fault-tolerance/request-rejection.md
+++ b/docs/pages/fault-tolerance/request-rejection.md
--- a/docs/pages/fault-tolerance/testing.md
+++ b/docs/pages/fault-tolerance/testing.md
--- a/docs/pages/features/disaggregated-serving/README.md
+++ b/docs/pages/features/disaggregated-serving/README.md
@@ -23,17 +23,17 @@ AIConfigurator answers these questions in seconds, providing:

 ### End-to-End Workflow

-![AIConfigurator end-to-end workflow](../../../assets/img/e2e-workflow.svg)
+![AIConfigurator end-to-end workflow](../../assets/img/e2e-workflow.svg)

 ### Aggregated vs Disaggregated Architecture

 AIConfigurator evaluates two deployment architectures and recommends the best one for your workload:

-![Aggregated vs Disaggregated architecture comparison](../../../assets/img/arch-comparison.svg)
+![Aggregated vs Disaggregated architecture comparison](../../assets/img/arch-comparison.svg)

 ### When to Use Each Architecture

-![Decision flowchart for choosing aggregated vs disaggregated](../../../assets/img/decision-flowchart.svg)
+![Decision flowchart for choosing aggregated vs disaggregated](../../assets/img/decision-flowchart.svg)

 ## Quick Start

@@ -287,7 +287,7 @@ Run AIPerf **inside the cluster** to avoid network latency affecting measurement

 To use AIPerf to benchmark an AIC-recommended configuration, you'll need to translate AIC parameters into AIPerf profiling arguments (we are working to automate this):

-![AIC-to-AIPerf parameter mapping](../../../assets/img/param-mapping.svg)
+![AIC-to-AIPerf parameter mapping](../../assets/img/param-mapping.svg)

 | AIC Output | AIPerf Parameter | Notes |
 |------------|-----------------|-------|

--- a/docs/pages/features/lora/README.md
+++ b/docs/pages/features/lora/README.md
--- a/docs/pages/features/multimodal/README.md
+++ b/docs/pages/features/multimodal/README.md
--- a/docs/pages/features/multimodal/multimodal-sglang.md
+++ b/docs/pages/features/multimodal/multimodal-sglang.md
--- a/docs/pages/features/multimodal/multimodal-trtllm.md
+++ b/docs/pages/features/multimodal/multimodal-trtllm.md
--- a/docs/pages/features/multimodal/multimodal-vllm.md
+++ b/docs/pages/features/multimodal/multimodal-vllm.md
--- a/docs/pages/features/speculative-decoding/README.md
+++ b/docs/pages/features/speculative-decoding/README.md