docs: full migration of docs/ to fern format in fern/ (#6050)

Signed-off-by: Dan Gil <dagil@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

docs: full migration of docs/ to fern format in fern/ (#6050)
Signed-off-by: Dan Gil <dagil@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
2c3066bd · dagil-nvidia · GitHub · d59b9d72 · 2c3066bd · 2c3066bd
Unverified Commit 2c3066bd authored Feb 06, 2026 by dagil-nvidia Committed by GitHub Feb 06, 2026
20 changed files
--- a/fern/pages/design-docs/disagg-serving.md
+++ b/fern/pages/design-docs/disagg-serving.md
--- a/fern/pages/design-docs/distributed-runtime.md
+++ b/fern/pages/design-docs/distributed-runtime.md
--- a/fern/pages/design-docs/dynamo-flow.md
+++ b/fern/pages/design-docs/dynamo-flow.md
--- a/fern/pages/design-docs/event-plane.md
+++ b/fern/pages/design-docs/event-plane.md
@@ -16,8 +16,7 @@ Dynamo's coordination layer adapts to the deployment environment:
 | **Kubernetes** (with operator) | Native K8s (CRDs, EndpointSlices) | NATS (optional) | TCP |
 | **Bare metal / Local** (default) | etcd | NATS (optional) | TCP |

-> [!NOTE]
-> The runtime always defaults to `kv_store` (etcd) for service discovery. Kubernetes deployments must explicitly set `DYN_DISCOVERY_BACKEND=kubernetes` - the Dynamo operator handles this automatically.
+> **Note:** The runtime always defaults to `kv_store` (etcd) for service discovery. Kubernetes deployments must explicitly set `DYN_DISCOVERY_BACKEND=kubernetes` - the Dynamo operator handles this automatically.

 ```
 ┌─────────────────────────────────────────────────────────────────────┐
@@ -51,8 +50,7 @@ The operator explicitly sets:
 DYN_DISCOVERY_BACKEND=kubernetes
 ```

-> [!WARNING]
-> This must be explicitly configured. The runtime defaults to `kv_store` in all environments.
+> **Important:** This must be explicitly configured. The runtime defaults to `kv_store` in all environments.

 ### How It Works

@@ -461,5 +459,5 @@ This provides KV-aware routing with reduced accuracy but no NATS dependency.
 ## Related Documentation

 - [Distributed Runtime](distributed-runtime.md) - Runtime architecture
- [Request Plane](../guides/request-plane.md) - Request transport configuration
- [Fault Tolerance](../fault-tolerance/request-cancellation.md) - Failure handling
+- [Request Plane](request-plane.md) - Request transport configuration
+- [Fault Tolerance](../fault-tolerance/README.md) - Failure handling
--- a/fern/pages/design-docs/kvbm-design.md
+++ b/fern/pages/design-docs/kvbm-design.md
--- a/fern/pages/design-docs/planner-design.md
+++ b/fern/pages/design-docs/planner-design.md
--- a/fern/pages/guides/request-plane.md
+++ b/fern/pages/guides/request-plane.md
--- a/fern/pages/design-docs/router-design.md
+++ b/fern/pages/design-docs/router-design.md
--- a/fern/pages/development/backend-guide.md
+++ b/fern/pages/development/backend-guide.md
@@ -72,7 +72,6 @@ The `model_type` can be:
 - `model_name`: The name to call the model. Your incoming HTTP requests model name must match this. Defaults to the hugging face repo name or the folder name.
 - `context_length`: Max model length in tokens. Defaults to the model's set max. Only set this if you need to reduce KV cache allocation to fit into VRAM.
 - `kv_cache_block_size`: Size of a KV block for the engine, in tokens. Defaults to 16.
- `migration_limit`: Maximum number of times a request may be [migrated to another Instance](../fault-tolerance/request-migration.md). Defaults to 0.
 - `user_data`: Optional dictionary containing custom metadata for worker behavior (e.g., LoRA configuration). Defaults to None.

 See `examples/backends` for full code examples.

--- a/fern/pages/guides/jail-stream-readme.md
+++ b/fern/pages/guides/jail-stream-readme.md
--- a/fern/pages/development/runtime-guide.md
+++ b/fern/pages/development/runtime-guide.md
@@ -61,7 +61,7 @@ be operating within your distributed runtime.

 The current examples use a hard-coded `namespace`. We will address the `namespace` collisions later.

-All examples require the `etcd` and `nats.io` pre-requisites to be running and available.
+Most examples require `etcd` for service discovery. `nats.io` is required for KV-aware routing with event tracking; for approximate mode (`--no-kv-events`), NATS is optional.

 #### Rust `hello_world`


--- a/fern/pages/fault-tolerance/graceful-shutdown.md
+++ b/fern/pages/fault-tolerance/graceful-shutdown.md
--- a/fern/pages/fault-tolerance/request-cancellation.md
+++ b/fern/pages/fault-tolerance/request-cancellation.md
--- a/fern/pages/fault-tolerance/request-migration.md
+++ b/fern/pages/fault-tolerance/request-migration.md
--- a/fern/pages/features/disaggregated-serving/README.md
+++ b/fern/pages/features/disaggregated-serving/README.md
--- a/fern/pages/features/lora/README.md
+++ b/fern/pages/features/lora/README.md
--- a/fern/pages/multimodal/index.md
+++ b/fern/pages/multimodal/index.md
--- a/fern/pages/multimodal/sglang.md
+++ b/fern/pages/multimodal/sglang.md
--- a/fern/pages/multimodal/trtllm.md
+++ b/fern/pages/multimodal/trtllm.md
--- a/fern/pages/multimodal/vllm.md
+++ b/fern/pages/multimodal/vllm.md