docs: remove duplicate H1 headings from Fern pages (#6410)

Signed-off-by: Dan Gil <dagil@nvidia.com>

docs: remove duplicate H1 headings from Fern pages (#6410)
Signed-off-by: Dan Gil <dagil@nvidia.com>
03360b84 · dagil-nvidia · GitHub · 01ecc8c7 · 03360b84 · 03360b84
Unverified Commit 03360b84 authored Feb 25, 2026 by dagil-nvidia Committed by GitHub Feb 25, 2026
20 changed files
--- a/docs/pages/api/nixl-connect/README.md
+++ b/docs/pages/api/nixl-connect/README.md
@@ -4,8 +4,6 @@
 title: NIXL Connect API
 ---
-# Dynamo NIXL Connect
 Dynamo NIXL Connect specializes in moving data between models/workers in a Dynamo Graph, and for the use cases where registration and memory regions need to be dynamic.
 Dynamo connect provides utilities for such use cases, using the NIXL-based I/O subsystem via a set of Python classes.
 The relaxed registration comes with some performance overheads, but simplifies the integration process.

--- a/docs/pages/api/nixl-connect/connector.md
+++ b/docs/pages/api/nixl-connect/connector.md
@@ -4,8 +4,6 @@
 title: Connector
 ---
-# dynamo.nixl_connect.Connector
 Core class for managing the connection between workers in a distributed environment.
 Use this class to create readable and writable operations, or read and write data to remote workers.

--- a/docs/pages/api/nixl-connect/descriptor.md
+++ b/docs/pages/api/nixl-connect/descriptor.md
@@ -4,8 +4,6 @@
 title: Descriptor
 ---
-# dynamo.nixl_connect.Descriptor
 Memory descriptor that ensures memory is registered with the NIXL-base I/O subsystem.
 Memory must be registered with the NIXL subsystem to enable interaction with the memory.

--- a/docs/pages/api/nixl-connect/device-kind.md
+++ b/docs/pages/api/nixl-connect/device-kind.md
@@ -4,8 +4,6 @@
 title: Device Kind
 ---
-# dynamo.nixl_connect.DeviceKind(IntEnum)
 Represents the kind of device a [`Device`](device.md) object represents.

--- a/docs/pages/api/nixl-connect/device.md
+++ b/docs/pages/api/nixl-connect/device.md
@@ -4,8 +4,6 @@
 title: Device
 ---
-# dynamo.nixl_connect.Device
 `Device` class describes the device a given allocation resides in.
 Usually host (`"cpu"`) or GPU (`"cuda"`) memory.

--- a/docs/pages/api/nixl-connect/operation-status.md
+++ b/docs/pages/api/nixl-connect/operation-status.md
@@ -4,8 +4,6 @@
 title: Operation Status
 ---
-# dynamo.nixl_connect.OperationStatus(IntEnum)
 Represents the current state or status of an operation.

--- a/docs/pages/api/nixl-connect/rdma-metadata.md
+++ b/docs/pages/api/nixl-connect/rdma-metadata.md
@@ -4,8 +4,6 @@
 title: RDMA Metadata
 ---
-# dynamo.nixl_connect.RdmaMetadata
 A Pydantic type intended to provide JSON serialized NIXL metadata about a [`ReadableOperation`](readable-operation.md) or [`WritableOperation`](writable-operation.md) object.
 NIXL metadata contains detailed information about a worker process and how to access memory regions registered with the corresponding agent.
 This data is required to perform data transfers using the NIXL-based I/O subsystem.

--- a/docs/pages/api/nixl-connect/read-operation.md
+++ b/docs/pages/api/nixl-connect/read-operation.md
@@ -4,8 +4,6 @@
 title: Read Operation
 ---
-# dynamo.nixl_connect.ReadOperation
 An operation which transfers data from a remote worker to the local worker.
 To create the operation, NIXL metadata ([RdmaMetadata](rdma-metadata.md)) from a remote worker's [`ReadableOperation`](readable-operation.md)

--- a/docs/pages/api/nixl-connect/readable-operation.md
+++ b/docs/pages/api/nixl-connect/readable-operation.md
@@ -4,8 +4,6 @@
 title: Readable Operation
 ---
-# dynamo.nixl_connect.ReadableOperation
 An operation which enables a remote worker to read data from the local worker.
 To create the operation, a set of local [`Descriptor`](descriptor.md) objects must be provided that reference memory intended to be transferred to a remote worker.

--- a/docs/pages/api/nixl-connect/writable-operation.md
+++ b/docs/pages/api/nixl-connect/writable-operation.md
@@ -4,8 +4,6 @@
 title: Writable Operation
 ---
-# dynamo.nixl_connect.WritableOperation
 An operation which enables a remote worker to write data to the local worker.
 To create the operation, a set of local [`Descriptor`](descriptor.md) objects must be provided which reference memory intended to receive data from a remote worker.

--- a/docs/pages/api/nixl-connect/write-operation.md
+++ b/docs/pages/api/nixl-connect/write-operation.md
@@ -4,8 +4,6 @@
 title: Write Operation
 ---
-# dynamo.nixl_connect.WriteOperation
 An operation which transfers data from the local worker to a remote worker.
 To create the operation, NIXL metadata ([RdmaMetadata](rdma-metadata.md)) from a remote worker's [`WritableOperation`](writable-operation.md)

--- a/docs/pages/backends/sglang/README.md
+++ b/docs/pages/backends/sglang/README.md
@@ -4,8 +4,6 @@
 title: SGLang
 ---
-# Running SGLang with Dynamo
 ## Use the Latest Release
 We recommend using the latest stable release of Dynamo to avoid breaking changes:

--- a/docs/pages/backends/sglang/sglang-diffusion.md
+++ b/docs/pages/backends/sglang/sglang-diffusion.md
@@ -4,8 +4,6 @@
 title: Diffusion
 ---
-# Diffusion Models
 Dynamo SGLang supports three types of diffusion-based generation: **LLM diffusion** (text generation via iterative refinement), **image diffusion** (text-to-image), and **video generation** (text-to-video). Each uses a different worker flag and handler, but all integrate with SGLang's `DiffGenerator`.
 ## Overview

--- a/docs/pages/backends/sglang/sglang-disaggregation.md
+++ b/docs/pages/backends/sglang/sglang-disaggregation.md
@@ -4,8 +4,6 @@
 title: Disaggregation
 ---
-# SGLang Disaggregated Serving
 This document explains how SGLang's disaggregated prefill-decode architecture works, both standalone and within Dynamo.
 ## Overview

--- a/docs/pages/backends/sglang/sglang-examples.md
+++ b/docs/pages/backends/sglang/sglang-examples.md
@@ -4,8 +4,6 @@
 title: Examples
 ---
-# SGLang Examples
 For quick start instructions, see the [SGLang README](README.md). This document provides all deployment patterns for running SGLang with Dynamo, including LLMs, multimodal, and diffusion models, and Kubernetes deployment.
 ## Table of Contents

--- a/docs/pages/backends/sglang/sglang-observability.md
+++ b/docs/pages/backends/sglang/sglang-observability.md
@@ -4,8 +4,6 @@
 title: Observability
 ---
-# SGLang Observability
 This guide covers metrics, tracing, and visualization for SGLang deployments running through Dynamo.
 ## Prometheus Metrics

--- a/docs/pages/backends/sglang/sglang-reference-guide.md
+++ b/docs/pages/backends/sglang/sglang-reference-guide.md
@@ -5,8 +5,6 @@ title: Reference Guide
 subtitle: Architecture, configuration, and operational details for the SGLang backend
 ---
-# Reference Guide
 ## Overview
 The SGLang backend in Dynamo uses a modular architecture where `main.py` dispatches to specialized initialization modules based on the worker type. Each worker type has its own init module, request handler, health check, and registration logic.

--- a/docs/pages/backends/trtllm/README.md
+++ b/docs/pages/backends/trtllm/README.md
@@ -4,8 +4,6 @@
 title: TensorRT-LLM
 ---
-# LLM Deployment using TensorRT-LLM
 This directory contains examples and reference implementations for deploying Large Language Models (LLMs) in various configurations using TensorRT-LLM.
 ## Use the Latest Release

--- a/docs/pages/backends/trtllm/gemma3-sliding-window-attention.md
+++ b/docs/pages/backends/trtllm/gemma3-sliding-window-attention.md
@@ -4,8 +4,6 @@
 title: Gemma3 Sliding Window
 ---
-# Gemma 3 with Variable Sliding Window Attention
 This guide demonstrates how to deploy google/gemma-3-1b-it with Variable Sliding Window Attention (VSWA) using Dynamo. Since google/gemma-3-1b-it is a small model, each aggregated, decode, or prefill worker only requires one H100 GPU or one GB200 GPU.
 VSWA is a mechanism in which a model’s layers alternate between multiple sliding window sizes. An example of this is Gemma 3, which incorporates both global attention layers and sliding window layers.

--- a/docs/pages/backends/trtllm/gpt-oss.md
+++ b/docs/pages/backends/trtllm/gpt-oss.md
@@ -4,8 +4,6 @@
 title: GPT-OSS
 ---
-# Running gpt-oss-120b Disaggregated with TensorRT-LLM
 Dynamo supports disaggregated serving of gpt-oss-120b with TensorRT-LLM. This guide demonstrates how to deploy gpt-oss-120b using disaggregated prefill/decode serving on a single B200 node with 8 GPUs, running 1 prefill worker on 4 GPUs and 1 decode worker on 4 GPUs.
 ## Overview