docs: Simplify sphinx build and table of contents on webpage (#2519)

766d3f2c · Ryan McCormick · GitHub · f5a41004 · 766d3f2c · 766d3f2c
Unverified Commit 766d3f2c authored Aug 25, 2025 by Ryan McCormick Committed by GitHub Aug 25, 2025
20 changed files
--- a/components/backends/sglang/deploy/README.md
+++ b/components/backends/sglang/deploy/README.md
@@ -145,7 +145,7 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
 ## Further Reading
 - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md)
+- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
 - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
 - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
 - **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
@@ -159,4 +159,4 @@ Common issues and solutions:
 3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
 4. **Out of memory**: Increase memory limits or reduce model batch size
-For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/quickstart.md).
+For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/README.md).
--- a/components/backends/trtllm/deploy/README.md
+++ b/components/backends/trtllm/deploy/README.md
@@ -81,7 +81,7 @@ extraPodSpec:
 Before using these templates, ensure you have:
-1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/quickstart.md)
+1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
 2. **Kubernetes cluster with GPU support**
 3. **Container registry access** for TensorRT-LLM runtime images
 4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
@@ -257,7 +257,7 @@ Configure the `model` name and `host` based on your deployment.
 ## Further Reading
 - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md)
+- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
 - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
 - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
 - **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
@@ -277,4 +277,4 @@ Common issues and solutions:
 6. **Git LFS issues**: Ensure git-lfs is installed before building containers
 7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines
-For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/quickstart.md#troubleshooting).
+For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
--- a/components/backends/vllm/deploy/README.md
+++ b/components/backends/vllm/deploy/README.md
@@ -82,7 +82,7 @@ extraPodSpec:
 Before using these templates, ensure you have:
-1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/quickstart.md)
+1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
 2. **Kubernetes cluster with GPU support**
 3. **Container registry access** for vLLM runtime images
 4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
@@ -236,7 +236,7 @@ args:
 ## Further Reading
 - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md)
+- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
 - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
 - **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/guides/dynamo_deploy/sla_planner_deployment.md)
 - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
@@ -252,4 +252,4 @@ Common issues and solutions:
 4. **Out of memory**: Increase memory limits or reduce model batch size
 5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command
-For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/quickstart.md#troubleshooting).
+For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
\ No newline at end of file
--- a/deploy/inference-gateway/README.md
+++ b/deploy/inference-gateway/README.md
@@ -20,7 +20,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat
 1. **Install Dynamo Platform**
-[See Quickstart Guide](../../docs/guides/dynamo_deploy/quickstart.md) to install Dynamo Cloud.
+[See Quickstart Guide](../../docs/guides/dynamo_deploy/README.md) to install Dynamo Cloud.
 2. **Deploy Inference Gateway**

--- a/docs/_includes/dive_in_examples.rst
+++ b/docs/_includes/dive_in_examples.rst
+The examples below assume you build the latest image yourself from source. If using a prebuilt image follow the examples from the corresponding branch.
+.. grid:: 1 2 2 2
+    :gutter: 3
+    :margin: 0
+    :padding: 3 4 0 0
+    .. grid-item-card:: :doc:`Hello World <../examples/runtime/hello_world/README>`
+        :link: ../examples/runtime/hello_world/README
+        :link-type: doc
+        Demonstrates the basic concepts of Dynamo by creating a simple GPU-unaware graph
+    .. grid-item-card:: :doc:`vLLM <../components/backends/vllm/README>`
+        :link: ../components/backends/vllm/README
+        :link-type: doc
+        Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with VLLM.
+    .. grid-item-card:: :doc:`SGLang <../components/backends/sglang/README>`
+        :link: ../components/backends/sglang/README
+        :link-type: doc
+        Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with SGLang.
+    .. grid-item-card:: :doc:`TensorRT-LLM <../components/backends/trtllm/README>`
+        :link: ../components/backends/trtllm/README
+        :link-type: doc
+        Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with TensorRT-LLM.
--- a/docs/_includes/install.rst
+++ b/docs/_includes/install.rst
+Pip (PyPI)
+----------
+Install a pre-built wheel from PyPI.
+.. code-block:: bash
+   # Create a virtual environment and activate it
+   uv venv venv
+   source venv/bin/activate
+   # Install Dynamo from PyPI (choose one backend extra)
+   uv pip install "ai-dynamo[sglang]==0.4.1"  # or [vllm], [trtllm]
+Pip from source
+---------------
+Install directly from a local checkout for development.
+.. code-block:: bash
+   # Clone the repository
+   git clone https://github.com/ai-dynamo/dynamo.git
+   cd dynamo
+   # Create a virtual environment and activate it
+   uv venv venv
+   source venv/bin/activate
+   uv pip install ".[sglang]"  # or [vllm], [trtllm]
+Docker
+------
+Pull and run prebuilt images from NVIDIA NGC (`nvcr.io`).
+.. code-block:: bash
+   # Run a container (mount your workspace if needed)
+   docker run --rm -it \
+     --gpus all \
+     --network host \
+     nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.4.1  # or vllm, tensorrtllm
--- a/docs/_includes/quick_start_local.rst
+++ b/docs/_includes/quick_start_local.rst
+Get started with Dynamo locally in just a few commands:
+**1. Install Dynamo**
+.. code-block:: bash
+   # Install uv (recommended Python package manager)
+   curl -LsSf https://astral.sh/uv/install.sh | sh
+   # Create virtual environment and install Dynamo
+   uv venv venv
+   source venv/bin/activate
+   uv pip install "ai-dynamo[sglang]==0.4.1"  # or [vllm], [trtllm]
+**2. Start etcd/NATS**
+.. code-block:: bash
+   # Fetch and start etcd and NATS using Docker Compose
+   curl -fsSL -o docker-compose.yml https://raw.githubusercontent.com/ai-dynamo/dynamo/release/0.4.1/deploy/docker-compose.yml
+   docker compose -f docker-compose.yml up -d
+**3. Run Dynamo**
+.. code-block:: bash
+   # Start the OpenAI compatible frontend (default port is 8080)
+   python -m dynamo.frontend
+   # In another terminal, start an SGLang worker
+   python -m dynamo.sglang --model-path Qwen/Qwen3-0.6B
+**4. Test your deployment**
+.. code-block:: bash
+   curl localhost:8080/v1/chat/completions \
+     -H "Content-Type: application/json" \
+     -d '{"model": "Qwen/Qwen3-0.6B",
+          "messages": [{"role": "user", "content": "Hello!"}],
+          "max_tokens": 50}'
--- a/docs/_sections/architecture.rst
+++ b/docs/_sections/architecture.rst
+Overview
+============
+.. include:: ../architecture/architecture.md
+   :parser: myst_parser.sphinx_
+.. toctree::
+   :hidden:
+   Overview <self>
+   Disaggregated Serving <../architecture/disagg_serving>
--- a/docs/_sections/backends.rst
+++ b/docs/_sections/backends.rst
+..
+    SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+    SPDX-License-Identifier: Apache-2.0
+    Licensed under the Apache License, Version 2.0 (the "License");
+    you may not use this file except in compliance with the License.
+    You may obtain a copy of the License at
+    http://www.apache.org/licenses/LICENSE-2.0
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+Backends
+========
+NVIDIA Dynamo supports multiple inference backends to provide flexibility and performance optimization for different use cases and model architectures. Backends are the underlying engines that execute AI model inference, each optimized for specific scenarios, hardware configurations, and performance requirements.
+Overview
+--------
+Dynamo's multi-backend architecture allows you to:
+* **Choose the optimal engine** for your specific workload and hardware
+* **Switch between backends** without changing your application code
+* **Leverage specialized optimizations** from each backend
+* **Scale flexibly** across different deployment scenarios
+Supported Backends
+------------------
+Dynamo currently supports the following high-performance inference backends:
+.. toctree::
+   :maxdepth: 1
+   vLLM <../components/backends/vllm/README>
+   SGLang <../components/backends/sglang/README>
+   TensorRT-LLM <../components/backends/trtllm/README>
--- a/docs/_sections/examples.rst
+++ b/docs/_sections/examples.rst
+..
+    Quickstart Page (left sidebar target)
+..
+Examples
+========
+.. include:: ../_includes/dive_in_examples.rst
\ No newline at end of file
--- a/docs/_sections/installation.rst
+++ b/docs/_sections/installation.rst
+..
+    Installation Page (left sidebar target)
+..
+Installation
+============
+.. include:: ../_includes/install.rst
--- a/docs/architecture/kvbm_intro.rst
+++ b/docs/architecture/kvbm_intro.rst
@@ -48,9 +48,6 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes
   * -
     - ❌
     - SGLang
-   * -
-     - ❌
-     - llama.cpp
   * - **Serving Type**
     - ✅
     - Aggregated
@@ -61,7 +58,9 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes
 .. toctree::
   :hidden:
+   Overview <self>
   Motivation <kvbm_motivation.md>
   KVBM Architecture <kvbm_architecture.md>
   Understanding KVBM components <kvbm_components.md>
   KVBM Further Reading <kvbm_reading>
+   LMCache Integration <../components/backends/vllm/LMCache_Integration.md>
--- a/docs/architecture/planner_intro.rst
+++ b/docs/architecture/planner_intro.rst
@@ -49,9 +49,6 @@ Key features include:
   * -
     - ❌
     - SGLang
-   * -
-     - ❌
-     - llama.cpp
   * - **Serving Type**
     - ✅
     - Aggregated
@@ -73,6 +70,7 @@ Key features include:
 .. toctree::
   :hidden:
+   Overview <self>
   Pre-Deployment Profiling <pre_deployment_profiling.md>
-   Load-based Planner <load_planner.md>
+   SLA-based Planner <sla_planner.md>
-   SLA-based Planner <sla_planner.md>
+   Planner Benchmark <../guides/planner_benchmark/README.md>
\ No newline at end of file
--- a/docs/architecture/pre_deployment_profiling.md
+++ b/docs/architecture/pre_deployment_profiling.md
@@ -96,7 +96,7 @@ Use the default pre-built image and inject custom configurations via PVC:
 1. **Set the container image:**
   ```bash
-   export DOCKER_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.0 # or any existing image tag
+   export DOCKER_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.1 # or any existing image tag
   ```
 2. **Inject your custom disagg configuration:**

--- a/docs/components/backends/llm/README.md
+++ b/docs/components/backends/llm/README.md
-../../../../components/backends/llm/README.md
\ No newline at end of file
--- a/docs/components/backends/sglang/README.md
+++ b/docs/components/backends/sglang/README.md
+../../../../components/backends/sglang/README.md
\ No newline at end of file
--- a/docs/components/backends/trtllm/multinode/multinode-examples.md
+++ b/docs/components/backends/trtllm/multinode/multinode-examples.md
+../../../../../components/backends/trtllm/multinode/multinode-examples.md
\ No newline at end of file
--- a/docs/components/backends/vllm/LMCache_Integration.md
+++ b/docs/components/backends/vllm/LMCache_Integration.md
+../../../../components/backends/vllm/LMCache_Integration.md
\ No newline at end of file
--- a/docs/conf.py
+++ b/docs/conf.py
-#!/usr/bin/env python3
 # SPDX-FileCopyrightText: Copyright (c) 2023-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 # SPDX-License-Identifier: Apache-2.0
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
 # Configuration file for the Sphinx documentation builder.
-#
-# This file only contains a selection of the most common options. For a full
-# list see the documentation:
-# https://www.sphinx-doc.org/en/master/usage/configuration.html
-# -- Path setup --------------------------------------------------------------
-import json
 import os
 import sys
-from datetime import date
-# If extensions (or modules to document with autodoc) are in another directory,
-# add these directories to sys.path here. If the directory is relative to the
-# documentation root, use os.path.abspath to make it absolute, like shown here.
-#
-import httplib2
-from packaging.version import Version
-sys.path.insert(0, os.path.abspath("_extensions"))
-# -- conf.py setup -----------------------------------------------------------
-# conf.py needs to be run in the top level 'docs'
-# directory but the calling build script needs to
-# be called from the current working directory. We
-# change to the 'docs' dir here and then revert back
-# at the end of the file.
-# current_dir = os.getcwd()
-# os.chdir("docs")
 # -- Project information -----------------------------------------------------
+project = "NVIDIA Dynamo"
-project = "Dynamo"
+copyright = "2024-2025, NVIDIA CORPORATION & AFFILIATES"
-copyright = "2025-{}, NVIDIA Corporation".format(date.today().year)
 author = "NVIDIA"
-# Get the version of dynamo this is building.
-version_long = "0.1.0"
-version_short = version_long
-version_short_split = version_short.split(".")
-one_before = f"{version_short_split[0]}.{int(version_short_split[1]) - 1}.{version_short_split[2]}"
 # -- General configuration ---------------------------------------------------
-# Add any Sphinx extension module names here, as strings. They can be
+# Standard extensions
-# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
-# ones.
 extensions = [
    "ablog",
    "myst_parser",
@@ -82,188 +29,53 @@ extensions = [
    "sphinx.ext.ifconfig",
    "sphinx.ext.extlinks",
    "sphinxcontrib.mermaid",
-    "github_alerts",  # Custom extension for GitHub alert conversion
-]
-suppress_warnings = ["myst.domains", "ref.ref", "myst.header"]
-source_suffix = [".rst", ".md"]
-autodoc_default_options = {
-    "members": True,
-    "undoc-members": True,
-    "private-members": True,
-}
-autosummary_generate = True
-autosummary_mock_imports = [
-    "tritonclient.grpc.model_config_pb2",
-    "tritonclient.grpc.service_pb2",
-    "tritonclient.grpc.service_pb2_grpc",
 ]
-napoleon_include_special_with_doc = True
+# Custom extensions
+sys.path.insert(0, os.path.abspath("_extensions"))
+extensions.append("github_alerts")
-numfig = True
+# Handle Mermaid diagrams as code blocks (not directives) to avoid warnings
+myst_fence_as_directive = ["mermaid"]  # Uncomment if sphinxcontrib-mermaid is installed
-# final location of docs for seo/sitemap
+# File extensions (myst_parser automatically handles .md files)
-html_baseurl = "https://docs.nvidia.com/dynamo/latest/"
+source_suffix = [".rst", ".md"]
+# MyST parser configuration
 myst_enable_extensions = [
-    "dollarmath",
+    "colon_fence",  # ::: code blocks
-    "amsmath",
+    "deflist",  # Definition lists
-    "deflist",
+    "html_image",  # HTML images
-    # "html_admonition",
+    "tasklist",  # Task lists
-    "html_image",
-    "colon_fence",
-    # "smartquotes",
-    "replacements",
-    # "linkify",
-    "substitution",
 ]
-myst_heading_anchors = 5
-myst_fence_as_directive = ["mermaid"]
-# Add any paths that contain templates here, relative to this directory.
+# Templates path
-# templates_path = ["_templates"] # disable it for nvidia-sphinx-theme to show footer
+templates_path = ["_templates"]
+# List of patterns to ignore when looking for source files
+exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "build"]
 # -- Options for HTML output -------------------------------------------------
-# The theme to use for HTML and HTML Help pages.  See the documentation for
-# a list of builtin themes.
-#
 html_theme = "nvidia_sphinx_theme"
-# Add any paths that contain custom static files (such as style sheets) here,
-# relative to this directory. They are copied after the builtin static files,
-# so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ["_static"]
-# html_js_files = ["custom.js"]
-# html_css_files = ["custom.css"] # Not needed with new theme
 html_theme_options = {
    "collapse_navigation": False,
    "github_url": "https://github.com/ai-dynamo/dynamo",
-    # "switcher": {
+    "navbar_start": ["navbar-logo"],
-    # use for local testing
-    # "json_url": "http://localhost:8000/_static/switcher.json",
-    # "json_url": "https://docs.nvidia.com/dynamo/latest/_static/switcher.json",
-    # "version_match": one_before if "dev" in version_long else version_short,
-    # },
-    "navbar_start": ["navbar-logo", "version-switcher"],
    "primary_sidebar_end": [],
 }
-# Theme options are theme-specific and customize the look and feel of a theme
+# Document settings
-# further.  For a list of options available for each theme, see the
+master_doc = "index"
-# documentation.
+html_title = f"{project} Documentation"
-#
+html_short_title = project
-html_theme_options.update(
+html_baseurl = "https://docs.nvidia.com/dynamo/latest/"
-    {
-        "collapse_navigation": False,
-    }
-)
-deploy_ngc_org = "nvidia"
-deploy_ngc_team = "dynamo"
-myst_substitutions = {
-    "VersionNum": version_short,
-    "deploy_ngc_org_team": f"{deploy_ngc_org}/{deploy_ngc_team}"
-    if deploy_ngc_team
-    else deploy_ngc_org,
-}
-def ultimateReplace(app, docname, source):
-    result = source[0]
-    for key in app.config.ultimate_replacements:
-        result = result.replace(key, app.config.ultimate_replacements[key])
-    source[0] = result
-# this is a necessary hack to allow us to fill in variables that exist in code blocks
-ultimate_replacements = {
-    "{VersionNum}": version_short,
-    "{SamplesVersionNum}": version_short,
-    "{NgcOrgTeam}": f"{deploy_ngc_org}/{deploy_ngc_team}"
-    if deploy_ngc_team
-    else deploy_ngc_org,
-}
-# bibtex_bibfiles = ["references.bib"]
-# To test that style looks good with common bibtex config
-# bibtex_reference_style = "author_year"
-# bibtex_default_style = "plain"
-### We currently use Myst: https://myst-nb.readthedocs.io/en/latest/use/execute.html
-nb_execution_mode = "off"  # Global execution disable
-# execution_excludepatterns = ['tutorials/tts-python-basics.ipynb']  # Individual notebook disable
-###############################
-# SETUP SWITCHER
-###############################
-switcher_path = os.path.join(html_static_path[0], "switcher.json")
-versions = []
-# Triton 2 releases
-correction = -1 if "dev" in version_long else 0
-upper_bound = version_short.split(".")[1]
-for i in range(2, int(version_short.split(".")[1]) + correction):
-    versions.append((f"2.{i}.0", f"dynamo{i}0"))
-# Patch releases
-# Add here.
-versions = sorted(versions, key=lambda v: Version(v[0]), reverse=True)
-# Build switcher data
-json_data = []
-for v in versions:
-    json_data.append(
-        {
-            "name": v[0],
-            "version": v[0],
-            "url": f"https://docs.nvidia.com/dynamo/archives/{v[1]}/user-guide/docs",
-        }
-    )
-if "dev" in version_long:
-    json_data.insert(
-        0,
-        {
-            "name": f"{one_before} (current_release)",
-            "version": f"{one_before}",
-            "url": "https://docs.nvidia.com/dynamo/latest/index.html",
-        },
-    )
-else:
-    json_data.insert(
-        0,
-        {
-            "name": f"{version_short} (current release)",
-            "version": f"{version_short}",
-            "url": "https://docs.nvidia.com/dynamo/latest/index.html",
-        },
-    )
-# Trim to last N releases.
-json_data = json_data[0:12]
-json_data.append(
-    {
-        "name": "older releases",
-        "version": "archives",
-        "url": "https://docs.nvidia.com/dynamo/archives/",
-    }
-)
-# validate the links
+# Suppress warnings for external links and missing references
-for i, d in enumerate(json_data):
+suppress_warnings = [
-    h = httplib2.Http()
+    "myst.xref_missing",  # Missing cross-references of relative links outside docs folder
-    resp = h.request(d["url"], "HEAD")
+]
-    if int(resp[0]["status"]) >= 400:
-        print(d["url"], "NOK", resp[0]["status"])
-        # exit(1)
-# Write switcher data to file
+# Additional MyST configuration
-with open(switcher_path, "w") as f:
+myst_heading_anchors = 7  # Generate anchors for headers
-    json.dump(json_data, f, ensure_ascii=False, indent=4)
+myst_substitutions = {}  # Custom substitutions
--- a/docs/examples/README.md
+++ b/docs/examples/README.md
-# Examples of using Dynamo Platform
-## Serving examples locally
-Follow individual examples under components/backends/ to serve models locally.
-For example follow the [vLLM Backend Example](../../components/backends/vllm/README.md)
-For a basic GPU - unaware example see the [Hello World Example](../../examples/runtime/hello_world/README.md)
-## Deploying Examples to Kubernetes
-First you need to install the Dynamo Cloud Platform. Dynamo Cloud acts as an orchestration layer between the end user and Kubernetes, handling the complexity of deploying your graphs for you.
-Before you can deploy your graphs, you need to deploy the Dynamo Runtime and Dynamo Cloud images. This is a one-time action, only necessary the first time you deploy a DynamoGraph.
-### Instructions for Dynamo User
-If you are a **👤 Dynamo User** first follow the [Quickstart Guide](../guides/dynamo_deploy/quickstart.md) first.
-### Instructions for Dynamo Contributor
-If you are a **🧑‍💻 Dynamo Contributor** you may have to rebuild the dynamo platform images as the code evolves.
-For more details read the [Cloud Guide](../guides/dynamo_deploy/dynamo_cloud.md)
-Read more on deploying Dynamo Cloud read [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md).
-### Deploying a particular example
-```bash
-# Set your dynamo root directory
-cd <root-dynamo-folder>
-export PROJECT_ROOT=$(pwd)
-export NAMESPACE=<your-namespace> # the namespace you used to deploy Dynamo cloud to.
-```
-Deploying an example consists of the simple `kubectl apply -f ... -n ${NAMESPACE}` command. For example:
-```bash
-kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE}
-```
-You can use `kubectl get dynamoGraphDeployment -n ${NAMESPACE}` to view your deployment.
-You can use `kubectl delete dynamoGraphDeployment <your-dep-name> -n ${NAMESPACE}` to delete the deployment.
-We provide a Custom Resource yaml file for many examples under the `components/backends/<backend-name>/deploy/`folder.
-Consult the examples below for the CRs for your specific inference backend.
-[View SGLang k8s](../../components/backends/sglang/deploy/README.md)
-[View vLLM K8s](../../components/backends/vllm/deploy/README.md)
-[View TRTLLM k8s](../../components/backends/trtllm/deploy/README.md)
-**Note 1** Example Image
-The examples use a prebuilt image from the `nvcr.io` registry.
-You can build your own image and update the image location in your CR file prior to applying.
-You could build your own image using
-```bash
-./container/build.sh --framework <your-inference-framework>
-```
-For example for the `sglang` run
-```bash
-./container/build.sh --framework sglang
-```
-Then you would need to overwrite the image in the examples.
-```bash
-extraPodSpec:
-        mainContainer:
-          image: <image-in-your-$DYNAMO_IMAGE>
-```
-**Note 2**
-Setup port forward if needed when deploying to Kubernetes.
-List the services in your namespace:
-```bash
-kubectl get svc -n ${NAMESPACE}
-```
-Look for one that ends in `-frontend` and use it for port forward.
-```bash
-SERVICE_NAME=$(kubectl get svc -n ${NAMESPACE} -o name | grep frontend | sed 's|.*/||' | sed 's|-frontend||' | head -n1)
-kubectl port-forward svc/${SERVICE_NAME}-frontend 8080:8080 -n ${NAMESPACE}
-```
-Consult the [Port Forward Documentation](https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/)