Unverified Commit 766d3f2c authored by Ryan McCormick's avatar Ryan McCormick Committed by GitHub
Browse files

docs: Simplify sphinx build and table of contents on webpage (#2519)

parent f5a41004
...@@ -145,7 +145,7 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you ...@@ -145,7 +145,7 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
## Further Reading ## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md) - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md) - **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md) - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md) - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) - **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
...@@ -159,4 +159,4 @@ Common issues and solutions: ...@@ -159,4 +159,4 @@ Common issues and solutions:
3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds` 3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
4. **Out of memory**: Increase memory limits or reduce model batch size 4. **Out of memory**: Increase memory limits or reduce model batch size
For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/quickstart.md). For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/README.md).
...@@ -81,7 +81,7 @@ extraPodSpec: ...@@ -81,7 +81,7 @@ extraPodSpec:
Before using these templates, ensure you have: Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/quickstart.md) 1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
2. **Kubernetes cluster with GPU support** 2. **Kubernetes cluster with GPU support**
3. **Container registry access** for TensorRT-LLM runtime images 3. **Container registry access** for TensorRT-LLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`) 4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
...@@ -257,7 +257,7 @@ Configure the `model` name and `host` based on your deployment. ...@@ -257,7 +257,7 @@ Configure the `model` name and `host` based on your deployment.
## Further Reading ## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md) - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md) - **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md) - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md) - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md) - **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
...@@ -277,4 +277,4 @@ Common issues and solutions: ...@@ -277,4 +277,4 @@ Common issues and solutions:
6. **Git LFS issues**: Ensure git-lfs is installed before building containers 6. **Git LFS issues**: Ensure git-lfs is installed before building containers
7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines 7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/quickstart.md#troubleshooting). For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
...@@ -82,7 +82,7 @@ extraPodSpec: ...@@ -82,7 +82,7 @@ extraPodSpec:
Before using these templates, ensure you have: Before using these templates, ensure you have:
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/quickstart.md) 1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
2. **Kubernetes cluster with GPU support** 2. **Kubernetes cluster with GPU support**
3. **Container registry access** for vLLM runtime images 3. **Container registry access** for vLLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`) 4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
...@@ -236,7 +236,7 @@ args: ...@@ -236,7 +236,7 @@ args:
## Further Reading ## Further Reading
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md) - **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md) - **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md) - **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
- **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/guides/dynamo_deploy/sla_planner_deployment.md) - **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/guides/dynamo_deploy/sla_planner_deployment.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md) - **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
...@@ -252,4 +252,4 @@ Common issues and solutions: ...@@ -252,4 +252,4 @@ Common issues and solutions:
4. **Out of memory**: Increase memory limits or reduce model batch size 4. **Out of memory**: Increase memory limits or reduce model batch size
5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command 5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/quickstart.md#troubleshooting). For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
\ No newline at end of file
...@@ -20,7 +20,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat ...@@ -20,7 +20,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat
1. **Install Dynamo Platform** 1. **Install Dynamo Platform**
[See Quickstart Guide](../../docs/guides/dynamo_deploy/quickstart.md) to install Dynamo Cloud. [See Quickstart Guide](../../docs/guides/dynamo_deploy/README.md) to install Dynamo Cloud.
2. **Deploy Inference Gateway** 2. **Deploy Inference Gateway**
......
The examples below assume you build the latest image yourself from source. If using a prebuilt image follow the examples from the corresponding branch.
.. grid:: 1 2 2 2
:gutter: 3
:margin: 0
:padding: 3 4 0 0
.. grid-item-card:: :doc:`Hello World <../examples/runtime/hello_world/README>`
:link: ../examples/runtime/hello_world/README
:link-type: doc
Demonstrates the basic concepts of Dynamo by creating a simple GPU-unaware graph
.. grid-item-card:: :doc:`vLLM <../components/backends/vllm/README>`
:link: ../components/backends/vllm/README
:link-type: doc
Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with VLLM.
.. grid-item-card:: :doc:`SGLang <../components/backends/sglang/README>`
:link: ../components/backends/sglang/README
:link-type: doc
Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with SGLang.
.. grid-item-card:: :doc:`TensorRT-LLM <../components/backends/trtllm/README>`
:link: ../components/backends/trtllm/README
:link-type: doc
Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with TensorRT-LLM.
Pip (PyPI)
----------
Install a pre-built wheel from PyPI.
.. code-block:: bash
# Create a virtual environment and activate it
uv venv venv
source venv/bin/activate
# Install Dynamo from PyPI (choose one backend extra)
uv pip install "ai-dynamo[sglang]==0.4.1" # or [vllm], [trtllm]
Pip from source
---------------
Install directly from a local checkout for development.
.. code-block:: bash
# Clone the repository
git clone https://github.com/ai-dynamo/dynamo.git
cd dynamo
# Create a virtual environment and activate it
uv venv venv
source venv/bin/activate
uv pip install ".[sglang]" # or [vllm], [trtllm]
Docker
------
Pull and run prebuilt images from NVIDIA NGC (`nvcr.io`).
.. code-block:: bash
# Run a container (mount your workspace if needed)
docker run --rm -it \
--gpus all \
--network host \
nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.4.1 # or vllm, tensorrtllm
Get started with Dynamo locally in just a few commands:
**1. Install Dynamo**
.. code-block:: bash
# Install uv (recommended Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create virtual environment and install Dynamo
uv venv venv
source venv/bin/activate
uv pip install "ai-dynamo[sglang]==0.4.1" # or [vllm], [trtllm]
**2. Start etcd/NATS**
.. code-block:: bash
# Fetch and start etcd and NATS using Docker Compose
curl -fsSL -o docker-compose.yml https://raw.githubusercontent.com/ai-dynamo/dynamo/release/0.4.1/deploy/docker-compose.yml
docker compose -f docker-compose.yml up -d
**3. Run Dynamo**
.. code-block:: bash
# Start the OpenAI compatible frontend (default port is 8080)
python -m dynamo.frontend
# In another terminal, start an SGLang worker
python -m dynamo.sglang --model-path Qwen/Qwen3-0.6B
**4. Test your deployment**
.. code-block:: bash
curl localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "Qwen/Qwen3-0.6B",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 50}'
Overview
============
.. include:: ../architecture/architecture.md
:parser: myst_parser.sphinx_
.. toctree::
:hidden:
Overview <self>
Disaggregated Serving <../architecture/disagg_serving>
..
SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Backends
========
NVIDIA Dynamo supports multiple inference backends to provide flexibility and performance optimization for different use cases and model architectures. Backends are the underlying engines that execute AI model inference, each optimized for specific scenarios, hardware configurations, and performance requirements.
Overview
--------
Dynamo's multi-backend architecture allows you to:
* **Choose the optimal engine** for your specific workload and hardware
* **Switch between backends** without changing your application code
* **Leverage specialized optimizations** from each backend
* **Scale flexibly** across different deployment scenarios
Supported Backends
------------------
Dynamo currently supports the following high-performance inference backends:
.. toctree::
:maxdepth: 1
vLLM <../components/backends/vllm/README>
SGLang <../components/backends/sglang/README>
TensorRT-LLM <../components/backends/trtllm/README>
..
Quickstart Page (left sidebar target)
..
Examples
========
.. include:: ../_includes/dive_in_examples.rst
\ No newline at end of file
..
Installation Page (left sidebar target)
..
Installation
============
.. include:: ../_includes/install.rst
...@@ -48,9 +48,6 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes ...@@ -48,9 +48,6 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes
* - * -
- ❌ - ❌
- SGLang - SGLang
* -
- ❌
- llama.cpp
* - **Serving Type** * - **Serving Type**
- ✅ - ✅
- Aggregated - Aggregated
...@@ -61,7 +58,9 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes ...@@ -61,7 +58,9 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes
.. toctree:: .. toctree::
:hidden: :hidden:
Overview <self>
Motivation <kvbm_motivation.md> Motivation <kvbm_motivation.md>
KVBM Architecture <kvbm_architecture.md> KVBM Architecture <kvbm_architecture.md>
Understanding KVBM components <kvbm_components.md> Understanding KVBM components <kvbm_components.md>
KVBM Further Reading <kvbm_reading> KVBM Further Reading <kvbm_reading>
LMCache Integration <../components/backends/vllm/LMCache_Integration.md>
...@@ -49,9 +49,6 @@ Key features include: ...@@ -49,9 +49,6 @@ Key features include:
* - * -
- ❌ - ❌
- SGLang - SGLang
* -
- ❌
- llama.cpp
* - **Serving Type** * - **Serving Type**
- ✅ - ✅
- Aggregated - Aggregated
...@@ -73,6 +70,7 @@ Key features include: ...@@ -73,6 +70,7 @@ Key features include:
.. toctree:: .. toctree::
:hidden: :hidden:
Overview <self>
Pre-Deployment Profiling <pre_deployment_profiling.md> Pre-Deployment Profiling <pre_deployment_profiling.md>
Load-based Planner <load_planner.md> SLA-based Planner <sla_planner.md>
SLA-based Planner <sla_planner.md> Planner Benchmark <../guides/planner_benchmark/README.md>
\ No newline at end of file \ No newline at end of file
...@@ -96,7 +96,7 @@ Use the default pre-built image and inject custom configurations via PVC: ...@@ -96,7 +96,7 @@ Use the default pre-built image and inject custom configurations via PVC:
1. **Set the container image:** 1. **Set the container image:**
```bash ```bash
export DOCKER_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.0 # or any existing image tag export DOCKER_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.1 # or any existing image tag
``` ```
2. **Inject your custom disagg configuration:** 2. **Inject your custom disagg configuration:**
......
../../../../components/backends/llm/README.md
\ No newline at end of file
../../../../components/backends/sglang/README.md
\ No newline at end of file
../../../../../components/backends/trtllm/multinode/multinode-examples.md
\ No newline at end of file
../../../../components/backends/vllm/LMCache_Integration.md
\ No newline at end of file
#!/usr/bin/env python3
# SPDX-FileCopyrightText: Copyright (c) 2023-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-FileCopyrightText: Copyright (c) 2023-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0 # SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Configuration file for the Sphinx documentation builder. # Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
import json
import os import os
import sys import sys
from datetime import date
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import httplib2
from packaging.version import Version
sys.path.insert(0, os.path.abspath("_extensions"))
# -- conf.py setup -----------------------------------------------------------
# conf.py needs to be run in the top level 'docs'
# directory but the calling build script needs to
# be called from the current working directory. We
# change to the 'docs' dir here and then revert back
# at the end of the file.
# current_dir = os.getcwd()
# os.chdir("docs")
# -- Project information ----------------------------------------------------- # -- Project information -----------------------------------------------------
project = "NVIDIA Dynamo"
project = "Dynamo" copyright = "2024-2025, NVIDIA CORPORATION & AFFILIATES"
copyright = "2025-{}, NVIDIA Corporation".format(date.today().year)
author = "NVIDIA" author = "NVIDIA"
# Get the version of dynamo this is building.
version_long = "0.1.0"
version_short = version_long
version_short_split = version_short.split(".")
one_before = f"{version_short_split[0]}.{int(version_short_split[1]) - 1}.{version_short_split[2]}"
# -- General configuration --------------------------------------------------- # -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be # Standard extensions
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [ extensions = [
"ablog", "ablog",
"myst_parser", "myst_parser",
...@@ -82,188 +29,53 @@ extensions = [ ...@@ -82,188 +29,53 @@ extensions = [
"sphinx.ext.ifconfig", "sphinx.ext.ifconfig",
"sphinx.ext.extlinks", "sphinx.ext.extlinks",
"sphinxcontrib.mermaid", "sphinxcontrib.mermaid",
"github_alerts", # Custom extension for GitHub alert conversion
]
suppress_warnings = ["myst.domains", "ref.ref", "myst.header"]
source_suffix = [".rst", ".md"]
autodoc_default_options = {
"members": True,
"undoc-members": True,
"private-members": True,
}
autosummary_generate = True
autosummary_mock_imports = [
"tritonclient.grpc.model_config_pb2",
"tritonclient.grpc.service_pb2",
"tritonclient.grpc.service_pb2_grpc",
] ]
napoleon_include_special_with_doc = True # Custom extensions
sys.path.insert(0, os.path.abspath("_extensions"))
extensions.append("github_alerts")
numfig = True # Handle Mermaid diagrams as code blocks (not directives) to avoid warnings
myst_fence_as_directive = ["mermaid"] # Uncomment if sphinxcontrib-mermaid is installed
# final location of docs for seo/sitemap # File extensions (myst_parser automatically handles .md files)
html_baseurl = "https://docs.nvidia.com/dynamo/latest/" source_suffix = [".rst", ".md"]
# MyST parser configuration
myst_enable_extensions = [ myst_enable_extensions = [
"dollarmath", "colon_fence", # ::: code blocks
"amsmath", "deflist", # Definition lists
"deflist", "html_image", # HTML images
# "html_admonition", "tasklist", # Task lists
"html_image",
"colon_fence",
# "smartquotes",
"replacements",
# "linkify",
"substitution",
] ]
myst_heading_anchors = 5
myst_fence_as_directive = ["mermaid"]
# Add any paths that contain templates here, relative to this directory. # Templates path
# templates_path = ["_templates"] # disable it for nvidia-sphinx-theme to show footer templates_path = ["_templates"]
# List of patterns to ignore when looking for source files
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "build"]
# -- Options for HTML output ------------------------------------------------- # -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "nvidia_sphinx_theme" html_theme = "nvidia_sphinx_theme"
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"] html_static_path = ["_static"]
# html_js_files = ["custom.js"]
# html_css_files = ["custom.css"] # Not needed with new theme
html_theme_options = { html_theme_options = {
"collapse_navigation": False, "collapse_navigation": False,
"github_url": "https://github.com/ai-dynamo/dynamo", "github_url": "https://github.com/ai-dynamo/dynamo",
# "switcher": { "navbar_start": ["navbar-logo"],
# use for local testing
# "json_url": "http://localhost:8000/_static/switcher.json",
# "json_url": "https://docs.nvidia.com/dynamo/latest/_static/switcher.json",
# "version_match": one_before if "dev" in version_long else version_short,
# },
"navbar_start": ["navbar-logo", "version-switcher"],
"primary_sidebar_end": [], "primary_sidebar_end": [],
} }
# Theme options are theme-specific and customize the look and feel of a theme # Document settings
# further. For a list of options available for each theme, see the master_doc = "index"
# documentation. html_title = f"{project} Documentation"
# html_short_title = project
html_theme_options.update( html_baseurl = "https://docs.nvidia.com/dynamo/latest/"
{
"collapse_navigation": False,
}
)
deploy_ngc_org = "nvidia"
deploy_ngc_team = "dynamo"
myst_substitutions = {
"VersionNum": version_short,
"deploy_ngc_org_team": f"{deploy_ngc_org}/{deploy_ngc_team}"
if deploy_ngc_team
else deploy_ngc_org,
}
def ultimateReplace(app, docname, source):
result = source[0]
for key in app.config.ultimate_replacements:
result = result.replace(key, app.config.ultimate_replacements[key])
source[0] = result
# this is a necessary hack to allow us to fill in variables that exist in code blocks
ultimate_replacements = {
"{VersionNum}": version_short,
"{SamplesVersionNum}": version_short,
"{NgcOrgTeam}": f"{deploy_ngc_org}/{deploy_ngc_team}"
if deploy_ngc_team
else deploy_ngc_org,
}
# bibtex_bibfiles = ["references.bib"]
# To test that style looks good with common bibtex config
# bibtex_reference_style = "author_year"
# bibtex_default_style = "plain"
### We currently use Myst: https://myst-nb.readthedocs.io/en/latest/use/execute.html
nb_execution_mode = "off" # Global execution disable
# execution_excludepatterns = ['tutorials/tts-python-basics.ipynb'] # Individual notebook disable
###############################
# SETUP SWITCHER
###############################
switcher_path = os.path.join(html_static_path[0], "switcher.json")
versions = []
# Triton 2 releases
correction = -1 if "dev" in version_long else 0
upper_bound = version_short.split(".")[1]
for i in range(2, int(version_short.split(".")[1]) + correction):
versions.append((f"2.{i}.0", f"dynamo{i}0"))
# Patch releases
# Add here.
versions = sorted(versions, key=lambda v: Version(v[0]), reverse=True)
# Build switcher data
json_data = []
for v in versions:
json_data.append(
{
"name": v[0],
"version": v[0],
"url": f"https://docs.nvidia.com/dynamo/archives/{v[1]}/user-guide/docs",
}
)
if "dev" in version_long:
json_data.insert(
0,
{
"name": f"{one_before} (current_release)",
"version": f"{one_before}",
"url": "https://docs.nvidia.com/dynamo/latest/index.html",
},
)
else:
json_data.insert(
0,
{
"name": f"{version_short} (current release)",
"version": f"{version_short}",
"url": "https://docs.nvidia.com/dynamo/latest/index.html",
},
)
# Trim to last N releases.
json_data = json_data[0:12]
json_data.append(
{
"name": "older releases",
"version": "archives",
"url": "https://docs.nvidia.com/dynamo/archives/",
}
)
# validate the links # Suppress warnings for external links and missing references
for i, d in enumerate(json_data): suppress_warnings = [
h = httplib2.Http() "myst.xref_missing", # Missing cross-references of relative links outside docs folder
resp = h.request(d["url"], "HEAD") ]
if int(resp[0]["status"]) >= 400:
print(d["url"], "NOK", resp[0]["status"])
# exit(1)
# Write switcher data to file # Additional MyST configuration
with open(switcher_path, "w") as f: myst_heading_anchors = 7 # Generate anchors for headers
json.dump(json_data, f, ensure_ascii=False, indent=4) myst_substitutions = {} # Custom substitutions
# Examples of using Dynamo Platform
## Serving examples locally
Follow individual examples under components/backends/ to serve models locally.
For example follow the [vLLM Backend Example](../../components/backends/vllm/README.md)
For a basic GPU - unaware example see the [Hello World Example](../../examples/runtime/hello_world/README.md)
## Deploying Examples to Kubernetes
First you need to install the Dynamo Cloud Platform. Dynamo Cloud acts as an orchestration layer between the end user and Kubernetes, handling the complexity of deploying your graphs for you.
Before you can deploy your graphs, you need to deploy the Dynamo Runtime and Dynamo Cloud images. This is a one-time action, only necessary the first time you deploy a DynamoGraph.
### Instructions for Dynamo User
If you are a **👤 Dynamo User** first follow the [Quickstart Guide](../guides/dynamo_deploy/quickstart.md) first.
### Instructions for Dynamo Contributor
If you are a **🧑‍💻 Dynamo Contributor** you may have to rebuild the dynamo platform images as the code evolves.
For more details read the [Cloud Guide](../guides/dynamo_deploy/dynamo_cloud.md)
Read more on deploying Dynamo Cloud read [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md).
### Deploying a particular example
```bash
# Set your dynamo root directory
cd <root-dynamo-folder>
export PROJECT_ROOT=$(pwd)
export NAMESPACE=<your-namespace> # the namespace you used to deploy Dynamo cloud to.
```
Deploying an example consists of the simple `kubectl apply -f ... -n ${NAMESPACE}` command. For example:
```bash
kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE}
```
You can use `kubectl get dynamoGraphDeployment -n ${NAMESPACE}` to view your deployment.
You can use `kubectl delete dynamoGraphDeployment <your-dep-name> -n ${NAMESPACE}` to delete the deployment.
We provide a Custom Resource yaml file for many examples under the `components/backends/<backend-name>/deploy/`folder.
Consult the examples below for the CRs for your specific inference backend.
[View SGLang k8s](../../components/backends/sglang/deploy/README.md)
[View vLLM K8s](../../components/backends/vllm/deploy/README.md)
[View TRTLLM k8s](../../components/backends/trtllm/deploy/README.md)
**Note 1** Example Image
The examples use a prebuilt image from the `nvcr.io` registry.
You can build your own image and update the image location in your CR file prior to applying.
You could build your own image using
```bash
./container/build.sh --framework <your-inference-framework>
```
For example for the `sglang` run
```bash
./container/build.sh --framework sglang
```
Then you would need to overwrite the image in the examples.
```bash
extraPodSpec:
mainContainer:
image: <image-in-your-$DYNAMO_IMAGE>
```
**Note 2**
Setup port forward if needed when deploying to Kubernetes.
List the services in your namespace:
```bash
kubectl get svc -n ${NAMESPACE}
```
Look for one that ends in `-frontend` and use it for port forward.
```bash
SERVICE_NAME=$(kubectl get svc -n ${NAMESPACE} -o name | grep frontend | sed 's|.*/||' | sed 's|-frontend||' | head -n1)
kubectl port-forward svc/${SERVICE_NAME}-frontend 8080:8080 -n ${NAMESPACE}
```
Consult the [Port Forward Documentation](https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment