docs: add release-artifacts.md with comprehensive artifact inventory (#5619)

Signed-off-by: Dan Gil <dagil@nvidia.com> Signed-off-by: Dmitry Tokarev <dtokarev@nvidia.com> Signed-off-by: dagil-nvidia <dagil@nvidia.com> Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>

docs: add release-artifacts.md with comprehensive artifact inventory (#5619)
Signed-off-by: Dan Gil <dagil@nvidia.com> Signed-off-by: Dmitry Tokarev <dtokarev@nvidia.com> Signed-off-by: dagil-nvidia <dagil@nvidia.com> Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
24807345 · dagil-nvidia · GitHub · 11d9cdfb · 24807345 · 24807345
Unverified Commit 24807345 authored Jan 26, 2026 by dagil-nvidia Committed by GitHub Jan 26, 2026
5 changed files
--- a/README.md
+++ b/README.md
@@ -61,7 +61,7 @@ Built in Rust for performance and Python for extensibility, Dynamo is fully open
 | [**Multimodal**](docs/multimodal/index.md)                           | ✅   | ✅     | ✅           |
 | [**Tool Calling**](docs/agents/tool-calling.md)                      | ✅   | ✅     | ✅           |
-> **[Full Feature Matrix →](feature-matrix.md)** — Detailed compatibility including LoRA, Request Migration, Speculative Decoding, and feature interactions.
+> **[Full Feature Matrix →](docs/reference/feature-matrix.md)** — Detailed compatibility including LoRA, Request Migration, Speculative Decoding, and feature interactions.
 ## Latest News

--- a/docs/index.rst
+++ b/docs/index.rst
@@ -43,6 +43,8 @@ Quickstart
   Quickstart <self>
   Installation <_sections/installation>
   Support Matrix <reference/support-matrix.md>
+   Feature Matrix <reference/feature-matrix.md>
+   Release Artifacts <reference/release-artifacts.md>
   Examples <_sections/examples>
 .. toctree::

--- a/feature-matrix.md
+++ b/feature-matrix.md
--- a/docs/reference/release-artifacts.md
+++ b/docs/reference/release-artifacts.md
+<!--
+SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES.
+All rights reserved.
+SPDX-License-Identifier: Apache-2.0
+-->
+# Dynamo Release Artifacts
+This document provides a comprehensive inventory of all Dynamo release artifacts including container images, Python wheels, Helm charts, and Rust crates.
+> **See also:** [Support Matrix](support-matrix.md) for hardware and platform compatibility | [Feature Matrix](feature-matrix.md) for backend feature support
+> [!Note]
+> Release history in this document begins at v0.6.0, as we expect the majority of users to be on v0.6.0 or later.
+## Current Release: Dynamo v0.8.1
+- **GitHub Release:** [v0.8.1](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.1)
+- **Docs:** [v0.8.1](https://docs.nvidia.com/dynamo/archive/0.8.1/index.html)
+- **NGC Collection:** [ai-dynamo](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo)
+### Patch Release: v0.8.1.post1 (Jan 23, 2026)
+> [!Note]
+> **v0.8.1.post1** is a patch release for PyPI wheels and TRT-LLM container only. There is no GitHub release for this version. All other artifacts (vLLM/SGLang containers, Helm charts, Rust crates) remain at v0.8.1.
+| Artifact | Version | Change | Link |
+|----------|---------|--------|------|
+| `ai-dynamo` | `0.8.1.post1` | Updated TRT-LLM to `v1.2.0rc6.post2` | [PyPI](https://pypi.org/project/ai-dynamo/0.8.1.post1/) |
+| `ai-dynamo-runtime` | `0.8.1.post1` | Updated TRT-LLM to `v1.2.0rc6.post2` | [PyPI](https://pypi.org/project/ai-dynamo-runtime/0.8.1.post1/) |
+| `tensorrtllm-runtime` | `0.8.1.post1` | TRT-LLM `v1.2.0rc6.post2` | [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime?version=0.8.1.post1) |
+### Container Images
+| Image:Tag | Description | Backend | CUDA | Arch | NGC | Notes |
+|-----------|-------------|---------|------|------|-----|-------|
+| `vllm-runtime:0.8.1` | Runtime container for vLLM backend | vLLM `v0.12.0` | `v12.9` | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime?version=0.8.1) | |
+| `vllm-runtime:0.8.1-cuda13` | Runtime container for vLLM backend (CUDA 13) | vLLM `v0.12.0` | `v13.0` | AMD64/ARM64* | — | Fails to launch |
+| `sglang-runtime:0.8.1` | Runtime container for SGLang backend | SGLang `v0.5.6.post2` | `v12.9` | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime?version=0.8.1) | |
+| `sglang-runtime:0.8.1-cuda13` | Runtime container for SGLang backend (CUDA 13) | SGLang `v0.5.6.post2` | `v13.0` | AMD64/ARM64* | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime?version=0.8.1-cuda13) | Experimental |
+| `tensorrtllm-runtime:0.8.1` | Runtime container for TensorRT-LLM backend | TRT-LLM `v1.2.0rc6.post1` | `v13.0` | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime?version=0.8.1) | |
+| `dynamo-frontend:0.8.1` | API gateway with Endpoint Prediction Protocol (EPP) | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend?version=0.8.1) | |
+| `kubernetes-operator:0.8.1` | Kubernetes operator for Dynamo deployments | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator?version=0.8.1) | |
+> [!Note]
+> \* Multimodality is not expected to work on ARM64 for CUDA 13 images (`vllm-runtime:*-cuda13`, `sglang-runtime:*-cuda13`). Multimodal inference works on AMD64 for these images.
+### Python Wheels
+> [!Important]
+> We recommend using the TensorRT-LLM NGC container instead of the `ai-dynamo[trtllm]` wheel. See the [NGC container collection](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) for supported images.
+| Package | Description | Python | Platform | PyPI |
+|---------|-------------|--------|----------|------|
+| `ai-dynamo==0.8.1` | Main package with backend integrations (vLLM, SGLang, TRT-LLM) | `3.10`–`3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo/0.8.1/) |
+| `ai-dynamo-runtime==0.8.1` | Core Python bindings for Dynamo runtime | `3.10`–`3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo-runtime/0.8.1/) |
+| `kvbm==0.8.1` | KV Block Manager for disaggregated KV cache | `3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/kvbm/0.8.1/) |
+### Helm Charts
+| Chart | Description | NGC |
+|-------|-------------|-----|
+| `dynamo-crds-0.8.1` | Custom Resource Definitions for Dynamo Kubernetes resources | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-0.8.1.tgz) |
+| `dynamo-platform-0.8.1` | Platform services (etcd, NATS) for Dynamo cluster | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-0.8.1.tgz) |
+| `dynamo-graph-0.8.1` | Deployment graph controller for Dynamo workloads | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-graph-0.8.1.tgz) |
+### Rust Crates
+| Crate | Description | MSRV (Rust) | crates.io |
+|-------|-------------|-------------|-----------|
+| `dynamo-runtime@0.8.1` | Core distributed runtime library | `v1.82` | [link](https://crates.io/crates/dynamo-runtime/0.8.1) |
+| `dynamo-llm@0.8.1` | LLM inference engine | `v1.82` | [link](https://crates.io/crates/dynamo-llm/0.8.1) |
+| `dynamo-async-openai@0.8.1` | Async OpenAI-compatible API client | `v1.82` | [link](https://crates.io/crates/dynamo-async-openai/0.8.1) |
+| `dynamo-parsers@0.8.1` | Protocol parsers (SSE, JSON streaming) | `v1.82` | [link](https://crates.io/crates/dynamo-parsers/0.8.1) |
+| `dynamo-memory@0.8.1` | Memory management utilities | `v1.82` | [link](https://crates.io/crates/dynamo-memory/0.8.1) |
+| `dynamo-config@0.8.1` | Configuration management | `v1.82` | [link](https://crates.io/crates/dynamo-config/0.8.1) |
+## Quick Install Commands
+### Container Images (NGC)
+> For detailed run instructions, see the [Container README](../../container/README.md) or backend-specific guides: [vLLM](../backends/vllm/README.md) | [SGLang](../backends/sglang/README.md) | [TensorRT-LLM](../backends/trtllm/README.md)
+```bash
+# Runtime containers
+docker pull nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.8.1
+docker pull nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.8.1
+docker pull nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.8.1.post1
+# CUDA 13 variants (experimental)
+# vLLM CUDA 13 image fails to launch (known issue)
+# docker pull nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.8.1-cuda13
+docker pull nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.8.1-cuda13
+# Infrastructure containers
+docker pull nvcr.io/nvidia/ai-dynamo/dynamo-frontend:0.8.1
+docker pull nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.8.1
+```
+### Python Wheels (PyPI)
+> For detailed installation instructions, see the [Local Quick Start](https://github.com/ai-dynamo/dynamo#local-quick-start) in the README.
+```bash
+# Install Dynamo with a specific backend (Recommended)
+uv pip install "ai-dynamo[vllm]==0.8.1.post1"
+uv pip install "ai-dynamo[sglang]==0.8.1.post1"
+# TensorRT-LLM requires the NVIDIA PyPI index and pip
+pip install --pre --extra-index-url https://pypi.nvidia.com "ai-dynamo[trtllm]==0.8.1.post1"
+# Install Dynamo core only
+uv pip install ai-dynamo==0.8.1.post1
+# Install standalone KVBM (Python 3.12 only)
+uv pip install kvbm==0.8.1
+```
+### Helm Charts (NGC)
+> For Kubernetes deployment instructions, see the [Kubernetes Installation Guide](../kubernetes/installation_guide.md).
+```bash
+helm install dynamo-crds oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds --version 0.8.1
+helm install dynamo-platform oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform --version 0.8.1
+helm install dynamo-graph oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-graph --version 0.8.1
+```
+### Rust Crates (crates.io)
+> For API documentation, see each crate on [docs.rs](https://docs.rs/). To build Dynamo from source, see [Building from Source](https://github.com/ai-dynamo/dynamo#building-from-source).
+```bash
+cargo add dynamo-runtime@0.8.1
+cargo add dynamo-llm@0.8.1
+cargo add dynamo-async-openai@0.8.1
+cargo add dynamo-parsers@0.8.1
+cargo add dynamo-memory@0.8.1
+cargo add dynamo-config@0.8.1
+```
+## CUDA and Driver Requirements
+For detailed CUDA toolkit versions and minimum driver requirements for each container image, see the [Support Matrix](support-matrix.md#cuda-and-driver-requirements).
+## Known Issues
+For a complete list of known issues, refer to the release notes for each patch:
+- [v0.8.0 Release Notes](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.0)
+- [v0.8.1 Release Notes](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.1)
+### Known Artifact Issues
+| Version | Artifact | Issue | Status |
+|---------|----------|-------|--------|
+| v0.8.1 | `vllm-runtime:0.8.1-cuda13` | Container fails to launch. | Known issue |
+| v0.8.1 | `sglang-runtime:0.8.1-cuda13`, `vllm-runtime:0.8.1-cuda13` | Multimodality not expected to work on ARM64. Works on AMD64. | Known limitation |
+| v0.8.0 | `sglang-runtime:0.8.0-cuda13` | CuDNN installation issue caused PyTorch `v2.9.1` compatibility problems with `nn.Conv3d`, resulting in performance degradation and excessive memory usage in multimodal workloads. | Fixed in v0.8.1 ([#5461](https://github.com/ai-dynamo/dynamo/pull/5461)) |
+---
+## Release History
+- **v0.8.1.post1 Patch**: Updated TRT-LLM to `v1.2.0rc6.post2` (PyPI wheels and TRT-LLM container only)
+- **Standalone Frontend Container**: `dynamo-frontend` added in v0.8.0
+- **CUDA 13 Runtimes**: Experimental CUDA 13 runtime for vLLM and SGLang in v0.8.0
+- **New Rust Crates**: `dynamo-memory` and `dynamo-config` added in v0.8.0
+### GitHub Releases
+| Version | Release Date | GitHub | Docs |
+|---------|--------------|--------|------|
+| `v0.8.1` | Jan 23, 2026 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.1) | [Docs](https://docs.nvidia.com/dynamo/archive/0.8.1/index.html) |
+| `v0.8.0` | Jan 15, 2026 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.0) | [Docs](https://docs.nvidia.com/dynamo/archive/0.8.0/index.html) |
+| `v0.7.1` | Dec 15, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.7.1) | [Docs](https://docs.nvidia.com/dynamo/archive/0.7.1/index.html) |
+| `v0.7.0` | Nov 26, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.7.0) | [Docs](https://docs.nvidia.com/dynamo/archive/0.7.0/index.html) |
+| `v0.6.1` | Nov 6, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.6.1) | [Docs](https://docs.nvidia.com/dynamo/archive/0.6.1/index.html) |
+| `v0.6.0` | Oct 28, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.6.0) | [Docs](https://docs.nvidia.com/dynamo/archive/0.6.0/index.html) |
+### Container Images
+#### vllm-runtime
+| Image:Tag | vLLM | Arch | CUDA | Notes |
+|-----------|------|------|------|-------|
+| `vllm-runtime:0.8.1` | `v0.12.0` | AMD64/ARM64 | `v12.9` | |
+| `vllm-runtime:0.8.0` | `v0.12.0` | AMD64/ARM64 | `v12.9` | |
+| `vllm-runtime:0.8.0-cuda13` | `v0.12.0` | AMD64/ARM64 | `v13.0` | Experimental |
+| `vllm-runtime:0.7.0.post2` | `v0.11.2` | AMD64/ARM64 | `v12.8` | Patch |
+| `vllm-runtime:0.7.1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | |
+| `vllm-runtime:0.7.0.post1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | Patch |
+| `vllm-runtime:0.7.0` | `v0.11.0` | AMD64/ARM64 | `v12.8` | |
+| `vllm-runtime:0.6.1.post1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | Patch |
+| `vllm-runtime:0.6.1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | |
+| `vllm-runtime:0.6.0` | `v0.11.0` | AMD64 | `v12.8` | |
+#### sglang-runtime
+| Image:Tag | SGLang | Arch | CUDA | Notes |
+|-----------|--------|------|------|-------|
+| `sglang-runtime:0.8.1` | `v0.5.6.post2` | AMD64/ARM64 | `v12.9` | |
+| `sglang-runtime:0.8.1-cuda13` | `v0.5.6.post2` | AMD64/ARM64 | `v13.0` | Experimental |
+| `sglang-runtime:0.8.0` | `v0.5.6.post2` | AMD64/ARM64 | `v12.9` | |
+| `sglang-runtime:0.8.0-cuda13` | `v0.5.6.post2` | AMD64/ARM64 | `v13.0` | Experimental |
+| `sglang-runtime:0.7.1` | `v0.5.4.post3` | AMD64/ARM64 | `v12.9` | |
+| `sglang-runtime:0.7.0.post1` | `v0.5.4.post3` | AMD64/ARM64 | `v12.9` | Patch |
+| `sglang-runtime:0.7.0` | `v0.5.4.post3` | AMD64/ARM64 | `v12.9` | |
+| `sglang-runtime:0.6.1.post1` | `v0.5.3.post2` | AMD64/ARM64 | `v12.9` | Patch |
+| `sglang-runtime:0.6.1` | `v0.5.3.post2` | AMD64/ARM64 | `v12.9` | |
+| `sglang-runtime:0.6.0` | `v0.5.3.post2` | AMD64 | `v12.8` | |
+#### tensorrtllm-runtime
+| Image:Tag | TRT-LLM | Arch | CUDA | Notes |
+|-----------|---------|------|------|-------|
+| `tensorrtllm-runtime:0.8.1.post1` | `v1.2.0rc6.post2` | AMD64/ARM64 | `v13.0` | Patch |
+| `tensorrtllm-runtime:0.8.1` | `v1.2.0rc6.post1` | AMD64/ARM64 | `v13.0` | |
+| `tensorrtllm-runtime:0.8.0` | `v1.2.0rc6.post1` | AMD64/ARM64 | `v13.0` | |
+| `tensorrtllm-runtime:0.7.0.post2` | `v1.2.0rc2` | AMD64/ARM64 | `v13.0` | Patch |
+| `tensorrtllm-runtime:0.7.1` | `v1.2.0rc3` | AMD64/ARM64 | `v13.0` | |
+| `tensorrtllm-runtime:0.7.0.post1` | `v1.2.0rc3` | AMD64/ARM64 | `v13.0` | Patch |
+| `tensorrtllm-runtime:0.7.0` | `v1.2.0rc2` | AMD64/ARM64 | `v13.0` | |
+| `tensorrtllm-runtime:0.6.1-cuda13` | `v1.2.0rc1` | AMD64/ARM64 | `v13.0` | Experimental |
+| `tensorrtllm-runtime:0.6.1.post1` | `v1.1.0rc5` | AMD64/ARM64 | `v12.9` | Patch |
+| `tensorrtllm-runtime:0.6.1` | `v1.1.0rc5` | AMD64/ARM64 | `v12.9` | |
+| `tensorrtllm-runtime:0.6.0` | `v1.1.0rc5` | AMD64/ARM64 | `v12.9` | |
+#### dynamo-frontend
+| Image:Tag | Arch | Notes |
+|-----------|------|-------|
+| `dynamo-frontend:0.8.1` | AMD64/ARM64 | |
+| `dynamo-frontend:0.8.0` | AMD64/ARM64 | Initial |
+#### kubernetes-operator
+| Image:Tag | Arch | Notes |
+|-----------|------|-------|
+| `kubernetes-operator:0.8.1` | AMD64/ARM64 | |
+| `kubernetes-operator:0.8.0` | AMD64/ARM64 | |
+| `kubernetes-operator:0.7.1` | AMD64/ARM64 | |
+| `kubernetes-operator:0.7.0.post1` | AMD64/ARM64 | Patch |
+| `kubernetes-operator:0.7.0` | AMD64/ARM64 | |
+| `kubernetes-operator:0.6.1` | AMD64/ARM64 | |
+| `kubernetes-operator:0.6.0` | AMD64/ARM64 | |
+### Python Wheels
+#### ai-dynamo (wheel)
+| Package | Python | Platform | Notes |
+|---------|--------|----------|-------|
+| `ai-dynamo==0.8.1.post1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | TRT-LLM `v1.2.0rc6.post2` |
+| `ai-dynamo==0.8.1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo==0.8.0` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo==0.7.1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo==0.7.0` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo==0.6.1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo==0.6.0` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+#### ai-dynamo-runtime (wheel)
+| Package | Python | Platform | Notes |
+|---------|--------|----------|-------|
+| `ai-dynamo-runtime==0.8.1.post1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | TRT-LLM `v1.2.0rc6.post2` |
+| `ai-dynamo-runtime==0.8.1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo-runtime==0.8.0` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo-runtime==0.7.1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo-runtime==0.7.0` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo-runtime==0.6.1` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+| `ai-dynamo-runtime==0.6.0` | `3.10`–`3.12` | Linux (glibc `v2.28+`) | |
+#### kvbm (wheel)
+| Package | Python | Platform | Notes |
+|---------|--------|----------|-------|
+| `kvbm==0.8.1` | `3.12` | Linux (glibc `v2.28+`) | |
+| `kvbm==0.8.0` | `3.12` | Linux (glibc `v2.28+`) | |
+| `kvbm==0.7.1` | `3.12` | Linux (glibc `v2.28+`) | |
+| `kvbm==0.7.0` | `3.12` | Linux (glibc `v2.28+`) | Initial |
+### Helm Charts
+#### dynamo-crds (Helm chart)
+| Chart | Notes |
+|-------|-------|
+| `dynamo-crds-0.8.1` | |
+| `dynamo-crds-0.8.0` | |
+| `dynamo-crds-0.7.1` | |
+| `dynamo-crds-0.7.0` | |
+| `dynamo-crds-0.6.1` | |
+| `dynamo-crds-0.6.0` | |
+#### dynamo-platform (Helm chart)
+| Chart | Notes |
+|-------|-------|
+| `dynamo-platform-0.8.1` | |
+| `dynamo-platform-0.8.0` | |
+| `dynamo-platform-0.7.1` | |
+| `dynamo-platform-0.7.0` | |
+| `dynamo-platform-0.6.1` | |
+| `dynamo-platform-0.6.0` | |
+#### dynamo-graph (Helm chart)
+| Chart | Notes |
+|-------|-------|
+| `dynamo-graph-0.8.1` | |
+| `dynamo-graph-0.8.0` | |
+| `dynamo-graph-0.7.1` | |
+| `dynamo-graph-0.7.0` | |
+| `dynamo-graph-0.6.1` | |
+| `dynamo-graph-0.6.0` | |
+### Rust Crates
+#### dynamo-runtime (crate)
+| Crate | MSRV (Rust) | Notes |
+|-------|-------------|-------|
+| `dynamo-runtime@0.8.1` | `v1.82` | |
+| `dynamo-runtime@0.8.0` | `v1.82` | |
+| `dynamo-runtime@0.7.1` | `v1.82` | |
+| `dynamo-runtime@0.7.0` | `v1.82` | |
+| `dynamo-runtime@0.6.1` | `v1.82` | |
+| `dynamo-runtime@0.6.0` | `v1.82` | |
+#### dynamo-llm (crate)
+| Crate | MSRV (Rust) | Notes |
+|-------|-------------|-------|
+| `dynamo-llm@0.8.1` | `v1.82` | |
+| `dynamo-llm@0.8.0` | `v1.82` | |
+| `dynamo-llm@0.7.1` | `v1.82` | |
+| `dynamo-llm@0.7.0` | `v1.82` | |
+| `dynamo-llm@0.6.1` | `v1.82` | |
+| `dynamo-llm@0.6.0` | `v1.82` | |
+#### dynamo-async-openai (crate)
+| Crate | MSRV (Rust) | Notes |
+|-------|-------------|-------|
+| `dynamo-async-openai@0.8.1` | `v1.82` | |
+| `dynamo-async-openai@0.8.0` | `v1.82` | |
+| `dynamo-async-openai@0.7.1` | `v1.82` | |
+| `dynamo-async-openai@0.7.0` | `v1.82` | |
+| `dynamo-async-openai@0.6.1` | `v1.82` | |
+| `dynamo-async-openai@0.6.0` | `v1.82` | |
+#### dynamo-parsers (crate)
+| Crate | MSRV (Rust) | Notes |
+|-------|-------------|-------|
+| `dynamo-parsers@0.8.1` | `v1.82` | |
+| `dynamo-parsers@0.8.0` | `v1.82` | |
+| `dynamo-parsers@0.7.1` | `v1.82` | |
+| `dynamo-parsers@0.7.0` | `v1.82` | |
+| `dynamo-parsers@0.6.1` | `v1.82` | |
+| `dynamo-parsers@0.6.0` | `v1.82` | |
+#### dynamo-memory (crate)
+| Crate | MSRV (Rust) | Notes |
+|-------|-------------|-------|
+| `dynamo-memory@0.8.1` | `v1.82` | |
+| `dynamo-memory@0.8.0` | `v1.82` | Initial |
+#### dynamo-config (crate)
+| Crate | MSRV (Rust) | Notes |
+|-------|-------------|-------|
+| `dynamo-config@0.8.1` | `v1.82` | |
+| `dynamo-config@0.8.0` | `v1.82` | Initial |
--- a/docs/reference/support-matrix.md
+++ b/docs/reference/support-matrix.md
@@ -8,7 +8,37 @@ SPDX-License-Identifier: Apache-2.0
 This document provides the support matrix for Dynamo, including hardware, software and build instructions.
-> **See also:** [Feature Compatibility Matrix](../../feature-matrix.md) for backend-specific feature support (vLLM, TensorRT-LLM, SGLang).
+> [!Note]
+> **See also:** [Release Artifacts](release-artifacts.md) for container images, wheels, Helm charts, and crates | [Feature Matrix](feature-matrix.md) for backend feature support
+## Backend Dependencies
+The following table shows the backend framework versions included with each Dynamo release:
+| **Dependency** | **main (ToT)** | **v0.8.1.post1** | **v0.8.1 (latest)** | **v0.8.0** | **v0.7.1** | **v0.7.0.post1** | **v0.7.0** |
+| :------------- | :------------- | :--------------- | :------------------ | :--------- | :--------- | :--------------- | :--------- |
+| vLLM           | `0.14.0`       | `0.12.0`         | `0.12.0`            | `0.12.0`   | `0.11.0`   | `0.11.0`         | `0.11.0`   |
+| SGLang         | `0.5.7`        | `0.5.6.post2`    | `0.5.6.post2`       | `0.5.6.post2` | `0.5.3.post4` | `0.5.3.post4` | `0.5.3.post4` |
+| TensorRT-LLM   | `1.2.0rc6.post2` | `1.2.0rc6.post2` | `1.2.0rc6.post1`  | `1.2.0rc6.post1` | `1.2.0rc3` | `1.2.0rc3`     | `1.2.0rc2` |
+| NIXL           | `0.9.0`        | `0.8.0`          | `0.8.0`             | `0.8.0`    | `0.8.0`    | `0.8.0`          | `0.8.0`    |
+> [!Note]
+> **main (ToT)** reflects the current development branch. **v0.8.1.post1** is a patch release for PyPI wheels and TRT-LLM container only (no GitHub release).
+> [!Important]
+> Currently TensorRT-LLM does not support Python 3.11 so installation of the ai-dynamo[trtllm] Python wheel will fail.
+| **Dynamo Version** | **SGLang**                | **TensorRT-LLM** | **vLLM**                 |
+| :----------------- | :------------------------ | :--------------- | :----------------------- |
+| **Dynamo 0.8.1**   | CUDA 12.9, CUDA 13.0 (🧪) | CUDA 13.0        | CUDA 12.9, CUDA 13.0 (🧪) |
+| **Dynamo 0.8.0**   | CUDA 12.9, CUDA 13.0 (🧪) | CUDA 13.0        | CUDA 12.9, CUDA 13.0 (🧪) |
+| **Dynamo 0.7.1**   | CUDA 12.8                 | CUDA 13.0        | CUDA 12.9                |
+| **Dynamo 0.7.0**   | CUDA 12.9                 | CUDA 13.0        | CUDA 12.8                |
+> [!Note]
+> Patch versions (e.g., v0.8.1.post1, v0.7.0.post1) have the same CUDA support as their base version.
+For detailed artifact versions and NGC links (including container images, Python wheels, Helm charts, and Rust crates), see the [Release Artifacts](release-artifacts.md) page.
 ## Hardware Compatibility
@@ -17,6 +47,7 @@ This document provides the support matrix for Dynamo, including hardware, softwa
 | **x86_64**           | Supported    |
 | **ARM64**            | Supported    |
+Dynamo provides multi-arch container images supporting both AMD64 (x86_64) and ARM64 architectures. See [Release Artifacts](release-artifacts.md) for available images.
 ### GPU Compatibility
@@ -41,7 +72,7 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
 | **CentOS Stream**    | 9           | x86_64           | Experimental |
 > [!Note]
-> Wheels are built using a manylinux_2_28-compatible environment and they have been validated on CentOS 9 and Ubuntu (22.04, 24.04).
+> Wheels are built using a manylinux_2_28-compatible environment and they have been validated on CentOS Stream 9 and Ubuntu (22.04, 24.04).
 >
 > Compatibility with other Linux distributions is expected but has not been officially verified yet.
@@ -50,39 +81,44 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
 ## Software Compatibility
-### Runtime Dependency
+### CUDA and Driver Requirements
-| **Python Package** | **Version** | glibc version                         | CUDA Version |
+Dynamo container images include CUDA toolkit libraries. The host machine must have a compatible NVIDIA GPU driver installed.
-| :----------------- | :---------- | :------------------------------------ | :----------- |
-| ai-dynamo          | 0.8.0       | >=2.28                                |              |
+| Dynamo Version | Backend | CUDA Toolkit | Min Driver (Linux) | Min Driver (Windows) | Notes |
-| ai-dynamo-runtime  | 0.8.0       | >=2.28 (Python 3.12 has known issues) |              |
+| :--- | :--- | :--- | :--- | :--- | :--- |
-| NIXL               | 0.9.0       | >=2.27                                | >=11.8       |
+| **0.8.1** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
+| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
-### Build Dependency
+| | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
+| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
-The following table shows the dependency versions included with each Dynamo release:
+| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
+| **0.8.0** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
-| **Dependency** | **main (ToT)** | **v0.8.0** | **v0.7.1** | **v0.7.0.post1** | **v0.7.0** |
+| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
-| :------------- | :------------- | :--------- | :--------- | :--------------- | :--------- |
+| | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
-| SGLang         | 0.5.7          | 0.5.6.post2 | 0.5.3.post4| 0.5.3.post4      | 0.5.3.post4|
+| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
-| TensorRT-LLM   | 1.2.0rc6.post2 | 1.2.0rc6.post1 | 1.2.0rc3   | 1.2.0rc3         | 1.2.0rc2   |
+| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
-| vLLM           | 0.14.0         | 0.12.0     | 0.11.0     | 0.11.0           | 0.11.0     |
+| **0.7.1** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
-| NIXL           | 0.9.0          | 0.8.0      | 0.8.0      | 0.8.0            | 0.8.0      |
+| | **SGLang** | 12.8 | 570.xx+ | 571.xx+ | |
+| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
+| **0.7.0** | **vLLM** | 12.8 | 570.xx+ | 571.xx+ | |
+| | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
+| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
 > [!Note]
-> **main (ToT)** reflects the current development branch.
+> Experimental CUDA 13 images are not published for all versions. Check [Release Artifacts](release-artifacts.md) for availability.
+#### CUDA Compatibility Resources
-> [!Important]
+For detailed information on CUDA driver compatibility, forward compatibility, and troubleshooting:
-> Specific versions of TensorRT-LLM supported by Dynamo are subject to change. Currently TensorRT-LLM does not support Python 3.11 so installation of the ai-dynamo[trtllm] will fail.
-### CUDA Support by Framework
+- [CUDA Compatibility Overview](https://docs.nvidia.com/deploy/cuda-compatibility/)
-| **Dynamo Version**   | **SGLang**                        | **TensorRT-LLM**        | **vLLM**                          |
+- [Why CUDA Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/why-cuda-compatibility.html)
-| :------------------- | :-------------------------------- | :-----------------------| :-------------------------------- |
+- [Minor Version Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/minor-version-compatibility.html)
-| **Dynamo 0.8.0**     | CUDA 12.9, CUDA 13.0 (🧪)         | CUDA 13.0               | CUDA 12.9, CUDA 13.0 (🧪)         |
+- [Forward Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html)
-| **Dynamo 0.7.1**     | CUDA 12.8                         | CUDA 13.0               | CUDA 12.9                         |
+- [FAQ](https://docs.nvidia.com/deploy/cuda-compatibility/frequently-asked-questions.html)
-> 🧪 = Experimental
+> [!Tip]
+> For extended driver compatibility beyond the minimum versions listed above, consider using `cuda-compat` packages on the host. See [Forward Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html) for details.
 ## Cloud Service Provider Compatibility
@@ -90,27 +126,31 @@ The following table shows the dependency versions included with each Dynamo rele
 | **Host Operating System** | **Version** | **Architecture** | **Status** |
 | :------------------------ | :---------- | :--------------- | :--------- |
-| **Amazon Linux**          | 2023        | x86_64           | Supported¹ |
+| **Amazon Linux**          | 2023        | x86_64           | Supported  |
 > [!Caution]
-> There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend).
+> **AL2023 TensorRT-LLM Limitation:** There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend).
 ## Build Support
+> [!Note]
+> For version-specific artifact details, installation commands, and release history, see [Release Artifacts](release-artifacts.md).
 **Dynamo** currently provides build support in the following ways:
 - **Wheels**: We distribute Python wheels of Dynamo and KV Block Manager:
  - [ai-dynamo](https://pypi.org/project/ai-dynamo/)
  - [ai-dynamo-runtime](https://pypi.org/project/ai-dynamo-runtime/)
-  - **New as of Dynamo v0.7.0:** [kvbm](https://pypi.org/project/kvbm/) as a standalone implementation.
+  - [kvbm](https://pypi.org/project/kvbm/) as a standalone implementation.
- **Dynamo Runtime Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Runtime for each of the LLM inference frameworks on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
-  - [SGLang](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime)
-  - [TensorRT-LLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime)
-  - [vLLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)
- **Dynamo Kubernetes Operator Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Operator on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
+- **Dynamo Container Images**: We distribute multi-arch images (x86 & ARM64 compatible) on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
-  - [kubernetes-operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator) to simplify deployments of Dynamo Graphs.
+  - [Dynamo Frontend](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend) *(New in v0.8.0)*
+  - [SGLang Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime)
+  - [SGLang Runtime (CUDA 13)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime-cu13)
+  - [TensorRT-LLM Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime)
+  - [vLLM Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)
+  - [vLLM Runtime (CUDA 13)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime-cu13)
+  - [Kubernetes Operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator)
 - **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo:
  - [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds)
@@ -119,8 +159,10 @@ The following table shows the dependency versions included with each Dynamo rele
 - **Rust Crates**:
  - [dynamo-runtime](https://crates.io/crates/dynamo-runtime/)
+  - [dynamo-llm](https://crates.io/crates/dynamo-llm/)
  - [dynamo-async-openai](https://crates.io/crates/dynamo-async-openai/)
  - [dynamo-parsers](https://crates.io/crates/dynamo-parsers/)
-  - [dynamo-llm](https://crates.io/crates/dynamo-llm/)
+  - [dynamo-config](https://crates.io/crates/dynamo-config/) *(New in v0.8.0)*
+  - [dynamo-memory](https://crates.io/crates/dynamo-memory/) *(New in v0.8.0)*
 Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the [Local Quick Start](https://github.com/ai-dynamo/dynamo/blob/main/README.md#local-quick-start) in the README.