Unverified Commit 24807345 authored by dagil-nvidia's avatar dagil-nvidia Committed by GitHub
Browse files

docs: add release-artifacts.md with comprehensive artifact inventory (#5619)


Signed-off-by: default avatarDan Gil <dagil@nvidia.com>
Signed-off-by: default avatarDmitry Tokarev <dtokarev@nvidia.com>
Signed-off-by: default avatardagil-nvidia <dagil@nvidia.com>
Co-authored-by: default avatarDmitry Tokarev <dtokarev@nvidia.com>
parent 11d9cdfb
...@@ -61,7 +61,7 @@ Built in Rust for performance and Python for extensibility, Dynamo is fully open ...@@ -61,7 +61,7 @@ Built in Rust for performance and Python for extensibility, Dynamo is fully open
| [**Multimodal**](docs/multimodal/index.md) | ✅ | ✅ | ✅ | | [**Multimodal**](docs/multimodal/index.md) | ✅ | ✅ | ✅ |
| [**Tool Calling**](docs/agents/tool-calling.md) | ✅ | ✅ | ✅ | | [**Tool Calling**](docs/agents/tool-calling.md) | ✅ | ✅ | ✅ |
> **[Full Feature Matrix →](feature-matrix.md)** — Detailed compatibility including LoRA, Request Migration, Speculative Decoding, and feature interactions. > **[Full Feature Matrix →](docs/reference/feature-matrix.md)** — Detailed compatibility including LoRA, Request Migration, Speculative Decoding, and feature interactions.
## Latest News ## Latest News
......
...@@ -43,6 +43,8 @@ Quickstart ...@@ -43,6 +43,8 @@ Quickstart
Quickstart <self> Quickstart <self>
Installation <_sections/installation> Installation <_sections/installation>
Support Matrix <reference/support-matrix.md> Support Matrix <reference/support-matrix.md>
Feature Matrix <reference/feature-matrix.md>
Release Artifacts <reference/release-artifacts.md>
Examples <_sections/examples> Examples <_sections/examples>
.. toctree:: .. toctree::
......
<!--
SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES.
All rights reserved.
SPDX-License-Identifier: Apache-2.0
-->
# Dynamo Release Artifacts
This document provides a comprehensive inventory of all Dynamo release artifacts including container images, Python wheels, Helm charts, and Rust crates.
> **See also:** [Support Matrix](support-matrix.md) for hardware and platform compatibility | [Feature Matrix](feature-matrix.md) for backend feature support
> [!Note]
> Release history in this document begins at v0.6.0, as we expect the majority of users to be on v0.6.0 or later.
## Current Release: Dynamo v0.8.1
- **GitHub Release:** [v0.8.1](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.1)
- **Docs:** [v0.8.1](https://docs.nvidia.com/dynamo/archive/0.8.1/index.html)
- **NGC Collection:** [ai-dynamo](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo)
### Patch Release: v0.8.1.post1 (Jan 23, 2026)
> [!Note]
> **v0.8.1.post1** is a patch release for PyPI wheels and TRT-LLM container only. There is no GitHub release for this version. All other artifacts (vLLM/SGLang containers, Helm charts, Rust crates) remain at v0.8.1.
| Artifact | Version | Change | Link |
|----------|---------|--------|------|
| `ai-dynamo` | `0.8.1.post1` | Updated TRT-LLM to `v1.2.0rc6.post2` | [PyPI](https://pypi.org/project/ai-dynamo/0.8.1.post1/) |
| `ai-dynamo-runtime` | `0.8.1.post1` | Updated TRT-LLM to `v1.2.0rc6.post2` | [PyPI](https://pypi.org/project/ai-dynamo-runtime/0.8.1.post1/) |
| `tensorrtllm-runtime` | `0.8.1.post1` | TRT-LLM `v1.2.0rc6.post2` | [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime?version=0.8.1.post1) |
### Container Images
| Image:Tag | Description | Backend | CUDA | Arch | NGC | Notes |
|-----------|-------------|---------|------|------|-----|-------|
| `vllm-runtime:0.8.1` | Runtime container for vLLM backend | vLLM `v0.12.0` | `v12.9` | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime?version=0.8.1) | |
| `vllm-runtime:0.8.1-cuda13` | Runtime container for vLLM backend (CUDA 13) | vLLM `v0.12.0` | `v13.0` | AMD64/ARM64* | — | Fails to launch |
| `sglang-runtime:0.8.1` | Runtime container for SGLang backend | SGLang `v0.5.6.post2` | `v12.9` | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime?version=0.8.1) | |
| `sglang-runtime:0.8.1-cuda13` | Runtime container for SGLang backend (CUDA 13) | SGLang `v0.5.6.post2` | `v13.0` | AMD64/ARM64* | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime?version=0.8.1-cuda13) | Experimental |
| `tensorrtllm-runtime:0.8.1` | Runtime container for TensorRT-LLM backend | TRT-LLM `v1.2.0rc6.post1` | `v13.0` | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime?version=0.8.1) | |
| `dynamo-frontend:0.8.1` | API gateway with Endpoint Prediction Protocol (EPP) | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend?version=0.8.1) | |
| `kubernetes-operator:0.8.1` | Kubernetes operator for Dynamo deployments | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator?version=0.8.1) | |
> [!Note]
> \* Multimodality is not expected to work on ARM64 for CUDA 13 images (`vllm-runtime:*-cuda13`, `sglang-runtime:*-cuda13`). Multimodal inference works on AMD64 for these images.
### Python Wheels
> [!Important]
> We recommend using the TensorRT-LLM NGC container instead of the `ai-dynamo[trtllm]` wheel. See the [NGC container collection](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) for supported images.
| Package | Description | Python | Platform | PyPI |
|---------|-------------|--------|----------|------|
| `ai-dynamo==0.8.1` | Main package with backend integrations (vLLM, SGLang, TRT-LLM) | `3.10``3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo/0.8.1/) |
| `ai-dynamo-runtime==0.8.1` | Core Python bindings for Dynamo runtime | `3.10``3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo-runtime/0.8.1/) |
| `kvbm==0.8.1` | KV Block Manager for disaggregated KV cache | `3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/kvbm/0.8.1/) |
### Helm Charts
| Chart | Description | NGC |
|-------|-------------|-----|
| `dynamo-crds-0.8.1` | Custom Resource Definitions for Dynamo Kubernetes resources | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-0.8.1.tgz) |
| `dynamo-platform-0.8.1` | Platform services (etcd, NATS) for Dynamo cluster | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-0.8.1.tgz) |
| `dynamo-graph-0.8.1` | Deployment graph controller for Dynamo workloads | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-graph-0.8.1.tgz) |
### Rust Crates
| Crate | Description | MSRV (Rust) | crates.io |
|-------|-------------|-------------|-----------|
| `dynamo-runtime@0.8.1` | Core distributed runtime library | `v1.82` | [link](https://crates.io/crates/dynamo-runtime/0.8.1) |
| `dynamo-llm@0.8.1` | LLM inference engine | `v1.82` | [link](https://crates.io/crates/dynamo-llm/0.8.1) |
| `dynamo-async-openai@0.8.1` | Async OpenAI-compatible API client | `v1.82` | [link](https://crates.io/crates/dynamo-async-openai/0.8.1) |
| `dynamo-parsers@0.8.1` | Protocol parsers (SSE, JSON streaming) | `v1.82` | [link](https://crates.io/crates/dynamo-parsers/0.8.1) |
| `dynamo-memory@0.8.1` | Memory management utilities | `v1.82` | [link](https://crates.io/crates/dynamo-memory/0.8.1) |
| `dynamo-config@0.8.1` | Configuration management | `v1.82` | [link](https://crates.io/crates/dynamo-config/0.8.1) |
## Quick Install Commands
### Container Images (NGC)
> For detailed run instructions, see the [Container README](../../container/README.md) or backend-specific guides: [vLLM](../backends/vllm/README.md) | [SGLang](../backends/sglang/README.md) | [TensorRT-LLM](../backends/trtllm/README.md)
```bash
# Runtime containers
docker pull nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.8.1
docker pull nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.8.1
docker pull nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.8.1.post1
# CUDA 13 variants (experimental)
# vLLM CUDA 13 image fails to launch (known issue)
# docker pull nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.8.1-cuda13
docker pull nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.8.1-cuda13
# Infrastructure containers
docker pull nvcr.io/nvidia/ai-dynamo/dynamo-frontend:0.8.1
docker pull nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.8.1
```
### Python Wheels (PyPI)
> For detailed installation instructions, see the [Local Quick Start](https://github.com/ai-dynamo/dynamo#local-quick-start) in the README.
```bash
# Install Dynamo with a specific backend (Recommended)
uv pip install "ai-dynamo[vllm]==0.8.1.post1"
uv pip install "ai-dynamo[sglang]==0.8.1.post1"
# TensorRT-LLM requires the NVIDIA PyPI index and pip
pip install --pre --extra-index-url https://pypi.nvidia.com "ai-dynamo[trtllm]==0.8.1.post1"
# Install Dynamo core only
uv pip install ai-dynamo==0.8.1.post1
# Install standalone KVBM (Python 3.12 only)
uv pip install kvbm==0.8.1
```
### Helm Charts (NGC)
> For Kubernetes deployment instructions, see the [Kubernetes Installation Guide](../kubernetes/installation_guide.md).
```bash
helm install dynamo-crds oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds --version 0.8.1
helm install dynamo-platform oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform --version 0.8.1
helm install dynamo-graph oci://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-graph --version 0.8.1
```
### Rust Crates (crates.io)
> For API documentation, see each crate on [docs.rs](https://docs.rs/). To build Dynamo from source, see [Building from Source](https://github.com/ai-dynamo/dynamo#building-from-source).
```bash
cargo add dynamo-runtime@0.8.1
cargo add dynamo-llm@0.8.1
cargo add dynamo-async-openai@0.8.1
cargo add dynamo-parsers@0.8.1
cargo add dynamo-memory@0.8.1
cargo add dynamo-config@0.8.1
```
## CUDA and Driver Requirements
For detailed CUDA toolkit versions and minimum driver requirements for each container image, see the [Support Matrix](support-matrix.md#cuda-and-driver-requirements).
## Known Issues
For a complete list of known issues, refer to the release notes for each patch:
- [v0.8.0 Release Notes](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.0)
- [v0.8.1 Release Notes](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.1)
### Known Artifact Issues
| Version | Artifact | Issue | Status |
|---------|----------|-------|--------|
| v0.8.1 | `vllm-runtime:0.8.1-cuda13` | Container fails to launch. | Known issue |
| v0.8.1 | `sglang-runtime:0.8.1-cuda13`, `vllm-runtime:0.8.1-cuda13` | Multimodality not expected to work on ARM64. Works on AMD64. | Known limitation |
| v0.8.0 | `sglang-runtime:0.8.0-cuda13` | CuDNN installation issue caused PyTorch `v2.9.1` compatibility problems with `nn.Conv3d`, resulting in performance degradation and excessive memory usage in multimodal workloads. | Fixed in v0.8.1 ([#5461](https://github.com/ai-dynamo/dynamo/pull/5461)) |
---
## Release History
- **v0.8.1.post1 Patch**: Updated TRT-LLM to `v1.2.0rc6.post2` (PyPI wheels and TRT-LLM container only)
- **Standalone Frontend Container**: `dynamo-frontend` added in v0.8.0
- **CUDA 13 Runtimes**: Experimental CUDA 13 runtime for vLLM and SGLang in v0.8.0
- **New Rust Crates**: `dynamo-memory` and `dynamo-config` added in v0.8.0
### GitHub Releases
| Version | Release Date | GitHub | Docs |
|---------|--------------|--------|------|
| `v0.8.1` | Jan 23, 2026 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.1) | [Docs](https://docs.nvidia.com/dynamo/archive/0.8.1/index.html) |
| `v0.8.0` | Jan 15, 2026 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.8.0) | [Docs](https://docs.nvidia.com/dynamo/archive/0.8.0/index.html) |
| `v0.7.1` | Dec 15, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.7.1) | [Docs](https://docs.nvidia.com/dynamo/archive/0.7.1/index.html) |
| `v0.7.0` | Nov 26, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.7.0) | [Docs](https://docs.nvidia.com/dynamo/archive/0.7.0/index.html) |
| `v0.6.1` | Nov 6, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.6.1) | [Docs](https://docs.nvidia.com/dynamo/archive/0.6.1/index.html) |
| `v0.6.0` | Oct 28, 2025 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v0.6.0) | [Docs](https://docs.nvidia.com/dynamo/archive/0.6.0/index.html) |
### Container Images
#### vllm-runtime
| Image:Tag | vLLM | Arch | CUDA | Notes |
|-----------|------|------|------|-------|
| `vllm-runtime:0.8.1` | `v0.12.0` | AMD64/ARM64 | `v12.9` | |
| `vllm-runtime:0.8.0` | `v0.12.0` | AMD64/ARM64 | `v12.9` | |
| `vllm-runtime:0.8.0-cuda13` | `v0.12.0` | AMD64/ARM64 | `v13.0` | Experimental |
| `vllm-runtime:0.7.0.post2` | `v0.11.2` | AMD64/ARM64 | `v12.8` | Patch |
| `vllm-runtime:0.7.1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | |
| `vllm-runtime:0.7.0.post1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | Patch |
| `vllm-runtime:0.7.0` | `v0.11.0` | AMD64/ARM64 | `v12.8` | |
| `vllm-runtime:0.6.1.post1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | Patch |
| `vllm-runtime:0.6.1` | `v0.11.0` | AMD64/ARM64 | `v12.8` | |
| `vllm-runtime:0.6.0` | `v0.11.0` | AMD64 | `v12.8` | |
#### sglang-runtime
| Image:Tag | SGLang | Arch | CUDA | Notes |
|-----------|--------|------|------|-------|
| `sglang-runtime:0.8.1` | `v0.5.6.post2` | AMD64/ARM64 | `v12.9` | |
| `sglang-runtime:0.8.1-cuda13` | `v0.5.6.post2` | AMD64/ARM64 | `v13.0` | Experimental |
| `sglang-runtime:0.8.0` | `v0.5.6.post2` | AMD64/ARM64 | `v12.9` | |
| `sglang-runtime:0.8.0-cuda13` | `v0.5.6.post2` | AMD64/ARM64 | `v13.0` | Experimental |
| `sglang-runtime:0.7.1` | `v0.5.4.post3` | AMD64/ARM64 | `v12.9` | |
| `sglang-runtime:0.7.0.post1` | `v0.5.4.post3` | AMD64/ARM64 | `v12.9` | Patch |
| `sglang-runtime:0.7.0` | `v0.5.4.post3` | AMD64/ARM64 | `v12.9` | |
| `sglang-runtime:0.6.1.post1` | `v0.5.3.post2` | AMD64/ARM64 | `v12.9` | Patch |
| `sglang-runtime:0.6.1` | `v0.5.3.post2` | AMD64/ARM64 | `v12.9` | |
| `sglang-runtime:0.6.0` | `v0.5.3.post2` | AMD64 | `v12.8` | |
#### tensorrtllm-runtime
| Image:Tag | TRT-LLM | Arch | CUDA | Notes |
|-----------|---------|------|------|-------|
| `tensorrtllm-runtime:0.8.1.post1` | `v1.2.0rc6.post2` | AMD64/ARM64 | `v13.0` | Patch |
| `tensorrtllm-runtime:0.8.1` | `v1.2.0rc6.post1` | AMD64/ARM64 | `v13.0` | |
| `tensorrtllm-runtime:0.8.0` | `v1.2.0rc6.post1` | AMD64/ARM64 | `v13.0` | |
| `tensorrtllm-runtime:0.7.0.post2` | `v1.2.0rc2` | AMD64/ARM64 | `v13.0` | Patch |
| `tensorrtllm-runtime:0.7.1` | `v1.2.0rc3` | AMD64/ARM64 | `v13.0` | |
| `tensorrtllm-runtime:0.7.0.post1` | `v1.2.0rc3` | AMD64/ARM64 | `v13.0` | Patch |
| `tensorrtllm-runtime:0.7.0` | `v1.2.0rc2` | AMD64/ARM64 | `v13.0` | |
| `tensorrtllm-runtime:0.6.1-cuda13` | `v1.2.0rc1` | AMD64/ARM64 | `v13.0` | Experimental |
| `tensorrtllm-runtime:0.6.1.post1` | `v1.1.0rc5` | AMD64/ARM64 | `v12.9` | Patch |
| `tensorrtllm-runtime:0.6.1` | `v1.1.0rc5` | AMD64/ARM64 | `v12.9` | |
| `tensorrtllm-runtime:0.6.0` | `v1.1.0rc5` | AMD64/ARM64 | `v12.9` | |
#### dynamo-frontend
| Image:Tag | Arch | Notes |
|-----------|------|-------|
| `dynamo-frontend:0.8.1` | AMD64/ARM64 | |
| `dynamo-frontend:0.8.0` | AMD64/ARM64 | Initial |
#### kubernetes-operator
| Image:Tag | Arch | Notes |
|-----------|------|-------|
| `kubernetes-operator:0.8.1` | AMD64/ARM64 | |
| `kubernetes-operator:0.8.0` | AMD64/ARM64 | |
| `kubernetes-operator:0.7.1` | AMD64/ARM64 | |
| `kubernetes-operator:0.7.0.post1` | AMD64/ARM64 | Patch |
| `kubernetes-operator:0.7.0` | AMD64/ARM64 | |
| `kubernetes-operator:0.6.1` | AMD64/ARM64 | |
| `kubernetes-operator:0.6.0` | AMD64/ARM64 | |
### Python Wheels
#### ai-dynamo (wheel)
| Package | Python | Platform | Notes |
|---------|--------|----------|-------|
| `ai-dynamo==0.8.1.post1` | `3.10``3.12` | Linux (glibc `v2.28+`) | TRT-LLM `v1.2.0rc6.post2` |
| `ai-dynamo==0.8.1` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo==0.8.0` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo==0.7.1` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo==0.7.0` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo==0.6.1` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo==0.6.0` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
#### ai-dynamo-runtime (wheel)
| Package | Python | Platform | Notes |
|---------|--------|----------|-------|
| `ai-dynamo-runtime==0.8.1.post1` | `3.10``3.12` | Linux (glibc `v2.28+`) | TRT-LLM `v1.2.0rc6.post2` |
| `ai-dynamo-runtime==0.8.1` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo-runtime==0.8.0` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo-runtime==0.7.1` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo-runtime==0.7.0` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo-runtime==0.6.1` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
| `ai-dynamo-runtime==0.6.0` | `3.10``3.12` | Linux (glibc `v2.28+`) | |
#### kvbm (wheel)
| Package | Python | Platform | Notes |
|---------|--------|----------|-------|
| `kvbm==0.8.1` | `3.12` | Linux (glibc `v2.28+`) | |
| `kvbm==0.8.0` | `3.12` | Linux (glibc `v2.28+`) | |
| `kvbm==0.7.1` | `3.12` | Linux (glibc `v2.28+`) | |
| `kvbm==0.7.0` | `3.12` | Linux (glibc `v2.28+`) | Initial |
### Helm Charts
#### dynamo-crds (Helm chart)
| Chart | Notes |
|-------|-------|
| `dynamo-crds-0.8.1` | |
| `dynamo-crds-0.8.0` | |
| `dynamo-crds-0.7.1` | |
| `dynamo-crds-0.7.0` | |
| `dynamo-crds-0.6.1` | |
| `dynamo-crds-0.6.0` | |
#### dynamo-platform (Helm chart)
| Chart | Notes |
|-------|-------|
| `dynamo-platform-0.8.1` | |
| `dynamo-platform-0.8.0` | |
| `dynamo-platform-0.7.1` | |
| `dynamo-platform-0.7.0` | |
| `dynamo-platform-0.6.1` | |
| `dynamo-platform-0.6.0` | |
#### dynamo-graph (Helm chart)
| Chart | Notes |
|-------|-------|
| `dynamo-graph-0.8.1` | |
| `dynamo-graph-0.8.0` | |
| `dynamo-graph-0.7.1` | |
| `dynamo-graph-0.7.0` | |
| `dynamo-graph-0.6.1` | |
| `dynamo-graph-0.6.0` | |
### Rust Crates
#### dynamo-runtime (crate)
| Crate | MSRV (Rust) | Notes |
|-------|-------------|-------|
| `dynamo-runtime@0.8.1` | `v1.82` | |
| `dynamo-runtime@0.8.0` | `v1.82` | |
| `dynamo-runtime@0.7.1` | `v1.82` | |
| `dynamo-runtime@0.7.0` | `v1.82` | |
| `dynamo-runtime@0.6.1` | `v1.82` | |
| `dynamo-runtime@0.6.0` | `v1.82` | |
#### dynamo-llm (crate)
| Crate | MSRV (Rust) | Notes |
|-------|-------------|-------|
| `dynamo-llm@0.8.1` | `v1.82` | |
| `dynamo-llm@0.8.0` | `v1.82` | |
| `dynamo-llm@0.7.1` | `v1.82` | |
| `dynamo-llm@0.7.0` | `v1.82` | |
| `dynamo-llm@0.6.1` | `v1.82` | |
| `dynamo-llm@0.6.0` | `v1.82` | |
#### dynamo-async-openai (crate)
| Crate | MSRV (Rust) | Notes |
|-------|-------------|-------|
| `dynamo-async-openai@0.8.1` | `v1.82` | |
| `dynamo-async-openai@0.8.0` | `v1.82` | |
| `dynamo-async-openai@0.7.1` | `v1.82` | |
| `dynamo-async-openai@0.7.0` | `v1.82` | |
| `dynamo-async-openai@0.6.1` | `v1.82` | |
| `dynamo-async-openai@0.6.0` | `v1.82` | |
#### dynamo-parsers (crate)
| Crate | MSRV (Rust) | Notes |
|-------|-------------|-------|
| `dynamo-parsers@0.8.1` | `v1.82` | |
| `dynamo-parsers@0.8.0` | `v1.82` | |
| `dynamo-parsers@0.7.1` | `v1.82` | |
| `dynamo-parsers@0.7.0` | `v1.82` | |
| `dynamo-parsers@0.6.1` | `v1.82` | |
| `dynamo-parsers@0.6.0` | `v1.82` | |
#### dynamo-memory (crate)
| Crate | MSRV (Rust) | Notes |
|-------|-------------|-------|
| `dynamo-memory@0.8.1` | `v1.82` | |
| `dynamo-memory@0.8.0` | `v1.82` | Initial |
#### dynamo-config (crate)
| Crate | MSRV (Rust) | Notes |
|-------|-------------|-------|
| `dynamo-config@0.8.1` | `v1.82` | |
| `dynamo-config@0.8.0` | `v1.82` | Initial |
...@@ -8,7 +8,37 @@ SPDX-License-Identifier: Apache-2.0 ...@@ -8,7 +8,37 @@ SPDX-License-Identifier: Apache-2.0
This document provides the support matrix for Dynamo, including hardware, software and build instructions. This document provides the support matrix for Dynamo, including hardware, software and build instructions.
> **See also:** [Feature Compatibility Matrix](../../feature-matrix.md) for backend-specific feature support (vLLM, TensorRT-LLM, SGLang). > [!Note]
> **See also:** [Release Artifacts](release-artifacts.md) for container images, wheels, Helm charts, and crates | [Feature Matrix](feature-matrix.md) for backend feature support
## Backend Dependencies
The following table shows the backend framework versions included with each Dynamo release:
| **Dependency** | **main (ToT)** | **v0.8.1.post1** | **v0.8.1 (latest)** | **v0.8.0** | **v0.7.1** | **v0.7.0.post1** | **v0.7.0** |
| :------------- | :------------- | :--------------- | :------------------ | :--------- | :--------- | :--------------- | :--------- |
| vLLM | `0.14.0` | `0.12.0` | `0.12.0` | `0.12.0` | `0.11.0` | `0.11.0` | `0.11.0` |
| SGLang | `0.5.7` | `0.5.6.post2` | `0.5.6.post2` | `0.5.6.post2` | `0.5.3.post4` | `0.5.3.post4` | `0.5.3.post4` |
| TensorRT-LLM | `1.2.0rc6.post2` | `1.2.0rc6.post2` | `1.2.0rc6.post1` | `1.2.0rc6.post1` | `1.2.0rc3` | `1.2.0rc3` | `1.2.0rc2` |
| NIXL | `0.9.0` | `0.8.0` | `0.8.0` | `0.8.0` | `0.8.0` | `0.8.0` | `0.8.0` |
> [!Note]
> **main (ToT)** reflects the current development branch. **v0.8.1.post1** is a patch release for PyPI wheels and TRT-LLM container only (no GitHub release).
> [!Important]
> Currently TensorRT-LLM does not support Python 3.11 so installation of the ai-dynamo[trtllm] Python wheel will fail.
| **Dynamo Version** | **SGLang** | **TensorRT-LLM** | **vLLM** |
| :----------------- | :------------------------ | :--------------- | :----------------------- |
| **Dynamo 0.8.1** | CUDA 12.9, CUDA 13.0 (🧪) | CUDA 13.0 | CUDA 12.9, CUDA 13.0 (🧪) |
| **Dynamo 0.8.0** | CUDA 12.9, CUDA 13.0 (🧪) | CUDA 13.0 | CUDA 12.9, CUDA 13.0 (🧪) |
| **Dynamo 0.7.1** | CUDA 12.8 | CUDA 13.0 | CUDA 12.9 |
| **Dynamo 0.7.0** | CUDA 12.9 | CUDA 13.0 | CUDA 12.8 |
> [!Note]
> Patch versions (e.g., v0.8.1.post1, v0.7.0.post1) have the same CUDA support as their base version.
For detailed artifact versions and NGC links (including container images, Python wheels, Helm charts, and Rust crates), see the [Release Artifacts](release-artifacts.md) page.
## Hardware Compatibility ## Hardware Compatibility
...@@ -17,6 +47,7 @@ This document provides the support matrix for Dynamo, including hardware, softwa ...@@ -17,6 +47,7 @@ This document provides the support matrix for Dynamo, including hardware, softwa
| **x86_64** | Supported | | **x86_64** | Supported |
| **ARM64** | Supported | | **ARM64** | Supported |
Dynamo provides multi-arch container images supporting both AMD64 (x86_64) and ARM64 architectures. See [Release Artifacts](release-artifacts.md) for available images.
### GPU Compatibility ### GPU Compatibility
...@@ -41,7 +72,7 @@ If you are using a **GPU**, the following GPU models and architectures are suppo ...@@ -41,7 +72,7 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
| **CentOS Stream** | 9 | x86_64 | Experimental | | **CentOS Stream** | 9 | x86_64 | Experimental |
> [!Note] > [!Note]
> Wheels are built using a manylinux_2_28-compatible environment and they have been validated on CentOS 9 and Ubuntu (22.04, 24.04). > Wheels are built using a manylinux_2_28-compatible environment and they have been validated on CentOS Stream 9 and Ubuntu (22.04, 24.04).
> >
> Compatibility with other Linux distributions is expected but has not been officially verified yet. > Compatibility with other Linux distributions is expected but has not been officially verified yet.
...@@ -50,39 +81,44 @@ If you are using a **GPU**, the following GPU models and architectures are suppo ...@@ -50,39 +81,44 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
## Software Compatibility ## Software Compatibility
### Runtime Dependency ### CUDA and Driver Requirements
| **Python Package** | **Version** | glibc version | CUDA Version | Dynamo container images include CUDA toolkit libraries. The host machine must have a compatible NVIDIA GPU driver installed.
| :----------------- | :---------- | :------------------------------------ | :----------- |
| ai-dynamo | 0.8.0 | >=2.28 | | | Dynamo Version | Backend | CUDA Toolkit | Min Driver (Linux) | Min Driver (Windows) | Notes |
| ai-dynamo-runtime | 0.8.0 | >=2.28 (Python 3.12 has known issues) | | | :--- | :--- | :--- | :--- | :--- | :--- |
| NIXL | 0.9.0 | >=2.27 | >=11.8 | | **0.8.1** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
### Build Dependency | | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
The following table shows the dependency versions included with each Dynamo release: | | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
| **0.8.0** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
| **Dependency** | **main (ToT)** | **v0.8.0** | **v0.7.1** | **v0.7.0.post1** | **v0.7.0** | | | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
| :------------- | :------------- | :--------- | :--------- | :--------------- | :--------- | | | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
| SGLang | 0.5.7 | 0.5.6.post2 | 0.5.3.post4| 0.5.3.post4 | 0.5.3.post4| | | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
| TensorRT-LLM | 1.2.0rc6.post2 | 1.2.0rc6.post1 | 1.2.0rc3 | 1.2.0rc3 | 1.2.0rc2 | | | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
| vLLM | 0.14.0 | 0.12.0 | 0.11.0 | 0.11.0 | 0.11.0 | | **0.7.1** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
| NIXL | 0.9.0 | 0.8.0 | 0.8.0 | 0.8.0 | 0.8.0 | | | **SGLang** | 12.8 | 570.xx+ | 571.xx+ | |
| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
| **0.7.0** | **vLLM** | 12.8 | 570.xx+ | 571.xx+ | |
| | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
> [!Note] > [!Note]
> **main (ToT)** reflects the current development branch. > Experimental CUDA 13 images are not published for all versions. Check [Release Artifacts](release-artifacts.md) for availability.
#### CUDA Compatibility Resources
> [!Important] For detailed information on CUDA driver compatibility, forward compatibility, and troubleshooting:
> Specific versions of TensorRT-LLM supported by Dynamo are subject to change. Currently TensorRT-LLM does not support Python 3.11 so installation of the ai-dynamo[trtllm] will fail.
### CUDA Support by Framework - [CUDA Compatibility Overview](https://docs.nvidia.com/deploy/cuda-compatibility/)
| **Dynamo Version** | **SGLang** | **TensorRT-LLM** | **vLLM** | - [Why CUDA Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/why-cuda-compatibility.html)
| :------------------- | :-------------------------------- | :-----------------------| :-------------------------------- | - [Minor Version Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/minor-version-compatibility.html)
| **Dynamo 0.8.0** | CUDA 12.9, CUDA 13.0 (🧪) | CUDA 13.0 | CUDA 12.9, CUDA 13.0 (🧪) | - [Forward Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html)
| **Dynamo 0.7.1** | CUDA 12.8 | CUDA 13.0 | CUDA 12.9 | - [FAQ](https://docs.nvidia.com/deploy/cuda-compatibility/frequently-asked-questions.html)
> 🧪 = Experimental > [!Tip]
> For extended driver compatibility beyond the minimum versions listed above, consider using `cuda-compat` packages on the host. See [Forward Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html) for details.
## Cloud Service Provider Compatibility ## Cloud Service Provider Compatibility
...@@ -90,27 +126,31 @@ The following table shows the dependency versions included with each Dynamo rele ...@@ -90,27 +126,31 @@ The following table shows the dependency versions included with each Dynamo rele
| **Host Operating System** | **Version** | **Architecture** | **Status** | | **Host Operating System** | **Version** | **Architecture** | **Status** |
| :------------------------ | :---------- | :--------------- | :--------- | | :------------------------ | :---------- | :--------------- | :--------- |
| **Amazon Linux** | 2023 | x86_64 | Supported¹ | | **Amazon Linux** | 2023 | x86_64 | Supported |
> [!Caution] > [!Caution]
> There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend). > **AL2023 TensorRT-LLM Limitation:** There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend).
## Build Support ## Build Support
> [!Note]
> For version-specific artifact details, installation commands, and release history, see [Release Artifacts](release-artifacts.md).
**Dynamo** currently provides build support in the following ways: **Dynamo** currently provides build support in the following ways:
- **Wheels**: We distribute Python wheels of Dynamo and KV Block Manager: - **Wheels**: We distribute Python wheels of Dynamo and KV Block Manager:
- [ai-dynamo](https://pypi.org/project/ai-dynamo/) - [ai-dynamo](https://pypi.org/project/ai-dynamo/)
- [ai-dynamo-runtime](https://pypi.org/project/ai-dynamo-runtime/) - [ai-dynamo-runtime](https://pypi.org/project/ai-dynamo-runtime/)
- **New as of Dynamo v0.7.0:** [kvbm](https://pypi.org/project/kvbm/) as a standalone implementation. - [kvbm](https://pypi.org/project/kvbm/) as a standalone implementation.
- **Dynamo Runtime Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Runtime for each of the LLM inference frameworks on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
- [SGLang](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime)
- [TensorRT-LLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime)
- [vLLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)
- **Dynamo Kubernetes Operator Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Operator on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo): - **Dynamo Container Images**: We distribute multi-arch images (x86 & ARM64 compatible) on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
- [kubernetes-operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator) to simplify deployments of Dynamo Graphs. - [Dynamo Frontend](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend) *(New in v0.8.0)*
- [SGLang Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime)
- [SGLang Runtime (CUDA 13)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime-cu13)
- [TensorRT-LLM Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime)
- [vLLM Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)
- [vLLM Runtime (CUDA 13)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime-cu13)
- [Kubernetes Operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator)
- **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo: - **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo:
- [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds) - [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds)
...@@ -119,8 +159,10 @@ The following table shows the dependency versions included with each Dynamo rele ...@@ -119,8 +159,10 @@ The following table shows the dependency versions included with each Dynamo rele
- **Rust Crates**: - **Rust Crates**:
- [dynamo-runtime](https://crates.io/crates/dynamo-runtime/) - [dynamo-runtime](https://crates.io/crates/dynamo-runtime/)
- [dynamo-llm](https://crates.io/crates/dynamo-llm/)
- [dynamo-async-openai](https://crates.io/crates/dynamo-async-openai/) - [dynamo-async-openai](https://crates.io/crates/dynamo-async-openai/)
- [dynamo-parsers](https://crates.io/crates/dynamo-parsers/) - [dynamo-parsers](https://crates.io/crates/dynamo-parsers/)
- [dynamo-llm](https://crates.io/crates/dynamo-llm/) - [dynamo-config](https://crates.io/crates/dynamo-config/) *(New in v0.8.0)*
- [dynamo-memory](https://crates.io/crates/dynamo-memory/) *(New in v0.8.0)*
Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the [Local Quick Start](https://github.com/ai-dynamo/dynamo/blob/main/README.md#local-quick-start) in the README. Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the [Local Quick Start](https://github.com/ai-dynamo/dynamo/blob/main/README.md#local-quick-start) in the README.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment