support-matrix.md 9.94 KB
Newer Older
1
<!--
2
SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES.
3
4
5
6
All rights reserved.
SPDX-License-Identifier: Apache-2.0
-->

7
8
9
10
# Dynamo Support Matrix

This document provides the support matrix for Dynamo, including hardware, software and build instructions.

11
**See also:** [Release Artifacts](release-artifacts.md) for container images, wheels, Helm charts, and crates | [Feature Matrix](feature-matrix.md) for backend feature support
12
13
14
15
16

## Backend Dependencies

The following table shows the backend framework versions included with each Dynamo release:

17
18
19
| **Dynamo** | **vLLM** | **SGLang** | **TensorRT-LLM** | **NIXL** |
| :--- | :--- | :--- | :--- | :--- |
| **main (ToT)** | `0.14.1` | `0.5.8` | `1.3.0rc1` | `0.9.0` |
20
21
22
23
24
| **v1.0.0** *(planned)* | `0.15.0` | *Latest as of 2/17* | *Latest as of 2/17* | `0.10.0` |
| **v0.9.0** *(in progress)* | `0.14.1` | `0.5.8` | `1.3.0rc1` | `0.9.0` |
| **v0.8.1.post3** *(in progress)* | `0.12.0` | `0.5.6.post2` | `1.2.0rc6.post3` | `0.8.0` |
| **v0.8.1.post2** | `0.12.0` | `0.5.6.post2` | `1.2.0rc6.post2` | `0.8.0` |
| **v0.8.1.post1** | `0.12.0` | `0.5.6.post2` | `1.2.0rc6.post1` | `0.8.0` |
25
26
27
28
29
30
31
32
| **v0.8.1** | `0.12.0` | `0.5.6.post2` | `1.2.0rc6.post1` | `0.8.0` |
| **v0.8.0** | `0.12.0` | `0.5.6.post2` | `1.2.0rc6.post1` | `0.8.0` |
| **v0.7.1** | `0.11.0` | `0.5.4.post3` | `1.2.0rc3` | `0.8.0` |
| **v0.7.0.post1** | `0.11.0` | `0.5.4.post3` | `1.2.0rc3` | `0.8.0` |
| **v0.7.0** | `0.11.0` | `0.5.4.post3` | `1.2.0rc2` | `0.8.0` |
| **v0.6.1.post1** | `0.11.0` | `0.5.3.post2` | `1.1.0rc5` | `0.6.0` |
| **v0.6.1** | `0.11.0` | `0.5.3.post2` | `1.1.0rc5` | `0.6.0` |
| **v0.6.0** | `0.11.0` | `0.5.3.post2` | `1.1.0rc5` | `0.6.0` |
33

34
### Version Labels
35

36
37
- **main (ToT)** reflects the current development branch.
- Releases marked *(in progress)* or *(planned)* show target versions that may change before final release.
38

39
40
41
42
### Version Compatibility

- Backend versions listed are the only versions tested and supported for each release.
- TensorRT-LLM does not support Python 3.11; installation of the `ai-dynamo[trtllm]` wheel will fail on Python 3.11.
43

44
45
46
47
48
49
50
51
52
53
### CUDA Versions by Backend

| **Dynamo** | **vLLM** | **SGLang** | **TensorRT-LLM** | **Notes** |
| :--- | :--- | :--- | :--- | :--- |
| **v0.8.1** | `12.9`, `13.0` | `12.9`, `13.0` | `13.0` | Experimental vLLM/SGLang CUDA 13 support |
| **v0.8.0** | `12.9`, `13.0` | `12.9`, `13.0` | `13.0` | Experimental vLLM/SGLang CUDA 13 support |
| **v0.7.1** | `12.9` | `12.8` | `13.0` | |
| **v0.7.0** | `12.8` | `12.9` | `13.0` | TensorRT-LLM CUDA 13 support - CUDA 12.9 deprecated |
| **v0.6.1** | `12.8` | `12.9` | `12.9` | |
| **v0.6.0** | `12.8` | `12.8` | `12.9` | |
54

55
Patch versions (e.g., v0.8.1.post1, v0.7.0.post1) have the same CUDA support as their base version.
56
57

For detailed artifact versions and NGC links (including container images, Python wheels, Helm charts, and Rust crates), see the [Release Artifacts](release-artifacts.md) page.
58

59
60
## Hardware Compatibility

61
62
63
| **CPU Architecture** | **Status**   |
| :------------------- | :----------- |
| **x86_64**           | Supported    |
64
| **ARM64**            | Supported    |
65

66
Dynamo provides multi-arch container images supporting both AMD64 (x86_64) and ARM64 architectures. See [Release Artifacts](release-artifacts.md) for available images.
67
68
69
70
71

### GPU Compatibility

If you are using a **GPU**, the following GPU models and architectures are supported:

72
73
74
75
76
77
78
| **GPU Architecture**                 | **Status** |
| :----------------------------------- | :--------- |
| **NVIDIA Blackwell Architecture**    | Supported  |
| **NVIDIA Hopper Architecture**       | Supported  |
| **NVIDIA Ada Lovelace Architecture** | Supported  |
| **NVIDIA Ampere Architecture**       | Supported  |

79
80
81
82
## Platform Architecture Compatibility

**Dynamo** is compatible with the following platforms:

83
| **Operating System** | **Version** | **Architecture** | **Status**   |
84
| :------------------- | :---------- | :--------------- | :----------- |
85
| **Ubuntu**           | 22.04       | x86_64           | Supported    |
86
| **Ubuntu**           | 24.04       | x86_64           | Supported    |
87
| **Ubuntu**           | 24.04       | ARM64            | Supported    |
88
| **CentOS Stream**    | 9           | x86_64           | Experimental |
89

90
Wheels are built using a manylinux_2_28-compatible environment and validated on CentOS Stream 9 and Ubuntu (22.04, 24.04). Compatibility with other Linux distributions is expected but not officially verified.
91
92
93

> [!Caution]
> KV Block Manager is supported only with Python 3.12. Python 3.12 support is currently limited to Ubuntu 24.04.
94

95
## Software Compatibility
96

97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
### CUDA and Driver Requirements

Dynamo container images include CUDA toolkit libraries. The host machine must have a compatible NVIDIA GPU driver installed.

| Dynamo Version | Backend | CUDA Toolkit | Min Driver (Linux) | Min Driver (Windows) | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- |
| **0.8.1** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
| | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
| **0.8.0** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
| | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
| | | 13.0 | 580.xx+ | 581.xx+ | Experimental |
| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
| **0.7.1** | **vLLM** | 12.9 | 575.xx+ | 576.xx+ | |
| | **SGLang** | 12.8 | 570.xx+ | 571.xx+ | |
| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
| **0.7.0** | **vLLM** | 12.8 | 570.xx+ | 571.xx+ | |
| | **SGLang** | 12.9 | 575.xx+ | 576.xx+ | |
| | **TensorRT-LLM** | 13.0 | 580.xx+ | 581.xx+ | |
119

120
Experimental CUDA 13 images are not published for all versions. Check [Release Artifacts](release-artifacts.md) for availability.
121

122
#### CUDA Compatibility Resources
123

124
For detailed information on CUDA driver compatibility, forward compatibility, and troubleshooting:
125

126
127
128
129
130
- [CUDA Compatibility Overview](https://docs.nvidia.com/deploy/cuda-compatibility/)
- [Why CUDA Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/why-cuda-compatibility.html)
- [Minor Version Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/minor-version-compatibility.html)
- [Forward Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html)
- [FAQ](https://docs.nvidia.com/deploy/cuda-compatibility/frequently-asked-questions.html)
131

132
For extended driver compatibility beyond the minimum versions listed above, consider using `cuda-compat` packages on the host. See [Forward Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html) for details.
133

134
135
136
137
## Cloud Service Provider Compatibility

### AWS

138
139
| **Host Operating System** | **Version** | **Architecture** | **Status** |
| :------------------------ | :---------- | :--------------- | :--------- |
140
| **Amazon Linux**          | 2023        | x86_64           | Supported  |
141

142
> [!Caution]
143
> **AL2023 TensorRT-LLM Limitation:** There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend).
144

145
## Build Support
146

147
For version-specific artifact details, installation commands, and release history, see [Release Artifacts](release-artifacts.md).
148

149
150
**Dynamo** currently provides build support in the following ways:

151
152
153
- **Wheels**: We distribute Python wheels of Dynamo and KV Block Manager:
  - [ai-dynamo](https://pypi.org/project/ai-dynamo/)
  - [ai-dynamo-runtime](https://pypi.org/project/ai-dynamo-runtime/)
154
  - [kvbm](https://pypi.org/project/kvbm/) as a standalone implementation.
155

156
157
158
159
160
161
162
163
- **Dynamo Container Images**: We distribute multi-arch images (x86 & ARM64 compatible) on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
  - [Dynamo Frontend](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend) *(New in v0.8.0)*
  - [SGLang Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime)
  - [SGLang Runtime (CUDA 13)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime-cu13)
  - [TensorRT-LLM Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime)
  - [vLLM Runtime](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)
  - [vLLM Runtime (CUDA 13)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime-cu13)
  - [Kubernetes Operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator)
164

165
166
167
168
- **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo:
  - [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds)
  - [Dynamo Platform](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-platform)
  - [Dynamo Graph](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-graph)
169

170
171
- **Rust Crates**:
  - [dynamo-runtime](https://crates.io/crates/dynamo-runtime/)
172
  - [dynamo-llm](https://crates.io/crates/dynamo-llm/)
173
174
  - [dynamo-async-openai](https://crates.io/crates/dynamo-async-openai/)
  - [dynamo-parsers](https://crates.io/crates/dynamo-parsers/)
175
176
  - [dynamo-config](https://crates.io/crates/dynamo-config/) *(New in v0.8.0)*
  - [dynamo-memory](https://crates.io/crates/dynamo-memory/) *(New in v0.8.0)*
177

178
Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the [Local Quick Start](https://github.com/ai-dynamo/dynamo/blob/main/README.md#local-quick-start) in the README.