Unverified Commit 25fd32ab authored by dagil-nvidia's avatar dagil-nvidia Committed by GitHub
Browse files

docs: update support matrix for Dynamo 0.7.0 (#4550)


Signed-off-by: default avatarDan Gil <dagil@nvidia.com>
parent e75bcf67
......@@ -13,11 +13,8 @@ This document provides the support matrix for Dynamo, including hardware, softwa
| **CPU Architecture** | **Status** |
| :------------------- | :----------- |
| **x86_64** | Supported |
| **ARM64** | Experimental |
| **ARM64** | Supported |
> [!Warning]
> While **x86_64** architecture is supported on systems with a minimum of 32 GB RAM and at least 4 CPU cores,
> the **ARM64** support is experimental and may have limitations.
### GPU Compatibility
......@@ -38,11 +35,10 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
| :------------------- | :---------- | :--------------- | :----------- |
| **Ubuntu** | 22.04 | x86_64 | Supported |
| **Ubuntu** | 24.04 | x86_64 | Supported |
| **Ubuntu** | 24.04 | ARM64 | Experimental |
| **Ubuntu** | 24.04 | ARM64 | Supported |
| **CentOS Stream** | 9 | x86_64 | Experimental |
> [!Note]
> For **Linux**, the **ARM64** support is experimental and may have limitations.
> Wheels are built using a manylinux_2_28-compatible environment and they have been validated on CentOS 9 and Ubuntu (22.04, 24.04).
>
> Compatibility with other Linux distributions is expected but has not been officially verified yet.
......@@ -62,16 +58,22 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
### Build Dependency
| **Build Dependency** | **Version** |
| **Build Dependency** | **Version as of Dynamo v0.7.0** |
| :------------------- | :------------------------------------------------------------------------------- |
| **TensorRT-LLM** | 1.1.0rc5 |
| **SGLang** | 0.5.3.post4 |
| **TensorRT-LLM** | 1.2.0rc2 |
| **vLLM** | 0.11.0 |
| **NIXL** | 0.7.1 |
| **vLLM** | 0.10.1.1 |
| **SGLang** | 0.5.3rc0 |
> [!Important]
> Specific versions of TensorRT-LLM supported by Dynamo are subject to change. Currently TensorRT-LLM does not support Python 3.11 so installation of the ai-dynamo[trtllm] will fail.
### CUDA Support by Framework
| **Dynamo Version** | **SGLang** | **TensorRT-LLM** | **vLLM** |
| :------------------- | :-----------------------| :-----------------------| :-----------------------|
| **Dynamo 0.7.0** | CUDA 12.8 | CUDA 13.0 | CUDA 12.8 |
## Cloud Service Provider Compatibility
### AWS
......@@ -81,21 +83,37 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
| **Amazon Linux** | 2023 | x86_64 | Supported¹ |
> [!Caution]
> ¹ There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend).
> There is a known issue with the TensorRT-LLM framework when running the AL2023 container locally with `docker run --network host ...` due to a [bug](https://github.com/mpi4py/mpi4py/discussions/491#discussioncomment-12660609) in mpi4py. To avoid this issue, replace the `--network host` flag with more precise networking configuration by mapping only the necessary ports (e.g., 4222 for nats, 2379/2380 for etcd, 8000 for frontend).
## Build Support
**Dynamo** currently provides build support in the following ways:
- **Wheels**: Pre-built Python wheels are only available for **x86_64 Linux**.
No wheels are available for other platforms at this time.
- **Wheels**: We distribute Python wheels of Dynamo and KV Block Manager:
- [ai-dynamo](https://pypi.org/project/ai-dynamo/)
- [ai-dynamo-runtime](https://pypi.org/project/ai-dynamo-runtime/)
- **New as of Dynamo v0.7.0:** [kvbm](https://pypi.org/project/kvbm/) as a standalone implementation.
- **Dynamo Runtime Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Runtime for each of the LLM inference frameworks on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
- [SGLang](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime)
- [TensorRT-LLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime)
- [vLLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)
- **Dynamo Kubernetes Operator Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Operator on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
- [kubernetes-operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator) to simplify deployments of Dynamo Graphs.
- **Runtime Container Images**: We distribute only **AMD64** images of the runtime target on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) for [TensorRT-LLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/tensorrtllm-runtime), [vLLM](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime), and [SGLang](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/sglang-runtime).
Users must build the container image from source if they require an **ARM64** image.
- **Dynamo Frontend Images**: We distribute multi-arch images (x86 & ARM64 compatible) of the Dynamo Frontend on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo):
- **New as of Dynamo v0.7.0:** [dynamo-frontend](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend) as a standalone implementation.
- **Deployment-supportive Images**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the [Dynamo kubernetes-operator](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator) to simplify deployments of Dynamo Graphs.
It is currently provided as an **AMD64** image only.
- **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo:
- [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds)
- [Dynamo Platform](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-platform)
- [Dynamo Graph](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-graph)
- **Helm Charts**: [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo) hosts the helm charts supporting Kubernetes deployments of Dynamo. [Dynamo CRDs](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-crds), [Dynamo Platform](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-platform), and [Dynamo Graph](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/helm-charts/dynamo-graph) are available.
- **Rust Crates**:
- [dynamo-runtime](https://crates.io/crates/dynamo-runtime/)
- [dynamo-async-openai](https://crates.io/crates/dynamo-async-openai/)
- [dynamo-parsers](https://crates.io/crates/dynamo-parsers/)
- [dynamo-llm](https://crates.io/crates/dynamo-llm/)
Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the instructions in the [Quick Start Guide](https://github.com/ai-dynamo/dynamo/blob/main/README.md#installation).
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment