@@ -97,13 +97,13 @@ Containers have all dependencies pre-installed. No setup required.
...
@@ -97,13 +97,13 @@ Containers have all dependencies pre-installed. No setup required.
```bash
```bash
# SGLang
# SGLang
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.8.1
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/sglang-runtime:1.0.0
# TensorRT-LLM
# TensorRT-LLM
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.8.1
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:1.0.0
# vLLM
# vLLM
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.8.1
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0
```
```
> **Tip:** To run frontend and worker in the same container, either run processes in background with `&` (see below), or open a second terminal and use `docker exec -it <container_id> bash`.
> **Tip:** To run frontend and worker in the same container, either run processes in background with `&` (see below), or open a second terminal and use `docker exec -it <container_id> bash`.
**v0.9.0.post1** is a Helm-chart-only patch release on NGC (no GitHub release). It fixes the `dynamo-platform` Helm chart which incorrectly set the operator image tag to `0.7.1` instead of `0.9.0`. Only the `dynamo-platform` chart was patched; all other artifacts remain at v0.9.0. Users upgrading to v0.9.1 do not need this patch.
| `dynamo-frontend:1.0.0` | API gateway with Endpoint Prediction Protocol (EPP) | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend?version=1.0.0) | |
| `kubernetes-operator:1.0.0` | Kubernetes operator for Dynamo deployments | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator?version=1.0.0) | |
| `snapshot-agent:1.0.0` | Snapshot agent for fast GPU worker recovery via CRIU | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/snapshot-agent?version=1.0.0) | Preview |
\* Multimodal inference on CUDA 13 images: works on AMD64 for all backends; works on ARM64 only for TensorRT-LLM (`vllm-runtime:*-cuda13` and `sglang-runtime:*-cuda13` do not support multimodality on ARM64).
\* Multimodal inference on CUDA 13 images: works on AMD64 for all backends; works on ARM64 only for TensorRT-LLM (`vllm-runtime:*-cuda13` and `sglang-runtime:*-cuda13` do not support multimodality on ARM64).
...
@@ -47,30 +39,32 @@ We recommend using the TensorRT-LLM NGC container instead of the `ai-dynamo[trtl
...
@@ -47,30 +39,32 @@ We recommend using the TensorRT-LLM NGC container instead of the `ai-dynamo[trtl
| `ai-dynamo==0.9.1` | Main package with backend integrations (vLLM, SGLang, TRT-LLM) | `3.10`–`3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo/0.9.1/) |
| `ai-dynamo==1.0.0` | Main package with backend integrations (vLLM, SGLang, TRT-LLM) | `3.10`–`3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo/1.0.0/) |
| `ai-dynamo-runtime==0.9.1` | Core Python bindings for Dynamo runtime | `3.10`–`3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo-runtime/0.9.1/) |
| `ai-dynamo-runtime==1.0.0` | Core Python bindings for Dynamo runtime | `3.10`–`3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/ai-dynamo-runtime/1.0.0/) |
| `kvbm==0.9.1` | KV Block Manager for disaggregated KV cache | `3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/kvbm/0.9.1/) |
| `kvbm==1.0.0` | KV Block Manager for disaggregated KV cache | `3.12` | Linux (glibc `v2.28+`) | [link](https://pypi.org/project/kvbm/1.0.0/) |
### Helm Charts
### Helm Charts
| Chart | Description | NGC |
| Chart | Description | NGC |
|-------|-------------|-----|
|-------|-------------|-----|
| `dynamo-crds-0.9.1` | Custom Resource Definitions for Dynamo Kubernetes resources | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-0.9.1.tgz) |
| `dynamo-platform-1.0.0` | Platform services (etcd, NATS) and Dynamo Operator for Dynamo cluster | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-1.0.0.tgz) |
| `dynamo-platform-0.9.1` | Platform services (etcd, NATS) for Dynamo cluster | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-0.9.1.tgz) |
| `snapshot-1.0.0` | Snapshot DaemonSet for fast GPU worker recovery | [link](https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/snapshot-1.0.0.tgz) |
> **Note:** The `dynamo-graph` Helm chart is deprecated as of v0.9.0. Use the Kubernetes operator for deployment graph management.
> **Note:** The `dynamo-crds` Helm chart is deprecated as of v1.0.0; CRDs are now managed by the Dynamo Operator. The `dynamo-graph` Helm chart is deprecated as of v0.9.0.
> For API documentation, see each crate on [docs.rs](https://docs.rs/). To build Dynamo from source, see [Building from Source](https://github.com/ai-dynamo/dynamo#building-from-source).
> For API documentation, see each crate on [docs.rs](https://docs.rs/). To build Dynamo from source, see [Building from Source](https://github.com/ai-dynamo/dynamo#building-from-source).
```bash
```bash
cargo add dynamo-runtime@0.9.1
cargo add dynamo-runtime@1.0.0
cargo add dynamo-llm@0.9.1
cargo add dynamo-llm@1.0.0
cargo add dynamo-async-openai@0.9.1
cargo add dynamo-async-openai@1.0.0
cargo add dynamo-parsers@0.9.1
cargo add dynamo-parsers@1.0.0
cargo add dynamo-memory@0.9.1
cargo add dynamo-memory@1.0.0
cargo add dynamo-config@0.9.1
cargo add dynamo-config@1.0.0
cargo add dynamo-tokens@0.9.1
cargo add dynamo-tokens@1.0.0
cargo add dynamo-mocker@1.0.0
cargo add dynamo-kv-router@1.0.0
```
```
**CUDA and Driver Requirements:** For detailed CUDA toolkit versions and minimum driver requirements for each container image, see the [Support Matrix](support-matrix.md#cuda-and-driver-requirements).
**CUDA and Driver Requirements:** For detailed CUDA toolkit versions and minimum driver requirements for each container image, see the [Support Matrix](support-matrix.md#cuda-and-driver-requirements).
...
@@ -139,6 +140,7 @@ cargo add dynamo-tokens@0.9.1
...
@@ -139,6 +140,7 @@ cargo add dynamo-tokens@0.9.1
## Known Issues
## Known Issues
For a complete list of known issues, refer to the release notes for each version:
For a complete list of known issues, refer to the release notes for each version:
@@ -155,11 +157,13 @@ For a complete list of known issues, refer to the release notes for each version
...
@@ -155,11 +157,13 @@ For a complete list of known issues, refer to the release notes for each version
## Release History
## Release History
-**v1.0.0**: First major release. SGLang `v0.5.9`, TRT-LLM `v1.3.0rc5.post1` (CUDA 13.1), vLLM `v0.16.0`, NIXL `v0.10.1`. New `snapshot-agent` container and `snapshot` Helm chart (Preview). New EFA container variants for vLLM and TRT-LLM (Experimental, AMD64 only). New `dynamo-mocker` and `dynamo-kv-router` Rust crates. Deprecated `dynamo-crds` Helm chart (CRDs now managed by the Operator). `v1alpha1` CRDs deprecated.
-**v0.9.1**: Updated TRT-LLM to `v1.3.0rc3`. All other backend versions unchanged from v0.9.0.
-**v0.9.1**: Updated TRT-LLM to `v1.3.0rc3`. All other backend versions unchanged from v0.9.0.
-**v0.9.0**: Updated vLLM to `v0.14.1`, SGLang to `v0.5.8`, TRT-LLM to `v1.3.0rc1`, NIXL to `v0.9.0`. New `dynamo-tokens` Rust crate. Deprecated `dynamo-graph` Helm chart.
-**v0.9.0**: Updated vLLM to `v0.14.1`, SGLang to `v0.5.8`, TRT-LLM to `v1.3.0rc1`, NIXL to `v0.9.0`. New `dynamo-tokens` Rust crate. Deprecated `dynamo-graph` Helm chart.
-**v0.8.1.post1/.post2/.post3 Patches**: Experimental patch releases updating TRT-LLM only (PyPI wheels and TRT-LLM container). No other artifacts changed.
-**v0.8.1.post1/.post2/.post3 Patches**: Experimental patch releases updating TRT-LLM only (PyPI wheels and TRT-LLM container). No other artifacts changed.
-**Standalone Frontend Container**: `dynamo-frontend` added in v0.8.0
-**Standalone Frontend Container**: `dynamo-frontend` added in v0.8.0
-**EFA Runtimes**: Experimental AWS EFA variants for vLLM and TRT-LLM (AMD64 only) in v1.0.0
-**CUDA 13 Runtimes**: Experimental CUDA 13 runtime for SGLang and vLLM in v0.8.0
-**CUDA 13 Runtimes**: Experimental CUDA 13 runtime for SGLang and vLLM in v0.8.0
-**New Rust Crates**: `dynamo-memory` and `dynamo-config` added in v0.8.0
-**New Rust Crates**: `dynamo-memory` and `dynamo-config` added in v0.8.0
...
@@ -167,6 +171,7 @@ For a complete list of known issues, refer to the release notes for each version
...
@@ -167,6 +171,7 @@ For a complete list of known issues, refer to the release notes for each version
| Version | Release Date | GitHub | Docs |
| Version | Release Date | GitHub | Docs |
|---------|--------------|--------|------|
|---------|--------------|--------|------|
| `v1.0.0` | Mar 2026 | [Release](https://github.com/ai-dynamo/dynamo/releases/tag/v1.0.0) | [Docs](https://docs.dynamo.nvidia.com/dynamo) |
-[dynamo-config](https://crates.io/crates/dynamo-config/)*(New in v0.8.0)*
-[dynamo-config](https://crates.io/crates/dynamo-config/)*(New in v0.8.0)*
-[dynamo-memory](https://crates.io/crates/dynamo-memory/)*(New in v0.8.0)*
-[dynamo-memory](https://crates.io/crates/dynamo-memory/)*(New in v0.8.0)*
-[dynamo-tokens](https://crates.io/crates/dynamo-tokens/)*(New in v0.9.0)*
-[dynamo-mocker](https://crates.io/crates/dynamo-mocker/)*(New in v1.0.0)*
-[dynamo-kv-router](https://crates.io/crates/dynamo-kv-router/)*(New in v1.0.0)*
Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the [Local Quick Start](https://github.com/ai-dynamo/dynamo/blob/main/README.md#local-quick-start) in the README.
Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the [Local Quick Start](https://github.com/ai-dynamo/dynamo/blob/main/README.md#local-quick-start) in the README.