> **v0.9.0 Helm Chart Issue:** The initial v0.9.0 `dynamo-platform` Helm chart sets the operator image to v0.7.1 instead of v0.9.0. Use `RELEASE_VERSION=0.9.0-post1` or add `--set dynamo-operator.controllerManager.manager.image.tag=0.9.0` to your helm install command.
**For Shared/Multi-Tenant Clusters:**
**For Shared/Multi-Tenant Clusters:**
If your cluster has namespace-restricted Dynamo operators, add this flag to step 2:
If your cluster has namespace-restricted Dynamo operators, add this flag to step 2:
> **v0.9.0 Helm Chart Issue:** The initial v0.9.0 `dynamo-platform` Helm chart sets the operator image to v0.7.1 instead of v0.9.0. Use `RELEASE_VERSION=0.9.0-post1` or add `--set dynamo-operator.controllerManager.manager.image.tag=0.9.0` to your helm install command.
**For Shared/Multi-Tenant Clusters:**
**For Shared/Multi-Tenant Clusters:**
If your cluster has namespace-restricted Dynamo operators, you MUST add namespace restriction to your installation:
If your cluster has namespace-restricted Dynamo operators, you MUST add namespace restriction to your installation:
**v0.8.1.post1** is a patch release for PyPI wheels and TRT-LLM container only (no GitHub release). All other artifacts remain at v0.8.1.
**v0.9.0.post1** is a Helm-chart-only patch release on NGC (no GitHub release). It fixes the `dynamo-platform` Helm chart which incorrectly set the operator image tag to `0.7.1` instead of `0.9.0`. Only the `dynamo-platform` chart was patched; all other artifacts remain at v0.9.0.
| `dynamo-frontend:0.8.1` | API gateway with Endpoint Prediction Protocol (EPP) | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend?version=0.8.1) | |
| `dynamo-frontend:0.9.0` | API gateway with Endpoint Prediction Protocol (EPP) | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/dynamo-frontend?version=0.9.0) | |
| `kubernetes-operator:0.8.1` | Kubernetes operator for Dynamo deployments | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator?version=0.8.1) | |
| `kubernetes-operator:0.9.0` | Kubernetes operator for Dynamo deployments | — | — | AMD64/ARM64 | [link](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/kubernetes-operator?version=0.9.0) | |
\* Multimodal inference on CUDA 13 images: works on AMD64 for all backends; works on ARM64 only for TensorRT-LLM (`vllm-runtime:*-cuda13` and `sglang-runtime:*-cuda13` do not support multimodality on ARM64).
\* Multimodal inference on CUDA 13 images: works on AMD64 for all backends; works on ARM64 only for TensorRT-LLM (`vllm-runtime:*-cuda13` and `sglang-runtime:*-cuda13` do not support multimodality on ARM64).
...
@@ -46,49 +47,50 @@ We recommend using the TensorRT-LLM NGC container instead of the `ai-dynamo[trtl
...
@@ -46,49 +47,50 @@ We recommend using the TensorRT-LLM NGC container instead of the `ai-dynamo[trtl
> For detailed run instructions, see the [Container README](https://github.com/ai-dynamo/dynamo/tree/main/container/README.md) or backend-specific guides: [SGLang](../backends/sglang/README.md) | [TensorRT-LLM](../backends/trtllm/README.md) | [vLLM](../backends/vllm/README.md)
> For detailed run instructions, see the [Container README](https://github.com/ai-dynamo/dynamo/tree/main/container/README.md) or backend-specific guides: [vLLM](../backends/vllm/README.md) | [SGLang](../backends/sglang/README.md) | [TensorRT-LLM](../backends/trtllm/README.md)
> For API documentation, see each crate on [docs.rs](https://docs.rs/). To build Dynamo from source, see [Building from Source](https://github.com/ai-dynamo/dynamo#building-from-source).
> For API documentation, see each crate on [docs.rs](https://docs.rs/). To build Dynamo from source, see [Building from Source](https://github.com/ai-dynamo/dynamo#building-from-source).
```bash
```bash
cargo add dynamo-runtime@0.8.1
cargo add dynamo-runtime@0.9.0
cargo add dynamo-llm@0.8.1
cargo add dynamo-llm@0.9.0
cargo add dynamo-async-openai@0.8.1
cargo add dynamo-async-openai@0.9.0
cargo add dynamo-parsers@0.8.1
cargo add dynamo-parsers@0.9.0
cargo add dynamo-memory@0.8.1
cargo add dynamo-memory@0.9.0
cargo add dynamo-config@0.8.1
cargo add dynamo-config@0.9.0
cargo add dynamo-tokens@0.9.0
```
```
## CUDA and Driver Requirements
**CUDA and Driver Requirements:** For detailed CUDA toolkit versions and minimum driver requirements for each container image, see the [Support Matrix](support-matrix.md#cuda-and-driver-requirements).
For detailed CUDA toolkit versions and minimum driver requirements for each container image, see the [Support Matrix](support-matrix.md#cuda-and-driver-requirements).
## Known Issues
## Known Issues
For a complete list of known issues, refer to the release notes for each patch:
For a complete list of known issues, refer to the release notes for each version:
| v0.9.0 | `dynamo-platform-0.9.0` | Helm chart sets operator image to `0.7.1` instead of `0.9.0`. | Fixed in v0.9.0.post1 |
| v0.8.1 | `vllm-runtime:0.8.1-cuda13` | Container fails to launch. | Known issue |
| v0.8.1 | `vllm-runtime:0.8.1-cuda13` | Container fails to launch. | Known issue |
| v0.8.1 | `sglang-runtime:0.8.1-cuda13`, `vllm-runtime:0.8.1-cuda13` | Multimodality not expected to work on ARM64. Works on AMD64. | Known limitation |
| v0.8.1 | `sglang-runtime:0.8.1-cuda13`, `vllm-runtime:0.8.1-cuda13` | Multimodality not expected to work on ARM64. Works on AMD64. | Known limitation |
| v0.8.0 | `sglang-runtime:0.8.0-cuda13` | CuDNN installation issue caused PyTorch `v2.9.1` compatibility problems with `nn.Conv3d`, resulting in performance degradation and excessive memory usage in multimodal workloads. | Fixed in v0.8.1 ([#5461](https://github.com/ai-dynamo/dynamo/pull/5461)) |
| v0.8.0 | `sglang-runtime:0.8.0-cuda13` | CuDNN installation issue caused PyTorch `v2.9.1` compatibility problems with `nn.Conv3d`, resulting in performance degradation and excessive memory usage in multimodal workloads. | Fixed in v0.8.1 ([#5461](https://github.com/ai-dynamo/dynamo/pull/5461)) |
...
@@ -154,7 +155,9 @@ For a complete list of known issues, refer to the release notes for each patch:
...
@@ -154,7 +155,9 @@ For a complete list of known issues, refer to the release notes for each patch:
## Release History
## Release History
-**v0.8.1.post1 Patch**: Updated TRT-LLM to `v1.2.0rc6.post2` (PyPI wheels and TRT-LLM container only)
-**v0.9.0**: Updated vLLM to `v0.14.1`, SGLang to `v0.5.8`, TRT-LLM to `v1.3.0rc1`, NIXL to `v0.9.0`. New `dynamo-tokens` Rust crate. Deprecated `dynamo-graph` Helm chart.
-**v0.8.1.post1/.post2/.post3 Patches**: Experimental patch releases updating TRT-LLM only (PyPI wheels and TRT-LLM container). No other artifacts changed.
-**Standalone Frontend Container**: `dynamo-frontend` added in v0.8.0
-**Standalone Frontend Container**: `dynamo-frontend` added in v0.8.0
-**CUDA 13 Runtimes**: Experimental CUDA 13 runtime for SGLang and vLLM in v0.8.0
-**CUDA 13 Runtimes**: Experimental CUDA 13 runtime for SGLang and vLLM in v0.8.0
-**New Rust Crates**: `dynamo-memory` and `dynamo-config` added in v0.8.0
-**New Rust Crates**: `dynamo-memory` and `dynamo-config` added in v0.8.0
...
@@ -163,12 +166,13 @@ For a complete list of known issues, refer to the release notes for each patch:
...
@@ -163,12 +166,13 @@ For a complete list of known issues, refer to the release notes for each patch: