docs: add message to guide users to the stable version (#1457)

e32fe675 · richardhuo-nv · GitHub · adeaa903 · e32fe675 · e32fe675
Unverified Commit e32fe675 authored Jun 11, 2025 by richardhuo-nv Committed by GitHub Jun 11, 2025
9 changed files
--- a/docs/examples/llm_deployment.md
+++ b/docs/examples/llm_deployment.md
@@ -19,6 +19,18 @@ limitations under the License.
 This directory contains examples and reference implementations for deploying Large Language Models (LLMs) in various configurations.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Components
 - workers: Prefill and decode worker handles actual LLM inference

--- a/docs/examples/multinode.md
+++ b/docs/examples/multinode.md
@@ -18,6 +18,18 @@ limitations under the License.
 # Multinode Examples
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Single node sized models
 You can deploy dynamo on multiple nodes via NATS/ETCD based discovery and communication. Here's an example of deploying disaggregated serving on 3 nodes using `nvidia/Llama-3.1-405B-Instruct-FP8`. Each node must be properly configured with Infiniband and/or RoCE for communication between decode and prefill workers.

--- a/docs/examples/trtllm.md
+++ b/docs/examples/trtllm.md
@@ -19,6 +19,17 @@ limitations under the License.
 This directory contains examples and reference implementations for deploying Large Language Models (LLMs) in various configurations using TensorRT-LLM.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Deployment Architectures

--- a/examples/llm/README.md
+++ b/examples/llm/README.md
@@ -19,6 +19,18 @@ limitations under the License.
 This directory contains examples and reference implementations for deploying Large Language Models (LLMs) in various configurations.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Components
 - workers: Prefill and decode worker handles actual LLM inference

--- a/examples/multimodal/README.md
+++ b/examples/multimodal/README.md
@@ -20,6 +20,18 @@ limitations under the License.
 This directory provides example workflows and reference implementations for deploying a multimodal model using Dynamo.
 The examples are based on the [llava-1.5-7b-hf](https://huggingface.co/llava-hf/llava-1.5-7b-hf) model.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Multimodal Aggregated Serving
 ### Components

--- a/examples/sglang/README.md
+++ b/examples/sglang/README.md
@@ -19,6 +19,18 @@ limitations under the License.
 This directory contains examples and reference implementations for deploying Large Language Models (LLMs) in various configurations using SGLang. SGLang internally uses ZMQ to communicate between the ingress and the engine processes. For Dynamo, we leverage the runtime to communicate directly with the engine processes and handle ingress and pre/post processing on our end.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Deployment Architectures
 See [deployment architectures](../llm/README.md#deployment-architectures) to learn about the general idea of the architecture. SGLang currently supports aggregated and disaggregated serving. KV routing support is coming soon!

--- a/examples/tensorrt_llm/README.md
+++ b/examples/tensorrt_llm/README.md
@@ -19,6 +19,17 @@ limitations under the License.
 This directory contains examples and reference implementations for deploying Large Language Models (LLMs) in various configurations using TensorRT-LLM.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Deployment Architectures

--- a/examples/vllm_v0/README.md
+++ b/examples/vllm_v0/README.md
@@ -19,6 +19,18 @@ limitations under the License.
 This directory contains examples for deploying vLLM (v0) models in both aggregated and disaggregated configurations.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 > [!NOTE]
 > Different than `/examples/llm`, this example uses `dynamo-run` to handle the (de)tokenization and routing. `dynamo-run` is a rust-based CLI designed for high-performance pre/post-processing and routing. Read more about `dynamo-run`: [dynamo_run.md](../../docs/guides/dynamo_run.md).

--- a/examples/vllm_v1/README.md
+++ b/examples/vllm_v1/README.md
@@ -19,6 +19,18 @@ limitations under the License.
 This directory contains examples for deploying vLLM models in both aggregated and disaggregated configurations.
+## Use the Latest Release
+We recommend using the latest stable release of dynamo to avoid breaking changes:
+[![GitHub Release](https://img.shields.io/github/v/release/ai-dynamo/dynamo)](https://github.com/ai-dynamo/dynamo/releases/latest)
+You can find the latest release [here](https://github.com/ai-dynamo/dynamo/releases/latest) and check out the corresponding branch with:
+```bash
+git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
+```
 ## Prerequisites
 1. Install vLLM: