chore: Mark NIXL as beta in TRTLLM (#3633)

Signed-off-by: Laikh Tewari <ltewari@nvidia.com> Co-authored-by: Laikh Tewari <ltewari@nvidia.com>

chore: Mark NIXL as beta in TRTLLM (#3633)
Signed-off-by: Laikh Tewari <ltewari@nvidia.com> Co-authored-by: Laikh Tewari <ltewari@nvidia.com>
43d687e8 · Tanmay Verma · GitHub · a7badb85 · 43d687e8
Unverified Commit 43d687e8 authored Oct 15, 2025 by Tanmay Verma Committed by GitHub Oct 15, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 4 additions and 4 deletions

docs/backends/trtllm/kv-cache-transfer.md docs/backends/trtllm/kv-cache-transfer.md +4 -4

No files found.
--- a/docs/backends/trtllm/kv-cache-transfer.md
+++ b/docs/backends/trtllm/kv-cache-transfer.md
@@ -24,10 +24,10 @@ In disaggregated serving architectures, KV cache must be transferred between pre
 ## Default Method: UCX
 By default, TensorRT-LLM uses UCX (Unified Communication X) for KV cache transfer between prefill and decode workers. UCX provides high-performance communication optimized for GPU-to-GPU transfers.
-## Experimental Method: NIXL
+## Beta Method: NIXL
-TensorRT-LLM also provides experimental support for using **NIXL** (NVIDIA Inference Xfer Library) for KV cache transfer. [NIXL](https://github.com/ai-dynamo/nixl) is NVIDIA's high-performance communication library designed for efficient data transfer in distributed GPU environments.
+TensorRT-LLM also supports using **NIXL** (NVIDIA Inference Xfer Library) for KV cache transfer. [NIXL](https://github.com/ai-dynamo/nixl) is NVIDIA's high-performance communication library designed for efficient data transfer in distributed GPU environments.
-**Note:** NIXL support in TensorRT-LLM is experimental and is not suitable for production environments yet.
+**Note:** NIXL support in TensorRT-LLM is currently beta and may have some sharp edges.
 ## Using NIXL for KV Cache Transfer
@@ -61,4 +61,4 @@ To enable NIXL for KV cache transfer in disaggregated serving:
 4. **Send the request:**
   See [client](./README.md#client) section to learn how to send the request to deployment.
 **Important:** Ensure that ETCD and NATS services are running before starting the service.
\ No newline at end of file