[Docs] add Dynamo/aibrix integration and kubeai/aks link (#32767)

Signed-off-by: Paco Xu <paco.xu@daocloud.io>

[Docs] add Dynamo/aibrix integration and kubeai/aks link (#32767)
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
7493c51c · Paco Xu · GitHub · ac773bbe · 7493c51c · 7493c51c
Unverified Commit 7493c51c authored Mar 05, 2026 by Paco Xu Committed by GitHub Mar 05, 2026
5 changed files
--- a/docs/deployment/integrations/aibrix.md
+++ b/docs/deployment/integrations/aibrix.md
+# AIBrix
+
+[AIBrix](https://github.com/vllm-project/aibrix) is a cloud-native control plane that integrates with vLLM to simplify Kubernetes deployment, scaling, routing, and LoRA adapter management for large language model inference.
+
+For installation and usage instructions, please refer to the [AIBrix documentation](https://aibrix.readthedocs.io/).
--- a/docs/deployment/integrations/dynamo.md
+++ b/docs/deployment/integrations/dynamo.md
+# NVIDIA Dynamo
+
+[NVIDIA Dynamo](https://github.com/ai-dynamo/dynamo) is an open-source framework for distributed LLM inference that can run vLLM on Kubernetes with flexible serving architectures (e.g. aggregated/disaggregated, optional router/planner).
+
+For Kubernetes deployment instructions and examples (including vLLM), see the [Deploying Dynamo on Kubernetes](https://github.com/ai-dynamo/dynamo/blob/main/docs/kubernetes/README.md) guide.
+
+Background reading: InfoQ news coverage — [NVIDIA Dynamo simplifies Kubernetes deployment for LLM inference](https://www.infoq.com/news/2025/12/nvidia-dynamo-kubernetes/).
--- a/docs/deployment/integrations/kubeai.md
+++ b/docs/deployment/integrations/kubeai.md
@@ -5,6 +5,7 @@
 Please see the Installation Guides for environment specific instructions:

 - [Any Kubernetes Cluster](https://www.kubeai.org/installation/any/)
+- [AKS](https://www.kubeai.org/installation/aks/)
 - [EKS](https://www.kubeai.org/installation/eks/)
 - [GKE](https://www.kubeai.org/installation/gke/)


--- a/docs/deployment/k8s.md
+++ b/docs/deployment/k8s.md
@@ -11,6 +11,7 @@ Deploying vLLM on Kubernetes is a scalable and efficient way to serve machine le
 Alternatively, you can deploy vLLM to Kubernetes using any of the following:

 - [Helm](frameworks/helm.md)
+- [NVIDIA Dynamo](integrations/dynamo.md)
 - [InftyAI/llmaz](integrations/llmaz.md)
 - [llm-d](integrations/llm-d.md)
 - [KAITO](integrations/kaito.md)
@@ -20,7 +21,7 @@ Alternatively, you can deploy vLLM to Kubernetes using any of the following:
 - [kubernetes-sigs/lws](frameworks/lws.md)
 - [meta-llama/llama-stack](integrations/llamastack.md)
 - [substratusai/kubeai](integrations/kubeai.md)
- [vllm-project/aibrix](https://github.com/vllm-project/aibrix)
+- [vllm-project/AIBrix](integrations/aibrix.md)
 - [vllm-project/production-stack](integrations/production-stack.md)

 ## Deployment with CPUs

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -177,6 +177,7 @@ Pn = "Pn"
 arange = "arange"
 PARD = "PARD"
 pard = "pard"
+AKS = "AKS"

 [tool.typos.type.py]
 extend-glob = []