"vllm/vscode:/vscode.git/clone" did not exist on "e0ade06d6305cf84b41c1962cdd9dfdbfee16ac9"
Unverified Commit 7493c51c authored by Paco Xu's avatar Paco Xu Committed by GitHub
Browse files

[Docs] add Dynamo/aibrix integration and kubeai/aks link (#32767)


Signed-off-by: default avatarPaco Xu <paco.xu@daocloud.io>
parent ac773bbe
# AIBrix
[AIBrix](https://github.com/vllm-project/aibrix) is a cloud-native control plane that integrates with vLLM to simplify Kubernetes deployment, scaling, routing, and LoRA adapter management for large language model inference.
For installation and usage instructions, please refer to the [AIBrix documentation](https://aibrix.readthedocs.io/).
# NVIDIA Dynamo
[NVIDIA Dynamo](https://github.com/ai-dynamo/dynamo) is an open-source framework for distributed LLM inference that can run vLLM on Kubernetes with flexible serving architectures (e.g. aggregated/disaggregated, optional router/planner).
For Kubernetes deployment instructions and examples (including vLLM), see the [Deploying Dynamo on Kubernetes](https://github.com/ai-dynamo/dynamo/blob/main/docs/kubernetes/README.md) guide.
Background reading: InfoQ news coverage — [NVIDIA Dynamo simplifies Kubernetes deployment for LLM inference](https://www.infoq.com/news/2025/12/nvidia-dynamo-kubernetes/).
......@@ -5,6 +5,7 @@
Please see the Installation Guides for environment specific instructions:
- [Any Kubernetes Cluster](https://www.kubeai.org/installation/any/)
- [AKS](https://www.kubeai.org/installation/aks/)
- [EKS](https://www.kubeai.org/installation/eks/)
- [GKE](https://www.kubeai.org/installation/gke/)
......
......@@ -11,6 +11,7 @@ Deploying vLLM on Kubernetes is a scalable and efficient way to serve machine le
Alternatively, you can deploy vLLM to Kubernetes using any of the following:
- [Helm](frameworks/helm.md)
- [NVIDIA Dynamo](integrations/dynamo.md)
- [InftyAI/llmaz](integrations/llmaz.md)
- [llm-d](integrations/llm-d.md)
- [KAITO](integrations/kaito.md)
......@@ -20,7 +21,7 @@ Alternatively, you can deploy vLLM to Kubernetes using any of the following:
- [kubernetes-sigs/lws](frameworks/lws.md)
- [meta-llama/llama-stack](integrations/llamastack.md)
- [substratusai/kubeai](integrations/kubeai.md)
- [vllm-project/aibrix](https://github.com/vllm-project/aibrix)
- [vllm-project/AIBrix](integrations/aibrix.md)
- [vllm-project/production-stack](integrations/production-stack.md)
## Deployment with CPUs
......
......@@ -177,6 +177,7 @@ Pn = "Pn"
arange = "arange"
PARD = "PARD"
pard = "pard"
AKS = "AKS"
[tool.typos.type.py]
extend-glob = []
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment