"vllm/vscode:/vscode.git/clone" did not exist on "090c856d7681f65143fece96f9dfd555c4b7d59b"
Unverified Commit 8e7fb5d4 authored by Kante Yin's avatar Kante Yin Committed by GitHub
Browse files

Support to serve vLLM on Kubernetes with LWS (#4829)


Signed-off-by: default avatarkerthcet <kerthcet@gmail.com>
parent 9a31a817
.. _deploying_with_lws:
Deploying with LWS
============================
LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.
A major use case is for multi-host/multi-node distributed inference.
vLLM can be deployed with `LWS <https://github.com/kubernetes-sigs/lws>`_ on Kubernetes for distributed model serving.
Please see `this guide <https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm>`_ for more details on
deploying vLLM on Kubernetes using LWS.
......@@ -8,4 +8,5 @@ Integrations
deploying_with_kserve
deploying_with_triton
deploying_with_bentoml
deploying_with_lws
serving_with_langchain
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment