Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
8e7fb5d4
Unverified
Commit
8e7fb5d4
authored
May 17, 2024
by
Kante Yin
Committed by
GitHub
May 16, 2024
Browse files
Support to serve vLLM on Kubernetes with LWS (#4829)
Signed-off-by:
kerthcet
<
kerthcet@gmail.com
>
parent
9a31a817
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
13 additions
and
0 deletions
+13
-0
docs/source/serving/deploying_with_lws.rst
docs/source/serving/deploying_with_lws.rst
+12
-0
docs/source/serving/integrations.rst
docs/source/serving/integrations.rst
+1
-0
No files found.
docs/source/serving/deploying_with_lws.rst
0 → 100644
View file @
8e7fb5d4
.. _deploying_with_lws:
Deploying with LWS
============================
LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.
A major use case is for multi-host/multi-node distributed inference.
vLLM can be deployed with `LWS <https://github.com/kubernetes-sigs/lws>`_ on Kubernetes for distributed model serving.
Please see `this guide <https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm>`_ for more details on
deploying vLLM on Kubernetes using LWS.
docs/source/serving/integrations.rst
View file @
8e7fb5d4
...
@@ -8,4 +8,5 @@ Integrations
...
@@ -8,4 +8,5 @@ Integrations
deploying_with_kserve
deploying_with_kserve
deploying_with_triton
deploying_with_triton
deploying_with_bentoml
deploying_with_bentoml
deploying_with_lws
serving_with_langchain
serving_with_langchain
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment