Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
ac1fbf7f
Unverified
Commit
ac1fbf7f
authored
May 13, 2024
by
Zhuohan Li
Committed by
GitHub
May 13, 2024
Browse files
[Doc] Shorten README by removing supported model list (#4796)
parent
33d3914b
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
28 additions
and
44 deletions
+28
-44
README.md
README.md
+9
-38
docs/source/models/supported_models.rst
docs/source/models/supported_models.rst
+19
-6
No files found.
README.md
View file @
ac1fbf7f
...
...
@@ -51,41 +51,14 @@ vLLM is flexible and easy to use with:
-
(Experimental) Prefix caching support
-
(Experimental) Multi-lora support
vLLM seamlessly supports many Hugging Face models, including the following architectures:
-
Aquila & Aquila2 (
`BAAI/AquilaChat2-7B`
,
`BAAI/AquilaChat2-34B`
,
`BAAI/Aquila-7B`
,
`BAAI/AquilaChat-7B`
, etc.)
-
Baichuan & Baichuan2 (
`baichuan-inc/Baichuan2-13B-Chat`
,
`baichuan-inc/Baichuan-7B`
, etc.)
-
BLOOM (
`bigscience/bloom`
,
`bigscience/bloomz`
, etc.)
-
ChatGLM (
`THUDM/chatglm2-6b`
,
`THUDM/chatglm3-6b`
, etc.)
-
Command-R (
`CohereForAI/c4ai-command-r-v01`
, etc.)
-
DBRX (
`databricks/dbrx-base`
,
`databricks/dbrx-instruct`
etc.)
-
DeciLM (
`Deci/DeciLM-7B`
,
`Deci/DeciLM-7B-instruct`
, etc.)
-
Falcon (
`tiiuae/falcon-7b`
,
`tiiuae/falcon-40b`
,
`tiiuae/falcon-rw-7b`
, etc.)
-
Gemma (
`google/gemma-2b`
,
`google/gemma-7b`
, etc.)
-
GPT-2 (
`gpt2`
,
`gpt2-xl`
, etc.)
-
GPT BigCode (
`bigcode/starcoder`
,
`bigcode/gpt_bigcode-santacoder`
, etc.)
-
GPT-J (
`EleutherAI/gpt-j-6b`
,
`nomic-ai/gpt4all-j`
, etc.)
-
GPT-NeoX (
`EleutherAI/gpt-neox-20b`
,
`databricks/dolly-v2-12b`
,
`stabilityai/stablelm-tuned-alpha-7b`
, etc.)
-
InternLM (
`internlm/internlm-7b`
,
`internlm/internlm-chat-7b`
, etc.)
-
InternLM2 (
`internlm/internlm2-7b`
,
`internlm/internlm2-chat-7b`
, etc.)
-
Jais (
`core42/jais-13b`
,
`core42/jais-13b-chat`
,
`core42/jais-30b-v3`
,
`core42/jais-30b-chat-v3`
, etc.)
-
LLaMA, Llama 2, and Meta Llama 3 (
`meta-llama/Meta-Llama-3-8B-Instruct`
,
`meta-llama/Meta-Llama-3-70B-Instruct`
,
`meta-llama/Llama-2-70b-hf`
,
`lmsys/vicuna-13b-v1.3`
,
`young-geng/koala`
,
`openlm-research/open_llama_13b`
, etc.)
-
MiniCPM (
`openbmb/MiniCPM-2B-sft-bf16`
,
`openbmb/MiniCPM-2B-dpo-bf16`
, etc.)
-
Mistral (
`mistralai/Mistral-7B-v0.1`
,
`mistralai/Mistral-7B-Instruct-v0.1`
, etc.)
-
Mixtral (
`mistralai/Mixtral-8x7B-v0.1`
,
`mistralai/Mixtral-8x7B-Instruct-v0.1`
,
`mistral-community/Mixtral-8x22B-v0.1`
, etc.)
-
MPT (
`mosaicml/mpt-7b`
,
`mosaicml/mpt-30b`
, etc.)
-
OLMo (
`allenai/OLMo-1B-hf`
,
`allenai/OLMo-7B-hf`
, etc.)
-
OPT (
`facebook/opt-66b`
,
`facebook/opt-iml-max-30b`
, etc.)
-
Orion (
`OrionStarAI/Orion-14B-Base`
,
`OrionStarAI/Orion-14B-Chat`
, etc.)
-
Phi (
`microsoft/phi-1_5`
,
`microsoft/phi-2`
, etc.)
-
Phi-3 (
`microsoft/Phi-3-mini-4k-instruct`
,
`microsoft/Phi-3-mini-128k-instruct`
, etc.)
-
Qwen (
`Qwen/Qwen-7B`
,
`Qwen/Qwen-7B-Chat`
, etc.)
-
Qwen2 (
`Qwen/Qwen1.5-7B`
,
`Qwen/Qwen1.5-7B-Chat`
, etc.)
-
Qwen2MoE (
`Qwen/Qwen1.5-MoE-A2.7B`
,
`Qwen/Qwen1.5-MoE-A2.7B-Chat`
, etc.)
-
StableLM(
`stabilityai/stablelm-3b-4e1t`
,
`stabilityai/stablelm-base-alpha-7b-v2`
, etc.)
-
Starcoder2(
`bigcode/starcoder2-3b`
,
`bigcode/starcoder2-7b`
,
`bigcode/starcoder2-15b`
, etc.)
-
Xverse (
`xverse/XVERSE-7B-Chat`
,
`xverse/XVERSE-13B-Chat`
,
`xverse/XVERSE-65B-Chat`
, etc.)
-
Yi (
`01-ai/Yi-6B`
,
`01-ai/Yi-34B`
, etc.)
vLLM seamlessly supports most popular open-source models on HuggingFace, including:
-
Transformer-like LLMs (e.g., Llama)
-
Mixture-of-Expert LLMs (e.g., Mixtral)
-
Multi-modal LLMs (e.g., LLaVA)
Find the full list of supported models
[
here
](
https://docs.vllm.ai/en/latest/models/supported_models.html
)
.
## Getting Started
Install vLLM with pip or
[
from source
](
https://vllm.readthedocs.io/en/latest/getting_started/installation.html#build-from-source
)
:
...
...
@@ -93,9 +66,7 @@ Install vLLM with pip or [from source](https://vllm.readthedocs.io/en/latest/get
pip
install
vllm
```
## Getting Started
Visit our
[
documentation
](
https://vllm.readthedocs.io/en/latest/
)
to get started.
Visit our
[
documentation
](
https://vllm.readthedocs.io/en/latest/
)
to learn more.
-
[
Installation
](
https://vllm.readthedocs.io/en/latest/getting_started/installation.html
)
-
[
Quickstart
](
https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html
)
-
[
Supported Models
](
https://vllm.readthedocs.io/en/latest/models/supported_models.html
)
...
...
docs/source/models/supported_models.rst
View file @
ac1fbf7f
...
...
@@ -16,13 +16,21 @@ Alongside each architecture, we include some popular models that use it.
- Example HuggingFace Models
- :ref:`LoRA <lora>`
* - :code:`AquilaForCausalLM`
- Aquila
- Aquila
& Aquila2
- :code:`BAAI/Aquila-7B`, :code:`BAAI/AquilaChat-7B`, etc.
- ✅︎
* - :code:`ArcticForCausalLM`
- Arctic
- :code:`Snowflake/snowflake-arctic-base`, :code:`Snowflake/snowflake-arctic-instruct`, etc.
-
* - :code:`BaiChuanForCausalLM`
- Baichuan
- Baichuan
& Baichuan2
- :code:`baichuan-inc/Baichuan2-13B-Chat`, :code:`baichuan-inc/Baichuan-7B`, etc.
- ✅︎
* - :code:`BloomForCausalLM`
- BLOOM, BLOOMZ, BLOOMChat
- :code:`bigscience/bloom`, :code:`bigscience/bloomz`, etc.
-
* - :code:`ChatGLMModel`
- ChatGLM
- :code:`THUDM/chatglm2-6b`, :code:`THUDM/chatglm3-6b`, etc.
...
...
@@ -39,10 +47,6 @@ Alongside each architecture, we include some popular models that use it.
- DeciLM
- :code:`Deci/DeciLM-7B`, :code:`Deci/DeciLM-7B-instruct`, etc.
-
* - :code:`BloomForCausalLM`
- BLOOM, BLOOMZ, BLOOMChat
- :code:`bigscience/bloom`, :code:`bigscience/bloomz`, etc.
-
* - :code:`FalconForCausalLM`
- Falcon
- :code:`tiiuae/falcon-7b`, :code:`tiiuae/falcon-40b`, :code:`tiiuae/falcon-rw-7b`, etc.
...
...
@@ -135,6 +139,15 @@ Alongside each architecture, we include some popular models that use it.
- StableLM
- :code:`stabilityai/stablelm-3b-4e1t/` , :code:`stabilityai/stablelm-base-alpha-7b-v2`, etc.
-
* - :code:`Starcoder2ForCausalLM`
- Starcoder2
- :code:`bigcode/starcoder2-3b`, :code:`bigcode/starcoder2-7b`, :code:`bigcode/starcoder2-15b`, etc.
-
* - :code:`XverseForCausalLM`
- Xverse
- :code:`xverse/XVERSE-7B-Chat`, :code:`xverse/XVERSE-13B-Chat`, :code:`xverse/XVERSE-65B-Chat`, etc.
-
If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment