vLLM supports a variety of generative Transformer models in `HuggingFace Transformers <https://huggingface.co/models>`_.
vLLM supports a variety of generative Transformer models in `HuggingFace (HF) Transformers <https://huggingface.co/models>`_.
The following is the list of model architectures that are currently supported by vLLM.
The following is the list of model architectures that are currently supported by vLLM.
Alongside each architecture, we include some popular models that use it.
Alongside each architecture, we include some popular models that use it.
...
@@ -19,7 +19,7 @@ Text Generation
...
@@ -19,7 +19,7 @@ Text Generation
* - Architecture
* - Architecture
- Models
- Models
- Example HuggingFace Models
- Example HF Models
- :ref:`LoRA <lora>`
- :ref:`LoRA <lora>`
- :ref:`PP <distributed_serving>`
- :ref:`PP <distributed_serving>`
* - :code:`AquilaForCausalLM`
* - :code:`AquilaForCausalLM`
...
@@ -280,7 +280,7 @@ Text Embedding
...
@@ -280,7 +280,7 @@ Text Embedding
* - Architecture
* - Architecture
- Models
- Models
- Example HuggingFace Models
- Example HF Models
- :ref:`LoRA <lora>`
- :ref:`LoRA <lora>`
- :ref:`PP <distributed_serving>`
- :ref:`PP <distributed_serving>`
* - :code:`Gemma2Model`
* - :code:`Gemma2Model`
...
@@ -303,7 +303,7 @@ Reward Modeling
...
@@ -303,7 +303,7 @@ Reward Modeling
* - Architecture
* - Architecture
- Models
- Models
- Example HuggingFace Models
- Example HF Models
- :ref:`LoRA <lora>`
- :ref:`LoRA <lora>`
- :ref:`PP <distributed_serving>`
- :ref:`PP <distributed_serving>`
* - :code:`Qwen2ForRewardModel`
* - :code:`Qwen2ForRewardModel`
...
@@ -316,7 +316,14 @@ Reward Modeling
...
@@ -316,7 +316,14 @@ Reward Modeling
As an interim measure, these models are supported via Embeddings API. See `this RFC <https://github.com/vllm-project/vllm/issues/8967>`_ for upcoming changes.
As an interim measure, these models are supported via Embeddings API. See `this RFC <https://github.com/vllm-project/vllm/issues/8967>`_ for upcoming changes.
Multimodal Language Models
Multimodal Language Models
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^
The following modalities are supported depending on the model:
- **T**\ ext
- **I**\ mage
- **V**\ ideo
- **A**\ udio
.. _supported_vlms:
.. _supported_vlms:
...
@@ -324,78 +331,78 @@ Text Generation
...
@@ -324,78 +331,78 @@ Text Generation
---------------
---------------
.. list-table::
.. list-table::
:widths: 25 25 25 25 5 5
:widths: 25 25 15 25 5 5
:header-rows: 1
:header-rows: 1
* - Architecture
* - Architecture
- Models
- Models
- Modalities
- Inputs
- Example HuggingFace Models
- Example HF Models
- :ref:`LoRA <lora>`
- :ref:`LoRA <lora>`
- :ref:`PP <distributed_serving>`
- :ref:`PP <distributed_serving>`
* - :code:`Blip2ForConditionalGeneration`
* - :code:`Blip2ForConditionalGeneration`
- BLIP-2
- BLIP-2
- Image\ :sup:`E`
- T + I\ :sup:`E`
- :code:`Salesforce/blip2-opt-2.7b`, :code:`Salesforce/blip2-opt-6.7b`, etc.
- :code:`Salesforce/blip2-opt-2.7b`, :code:`Salesforce/blip2-opt-6.7b`, etc.
-
-
- ✅︎
- ✅︎
* - :code:`ChameleonForConditionalGeneration`
* - :code:`ChameleonForConditionalGeneration`
- Chameleon
- Chameleon
- Image
- T + I
- :code:`facebook/chameleon-7b` etc.
- :code:`facebook/chameleon-7b` etc.
-
-
- ✅︎
- ✅︎
* - :code:`FuyuForCausalLM`
* - :code:`FuyuForCausalLM`
- Fuyu
- Fuyu
- Image
- T + I
- :code:`adept/fuyu-8b` etc.
- :code:`adept/fuyu-8b` etc.
-
-
- ✅︎
- ✅︎
* - :code:`ChatGLMModel`
* - :code:`ChatGLMModel`
- GLM-4V
- GLM-4V
- Image
- T + I
- :code:`THUDM/glm-4v-9b` etc.
- :code:`THUDM/glm-4v-9b` etc.
-
-
- ✅︎
- ✅︎
* - :code:`InternVLChatModel`
* - :code:`InternVLChatModel`
- InternVL2
- InternVL2
- Image\ :sup:`E+`
- T + I\ :sup:`E+`
- :code:`OpenGVLab/InternVL2-4B`, :code:`OpenGVLab/InternVL2-8B`, etc.
- :code:`OpenGVLab/InternVL2-4B`, :code:`OpenGVLab/InternVL2-8B`, etc.
-
-
- ✅︎
- ✅︎
* - :code:`LlavaForConditionalGeneration`
* - :code:`LlavaForConditionalGeneration`
- LLaVA-1.5
- LLaVA-1.5
- Image\ :sup:`E+`
- T + I\ :sup:`E+`
- :code:`llava-hf/llava-1.5-7b-hf`, :code:`llava-hf/llava-1.5-13b-hf`, etc.
- :code:`llava-hf/llava-1.5-7b-hf`, :code:`llava-hf/llava-1.5-13b-hf`, etc.
-
-
- ✅︎
- ✅︎
* - :code:`LlavaNextForConditionalGeneration`
* - :code:`LlavaNextForConditionalGeneration`
- LLaVA-NeXT
- LLaVA-NeXT
- Image\ :sup:`E+`
- T + I\ :sup:`E+`
- :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc.
- :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc.