multimodal_index.rst 1.57 KB
Newer Older
1
2
.. _multi_modality:

3
4
5
6
7
8
9
Multi-Modality
==============

.. currentmodule:: vllm.multimodal
    
vLLM provides experimental support for multi-modal models through the :mod:`vllm.multimodal` package.

10
Multi-modal inputs can be passed alongside text and token prompts to :ref:`supported models <supported_vlms>`
11
via the ``multi_modal_data`` field in :class:`vllm.inputs.PromptInputs`.
12

13
14
Currently, vLLM only has built-in support for image data. You can extend vLLM to process additional modalities
by following :ref:`this guide <adding_multimodal_plugin>`.
15

16
Looking to add your own multi-modal model? Please follow the instructions listed :ref:`here <enabling_multimodal_inputs>`.
17

18
19
20
..
  TODO: Add usage of --limit-mm-per-prompt when multi-image input is officially supported

21
22
23
24
25
26
27
Guides
++++++

.. toctree::
   :maxdepth: 1

   adding_multimodal_plugin
28

29
30
31
32
33
34
35
36
Module Contents
+++++++++++++++

.. automodule:: vllm.multimodal

Registry
--------

37
.. autodata:: vllm.multimodal.MULTIMODAL_REGISTRY
38
39
40
41
42
43
44
45

.. autoclass:: vllm.multimodal.MultiModalRegistry
    :members:
    :show-inheritance:

Base Classes
------------

46
47
.. autodata:: vllm.multimodal.NestedTensors

48
49
.. autodata:: vllm.multimodal.BatchedTensorInputs

50
.. autoclass:: vllm.multimodal.MultiModalDataBuiltins
51
52
53
    :members:
    :show-inheritance:

54
55
.. autodata:: vllm.multimodal.MultiModalDataDict

56
57
58
59
.. autoclass:: vllm.multimodal.MultiModalInputs
    :members:
    :show-inheritance:

60
61
62
63
64
65
66
67
68
69
.. autoclass:: vllm.multimodal.MultiModalPlugin
    :members:
    :show-inheritance:

Image Classes
-------------

.. automodule:: vllm.multimodal.image
    :members:
    :show-inheritance: