multimodal_index.md 1.6 KB
Newer Older
1
(multi-modality)=
2

3
# Multi-Modality
4

5
```{eval-rst}
6
.. currentmodule:: vllm.multimodal
7
```
8

9
10
11
12
vLLM provides experimental support for multi-modal models through the {mod}`vllm.multimodal` package.

Multi-modal inputs can be passed alongside text and token prompts to [supported models](#supported-mm-models)
via the `multi_modal_data` field in {class}`vllm.inputs.PromptType`.
13

14
Currently, vLLM only has built-in support for image data. You can extend vLLM to process additional modalities
15
by following [this guide](#adding-multimodal-plugin).
16

17
Looking to add your own multi-modal model? Please follow the instructions listed [here](#enabling-multimodal-inputs).
18

19
## Guides
20

21
22
```{toctree}
:maxdepth: 1
23

24
25
adding_multimodal_plugin
```
26

27
## Module Contents
28

29
```{eval-rst}
30
.. automodule:: vllm.multimodal
31
```
32

33
### Registry
34

35
```{eval-rst}
36
.. autodata:: vllm.multimodal.MULTIMODAL_REGISTRY
37
```
38

39
```{eval-rst}
40
41
42
.. autoclass:: vllm.multimodal.MultiModalRegistry
    :members:
    :show-inheritance:
43
```
44

45
### Base Classes
46

47
```{eval-rst}
48
.. autodata:: vllm.multimodal.NestedTensors
49
```
50

51
```{eval-rst}
52
.. autodata:: vllm.multimodal.BatchedTensorInputs
53
```
54

55
```{eval-rst}
56
.. autoclass:: vllm.multimodal.MultiModalDataBuiltins
57
58
    :members:
    :show-inheritance:
59
```
60

61
```{eval-rst}
62
.. autodata:: vllm.multimodal.MultiModalDataDict
63
```
64

65
```{eval-rst}
66
.. autoclass:: vllm.multimodal.MultiModalKwargs
67
68
    :members:
    :show-inheritance:
69
```
70

71
```{eval-rst}
72
73
74
.. autoclass:: vllm.multimodal.MultiModalPlugin
    :members:
    :show-inheritance:
75
```
76

77
### Image Classes
78

79
```{eval-rst}
80
81
82
.. automodule:: vllm.multimodal.image
    :members:
    :show-inheritance:
83
```