summary.md 2.58 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
# Summary

(configuration)=

## Configuration

API documentation for vLLM's configuration classes.

```{autodoc2-summary}
    vllm.config.ModelConfig
    vllm.config.CacheConfig
    vllm.config.TokenizerPoolConfig
    vllm.config.LoadConfig
    vllm.config.ParallelConfig
    vllm.config.SchedulerConfig
    vllm.config.DeviceConfig
    vllm.config.SpeculativeConfig
    vllm.config.LoRAConfig
    vllm.config.PromptAdapterConfig
    vllm.config.MultiModalConfig
    vllm.config.PoolerConfig
    vllm.config.DecodingConfig
    vllm.config.ObservabilityConfig
    vllm.config.KVTransferConfig
    vllm.config.CompilationConfig
    vllm.config.VllmConfig
```

(offline-inference-api)=

## Offline Inference

LLM Class.

```{autodoc2-summary}
    vllm.LLM
```

LLM Inputs.

```{autodoc2-summary}
    vllm.inputs.PromptType
    vllm.inputs.TextPrompt
    vllm.inputs.TokensPrompt
```

## vLLM Engines

Engine classes for offline and online inference.

```{autodoc2-summary}
    vllm.LLMEngine
    vllm.AsyncLLMEngine
```

## Inference Parameters

Inference parameters for vLLM APIs.

(sampling-params)=
(pooling-params)=

```{autodoc2-summary}
    vllm.SamplingParams
    vllm.PoolingParams
```

(multi-modality)=

## Multi-Modality

vLLM provides experimental support for multi-modal models through the {mod}`vllm.multimodal` package.

Multi-modal inputs can be passed alongside text and token prompts to [supported models](#supported-mm-models)
via the `multi_modal_data` field in {class}`vllm.inputs.PromptType`.

Looking to add your own multi-modal model? Please follow the instructions listed [here](#supports-multimodal).

```{autodoc2-summary}
    vllm.multimodal.MULTIMODAL_REGISTRY
```

### Inputs

User-facing inputs.

```{autodoc2-summary}
    vllm.multimodal.inputs.MultiModalDataDict
```

Internal data structures.

```{autodoc2-summary}
    vllm.multimodal.inputs.PlaceholderRange
    vllm.multimodal.inputs.NestedTensors
    vllm.multimodal.inputs.MultiModalFieldElem
    vllm.multimodal.inputs.MultiModalFieldConfig
    vllm.multimodal.inputs.MultiModalKwargsItem
    vllm.multimodal.inputs.MultiModalKwargs
    vllm.multimodal.inputs.MultiModalInputs
```

### Data Parsing

```{autodoc2-summary}
    vllm.multimodal.parse
```

### Data Processing

```{autodoc2-summary}
    vllm.multimodal.processing
```

### Memory Profiling

```{autodoc2-summary}
    vllm.multimodal.profiling
```

### Registry

```{autodoc2-summary}
    vllm.multimodal.registry
```

## Model Development

```{autodoc2-summary}
    vllm.model_executor.models.interfaces_base
    vllm.model_executor.models.interfaces
    vllm.model_executor.models.adapters
```