compatibility_matrix.md 5.22 KB
Newer Older
1
2
3
4
5
6
(compatibility-matrix)=

# Compatibility Matrix

The tables below show mutually exclusive features and the support on some hardware.

7
8
9
10
11
12
The symbols used have the following meanings:

- ✅ = Full compatibility
- 🟠 = Partial compatibility
- ❌ = No compatibility

13
:::{note}
14
Check the ❌ or 🟠 with links to see tracking issue for unsupported feature/hardware combination.
15
:::
16
17
18

## Feature x Feature

19
:::{raw} html
20
21
22
23
24
25
26
27
28
29
30
31
<style>
  /* Make smaller to try to improve readability  */
  td {
    font-size: 0.8rem;
    text-align: center;
  }

  th {
    text-align: center;
    font-size: 0.8rem;
  }
</style>
32
:::
33

34
35
36
37
:::{list-table}
:header-rows: 1
:stub-columns: 1
:widths: auto
38
:class: vertical-table-header
39

40
41
42
43
44
- * Feature
  * [CP](#chunked-prefill)
  * [APC](#automatic-prefix-caching)
  * [LoRA](#lora-adapter)
  * <abbr title="Prompt Adapter">prmpt adptr</abbr>
45
  * [SD](#spec-decode)
46
47
48
49
50
51
52
53
54
55
56
57
  * CUDA graph
  * <abbr title="Pooling Models">pooling</abbr>
  * <abbr title="Encoder-Decoder Models">enc-dec</abbr>
  * <abbr title="Logprobs">logP</abbr>
  * <abbr title="Prompt Logprobs">prmpt logP</abbr>
  * <abbr title="Async Output Processing">async output</abbr>
  * multi-step
  * <abbr title="Multimodal Inputs">mm</abbr>
  * best-of
  * beam-search
  * <abbr title="Guided Decoding">guided dec</abbr>
- * [CP](#chunked-prefill)
58
  *
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
- * [APC](#automatic-prefix-caching)
  *
76
  *
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
- * [LoRA](#lora-adapter)
  *
93
94
  *
  *
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Prompt Adapter">prmpt adptr</abbr>
  *
  *
  *
112
  *
113
114
115
116
117
118
119
120
121
122
123
124
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
125
- * [SD](#spec-decode)
126
127
  *
  *
128
129
  *
  *
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
- * CUDA graph
  *
  *
  *
  *
  *
148
  *
149
150
151
152
153
154
155
156
157
158
159
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Pooling Models">pooling</abbr>
160
161
162
163
164
165
166
  *
  *
  *
  *
  *
  *
  *
167
168
169
170
171
172
173
174
175
176
  *
  *
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Encoder-Decoder Models">enc-dec</abbr>
177
178
179
180
181
182
  *
  * [](gh-issue:7366)
  *
  *
  * [](gh-issue:7366)
  *
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
  *
  *
  *
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Logprobs">logP</abbr>
  *
  *
  *
  *
  *
  *
200
201
  *
  *
202
203
204
205
206
207
208
209
210
211
212
213
214
215
  *
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Prompt Logprobs">prmpt logP</abbr>
  *
  *
  *
  *
  *
216
217
218
  *
  *
  *
219
220
221
222
223
224
225
226
227
228
229
230
231
  *
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Async Output Processing">async output</abbr>
  *
  *
  *
  *
232
233
234
235
  *
  *
  *
  *
236
237
238
239
240
241
242
243
244
  *
  *
  *
  *
  *
  *
  *
  *
- * multi-step
245
  *
246
  *
247
248
249
250
251
252
  *
  *
  *
  *
  *
  *
253
254
255
256
257
258
259
260
261
262
  *
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Multimodal Inputs">mm</abbr>
  *
263
264
265
266
  * [🟠](gh-pr:8348)
  * [🟠](gh-pr:4194)
  *
  *
267
268
269
270
271
272
  *
  *
  *
  *
  *
  *
273
274
  *
  *
275
276
277
278
279
280
281
282
  *
  *
  *
- * best-of
  *
  *
  *
  *
283
  * [](gh-issue:6137)
284
  *
285
  *
286
287
288
  *
  *
  *
289
290
291
  *
  * [](gh-issue:7968)
  *
292
293
294
295
296
297
298
299
  *
  *
  *
- * beam-search
  *
  *
  *
  *
300
  * [](gh-issue:6137)
301
  *
302
  *
303
304
305
  *
  *
  *
306
307
308
309
  *
  * [](gh-issue:7968)
  *
  *
310
311
312
313
314
  *
  *
- * <abbr title="Guided Decoding">guided dec</abbr>
  *
  *
315
316
317
  *
  *
  * [](gh-issue:11484)
318
  *
319
320
  *
  *
321
322
323
  *
  *
  *
324
325
326
  * [](gh-issue:9893)
  *
  *
327
328
329
  *
  *
:::
330

331
332
333
(feature-x-hardware)=

## Feature x Hardware
334

335
336
337
338
:::{list-table}
:header-rows: 1
:stub-columns: 1
:widths: auto
339

340
341
342
343
344
345
346
347
348
- * Feature
  * Volta
  * Turing
  * Ampere
  * Ada
  * Hopper
  * CPU
  * AMD
- * [CP](#chunked-prefill)
349
  * [](gh-issue:2729)
350
351
352
353
354
355
356
  *
  *
  *
  *
  *
  *
- * [APC](#automatic-prefix-caching)
357
  * [](gh-issue:3687)
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
  *
  *
  *
  *
  *
  *
- * [LoRA](#lora-adapter)
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Prompt Adapter">prmpt adptr</abbr>
  *
  *
  *
  *
  *
378
  * [](gh-issue:8475)
379
  *
380
- * [SD](#spec-decode)
381
382
383
384
385
386
387
388
389
390
391
392
393
  *
  *
  *
  *
  *
  *
  *
- * CUDA graph
  *
  *
  *
  *
  *
394
  *
395
396
397
398
399
400
401
402
  *
- * <abbr title="Pooling Models">pooling</abbr>
  *
  *
  *
  *
  *
  *
403
  *
404
405
406
407
408
409
410
- * <abbr title="Encoder-Decoder Models">enc-dec</abbr>
  *
  *
  *
  *
  *
  *
411
  *
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
- * <abbr title="Multimodal Inputs">mm</abbr>
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Logprobs">logP</abbr>
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Prompt Logprobs">prmpt logP</abbr>
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Async Output Processing">async output</abbr>
  *
  *
  *
  *
  *
442
443
  *
  *
444
445
446
447
448
449
- * multi-step
  *
  *
  *
  *
  *
450
  * [](gh-issue:8477)
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
  *
- * best-of
  *
  *
  *
  *
  *
  *
  *
- * beam-search
  *
  *
  *
  *
  *
  *
  *
- * <abbr title="Guided Decoding">guided dec</abbr>
  *
  *
  *
  *
  *
  *
  *
:::