[Doc] Add usage of implicit text-only mode (#22561)

Signed-off-by: Roger Wang <hey@rogerw.me> Co-authored-by: Flora Feng <4florafeng@gmail.com>

[Doc] Add usage of implicit text-only mode (#22561)
Signed-off-by: Roger Wang <hey@rogerw.me> Co-authored-by: Flora Feng <4florafeng@gmail.com>
23472ff5 · Roger Wang · GitHub · 08b751ba · 23472ff5
Unverified Commit 23472ff5 authored Aug 08, 2025 by Roger Wang Committed by GitHub Aug 08, 2025
Show whitespace changes
Inline Side-by-side

Showing with 3 additions and 0 deletions

docs/models/supported_models.md docs/models/supported_models.md +3 -0

No files found.
--- a/docs/models/supported_models.md
+++ b/docs/models/supported_models.md
@@ -583,6 +583,9 @@ See [this page](../features/multimodal_inputs.md) on how to pass multi-modal inp
    **This is no longer required if you are using vLLM V1.**
+!!! tip
+    For hybrid-only models such as Llama-4, Step3 and Mistral-3, a text-only mode can be enabled by setting all supported multimodal modalities to 0 (e.g, `--limit-mm-per-prompt '{"image":0}`) so that their multimodal modules will not be loaded to free up more GPU memory for KV cache.
 !!! note
    vLLM currently only supports adding LoRA to the language backbone of multimodal models.