[Misc][Doc] Add missing comment for LLM (#20285)

Signed-off-by: Lifan Shen <lifans@meta.com>

[Misc][Doc] Add missing comment for LLM (#20285)
Signed-off-by: Lifan Shen <lifans@meta.com>
9ec1e306 · Lifans · GitHub · 9dae7d46 · 9ec1e306
Unverified Commit 9ec1e306 authored Jul 01, 2025 by Lifans Committed by GitHub Jul 01, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 20 additions and 12 deletions

vllm/entrypoints/llm.py vllm/entrypoints/llm.py +20 -12

No files found.
--- a/vllm/entrypoints/llm.py
+++ b/vllm/entrypoints/llm.py
@@ -132,6 +132,14 @@ class LLM:
        hf_overrides: If a dictionary, contains arguments to be forwarded to the
            HuggingFace config. If a callable, it is called to update the
            HuggingFace config.
+        mm_processor_kwargs: Arguments to be forwarded to the model's processor
+            for multi-modal data, e.g., image processor. Overrides for the
+            multi-modal processor obtained from `AutoProcessor.from_pretrained`.
+            The available overrides depend on the model that is being run.
+            For example, for Phi-3-Vision: `{"num_crops": 4}`.
+        override_pooler_config: Initialize non-default pooling config or
+            override default pooling config for the pooling model.
+            e.g. `PoolerConfig(pooling_type="mean", normalize=False)`.
        compilation_config: Either an integer or a dictionary. If it is an
            integer, it is used as the level of compilation optimization. If it
            is a dictionary, it can specify the full compilation configuration.
@@ -1347,16 +1355,16 @@ class LLM:
        during the sleep period, before `wake_up` is called.
        Args:
-            level: The sleep level. Level 1 sleep will offload the model 
+            level: The sleep level. Level 1 sleep will offload the model
-                weights and discard the kv cache. The content of kv cache 
+                weights and discard the kv cache. The content of kv cache
                is forgotten. Level 1 sleep is good for sleeping and waking
-                up the engine to run the same model again. The model weights 
+                up the engine to run the same model again. The model weights
-                are backed up in CPU memory. Please make sure there's enough 
+                are backed up in CPU memory. Please make sure there's enough
-                CPU memory to store the model weights. Level 2 sleep will 
+                CPU memory to store the model weights. Level 2 sleep will
-                discard both the model weights and the kv cache. The content 
+                discard both the model weights and the kv cache. The content
-                of both the model weights and kv cache is forgotten. Level 2 
+                of both the model weights and kv cache is forgotten. Level 2
                sleep is good for sleeping and waking up the engine to run a
-                different model or update the model, where previous model 
+                different model or update the model, where previous model
                weights are not needed. It reduces CPU memory pressure.
        """
        self.reset_prefix_cache()
@@ -1366,12 +1374,12 @@ class LLM:
        """
        Wake up the engine from sleep mode. See the [sleep][] method
        for more details.
        Args:
-            tags: An optional list of tags to reallocate the engine memory 
+            tags: An optional list of tags to reallocate the engine memory
-                for specific memory allocations. Values must be in 
+                for specific memory allocations. Values must be in
                `("weights", "kv_cache")`. If None, all memory is reallocated.
-                wake_up should be called with all tags (or None) before the 
+                wake_up should be called with all tags (or None) before the
                engine is used again.
        """
        self.llm_engine.wake_up(tags)