[multimodal][test] Reduce memory utilization for test_siglip to avoid OOM (#29504)

Signed-off-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

[multimodal][test] Reduce memory utilization for test_siglip to avoid OOM (#29504)
Signed-off-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
ad9d656b · Zhengxu Chen · GitHub · f37e8938 · ad9d656b
Unverified Commit ad9d656b authored Dec 01, 2025 by Zhengxu Chen Committed by GitHub Dec 01, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 7 additions and 1 deletion

tests/models/multimodal/pooling/test_siglip.py tests/models/multimodal/pooling/test_siglip.py +7 -1

No files found.
--- a/tests/models/multimodal/pooling/test_siglip.py
+++ b/tests/models/multimodal/pooling/test_siglip.py
@@ -37,7 +37,12 @@ def _run_test(
    dtype: str,
 ) -> None:
    with vllm_runner(
-        model, runner="pooling", dtype=dtype, enforce_eager=True, max_model_len=64
+        model,
+        runner="pooling",
+        dtype=dtype,
+        enforce_eager=True,
+        max_model_len=64,
+        gpu_memory_utilization=0.7,
    ) as vllm_model:
        vllm_outputs = vllm_model.embed(input_texts, images=input_images)

@@ -134,6 +139,7 @@ def test_models_text_image_no_crash(
        dtype=dtype,
        enforce_eager=True,
        max_model_len=64,
+        gpu_memory_utilization=0.7,
    ) as vllm_model:
        with pytest.raises(ValueError, match="not both"):
            vllm_model.embed(texts, images=images)