[TPU][V1] Make `--disable_chunked_mm_input` mandatory for serving MM models (#16483)

Signed-off-by: NickLucche <nlucches@redhat.com>

[TPU][V1] Make `--disable_chunked_mm_input` mandatory for serving MM models (#16483)
Signed-off-by: NickLucche <nlucches@redhat.com>
4d022cbc · Nicolò Lucchesi · GitHub · 70de35a8 · 4d022cbc
Unverified Commit 4d022cbc authored Apr 11, 2025 by Nicolò Lucchesi Committed by GitHub Apr 11, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 7 additions and 0 deletions

vllm/platforms/tpu.py vllm/platforms/tpu.py +7 -0

No files found.
--- a/vllm/platforms/tpu.py
+++ b/vllm/platforms/tpu.py
@@ -120,6 +120,13 @@ class TpuPlatform(Platform):
        assert not vllm_config.speculative_config, (
            "Speculative decoding is not yet supported for TPU backend")
+        if scheduler_config.is_multimodal_model and not \
+            scheduler_config.disable_chunked_mm_input:
+            logger.warning("TPU does not support running Multimodal models"\
+            " without setting `--disable_chunked_mm_input`. " \
+            "Forcing --disable_chunked_mm_input.")
+            scheduler_config.disable_chunked_mm_input = True
    @classmethod
    def is_pin_memory_available(cls):
        logger.warning("Pin memory is not supported on TPU.")