[Misc] V1 LoRA support CPU offload (#15843)

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

[Misc] V1 LoRA support CPU offload (#15843)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
58e234a7 · Jee Jee Li · GitHub · e86c414d · 58e234a7
Unverified Commit 58e234a7 authored Apr 02, 2025 by Jee Jee Li Committed by GitHub Apr 02, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 3 deletions

vllm/config.py vllm/config.py +3 -3

No files found.
--- a/vllm/config.py
+++ b/vllm/config.py
@@ -2434,9 +2434,9 @@ class LoRAConfig:
                f"max_loras ({self.max_loras})")

    def verify_with_cache_config(self, cache_config: CacheConfig):
-        # TODO LoRA supports CPU offload.
-        if cache_config.cpu_offload_gb > 0:
-            raise ValueError("CPU offload is not supported with LoRA yet.")
+        if cache_config.cpu_offload_gb > 0 and not envs.VLLM_USE_V1:
+            raise ValueError(
+                "V0 LoRA does not support CPU offload, please use V1.")

    def verify_with_model_config(self, model_config: ModelConfig):
        if self.lora_dtype in (None, "auto"):