[bugfix] fix inductor cache on max_position_embeddings (#15436)

Signed-off-by: youkaichao <youkaichao@gmail.com>

[bugfix] fix inductor cache on max_position_embeddings (#15436)
Signed-off-by: youkaichao <youkaichao@gmail.com>
d0cfec7a · youkaichao · GitHub · a6081600 · d0cfec7a
Unverified Commit d0cfec7a authored Mar 25, 2025 by youkaichao Committed by GitHub Mar 25, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 0 deletions

vllm/config.py vllm/config.py +3 -0

No files found.
--- a/vllm/config.py
+++ b/vllm/config.py
@@ -221,6 +221,9 @@ class ModelConfig:
        factors.append(self.trust_remote_code)
        factors.append(self.rope_scaling)
        factors.append(self.rope_theta)
+        # rope cos/sin cache depends on the max_position_embeddings
+        factors.append(
+            getattr(self.hf_config, "max_position_embeddings", "None"))
        return hashlib.sha256(str(factors).encode()).hexdigest()
    def __init__(