[BugFix] Align fused MoE-LoRA kernel config with actual weight shapes (#34396)

Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>

[BugFix] Align fused MoE-LoRA kernel config with actual weight shapes (#34396)
Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>
a1f53add · Runkai Tao · GitHub · 05970c77 · a1f53add
Unverified Commit a1f53add authored Feb 26, 2026 by Runkai Tao Committed by GitHub Feb 26, 2026
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 1 deletion

vllm/lora/layers/fused_moe.py vllm/lora/layers/fused_moe.py +5 -1

No files found.
--- a/vllm/lora/layers/fused_moe.py
+++ b/vllm/lora/layers/fused_moe.py
@@ -83,7 +83,11 @@ class FusedMoEWithLoRA(BaseLayerWithLoRA):
    ):
        if envs.VLLM_TUNED_CONFIG_FOLDER:
            hidden_size = layer.hidden_size
-            intermediate_size = layer.intermediate_size_per_partition
+            intermediate_size = (
+                self.w2_lora_a_stacked[0].shape[-1]
+                if op_prefix == "w2"
+                else self.w13_lora_b_stacked[0].shape[-2]
+            )
            shrink_config = get_lora_op_configs(
                op_type=f"fused_moe_lora_{op_prefix}_shrink",
                max_loras=num_loras,