[BUGFIX] KeyError 'layers.14.mlp.gate.g_idx' for Qwen3-MoE with GPTQ on ROCm (#22017)

1e55dfa7 · JartX · GitHub · 384a0529 · 1e55dfa7
Unverified Commit 1e55dfa7 authored Aug 11, 2025 by JartX Committed by GitHub Aug 11, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

vllm/model_executor/models/qwen3_moe.py vllm/model_executor/models/qwen3_moe.py +1 -1

No files found.
--- a/vllm/model_executor/models/qwen3_moe.py
+++ b/vllm/model_executor/models/qwen3_moe.py
@@ -149,7 +149,7 @@ class Qwen3MoeSparseMoeBlock(nn.Module):
        self.gate = ReplicatedLinear(config.hidden_size,
                                     config.num_experts,
                                     bias=False,
-                                     quant_config=None,
+                                     quant_config=quant_config,
                                     prefix=f"{prefix}.gate")
    def forward(self, hidden_states: torch.Tensor) -> torch.Tensor: