Unverified Commit 36429096 authored by JartX's avatar JartX Committed by GitHub
Browse files

[BUGFIX] GPTQ quantization compatibility for Qwen3 Next MOE models (AutoGPTQ...


[BUGFIX] GPTQ quantization compatibility for Qwen3 Next MOE models (AutoGPTQ and AutoRound-GPTQ) (#25268)
Signed-off-by: default avatarJartX <sagformas@epdcenter.es>
parent c308501c
......@@ -148,9 +148,11 @@ class Qwen3NextSparseMoeBlock(nn.Module):
def _maybe_ignore_quant_config(self, quant_config: QuantizationConfig):
# GPTQ configs do not have a list of ignored modules, however AutoGPTQ
# seems to avoid gate quantization.
# See: https://huggingface.co/Qwen/Qwen3-30B-A3B-GPTQ-Int4
if isinstance(quant_config, (GPTQConfig, GPTQMarlinConfig)):
# seems to avoid gate quantization while AutoRound does.
if isinstance(
quant_config,
(GPTQConfig,
GPTQMarlinConfig)) and not quant_config.autoround_version:
return None
return quant_config
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment