Unverified Commit 06dd0825 authored by Yu Chin Fabian Lim's avatar Yu Chin Fabian Lim Committed by GitHub
Browse files

Enforce that TP > 1 is not supported for Mamba2 if Quantization is Enabled. (#14617)


Signed-off-by: default avatarYu Chin Fabian Lim <flim@sg.ibm.com>
parent 2b22290c
......@@ -251,6 +251,9 @@ class MambaMixer2(CustomOp):
"then num_groups must equal 1."
)
assert self.tp_size == 1 or quant_config is None, \
"Tensor parallel currently not supported for quantized models."
self.ssm_state_size = ssm_state_size
self.activation = activation
......@@ -331,6 +334,8 @@ class MambaMixer2(CustomOp):
], self.tp_size, tp_rank)
})
if quant_config is None:
# - quant layers do not have a weight loader
delattr(self.in_proj.weight, "weight_loader")
set_weight_attrs(
self.in_proj.weight,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment