Unverified Commit 08d2bd78 authored by Chendi.Xue's avatar Chendi.Xue Committed by GitHub
Browse files

[BUGFIX] deepseek-v2-lite failed due to fused_qkv_a_proj name update (#21414)


Signed-off-by: default avatarChendi.Xue <chendi.xue@intel.com>
parent 4f76a05f
...@@ -885,13 +885,16 @@ class DeepseekV2ForCausalLM(nn.Module, SupportsPP, MixtureOfExperts): ...@@ -885,13 +885,16 @@ class DeepseekV2ForCausalLM(nn.Module, SupportsPP, MixtureOfExperts):
# for mlp.experts[0].gate_gate_up_proj, which breaks load. # for mlp.experts[0].gate_gate_up_proj, which breaks load.
if (("mlp.experts." in name) and name not in params_dict): if (("mlp.experts." in name) and name not in params_dict):
continue continue
name = name.replace(weight_name, param_name) name_mapped = name.replace(weight_name, param_name)
# QKV fusion is optional, fall back to normal # QKV fusion is optional, fall back to normal
# weight loading if it's not enabled # weight loading if it's not enabled
# if go with fusion option, then update name
if ((param_name == "fused_qkv_a_proj") if ((param_name == "fused_qkv_a_proj")
and name not in params_dict): and name_mapped not in params_dict):
continue continue
else:
name = name_mapped
# Skip loading extra bias for GPTQ models. # Skip loading extra bias for GPTQ models.
if name.endswith(".bias") and name not in params_dict: if name.endswith(".bias") and name not in params_dict:
continue continue
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment