Unverified Commit b7050ca7 authored by Taemin Lee's avatar Taemin Lee Committed by GitHub
Browse files

[BugFix] gemma loading after quantization or LoRA. (#3553)

parent c188ecb0
...@@ -340,6 +340,10 @@ class GemmaForCausalLM(nn.Module): ...@@ -340,6 +340,10 @@ class GemmaForCausalLM(nn.Module):
weight_loader(param, loaded_weight, shard_id) weight_loader(param, loaded_weight, shard_id)
break break
else: else:
# lm_head is not used in vllm as it is tied with embed_token.
# To prevent errors, skip loading lm_head.weight.
if "lm_head.weight" in name:
continue
# Skip loading extra bias for GPTQ models. # Skip loading extra bias for GPTQ models.
if name.endswith(".bias") and name not in params_dict: if name.endswith(".bias") and name not in params_dict:
continue continue
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment