Unverified Commit eadb4e86 authored by Artus Krohn-Grimberghe's avatar Artus Krohn-Grimberghe Committed by GitHub
Browse files

[Bugfix] Avoid duplicate k-proj weight emission in helper (#34142)


Signed-off-by: default avatarArtus KG <artuskg@gmail.com>
parent 285bab47
...@@ -958,8 +958,8 @@ def _create_fake_bias_for_k_proj( ...@@ -958,8 +958,8 @@ def _create_fake_bias_for_k_proj(
So that the bias for k_proj in qkv_proj can be initialized with zeros. So that the bias for k_proj in qkv_proj can be initialized with zeros.
""" """
for name, weight in weights: for name, weight in weights:
yield name, weight
if name.endswith(fake_bias_key_name): if name.endswith(fake_bias_key_name):
bias = torch.zeros(weight.size(0)) bias = torch.zeros(weight.size(0))
bias_name = name.replace("weight", "bias") bias_name = name.replace("weight", "bias")
yield from [(name, weight), (bias_name, bias)] yield bias_name, bias
yield name, weight
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment