Unverified Commit 84896fda authored by baoqian426's avatar baoqian426 Committed by GitHub
Browse files

[Bugfix] deepseek-V3.2 self.weights_proj has no bias (#30841)


Signed-off-by: default avatarbaoqian <1354987947@qq.com>
Signed-off-by: default avatarbaoqian426 <1354987947@qq.com>
parent 4bf6c236
...@@ -835,7 +835,11 @@ class Indexer(nn.Module): ...@@ -835,7 +835,11 @@ class Indexer(nn.Module):
) )
self.k_norm = LayerNorm(self.head_dim, eps=1e-6) self.k_norm = LayerNorm(self.head_dim, eps=1e-6)
self.weights_proj = ReplicatedLinear( self.weights_proj = ReplicatedLinear(
hidden_size, self.n_head, quant_config=None, prefix=f"{prefix}.weights_proj" hidden_size,
self.n_head,
bias=False,
quant_config=None,
prefix=f"{prefix}.weights_proj",
) )
self.softmax_scale = self.head_dim**-0.5 self.softmax_scale = self.head_dim**-0.5
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment