Commit f155ae89 authored by klhhhhh's avatar klhhhhh Committed by Hongxin Liu
Browse files

[shardformer] ChatGLM support layernorm sharding

parent 00f6ef15
......@@ -417,7 +417,7 @@ class SelfAttention(torch.nn.Module):
)
=======
self.dense = nn.Linear(self.projection_size,
self.hidden_size,
config.hidden_size,
bias=config.add_bias_linear,
device=device,
**_config_to_kwargs(config))
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment