Always initialize tied output_embeddings if it has a bias term (#28947)

Continue to initialize tied output_embeddings if it has a bias term The bias term is not tied, and so will need to be initialized accordingly.

Always initialize tied output_embeddings if it has a bias term (#28947)
Continue to initialize tied output_embeddings if it has a bias term The bias term is not tied, and so will need to be initialized accordingly.
136cd893 · JB (Don) · GitHub · 792819f6 · 136cd893
Unverified Commit 136cd893 authored Feb 12, 2024 by JB (Don) Committed by GitHub Feb 12, 2024
Show whitespace changes
Inline Side-by-side

Showing with 4 additions and 2 deletions

src/transformers/modeling_utils.py src/transformers/modeling_utils.py +4 -2

No files found.
--- a/src/transformers/modeling_utils.py
+++ b/src/transformers/modeling_utils.py
@@ -3748,10 +3748,12 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix
                else:
                    _loaded_keys = loaded_keys
                not_initialized_submodules = set_initialized_submodules(model, _loaded_keys)
-                # if we're about to tie the output embeds to the input embeds we don't need to init them
+                # If we're about to tie the output embeds to the input embeds we don't need to init them
                if hasattr(model.config, "tie_word_embeddings") and model.config.tie_word_embeddings:
                    output_embeddings = model.get_output_embeddings()
                    if output_embeddings is not None:
+                        # Still need to initialize if there is a bias term since biases are not tied.
+                        if not hasattr(output_embeddings, "bias") or output_embeddings.bias is None:
                            output_embeddings._is_hf_initialized = True
            else:
                not_initialized_submodules = dict(model.named_modules())