Gemma: update activation warning (#29995)

* Gemma: only display act. warning when necessary This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`". * Change message, and set `config.hidden_activation`

Gemma: update activation warning (#29995)
* Gemma: only display act. warning when necessary This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`". * Change message, and set `config.hidden_activation`
f4f18afd · Pedro Cuenca · GitHub · bbaa8cef · f4f18afd
Unverified Commit f4f18afd authored May 01, 2024 by Pedro Cuenca Committed by GitHub May 01, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 8 deletions

src/transformers/models/gemma/modeling_gemma.py src/transformers/models/gemma/modeling_gemma.py +6 -8

No files found.
--- a/src/transformers/models/gemma/modeling_gemma.py
+++ b/src/transformers/models/gemma/modeling_gemma.py
@@ -174,15 +174,13 @@ class GemmaMLP(nn.Module):
        self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=False)
        if config.hidden_activation is None:
            logger.warning_once(
-                "Gemma's activation function should be approximate GeLU and not exact GeLU.\n"
-                "Changing the activation function to `gelu_pytorch_tanh`."
-                f"if you want to use the legacy `{config.hidden_act}`, "
-                f"edit the `model.config` to set `hidden_activation={config.hidden_act}` "
-                "  instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details."
+                "`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.\n"
+                "Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use\n"
+                "`config.hidden_activation` if you want to override this behaviour.\n"
+                "See https://github.com/huggingface/transformers/pull/29402 for more details."
            )
-            hidden_activation = "gelu_pytorch_tanh"
-        else:
-            hidden_activation = config.hidden_activation
+            config.hidden_activation = "gelu_pytorch_tanh"
+        hidden_activation = config.hidden_activation
        self.act_fn = ACT2FN[hidden_activation]

    def forward(self, x):