Add FP32 cast in ConvNext LayerNorm to prevent rounding errors with FP16 input (#18746)

* Adding cast to fp32 in convnext layernorm to prevent rounding errors in the case of fp16 input * Trigger CI

Add FP32 cast in ConvNext LayerNorm to prevent rounding errors with FP16 input (#18746)
* Adding cast to fp32 in convnext layernorm to prevent rounding errors in the case of fp16 input * Trigger CI
d63bdf78 · Jim Briggs · GitHub · 532ca050 · d63bdf78
Unverified Commit d63bdf78 authored Sep 16, 2022 by Jim Briggs Committed by GitHub Sep 16, 2022
Show whitespace changes
Inline Side-by-side

Showing with 3 additions and 0 deletions

src/transformers/models/convnext/modeling_convnext.py src/transformers/models/convnext/modeling_convnext.py +3 -0

No files found.
--- a/src/transformers/models/convnext/modeling_convnext.py
+++ b/src/transformers/models/convnext/modeling_convnext.py
@@ -109,9 +109,12 @@ class ConvNextLayerNorm(nn.Module):
        if self.data_format == "channels_last":
            x = torch.nn.functional.layer_norm(x, self.normalized_shape, self.weight, self.bias, self.eps)
        elif self.data_format == "channels_first":
+            input_dtype = x.dtype
+            x = x.float()
            u = x.mean(1, keepdim=True)
            s = (x - u).pow(2).mean(1, keepdim=True)
            x = (x - u) / torch.sqrt(s + self.eps)
+            x = x.to(dtype=input_dtype)
            x = self.weight[:, None, None] * x + self.bias[:, None, None]
        return x