XLNet Bug when training with apex 16-bit precision (#6567)

* xlnet fp16 bug fix * comment cast added * Update modeling_xlnet.py Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

XLNet Bug when training with apex 16-bit precision (#6567)
* xlnet fp16 bug fix * comment cast added * Update modeling_xlnet.py Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
95395837 · Ivan Dolgov · GitHub · 505f2d74 · 95395837
Unverified Commit 95395837 authored Aug 20, 2020 by Ivan Dolgov Committed by GitHub Aug 21, 2020
Show whitespace changes
Inline Side-by-side

Showing with 2 additions and 1 deletion

src/transformers/modeling_xlnet.py src/transformers/modeling_xlnet.py +2 -1

No files found.
--- a/src/transformers/modeling_xlnet.py
+++ b/src/transformers/modeling_xlnet.py
@@ -446,7 +446,8 @@ class XLNetRelativeAttention(nn.Module):
            v_head_h = torch.einsum("ibh,hnd->ibnd", cat, self.v)

            # positional heads
-            k_head_r = torch.einsum("ibh,hnd->ibnd", r, self.r)
+            # type casting for fp16 support
+            k_head_r = torch.einsum("ibh,hnd->ibnd", r.type(self.r.dtype), self.r)

            # core attention ops
            attn_vec = self.rel_attn_core(