Fix fp16 masking in PoolerEndLogits

Necessary to run xlnet (at least in squad) with `--fp16 --fp16_opt_level="O2"`, otherwise loss is immediately `NaN` and fine-tuning cannot proceed.

Fix fp16 masking in PoolerEndLogits
Necessary to run xlnet (at least in squad) with `--fp16 --fp16_opt_level="O2"`, otherwise loss is immediately `NaN` and fine-tuning cannot proceed.
ec94f4e0 · Simon Layton · GitHub · 32e1332a · ec94f4e0
Unverified Commit ec94f4e0 authored Sep 18, 2019 by Simon Layton Committed by GitHub Sep 18, 2019
Show whitespace changes
Inline Side-by-side

Showing with 4 additions and 1 deletion

pytorch_transformers/modeling_utils.py pytorch_transformers/modeling_utils.py +4 -1

No files found.
--- a/pytorch_transformers/modeling_utils.py
+++ b/pytorch_transformers/modeling_utils.py
@@ -478,6 +478,9 @@ class PoolerEndLogits(nn.Module):
        x = self.dense_1(x).squeeze(-1)

        if p_mask is not None:
+            if next(self.parameters()).dtype == torch.float16:
+                x = x * (1 - p_mask) - 65500 * p_mask
+            else:
                x = x * (1 - p_mask) - 1e30 * p_mask

        return x