convert logits to fp32 for calculating loss in masked_lm_loss criterion

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/568 Differential Revision: D15308483 Pulled By: myleott fbshipit-source-id: 9d898ce523e46e6b6fb444274f478da0b577b603

convert logits to fp32 for calculating loss in masked_lm_loss criterion
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/568 Differential Revision: D15308483 Pulled By: myleott fbshipit-source-id: 9d898ce523e46e6b6fb444274f478da0b577b603
43722c5e · Naman Goyal · Facebook Github Bot · 5dcc855a · 43722c5e
Commit 43722c5e authored May 11, 2019 by Naman Goyal Committed by Facebook Github Bot May 11, 2019
Show whitespace changes
Inline Side-by-side

Showing with 3 additions and 2 deletions

fairseq/criterions/masked_lm_loss.py fairseq/criterions/masked_lm_loss.py +3 -2

No files found.
--- a/fairseq/criterions/masked_lm_loss.py
+++ b/fairseq/criterions/masked_lm_loss.py
@@ -7,6 +7,7 @@
 import math
+import torch
 import torch.nn.functional as F
 from fairseq import utils
@@ -22,8 +23,8 @@ def compute_cross_entropy_loss(logits, targets, ignore_index=-100):
    assert logits.size(0) == targets.size(-1), \
        "Logits and Targets tensor shapes don't match up"
-    loss = F.cross_entropy(
+    loss = F.nll_loss(
-        logits,
+        F.log_softmax(logits, -1, dtype=torch.float32),
        targets,
        reduction="sum",
        ignore_index=ignore_index,