Commit 0c75c760 authored by Chunting Zhou's avatar Chunting Zhou Committed by Facebook Github Bot
Browse files

Fix bug (the returned value has a dimension mismatch) in...

Fix bug (the returned value has a dimension mismatch) in label-smoothed-cross-entropy for MoE (#1037)

Summary:
MoE will encounter a dimension mismatch bug when using label-smoothed cross entropy as the criterion, which occurs at [https://github.com/pytorch/fairseq/blob/master/fairseq/tasks/translation_moe.py#L125](url). This is a fix to the bug.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1037

Differential Revision: D16892674

Pulled By: myleott

fbshipit-source-id: a73bc03d2280356667d02422d22ad11d968d0c65
parent 732d15a9
...@@ -16,9 +16,9 @@ def label_smoothed_nll_loss(lprobs, target, epsilon, ignore_index=None, reduce=T ...@@ -16,9 +16,9 @@ def label_smoothed_nll_loss(lprobs, target, epsilon, ignore_index=None, reduce=T
nll_loss = -lprobs.gather(dim=-1, index=target) nll_loss = -lprobs.gather(dim=-1, index=target)
smooth_loss = -lprobs.sum(dim=-1, keepdim=True) smooth_loss = -lprobs.sum(dim=-1, keepdim=True)
if ignore_index is not None: if ignore_index is not None:
non_pad_mask = target.ne(ignore_index) pad_mask = target.eq(ignore_index)
nll_loss = nll_loss[non_pad_mask] nll_loss[pad_mask] = nll_loss[pad_mask] * 0.
smooth_loss = smooth_loss[non_pad_mask] smooth_loss[pad_mask] = smooth_loss[pad_mask] * 0.
else: else:
nll_loss = nll_loss.squeeze(-1) nll_loss = nll_loss.squeeze(-1)
smooth_loss = smooth_loss.squeeze(-1) smooth_loss = smooth_loss.squeeze(-1)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment