Merge pull request #251 from Iwontbecreative/active_loss_tok_classif

Only keep the active part mof the loss for token classification

Merge pull request #251 from Iwontbecreative/active_loss_tok_classif
Only keep the active part mof the loss for token classification
bd746326 · Thomas Wolf · GitHub · fd223374 · f3bda235 · bd746326
Unverified Commit bd746326 authored Feb 05, 2019 by Thomas Wolf Committed by GitHub Feb 05, 2019
Show whitespace changes
Inline Side-by-side

Showing with 8 additions and 1 deletion

pytorch_pretrained_bert/modeling.py pytorch_pretrained_bert/modeling.py +8 -1

No files found.
--- a/pytorch_pretrained_bert/modeling.py
+++ b/pytorch_pretrained_bert/modeling.py
@@ -1025,6 +1025,13 @@ class BertForTokenClassification(PreTrainedBertModel):
        if labels is not None:
            loss_fct = CrossEntropyLoss()
+            # Only keep active parts of the loss
+            if attention_mask is not None:
+                active_loss = attention_mask.view(-1) == 1
+                active_logits = logits.view(-1, self.num_labels)[active_loss]
+                active_labels = labels.view(-1)[active_loss]
+                loss = loss_fct(active_logits, active_labels)
+            else:
                loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
            return loss
        else: