Update the doc string for T5WithLMHeadModel

T5WithLMHeadModel's doc string claims that indices of -1 are ignored while computing the cross-entropy loss in the forward pass; however, indices of -1 throw an error while indices of -100 are ignored. This commit updates the doc string to be consistent with the class's behavior.

Update the doc string for T5WithLMHeadModel
T5WithLMHeadModel's doc string claims that indices of -1 are ignored while computing the cross-entropy loss in the forward pass; however, indices of -1 throw an error while indices of -100 are ignored. This commit updates the doc string to be consistent with the class's behavior.
62f58046 · Nicholas Lourie · Lysandre Debut · 908230d2 · 62f58046
Commit 62f58046 authored Jan 21, 2020 by Nicholas Lourie Committed by Lysandre Debut Jan 24, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 3 deletions

src/transformers/modeling_t5.py src/transformers/modeling_t5.py +3 -3

No files found.
--- a/src/transformers/modeling_t5.py
+++ b/src/transformers/modeling_t5.py
@@ -802,9 +802,9 @@ class T5WithLMHeadModel(T5PreTrainedModel):
    r"""
        **lm_labels**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
            Labels for computing the masked language modeling loss.
-            Indices should be in ``[-1, 0, ..., config.vocab_size]`` (see ``input_ids`` docstring)
+            Indices should either be in ``[0, ..., config.vocab_size]`` or -100 (see ``input_ids`` docstring).
-            Tokens with indices set to ``-1`` are ignored (masked), the loss is only computed for the tokens with labels
+            Tokens with indices set to ``-100`` are ignored (masked), the loss is only computed for the tokens with labels
-            in ``[0, ..., config.vocab_size]``
+            in ``[0, ..., config.vocab_size]``.
    Outputs: `Tuple` comprising various elements depending on the configuration (config) and inputs:
        **loss**: (`optional`, returned when ``lm_labels`` is provided) ``torch.FloatTensor`` of shape ``(1,)``: