Update token_classification.md (#24484)

Add link to pytorch CrossEntropyLoss so that one understand why '-100' is ignore by the loss function.

Update token_classification.md (#24484)
Add link to pytorch CrossEntropyLoss so that one understand why '-100' is ignore by the loss function.
c2aa5e17 · condor-cp · GitHub · 3ca02223 · c2aa5e17
Unverified Commit c2aa5e17 authored Jun 26, 2023 by condor-cp Committed by GitHub Jun 26, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

docs/source/en/tasks/token_classification.md docs/source/en/tasks/token_classification.md +1 -1

No files found.
--- a/docs/source/en/tasks/token_classification.md
+++ b/docs/source/en/tasks/token_classification.md
@@ -126,7 +126,7 @@ As you saw in the example `tokens` field above, it looks like the input has alre
 However, this adds some special tokens `[CLS]` and `[SEP]` and the subword tokenization creates a mismatch between the input and labels. A single word corresponding to a single label may now be split into two subwords. You'll need to realign the tokens and labels by:
 1. Mapping all tokens to their corresponding word with the [`word_ids`](https://huggingface.co/docs/transformers/main_classes/tokenizer#transformers.BatchEncoding.word_ids) method.
-2. Assigning the label `-100` to the special tokens `[CLS]` and `[SEP]` so they're ignored by the PyTorch loss function.
+2. Assigning the label `-100` to the special tokens `[CLS]` and `[SEP]` so they're ignored by the PyTorch loss function (see [CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html)).
 3. Only labeling the first token of a given word. Assign `-100` to other subtokens from the same word.
 Here is how you can create a function to realign the tokens and labels, and truncate sequences to be no longer than DistilBERT's maximum input length: