"test/git@developer.sourcefind.cn:gaoqiong/migraphx.git" did not exist on "6a04a953ca46c43f744256e6b9d2780577f2bcf1"
Unverified Commit daf53241 authored by Mayank Agarwal's avatar Mayank Agarwal Committed by GitHub
Browse files

Fix word_ids hyperlink (#22765)

* Fix word_ids hyperlink

* Add suggested fix
parent 06e737fb
...@@ -121,7 +121,7 @@ As you saw in the example `tokens` field above, it looks like the input has alre ...@@ -121,7 +121,7 @@ As you saw in the example `tokens` field above, it looks like the input has alre
However, this adds some special tokens `[CLS]` and `[SEP]` and the subword tokenization creates a mismatch between the input and labels. A single word corresponding to a single label may now be split into two subwords. You'll need to realign the tokens and labels by: However, this adds some special tokens `[CLS]` and `[SEP]` and the subword tokenization creates a mismatch between the input and labels. A single word corresponding to a single label may now be split into two subwords. You'll need to realign the tokens and labels by:
1. Mapping all tokens to their corresponding word with the [`word_ids`](https://huggingface.co/docs/tokenizers/python/latest/api/reference.html#tokenizers.Encoding.word_ids) method. 1. Mapping all tokens to their corresponding word with the [`word_ids`](https://huggingface.co/docs/transformers/main_classes/tokenizer#transformers.BatchEncoding.word_ids) method.
2. Assigning the label `-100` to the special tokens `[CLS]` and `[SEP]` so they're ignored by the PyTorch loss function. 2. Assigning the label `-100` to the special tokens `[CLS]` and `[SEP]` so they're ignored by the PyTorch loss function.
3. Only labeling the first token of a given word. Assign `-100` to other subtokens from the same word. 3. Only labeling the first token of a given word. Assign `-100` to other subtokens from the same word.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment