Unverified Commit c0d97cee authored by Lysandre Debut's avatar Lysandre Debut Committed by GitHub
Browse files

Adds a note to resize the token embedding matrix when adding special … (#11120)

* Adds a note to resize the token embedding matrix when adding special tokens

* Remove superfluous space
parent 02f7c2fe
......@@ -825,7 +825,13 @@ class SpecialTokensMixin:
special tokens are NOT in the vocabulary, they are added to it (indexed starting from the last index of the
current vocabulary).
Using : obj:`add_special_tokens` will ensure your special tokens can be used in several ways:
.. Note::
When adding new tokens to the vocabulary, you should make sure to also resize the token embedding matrix of
the model so that its embedding matrix matches the tokenizer.
In order to do that, please use the :meth:`~transformers.PreTrainedModel.resize_token_embeddings` method.
Using :obj:`add_special_tokens` will ensure your special tokens can be used in several ways:
- Special tokens are carefully handled by the tokenizer (they are never split).
- You can easily refer to special tokens using tokenizer class attributes like :obj:`tokenizer.cls_token`. This
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment