Unverified Commit e73a3e18 authored by Lysandre Debut's avatar Lysandre Debut Committed by GitHub
Browse files

Add note to resize token embeddings matrix when adding new tokens to voc (#10331)

parent 19e737b9
......@@ -971,6 +971,12 @@ class SpecialTokensMixin:
Add a list of new tokens to the tokenizer class. If the new tokens are not in the vocabulary, they are added to
it with indices starting from length of the current vocabulary.
.. Note::
When adding new tokens to the vocabulary, you should make sure to also resize the token embedding matrix of
the model so that its embedding matrix matches the tokenizer.
In order to do that, please use the :meth:`~transformers.PreTrainedModel.resize_token_embeddings` method.
Args:
new_tokens (:obj:`str`, :obj:`tokenizers.AddedToken` or a list of `str` or :obj:`tokenizers.AddedToken`):
Tokens are only added if they are not already in the vocabulary. :obj:`tokenizers.AddedToken` wraps a
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment