Unverified Commit 57b980a6 authored by Vladimir Maryasin's avatar Vladimir Maryasin Committed by GitHub
Browse files

Fix saving FlaubertTokenizer configs (#14991)

All specific tokenizer config properties must be passed to its base
class (XLMTokenizer) in order to be saved. This was not the case for
do_lowercase config. Thus it was not saved by save_pretrained() method
and saving and reloading the tokenizer changed its behaviour.

This commit fixes it.
parent 16f0b7d7
......@@ -96,7 +96,7 @@ class FlaubertTokenizer(XLMTokenizer):
max_model_input_sizes = PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
def __init__(self, do_lowercase=False, **kwargs):
super().__init__(**kwargs)
super().__init__(do_lowercase=do_lowercase, **kwargs)
self.do_lowercase = do_lowercase
self.do_lowercase_and_remove_accent = False
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment