- 21 Jun, 2024 1 commit
-
-
Ita Zaporozhets authored
* PR SPLIT: moving origina changes for adding user defined symbols * adding gemma test and generalizing gemma converter * ruff * update common test * update serialization test * deberta v2 tests updates as rust version adds '.' as a user added token, so a space is not added * removing commented lines * applying feedback - user only added_tokens to add and check piece.type instead of trainer_spec for user_defined_symbols * add comment referencing sentencepiece
-
- 22 May, 2024 1 commit
-
-
Arthur authored
* update ruff version * fix research projects * Empty * Fix errors --------- Co-authored-by:Lysandre <lysandre@huggingface.co>
-
- 15 Apr, 2024 1 commit
-
-
Sai-Suraj-27 authored
Replace deprecated assertEquals with assertEqual.
-
- 13 Mar, 2024 1 commit
-
-
Lysandre Debut authored
* Adds pretrained IDs directly in the tests * Fix tests * Fix tests * Review!
-
- 04 Dec, 2023 1 commit
-
-
Nilesh authored
* Added test cases for rembert refering to albert and reformer test_tokenization * removed CURL_CA_BUNDLE=' * Added flag test_sentencepiece_ignore_case and space_between_special_tokens to True * Overrided test_added_tokens_serialization * As slow->fast token failed due to the different initialization for [MASK] for slow and fast, Therefore it required to make the initialization for [MASK] token uniform between fast and slow token * Added few more test cases in test_encode_decode_round_trip and modefied the slow token (mask_token) to have AddedToken instance with lstrip=True * Added few test cases in test_encoder_decoder round trip and also modified slow tokenizer of rembert to have mask_token as AddedToken with lstrip = True * Cleaned the code and added fmt: skip to avoid line breaks after make style + added comments to indicate from the copied test cases * Corrected few comments * Fixed quality issue * Ran fix-copies * Fixed few minor issues as (make fix-copies) broke few test cases while stripping the text * Reverted the changes made by repo-consistancy --------- Co-authored-by:Kokane <kokanen@apac.corpdir.net>
-