1. 04 Dec, 2023 1 commit
    • Nilesh's avatar
      Added test cases for rembert refering to albert and reformer test_tok… (#27637) · 4d4febb7
      Nilesh authored
      
      
      * Added test cases for rembert refering to albert and reformer test_tokenization
      
      * removed CURL_CA_BUNDLE='
      
      * Added flag test_sentencepiece_ignore_case and space_between_special_tokens to True
      
      * Overrided test_added_tokens_serialization
      
      * As slow->fast token failed due to the different initialization for [MASK]  for slow and fast, Therefore it required to make the initialization for [MASK] token uniform between fast and slow token
      
      * Added few more test cases in test_encode_decode_round_trip and modefied the slow token (mask_token) to  have AddedToken instance with lstrip=True
      
      * Added few test cases in test_encoder_decoder round trip and also modified slow tokenizer of rembert to have mask_token as AddedToken with lstrip = True
      
      * Cleaned the code and added  fmt: skip to avoid line breaks after make style +  added comments to indicate from the copied test cases
      
      * Corrected few comments
      
      * Fixed quality issue
      
      * Ran fix-copies
      
      * Fixed few minor issues as (make fix-copies) broke few test cases while stripping the text
      
      * Reverted the changes made by repo-consistancy
      
      ---------
      Co-authored-by: default avatarKokane <kokanen@apac.corpdir.net>
      4d4febb7