1. 05 Jun, 2024 1 commit
    • bastrob's avatar
      Add missing Flaubert tokenizer tests (#30492) · 464d986b
      bastrob authored
      * add flaubert tokenization test, enrich inheritance in FlaubertTokenizer.
      
      * fix quality code ci
      
      * ensure parameter consistency
      
      * fix ci
      
      * fix copyright year and flatten vocab list.
      
      * fix style
      464d986b
  2. 04 Jun, 2024 7 commits
  3. 03 Jun, 2024 6 commits
  4. 31 May, 2024 2 commits
  5. 30 May, 2024 2 commits
  6. 29 May, 2024 2 commits
  7. 28 May, 2024 8 commits
  8. 27 May, 2024 3 commits
  9. 24 May, 2024 8 commits
  10. 23 May, 2024 1 commit
    • Yasmin Moslem's avatar
      Remove deprecated properties in tokenization_nllb.py and tokenization_nllb_fast.py (#29834) · 6d3d5b10
      Yasmin Moslem authored
      * Fix typo in tokenization_nllb.py
      
      Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
      
      * Fix typo in tokenization_nllb_fast.py
      
      Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
      
      * Remove deprecated attributes in tokenization_nllb.py
      
      Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens`
      
      * Remove deprecated attribute in tokenization_nllb_fast.py
      
      Remove deprecated attribute `lang_code_to_id`
      
      * Remove deprecated properties in tokenization_nllb.py
      
      Remove deprecated properties - fix format
      
      * Remove deprecated properties in tokenization_nllb_fast.py
      
      Remove deprecated properties - fix format
      
      * Update test_tokenization_nllb.py
      
      * update test_tokenization_nllb.py
      
      * Update tokenization_nllb.py
      
      * Update test_tokenization_seamless_m4t.py
      
      * Update test_tokenization_seamless_m4t.py
      6d3d5b10