"tests/models/vscode:/vscode.git/clone" did not exist on "43d17c18360ac9c3d3491389328e2fe55fe8f9ce"
  1. 07 Nov, 2022 7 commits
  2. 04 Nov, 2022 14 commits
  3. 03 Nov, 2022 11 commits
  4. 02 Nov, 2022 8 commits
    • Steven Liu's avatar
      reorganize glossary (#20010) · aa39967b
      Steven Liu authored
      aa39967b
    • Yih-Dar's avatar
      Show installed libraries and their versions in CI jobs (#20026) · 305e8718
      Yih-Dar authored
      
      
      * Show versions
      
      * check
      
      * store outputs
      
      * revert
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      305e8718
    • Ben Eyal's avatar
      馃毃 馃毃 馃毃 Fix Issue 15003: SentencePiece Tokenizers Not Adding Special Tokens in... · 9f9ddcc2
      Ben Eyal authored
      馃毃 馃毃 馃毃 Fix Issue 15003: SentencePiece Tokenizers Not Adding Special Tokens in `convert_tokens_to_string` (#15775)
      
      * Add test for SentencePiece not adding special tokens to strings
      
      * Add SentencePieceStringConversionMixin to fix issue 15003
      
      * Fix conversion from tokens to string for most SentencePiece tokenizers
      
      Tokenizers fixed:
      - AlbertTokenizer
      - BarthezTokenizer
      - CamembertTokenizer
      - FNetTokenizer
      - M2M100Tokenizer
      - MBart50Tokenizer
      - PegasusTokenizer
      - Speech2TextTokenizer
      
      * Fix MarianTokenizer, adjust SentencePiece test to accomodate vocab
      
      * Fix DebertaV2Tokenizer
      
      * Ignore LayoutXLMTokenizer in SentencePiece string conversion test
      
      * Run 'make style' and 'make quality'
      
      * Clean convert_tokens_to_string test
      
      Instead of explicitly ignoring LayoutXLMTokenizer in the test,
      override the test in LayoutLMTokenizationTest and do nothing in it.
      
      * Remove commented out code
      
      * Improve robustness of convert_tokens_to_string test
      
      Instead of comparing lengths of re-tokenized text and input_ids,
      check that converting all special tokens to string yields a string
      with all special tokens.
      
      * Inline and remove SentencePieceStringConversionMixin
      
      The convert_tokens_to_string method is now implemented
      in each relevant SentencePiece tokenizer.
      
      * Run 'make style' and 'make quality'
      
      * Revert removal of space in convert_tokens_to_string
      
      * Remove redundant import
      
      * Revert test text to original
      
      * Uncomment the lowercasing of the reverse_text variable
      
      * Mimic Rust tokenizer behavior for tokenizers
      
      - Albert
      - Barthez
      - Camembert
      - MBart50
      - T5
      
      * Fix accidentally skipping test in wrong tokenizer
      
      * Add test for equivalent Rust and slow tokenizer behavior
      
      * Override _decode in BigBirdTokenizer to mimic Rust behavior
      
      * Override _decode in FNetTokenizer to mimic Rust behavior
      
      * Override _decode in XLNetTokenizer to mimic Rust behavior
      
      * Remove unused 're' import
      
      * Update DebertaV2Tokenizer to mimic Rust tokenizer
      
      * Deberta tokenizer now behaves like Albert and its `convert_tokens_to_string` is not tested.
      
      * Ignore problematic tests in Deberta V2
      
      * Add comment on why the Deberta V2 tests are skipped
      9f9ddcc2
    • Yih-Dar's avatar
      Fix doctest (#20023) · fb7cbe23
      Yih-Dar authored
      
      
      * Fix doctest
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      fb7cbe23
    • Yih-Dar's avatar
      Improve model tester (#19984) · f69eb24b
      Yih-Dar authored
      
      
      * part 1
      
      * part 2
      
      * part 3
      
      * fix
      
      * For CANINE
      
      * For ESMFold
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      f69eb24b
    • Saad Mahmud's avatar
      [Doctest] Add configuration_deberta_v2.py (#19995) · 74877437
      Saad Mahmud authored
      * Add example docstring for DebertaV2Config
      
      * Add DebertaV2Config to documentation_tests
      
      * Fix mistake with directory name
      74877437
    • amyeroberts's avatar
    • Sylvain Gugger's avatar
      Quality (#20002) · 49b77b89
      Sylvain Gugger authored
      49b77b89