1. 17 Jun, 2020 1 commit
  2. 15 Jun, 2020 1 commit
    • Anthony MOI's avatar
      [HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized... · 36434220
      Anthony MOI authored
      
      [HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510)
      
      * Use tokenizers pre-tokenized pipeline
      
      * failing pretrokenized test
      
      * Fix is_pretokenized in python
      
      * add pretokenized tests
      
      * style and quality
      
      * better tests for batched pretokenized inputs
      
      * tokenizers clean up - new padding_strategy - split the files
      
      * [HUGE] refactoring tokenizers - padding - truncation - tests
      
      * style and quality
      
      * bump up requied tokenizers version to 0.8.0-rc1
      
      * switched padding/truncation API - simpler better backward compat
      
      * updating tests for custom tokenizers
      
      * style and quality - tests on pad
      
      * fix QA pipeline
      
      * fix backward compatibility for max_length only
      
      * style and quality
      
      * Various cleans up - add verbose
      
      * fix tests
      
      * update docstrings
      
      * Fix tests
      
      * Docs reformatted
      
      * __call__ method documented
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      36434220
  3. 09 Jun, 2020 2 commits
  4. 02 Jun, 2020 2 commits
  5. 07 May, 2020 1 commit
    • Julien Chaumond's avatar
      BIG Reorganize examples (#4213) · 0ae96ff8
      Julien Chaumond authored
      * Created using Colaboratory
      
      * [examples] reorganize files
      
      * remove run_tpu_glue.py as superseded by TPU support in Trainer
      
      * Bugfix: int, not tuple
      
      * move files around
      0ae96ff8
  6. 29 Apr, 2020 1 commit
    • Julien Chaumond's avatar
      CDN urls (#4030) · 455c6390
      Julien Chaumond authored
      * [file_utils] use_cdn + documentation
      
      * Move to cdn. urls for weights
      
      * [urls] Hotfix for bert-base-japanese
      455c6390
  7. 28 Apr, 2020 1 commit
  8. 24 Apr, 2020 1 commit
  9. 22 Apr, 2020 1 commit
    • Julien Chaumond's avatar
      Trainer (#3800) · dd9d483d
      Julien Chaumond authored
      * doc
      
      * [tests] Add sample files for a regression task
      
      * [HUGE] Trainer
      
      * Feedback from @sshleifer
      
      * Feedback from @thomwolf + logging tweak
      
      * [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes
      
      * [glue] Use default max_seq_length of 128 like before
      
      * [glue] move DataTrainingArguments around
      
      * [ner] Change interface of InputExample, and align run_{tf,pl}
      
      * Re-align the pl scripts a little bit
      
      * ner
      
      * [ner] Add integration test
      
      * Fix language_modeling with API tweak
      
      * [ci] Tweak loss target
      
      * Don't break console output
      
      * amp.initialize: model must be on right device before
      
      * [multiple-choice] update for Trainer
      
      * Re-align to 827d6d6e
      dd9d483d
  10. 20 Apr, 2020 1 commit
  11. 16 Apr, 2020 1 commit
  12. 15 Apr, 2020 1 commit
  13. 14 Apr, 2020 1 commit
  14. 07 Apr, 2020 2 commits
  15. 02 Apr, 2020 1 commit
  16. 31 Mar, 2020 1 commit
  17. 30 Mar, 2020 1 commit
  18. 29 Mar, 2020 1 commit
  19. 27 Mar, 2020 3 commits
  20. 26 Mar, 2020 1 commit
    • Patrick von Platen's avatar
      Add t5 summarization example (#3411) · e703e923
      Patrick von Platen authored
      * rebase to master
      
      * change tf to pytorch
      
      * change to pytorch
      
      * small fix
      
      * renaming
      
      * add gpu training possibility
      
      * renaming
      
      * improve README
      
      * incoorporate collins feedback
      
      * better Readme
      
      * better README.md
      e703e923
  21. 25 Mar, 2020 1 commit
  22. 23 Mar, 2020 1 commit
  23. 20 Mar, 2020 1 commit
  24. 17 Mar, 2020 1 commit
  25. 16 Mar, 2020 1 commit
  26. 13 Mar, 2020 2 commits
  27. 12 Mar, 2020 1 commit
  28. 11 Mar, 2020 1 commit
  29. 09 Mar, 2020 1 commit
  30. 05 Mar, 2020 3 commits
  31. 03 Mar, 2020 1 commit
    • Sam Shleifer's avatar
      Summarization Examples: add Bart CNN Evaluation (#3082) · 5b396457
      Sam Shleifer authored
      * Rename and improve example
      
      * Add test
      
      * slightly faster test
      
      * style
      
      * This breaks remy prolly
      
      * shorter test string
      
      * no slow
      
      * newdir structure
      
      * New tree
      
      * Style
      
      * shorter
      
      * docs
      
      * clean
      
      * Attempt future import
      
      * more import hax
      5b396457
  32. 20 Feb, 2020 1 commit
    • Sam Shleifer's avatar
      New BartModel (#2745) · 53ce3854
      Sam Shleifer authored
      * Results same as fairseq
      * Wrote a ton of tests
      * Struggled with api signatures
      * added some docs
      
      53ce3854