"git@developer.sourcefind.cn:wuxk1/megatron-lm.git" did not exist on "0f0f60aa0e665d199587dc28205b4046131ab7ee"
  1. 29 Apr, 2020 1 commit
    • Julien Chaumond's avatar
      CDN urls (#4030) · 455c6390
      Julien Chaumond authored
      * [file_utils] use_cdn + documentation
      
      * Move to cdn. urls for weights
      
      * [urls] Hotfix for bert-base-japanese
      455c6390
  2. 18 Apr, 2020 1 commit
    • Thomas Wolf's avatar
      Cleanup fast tokenizers integration (#3706) · 827d6d6e
      Thomas Wolf authored
      
      
      * First pass on utility classes and python tokenizers
      
      * finishing cleanup pass
      
      * style and quality
      
      * Fix tests
      
      * Updating following @mfuntowicz comment
      
      * style and quality
      
      * Fix Roberta
      
      * fix batch_size/seq_length inBatchEncoding
      
      * add alignement methods + tests
      
      * Fix OpenAI and Transfo-XL tokenizers
      
      * adding trim_offsets=True default for GPT2 et RoBERTa
      
      * style and quality
      
      * fix tests
      
      * add_prefix_space in roberta
      
      * bump up tokenizers to rc7
      
      * style
      
      * unfortunately tensorfow does like these - removing shape/seq_len for now
      
      * Update src/transformers/tokenization_utils.py
      Co-Authored-By: default avatarStefan Schweter <stefan@schweter.it>
      
      * Adding doc and docstrings
      
      * making flake8 happy
      Co-authored-by: default avatarStefan Schweter <stefan@schweter.it>
      827d6d6e
  3. 16 Apr, 2020 1 commit
  4. 08 Apr, 2020 1 commit
  5. 04 Apr, 2020 1 commit
  6. 24 Mar, 2020 1 commit
  7. 02 Mar, 2020 1 commit
  8. 07 Feb, 2020 1 commit
  9. 29 Jan, 2020 4 commits
  10. 15 Jan, 2020 1 commit
  11. 13 Jan, 2020 1 commit
  12. 07 Jan, 2020 1 commit
  13. 06 Jan, 2020 2 commits
  14. 05 Jan, 2020 1 commit
  15. 28 Dec, 2019 1 commit
  16. 23 Dec, 2019 1 commit
  17. 22 Dec, 2019 20 commits