1. 22 Dec, 2020 1 commit
  2. 21 Dec, 2020 3 commits
  3. 20 Dec, 2020 1 commit
  4. 19 Dec, 2020 1 commit
  5. 18 Dec, 2020 7 commits
  6. 17 Dec, 2020 1 commit
  7. 16 Dec, 2020 3 commits
    • Sylvain Gugger's avatar
      Experimental support for fairscale ShardedDDP (#9139) · 9a671853
      Sylvain Gugger authored
      * Experimental stupport for fairscale ShardedDDP
      
      * Add import error if fairscale not available
      
      * Address review comments
      
      * Fix seq2seq trainer
      9a671853
    • Sylvain Gugger's avatar
    • Patrick von Platen's avatar
      [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1
      Patrick von Platen authored
      
      
      * save intermediate
      
      * save intermediate
      
      * save intermediate
      
      * correct flax bert model file
      
      * new module / model naming
      
      * make style
      
      * almost finish BERT
      
      * finish roberta
      
      * make fix-copies
      
      * delete keys file
      
      * last refactor
      
      * fixes in run_mlm_flax.py
      
      * remove pooled from run_mlm_flax.py`
      
      * fix gelu | gelu_new
      
      * remove Module from inits
      
      * splits
      
      * dirty print
      
      * preventing warmup_steps == 0
      
      * smaller splits
      
      * make fix-copies
      
      * dirty print
      
      * dirty print
      
      * initial_evaluation argument
      
      * declaration order fix
      
      * proper model initialization/loading
      
      * proper initialization
      
      * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug
      
      * removed tokenizers warning hack, fixed model re-initialization
      
      * reverted training_args.py changes
      
      * fix flax from pretrained
      
      * improve test in flax
      
      * apply sylvains tips
      
      * update init
      
      * make 0.3.0 compatible
      
      * revert tevens changes
      
      * revert tevens changes 2
      
      * finalize revert
      
      * fix bug
      
      * add docs
      
      * add pretrained to init
      
      * Update src/transformers/modeling_flax_utils.py
      
      * fix copies
      
      * final improvements
      Co-authored-by: default avatarTevenLeScao <teven.lescao@gmail.com>
      640e6fe1
  8. 15 Dec, 2020 4 commits
  9. 11 Dec, 2020 3 commits
  10. 10 Dec, 2020 1 commit
  11. 09 Dec, 2020 1 commit
  12. 08 Dec, 2020 1 commit
  13. 07 Dec, 2020 3 commits
  14. 05 Dec, 2020 1 commit
    • Ethan Perez's avatar
      Don't pass in token_type_ids to BART for GLUE (#8929) · 8dfc8c72
      Ethan Perez authored
      Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.
      8dfc8c72
  15. 04 Dec, 2020 2 commits
  16. 01 Dec, 2020 1 commit
  17. 30 Nov, 2020 3 commits
  18. 26 Nov, 2020 3 commits