1. 16 Dec, 2020 1 commit
    • Patrick von Platen's avatar
      [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1
      Patrick von Platen authored
      
      
      * save intermediate
      
      * save intermediate
      
      * save intermediate
      
      * correct flax bert model file
      
      * new module / model naming
      
      * make style
      
      * almost finish BERT
      
      * finish roberta
      
      * make fix-copies
      
      * delete keys file
      
      * last refactor
      
      * fixes in run_mlm_flax.py
      
      * remove pooled from run_mlm_flax.py`
      
      * fix gelu | gelu_new
      
      * remove Module from inits
      
      * splits
      
      * dirty print
      
      * preventing warmup_steps == 0
      
      * smaller splits
      
      * make fix-copies
      
      * dirty print
      
      * dirty print
      
      * initial_evaluation argument
      
      * declaration order fix
      
      * proper model initialization/loading
      
      * proper initialization
      
      * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug
      
      * removed tokenizers warning hack, fixed model re-initialization
      
      * reverted training_args.py changes
      
      * fix flax from pretrained
      
      * improve test in flax
      
      * apply sylvains tips
      
      * update init
      
      * make 0.3.0 compatible
      
      * revert tevens changes
      
      * revert tevens changes 2
      
      * finalize revert
      
      * fix bug
      
      * add docs
      
      * add pretrained to init
      
      * Update src/transformers/modeling_flax_utils.py
      
      * fix copies
      
      * final improvements
      Co-authored-by: default avatarTevenLeScao <teven.lescao@gmail.com>
      640e6fe1
  2. 15 Dec, 2020 4 commits
  3. 11 Dec, 2020 3 commits
  4. 10 Dec, 2020 1 commit
  5. 09 Dec, 2020 1 commit
  6. 08 Dec, 2020 1 commit
  7. 07 Dec, 2020 3 commits
  8. 05 Dec, 2020 1 commit
    • Ethan Perez's avatar
      Don't pass in token_type_ids to BART for GLUE (#8929) · 8dfc8c72
      Ethan Perez authored
      Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.
      8dfc8c72
  9. 04 Dec, 2020 2 commits
  10. 01 Dec, 2020 1 commit
  11. 30 Nov, 2020 3 commits
  12. 26 Nov, 2020 4 commits
  13. 24 Nov, 2020 3 commits
  14. 23 Nov, 2020 2 commits
  15. 22 Nov, 2020 1 commit
  16. 20 Nov, 2020 1 commit
    • Quentin Lhoest's avatar
      Fix rag finetuning + add finetuning test (#8585) · 8062fa63
      Quentin Lhoest authored
      * replace init_ddp_connection for index init
      
      * style
      
      * add finetune test
      
      * add test data
      
      * move generate tensors to device
      
      * add test on EM metric
      
      * style
      
      * allow multi process test
      
      * keep gloo process group for retrieval
      
      * add multi-gpu test
      
      * use custom accelerator
      
      * clean test finetune
      
      * minor
      
      * style
      
      * style
      
      * typo
      
      * use python call instead of imported main fumction
      
      * return_dict fix in modeling_rag
      
      * use float32 in retrieval
      
      * store as float32 as well in the custom knowledge dataset example
      
      * style
      
      * rename to finetune_rag
      
      * style
      
      * update readme
      
      * rename utils and callbacks to utils_rag and callbacks_rag
      
      * fix test
      
      * patrick's comments
      
      * generate dummy data in the finetue test script
      
      * remove dummy data files
      
      * style
      8062fa63
  17. 19 Nov, 2020 6 commits
  18. 18 Nov, 2020 2 commits