1. 27 Apr, 2021 1 commit
  2. 26 Apr, 2021 1 commit
  3. 23 Apr, 2021 1 commit
  4. 21 Apr, 2021 1 commit
  5. 13 Apr, 2021 2 commits
  6. 12 Apr, 2021 1 commit
  7. 09 Apr, 2021 1 commit
  8. 08 Apr, 2021 3 commits
  9. 05 Apr, 2021 2 commits
  10. 26 Mar, 2021 1 commit
  11. 16 Mar, 2021 1 commit
  12. 15 Mar, 2021 1 commit
  13. 12 Mar, 2021 2 commits
  14. 10 Mar, 2021 1 commit
  15. 09 Mar, 2021 1 commit
  16. 05 Mar, 2021 1 commit
  17. 03 Mar, 2021 1 commit
  18. 25 Feb, 2021 2 commits
    • Sylvain Gugger's avatar
      Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354) · 9d14be5c
      Sylvain Gugger authored
      
      
      * Ass support for ZeRO-2/3 and ZeRO-offload in fairscale
      
      * Quality
      
      * Rework from review comments
      
      * Add doc
      
      * Apply suggestions from code review
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      
      * Address review comments
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      9d14be5c
    • Patrick von Platen's avatar
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor,... · cb38ffcc
      Patrick von Platen authored
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
      
      * push to show
      
      * small improvement
      
      * small improvement
      
      * Update src/transformers/feature_extraction_utils.py
      
      * Update src/transformers/feature_extraction_utils.py
      
      * implement base
      
      * add common tests
      
      * make all tests pass for wav2vec2
      
      * make padding work & add more tests
      
      * finalize feature extractor utils
      
      * add call method to feature extraction
      
      * finalize feature processor
      
      * finish tokenizer
      
      * finish general processor design
      
      * finish tests
      
      * typo
      
      * remove bogus file
      
      * finish docstring
      
      * add docs
      
      * finish docs
      
      * small fix
      
      * correct docs
      
      * save intermediate
      
      * load changes
      
      * apply changes
      
      * apply changes to doc
      
      * change tests
      
      * apply surajs recommend
      
      * final changes
      
      * Apply suggestions from code review
      
      * fix typo
      
      * fix import
      
      * correct docstring
      cb38ffcc
  19. 22 Feb, 2021 1 commit
  20. 17 Feb, 2021 1 commit
  21. 11 Feb, 2021 1 commit
  22. 10 Feb, 2021 1 commit
  23. 09 Feb, 2021 1 commit
  24. 08 Feb, 2021 1 commit
  25. 02 Feb, 2021 1 commit
  26. 14 Jan, 2021 1 commit
  27. 13 Jan, 2021 1 commit
    • Stas Bekman's avatar
      [trainer] deepspeed integration (#9211) · 2df34f4a
      Stas Bekman authored
      
      
      * deepspeed integration
      
      * style
      
      * add test
      
      * ds wants to do its own backward
      
      * fp16 assert
      
      * Update src/transformers/training_args.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * style
      
      * for clarity extract what args are being passed to deepspeed
      
      * introduce the concept of self.wrapped_model
      
      * s/self.wrapped_model/self.model_wrapped/
      
      * complete transition to self.wrapped_model / self.model
      
      * fix
      
      * doc
      
      * give ds its own init
      
      * add custom overrides, handle bs correctly
      
      * fix test
      
      * clean up model_init logic, fix small bug
      
      * complete fix
      
      * collapse --deepspeed_config into --deepspeed
      
      * style
      
      * start adding doc notes
      
      * style
      
      * implement hf2ds optimizer and scheduler configuration remapping
      
      * oops
      
      * call get_num_training_steps absolutely when needed
      
      * workaround broken auto-formatter
      
      * deepspeed_config arg is no longer needed - fixed in deepspeed master
      
      * use hf's fp16 args in config
      
      * clean
      
      * start on the docs
      
      * rebase cleanup
      
      * finish up --fp16
      
      * clarify the supported stages
      
      * big refactor thanks to discovering deepspeed.init_distributed
      
      * cleanup
      
      * revert fp16 part
      
      * add checkpoint-support
      
      * more init ds into integrations
      
      * extend docs
      
      * cleanup
      
      * unfix docs
      
      * clean up old code
      
      * imports
      
      * move docs
      
      * fix logic
      
      * make it clear which file it's referring to
      
      * document nodes/gpus
      
      * style
      
      * wrong format
      
      * style
      
      * deepspeed handles gradient clipping
      
      * easier to read
      
      * major doc rewrite
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * docs
      
      * switch to AdamW optimizer
      
      * style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * clarify doc
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      2df34f4a
  28. 05 Jan, 2021 1 commit
  29. 23 Dec, 2020 1 commit
    • Suraj Patil's avatar
      Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893
      Suraj Patil authored
      * add past_key_values
      
      * add use_cache option
      
      * make mask before cutting ids
      
      * adjust position_ids according to past_key_values
      
      * flatten past_key_values
      
      * fix positional embeds
      
      * fix _reorder_cache
      
      * set use_cache to false when not decoder, fix attention mask init
      
      * add test for caching
      
      * add past_key_values for Roberta
      
      * fix position embeds
      
      * add caching test for roberta
      
      * add doc
      
      * make style
      
      * doc, fix attention mask, test
      
      * small fixes
      
      * adress patrick's comments
      
      * input_ids shouldn't start with pad token
      
      * use_cache only when decoder
      
      * make consistent with bert
      
      * make copies consistent
      
      * add use_cache to encoder
      
      * add past_key_values to tapas attention
      
      * apply suggestions from code review
      
      * make coppies consistent
      
      * add attn mask in tests
      
      * remove copied from longformer
      
      * apply suggestions from code review
      
      * fix bart test
      
      * nit
      
      * simplify model outputs
      
      * fix doc
      
      * fix output ordering
      88ef8893
  30. 22 Dec, 2020 1 commit
  31. 16 Dec, 2020 2 commits
    • Lysandre Debut's avatar
      TableQuestionAnsweringPipeline (#9145) · 1c1a2ffb
      Lysandre Debut authored
      
      
      * AutoModelForTableQuestionAnswering
      
      * TableQuestionAnsweringPipeline
      
      * Apply suggestions from Patrick's code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Sylvain and Patrick comments
      
      * Better PyTorch/TF error message
      
      * Add integration tests
      
      * Argument Handler naming
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      
      * Fix docs to appease the documentation gods
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      1c1a2ffb
    • Patrick von Platen's avatar
      [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1
      Patrick von Platen authored
      
      
      * save intermediate
      
      * save intermediate
      
      * save intermediate
      
      * correct flax bert model file
      
      * new module / model naming
      
      * make style
      
      * almost finish BERT
      
      * finish roberta
      
      * make fix-copies
      
      * delete keys file
      
      * last refactor
      
      * fixes in run_mlm_flax.py
      
      * remove pooled from run_mlm_flax.py`
      
      * fix gelu | gelu_new
      
      * remove Module from inits
      
      * splits
      
      * dirty print
      
      * preventing warmup_steps == 0
      
      * smaller splits
      
      * make fix-copies
      
      * dirty print
      
      * dirty print
      
      * initial_evaluation argument
      
      * declaration order fix
      
      * proper model initialization/loading
      
      * proper initialization
      
      * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug
      
      * removed tokenizers warning hack, fixed model re-initialization
      
      * reverted training_args.py changes
      
      * fix flax from pretrained
      
      * improve test in flax
      
      * apply sylvains tips
      
      * update init
      
      * make 0.3.0 compatible
      
      * revert tevens changes
      
      * revert tevens changes 2
      
      * finalize revert
      
      * fix bug
      
      * add docs
      
      * add pretrained to init
      
      * Update src/transformers/modeling_flax_utils.py
      
      * fix copies
      
      * final improvements
      Co-authored-by: default avatarTevenLeScao <teven.lescao@gmail.com>
      640e6fe1
  32. 10 Dec, 2020 1 commit
  33. 07 Dec, 2020 1 commit