- 22 Feb, 2021 8 commits
-
-
Stas Bekman authored
* make logging and saving trainer built-in * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Tanmay Garg authored
Enhance resume_from_checkpoint argument of Trainer.train to accept bool type. If True given, last saved checkpoint in self.args.output_dir will be loaded. (#10280)
-
Stas Bekman authored
* implement gradient_accumulation_steps support in DeepSpeed integration * typo * cleanup * cleanup
-
Stas Bekman authored
-
Sylvain Gugger authored
* Deprecate prepare_seq2seq_batch * Fix last tests * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Suraj Patil <surajp815@gmail.com> * More review comments Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
Lysandre Debut authored
-
Julien Plu authored
* AMP * Add LED * Apply style * Fix longformer
-
Lysandre Debut authored
Co-authored-by:
Pengcheng He <penhe@microsoft.com> Co-authored-by:
Pengcheng He <penhe@microsoft.com>
-
- 21 Feb, 2021 1 commit
-
-
tagucci authored
* fix typo in conversion script * style Co-authored-by:Stas Bekman <stas@stason.org>
-
- 20 Feb, 2021 3 commits
-
-
Stas Bekman authored
-
Sylvain Gugger authored
-
cronoik authored
-
- 19 Feb, 2021 15 commits
-
-
Pengcheng He authored
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models; * DeBERTa-v2 * Fix v2 model loading issue (#10129) * Doc members * Update src/transformers/models/deberta/modeling_deberta.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address Sylvain's comments * Address Patrick's comments Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Style Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Sylvain Gugger authored
-
Julien Plu authored
-
Joe Davison authored
-
Stas Bekman authored
-
Tanmay Garg authored
Introduce logging_strategy training argument in TrainingArguments and TFTrainingArguments. (#9838)
-
Julien Plu authored
* Fix AMP and XLA * Remove useless var
-
Julien Plu authored
* Fix AMP * Apply style * Remove unused import
-
Julien Plu authored
-
Julien Plu authored
* Fix XLA * Rework cast * Apply style
-
Julien Plu authored
* Fix AMP * Trigger CI * Rework cast
-
Julien Plu authored
* Fix AMP * Rework cast * Apply style
-
Stas Bekman authored
* propose using google colab to reproduce problems * Update ISSUES.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* implement --fp16_full_eval * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * add test Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
-
- 18 Feb, 2021 6 commits
-
-
Joe Davison authored
* add zero-shot distillation script * readme wordsmithing * clean up code * add multi-gpu teacher inference plus tidying up more code * add use_fast_tokenizer arg * update results in readme * more readme wordsmithing * style * Add handle to readme Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * fix code block * add error+docs about distributed & tpu * add @sgugger format requests * xla -> tpu * support fp16 for teacher preds * no checkpoint by default * add demo colab link * add model sharing prompt + model link * correct resulting acc of example Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
* memory tracker metrics * go back to eval for somewhat consistency * handle no-gpu case * deal with stackable eval calls * restore callback order * style * simplify the API * add test * docs * consistently use eval_ prefix * improve docs * Update src/transformers/trainer_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename method * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Tanmay Garg authored
Introduce warmup_ratio training argument in both TrainingArguments and TFTrainingArguments classes (#6673)
-
Julien Plu authored
* rework savedmodel slow test * Improve savedmodel tests * Remove useless content
-
Julien Plu authored
-
Julien Plu authored
* Fix XLA and AMP * Fix AMP and XLA * Apply style * Apply Patrick's comment
-
- 17 Feb, 2021 7 commits
-
-
Stas Bekman authored
-
Stas Bekman authored
* refactor place_model_on_device logic, add deepspeed * doc * style
-
Stas Bekman authored
* fix invalid port * missing requirements
-
Julien Plu authored
* Fix XLA and AMP * Apply style * Remove useless cast
-
Julien Plu authored
* Fix Flaubert and XLM * Remove useless cast * Tiny fix * Tiny fix
-
Julien Plu authored
* Update BART * Update Blenderbot * Update BlenderbotSmall * Update Marian * Update MBart * Update MBart * Update Pegasus * Update template * Fix Marian and Pegasus * Apply style * Default initializer * Default initializer * Default initializer * Remove int32 casts * Fix template * Remove more cast
-
Daniel Stancl authored
* Fix head_mask and decoder_head_mask in TFT5 models * Enable test_headmasking both fot TFT5 tester and TFT5EncoderOnly tester Co-authored-by:patrickvonplaten <patrick.v.platen@gmail.com>
-