- 15 Jan, 2021 4 commits
-
-
Lysandre Debut authored
* Ignore lm_head decoder bias warning * Revert "Ignore lm_head decoder bias warning" This reverts commit f25177a9da6ca898e351f46c8b1515971de5c670. * predictions -> lm_head
-
Julien Plu authored
* Add warning * Remove unused import * Fix missing call * Fix missing call * Completely remove token_type_ids * Apply style * Remove unused import * Update src/transformers/models/mpnet/modeling_tf_mpnet.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Patrick von Platen authored
* fix tf led * remove loop file
-
Kiyoung Kim authored
This reverts commit 3f40070c.
-
- 14 Jan, 2021 11 commits
-
-
Stas Bekman authored
* [doc] install + 1-gpu deployment * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improvements Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Upstream (and rename) sortish sampler * Use proper sampler * Update src/transformers/trainer_pt_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Kiyoung Kim authored
* gradient accumulation for tftrainer * label naming Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * label naming Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Lysandre authored
-
Lysandre Debut authored
-
Lysandre Debut authored
* conda build -> conda-build * Syntax error * conda build -> conda-build + 4.2.0 * Prepare to merge in `master`
-
Stas Bekman authored
* note on how to get to deps from shell * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix text Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Julien Plu authored
-
Julien Plu authored
* Compliancy with tf-nightly * Add more version + restore min version check
-
Sylvain Gugger authored
* Switch metrics in run_ner to datasets * Add flag to return all metrics * Upstream (and rename) sortish_sampler * Revert "Upstream (and rename) sortish_sampler" This reverts commit e07d0dcf650c2bae36da011dd76c77a8bb4feb0d.
-
Sylvain Gugger authored
* Fix Trainer with a parallel model * More clean up
-
- 13 Jan, 2021 14 commits
-
-
Patrick von Platen authored
-
Lysandre authored
-
Lysandre authored
-
Lysandre authored
-
Lysandre Debut authored
* Fix conversational pipeline test * LayoutLM * ProphetNet * BART * Blenderbot & small * Marian * mBART * Pegasus * Tapas tokenizer * BERT2BERT test * Style * Example requirements * TF BERT2BERT test
-
Sylvain Gugger authored
* Fix data parallelism in Trainer * Update src/transformers/training_args.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
-
Yusuke Mori authored
* Update run_glue for do_predict with local test data (#9442) * Update run_glue (#9442): fix comments ('files' to 'a file') * Update run_glue (#9442): reflect the code review * Update run_glue (#9442): auto format * Update run_glue (#9442): reflect the code review -
LSinev authored
* make TopKLogitsWarper faster * make TopPLogitsWarper faster
-
Pavel Tarashkevich authored
Co-authored-by:Pavel Tarashkevich <Pavel.Tarashkievich@orange.com>
-
Lysandre Debut authored
-
Julien Chaumond authored
* Update pretrained_models.rst To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395 * format
-
Suraj Patil authored
* add model_input_names * fix test
-
Stas Bekman authored
* deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 12 Jan, 2021 11 commits
-
-
Sylvain Gugger authored
* Use the right version of tokenizers * Try another way * Try another way * Deps are installed from there... * Deps are installed from there... * Revert last * remove needless comment
-
Sylvain Gugger authored
* Add target contextmanager and rework prepare_seq2seq_batch * Fix tests, treat BART and Barthez * Add last tokenizers * Fix test * Set src token before calling the superclass * Remove special behavior for T5 * Remove needless imports * Remove needless asserts
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Lysandre Debut authored
-
Lysandre Debut authored
-
NielsRogge authored
* Add LayoutLMForSequenceClassification and integration tests Improve docs Add LayoutLM notebook to list of community notebooks * Make style & quality * Address comments by @sgugger, @patrickvonplaten and @LysandreJik * Fix rebase with master * Reformat in one line * Improve code examples as requested by @patrickvonplaten Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Suraj Patil authored
* fix t5 fp16
-
Patrick von Platen authored
-
Lysandre Debut authored
-
Simon Brandeis authored
-