- 14 Jan, 2021 7 commits
-
-
Lysandre Debut authored
-
Lysandre Debut authored
* conda build -> conda-build * Syntax error * conda build -> conda-build + 4.2.0 * Prepare to merge in `master`
-
Stas Bekman authored
* note on how to get to deps from shell * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix text Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Julien Plu authored
-
Julien Plu authored
* Compliancy with tf-nightly * Add more version + restore min version check
-
Sylvain Gugger authored
* Switch metrics in run_ner to datasets * Add flag to return all metrics * Upstream (and rename) sortish_sampler * Revert "Upstream (and rename) sortish_sampler" This reverts commit e07d0dcf650c2bae36da011dd76c77a8bb4feb0d.
-
Sylvain Gugger authored
* Fix Trainer with a parallel model * More clean up
-
- 13 Jan, 2021 14 commits
-
-
Patrick von Platen authored
-
Lysandre authored
-
Lysandre authored
-
Lysandre authored
-
Lysandre Debut authored
* Fix conversational pipeline test * LayoutLM * ProphetNet * BART * Blenderbot & small * Marian * mBART * Pegasus * Tapas tokenizer * BERT2BERT test * Style * Example requirements * TF BERT2BERT test
-
Sylvain Gugger authored
* Fix data parallelism in Trainer * Update src/transformers/training_args.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
-
Yusuke Mori authored
* Update run_glue for do_predict with local test data (#9442) * Update run_glue (#9442): fix comments ('files' to 'a file') * Update run_glue (#9442): reflect the code review * Update run_glue (#9442): auto format * Update run_glue (#9442): reflect the code review -
LSinev authored
* make TopKLogitsWarper faster * make TopPLogitsWarper faster
-
Pavel Tarashkevich authored
Co-authored-by:Pavel Tarashkevich <Pavel.Tarashkievich@orange.com>
-
Lysandre Debut authored
-
Julien Chaumond authored
* Update pretrained_models.rst To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395 * format
-
Suraj Patil authored
* add model_input_names * fix test
-
Stas Bekman authored
* deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 12 Jan, 2021 13 commits
-
-
Sylvain Gugger authored
* Use the right version of tokenizers * Try another way * Try another way * Deps are installed from there... * Deps are installed from there... * Revert last * remove needless comment
-
Sylvain Gugger authored
* Add target contextmanager and rework prepare_seq2seq_batch * Fix tests, treat BART and Barthez * Add last tokenizers * Fix test * Set src token before calling the superclass * Remove special behavior for T5 * Remove needless imports * Remove needless asserts
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Lysandre Debut authored
-
Lysandre Debut authored
-
NielsRogge authored
* Add LayoutLMForSequenceClassification and integration tests Improve docs Add LayoutLM notebook to list of community notebooks * Make style & quality * Address comments by @sgugger, @patrickvonplaten and @LysandreJik * Fix rebase with master * Reformat in one line * Improve code examples as requested by @patrickvonplaten Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Suraj Patil authored
* fix t5 fp16
-
Patrick von Platen authored
-
Lysandre Debut authored
-
Simon Brandeis authored
-
Patrick von Platen authored
* fix naming issues * better names
-
Patrick von Platen authored
* make templates ready * make add_new_model_command_ready * finish tf bart * prepare tf mbart * finish tf bart * add tf mbart * add marian * prep pegasus * add tf pegasus * push blenderbot tf * add blenderbot * add blenderbot small * clean-up * make fix copy * define blend bot tok * fix * up * make style * add to docs * add copy statements * overwrite changes * improve * fix docs * finish * fix last slow test * fix missing git conflict line * fix blenderbot * up * fix blenderbot small * load changes * finish copied from * upload fix
-
- 11 Jan, 2021 6 commits
-
-
Stas Bekman authored
After experimenting with different number of workers https://github.com/huggingface/transformers/issues/9496#issuecomment-758145868 4-5 workers seems to be the most optimal - let's go with 4 as surely we wouldn't find a cpu with less cores these days. Fixes part of https://github.com/huggingface/transformers/issues/9496 @sgugger
-
Stas Bekman authored
* round numbers * style * round only on logging
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Julien Plu authored
-
Stas Bekman authored
* fix bad merge - dropped code * remove --model_parallel * Deal with TrainingArguments * Use a private attr and fix batch sizes * fix _n_gpu * add is_parallel helper wrapper * fix attribute * introduce a new attribute is_model_parallel * docs * docs * Put back init False and rearrange doc * Ignore non-init args in HFArgumentParser Co-authored-by:Sylvain Gugger <sylvain.gugger@gmail.com>
-