- 25 May, 2020 9 commits
-
-
Patrick von Platen authored
* fix reformer num buckets * fix * adapt docs * set num buckets in config
-
Elman Mansimov authored
-
Suraj Patil authored
-
Oliver Guhr authored
I looks like the conference has changed the link to the paper.
-
Sho Arora authored
-
Manuel Romero authored
-
Ali Safaya authored
-
Antonis Maronikolakis authored
-
Suraj Patil authored
* added LongformerForQuestionAnswering * add LongformerForQuestionAnswering * fix import for LongformerForMaskedLM * add LongformerForQuestionAnswering * hardcoded sep_token_id * compute attention_mask if not provided * combine global_attention_mask with attention_mask when provided * update example in docstring * add assert error messages, better attention combine * add test for longformerForQuestionAnswering * typo * cast gloabl_attention_mask to long * make style * Update src/transformers/configuration_longformer.py * Update src/transformers/configuration_longformer.py * fix the code quality * Merge branch 'longformer-for-question-answering' of https://github.com/patil-suraj/transformers into longformer-for-question-answering Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 23 May, 2020 1 commit
-
-
Bharat Raghunathan authored
-
- 22 May, 2020 15 commits
-
-
Bijay Gurung authored
* Add Type Hints to modeling_utils.py Closes #3911 Add Type Hints to methods in `modeling_utils.py` Note: The coverage isn't 100%. Mostly skipped internal methods. * Reformat according to `black` and `isort` * Use typing.Iterable instead of Sequence * Parameterize Iterable by its generic type * Use typing.Optional when None is the default value * Adhere to style guideline * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Funtowicz Morgan authored
* Warn the user about max_len being on the path to be deprecated. * Ensure better backward compatibility when max_len is provided to a tokenizer. * Make sure to override the parameter and not the actual instance value. * Format & quality
-
Patrick von Platen authored
-
Sam Shleifer authored
* Fix pipelines defaults bug * one liner * style
-
Julien Chaumond authored
As discussed w/ @lysandrejik packaging is maintained by PyPA (the Python Packaging Authority), and should be lightweight and stable
-
Lysandre authored
-
Alexander Measure authored
changed from https://https://arxiv.org/abs/2001.04451.pdf to https://arxiv.org/abs/2001.04451.pdf
-
HUSEIN ZOLKEPLI authored
-
Anthony MOI authored
-
Julien Chaumond authored
cc @sshleifer
-
Patrick von Platen authored
-
Lysandre authored
-
Lysandre authored
-
Lysandre authored
-
Frankie Liuzzi authored
* added functionality for electra classification head * unneeded dropout * Test ELECTRA for sequence classification * Style Co-authored-by:
Frankie <frankie@frase.io> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 21 May, 2020 4 commits
-
-
Lysandre authored
-
Lysandre Debut authored
* TPU hangs when saving optimizer/scheduler * Style * ParallelLoader is not a DataLoader * Style * Addressing @julien-c's comments
-
Zhangyx authored
Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463) * Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website. * Use Split enum + always output the label name Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Tobias Lee authored
* fix no grad in second pruning and typo * fix prune heads attention mismatch problem * fix * fix * fix * run make style * run make style
-
- 20 May, 2020 11 commits
-
-
Julien Chaumond authored
-
Julien Chaumond authored
-
Cola authored
Remove warning of deprecated overload of addcdiv_ Fix #4451
-
Julien Plu authored
* Better None gradients handling * Apply Style * Apply Style
-
Oliver 脜strand authored
* Exclude LayerNorms from weight decay * Include both formats of layer norm
-
Rens authored
-
Nathan Cooper authored
-
Timo Moeller authored
-
Lysandre Debut authored
* There is one missing key in BERT * Correct device for CamemBERT model * RoBERTa tokenization adding prefix space * Style
-
Manuel Romero authored
-
Manuel Romero authored
-