- 22 Jun, 2020 26 commits
-
-
bogdankostic authored
-
Adriano Diniz authored
-
Adriano Diniz authored
-
Adriano Diniz authored
* Create README.md * Apply suggestions from code review Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Micha毛l Benesty authored
* Add link to new comunity notebook (optimization) related to https://github.com/huggingface/transformers/issues/4842#event-3469184635 This notebook is about benchmarking model training with/without dynamic padding optimization. https://github.com/ELS-RD/transformers-notebook Using dynamic padding on MNLI provides a **4.7 times training time reduction**, with max pad length set to 512. The effect is strong because few examples are >> 400 tokens in this dataset. IRL, it will depend of the dataset, but it always bring improvement and, after more than 20 experiments listed in this [article](https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e?source=friends_link&sk=10a45a0ace94b3255643d81b6475f409 ), it seems to not hurt performance. Following advice from @patrickvonplaten I do the PR myself :-) * Update notebooks/README.md Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Lee Haau-Sing authored
* nyu-mll: roberta on smaller datasets * Update README.md * Update README.md Co-authored-by:Alex Warstadt <alexwarstadt@gmail.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Quicktour part 1 * Update * All done * Typos Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com> * Address comments in quick tour * Update docs/source/quicktour.rst Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update from feedback Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Thomas Wolf authored
* Cleaner warning when loading pretrained models This make more explicit logging messages when using the various `from_pretrained` methods. It also make these messages as `logging.warning` because it's a common source of silent mistakes. * Update src/transformers/modeling_utils.py Co-authored-by:
Julien Chaumond <chaumond@gmail.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Julien Chaumond <chaumond@gmail.com> * style and quality Co-authored-by:
Julien Chaumond <chaumond@gmail.com>
-
Lysandre Debut authored
* Have documentation fail on warning * Force ci failure * Revert "Force ci failure" This reverts commit f0a4666ec2eb4cd00a4da48af3357defc63324a0.
-
Sylvain Gugger authored
-
Adriano Diniz authored
-
Manuel Romero authored
* Create README.md @julien-c check out that dataset meta tag is right * Fix typo Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Manuel Romero authored
-
Patrick von Platen authored
-
Thomas Wolf authored
* fix #5081 and improve backward compatibility (slightly) * add nlp to setup.cfg - style and quality * align default to previous default * remove test that doesn't generalize
-
Malte authored
Fix for https://github.com/huggingface/transformers/issues/3809
-
Iz Beltagy authored
* add support for gradient checkpointing in BERT * fix unit tests * isort * black * workaround for `torch.utils.checkpoint.checkpoint` not accepting bool * Revert "workaround for `torch.utils.checkpoint.checkpoint` not accepting bool" This reverts commit 5eb68bb804f5ffbfc7ba13c45a47717f72d04574. * workaround for `torch.utils.checkpoint.checkpoint` not accepting bool Co-authored-by:Lysandre Debut <lysandre@huggingface.co>
-
Joseph Liu authored
* Configure all models to use output_hidden_states as argument passed to foward() * Pass all tests * Remove cast_bool_to_primitive in TF Flaubert model * correct tf xlnet * add pytorch test * add tf test * Fix broken tests * Configure all models to use output_hidden_states as argument passed to foward() * Pass all tests * Remove cast_bool_to_primitive in TF Flaubert model * correct tf xlnet * add pytorch test * add tf test * Fix broken tests * Refactor output_hidden_states for mobilebert * Reset and remerge to master Co-authored-by:
Joseph Liu <joseph.liu@coinflex.com> Co-authored-by:
patrickvonplaten <patrick.v.platen@gmail.com>
-
Kevin Canwen Xu authored
* Add model cards for Microsoft's MiniLM * XLMRobertaTokenizer * format * Add thumbnail * finishing up
-
RafaelWO authored
* Fixed resize_token_embeddings for transfo_xl model * Fixed resize_token_embeddings for transfo_xl. Added custom methods to TransfoXLPreTrainedModel for resizing layers of the AdaptiveEmbedding. * Updated docstring * Fixed resizinhg cutoffs; added check for new size of embedding layer. * Added test for resize_token_embeddings * Fixed code quality * Fixed unchanged cutoffs in model.config * Added feature to move added tokens in tokenizer. * Fixed code quality * Added feature to move added tokens in tokenizer. * Fixed code quality * Fixed docstring, renamed sym to oken. Co-authored-by:Rafael Weingartner <rweingartner.its-b2015@fh-salzburg.ac.at>
-
Sylvain Gugger authored
* Update glossary * Update docs/source/glossary.rst Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* finish benchmark * fix isort * fix setup cfg * retab * fix time measuring of tf graph mode * fix tf cuda * clean code * better error message
-
Zihao Fu authored
fix bart doc
-
Mikael Souza authored
-
flozi00 authored
-
- 21 Jun, 2020 1 commit
-
-
Ilya Boytsov authored
Authored-by:i.boytsov <i.boytsov@MAC867.local>
-
- 20 Jun, 2020 7 commits
-
-
Tim Suchanek authored
-
Kevin Canwen Xu authored
-
Julien Chaumond authored
* SummarizationPipeline: init required task name * Update src/transformers/pipelines.py Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * Apply suggestions from code review Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
Kevin Canwen Xu authored
* Add BERT Loses Patience (Patience-based Early Exit) * update model archive * update format * sort import * flake8 * Add results * full results * align the table * refactor to inherit * default per gpu eval = 1 * Formatting * Formatting * isort * modify readme * Add check * Fix format * Fix format * Doc strings * ALBERT & BERT for sequence classification don't inherit from the original anymore * Remove incorrect comments * Remove incorrect comments * Remove incorrect comments * Sync up with new code * Sync up with new code * Add a test * Add a test * Add a test * Add a test * Add a test * Add a test * Finishing up!
-
Zhu Baohe authored
-
Kevin Canwen Xu authored
-
Lysandre authored
-
- 19 Jun, 2020 5 commits
-
-
Vasily Shamporov authored
* Add MobileBert * Quality + Conversion script * style * Update src/transformers/modeling_mobilebert.py * Links to S3 * Style * TFMobileBert Slight fixes to the pytorch MobileBert Style * MobileBertForMaskedLM (PT + TF) * MobileBertForNextSentencePrediction (PT + TF) * MobileFor{MultipleChoice, TokenClassification} (PT + TF) ss * Tests + Auto * Doc * Tests * Addressing @sgugger's comments * Adressing @patrickvonplaten's comments * Style * Style * Integration test * style * Model card Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sam Shleifer authored
-
Erick Rocha Fonseca authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
- 18 Jun, 2020 1 commit
-
-
Sylvain Gugger authored
-