- 24 Jun, 2020 18 commits
-
-
Setu Shah authored
-
Sylvain Gugger authored
-
Sai Saketh Aluru authored
* Add dehatebert-mono-arabic readme card * Update dehatebert-mono-arabic model card * model cards for Hate-speech-CNERG models
-
Lysandre Debut authored
* Cleaning TensorFlow models Update all classes stylr * Don't average loss
-
Sylvain Gugger authored
-
Ali Modarressi authored
-
Sylvain Gugger authored
* Try with the same command * Try like this
-
Sylvain Gugger authored
-
Patrick von Platen authored
* fix use cache * add bart use cache * fix bart * finish bart
-
Sylvain Gugger authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* add benchmark for all kinds of models * improved import * delete bogus files * make style
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
flozi00 authored
* Create README.md * Update model_cards/a-ware/roberta-large-squad-classification/README.md Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Adriano Diniz authored
Fix/add information in README.md
-
ahotrod authored
electra_large_discriminator_squad2_512 Question Answering LM
-
Kevin Canwen Xu authored
* Fix PABEE division by zero error * patience=0 by default
-
- 23 Jun, 2020 11 commits
-
-
Sylvain Gugger authored
* Only put tensors on a device * Type hint and unpack list comprehension
-
Sylvain Gugger authored
* Add version control menu * Constify things Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Chaumond <chaumond@gmail.com>
-
Sam Shleifer authored
-
Julien Chaumond authored
-
Sam Shleifer authored
-
Thomas Wolf authored
-
Thomas Wolf authored
* Add return lengths * make pad a bit more flexible so it can be used as collate_fn * check all kwargs sent to encoding method are known * fixing kwargs in encodings * New AddedToken class in python This class let you specify specifique tokenization behaviors for some special tokens. Used in particular for GPT2 and Roberta, to control how white spaces are stripped around special tokens. * style and quality * switched to hugginface tokenizers library for AddedTokens * up to tokenizer 0.8.0-rc3 - update API to use AddedToken state * style and quality * do not raise an error on additional or unused kwargs for tokenize() but only a warning * transfo-xl pretrained model requires torch * Update src/transformers/tokenization_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Patrick von Platen authored
* improve mem handling * improve mem for pos ax encodings
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
- 22 Jun, 2020 11 commits
-
-
flozi00 authored
* [Modelcard] bart-squadv2 * Update README.md * Update README.md
-
flozi00 authored
-
Fran Martinez authored
* Create README.md * changes in model usage section * minor changes in output visualization * minor errata in readme
-
furunkel authored
* Create README.md * Update README.md
-
bogdankostic authored
-
Adriano Diniz authored
-
Adriano Diniz authored
-
Adriano Diniz authored
* Create README.md * Apply suggestions from code review Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Micha毛l Benesty authored
* Add link to new comunity notebook (optimization) related to https://github.com/huggingface/transformers/issues/4842#event-3469184635 This notebook is about benchmarking model training with/without dynamic padding optimization. https://github.com/ELS-RD/transformers-notebook Using dynamic padding on MNLI provides a **4.7 times training time reduction**, with max pad length set to 512. The effect is strong because few examples are >> 400 tokens in this dataset. IRL, it will depend of the dataset, but it always bring improvement and, after more than 20 experiments listed in this [article](https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e?source=friends_link&sk=10a45a0ace94b3255643d81b6475f409 ), it seems to not hurt performance. Following advice from @patrickvonplaten I do the PR myself :-) * Update notebooks/README.md Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Lee Haau-Sing authored
* nyu-mll: roberta on smaller datasets * Update README.md * Update README.md Co-authored-by:Alex Warstadt <alexwarstadt@gmail.com>
-
Sylvain Gugger authored
-