- 20 Aug, 2020 14 commits
-
-
sgugger authored
-
Sylvain Gugger authored
* Move threshold up for flaky test with Electra * Update above as well
-
Ivan Dolgov authored
* xlnet fp16 bug fix * comment cast added * Update modeling_xlnet.py Co-authored-by:Kevin Canwen Xu <canwenxu@126.com>
-
Patrick von Platen authored
* fix distilbert * fix typo
-
Denisa Roberts authored
-
Joe Davison authored
* TFTrainer dataset doc & fix evaluation bug discussed in #6551 * add docstring to test/eval datasets
-
Sylvain Gugger authored
* Add tests to Trainer * Test if removing long breaks everything * Remove ugly hack * Fix distributed test * Use float for number of epochs
-
Joe Davison authored
* add intro to nlp lib + links * unique links...
-
sgugger authored
-
Prajjwal Bhargava authored
* removed redundant arg in prepare_inputs * made same change in prediction_loop
-
Romain Rigaux authored
Tested in a local build of the docs. e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling Copy will copy the full code, e.g. for token in top_5_tokens: print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token]))) Instead of currently only: for token in top_5_tokens: >>> for token in top_5_tokens: ... print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token]))) Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint. Docs for the option fix: https://sphinx-copybutton.readthedocs.io/en/latest/
-
Stas Bekman authored
-
Siddharth Jain authored
-
Oren Amsalem authored
-
- 19 Aug, 2020 7 commits
-
-
Sylvain Gugger authored
-
Suraj Patil authored
-
Patrick von Platen authored
-
Sam Shleifer authored
-
Pradhy729 authored
* Feed forward chunking for Distilbert & Albert * Added ff chunking for many other models * Change model signature * Added chunking for XLM * Cleaned up by removing some variables. * remove test_chunking flag Co-authored-by:patrickvonplaten <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* start adding tie encoder to decoder functionality * finish model tying * make style * Apply suggestions from code review * fix t5 list including cross attention * apply sams suggestions * Update src/transformers/modeling_encoder_decoder.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add max depth break point Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sam Shleifer authored
-
- 18 Aug, 2020 13 commits
-
-
Sam Shleifer authored
Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Patrick von Platen authored
* Bert2GPT2 EncoderDecoder model * Update README.md
-
Suraj Patil authored
-
Suraj Patil authored
Minor typo correction @sshleifer
-
Manuel Romero authored
-
Manuel Romero authored
-
Romain Rigaux authored
-
Romain Rigaux authored
-
Stas Bekman authored
As discussed at https://github.com/huggingface/transformers/issues/6317 codecov currently sends an invalid report when it fails to find a code coverage report for the base it checks against, so this gets fixed by: - require_base: yes # don't report if there is no base coverage report let's add this for clarity, this supposedly is already the default. - require_head: yes # don't report if there is no head coverage report and perhaps no point reporting on doc changes as they don't make any difference and it just generates noise: - require_changes: true # only comment if there was change in coverage
-
Stefan Schweter authored
-
Philip May authored
* Update README.md * Update model_cards/german-nlp-group/electra-base-german-uncased/README.md Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Ali Modarressi authored
* fixed label datatype for sts-b * naming update * make style * make style
-
Sam Shleifer authored
-
- 17 Aug, 2020 6 commits
-
-
Jim Regan authored
-
onepointconsulting authored
* Added first model card * Add metadata Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Ikram Ali authored
* [model_cards] roberta-urdu-small added. * [model_cards] typo fixed. * Tweak license format (yaml expects a simple string) Co-authored-by: Ikram Ali <mrikram1989> Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Jim Regan authored
-
Julien Chaumond authored
-
Suraj Patil authored
* tests
-