- 02 Sep, 2020 5 commits
-
-
Harry Wang authored
-
Patrick von Platen authored
-
Parthe Pandit authored
outptus -> outputs in example of BertForPreTraining
-
David Mark Nemeskey authored
* Create README.md Model card for huBERT. * Update README.md lowercase h * Update model_cards/SZTAKI-HLT/hubert-base-cc/README.md Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Patrick von Platen authored
-
- 01 Sep, 2020 22 commits
-
-
Julien Chaumond authored
-
Rohan Rajpal authored
-
Rohan Rajpal authored
-
Igli Manaj authored
Fix range of possible score, add inference .
-
Tom Grek authored
-
zolekode authored
Co-authored-by:zolekode <pascal.zoleko@fau.de>
-
hakan authored
-
Manuel Romero authored
Add language meta attribute
-
Manuel Romero authored
Add language meta attribute
-
Abed khooli authored
* Create README.md model card for akhooli/xlm-r-large-arabic-sent * Update model_cards/akhooli/xlm-r-large-arabic-sent/README.md Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Abed khooli authored
-
Patrick von Platen authored
* finish xlm-roberta * finish docs * expose XLMRobertaForCausalLM
-
Patrick von Platen authored
* Create README.md * Update README.md
-
Jin Young (Daniel) Sohn authored
* Add cache_dir to save features TextDataset This is in case the dataset is in a RO filesystem, for which is the case in tests (GKE TPU tests). * style
-
Lysandre Debut authored
-
Lysandre authored
-
Lysandre authored
-
Patrick von Platen authored
* fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test
-
Funtowicz Morgan authored
Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com>
-
Sam Shleifer authored
-
Sylvain Gugger authored
* Add logging doc * Foamtting * Update docs/source/main_classes/logging.rst * Update src/transformers/utils/logging.py Co-authored-by:Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
we had it added for one job, please add it to all pytest jobs - we need the output of what tests were run to debug the codecov issue. thank you!
-
- 31 Aug, 2020 13 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Funtowicz Morgan authored
* Update ONNX notebook to include section on quantization. Signed-off-by:Morgan Funtowicz <morgan@huggingface.co> * Addressing ONNX team comments
-
Sylvain Gugger authored
* Split the run_hp_search by backend * Unused import
-
krfricke authored
* Introduce HPO checkpointing for PBT * Moved checkpoint saving * Fixed checkpoint subdir pass * Fixed style * Enable/disable checkpointing, check conditions for various tune schedulers incl. PBT * Adjust number of GPUs to number of jobs * Avoid mode pickling in ray * Move hp search to integrations
-
Sam Shleifer authored
-
Jin Young (Daniel) Sohn authored
* Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * Fix style (#6803) * t5 model should make decoder_attention_mask (#6800) * [s2s] Test hub configs in self-scheduled CI (#6809) * [s2s] round runtime in run_eval (#6798) * Pegasus finetune script: add --adafactor (#6811) * [bart] rename self-attention -> attention (#6708) * [tests] fix typos in inputs (#6818) * Fixed open in colab link (#6825) * Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827) * BR_BERTo model card (#6793) * clearly indicate shuffle=False (#6312) * Clarify shuffle * clarify shuffle Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com> * [s2s README] Add more dataset download instructions (#6737) * Style * Patch logging issue * Set default logging level to `WARNING` instead of `INFO` * TF Flaubert w/ pre-norm (#6841) * Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> * Fix in Adafactor docstrings (#6845) * Fix resuming training for Windows (#6847) * Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * comments Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Thomas Ashish Cherian <6967017+PandaWhoCodes@users.noreply.github.com> Co-authored-by:
Zane Lim <zyuanlim@gmail.com> Co-authored-by:
Rodolfo De Nadai <rdenadai@gmail.com> Co-authored-by:
xujiaze13 <37360975+xujiaze13@users.noreply.github.com> Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Huang Lianzhe <hlz@pku.edu.cn> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Huang Lianzhe authored
* add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Lysandre Debut authored
-