- 01 Sep, 2020 6 commits
-
-
Lysandre authored
-
Patrick von Platen authored
* fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test
-
Funtowicz Morgan authored
Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com>
-
Sam Shleifer authored
-
Sylvain Gugger authored
* Add logging doc * Foamtting * Update docs/source/main_classes/logging.rst * Update src/transformers/utils/logging.py Co-authored-by:Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
we had it added for one job, please add it to all pytest jobs - we need the output of what tests were run to debug the codecov issue. thank you!
-
- 31 Aug, 2020 15 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Funtowicz Morgan authored
* Update ONNX notebook to include section on quantization. Signed-off-by:Morgan Funtowicz <morgan@huggingface.co> * Addressing ONNX team comments
-
Sylvain Gugger authored
* Split the run_hp_search by backend * Unused import
-
krfricke authored
* Introduce HPO checkpointing for PBT * Moved checkpoint saving * Fixed checkpoint subdir pass * Fixed style * Enable/disable checkpointing, check conditions for various tune schedulers incl. PBT * Adjust number of GPUs to number of jobs * Avoid mode pickling in ray * Move hp search to integrations
-
Sam Shleifer authored
-
Jin Young (Daniel) Sohn authored
* Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * Fix style (#6803) * t5 model should make decoder_attention_mask (#6800) * [s2s] Test hub configs in self-scheduled CI (#6809) * [s2s] round runtime in run_eval (#6798) * Pegasus finetune script: add --adafactor (#6811) * [bart] rename self-attention -> attention (#6708) * [tests] fix typos in inputs (#6818) * Fixed open in colab link (#6825) * Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827) * BR_BERTo model card (#6793) * clearly indicate shuffle=False (#6312) * Clarify shuffle * clarify shuffle Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com> * [s2s README] Add more dataset download instructions (#6737) * Style * Patch logging issue * Set default logging level to `WARNING` instead of `INFO` * TF Flaubert w/ pre-norm (#6841) * Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> * Fix in Adafactor docstrings (#6845) * Fix resuming training for Windows (#6847) * Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * comments Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Thomas Ashish Cherian <6967017+PandaWhoCodes@users.noreply.github.com> Co-authored-by:
Zane Lim <zyuanlim@gmail.com> Co-authored-by:
Rodolfo De Nadai <rdenadai@gmail.com> Co-authored-by:
xujiaze13 <37360975+xujiaze13@users.noreply.github.com> Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Huang Lianzhe <hlz@pku.edu.cn> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Huang Lianzhe authored
* add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Lysandre Debut authored
-
Lysandre authored
-
Lysandre authored
-
- 30 Aug, 2020 6 commits
-
-
Sam Shleifer authored
-
xujiaze13 authored
* Clarify shuffle * clarify shuffle Co-authored-by:Kevin Canwen Xu <canwenxu@126.com>
-
Rodolfo De Nadai authored
-
Zane Lim authored
-
Thomas Ashish Cherian authored
-
Stas Bekman authored
-
- 29 Aug, 2020 3 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
- 28 Aug, 2020 9 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Sam Shleifer authored
* broken test * batch parity * tests pass * boom boom * boom boom * split out bart tokenizer tests * fix tests * boom boom * Fixed dataset bug * Fix marian * Undo extra * Get marian working * Fix t5 tok tests * Test passing * Cleanup * better assert msg * require torch * Fix mbart tests * undo extra decoder_attn_mask change * Fix import * pegasus tokenizer can ignore src_lang kwargs * unused kwarg test cov * boom boom * add todo for pegasus issue * cover one word translation edge case * Cleanup * doc
-
RafaelWO authored
* Improved tokenization with sacremoses * The TransfoXLTokenizer is now using sacremoses for tokenization * Added tokenization of comma-separated and floating point numbers. * Removed prepare_for_tokenization() from tokenization_transfo_xl.py because punctuation is handled by sacremoses * Added corresponding tests * Removed test comapring TransfoXLTokenizer and TransfoXLTokenizerFast * Added deprecation warning to TransfoXLTokenizerFast * isort change Co-authored-by:
Teven <teven.lescao@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Ahmed Elnaggar authored
-
Stas Bekman authored
`make style` with `black` < 20.8b1 is a no go (in case some other package forced a lower version) - so make it explicit to avoid confusion
-
Sam Shleifer authored
-
Stas Bekman authored
-
- 27 Aug, 2020 1 commit
-
-
Lysandre authored
-