- 27 Jan, 2021 1 commit
-
-
Patrick von Platen authored
-
- 26 Jan, 2021 13 commits
-
-
Yusuke Mori authored
-
Tristan Deleu authored
* Commit the last step on world_process_zero in WandbCallback * Use the environment variable WANDB_LOG_MODEL as a default value in WandbCallback
-
Derrick Blakely authored
* get cross attns * add cross-attns doc strings * fix typo * line length * Apply suggestions from code review Co-authored-by:
Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by:
Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
-
Magdalena Biesialska authored
-
Michael Glass authored
-
Sylvain Gugger authored
* Add a debug print * Adapt Trainer to use smdistributed if available * Forgotten parenthesis * Real check for sagemaker * Donforget to define device... * Woopsie, local)rank is defined differently * Update since local_rank has the proper value * Remove debug statement * More robust check for smdistributed * Quality * Deal with key not present error
-
Lysandre authored
-
Andrea Cappelli authored
* Pad to 8x for fp16 multiple choice example (#9752) * Pad to 8x for fp16 squad trainer example (#9752) * Pad to 8x for fp16 ner example (#9752) * Pad to 8x for fp16 swag example (#9752) * Pad to 8x for fp16 qa beam search example (#9752) * Pad to 8x for fp16 qa example (#9752) * Pad to 8x for fp16 seq2seq example (#9752) * Pad to 8x for fp16 glue example (#9752) * Pad to 8x for fp16 new ner example (#9752) * update script template #9752 * Update examples/multiple-choice/run_swag.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa_beam_search.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve code quality #9752 Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Nicolas Patry authored
* We most likely don't want special tokens in this output. * Adding `skip_special_tokens=True` to FillMaskPipeline - It's backward incompatible. - It makes for sense for pipelines to remove references to special_tokens (all of the other pipelines do that). - Keeping special tokens makes it hard for users to actually remove them because all models have different tokens (<s>, <cls>, [CLS], ....) * Fixing `token_str` in the same vein, and actually fix the tests too !
-
Daniel Stancl authored
* Add head_mask/decoder_head_mask for TF BART models * Add head_mask and decoder_head_mask input arguments for TF BART-based models as a TF counterpart to the PR #9569 * Add test_headmasking functionality to tests/test_modeling_tf_common.py * TODO: Add a test to verify that we can get a gradient back for importance score computation * Remove redundant #TODO note Remove redundant #TODO note from tests/test_modeling_tf_common.py * Fix assertions * Make style * Fix ...Model input args and adjust one new test * Add back head_mask and decoder_head_mask to BART-based ...Model after the last commit * Remove head_mask ande decoder_head_mask from input_dict in TF test_train_pipeline_custom_model as these two have different shape than other input args (Necessary for passing this test) * Revert adding global_rng in test_modeling_tf_common.py
-
Yusuke Mori authored
* Fix broken links in the converting tf ckpt document * Update docs/source/converting_tensorflow_models.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Reflect the review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Patrick von Platen authored
* fix ci * fix ci * renaming * fix dup line
-
Stas Bekman authored
* normalize, group, sort + add myself for deepspeed * new structure * add ray * typo * more suggestions * more suggestions * white space * Update .github/ISSUE_TEMPLATE/bug-report.md Co-authored-by:
Suraj Patil <surajp815@gmail.com> * add bullets * sync * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * sync Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 25 Jan, 2021 7 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Auto-resume training from checkpoint * Update examples/text-classification/run_glue.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Roll out to other examples Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Lysandre Debut authored
-
Stas Bekman authored
* onnx triu workaround * style * working this time * add test * more efficient version
-
Sorami Hisamoto authored
`compute_objectie` => `compute_objective`
-
Kai Fricke authored
-
Maria Janina Sarol authored
* Fix TFTrainer prediction output * Update trainer_tf.py * Fix TFTrainer prediction output * Fix evaluation_loss update in TFTrainer * Fix TFTrainer prediction output
-
- 23 Jan, 2021 2 commits
-
-
Wilfried L. Bounsi authored
-
Stas Bekman authored
-
- 22 Jan, 2021 5 commits
-
-
Julien Plu authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Fixes to run_seq2seq and instructions * Add more defaults for summarization
-
Julien Plu authored
* Fix saved model tests + fix a graph issue in longformer * Apply style
-
Stefan Schweter authored
-
- 21 Jan, 2021 11 commits
-
-
Sylvain Gugger authored
* Fix memory regression in Seq2Seq example * Fix test and properly deal with -100 * Easier condition with device safety * Patch for MBartTokenzierFast
-
Julien Plu authored
* Fix Seq2Seq models for serving * Apply style * Fix lonfgormer * Fix mBart/Pegasus/Blenderbot * Apply style * Add a main intermediate layer * Apply style * Remove import * Apply tf.function to Longformer * Fix utils check_copy * Update S2S template * Fix BART + Blenderbot * Fix BlenderbotSmall * Fix BlenderbotSmall * Fix BlenderbotSmall * Fix MBart * Fix Marian * Fix Pegasus + template * Apply style * Fix common attributes test * Forgot to fix the LED test * Apply Patrick's comment on LED Decoder
-
Nicolas Patry authored
* Changing model default for TableQuestionAnsweringPipeline. - Discussion: https://discuss.huggingface.co/t/table-question-answering-is-not-an-available-task-under-pipeline/3284/6 * Updating slow tests that were out of sync.
-
Julien Plu authored
* Fix Gelu precision * Fix gelu_fast * Naming * Fix usage and apply style * add TF gelu approximate version * add TF gelu approximate version * add TF gelu approximate version * Apply style * Fix albert * Remove the usage of the Activation layer
-
Suraj Patil authored
* fix head mask in model_parallel * pass correct head mask
-
Patrick von Platen authored
-
Patrick von Platen authored
-
guillaume-be authored
* Moved ProphetNetForCausalLM's parent initialization after config update * Added unit tests for generation for ProphetNetForCausalLM
-
Lysandre Debut authored
-
Muennighoff authored
* fix typo Co-authored-by:Suraj Patil <surajp815@gmail.com>
-
Stas Bekman authored
* no --deepspeed and --sharded_ddp together * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 20 Jan, 2021 1 commit
-
-
Sylvain Gugger authored
-