- 01 Dec, 2020 9 commits
-
-
Sylvain Gugger authored
* Add a `distributed_env` property to TrainingArguments * Change name * Address comment
-
Sylvain Gugger authored
-
Stas Bekman authored
* restore skip * Revert "Remove deprecated `evalutate_during_training` (#8852)" This reverts commit 55302990. * check that pipeline.git.base_revision is defined before proceeding * Revert "Revert "Remove deprecated `evalutate_during_training` (#8852)"" This reverts commit dfec84db3fdce1079f01f1bc8dfaf21db2ccaba1. * check that pipeline.git.base_revision is defined before proceeding * doc only * doc + code * restore * restore * typo
-
Lysandre Debut authored
-
Adam Pocock authored
Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582. (#8860) Update src/transformers/tokenization_utils_base.py with review fix Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
-
Ratthachat (Jung) authored
* 2 typos - from_question_encoder_generator_configs fix 2 typos from_encoder_generator_configs --> from_question_encoder_generator_configs * apply make style
-
Rodolfo Quispe authored
-
elk-cloner authored
* add CTRLForSequenceClassification * pass local test * merge with master * fix modeling test for sequence classification * fix deco * fix assert
-
- 30 Nov, 2020 15 commits
-
-
Stas Bekman authored
* fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup
-
Nicolas Patry authored
* NerPipeline (TokenClassification) now outputs offsets of words - It happens that the offsets are missing, it forces the user to pattern match the "word" from his input, which is not always feasible. For instance if a sentence contains the same word twice, then there is no way to know which is which. - This PR proposes to fix that by outputting 2 new keys for this pipelines outputs, "start" and "end", which correspond to the string offsets of the word. That means that we should always have the invariant: ```python input[entity["start"]: entity["end"]] == entity["entity_group"] # or entity["entity"] if not grouped ``` * Fixing doc style -
LysandreJik authored
-
Funtowicz Morgan authored
* Slightly increase tolerance between pytorch and flax output Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * test_multiple_sentences doesn't require torch Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify parameterization on "jit" to use boolean rather than str Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use `require_torch` on `test_multiple_sentences` because we pull the weight from the hub. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Rename "jit" parameter to "use_jit" for (hopefully) making it self-documenting. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove pytest.mark.parametrize which seems to fail in some circumstances Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Give default parameters values for traced model. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Review comment: Change sentences to sequences Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
LysandreJik authored
-
LysandreJik authored
-
Sylvain Gugger authored
* Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Shai Erera authored
* Use model.from_pretrained for DataParallel also When training on multiple GPUs, the code wraps a model with torch.nn.DataParallel. However if the model has custom from_pretrained logic, it does not get applied during load_best_model_at_end. This commit uses the underlying model during load_best_model_at_end, and re-wraps the loaded model with DataParallel. If you choose to reject this change, then could you please move the this logic to a function, e.g. def load_best_model_checkpoint(best_model_checkpoint) or something, so that it can be overridden? * Fix silly bug * Address review comments Thanks for the feedback. I made the change that you proposed, but I also think we should update L811 to check if `self.mode` is an instance of `PreTrained`, otherwise we would still not get into that `if` section, right?
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Fraser Greenlee authored
Related issue: https://github.com/huggingface/transformers/issues/8837
-
Stefan Schweter authored
-
Ahmed Elnaggar authored
* Add T5 Encoder class for feature extraction * fix T5 encoder add_start_docstrings indent * update init with T5 encoder * update init with TFT5ModelEncoder * remove TFT5ModelEncoder * change T5ModelEncoder order in init * add T5ModelEncoder to transformers init * clean T5ModelEncoder * update init with TFT5ModelEncoder * add TFModelEncoder for Tensorflow * update init with TFT5ModelEncoder * Update src/transformers/models/t5/modeling_t5.py change output from Seq2SeqModelOutput to BaseModelOutput Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * remove encoder_outputs 1. remove encoder_outputs from the function call. 2. remove the encoder_outputs If statement. 3. remove isinstance from return_dict. * Authorize missing decoder keys * remove unnecessary input parameters remove pask_key_values and use_cache * remove use_cache remove use_cache from the forward method * add doctoring for T5 encoder add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING * change return_dict to dot access * add T5_ENCODER_INPUTS_DOCSTRING for TF T5 * change TFT5Encoder output type to BaseModelOutput * remove unnecessary parameters for TFT5Encoder * remove unnecessary if statement * add import BaseModelOutput * fix BaseModelOutput typo to TFBaseModelOutput * update T5 doc with T5ModelEncoder * add T5ModelEncoder to tests * finish pytorch * finish docs and mt5 * add mtf to init * fix init * remove n_positions * finish PR * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/mt5/modeling_tf_mt5.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * make style Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Lysandre Debut authored
* Migration guide from v3.x to v4.x * Better wording * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments * Better wording. Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 29 Nov, 2020 3 commits
-
-
Stas Bekman authored
* implement job skipping for doc-only PRs * silent grep is crucial * wip * wip * wip * wip * wip * wip * wip * wip * let's add doc * let's add code * revert test commits * restore * Better name * Better name * Better name * some more testing * some more testing * some more testing * finish testing
-
Guy Rosin authored
* Fix minor typos * Additional typos * Style fix Co-authored-by:guyrosin <guyrosin@assist-561.cs.technion.ac.il>
-
Patrick von Platen authored
* refactor * further refactor * fix the rest tomorrow * save intermediate * finish slow tokenizer * make more tests pass * finish refactor * fix comment * clean further * fix name * fix naming * Update src/transformers/models/reformer/tokenization_reformer.py * Apply suggestions from code review * Apply suggestions from code review * refactor * fix init tokenizers * refactor * improve convert * refactor * correct convert slow tokenizer * final fix for Pegasus Tok * remove ipdb * improve links
-
- 28 Nov, 2020 1 commit
-
-
Patrick von Platen authored
-
- 27 Nov, 2020 12 commits
-
-
Lysandre Debut authored
-
LysandreJik authored
-
Stas Bekman authored
-
Max Del authored
* Fix decoder not returning hidden states from the last layer * Resolve conflict * Change the way to gather hidden states * Add decoder hidden states test * Make pytest and black happy * Remove redundant line * remove new line Co-authored-by:Stas Bekman <stas00@users.noreply.github.com>
-
Moussa Kamal Eddine authored
* Add init barthez * Add barthez model, tokenizer and docs BARThez is a pre-trained french seq2seq model that uses BART objective. * Apply suggestions from code review docs typos Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add license * Change URLs scheme * Remove barthez model keep tokenizer * Fix style * Fix quality * Update tokenizer * Add fast tokenizer * Add fast tokenizer test Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Julien Plu authored
enforce unix newline encoding regardless of OS creating the file
-
Manuel Romero authored
* Create README.md * Fix model path
-
Giovanni Compagnoni authored
* update configuration_utils.py typing to allow pathlike objects when sensible * update modeling_utils.py typing to allow pathlike objects when sensible * black * update tokenization_utils_base.py typing to allow pathlike objects when sensible * update tokenization_utils_fast.py typing to allow pathlike objects when sensible * update configuration_auto.py typing to allow pathlike objects when sensible * update configuration_auto.py docstring to allow pathlike objects when sensible * update tokenization_auto.py docstring to allow pathlike objects when sensible * black
-
Patrick von Platen authored
* correct dpr test and bert pos fault * fix dpr bert config problem * fix layoutlm * add config to dpr as well
-
Patrick von Platen authored
* try flax fix * same for roberta
-
mdermentzi authored
The tokenizer called at the input_ids of example 2 is currently encoding text_1. I think this should be changed to text_2.
-
Kristian Holsheimer authored
* [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes * [FlaxRoberta] Fix non-broadcastable attention mask * Use jax.numpy instead of ordinary numpy (otherwise not jit-able) * Partially revert "Use jax.numpy ..." * Add tests for batched forward passes * Avoid unnecessary OOMs due to preallocation of GPU memory by XLA * Auto-fix style * Re-enable GPU memory preallocation but with mem fraction < 1/paralleism
-