- 26 Apr, 2021 12 commits
-
-
Kostas Stathoulopoulos authored
* Improve documentation for is_split_into_words argument * Change description wording
-
Sylvain Gugger authored
* Pass along seed to DistributedSampler * Add seed to DistributedLengthGroupedSampler
-
LSinev authored
-
Amine Abdaoui authored
-
Sylvain Gugger authored
* Add FP16 support for SageMaker MP * Add print debugs * Squeeze * Remove debug statements * Add defensive check * Typo
-
Daniel Stancl authored
TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699) * Add cross_attn_head_mask to BART * Fix cross_attentions in TFBart-like models * This commit enables returning of `cross_attentions` for TFBart-like models * It also fixes attention head masking in cross-attenion module * Update TF model templates * Fix missing , in TF model templates * Fix typo: congig -> config
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
-
Vasudev Gupta authored
-
abiolaTresor authored
-
- 25 Apr, 2021 2 commits
-
-
cronoik authored
* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel * rollback to current version of the huggingface master branch * reworked version that ties the encoder and decoder config of the parent encoderdecoder instance * overwrite of resize_token_embeddings throws an error now * review comment suggestion Co-authored-by:
Suraj Patil <surajp815@gmail.com> * implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig * added test to avoid diverging configs of wrapper class and wrapped classes * Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py * make style Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Daniel Stancl authored
* Add head_mask & decoder_head_mask + some corrections * Fix head masking for N-grams * Enable test_headmasking for encoder and decod * Fix one typo regarding in modeling_propgetnet.py * Enable test_headmasking for ProphetNetStandaloneDecoderModelTest and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py * make style * Fix cross_head_mask * Fix attention head mask naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Still need to merge #10605 to master to pass the tests
-
- 24 Apr, 2021 2 commits
-
-
Sylvain Gugger authored
-
cronoik authored
documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410)
-
- 23 Apr, 2021 19 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Philip May authored
* enable subword regularization. * fix tokenizer storage * fix docstring formatting * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by:
Stefan Schweter <stefan@schweter.it> * fix docstring formatting * add test for subword regularization tokenizer * improve comments of test * add sp_model_kwargs * reformat docstring to match the style * add some more documentation * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve docstring * empty commit to trigger CI * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docstring formatting for sphinx Co-authored-by:
Stefan Schweter <stefan@schweter.it> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
Daniel Stancl authored
* Fix cross-attention head mask for Torch BART models * Fix head masking for cross-attention module for the following models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart, Pegasus * Enable test_headmasking for M2M_100 model * Fix cross_head_mask for FSMT, LED and T5 * This commit fixes `head_mask` for cross-attention modules in the following models: FSMT, LED, T5 * It also contains some smaller changes in doc so that it is be perfectly clear the shape of `cross_head_mask` is the same as of `decoder_head_mask` * Update template * Fix template for BartForCausalLM * Fix cross_head_mask for Speech2Text models * Fix cross_head_mask in templates * Fix args order in BartForCausalLM template * Fix doc in BART templates * Make more explicit naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Fix doc * make style quality * Fix speech2text docstring
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Nicola De Cao authored
When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.
-
Kiran R authored
-
Patrick von Platen authored
-
Sylvain Gugger authored
* Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Teven authored
* Fixed trainer total_flos relaoding in distributed mode * logging flos at the end of training
-
Patrick von Platen authored
-
Yoshitomo Matsubara authored
-
Max Del authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* improve flax * refactor * typos * Update src/transformers/modeling_flax_utils.py * Apply suggestions from code review * Update src/transformers/modeling_flax_utils.py * fix typo * improve error tolerance * typo * correct nasty saving bug * fix from pretrained * correct tree map * add note * correct weight tying
-
- 22 Apr, 2021 5 commits
-
-
Sylvain Gugger authored
* Fix Trainer with remove_unused_columns=False * Typo
-
PenutChen authored
-
Matt authored
-
Takuya Makino authored
-
johnson7788 authored
fix typo Co-authored-by:johnson <johnson@github.com>
-