- 27 Apr, 2021 3 commits
-
-
Suraj Patil authored
* fix docs for decoder_input_ids * revert the changes for bart and mbart
-
Hamel Husain authored
* finish quicktour * fix import * fix print * explain config default better * Update docs/source/quicktour.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Hamel Husain authored
* update docs to reflect model output object * run make style`
-
- 26 Apr, 2021 20 commits
-
-
Ashwin Geet D'Sa authored
* removed max_len * removed max_length from BeamSearchScorer * correct max length * finish * del vim * finish & add test Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Stas Bekman authored
* adding Z-inf * revamp config process * up version requirement * wip * massive rewrite * cleanup * cleanup * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * consistent json commas * act on suggestions * leave this feature for 0.3.16 * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Jaimeen Ahn authored
As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively, the correction makes the example work
-
Bhadresh Savani authored
* added changes for uniformity * modified files * corrected typo * fixed qa scripts * fix typos * fixed predict typo in qa no trainer * fixed test file * reverted trainer changes * reverted trainer changes in custom exmaples * updated readme * added changes in deepspeed test * added changes for predict and eval
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Stas Bekman authored
-
Stas Bekman authored
* fix invalid class name * proper ref * proper ref
-
Kostas Stathoulopoulos authored
* Improve documentation for is_split_into_words argument * Change description wording
-
Sylvain Gugger authored
* Pass along seed to DistributedSampler * Add seed to DistributedLengthGroupedSampler
-
LSinev authored
-
Amine Abdaoui authored
-
Sylvain Gugger authored
* Add FP16 support for SageMaker MP * Add print debugs * Squeeze * Remove debug statements * Add defensive check * Typo
-
Daniel Stancl authored
TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699) * Add cross_attn_head_mask to BART * Fix cross_attentions in TFBart-like models * This commit enables returning of `cross_attentions` for TFBart-like models * It also fixes attention head masking in cross-attenion module * Update TF model templates * Fix missing , in TF model templates * Fix typo: congig -> config
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
-
Vasudev Gupta authored
-
abiolaTresor authored
-
- 25 Apr, 2021 2 commits
-
-
cronoik authored
* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel * rollback to current version of the huggingface master branch * reworked version that ties the encoder and decoder config of the parent encoderdecoder instance * overwrite of resize_token_embeddings throws an error now * review comment suggestion Co-authored-by:
Suraj Patil <surajp815@gmail.com> * implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig * added test to avoid diverging configs of wrapper class and wrapped classes * Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py * make style Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Daniel Stancl authored
* Add head_mask & decoder_head_mask + some corrections * Fix head masking for N-grams * Enable test_headmasking for encoder and decod * Fix one typo regarding in modeling_propgetnet.py * Enable test_headmasking for ProphetNetStandaloneDecoderModelTest and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py * make style * Fix cross_head_mask * Fix attention head mask naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Still need to merge #10605 to master to pass the tests
-
- 24 Apr, 2021 2 commits
-
-
Sylvain Gugger authored
-
cronoik authored
documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410)
-
- 23 Apr, 2021 13 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Philip May authored
* enable subword regularization. * fix tokenizer storage * fix docstring formatting * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by:
Stefan Schweter <stefan@schweter.it> * fix docstring formatting * add test for subword regularization tokenizer * improve comments of test * add sp_model_kwargs * reformat docstring to match the style * add some more documentation * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve docstring * empty commit to trigger CI * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docstring formatting for sphinx Co-authored-by:
Stefan Schweter <stefan@schweter.it> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
Daniel Stancl authored
* Fix cross-attention head mask for Torch BART models * Fix head masking for cross-attention module for the following models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart, Pegasus * Enable test_headmasking for M2M_100 model * Fix cross_head_mask for FSMT, LED and T5 * This commit fixes `head_mask` for cross-attention modules in the following models: FSMT, LED, T5 * It also contains some smaller changes in doc so that it is be perfectly clear the shape of `cross_head_mask` is the same as of `decoder_head_mask` * Update template * Fix template for BartForCausalLM * Fix cross_head_mask for Speech2Text models * Fix cross_head_mask in templates * Fix args order in BartForCausalLM template * Fix doc in BART templates * Make more explicit naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Fix doc * make style quality * Fix speech2text docstring
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Nicola De Cao authored
When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.
-
Kiran R authored
-
Patrick von Platen authored
-
Sylvain Gugger authored
* Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Teven authored
* Fixed trainer total_flos relaoding in distributed mode * logging flos at the end of training
-