1. 04 Feb, 2021 1 commit
    • demSd's avatar
      BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128) · 00031785
      demSd authored
      
      
      * initiliaze bart4causalLM
      
      * create BartDecoderWrapper, setters/getters
      
      * delete spaces
      
      * forward and additional methods
      
      * update cache function, loss function, remove ngram* params in data class.
      
      * add bartcausallm, bartdecoder testing
      
      * correct bart for causal lm
      
      * remove at
      
      * add mbart as well
      
      * up
      
      * fix typo
      
      * up
      
      * correct
      
      * add pegasusforcausallm
      
      * add blenderbotforcausallm
      
      * add blenderbotsmallforcausallm
      
      * add marianforcausallm
      
      * add test for MarianForCausalLM
      
      * add Pegasus test
      
      * add BlenderbotSmall test
      
      * add blenderbot test
      
      * fix a fail
      
      * fix an import fail
      
      * a fix
      
      * fix
      
      * Update modeling_pegasus.py
      
      * fix models
      
      * fix inputs_embeds setting getter
      
      * adapt tests
      
      * correct repo utils check
      
      * finish test improvement
      
      * fix tf models as well
      
      * make style
      
      * make fix-copies
      
      * fix copies
      
      * run all tests
      
      * last changes
      
      * fix all tests
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      00031785
  2. 03 Feb, 2021 6 commits
  3. 02 Feb, 2021 4 commits
    • Daniel Stancl's avatar
      Add head_mask and decoder_head_mask to PyTorch LED (#9856) · 71bdc076
      Daniel Stancl authored
      * Add {decoder_,}head_mask to LED
      
      * Fix create_custom_forward signatue in encoder
      
      * Add head_mask to longformer
      
      * Add head_mask to longformer to fix dependencies
      of LED on Longformer.
      
      * Not working yet
      
      * Add mising one input in longofrmer_modeling.py
      
      * make fix-copies
      71bdc076
    • Patrick von Platen's avatar
      Wav2Vec2 (#9659) · d6217fb3
      Patrick von Platen authored
      
      
      * add raw scaffold
      
      * implement feat extract layers
      
      * make style
      
      * remove +
      
      * correctly convert weights
      
      * make feat extractor work
      
      * make feature extraction proj work
      
      * run forward pass
      
      * finish forward pass
      
      * Succesful decoding example
      
      * remove unused files
      
      * more changes
      
      * add wav2vec tokenizer
      
      * add new structure
      
      * fix run forward
      
      * add other layer norm architecture
      
      * finish 2nd structure
      
      * add model tests
      
      * finish tests for tok and model
      
      * clean-up
      
      * make style
      
      * finish docstring for model and config
      
      * make style
      
      * correct docstring
      
      * correct tests
      
      * change checkpoints to fairseq
      
      * fix examples
      
      * finish wav2vec2
      
      * make style
      
      * apply sylvains suggestions
      
      * apply lysandres suggestions
      
      * change print to log.info
      
      * re-add assert statement
      
      * add input_values as required input name
      
      * finish wav2vec2 tokenizer
      
      * Update tests/test_tokenization_wav2vec2.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * apply sylvains suggestions
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      d6217fb3
    • Lysandre Debut's avatar
      ALBERT Tokenizer integration test (#9943) · 1809de51
      Lysandre Debut authored
      * ALBERT Tokenizer integration test
      
      * Batching
      
      * Style
      1809de51
    • Patrick von Platen's avatar
      [Tokenizer Utils Base] Make pad function more flexible (#9928) · 538b3b46
      Patrick von Platen authored
      * change tokenizer requirement
      
      * split line
      
      * Correct typo from list to str
      
      * improve style
      
      * make other function pretty as well
      
      * add comment
      
      * correct typo
      
      * add new test
      
      * pass tests for tok without padding token
      
      * Apply suggestions from code review
      538b3b46
  4. 01 Feb, 2021 1 commit
    • Daniel Stancl's avatar
      Add head_mask and decoder_head_mask to FSMT (#9819) · 0c6c0afc
      Daniel Stancl authored
      * Add {decoder_,}head_mask to fsmt_modeling.py
      
      * Enable test_headmasking and some changes to docs
      
      * Remove test_head_masking flag from fsmt test file
      
      Remove test_head_masking flag from test_modeling_fsmt.py
      since test_head_masking is set to be True by default (thus it is redundant to store).
      
      * Merge master and remove test_head_masking = True
      
      * Rebase necessary due to an update of jaxlib
      
      * Remove test_head_masking=True in tests/test_modeling_fsmt.py
      as it is redundant.
      0c6c0afc
  5. 29 Jan, 2021 2 commits
    • Julien Plu's avatar
      Add XLA test (#9848) · fdcde144
      Julien Plu authored
      fdcde144
    • Nicolas Patry's avatar
      Adding a new `return_full_text` parameter to TextGenerationPipeline. (#9852) · c2d0ffec
      Nicolas Patry authored
      * Adding a new `return_full_text` parameter to TextGenerationPipeline.
      
      For text-generation, it's sometimes used as prompting text.
      In that context, prefixing `generated_text` with the actual input
      forces the caller to take an extra step to remove it.
      
      The proposed change adds a new parameter (for backward compatibility).
      `return_full_text` that enables the caller to prevent adding the prefix.
      
      * Doc quality.
      c2d0ffec
  6. 28 Jan, 2021 3 commits
  7. 27 Jan, 2021 7 commits
  8. 26 Jan, 2021 3 commits
    • Nicolas Patry's avatar
      Adding `skip_special_tokens=True` to FillMaskPipeline (#9783) · 781e4b13
      Nicolas Patry authored
      * We most likely don't want special tokens in this output.
      
      * Adding `skip_special_tokens=True` to FillMaskPipeline
      
      - It's backward incompatible.
      - It makes for sense for pipelines to remove references to
      special_tokens (all of the other pipelines do that).
      - Keeping special tokens makes it hard for users to actually remove them
        because all models have different tokens (<s>, <cls>, [CLS], ....)
      
      * Fixing `token_str` in the same vein, and actually fix the tests too !
      781e4b13
    • Daniel Stancl's avatar
      Add head_mask/decoder_head_mask for TF BART models (#9639) · 1867d9a8
      Daniel Stancl authored
      * Add head_mask/decoder_head_mask for TF BART models
      
      * Add head_mask and decoder_head_mask input arguments for TF BART-based
      models as a TF counterpart to the PR #9569
      
      * Add test_headmasking functionality to tests/test_modeling_tf_common.py
      
      * TODO: Add a test to verify that we can get a gradient back for
      importance score computation
      
      * Remove redundant #TODO note
      
      Remove redundant #TODO note from tests/test_modeling_tf_common.py
      
      * Fix assertions
      
      * Make style
      
      * Fix ...Model input args and adjust one new test
      
      * Add back head_mask and decoder_head_mask to BART-based ...Model
      after the last commit
      
      * Remove head_mask ande decoder_head_mask from input_dict
      in TF test_train_pipeline_custom_model as these two have different
      shape than other input args (Necessary for passing this test)
      
      * Revert adding global_rng in test_modeling_tf_common.py
      1867d9a8
    • Patrick von Platen's avatar
      [Flaky Generation Tests] Make sure that no early stopping is happening for beam search (#9794) · d94cc2f9
      Patrick von Platen authored
      * fix ci
      
      * fix ci
      
      * renaming
      
      * fix dup line
      d94cc2f9
  9. 25 Jan, 2021 1 commit
  10. 22 Jan, 2021 2 commits
  11. 21 Jan, 2021 5 commits
  12. 20 Jan, 2021 2 commits
  13. 19 Jan, 2021 3 commits
    • Daniel Stancl's avatar
      Add separated decoder_head_mask for T5 Models (#9634) · 2ebbbf55
      Daniel Stancl authored
      * Add decoder_head_mask for PyTorch T5 model
      
      * Add decoder_head_mask args into T5Model and T5ForConditionalGeneration
      
      * Slightly change the order of input args to be in accordance
      with the convention from BART-based models introduced within the PR #9569.
      
      * Make style for modeling_t5.py
      
      * Add decoder_head_mask for TF T5 models
      
      * Separate head_mask and decoder_head_mask args in TF T5 models
      
      * Slightly change the order of input args to follow convention
      of BART-based models updated in PR #9569
      
      * Update test_forward_signature tests/test_modeling_tf_common.py
      w.r.t. the changed order of input args
      
      * Add FutureWarnings for T5 and TFT5 models
      
      * Add FutureWarnings for T5 and TFT5 models warning a user that
      input argument `head_mask` was split into two arguments -
      `head_mask` and `decoder_head_mask`
      
      * Add default behaviour - `decoder_head_mask` is set to copy
      `head_mask`
      
      * Fix T5 modeling and FutureWarning
      
      * Make proper usage of head_mask and decoder_head_mask
      in cross_attention
      
      * Fix conditions for raising FutureWarning
      
      * Reformat FutureWarning in T5 modeling
      
      * Refactor the warning message
      2ebbbf55
    • Sylvain Gugger's avatar
      New run_seq2seq script (#9605) · e4c06ed6
      Sylvain Gugger authored
      
      
      * New run_seq2seq script
      
      * Add tests
      
      * Mark as slow
      
      * Update examples/seq2seq/run_seq2seq.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/data/data_collator.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update src/transformers/data/data_collator.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Address review comments
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      e4c06ed6
    • Yusuke Mori's avatar
      Update `past_key_values` in GPT-2 (#9596) · b020a736
      Yusuke Mori authored
      
      
      * Update past_key_values in gpt2 (#9391)
      
      * Update generation_utils, and rename some items
      
      * Update modeling_gpt2 to avoid an error in gradient_checkpointing
      
      * Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2
      
      * Change the location of '_reorder_cache' in modeling files
      
      * Add '_reorder_cache' in modeling_ctrl
      
      * Fix a bug of my last commit in CTRL
      
      * Add '_reorder_cache' to GPT2DoubleHeadsModel
      
      * Manage 'use_cache' in config of test_modeling_gpt2
      
      * Clean up the doc string
      
      * Update src/transformers/models/gpt2/modeling_gpt2.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Fix the doc string (GPT-2, CTRL)
      
      * improve gradient_checkpointing_behavior
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      b020a736