1. 31 Aug, 2021 1 commit
  2. 06 Aug, 2021 1 commit
    • Sylvain Gugger's avatar
      [WIP] Disentangle auto modules from other modeling files (#13023) · 9870093f
      Sylvain Gugger authored
      * Initial work
      
      * All auto models
      
      * All tf auto models
      
      * All flax auto models
      
      * Tokenizers
      
      * Add feature extractors
      
      * Fix typos
      
      * Fix other typo
      
      * Use the right config
      
      * Remove old mapping names and update logic in AutoTokenizer
      
      * Update check_table
      
      * Fix copies and check_repo script
      
      * Fix last test
      
      * Add back name
      
      * clean up
      
      * Update template
      
      * Update template
      
      * Forgot a )
      
      * Use alternative to fixup
      
      * Fix TF model template
      
      * Address review comments
      
      * Address review comments
      
      * Style
      9870093f
  3. 03 Aug, 2021 1 commit
  4. 21 Jul, 2021 1 commit
  5. 08 Jul, 2021 1 commit
  6. 07 Jul, 2021 1 commit
  7. 23 Jun, 2021 1 commit
  8. 22 Jun, 2021 1 commit
    • Hamid Shojanazeri's avatar
      Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252) · af6e01c5
      Hamid Shojanazeri authored
      
      
      * registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing
      
      * sytle format
      
      * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue
      
      * adding the try catch to the fix as persistent flag is only available from PT >1.6
      
      * adding version check
      
      * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user
      
      * adding comments and making the conidtion where token_type_ids are None to use the registered buffer
      
      * taking out position-embeddding from the if block
      
      * adding comments
      
      * handling the case if buffer for position_ids was not registered
      
      * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings
      
      * reverting the token_type_ids in case of None to the previous version
      
      * reverting changes on position_ids adding back the if block
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies and added the import version as it was getting used
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies
      
      * fixing the import format
      
      * fixing the import format
      
      * modified to use temp tensor for trimed and expanded token_type_ids buffer
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * clean up
      
      * clean up
      
      * clean up
      
      * clean up
      
      * Nit
      
      * Nit
      
      * Nit
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * changes based on latest in master
      
      * Adapt templates
      
      * Add version import
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      af6e01c5
  9. 18 Jun, 2021 1 commit
  10. 14 Jun, 2021 1 commit
  11. 07 Jun, 2021 2 commits
    • François Lagunas's avatar
      Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c
      François Lagunas authored
      * Fixing bug that appears when using distilation (and potentially other uses).
      During backward pass Pytorch complains with:
      RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
      This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.
      
      * Fixing all models QA clamp_ bug.
      f8bd8c6c
    • Suraj Patil's avatar
      fix docs of past_key_values (#12049) · 185122ef
      Suraj Patil authored
      185122ef
  12. 06 May, 2021 1 commit
  13. 26 Apr, 2021 1 commit
    • Daniel Stancl's avatar
      TF BART models - Add `cross_attentions` to model output and fix... · 38a716cd
      Daniel Stancl authored
      TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699)
      
      * Add cross_attn_head_mask to BART
      
      * Fix cross_attentions in TFBart-like models
      
      * This commit enables returning of `cross_attentions`
      for TFBart-like models
      
      * It also fixes attention head masking in cross-attenion module
      
      * Update TF model templates
      
      * Fix missing , in TF model templates
      
      * Fix typo: congig -> config
      38a716cd
  14. 23 Apr, 2021 1 commit
    • Daniel Stancl's avatar
      Fix cross-attention head mask for Torch encoder-decoder models (#10605) · e3ff165a
      Daniel Stancl authored
      * Fix cross-attention head mask for Torch BART models
      
      * Fix head masking for cross-attention module for the following
      models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
      Pegasus
      
      * Enable test_headmasking for M2M_100 model
      
      * Fix cross_head_mask for FSMT, LED and T5
      
      * This commit fixes `head_mask` for cross-attention modules
      in the following models: FSMT, LED, T5
      
      * It also contains some smaller changes in doc so that
      it is be perfectly clear the shape of `cross_head_mask`
      is the same as of `decoder_head_mask`
      
      * Update template
      
      * Fix template for BartForCausalLM
      
      * Fix cross_head_mask for Speech2Text models
      
      * Fix cross_head_mask in templates
      
      * Fix args order in BartForCausalLM template
      
      * Fix doc in BART templates
      
      * Make more explicit naming
      
      * `cross_head_mask` -> `cross_attn_head_mask`
      
      * `cross_layer_head_mask` -> `cross_attn_layer_head_mask`
      
      * Fix doc
      
      * make style quality
      
      * Fix speech2text docstring
      e3ff165a
  15. 21 Apr, 2021 1 commit
  16. 09 Apr, 2021 1 commit
  17. 07 Apr, 2021 1 commit
  18. 31 Mar, 2021 1 commit
  19. 29 Mar, 2021 1 commit
  20. 10 Mar, 2021 1 commit
  21. 05 Mar, 2021 1 commit
  22. 03 Mar, 2021 1 commit
  23. 25 Feb, 2021 1 commit
    • mingruimingrui's avatar
      Bugfix: Removal of padding_idx in BartLearnedPositionalEmbedding (#10200) · 894db670
      mingruimingrui authored
      
      
      * Assumption of padding_idx <2 might not stand
      
      * Use offset instead of 2
      
      * Fix with black
      
      * Change behavior to warning instead for backward compatibility.
      
      * Fix with black
      
      * Remove warning
      
      * Make padding_idx non-required
      
      * padding_idx fix for blenderbot
      
      * padding_idx fix for blenderbot_small
      
      * padding_idx fix for led
      
      * padding_idx fix for mbart
      
      * Remove extra whitespaces
      
      * padding_idx fix for template
      
      * Fix padding_idx passed to nn.Embedding mistake
      
      * Fixed padding_idx passed to positional embedding in template
      
      * Remove padding_idx from pytorch learned positional embeddings
      
      * Remove accidentally added quotes
      
      * Remove padding_idx from tf learned positional embeddings
      
      * Remove zeroing of weights in __init__
      Co-authored-by: default avatarWang Ming Rui <mingrui.wang@C02CJTUYMD6M.local>
      894db670
  24. 17 Feb, 2021 1 commit
    • Julien Plu's avatar
      Making TF BART-like models XLA and AMP compliant (#10191) · 83d803ba
      Julien Plu authored
      * Update BART
      
      * Update Blenderbot
      
      * Update BlenderbotSmall
      
      * Update Marian
      
      * Update MBart
      
      * Update MBart
      
      * Update Pegasus
      
      * Update template
      
      * Fix Marian and Pegasus
      
      * Apply style
      
      * Default initializer
      
      * Default initializer
      
      * Default initializer
      
      * Remove int32 casts
      
      * Fix template
      
      * Remove more cast
      83d803ba
  25. 15 Feb, 2021 2 commits
  26. 11 Feb, 2021 2 commits
  27. 09 Feb, 2021 2 commits
  28. 08 Feb, 2021 4 commits
  29. 05 Feb, 2021 1 commit
    • Patrick von Platen's avatar
      [Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921) · 89be094e
      Patrick von Platen authored
      * add big bird
      
      * change teacher to mentor
      
      * add proposal template
      
      * adapt template
      
      * delete old template
      
      * correct some links
      
      * finish template
      
      * create big bird from template
      
      * add big bird
      
      * improve boxes
      
      * finish boxes
      
      * add pointers for BigBird
      
      * finish big bird
      
      * up
      
      * up
      
      * up
      
      * up
      
      * apply lysandres and sylvains suggestions
      
      * delete bogus file
      
      * correct markdown
      
      * try different style
      
      * try different style
      
      * finalize
      89be094e
  30. 04 Feb, 2021 2 commits
    • Lysandre Debut's avatar
      Fix model templates (#9999) · e89c959a
      Lysandre Debut authored
      e89c959a
    • demSd's avatar
      BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128) · 00031785
      demSd authored
      
      
      * initiliaze bart4causalLM
      
      * create BartDecoderWrapper, setters/getters
      
      * delete spaces
      
      * forward and additional methods
      
      * update cache function, loss function, remove ngram* params in data class.
      
      * add bartcausallm, bartdecoder testing
      
      * correct bart for causal lm
      
      * remove at
      
      * add mbart as well
      
      * up
      
      * fix typo
      
      * up
      
      * correct
      
      * add pegasusforcausallm
      
      * add blenderbotforcausallm
      
      * add blenderbotsmallforcausallm
      
      * add marianforcausallm
      
      * add test for MarianForCausalLM
      
      * add Pegasus test
      
      * add BlenderbotSmall test
      
      * add blenderbot test
      
      * fix a fail
      
      * fix an import fail
      
      * a fix
      
      * fix
      
      * Update modeling_pegasus.py
      
      * fix models
      
      * fix inputs_embeds setting getter
      
      * adapt tests
      
      * correct repo utils check
      
      * finish test improvement
      
      * fix tf models as well
      
      * make style
      
      * make fix-copies
      
      * fix copies
      
      * run all tests
      
      * last changes
      
      * fix all tests
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      00031785
  31. 02 Feb, 2021 2 commits