1. 01 Sep, 2021 1 commit
    • Jonathan Chang's avatar
      Add template for adding flax models (#12441) · d160782a
      Jonathan Chang authored
      
      
      * Add option to add flax
      
      * Add flax template for __init__.py
      
      * Add flax template for .rst
      
      * Copy TF modeling template
      
      * Add a missing line in modeling_tf_... template
      
      * Update first half of modeling_flax_..
      
      * Update encoder flax template
      
      * Copy test_modeling_tf... as test_modeling_flax...
      
      * Replace some TF to Flax in test_modeling_flax_...
      
      * Replace tf to np
      
      some function might not work, like _assert_tensors_equal
      
      * Replace remaining tf to np (might not work)
      
      * Fix cookiecutter
      
      * Add Flax in to_replace_... template
      
      * Update transformers-cli add-new-model
      
      * Save generate_flax in configuration.json
      
      This will be read by transformers-cli
      
      * Fix to_replace_... and cli
      
      * Fix replace cli
      
      * Fix cookiecutter name
      
      * Move docstring earlier to avoid not defined error
      
      * Fix a missing Module
      
      * Add encoder-decoder flax template from bart
      
      * Fix flax test
      
      * Make style
      
      * Fix endif
      
      * Fix replace all "utf-8 -> unp-8"
      
      * Update comment
      
      * Fix flax template (add missing ..._DOCSTRING)
      
      * Use flax_bart imports in template (was t5)
      
      * Fix unp
      
      * Update templates/adding_a_new_model/tests
      
      * Revert "Fix unp"
      
      This reverts commit dc9002a41d902c4f9b07343eab1cb350c8b7fd57.
      
      * Remove one line of copied from to suppress CI error
      
      * Use generate_tensorflow_pytorch_and_flax
      
      * Add a missing part
      
      * fix typo
      
      * fix flax config
      
      * add examples for flax
      
      * small rename
      
      * correct modeling imports
      
      * correct auto loading
      
      * corrects some flax tests
      
      * correct small typo
      
      * correct as type
      
      * finish modif
      
      * correct more templates
      
      * final fixes
      
      * add file testers
      
      * up
      
      * make sure tests match template regex
      
      * correct pytorch
      
      * correct tf
      
      * correct more tf
      
      * correct imports
      
      * minor error
      
      * minor error
      
      * correct init
      
      * more fixes
      
      * correct more flax tests
      
      * correct flax test
      
      * more fixes
      
      * correct docs
      
      * update
      
      * fix
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d160782a
  2. 31 Aug, 2021 1 commit
  3. 19 Aug, 2021 1 commit
  4. 06 Aug, 2021 1 commit
    • Sylvain Gugger's avatar
      [WIP] Disentangle auto modules from other modeling files (#13023) · 9870093f
      Sylvain Gugger authored
      * Initial work
      
      * All auto models
      
      * All tf auto models
      
      * All flax auto models
      
      * Tokenizers
      
      * Add feature extractors
      
      * Fix typos
      
      * Fix other typo
      
      * Use the right config
      
      * Remove old mapping names and update logic in AutoTokenizer
      
      * Update check_table
      
      * Fix copies and check_repo script
      
      * Fix last test
      
      * Add back name
      
      * clean up
      
      * Update template
      
      * Update template
      
      * Forgot a )
      
      * Use alternative to fixup
      
      * Fix TF model template
      
      * Address review comments
      
      * Address review comments
      
      * Style
      9870093f
  5. 03 Aug, 2021 1 commit
  6. 21 Jul, 2021 1 commit
  7. 08 Jul, 2021 1 commit
  8. 07 Jul, 2021 1 commit
  9. 28 Jun, 2021 1 commit
  10. 26 Jun, 2021 1 commit
  11. 25 Jun, 2021 1 commit
  12. 23 Jun, 2021 1 commit
  13. 22 Jun, 2021 1 commit
    • Hamid Shojanazeri's avatar
      Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252) · af6e01c5
      Hamid Shojanazeri authored
      
      
      * registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing
      
      * sytle format
      
      * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue
      
      * adding the try catch to the fix as persistent flag is only available from PT >1.6
      
      * adding version check
      
      * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user
      
      * adding comments and making the conidtion where token_type_ids are None to use the registered buffer
      
      * taking out position-embeddding from the if block
      
      * adding comments
      
      * handling the case if buffer for position_ids was not registered
      
      * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings
      
      * reverting the token_type_ids in case of None to the previous version
      
      * reverting changes on position_ids adding back the if block
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies and added the import version as it was getting used
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies
      
      * fixing the import format
      
      * fixing the import format
      
      * modified to use temp tensor for trimed and expanded token_type_ids buffer
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * clean up
      
      * clean up
      
      * clean up
      
      * clean up
      
      * Nit
      
      * Nit
      
      * Nit
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * changes based on latest in master
      
      * Adapt templates
      
      * Add version import
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      af6e01c5
  14. 18 Jun, 2021 1 commit
  15. 14 Jun, 2021 1 commit
  16. 07 Jun, 2021 2 commits
    • François Lagunas's avatar
      Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c
      François Lagunas authored
      * Fixing bug that appears when using distilation (and potentially other uses).
      During backward pass Pytorch complains with:
      RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
      This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.
      
      * Fixing all models QA clamp_ bug.
      f8bd8c6c
    • Suraj Patil's avatar
      fix docs of past_key_values (#12049) · 185122ef
      Suraj Patil authored
      185122ef
  17. 25 May, 2021 1 commit
  18. 06 May, 2021 1 commit
  19. 26 Apr, 2021 2 commits
    • Bhadresh Savani's avatar
      [Examples] Fixes inconsistency around eval vs val and predict vs test (#11380) · 1d30ec95
      Bhadresh Savani authored
      * added changes for uniformity
      
      * modified files
      
      * corrected typo
      
      * fixed qa scripts
      
      * fix typos
      
      * fixed predict typo in qa no trainer
      
      * fixed test file
      
      * reverted trainer changes
      
      * reverted trainer changes in custom exmaples
      
      * updated readme
      
      * added changes in deepspeed test
      
      * added changes for predict and eval
      1d30ec95
    • Daniel Stancl's avatar
      TF BART models - Add `cross_attentions` to model output and fix... · 38a716cd
      Daniel Stancl authored
      TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699)
      
      * Add cross_attn_head_mask to BART
      
      * Fix cross_attentions in TFBart-like models
      
      * This commit enables returning of `cross_attentions`
      for TFBart-like models
      
      * It also fixes attention head masking in cross-attenion module
      
      * Update TF model templates
      
      * Fix missing , in TF model templates
      
      * Fix typo: congig -> config
      38a716cd
  20. 23 Apr, 2021 1 commit
    • Daniel Stancl's avatar
      Fix cross-attention head mask for Torch encoder-decoder models (#10605) · e3ff165a
      Daniel Stancl authored
      * Fix cross-attention head mask for Torch BART models
      
      * Fix head masking for cross-attention module for the following
      models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
      Pegasus
      
      * Enable test_headmasking for M2M_100 model
      
      * Fix cross_head_mask for FSMT, LED and T5
      
      * This commit fixes `head_mask` for cross-attention modules
      in the following models: FSMT, LED, T5
      
      * It also contains some smaller changes in doc so that
      it is be perfectly clear the shape of `cross_head_mask`
      is the same as of `decoder_head_mask`
      
      * Update template
      
      * Fix template for BartForCausalLM
      
      * Fix cross_head_mask for Speech2Text models
      
      * Fix cross_head_mask in templates
      
      * Fix args order in BartForCausalLM template
      
      * Fix doc in BART templates
      
      * Make more explicit naming
      
      * `cross_head_mask` -> `cross_attn_head_mask`
      
      * `cross_layer_head_mask` -> `cross_attn_layer_head_mask`
      
      * Fix doc
      
      * make style quality
      
      * Fix speech2text docstring
      e3ff165a
  21. 21 Apr, 2021 1 commit
  22. 09 Apr, 2021 1 commit
  23. 07 Apr, 2021 1 commit
  24. 31 Mar, 2021 1 commit
  25. 29 Mar, 2021 1 commit
  26. 23 Mar, 2021 2 commits
  27. 10 Mar, 2021 1 commit
  28. 09 Mar, 2021 1 commit
  29. 05 Mar, 2021 1 commit
  30. 03 Mar, 2021 1 commit
  31. 25 Feb, 2021 1 commit
    • mingruimingrui's avatar
      Bugfix: Removal of padding_idx in BartLearnedPositionalEmbedding (#10200) · 894db670
      mingruimingrui authored
      
      
      * Assumption of padding_idx <2 might not stand
      
      * Use offset instead of 2
      
      * Fix with black
      
      * Change behavior to warning instead for backward compatibility.
      
      * Fix with black
      
      * Remove warning
      
      * Make padding_idx non-required
      
      * padding_idx fix for blenderbot
      
      * padding_idx fix for blenderbot_small
      
      * padding_idx fix for led
      
      * padding_idx fix for mbart
      
      * Remove extra whitespaces
      
      * padding_idx fix for template
      
      * Fix padding_idx passed to nn.Embedding mistake
      
      * Fixed padding_idx passed to positional embedding in template
      
      * Remove padding_idx from pytorch learned positional embeddings
      
      * Remove accidentally added quotes
      
      * Remove padding_idx from tf learned positional embeddings
      
      * Remove zeroing of weights in __init__
      Co-authored-by: default avatarWang Ming Rui <mingrui.wang@C02CJTUYMD6M.local>
      894db670
  32. 17 Feb, 2021 1 commit
    • Julien Plu's avatar
      Making TF BART-like models XLA and AMP compliant (#10191) · 83d803ba
      Julien Plu authored
      * Update BART
      
      * Update Blenderbot
      
      * Update BlenderbotSmall
      
      * Update Marian
      
      * Update MBart
      
      * Update MBart
      
      * Update Pegasus
      
      * Update template
      
      * Fix Marian and Pegasus
      
      * Apply style
      
      * Default initializer
      
      * Default initializer
      
      * Default initializer
      
      * Remove int32 casts
      
      * Fix template
      
      * Remove more cast
      83d803ba
  33. 15 Feb, 2021 2 commits
  34. 11 Feb, 2021 2 commits
  35. 09 Feb, 2021 1 commit