1. 03 Jul, 2023 1 commit
    • Arthur's avatar
      [`Umt5`] Add google's umt5 to `transformers` (#24477) · 799df10a
      Arthur authored
      
      
      * add tokenization template
      
      * update conversion script
      
      * update modeling code
      
      * update
      
      * update convert checkpoint
      
      * update modeling
      
      * revert changes on convert script
      
      * new conversion script for new format
      
      * correct position bias
      
      * cleaning a bit
      
      * Credit co authors
      Co-authored-by: default avataragemagician <ahmed.elnaggar@tum.de>
      
      Co-authored-by: stefan-it
      <>
      
      * styling
      
      * Add docq
      
      * fix copies
      
      * add co author
      
      * Other Author
      
      * Merge branch 'main' of https://github.com/huggingface/transformers
      
       into add-umt5
      
      * add testing
      
      * nit
      
      * Update docs/source/en/model_doc/umt5.mdx
      Co-authored-by: default avatarStefan Schweter <stefan@schweter.it>
      
      * fix t5
      
      * actual fix?
      
      * revert wrong changes
      
      * remove
      
      * update test
      
      * more fixes
      
      * revert some changes
      
      * add SPIECE_UNDERLINE
      
      * add a commone xample
      
      * upfate
      
      * fix copies
      
      * revert changes on t5 conversion script
      
      * revert bytefallback changes since there was no addition yet
      
      * fixup
      
      * fixup
      
      * ingore umt5 cutom testing folder
      
      * fix readmes
      
      * revertT5 changes
      
      * same outputs
      
      * fixup
      
      * update example
      
      * Apply suggestions from code review
      
      * style
      
      * draft addition of all new files
      
      * current update
      
      * fix attention and stuff
      
      * finish refactoring
      
      * auto config
      
      * fixup
      
      * more nits
      
      * add umt5 to init
      
      * use md format
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * revert changes on mt5
      
      * revert mt4 changes
      
      * update test
      
      * more fixes
      
      * add to mapping
      
      * fix-copies
      
      * fix copies
      
      * foix retain grad
      
      * fix some tests
      
      * nits
      
      * done
      
      * Update src/transformers/models/umt5/modeling_umt5.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/umt5.md
      
      * Update src/transformers/models/umt5/__init__.py
      
      * Update docs/source/en/model_doc/umt5.md
      Co-authored-by: default avatarStefan Schweter <stefan@schweter.it>
      
      * Update src/transformers/models/umt5/modeling_umt5.py
      
      * update conversion script + use google checkpoints
      
      * nits
      
      * update test and modelling
      
      * stash slow convert
      
      * update fixupd
      
      * don't change slow
      
      ---------
      
      Co-authored-by: stefan-it <>
      Co-authored-by: default avatarStefan Schweter <stefan@schweter.it>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      799df10a
  2. 27 Jun, 2023 1 commit
  3. 20 Jun, 2023 1 commit
  4. 14 Jun, 2023 1 commit
  5. 04 May, 2023 1 commit
  6. 03 May, 2023 1 commit
  7. 02 May, 2023 1 commit
  8. 24 Mar, 2023 1 commit
    • Mitch Naylor's avatar
      Add Mega: Moving Average Equipped Gated Attention (#21766) · 57f25f4b
      Mitch Naylor authored
      
      
      * add mega file structure and plain pytorch version of mega source code
      
      * added config class with old naming conventions
      
      * filled in mega documentation
      
      * added config class and embeddings with optional token types
      
      * updated notes
      
      * starting the conversion process, deleted intermediate and added use_cache back to config
      
      * renamed config attributes in modeling_mega.py
      
      * checkpointing before refactoring incremental decoding functions
      
      * removed stateful incremental key/values for EMA and self-attention
      
      * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask
      
      * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement
      
      * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention
      
      * bug fix in attention mask handling in MovingAverageGatedAttention
      
      * removed incremental state from GatedCrossAttention and removed IncrementalState class
      
      * finished gated cross attention and got MegaLayer working
      
      * fixed causal masking in mega decoder
      
      * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching
      
      * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids
      
      * added optional dense hidden layer for masked and causal LM classes
      
      * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention
      
      * removed before_attn_fn in Mega class and updated docstrings and comments up to there
      
      * bug fix in MovingAverageGatedAttention masking
      
      * working conversion of MLM checkpoint in scratchpad script -- perfect matches
      
      * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters
      
      * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint
      
      * finished checkpoint conversion script
      
      * cleanup old class in mega config script
      
      * removed 'copied from' statements and passing integration tests
      
      * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing
      
      * fixed tuple output of megamodel
      
      * all common tests passing after fixing issues in decoder, gradient retention, and initialization
      
      * added mega-specific tests, ready for more documentation and style checks
      
      * updated docstrings; checkpoint before style fixes
      
      * style and quality checks, fixed initialization problem in float_tensor, ready for PR
      
      * added mega to toctree
      
      * removed unnecessary arg in megaconfig
      
      * removed unused arg and fixed code samples with leftover roberta models
      
      * Apply suggestions from code review
      
      Applied all suggestions except the one renaming a class, as I'll need to update that througout
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA
      
      * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms
      
      * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention
      
      * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights()
      
      * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files
      
      * variable names in NFFN
      
      * manual Mega->MEGA changes in docs
      
      * Mega->MEGA in config auto
      
      * style and quality fixes
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments
      
      * commit before dealing with merge conflicts
      
      * made new attention activation functions available in ACT2FN and added generation test from OPT
      
      * style and quality in activations and tests
      
      * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings
      
      * style and quality fixes after latest updates, before rotary position ids
      
      * causal mask in MegaBlock docstring + added missing device passing
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR
      
      * style and quality fixes + readme updates pointing to main
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      57f25f4b
  9. 27 Feb, 2023 1 commit
  10. 15 Feb, 2023 1 commit
    • Susnato Dhar's avatar
      Add Ernie-M Model to huggingface (#21349) · 0c9c8472
      Susnato Dhar authored
      * config and tokenization(fast too) changed and ErnieEncoder added
      
      * Slow Tokenization Added
      
      * Tokenizer(slow) is now working and Fast Tokenizer removed
      
      * Added Config code
      
      * Added Base Model and utils
      
      * ErnieMModel is now working
      
      * All added except tests
      
      * All tests passed except ErnieUIEM
      
      * All tests passed
      
      * all fixes done
      
      * all fixes done
      
      * fixed MAP
      
      * fixed check_code_quality
      
      * fixed Build PR Documentation issue
      
      * Added changes(comments) and also updated to the latest upstream/main
      
      * Added fixup
      
      * Added # Copied comments
      
      * Added fixup
      
      * Added more comments and some nits
      
      * Added fixup
      
      * Fixed README_hd.md
      
      * Added more fixes
      
      * ErnieMTokenizer (being sentencepiece) protected and other docs edited
      
      * Added code_quality fix
      
      * Fixed for
      
      * Added more fix
      
      * modified AZ
      
      * ernie-m tokenization test added!
      
      * attention mask part fixed(with 0->self.config.pad_token_id)
      
      * applied make fixup
      0c9c8472
  11. 14 Feb, 2023 1 commit
  12. 10 Feb, 2023 1 commit
    • Jannis Vamvas's avatar
      Add X-MOD (#20939) · b0d539cc
      Jannis Vamvas authored
      
      
      * Add X-MOD to Readme
      
      * Add documentation for X-MOD
      
      * Implement X-MOD
      
      * Fix formatting of X-MOD docs
      
      * Change signature of X-MOD forward methods to use lang_ids
      
      * Minor changes
      
      * Rebase with main and run make fix-copies
      
      * Make suggested changes to docstrings
      
      * Improve code readability
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      
      * Fix code style
      
      * Conversion script: Remove asserts and type annotations
      
      * Remove _TOKENIZER_FOR_DOC
      
      * XMOD -> Xmod
      
      * Update copyright note
      
      * Fix doctests
      
      * Fix docstring
      
      * Add integration test for FillMaskPipeline
      
      * Revert "Add integration test for FillMaskPipeline"
      
      This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f.
      
      * Add end-to-end integration test for mask fill
      
      * make style
      
      * Rebase with main and make fix-copies
      
      ---------
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      b0d539cc
  13. 02 Feb, 2023 1 commit
  14. 27 Jan, 2023 1 commit
    • Maria Khalusova's avatar
      Automated compatible models list for task guides (#21338) · 73a2ff69
      Maria Khalusova authored
      * initial commit. added tip placeholders and a script
      
      * removed unused imports, fixed paths
      
      * fixed generated links
      
      * make style
      
      * split language modeling doc into two: causal language modeling and masked language modeling
      
      * added check_task_guides.py to make fix-copies
      
      * review feedback addressed
      73a2ff69
  15. 21 Nov, 2022 1 commit
    • Steven Liu's avatar
      Add inference section to task guides (#18781) · d896029e
      Steven Liu authored
      * 📝 start adding inference section to task guides
      
      *  make style
      
      * 📝 add multiple choice
      
      * add rest of inference sections
      
      * make style
      
      * add compute_metric, push_to_hub, pipeline
      
      * make style
      
      * add updated sequence and token classification
      
      * make style
      
      * make edits in token classification
      
      * add audio classification
      
      * make style
      
      * add asr
      
      * make style
      
      * add image classification
      
      * make style
      
      * add summarization
      
      * make style
      
      * add translation
      
      * make style
      
      * add multiple choice
      
      * add language modeling
      
      * add qa
      
      * make style
      
      * review and edits
      
      * apply reviews
      
      * make style
      
      * fix call to processor
      
      * apply audio reviews
      
      * update to better asr model
      
      * make style
      d896029e
  16. 07 Sep, 2022 1 commit
  17. 06 Jul, 2022 1 commit
  18. 04 Apr, 2022 1 commit
  19. 25 Mar, 2022 1 commit
  20. 22 Mar, 2022 1 commit
  21. 18 Mar, 2022 1 commit
  22. 15 Mar, 2022 1 commit
  23. 23 Feb, 2022 1 commit