1. 02 Feb, 2024 1 commit
  2. 31 Oct, 2023 1 commit
    • Hz, Ji's avatar
      device agnostic models testing (#27146) · 50378cbf
      Hz, Ji authored
      * device agnostic models testing
      
      * add decorator `require_torch_fp16`
      
      * make style
      
      * apply review suggestion
      
      * Oops, the fp16 decorator was misused
      50378cbf
  3. 05 Sep, 2023 1 commit
  4. 02 Aug, 2023 1 commit
  5. 07 Apr, 2023 1 commit
  6. 06 Apr, 2023 1 commit
  7. 24 Mar, 2023 1 commit
    • Mitch Naylor's avatar
      Add Mega: Moving Average Equipped Gated Attention (#21766) · 57f25f4b
      Mitch Naylor authored
      
      
      * add mega file structure and plain pytorch version of mega source code
      
      * added config class with old naming conventions
      
      * filled in mega documentation
      
      * added config class and embeddings with optional token types
      
      * updated notes
      
      * starting the conversion process, deleted intermediate and added use_cache back to config
      
      * renamed config attributes in modeling_mega.py
      
      * checkpointing before refactoring incremental decoding functions
      
      * removed stateful incremental key/values for EMA and self-attention
      
      * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask
      
      * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement
      
      * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention
      
      * bug fix in attention mask handling in MovingAverageGatedAttention
      
      * removed incremental state from GatedCrossAttention and removed IncrementalState class
      
      * finished gated cross attention and got MegaLayer working
      
      * fixed causal masking in mega decoder
      
      * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching
      
      * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids
      
      * added optional dense hidden layer for masked and causal LM classes
      
      * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention
      
      * removed before_attn_fn in Mega class and updated docstrings and comments up to there
      
      * bug fix in MovingAverageGatedAttention masking
      
      * working conversion of MLM checkpoint in scratchpad script -- perfect matches
      
      * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters
      
      * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint
      
      * finished checkpoint conversion script
      
      * cleanup old class in mega config script
      
      * removed 'copied from' statements and passing integration tests
      
      * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing
      
      * fixed tuple output of megamodel
      
      * all common tests passing after fixing issues in decoder, gradient retention, and initialization
      
      * added mega-specific tests, ready for more documentation and style checks
      
      * updated docstrings; checkpoint before style fixes
      
      * style and quality checks, fixed initialization problem in float_tensor, ready for PR
      
      * added mega to toctree
      
      * removed unnecessary arg in megaconfig
      
      * removed unused arg and fixed code samples with leftover roberta models
      
      * Apply suggestions from code review
      
      Applied all suggestions except the one renaming a class, as I'll need to update that througout
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA
      
      * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms
      
      * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention
      
      * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights()
      
      * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files
      
      * variable names in NFFN
      
      * manual Mega->MEGA changes in docs
      
      * Mega->MEGA in config auto
      
      * style and quality fixes
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments
      
      * commit before dealing with merge conflicts
      
      * made new attention activation functions available in ACT2FN and added generation test from OPT
      
      * style and quality in activations and tests
      
      * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings
      
      * style and quality fixes after latest updates, before rotary position ids
      
      * causal mask in MegaBlock docstring + added missing device passing
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR
      
      * style and quality fixes + readme updates pointing to main
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      57f25f4b
  8. 28 Feb, 2023 1 commit
    • Yih-Dar's avatar
      馃敟Rework pipeline testing by removing `PipelineTestCaseMeta` 馃殌 (#21516) · 871c31a6
      Yih-Dar authored
      
      
      * Add PipelineTesterMixin
      
      * remove class PipelineTestCaseMeta
      
      * move validate_test_components
      
      * Add for ViT
      
      * Add to SPECIAL_MODULE_TO_TEST_MAP
      
      * style and quality
      
      * Add feature-extraction
      
      * update
      
      * raise instead of skip
      
      * add tiny_model_summary.json
      
      * more explicit
      
      * skip tasks not in mapping
      
      * add availability check
      
      * Add Copyright
      
      * A way to diable irrelevant tests
      
      * update with main
      
      * remove disable_irrelevant_tests
      
      * skip tests
      
      * better skip message
      
      * better skip message
      
      * Add all pipeline task tests
      
      * revert
      
      * Import PipelineTesterMixin
      
      * subclass test classes with PipelineTesterMixin
      
      * Add pipieline_model_mapping
      
      * Fix import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix one more import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix test issues
      
      * Fix import requirements
      
      * Fix mapping for MobileViTModelTest
      
      * Update
      
      * Better skip message
      
      * pipieline_model_mapping could not be None
      
      * Remove some PipelineTesterMixin
      
      * Fix typo
      
      * revert tests_fetcher.py
      
      * update
      
      * rename
      
      * revert
      
      * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests
      
      * style and quality
      
      * test fetcher for all pipeline/model tests
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      871c31a6
  9. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  10. 15 Nov, 2022 1 commit
    • Arthur's avatar
      update relative positional embedding (#20203) · f60eec40
      Arthur authored
      * update relative positional embedding
      
      * make fix copies
      
      * add `use_cache` to list of arguments
      
      * fixup
      
      * 1line fucntion
      
      * add `test_decoder_model_past_with_large_inputs_relative_pos_emb`
      
      * add relative pos embedding test for more models
      
      * style
      f60eec40
  11. 09 Nov, 2022 1 commit
  12. 02 Nov, 2022 1 commit
  13. 08 Jun, 2022 1 commit
  14. 03 May, 2022 1 commit
    • Yih-Dar's avatar
      Move test model folders (#17034) · 19420fd9
      Yih-Dar authored
      
      
      * move test model folders (TODO: fix imports and others)
      
      * fix (potentially partially) imports (in model test modules)
      
      * fix (potentially partially) imports (in tokenization test modules)
      
      * fix (potentially partially) imports (in feature extraction test modules)
      
      * fix import utils.test_modeling_tf_core
      
      * fix path ../fixtures/
      
      * fix imports about generation.test_generation_flax_utils
      
      * fix more imports
      
      * fix fixture path
      
      * fix get_test_dir
      
      * update module_to_test_file
      
      * fix get_tests_dir from wrong transformers.utils
      
      * update config.yml (CircleCI)
      
      * fix style
      
      * remove missing imports
      
      * update new model script
      
      * update check_repo
      
      * update SPECIAL_MODULE_TO_TEST_MAP
      
      * fix style
      
      * add __init__
      
      * update self-scheduled
      
      * fix add_new_model scripts
      
      * check one way to get location back
      
      * python setup.py build install
      
      * fix import in test auto
      
      * update self-scheduled.yml
      
      * update slack notification script
      
      * Add comments about artifact names
      
      * fix for yolos
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      19420fd9
  15. 23 Feb, 2022 1 commit
  16. 07 Feb, 2022 1 commit
    • Michael Benayoun's avatar
      FX tracing improvement (#14321) · 0fe17f37
      Michael Benayoun authored
      * Change the way tracing happens, enabling dynamic axes out of the box
      
      * Update the tests and modeling xlnet
      
      * Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).
      
      * Comments and making tracing work for gpt-j and xlnet
      
      * Refactore things related to num_choices (and batch_size, sequence_length)
      
      * Update fx to work on PyTorch 1.10
      
      * Postpone autowrap_function feature usage for later
      
      * Add copyrights
      
      * Remove unnecessary file
      
      * Fix issue with add_new_model_like
      
      * Apply suggestions
      0fe17f37
  17. 06 Jan, 2022 1 commit
  18. 29 Oct, 2021 1 commit
  19. 21 Jul, 2021 1 commit
  20. 01 Jul, 2021 1 commit
  21. 08 Jun, 2021 1 commit
  22. 04 May, 2021 1 commit
  23. 23 Dec, 2020 1 commit
    • Suraj Patil's avatar
      Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893
      Suraj Patil authored
      * add past_key_values
      
      * add use_cache option
      
      * make mask before cutting ids
      
      * adjust position_ids according to past_key_values
      
      * flatten past_key_values
      
      * fix positional embeds
      
      * fix _reorder_cache
      
      * set use_cache to false when not decoder, fix attention mask init
      
      * add test for caching
      
      * add past_key_values for Roberta
      
      * fix position embeds
      
      * add caching test for roberta
      
      * add doc
      
      * make style
      
      * doc, fix attention mask, test
      
      * small fixes
      
      * adress patrick's comments
      
      * input_ids shouldn't start with pad token
      
      * use_cache only when decoder
      
      * make consistent with bert
      
      * make copies consistent
      
      * add use_cache to encoder
      
      * add past_key_values to tapas attention
      
      * apply suggestions from code review
      
      * make coppies consistent
      
      * add attn mask in tests
      
      * remove copied from longformer
      
      * apply suggestions from code review
      
      * fix bart test
      
      * nit
      
      * simplify model outputs
      
      * fix doc
      
      * fix output ordering
      88ef8893
  24. 07 Dec, 2020 1 commit
  25. 24 Nov, 2020 1 commit
    • zhiheng-huang's avatar
      Support various BERT relative position embeddings (2nd) (#8276) · 2c83b3c3
      zhiheng-huang authored
      
      
      * Support BERT relative position embeddings
      
      * Fix typo in README.md
      
      * Address review comment
      
      * Fix failing tests
      
      * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py
      
      * make fix copies
      
      * fix configs of electra and albert and fix longformer
      
      * remove copy statement from longformer
      
      * fix albert
      
      * fix electra
      
      * Add bert variants forward tests for various position embeddings
      
      * [tiny] Fix style for test_modeling_bert.py
      
      * improve docstring
      
      * [tiny] improve docstring and remove unnecessary dependency
      
      * [tiny] Remove unused import
      
      * re-add to ALBERT
      
      * make embeddings work for ALBERT
      
      * add test for albert
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2c83b3c3
  26. 17 Nov, 2020 1 commit
    • Sylvain Gugger's avatar
      Reorganize repo (#8580) · c89bdfbe
      Sylvain Gugger authored
      * Put models in subfolders
      
      * Styling
      
      * Fix imports in tests
      
      * More fixes in test imports
      
      * Sneaky hidden imports
      
      * Fix imports in doc files
      
      * More sneaky imports
      
      * Finish fixing tests
      
      * Fix examples
      
      * Fix path for copies
      
      * More fixes for examples
      
      * Fix dummy files
      
      * More fixes for example
      
      * More model import fixes
      
      * Is this why you're unhappy GitHub?
      
      * Fix imports in conver command
      c89bdfbe
  27. 16 Nov, 2020 1 commit
    • Sylvain Gugger's avatar
      Switch `return_dict` to `True` by default. (#8530) · 1073a2bd
      Sylvain Gugger authored
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Run on the real suite
      
      * Fix slow tests
      1073a2bd
  28. 03 Nov, 2020 1 commit
    • Patrick von Platen's avatar
      Refactoring the generate() function (#6949) · a1bbcf3f
      Patrick von Platen authored
      * first draft
      
      * show design proposition for new generate method
      
      * up
      
      * make better readable
      
      * make first version
      
      * gpt2 tests pass
      
      * make beam search for gpt2 work
      
      * add first encoder-decoder code
      
      * delete typo
      
      * make t5 work
      
      * save indermediate
      
      * make bart work with beam search
      
      * finish beam search bart / t5
      
      * add default kwargs
      
      * make more tests pass
      
      * fix no bad words sampler
      
      * some fixes and tests for all distribution processors
      
      * fix test
      
      * fix rag slow tests
      
      * merge to master
      
      * add nograd to generate
      
      * make all slow tests pass
      
      * speed up generate
      
      * fix edge case bug
      
      * small fix
      
      * correct typo
      
      * add type hints and docstrings
      
      * fix typos in tests
      
      * add beam search tests
      
      * add tests for beam scorer
      
      * fix test rag
      
      * finish beam search tests
      
      * move generation tests in seperate file
      
      * fix generation tests
      
      * more tests
      
      * add aggressive generation tests
      
      * fix tests
      
      * add gpt2 sample test
      
      * add more docstring
      
      * add more docs
      
      * finish doc strings
      
      * apply some more of sylvains and sams comments
      
      * fix some typos
      
      * make fix copies
      
      * apply lysandres and sylvains comments
      
      * final corrections on examples
      
      * small fix for reformer
      a1bbcf3f
  29. 30 Oct, 2020 1 commit
    • Lysandre Debut's avatar
      Ci test tf super slow (#8007) · 10f8c636
      Lysandre Debut authored
      * Test TF GPU CI
      
      * Change cache
      
      * Fix missing torch requirement
      
      * Fix some model tests
      
      
      Style
      
      * LXMERT
      
      * MobileBERT
      
      * Longformer skip test
      
      * XLNet
      
      * The rest of the tests
      
      * RAG goes OOM in multi gpu setup
      
      * YAML test files
      
      * Last fixes
      
      * Skip doctests
      
      * Fill mask tests
      
      * Yaml files
      
      * Last test fix
      
      * Style
      
      * Update cache
      
      * Change ONNX tests to slow + use tiny model
      10f8c636
  30. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 馃帀
      
      * and removed hard dependency on tokenizers 馃帀
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  31. 26 Aug, 2020 1 commit
  32. 24 Aug, 2020 1 commit
  33. 20 Aug, 2020 1 commit
  34. 12 Aug, 2020 1 commit
  35. 04 Aug, 2020 1 commit
  36. 31 Jul, 2020 1 commit
  37. 01 Jul, 2020 1 commit
  38. 23 Jun, 2020 1 commit
  39. 16 Jun, 2020 1 commit
  40. 10 Jun, 2020 1 commit