1. 19 Jun, 2024 1 commit
    • Anton Vlasjuk's avatar
      [`GPT2`] Add SDPA support (#31172) · b275a410
      Anton Vlasjuk authored
      * `gpt2` sdpa support
      
      * fix (at least) one test, style, repo consistency
      
      * fix sdpa mask in forward --> fixes generation
      
      * test
      
      * test2
      
      * test3
      
      * test4
      
      * simplify shapes for attn mask creation and small comments
      
      * hub fail test
      
      * benchmarks
      
      * flash attn 2 mask should not be inverted on enc-dec setup
      
      * fix comment
      
      * apply some suggestion from code review
      
      - only save _attn_implentation once
      - remove unnecessary comment
      
      * change elif logic
      
      * [run-slow] gpt2
      
      * modify `test_gpt2_sample_max_time` to follow previous assertion patterns
      b275a410
  2. 28 Mar, 2024 1 commit
  3. 25 Mar, 2024 1 commit
  4. 16 Feb, 2024 1 commit
  5. 16 Nov, 2023 1 commit
    • Arthur's avatar
      [`Styling`] stylify using ruff (#27144) · 651408a0
      Arthur authored
      
      
      * try to stylify using ruff
      
      * might need to remove these changes?
      
      * use ruf format andruff check
      
      * use isinstance instead of type comparision
      
      * use # fmt: skip
      
      * use # fmt: skip
      
      * nits
      
      * soem styling changes
      
      * update ci job
      
      * nits isinstance
      
      * more files update
      
      * nits
      
      * more nits
      
      * small nits
      
      * check and format
      
      * revert wrong changes
      
      * actually use formatter instead of checker
      
      * nits
      
      * well docbuilder is overwriting this commit
      
      * revert notebook changes
      
      * try to nuke docbuilder
      
      * style
      
      * fix feature exrtaction test
      
      * remve `indent-width = 4`
      
      * fixup
      
      * more nits
      
      * update the ruff version that we use
      
      * style
      
      * nuke docbuilder styling
      
      * leve the print for detected changes
      
      * nits
      
      * Remove file I/O
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      
      * style
      
      * nits
      
      * revert notebook changes
      
      * Add # fmt skip when possible
      
      * Add # fmt skip when possible
      
      * Fix
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * NIts
      
      * more fixes
      
      * fix tapas
      
      * Another way to skip
      
      * Recommended way
      
      * Fix two more fiels
      
      * Remove asynch
      Remove asynch
      
      ---------
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      651408a0
  6. 31 Oct, 2023 1 commit
    • Hz, Ji's avatar
      device agnostic models testing (#27146) · 50378cbf
      Hz, Ji authored
      * device agnostic models testing
      
      * add decorator `require_torch_fp16`
      
      * make style
      
      * apply review suggestion
      
      * Oops, the fp16 decorator was misused
      50378cbf
  7. 30 Oct, 2023 1 commit
  8. 02 Aug, 2023 1 commit
  9. 27 Jun, 2023 1 commit
  10. 22 Jun, 2023 1 commit
  11. 21 Jun, 2023 1 commit
  12. 08 Jun, 2023 1 commit
  13. 02 May, 2023 1 commit
  14. 10 Mar, 2023 2 commits
  15. 28 Feb, 2023 1 commit
    • Yih-Dar's avatar
      馃敟Rework pipeline testing by removing `PipelineTestCaseMeta` 馃殌 (#21516) · 871c31a6
      Yih-Dar authored
      
      
      * Add PipelineTesterMixin
      
      * remove class PipelineTestCaseMeta
      
      * move validate_test_components
      
      * Add for ViT
      
      * Add to SPECIAL_MODULE_TO_TEST_MAP
      
      * style and quality
      
      * Add feature-extraction
      
      * update
      
      * raise instead of skip
      
      * add tiny_model_summary.json
      
      * more explicit
      
      * skip tasks not in mapping
      
      * add availability check
      
      * Add Copyright
      
      * A way to diable irrelevant tests
      
      * update with main
      
      * remove disable_irrelevant_tests
      
      * skip tests
      
      * better skip message
      
      * better skip message
      
      * Add all pipeline task tests
      
      * revert
      
      * Import PipelineTesterMixin
      
      * subclass test classes with PipelineTesterMixin
      
      * Add pipieline_model_mapping
      
      * Fix import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix one more import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix test issues
      
      * Fix import requirements
      
      * Fix mapping for MobileViTModelTest
      
      * Update
      
      * Better skip message
      
      * pipieline_model_mapping could not be None
      
      * Remove some PipelineTesterMixin
      
      * Fix typo
      
      * revert tests_fetcher.py
      
      * update
      
      * rename
      
      * revert
      
      * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests
      
      * style and quality
      
      * test fetcher for all pipeline/model tests
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      871c31a6
  16. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  17. 09 Nov, 2022 1 commit
  18. 01 Nov, 2022 1 commit
  19. 08 Jun, 2022 1 commit
  20. 03 May, 2022 1 commit
    • Yih-Dar's avatar
      Move test model folders (#17034) · 19420fd9
      Yih-Dar authored
      
      
      * move test model folders (TODO: fix imports and others)
      
      * fix (potentially partially) imports (in model test modules)
      
      * fix (potentially partially) imports (in tokenization test modules)
      
      * fix (potentially partially) imports (in feature extraction test modules)
      
      * fix import utils.test_modeling_tf_core
      
      * fix path ../fixtures/
      
      * fix imports about generation.test_generation_flax_utils
      
      * fix more imports
      
      * fix fixture path
      
      * fix get_test_dir
      
      * update module_to_test_file
      
      * fix get_tests_dir from wrong transformers.utils
      
      * update config.yml (CircleCI)
      
      * fix style
      
      * remove missing imports
      
      * update new model script
      
      * update check_repo
      
      * update SPECIAL_MODULE_TO_TEST_MAP
      
      * fix style
      
      * add __init__
      
      * update self-scheduled
      
      * fix add_new_model scripts
      
      * check one way to get location back
      
      * python setup.py build install
      
      * fix import in test auto
      
      * update self-scheduled.yml
      
      * update slack notification script
      
      * Add comments about artifact names
      
      * fix for yolos
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      19420fd9
  21. 15 Mar, 2022 1 commit
    • Matt's avatar
      TF XLA greedy generation (#15786) · cd4c5c90
      Matt authored
      
      
      * First attempt at TF XLA generation
      
      * Fix comments
      
      * Update XLA greedy generate with direct XLA calls
      
      * Support attention mask, prepare_inputs_for_generation no longer hardcoded for greedy
      
      * Handle position_ids correctly
      
      * make xla generate work for non xla case
      
      * force using xla generate
      
      * refactor
      
      * more fixes
      
      * finish cleaning
      
      * finish
      
      * finish
      
      * clean gpt2 tests
      
      * add gpt2 tests
      
      * correct more cases
      
      * up
      
      * finish
      
      * finish
      
      * more fixes
      
      * flake 8 stuff
      
      * final rag fix
      
      * Update src/transformers/models/rag/modeling_tf_rag.py
      
      * finish t5 as well
      
      * finish
      
      * Update src/transformers/generation_utils.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      cd4c5c90
  22. 23 Feb, 2022 1 commit
  23. 07 Feb, 2022 1 commit
    • Michael Benayoun's avatar
      FX tracing improvement (#14321) · 0fe17f37
      Michael Benayoun authored
      * Change the way tracing happens, enabling dynamic axes out of the box
      
      * Update the tests and modeling xlnet
      
      * Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).
      
      * Comments and making tracing work for gpt-j and xlnet
      
      * Refactore things related to num_choices (and batch_size, sequence_length)
      
      * Update fx to work on PyTorch 1.10
      
      * Postpone autowrap_function feature usage for later
      
      * Add copyrights
      
      * Remove unnecessary file
      
      * Fix issue with add_new_model_like
      
      * Apply suggestions
      0fe17f37
  24. 29 Oct, 2021 1 commit
    • Thomas Wang's avatar
      Remove n_ctx from configs (#14165) · 5b45422b
      Thomas Wang authored
      * Remove n_ctx from configs
      
      * Fix GPTJ and OpenAIGPT, both are acceptable breaking changes as there are no configs such that it breaks
      
      * Remove unecessary n_positions from TFOpenAIGPT
      5b45422b
  25. 04 Oct, 2021 1 commit
    • Sidd Karamcheti's avatar
      Add Mistral GPT-2 Stability Tweaks (#13573) · 3a8de58c
      Sidd Karamcheti authored
      
      
      * Add layer-wise scaling
      
      * Add reorder & upcasting argument
      
      * Add OpenAI GPT-2 weight initialization scheme
      
      * start `layer_idx` count at zero for consistency
      
      * disentangle attn and reordered and upscaled attn function
      
      * rename `scale_attn_by_layer` to `scale_attn_by_layer_id`
      
      * make autocast from amp compatible with pytorch<1.6
      
      * fix docstring
      
      * style fixes
      
      * Add fixes from PR feedback, style tweaks
      
      * Fix doc whitespace
      
      * Reformat
      
      * First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests
      
      * Rename scale_attn_by_layer_idx, add tip
      
      * Remove extra newline
      
      * add test for weight initialization
      
      * update code format
      
      * add assert check weights are fp32
      
      * remove assert
      
      * Fix incorrect merge
      
      * Fix shape mismatch in baddbmm
      
      * Add generation test for Mistral flags
      Co-authored-by: default avatarleandro <leandro.vonwerra@spoud.io>
      Co-authored-by: default avatarKeshav Santhanam <keshav2@stanford.edu>
      Co-authored-by: default avatarJ38 <jebolton@stanford.edu>
      3a8de58c
  26. 22 Sep, 2021 1 commit
  27. 17 Sep, 2021 1 commit
  28. 31 Aug, 2021 1 commit
    • tucan9389's avatar
      Add GPT2ForTokenClassification (#13290) · 41c55941
      tucan9389 authored
      
      
      * Add GPT2ForTokenClassification
      
      * Fix dropout exception for GPT2 NER
      
      * Remove sequence label in test
      
      * Change TokenClassifierOutput to TokenClassifierOutputWithPast
      
      * Fix for black formatter
      
      * Remove dummy
      
      * Update docs for GPT2ForTokenClassification
      
      * Fix check_inits ci fail
      
      * Update dummy_pt_objects after make fix-copies
      
      * Remove TokenClassifierOutputWithPast
      
      * Fix tuple input issue
      Co-authored-by: default avatardanielsejong55@gmail.com <danielsejong55@gmail.com>
      41c55941
  29. 21 Jul, 2021 1 commit
  30. 28 May, 2021 1 commit
  31. 20 May, 2021 1 commit
  32. 15 Mar, 2021 1 commit
  33. 12 Mar, 2021 1 commit
    • Nicolas Patry's avatar
      Adding new parameter to `generate`: `max_time`. (#9846) · 543d0549
      Nicolas Patry authored
      * [WIP] Adding new parameter to `generate`:  `max_time`.
      
      Generation by tokens number is sometimes a bit clunky because we don't
      know how many tokens are good enough or even how many tokens are in
      the payload (for pipelines users for instance). This leads to hard
      to understand behavior.
      
      This PR proposes a new argument `max_time` which is a float of seconds
      for the allowed time for `generate` to run on.
      Ideally combinations of `max_tokens=None`, `max_time=2` could be used to
      generate as many tokens as possible within time budget.
      
      NB: Another possible approach consists of passing a callback to `generate`
        putting the caller in charge of the actual decision of when to stop
        generating tokens. It opens the door to 'which args should we pass'
        to this callback. It's hard to imagine other use-cases for this
        early stopping behavior than time (that are not already covered by
        parameters of generate)
      
      * Revamp with StoppingCriteria
      
      * Removing deprecated mentions.
      
      * Forgot arguments to stopping criteria.
      
      * Readding max_length it's not just used as a stopping criteria.
      
      * Default value for `stopping_criteria`.
      
      * Address @patrickvonplaten comments.
      
      - More docstrings
      - Actual doc
      - Include in global namespace
      - Remove TF work.
      
      * Put back `max_length` (deprecation different PR).
      
      * Doc quality.
      
      * Fixing old behavior without `stopping_criteria` but with `max_length`.
      
      Making sure we don't break that in the future.
      
      * Adding more tests for possible inconsistencies between
      
      `max_length` and `stopping_criteria`.
      
      * Fixing the torch imports.
      543d0549
  34. 19 Jan, 2021 1 commit
  35. 22 Dec, 2020 1 commit
  36. 07 Dec, 2020 1 commit
  37. 23 Nov, 2020 1 commit
  38. 17 Nov, 2020 1 commit
  39. 16 Nov, 2020 1 commit
    • Sylvain Gugger's avatar
      Switch `return_dict` to `True` by default. (#8530) · 1073a2bd
      Sylvain Gugger authored
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Run on the real suite
      
      * Fix slow tests
      1073a2bd