1. 28 Mar, 2024 1 commit
  2. 25 Mar, 2024 1 commit
  3. 16 Feb, 2024 1 commit
  4. 01 Feb, 2024 1 commit
    • JB (Don)'s avatar
      Adding [T5/MT5/UMT5]ForTokenClassification (#28443) · 0d26abdd
      JB (Don) authored
      * Adding [T5/MT5/UMT5]ForTokenClassification
      
      * Add auto mappings for T5ForTokenClassification and variants
      
      * Adding ForTokenClassification to the list of models
      
      * Adding attention_mask param to the T5ForTokenClassification test
      
      * Remove outdated comment in test
      
      * Adding EncoderOnly and Token Classification tests for MT5 and UMT5
      
      * Fix typo in umt5 string
      
      * Add tests for all the existing MT5 models
      
      * Fix wrong comment in dependency_versions_table
      
      * Reverting change to common test for _keys_to_ignore_on_load_missing
      
      The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.
      
      * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model
      
      * Add fix-copies to MT5ModelTest
      0d26abdd
  5. 02 Nov, 2023 1 commit
  6. 01 Nov, 2023 1 commit
  7. 27 Oct, 2023 1 commit
  8. 25 Oct, 2023 1 commit
    • Younes Belkada's avatar
      [`core`] Refactor of `gradient_checkpointing` (#27020) · 06e782da
      Younes Belkada authored
      * v1
      
      * fix
      
      * remove `create_custom_forward`
      
      * fixup
      
      * fixup
      
      * add test and fix all failing GC tests
      
      * remove all remaining `create_custom_forward` methods
      
      * fix idefics bug
      
      * fixup
      
      * replace with `__call__`
      
      * add comment
      
      * quality
      06e782da
  9. 12 Oct, 2023 1 commit
  10. 11 Oct, 2023 1 commit
    • Billy Bradley's avatar
      In assisted decoding, pass model_kwargs to model's forward call (fix... · dcc49d8a
      Billy Bradley authored
      In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242)
      
      * In assisted decoding, pass model_kwargs to model's forward call
      
      Previously, assisted decoding would ignore any additional kwargs
      that it doesn't explicitly handle. This was inconsistent with other
      generation methods, which pass the model_kwargs through
      prepare_inputs_for_generation and forward the returned dict to the
      model's forward call.
      
      The prepare_inputs_for_generation method needs to be amended in all
      models, as previously it only kept the last input ID when a past_key_values
      was passed.
      
      * Improve variable names in _extend_attention_mask
      
      * Refactor extending token_type_ids into a function
      
      * Replace deepcopy with copy to optimize performance
      
      * Update new persimmon model with llama changes for assisted generation
      
      * Update new mistral model for assisted generation with prepare_inputs_for_generation
      
      * Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
      dcc49d8a
  11. 28 Sep, 2023 1 commit
    • fleance's avatar
      Do not warn about unexpected decoder weights when loading T5EncoderModel and... · 216dff75
      fleance authored
      Do not warn about unexpected decoder weights when loading T5EncoderModel and LongT5EncoderModel (#26211)
      
      Ignore decoder weights when using T5EncoderModel and LongT5EncoderModel
      
      Both T5EncoderModel and LongT5EncoderModel do not have any decoder layers, so
      loading a pretrained model checkpoint such as t5-small will give warnings about
      keys found in the model checkpoint that are not in the model itself.
      
      To prevent this log warning, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for
      both T5EncoderModel and LongT5EncoderModel
      216dff75
  12. 25 Jul, 2023 1 commit
    • Sebastian Husch Lee's avatar
      [`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) · 8f36ab3e
      Sebastian Husch Lee authored
      * Initial addition of t5forsequenceclassification
      
      * Adding imports and adding tests
      
      * Formatting
      
      * Running make fix-copies
      
      * Adding mt5forseq
      
      * Formatting
      
      * run make fix-copies
      
      * Adding to docs
      
      * Add model_parallel
      
      * Fix bug
      
      * Fix
      
      * Remove TODO
      
      * Fixing tests for T5ForSequenceClassification
      
      * Undo changes to dependency_versions_table.py
      
      * Change classification head to work with T5Config directly
      
      * Change seq length to let tests pass
      
      * PR comments for formatting
      
      * Formatting
      
      * Initial addition of UMT5ForSequenceClassification
      
      * Adding to inits and formatting
      
      * run make fix-copies
      
      * Add doc for UMT5ForSeqClass
      
      * Update UMT5 config
      
      * Fix docs
      
      * Skip torch fx test for SequenceClassification
      
      * Formatting
      
      * Add skip to UMT5 tests as well
      
      * Fix umt5 tests
      
      * Running make fix-copies
      
      * PR comments
      
      * Fix for change to sentence_representation
      
      * Rename seq_len to hidden_size since that's what it is
      
      * Use base_model to follow format of the rest of the library
      
      * Update docs
      
      * Extract the decoder_input_ids changes and make one liner
      
      * Make one-liner
      8f36ab3e
  13. 10 Jul, 2023 1 commit
  14. 27 Jun, 2023 2 commits
  15. 22 Jun, 2023 1 commit
  16. 21 Jun, 2023 1 commit
  17. 13 Jun, 2023 1 commit
    • Sylvain Gugger's avatar
      Tied params cleanup (#24211) · 695928e1
      Sylvain Gugger authored
      * First test
      
      * Add info for all models
      
      * style
      
      * Repo consistency
      
      * Fix last model and cleanup prints
      
      * Repo consistency
      
      * Use consistent function for detecting tied weights
      695928e1
  18. 12 May, 2023 1 commit
  19. 20 Apr, 2023 1 commit
  20. 03 Apr, 2023 1 commit
  21. 15 Mar, 2023 1 commit
  22. 09 Mar, 2023 1 commit
  23. 28 Feb, 2023 1 commit
  24. 27 Feb, 2023 1 commit
  25. 07 Feb, 2023 2 commits
  26. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  27. 24 Jan, 2023 1 commit
  28. 23 Jan, 2023 1 commit
  29. 08 Jan, 2023 1 commit
    • Arthur's avatar
      Replace `past` with `past_key_values` (#20944) · f0577df6
      Arthur authored
      * start cleanup
      
      * more updates
      
      * more models are affected
      
      * more updates
      
      * update generation utils
      
      * style
      
      * revert change that removed reorder cachce
      
      * update generation utils
      
      * style
      
      * style
      
      * remove reorder cache
      f0577df6
  30. 03 Jan, 2023 1 commit
  31. 15 Dec, 2022 1 commit
  32. 13 Dec, 2022 1 commit
  33. 06 Dec, 2022 1 commit
  34. 09 Nov, 2022 1 commit
    • Nicolas Patry's avatar
      Attempting to test automatically the `_keys_to_ignore`. (#20042) · bac2d29a
      Nicolas Patry authored
      
      
      * Attempting to test automatically the `_keys_to_ignore`.
      
      * Style.
      
      * First fix pass.
      
      * Moving test on its own.
      
      * Another batch.
      
      * Second round removing BatchNorm
      
      * Fixing layoutlmv{2,3} + support older Python.
      
      * Disable miss missing warning.
      
      * Removing dodgy additions.
      
      * Big pass.
      
      * mbart.
      
      * More corrections.
      
      * Fixup.
      
      * Updating test_correct_missing_keys
      
      * Add escape hatch for when the head has no extra params so doesn't need
      
      the missing keys check.
      
      * Fixing test.
      
      * Greener.
      
      * Green ! (except for weird splinter bug).
      
      * Adding a test about `named_parameters` usage.
      
      * Shorten message.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * After rebase modifications.
      
      * More explicit condition checking.
      
      * Fixing slow tests issues.
      
      * Remove extra pdb.
      
      * Remove print.
      
      * Attempt to make failure consistent + fixing roc_bert.
      
      * Removing the seed  (all tests passing with it).
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      bac2d29a
  35. 06 Sep, 2022 2 commits
  36. 27 Jul, 2022 1 commit
  37. 06 Jul, 2022 1 commit