1. 08 Apr, 2024 3 commits
  2. 05 Apr, 2024 2 commits
  3. 04 Apr, 2024 1 commit
    • byi8220's avatar
      [`ProcessingIdefics`] Attention mask bug with padding (#29449) · 75b76a5e
      byi8220 authored
      * Defaulted IdeficsProcessor padding to 'longest', removed manual padding
      
      * make fixup
      
      * Defaulted processor call to padding=False
      
      * Add padding to processor call in IdeficsModelIntegrationTest as well
      
      * Defaulted IdeficsProcessor padding to 'longest', removed manual padding
      
      * make fixup
      
      * Defaulted processor call to padding=False
      
      * Add padding to processor call in IdeficsModelIntegrationTest as well
      
      * redefaulted padding=longest again
      
      * fixup/doc
      75b76a5e
  4. 03 Apr, 2024 4 commits
  5. 02 Apr, 2024 4 commits
    • Minsub Lee (Matt)'s avatar
      Fix `skip_special_tokens` for `Wav2Vec2CTCTokenizer._decode` (#29311) · 15cd6871
      Minsub Lee (Matt) authored
      * Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode
      
      * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode
      
      * Exclude pad_token filtering since it is used as CTC-blank token
      
      * Add small test for skip_special_tokens
      
      * Update decoding test for added new token
      15cd6871
    • Yoach Lacombe's avatar
      Add Flash Attention 2 support to Musicgen and Musicgen Melody (#29939) · 0d04b1e2
      Yoach Lacombe authored
      * add FA2 to o.g Musicgen
      
      * make style
      
      * add FA2 support to Musicgen Melody
      
      * add generation FA2 tests to o.g Musicgen
      
      * make style and fix copies
      
      * add Musicgen to FA2 docs + deprecate list
      
      * add sdpa supports to Musicgen's
      
      * make style and fix copies
      
      * refactor attention implementation arguments
      
      * add Copied from to sdpa tests
      
      * add copied form in sdpa tests melody
      
      * add copied for FA2 generation tests
      
      * add FA2 inference copied from
      
      * make style
      0d04b1e2
    • Hovnatan Karapetyan's avatar
      Fix 29807 sinusoidal positional encodings in Flaubert, Informer and XLM (#29904) · 416711c3
      Hovnatan Karapetyan authored
      * Fix sinusoidal_embeddings in FlaubertModel
      
      * Fix for Informer
      
      * Fix for XLM
      
      * Move sinusoidal emb for XLM
      
      * Move sinusoidal emb for Flaubert
      
      * Small cleanup
      
      * Add comments on tests code copied from
      
      * Add with Distilbert->
      416711c3
    • Arthur's avatar
      [`generate`] fix breaking change for patch (#29976) · 83b26dd7
      Arthur authored
      * fix bug and add tests
      
      * nit
      
      * otherway to get the cur len instead of attention mask
      
      * more places where this might have been broken
      
      * nit
      
      * oups
      
      * inputs_embeds vs input_embeds
      
      * test generated outptus
      
      * style
      
      * nit
      
      * fix
      
      * skip failing biogpt
      83b26dd7
  6. 01 Apr, 2024 2 commits
  7. 29 Mar, 2024 1 commit
  8. 28 Mar, 2024 5 commits
  9. 27 Mar, 2024 4 commits
    • Lorenzo Verardo's avatar
      MixtralSparseMoeBlock: add gate jitter (#29865) · a25037be
      Lorenzo Verardo authored
      This commit adds gate jitter to MixtralSparseMoeBlock's input data
      before passing it through the MoE layer, if turned on.
      a25037be
    • Hovnatan Karapetyan's avatar
      Fix 29807, sinusoidal positional encodings overwritten by post_init() (#29813) · a81cf9ee
      Hovnatan Karapetyan authored
      * Check for requires_grad when initing weights
      
      * Add unit test
      
      * Move sinusoidal positional encoding generation after post_init()
      
      * Add modules to skip init list
      
      * Move create_sinusoidal_embeddings to _init_weights
      a81cf9ee
    • Anton Vlasjuk's avatar
      Mamba `slow_forward` gradient fix (#29563) · cefb819f
      Anton Vlasjuk authored
      * FIX: Cached slow forward in mamba
      - additionally added mamba cached test
      - added unused test (mamba causal lm forward and backward)
      - fixed typo: "causl" --> "causal"
      
      * formatting
      
      * fix: use real `slow_forward` call instead of torch module's
      
      * add shape assertion for mixer block test
      
      * adjust shape assertion
      cefb819f
    • Bo Zheng's avatar
      Add Qwen2MoE (#29377) · 1c39974a
      Bo Zheng authored
      
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * update model name & test
      
      * update readme
      
      * update class names & readme & model_doc of Qwen2MoE.
      
      * update architecture name
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fix style
      
      * fix test when there are sparse and non sparse layers
      
      * fixup
      
      * Update README.md
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fixup
      
      * fixup
      
      * add archive back
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * update model name & test
      
      * update readme
      
      * update class names & readme & model_doc of Qwen2MoE.
      
      * update architecture name
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fixup
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * fix style
      
      * fix test when there are sparse and non sparse layers
      
      * fixup
      
      * add archive back
      
      * fix integration test
      
      * fixup
      
      ---------
      Co-authored-by: default avatarbozheng-hit <dsoul0621@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      1c39974a
  10. 25 Mar, 2024 1 commit
  11. 22 Mar, 2024 1 commit
  12. 21 Mar, 2024 1 commit
  13. 20 Mar, 2024 5 commits
  14. 19 Mar, 2024 3 commits
    • Raushan Turganbay's avatar
      Clean-up generation tests after moving methods to private (#29582) · 425ba56c
      Raushan Turganbay authored
      * clean-up tests
      
      * refine comments
      
      * fix musicgen tests
      
      * make style
      
      * remove slow decorator from a test
      
      * more clean-up
      
      * fix other failing tests
      425ba56c
    • StevenBucaille's avatar
      Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) · 56baa033
      StevenBucaille authored
      
      
      * Added SuperPoint docs
      
      * Added tests
      
      * Removed commented part
      
      * Commit to create and fix add_superpoint branch with a new branch
      
      * Fixed dummy_pt_objects
      
      * Committed missing files
      
      * Fixed README.md
      
      * Apply suggestions from code review
      
      Fixed small changes
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Moved ImagePointDescriptionOutput from modeling_outputs.py to modeling_superpoint.py
      
      * Removed AutoModelForKeypointDetection and related stuff
      
      * Fixed inconsistencies in image_processing_superpoint.py
      
      * Moved infer_on_model logic simply in test_inference
      
      * Fixed bugs, added labels to forward method with checks whether it is properly a None value, also added tests about this logic in test_modeling_superpoint.py
      
      * Added tests to SuperPointImageProcessor to ensure that images are properly converted to grayscale
      
      * Removed remaining mentions of MODEL_FOR_KEYPOINT_DETECTION_MAPPING
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Fixed from (w, h) to (h, w) as input for tests
      
      * Removed unnecessary condition
      
      * Moved last_hidden_state to be the first returned
      
      * Moved last_hidden_state to be the first returned (bis)
      
      * Moved last_hidden_state to be the first returned (ter)
      
      * Switched image_width and image_height in tests to match recent changes
      
      * Added config as first SuperPointConvBlock init argument
      
      * Reordered README's after merge
      
      * Added missing first config argument to SuperPointConvBlock instantiations
      
      * Removed formatting error
      
      * Added SuperPoint to README's de, pt-br, ru, te and vi
      
      * Checked out README_fr.md
      
      * Fixed README_fr.md
      
      * Test fix README_fr.md
      
      * Test fix README_fr.md
      
      * Last make fix-copies !
      
      * Updated checkpoint path
      
      * Removed unused SuperPoint doc
      
      * Added missing image
      
      * Update src/transformers/models/superpoint/modeling_superpoint.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Removed unnecessary import
      
      * Update src/transformers/models/superpoint/modeling_superpoint.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Added SuperPoint to _toctree.yml
      
      ---------
      Co-authored-by: default avatarsteven <steven.bucaillle@gmail.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarSteven Bucaille <steven.bucaille@buawei.com>
      56baa033
    • Arthur's avatar
      [`GemmaConverter`] use user_defined_symbols (#29473) · 2f9a3edb
      Arthur authored
      * use user_defined_symbols
      
      * fixup
      
      * nit
      
      * add a very robust test
      
      * make sure all models are tested with the `pretrained_tokenizer_to_test`
      
      * should we make sure we test all of them?
      
      * merge
      
      * remove the id
      
      * fix test
      
      * update
      
      * ousies
      
      * oups
      
      * fixup
      
      * fix copies check
      
      * remove `pretrained_tokenizer_to_test`
      2f9a3edb
  15. 18 Mar, 2024 1 commit
    • Yoach Lacombe's avatar
      Add MusicGen Melody (#28819) · c43b380e
      Yoach Lacombe authored
      
      
      * first modeling code
      
      * make repository
      
      * still WIP
      
      * update model
      
      * add tests
      
      * add latest change
      
      * clean docstrings and copied from
      
      * update docstrings md and readme
      
      * correct chroma function
      
      * correct copied from and remove unreleated test
      
      * add doc to toctree
      
      * correct imports
      
      * add convert script to notdoctested
      
      * Add suggestion from Sanchit
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * correct get_uncoditional_inputs docstrings
      
      * modify README according to SANCHIT feedback
      
      * add chroma to audio utils
      
      * clean librosa and torchaudio hard dependencies
      
      * fix FE
      
      * refactor audio decoder -> audio encoder for consistency with previous musicgen
      
      * refactor conditional -> encoder
      
      * modify sampling rate logics
      
      * modify license at the beginning
      
      * refactor all_self_attns->all_attentions
      
      * remove ignore copy from causallm generate
      
      * add copied from for from_sub_models
      
      * fix make copies
      
      * add warning if audio is truncated
      
      * add copied from where relevant
      
      * remove artefact
      
      * fix convert script
      
      * fix torchaudio and FE
      
      * modify chroma method according to feedback-> better naming
      
      * refactor input_values->input_features
      
      * refactor input_values->input_features and fix import fe
      
      * add input_features to docstrigs
      
      * correct inputs_embeds logics
      
      * remove dtype conversion
      
      * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation
      
      * change warning for chroma length
      
      * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * change way to save wav, using soundfile
      
      * correct docs and change to soundfile
      
      * fix import
      
      * fix init proj layers
      
      * remove line breaks from md
      
      * fix issue with docstrings
      
      * add FE suggestions
      
      * improve is in logics and remove useless imports
      
      * remove custom from_pretrained
      
      * simplify docstring code
      
      * add suggestions for modeling tests
      
      * make style
      
      * update converting script with sanity check
      
      * remove encoder attention mask from conditional generation
      
      * replace musicgen melody checkpoints with official orga
      
      * rename ylacombe->facebook in checkpoints
      
      * fix copies
      
      * remove unecessary warning
      
      * add shape in code docstrings
      
      * add files to slow doc tests
      
      * fix md bug and add md to not_tested
      
      * make fix-copies
      
      * fix hidden states test and batching
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      c43b380e
  16. 15 Mar, 2024 2 commits