1. 24 May, 2024 2 commits
  2. 23 May, 2024 5 commits
    • Yasmin Moslem's avatar
      Remove deprecated properties in tokenization_nllb.py and tokenization_nllb_fast.py (#29834) · 6d3d5b10
      Yasmin Moslem authored
      * Fix typo in tokenization_nllb.py
      
      Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
      
      * Fix typo in tokenization_nllb_fast.py
      
      Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
      
      * Remove deprecated attributes in tokenization_nllb.py
      
      Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens`
      
      * Remove deprecated attribute in tokenization_nllb_fast.py
      
      Remove deprecated attribute `lang_code_to_id`
      
      * Remove deprecated properties in tokenization_nllb.py
      
      Remove deprecated properties - fix format
      
      * Remove deprecated properties in tokenization_nllb_fast.py
      
      Remove deprecated properties - fix format
      
      * Update test_tokenization_nllb.py
      
      * update test_tokenization_nllb.py
      
      * Update tokenization_nllb.py
      
      * Update test_tokenization_seamless_m4t.py
      
      * Update test_tokenization_seamless_m4t.py
      6d3d5b10
    • Aritra Roy Gosthipaty's avatar
      [Port] TensorFlow implementation of Mistral (#29708) · 965e98dc
      Aritra Roy Gosthipaty authored
      
      
      * chore: initial commit
      
      * chore: adding imports and inits
      
      * chore: adding the causal and classification code
      
      * chore: adding names to the layers
      
      * chore: using single self attn layer
      
      * chore: built the model and layers
      
      * chore: start with testing
      
      * chore: docstring change, transpose fix
      
      * fix: rotary embedding
      
      * chore: adding cache implementation
      
      * remove unused torch
      
      * chore: fixing the indexing issue
      
      * make fix-copies
      
      * Use modeling_tf_utils.keras
      
      * make fixup
      
      * chore: fixing tests
      
      * chore: adding past key value logic
      
      * chore: adding multi label classfication test
      
      * fix: switching on the built parameters in the layers
      
      * fixing repo consistency
      
      * ruff formats
      
      * style changes
      
      * fix: tf and pt equivalence
      
      * removing returns from docstrings
      
      * fix docstrings
      
      * fix docstrings
      
      * removing todos
      
      * fix copies
      
      * fix docstring
      
      * fix docstring
      
      * chore: using easier rotate_half
      
      * adding integration tests
      
      * chore: addressing review related to rotary embedding layer
      
      * review changes
      
      * [run-slow] mistral
      
      * skip: test save load after resize token embedding
      
      * style
      
      ---------
      Co-authored-by: default avatarMatt <rocketknight1@gmail.com>
      965e98dc
    • Yih-Dar's avatar
      Update 4 `MptIntegrationTests` expected outputs (#30989) · 2a89673f
      Yih-Dar authored
      
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * [run-slow] mpt
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      2a89673f
    • Fanli Lin's avatar
      [tests] add `torch.use_deterministic_algorithms` for XPU (#30774) · 21339a52
      Fanli Lin authored
      * add xpu check
      
      * add marker
      
      * add documentation
      
      * update doc
      
      * fix ci
      
      * remove from global init
      
      * fix
      21339a52
    • Marc Sun's avatar
      Fix accelerate failing tests (#30836) · 8366b572
      Marc Sun authored
      * Fix accelerate tests
      
      * fix clip
      
      * skip dbrx tests
      
      * fix GPTSan
      
      * fix M2M100Model
      
      * same fix as jamba
      
      * fix mt5
      
      * Fix T5Model
      
      * Fix umt5 model
      
      * fix switch_transformers
      
      * fix whisper
      
      * fix gptsan again
      
      * fix siglip recent test
      
      * skip siglip tests
      
      * wrong place fixed
      8366b572
  3. 22 May, 2024 4 commits
  4. 21 May, 2024 2 commits
  5. 20 May, 2024 5 commits
  6. 17 May, 2024 4 commits
  7. 16 May, 2024 2 commits
    • Yih-Dar's avatar
      Make `Gemma` work with `torch.compile` (#30775) · 1b3dba94
      Yih-Dar authored
      
      
      * fix
      
      * [run-slow] gemma
      
      * add test
      
      * add `test_compile_static_cache`
      
      * fix
      
      * style
      
      * remove subprocess
      
      * use attribute
      
      * fix
      
      * style
      
      * update
      
      * [run-slow] dbrx,gemma,jetmoe,phi3,recurrent_gemma
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      1b3dba94
    • hyenal's avatar
      add sdpa to ViT [follow up of #29325] (#30555) · 1c21f48a
      hyenal authored
      
      
      remove blank line (+1 squashed commit)
      Squashed commits:
      [24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
      Squashed commits:
      [08bd27e7a] [run-slow]vit_msn,vision_encoder_decoder
      [ec96a8db3] [run-slow]vit_msn
      [ead817eca] fix vit msn multi gpu
      [d12cdc8fd] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [3fdbfa88f] doc
      [a3ff33e4a] finish implementation
      [e20b7b7fb] Update test_modeling_common.py
      [e290c5810] Update test_modeling_flax_common.py
      [d3af86f46] comment
      [ff7dd32d8] more comments
      [59b137889] suggestion
      [7e2ba6d67] attn_implementation as attribute of the class
      [fe66ab71f] minor
      [38642b568] Apply suggestions from code review
      
      Accept comments
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [22cde7d52] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [48e137cc6] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [99f4c679f] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [96cf20a6d] Update src/transformers/models/vit_msn/modeling_vit_msn.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [c59377d23] Update src/transformers/models/vit_mae/modeling_vit_mae.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [b70a47259] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [00c84d216] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [61f00ebb0] all tests are passing locally
      [e9e0b82b7] vision encoder/decoder
      [4d5076b56] test-vision (+20 squashed commits)
      Squashed commits:
      [d1add8db9] yolo
      [9fde65716] fix flax
      [986566c28] minor
      [ca2f21d1f] vit
      [3333efd7a] easy models change
      [ebfc21402] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [b8b8603ed] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
      [48ecc7e26] all tests are passing locally
      [bff7fc366] minor
      [62f88306f] fix yolo and text_encoder tests
      [121507555] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [1064cae0a] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
      [b7f52ff3a] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [cffaa10dd] fix-copies
      [ef6c511c4] test vit hybrid
      [7d4ba8644] vit hybrid
      [66f919033] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [1fcc0a031] fixes
      [cfde6eb21] fixup
      [e77df1ed3] all except yolo end encoder decoder (+17 squashed commits)
      Squashed commits:
      [602913e22] vit + vit_mae are working
      [547f6c4cc] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/  passes
      [61a97dfa9] it s the complete opposite...
      [aefab37d4] fix more tests
      [71802a1b9] fix all torch tests
      [40b12eb58] encoder - decoder tests
      [941552b69] slow decorator where appropriate
      [14d055d80] has_attentions to yolo and msn
      [3381fa19f] add correct name
      [e261316a7] repo consistency
      [31c6d0c08] fixup
      [9d214276c] minor fix
      [11ed2e1b7] chore
      [eca6644c4] add sdpa to vit-based models
      [cffbf390b] make fix-copies result
      [6468319b0] fix style
      [d324cd02a] add sdpa for vit
      Co-authored-by: default avatarLiubov Yaronskaya <luba.yaronskaya@gmail.com>
      1c21f48a
  8. 15 May, 2024 4 commits
  9. 14 May, 2024 2 commits
  10. 13 May, 2024 4 commits
    • Alazar's avatar
      Port IDEFICS to tensorflow (#26870) · 94306352
      Alazar authored
      
      
      * Initial commit
      
      * Just a copy of modeling_idefics.py that will be ported to TF
      
      * - Prepend TF to the name of all classes
      - Convert pytorch ops to TF (not all operations are converted yet)
      
      * Add TF imports
      
      * Add autotranslated files
      
      * Add TF classes to model_tf_auto.py
      
      * Add the TF classes in model_doc
      
      * include auto-translated code
      
      * Adopted from auto-translated version
      
      * Add a forgotten super().build
      
      * Add test code for TF version.
      
      * Fix indentation and load pytorch weights for now
      
      * Some fixes. Many tests are still failing but some are passing now.
      
      - I have added TODO's for some of the hacks I made to unblock me
        and I will address them soon
      - I have the processing_idefics.py hacked in my view to support TF temporarily
      
      * Add ALL_LAYERNORM_LAYERS to match pytorch
      
      * Revert "Add ALL_LAYERNORM_LAYERS to match pytorch"
      
      This reverts commit 7e0a35119b4d7a6284d04d8c543fba1b29e573c9 as it
      is not needed in the tf implementation.
      
      * Fix freeze_relevant_params()
      
      * Some more fixes
      
      * Fix test_attention_outputs
      
      * Add tf stuff to processing_idefics.py
      
      processing_idefics.py supports both pytorch and tf now.
      
      test_processor_idefics.py for pytorch is passing, so i didn't break anything
      but still some issues with tf. I also need to add tf tests in
      test_processor_idefics.py.
      
      * Pass return_tensors to image processing code and fix test
      
      * Pass return_tensors to the image processor __init__
      
      * Fix several test cases
      
      - Make input to some of the forward pass of type `TFModelInputType`
      - Decorate main layer forward pass with `@unpack_inputs`
      - Decorate main layer with `@keras_serializable`
      - Pass `inputs` to TFIdeficsModel
      
      * Some more fixes forgotten in last commit
      
      * Fix processing code and vision_tf.py
      
      * Fix perceiver bug
      
      * Import from
      
      * Auto-add build() methods + style pass
      
      * Fix build() errors due to `None` being passed as shape to some layers
      
      * Change name in TFIdeficsForVisionText2Text to attribute in IdeficsForVisionText2Text
      
      * Fix pytorch weights load for tf2
      
      There were a lot of `name=` missing in weight initialization code.
      
      * Attempt to fix CI
      
      * Add back accidently removed line
      
      * Remove torch-specific stuff from the TF test file
      
      * make fix-copies, make style, remove autotranslated files
      
      * Fixes to imports/docstrings
      
      * Let's try the from future import in desperation
      
      * Fix the core random_attention_mask fn to match the torch/flax behaviour
      
      * Clean random_attention_mask up correctly
      
      * Remove torch-only test
      
      * Fix loss shape, couple of nits
      
      * make style
      
      * Don't test for OOB embeddings because IDEFICS uses those deliberately
      
      * Fix loss computation to handle masking
      
      * Fix test failures when flattening
      
      * Fix some test failures
      
      - Add cross attention gate which was missing and wasn't being passed arround
      - Fix overwriting of image_attention_mask due to hack I had for dummy inputs
      
      * Add a proper stateless scaled_dot_product_attention
      
      * make style
      
      * Adding missing attribute from the PyTorch version
      
      * Small cleanups to decoupledlinearlayer in case that helps
      
      * Pass epsilon to LayerNormalization
      
      * Attemp to fix pytorch weight cross-loading for TFIdeficsEmbedding
      
      * Fix a bug in TFIdeficsGatedCrossAttentionLayer
      
      * Patching up build() methods
      
      * Constant self.inv_freq
      
      * Constant self.inv_freq
      
      * First working version
      
      The TF implementation works now, there was a bug in the TFIdeficsDecoupledLinear
      where the weights were mis-intialized (in_features,out_features)
      when it should be: (out_features, in_features)
      
      I have tested this so far with tiny-random and idefics-9b-instruct
      and gives correct output.
      
      I also dumped the final outputs for both pytorch and TF
      and they are identical.
      
      * Fix some test failures
      
      * remove print statement
      
      * Fix return_tensors
      
      * Fix CI test failure check_code_quality
      
      * Attempt to fix CI failures by running `make fixup`
      
      The hardcoded IDs in test_modeling_tf_idefics.py are for the integration
      test and makes that file unreadable and should probably be moved to a seperate file.
      
      * Attempt to fix tests_pr_documentation_tests
      
      * Fix a test failure in test_image_processing_idefics.py
      
      * Fix test test_pt_tf_model_equivalence
      
      * Fix a few failures
      
      * Tiny fix
      
      * Some minor fixes
      
      * Remove a duplicate test
      
      * Override a few test failures for IDEFICS
      
      - `test_keras_save_load` is passing now
      - `test_compile_tf_model` is still failing
      
      * Fix processing_idefics.py after rebase
      
      * Guard import keras with is_tf_available
      
      * fix check code quality
      
      * fix check code quality
      
      * Minor fixes
      
      * Skip test_save_load temporarily
      
      This test passed on my local box but fails on the CI, skipping
      for now to see if there are other remaining failures on the CI.
      
      * Run `ruff format tests src utils`
      
      * Fix last failing test, `test_compile_tf_model`
      
      * Add fixes for vision_tf.py
      
      I forgot to add this file in last commit.
      
      * Minor fixes
      
      * Replace "<<<" with "<<" for doc tests
      
      IDEFICS-9B is too big for doctest runner, so don't run it there
      
      * Make code more readable
      
      * Fix bug after code review
      
      I added a layer_norm_eps to IdeficsConfig but I don't even need it
      since the vision config has a layer_norm_eps.
      
      * Fix after code review
      
      Use original code tokenizer.convert_tokens_to_ids
      
      * Keep PyTorch as the default return_tensors
      
      * Fixes to modeling_tf after code review
      
      * Fixes from code review
      
      - Remove all references of `TF_IDEFICS_PRETRAINED_MODEL_ARCHIVE_LIST`
      - Pass 1e-5 to LayerNormalization in perceiver
      
      * Run ruff
      
      * Undo a change
      
      * Refactor processing code after Matt's suggestion
      
      * Remove TODO's that aren't needed anymore
      
      * For pytorch, Use original pytorch processing code from main
      
      Since this PR is a TF port it shouldn't make any modifications
      to pytorch IDEFICS code. This changes undo's the pytorch processing
      modifications I made and uses original code from main.
      
      * Update tests/models/idefics/test_modeling_idefics.py
      
      * Update tests/models/idefics/test_modeling_tf_idefics.py
      
      * Add missing imports for is_pt_tf_cross_test
      
      * [DO NOT MERGE]: This is a commit for debugging and will be reverted
      
      The cross test `test_pt_tf_model_equivalence` passes locally but
      fails when running on the CI. This commit is to help debug that
      and will be reverted.
      
      * Revert "[DO NOT MERGE]: This is a commit for debugging and will be reverted"
      
      This reverts commit 8f0d709ec5bd46685fb0b4259d914ffee794875b.
      
      * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted
      
      * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted
      
      * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"
      
      This reverts commit 998cc38b8c3d313bf5e5eb55a7f5b7b881897b89.
      
      * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"
      
      This reverts commit 1c695ac4219c4ae4d39b330b01744dc27deb7dd4.
      
      * Don't skip test_save_load
      
      IIRC test_save_load was also failing on the CI but not on my local
      box, it might be easier to debug that on the CI first than the cross tests
      
      * Debugging commit, will be reverted
      
      * Revert "Debugging commit, will be reverted"
      
      This reverts commit 8eafc8e41e20c4e95a3a90834f06a6e9f445e2d5.
      
      * Override `test_save_load` and push model to save
      
      Maybe this will help me repro this weird bug
      
      * pass my repo_id
      
      * add endpoint
      
      * Pass a temp (write) token just for this CI
      
      * Undo last few commits, still pushing to hub for model debugging
      
      The issue seems to be with save_pretrained(),  when I looked at the model saved
      from the CI test failure it is basically empty and has no weights.
      `self.save_weights(..)` seems to be failing in save_pretrained but needs
      more debugging
      
      * Add logging to modeling tf utils, will be reverted just for debugging
      
      * Debugging, will revert
      
      * Revert "Debugging, will revert"
      
      This reverts commit 9d0d3075fb7c82d8cde3a5c76bc8f3876c5c55d3.
      
      * Revert "Add logging to modeling tf utils, will be reverted just for debugging"
      
      This reverts commit 774b6b7b1c17b3ce5d7634ade768f2f686cee617.
      
      * Remove `test_save_load`
      
      The CI failures are gone after my latest rebase, no idea why
      but I was still saving the model to my hub on HF and the tf_model.h5
      file now has everything.
      
      * Run make fix-copies
      
      * Run ruff format tests src utils
      
      * Debugging commit, will be reverted
      
      * Run ruff, also trigger CI run
      
      * Run ruff again
      
      * Undo debugging commit
      
      ---------
      Co-authored-by: default avatarMatt <rocketknight1@gmail.com>
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      94306352
    • Poedator's avatar
      Llama: fix custom 4D masks, v2 (#30348) · a0779b9e
      Poedator authored
      
      
      * 4d mask fixes
      
      * Update custom 4D mask logic
      
      * test moved to mixin
      
      * extra tests 4d mask
      
      * upd 4d mask and StaticCache handling
      
      * added Mask4DTestHard to mistral tests
      
      * post-rebase fixes
      
      * test fixes for StaticCache
      
      * make fix-copies
      
      * upd 1 after #30476
      
      * fix common tests
      
      * rm elif attention_mask.dim() == 4:
      
      * tests combined, fixed, mixtral supported
      
      * bigbird style chg reverted
      
      * rm if attention_mask.dim() == 2
      
      * modeling_llama formatting chg
      
      ---------
      Co-authored-by: default avatarJoao Gante <joao@huggingface.co>
      a0779b9e
    • Nilabhra Roy Chowdhury's avatar
      Support for Falcon2-11B (#30771) · e52741f6
      Nilabhra Roy Chowdhury authored
      
      
      * remove unrelated changes
      
      * remove unrelated changes on phi and stable LM
      
      * add: Test for Falcon 10B
      
      * fix: formatting
      
      * fix: loading the falcon 10B in 8 bit precision using bitsanbytes.
      
      * fix: device placement
      
      * fix: broken tests.
      
      * fix: backwards compatibility for falcon 1B architecture.
      
      * chore: updated test.
      
      * chore: test_modeling_falcon.py to use the 11B model.
      
      * chore: minor edit
      
      * chore: formating.
      
      ---------
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      e52741f6
    • Zafir Stojanovski's avatar
      Blip dynamic input resolution (#30722) · f63d8222
      Zafir Stojanovski authored
      * blip with interpolated pos encoding
      
      * feat: Add interpolate_pos_encoding option to other models from `BLIP` family.
      
      * include check for textual generated content in tests
      f63d8222
  11. 09 May, 2024 5 commits
  12. 08 May, 2024 1 commit