1. 17 Jul, 2024 1 commit
  2. 11 Jul, 2024 1 commit
    • Naman Garg's avatar
      Adding hiera (#30356) · c1e139c2
      Naman Garg authored
      
      
      * initialized Structure
      
      * Updated variable names
      
      * Added Config class, basic HF setup, convert_to_hf
      
      * Fixed Convert function, added hiera to HF files, Initilized test files
      
      * better naming for x in forward pass
      
      * Moved utils to hiera
      
      * Change hiera -> hiera_model
      
      * Fixed integration into tranformers
      
      * Fix: Convert Checkpoint
      
      * added documentation for hiera
      
      * added documentation for hiera
      
      * added Docstings to models, Transformers based changes
      
      * make style and quality
      
      * make style and quality
      
      * Integration & Block tests running
      
      * Fixed bugs
      
      * initialized Structure
      
      * Updated variable names
      
      * Added Config class, basic HF setup, convert_to_hf
      
      * Fixed Convert function, added hiera to HF files, Initilized test files
      
      * better naming for x in forward pass
      
      * Moved utils to hiera
      
      * Change hiera -> hiera_model
      
      * Fixed integration into tranformers
      
      * Fix: Convert Checkpoint
      
      * added documentation for hiera
      
      * added documentation for hiera
      
      * added Docstings to models, Transformers based changes
      
      * make style and quality
      
      * make style and quality
      
      * Integration & Block tests running
      
      * Fixed bugs
      
      * Removed tim dependency
      
      * added HieraBlock
      
      * fixed: Model name
      
      * added tests for HieraModel, HieraBlock
      
      * fixed imports
      
      * fixed quality & copies
      
      * Fixes
      
      * Update docs/source/en/model_doc/hiera.md
      
      Fix name
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/hiera.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/hiera.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/configuration_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/configuration_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Fixed formatting
      
      * Code quality & Import differences
      
      * quality and repo-consistency fix
      
      * fixed no torch error
      
      * Docstring fix
      
      * Docstring fix
      
      * doc string fix
      
      * fixed example usage
      
      * Resolved issues in modeling_hiera
      
      * Removed Hiera MAE
      
      * Added test and resolved bug
      
      * fixed doc string
      
      * First commit
      
      * Finished conversion script and model forward working
      
      * Resolved all issues
      
      * nits
      
      * Improving tests
      
      * Nits
      
      * More nits
      
      * Improving HieraForMaskedImageModeling
      
      * More improvements and nits
      
      * Fixed docstrings of outputs
      
      * More fixes
      
      * More imrpovments
      
      * Updated conversion script
      
      * Fixed docstrings
      
      * Improved tests
      
      * Fixed attentou outputs test
      
      * All tests green
      
      * Removed unnecessary file
      
      * contribution attribution
      
      * Resolved a few issues
      
      * Resolved Comments
      
      * Updated model repo id and fixed bugs
      
      * Removed loss print
      
      * Make tests green
      
      * Updated docstrings
      
      * Fix style
      
      * Fixed num_heads in config
      
      * Removed unnecessary video checkpoint related code in the conversion script
      
      * Fix style
      
      * Changed atol in conversion script
      
      * HieraConfig
      
      * Fix copies
      
      * Fixed typo
      
      * Resolved few issues
      
      * make
      
      * converted conv_nd -> nn.Module
      
      * Removed video complexities
      
      * Removed video complexities
      
      * fix style
      
      * Addressing comments
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Fix style
      
      * Fixed tests
      
      * Fixed typo
      
      * Fixed interpolate test
      
      * Made torch fx compatible
      
      * Made sure imageprocesor is correct
      
      * Addressed comments
      
      * Noise directly as torch
      
      * Remove unnecesary attr
      
      * Added return_dit
      
      * Update src/transformers/models/hiera/__init__.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Updated checkpoints
      
      * [run_slow] hiera
      
      * Fixed device mismatch
      
      * [run_slow] hiera
      
      * Fixed GPU tests
      
      * [run_slow] hiera
      
      ---------
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarEduardo Pacheco <eduardo.pach@hotmail.com>
      Co-authored-by: default avatarEduardo Pacheco <69953243+EduardoPach@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      c1e139c2
  3. 10 Jul, 2024 2 commits
  4. 09 Jul, 2024 1 commit
  5. 08 Jul, 2024 2 commits
    • Pavel Iakubovskii's avatar
      Add FA2 and `sdpa` support for SigLIP (#31499) · a177821b
      Pavel Iakubovskii authored
      * Rebase to main
      
      * Fix attention implementation autoset for tex and vision configs
      
      * Fixup
      
      * Minor fixes
      
      * Fix copies
      
      * Fix attention_mask for FA2
      
      * Add eqvivalence tests for siglip
      
      * Remove right padding test
      
      * Uncomment flaky
      
      * Fix import
      
      * Add to docs
      
      * Fix test message
      
      * Add sdpa
      
      * Add sdpa equivalence test
      
      * Add siglip sdpa to docs
      
      * Fix typing for attention output
      
      * Add sdpa tests
      
      * Fix signature of FA2
      
      * Autoset attn_implementation in config
      
      * Rename bsz -> batch_size
      
      * Move back autoset attn method
      
      * Mark as flaky
      
      * Correct attention mask padding
      
      * [run-slow] siglip
      
      * Add FA2 and sdpa docs
      
      * Style fix
      
      * Remove flaky for FA2 test
      
      * Change attention implementation set
      
      * Change attn_implementaiton propogation
      
      * Fix typos
      
      * Add modality to assert message
      
      * Add more sdpa backends in test
      
      * [run slow] siglip
      
      * Add math sdpa backend for all options
      
      * [run slow] siglip
      a177821b
    • NielsRogge's avatar
      Add ZoeDepth (#30136) · 06fd7972
      NielsRogge authored
      
      
      * First draft
      
      * Add docs
      
      * Clean up code
      
      * Convert model
      
      * Add image processor
      
      * Convert Zoe_K
      
      * More improvements
      
      * Improve variable names and docstrings
      
      * Improve variable names
      
      * Improve variable names
      
      * Replace nn.sequential
      
      * More improvements
      
      * Convert ZoeD_NK
      
      * Fix most tests
      
      * Verify pixel values
      
      * Verify pixel values
      
      * Add squeeze
      
      * Update beit to support arbitrary window sizes
      
      * Improve image processor
      
      * Improve docstring
      
      * Improve beit
      
      * Improve model outputs
      
      * Add figure
      
      * Fix beit
      
      * Update checkpoint
      
      * Fix repo id
      
      * Add _keys_to_ignore_on_load_unexpected
      
      * More improvements
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Rename variable name
      
      * Add backbone_hidden_size
      
      * Vectorize
      
      * Vectorize more
      
      * Address comments
      
      * Clarify docstring
      
      * Remove backbone_hidden_size
      
      * Fix image processor
      
      * Remove print statements
      
      * Remove print statement
      
      * Add integration test
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Add requires_backends
      
      * Clean up
      
      * Simplify conversion script
      
      * Simplify more
      
      * Simplify more
      
      * Simplify more
      
      * Clean up
      
      * Make sure beit is loaded correctly
      
      * Address comment
      
      * Address bin_configurations
      
      * Use bin_configurations
      
      * Convert models, add integration tests
      
      * Fix doc test
      
      * Address comments
      
      * Unify regressor classes
      
      * Clarify arguments
      
      * Improve resize_image
      
      * Add num_relative_features
      
      * Address comment
      
      * [run-slow]beit,data2vec,zoedepth
      
      * [run-slow]beit,data2vec,zoedepth
      
      * Address comments
      
      * Address comment
      
      * Address comment
      
      * Replace nn.TransformerEncoderLayer and nn.TransformerEncoder
      
      * Replace nn.MultiheadAttention
      
      * Add attributes for patch transformer to config
      
      * Add tests for ensure_multiple_of
      
      * Update organization
      
      * Add tests
      
      * [run-slow] beit data2vec
      
      * Update ruff
      
      * [run-slow] beit data2vec
      
      * Add comment
      
      * Improve docstrings, add test
      
      * Fix interpolate_pos_encoding
      
      * Fix slow tests
      
      * Add docstring
      
      * Update src/transformers/models/zoedepth/image_processing_zoedepth.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/zoedepth/image_processing_zoedepth.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Improve tests and docstrings
      
      * Use run_common_tests
      
      * Improve docstrings
      
      * Improve docstrings
      
      * Improve tests
      
      * Improve tests
      
      * Remove print statements
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      06fd7972
  6. 05 Jul, 2024 2 commits
    • Pedro Cuenca's avatar
      Depth Anything: update conversion script for V2 (#31522) · 1082361a
      Pedro Cuenca authored
      * Depth Anything: update conversion script for V2
      
      * Update docs
      
      * Style
      
      * Revert "Update docs"
      
      This reverts commit be0ca47ea1be4f3cd9aa2113bdd8efcc9959119e.
      
      * Add docs for depth anything v2
      
      * Add depth_anything_v2 to MODEL_NAMES_MAPPING
      
      Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files
      
      * Add tip in original docs
      1082361a
    • Billy Cao's avatar
      Add training support for SigLIP (#31495) · 1d3eaa6f
      Billy Cao authored
      * Add siglip loss function
      
      * Update docs
      
      * Enable training tests
      [experimental] enable GC training tests as it has worked for my own data
      
      * Remove test_training* overrides to enable training tests
      [run_slow] siglip
      
      * Skip training tests for Siglip text model and ImageClassificationModel
      [run_slow] siglip
      
      * Skip GC training tests for SiglipForImageClassification
      
      * Explicitly skip training tests for SiglipVisionModel
      Add skip reason for training tests for SiglipTextModel
      
      * Remove copied from to fix CI
      1d3eaa6f
  7. 02 Jul, 2024 2 commits
  8. 27 Jun, 2024 2 commits
  9. 26 Jun, 2024 3 commits
    • Raushan Turganbay's avatar
      Add LLaVa NeXT Video (#31252) · e71f2863
      Raushan Turganbay authored
      
      
      * squash into single commit
      
      * run diff once more
      
      * docstring
      
      * tests
      
      * minor chnages and ready to go
      
      * Update src/transformers/models/llava_next_video/processing_llava_next_video.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update tests/models/vipllava/test_modeling_vipllava.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * [run-slow] llava-next-video
      
      * [run-slow] llava-next-video
      
      * [run-slow] llava_next_video
      
      * fix two tests
      
      * fix slow tests
      
      * remove logit checks due to numeric errors
      
      * run test once more
      
      * [run-slow] llava_next_video
      
      * final try to pass the test
      
      * [run-slow] llava_next_video
      
      * [run-slow] llava_next_video
      
      * [run-slow] llava_next_video
      
      * style
      
      * fix
      
      * style
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      e71f2863
    • Pavel Iakubovskii's avatar
      Update RT-DETR code snippet (#31631) · ac52084b
      Pavel Iakubovskii authored
      Update code snippet
      ac52084b
    • Anton Vlasjuk's avatar
      [`GPT-NeoX`] Add SDPA support (#31031) · b07770c5
      Anton Vlasjuk authored
      * starting support for sdpa in `gptneox` models
      
      * small comment on tests
      
      * fix dropout
      
      * documentation and style
      
      * clarify concrete paths for reference
      
      * generalise attn projections and rope application
      
      added head mask check to sdpa mask creation
      
      handle sdpa memory backend bug via own version flag
      
      * update docs and style
      
      * move dtype casting outside of general attn_projection_and_rope function
      
      fix flash_attn_2 stuff
      
      * more generic attn warning if output_attns or head_mask
      
      * simplify head mask check by moving head mask creation to a later point
      
      * remove copied llama artifact
      
      * remove padding_mask from attention function signature
      
      * removing unnecessary comments, only "save" attn implementation once
      
      * [run_slow] gpt_neox
      b07770c5
  10. 25 Jun, 2024 1 commit
    • Raushan Turganbay's avatar
      Add video modality for InstrucBLIP (#30182) · fc689d75
      Raushan Turganbay authored
      * squash in single commit
      
      * add docs
      
      * dummy obj
      
      * more changes in diff converter
      
      * tiny fix
      
      * make docs happy
      
      * skip test
      
      * repo consistency tests
      
      * update docstring
      
      * style
      
      * fix tests
      
      * change diff imports
      
      * [run-slow] instructblipvideo
      
      * [run-slow] instructblipvideo
      
      * fix tests and remove logit check
      
      * [run-slow] instructblipvideo
      fc689d75
  11. 21 Jun, 2024 1 commit
  12. 19 Jun, 2024 2 commits
  13. 11 Jun, 2024 1 commit
    • amyeroberts's avatar
      Fast image processor (#28847) · f53fe35b
      amyeroberts authored
      
      
      * Draft fast image processors
      
      * Draft working fast version
      
      * py3.8 compatible cache
      
      * Enable loading fast image processors through auto
      
      * Tidy up; rescale behaviour based on input type
      
      * Enable tests for fast image processors
      
      * Smarter rescaling
      
      * Don't default to Fast
      
      * Safer imports
      
      * Add necessary Pillow requirement
      
      * Woops
      
      * Add AutoImageProcessor test
      
      * Fix up
      
      * Fix test for imagegpt
      
      * Fix test
      
      * Review comments
      
      * Add warning for TF and JAX input types
      
      * Rearrange
      
      * Return transforms
      
      * NumpyToTensor transformation
      
      * Rebase - include changes from upstream in ImageProcessingMixin
      
      * Safe typing
      
      * Fix up
      
      * convert mean/std to tesnor to rescale
      
      * Don't store transforms in state
      
      * Fix up
      
      * Update src/transformers/image_processing_utils_fast.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/auto/image_processing_auto.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/auto/image_processing_auto.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/auto/image_processing_auto.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Warn if fast image processor available
      
      * Update src/transformers/models/vit/image_processing_vit_fast.py
      
      * Transpose incoming numpy images to be in CHW format
      
      * Update mapping names based on packages, auto set fast to None
      
      * Fix up
      
      * Fix
      
      * Add AutoImageProcessor.from_pretrained(checkpoint, use_fast=True) test
      
      * Update src/transformers/models/vit/image_processing_vit_fast.py
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      
      * Add equivalence and speed tests
      
      * Fix up
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarPavel Iakubovskii <qubvel@gmail.com>
      f53fe35b
  14. 10 Jun, 2024 1 commit
    • Pavel Iakubovskii's avatar
      Decorators for deprecation and named arguments validation (#30799) · 517df566
      Pavel Iakubovskii authored
      
      
      * Fix do_reduce_labels for maskformer image processor
      
      * Deprecate reduce_labels in favor to do_reduce_labels
      
      * Deprecate reduce_labels in favor to do_reduce_labels (segformer)
      
      * Deprecate reduce_labels in favor to do_reduce_labels (oneformer)
      
      * Deprecate reduce_labels in favor to do_reduce_labels (maskformer)
      
      * Deprecate reduce_labels in favor to do_reduce_labels (mask2former)
      
      * Fix typo
      
      * Update mask2former test
      
      * fixup
      
      * Update segmentation examples
      
      * Update docs
      
      * Fixup
      
      * Imports fixup
      
      * Add deprecation decorator draft
      
      * Add deprecation decorator
      
      * Fixup
      
      * Add deprecate_kwarg decorator
      
      * Validate kwargs decorator
      
      * Kwargs validation (beit)
      
      * fixup
      
      * Kwargs validation (mask2former)
      
      * Kwargs validation (maskformer)
      
      * Kwargs validation (oneformer)
      
      * Kwargs validation (segformer)
      
      * Better message
      
      * Fix oneformer processor save-load test
      
      * Update src/transformers/utils/deprecation.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/utils/deprecation.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/utils/deprecation.py
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      
      * Update src/transformers/utils/deprecation.py
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      
      * Better handle classmethod warning
      
      * Fix typo, remove warn
      
      * Add header
      
      * Docs and `additional_message`
      
      * Move to filter decorator ot generic
      
      * Proper deprecation for semantic segm scripts
      
      * Add to __init__ and update import
      
      * Basic tests for filter decorator
      
      * Fix doc
      
      * Override `to_dict()` to pop depracated `_max_size`
      
      * Pop unused parameters
      
      * Fix trailing whitespace
      
      * Add test for deprecation
      
      * Add deprecation warning control parameter
      
      * Update generic test
      
      * Fixup deprecation tests
      
      * Introduce init service kwargs
      
      * Revert popping unused params
      
      * Revert oneformer test
      
      * Allow "metadata" to pass
      
      * Better docs
      
      * Fix test
      
      * Add notion in docstring
      
      * Fix notification for both names
      
      * Add func name to warning message
      
      * Fixup
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      517df566
  15. 05 Jun, 2024 1 commit
  16. 04 Jun, 2024 1 commit
  17. 31 May, 2024 2 commits
  18. 28 May, 2024 2 commits
    • amyeroberts's avatar
      Deprecate low use models (#30781) · a564d10a
      amyeroberts authored
      * Deprecate models
      - graphormer
      - time_series_transformer
      - xlm_prophetnet
      - qdqbert
      - nat
      - ernie_m
      - tvlt
      - nezha
      - mega
      - jukebox
      - vit_hybrid
      - x_clip
      - deta
      - speech_to_text_2
      - efficientformer
      - realm
      - gptsan_japanese
      
      * Fix up
      
      * Fix speech2text2 imports
      
      * Make sure message isn't indented
      
      * Fix docstrings
      
      * Correctly map for deprecated models from model_type
      
      * Uncomment out
      
      * Add back time series transformer and x-clip
      
      * Import fix and fix-up
      
      * Fix up with updated ruff
      a564d10a
    • NielsRogge's avatar
      [SuperPoint, PaliGemma] Update docs (#31025) · 90da0b1c
      NielsRogge authored
      * Update docs
      
      * Add PaliGemma resources
      
      * Address comment
      
      * Update docs
      90da0b1c
  19. 27 May, 2024 1 commit
  20. 23 May, 2024 1 commit
    • Aritra Roy Gosthipaty's avatar
      [Port] TensorFlow implementation of Mistral (#29708) · 965e98dc
      Aritra Roy Gosthipaty authored
      
      
      * chore: initial commit
      
      * chore: adding imports and inits
      
      * chore: adding the causal and classification code
      
      * chore: adding names to the layers
      
      * chore: using single self attn layer
      
      * chore: built the model and layers
      
      * chore: start with testing
      
      * chore: docstring change, transpose fix
      
      * fix: rotary embedding
      
      * chore: adding cache implementation
      
      * remove unused torch
      
      * chore: fixing the indexing issue
      
      * make fix-copies
      
      * Use modeling_tf_utils.keras
      
      * make fixup
      
      * chore: fixing tests
      
      * chore: adding past key value logic
      
      * chore: adding multi label classfication test
      
      * fix: switching on the built parameters in the layers
      
      * fixing repo consistency
      
      * ruff formats
      
      * style changes
      
      * fix: tf and pt equivalence
      
      * removing returns from docstrings
      
      * fix docstrings
      
      * fix docstrings
      
      * removing todos
      
      * fix copies
      
      * fix docstring
      
      * fix docstring
      
      * chore: using easier rotate_half
      
      * adding integration tests
      
      * chore: addressing review related to rotary embedding layer
      
      * review changes
      
      * [run-slow] mistral
      
      * skip: test save load after resize token embedding
      
      * style
      
      ---------
      Co-authored-by: default avatarMatt <rocketknight1@gmail.com>
      965e98dc
  21. 22 May, 2024 2 commits
  22. 21 May, 2024 1 commit
  23. 20 May, 2024 2 commits
  24. 16 May, 2024 3 commits
    • Raushan Turganbay's avatar
      Video-LLaVa: Fix docs (#30855) · 95b3c381
      Raushan Turganbay authored
      fix model id in docs
      95b3c381
    • NielsRogge's avatar
      [Idefics2] Improve docs, add resources (#30717) · 17cc71e1
      NielsRogge authored
      
      
      * Add resources
      
      * Address comment
      
      * Address comments
      
      * Update docs/source/en/model_doc/idefics2.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update figure
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      17cc71e1
    • hyenal's avatar
      add sdpa to ViT [follow up of #29325] (#30555) · 1c21f48a
      hyenal authored
      
      
      remove blank line (+1 squashed commit)
      Squashed commits:
      [24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
      Squashed commits:
      [08bd27e7a] [run-slow]vit_msn,vision_encoder_decoder
      [ec96a8db3] [run-slow]vit_msn
      [ead817eca] fix vit msn multi gpu
      [d12cdc8fd] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [3fdbfa88f] doc
      [a3ff33e4a] finish implementation
      [e20b7b7fb] Update test_modeling_common.py
      [e290c5810] Update test_modeling_flax_common.py
      [d3af86f46] comment
      [ff7dd32d8] more comments
      [59b137889] suggestion
      [7e2ba6d67] attn_implementation as attribute of the class
      [fe66ab71f] minor
      [38642b568] Apply suggestions from code review
      
      Accept comments
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [22cde7d52] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [48e137cc6] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [99f4c679f] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [96cf20a6d] Update src/transformers/models/vit_msn/modeling_vit_msn.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [c59377d23] Update src/transformers/models/vit_mae/modeling_vit_mae.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [b70a47259] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [00c84d216] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [61f00ebb0] all tests are passing locally
      [e9e0b82b7] vision encoder/decoder
      [4d5076b56] test-vision (+20 squashed commits)
      Squashed commits:
      [d1add8db9] yolo
      [9fde65716] fix flax
      [986566c28] minor
      [ca2f21d1f] vit
      [3333efd7a] easy models change
      [ebfc21402] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [b8b8603ed] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
      [48ecc7e26] all tests are passing locally
      [bff7fc366] minor
      [62f88306f] fix yolo and text_encoder tests
      [121507555] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [1064cae0a] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
      [b7f52ff3a] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [cffaa10dd] fix-copies
      [ef6c511c4] test vit hybrid
      [7d4ba8644] vit hybrid
      [66f919033] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [1fcc0a031] fixes
      [cfde6eb21] fixup
      [e77df1ed3] all except yolo end encoder decoder (+17 squashed commits)
      Squashed commits:
      [602913e22] vit + vit_mae are working
      [547f6c4cc] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/  passes
      [61a97dfa9] it s the complete opposite...
      [aefab37d4] fix more tests
      [71802a1b9] fix all torch tests
      [40b12eb58] encoder - decoder tests
      [941552b69] slow decorator where appropriate
      [14d055d80] has_attentions to yolo and msn
      [3381fa19f] add correct name
      [e261316a7] repo consistency
      [31c6d0c08] fixup
      [9d214276c] minor fix
      [11ed2e1b7] chore
      [eca6644c4] add sdpa to vit-based models
      [cffbf390b] make fix-copies result
      [6468319b0] fix style
      [d324cd02a] add sdpa for vit
      Co-authored-by: default avatarLiubov Yaronskaya <luba.yaronskaya@gmail.com>
      1c21f48a
  25. 15 May, 2024 1 commit
  26. 14 May, 2024 1 commit
    • Pablo Montalvo's avatar
      Add PaliGemma (#30814) · 1360801a
      Pablo Montalvo authored
      
      
      * add new model like
      
      * add state dict slicing + new model config
      
      * update palma config and weights, passes vision activations
      
      * fix
      
      * update
      
      * reorder loading/unpacking
      
      * clean up
      
      * add debug statements
      
      * change device
      
      * fix
      
      * debugging
      
      * fix noncausal mask
      
      * fixup sdpa + causal mask
      
      * fix activation function
      
      * remove debug before changing modeling file
      
      * add variants
      
      * debug attention mask in generate
      
      * revert to non-debug sdpa
      
      * revert gemma modifications
      
      * add custom language modeling
      
      * use Processor
      
      * add language modeling file to init
      
      * try thin wrapper around generate
      
      * Update
      
      * update mask
      
      * breakpoints galore
      
      * remove conflict
      
      * switch to left-padding
      
      * add incomplete model doc
      
      * add paligemma global files
      
      * batch rename paligemma
      
      * make generation match outputs and captioning
      
      * style
      
      * style
      
      * remove copied from + doc
      
      * remove more copied from
      
      * remove copy from projector
      
      * minor fix
      
      * update config and style
      
      * add readme - dummy
      
      * CORRECT image captioning
      
      * moving to args
      
      * add siglip proper + fix merging image + text features
      
      * take update_causal_mask from upstream
      
      * remove breakpoint
      
      * leverage AutoModel
      
      * fix input_ids slicing
      
      * make siglip head conditional
      
      * remove encoder_decoder value
      
      * remove unneeded modeling file
      
      * add commented 4d attention mask
      
      * FIXED generation with 4D mask
      
      * Update src/transformers/models/siglip/modeling_siglip.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fix left padding detection
      
      * shuffle order of verifications
      
      * fix missing labels for training
      
      * fix
      
      * vectorize merging of features, improve slicing
      
      * improve testing before conversion
      
      * handle merging in processor
      
      * image token index depends on checkpoint
      
      * add variants, save processor too
      
      * save processors, base tokenizer off spm file
      
      * expand model embeddings due to additional image token
      
      * pass image processing args
      
      * add convert rgb to siglip processor
      
      * add \n token separately
      
      * fix tokenizer and prompts
      
      * fix docstrings
      
      * change to camel
      
      * fix casing
      
      * debug pos_ids and sdpa
      
      * pass and use cache_position
      
      * add flag for newline tokenization
      
      * Update src/transformers/models/paligemma/processing_paligemma.py
      Co-authored-by: default avatarMerve Noyan <merveenoyan@gmail.com>
      
      * simplify conversion script
      
      * add copied from
      
      * add precision to conversion script
      
      * Update src/transformers/models/paligemma/modeling_paligemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up
      
      * Shift attention mask from `1:`
      
      After discussion with @molbap
      
      * add docs, fix quality
      
      * quality, tied weights inheritance, and logits/label alignment
      
      * fix more tests
      
      * pass attn_implementation to language model correctly
      
      * add SiglipVisionTransformer to no split modules
      
      * skip paligemma test for sdpa dispatch to flash
      
      * skip incompatible tests
      
      * quality
      
      * [broken archive maps]
      
      * Apply suggestions
      
      - remove archive lists
      - style
      - take shape of inputs_embeds for batch
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/utils/dummy_pt_objects.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * simplify conversion script
      
      * add suggestions
      
      * add suggestions
      
      * add copied from
      
      * fix
      
      * move labels out
      
      * revert
      
      * fix
      
      * remove placeholder labels if None
      
      * use cache_position
      
      * fix quality + docstrings
      
      * fix quality
      
      * fix paligemma 4d gemma mask incompatibility
      
      * fix config docstring
      
      * fix query and attn_mask dtype
      
      ---------
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarMerve Noyan <merveenoyan@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1360801a