"vscode:/vscode.git/clone" did not exist on "280db79ac139eff31962a56006b34a9f42886834"
  1. 08 Jul, 2024 1 commit
  2. 02 Jul, 2024 1 commit
  3. 27 Jun, 2024 1 commit
    • Arthur's avatar
      Add gemma 2 (#31659) · 0cf60f13
      Arthur authored
      
      
      * inital commit
      
      * Add doc
      
      * protect?
      
      * fixup stuffs
      
      * update tests
      
      * fix build documentation
      
      * mmmmmmm config attributes
      
      * style
      
      * nit
      
      * uodate
      
      * nit
      
      * Fix docs
      
      * protect some stuff
      
      ---------
      Co-authored-by: default avatarLysandre <lysandre@huggingface.co>
      0cf60f13
  4. 26 Jun, 2024 1 commit
  5. 24 Jun, 2024 1 commit
  6. 07 Jun, 2024 1 commit
  7. 24 May, 2024 1 commit
  8. 23 May, 2024 1 commit
  9. 22 May, 2024 1 commit
  10. 20 May, 2024 1 commit
  11. 16 May, 2024 3 commits
    • Yih-Dar's avatar
      Make `Gemma` work with `torch.compile` (#30775) · 1b3dba94
      Yih-Dar authored
      
      
      * fix
      
      * [run-slow] gemma
      
      * add test
      
      * add `test_compile_static_cache`
      
      * fix
      
      * style
      
      * remove subprocess
      
      * use attribute
      
      * fix
      
      * style
      
      * update
      
      * [run-slow] dbrx,gemma,jetmoe,phi3,recurrent_gemma
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      1b3dba94
    • Joao Gante's avatar
      Cache: add new flag to distinguish models that `Cache` but not static cache (#30800) · 9d889f87
      Joao Gante authored
      * jamba cache
      
      * new flag
      
      * generate exception
      9d889f87
    • hyenal's avatar
      add sdpa to ViT [follow up of #29325] (#30555) · 1c21f48a
      hyenal authored
      
      
      remove blank line (+1 squashed commit)
      Squashed commits:
      [24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
      Squashed commits:
      [08bd27e7a] [run-slow]vit_msn,vision_encoder_decoder
      [ec96a8db3] [run-slow]vit_msn
      [ead817eca] fix vit msn multi gpu
      [d12cdc8fd] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [3fdbfa88f] doc
      [a3ff33e4a] finish implementation
      [e20b7b7fb] Update test_modeling_common.py
      [e290c5810] Update test_modeling_flax_common.py
      [d3af86f46] comment
      [ff7dd32d8] more comments
      [59b137889] suggestion
      [7e2ba6d67] attn_implementation as attribute of the class
      [fe66ab71f] minor
      [38642b568] Apply suggestions from code review
      
      Accept comments
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [22cde7d52] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [48e137cc6] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [99f4c679f] Update tests/test_modeling_common.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [96cf20a6d] Update src/transformers/models/vit_msn/modeling_vit_msn.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [c59377d23] Update src/transformers/models/vit_mae/modeling_vit_mae.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [b70a47259] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      [00c84d216] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [61f00ebb0] all tests are passing locally
      [e9e0b82b7] vision encoder/decoder
      [4d5076b56] test-vision (+20 squashed commits)
      Squashed commits:
      [d1add8db9] yolo
      [9fde65716] fix flax
      [986566c28] minor
      [ca2f21d1f] vit
      [3333efd7a] easy models change
      [ebfc21402] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
      [b8b8603ed] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
      [48ecc7e26] all tests are passing locally
      [bff7fc366] minor
      [62f88306f] fix yolo and text_encoder tests
      [121507555] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [1064cae0a] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
      [b7f52ff3a] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [cffaa10dd] fix-copies
      [ef6c511c4] test vit hybrid
      [7d4ba8644] vit hybrid
      [66f919033] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
      [1fcc0a031] fixes
      [cfde6eb21] fixup
      [e77df1ed3] all except yolo end encoder decoder (+17 squashed commits)
      Squashed commits:
      [602913e22] vit + vit_mae are working
      [547f6c4cc] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/  passes
      [61a97dfa9] it s the complete opposite...
      [aefab37d4] fix more tests
      [71802a1b9] fix all torch tests
      [40b12eb58] encoder - decoder tests
      [941552b69] slow decorator where appropriate
      [14d055d80] has_attentions to yolo and msn
      [3381fa19f] add correct name
      [e261316a7] repo consistency
      [31c6d0c08] fixup
      [9d214276c] minor fix
      [11ed2e1b7] chore
      [eca6644c4] add sdpa to vit-based models
      [cffbf390b] make fix-copies result
      [6468319b0] fix style
      [d324cd02a] add sdpa for vit
      Co-authored-by: default avatarLiubov Yaronskaya <luba.yaronskaya@gmail.com>
      1c21f48a
  12. 15 May, 2024 2 commits
  13. 14 May, 2024 1 commit
    • Pablo Montalvo's avatar
      Add PaliGemma (#30814) · 1360801a
      Pablo Montalvo authored
      
      
      * add new model like
      
      * add state dict slicing + new model config
      
      * update palma config and weights, passes vision activations
      
      * fix
      
      * update
      
      * reorder loading/unpacking
      
      * clean up
      
      * add debug statements
      
      * change device
      
      * fix
      
      * debugging
      
      * fix noncausal mask
      
      * fixup sdpa + causal mask
      
      * fix activation function
      
      * remove debug before changing modeling file
      
      * add variants
      
      * debug attention mask in generate
      
      * revert to non-debug sdpa
      
      * revert gemma modifications
      
      * add custom language modeling
      
      * use Processor
      
      * add language modeling file to init
      
      * try thin wrapper around generate
      
      * Update
      
      * update mask
      
      * breakpoints galore
      
      * remove conflict
      
      * switch to left-padding
      
      * add incomplete model doc
      
      * add paligemma global files
      
      * batch rename paligemma
      
      * make generation match outputs and captioning
      
      * style
      
      * style
      
      * remove copied from + doc
      
      * remove more copied from
      
      * remove copy from projector
      
      * minor fix
      
      * update config and style
      
      * add readme - dummy
      
      * CORRECT image captioning
      
      * moving to args
      
      * add siglip proper + fix merging image + text features
      
      * take update_causal_mask from upstream
      
      * remove breakpoint
      
      * leverage AutoModel
      
      * fix input_ids slicing
      
      * make siglip head conditional
      
      * remove encoder_decoder value
      
      * remove unneeded modeling file
      
      * add commented 4d attention mask
      
      * FIXED generation with 4D mask
      
      * Update src/transformers/models/siglip/modeling_siglip.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fix left padding detection
      
      * shuffle order of verifications
      
      * fix missing labels for training
      
      * fix
      
      * vectorize merging of features, improve slicing
      
      * improve testing before conversion
      
      * handle merging in processor
      
      * image token index depends on checkpoint
      
      * add variants, save processor too
      
      * save processors, base tokenizer off spm file
      
      * expand model embeddings due to additional image token
      
      * pass image processing args
      
      * add convert rgb to siglip processor
      
      * add \n token separately
      
      * fix tokenizer and prompts
      
      * fix docstrings
      
      * change to camel
      
      * fix casing
      
      * debug pos_ids and sdpa
      
      * pass and use cache_position
      
      * add flag for newline tokenization
      
      * Update src/transformers/models/paligemma/processing_paligemma.py
      Co-authored-by: default avatarMerve Noyan <merveenoyan@gmail.com>
      
      * simplify conversion script
      
      * add copied from
      
      * add precision to conversion script
      
      * Update src/transformers/models/paligemma/modeling_paligemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up
      
      * Shift attention mask from `1:`
      
      After discussion with @molbap
      
      * add docs, fix quality
      
      * quality, tied weights inheritance, and logits/label alignment
      
      * fix more tests
      
      * pass attn_implementation to language model correctly
      
      * add SiglipVisionTransformer to no split modules
      
      * skip paligemma test for sdpa dispatch to flash
      
      * skip incompatible tests
      
      * quality
      
      * [broken archive maps]
      
      * Apply suggestions
      
      - remove archive lists
      - style
      - take shape of inputs_embeds for batch
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/utils/dummy_pt_objects.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * simplify conversion script
      
      * add suggestions
      
      * add suggestions
      
      * add copied from
      
      * fix
      
      * move labels out
      
      * revert
      
      * fix
      
      * remove placeholder labels if None
      
      * use cache_position
      
      * fix quality + docstrings
      
      * fix quality
      
      * fix paligemma 4d gemma mask incompatibility
      
      * fix config docstring
      
      * fix query and attn_mask dtype
      
      ---------
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarMerve Noyan <merveenoyan@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1360801a
  14. 13 May, 2024 2 commits
    • Marc Sun's avatar
      skip low_cpu_mem_usage tests (#30782) · 539ed75d
      Marc Sun authored
      539ed75d
    • Poedator's avatar
      Llama: fix custom 4D masks, v2 (#30348) · a0779b9e
      Poedator authored
      
      
      * 4d mask fixes
      
      * Update custom 4D mask logic
      
      * test moved to mixin
      
      * extra tests 4d mask
      
      * upd 4d mask and StaticCache handling
      
      * added Mask4DTestHard to mistral tests
      
      * post-rebase fixes
      
      * test fixes for StaticCache
      
      * make fix-copies
      
      * upd 1 after #30476
      
      * fix common tests
      
      * rm elif attention_mask.dim() == 4:
      
      * tests combined, fixed, mixtral supported
      
      * bigbird style chg reverted
      
      * rm if attention_mask.dim() == 2
      
      * modeling_llama formatting chg
      
      ---------
      Co-authored-by: default avatarJoao Gante <joao@huggingface.co>
      a0779b9e
  15. 07 May, 2024 1 commit
  16. 02 May, 2024 1 commit
  17. 01 May, 2024 2 commits
  18. 26 Apr, 2024 1 commit
    • JB (Don)'s avatar
      [`BERT`] Add support for sdpa (#28802) · dfa7b580
      JB (Don) authored
      * Adding SDPA support for BERT
      
      * Using the proper input name for testing model input in inference()
      
      * Adding documentation for SDPA in BERT model page
      
      * Use the stable link for the documentation
      
      * Adding a gate to only call .contiguous() for torch < 2.2.0
      
      * Additions and fixes to the documentation
      
      * Minor updates to documentation
      
      * Adding extra requirements needed for the contiguous() bug
      
      * Adding "Adapted from" in plcae of the "Copied from"
      
      * Add benchmark speedup tables to the documentation
      
      * Minor fixes to the documentation
      
      * Use ClapText as a replacemenet for Bert in the Copied-From
      
      * Some more fixes for the fix-copies references
      
      * Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage
      
      [test all]
      
      * Undo changes to separate test
      
      * Refactored SDPA self attention code for KV projections
      
      * Change use_sdpa to attn_implementation
      
      * Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
      dfa7b580
  19. 22 Apr, 2024 1 commit
  20. 19 Apr, 2024 1 commit
    • Jacky Lee's avatar
      Enable multi-device for some models (#30207) · 30b45320
      Jacky Lee authored
      
      
      * feat: multidevice for resnet
      
      * feat: yes! resnet
      
      * fix: compare all elements in tuple
      
      * feat: support for regnet
      
      * feat: support for convnextv2
      
      * feat: support for bit
      
      * feat: support for cvt
      
      * feat: add support for focalnet
      
      * feat: support for yolos
      
      * feat: support for glpn
      
      * feat: support for imagegpt
      
      * feat: support for levit
      
      * feat: support for mgp_str
      
      * feat: support for mobilnet_v1
      
      * feat: support for mobilnet_v2
      
      * feat: support for mobilevit
      
      * feat: support for mobilevitv2
      
      * feat: support for poolformer
      
      * fix: copies
      
      * fix: code quality check
      
      * update: upstream changes from main
      
      * fix: consistency check
      
      * feat: support for sam
      
      * feat: support for switchformer
      
      * feat: support for swin
      
      * feat: support for swinv2
      
      * feat: support for timesformer
      
      * feat: suport for trocr
      
      * feat: support for upernet
      
      * fix: check copies
      
      * update: rerun CI
      
      * update: rerun again, maybe
      
      * update: one more rerun
      
      ---------
      Co-authored-by: default avatarJacky Lee <jackylee328@gmail.com>
      30b45320
  21. 18 Apr, 2024 3 commits
  22. 17 Apr, 2024 2 commits
  23. 15 Apr, 2024 1 commit
    • amyeroberts's avatar
      Add Idefics2 (#30253) · 6b78360e
      amyeroberts authored
      
      
      * Initial add model additions
      
      * Test
      
      * All weights loading
      
      * Can perform full forward pass
      
      * Local and remote the same
      
      * Matching local and remote
      
      * Fixup
      
      * Idefics2Model importable; fixup docstrings
      
      * Don't skip by default
      
      * Remove deprecated use_resampler arg
      
      * Remove self.config
      
      * DecoupledLinear takes config
      
      * Tidy up
      
      * Enable eager attention and tidy up
      
      * Most tests passing
      
      * Update for batch of processed images
      
      * Add image processor
      
      * Update doc pages
      
      * Update conversion script
      
      * Remove erroneous breakpoint
      
      * Remove accidendtal spelling change
      
      * Update to reflect changes on hub - make generate work
      
      * Fix up
      
      * Image processor tests
      
      * Update tests
      
      * Add a processor
      
      * Add a processor
      
      * Update convert script
      
      * Update modeling file - remove fixmes
      
      * Bug fix
      
      * Add processing test
      
      * Use processor
      
      * Fix up
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Fix test
      
      * Update config - PR comments and defaults align with checkpoint
      
      * Reviewer comments
      
      * Add copied froms for flahs attention
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Remove qk_layer_norm and freeze_layers functionality
      
      * Fix
      
      * Remove freeze_layer options from config
      
      * Sync with upstream main
      
      * Fix attention shapes siglip
      
      * Remove Llava-next refs - TO REBASE
      
      * Use AutoModel for text model
      
      * Add comment to explain vision embeddings
      
      * Fix issue with tie_word_embeddings
      
      * Address review comments
      
      * Fix and fix up
      
      * Chat templates for idefics
      
      * Fix copies
      
      * Fix
      
      * Add layer norms to FA2
      
      * Fix tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Fix
      
      * Review comments
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update inputs merger
      
      * Merge weights in correct order
      
      * Update convert script
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update template
      
      * Model code examples (fix idefics too)
      
      * More review comments
      
      * Tidy up
      
      * Update processing
      
      * Fix attention mask preparation
      
      * Update inputs_merger inputs
      
      * Vectorize inputs_merger
      
      * Update src/transformers/models/idefics2/__init__.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      * Review comments
      
      * saying bye to the `qk_layer_norms`
      
      * Simplify
      
      * Update latents
      
      * Remove erroneuous readme changes
      
      * Return images when applying chat template
      
      * Fix bug - prompt images are for a single sample
      
      * Update src/transformers/models/idefics2/modeling_idefics2.py
      
      * image splitting
      
      * fix test
      
      * some more comment
      
      * some comment
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/idefics2/image_processing_idefics2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update processor
      
      * Update model tests
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Don't add BOS in template
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Remove index in examples
      
      * Update tests to reflect #13
      
      * Update src/transformers/models/idefics2/processing_idefics2.py
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * PR comment - consistent typing
      
      * Update readme and model doc
      
      * Update docs
      
      * Update checkpoint references
      
      * Update examples
      
      * Fix and update tests
      
      * Small addition
      
      * Update tests - remove copied from as no ignore placement copy could be found
      
      * Update example
      
      * small fixes
      
      * Update docs/source/en/model_doc/idefics2.md
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update docs/source/en/model_doc/idefics2.md
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Update README.md
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      
      * Connector model as bridge
      
      * Fix up
      
      * Fix up
      
      * Don't pass model inputs for generation kwargs update
      
      * IDEFICS-2 -> Idefics2
      
      * Remove config archive name
      
      * IDEFICS-2 -> Idefics2
      
      * Add back llava-next
      
      * Update readmes
      
      * Add requirements for processor tester
      
      * Use custom convert_to_rgb to avoid possible BC
      
      * Fix doc example
      
      * Fix doc example
      
      * Skip model doc tests - as model to large
      
      * More doc example - account for image splitting
      
      * Update src/transformers/image_transforms.py
      
      * Fix config doctest
      
      ---------
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarVictor SANH <victorsanh@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      6b78360e
  24. 09 Apr, 2024 1 commit
  25. 01 Apr, 2024 1 commit
  26. 28 Mar, 2024 1 commit
  27. 20 Mar, 2024 1 commit
  28. 13 Mar, 2024 1 commit
  29. 12 Mar, 2024 1 commit
  30. 26 Feb, 2024 2 commits
  31. 20 Feb, 2024 1 commit