1. 25 Jun, 2024 1 commit
  2. 20 Jun, 2024 1 commit
  3. 23 May, 2024 1 commit
    • Marc Sun's avatar
      Fix accelerate failing tests (#30836) · 8366b572
      Marc Sun authored
      * Fix accelerate tests
      
      * fix clip
      
      * skip dbrx tests
      
      * fix GPTSan
      
      * fix M2M100Model
      
      * same fix as jamba
      
      * fix mt5
      
      * Fix T5Model
      
      * Fix umt5 model
      
      * fix switch_transformers
      
      * fix whisper
      
      * fix gptsan again
      
      * fix siglip recent test
      
      * skip siglip tests
      
      * wrong place fixed
      8366b572
  4. 22 May, 2024 1 commit
  5. 14 May, 2024 1 commit
    • Pablo Montalvo's avatar
      Add PaliGemma (#30814) · 1360801a
      Pablo Montalvo authored
      
      
      * add new model like
      
      * add state dict slicing + new model config
      
      * update palma config and weights, passes vision activations
      
      * fix
      
      * update
      
      * reorder loading/unpacking
      
      * clean up
      
      * add debug statements
      
      * change device
      
      * fix
      
      * debugging
      
      * fix noncausal mask
      
      * fixup sdpa + causal mask
      
      * fix activation function
      
      * remove debug before changing modeling file
      
      * add variants
      
      * debug attention mask in generate
      
      * revert to non-debug sdpa
      
      * revert gemma modifications
      
      * add custom language modeling
      
      * use Processor
      
      * add language modeling file to init
      
      * try thin wrapper around generate
      
      * Update
      
      * update mask
      
      * breakpoints galore
      
      * remove conflict
      
      * switch to left-padding
      
      * add incomplete model doc
      
      * add paligemma global files
      
      * batch rename paligemma
      
      * make generation match outputs and captioning
      
      * style
      
      * style
      
      * remove copied from + doc
      
      * remove more copied from
      
      * remove copy from projector
      
      * minor fix
      
      * update config and style
      
      * add readme - dummy
      
      * CORRECT image captioning
      
      * moving to args
      
      * add siglip proper + fix merging image + text features
      
      * take update_causal_mask from upstream
      
      * remove breakpoint
      
      * leverage AutoModel
      
      * fix input_ids slicing
      
      * make siglip head conditional
      
      * remove encoder_decoder value
      
      * remove unneeded modeling file
      
      * add commented 4d attention mask
      
      * FIXED generation with 4D mask
      
      * Update src/transformers/models/siglip/modeling_siglip.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fix left padding detection
      
      * shuffle order of verifications
      
      * fix missing labels for training
      
      * fix
      
      * vectorize merging of features, improve slicing
      
      * improve testing before conversion
      
      * handle merging in processor
      
      * image token index depends on checkpoint
      
      * add variants, save processor too
      
      * save processors, base tokenizer off spm file
      
      * expand model embeddings due to additional image token
      
      * pass image processing args
      
      * add convert rgb to siglip processor
      
      * add \n token separately
      
      * fix tokenizer and prompts
      
      * fix docstrings
      
      * change to camel
      
      * fix casing
      
      * debug pos_ids and sdpa
      
      * pass and use cache_position
      
      * add flag for newline tokenization
      
      * Update src/transformers/models/paligemma/processing_paligemma.py
      Co-authored-by: default avatarMerve Noyan <merveenoyan@gmail.com>
      
      * simplify conversion script
      
      * add copied from
      
      * add precision to conversion script
      
      * Update src/transformers/models/paligemma/modeling_paligemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up
      
      * Shift attention mask from `1:`
      
      After discussion with @molbap
      
      * add docs, fix quality
      
      * quality, tied weights inheritance, and logits/label alignment
      
      * fix more tests
      
      * pass attn_implementation to language model correctly
      
      * add SiglipVisionTransformer to no split modules
      
      * skip paligemma test for sdpa dispatch to flash
      
      * skip incompatible tests
      
      * quality
      
      * [broken archive maps]
      
      * Apply suggestions
      
      - remove archive lists
      - style
      - take shape of inputs_embeds for batch
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/utils/dummy_pt_objects.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * simplify conversion script
      
      * add suggestions
      
      * add suggestions
      
      * add copied from
      
      * fix
      
      * move labels out
      
      * revert
      
      * fix
      
      * remove placeholder labels if None
      
      * use cache_position
      
      * fix quality + docstrings
      
      * fix quality
      
      * fix paligemma 4d gemma mask incompatibility
      
      * fix config docstring
      
      * fix query and attn_mask dtype
      
      ---------
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarMerve Noyan <merveenoyan@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1360801a
  6. 09 May, 2024 2 commits
  7. 25 Apr, 2024 1 commit
  8. 24 Apr, 2024 1 commit
  9. 16 Apr, 2024 1 commit
  10. 25 Mar, 2024 1 commit
  11. 14 Feb, 2024 1 commit
  12. 24 Jan, 2024 1 commit
    • nakranivaibhav's avatar
      Improved type hinting for all attention parameters (#28479) · 5d29530e
      nakranivaibhav authored
      * Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None'
      
      * Fixed the ruff formatting issue
      
      * fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None'
      
      * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py
      
      * test fail update
      
      * fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py
      
      * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py
      
      * test fail update
      
      * Removed the myvenv file
      
      * Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py
      5d29530e
  13. 19 Jan, 2024 1 commit
  14. 08 Jan, 2024 1 commit
    • NielsRogge's avatar
      Add SigLIP (#26522) · 3b742ea8
      NielsRogge authored
      
      
      * Add first draft
      
      * Use appropriate gelu function
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Convert checkpoint
      
      * More improvements
      
      * Improve docs, remove print statements
      
      * More improvements
      
      * Add link
      
      * remove unused masking function
      
      * begin tokenizer
      
      * do_lower_case
      
      * debug
      
      * set split_special_tokens=True
      
      * Remove script
      
      * Fix style
      
      * Fix rebase
      
      * Use same design as CLIP
      
      * Add fast tokenizer
      
      * Add SiglipTokenizer to init, remove extra_ids
      
      * Improve conversion script
      
      * Use smaller inputs in conversion script
      
      * Update conversion script
      
      * More improvements
      
      * Add processor to conversion script
      
      * Add tests
      
      * Remove print statements
      
      * Add tokenizer tests
      
      * Fix more tests
      
      * More improvements related to weight initialization
      
      * More improvements
      
      * Make more tests pass
      
      * More improvements
      
      * More improvements
      
      * Add copied from
      
      * Add canonicalize_text
      
      * Enable fast tokenizer tests
      
      * More improvements
      
      * Fix most slow tokenizer tests
      
      * Address comments
      
      * Fix style
      
      * Remove script
      
      * Address some comments
      
      * Add copied from to tests
      
      * Add more copied from
      
      * Add more copied from
      
      * Add more copied from
      
      * Remove is_flax_available
      
      * More updates
      
      * Address comment
      
      * Remove SiglipTokenizerFast for now
      
      * Add caching
      
      * Remove umt5 test
      
      * Add canonicalize_text inside _tokenize, thanks Arthur
      
      * Fix image processor tests
      
      * Skip tests which are not applicable
      
      * Skip test_initialization
      
      * More improvements
      
      * Compare pixel values
      
      * Fix doc tests, add integration test
      
      * Add do_normalize
      
      * Remove causal mask and leverage ignore copy
      
      * Fix attention_mask
      
      * Fix remaining tests
      
      * Fix dummies
      
      * Rename temperature and bias
      
      * Address comments
      
      * Add copied from to tokenizer tests
      
      * Add SiglipVisionModel to auto mapping
      
      * Add copied from to image processor tests
      
      * Improve doc
      
      * Remove SiglipVisionModel from index
      
      * Address comments
      
      * Improve docs
      
      * Simplify config
      
      * Add first draft
      
      * Make it like mistral
      
      * More improvements
      
      * Fix attention_mask
      
      * Fix output_attentions
      
      * Add note in docs
      
      * Convert multilingual model
      
      * Convert large checkpoint
      
      * Convert more checkpoints
      
      * Add pipeline support, correct image_mean and image_std
      
      * Use padding=max_length by default
      
      * Make processor like llava
      
      * Add code snippet
      
      * Convert more checkpoints
      
      * Set keep_punctuation_string=None as in OpenCLIP
      
      * Set normalized=False for special tokens
      
      * Fix doc test
      
      * Update integration test
      
      * Add figure
      
      * Update organization
      
      * Happy new year
      
      * Use AutoModel everywhere
      
      ---------
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      3b742ea8
  15. 05 Dec, 2023 1 commit
  16. 16 Nov, 2023 1 commit
    • Arthur's avatar
      [`Styling`] stylify using ruff (#27144) · 651408a0
      Arthur authored
      
      
      * try to stylify using ruff
      
      * might need to remove these changes?
      
      * use ruf format andruff check
      
      * use isinstance instead of type comparision
      
      * use # fmt: skip
      
      * use # fmt: skip
      
      * nits
      
      * soem styling changes
      
      * update ci job
      
      * nits isinstance
      
      * more files update
      
      * nits
      
      * more nits
      
      * small nits
      
      * check and format
      
      * revert wrong changes
      
      * actually use formatter instead of checker
      
      * nits
      
      * well docbuilder is overwriting this commit
      
      * revert notebook changes
      
      * try to nuke docbuilder
      
      * style
      
      * fix feature exrtaction test
      
      * remve `indent-width = 4`
      
      * fixup
      
      * more nits
      
      * update the ruff version that we use
      
      * style
      
      * nuke docbuilder styling
      
      * leve the print for detected changes
      
      * nits
      
      * Remove file I/O
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      
      * style
      
      * nits
      
      * revert notebook changes
      
      * Add # fmt skip when possible
      
      * Add # fmt skip when possible
      
      * Fix
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * NIts
      
      * more fixes
      
      * fix tapas
      
      * Another way to skip
      
      * Recommended way
      
      * Fix two more fiels
      
      * Remove asynch
      Remove asynch
      
      ---------
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      651408a0
  17. 27 Oct, 2023 2 commits
  18. 25 Oct, 2023 1 commit
    • Younes Belkada's avatar
      [`core`] Refactor of `gradient_checkpointing` (#27020) · 06e782da
      Younes Belkada authored
      * v1
      
      * fix
      
      * remove `create_custom_forward`
      
      * fixup
      
      * fixup
      
      * add test and fix all failing GC tests
      
      * remove all remaining `create_custom_forward` methods
      
      * fix idefics bug
      
      * fixup
      
      * replace with `__call__`
      
      * add comment
      
      * quality
      06e782da
  19. 08 Aug, 2023 1 commit
  20. 21 Jul, 2023 1 commit
  21. 17 Jul, 2023 1 commit
  22. 13 Jul, 2023 1 commit
  23. 30 Jun, 2023 1 commit
  24. 27 Jun, 2023 1 commit
    • Sylvain Gugger's avatar
      Clean load keys (#24505) · 8e5d1619
      Sylvain Gugger authored
      * Preliminary work on some models
      
      * Fix test load missing and make sure nonpersistent buffers are tested
      
      * Always ignore nonpersistent buffers if in state_dict
      
      * Treat models
      
      * More models
      
      * Treat remaining models
      
      * Fix quality
      
      * Fix tests
      
      * Remove draft
      
      * This test is not needed anymore
      
      * Fix copies
      
      * Fix last test
      
      * Newly added models
      
      * Fix last tests
      
      * Address review comments
      8e5d1619
  25. 22 Jun, 2023 1 commit
  26. 21 Jun, 2023 1 commit
  27. 14 Jun, 2023 1 commit
    • TAE YOUNGDON's avatar
      Fix URL in comment for contrastive loss function (#24271) · 6ab045d6
      TAE YOUNGDON authored
      * Update language_modeling.py
      
      in "class TextDatasetForNextSentencePrediction(Dataset)", double considering "self.tokenizer.num_special_tokens_to_add(pair=True)" 
      
      so, i remove self.block_size, and add parameter for "def create_examples_from_document". like "class LineByLineWithSOPTextDataset" do
      
      * Update language_modeling.py
      
      * Fix URL in comment for contrastive loss function
      6ab045d6
  28. 01 Jun, 2023 1 commit
  29. 20 Apr, 2023 1 commit
  30. 31 Jan, 2023 1 commit
  31. 23 Jan, 2023 1 commit
  32. 20 Jan, 2023 1 commit
  33. 21 Dec, 2022 1 commit
  34. 19 Dec, 2022 1 commit
  35. 08 Dec, 2022 1 commit
  36. 15 Nov, 2022 1 commit
  37. 12 Oct, 2022 1 commit
  38. 07 Oct, 2022 1 commit