1. 30 Apr, 2024 3 commits
    • Joao Gante's avatar
      75bbfd5b
    • Jacky Lee's avatar
      Enable multi-device for more models (#30409) · 0ae789e0
      Jacky Lee authored
      * feat: support for dinov2
      
      * feat: support for depth_anything
      
      * feat: support for efficientformer
      
      * feat: support for bert (is this right?)
      
      * update: embedding split
      
      * remove: empty string
      
      * feat: support for align
      
      * fix: copies
      
      * fix: QAQBertEmbeddings
      
      * fix: more consistency issues
      
      * revert: support for effientformer
      
      * feat: support for altclip
      
      * feat: support for blip_text
      
      * support for ChineseCLIP
      
      * feat: support for depth anything
      
      * feat: support for dpt
      
      * feat: support for dpt
      
      * feat: support for git
      
      * feat: support for groupvit
      
      * update: format
      
      * fix: support for clip
      
      * fix: consistency
      
      * feat: support for pvt
      
      * feat: support for vit_msn
      
      * fix: consistency
      
      * fix: other copies
      
      * remove: device transfer
      
      * revert: in-place add
      
      * update: support for align
      
      * update: support for bert
      
      * update: support for Chinese CLIP
      
      * revert: changes to efficientformer
      
      * update: support for dpt
      
      * update: support for efficientformer
      
      * revert: changes to git
      
      * revert: changes to groupvit
      
      * revert: changes to roc_bert
      
      * update: support for vit_msn
      
      * revert: changes to dpt
      
      * remove: extra space
      
      * style: extra space
      0ae789e0
    • Raushan Turganbay's avatar
      Pass `use_cache` in kwargs for GPTNeoX (#30538) · c712d05a
      Raushan Turganbay authored
      pass use_cache in kwargs
      c712d05a
  2. 29 Apr, 2024 4 commits
  3. 26 Apr, 2024 9 commits
    • Eduardo Pacheco's avatar
      [SegGPT] Fix seggpt image processor (#29550) · 6d4cabda
      Eduardo Pacheco authored
      * Fixed SegGptImageProcessor to handle 2D and 3D prompt mask inputs
      
      * Added new test to check prompt mask equivalence
      
      * New proposal
      
      * Better proposal
      
      * Removed unnecessary method
      
      * Updated seggpt docs
      
      * Introduced do_convert_rgb
      
      * nits
      6d4cabda
    • amyeroberts's avatar
      load_image - decode b64encode and encodebytes strings (#30192) · c793b26f
      amyeroberts authored
      * Decode b64encode and encodebytes strings
      
      * Remove conditional encode -- image is always a string
      c793b26f
    • amyeroberts's avatar
      Fix GroundingDINO, DPR after BERT SDPA update (#30506) · e7d52a10
      amyeroberts authored
      Fix GroundingDINO, DPR after BET SDPA update
      e7d52a10
    • amyeroberts's avatar
      [`DETR`] Remove timm hardcoded logic in modeling files (#29038) · aafa7ce7
      amyeroberts authored
      
      
      * Enable instantiating model with pretrained backbone weights
      
      * Clarify pretrained import
      
      * Use load_backbone instead
      
      * Add backbone_kwargs to config
      
      * Fix up
      
      * Add tests
      
      * Tidy up
      
      * Enable instantiating model with pretrained backbone weights
      
      * Update tests so backbone checkpoint isn't passed in
      
      * Clarify pretrained import
      
      * Update configs - docs and validation check
      
      * Update src/transformers/utils/backbone_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Clarify exception message
      
      * Update config init in tests
      
      * Add test for when use_timm_backbone=True
      
      * Use load_backbone instead
      
      * Add use_timm_backbone to the model configs
      
      * Add backbone_kwargs to config
      
      * Pass kwargs to constructors
      
      * Draft
      
      * Fix tests
      
      * Add back timm - weight naming
      
      * More tidying up
      
      * Whoops
      
      * Tidy up
      
      * Handle when kwargs are none
      
      * Update tests
      
      * Revert test changes
      
      * Deformable detr test - don't use default
      
      * Don't mutate; correct model attributes
      
      * Add some clarifying comments
      
      * nit - grammar is hard
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      aafa7ce7
    • Zach Mueller's avatar
      Remove skipping logic now that set_epoch exists (#30501) · 77ff304d
      Zach Mueller authored
      * Remove skipping logic now that set_epoch exists
      
      * Working version, clean
      77ff304d
    • JB (Don)'s avatar
      [`BERT`] Add support for sdpa (#28802) · dfa7b580
      JB (Don) authored
      * Adding SDPA support for BERT
      
      * Using the proper input name for testing model input in inference()
      
      * Adding documentation for SDPA in BERT model page
      
      * Use the stable link for the documentation
      
      * Adding a gate to only call .contiguous() for torch < 2.2.0
      
      * Additions and fixes to the documentation
      
      * Minor updates to documentation
      
      * Adding extra requirements needed for the contiguous() bug
      
      * Adding "Adapted from" in plcae of the "Copied from"
      
      * Add benchmark speedup tables to the documentation
      
      * Minor fixes to the documentation
      
      * Use ClapText as a replacemenet for Bert in the Copied-From
      
      * Some more fixes for the fix-copies references
      
      * Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage
      
      [test all]
      
      * Undo changes to separate test
      
      * Refactored SDPA self attention code for KV projections
      
      * Change use_sdpa to attn_implementation
      
      * Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
      dfa7b580
    • Michael Goin's avatar
      Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488) · 20081c74
      Michael Goin authored
      * Update modeling_utils/dtype_byte_size to handle float8 types
      
      * Add a test for dtype_byte_size
      
      * Format
      
      * Fix bool
      20081c74
    • kyo's avatar
      Fix the `bitsandbytes` error formatting ("Some modules are dispatched on ...") (#30494) · 59e715f7
      kyo authored
      Fix the `bitsandbytes` error when some modules are not properly offloaded.
      59e715f7
    • Younes Belkada's avatar
      FEAT: PEFT support for EETQ (#30449) · 19cfdf0f
      Younes Belkada authored
      Update quantizer_eetq.py
      19cfdf0f
  4. 25 Apr, 2024 9 commits
    • Younes Belkada's avatar
      Quantization: `HfQuantizer` quant method update (#30484) · 26ddc580
      Younes Belkada authored
      ensure popular quant methods are supported
      26ddc580
    • Xuehai Pan's avatar
    • Raushan Turganbay's avatar
      Fix Llava for 0-embeddings (#30473) · e60491ad
      Raushan Turganbay authored
      e60491ad
    • Zach Mueller's avatar
      Introduce Stateful Callbacks (#29666) · ad697f18
      Zach Mueller authored
      
      
      * Introduce saveable callbacks
      
      * Add note
      
      * Test for non-present and flag
      
      * Support early stopping and refusing to train further
      
      * Update docstring
      
      * More saving
      
      * Import oopsie
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Make it go through TrainerArguments
      
      * Document
      
      * Fix test
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Rework to allow for duplicates
      
      * CLean
      
      * Fix failing tests
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      ad697f18
    • Alexander Visheratin's avatar
      Add WSD scheduler (#30231) · 7b1170b0
      Alexander Visheratin authored
      * Added WSD scheduler.
      
      * Added tests.
      
      * Fixed errors.
      
      * Fix formatting.
      
      * CI fixes.
      7b1170b0
    • Yoach Lacombe's avatar
      馃毃 Add training compatibility for Musicgen-like models (#29802) · 90cb55bf
      Yoach Lacombe authored
      
      
      * first modeling code
      
      * make repository
      
      * still WIP
      
      * update model
      
      * add tests
      
      * add latest change
      
      * clean docstrings and copied from
      
      * update docstrings md and readme
      
      * correct chroma function
      
      * correct copied from and remove unreleated test
      
      * add doc to toctree
      
      * correct imports
      
      * add convert script to notdoctested
      
      * Add suggestion from Sanchit
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * correct get_uncoditional_inputs docstrings
      
      * modify README according to SANCHIT feedback
      
      * add chroma to audio utils
      
      * clean librosa and torchaudio hard dependencies
      
      * fix FE
      
      * refactor audio decoder -> audio encoder for consistency with previous musicgen
      
      * refactor conditional -> encoder
      
      * modify sampling rate logics
      
      * modify license at the beginning
      
      * refactor all_self_attns->all_attentions
      
      * remove ignore copy from causallm generate
      
      * add copied from for from_sub_models
      
      * fix make copies
      
      * add warning if audio is truncated
      
      * add copied from where relevant
      
      * remove artefact
      
      * fix convert script
      
      * fix torchaudio and FE
      
      * modify chroma method according to feedback-> better naming
      
      * refactor input_values->input_features
      
      * refactor input_values->input_features and fix import fe
      
      * add input_features to docstrigs
      
      * correct inputs_embeds logics
      
      * remove dtype conversion
      
      * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation
      
      * change warning for chroma length
      
      * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * change way to save wav, using soundfile
      
      * correct docs and change to soundfile
      
      * fix import
      
      * fix init proj layers
      
      * add draft training
      
      * fix cross entropy
      
      * clean loss computation
      
      * fix labels
      
      * remove line breaks from md
      
      * fix issue with docstrings
      
      * add FE suggestions
      
      * improve is in logics and remove useless imports
      
      * remove custom from_pretrained
      
      * simplify docstring code
      
      * add suggestions for modeling tests
      
      * make style
      
      * update converting script with sanity check
      
      * remove encoder attention mask from conditional generation
      
      * replace musicgen melody checkpoints with official orga
      
      * rename ylacombe->facebook in checkpoints
      
      * fix copies
      
      * remove unecessary warning
      
      * add shape in code docstrings
      
      * add files to slow doc tests
      
      * fix md bug and add md to not_tested
      
      * make fix-copies
      
      * fix hidden states test and batching
      
      * update training code
      
      * add training tests for melody
      
      * add training for o.g musicgen
      
      * fix copied from
      
      * remove final todos
      
      * make style
      
      * fix style
      
      * add suggestions from review
      
      * add ref to the original loss computation code
      
      * rename method + fix labels in tests
      
      * make style
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      90cb55bf
    • Tom Aarsen's avatar
      Prevent crash with `WandbCallback` with third parties (#30477) · ce5ae5a4
      Tom Aarsen authored
      * Use EAFP principle to prevent crash with third parties
      
      * Remove leftover debugging code
      
      * Add info-level logger message
      ce5ae5a4
    • amyeroberts's avatar
      Fix SigLip classification doctest (#30475) · 4fed29e3
      amyeroberts authored
      * Fix SigLip classification doctest
      
      * Remove extra line
      
      * Update src/transformers/models/siglip/modeling_siglip.py
      4fed29e3
    • Arthur's avatar
      [fix codellama conversion] (#30472) · c60749d6
      Arthur authored
      * fix codellama conversion
      
      * nit
      c60749d6
  5. 24 Apr, 2024 11 commits
  6. 23 Apr, 2024 4 commits