1. 26 Apr, 2024 8 commits
    • Sanchit Gandhi's avatar
      [examples] update whisper fine-tuning (#29938) · 38b53da3
      Sanchit Gandhi authored
      * [examples] update whisper fine-tuning
      
      * deprecate forced/suppress tokens
      
      * item assignment
      
      * update readme
      
      * final fix
      38b53da3
    • amyeroberts's avatar
      [`DETR`] Remove timm hardcoded logic in modeling files (#29038) · aafa7ce7
      amyeroberts authored
      
      
      * Enable instantiating model with pretrained backbone weights
      
      * Clarify pretrained import
      
      * Use load_backbone instead
      
      * Add backbone_kwargs to config
      
      * Fix up
      
      * Add tests
      
      * Tidy up
      
      * Enable instantiating model with pretrained backbone weights
      
      * Update tests so backbone checkpoint isn't passed in
      
      * Clarify pretrained import
      
      * Update configs - docs and validation check
      
      * Update src/transformers/utils/backbone_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Clarify exception message
      
      * Update config init in tests
      
      * Add test for when use_timm_backbone=True
      
      * Use load_backbone instead
      
      * Add use_timm_backbone to the model configs
      
      * Add backbone_kwargs to config
      
      * Pass kwargs to constructors
      
      * Draft
      
      * Fix tests
      
      * Add back timm - weight naming
      
      * More tidying up
      
      * Whoops
      
      * Tidy up
      
      * Handle when kwargs are none
      
      * Update tests
      
      * Revert test changes
      
      * Deformable detr test - don't use default
      
      * Don't mutate; correct model attributes
      
      * Add some clarifying comments
      
      * nit - grammar is hard
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      aafa7ce7
    • Zach Mueller's avatar
      Remove skipping logic now that set_epoch exists (#30501) · 77ff304d
      Zach Mueller authored
      * Remove skipping logic now that set_epoch exists
      
      * Working version, clean
      77ff304d
    • JB (Don)'s avatar
      [`BERT`] Add support for sdpa (#28802) · dfa7b580
      JB (Don) authored
      * Adding SDPA support for BERT
      
      * Using the proper input name for testing model input in inference()
      
      * Adding documentation for SDPA in BERT model page
      
      * Use the stable link for the documentation
      
      * Adding a gate to only call .contiguous() for torch < 2.2.0
      
      * Additions and fixes to the documentation
      
      * Minor updates to documentation
      
      * Adding extra requirements needed for the contiguous() bug
      
      * Adding "Adapted from" in plcae of the "Copied from"
      
      * Add benchmark speedup tables to the documentation
      
      * Minor fixes to the documentation
      
      * Use ClapText as a replacemenet for Bert in the Copied-From
      
      * Some more fixes for the fix-copies references
      
      * Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage
      
      [test all]
      
      * Undo changes to separate test
      
      * Refactored SDPA self attention code for KV projections
      
      * Change use_sdpa to attn_implementation
      
      * Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
      dfa7b580
    • Matt's avatar
      Use the Keras set_random_seed in tests (#30504) · 2de5cb12
      Matt authored
      Use the Keras set_random_seed to ensure reproducible weight initialization
      2de5cb12
    • Michael Goin's avatar
      Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488) · 20081c74
      Michael Goin authored
      * Update modeling_utils/dtype_byte_size to handle float8 types
      
      * Add a test for dtype_byte_size
      
      * Format
      
      * Fix bool
      20081c74
    • kyo's avatar
      Fix the `bitsandbytes` error formatting ("Some modules are dispatched on ...") (#30494) · 59e715f7
      kyo authored
      Fix the `bitsandbytes` error when some modules are not properly offloaded.
      59e715f7
    • Younes Belkada's avatar
      FEAT: PEFT support for EETQ (#30449) · 19cfdf0f
      Younes Belkada authored
      Update quantizer_eetq.py
      19cfdf0f
  2. 25 Apr, 2024 18 commits
  3. 24 Apr, 2024 14 commits