1. 08 Jan, 2024 2 commits
  2. 07 Jan, 2024 1 commit
  3. 05 Jan, 2024 2 commits
  4. 04 Jan, 2024 2 commits
  5. 03 Jan, 2024 2 commits
    • Apsod's avatar
      Remove token_type_ids from model_input_names (like #24788) (#28325) · 45b1dfa3
      Apsod authored
      * remove token_type_ids from model_input_names (like #24788)
      
      * removed test that assumed token_type_ids should be present and updated a model reference so that it points to an available model)
      45b1dfa3
    • Connor Henderson's avatar
      Add FastSpeech2Conformer (#23439) · d83ff5ee
      Connor Henderson authored
      * start - docs, SpeechT5 copy and rename
      
      * add relevant code from FastSpeech2 draft, have tests pass
      
      * make it an actual conformer, demo ex.
      
      * matching inference with original repo, includes debug code
      
      * refactor nn.Sequentials, start more desc. var names
      
      * more renaming
      
      * more renaming
      
      * vocoder scratchwork
      
      * matching vocoder outputs
      
      * hifigan vocoder conversion script
      
      * convert model script, rename some config vars
      
      * replace postnet with speecht5's implementation
      
      * passing common tests, file cleanup
      
      * expand testing, add output hidden states and attention
      
      * tokenizer + passing tokenizer tests
      
      * variety of updates and tests
      
      * g2p_en pckg setup
      
      * import structure edits
      
      * docstrings and cleanup
      
      * repo consistency
      
      * deps
      
      * small cleanup
      
      * forward signature param order
      
      * address comments except for masks and labels
      
      * address comments on attention_mask and labels
      
      * address second round of comments
      
      * remove old unneeded line
      
      * address comments part 1
      
      * address comments pt 2
      
      * rename auto mapping
      
      * fixes for failing tests
      
      * address comments part 3 (bart-like, train loss)
      
      * make style
      
      * pass config where possible
      
      * add forward method + tests to WithHifiGan model
      
      * make style
      
      * address arg passing and generate_speech comments
      
      * address Arthur comments
      
      * address Arthur comments pt2
      
      * lint  changes
      
      * Sanchit comment
      
      * add g2p-en to doctest deps
      
      * move up self.encoder
      
      * onnx compatible tensor method
      
      * fix is symbolic
      
      * fix paper url
      
      * move models to espnet org
      
      * make style
      
      * make fix-copies
      
      * update docstring
      
      * Arthur comments
      
      * update docstring w/ new updates
      
      * add model architecture images
      
      * header size
      
      * md wording update
      
      * make style
      d83ff5ee
  6. 22 Dec, 2023 4 commits
  7. 21 Dec, 2023 6 commits
  8. 20 Dec, 2023 2 commits
  9. 19 Dec, 2023 1 commit
  10. 18 Dec, 2023 1 commit
    • Matt's avatar
      More TF fixes (#28081) · 71d47f0a
      Matt authored
      * More build_in_name_scope()
      
      * Make sure we set the save spec now we don't do it with dummies anymore
      
      * make fixup
      71d47f0a
  11. 15 Dec, 2023 2 commits
  12. 14 Dec, 2023 4 commits
    • Matt's avatar
      Replace build() with build_in_name_scope() for some TF tests (#28046) · 3060899b
      Matt authored
      Replace build() with build_in_name_scope() for some tests
      3060899b
    • Matt's avatar
      Proper build() methods for TF (#27794) · 050e0b44
      Matt authored
      * Add a convenience method for building in your own name scope
      
      * Second attempt at auto layer building
      
      * Revert "Second attempt at auto layer building"
      
      This reverts commit e03a3aaecf9ec41a805582b83cbdfe3290a631be.
      
      * Attempt #3
      
      * Revert "Attempt #3"
      
      This reverts commit b9df7a0857560d29b5abbed6127d9e9eca77cf47.
      
      * Add missing attributes that we're going to need later
      
      * Add some attributes we're going to need later
      
      * A fourth attempt! Feel the power flow through you!
      
      * Revert "A fourth attempt! Feel the power flow through you!"
      
      This reverts commit 6bf4aaf3875d6f28485f50187617a4c616c8aff7.
      
      * Add more values we'll need later
      
      * TF refactor that we'll need later
      
      * Revert "TF refactor that we'll need later"
      
      This reverts commit ca07202fb5b7b7436b893baa8d688b4f348ea7b9.
      
      * Revert "Revert "TF refactor that we'll need later""
      
      This reverts commit 1beb0f39f293ed9c27594575e1c849aadeb15c13.
      
      * make fixup
      
      * Attempt five!
      
      * Revert "Attempt five!"
      
      This reverts commit 3302207958dfd0374b0447a51c06eea51a506044.
      
      * Attempt six - this time don't add empty methods
      
      * Revert "Attempt six - this time don't add empty methods"
      
      This reverts commit 67d60129be75416b6beb8f47c7d38d77b18d79bb.
      
      * Attempt seven - better base model class detection!
      
      * Revert "Attempt seven - better base model class detection!"
      
      This reverts commit 5f14845e92ea0e87c598da933bfbfee10f553bc9.
      
      * Another attribute we'll need later
      
      * Try again with the missing attribute!
      
      * Revert "Try again with the missing attribute!"
      
      This reverts commit 760c6f30c5dffb3e04b0e73c34a77d1882a0fef7.
      
      * This is the attempt that will pierce the heavens!
      
      * Revert "This is the attempt that will pierce the heavens!"
      
      This reverts commit c868bb657de057aca7a5260350a3f831fc4dfee6.
      
      * Attempt seven - snag list is steadily decreasing
      
      * Revert "Attempt seven - snag list is steadily decreasing"
      
      This reverts commit 46fbd975deda64429bfb3e5fac4fc0370c00d316.
      
      * Attempt eight - will an empty snag list do it?
      
      * Revert "Attempt eight - will an empty snag list do it?"
      
      This reverts commit 7c8a3c2b083253649569e9877e02054ae5cec67b.
      
      * Fixes to Hubert issues that cause problems later
      
      * Trying again with Conv1D/SeparableConv fixes
      
      * Revert "Trying again with Conv1D/SeparableConv fixes"
      
      This reverts commit 55092bca952bc0f750aa1ffe246a640bf1e2036e.
      
      * Apply the build shape fixes to Wav2Vec2 as well
      
      * One more attempt!
      
      * Revert "One more attempt!"
      
      This reverts commit 5ac3e4cb01b9458cc93312873725f9444ae7261c.
      
      * Another attempt!
      
      * Revert "Another attempt!"
      
      This reverts commit ea16d890e019d7de8792a3b8e72f3b1c02adae50.
      
      * Let's see how many failures we get without the internal build method
      
      * Fix OpenAI
      
      * Fix MobileBERT
      
      * (Mostly) fix GroupVIT
      
      * Fix BLIP
      
      * One more BLIP fix
      
      * One more BLIP fix!
      
      * Fix Regnet
      
      * Finally fully fix GroupViT
      
      * Fix Data2Vec and add the new AdaptivePool
      
      * Fix Segformer
      
      * Fix Albert
      
      * Fix Deberta/DebertaV2
      
      * Fix XLM
      
      * Actually fix XLM
      
      * Fix Flaubert
      
      * Fix lxmert
      
      * Fix Resnet
      
      * Fix ConvBERT
      
      * Fix ESM
      
      * Fix Convnext / ConvnextV2
      
      * Fix SAM
      
      * Fix Efficientformer
      
      * Fix LayoutLMv3
      
      * Fix speech_to_text
      
      * Fix mpnet and mobilevit
      
      * Fix Swin
      
      * Fix CTRL
      
      * Fix CVT
      
      * Fix DPR
      
      * Fix Wav2Vec2
      
      * Fix T5
      
      * Fix Hubert
      
      * Fix GPT2
      
      * Fix Whisper
      
      * Fix DeiT
      
      * Fix the encoder-decoder / dual-encoder classes
      
      * make fix-copies
      
      * build in name scope
      
      * Fix summarization test
      
      * Fix tied weight names for BART + Blenderbot
      
      * Fix tied weight name building
      
      * Fix to TFESM weight building
      
      * Update TF SAM
      
      * Expand all the shapes out into Big Boy Shapes
      050e0b44
    • Yoach Lacombe's avatar
      Fix languages covered by M4Tv2 (#28019) · bb1d0d0d
      Yoach Lacombe authored
      
      
      * correct language assessment  + add tests
      
      * Update src/transformers/models/seamless_m4t_v2/modeling_seamless_m4t_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * make style + simplify and enrich test
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      bb1d0d0d
    • Joao Gante's avatar
  13. 13 Dec, 2023 3 commits
  14. 11 Dec, 2023 5 commits
  15. 08 Dec, 2023 3 commits
    • fxmarty's avatar
      F.scaled_dot_product_attention support (#26572) · 80377eb0
      fxmarty authored
      
      
      * add sdpa
      
      * wip
      
      * cleaning
      
      * add ref
      
      * yet more cleaning
      
      * and more :)
      
      * wip llama
      
      * working llama
      
      * add output_attentions=True support
      
      * bigcode sdpa support
      
      * fixes
      
      * gpt-bigcode support, require torch>=2.1.1
      
      * add falcon support
      
      * fix conflicts falcon
      
      * style
      
      * fix attention_mask definition
      
      * remove output_attentions from attnmaskconverter
      
      * support whisper without removing any Copied from statement
      
      * fix mbart default to eager renaming
      
      * fix typo in falcon
      
      * fix is_causal in SDPA
      
      * check is_flash_attn_2_available in the models init as well in case the model is not initialized through from_pretrained
      
      * add warnings when falling back on the manual implementation
      
      * precise doc
      
      * wip replace _flash_attn_enabled by config.attn_implementation
      
      * fix typo
      
      * add tests
      
      * style
      
      * add a copy.deepcopy on the config in from_pretrained, as we do not want to modify it inplace
      
      * obey to config.attn_implementation if a config is passed in from_pretrained
      
      * fix is_torch_sdpa_available when torch is not installed
      
      * remove dead code
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/bart/modeling_bart.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * remove duplicate pretraining_tp code
      
      * add dropout in llama
      
      * precise comment on attn_mask
      
      * add fmt: off for _unmask_unattended docstring
      
      * precise num_masks comment
      
      * nuke pretraining_tp in LlamaSDPAAttention following Arthur's suggestion
      
      * cleanup modeling_utils
      
      * backward compatibility
      
      * fix style as requested
      
      * style
      
      * improve documentation
      
      * test pass
      
      * style
      
      * add _unmask_unattended tests
      
      * skip meaningless tests for idefics
      
      * hard_check SDPA requirements when specifically requested
      
      * standardize the use if XXX_ATTENTION_CLASSES
      
      * fix SDPA bug with mem-efficient backend on CUDA when using fp32
      
      * fix test
      
      * rely on SDPA is_causal parameter to handle the causal mask in some cases
      
      * fix FALCON_ATTENTION_CLASSES
      
      * remove _flash_attn_2_enabled occurences
      
      * fix test
      
      * add OPT to the list of supported flash models
      
      * improve test
      
      * properly test on different SDPA backends, on different dtypes & properly handle separately the pad tokens in the test
      
      * remove remaining _flash_attn_2_enabled occurence
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update docs/source/en/perf_infer_gpu_one.md
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * remove use_attn_implementation
      
      * fix docstring & slight bug
      
      * make attn_implementation internal (_attn_implementation)
      
      * typos
      
      * fix tests
      
      * deprecate use_flash_attention_2=True
      
      * fix test
      
      * add back llama that was removed by mistake
      
      * fix tests
      
      * remove _flash_attn_2_enabled occurences bis
      
      * add check & test that passed attn_implementation is valid
      
      * fix falcon torchscript export
      
      * fix device of mask in tests
      
      * add tip about torch.jit.trace and move bt doc below sdpa
      
      * fix parameterized.expand order
      
      * move tests from test_modeling_attn_mask_utils to test_modeling_utils as a relevant test class is already there
      
      * update sdpaattention class with the new cache
      
      * Update src/transformers/configuration_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/bark/modeling_bark.py
      
      * address review comments
      
      * WIP torch.jit.trace fix. left: test both eager & sdpa
      
      * add test for torch.jit.trace for both eager/sdpa
      
      * fix falcon with torch==2.0 that needs to use sdpa
      
      * fix doc
      
      * hopefully last fix
      
      * fix key_value_length that has no default now in mask converter
      
      * is it flacky?
      
      * fix speculative decoding bug
      
      * tests do pass
      
      * fix following #27907
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      80377eb0
    • Yih-Dar's avatar
    • Xin Qiu's avatar
      Fix remaining issues in beam score calculation (#27808) · b31905d1
      Xin Qiu authored
      * Fix issues in add and is_done for BeamHypotheses
      
      * make newly added arguments optional for better compatibility
      
      * Directly use cur_len as generated_len, add note for retrocompatibility
      
      * update test expectation
      
      * make cur_len represents the length of the entire sequence including the decoder prompt
      
      * remove redundant if/else in testing
      b31905d1