Commits · 09eb11a1bd6e77833fde7020efc57e6a804e4e8a · chenpangpang / transformers

16 Jan, 2024 3 commits

[`SpeechT5Tokenization`] Add copied from and fix the... · fe23256b

Arthur authored Jan 16, 2024

[`SpeechT5Tokenization`]  Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme (#28522)

* Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme

* fixup

* add a small test

* style test file

* nites

fe23256b

[`TokenizationRoformerFast`] Fix the save and loading (#28527) · 96d08831
Arthur authored Jan 16, 2024
```
* cleanup

* add a test

* update the test

* style

* revert part that allows to pickle the tokenizer
```
96d08831

Fix/speecht5 bug (#28481) · 07ae53e6

Nima Yaqmuri authored Jan 16, 2024

* Fix bug in SpeechT5 speech decoder prenet's forward method

- Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues.
- Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact.
- This change resolves a critical bug affecting the model's performance in handling speaker embeddings.

* Refactor SpeechT5 text to speech integration tests

- Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite.
- Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations.
- Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing.
- Fixed existing test cases where incorrect assumptions about output shapes led to potential errors.

* Fix bug in SpeechT5 speech decoder prenet's forward method

* Refactor SpeechT5 text to speech integration tests

* Enhance handling of speaker embeddings in SpeechT5

- Refined the generate and generate_speech functions in the SpeechT5 class to robustly handle two scenarios for speaker embeddings: matching the batch size (one embedding per sample) and one-to-many (a single embedding for all samples in the batch).
- The update includes logic to repeat the speaker embedding when a single embedding is provided for multiple samples, and a ValueError is raised for any mismatched dimensions.
- Also added corresponding test cases to validate both scenarios, ensuring complete coverage and functionality for diverse speaker embedding situations.

* Improve Test Robustness with Randomized Speaker Embeddings

07ae53e6

15 Jan, 2024 1 commit
- Generate: consolidate output classes (#28494) · 7e0ddf89
  Joao Gante authored Jan 15, 2024
  
  7e0ddf89
12 Jan, 2024 1 commit

Update metadata loading for oneformer (#28398) · 666a6f07

amyeroberts authored Jan 12, 2024

* Update meatdata loading for oneformer

* Enable loading from a model repo

* Update docstrings

* Fix tests

* Update tests

* Clarify repo_path behaviour

666a6f07

11 Jan, 2024 4 commits

Byebye torch 1.10 (#28207) · 59cd9de3

Yih-Dar authored Jan 11, 2024



* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

59cd9de3

Fix load balancing loss func for mixtral (#28256) · e768616a

liangxuZhang authored Jan 11, 2024



* Correct the implementation of auxiliary loss of mixtrtal

* correct the implementation of auxiliary loss of mixtrtal

* Implement a simpler calculation method

---------
Co-authored-by: zhangliangxu3 <zhangliangxu3@jd.com>

e768616a

[Phi] Extend implementation to use GQA/MQA. (#28163) · 55090585

Gustavo de Rosa authored Jan 11, 2024

* chore(phi): Updates configuration_phi with missing keys.

* chore(phi): Adds first draft of combined modeling_phi.

* fix(phi): Fixes according to latest review.

* fix(phi): Removes pad_vocab_size_multiple to prevent inconsistencies.

* fix(phi): Fixes unit and integration tests.

* fix(phi): Ensures that everything works with microsoft/phi-1 for first integration.

* fix(phi): Fixes output of docstring generation.

* fix(phi): Fixes according to latest review.

* fix(phi): Fixes according to latest review.

* fix(tests): Re-enables Phi-1.5 test.

* fix(phi): Fixes attention overflow on PhiAttention (for Phi-2).

* fix(phi): Improves how queries and keys are upcast.

* fix(phi): Small updates on latest changes.

55090585

Optionally preprocess segmentation maps for MobileViT (#28420) · d5606378

Harisankar Babu authored Jan 11, 2024

* optionally preprocess segmentation maps for mobilevit

* changed pretrained model name to that of segmentation model

* removed voc-deeplabv3 from model archive list

* added preprocess_image and preprocess_mask methods for processing images and segmentation masks respectively

* added tests for segmentation masks based on segformer feature extractor

* use crop_size instead of size

* reverting to initial model

d5606378

10 Jan, 2024 2 commits

[Whisper] Fix slow test (#28407) · cbbe3074

Patrick von Platen authored Jan 10, 2024



* [Whisper] Fix slow test

* update

* update

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

cbbe3074

Fix `_merge_input_ids_with_image_features` for llava model (#28333) · 0f2f0c63

Victor SANH authored Jan 10, 2024



* fix `_merge_input_ids_with_image_features` for llava model

* Update src/transformers/models/llava/modeling_llava.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* adress comments

* style and tests

* ooops

* test the backward too

* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update tests/models/vipllava/test_modeling_vipllava.py

* style and quality

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

0f2f0c63

09 Jan, 2024 1 commit
- fix auxiliary loss training in DetrSegmentation (#28354) · 357971ec
  Sangbum Daniel Choi authored Jan 09, 2024
```
* fix auxiliary loss training in detrSegmentation

* add auxiliary_loss testing
```
  357971ec
08 Jan, 2024 3 commits

Add SigLIP (#26522) · 3b742ea8

NielsRogge authored Jan 08, 2024



* Add first draft

* Use appropriate gelu function

* More improvements

* More improvements

* More improvements

* Convert checkpoint

* More improvements

* Improve docs, remove print statements

* More improvements

* Add link

* remove unused masking function

* begin tokenizer

* do_lower_case

* debug

* set split_special_tokens=True

* Remove script

* Fix style

* Fix rebase

* Use same design as CLIP

* Add fast tokenizer

* Add SiglipTokenizer to init, remove extra_ids

* Improve conversion script

* Use smaller inputs in conversion script

* Update conversion script

* More improvements

* Add processor to conversion script

* Add tests

* Remove print statements

* Add tokenizer tests

* Fix more tests

* More improvements related to weight initialization

* More improvements

* Make more tests pass

* More improvements

* More improvements

* Add copied from

* Add canonicalize_text

* Enable fast tokenizer tests

* More improvements

* Fix most slow tokenizer tests

* Address comments

* Fix style

* Remove script

* Address some comments

* Add copied from to tests

* Add more copied from

* Add more copied from

* Add more copied from

* Remove is_flax_available

* More updates

* Address comment

* Remove SiglipTokenizerFast for now

* Add caching

* Remove umt5 test

* Add canonicalize_text inside _tokenize, thanks Arthur

* Fix image processor tests

* Skip tests which are not applicable

* Skip test_initialization

* More improvements

* Compare pixel values

* Fix doc tests, add integration test

* Add do_normalize

* Remove causal mask and leverage ignore copy

* Fix attention_mask

* Fix remaining tests

* Fix dummies

* Rename temperature and bias

* Address comments

* Add copied from to tokenizer tests

* Add SiglipVisionModel to auto mapping

* Add copied from to image processor tests

* Improve doc

* Remove SiglipVisionModel from index

* Address comments

* Improve docs

* Simplify config

* Add first draft

* Make it like mistral

* More improvements

* Fix attention_mask

* Fix output_attentions

* Add note in docs

* Convert multilingual model

* Convert large checkpoint

* Convert more checkpoints

* Add pipeline support, correct image_mean and image_std

* Use padding=max_length by default

* Make processor like llava

* Add code snippet

* Convert more checkpoints

* Set keep_punctuation_string=None as in OpenCLIP

* Set normalized=False for special tokens

* Fix doc test

* Update integration test

* Add figure

* Update organization

* Happy new year

* Use AutoModel everywhere

---------
Co-authored-by: patil-suraj <surajp815@gmail.com>

3b742ea8

Add segmentation map processing to SAM Image Processor (#27463) · 73c88012

Rosie Wood authored Jan 08, 2024



* add segmentation map processing to sam image processor

* fixup

* add tests

* reshaped_input_size is shape before padding

* update tests for size/shape outputs

* fixup

* add code snippet to docs

* Update docs/source/en/model_doc/sam.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add missing backticks

* add `segmentation_maps` as arg for SamProcessor.__call__()

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

73c88012

Fix building alibi tensor when num_heads is not a power of 2 (#28380) · 0c2121f9
Mohamed Abu El-Nasr authored Jan 08, 2024
```
* Fix building alibi tensor when num_heads is not a power of 2

* Remove print function
```
0c2121f9

07 Jan, 2024 1 commit
- [Phi2] Add support for phi2 models (#28211) · 3eddda11
  Susnato Dhar authored Jan 07, 2024
```
* modified script and added test for phi2

* changes
```
  3eddda11
05 Jan, 2024 2 commits

[DETA] Improvement and Sync from DETA especially for training (#27990) · 899d8351

Sangbum Daniel Choi authored Jan 05, 2024



* [DETA] fix freeze/unfreeze function

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add freeze/unfreeze test case in DETA

* fix type

* fix typo 2

* fix : enable aux and enc loss in training pipeline

* Add unsynced variables from original DETA for training

* modification for passing CI test

* make style

* make fix

* manual make fix

* change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking

* remove print

* divide configuration in DetaModel and DetaForObjectDetection

* image smaller size than 224 will give topk error

* pred_boxes and logits should be equivalent to two_stage_num_proposals

* add missing part in DetaConfig

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add docstring in configure and prettify TO DO part

* change distribute related code to accelerate

* Update src/transformers/models/deta/configuration_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/deta/test_modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* protect importing accelerate

* change variable name to specific value

* wrong import

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

899d8351

Fix pos_mask application and update tests accordingly (#27892) · 57e9c832

Fernando Rodriguez Sanchez authored Jan 05, 2024



* Fix pos_mask application and update tests accordingly

* Fix style

* Adding comments

---------
Co-authored-by: Fernando Rodriguez <fernando.rodriguez@nielseniq.com>

57e9c832

04 Jan, 2024 2 commits
- Fix error in M4T feature extractor (#28340) · 35e9d2b2
  Yoach Lacombe authored Jan 04, 2024
```
* fix M4T FE error when no attention mask

* modify logic

* add test

* go back to initial test situation + add other tests
```
  35e9d2b2
- enable training mask2former and maskformer for transformers trainer (#28277) · 4a66c0d9
  Sangbum Daniel Choi authored Jan 04, 2024
```
* fix get_num_masks output as [int] to int

* fix loss size from torch.Size([1]) to torch.Size([])
```
  4a66c0d9
03 Jan, 2024 2 commits

Remove token_type_ids from model_input_names (like #24788) (#28325) · 45b1dfa3

Apsod authored Jan 03, 2024

* remove token_type_ids from model_input_names (like #24788)

* removed test that assumed token_type_ids should be present and updated a model reference so that it points to an available model)

45b1dfa3

Add FastSpeech2Conformer (#23439) · d83ff5ee

Connor Henderson authored Jan 03, 2024

* start - docs, SpeechT5 copy and rename

* add relevant code from FastSpeech2 draft, have tests pass

* make it an actual conformer, demo ex.

* matching inference with original repo, includes debug code

* refactor nn.Sequentials, start more desc. var names

* more renaming

* more renaming

* vocoder scratchwork

* matching vocoder outputs

* hifigan vocoder conversion script

* convert model script, rename some config vars

* replace postnet with speecht5's implementation

* passing common tests, file cleanup

* expand testing, add output hidden states and attention

* tokenizer + passing tokenizer tests

* variety of updates and tests

* g2p_en pckg setup

* import structure edits

* docstrings and cleanup

* repo consistency

* deps

* small cleanup

* forward signature param order

* address comments except for masks and labels

* address comments on attention_mask and labels

* address second round of comments

* remove old unneeded line

* address comments part 1

* address comments pt 2

* rename auto mapping

* fixes for failing tests

* address comments part 3 (bart-like, train loss)

* make style

* pass config where possible

* add forward method + tests to WithHifiGan model

* make style

* address arg passing and generate_speech comments

* address Arthur comments

* address Arthur comments pt2

* lint  changes

* Sanchit comment

* add g2p-en to doctest deps

* move up self.encoder

* onnx compatible tensor method

* fix is symbolic

* fix paper url

* move models to espnet org

* make style

* make fix-copies

* update docstring

* Arthur comments

* update docstring w/ new updates

* add model architecture images

* header size

* md wording update

* make style

d83ff5ee

22 Dec, 2023 4 commits

[`Llava`] Fix llava index errors (#28032) · 29e7a1e1

Younes Belkada authored Dec 22, 2023



* fix llava index errors

* forward contrib credits from original implementation and fix

* better fix

* final fixes and fix all tests

* fix

* fix nit

* fix tests

* add regression tests

---------
Co-authored-by: gullalc <gullalc@users.noreply.github.com>

29e7a1e1

[Whisper] Fix word-level timestamps with bs>1 or num_beams>1 (#28114) · 5da3db3f

Yoach Lacombe authored Dec 22, 2023



* fix frames

* use smaller chunk length

* correct beam search + tentative stride

* fix whisper word timestamp in batch

* add test batch generation with return token timestamps

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* clean a test

* make style + correct typo

* write clearer comments

* explain test in comment

---------
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

5da3db3f

Add Swinv2 backbone (#27742) · c9fb250a

NielsRogge authored Dec 22, 2023

* First draft

* More improvements

* More improvements

* Make all tests pass

* Remove script

* Update image processor

* Address comments

* Use new gradient checkpointing method

* Convert checkpoints, add integration test

* Do not keep aspect ratio for now

* Set keep_aspect_ratio=False for beit, add integration test

* Remove print statement

c9fb250a

Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format... · 1ef86c4f

Nicholas Neo authored Dec 22, 2023


Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format in the SeamlessM4TFeatureExtractor class (#27914)

* fixes: code fixes on is_batched condition to also check for batched audio data in torch.Tensor format instead of only just checking for batched audio data in np.ndarray format

* Update src/transformers/models/seamless_m4t/feature_extraction_seamless_m4t.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* refactor: code refactoring to remove torch framework dependency

* docs: updated docstring to add torch tensor compatibility

* test: add test cases to incorporate torch tensor inputs

* test: ran make fix-copies for code conformity

* test: refactor test to separate the test_call into test_call_numpy and test_call_torch

---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

1ef86c4f

21 Dec, 2023 6 commits

Update YOLOS slow test values (#28187) · 3657748b
amyeroberts authored Dec 21, 2023
```
Update test values
```
3657748b
Fix slow backbone tests - out_indices must match stage name ordering (#28186) · cd1350ce
amyeroberts authored Dec 21, 2023
```
Indices must match stage name ordering
```
cd1350ce

Even more TF test fixes (#28146) · 260b9d21

Matt authored Dec 21, 2023

* Fix vision text dual encoder

* Small cleanup for wav2vec2 (not fixed yet)

* Small fix for vision_encoder_decoder

* Fix SAM builds

* Update TFBertTokenizer test with modern exporting + tokenizer

* Fix DeBERTa

* Fix DeBERTav2

* Try RAG fix but it's impossible to test locally

* Actually fix RAG now that I got FAISS working somehow

* Fix Wav2Vec2, add sermon

* Fix Hubert

260b9d21

[`Mixtral` & `Mistral`] Add support for sdpa (#28133) · f9a98c47

Arthur authored Dec 21, 2023



* some nits

* update test

* add support d\sd[a

* remove some dummy inputs

* all good

* style

* nits

* fixes

* fix more copies

* nits

* styling

* fix

* Update src/transformers/models/mistral/modeling_mistral.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add a slow test just to be sure

* fixup

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f9a98c47

[Whisper] Use torch for stft if available (#26119) · 814619f5

Sanchit Gandhi authored Dec 21, 2023

* [Whisper] Use torch for stft if available

* update docstring

* mock patch decorator

* fit on one line

814619f5

disable test_retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest (#28169) · e268d7e5
Dean Wyatte authored Dec 21, 2023
```
disable retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest
```
e268d7e5

20 Dec, 2023 2 commits
- Fix yolos resizing (#27663) · 1d777359
  amyeroberts authored Dec 20, 2023
```
* Fix yolos resizing

* Update tests

* Add a test
```
  1d777359
- Generate: fix speculative decoding (#28166) · 45b70384
  Joao Gante authored Dec 20, 2023
```
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
```
  45b70384
19 Dec, 2023 1 commit

[`Mixtral`] Fix loss + nits (#28115) · 4a04b4cc

Arthur authored Dec 19, 2023



* default config should not use sliding window

* update the doc

* nits

* add a proper test

* update

* update

* update expected value

* Update src/transformers/tokenization_utils_fast.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* convert to float

* average then N**2

* comment

* revert nit

* good to fo

* fixup

* Update tests/models/mixtral/test_modeling_mixtral.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* revert unrelated change

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

4a04b4cc

18 Dec, 2023 1 commit

More TF fixes (#28081) · 71d47f0a

Matt authored Dec 18, 2023

* More build_in_name_scope()

* Make sure we set the save spec now we don't do it with dummies anymore

* make fixup

71d47f0a

15 Dec, 2023 2 commits
- Update fixtures-image-utils (#28080) · 26ea725b
  Quentin Lhoest authored Dec 15, 2023
```
* fix hf-internal-testing/fixtures_image_utils

* fix test

* comments
```
  26ea725b
- Skip M4T `test_retain_grad_hidden_states_attentions` (#28060) · deb72cb6
  Yoach Lacombe authored Dec 15, 2023
```
* skip test from SpeechInput

* refine description of skip
```
  deb72cb6
14 Dec, 2023 2 commits

Replace build() with build_in_name_scope() for some TF tests (#28046) · 3060899b
Matt authored Dec 14, 2023
```
Replace build() with build_in_name_scope() for some tests
```
3060899b

Proper build() methods for TF (#27794) · 050e0b44

Matt authored Dec 14, 2023

* Add a convenience method for building in your own name scope

* Second attempt at auto layer building

* Revert "Second attempt at auto layer building"

This reverts commit e03a3aaecf9ec41a805582b83cbdfe3290a631be.

* Attempt #3

* Revert "Attempt #3"

This reverts commit b9df7a0857560d29b5abbed6127d9e9eca77cf47.

* Add missing attributes that we're going to need later

* Add some attributes we're going to need later

* A fourth attempt! Feel the power flow through you!

* Revert "A fourth attempt! Feel the power flow through you!"

This reverts commit 6bf4aaf3875d6f28485f50187617a4c616c8aff7.

* Add more values we'll need later

* TF refactor that we'll need later

* Revert "TF refactor that we'll need later"

This reverts commit ca07202fb5b7b7436b893baa8d688b4f348ea7b9.

* Revert "Revert "TF refactor that we'll need later""

This reverts commit 1beb0f39f293ed9c27594575e1c849aadeb15c13.

* make fixup

* Attempt five!

* Revert "Attempt five!"

This reverts commit 3302207958dfd0374b0447a51c06eea51a506044.

* Attempt six - this time don't add empty methods

* Revert "Attempt six - this time don't add empty methods"

This reverts commit 67d60129be75416b6beb8f47c7d38d77b18d79bb.

* Attempt seven - better base model class detection!

* Revert "Attempt seven - better base model class detection!"

This reverts commit 5f14845e92ea0e87c598da933bfbfee10f553bc9.

* Another attribute we'll need later

* Try again with the missing attribute!

* Revert "Try again with the missing attribute!"

This reverts commit 760c6f30c5dffb3e04b0e73c34a77d1882a0fef7.

* This is the attempt that will pierce the heavens!

* Revert "This is the attempt that will pierce the heavens!"

This reverts commit c868bb657de057aca7a5260350a3f831fc4dfee6.

* Attempt seven - snag list is steadily decreasing

* Revert "Attempt seven - snag list is steadily decreasing"

This reverts commit 46fbd975deda64429bfb3e5fac4fc0370c00d316.

* Attempt eight - will an empty snag list do it?

* Revert "Attempt eight - will an empty snag list do it?"

This reverts commit 7c8a3c2b083253649569e9877e02054ae5cec67b.

* Fixes to Hubert issues that cause problems later

* Trying again with Conv1D/SeparableConv fixes

* Revert "Trying again with Conv1D/SeparableConv fixes"

This reverts commit 55092bca952bc0f750aa1ffe246a640bf1e2036e.

* Apply the build shape fixes to Wav2Vec2 as well

* One more attempt!

* Revert "One more attempt!"

This reverts commit 5ac3e4cb01b9458cc93312873725f9444ae7261c.

* Another attempt!

* Revert "Another attempt!"

This reverts commit ea16d890e019d7de8792a3b8e72f3b1c02adae50.

* Let's see how many failures we get without the internal build method

* Fix OpenAI

* Fix MobileBERT

* (Mostly) fix GroupVIT

* Fix BLIP

* One more BLIP fix

* One more BLIP fix!

* Fix Regnet

* Finally fully fix GroupViT

* Fix Data2Vec and add the new AdaptivePool

* Fix Segformer

* Fix Albert

* Fix Deberta/DebertaV2

* Fix XLM

* Actually fix XLM

* Fix Flaubert

* Fix lxmert

* Fix Resnet

* Fix ConvBERT

* Fix ESM

* Fix Convnext / ConvnextV2

* Fix SAM

* Fix Efficientformer

* Fix LayoutLMv3

* Fix speech_to_text

* Fix mpnet and mobilevit

* Fix Swin

* Fix CTRL

* Fix CVT

* Fix DPR

* Fix Wav2Vec2

* Fix T5

* Fix Hubert

* Fix GPT2

* Fix Whisper

* Fix DeiT

* Fix the encoder-decoder / dual-encoder classes

* make fix-copies

* build in name scope

* Fix summarization test

* Fix tied weight names for BART + Blenderbot

* Fix tied weight name building

* Fix to TFESM weight building

* Update TF SAM

* Expand all the shapes out into Big Boy Shapes

050e0b44