Commits · c4df7c16685a340baa271bf139c1b3d60654f177 · chenpangpang / transformers

22 Dec, 2023 8 commits

Drop `feature_extractor_type` when loading an image processor file (#28195) · c4df7c16
Yih-Dar authored Dec 22, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
c4df7c16

Fix the check of models supporting FA/SDPA not run (#28202) · bb3bd447

Yih-Dar authored Dec 22, 2023



* add check_support_list.py

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

bb3bd447

Bug: `training_args.py` fix missing import with accelerate with version... · e37ab52d

Michael Feil authored Dec 22, 2023

Bug: `training_args.py` fix missing import with accelerate with version `accelerate==0.20.1` (#28171)

* fix-accelerate-version

* updated with exported ACCELERATE_MIN_VERSION,

* update string in ACCELERATE_MIN_VERSION

e37ab52d

Add Swinv2 backbone (#27742) · c9fb250a

NielsRogge authored Dec 22, 2023

* First draft

* More improvements

* More improvements

* Make all tests pass

* Remove script

* Update image processor

* Address comments

* Use new gradient checkpointing method

* Convert checkpoints, add integration test

* Do not keep aspect ratio for now

* Set keep_aspect_ratio=False for beit, add integration test

* Remove print statement

c9fb250a

Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format... · 1ef86c4f

Nicholas Neo authored Dec 22, 2023


Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format in the SeamlessM4TFeatureExtractor class (#27914)

* fixes: code fixes on is_batched condition to also check for batched audio data in torch.Tensor format instead of only just checking for batched audio data in np.ndarray format

* Update src/transformers/models/seamless_m4t/feature_extraction_seamless_m4t.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* refactor: code refactoring to remove torch framework dependency

* docs: updated docstring to add torch tensor compatibility

* test: add test cases to incorporate torch tensor inputs

* test: ran make fix-copies for code conformity

* test: refactor test to separate the test_call into test_call_numpy and test_call_torch

---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

1ef86c4f

Fix ONNX export for causal LM sequence classifiers by removing reverse indexing (#28144) · 548a8f61

Dean Wyatte authored Dec 22, 2023

* normalize reverse indexing for causal lm sequence classifiers

* normalize reverse indexing for causal lm sequence classifiers

* normalize reverse indexing for causal lm sequence classifiers

* use modulo instead

* unify modulo-based sequence lengths

548a8f61

Update `docs/source/en/perf_infer_gpu_one.md` (#28198) · 71f46057
Yih-Dar authored Dec 22, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
71f46057
[`Docs`] Add 4-bit serialization docs (#28182) · 3a8769f6
Younes Belkada authored Dec 22, 2023
```
* add 4-bit serialization docs

* up

* up
```
3a8769f6

21 Dec, 2023 8 commits

Update YOLOS slow test values (#28187) · 3657748b
amyeroberts authored Dec 21, 2023
```
Update test values
```
3657748b
Fix slow backbone tests - out_indices must match stage name ordering (#28186) · cd1350ce
amyeroberts authored Dec 21, 2023
```
Indices must match stage name ordering
```
cd1350ce

Even more TF test fixes (#28146) · 260b9d21

Matt authored Dec 21, 2023

* Fix vision text dual encoder

* Small cleanup for wav2vec2 (not fixed yet)

* Small fix for vision_encoder_decoder

* Fix SAM builds

* Update TFBertTokenizer test with modern exporting + tokenizer

* Fix DeBERTa

* Fix DeBERTav2

* Try RAG fix but it's impossible to test locally

* Actually fix RAG now that I got FAISS working somehow

* Fix Wav2Vec2, add sermon

* Fix Hubert

260b9d21

[`Mixtral` & `Mistral`] Add support for sdpa (#28133) · f9a98c47

Arthur authored Dec 21, 2023



* some nits

* update test

* add support d\sd[a

* remove some dummy inputs

* all good

* style

* nits

* fixes

* fix more copies

* nits

* styling

* fix

* Update src/transformers/models/mistral/modeling_mistral.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add a slow test just to be sure

* fixup

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f9a98c47

[Whisper] Use torch for stft if available (#26119) · 814619f5

Sanchit Gandhi authored Dec 21, 2023

* [Whisper] Use torch for stft if available

* update docstring

* mock patch decorator

* fit on one line

814619f5

Fix `input_embeds` docstring in encoder-decoder architectures (#28168) · 7e93ce40
Joao Gante authored Dec 21, 2023

7e93ce40

[bnb] Let's make serialization of 4bit models possible (#26037) · 4f7806ef

Poedator authored Dec 21, 2023



* updated bitsandbytes.py

* rm test_raise_* from test_4bit.py

* add test_4bit_serialization.py

* modeling_utils bulk edits

* bnb_ver 0.41.3 in integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* @slow reinstated
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* bnb ver 0.41.3 in  src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* rm bnb version todo in  integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* moved 4b serialization tests to test_4bit

* tests upd for opt

* to torch_device
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* ruff fixes to tests

* rm redundant bnb version check in mod_utils
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* restore _hf_peft_config_loaded  modeling_utils.py::2188
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* restore _hf_peft_config_loaded  test in modeling_utils.py::2199
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fixed NOT getattr(self, "is_8bit_serializable")
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* setting model.is_4bit_serializable

* rm separate fp16_statistics arg from set_module...

* rm else branch in integrations::bnb::set_module

* bnb 4bit dtype check

* upd comment on 4bit weights

* upd tests for FP4 safe

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

4f7806ef

disable test_retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest (#28169) · e268d7e5
Dean Wyatte authored Dec 21, 2023
```
disable retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest
```
e268d7e5

20 Dec, 2023 11 commits

Fix yolos resizing (#27663) · 1d777359
amyeroberts authored Dec 20, 2023
```
* Fix yolos resizing

* Update tests

* Add a test
```
1d777359
Generate: fix speculative decoding (#28166) · 45b70384
Joao Gante authored Dec 20, 2023
```
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
```
45b70384
[docs] Trainer docs (#28145) · 01c081d1
Steven Liu authored Dec 20, 2023
```
* fsdp, debugging, gpu selection

* fix hfoption

* fix
```
01c081d1

Align backbone stage selection with out_indices & out_features (#27606) · ee298a16

amyeroberts authored Dec 20, 2023

* Iteratre over out_features instead of stage_names

* Update for all backbones

* Add tests

* Fix

* Align timm backbone behaviour with other backbones

* Fix tests

* Stricter checks on set out_features and out_indices

* Revert back stage selection logic

* Remove out-of-order logic

* Document restriction in docstrings

ee298a16

Update FA2 exception msg to point to hub discussions (#28161) · 224ab709
amyeroberts authored Dec 20, 2023
```
* Update FA2 exception msg to point to hub discussions

* Use path for hub url
```
224ab709
Avoid unnecessary warnings when loading `CLIPConfig` (#28108) · 9924df9e
Yih-Dar authored Dec 20, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9924df9e
Fix weights not properly initialized due to shape mismatch (#28122) · 7938c8c8
Yih-Dar authored Dec 20, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
7938c8c8

move code to Trainer.evaluate to enable use of that function with multiple datasets (#27844) · 769a9542

peter-sk authored Dec 20, 2023



* move code to Trainer.evaluate to enable use of that function with multiple datasets

* test

* update doc string

* and a tip

* forgot the type

---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>

769a9542

[gpt-neox] Add attention_bias config to support model trained without attention biases (#28126) · cd9f9d63
Jong-hun Shin authored Dec 20, 2023
```
* add attention_bias hparam for a model trained without attention biases

* fix argument documentation error
```
cd9f9d63

Fix FA2 integration (#28142) · def581ef

Sourab Mangrulkar authored Dec 20, 2023



* fix fa2

* fix FA2 for popular models

* improve warning and add Younes as co-author
Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix the warning

* Add Tip

* typo fix

* nit

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

def581ef

Remove deprecated CPU dockerfiles (#28149) · b134f685
Abolfazl Shahbazi authored Dec 19, 2023
```
Signed-off-by: Abolfazl Shahbazi <abolfazl.shahbazi@intel.com>
```
b134f685

19 Dec, 2023 6 commits

[docs] Fix mistral link in mixtral.md (#28143) · 38611086
Aaron Jimenez authored Dec 19, 2023
```
Fix mistral link in mixtral.md
```
38611086

Update modeling_utils.py (#28127) · 23f8e4db

Mike Zellinger authored Dec 19, 2023

In docstring for PreTrainedModel.resize_token_embeddings, correct definition of new_num_tokens parameter to read "the new number of tokens" (meaning the new size of the vocab) rather than "the number of new tokens" (number of newly added tokens only).

23f8e4db

[`Mixtral`] Fix loss + nits (#28115) · 4a04b4cc

Arthur authored Dec 19, 2023



* default config should not use sliding window

* update the doc

* nits

* add a proper test

* update

* update

* update expected value

* Update src/transformers/tokenization_utils_fast.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* convert to float

* average then N**2

* comment

* revert nit

* good to fo

* fixup

* Update tests/models/mixtral/test_modeling_mixtral.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* revert unrelated change

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

4a04b4cc

Generate: speculative decoding (#27979) · ac974199

Joao Gante authored Dec 19, 2023



* speculative decoding

* fix test

* space

* better comments

* remove redundant test

* test nit

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* PR comments

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ac974199

Update split string in doctest to reflect #28087 (#28135) · bd7a3561
amyeroberts authored Dec 19, 2023

bd7a3561

When save a model on TPU, make a copy to be moved to CPU (#27993) · 5aec50ec

qihqi authored Dec 19, 2023

* When save a model, make a copy to be moved to CPU, dont move the original
model

* make deepcopy inside of _save_tpu

* Move to tpu without copy

5aec50ec

18 Dec, 2023 7 commits
- [Doc] Fix token link in What 🤗 Transformers can do (#28123) · 4edffda6
  Aaron Jimenez authored Dec 18, 2023
```
Fix token link
```
  4edffda6
- Fix a typo in tokenizer documentation (#28118) · c52b515e
  Mike Salvatore authored Dec 18, 2023
  
  c52b515e
- [docs] General doc fixes (#28087) · a52e180a
  Steven Liu authored Dec 18, 2023
```
* doc fix friday

* deprecated objects

* update not_doctested

* update toctree
```
  a52e180a
- Fix indentation error - semantic_segmentation.md (#28117) · 08a6e7a7
  Rockerz authored Dec 18, 2023
```
Update semantic_segmentation.md
```
  08a6e7a7
- More TF fixes (#28081) · 71d47f0a
  Matt authored Dec 18, 2023
```
* More build_in_name_scope()

* Make sure we set the save spec now we don't do it with dummies anymore

* make fixup
```
  71d47f0a
- Remove warning if `DISABLE_TELEMETRY` is used (#28113) · 0695b242
  Lucain authored Dec 18, 2023
```
remove warning if DISABLE_TELEMETRY is used
```
  0695b242
- Disable jitter noise during evaluation in SwitchTransformers (#28077) · 7c5408da
  Daize Dong authored Dec 18, 2023
```
* Disable jitter noise during evaluation

* Update outdated configuration information

* Formatting

* Add new line
```
  7c5408da