Commits · 502a10a6f89b2919444aba68cd0def51d5ba618c · chenpangpang / transformers

02 Jan, 2024 2 commits
- Fix trainer saving safetensors: metadata is None (#28219) · 502a10a6
  hoshi-hiyouga authored Jan 02, 2024
```
* Update trainer.py

* format
```
  502a10a6
- Update docs around mixing hf scheduler with deepspeed optimizer (#28223) · cad9f5c6
  Dean Wyatte authored Jan 02, 2024
```
update docs around mixing hf scheduler with deepspeed optimizer
```
  cad9f5c6
26 Dec, 2023 2 commits
- small typo (#28229) · 3cefac1d
  Stas Bekman authored Dec 26, 2023
```
Update modeling_utils.py
```
  3cefac1d
- fix FA2 when using quantization (#28203) · 3b7675b2
  Sourab Mangrulkar authored Dec 26, 2023
  
  3b7675b2
25 Dec, 2023 1 commit
- [`Awq`] Enable the possibility to skip quantization for some target modules (#27950) · fa21ead7
  Younes Belkada authored Dec 25, 2023
```
* v1

* add docstring

* add tests

* add awq 0.1.8

* oops

* fix test
```
  fa21ead7
22 Dec, 2023 12 commits

[`Llava`] Fix llava index errors (#28032) · 29e7a1e1

Younes Belkada authored Dec 22, 2023



* fix llava index errors

* forward contrib credits from original implementation and fix

* better fix

* final fixes and fix all tests

* fix

* fix nit

* fix tests

* add regression tests

---------
Co-authored-by: gullalc <gullalc@users.noreply.github.com>

29e7a1e1

update the logger message with accordant weights_file_name (#28181) · 68fa1e85
lin yudong authored Dec 22, 2023
```
Co-authored-by: yudong.lin <yudong.lin@funplus.com>
```
68fa1e85

Fixing visualization code for object detection to support both types of bounding box. (#27842) · 74d9d0ce

Anindyadeep authored Dec 22, 2023



* fix: minor enhancement and fix in bounding box visualization example

The example that was trying to visualize the bounding box was not considering an edge case,
where the bounding box can be un-normalized. So using the same set of code, we can not get
results with a different dataset with un-normalized bounding box. This commit fixes that.

* run make clean

* add an additional note on the scenarios where the box viz code works

---------
Co-authored-by: Anindyadeep <anindya@pop-os.localdomain>

74d9d0ce

[Whisper] Fix word-level timestamps with bs>1 or num_beams>1 (#28114) · 5da3db3f

Yoach Lacombe authored Dec 22, 2023



* fix frames

* use smaller chunk length

* correct beam search + tentative stride

* fix whisper word timestamp in batch

* add test batch generation with return token timestamps

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* clean a test

* make style + correct typo

* write clearer comments

* explain test in comment

---------
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

5da3db3f

Drop `feature_extractor_type` when loading an image processor file (#28195) · c4df7c16
Yih-Dar authored Dec 22, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
c4df7c16

Fix the check of models supporting FA/SDPA not run (#28202) · bb3bd447

Yih-Dar authored Dec 22, 2023



* add check_support_list.py

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

bb3bd447

Bug: `training_args.py` fix missing import with accelerate with version... · e37ab52d

Michael Feil authored Dec 22, 2023

Bug: `training_args.py` fix missing import with accelerate with version `accelerate==0.20.1` (#28171)

* fix-accelerate-version

* updated with exported ACCELERATE_MIN_VERSION,

* update string in ACCELERATE_MIN_VERSION

e37ab52d

Add Swinv2 backbone (#27742) · c9fb250a

NielsRogge authored Dec 22, 2023

* First draft

* More improvements

* More improvements

* Make all tests pass

* Remove script

* Update image processor

* Address comments

* Use new gradient checkpointing method

* Convert checkpoints, add integration test

* Do not keep aspect ratio for now

* Set keep_aspect_ratio=False for beit, add integration test

* Remove print statement

c9fb250a

Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format... · 1ef86c4f

Nicholas Neo authored Dec 22, 2023


Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format in the SeamlessM4TFeatureExtractor class (#27914)

* fixes: code fixes on is_batched condition to also check for batched audio data in torch.Tensor format instead of only just checking for batched audio data in np.ndarray format

* Update src/transformers/models/seamless_m4t/feature_extraction_seamless_m4t.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* refactor: code refactoring to remove torch framework dependency

* docs: updated docstring to add torch tensor compatibility

* test: add test cases to incorporate torch tensor inputs

* test: ran make fix-copies for code conformity

* test: refactor test to separate the test_call into test_call_numpy and test_call_torch

---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

1ef86c4f

Fix ONNX export for causal LM sequence classifiers by removing reverse indexing (#28144) · 548a8f61

Dean Wyatte authored Dec 22, 2023

* normalize reverse indexing for causal lm sequence classifiers

* normalize reverse indexing for causal lm sequence classifiers

* normalize reverse indexing for causal lm sequence classifiers

* use modulo instead

* unify modulo-based sequence lengths

548a8f61

Update `docs/source/en/perf_infer_gpu_one.md` (#28198) · 71f46057
Yih-Dar authored Dec 22, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
71f46057
[`Docs`] Add 4-bit serialization docs (#28182) · 3a8769f6
Younes Belkada authored Dec 22, 2023
```
* add 4-bit serialization docs

* up

* up
```
3a8769f6

21 Dec, 2023 8 commits

Update YOLOS slow test values (#28187) · 3657748b
amyeroberts authored Dec 21, 2023
```
Update test values
```
3657748b
Fix slow backbone tests - out_indices must match stage name ordering (#28186) · cd1350ce
amyeroberts authored Dec 21, 2023
```
Indices must match stage name ordering
```
cd1350ce

Even more TF test fixes (#28146) · 260b9d21

Matt authored Dec 21, 2023

* Fix vision text dual encoder

* Small cleanup for wav2vec2 (not fixed yet)

* Small fix for vision_encoder_decoder

* Fix SAM builds

* Update TFBertTokenizer test with modern exporting + tokenizer

* Fix DeBERTa

* Fix DeBERTav2

* Try RAG fix but it's impossible to test locally

* Actually fix RAG now that I got FAISS working somehow

* Fix Wav2Vec2, add sermon

* Fix Hubert

260b9d21

[`Mixtral` & `Mistral`] Add support for sdpa (#28133) · f9a98c47

Arthur authored Dec 21, 2023



* some nits

* update test

* add support d\sd[a

* remove some dummy inputs

* all good

* style

* nits

* fixes

* fix more copies

* nits

* styling

* fix

* Update src/transformers/models/mistral/modeling_mistral.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add a slow test just to be sure

* fixup

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f9a98c47

[Whisper] Use torch for stft if available (#26119) · 814619f5

Sanchit Gandhi authored Dec 21, 2023

* [Whisper] Use torch for stft if available

* update docstring

* mock patch decorator

* fit on one line

814619f5

Fix `input_embeds` docstring in encoder-decoder architectures (#28168) · 7e93ce40
Joao Gante authored Dec 21, 2023

7e93ce40

[bnb] Let's make serialization of 4bit models possible (#26037) · 4f7806ef

Poedator authored Dec 21, 2023



* updated bitsandbytes.py

* rm test_raise_* from test_4bit.py

* add test_4bit_serialization.py

* modeling_utils bulk edits

* bnb_ver 0.41.3 in integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* @slow reinstated
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* bnb ver 0.41.3 in  src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* rm bnb version todo in  integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* moved 4b serialization tests to test_4bit

* tests upd for opt

* to torch_device
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* ruff fixes to tests

* rm redundant bnb version check in mod_utils
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* restore _hf_peft_config_loaded  modeling_utils.py::2188
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* restore _hf_peft_config_loaded  test in modeling_utils.py::2199
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fixed NOT getattr(self, "is_8bit_serializable")
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* setting model.is_4bit_serializable

* rm separate fp16_statistics arg from set_module...

* rm else branch in integrations::bnb::set_module

* bnb 4bit dtype check

* upd comment on 4bit weights

* upd tests for FP4 safe

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

4f7806ef

disable test_retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest (#28169) · e268d7e5
Dean Wyatte authored Dec 21, 2023
```
disable retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest
```
e268d7e5

20 Dec, 2023 11 commits

Fix yolos resizing (#27663) · 1d777359
amyeroberts authored Dec 20, 2023
```
* Fix yolos resizing

* Update tests

* Add a test
```
1d777359
Generate: fix speculative decoding (#28166) · 45b70384
Joao Gante authored Dec 20, 2023
```
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
```
45b70384
[docs] Trainer docs (#28145) · 01c081d1
Steven Liu authored Dec 20, 2023
```
* fsdp, debugging, gpu selection

* fix hfoption

* fix
```
01c081d1

Align backbone stage selection with out_indices & out_features (#27606) · ee298a16

amyeroberts authored Dec 20, 2023

* Iteratre over out_features instead of stage_names

* Update for all backbones

* Add tests

* Fix

* Align timm backbone behaviour with other backbones

* Fix tests

* Stricter checks on set out_features and out_indices

* Revert back stage selection logic

* Remove out-of-order logic

* Document restriction in docstrings

ee298a16

Update FA2 exception msg to point to hub discussions (#28161) · 224ab709
amyeroberts authored Dec 20, 2023
```
* Update FA2 exception msg to point to hub discussions

* Use path for hub url
```
224ab709
Avoid unnecessary warnings when loading `CLIPConfig` (#28108) · 9924df9e
Yih-Dar authored Dec 20, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9924df9e
Fix weights not properly initialized due to shape mismatch (#28122) · 7938c8c8
Yih-Dar authored Dec 20, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
7938c8c8

move code to Trainer.evaluate to enable use of that function with multiple datasets (#27844) · 769a9542

peter-sk authored Dec 20, 2023



* move code to Trainer.evaluate to enable use of that function with multiple datasets

* test

* update doc string

* and a tip

* forgot the type

---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>

769a9542

[gpt-neox] Add attention_bias config to support model trained without attention biases (#28126) · cd9f9d63
Jong-hun Shin authored Dec 20, 2023
```
* add attention_bias hparam for a model trained without attention biases

* fix argument documentation error
```
cd9f9d63

Fix FA2 integration (#28142) · def581ef

Sourab Mangrulkar authored Dec 20, 2023



* fix fa2

* fix FA2 for popular models

* improve warning and add Younes as co-author
Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix the warning

* Add Tip

* typo fix

* nit

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

def581ef

Remove deprecated CPU dockerfiles (#28149) · b134f685
Abolfazl Shahbazi authored Dec 19, 2023
```
Signed-off-by: Abolfazl Shahbazi <abolfazl.shahbazi@intel.com>
```
b134f685

19 Dec, 2023 4 commits

[docs] Fix mistral link in mixtral.md (#28143) · 38611086
Aaron Jimenez authored Dec 19, 2023
```
Fix mistral link in mixtral.md
```
38611086

Update modeling_utils.py (#28127) · 23f8e4db

Mike Zellinger authored Dec 19, 2023

In docstring for PreTrainedModel.resize_token_embeddings, correct definition of new_num_tokens parameter to read "the new number of tokens" (meaning the new size of the vocab) rather than "the number of new tokens" (number of newly added tokens only).

23f8e4db

[`Mixtral`] Fix loss + nits (#28115) · 4a04b4cc

Arthur authored Dec 19, 2023



* default config should not use sliding window

* update the doc

* nits

* add a proper test

* update

* update

* update expected value

* Update src/transformers/tokenization_utils_fast.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* convert to float

* average then N**2

* comment

* revert nit

* good to fo

* fixup

* Update tests/models/mixtral/test_modeling_mixtral.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* revert unrelated change

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

4a04b4cc

Generate: speculative decoding (#27979) · ac974199

Joao Gante authored Dec 19, 2023



* speculative decoding

* fix test

* space

* better comments

* remove redundant test

* test nit

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* PR comments

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ac974199