Commits · e65502951593a76844e872fee9c56b805598538a · chenpangpang / transformers

28 Jun, 2024 6 commits

Add French version of run scripts tutorial (#31483) · e6550295

Jade Choghari authored Jun 28, 2024



* Add French translation of run scripts tutorial

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Jade Choghari <chogharijade@icloud.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

e6550295

Gemma capping is a must for big models (#31698) · bbf1e618
Arthur authored Jun 28, 2024
```
* softcapping

* soft cap before the mask

* style

* ...

* super nit
```
bbf1e618

add gather_use_object arguments (#31514) · cb298978

Sangbum Daniel Choi authored Jun 28, 2024



* add gather_use_object arguments

* fix name and pass the CI test for Seq2SeqTrainer

* make style

* make it to functools

* fix typo

* add accelerate version:

* adding warning

* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* make style

* Update src/transformers/training_args.py

* check function move to initial part

* add test for eval_use_gather_object

---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

cb298978

Fix return_dict in encodec (#31646) · 82a1fc72

Jacky Lee authored Jun 28, 2024

* fix: use return_dict parameter

* fix: type checks

* fix: unused imports

* update: one-line if else

* remove: recursive check

82a1fc72

Fix Gemma2 4d attention mask (#31674) · 5e89b335

hoshi-hiyouga authored Jun 28, 2024



Update modeling_gemma2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

5e89b335

don't zero out the attention_mask when using sliding window with flash attention (#31670) · 0142aab7
Wing Lian authored Jun 27, 2024
```
* don't zero out the attention_mask when using sliding window with flash attention

* chore: lint
```
0142aab7

27 Jun, 2024 12 commits

[HybridCache] Fix `get_seq_length` method (#31661) · 1c68f2ca
Sanchit Gandhi authored Jun 27, 2024
```
* fix gemma2

* handle in generate
```
1c68f2ca
[docs] Llama3 (#31662) · 464aa746
Steven Liu authored Jun 27, 2024
```
quick usage to top
```
464aa746
Fix float out of range in owlvit and owlv2 when using FP16 or lower precision (#31657) · e44b878c
Billy Cao authored Jun 28, 2024

e44b878c
Fix post gemma merge (#31660) · 75a63198
Arthur authored Jun 27, 2024
```
* nit

* toctree issue

* protect gemma2 tests as well

* sdpa supported
```
75a63198
v4.43.0.dev0 · 727eea4a
Lysandre authored Jun 27, 2024

727eea4a

Add gemma 2 (#31659) · 0cf60f13

Arthur authored Jun 27, 2024



* inital commit

* Add doc

* protect?

* fixup stuffs

* update tests

* fix build documentation

* mmmmmmm config attributes

* style

* nit

* uodate

* nit

* Fix docs

* protect some stuff

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

0cf60f13

Remove deprecated config attribute in VLMs (#31655) · 4aa17d00
Raushan Turganbay authored Jun 27, 2024
```
remove
```
4aa17d00
change anchor_image_size None for compatibility (#31640) · be50a033
Sangbum Daniel Choi authored Jun 27, 2024
```
* change anchor_image_size None for compatibility

* make fix-copies
```
be50a033
[QoL] Allow dtype str for torch_dtype arg of from_pretrained (#31590) · 3a028101
Billy Cao authored Jun 27, 2024
```
* Allow dtype str for torch_dtype in from_pretrained

* Update docstring

* Add tests for str torch_dtype
```
3a028101

[`Llama`] Conversion: fix and simplify the script! (#31591) · 11138ca0

Arthur authored Jun 27, 2024



* fix and simplify the script!

* add co-author

---------
Co-authored-by: crackalamoo <crackalamoo@users.noreply.github.com>

11138ca0

Fix ONNX exports for Optimum compatible models (#31311) · c9f191a0

Merve Noyan authored Jun 27, 2024



* fixed models

* format with bumped ruff version on my local

* fix copies

* add tracing checks

* format

* Update src/transformers/utils/generic.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* format

* style fix

* Update modeling_mobilevit.py

* add docstring and change name

* Update __init__.py

* Update __init__.py

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

c9f191a0

Generation: past kv can be None (#31051) · dc76e9fa
Raushan Turganbay authored Jun 27, 2024
```
* fix

* better
```
dc76e9fa

26 Jun, 2024 12 commits

Skip tests properly (#31308) · 1de7dc74

amyeroberts authored Jun 26, 2024

* Skip tests properly

* [test_all]

* Add 'reason' as kwarg for skipTest

* [test_all] Fix up

* [test_all]

1de7dc74

Fix dtype casting in swinv2 and swinv2sr to allow non-FP32 inference (#31589) · 1f9f57ab

Billy Cao authored Jun 27, 2024



* Fix dtype casting in modeling_swin2sr to allow non-FP32 inference

* Fix formattting

* Fix for swinv2 too

* Update src/transformers/models/swin2sr/modeling_swin2sr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/swinv2/modeling_swinv2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add FP16 tests for swin2sr and swinv2

* [run_slow] swin2sr, swinv2

* [run_slow] swin2sr, swinv2

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1f9f57ab

Generate: fix assisted generation with `past_key_values` passed as kwargs (#31644) · a3fb96a4
Joao Gante authored Jun 26, 2024

a3fb96a4
Fix paligemma detection inference (#31587) · 492ee17e
Pablo Montalvo authored Jun 26, 2024
```
* fix extended attention mask

* add slow test for detection instance

* [run-slow]paligemma
```
492ee17e

Add LLaVa NeXT Video (#31252) · e71f2863

Raushan Turganbay authored Jun 26, 2024



* squash into single commit

* run diff once more

* docstring

* tests

* minor chnages and ready to go

* Update src/transformers/models/llava_next_video/processing_llava_next_video.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/vipllava/test_modeling_vipllava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* [run-slow] llava-next-video

* [run-slow] llava-next-video

* [run-slow] llava_next_video

* fix two tests

* fix slow tests

* remove logit checks due to numeric errors

* run test once more

* [run-slow] llava_next_video

* final try to pass the test

* [run-slow] llava_next_video

* [run-slow] llava_next_video

* [run-slow] llava_next_video

* style

* fix

* style

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

e71f2863

Fix RT-DETR inference with float16 and bfloat16 (#31639) · b1ec7454

Pavel Iakubovskii authored Jun 26, 2024



* [run_slow] rt_detr

* Fix positional embeddings and anchors dtypes

* [run slow] rt_detr

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixup

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b1ec7454

Llama et al. / FSDP : Fix breaking change in 4.40 for FSDP (#31161) · 3f93fd06

Younes Belkada authored Jun 26, 2024



* fix llama fsdp

* fixup

* adding FSDP tests for CPU offloading

* fixes

* fix tests

* fix tests

* add it for mixtral

* propagate the changes on other models

* Update src/transformers/models/phi/modeling_phi.py

* Delete utils/testing_scripts/fsdp_cpu_offloading.py

Remove script - FSDP + CPU offloading it tested in the test suite

* Delete utils/testing_scripts/dummy_fsdp_config.yml

* Update + add cache_positions docstring

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3f93fd06

Update RT-DETR code snippet (#31631) · ac52084b
Pavel Iakubovskii authored Jun 26, 2024
```
Update code snippet
```
ac52084b
Fix llama gguf converter (#31575) · 915cce39
Marc Sun authored Jun 26, 2024

915cce39

[`GPT-NeoX`] Add SDPA support (#31031) · b07770c5

Anton Vlasjuk authored Jun 26, 2024

* starting support for sdpa in `gptneox` models

* small comment on tests

* fix dropout

* documentation and style

* clarify concrete paths for reference

* generalise attn projections and rope application

added head mask check to sdpa mask creation

handle sdpa memory backend bug via own version flag

* update docs and style

* move dtype casting outside of general attn_projection_and_rope function

fix flash_attn_2 stuff

* more generic attn warning if output_attns or head_mask

* simplify head mask check by moving head mask creation to a later point

* remove copied llama artifact

* remove padding_mask from attention function signature

* removing unnecessary comments, only "save" attn implementation once

* [run_slow] gpt_neox

b07770c5

Removed unnecessary `self.projection` call in `VivitTubeletEmbeddings` (#31632) · 1218e439
Vladimir Iashin authored Jun 26, 2024
```
removes unnecessary second projection call
```
1218e439
docs: move translations to `i18n` (#31584) · 2daf2c3e
Saurav Maheshkar authored Jun 26, 2024
```
docs: move translations to i18n
```
2daf2c3e

25 Jun, 2024 6 commits

Add ViTImageProcessorFast to tests (#31424) · 0f67ba1d
amyeroberts authored Jun 25, 2024
```
* Add ViTImageProcessor to tests

* Correct data format

* Review comments
```
0f67ba1d
Improve error message for mismatched copies in code blocks (#31535) · aab08297
Pablo Montalvo authored Jun 25, 2024
```
improve error message for mismatched code blocks
```
aab08297
add preprocessing_num_workers to run_classification.py (#31586) · e73a97a2
Locke authored Jun 25, 2024
```
preprocessing_num_workers option to speedup preprocess
```
e73a97a2

Add video modality for InstrucBLIP (#30182) · fc689d75

Raushan Turganbay authored Jun 25, 2024

* squash in single commit

* add docs

* dummy obj

* more changes in diff converter

* tiny fix

* make docs happy

* skip test

* repo consistency tests

* update docstring

* style

* fix tests

* change diff imports

* [run-slow] instructblipvideo

* [run-slow] instructblipvideo

* fix tests and remove logit check

* [run-slow] instructblipvideo

fc689d75

fix output data type of image classification (#31444) · a958c4a8

jiqing-feng authored Jun 25, 2024



* fix output data type of image classification

* add tests for low-precision pipeline

* add bf16 pipeline tests

* fix bf16 tests

* Update tests/pipelines/test_pipelines_image_classification.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix import

* fix import torch

* fix style

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

a958c4a8

Siglip: add `_no_split_module` (#31566) · 7e86cb6c
Raushan Turganbay authored Jun 25, 2024
```
* device-map siglip

* move split modules to PretrainedSigLip
```
7e86cb6c

24 Jun, 2024 4 commits
- Added version constraint on numpy for version <2.0 (#31569) · 74b92c62
  René Gentzen authored Jun 24, 2024
```
* Contrained numpy to <2.0

* Updated dependency_versions_table

---------
Co-authored-by: René Gentzen <rene.gentzen@mittelstand.ai>
```
  74b92c62
- Fix is_torch_xpu_available for torch < 2.3 (#31573) · 3a49ebe0
  amyeroberts authored Jun 24, 2024
  
  3a49ebe0
- Fix doc typo in `TrainingArguments` (#31503) · 2fc9d8e9
  Quentin Gallouédec authored Jun 24, 2024
  
  2fc9d8e9
- Add Jinja as a requirement with the right version cutoff (#31536) · 2d482028
  Matt authored Jun 24, 2024
```
* Add Jinja as a requirement with the right version cutoff

* Correct package name!
```
  2d482028