Commits · 2aa2a14481dda0243522e6dff018aadab9829efa · chenpangpang / transformers

05 Jul, 2024 3 commits
- Make tensor device correct when ACCELERATE_TORCH_DEVICE is defined (#31751) · 2aa2a144
  Kazuaki Ishizaki authored Jul 05, 2024
```
return correct device when ACCELERATE_TORCH_DEVICE is defined
```
  2aa2a144
- Fix serialization for offloaded model (#31727) · 8c5c180d
  Marc Sun authored Jul 05, 2024
```
* Fix serialization

* style

* add test
```
  8c5c180d
- Fix ClapProcessor to merge feature_extractor output into the returned BatchEncoding (#31767) · eaa5f414
  mxkopy authored Jul 04, 2024
```
* fixed ClapProcessor to merge all values output from the feature extractor into the returned BatchEncoding.

* fixed trailing whitespace
```
  eaa5f414
04 Jul, 2024 3 commits

Add torch_empty_cache_steps to TrainingArguments (#31546) · 43ffb785

aliencaocao authored Jul 05, 2024

* Add torch_empty_cache_steps to TrainingArguments

* Fix formatting

* Add torch_empty_cache_steps to docs on single gpu training

* Remove check for torch_empty_cache_steps <= max_steps

* Captalize Tip

* Be device agnostic

* Fix linting

43ffb785

Fix Gemma2 types (#31779) · cee768d9
hoshi-hiyouga authored Jul 04, 2024
```
Update __init__.py
```
cee768d9
`pytest_num_workers=4` for some CircleCI jobs (#31764) · 87726a08
Yih-Dar authored Jul 04, 2024
```
pytest_num_workers=4
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
87726a08

03 Jul, 2024 9 commits

Fix RT-DETR weights initialization (#31724) · 048f599f

Pavel Iakubovskii authored Jul 03, 2024

* Fix init for rt-detr heads

* Fixup

* Add separate prior_prob value to config for initialization

* Add bbox init

* Change to 1 / num_labels init

* Adjust weights init test

* Fix style for test

048f599f

Fix RT-DETR cache for generate_anchors (#31671) · b9752161

Pavel Iakubovskii authored Jul 03, 2024

* Fix cache and type conversion

* Add test

* Fixup

* nit

* [run slow] rt_detr

* Fix test

* Fixup

* [run slow] rt_detr

* Update src/transformers/models/rt_detr/modeling_rt_detr.py

b9752161

[fix bug] logits's shape different from label's shape in preprocess_logits_for_metrics (#31447) · 534cbf8a

Willard Sheen authored Jul 03, 2024

* [fix BUG] pad labels before use it in preprocess_logits_for_metrics

* a more readable fix

labels can't use  `gather` before pass to `preprocess_logits_for_metrics`, so must split into 2 if-block

* add a comment

* oh code quality check

534cbf8a

Add ignore_errors=True to trainer.py rmtree in _inner_training_loop (#31668) · 65a02cd2
Nate Brake authored Jul 03, 2024
```
Update trainer.py
```
65a02cd2
Gemma 2: Update slow tests (#31759) · ddfaf119
Joao Gante authored Jul 03, 2024
```
gemma 2 slow tests
```
ddfaf119
handle (processor_class, None) returned by ModelPatterns (#31753) · c1fe1259
Pablo Montalvo authored Jul 03, 2024

c1fe1259

Adds final answer tool for all agents (#31703) · 0fd885b9

Aymeric Roucher authored Jul 03, 2024

* Adds final answer tool for all agents

* Typo

* Add clarification in doc

* Put final_answer tool adition in agent for clarity

0fd885b9

Requires for torch.tensor before casting (#31755) · dc72fd7e
Ella Charlaix authored Jul 03, 2024

dc72fd7e

fix assisted decoding (#31401) · 7f91f168

jiqing-feng authored Jul 03, 2024

* fix assisted decoding

* check None

* fix typo

* fix _prepare_special_tokens

* fix style

* fix lint

* add tests for assisted decoding

* fix style

* fix tests check

7f91f168

02 Jul, 2024 7 commits

Fix documentation for Gemma2. (#31682) · f91c16d2

Jörg Bornschein authored Jul 02, 2024



* Fix documentation for Gemma2. 

Model sizes and Blog post URL are wrong in the documentation.

* Update docs/source/en/model_doc/gemma2.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

f91c16d2

Make tool JSON schemas consistent (#31756) · cd0935dd
Matt authored Jul 02, 2024
```
Make the order of array items consistent using sorted()
```
cd0935dd
🚨🚨 TextGenerationPipeline: rely on the tokenizer default kwargs (#31747) · 82486e59
Joao Gante authored Jul 02, 2024
```
* rely on the tokenizer default kwargs

* fix a few tests
```
82486e59

[whisper] static kv cache (#31166) · a9701953

Sanchit Gandhi authored Jul 02, 2024



* make work with cache abstraction

* correct for static cache

* hacks for compile

* make fast

* fix

* fix pos ids

* generate

* fix sdpa

* fix sdpa cache pos

* fix fa2

* clean fa2

* integrate cache into generate

* make style

* copies

* more copies

* update eager

* update sdpa

* update fa2

* simplify

* use cache pos

* always compute cross-cache for debug

* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>

* fix fix

* fix fix fix

* more fix

* try encoder-decoder cache (too messy)

* revert encoder-decoder cache

* check cross-attn cache

* use enc-dec dataclass

* use richer enc-dec dataclass

* clean-up

* revert static cache changes

* small fixes

* revert to cpu flag

* fix copies

* add static slow test

* past k/v docstring

* more docstrings

* cache_position docstrings

* add to docs

* add enc-dec cache to docs

* make style

* fix afte...

a9701953

Fix mistral ONNX export (#31696) · 57d7594a
fxmarty authored Jul 02, 2024
```
* use bitwise or

* why is the CI not triggered?
```
57d7594a

Move some test files (`tets/test_xxx_utils.py`) to `tests/utils` (#31730) · 93cd94b7

Yih-Dar authored Jul 02, 2024



* move

* move

* move

* move

* Update tests/utils/test_image_processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

93cd94b7

remove incorrect urls pointing to the llava repository (#31107) · cf85e86e

Krisztián Boros authored Jul 02, 2024

* remove incorrect urls pointing to the llava repository

* remove incorrect urls pointing to the llava repository; removing entire comments

* remove incorrect urls pointing to the llava repository; removing entire comments; ran fix-copies

* ran fixup

cf85e86e

01 Jul, 2024 1 commit
- dependencies: `keras-nlp<0.14` pin (#31684) · 3345ae73
  Joao Gante authored Jul 01, 2024
```
* keras nlp pin

* this should use the new docker images:dev

* dev-ci
```
  3345ae73
28 Jun, 2024 6 commits

Add French version of run scripts tutorial (#31483) · e6550295

Jade Choghari authored Jun 28, 2024



* Add French translation of run scripts tutorial

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Jade Choghari <chogharijade@icloud.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

e6550295

Gemma capping is a must for big models (#31698) · bbf1e618
Arthur authored Jun 28, 2024
```
* softcapping

* soft cap before the mask

* style

* ...

* super nit
```
bbf1e618

add gather_use_object arguments (#31514) · cb298978

Sangbum Daniel Choi authored Jun 28, 2024



* add gather_use_object arguments

* fix name and pass the CI test for Seq2SeqTrainer

* make style

* make it to functools

* fix typo

* add accelerate version:

* adding warning

* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* make style

* Update src/transformers/training_args.py

* check function move to initial part

* add test for eval_use_gather_object

---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

cb298978

Fix return_dict in encodec (#31646) · 82a1fc72

Jacky Lee authored Jun 28, 2024

* fix: use return_dict parameter

* fix: type checks

* fix: unused imports

* update: one-line if else

* remove: recursive check

82a1fc72

Fix Gemma2 4d attention mask (#31674) · 5e89b335

hoshi-hiyouga authored Jun 28, 2024



Update modeling_gemma2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

5e89b335

don't zero out the attention_mask when using sliding window with flash attention (#31670) · 0142aab7
Wing Lian authored Jun 27, 2024
```
* don't zero out the attention_mask when using sliding window with flash attention

* chore: lint
```
0142aab7

27 Jun, 2024 11 commits

[HybridCache] Fix `get_seq_length` method (#31661) · 1c68f2ca
Sanchit Gandhi authored Jun 27, 2024
```
* fix gemma2

* handle in generate
```
1c68f2ca
[docs] Llama3 (#31662) · 464aa746
Steven Liu authored Jun 27, 2024
```
quick usage to top
```
464aa746
Fix float out of range in owlvit and owlv2 when using FP16 or lower precision (#31657) · e44b878c
aliencaocao authored Jun 28, 2024

e44b878c
Fix post gemma merge (#31660) · 75a63198
Arthur authored Jun 27, 2024
```
* nit

* toctree issue

* protect gemma2 tests as well

* sdpa supported
```
75a63198
v4.43.0.dev0 · 727eea4a
Lysandre authored Jun 27, 2024

727eea4a

Add gemma 2 (#31659) · 0cf60f13

Arthur authored Jun 27, 2024



* inital commit

* Add doc

* protect?

* fixup stuffs

* update tests

* fix build documentation

* mmmmmmm config attributes

* style

* nit

* uodate

* nit

* Fix docs

* protect some stuff

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

0cf60f13

Remove deprecated config attribute in VLMs (#31655) · 4aa17d00
Raushan Turganbay authored Jun 27, 2024
```
remove
```
4aa17d00
change anchor_image_size None for compatibility (#31640) · be50a033
Sangbum Daniel Choi authored Jun 27, 2024
```
* change anchor_image_size None for compatibility

* make fix-copies
```
be50a033
[QoL] Allow dtype str for torch_dtype arg of from_pretrained (#31590) · 3a028101
aliencaocao authored Jun 27, 2024
```
* Allow dtype str for torch_dtype in from_pretrained

* Update docstring

* Add tests for str torch_dtype
```
3a028101

[`Llama`] Conversion: fix and simplify the script! (#31591) · 11138ca0

Arthur authored Jun 27, 2024



* fix and simplify the script!

* add co-author

---------
Co-authored-by: crackalamoo <crackalamoo@users.noreply.github.com>

11138ca0

Fix ONNX exports for Optimum compatible models (#31311) · c9f191a0

Merve Noyan authored Jun 27, 2024



* fixed models

* format with bumped ruff version on my local

* fix copies

* add tracing checks

* format

* Update src/transformers/utils/generic.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* format

* style fix

* Update modeling_mobilevit.py

* add docstring and change name

* Update __init__.py

* Update __init__.py

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

c9f191a0