Commits · 4879ac2b33c00b4418ff8f6e0afd501567b2339e · chenpangpang / transformers

08 Jul, 2024 9 commits

Avoid failure `TFBlipModelTest::test_pipeline_image_to_text` (#31827) · 4879ac2b
Yih-Dar authored Jul 08, 2024
```
* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
4879ac2b

transformers.fx.symbolic_trace supports inputs_embeds (#31574) · ba743700

fxmarty authored Jul 08, 2024



* symbolic trace supports inputs_embeds

* fix test?

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ba743700

Fix typos (#31819) · e5ca9b05
omahs authored Jul 08, 2024
```
* fix typo

* fix typo

* fix typos

* fix typo

* fix typos
```
e5ca9b05

Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/lxmert (#31838) · f4711844

dependabot[bot] authored Jul 08, 2024

Bump certifi in /examples/research_projects/lxmert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04

)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

f4711844

Bump transformers from 4.26.1 to 4.38.0 in /examples/tensorflow/language-modeling-tpu (#31837) · 9f3f58c9

dependabot[bot] authored Jul 08, 2024

Bump transformers in /examples/tensorflow/language-modeling-tpu

Bumps [transformers](https://github.com/huggingface/transformers) from 4.26.1 to 4.38.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.26.1...v4.38.0

)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

9f3f58c9

Add FA2 and `sdpa` support for SigLIP (#31499) · a177821b

Pavel Iakubovskii authored Jul 08, 2024

* Rebase to main

* Fix attention implementation autoset for tex and vision configs

* Fixup

* Minor fixes

* Fix copies

* Fix attention_mask for FA2

* Add eqvivalence tests for siglip

* Remove right padding test

* Uncomment flaky

* Fix import

* Add to docs

* Fix test message

* Add sdpa

* Add sdpa equivalence test

* Add siglip sdpa to docs

* Fix typing for attention output

* Add sdpa tests

* Fix signature of FA2

* Autoset attn_implementation in config

* Rename bsz -> batch_size

* Move back autoset attn method

* Mark as flaky

* Correct attention mask padding

* [run-slow] siglip

* Add FA2 and sdpa docs

* Style fix

* Remove flaky for FA2 test

* Change attention implementation set

* Change attn_implementaiton propogation

* Fix typos

* Add modality to assert message

* Add more sdpa backends in test

* [run slow] siglip

* Add math sdpa backend for all options

* [run slow] siglip

a177821b

Bump certifi from 2023.7.22 to 2024.7.4 in... · 076e66e4

dependabot[bot] authored Jul 08, 2024

Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/decision_transformer (#31813)

Bump certifi in /examples/research_projects/decision_transformer

Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04

)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

076e66e4

Fix Seq2SeqTrainer crash when BatchEncoding data is None (#31418) · c1cda0ee
Dingli Yang authored Jul 08, 2024
```
avoiding crash when BatchEncoding data is None
```
c1cda0ee

Add ZoeDepth (#30136) · 06fd7972

NielsRogge authored Jul 08, 2024



* First draft

* Add docs

* Clean up code

* Convert model

* Add image processor

* Convert Zoe_K

* More improvements

* Improve variable names and docstrings

* Improve variable names

* Improve variable names

* Replace nn.sequential

* More improvements

* Convert ZoeD_NK

* Fix most tests

* Verify pixel values

* Verify pixel values

* Add squeeze

* Update beit to support arbitrary window sizes

* Improve image processor

* Improve docstring

* Improve beit

* Improve model outputs

* Add figure

* Fix beit

* Update checkpoint

* Fix repo id

* Add _keys_to_ignore_on_load_unexpected

* More improvements

* Address comments

* Address comments

* Address comments

* Address comments

* Rename variable name

* Add backbone_hidden_size

* Vectorize

* Vectorize more

* Address comments

* Clarify docstring

* Remove backbone_hidden_size

* Fix image processor

* Remove print statements

* Remove print statement

* Add integration test

* Address comments

* Address comments

* Address comments

* Address comments

* Add requires_backends

* Clean up

* Simplify conversion script

* Simplify more

* Simplify more

* Simplify more

* Clean up

* Make sure beit is loaded correctly

* Address comment

* Address bin_configurations

* Use bin_configurations

* Convert models, add integration tests

* Fix doc test

* Address comments

* Unify regressor classes

* Clarify arguments

* Improve resize_image

* Add num_relative_features

* Address comment

* [run-slow]beit,data2vec,zoedepth

* [run-slow]beit,data2vec,zoedepth

* Address comments

* Address comment

* Address comment

* Replace nn.TransformerEncoderLayer and nn.TransformerEncoder

* Replace nn.MultiheadAttention

* Add attributes for patch transformer to config

* Add tests for ensure_multiple_of

* Update organization

* Add tests

* [run-slow] beit data2vec

* Update ruff

* [run-slow] beit data2vec

* Add comment

* Improve docstrings, add test

* Fix interpolate_pos_encoding

* Fix slow tests

* Add docstring

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Improve tests and docstrings

* Use run_common_tests

* Improve docstrings

* Improve docstrings

* Improve tests

* Improve tests

* Remove print statements

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

06fd7972

05 Jul, 2024 13 commits

Depth Anything: update conversion script for V2 (#31522) · 1082361a

Pedro Cuenca authored Jul 05, 2024

* Depth Anything: update conversion script for V2

* Update docs

* Style

* Revert "Update docs"

This reverts commit be0ca47ea1be4f3cd9aa2113bdd8efcc9959119e.

* Add docs for depth anything v2

* Add depth_anything_v2 to MODEL_NAMES_MAPPING

Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files

* Add tip in original docs

1082361a

Fix Wav2Vec2 Fairseq conversion (weight norm state dict keys) (#31714) · a8fa6fbb
Thien Tran authored Jul 06, 2024
```
* handle new weight norm

* fix

* fix trailing space
```
a8fa6fbb

Fix galore lr display with schedulers (#31710) · a01b033c

Anton Vlasjuk authored Jul 05, 2024

* fix galore lr display with lr schedulers

* style

* add some tests to check for displayed lrs

* copy-paste err for warmup steps

* standardize the default lr to be only in the optimizer

* trying out my luck with the reads

a01b033c

Allow FP16 or other precision inference for Pipelines (#31342) · ac262604

Billy Cao authored Jul 06, 2024



* cast image features to model.dtype where needed to support FP16 or other precision in pipelines

* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use .to instead

* Add FP16 pipeline support for zeroshot audio classification

* Remove unused torch imports

* Add docs on FP16 pipeline

* Remove unused import

* Add FP16 tests to pipeline mixin

* Add fp16 placeholder for mask_generation pipeline test

* Add FP16 tests for all pipelines

* Fix formatting

* Remove torch_dtype arg from is_pipeline_test_to_skip*

* Fix format

* trigger ci

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ac262604

Repeating an important warning in the chat template docs (#31796) · e7868444

Matt authored Jul 05, 2024



* Repeating an important warning in the chat template docs

* Update docs/source/en/chat_templating.md
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Reword for clarity

* Reword for clarity

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

e7868444

Add training support for SigLIP (#31495) · 1d3eaa6f

Billy Cao authored Jul 05, 2024

* Add siglip loss function

* Update docs

* Enable training tests
[experimental] enable GC training tests as it has worked for my own data

* Remove test_training* overrides to enable training tests
[run_slow] siglip

* Skip training tests for Siglip text model and ImageClassificationModel
[run_slow] siglip

* Skip GC training tests for SiglipForImageClassification

* Explicitly skip training tests for SiglipVisionModel
Add skip reason for training tests for SiglipTextModel

* Remove copied from to fix CI

1d3eaa6f

Code agent: allow function persistence between steps (#31769) · 15560252
Aymeric Roucher authored Jul 05, 2024
```
* Code agent: allow function persistence between steps
```
15560252

Fix gemma tests (#31794) · eef0507f

Yih-Dar authored Jul 05, 2024



* skip 3 7b tests

* fix

* fix

* fix

* [run-slow] gemma

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

eef0507f

Update CometCallback to allow reusing of the running experiment (#31366) · 9e599d1d

Boris Feld authored Jul 05, 2024

* Update CometCallback to allow reusing of the running experiment

* Fixups

* Remove useless TODO

* Add checks for minimum version of the Comet SDK

* Fix documentation and links.

Also simplify how the Comet Experiment name is passed

9e599d1d

Exclude torch.compile time from metrics computation (#31443) · d19b5a90
xiangdong authored Jul 05, 2024
```
* exclude compile time from metrics computation

* fix the quality issue
```
d19b5a90
Make tensor device correct when ACCELERATE_TORCH_DEVICE is defined (#31751) · 2aa2a144
Kazuaki Ishizaki authored Jul 05, 2024
```
return correct device when ACCELERATE_TORCH_DEVICE is defined
```
2aa2a144
Fix serialization for offloaded model (#31727) · 8c5c180d
Marc Sun authored Jul 05, 2024
```
* Fix serialization

* style

* add test
```
8c5c180d

Fix ClapProcessor to merge feature_extractor output into the returned BatchEncoding (#31767) · eaa5f414

mxkopy authored Jul 04, 2024

* fixed ClapProcessor to merge all values output from the feature extractor into the returned BatchEncoding.

* fixed trailing whitespace

eaa5f414

04 Jul, 2024 3 commits

Add torch_empty_cache_steps to TrainingArguments (#31546) · 43ffb785

Billy Cao authored Jul 05, 2024

* Add torch_empty_cache_steps to TrainingArguments

* Fix formatting

* Add torch_empty_cache_steps to docs on single gpu training

* Remove check for torch_empty_cache_steps <= max_steps

* Captalize Tip

* Be device agnostic

* Fix linting

43ffb785

Fix Gemma2 types (#31779) · cee768d9
hoshi-hiyouga authored Jul 04, 2024
```
Update __init__.py
```
cee768d9
`pytest_num_workers=4` for some CircleCI jobs (#31764) · 87726a08
Yih-Dar authored Jul 04, 2024
```
pytest_num_workers=4
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
87726a08

03 Jul, 2024 9 commits

Fix RT-DETR weights initialization (#31724) · 048f599f

Pavel Iakubovskii authored Jul 03, 2024

* Fix init for rt-detr heads

* Fixup

* Add separate prior_prob value to config for initialization

* Add bbox init

* Change to 1 / num_labels init

* Adjust weights init test

* Fix style for test

048f599f

Fix RT-DETR cache for generate_anchors (#31671) · b9752161

Pavel Iakubovskii authored Jul 03, 2024

* Fix cache and type conversion

* Add test

* Fixup

* nit

* [run slow] rt_detr

* Fix test

* Fixup

* [run slow] rt_detr

* Update src/transformers/models/rt_detr/modeling_rt_detr.py

b9752161

[fix bug] logits's shape different from label's shape in preprocess_logits_for_metrics (#31447) · 534cbf8a

Willard Sheen authored Jul 03, 2024

* [fix BUG] pad labels before use it in preprocess_logits_for_metrics

* a more readable fix

labels can't use  `gather` before pass to `preprocess_logits_for_metrics`, so must split into 2 if-block

* add a comment

* oh code quality check

534cbf8a

Add ignore_errors=True to trainer.py rmtree in _inner_training_loop (#31668) · 65a02cd2
Nate Brake authored Jul 03, 2024
```
Update trainer.py
```
65a02cd2
Gemma 2: Update slow tests (#31759) · ddfaf119
Joao Gante authored Jul 03, 2024
```
gemma 2 slow tests
```
ddfaf119
handle (processor_class, None) returned by ModelPatterns (#31753) · c1fe1259
Pablo Montalvo authored Jul 03, 2024

c1fe1259

Adds final answer tool for all agents (#31703) · 0fd885b9

Aymeric Roucher authored Jul 03, 2024

* Adds final answer tool for all agents

* Typo

* Add clarification in doc

* Put final_answer tool adition in agent for clarity

0fd885b9

Requires for torch.tensor before casting (#31755) · dc72fd7e
Ella Charlaix authored Jul 03, 2024

dc72fd7e

fix assisted decoding (#31401) · 7f91f168

jiqing-feng authored Jul 03, 2024

* fix assisted decoding

* check None

* fix typo

* fix _prepare_special_tokens

* fix style

* fix lint

* add tests for assisted decoding

* fix style

* fix tests check

7f91f168

02 Jul, 2024 6 commits

Fix documentation for Gemma2. (#31682) · f91c16d2

Jörg Bornschein authored Jul 02, 2024



* Fix documentation for Gemma2. 

Model sizes and Blog post URL are wrong in the documentation.

* Update docs/source/en/model_doc/gemma2.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

f91c16d2

Make tool JSON schemas consistent (#31756) · cd0935dd
Matt authored Jul 02, 2024
```
Make the order of array items consistent using sorted()
```
cd0935dd
🚨🚨 TextGenerationPipeline: rely on the tokenizer default kwargs (#31747) · 82486e59
Joao Gante authored Jul 02, 2024
```
* rely on the tokenizer default kwargs

* fix a few tests
```
82486e59

[whisper] static kv cache (#31166) · a9701953

Sanchit Gandhi authored Jul 02, 2024



* make work with cache abstraction

* correct for static cache

* hacks for compile

* make fast

* fix

* fix pos ids

* generate

* fix sdpa

* fix sdpa cache pos

* fix fa2

* clean fa2

* integrate cache into generate

* make style

* copies

* more copies

* update eager

* update sdpa

* update fa2

* simplify

* use cache pos

* always compute cross-cache for debug

* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>

* fix fix

* fix fix fix

* more fix

* try encoder-decoder cache (too messy)

* revert encoder-decoder cache

* check cross-attn cache

* use enc-dec dataclass

* use richer enc-dec dataclass

* clean-up

* revert static cache changes

* small fixes

* revert to cpu flag

* fix copies

* add static slow test

* past k/v docstring

* more docstrings

* cache_position docstrings

* add to docs

* add enc-dec cache to docs

* make style

* fix after rebase

* fix beam

* style

* fix generation strategies

* fix most decoder-only tests

* style

* skip test

* more clean up

* small docstrings

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add todo

* only crop self-attn

* check cache in mixin

* style

* fix re-compile after rebase

* move `is_updated` logic to enc-dec wrapper

* revert back

* revert cache back

* finalise design

* fix

* fix fix

* style

* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* deprecate

* updates

* final updates

* style

* style

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

a9701953

Fix mistral ONNX export (#31696) · 57d7594a
fxmarty authored Jul 02, 2024
```
* use bitwise or

* why is the CI not triggered?
```
57d7594a

Move some test files (`tets/test_xxx_utils.py`) to `tests/utils` (#31730) · 93cd94b7

Yih-Dar authored Jul 02, 2024



* move

* move

* move

* move

* Update tests/utils/test_image_processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

93cd94b7