Commits · e3a7d9bd476d69433538d646c9096e4f5b0ab457 · chenpangpang / transformers

09 Jul, 2024 10 commits
- Update depth estimation task guide (#31860) · e3a7d9bd
  Merve Noyan authored Jul 09, 2024
```
---------
Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
```
  e3a7d9bd
- Fix `_init_weights` for `ResNetPreTrainedModel` (#31851) · 4c8149d6
  Yih-Dar authored Jul 09, 2024
```
* init

* test

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  4c8149d6
- Generate: Add new decoding strategy "DoLa" in `.generate()` (#29619) · d094d8d9
  Yung-Sung Chuang authored Jul 09, 2024
```
Co-authored-by: Joao Gante <joao@huggingface.co>
```
  d094d8d9
- docs: typo in tf qa example (#31864) · 99c0e553
  chenk authored Jul 09, 2024
```
Signed-off-by: chenk <hen.keinan@gmail.com>
```
  99c0e553
- Test loading generation config with safetensor weights (#31550) · 4c2538b8
  Joao Gante authored Jul 09, 2024
```
fix test
```
  4c2538b8
- save_pretrained: use tqdm when saving checkpoint shards from offloaded params (#31856) · cffa2b9c
  kallewoof authored Jul 09, 2024
  
  cffa2b9c
- chore: remove duplicate words (#31853) · 350aed70
  hatti authored Jul 09, 2024
```
remove duplicate words
```
  350aed70
- [Grounding DINO] Add processor to auto mapping (#31845) · bd760cd1
  NielsRogge authored Jul 09, 2024
```
Add model
```
  bd760cd1
- FX symbolic_trace: do not test decoder_inputs_embeds (#31840) · 0abf5e8e
  fxmarty authored Jul 09, 2024
```
only test input_embeds, not decoder_input_embeds
```
  0abf5e8e
- Deprecate `vocab_size` in other two VLMs (#31681) · 952dfd48
  Raushan Turganbay authored Jul 09, 2024
```
* deprrecate `vocab_size` in other two VLMs

* Update src/transformers/models/fuyu/configuration_fuyu.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* depracate until 4.44

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
```
  952dfd48
08 Jul, 2024 11 commits

Mamba & RecurrentGemma: enable strict signature (#31549) · 594c1610
Joao Gante authored Jul 08, 2024
```
* enable strict signature

* this should not have been deleted

* recurrent_gemma too
```
594c1610
Fix incorrect accelerator device handling for MPS in `TrainingArguments` (#31812) · ae9dd02e
André Storhaug authored Jul 08, 2024
```
* Fix wrong acclerator device setup when using MPS

* More robust TrainingArguments MPS handling

* Update training_args.py

* Cleanup
```
ae9dd02e
Avoid failure `TFBlipModelTest::test_pipeline_image_to_text` (#31827) · 4879ac2b
Yih-Dar authored Jul 08, 2024
```
* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
4879ac2b

transformers.fx.symbolic_trace supports inputs_embeds (#31574) · ba743700

fxmarty authored Jul 08, 2024



* symbolic trace supports inputs_embeds

* fix test?

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ba743700

Fix typos (#31819) · e5ca9b05
omahs authored Jul 08, 2024
```
* fix typo

* fix typo

* fix typos

* fix typo

* fix typos
```
e5ca9b05

Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/lxmert (#31838) · f4711844

dependabot[bot] authored Jul 08, 2024

Bump certifi in /examples/research_projects/lxmert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04

)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

f4711844

Bump transformers from 4.26.1 to 4.38.0 in /examples/tensorflow/language-modeling-tpu (#31837) · 9f3f58c9

dependabot[bot] authored Jul 08, 2024

Bump transformers in /examples/tensorflow/language-modeling-tpu

Bumps [transformers](https://github.com/huggingface/transformers) from 4.26.1 to 4.38.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.26.1...v4.38.0

)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

9f3f58c9

Add FA2 and `sdpa` support for SigLIP (#31499) · a177821b

Pavel Iakubovskii authored Jul 08, 2024

* Rebase to main

* Fix attention implementation autoset for tex and vision configs

* Fixup

* Minor fixes

* Fix copies

* Fix attention_mask for FA2

* Add eqvivalence tests for siglip

* Remove right padding test

* Uncomment flaky

* Fix import

* Add to docs

* Fix test message

* Add sdpa

* Add sdpa equivalence test

* Add siglip sdpa to docs

* Fix typing for attention output

* Add sdpa tests

* Fix signature of FA2

* Autoset attn_implementation in config

* Rename bsz -> batch_size

* Move back autoset attn method

* Mark as flaky

* Correct attention mask padding

* [run-slow] siglip

* Add FA2 and sdpa docs

* Style fix

* Remove flaky for FA2 test

* Change attention implementation set

* Change attn_implementaiton propogation

* Fix typos

* Add modality to assert message

* Add more sdpa backends in test

* [run slow] siglip

* Add math sdpa backend for all options

* [run slow] siglip

a177821b

Bump certifi from 2023.7.22 to 2024.7.4 in... · 076e66e4

dependabot[bot] authored Jul 08, 2024

Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/decision_transformer (#31813)

Bump certifi in /examples/research_projects/decision_transformer

Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04

)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

076e66e4

Fix Seq2SeqTrainer crash when BatchEncoding data is None (#31418) · c1cda0ee
Dingli Yang authored Jul 08, 2024
```
avoiding crash when BatchEncoding data is None
```
c1cda0ee

Add ZoeDepth (#30136) · 06fd7972

NielsRogge authored Jul 08, 2024



* First draft

* Add docs

* Clean up code

* Convert model

* Add image processor

* Convert Zoe_K

* More improvements

* Improve variable names and docstrings

* Improve variable names

* Improve variable names

* Replace nn.sequential

* More improvements

* Convert ZoeD_NK

* Fix most tests

* Verify pixel values

* Verify pixel values

* Add squeeze

* Update beit to support arbitrary window sizes

* Improve image processor

* Improve docstring

* Improve beit

* Improve model outputs

* Add figure

* Fix beit

* Update checkpoint

* Fix repo id

* Add _keys_to_ignore_on_load_unexpected

* More improvements

* Address comments

* Address comments

* Address comments

* Address comments

* Rename variable name

* Add backbone_hidden_size

* Vectorize

* Vectorize more

* Address comments

* Clarify docstring

* Remove backbone_hidden_size

* Fix image processor

* Remove print statements

* Remove print statement

* Add integration test

* Address comments

* Address comments

* Address comments

* Address comments

* Add requires_backends

* Clean up

* Simplify conversion script

* Simplify more

* Simplify more

* Simplify more

* Clean up

* Make sure beit is loaded correctly

* Address comment

* Address bin_configurations

* Use bin_configurations

* Convert models, add integration tests

* Fix doc test

* Address comments

* Unify regressor classes

* Clarify arguments

* Improve resize_image

* Add num_relative_features

* Address comment

* [run-slow]beit,data2vec,zoedepth

* [run-slow]beit,data2vec,zoedepth

* Address comments

* Address comment

* Address comment

* Replace nn.TransformerEncoderLayer and nn.TransformerEncoder

* Replace nn.MultiheadAttention

* Add attributes for patch transformer to config

* Add tests for ensure_multiple_of

* Update organization

* Add tests

* [run-slow] beit data2vec

* Update ruff

* [run-slow] beit data2vec

* Add comment

* Improve docstrings, add test

* Fix interpolate_pos_encoding

* Fix slow tests

* Add docstring

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Improve tests and docstrings

* Use run_common_tests

* Improve docstrings

* Improve docstrings

* Improve tests

* Improve tests

* Remove print statements

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

06fd7972

05 Jul, 2024 13 commits

Depth Anything: update conversion script for V2 (#31522) · 1082361a

Pedro Cuenca authored Jul 05, 2024

* Depth Anything: update conversion script for V2

* Update docs

* Style

* Revert "Update docs"

This reverts commit be0ca47ea1be4f3cd9aa2113bdd8efcc9959119e.

* Add docs for depth anything v2

* Add depth_anything_v2 to MODEL_NAMES_MAPPING

Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files

* Add tip in original docs

1082361a

Fix Wav2Vec2 Fairseq conversion (weight norm state dict keys) (#31714) · a8fa6fbb
Thien Tran authored Jul 06, 2024
```
* handle new weight norm

* fix

* fix trailing space
```
a8fa6fbb

Fix galore lr display with schedulers (#31710) · a01b033c

Anton Vlasjuk authored Jul 05, 2024

* fix galore lr display with lr schedulers

* style

* add some tests to check for displayed lrs

* copy-paste err for warmup steps

* standardize the default lr to be only in the optimizer

* trying out my luck with the reads

a01b033c

Allow FP16 or other precision inference for Pipelines (#31342) · ac262604

Billy Cao authored Jul 06, 2024



* cast image features to model.dtype where needed to support FP16 or other precision in pipelines

* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use .to instead

* Add FP16 pipeline support for zeroshot audio classification

* Remove unused torch imports

* Add docs on FP16 pipeline

* Remove unused import

* Add FP16 tests to pipeline mixin

* Add fp16 placeholder for mask_generation pipeline test

* Add FP16 tests for all pipelines

* Fix formatting

* Remove torch_dtype arg from is_pipeline_test_to_skip*

* Fix format

* trigger ci

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ac262604

Repeating an important warning in the chat template docs (#31796) · e7868444

Matt authored Jul 05, 2024



* Repeating an important warning in the chat template docs

* Update docs/source/en/chat_templating.md
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Reword for clarity

* Reword for clarity

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

e7868444

Add training support for SigLIP (#31495) · 1d3eaa6f

Billy Cao authored Jul 05, 2024

* Add siglip loss function

* Update docs

* Enable training tests
[experimental] enable GC training tests as it has worked for my own data

* Remove test_training* overrides to enable training tests
[run_slow] siglip

* Skip training tests for Siglip text model and ImageClassificationModel
[run_slow] siglip

* Skip GC training tests for SiglipForImageClassification

* Explicitly skip training tests for SiglipVisionModel
Add skip reason for training tests for SiglipTextModel

* Remove copied from to fix CI

1d3eaa6f

Code agent: allow function persistence between steps (#31769) · 15560252
Aymeric Roucher authored Jul 05, 2024
```
* Code agent: allow function persistence between steps
```
15560252

Fix gemma tests (#31794) · eef0507f

Yih-Dar authored Jul 05, 2024



* skip 3 7b tests

* fix

* fix

* fix

* [run-slow] gemma

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

eef0507f

Update CometCallback to allow reusing of the running experiment (#31366) · 9e599d1d

Boris Feld authored Jul 05, 2024

* Update CometCallback to allow reusing of the running experiment

* Fixups

* Remove useless TODO

* Add checks for minimum version of the Comet SDK

* Fix documentation and links.

Also simplify how the Comet Experiment name is passed

9e599d1d

Exclude torch.compile time from metrics computation (#31443) · d19b5a90
xiangdong authored Jul 05, 2024
```
* exclude compile time from metrics computation

* fix the quality issue
```
d19b5a90
Make tensor device correct when ACCELERATE_TORCH_DEVICE is defined (#31751) · 2aa2a144
Kazuaki Ishizaki authored Jul 05, 2024
```
return correct device when ACCELERATE_TORCH_DEVICE is defined
```
2aa2a144
Fix serialization for offloaded model (#31727) · 8c5c180d
Marc Sun authored Jul 05, 2024
```
* Fix serialization

* style

* add test
```
8c5c180d

Fix ClapProcessor to merge feature_extractor output into the returned BatchEncoding (#31767) · eaa5f414

mxkopy authored Jul 04, 2024

* fixed ClapProcessor to merge all values output from the feature extractor into the returned BatchEncoding.

* fixed trailing whitespace

eaa5f414

04 Jul, 2024 3 commits

Add torch_empty_cache_steps to TrainingArguments (#31546) · 43ffb785

Billy Cao authored Jul 05, 2024

* Add torch_empty_cache_steps to TrainingArguments

* Fix formatting

* Add torch_empty_cache_steps to docs on single gpu training

* Remove check for torch_empty_cache_steps <= max_steps

* Captalize Tip

* Be device agnostic

* Fix linting

43ffb785

Fix Gemma2 types (#31779) · cee768d9
hoshi-hiyouga authored Jul 04, 2024
```
Update __init__.py
```
cee768d9
`pytest_num_workers=4` for some CircleCI jobs (#31764) · 87726a08
Yih-Dar authored Jul 04, 2024
```
pytest_num_workers=4
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
87726a08

03 Jul, 2024 3 commits

Fix RT-DETR weights initialization (#31724) · 048f599f

Pavel Iakubovskii authored Jul 03, 2024

* Fix init for rt-detr heads

* Fixup

* Add separate prior_prob value to config for initialization

* Add bbox init

* Change to 1 / num_labels init

* Adjust weights init test

* Fix style for test

048f599f

Fix RT-DETR cache for generate_anchors (#31671) · b9752161

Pavel Iakubovskii authored Jul 03, 2024

* Fix cache and type conversion

* Add test

* Fixup

* nit

* [run slow] rt_detr

* Fix test

* Fixup

* [run slow] rt_detr

* Update src/transformers/models/rt_detr/modeling_rt_detr.py

b9752161

[fix bug] logits's shape different from label's shape in preprocess_logits_for_metrics (#31447) · 534cbf8a

Willard Sheen authored Jul 03, 2024

* [fix BUG] pad labels before use it in preprocess_logits_for_metrics

* a more readable fix

labels can't use  `gather` before pass to `preprocess_logits_for_metrics`, so must split into 2 if-block

* add a comment

* oh code quality check

534cbf8a