Commits · 06fd7972acbc6a5e9cd75b4d482583c060ac2ed0 · chenpangpang / transformers

08 Jul, 2024 1 commit

NielsRogge authored Jul 08, 2024



* First draft

* Add docs

* Clean up code

* Convert model

* Add image processor

* Convert Zoe_K

* More improvements

* Improve variable names and docstrings

* Improve variable names

* Improve variable names

* Replace nn.sequential

* More improvements

* Convert ZoeD_NK

* Fix most tests

* Verify pixel values

* Verify pixel values

* Add squeeze

* Update beit to support arbitrary window sizes

* Improve image processor

* Improve docstring

* Improve beit

* Improve model outputs

* Add figure

* Fix beit

* Update checkpoint

* Fix repo id

* Add _keys_to_ignore_on_load_unexpected

* More improvements

* Address comments

* Address comments

* Address comments

* Address comments

* Rename variable name

* Add backbone_hidden_size

* Vectorize

* Vectorize more

* Address comments

* Clarify docstring

* Remove backbone_hidden_size

* Fix image processor

* Remove print statements

* Remove print statement

* Add integration test

* Address comments

* Address comments

* Address comments

* Address comments

* Add requires_backends

* Clean up

* Simplify conversion script

* Simplify more

* Simplify more

* Simplify more

* Clean up

* Make sure beit is loaded correctly

* Address comment

* Address bin_configurations

* Use bin_configurations

* Convert models, add integration tests

* Fix doc test

* Address comments

* Unify regressor classes

* Clarify arguments

* Improve resize_image

* Add num_relative_features

* Address comment

* [run-slow]beit,data2vec,zoedepth

* [run-slow]beit,data2vec,zoedepth

* Address comments

* Address comment

* Address comment

* Replace nn.TransformerEncoderLayer and nn.TransformerEncoder

* Replace nn.MultiheadAttention

* Add attributes for patch transformer to config

* Add tests for ensure_multiple_of

* Update organization

* Add tests

* [run-slow] beit data2vec

* Update ruff

* [run-slow] beit data2vec

* Add comment

* Improve docstrings, add test

* Fix interpolate_pos_encoding

* Fix slow tests

* Add docstring

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Improve tests and docstrings

* Use run_common_tests

* Improve docstrings

* Improve docstrings

* Improve tests

* Improve tests

* Remove print statements

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

06fd7972

05 Jul, 2024 13 commits

Depth Anything: update conversion script for V2 (#31522) · 1082361a

Pedro Cuenca authored Jul 05, 2024

* Depth Anything: update conversion script for V2

* Update docs

* Style

* Revert "Update docs"

This reverts commit be0ca47ea1be4f3cd9aa2113bdd8efcc9959119e.

* Add docs for depth anything v2

* Add depth_anything_v2 to MODEL_NAMES_MAPPING

Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files

* Add tip in original docs

1082361a

Fix Wav2Vec2 Fairseq conversion (weight norm state dict keys) (#31714) · a8fa6fbb
Thien Tran authored Jul 06, 2024
```
* handle new weight norm

* fix

* fix trailing space
```
a8fa6fbb

Fix galore lr display with schedulers (#31710) · a01b033c

Anton Vlasjuk authored Jul 05, 2024

* fix galore lr display with lr schedulers

* style

* add some tests to check for displayed lrs

* copy-paste err for warmup steps

* standardize the default lr to be only in the optimizer

* trying out my luck with the reads

a01b033c

Allow FP16 or other precision inference for Pipelines (#31342) · ac262604

aliencaocao authored Jul 06, 2024



* cast image features to model.dtype where needed to support FP16 or other precision in pipelines

* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use .to instead

* Add FP16 pipeline support for zeroshot audio classification

* Remove unused torch imports

* Add docs on FP16 pipeline

* Remove unused import

* Add FP16 tests to pipeline mixin

* Add fp16 placeholder for mask_generation pipeline test

* Add FP16 tests for all pipelines

* Fix formatting

* Remove torch_dtype arg from is_pipeline_test_to_skip*

* Fix format

* trigger ci

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ac262604

Repeating an important warning in the chat template docs (#31796) · e7868444

Matt authored Jul 05, 2024



* Repeating an important warning in the chat template docs

* Update docs/source/en/chat_templating.md
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Reword for clarity

* Reword for clarity

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

e7868444

Add training support for SigLIP (#31495) · 1d3eaa6f

aliencaocao authored Jul 05, 2024

* Add siglip loss function

* Update docs

* Enable training tests
[experimental] enable GC training tests as it has worked for my own data

* Remove test_training* overrides to enable training tests
[run_slow] siglip

* Skip training tests for Siglip text model and ImageClassificationModel
[run_slow] siglip

* Skip GC training tests for SiglipForImageClassification

* Explicitly skip training tests for SiglipVisionModel
Add skip reason for training tests for SiglipTextModel

* Remove copied from to fix CI

1d3eaa6f

Code agent: allow function persistence between steps (#31769) · 15560252
Aymeric Roucher authored Jul 05, 2024
```
* Code agent: allow function persistence between steps
```
15560252

Fix gemma tests (#31794) · eef0507f

Yih-Dar authored Jul 05, 2024



* skip 3 7b tests

* fix

* fix

* fix

* [run-slow] gemma

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

eef0507f

Update CometCallback to allow reusing of the running experiment (#31366) · 9e599d1d

Boris Feld authored Jul 05, 2024

* Update CometCallback to allow reusing of the running experiment

* Fixups

* Remove useless TODO

* Add checks for minimum version of the Comet SDK

* Fix documentation and links.

Also simplify how the Comet Experiment name is passed

9e599d1d

Exclude torch.compile time from metrics computation (#31443) · d19b5a90
xiangdong authored Jul 05, 2024
```
* exclude compile time from metrics computation

* fix the quality issue
```
d19b5a90
Make tensor device correct when ACCELERATE_TORCH_DEVICE is defined (#31751) · 2aa2a144
Kazuaki Ishizaki authored Jul 05, 2024
```
return correct device when ACCELERATE_TORCH_DEVICE is defined
```
2aa2a144
Fix serialization for offloaded model (#31727) · 8c5c180d
Marc Sun authored Jul 05, 2024
```
* Fix serialization

* style

* add test
```
8c5c180d

Fix ClapProcessor to merge feature_extractor output into the returned BatchEncoding (#31767) · eaa5f414

mxkopy authored Jul 04, 2024

* fixed ClapProcessor to merge all values output from the feature extractor into the returned BatchEncoding.

* fixed trailing whitespace

eaa5f414

04 Jul, 2024 3 commits

Add torch_empty_cache_steps to TrainingArguments (#31546) · 43ffb785

aliencaocao authored Jul 05, 2024

* Add torch_empty_cache_steps to TrainingArguments

* Fix formatting

* Add torch_empty_cache_steps to docs on single gpu training

* Remove check for torch_empty_cache_steps <= max_steps

* Captalize Tip

* Be device agnostic

* Fix linting

43ffb785

Fix Gemma2 types (#31779) · cee768d9
hoshi-hiyouga authored Jul 04, 2024
```
Update __init__.py
```
cee768d9
`pytest_num_workers=4` for some CircleCI jobs (#31764) · 87726a08
Yih-Dar authored Jul 04, 2024
```
pytest_num_workers=4
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
87726a08

03 Jul, 2024 9 commits

Fix RT-DETR weights initialization (#31724) · 048f599f

Pavel Iakubovskii authored Jul 03, 2024

* Fix init for rt-detr heads

* Fixup

* Add separate prior_prob value to config for initialization

* Add bbox init

* Change to 1 / num_labels init

* Adjust weights init test

* Fix style for test

048f599f

Fix RT-DETR cache for generate_anchors (#31671) · b9752161

Pavel Iakubovskii authored Jul 03, 2024

* Fix cache and type conversion

* Add test

* Fixup

* nit

* [run slow] rt_detr

* Fix test

* Fixup

* [run slow] rt_detr

* Update src/transformers/models/rt_detr/modeling_rt_detr.py

b9752161

[fix bug] logits's shape different from label's shape in preprocess_logits_for_metrics (#31447) · 534cbf8a

Willard Sheen authored Jul 03, 2024

* [fix BUG] pad labels before use it in preprocess_logits_for_metrics

* a more readable fix

labels can't use  `gather` before pass to `preprocess_logits_for_metrics`, so must split into 2 if-block

* add a comment

* oh code quality check

534cbf8a

Add ignore_errors=True to trainer.py rmtree in _inner_training_loop (#31668) · 65a02cd2
Nate Brake authored Jul 03, 2024
```
Update trainer.py
```
65a02cd2
Gemma 2: Update slow tests (#31759) · ddfaf119
Joao Gante authored Jul 03, 2024
```
gemma 2 slow tests
```
ddfaf119
handle (processor_class, None) returned by ModelPatterns (#31753) · c1fe1259
Pablo Montalvo authored Jul 03, 2024

c1fe1259

Adds final answer tool for all agents (#31703) · 0fd885b9

Aymeric Roucher authored Jul 03, 2024

* Adds final answer tool for all agents

* Typo

* Add clarification in doc

* Put final_answer tool adition in agent for clarity

0fd885b9

Requires for torch.tensor before casting (#31755) · dc72fd7e
Ella Charlaix authored Jul 03, 2024

dc72fd7e

fix assisted decoding (#31401) · 7f91f168

jiqing-feng authored Jul 03, 2024

* fix assisted decoding

* check None

* fix typo

* fix _prepare_special_tokens

* fix style

* fix lint

* add tests for assisted decoding

* fix style

* fix tests check

7f91f168

02 Jul, 2024 7 commits

Fix documentation for Gemma2. (#31682) · f91c16d2

Jörg Bornschein authored Jul 02, 2024



* Fix documentation for Gemma2. 

Model sizes and Blog post URL are wrong in the documentation.

* Update docs/source/en/model_doc/gemma2.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

f91c16d2

Make tool JSON schemas consistent (#31756) · cd0935dd
Matt authored Jul 02, 2024
```
Make the order of array items consistent using sorted()
```
cd0935dd
🚨🚨 TextGenerationPipeline: rely on the tokenizer default kwargs (#31747) · 82486e59
Joao Gante authored Jul 02, 2024
```
* rely on the tokenizer default kwargs

* fix a few tests
```
82486e59

[whisper] static kv cache (#31166) · a9701953

Sanchit Gandhi authored Jul 02, 2024



* make work with cache abstraction

* correct for static cache

* hacks for compile

* make fast

* fix

* fix pos ids

* generate

* fix sdpa

* fix sdpa cache pos

* fix fa2

* clean fa2

* integrate cache into generate

* make style

* copies

* more copies

* update eager

* update sdpa

* update fa2

* simplify

* use cache pos

* always compute cross-cache for debug

* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>

* fix fix

* fix fix fix

* more fix

* try encoder-decoder cache (too messy)

* revert encoder-decoder cache

* check cross-attn cache

* use enc-dec dataclass

* use richer enc-dec dataclass

* clean-up

* revert static cache changes

* small fixes

* revert to cpu flag

* fix copies

* add static slow test

* past k/v docstring

* more docstrings

* cache_position docstrings

* add to docs

* add enc-dec cache to docs

* make style

* fix afte...

a9701953

Fix mistral ONNX export (#31696) · 57d7594a
fxmarty authored Jul 02, 2024
```
* use bitwise or

* why is the CI not triggered?
```
57d7594a

Move some test files (`tets/test_xxx_utils.py`) to `tests/utils` (#31730) · 93cd94b7

Yih-Dar authored Jul 02, 2024



* move

* move

* move

* move

* Update tests/utils/test_image_processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

93cd94b7

remove incorrect urls pointing to the llava repository (#31107) · cf85e86e

Krisztián Boros authored Jul 02, 2024

* remove incorrect urls pointing to the llava repository

* remove incorrect urls pointing to the llava repository; removing entire comments

* remove incorrect urls pointing to the llava repository; removing entire comments; ran fix-copies

* ran fixup

cf85e86e

01 Jul, 2024 1 commit
- dependencies: `keras-nlp<0.14` pin (#31684) · 3345ae73
  Joao Gante authored Jul 01, 2024
```
* keras nlp pin

* this should use the new docker images:dev

* dev-ci
```
  3345ae73
28 Jun, 2024 6 commits

Add French version of run scripts tutorial (#31483) · e6550295

Jade Choghari authored Jun 28, 2024



* Add French translation of run scripts tutorial

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Jade Choghari <chogharijade@icloud.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

e6550295

Gemma capping is a must for big models (#31698) · bbf1e618
Arthur authored Jun 28, 2024
```
* softcapping

* soft cap before the mask

* style

* ...

* super nit
```
bbf1e618

add gather_use_object arguments (#31514) · cb298978

Sangbum Daniel Choi authored Jun 28, 2024



* add gather_use_object arguments

* fix name and pass the CI test for Seq2SeqTrainer

* make style

* make it to functools

* fix typo

* add accelerate version:

* adding warning

* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* make style

* Update src/transformers/training_args.py

* check function move to initial part

* add test for eval_use_gather_object

---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

cb298978

Fix return_dict in encodec (#31646) · 82a1fc72

Jacky Lee authored Jun 28, 2024

* fix: use return_dict parameter

* fix: type checks

* fix: unused imports

* update: one-line if else

* remove: recursive check

82a1fc72

Fix Gemma2 4d attention mask (#31674) · 5e89b335

hoshi-hiyouga authored Jun 28, 2024



Update modeling_gemma2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

5e89b335

don't zero out the attention_mask when using sliding window with flash attention (#31670) · 0142aab7
Wing Lian authored Jun 27, 2024
```
* don't zero out the attention_mask when using sliding window with flash attention

* chore: lint
```
0142aab7