Commits · 739a63166d30f35b77afda2f968974164b05f33b · chenpangpang / transformers

14 Jul, 2024 2 commits

Generate: remove deprecated code due to `Cache` and `cache_position` being default (#31898) · 739a6316

Joao Gante authored Jul 14, 2024



* tmp commit

* shorter

* nit

* explicit kwargs

* propagate changes

* mass propagation with a few manual touches (let's see how CI behaves)

* fix cacheless case

* Update src/transformers/generation/utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* make fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

739a6316

Fix `GenerationMixin.generate` compatibility with pytorch profiler (#31935) · 8480fda6
fxmarty authored Jul 14, 2024
```
use torch.compiler.is_compiling() when possible
```
8480fda6

12 Jul, 2024 1 commit

fix prompt strip to support tensors and np arrays (#27818) · 7f79a973

Aviv Shamsian authored Jul 12, 2024



* fix prompt strip to support tensors and np arrays

* framework agnostic

* change logic check before converting prompt into list
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* adding _convert_to_list to tokenization_whisper_fast

* adding tests for prompt decoding

* adding comment
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* adding comment
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* revert minor

* make style formatting

* style formatting after update

* Update src/transformers/models/whisper/tokenization_whisper_fast.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fixing _strip_prompt to handle _decode_with_timestamps

* fix copies

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

7f79a973

11 Jul, 2024 16 commits

[Bug Fix] fix qa pipeline tensor to numpy (#31585) · aec1ca3a
jiqing-feng authored Jul 12, 2024
```
* fix qa pipeline

* fix tensor to numpy
```
aec1ca3a

Adding hiera (#30356) · c1e139c2

Naman Garg authored Jul 12, 2024



* initialized Structure

* Updated variable names

* Added Config class, basic HF setup, convert_to_hf

* Fixed Convert function, added hiera to HF files, Initilized test files

* better naming for x in forward pass

* Moved utils to hiera

* Change hiera -> hiera_model

* Fixed integration into tranformers

* Fix: Convert Checkpoint

* added documentation for hiera

* added documentation for hiera

* added Docstings to models, Transformers based changes

* make style and quality

* make style and quality

* Integration & Block tests running

* Fixed bugs

* initialized Structure

* Updated variable names

* Added Config class, basic HF setup, convert_to_hf

* Fixed Convert function, added hiera to HF files, Initilized test files

* better naming for x in forward pass

* Moved utils to hiera

* Change hiera -> hiera_model

* Fixed integration into tranformers

* Fix: Convert Checkpoint

* added documentation for hiera

* added documentation for hiera

* added Docstings to models, Transformers based changes

* make style and quality

* make style and quality

* Integration & Block tests running

* Fixed bugs

* Removed tim dependency

* added HieraBlock

* fixed: Model name

* added tests for HieraModel, HieraBlock

* fixed imports

* fixed quality & copies

* Fixes

* Update docs/source/en/model_doc/hiera.md

Fix name
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/hiera.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/hiera.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/hiera/configuration_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/hiera/configuration_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Fixed formatting

* Code quality & Import differences

* quality and repo-consistency fix

* fixed no torch error

* Docstring fix

* Docstring fix

* doc string fix

* fixed example usage

* Resolved issues in modeling_hiera

* Removed Hiera MAE

* Added test and resolved bug

* fixed doc string

* First commit

* Finished conversion script and model forward working

* Resolved all issues

* nits

* Improving tests

* Nits

* More nits

* Improving HieraForMaskedImageModeling

* More improvements and nits

* Fixed docstrings of outputs

* More fixes

* More imrpovments

* Updated conversion script

* Fixed docstrings

* Improved tests

* Fixed attentou outputs test

* All tests green

* Removed unnecessary file

* contribution attribution

* Resolved a few issues

* Resolved Comments

* Updated model repo id and fixed bugs

* Removed loss print

* Make tests green

* Updated docstrings

* Fix style

* Fixed num_heads in config

* Removed unnecessary video checkpoint related code in the conversion script

* Fix style

* Changed atol in conversion script

* HieraConfig

* Fix copies

* Fixed typo

* Resolved few issues

* make

* converted conv_nd -> nn.Module

* Removed video complexities

* Removed video complexities

* fix style

* Addressing comments

* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix style

* Fixed tests

* Fixed typo

* Fixed interpolate test

* Made torch fx compatible

* Made sure imageprocesor is correct

* Addressed comments

* Noise directly as torch

* Remove unnecesary attr

* Added return_dit

* Update src/transformers/models/hiera/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Updated checkpoints

* [run_slow] hiera

* Fixed device mismatch

* [run_slow] hiera

* Fixed GPU tests

* [run_slow] hiera

---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Eduardo Pacheco <eduardo.pach@hotmail.com>
Co-authored-by: Eduardo Pacheco <69953243+EduardoPach@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

c1e139c2

Allow `Trainer.get_optimizer_cls_and_kwargs` to be overridden (#31875) · 574e68d5

Apoorv Khandelwal authored Jul 11, 2024

* Change `Trainer.get_optimizer_cls_and_kwargs` to `self.`

* Make `get_optimizer_cls_and_kwargs` an instance method

* Fixing typo

* Revert `get_optimizer_cls_and_kwargs` to staticmethod

* restore newline to trainer.py eof

574e68d5

🚨 fix(SigLip): remove spurious exclusion of first vision output token (#30952) · 52585019
t11s authored Jul 11, 2024
```
fix(SigLip): remove spurious exclusion of first vision output token in classifier
```
52585019
Generate: fix `SlidingWindowCache.reset()` (#31917) · 6a05f68f
Joao Gante authored Jul 11, 2024
```
fix sliding cache
```
6a05f68f

Refactor flash attention implementation in transformers (#31446) · e3143952

Arthur authored Jul 11, 2024



* dumb commit

* nit

* update

* something like this

* unpack in modeling utils

* safe import

* oups

* update

* nits

* diff convert gemma

* update

* start propagating

* udpate other modeling code as well

* update for sliding window models

* nits

* more init cleanups

* styling

* fixup

* noice

* pass fixup

* typo typing_extension -> typing_extensions

* torch.nn.functionnal -> torch.nn.functional

* add to import structure

* unpack

* simplify a bit more for this first version

* nut

* update

* update

* nit

* ease the import of `Unpack`

* remove useless `use_sliding_window`

* no qua please

* protect import?

* style

* [run-slow]

* [run slow] llama,gemma,mistral,mixtral

* remove extra kwargs

* fix llama

* address review comments

* apply diff_model_converter to modeling_gemma.py

* remove cache_position 1

* remove cache_position 2

* some cleaning

* refactor gemma2 as well

* apply review comments

* rename file to modeling_flash_attention_utils.py

* siglip refactor

* remove dead code

* is the hub down?

* still down?

* fix siglip

* fix gemma2

* fatal: Could not read from remote repository.

* fix typo in softcap implem

* flacky

* Failed: Timeout >120.0s

---------
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

e3143952

Fix fx tests with inputs_embeds (#31862) · ad4ef3a2
fxmarty authored Jul 11, 2024
```
* fix tests

* [test_all] check

* address review comments
```
ad4ef3a2

Add warning message for beta and gamma parameters (#31654) · 1499a550

Omar Salman authored Jul 11, 2024

* Add warning message for  and  parameters

* Fix when the warning is raised

* Formatting changes

* Improve testing and remove duplicated warning from _fix_key

1499a550

add gather_use_object arguments II (#31799) · 23d6d0cc

Sangbum Daniel Choi authored Jul 11, 2024



* add gather_use_object arguments

* fix name and pass the CI test for Seq2SeqTrainer

* make style

* make it to functools

* fix typo

* add accelerate version:

* adding warning

* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* make style

* Update src/transformers/training_args.py

* check function move to initial part

* add test for eval_use_gather_object

* fix minor

---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

23d6d0cc

fix: Fixed the `1st argument` name in classmethods (#31907) · 2e48b3e8
Sai-Suraj-27 authored Jul 11, 2024
```
Fixed the first argument name in few classmethods.
```
2e48b3e8

Fix missing methods for Fuyu (#31880) · 48c20700

Isotr0py authored Jul 11, 2024

* add missing methods for FuyuForCausalLM

* fix a typo

* format code

* add missing tie_weights

* format code

48c20700

[`Gemma2`] Support FA2 softcapping (#31887) · f4ec7a28
Arthur authored Jul 11, 2024
```
* Support softcapping

* strictly greater than

* update
```
f4ec7a28
[`ConvertSlow`] make sure the order is preserved for addedtokens (#31902) · f67e0f7f
Arthur authored Jul 11, 2024
```
* preserve the order

* oups

* oups

* nit

* trick

* fix issues
```
f67e0f7f

Processor accepts any kwargs (#31889) · 14d3b3f0

Raushan Turganbay authored Jul 11, 2024

* accept kwargs in processors

* return unused kwargs

* fix tests

* typo

* update the other way

14d3b3f0

Fixes to alternating SWA layers in Gemma2 (#31775) · a695c186

turboderp authored Jul 11, 2024

* HybridCache: Flip order of alternating global-attn/sliding-attn layers

* HybridCache: Read sliding_window argument from cache_kwargs

* Gemma2Model: Flip order of alternating global-attn/sliding-attn layers

* Code formatting

a695c186

InstructBlipVideo: Update docstring (#31886) · d625294d
Raushan Turganbay authored Jul 11, 2024
```
* update docs

* one more change
```
d625294d

10 Jul, 2024 6 commits

Add a condition for nested_detach (#31855) · c54af4c7
haikuoxin authored Jul 11, 2024
```
fix bug: https://github.com/huggingface/transformers/issues/31852
```
c54af4c7
Push sharded checkpoint to hub when `push_to_hub=True` in `TrainingArguments` (#31808) · 8df28bb3
Marc Sun authored Jul 10, 2024
```
Save sharded checkpoint in Trainer
```
8df28bb3
fix: Removed `duplicate` field definitions in some classes (#31888) · da79b180
Sai-Suraj-27 authored Jul 10, 2024
```
Removed duplicate field definitions in classes.
```
da79b180

Fix failed tests in #31851 (#31879) · 9d98706b

Yih-Dar authored Jul 10, 2024

* Revert "Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868)"

This reverts commit b45dd5de

.

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

* fix

* [test_all] check

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

9d98706b

remove duplicate words in msg (#31876) · e9eeedaf
yukionfire authored Jul 10, 2024

e9eeedaf

Add conversion for interleave llava (#31858) · 97aa3e29

Raushan Turganbay authored Jul 10, 2024



* add conversion for interleave llava

* remove debug lines

* remove unused imports

* Update src/transformers/models/llava/convert_llava_weights_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* small changes + docs

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

97aa3e29

09 Jul, 2024 8 commits

add warning when using gradient_checkpointing with FSDP full shard (#31578) · ad35309a

Yun Dai authored Jul 09, 2024



* add warning when using  with FSDP full shard

* fix style

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add hybrid shard warn

* fix style

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ad35309a

Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868) · b45dd5de
Yih-Dar authored Jul 09, 2024
```
Revert "Fix `_init_weights` for `ResNetPreTrainedModel` (#31851)"

This reverts commit 4c8149d6.
```
b45dd5de
Add return type annotation to PreTrainedModel.from_pretrained (#31869) · c5bc2d5f
Mauricio Villegas authored Jul 09, 2024
```
Update modeling_utils.py

Add return type annotation to PreTrainedModel.from_pretrained
```
c5bc2d5f
Fix `_init_weights` for `ResNetPreTrainedModel` (#31851) · 4c8149d6
Yih-Dar authored Jul 09, 2024
```
* init

* test

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
4c8149d6
Generate: Add new decoding strategy "DoLa" in `.generate()` (#29619) · d094d8d9
Yung-Sung Chuang authored Jul 09, 2024
```
Co-authored-by: Joao Gante <joao@huggingface.co>
```
d094d8d9
save_pretrained: use tqdm when saving checkpoint shards from offloaded params (#31856) · cffa2b9c
kallewoof authored Jul 09, 2024

cffa2b9c
[Grounding DINO] Add processor to auto mapping (#31845) · bd760cd1
NielsRogge authored Jul 09, 2024
```
Add model
```
bd760cd1

Deprecate `vocab_size` in other two VLMs (#31681) · 952dfd48

Raushan Turganbay authored Jul 09, 2024



* deprrecate `vocab_size` in other two VLMs

* Update src/transformers/models/fuyu/configuration_fuyu.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* depracate until 4.44

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

952dfd48

08 Jul, 2024 6 commits

Mamba & RecurrentGemma: enable strict signature (#31549) · 594c1610
Joao Gante authored Jul 08, 2024
```
* enable strict signature

* this should not have been deleted

* recurrent_gemma too
```
594c1610
Fix incorrect accelerator device handling for MPS in `TrainingArguments` (#31812) · ae9dd02e
André Storhaug authored Jul 08, 2024
```
* Fix wrong acclerator device setup when using MPS

* More robust TrainingArguments MPS handling

* Update training_args.py

* Cleanup
```
ae9dd02e

transformers.fx.symbolic_trace supports inputs_embeds (#31574) · ba743700

fxmarty authored Jul 08, 2024



* symbolic trace supports inputs_embeds

* fix test?

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ba743700

Add FA2 and `sdpa` support for SigLIP (#31499) · a177821b

Pavel Iakubovskii authored Jul 08, 2024

* Rebase to main

* Fix attention implementation autoset for tex and vision configs

* Fixup

* Minor fixes

* Fix copies

* Fix attention_mask for FA2

* Add eqvivalence tests for siglip

* Remove right padding test

* Uncomment flaky

* Fix import

* Add to docs

* Fix test message

* Add sdpa

* Add sdpa equivalence test

* Add siglip sdpa to docs

* Fix typing for attention output

* Add sdpa tests

* Fix signature of FA2

* Autoset attn_implementation in config

* Rename bsz -> batch_size

* Move back autoset attn method

* Mark as flaky

* Correct attention mask padding

* [run-slow] siglip

* Add FA2 and sdpa docs

* Style fix

* Remove flaky for FA2 test

* Change attention implementation set

* Change attn_implementaiton propogation

* Fix typos

* Add modality to assert message

* Add more sdpa backends in test

* [run slow] siglip

* Add math sdpa backend for all options

* [run slow] siglip

a177821b

Fix Seq2SeqTrainer crash when BatchEncoding data is None (#31418) · c1cda0ee
Dingli Yang authored Jul 08, 2024
```
avoiding crash when BatchEncoding data is None
```
c1cda0ee

Add ZoeDepth (#30136) · 06fd7972

NielsRogge authored Jul 08, 2024



* First draft

* Add docs

* Clean up code

* Convert model

* Add image processor

* Convert Zoe_K

* More improvements

* Improve variable names and docstrings

* Improve variable names

* Improve variable names

* Replace nn.sequential

* More improvements

* Convert ZoeD_NK

* Fix most tests

* Verify pixel values

* Verify pixel values

* Add squeeze

* Update beit to support arbitrary window sizes

* Improve image processor

* Improve docstring

* Improve beit

* Improve model outputs

* Add figure

* Fix beit

* Update checkpoint

* Fix repo id

* Add _keys_to_ignore_on_load_unexpected

* More improvements

* Address comments

* Address comments

* Address comments

* Address comments

* Rename variable name

* Add backbone_hidden_size

* Vectorize

* Vectorize more

* Address comments

* Clarify docstring

* Remove backbone_hidden_size

* Fix image processor

* Remove print statements

* Remove print statement

* Add integration test

* Address comments

* Address comments

* Address comments

* Address comments

* Add requires_backends

* Clean up

* Simplify conversion script

* Simplify more

* Simplify more

* Simplify more

* Clean up

* Make sure beit is loaded correctly

* Address comment

* Address bin_configurations

* Use bin_configurations

* Convert models, add integration tests

* Fix doc test

* Address comments

* Unify regressor classes

* Clarify arguments

* Improve resize_image

* Add num_relative_features

* Address comment

* [run-slow]beit,data2vec,zoedepth

* [run-slow]beit,data2vec,zoedepth

* Address comments

* Address comment

* Address comment

* Replace nn.TransformerEncoderLayer and nn.TransformerEncoder

* Replace nn.MultiheadAttention

* Add attributes for patch transformer to config

* Add tests for ensure_multiple_of

* Update organization

* Add tests

* [run-slow] beit data2vec

* Update ruff

* [run-slow] beit data2vec

* Add comment

* Improve docstrings, add test

* Fix interpolate_pos_encoding

* Fix slow tests

* Add docstring

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Improve tests and docstrings

* Use run_common_tests

* Improve docstrings

* Improve docstrings

* Improve tests

* Improve tests

* Remove print statements

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

06fd7972

05 Jul, 2024 1 commit

Depth Anything: update conversion script for V2 (#31522) · 1082361a

Pedro Cuenca authored Jul 05, 2024

* Depth Anything: update conversion script for V2

* Update docs

* Style

* Revert "Update docs"

This reverts commit be0ca47ea1be4f3cd9aa2113bdd8efcc9959119e.

* Add docs for depth anything v2

* Add depth_anything_v2 to MODEL_NAMES_MAPPING

Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files

* Add tip in original docs

1082361a