Commits · 75bbfd5b2237b7e35a9265731ecf63022579e7e2 · chenpangpang / transformers

30 Apr, 2024 3 commits

Cache: Static cache as a standalone object (#30476) · 75bbfd5b
Joao Gante authored Apr 30, 2024

75bbfd5b

Enable multi-device for more models (#30409) · 0ae789e0

Jacky Lee authored Apr 30, 2024

* feat: support for dinov2

* feat: support for depth_anything

* feat: support for efficientformer

* feat: support for bert (is this right?)

* update: embedding split

* remove: empty string

* feat: support for align

* fix: copies

* fix: QAQBertEmbeddings

* fix: more consistency issues

* revert: support for effientformer

* feat: support for altclip

* feat: support for blip_text

* support for ChineseCLIP

* feat: support for depth anything

* feat: support for dpt

* feat: support for dpt

* feat: support for git

* feat: support for groupvit

* update: format

* fix: support for clip

* fix: consistency

* feat: support for pvt

* feat: support for vit_msn

* fix: consistency

* fix: other copies

* remove: device transfer

* revert: in-place add

* update: support for align

* update: support for bert

* update: support for Chinese CLIP

* revert: changes to efficientformer

* update: support for dpt

* update: support for efficientformer

* revert: changes to git

* revert: changes to groupvit

* revert: changes to roc_bert

* update: support for vit_msn

* revert: changes to dpt

* remove: extra space

* style: extra space

0ae789e0

Pass `use_cache` in kwargs for GPTNeoX (#30538) · c712d05a
Raushan Turganbay authored Apr 30, 2024
```
pass use_cache in kwargs
```
c712d05a

29 Apr, 2024 4 commits
- Include safetensors as part of `_load_best_model` (#30553) · a3aabc70
  Zach Mueller authored Apr 29, 2024
```
* Include safetensors

* Cleanup
```
  a3aabc70
- Reenable SDPA's FA2 During Training with torch.compile (#30442) · 9df8b301
  Benjamin Warner authored Apr 29, 2024
```
* Reenable SDPA's FA2 during training with torch.compile

* fix Olmo's SDPA FA2 dispatching too

* update formatting

* improved SDPA comment

* formatting and explanatory comment

* is_causal if statement to one-liner
```
  9df8b301
- Pass attn_implementation when using AutoXXX.from_config (#30507) · e8acb700
  amyeroberts authored Apr 29, 2024
```
* Pass attn_implementation when using AutoXXX.from_config

* Fix
```
  e8acb700
- Allow boolean FSDP options in fsdp_config (#30439) · 80126f98
  Howard Liberty authored Apr 29, 2024
```
* Allow boolean FSDP options in fsdp_config

* Use lower() to be safe
```
  80126f98
26 Apr, 2024 9 commits

[SegGPT] Fix seggpt image processor (#29550) · 6d4cabda

Eduardo Pacheco authored Apr 26, 2024

* Fixed SegGptImageProcessor to handle 2D and 3D prompt mask inputs

* Added new test to check prompt mask equivalence

* New proposal

* Better proposal

* Removed unnecessary method

* Updated seggpt docs

* Introduced do_convert_rgb

* nits

6d4cabda

load_image - decode b64encode and encodebytes strings (#30192) · c793b26f
amyeroberts authored Apr 26, 2024
```
* Decode b64encode and encodebytes strings

* Remove conditional encode -- image is always a string
```
c793b26f
Fix GroundingDINO, DPR after BERT SDPA update (#30506) · e7d52a10
amyeroberts authored Apr 26, 2024
```
Fix GroundingDINO, DPR after BET SDPA update
```
e7d52a10

[`DETR`] Remove timm hardcoded logic in modeling files (#29038) · aafa7ce7

amyeroberts authored Apr 26, 2024



* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Fix up

* Add tests

* Tidy up

* Enable instantiating model with pretrained backbone weights

* Update tests so backbone checkpoint isn't passed in

* Clarify pretrained import

* Update configs - docs and validation check

* Update src/transformers/utils/backbone_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Clarify exception message

* Update config init in tests

* Add test for when use_timm_backbone=True

* Use load_backbone instead

* Add use_timm_backbone to the model configs

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Draft

* Fix tests

* Add back timm - weight naming

* More tidying up

* Whoops

* Tidy up

* Handle when kwargs are none

* Update tests

* Revert test changes

* Deformable detr test - don't use default

* Don't mutate; correct model attributes

* Add some clarifying comments

* nit - grammar is hard

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

aafa7ce7

Remove skipping logic now that set_epoch exists (#30501) · 77ff304d
Zach Mueller authored Apr 26, 2024
```
* Remove skipping logic now that set_epoch exists

* Working version, clean
```
77ff304d

[`BERT`] Add support for sdpa (#28802) · dfa7b580

JB (Don) authored Apr 26, 2024

* Adding SDPA support for BERT

* Using the proper input name for testing model input in inference()

* Adding documentation for SDPA in BERT model page

* Use the stable link for the documentation

* Adding a gate to only call .contiguous() for torch < 2.2.0

* Additions and fixes to the documentation

* Minor updates to documentation

* Adding extra requirements needed for the contiguous() bug

* Adding "Adapted from" in plcae of the "Copied from"

* Add benchmark speedup tables to the documentation

* Minor fixes to the documentation

* Use ClapText as a replacemenet for Bert in the Copied-From

* Some more fixes for the fix-copies references

* Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage

[test all]

* Undo changes to separate test

* Refactored SDPA self attention code for KV projections

* Change use_sdpa to attn_implementation

* Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)

dfa7b580

Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488) · 20081c74
Michael Goin authored Apr 26, 2024
```
* Update modeling_utils/dtype_byte_size to handle float8 types

* Add a test for dtype_byte_size

* Format

* Fix bool
```
20081c74
Fix the `bitsandbytes` error formatting ("Some modules are dispatched on ...") (#30494) · 59e715f7
kyo authored Apr 26, 2024
```
Fix the `bitsandbytes` error when some modules are not properly offloaded.
```
59e715f7
FEAT: PEFT support for EETQ (#30449) · 19cfdf0f
Younes Belkada authored Apr 26, 2024
```
Update quantizer_eetq.py
```
19cfdf0f

25 Apr, 2024 9 commits

Quantization: `HfQuantizer` quant method update (#30484) · 26ddc580
Younes Belkada authored Apr 25, 2024
```
ensure popular quant methods are supported
```
26ddc580
Do not use deprecated `SourceFileLoader.load_module()` in dynamic module loading (#30370) · bc274a28
Xuehai Pan authored Apr 26, 2024

bc274a28
Fix Llava for 0-embeddings (#30473) · e60491ad
Raushan Turganbay authored Apr 25, 2024

e60491ad

Introduce Stateful Callbacks (#29666) · ad697f18

Zach Mueller authored Apr 25, 2024



* Introduce saveable callbacks

* Add note

* Test for non-present and flag

* Support early stopping and refusing to train further

* Update docstring

* More saving

* Import oopsie

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Make it go through TrainerArguments

* Document

* Fix test

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Rework to allow for duplicates

* CLean

* Fix failing tests

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ad697f18

Add WSD scheduler (#30231) · 7b1170b0

Alexander Visheratin authored Apr 25, 2024

* Added WSD scheduler.

* Added tests.

* Fixed errors.

* Fix formatting.

* CI fixes.

7b1170b0

🚨

Add training compatibility for Musicgen-like models (#29802) · 90cb55bf

Yoach Lacombe authored Apr 25, 2024



* first modeling code

* make repository

* still WIP

* update model

* add tests

* add latest change

* clean docstrings and copied from

* update docstrings md and readme

* correct chroma function

* correct copied from and remove unreleated test

* add doc to toctree

* correct imports

* add convert script to notdoctested

* Add suggestion from Sanchit
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct get_uncoditional_inputs docstrings

* modify README according to SANCHIT feedback

* add chroma to audio utils

* clean librosa and torchaudio hard dependencies

* fix FE

* refactor audio decoder -> audio encoder for consistency with previous musicgen

* refactor conditional -> encoder

* modify sampling rate logics

* modify license at the beginning

* refactor all_self_attns->all_attentions

* remove ignore copy from causallm generate

* add copied from for from_sub_models

* fix make copies

* add warning if audio is truncated

* add copied from where relevant

* remove artefact

* fix convert script

* fix torchaudio and FE

* modify chroma method according to feedback-> better naming

* refactor input_values->input_features

* refactor input_values->input_features and fix import fe

* add input_features to docstrigs

* correct inputs_embeds logics

* remove dtype conversion

* refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation

* change warning for chroma length

* Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* change way to save wav, using soundfile

* correct docs and change to soundfile

* fix import

* fix init proj layers

* add draft training

* fix cross entropy

* clean loss computation

* fix labels

* remove line breaks from md

* fix issue with docstrings

* add FE suggestions

* improve is in logics and remove useless imports

* remove custom from_pretrained

* simplify docstring code

* add suggestions for modeling tests

* make style

* update converting script with sanity check

* remove encoder attention mask from conditional generation

* replace musicgen melody checkpoints with official orga

* rename ylacombe->facebook in checkpoints

* fix copies

* remove unecessary warning

* add shape in code docstrings

* add files to slow doc tests

* fix md bug and add md to not_tested

* make fix-copies

* fix hidden states test and batching

* update training code

* add training tests for melody

* add training for o.g musicgen

* fix copied from

* remove final todos

* make style

* fix style

* add suggestions from review

* add ref to the original loss computation code

* rename method + fix labels in tests

* make style

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

90cb55bf

Prevent crash with `WandbCallback` with third parties (#30477) · ce5ae5a4

Tom Aarsen authored Apr 25, 2024

* Use EAFP principle to prevent crash with third parties

* Remove leftover debugging code

* Add info-level logger message

ce5ae5a4

Fix SigLip classification doctest (#30475) · 4fed29e3

amyeroberts authored Apr 25, 2024

* Fix SigLip classification doctest

* Remove extra line

* Update src/transformers/models/siglip/modeling_siglip.py

4fed29e3

[fix codellama conversion] (#30472) · c60749d6
Arthur authored Apr 25, 2024
```
* fix codellama conversion

* nit
```
c60749d6

24 Apr, 2024 11 commits

Non blocking support to torch DL's (#30465) · 6ad9c8f7
Zach Mueller authored Apr 24, 2024
```
* Non blocking support

* Check for optimization

* Doc
```
6ad9c8f7

Enable fp16 on CPU (#30459) · 5c57463b

Zach Mueller authored Apr 24, 2024

* Check removing flag for torch

* LLM oops

* Getting there...

* More discoveries

* Change

* Clean up and prettify

* Logic check

* Not

5c57463b

Neuron: When save_safetensor=False, no need to move model to CPU (#29703) · d1d94d79

jeffhataws authored Apr 24, 2024

save_safetensor=True is default as of release 4.35.0, which then
required TPU hotfix https://github.com/huggingface/transformers/pull/27799
(issue https://github.com/huggingface/transformers/issues/27578).
However, when the flag save_safetensor is set to False (compatibility mode),
moving the model to CPU causes generation of too many graphs
during checkpoint https://github.com/huggingface/transformers/issues/28438.
This PR disable moving of model to CPU when save_safetensor=False.

d1d94d79

Phi-3 (#30423) · c9693db2

Gustavo de Rosa authored Apr 24, 2024

* chore(root): Initial commit of Phi-3 files.

* fix(root): Fixes Phi-3 missing on readme.

* fix(root): Ensures files are consistent.

* fix(phi3): Fixes unit tests.

* fix(tests): Fixes style of phi-3 test file.

* chore(tests): Adds integration tests for Phi-3.

* fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm.

* fix(phi3): Fixes incorrect docstrings.

* fix(phi3): Fixes docstring typos.

* fix(phi3): Adds support for Su and Yarn embeddings.

* fix(phi3): Improves according first batch of reviews.

* fix(phi3): Uses up_states instead of y in Phi3MLP.

* fix(phi3): Uses gemma rotary embedding to support torch.compile.

* fix(phi3): Improves how rotary embedding classes are defined.

* fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.

* fix(phi3): Adds last suggestions to modeling file.

* fix(phi3): Splits inv_freq calculation in two lines.

c9693db2

[SegGPT] Fix loss calculation (#30421) · d26c1413

Eduardo Pacheco authored Apr 24, 2024



* Fixed main train issues

* Added loss test

* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added missing labels arg in SegGptModel forward

* Fixed typo

* Added slow test to test loss calculation

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

d26c1413

fix jamba slow foward for multi-gpu (#30418) · 37fa1f65
Marc Sun authored Apr 24, 2024
```
* fix jamba slow foward for multi-gpu

* remove comm

* oups

* style
```
37fa1f65
fix uncaught init of linear layer in clip's/siglip's for image classification models (#30435) · 5d64ae9d
Anton Vlasjuk authored Apr 24, 2024
```
* fix clip's/siglip's _init_weights to reflect linear layers in "for image classification"

* trigger slow tests
```
5d64ae9d

[`Llava`] + CIs fix red cis and llava integration tests (#30440) · 9a4a119c

Arthur authored Apr 24, 2024



* nit

* nit and fmt skip

* fixup

* Update src/transformers/convert_slow_tokenizer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set to true

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

9a4a119c

Fix YOLOS image processor resizing (#30436) · 767e3518

Pavel Iakubovskii authored Apr 24, 2024

* Add test for square image that fails

* Fix for square images

* Extend test cases

* Fix resizing in tests

* Style fixup

767e3518

Add llama3 (#30334) · 89c510d8

Arthur authored Apr 24, 2024



* nuke

* add co-author

* add co-author

* update card

* fixup and fix copies to please our ci

* nit fixup

* super small nits

* remove tokenizer_path from call to `write_model`

* always safe serialize by default

---------
Co-authored-by: pcuenca <pcuenca@users.noreply.github.com>
Co-authored-by: xenova <xenova@users.noreply.github.com>

89c510d8

Remove add-new-model in favor of add-new-model-like (#30424) · d4e92f1a
Lysandre Debut authored Apr 24, 2024
```
* Remove add-new-model in favor of add-new-model-like

* nits
```
d4e92f1a

23 Apr, 2024 4 commits

[`LlamaTokenizerFast`] Refactor default llama (#28881) · e34da3ee

Arthur authored Apr 23, 2024

* push legacy to fast as well

* super strange

* Update src/transformers/convert_slow_tokenizer.py

* make sure we are BC

* fix Llama test

* nit

* revert

* more test

* style

* update

* small update w.r.t tokenizers

* nit

* don't split

* lol

* add a test for `add_prefix_space=False`

* fix gemma tokenizer as well

* update

* fix gemma

* nicer failures

* fixup

* update

* fix the example for legacy = False

* use `huggyllama/llama-7b` for the PR doctest

* nit

* use from_slow

* fix llama

e34da3ee

Fix use_cache for xla fsdp (#30353) · 12c39e56
Jiewen Tan authored Apr 23, 2024
```
* Fix use_cache for xla fsdp

* Fix linters
```
12c39e56

Fix LayoutLMv2 init issue and doctest (#30278) · 416fdbad

Yih-Dar authored Apr 23, 2024



* fix

* try suggestion

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

416fdbad

Make EosTokenCriteria compatible with mps (#30376) · 4b63d013
Pedro Cuenca authored Apr 23, 2024

4b63d013