Commits · 835de4c8335f72a9c53178f54cc3b4c0688960ec · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "5e7f085fcc944ca3c6f99a34ff1cad6a38ef6685"

06 May, 2024 6 commits

Respect `resume_download` deprecation (#30620) · 835de4c8

Lucain authored May 06, 2024



* Deprecate resume_download

* remove default resume_download value

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

835de4c8

Trainer - add cache clearing and the option for batched eval metrics computation (#28769) · df475bf8

Nate Cibik authored May 06, 2024

* Added cache clearing for GPU efficiency.

* Added cache clearing for GPU efficiency.

* Added batch_eval_metrics capability

* Ran make fixup

* Fixed bug

* Fixed whitespace issue

* Fixed outdated condition

* Updated docstrings with instructions for batch_eval_metrics. Updated end of dataloader logic

* Added first version of batch_eval_metrics Trainer test

* Fixed batch_eval_metrics Trainer tests for both eval and predict

* Fixed batch_eval_metrics behavior for new Trainer variables

* Fixed batch_eval_metrics Trainer tests

* Ran fixup

df475bf8

Trainer._load_from_checkpoint - support loading multiple Peft adapters (#30505) · e0769530

Clara Pohland authored May 06, 2024



* Trainer: load checkpoint model with multiple adapters

* Trainer._load_from_checkpoint support multiple active adapters

* PeftModel.set_adapter does not support multiple adapters yet

* Trainer._load_from_checkpoint test multiple adapters

---------
Co-authored-by: Clara Luise Pohland <clara-luise.pohland@telekom.de>

e0769530

Fix llava next tie_word_embeddings config (#30640) · aa64f086

Marc Sun authored May 06, 2024



* fix llava next embedding

* add docstring

* Update src/transformers/models/llava_next/configuration_llava_next.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

aa64f086

Check if the current compiled version of pytorch supports MPS (#30664) · 09edd77f
jiaqianjing authored May 06, 2024

09edd77f

[`CI update`] Try to use dockers and no cache (#29202) · 307f632b

Arthur authored May 06, 2024



* change cis

* nits

* update

* minor updates

* [push-ci-image]

* nit [push-ci-image]

* nitsssss

* [build-ci-image]

* [push-ci-image]

* [push-ci-image]

* both

* [push-ci-image]

* this?

* [push-ci-image]

* pypi-kenlm needs g++

* [push-ci-image]

* nit

* more nits [push-ci-image]

* nits [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* add vision

* [push-ci-image]

* [push-ci-image]

* add new dummy file but will need to update them [push-ci-image]

* [push-ci-image]

* show package size as well

* [push-ci-image]

* potentially ignore failures

* workflow updates

* nits [push-ci-image]

* [push-ci-image]

* fix consistency

* clean nciida triton

* also show big packages [push-ci-image]

* nit

* update

* another one

* line escape?

* add accelerate [push-ci-image]

* updates [push-ci-image]

* nits to run tests, no push-ci

* try to parse skip reason to make sure nothing is skipped that should no be skippped

* nit?

* always show skipped reasons

* nits

* better parsing of the test outputs

* action="store_true",

* failure on failed

* show matched

* debug

* update short summary with skipped, failed and errors

* nits

* nits

* coolu pdates

* remove docbuilder

* fix

* always run checks

* oups

* nits

* don't error out on library printing

* non zero exi codes

* no warning

* nit

* WAT?

* format nit

* [push-ci-image]

* fail if fail is needed

* [push-ci-image]

* sound file for torch light?

* [push-ci-image]

* order is important [push-ci-image]

* [push-ci-image] reduce even further

* [push-ci-image]

* use pytest rich !

* yes [push-ci-image]

* oupsy

* bring back the full traceback, but pytest rich should help

* nit

* [push-ci-image]

* re run

* nit

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* empty push to trigger

* [push-ci-image]

* nit? [push-ci-image]

* empty

* try to install timm with no deps

* [push-ci-image]

* oups [push-ci-image]

* [push-ci-image]

* [push-ci-image] ?

* [push-ci-image] open ssh client for git checkout fast

* empty for torch light

* updates [push-ci-image]

* nit

* @v4 for checkout

* [push-ci-image]

* [push-ci-image]

* fix fetch tests with parallelism

* [push-ci-image]

* more parallelism

* nit

* more nits

* empty to re-trigger

* empty to re-trigger

* split by timing

* did not work with previous commit

* junit.xml

* no path?

* mmm this?

* junitxml format

* split by timing

* nit

* fix junit family

* now we can test if the xunit1 is compatible!

* this?

* fully list tests

* update

* update

* oups

* finally

* use classname

* remove working directory to make sure the path does not interfere

* okay no juni should have the correct path

* name split?

* sort by classname is what make most sense

* some testing

* naem

* oups

* test something fun

* autodetect

* 18?

* nit

* file size?

* uip

* 4 is best

* update to see versions

* better print

* [push-ci-image]

* [push-ci-image]

* please install the correct keras version

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* uv is fucking me up

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* nits

* [push-ci-image]

* [push-ci-image]

* install issues an pins

* tapas as well

* nits

* more paralellism

* short tb

* soundfile

* soundfile

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* oups

* [push-ci-image]

* fix some things

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* use torch-light for hub

* small git lfs for hub job

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* fix tf tapas

* [push-ci-image]

* nits

* [push-ci-image]

* don't update the test

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* no use them

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* update tf proba

* [push-ci-image]

* [push-ci-image]

* woops

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* test with built dockers

* [push-ci-image]

* skip annoying tests

* revert fix copy

* update test values

* update

* last skip and fixup

* nit

* ALL GOOOD

* quality

* Update tests/models/layoutlmv2/test_image_processing_layoutlmv2.py

* Update docker/quality.dockerfile
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/models/tapas/modeling_tf_tapas.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>

* use torch-speed

* updates

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* fuck ken-lm [push-ci-image]

* [push-ci-image]

* [push-ci-image]

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

307f632b

03 May, 2024 4 commits

Prevent `TextGenerationPipeline._sanitize_parameters` from overriding... · deb7605a

Yen Ting authored May 03, 2024


Prevent `TextGenerationPipeline._sanitize_parameters` from overriding previously provided parameters (#30362)

* Fixed TextGenerationPipeline._sanitize_parameters default params

* removed empty spaces

---------
Co-authored-by: Ng, Yen Ting <yen.ting.ng@intel.com>

deb7605a

HQQ: PEFT support for HQQ (#30632) · d0c72c15
Younes Belkada authored May 03, 2024
```
Update quantizer_hqq.py
```
d0c72c15
Fix W&B run name (#30462) · 66f675eb
Pavel Iakubovskii authored May 03, 2024
```
* Remove comparison to output_dir

* Update docs for `run_name`

* Add warning
```
66f675eb
add mlp bias for llama models (#30031) · 425e1a04
Mayank Mishra authored May 03, 2024
```
* add bias

* fix quality
```
425e1a04

02 May, 2024 13 commits

Fix CI after #30410 (#30612) · a0e77a1f
Raushan Turganbay authored May 03, 2024
```
* Fix CI after #30410

* [run-slow] blenderbot
```
a0e77a1f

Add HQQ quantization support (#29637) · 59952994

mobicham authored May 02, 2024



* update HQQ transformers integration

* push import_utils.py

* add force_hooks check in modeling_utils.py

* fix | with Optional

* force bias as param

* check bias is Tensor

* force forward for multi-gpu

* review fixes pass

* remove torch grad()

* if any key in linear_tags fix

* add cpu/disk check

* isinstance return

* add multigpu test + refactor tests

* clean hqq_utils imports in hqq.py

* clean hqq_utils imports in quantizer_hqq.py

* delete hqq_utils.py

* Delete src/transformers/utils/hqq_utils.py

* ruff init

* remove torch.float16 from __init__ in test

* refactor test

* isinstance -> type in quantizer_hqq.py

* cpu/disk device_map check in quantizer_hqq.py

* remove type(module) nn.linear check in quantizer_hqq.py

* add BaseQuantizeConfig import inside HqqConfig init

* remove hqq import in hqq.py

* remove accelerate import from test_hqq.py

* quant config.py doc update

* add hqqconfig to main_classes doc

* make style

* __init__ fix

* ruff __init__

* skip_modules list

* hqqconfig format fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* test_hqq.py remove mistral comment

* remove self.using_multi_gpu is False

* torch_dtype default val set and logger.info

* hqq.py isinstance fix

* remove torch=None

* torch_device test_hqq

* rename test_hqq

* MODEL_ID in test_hqq

* quantizer_hqq setattr fix

* quantizer_hqq typo fix

* imports quantizer_hqq.py

* isinstance quantizer_hqq

* hqq_layer.bias reformat quantizer_hqq

* Step 2 as comment in quantizer_hqq

* prepare_for_hqq_linear() comment

* keep_in_fp32_modules fix

* HqqHfQuantizer reformat

* quantization.md hqqconfig

* quantization.md model example reformat

* quantization.md # space

* quantization.md space   })

* quantization.md space   })

* quantization_config fix doc
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* axis value check in quantization_config

* format

* dynamic config explanation

* quant config method in quantization.md

* remove shard-level progress

* .cuda fix modeling_utils

* test_hqq fixes

* make fix-copies

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

59952994

Output `None` as attention when layer is skipped (#30597) · 4c940934
Jonghwan Hyeon authored May 03, 2024
```
* Output `None` as attention when layer is skipped

* Add test for output_attentions
```
4c940934
Fix FX tracing issues for Llama (#30619) · 39359e5b
Michael Benayoun authored May 02, 2024

39359e5b
Generate: fix `SinkCache` on Llama models (#30581) · 9719202d
Joao Gante authored May 02, 2024

9719202d

Docs: add missing `StoppingCriteria` autodocs (#30617) · 66abe139

Joao Gante authored May 02, 2024



* add missing docstrings to docs

* Update src/transformers/generation/stopping_criteria.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

66abe139

Docs: fix `generate`-related rendering issues (#30600) · aa55ff44
Joao Gante authored May 02, 2024
```
* does this work?

* like this?

* fix the other generate links

* missing these
```
aa55ff44
Use `contiguous()` in clip checkpoint conversion script (#30613) · f57f0149
Yih-Dar authored May 02, 2024
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
f57f0149
fix:missing `output_router_logits` in SwitchTransformers (#30573) · a65da83d
Zhan Lu authored May 02, 2024
```
* fix:missing `output_router_logits` in SwitchTransformers

* fix whitespace in blank line
```
a65da83d
Fix copies for DBRX - neuron fix (#30610) · 4ad5adaf
amyeroberts authored May 02, 2024

4ad5adaf
🚨 Update image_processing_vitmatte.py (#30566) · f9530258
Richard Brown authored May 02, 2024
```
* Update image_processing_vitmatte.py

* add test

* [run-slow]vitmatte
```
f9530258
Fix for Neuron (#30259) · fbabd674
Michael Benayoun authored May 02, 2024

fbabd674
Fix: failing CI after #30568 (#30599) · 5cf3e6bf
Raushan Turganbay authored May 02, 2024
```
* failiing CI

* no let's keep it intil full deprecation in  v4.42
```
5cf3e6bf

01 May, 2024 6 commits

Fix llava half precision and autocast issues (#29721) · 5090ea3f

Fraser Mince authored May 01, 2024

* Ensure input_embeds and image_features are the same dtype in autocast

* Fix nans in half precision llava-next and fix autocasting behavior.

* Fix styling issues.

* fix randn newline instantiation

* fix broken slow llava test

* Fix llava next init.

* fix styling issues

* [run-slow]llava,llava_next

* fix styling issues

5090ea3f

Generate: remove deprecated public decoding functions and streamline logic 🧼 (#29956) · d57ffb48
Joao Gante authored May 01, 2024

d57ffb48

Gemma: update activation warning (#29995) · f4f18afd

Pedro Cuenca authored May 01, 2024

* Gemma: only display act. warning when necessary

This is a nit PR, but I was confused. I got the warning even after I
had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I
was using the "legacy" `gelu_pytorch_tanh`.

Another option is to keep the warning but change the message to say
something like "`hidden_act` is ignored, please use `hidden_activation`
instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".

* Change message, and set `config.hidden_activation`

f4f18afd

Refactor default chat template warnings (#30551) · 4b4da18f

Matt authored May 01, 2024

* Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates

* make fixup

* Move the default chat template warning into apply_chat_template itself

* make fixup

4b4da18f

Fix Marian model conversion (#30173) · 4bc9cb36

Raushan Turganbay authored May 01, 2024

* fix marian model coversion

* uncomment that line

* remove unnecessary code

* revert tie_weights, doesn't hurt

4bc9cb36

Encoder-decoder models: move embedding scale to nn.Module (#30410) · 38a4bf79

Raushan Turganbay authored May 01, 2024



* move scaling to nn.Module

* let the test be here for now (need to fix)

* failing tests

* last failing models

* Revert commit 4c14817f38

* clean-up

* oops forgot

* codestyle

* raise NotImplemented when possible

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* skip tests in respective modeling files

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

38a4bf79

30 Apr, 2024 9 commits

Remove `use_square_size` after loading (#30567) · 78fdd64d

Yih-Dar authored Apr 30, 2024



* fix

* add test

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

78fdd64d

Fix generation doctests (#30263) · b8ac4d03
Raushan Turganbay authored May 01, 2024
```
* fix doctest

* fix torch doctest

* make CI happy

* raise error

* make fixup
```
b8ac4d03

Add chat templating support for KeyDataset in text-generation pipeline (#30558) · 2ecefc39

DarshanDeshpande authored Apr 30, 2024

* added chat templating support for keydataset in generation pipeline

* fixed and improved test

* fix formatting test failures

* Fix tests

* Fix tests

2ecefc39

BlipModel: get_multimodal_features method (#30438) · 0cdb6b3f

Jiarui Xu authored May 01, 2024

* add_blip_get_multimodal_feautres

* Fix docstring error

* reimplement get_multimodal_features

* fix error

* recheck code quality

* add new necessary tests

0cdb6b3f

Fix seq2seq collator padding (#30556) · 9112520b

Anton Vlasjuk authored Apr 30, 2024

* fix seq2seq data collator to respect the given padding strategy

further added tests for the seq2seq data collator in the style of the `data_collator_for_token_classification` (pt, tf, np)

* formatting and change bool equals "==" to "is"

* add missed return types in tests

* update numpy test as it can handle unequal shapes, not like pt or tf

9112520b

DBRX: make fixup (#30578) · 78a57c5e
Joao Gante authored Apr 30, 2024

78a57c5e
Cache: Static cache as a standalone object (#30476) · 75bbfd5b
Joao Gante authored Apr 30, 2024

75bbfd5b

Enable multi-device for more models (#30409) · 0ae789e0

Jacky Lee authored Apr 30, 2024

* feat: support for dinov2

* feat: support for depth_anything

* feat: support for efficientformer

* feat: support for bert (is this right?)

* update: embedding split

* remove: empty string

* feat: support for align

* fix: copies

* fix: QAQBertEmbeddings

* fix: more consistency issues

* revert: support for effientformer

* feat: support for altclip

* feat: support for blip_text

* support for ChineseCLIP

* feat: support for depth anything

* feat: support for dpt

* feat: support for dpt

* feat: support for git

* feat: support for groupvit

* update: format

* fix: support for clip

* fix: consistency

* feat: support for pvt

* feat: support for vit_msn

* fix: consistency

* fix: other copies

* remove: device transfer

* revert: in-place add

* update: support for align

* update: support for bert

* update: support for Chinese CLIP

* revert: changes to efficientformer

* update: support for dpt

* update: support for efficientformer

* revert: changes to git

* revert: changes to groupvit

* revert: changes to roc_bert

* update: support for vit_msn

* revert: changes to dpt

* remove: extra space

* style: extra space

0ae789e0

Pass `use_cache` in kwargs for GPTNeoX (#30538) · c712d05a
Raushan Turganbay authored Apr 30, 2024
```
pass use_cache in kwargs
```
c712d05a

29 Apr, 2024 2 commits

Include safetensors as part of `_load_best_model` (#30553) · a3aabc70
Zach Mueller authored Apr 29, 2024
```
* Include safetensors

* Cleanup
```
a3aabc70

Reenable SDPA's FA2 During Training with torch.compile (#30442) · 9df8b301

Benjamin Warner authored Apr 29, 2024

* Reenable SDPA's FA2 during training with torch.compile

* fix Olmo's SDPA FA2 dispatching too

* update formatting

* improved SDPA comment

* formatting and explanatory comment

* is_causal if statement to one-liner

9df8b301