Commits · 7164171212e0ba68845447a5bef4631600b303ff · chenpangpang / transformers

01 May, 2024 11 commits

Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/distillation (#30586) · 71641712

dependabot[bot] authored May 01, 2024

Bump gitpython in /examples/research_projects/distillation

Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41

)

---
updated-dependencies:
- dependency-name: gitpython
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

71641712

Bump grpcio from 1.44.0 to 1.53.2 in /examples/research_projects/decision_transformer (#30585) · ff8f6245

dependabot[bot] authored May 01, 2024

Bump grpcio in /examples/research_projects/decision_transformer

Bumps [grpcio](https://github.com/grpc/grpc) from 1.44.0 to 1.53.2.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](https://github.com/grpc/grpc/compare/v1.44.0...v1.53.2

)

---
updated-dependencies:
- dependency-name: grpcio
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

ff8f6245

Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/decision_transformer (#30587) · b71f5128

dependabot[bot] authored May 01, 2024

Bump gitpython in /examples/research_projects/decision_transformer

Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41

)

---
updated-dependencies:
- dependency-name: gitpython
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

b71f5128

Gemma: update activation warning (#29995) · f4f18afd

Pedro Cuenca authored May 01, 2024

* Gemma: only display act. warning when necessary

This is a nit PR, but I was confused. I got the warning even after I
had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I
was using the "legacy" `gelu_pytorch_tanh`.

Another option is to keep the warning but change the message to say
something like "`hidden_act` is ignored, please use `hidden_activation`
instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".

* Change message, and set `config.hidden_activation`

f4f18afd

Fix canonical model --model_type in examples (#30480) · bbaa8cef
amyeroberts authored May 01, 2024
```
Fix --model_type in examples
```
bbaa8cef
remove jax example (#30498) · 3c69d81e
Arthur authored May 01, 2024
```
remove example
```
3c69d81e

Fix QA example (#30580) · 1e05671d

Matt authored May 01, 2024

* Handle cases when CLS token is absent

* Use BOS token as a fallback

1e05671d

Refactor default chat template warnings (#30551) · 4b4da18f

Matt authored May 01, 2024

* Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates

* make fixup

* Move the default chat template warning into apply_chat_template itself

* make fixup

4b4da18f

Fix Marian model conversion (#30173) · 4bc9cb36

Raushan Turganbay authored May 01, 2024

* fix marian model coversion

* uncomment that line

* remove unnecessary code

* revert tie_weights, doesn't hurt

4bc9cb36

Encoder-decoder models: move embedding scale to nn.Module (#30410) · 38a4bf79

Raushan Turganbay authored May 01, 2024



* move scaling to nn.Module

* let the test be here for now (need to fix)

* failing tests

* last failing models

* Revert commit 4c14817f38

* clean-up

* oops forgot

* codestyle

* raise NotImplemented when possible

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* skip tests in respective modeling files

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

38a4bf79

Use text config's vocab size in testing models (#30568) · 9d31b32e
Raushan Turganbay authored May 01, 2024
```
use text config's vocab size
```
9d31b32e

30 Apr, 2024 11 commits

Remove `use_square_size` after loading (#30567) · 78fdd64d

Yih-Dar authored Apr 30, 2024



* fix

* add test

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

78fdd64d

General PR slow CI (#30540) · 87927b24

Yih-Dar authored Apr 30, 2024



* More general PR slow CI

* Update utils/pr_slow_ci_models.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

87927b24

Fix generation doctests (#30263) · b8ac4d03
Raushan Turganbay authored May 01, 2024
```
* fix doctest

* fix torch doctest

* make CI happy

* raise error

* make fixup
```
b8ac4d03

Add chat templating support for KeyDataset in text-generation pipeline (#30558) · 2ecefc39

DarshanDeshpande authored Apr 30, 2024

* added chat templating support for keydataset in generation pipeline

* fixed and improved test

* fix formatting test failures

* Fix tests

* Fix tests

2ecefc39

BlipModel: get_multimodal_features method (#30438) · 0cdb6b3f

Jiarui Xu authored May 01, 2024

* add_blip_get_multimodal_feautres

* Fix docstring error

* reimplement get_multimodal_features

* fix error

* recheck code quality

* add new necessary tests

0cdb6b3f

Fix seq2seq collator padding (#30556) · 9112520b

Anton Vlasjuk authored Apr 30, 2024

* fix seq2seq data collator to respect the given padding strategy

further added tests for the seq2seq data collator in the style of the `data_collator_for_token_classification` (pt, tf, np)

* formatting and change bool equals "==" to "is"

* add missed return types in tests

* update numpy test as it can handle unequal shapes, not like pt or tf

9112520b

DBRX: make fixup (#30578) · 78a57c5e
Joao Gante authored Apr 30, 2024

78a57c5e
Generate: update links on LLM tutorial doc (#30550) · 1bff6a0b
Joao Gante authored Apr 30, 2024

1bff6a0b
Cache: Static cache as a standalone object (#30476) · 75bbfd5b
Joao Gante authored Apr 30, 2024

75bbfd5b

Enable multi-device for more models (#30409) · 0ae789e0

Jacky Lee authored Apr 30, 2024

* feat: support for dinov2

* feat: support for depth_anything

* feat: support for efficientformer

* feat: support for bert (is this right?)

* update: embedding split

* remove: empty string

* feat: support for align

* fix: copies

* fix: QAQBertEmbeddings

* fix: more consistency issues

* revert: support for effientformer

* feat: support for altclip

* feat: support for blip_text

* support for ChineseCLIP

* feat: support for depth anything

* feat: support for dpt

* feat: support for dpt

* feat: support for git

* feat: support for groupvit

* update: format

* fix: support for clip

* fix: consistency

* feat: support for pvt

* feat: support for vit_msn

* fix: consistency

* fix: other copies

* remove: device transfer

* revert: in-place add

* update: support for align

* update: support for bert

* update: support for Chinese CLIP

* revert: changes to efficientformer

* update: support for dpt

* update: support for efficientformer

* revert: changes to git

* revert: changes to groupvit

* revert: changes to roc_bert

* update: support for vit_msn

* revert: changes to dpt

* remove: extra space

* style: extra space

0ae789e0

Pass `use_cache` in kwargs for GPTNeoX (#30538) · c712d05a
Raushan Turganbay authored Apr 30, 2024
```
pass use_cache in kwargs
```
c712d05a

29 Apr, 2024 7 commits
- Include safetensors as part of `_load_best_model` (#30553) · a3aabc70
  Zach Mueller authored Apr 29, 2024
```
* Include safetensors

* Cleanup
```
  a3aabc70
- Reenable SDPA's FA2 During Training with torch.compile (#30442) · 9df8b301
  Benjamin Warner authored Apr 29, 2024
```
* Reenable SDPA's FA2 during training with torch.compile

* fix Olmo's SDPA FA2 dispatching too

* update formatting

* improved SDPA comment

* formatting and explanatory comment

* is_causal if statement to one-liner
```
  9df8b301
- Fix repo. fetch/checkout in PR slow CI job (#30537) · 87be06ca
  Yih-Dar authored Apr 29, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  87be06ca
- Update runner tag for PR slow CI (#30535) · c0242188
  Yih-Dar authored Apr 29, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  c0242188
- Fix broken link to Transformers notebooks (#30512) · bdbe1662
  clinty authored Apr 29, 2024
```
Co-authored-by: Clint Adams <clint@debian.org>
```
  bdbe1662
- Pass attn_implementation when using AutoXXX.from_config (#30507) · e8acb700
  amyeroberts authored Apr 29, 2024
```
* Pass attn_implementation when using AutoXXX.from_config

* Fix
```
  e8acb700
- Allow boolean FSDP options in fsdp_config (#30439) · 80126f98
  Howard Liberty authored Apr 29, 2024
```
* Allow boolean FSDP options in fsdp_config

* Use lower() to be safe
```
  80126f98
26 Apr, 2024 11 commits

Fix link in dbrx.md (#30509) · 73014b56
Eitan Turok authored Apr 26, 2024

73014b56

[SegGPT] Fix seggpt image processor (#29550) · 6d4cabda

Eduardo Pacheco authored Apr 26, 2024

* Fixed SegGptImageProcessor to handle 2D and 3D prompt mask inputs

* Added new test to check prompt mask equivalence

* New proposal

* Better proposal

* Removed unnecessary method

* Updated seggpt docs

* Introduced do_convert_rgb

* nits

6d4cabda

load_image - decode b64encode and encodebytes strings (#30192) · c793b26f
amyeroberts authored Apr 26, 2024
```
* Decode b64encode and encodebytes strings

* Remove conditional encode -- image is always a string
```
c793b26f
Fix GroundingDINO, DPR after BERT SDPA update (#30506) · e7d52a10
amyeroberts authored Apr 26, 2024
```
Fix GroundingDINO, DPR after BET SDPA update
```
e7d52a10

[examples] update whisper fine-tuning (#29938) · 38b53da3

Sanchit Gandhi authored Apr 26, 2024

* [examples] update whisper fine-tuning

* deprecate forced/suppress tokens

* item assignment

* update readme

* final fix

38b53da3

[`DETR`] Remove timm hardcoded logic in modeling files (#29038) · aafa7ce7

amyeroberts authored Apr 26, 2024



* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Fix up

* Add tests

* Tidy up

* Enable instantiating model with pretrained backbone weights

* Update tests so backbone checkpoint isn't passed in

* Clarify pretrained import

* Update configs - docs and validation check

* Update src/transformers/utils/backbone_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Clarify exception message

* Update config init in tests

* Add test for when use_timm_backbone=True

* Use load_backbone instead

* Add use_timm_backbone to the model configs

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Draft

* Fix tests

* Add back timm - weight naming

* More tidying up

* Whoops

* Tidy up

* Handle when kwargs are none

* Update tests

* Revert test changes

* Deformable detr test - don't use default

* Don't mutate; correct model attributes

* Add some clarifying comments

* nit - grammar is hard

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

aafa7ce7

Remove skipping logic now that set_epoch exists (#30501) · 77ff304d
Zach Mueller authored Apr 26, 2024
```
* Remove skipping logic now that set_epoch exists

* Working version, clean
```
77ff304d

[`BERT`] Add support for sdpa (#28802) · dfa7b580

JB (Don) authored Apr 26, 2024

* Adding SDPA support for BERT

* Using the proper input name for testing model input in inference()

* Adding documentation for SDPA in BERT model page

* Use the stable link for the documentation

* Adding a gate to only call .contiguous() for torch < 2.2.0

* Additions and fixes to the documentation

* Minor updates to documentation

* Adding extra requirements needed for the contiguous() bug

* Adding "Adapted from" in plcae of the "Copied from"

* Add benchmark speedup tables to the documentation

* Minor fixes to the documentation

* Use ClapText as a replacemenet for Bert in the Copied-From

* Some more fixes for the fix-copies references

* Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage

[test all]

* Undo changes to separate test

* Refactored SDPA self attention code for KV projections

* Change use_sdpa to attn_implementation

* Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)

dfa7b580

Use the Keras set_random_seed in tests (#30504) · 2de5cb12
Matt authored Apr 26, 2024
```
Use the Keras set_random_seed to ensure reproducible weight initialization
```
2de5cb12
Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488) · 20081c74
Michael Goin authored Apr 26, 2024
```
* Update modeling_utils/dtype_byte_size to handle float8 types

* Add a test for dtype_byte_size

* Format

* Fix bool
```
20081c74
Fix the `bitsandbytes` error formatting ("Some modules are dispatched on ...") (#30494) · 59e715f7
kyo authored Apr 26, 2024
```
Fix the `bitsandbytes` error when some modules are not properly offloaded.
```
59e715f7