Commits · acbfaf69ccca28efbde397ec55ce9f2c4ee8b509 · chenpangpang / transformers

24 May, 2024 8 commits

Yih-Dar authored May 24, 2024



* allow multi-gpu

* allow multi-gpu

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

acbfaf69

FIX / TST: Fix expected results on Mistral AWQ test (#30971) · ae87f979
Marc Sun authored May 24, 2024
```
fix awq mistral test
```
ae87f979
[tests] make `test_model_parallelism` device-agnostic (#30844) · 04c7c176
Fanli Lin authored May 24, 2024
```
* enable on xpu

* fix style

* add comment and mps
```
04c7c176

Perceiver interpolate position embedding (#30979) · 42d8dd87

Yixiang Gao authored May 24, 2024



* add test that currently fails

* test passed

* all perceiver passed

* fixup, style, quality, repo-consistency, all passed

* Apply suggestions from code review: default to False + compute sqrt once only
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix a minor bracket

* replace dim with self._num_channels

* add arguments to the rest preprocessors

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

42d8dd87

pin `uv==0.1.45` (#31006) · 5855afd1

Yih-Dar authored May 24, 2024



* fix

* [push-ci-image]

* run with latest

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5855afd1

Do not trigger autoconversion if local_files_only (#31004) · 03935d30
Lucain authored May 24, 2024

03935d30

Fix training speed regression introduced by "optimize VRAM for calculating... · 21e259d8

Kevin Koehncke authored May 24, 2024

Fix training speed regression introduced by "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139)" (#30988)

* Revert "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139)"

This reverts commit a7e0ed82

.

* Instead of reverting commit, wrap indexing in torch.no_grad context

* Apply wrapping in LayoutLMv2

* Add comments explaining reason for no_grad

* Fix code format

---------
Co-authored-by: Kevin Koehncke <kevin.koehncke@uipath.com>

21e259d8

add prefix space ignored in llama #29625 (#30964) · 7f6e8741

Ita Zaporozhets authored May 24, 2024



* add prefix space ignored in llama #29625

* adding test with add_prefix_space=False

* ruff

---------
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>

7f6e8741

23 May, 2024 15 commits

Bugfix: WandbCallback uploads initial model checkpoint (#30897) · 6657fb5f

Matthias Gerstgrasser authored May 23, 2024

* fix wandb always uploading initial model

* Update comment.

* Optionally log initial model

* Revert "Optionally log initial model"

This reverts commit 9602cc1fad3feaf218f82a7339a194d3d2fbb946.

6657fb5f

Remove deprecated properties in tokenization_nllb.py and tokenization_nllb_fast.py (#29834) · 6d3d5b10

Yasmin Moslem authored May 23, 2024

* Fix typo in tokenization_nllb.py

Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.

* Fix typo in tokenization_nllb_fast.py

Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.

* Remove deprecated attributes in tokenization_nllb.py

Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens`

* Remove deprecated attribute in tokenization_nllb_fast.py

Remove deprecated attribute `lang_code_to_id`

* Remove deprecated properties in tokenization_nllb.py

Remove deprecated properties - fix format

* Remove deprecated properties in tokenization_nllb_fast.py

Remove deprecated properties - fix format

* Update test_tokenization_nllb.py

* update test_tokenization_nllb.py

* Update tokenization_nllb.py

* Update test_tokenization_seamless_m4t.py

* Update test_tokenization_seamless_m4t.py

6d3d5b10

[Port] TensorFlow implementation of Mistral (#29708) · 965e98dc

Aritra Roy Gosthipaty authored May 23, 2024



* chore: initial commit

* chore: adding imports and inits

* chore: adding the causal and classification code

* chore: adding names to the layers

* chore: using single self attn layer

* chore: built the model and layers

* chore: start with testing

* chore: docstring change, transpose fix

* fix: rotary embedding

* chore: adding cache implementation

* remove unused torch

* chore: fixing the indexing issue

* make fix-copies

* Use modeling_tf_utils.keras

* make fixup

* chore: fixing tests

* chore: adding past key value logic

* chore: adding multi label classfication test

* fix: switching on the built parameters in the layers

* fixing repo consistency

* ruff formats

* style changes

* fix: tf and pt equivalence

* removing returns from docstrings

* fix docstrings

* fix docstrings

* removing todos

* fix copies

* fix docstring

* fix docstring

* chore: using easier rotate_half

* adding integration tests

* chore: addressing review related to rotary embedding layer

* review changes

* [run-slow] mistral

* skip: test save load after resize token embedding

* style

---------
Co-authored-by: Matt <rocketknight1@gmail.com>

965e98dc

Update 4 `MptIntegrationTests` expected outputs (#30989) · 2a89673f

Yih-Dar authored May 23, 2024



* fix

* fix

* fix

* fix

* fix

* [run-slow] mpt

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

2a89673f

Add a check that warmup_setps is either 0 or >= 1 (#30764) · 892b13d3

Yasmin Moslem authored May 23, 2024



* Add a check that warmup_setps is either 0 or >= 1

Update training_args.py to add a check that warmup_setps is either 0 or >= 1. Otherwise, raise an error.

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

892b13d3

[tests] add `torch.use_deterministic_algorithms` for XPU (#30774) · 21339a52
Fanli Lin authored May 23, 2024
```
* add xpu check

* add marker

* add documentation

* update doc

* fix ci

* remove from global init

* fix
```
21339a52

Fix accelerate failing tests (#30836) · 8366b572

Marc Sun authored May 23, 2024

* Fix accelerate tests

* fix clip

* skip dbrx tests

* fix GPTSan

* fix M2M100Model

* same fix as jamba

* fix mt5

* Fix T5Model

* Fix umt5 model

* fix switch_transformers

* fix whisper

* fix gptsan again

* fix siglip recent test

* skip siglip tests

* wrong place fixed

8366b572

FIX / Docs: Minor changes in quantization docs (#30985) · 5a74ae6d

Younes Belkada authored May 23, 2024



* Change in quantization docs

* Update overview.md

* Update docs/source/en/quantization/overview.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

5a74ae6d

Finish adding support for torch.compile dynamic shapes (#30919) · 046c2ad7
Benjamin Warner authored May 23, 2024
```
add torch.compile dynamic support
```
046c2ad7
test_custom_4d_attention_mask skip with sliding window attn (#30833) · 6739e1d2
Poedator authored May 23, 2024

6739e1d2

Docs / Quantization: refactor quantization documentation (#30942) · 87a35181

Younes Belkada authored May 23, 2024



* refactor quant docs

* delete file

* rename to overview

* fix

* fix table

* fix

* add content

* fix library versions

* fix table

* fix table

* fix table

* fix table

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* replace to quantization_config

* fix aqlm snippet

* add DLAI courses

* fix

* fix table

* fix bulet points

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

87a35181

Quantized KV Cache (#30483) · d583f131

Raushan Turganbay authored May 23, 2024



* clean-up

* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* more suggestions

* mapping if torch available

* run tests & add 'support_quantized' flag

* fix jamba test

* revert, will be fixed by another PR

* codestyle

* HQQ and versatile cache classes

* final update

* typo

* make tests happy

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

d583f131

Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/visual_bert (#30983) · e05baad8

dependabot[bot] authored May 23, 2024

Bump requests in /examples/research_projects/visual_bert

Bumps [requests](https://github.com/psf/requests) from 2.31.0 to 2.32.2.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.31.0...v2.32.2

)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

e05baad8

Push ci image (#30982) · 4ef85fee

Arthur authored May 23, 2024

* [build-ci-image]

* correct branch

* push ci image

* [build-ci-image]

* update scheduled as well

* [push-ci-image]

* [build-ci-image]

* [push-ci-image]

* update deps

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* oups [build-ci-image]

* [push-ci-image]

* fix

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* updated

* [build-ci-image] update tag

* [build-ci-image]

* [build-ci-image]

* fix tag

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* github name

* commit_title?

* fetch

* update

* it not found

* dev

* dev

* [push-ci-image]

* dev

* dev

* update

* dev

* dev print dev commit message dev

* dev ? dev

* dev

* dev

* dev

* dev

* [build-ci-image]

* [build-ci-image]

* [push-ci-image]

* revert unwanted

* revert convert as well

* no you are not important

* [build-ci-image]

* Update .circleci/config.yml

* pin tf probability dev

* [push-ci-image] skip

* [push-ci-image] test

* [push-ci-image]

* fix

* device

4ef85fee

Using assistant in AutomaticSpeechRecognitionPipeline with different encoder size (#30637) · eb1a77bb

Kamil Akesbi authored May 23, 2024



* fiw input to generate in pipeline

* fixup

* pass input_features to generate with assistant

* error if model and assistant with different enc size

* fix

* apply review suggestions

* use self.config.is_encoder_decoder

* pass inputs to generate directly

* add slow tests

* Update src/transformers/generation/utils.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* apply review

* Update src/transformers/generation/utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* apply code review

* update attributes encoder_xyz to check

* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add slow test

* solve conflicts

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

eb1a77bb

22 May, 2024 17 commits

Update object detection with latest resize and pad strategies (#30955) · 15585b81

Pavel Iakubovskii authored May 22, 2024

* Update with new resizing and pad strategy

* Return pixel mask param

* Update inference in guide

* Fix empty compose

* Update guide

15585b81

Paligemma causal attention mask (#30967) · a25f7d3c

Pablo Montalvo authored May 22, 2024



* PaliGemma working causal attention

* Formatting

* Style

* Docstrings + remove commented code

* Update docstring for PaliGemma Config

* PaliGemma - add separator ind to model/labels

* Refactor + docstring paligemma processor method

* Style

* return token type ids when tokenizing labels

* use token type ids when building causal mask

* add token type ids to tester

* remove separator from config

* fix style

* don't ignore separator

* add processor documentation

* simplify tokenization

* fix causal mask

* style

* fix label propagation, revert suffix naming

* fix style

* fix labels tokenization

* [run-slow]paligemma

* add eos if suffixes are present

* [run-slow]paligemma

* [run-slow]paligemma

* add misssing tokens to fast version

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

* [run-slow]paligemma

---------
Co-authored-by: Peter Robicheaux <peter@roboflow.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

a25f7d3c

Fix link in Pipeline documentation (#30948) · d44e1ae0

Jun authored May 22, 2024



fix documentation as suggested by stevhliu
Co-authored-by: Jun <jun@reliant.ai>

d44e1ae0

[Whisper] Strip prompt before finding common subsequence (#27836) · 0948c827
Sanchit Gandhi authored May 22, 2024

0948c827
Generation: get special tokens from model config (#30899) · b1065aa0
Raushan Turganbay authored May 22, 2024
```
* fix

* let's do this way?

* codestyle

* update

* add tests
```
b1065aa0
legacy to init the slow tokenizer when converting from slow was wrong (#30972) · 1d568dfa
Arthur authored May 22, 2024

1d568dfa
Finally fix the missing new model failure CI report (#30968) · 1432f641
Yih-Dar authored May 22, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
1432f641

🚨

out_indices always a list (#30941) · dff54ad2

amyeroberts authored May 22, 2024

* out_indices always a list

* Update src/transformers/utils/backbone_utils.py

* Update src/transformers/utils/backbone_utils.py

* Move type casting

* nit

dff54ad2

Paligemma - fix slow tests, add bf16 and f16 slow tests (#30851) · 250ae9f7

Pablo Montalvo authored May 22, 2024

* fix slow tests, add bf16 and f16 slow tests

* few fixes

* [run-slow]paligemma

* add gate decorator

* [run-slow]paligemma

* add missing gating

* [run-slow]paligemma

* [run-slow]paligemma

250ae9f7

[whisper] only trigger forced ids warning once (#30966) · ada86f97
Sanchit Gandhi authored May 22, 2024

ada86f97
Avoid extra chunk in speech recognition (#29539) · 15185084
Jonatan Kłosko authored May 22, 2024

15185084
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper. (#30938) · 24d2a5e1
Vaibhav Srivastav authored May 22, 2024
```
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper doc.
```
24d2a5e1
Fix low cpu mem usage tests (#30808) · 5c186003
Marc Sun authored May 22, 2024
```
* Fix tests

* fix udop failing test

* remove skip

* style
```
5c186003

Update video-llava docs (#30935) · 934e1b84

Raushan Turganbay authored May 22, 2024



* update video-llava

* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

934e1b84

Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/lxmert (#30956) · edb14eba

dependabot[bot] authored May 22, 2024


```yaml
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
```
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

edb14eba

Update build ci image [push-ci-image] (#30933) · 8e8786e5

Arthur authored May 22, 2024

* [build-ci-image]

* correct branch

* push ci image

* [build-ci-image]

* update scheduled as well

* [push-ci-image]

* [build-ci-image]

* [push-ci-image]

* update deps

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* oups [build-ci-image]

* [push-ci-image]

* fix

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* updated

* [build-ci-image] update tag

* [build-ci-image]

* [build-ci-image]

* fix tag

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* github name

* commit_title?

* fetch

* update

* it not found

* dev

* dev

* [push-ci-image]

* dev

* dev

* update

* dev

* dev print dev commit message dev

* dev ? dev

* dev

* dev

* dev

* dev

...

8e8786e5

update ruff version (#30932) · 673440d0

Arthur authored May 22, 2024



* update ruff version

* fix research projects

* Empty

* Fix errors

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

673440d0