Commits · 15585b81a525e3878321df4b13404c96690a2469 · chenpangpang / transformers

22 May, 2024 17 commits

Update object detection with latest resize and pad strategies (#30955) · 15585b81

Pavel Iakubovskii authored May 22, 2024

* Update with new resizing and pad strategy

* Return pixel mask param

* Update inference in guide

* Fix empty compose

* Update guide

15585b81

Paligemma causal attention mask (#30967) · a25f7d3c

Pablo Montalvo authored May 22, 2024



* PaliGemma working causal attention

* Formatting

* Style

* Docstrings + remove commented code

* Update docstring for PaliGemma Config

* PaliGemma - add separator ind to model/labels

* Refactor + docstring paligemma processor method

* Style

* return token type ids when tokenizing labels

* use token type ids when building causal mask

* add token type ids to tester

* remove separator from config

* fix style

* don't ignore separator

* add processor documentation

* simplify tokenization

* fix causal mask

* style

* fix label propagation, revert suffix naming

* fix style

* fix labels tokenization

* [run-slow]paligemma

* add eos if suffixes are present

* [run-slow]paligemma

* [run-slow]paligemma

* add misssing tokens to fast version

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

* [run-slow]paligemma

---------
Co-authored-by: Peter Robicheaux <peter@roboflow.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

a25f7d3c

Fix link in Pipeline documentation (#30948) · d44e1ae0

Jun authored May 22, 2024



fix documentation as suggested by stevhliu
Co-authored-by: Jun <jun@reliant.ai>

d44e1ae0

[Whisper] Strip prompt before finding common subsequence (#27836) · 0948c827
Sanchit Gandhi authored May 22, 2024

0948c827
Generation: get special tokens from model config (#30899) · b1065aa0
Raushan Turganbay authored May 22, 2024
```
* fix

* let's do this way?

* codestyle

* update

* add tests
```
b1065aa0
legacy to init the slow tokenizer when converting from slow was wrong (#30972) · 1d568dfa
Arthur authored May 22, 2024

1d568dfa
Finally fix the missing new model failure CI report (#30968) · 1432f641
Yih-Dar authored May 22, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
1432f641

🚨

out_indices always a list (#30941) · dff54ad2

amyeroberts authored May 22, 2024

* out_indices always a list

* Update src/transformers/utils/backbone_utils.py

* Update src/transformers/utils/backbone_utils.py

* Move type casting

* nit

dff54ad2

Paligemma - fix slow tests, add bf16 and f16 slow tests (#30851) · 250ae9f7

Pablo Montalvo authored May 22, 2024

* fix slow tests, add bf16 and f16 slow tests

* few fixes

* [run-slow]paligemma

* add gate decorator

* [run-slow]paligemma

* add missing gating

* [run-slow]paligemma

* [run-slow]paligemma

250ae9f7

[whisper] only trigger forced ids warning once (#30966) · ada86f97
Sanchit Gandhi authored May 22, 2024

ada86f97
Avoid extra chunk in speech recognition (#29539) · 15185084
Jonatan Kłosko authored May 22, 2024

15185084
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper. (#30938) · 24d2a5e1
Vaibhav Srivastav authored May 22, 2024
```
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper doc.
```
24d2a5e1
Fix low cpu mem usage tests (#30808) · 5c186003
Marc Sun authored May 22, 2024
```
* Fix tests

* fix udop failing test

* remove skip

* style
```
5c186003

Update video-llava docs (#30935) · 934e1b84

Raushan Turganbay authored May 22, 2024



* update video-llava

* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

934e1b84

Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/lxmert (#30956) · edb14eba

dependabot[bot] authored May 22, 2024


```yaml
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
```
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

edb14eba

Update build ci image [push-ci-image] (#30933) · 8e8786e5

Arthur authored May 22, 2024

* [build-ci-image]

* correct branch

* push ci image

* [build-ci-image]

* update scheduled as well

* [push-ci-image]

* [build-ci-image]

* [push-ci-image]

* update deps

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* oups [build-ci-image]

* [push-ci-image]

* fix

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* updated

* [build-ci-image] update tag

* [build-ci-image]

* [build-ci-image]

* fix tag

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* [build-ci-image]

* github name

* commit_title?

* fetch

* update

* it not found

* dev

* dev

* [push-ci-image]

* dev

* dev

* update

* dev

* dev print dev commit message dev

* dev ? dev

* dev

* dev

* dev

* dev

* [build-ci-image]

* [build-ci-image]

* [push-ci-image]

* revert unwanted

* revert convert as well

* no you are not important

* [build-ci-image]

* Update .circleci/config.yml

* pin tf probability dev

8e8786e5

update ruff version (#30932) · 673440d0

Arthur authored May 22, 2024



* update ruff version

* fix research projects

* Empty

* Fix errors

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

673440d0

21 May, 2024 12 commits

🚨 [Idefics2] Update ignore index (#30898) · 60bb571e
NielsRogge authored May 21, 2024
```
* Update ignore index

* Update docs

* Update docs
```
60bb571e
Fix inhomogeneous shape error in example (#30434) · 5bf9caa0
Lu Teng authored May 22, 2024
```
Fix inhomogeneous shape error in example.
```
5bf9caa0
Fix swin embeddings interpolation (#30936) · d24097e0
amyeroberts authored May 21, 2024

d24097e0
TST / Workflows: Get slack notifications for docker image build (#30891) · eae2b6b8
Younes Belkada authored May 21, 2024
```
* Get slack notifications for docker image build

* Apply suggestions from code review

* Apply suggestions from code review
```
eae2b6b8

[Benchmark] Reuse `optimum-benchmark` (#30615) · 64e0573a

Yih-Dar authored May 21, 2024



* benchmark

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

64e0573a

fix: center_crop occasionally outputs off-by-one dimension matrix (#30934) · 3b09d3f0

Matthew Beckers authored May 21, 2024

If required padding for a crop larger than input image is odd-numbered,
the padding would be rounded down instead of rounded up, causing the
output dimension to be one smaller than it should be.

3b09d3f0

Enforce saving at end of training if saving option chosen (#30160) · daf281f4

Zach Mueller authored May 21, 2024

* Enforce saving at end of training

* Fix test

* Rework test

* Fixup tests'

* Update comment based on sourab feedback

* Clean

daf281f4

CI: AMD MI300 tests fix (#30797) · 7a4792e6

Mohit Sharma authored May 21, 2024

* add fix

* update import

* updated dicts and comments

* remove prints

* Update testing_utils.py

7a4792e6

PaliGemma - fix processor with no input text (#30916) · a7557455
hoshi-hiyouga authored May 21, 2024
```
Update processing_paligemma.py
```
a7557455

Bump requests from 2.31.0 to 2.32.0 in /examples/research_projects/decision_transformer (#30925) · d502bd64

dependabot[bot] authored May 21, 2024


```yaml
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
```
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

d502bd64

FEAT / Trainer: LOMO optimizer support (#30178) · 8871b261

Younes Belkada authored May 21, 2024



* add V1 - adalomo not working yet

* add todo docs + refactor from comments

* adjust LR

* add docs

* add more elaborated test

* Apply suggestions from code review
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix

* push

* add accelerate check

* fix DDP case

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

* init kwargs

* safely add attribute

* revert to enum logic

* Update src/transformers/trainer.py

---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8871b261

FIX / TST: Fix expected results on Mistral slow test (A10) (#30909) · c876d121
Younes Belkada authored May 21, 2024
```
Update test_modeling_mistral.py
```
c876d121

20 May, 2024 11 commits

[docs] Spanish translation of model_memory_anatomy.md (#30885) · 0df888ff

Aaron Jimenez authored May 20, 2024

* add model_memory_anatomy to es/_toctree.yml

* copy model_memory_anatomy.md to es/

* translate first section

* translate doc

* chage forward activations

* fix sentence and and link to Trainer

* fix Trainer link

0df888ff

Add torch.compile for Mistral (#30642) · 616bb11d

Longjie Zheng authored May 20, 2024

* first version

* fix sliding window

* fix style

* add sliding window cache

* fix style

* address comments

* fix test

* fix style

* move sliding window check inside cache init

* revert changes on irrelevant files & add comment on SlidingWindowCache

* address comments & fix style

fix style

* update causal mask

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] llama

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* revert CI from a10 to t4

* wrap up

616bb11d

Introduce configured_state arg for accelerator_config (#29781) · 92d1d97c

Zach Mueller authored May 20, 2024



* Introduce configured_state

* Include note on tuning

* Allow for users to have defined a state already

* Include tests

* Add note on hpam tune

* Guard a bit better

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Finish rebase

* Finish rebase

* Guard carefully

* Fixup test

* Refactor

* Fin refactor

* Comment

* Update wrt feedback

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

92d1d97c

`tokenizer_class = "AutoTokenizer"` Llava Family (#30912) · bb48e921
Arthur authored May 20, 2024
```
propagate changes to more models
```
bb48e921
Fix a shape annotation and typos in `mamba` slow forward (#30691) · 76e05301
Anton Vlasjuk authored May 20, 2024
```
* fix typos and one shape comment

* fix `intermediade` typo in jamba
```
76e05301

Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM (#28706) · e6708709

Yoach Lacombe authored May 20, 2024

* Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM

* update with a type filter

* add raises error test

* fix added test

e6708709

fix for custom pipeline configuration (#29004) · c11ac785

Hafedh authored May 20, 2024

* fix for custom pipeline configuration

* fix for custom pipelines

* remove extra exception

* added test for custom pipelines extra tag

* format with ruff

* limit extra tag for first time only

* format with ruff

* improve tests for custom pipelines

c11ac785

separate kwargs in processor (similar to #30193) (#30905) · 7b4b4564

Eric2i authored May 20, 2024

* Fix similar bug in processor (related to #30193)

* Reformat processing_git.py to comply with ruff formatting

7b4b4564

Fix num_hidden_layers in initialization of new model in Mamba (#30403) · 18349164

Goncalo Paulo authored May 20, 2024

Fix num_hidden_layers in initialization

Originally, the initialization was using config.num_layers instead of config.num_hidden_layers. This fixes that.

18349164

add return_token_timestamps to WhisperProcessor (#30812) · 1c2bb3ac

Kamil Akesbi authored May 20, 2024



* compute num_frames in WhisperFeatureExtractor

* add return_num_frames in WhisperFeatureProcessor + adapt pipeline

* return_timestamps renaming + pipeline fix

* fix

* fix

* fix

* add tests

* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* apply review changes

* fix

* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests/models/whisper/test_modeling_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* apply review

* fix

* review changes

* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style quality

* EXPECTED_OUTPUT in single line

* small numpy->torch fix

* fix

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1c2bb3ac

DeformableDETR two stage support bfloat16 (#30907) · 66b0d9ee
Donggeun Yu authored May 20, 2024
```
Update modeling_deformable_detr.py
```
66b0d9ee