Commits · 64e0573a8133db833521db9f23d05e36fc06c5f3 · chenpangpang / transformers

21 May, 2024 8 commits

[Benchmark] Reuse `optimum-benchmark` (#30615) · 64e0573a

Yih-Dar authored May 21, 2024



* benchmark

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

64e0573a

fix: center_crop occasionally outputs off-by-one dimension matrix (#30934) · 3b09d3f0

Matthew Beckers authored May 21, 2024

If required padding for a crop larger than input image is odd-numbered,
the padding would be rounded down instead of rounded up, causing the
output dimension to be one smaller than it should be.

3b09d3f0

Enforce saving at end of training if saving option chosen (#30160) · daf281f4

Zach Mueller authored May 21, 2024

* Enforce saving at end of training

* Fix test

* Rework test

* Fixup tests'

* Update comment based on sourab feedback

* Clean

daf281f4

CI: AMD MI300 tests fix (#30797) · 7a4792e6

Mohit Sharma authored May 21, 2024

* add fix

* update import

* updated dicts and comments

* remove prints

* Update testing_utils.py

7a4792e6

PaliGemma - fix processor with no input text (#30916) · a7557455
hoshi-hiyouga authored May 21, 2024
```
Update processing_paligemma.py
```
a7557455

Bump requests from 2.31.0 to 2.32.0 in /examples/research_projects/decision_transformer (#30925) · d502bd64

dependabot[bot] authored May 21, 2024


```yaml
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
```
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

d502bd64

FEAT / Trainer: LOMO optimizer support (#30178) · 8871b261

Younes Belkada authored May 21, 2024



* add V1 - adalomo not working yet

* add todo docs + refactor from comments

* adjust LR

* add docs

* add more elaborated test

* Apply suggestions from code review
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix

* push

* add accelerate check

* fix DDP case

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

* init kwargs

* safely add attribute

* revert to enum logic

* Update src/transformers/trainer.py

---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8871b261

FIX / TST: Fix expected results on Mistral slow test (A10) (#30909) · c876d121
Younes Belkada authored May 21, 2024
```
Update test_modeling_mistral.py
```
c876d121

20 May, 2024 15 commits

[docs] Spanish translation of model_memory_anatomy.md (#30885) · 0df888ff

Aaron Jimenez authored May 20, 2024

* add model_memory_anatomy to es/_toctree.yml

* copy model_memory_anatomy.md to es/

* translate first section

* translate doc

* chage forward activations

* fix sentence and and link to Trainer

* fix Trainer link

0df888ff

Add torch.compile for Mistral (#30642) · 616bb11d

Longjie Zheng authored May 20, 2024

* first version

* fix sliding window

* fix style

* add sliding window cache

* fix style

* address comments

* fix test

* fix style

* move sliding window check inside cache init

* revert changes on irrelevant files & add comment on SlidingWindowCache

* address comments & fix style

fix style

* update causal mask

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] llama

* [run-slow] mistral

* [run-slow] mistral

* [run-slow] mistral

* revert CI from a10 to t4

* wrap up

616bb11d

Introduce configured_state arg for accelerator_config (#29781) · 92d1d97c

Zach Mueller authored May 20, 2024



* Introduce configured_state

* Include note on tuning

* Allow for users to have defined a state already

* Include tests

* Add note on hpam tune

* Guard a bit better

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Finish rebase

* Finish rebase

* Guard carefully

* Fixup test

* Refactor

* Fin refactor

* Comment

* Update wrt feedback

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

92d1d97c

`tokenizer_class = "AutoTokenizer"` Llava Family (#30912) · bb48e921
Arthur authored May 20, 2024
```
propagate changes to more models
```
bb48e921
Fix a shape annotation and typos in `mamba` slow forward (#30691) · 76e05301
Anton Vlasjuk authored May 20, 2024
```
* fix typos and one shape comment

* fix `intermediade` typo in jamba
```
76e05301

Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM (#28706) · e6708709

Yoach Lacombe authored May 20, 2024

* Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM

* update with a type filter

* add raises error test

* fix added test

e6708709

fix for custom pipeline configuration (#29004) · c11ac785

Hafedh authored May 20, 2024

* fix for custom pipeline configuration

* fix for custom pipelines

* remove extra exception

* added test for custom pipelines extra tag

* format with ruff

* limit extra tag for first time only

* format with ruff

* improve tests for custom pipelines

c11ac785

separate kwargs in processor (similar to #30193) (#30905) · 7b4b4564

Eric2i authored May 20, 2024

* Fix similar bug in processor (related to #30193)

* Reformat processing_git.py to comply with ruff formatting

7b4b4564

Fix num_hidden_layers in initialization of new model in Mamba (#30403) · 18349164

Goncalo Paulo authored May 20, 2024

Fix num_hidden_layers in initialization

Originally, the initialization was using config.num_layers instead of config.num_hidden_layers. This fixes that.

18349164

add return_token_timestamps to WhisperProcessor (#30812) · 1c2bb3ac

Kamil Akesbi authored May 20, 2024



* compute num_frames in WhisperFeatureExtractor

* add return_num_frames in WhisperFeatureProcessor + adapt pipeline

* return_timestamps renaming + pipeline fix

* fix

* fix

* fix

* add tests

* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* apply review changes

* fix

* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests/models/whisper/test_modeling_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* apply review

* fix

* review changes

* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style quality

* EXPECTED_OUTPUT in single line

* small numpy->torch fix

* fix

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1c2bb3ac

DeformableDETR two stage support bfloat16 (#30907) · 66b0d9ee
Donggeun Yu authored May 20, 2024
```
Update modeling_deformable_detr.py
```
66b0d9ee

LLaVa-Next: Update docs with batched inference (#30857) · 5d0bf59b

Raushan Turganbay authored May 20, 2024



* update docs with batch ex

* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* accept nested list of img

---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

5d0bf59b

Add support for torch.compile dynamic shapes (#30560) · cd6bd0af

Benjamin Warner authored May 20, 2024

* add torch.compile dynamic support

* Add SDPA dynamic shapes compile test & improve SDPA comment

* comment consistency

cd6bd0af

FIX / Quantization: Fix Dockerfile build (#30890) · fce78fd0
Younes Belkada authored May 20, 2024
```
* Update Dockerfile

* Update docker/transformers-quantization-latest-gpu/Dockerfile
```
fce78fd0

Add TokenClassification for Mistral, Mixtral and Qwen2 (#29878) · 07bf2dff

Joseph Enguehard authored May 20, 2024



* Add MistralForTokenClassification

* Add tests and docs

* Add token classification for Mixtral and Qwen2

* Save llma for token classification draft

* Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2

* Formatting

* Add token classification support for Qwen2Moe model

* Add dropout layer to each ForTokenClassification model

* Add copied from in tests

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Propagate suggested changes

* Style

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

07bf2dff

17 May, 2024 10 commits

Enable dynamic resolution input for Swin Transformer and variants (#30656) · 481a9578

Abhiroop Tejomay authored May 17, 2024



* add interpolation of positional encoding support to swin

* add style changes

* use default image processor and make size a dictionary
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove logits testing
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Refactor image size validation logic when interpolation is disabled
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove asserts in modeling
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add dynamic resolution input support to swinv2

* change size to ensure interpolation encoding path is triggered

* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set interpolate_pos_encoding default value to False

* add dynamic resolution input to donut swin

* add dynamic resolution input to maskformer swin

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

481a9578

v4.42.dev.0 · b6eb708b
Arthur Zucker authored May 17, 2024

b6eb708b

Add fixed resize and pad strategy for object detection (#30742) · bf646fbf

Pavel Iakubovskii authored May 17, 2024

* Add resize and pad strategy

* Merge get_size functions

* Add pad_size + tests to object detection models

* Fixup

* Update docstrings

* Fixup

bf646fbf

update release script (#30880) · e9a8041d
Arthur authored May 17, 2024
```
* update release script

* update release script
```
e9a8041d

Support arbitrary processor (#30875) · 0a9300f4

Arthur authored May 17, 2024

* Support arbitrary processor

* fix

* nit

* update

* nit

* nit

* fix and revert

* add a small test

* better check

* fixup

* bug so let's just use class for now

* oups

* .

0a9300f4

[whisper] fix multilingual fine-tuning (#30865) · 57edd84b
Sanchit Gandhi authored May 17, 2024
```
* [whisper] fix multilingual fine-tuning

* config ids as well
```
57edd84b
Fix dependencies for image classification example (#30842) · 977ce58a
Jacky Lee authored May 17, 2024
```
* fix: missing dependencies

* fix: image classification dependencies
```
977ce58a

Enable device map (#30870) · 3802e786

Darshana S authored May 17, 2024

* added_no_split_modules

* added LlavaNextVisionAttention to _no_split_modules

3802e786

Remove deprecated logic and warnings (#30743) · 57c965a8

amyeroberts authored May 17, 2024

* Remove deprecated logic and warnings

* Add back some code that seems to be important...

* Let's just add all he nllb stuff back; removing it is a bit more involved

* Remove kwargs

* Remove more kwargs

57c965a8

TEST: Add llama logits tests (#30835) · 3d7d3a87

Younes Belkada authored May 17, 2024

* add llama logits test

* fix

* fix tests
"

"

* fix for a10

* format

* format

* fix

* [run-slow] remove fmt: skip

* Your commit message

* test commit

* Revert "test commit"

This reverts commit b66e01e55f5e31d4c0479cac4bcacc0f123dc9d2.

* [run-slow]llama

* Update tests/models/llama/test_modeling_llama.py

* [run-slow]llama

* empty commit

3d7d3a87

16 May, 2024 7 commits
- Fix VideoLlava imports (#30867) · 15c74a28
  amyeroberts authored May 16, 2024
```
* Fix VideoLlava imports

* Update dummy objects
```
  15c74a28
- TST / Quantization: Reverting to torch==2.2.1 (#30866) · 4e17e7dc
  Younes Belkada authored May 16, 2024
```
Reverting to 2.2.1
```
  4e17e7dc
- Docs: update example with assisted generation + sample (#30853) · f4014e75
  Joao Gante authored May 16, 2024
  
  f4014e75
- Video-LLaVa: Fix docs (#30855) · 95b3c381
  Raushan Turganbay authored May 16, 2024
```
fix model id in docs
```
  95b3c381
- Make `Gemma` work with `torch.compile` (#30775) · 1b3dba94
  Yih-Dar authored May 16, 2024
```
* fix

* [run-slow] gemma

* add test

* add `test_compile_static_cache`

* fix

* style

* remove subprocess

* use attribute

* fix

* style

* update

* [run-slow] dbrx,gemma,jetmoe,phi3,recurrent_gemma

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  1b3dba94
- Disable the FA backend for SDPA on AMD GPUs (#30850) · 0753134f
  Mohit Sharma authored May 16, 2024
```
* disable fa

* disable fa

* update warning

* update warning
```
  0753134f
- Cache: add new flag to distinguish models that `Cache` but not static cache (#30800) · 9d889f87
  Joao Gante authored May 16, 2024
```
* jamba cache

* new flag

* generate exception
```
  9d889f87