Commits · 692c3c6b73b8d4cb312950f60a05ab8ad37eff04 · chenpangpang / transformers

22 Jan, 2024 4 commits
- Add config tip to custom model docs (#28601) · 692c3c6b
  Matt authored Jan 22, 2024
```
Add tip to custom model docs
```
  692c3c6b
- Avoid root logger's level being changed (#28638) · d336c56d
  Yih-Dar authored Jan 22, 2024
```
* avoid root logger's level being changed

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  d336c56d
- Add missing key to TFLayoutLM signature (#28640) · bf674153
  Matt authored Jan 22, 2024
```
Fix missing bbox in LayoutLM signature
```
  bf674153
- Fix id2label assignment in run_classification.py (#28590) · f0acf7b6
  jheitmann authored Jan 22, 2024
  
  f0acf7b6
21 Jan, 2024 1 commit

[`GPTNeoX`] Fix BC issue with 4.36 (#28602) · 83f9196c

Arthur authored Jan 21, 2024

* fix dtype issue

* add a test

* update copied from mentions

* nits

* fixup

* fix copies

* Apply suggestions from code review

83f9196c

19 Jan, 2024 12 commits

Fix auxiliary loss related code in transformers (#28406) · 3f69f415

Sangbum Daniel Choi authored Jan 19, 2024



* [DETA] fix freeze/unfreeze function

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add freeze/unfreeze test case in DETA

* fix type

* fix typo 2

* fix : enable aux and enc loss in training pipeline

* Add unsynced variables from original DETA for training

* modification for passing CI test

* make style

* make fix

* manual make fix

* change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking

* remove print

* divide configuration in DetaModel and DetaForObjectDetection

* image smaller size than 224 will give topk error

* pred_boxes and logits should be equivalent to two_stage_num_proposals

* add missing part in DetaConfig

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add docstring in configure and prettify TO DO part

* change distribute related code to accelerate

* Update src/transformers/models/deta/configuration_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/deta/test_modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* protect importing accelerate

* change variable name to specific value

* wrong import

* fix aux_loss in conditional_detr

* add test aux_loss

* add aux_loss test in deta and table_transformer

* fix yolos since it doesn't have auxiliary function

* fix maskformer auxiliary_loss related code

* make style

* change param 'auxiliary_loss' to 'use_auxiliary_loss'

* change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests

* make style & fix-copies, also revert yolos related parameter

* revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig

* revert variable name in yolos

* revert maskformer

* add aux_loss test in maskformer

* make style

* Update src/transformers/models/yolos/configuration_yolos.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3f69f415

RWKV: raise informative exception when attempting to manipulate `past_key_values` (#28600) · 948ffff4
Joao Gante authored Jan 19, 2024

948ffff4
Fix `_speculative_sampling` implementation (#28508) · 9efec114
Ofir Zafrir authored Jan 19, 2024

9efec114

Allow add_tokens for ESM (#28535) · d1578159

Matt authored Jan 19, 2024



* Allow non-special tokens to be added

* Add test, fix token adding code

* Revert changes to id_to_token and token_to_id

* Update the ESM tokenizer to be a bit more standardized

* Update src/transformers/models/esm/tokenization_esm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d1578159

[`Llava`] Fix convert_llava_weights_to_hf.py script (#28570) · 5b7f4bc6

isaac-vidas authored Jan 19, 2024

* Update convert_llava_weights_to_hf.py

Fix call to `tokenizer.add_tokens`

* Add special_tokens to tokenizer.add_tokens in convert_vipllava_weights_to_hf.py

5b7f4bc6

[SigLIP] Don't pad by default (#28578) · faf03541
NielsRogge authored Jan 19, 2024
```
First draft
```
faf03541
Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386) · 8db64367
Fanli Lin authored Jan 19, 2024
```
* remove elif xpu

* remove redudant code
```
8db64367

[Whisper] Finalize batched SOTA long-form generation (#27658) · 690fe73f

Patrick von Platen authored Jan 19, 2024



* finalize

* make fix copies whisper

* [Tests] Make sure that we don't run tests mulitple times

* Update src/transformers/models/whisper/modeling_whisper.py

* [Tests] Make sure that we don't run tests mulitple times

* fix more

* improve

* improve

* improve further

* improve more

* improve

* fix more

* git commit and git push

* fix more

* fix more

* fix more

* New try

* Fix more whisper stuff

* Improve

* correct more

* correct more

* correct more

* Fix some tests

* Add more tests

* correct more

* correct more

* correct more

* push

* correct more

* Fix more

* Better

* without dec mask

* correct more

* clean

* save intermediate

* Fix more

* Fix VAD for large-v2

* Save new

* Correct more

* make cleaner

* correct tests

* correct src

* Finish

* Fix more

* Fix more

* finish

* Fix edge cases

* fix return_dict_in_generate

* fix all tests

* make style

* add docstrings

* add docstrings

* Fix logit processor

* make style

* fix pipeline test

* fix more style

* Apply suggestions from code review

* apply feedback Sanchit

* correct more

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct more

* correct more

* correct more

* Fix staticmethod

* correct more

* fix

* fix slow tests

* make style

* fix tokenizer test

* fix tokenizer test

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* finish

* finish

* revert kwargs change

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

690fe73f

feat: Sequential beam search (#26304) · d4fc1eb4
Saibo-creator authored Jan 19, 2024

d4fc1eb4

Add w2v2bert to pipeline (#28585) · 268fc1fd

Yoach Lacombe authored Jan 19, 2024

* generalize asr pipeline to fbank models

* change w2v2 pipeline output

* Update test_pipelines_automatic_speech_recognition.py

268fc1fd

v4.38.dev.0 · b2748a6e
Amy Roberts authored Jan 19, 2024

b2748a6e

Don't save `processor_config.json` if a processor has no extra attribute (#28584) · db9a7e9d

Yih-Dar authored Jan 19, 2024



* not save if empty

* fix

* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

db9a7e9d

18 Jan, 2024 11 commits

Making CTC training example more general (#28582) · 772307be

Yoach Lacombe authored Jan 18, 2024



* add w2v2bert compatibility

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

772307be

[Whisper] Fix audio classification with weighted layer sum (#28563) · 186aa6be
Sanchit Gandhi authored Jan 18, 2024
```
* fix

* tests

* fix test
```
186aa6be
[Whisper Tok] Move token ids to CPU when computing offsets (#28485) · 619ecfe2
Sanchit Gandhi authored Jan 18, 2024
```
* move token ids to cpu

* check for torch attr
```
619ecfe2
[ASR Pipe] Update init to set model type and subsequently call parent init method (#28486) · 0eaa5ea3
Sanchit Gandhi authored Jan 18, 2024
```
* add image processor arg

* super

* rm args
```
0eaa5ea3
Fix the documentation checkpoint for xlm-roberta-xl (#28567) · c662c78c
Jeremy Fowers authored Jan 18, 2024
```
* Fix the documentation checkpoint for xlm-roberta-xl

* Improve docstring consistency
```
c662c78c

Use `LoggingLevel` context manager in 3 tests (#28575) · 0754217c

Yih-Dar authored Jan 18, 2024



* inside with LoggingLevel

* remove is_flaky

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

0754217c

Add new meta w2v2-conformer BERT-like model (#28165) · d2cdefb9

Yoach Lacombe authored Jan 18, 2024



* first commit

* correct default value non causal

* update config and modeling code

* update converting checkpoint

* clean modeling and fix tests

* make style

* add new config parameters to docstring

* fix copied from statements

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* make position_embeddings_type docstrings clearer

* clean converting script

* remove function not used

* clean modeling file

* apply suggestion for test file + add convert script to not_doctested

* modify tests according to review - cleaner logic and more tests

* Apply nit suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add checker of valid position embeddings type

* instantiate new layer norm layer with the right eps

* fix freeze_feature_encoder since it can be None in some cases

* add test same output in convert script

* restore wav2vec2conformer and add new model

* create processor and FE + clean

* add new model code

* fix convert script and set default config parameters

* correct model id paths

* make style

* make fix-copies and cleaning files

* fix copied from statements

* complete .md and fixe copies

* clean convert script argument defaults

* fix config parameters docstrings

* fix config docstring

* add copied from and enrich FE tests

* fix copied from and repo-consistency

* add autotokenizer

* make test input length shorter and change docstring code

* fix docstrings and copied from

* add add_adapter to ASR training example

* make testing of adapters more robust

* adapt to multi adapter layers

* refactor input_values->input_features and remove w2v2-bert feature extractor

* remove pretraining model

* remove depreciated features and useless lines

* add copied from and ignore statements to modeling tests

* remove pretraining model #2

* change import in convert script

* change default in convert script

* update readme and remove useless line

* Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* refactor BERT to Bert for consistency

* remove useless ignore copy statement

* add persistent to buffer in rotary

* add eps in LayerNorm init and remove copied from

* add adapter activation parameters and add copied from statements

* Fix copied statements and add unitest.skip reasons

* add copied statement in test_processor

* refactor processor

* make style

* replace numpy random by torch rand

* remove expected output CTC

* improve converting script with processor class

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove gumbel class

* remove tests related to previously deleted class

* Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* correct typos

* remove uused parameters

* update processor to takes both text and audio

* update checkpoints

* update expected output and add ctc expected output

* add label_attention_mask

* replace pt with np in processor tests

* fix typo

* revert to behaviour with labels_attention_mask

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

d2cdefb9

chore: Fix multiple typos (#28574) · 5d8eb93e
hugo-syn authored Jan 18, 2024

5d8eb93e

[`Core Tokenization`] Support a fix for spm fast models (#26678) · 81899778

Arthur authored Jan 18, 2024

* fix

* last attempt

* current work

* fix forward compatibility

* save all special tokens

* current state

* revert additional changes

* updates

* remove tokenizer.model

* add a test and the fix

* nit

* revert one more break

* fix typefield issue

* quality

* more tests

* fix fields for FC

* more nits?

* new additional changes

* how

* some updates

* the fix

* where do we stand

* nits

* nits

* revert unrelated changes

* nits nits nits

* styling

* don't break llama just yet

* revert llama changes

* safe arg check

* fixup

* Add a test for T5

* Necessary changes

* Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Update to main

* nits

* fmt

* more and more test

* comments

* revert change as tests are failing

* make the test more readble

* nits

* refactor the test

* nit

* updates

* simplify

* style

* style

* style convert slow

* Update src/transformers/convert_slow_tokenizer.py

81899778

Use `weights_only` only if torch >= 1.13 (#28506) · a1668cc7

Yih-Dar authored Jan 18, 2024



* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

a1668cc7

Save `Processor` (#27761) · 3005f965

Yih-Dar authored Jan 18, 2024



* save processor

* Update tests/models/auto/test_processor_auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/test_processing_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

3005f965

17 Jan, 2024 7 commits

Fix Switch Transformers When sparse_step = 1 (#28564) · 98dda8ed
Ahmed Elnaggar authored Jan 17, 2024
```
Fix sparse_step = 1

I case sparse_step = 1, the current code will not work.
```
98dda8ed

Allow to train dinov2 with different dtypes like bf16 (#28504) · fa6d12f7

Lucas Thompson authored Jan 18, 2024

I want to train dinov2 with bf16 but I get the following error in https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/dinov2/modeling_dinov2.py#L635:

```
RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same
```

Since the input dtype is torch.float32, the parameter dtype has to be torch.float32...

@LZHgrla and I checked the code of clip vision encoder and found there is an automatic dtype transformation (https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/clip/modeling_clip.py#L181-L182).

So I add similar automatic dtype transformation to modeling_dinov2.py.

fa6d12f7

Fix SDPA tests (#28552) · 2c1eebc1

fxmarty authored Jan 17, 2024



* skip bf16 test if not supported by device

* fix

* fix bis

* use is_torch_bf16_available_on_device

* use is_torch_fp16_available_on_device

* fix & use public llama

* use 1b model

* fix flacky test

---------
Co-authored-by: Your Name <you@example.com>

2c1eebc1

Add qwen2 (#28436) · d6ffe74d

Junyang Lin authored Jan 17, 2024



* add config, modeling, and tokenization

* add auto and init

* update readme

* update readme

* update team name

* fixup

* fixup

* update config

* update code style

* update for fixup

* update for fixup

* update for fixup

* update for testing

* update for testing

* fix bug for config and tokenization

* fix bug for bos token

* not doctest

* debug tokenizer

* not doctest

* debug tokenization

* debug init for tokenizer

* fix style

* update init

* delete if in token auto

* add tokenizer doc

* add tokenizer in init

* Update dummy_tokenizers_objects.py

* update

* update

* debug

* Update tokenization_qwen2.py

* debug

* Update convert_slow_tokenizer.py

* add copies

* add copied from and make style

* update files map

* update test

* fix style

* fix merge reading and update tests

* fix tests

* fix tests

* fix style

* debug a variable in readme

* Update src/transformers/models/qwen2/configuration_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update test and copied from

* fix style

* update qwen2 tokenization  and tests

* Update tokenization_qwen2.py

* delete the copied from after property

* fix style

* update tests

* update tests

* add copied from

* fix bugs

* update doc

* add warning for sliding window attention

* update qwen2 tokenization

* fix style

* Update src/transformers/models/qwen2/modeling_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix tokenizer fast

---------
Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d6ffe74d

Fixes default value of `softmax_scale` in `PhiFlashAttention2`. (#28537) · d93ef7d7
Gustavo de Rosa authored Jan 17, 2024
```
* fix(phi): Phi does not use softmax_scale in Flash-Attention.

* chore(docs): Update Phi docs.
```
d93ef7d7

symbolic_trace: add past_key_values, llama, sdpa support (#28447) · a6adc05e

fxmarty authored Jan 17, 2024

* torch.fx: add pkv, llama, sdpa support

* Update src/transformers/models/opt/modeling_opt.py

* remove spaces

* trigger ci

* use explicit variable names

a6adc05e

[Makefile] Exclude research projects from format (#28551) · 09eb11a1
Patrick von Platen authored Jan 17, 2024

09eb11a1

16 Jan, 2024 5 commits

Config: warning when saving generation kwargs in the model config (#28514) · f4f57f9d
Joao Gante authored Jan 16, 2024

f4f57f9d

Add is_model_supported for fx (#28521) · 7142bdfa

inisis authored Jan 17, 2024



* modify check_if_model_is_supported to return bool

* add is_model_supported and have check_if_model_is_supported use that

* Update src/transformers/utils/fx.py

Fantastic
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

7142bdfa

Clearer error for SDPA when explicitely requested (#28006) · 02f8738e
fxmarty authored Jan 16, 2024
```
* clearer error for sdpa

* better message
```
02f8738e

[`SpeechT5Tokenization`] Add copied from and fix the... · fe23256b

Arthur authored Jan 16, 2024

[`SpeechT5Tokenization`]  Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme (#28522)

* Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme

* fixup

* add a small test

* style test file

* nites

fe23256b

[`TokenizationRoformerFast`] Fix the save and loading (#28527) · 96d08831
Arthur authored Jan 16, 2024
```
* cleanup

* add a test

* update the test

* style

* revert part that allows to pickle the tokenizer
```
96d08831