Commits · 9a4521dd9b6ef93d9ab770b13ac4684db72ac585 · chenpangpang / transformers

23 Jan, 2024 8 commits

Support single token decode for `CodeGenTokenizer` (#28628) · 9a4521dd
cmathw authored Jan 23, 2024
```
convert token id to list in .decode()
```
9a4521dd

add dataloader prefetch factor in training args and trainer (#28498) · 5b5e71dc

Quentin Meeus authored Jan 23, 2024



* add dataloader prefetch factor in training args and trainer

* remove trailing spaces

* prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None

dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process.

* Remove whitespaces in empty line

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5b5e71dc

Fix windows err with checkpoint race conditions (#28637) · 582d104b
Zach Mueller authored Jan 23, 2024
```
Fix windows err
```
582d104b
`tensor_size` - fix copy/paste error msg typo (#28660) · c475eca9
Scruel Tao authored Jan 23, 2024
```
Fix copy/paste error msg typo
```
c475eca9

Enable instantiating model with pretrained backbone weights (#28214) · 27c79a0f

amyeroberts authored Jan 23, 2024



* Enable instantiating model with pretrained backbone weights

* Update tests so backbone checkpoint isn't passed in

* Remove doc updates until changes made in modeling code

* Clarify pretrained import

* Update configs - docs and validation check

* Update src/transformers/utils/backbone_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Clarify exception message

* Update config init in tests

* Add test for when use_timm_backbone=True

* Small test updates

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

27c79a0f

Enable safetensors conversion from PyTorch to other frameworks without the... · 008a6a22

Lysandre Debut authored Jan 23, 2024


Enable safetensors conversion from PyTorch to other frameworks without the torch requirement (#27599)

* Initial commit

* Requirements & tests

* Tests

* Tests

* Rogue import

* Rogue torch import

* Cleanup

* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* bfloat16 management

* Sanchit's comments

* Import shield

* apply suggestions from code review

* correct bf16

* rebase

---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

008a6a22

integrations: fix DVCLiveCallback model logging (#28653) · 03986609
Dave Berenbaum authored Jan 23, 2024

03986609

get default device through `PartialState().default_device` as it has been... · 1fc12960

Huazhong Ji authored Jan 23, 2024

get default device through `PartialState().default_device` as it has been officially released (#27256)

get default device through `PartialState().default_device` as it has
been officially released

1fc12960

22 Jan, 2024 10 commits
- Fix phi model doc checkpoint (#28581) · e547458c
  amyeroberts authored Jan 22, 2024
```
Co-authored-by: Pashmina Cameron <11311835+pashminacameron@users.noreply.github.com>
```
  e547458c
- [`SigLIP`] Only import tokenizer if sentencepiece available (#28636) · 590be773
  amyeroberts authored Jan 22, 2024
```
Only import class if sp available
```
  590be773
- Update image_processing_deformable_detr.py (#28561) · a35ea570
  Sounak Dey authored Jan 22, 2024
```
* Update image_processing_deformable_detr.py

* Changes after running make fix-copies
```
  a35ea570
- [`GPTNeoX`] Fix GPTNeoX + Flash Attention 2 issue (#28645) · e201864b
  Younes Belkada authored Jan 22, 2024
```
Update modeling_gpt_neox.py
```
  e201864b
- [`Llava`] Update convert_llava_weights_to_hf.py script (#28617) · dafd5951
  isaac-vidas authored Jan 22, 2024
```
* Update convert_llava_weights_to_hf.py script

* Remove config update of adding padding to `vocab_size` and `text_config.vocab_size` which causes `ValueError` exception.
* Remove keys that ends with `inv_freq` from the state dict.
* Add examples and instructions for creating `model_state_dict.bin` that can be used by the script.

* Update convert_llava_weights_to_hf.py

* Update convert_vipllava_weights_to_hf.py
```
  dafd5951
- Fix lr_scheduler in no_trainer training scripts (#27872) · deb2b590
  bofeng huang authored Jan 22, 2024
```
* Fix lr_scheduler

* Fix lr scheduler
```
  deb2b590
- Add config tip to custom model docs (#28601) · 692c3c6b
  Matt authored Jan 22, 2024
```
Add tip to custom model docs
```
  692c3c6b
- Avoid root logger's level being changed (#28638) · d336c56d
  Yih-Dar authored Jan 22, 2024
```
* avoid root logger's level being changed

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  d336c56d
- Add missing key to TFLayoutLM signature (#28640) · bf674153
  Matt authored Jan 22, 2024
```
Fix missing bbox in LayoutLM signature
```
  bf674153
- Fix id2label assignment in run_classification.py (#28590) · f0acf7b6
  jheitmann authored Jan 22, 2024
  
  f0acf7b6
21 Jan, 2024 1 commit

[`GPTNeoX`] Fix BC issue with 4.36 (#28602) · 83f9196c

Arthur authored Jan 21, 2024

* fix dtype issue

* add a test

* update copied from mentions

* nits

* fixup

* fix copies

* Apply suggestions from code review

83f9196c

19 Jan, 2024 12 commits

Fix auxiliary loss related code in transformers (#28406) · 3f69f415

Sangbum Daniel Choi authored Jan 19, 2024



* [DETA] fix freeze/unfreeze function

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add freeze/unfreeze test case in DETA

* fix type

* fix typo 2

* fix : enable aux and enc loss in training pipeline

* Add unsynced variables from original DETA for training

* modification for passing CI test

* make style

* make fix

* manual make fix

* change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking

* remove print

* divide configuration in DetaModel and DetaForObjectDetection

* image smaller size than 224 will give topk error

* pred_boxes and logits should be equivalent to two_stage_num_proposals

* add missing part in DetaConfig

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add docstring in configure and prettify TO DO part

* change distribute related code to accelerate

* Update src/transformers/models/deta/configuration_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/deta/test_modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* protect importing accelerate

* change variable name to specific value

* wrong import

* fix aux_loss in conditional_detr

* add test aux_loss

* add aux_loss test in deta and table_transformer

* fix yolos since it doesn't have auxiliary function

* fix maskformer auxiliary_loss related code

* make style

* change param 'auxiliary_loss' to 'use_auxiliary_loss'

* change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests

* make style & fix-copies, also revert yolos related parameter

* revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig

* revert variable name in yolos

* revert maskformer

* add aux_loss test in maskformer

* make style

* Update src/transformers/models/yolos/configuration_yolos.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3f69f415

RWKV: raise informative exception when attempting to manipulate `past_key_values` (#28600) · 948ffff4
Joao Gante authored Jan 19, 2024

948ffff4
Fix `_speculative_sampling` implementation (#28508) · 9efec114
Ofir Zafrir authored Jan 19, 2024

9efec114

Allow add_tokens for ESM (#28535) · d1578159

Matt authored Jan 19, 2024



* Allow non-special tokens to be added

* Add test, fix token adding code

* Revert changes to id_to_token and token_to_id

* Update the ESM tokenizer to be a bit more standardized

* Update src/transformers/models/esm/tokenization_esm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d1578159

[`Llava`] Fix convert_llava_weights_to_hf.py script (#28570) · 5b7f4bc6

isaac-vidas authored Jan 19, 2024

* Update convert_llava_weights_to_hf.py

Fix call to `tokenizer.add_tokens`

* Add special_tokens to tokenizer.add_tokens in convert_vipllava_weights_to_hf.py

5b7f4bc6

[SigLIP] Don't pad by default (#28578) · faf03541
NielsRogge authored Jan 19, 2024
```
First draft
```
faf03541
Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386) · 8db64367
Fanli Lin authored Jan 19, 2024
```
* remove elif xpu

* remove redudant code
```
8db64367

[Whisper] Finalize batched SOTA long-form generation (#27658) · 690fe73f

Patrick von Platen authored Jan 19, 2024



* finalize

* make fix copies whisper

* [Tests] Make sure that we don't run tests mulitple times

* Update src/transformers/models/whisper/modeling_whisper.py

* [Tests] Make sure that we don't run tests mulitple times

* fix more

* improve

* improve

* improve further

* improve more

* improve

* fix more

* git commit and git push

* fix more

* fix more

* fix more

* New try

* Fix more whisper stuff

* Improve

* correct more

* correct more

* correct more

* Fix some tests

* Add more tests

* correct more

* correct more

* correct more

* push

* correct more

* Fix more

* Better

* without dec mask

* correct more

* clean

* save intermediate

* Fix more

* Fix VAD for large-v2

* Save new

* Correct more

* make cleaner

* correct tests

* correct src

* Finish

* Fix more

* Fix more

* finish

* Fix edge cases

* fix return_dict_in_generate

* fix all tests

* make style

* add docstrings

* add docstrings

* Fix logit processor

* make style

* fix pipeline test

* fix more style

* Apply suggestions from code review

* apply feedback Sanchit

* correct more

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct more

* correct more

* correct more

* Fix staticmethod

* correct more

* fix

* fix slow tests

* make style

* fix tokenizer test

* fix tokenizer test

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* finish

* finish

* revert kwargs change

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

690fe73f

feat: Sequential beam search (#26304) · d4fc1eb4
Saibo-creator authored Jan 19, 2024

d4fc1eb4

Add w2v2bert to pipeline (#28585) · 268fc1fd

Yoach Lacombe authored Jan 19, 2024

* generalize asr pipeline to fbank models

* change w2v2 pipeline output

* Update test_pipelines_automatic_speech_recognition.py

268fc1fd

v4.38.dev.0 · b2748a6e
Amy Roberts authored Jan 19, 2024

b2748a6e

Don't save `processor_config.json` if a processor has no extra attribute (#28584) · db9a7e9d

Yih-Dar authored Jan 19, 2024



* not save if empty

* fix

* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

db9a7e9d

18 Jan, 2024 9 commits

Making CTC training example more general (#28582) · 772307be

Yoach Lacombe authored Jan 18, 2024



* add w2v2bert compatibility

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

772307be

[Whisper] Fix audio classification with weighted layer sum (#28563) · 186aa6be
Sanchit Gandhi authored Jan 18, 2024
```
* fix

* tests

* fix test
```
186aa6be
[Whisper Tok] Move token ids to CPU when computing offsets (#28485) · 619ecfe2
Sanchit Gandhi authored Jan 18, 2024
```
* move token ids to cpu

* check for torch attr
```
619ecfe2
[ASR Pipe] Update init to set model type and subsequently call parent init method (#28486) · 0eaa5ea3
Sanchit Gandhi authored Jan 18, 2024
```
* add image processor arg

* super

* rm args
```
0eaa5ea3
Fix the documentation checkpoint for xlm-roberta-xl (#28567) · c662c78c
Jeremy Fowers authored Jan 18, 2024
```
* Fix the documentation checkpoint for xlm-roberta-xl

* Improve docstring consistency
```
c662c78c

Use `LoggingLevel` context manager in 3 tests (#28575) · 0754217c

Yih-Dar authored Jan 18, 2024



* inside with LoggingLevel

* remove is_flaky

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

0754217c

Add new meta w2v2-conformer BERT-like model (#28165) · d2cdefb9

Yoach Lacombe authored Jan 18, 2024



* first commit

* correct default value non causal

* update config and modeling code

* update converting checkpoint

* clean modeling and fix tests

* make style

* add new config parameters to docstring

* fix copied from statements

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* make position_embeddings_type docstrings clearer

* clean converting script

* remove function not used

* clean modeling file

* apply suggestion for test file + add convert script to not_doctested

* modify tests according to review - cleaner logic and more tests

* Apply nit suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add checker of valid position embeddings type

* instantiate new layer norm layer with the right eps

* fix freeze_feature_encoder since it can be None in some cases

* add test same output in convert script

* restore wav2vec2conformer and add new model

* create processor and FE + clean

* add new model code

* fix convert script and set default config parameters

* correct model id paths

* make style

* make fix-copies and cleaning files

* fix copied from statements

* complete .md and fixe copies

* clean convert script argument defaults

* fix config parameters docstrings

* fix config docstring

* add copied from and enrich FE tests

* fix copied from and repo-consistency

* add autotokenizer

* make test input length shorter and change docstring code

* fix docstrings and copied from

* add add_adapter to ASR training example

* make testing of adapters more robust

* adapt to multi adapter layers

* refactor input_values->input_features and remove w2v2-bert feature extractor

* remove pretraining model

* remove depreciated features and useless lines

* add copied from and ignore statements to modeling tests

* remove pretraining model #2

* change import in convert script

* change default in convert script

* update readme and remove useless line

* Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* refactor BERT to Bert for consistency

* remove useless ignore copy statement

* add persistent to buffer in rotary

* add eps in LayerNorm init and remove copied from

* add adapter activation parameters and add copied from statements

* Fix copied statements and add unitest.skip reasons

* add copied statement in test_processor

* refactor processor

* make style

* replace numpy random by torch rand

* remove expected output CTC

* improve converting script with processor class

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove gumbel class

* remove tests related to previously deleted class

* Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* correct typos

* remove uused parameters

* update processor to takes both text and audio

* update checkpoints

* update expected output and add ctc expected output

* add label_attention_mask

* replace pt with np in processor tests

* fix typo

* revert to behaviour with labels_attention_mask

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

d2cdefb9

chore: Fix multiple typos (#28574) · 5d8eb93e
hugo-syn authored Jan 18, 2024

5d8eb93e

[`Core Tokenization`] Support a fix for spm fast models (#26678) · 81899778

Arthur authored Jan 18, 2024

* fix

* last attempt

* current work

* fix forward compatibility

* save all special tokens

* current state

* revert additional changes

* updates

* remove tokenizer.model

* add a test and the fix

* nit

* revert one more break

* fix typefield issue

* quality

* more tests

* fix fields for FC

* more nits?

* new additional changes

* how

* some updates

* the fix

* where do we stand

* nits

* nits

* revert unrelated changes

* nits nits nits

* styling

* don't break llama just yet

* revert llama changes

* safe arg check

* fixup

* Add a test for T5

* Necessary changes

* Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Update to main

* nits

* fmt

* more and more test

* comments

* revert change as tests are failing

* make the test more readble

* nits

* refactor the test

* nit

* updates

* simplify

* style

* style

* style convert slow

* Update src/transformers/convert_slow_tokenizer.py

81899778