Commits · f40b87de0ca234df61f76928956c4a2118c0b548 · chenpangpang / transformers

24 Jan, 2024 7 commits

[docs] Fix doc format (#28684) · f40b87de
Steven Liu authored Jan 24, 2024
```
* fix hfoptions

* revert changes to other files

* fix
```
f40b87de

improve efficient training on CPU documentation (#28646) · 8278b153

Fanli Lin authored Jan 25, 2024



* update doc

* revert

* typo fix

* refine

* add dtypes

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* no comma

* use avx512-vnni

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

8278b153

Improved type hinting for all attention parameters (#28479) · 5d29530e

nakranivaibhav authored Jan 24, 2024

* Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None'

* Fixed the ruff formatting issue

* fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None'

* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py

* test fail update

* fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py

* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py

* test fail update

* Removed the myvenv file

* Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py

5d29530e

[docs] DeepSpeed (#28542) · 738ec75c

Steven Liu authored Jan 24, 2024

* config

* optim

* pre deploy

* deploy

* save weights, memory, troubleshoot, non-Trainer

* done

738ec75c

Add back in generation types (#28681) · bb6aa8bc
amyeroberts authored Jan 24, 2024

bb6aa8bc

Use save_safetensor to disable safe serialization for XLA (#28669) · 0549000c

jeffhataws authored Jan 24, 2024

* Use save_safetensor to disable safe serialization for XLA

https://github.com/huggingface/transformers/issues/28438

* Style fixup

0549000c

Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#28517) · c5c69096

Khai Mai authored Jan 24, 2024

* fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask

* format code using black and ruff

* skip computing mask if attention_mask=None

* add tests for load balancing loss Mixtral-Moe

* fix assert loss is different in mixtral_test

* fix pad_leng

* use assertNotAlmostEqual and print to debug

* remove print for debug

* minor updates

* reduce rtol and atol

c5c69096

23 Jan, 2024 11 commits

Update README_es.md (#28612) · 5f81266f
Vladimir Pinera authored Jan 23, 2024
```
Fixing grammatical errors in the text
```
5f81266f

fix a hidden bug of `GenerationConfig`, now the `generation_config.json` can... · 39c3c0a7

Zhenwei authored Jan 24, 2024


fix a hidden bug of `GenerationConfig`, now the `generation_config.json` can be loaded successfully (#28604)

* fix a hidden bug of GenerationConfig

* keep `sort_keys=True` to maintain visibility

* Update src/transformers/generation/configuration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update configuration_utils.py

in case `obj` is a list, check the items in the list

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

39c3c0a7

Remove deprecated eager_serving fn (#28665) · ebc8f47b

Matt authored Jan 23, 2024

* Remove deprecated eager_serving fn

* Fix the input_signature docstring while I'm here

ebc8f47b

Support single token decode for `CodeGenTokenizer` (#28628) · 9a4521dd
cmathw authored Jan 23, 2024
```
convert token id to list in .decode()
```
9a4521dd

add dataloader prefetch factor in training args and trainer (#28498) · 5b5e71dc

Quentin Meeus authored Jan 23, 2024



* add dataloader prefetch factor in training args and trainer

* remove trailing spaces

* prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None

dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process.

* Remove whitespaces in empty line

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5b5e71dc

Fix windows err with checkpoint race conditions (#28637) · 582d104b
Zach Mueller authored Jan 23, 2024
```
Fix windows err
```
582d104b
`tensor_size` - fix copy/paste error msg typo (#28660) · c475eca9
Scruel Tao authored Jan 23, 2024
```
Fix copy/paste error msg typo
```
c475eca9

Enable instantiating model with pretrained backbone weights (#28214) · 27c79a0f

amyeroberts authored Jan 23, 2024



* Enable instantiating model with pretrained backbone weights

* Update tests so backbone checkpoint isn't passed in

* Remove doc updates until changes made in modeling code

* Clarify pretrained import

* Update configs - docs and validation check

* Update src/transformers/utils/backbone_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Clarify exception message

* Update config init in tests

* Add test for when use_timm_backbone=True

* Small test updates

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

27c79a0f

Enable safetensors conversion from PyTorch to other frameworks without the... · 008a6a22

Lysandre Debut authored Jan 23, 2024


Enable safetensors conversion from PyTorch to other frameworks without the torch requirement (#27599)

* Initial commit

* Requirements & tests

* Tests

* Tests

* Rogue import

* Rogue torch import

* Cleanup

* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* bfloat16 management

* Sanchit's comments

* Import shield

* apply suggestions from code review

* correct bf16

* rebase

---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

008a6a22

integrations: fix DVCLiveCallback model logging (#28653) · 03986609
Dave Berenbaum authored Jan 23, 2024

03986609

get default device through `PartialState().default_device` as it has been... · 1fc12960

Huazhong Ji authored Jan 23, 2024

get default device through `PartialState().default_device` as it has been officially released (#27256)

get default device through `PartialState().default_device` as it has
been officially released

1fc12960

22 Jan, 2024 10 commits
- Fix phi model doc checkpoint (#28581) · e547458c
  amyeroberts authored Jan 22, 2024
```
Co-authored-by: Pashmina Cameron <11311835+pashminacameron@users.noreply.github.com>
```
  e547458c
- [`SigLIP`] Only import tokenizer if sentencepiece available (#28636) · 590be773
  amyeroberts authored Jan 22, 2024
```
Only import class if sp available
```
  590be773
- Update image_processing_deformable_detr.py (#28561) · a35ea570
  Sounak Dey authored Jan 22, 2024
```
* Update image_processing_deformable_detr.py

* Changes after running make fix-copies
```
  a35ea570
- [`GPTNeoX`] Fix GPTNeoX + Flash Attention 2 issue (#28645) · e201864b
  Younes Belkada authored Jan 22, 2024
```
Update modeling_gpt_neox.py
```
  e201864b
- [`Llava`] Update convert_llava_weights_to_hf.py script (#28617) · dafd5951
  isaac-vidas authored Jan 22, 2024
```
* Update convert_llava_weights_to_hf.py script

* Remove config update of adding padding to `vocab_size` and `text_config.vocab_size` which causes `ValueError` exception.
* Remove keys that ends with `inv_freq` from the state dict.
* Add examples and instructions for creating `model_state_dict.bin` that can be used by the script.

* Update convert_llava_weights_to_hf.py

* Update convert_vipllava_weights_to_hf.py
```
  dafd5951
- Fix lr_scheduler in no_trainer training scripts (#27872) · deb2b590
  bofeng huang authored Jan 22, 2024
```
* Fix lr_scheduler

* Fix lr scheduler
```
  deb2b590
- Add config tip to custom model docs (#28601) · 692c3c6b
  Matt authored Jan 22, 2024
```
Add tip to custom model docs
```
  692c3c6b
- Avoid root logger's level being changed (#28638) · d336c56d
  Yih-Dar authored Jan 22, 2024
```
* avoid root logger's level being changed

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  d336c56d
- Add missing key to TFLayoutLM signature (#28640) · bf674153
  Matt authored Jan 22, 2024
```
Fix missing bbox in LayoutLM signature
```
  bf674153
- Fix id2label assignment in run_classification.py (#28590) · f0acf7b6
  jheitmann authored Jan 22, 2024
  
  f0acf7b6
21 Jan, 2024 1 commit

[`GPTNeoX`] Fix BC issue with 4.36 (#28602) · 83f9196c

Arthur authored Jan 21, 2024

* fix dtype issue

* add a test

* update copied from mentions

* nits

* fixup

* fix copies

* Apply suggestions from code review

83f9196c

19 Jan, 2024 11 commits

Fix auxiliary loss related code in transformers (#28406) · 3f69f415

Sangbum Daniel Choi authored Jan 19, 2024



* [DETA] fix freeze/unfreeze function

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add freeze/unfreeze test case in DETA

* fix type

* fix typo 2

* fix : enable aux and enc loss in training pipeline

* Add unsynced variables from original DETA for training

* modification for passing CI test

* make style

* make fix

* manual make fix

* change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking

* remove print

* divide configuration in DetaModel and DetaForObjectDetection

* image smaller size than 224 will give topk error

* pred_boxes and logits should be equivalent to two_stage_num_proposals

* add missing part in DetaConfig

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add docstring in configure and prettify TO DO part

* change distribute related code to accelerate

* Update src/transformers/models/deta/configuration_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/deta/test_modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* protect importing accelerate

* change variable name to specific value

* wrong import

* fix aux_loss in conditional_detr

* add test aux_loss

* add aux_loss test in deta and table_transformer

* fix yolos since it doesn't have auxiliary function

* fix maskformer auxiliary_loss related code

* make style

* change param 'auxiliary_loss' to 'use_auxiliary_loss'

* change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests

* make style & fix-copies, also revert yolos related parameter

* revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig

* revert variable name in yolos

* revert maskformer

* add aux_loss test in maskformer

* make style

* Update src/transformers/models/yolos/configuration_yolos.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3f69f415

RWKV: raise informative exception when attempting to manipulate `past_key_values` (#28600) · 948ffff4
Joao Gante authored Jan 19, 2024

948ffff4
Fix `_speculative_sampling` implementation (#28508) · 9efec114
Ofir Zafrir authored Jan 19, 2024

9efec114

Allow add_tokens for ESM (#28535) · d1578159

Matt authored Jan 19, 2024



* Allow non-special tokens to be added

* Add test, fix token adding code

* Revert changes to id_to_token and token_to_id

* Update the ESM tokenizer to be a bit more standardized

* Update src/transformers/models/esm/tokenization_esm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d1578159

[`Llava`] Fix convert_llava_weights_to_hf.py script (#28570) · 5b7f4bc6

isaac-vidas authored Jan 19, 2024

* Update convert_llava_weights_to_hf.py

Fix call to `tokenizer.add_tokens`

* Add special_tokens to tokenizer.add_tokens in convert_vipllava_weights_to_hf.py

5b7f4bc6

[SigLIP] Don't pad by default (#28578) · faf03541
NielsRogge authored Jan 19, 2024
```
First draft
```
faf03541
Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386) · 8db64367
Fanli Lin authored Jan 19, 2024
```
* remove elif xpu

* remove redudant code
```
8db64367

[Whisper] Finalize batched SOTA long-form generation (#27658) · 690fe73f

Patrick von Platen authored Jan 19, 2024



* finalize

* make fix copies whisper

* [Tests] Make sure that we don't run tests mulitple times

* Update src/transformers/models/whisper/modeling_whisper.py

* [Tests] Make sure that we don't run tests mulitple times

* fix more

* improve

* improve

* improve further

* improve more

* improve

* fix more

* git commit and git push

* fix more

* fix more

* fix more

* New try

* Fix more whisper stuff

* Improve

* correct more

* correct more

* correct more

* Fix some tests

* Add more tests

* correct more

* correct more

* correct more

* push

* correct more

* Fix more

* Better

* without dec mask

* correct more

* clean

* save intermediate

* Fix more

* Fix VAD for large-v2

* Save new

* Correct more

* make cleaner

* correct tests

* correct src

* Finish

* Fix more

* Fix more

* finish

* Fix edge cases

* fix return_dict_in_generate

* fix all tests

* make style

* add docstrings

* add docstrings

* Fix logit processor

* make style

* fix pipeline test

* fix more style

* Apply suggestions from code review

* apply feedback Sanchit

* correct more

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct more

* correct more

* correct more

* Fix staticmethod

* correct more

* fix

* fix slow tests

* make style

* fix tokenizer test

* fix tokenizer test

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* finish

* finish

* revert kwargs change

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

690fe73f

feat: Sequential beam search (#26304) · d4fc1eb4
Saibo-creator authored Jan 19, 2024

d4fc1eb4

Add w2v2bert to pipeline (#28585) · 268fc1fd

Yoach Lacombe authored Jan 19, 2024

* generalize asr pipeline to fbank models

* change w2v2 pipeline output

* Update test_pipelines_automatic_speech_recognition.py

268fc1fd

v4.38.dev.0 · b2748a6e
Amy Roberts authored Jan 19, 2024

b2748a6e