Commits · 4afead8a1c95d7a067c2cd091fbc81c87983e1bf · chenpangpang / transformers

13 Mar, 2024 3 commits

[Whisper] Deprecate forced ids for v4.39 (#29485) · 4afead8a
Sanchit Gandhi authored Mar 13, 2024
```
deprecate old funcs
```
4afead8a
Core: Fix copies on main (#29624) · 9acce7de
Younes Belkada authored Mar 13, 2024
```
fix fix copies
```
9acce7de

[Flash Attention 2] Add flash attention 2 for GPT-J (#28295) · be3fd8a2

bytebarde authored Mar 13, 2024



* initial implementation of flash attention for gptj

* modify flash attention and overwrite test_flash_attn_2_generate_padding_right

* update flash attention support list

* remove the copy line in the `CodeGenBlock`

* address copy mechanism

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add GPTJ attention classes

* add expected outputs in the gptj test

* Ensure repo consistency with 'make fix-copies'

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

be3fd8a2

12 Mar, 2024 13 commits

[`Gemma`] Supports converting directly in half-precision (#29529) · d522afea

Younes Belkada authored Mar 12, 2024

* Update convert_gemma_weights_to_hf.py

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py

* fixup

d522afea

Examples: check `max_position_embeddings` in the translation example (#29600) · d4796653
Joao Gante authored Mar 12, 2024
```
check max_position_embeddings
```
d4796653
Fix: handle logging of scalars in Weights & Biases summary (#29612) · 6b660d5e
Bharat Ramanathan authored Mar 12, 2024
```
fix: handle logging of scalars in wandb summary

fixes:  #29430
```
6b660d5e

Add tests for batching support (#29297) · 8e64ba28

Raushan Turganbay authored Mar 12, 2024



* add tests for batching support

* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* fixes and comments

* use cosine distance for conv models

* skip mra model testing

* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* finzalize  and make style

* check model type by input names

* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fixed batch size for all testers

* Revert "fixed batch size for all testers"

This reverts commit 525f3a0a058f069fbda00352cf202b728d40df99.

* add batch_size for all testers

* dict from model output

* do not skip layoutlm

* bring back some code from git revert

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* clean-up

* where did minus go in tolerance

* make whisper happy

* deal with consequences of losing minus

* deal with consequences of losing minus

* maskformer needs its own test for happiness

* fix more models

* tag flaky CV models from Amy's approval

* make codestyle

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8e64ba28

Fix typo ; Update quantization.md (#29615) · 11163fff
Furkan Akkurt authored Mar 12, 2024
```
Update quantization.md
```
11163fff

Update flava tests (#29611) · a15bd3af

Yih-Dar authored Mar 12, 2024



* update

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

a15bd3af

Set env var to hold Keras at Keras 2 (#29598) · df154258

Matt authored Mar 12, 2024

* Set env var to hold Keras at Keras 2

* Add Amy's update

* make fixup

* Use a warning instead

df154258

Update legacy Repository usage in various example files (#29085) · b6404866

Hilco van der Wilk authored Mar 12, 2024

* Update legacy Repository usage in `examples/pytorch/text-classification/run_glue_no_trainer.py`

Marked for deprecation here https://huggingface.co/docs/huggingface_hub/guides/upload#legacy-upload-files-with-git-lfs

* Fix import order

* Replace all example usage of deprecated Repository

* Fix remaining repo call and rename args variable

* Revert removing creation of gitignore files and don't change research examples

b6404866

Implemented add_pooling_layer arg to TFBertModel (#29603) · f1a565a3
tomigee authored Mar 12, 2024
```
Implemented add_pooling_layer argument
```
f1a565a3

Fix typo (determine) (#29606) · 50ec4933

Kola authored Mar 12, 2024



* Fix type (determine)

* ruff

* Update src/transformers/models/mamba/configuration_mamba.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

50ec4933

Stop passing None to compile() in TF examples (#29597) · 81ec8028

Matt authored Mar 12, 2024

* Fix examples to stop passing None to compile(), rework example invocation for run_text_classification.py

* Add Amy's fix

81ec8028

Fix minor typo: softare => software (#29602) · 73efe896
Dries Verachtert authored Mar 12, 2024

73efe896
Fix Fuyu doc typos (#29601) · 6cc5411d
Raushan Turganbay authored Mar 12, 2024
```
fix fuyu docs
```
6cc5411d

11 Mar, 2024 12 commits

Experimental loading of MLX files (#29511) · b382a09e

Pedro Cuenca authored Mar 11, 2024

* Experimental loading of MLX files

* Update exception message

* Add test

* Style

* Use model from hf-internal-testing

b382a09e

Tiny improvement for doc (#29581) · 73a27345

fzyzcjy authored Mar 12, 2024



* Update add_new_model.md

* Update docs/source/en/add_new_model.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

73a27345

Fixed broken link (#29558) · b45c0f55

Amrit Gupta authored Mar 11, 2024

Fixed broken link for Resources -> Token Classification -> Finetuning BERT for named-entity

b45c0f55

Add missing localized READMEs to the copies check (#29575) · c1e478aa
Klaus Hipp authored Mar 11, 2024
```
* Add missing localized READMEs to the copies check

* Run check to resolve all inconsistencies
```
c1e478aa

fix error: TypeError: Object of type Tensor is not JSON serializable … (#29568) · 47c95709

yuanzhoulvpi authored Mar 12, 2024



fix error: TypeError: Object of type Tensor is not JSON serializable trainer
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

47c95709

Don't use a subset in test fetcher if on `main` branch (#28816) · e5eb55b8
Yih-Dar authored Mar 11, 2024
```
save ci life
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
e5eb55b8
[Docs] Fix FastSpeech2Conformer model doc links (#29574) · dd1c9052
Klaus Hipp authored Mar 11, 2024
```
[Docs] Fix FastSpeech2Conformer links
```
dd1c9052

Make torch xla available on GPU (#29334) · 873d9bb3

Yitong Huang authored Mar 11, 2024



* add USE_TORCH_XLA env

* rename torch_tpu to torch_xla

* better is_torch_xla_available; fix some fsdp and performance issues

* fix format

* fix bug when pjrt_device is cpu

* fix bug

* fix the deprecation handling

---------
Co-authored-by: anw90 <ang868@gmail.com>
Co-authored-by: wangang.wa <wangang.wa@alibaba-inc.com>

873d9bb3

Bark model Flash Attention 2 Enabling to pass on check_device_map parameter to super() (#29357) · 9a3f4d4d
Damith Senanayake authored Mar 11, 2024
```
* Fixing error #29332. The _check_and_enable_flash_attn_2() method receives a check_device_map parameter and fails.

* style fixup
```
9a3f4d4d

Add Fill-in-the-middle training objective example - PyTorch (#27464) · 6d67837f

Tanay Mehta authored Mar 11, 2024

* add: initial script to train clm fim

* fix: if training model from scratch, new tokens will be added and embeddings resized

* fix: fixed attention_mask errors when generating FIM data

* fix: file formatted using black

* add: run_fim_no_trainer.py and fixed some comments in run_fim.py

* add: added fim examples to the README.md and ran code fixup

* fix: little bug in both fim training scripts

* fix: remove comment from notebook and added a note on fim related params

* fix: minor typo in README

* add: suggested minor changes to README and run_fim.py

* add: gradient_accumulation_steps and gradient_checkpointing args

* add: improved model embedding resizing

* add: pad_to_multiple_of and attn_implementation params

* add: requested minor changes

* add: deepspeed zero compatibility

* add: resize embeddings layer with zero3 support for fim model initialization

6d67837f

[`Docs`] fixed minor typo (#29555) · d80c9a34
j-gc authored Mar 11, 2024

d80c9a34
[`Mamba doc`] Post merge updates (#29472) · 4f27ee93
Arthur authored Mar 11, 2024
```
* post merge update

* nit

* oups
```
4f27ee93

08 Mar, 2024 12 commits

feat: use `warning_advice` for tensorflow warning (#29540) · 0290ec19
Winston H authored Mar 09, 2024
```
feat: use `warning_advice` instead of tensorflow warning
```
0290ec19

Fix eval thread fork bomb (#29538) · 469c1328

Zach Mueller authored Mar 08, 2024

* Fix eval thread fork bomb

* Keep eval dl persistent and prepare after so free_memory doesn't destroy it

* Add note

* Quality

469c1328

[tests] use the correct `n_gpu` in... · 3f6973db

Fanli Lin authored Mar 08, 2024

[tests] use the correct `n_gpu` in `TrainerIntegrationTest::test_train_and_eval_dataloaders` for XPU (#29307)

* fix n_gpu

* fix style

3f6973db

Fix WhisperNoSpeechDetection when input is full silence (#29065) · 1ba89dc2
Yoach Lacombe authored Mar 08, 2024
```
fix total silence input with no_speech_threshold
```
1ba89dc2
fix typos in FSDP config parsing logic in `TrainingArguments` (#29189) · 697f05ba
Yun Dai authored Mar 08, 2024
```
fix FSDP config
```
697f05ba
Make sliding window size inclusive in eager attention (#29519) · 608fa549
Jonatan Kłosko authored Mar 08, 2024
```
* Make sliding window size inclusive in eager attention

* Fix tests
```
608fa549

StableLM: Fix dropout argument type error (#29236) · f386c51a

liangjs authored Mar 08, 2024



* fix stablelm dropout argument type error

* fix docs of _flash_attention_forward

* fix all docs of _flash_attention_forward

* fix docs of _flash_attention_forward in starcoder2

---------
Co-authored-by: oliang <oliang@tencent.com>

f386c51a

[tests] use `torch_device` instead of `auto` for model testing (#29531) · 1ea3ad1a

Fanli Lin authored Mar 08, 2024



* use torch_device

* skip for XPU

* Update tests/generation/test_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1ea3ad1a

Typo fix in error message (#29535) · 14536c33
Clémentine Fourrier authored Mar 08, 2024

14536c33

fix image-to-text batch incorrect output issue (#29342) · 8ee1d472

Wang, Yi authored Mar 08, 2024



* fix image-to-text batch incorrect output issue
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add ci test
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* update ci test
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

8ee1d472

[tests] add the missing `require_sacremoses` decorator (#29504) · 8e589c83

Fanli Lin authored Mar 08, 2024



* add sacremoses check

* fix style

* for FlaubertTokenizer

* HerbertTokenizer fix

* add typeHint

* Update src/transformers/testing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make less skipped

* make quality

* remove import

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8e589c83

Generate: left-padding test, revisited (#29515) · bc764f42

Joao Gante authored Mar 08, 2024



* left-padding test revisited

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

bc764f42