Commits · fe085560d05b3a4a00464f9dd693dda34dc93d63 · chenpangpang / transformers

13 Mar, 2024 17 commits

Fix `multi_gpu_data_parallel_forward` for `MusicgenTest` (#29632) · fe085560
Yih-Dar authored Mar 13, 2024
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
fe085560

Fix batching tests for new models (Mamba and SegGPT) (#29633) · 5ac264d8

Raushan Turganbay authored Mar 13, 2024



* fix batchinng tests for new models

* Update tests/models/seggpt/test_modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5ac264d8

Refactor TFP call to just sigmoid() (#29641) · 31d01150
Matt authored Mar 13, 2024
```
* Refactor TFP call to just sigmoid()

* Make sure we cast to the right dtype
```
31d01150
[tests] make `test_trainer_log_level_replica` to run on accelerators with more... · a7e5e154
Fanli Lin authored Mar 14, 2024
```
[tests] make `test_trainer_log_level_replica` to run on accelerators with more than 2 devices (#29609)

add new arg
```
a7e5e154

[`Mask2Former`] Move normalization for numerical stability (#29542) · 3b6e95ec

amyeroberts authored Mar 13, 2024

* Move normalization for numerical stability

* Apply suggestions from code review

Remove useless x=x line

* PR comment - normalize later to preserve var name meaning

3b6e95ec

Add support for FSDP+QLoRA and DeepSpeed ZeRO3+QLoRA (#29587) · 350c5d15

Sourab Mangrulkar authored Mar 13, 2024



* fsdp+qlora related changes

* fixes

* Update quantization_config.py

* support fsdp+qlora and dsz3+qlora

* Update quantization_config.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* handle fsdp+qlora and dsz3+qlora correctly while model loading

* fix param count

* quality

* fsdp related changes

* fsdp changes only when using LoRA/QLoRA

* add accelerate version check

* refactor, update min accelerate version and add tests

1. Update minimum accelerate version to 0.26.0
2. Clean the trainer wrt accelerate version checks
3. FSDP refactor and test for fsdp config
4. use `itemsize` instead of `dtype2bytes` dict

* fix test

* Address comments
Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fix the conditional flag

* fix conditional flag

* address comments
Co-Authored-By: Zach Mueller <7831895+muellerzr@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Zach Mueller <7831895+muellerzr@users.noreply.github.com>

350c5d15

[docs] Spanish translate chat_templating.md & yml addition (#29559) · d3801aae

njackman-2344 authored Mar 13, 2024

* torchscript and trainer md es translation

* corrected md es files and even corrected spelling in en md

* made es corrections to trainer.md

* deleted entrenamiento... title on yml

* placed entrenamiento in right place

* translated es chat_templating.md w/ yml addition

* requested es changes to md and yml

* last es changes to md

d3801aae

[PyTorch/XLA] Fix extra TPU compilations introduced by recent changes (#29158) · b340d907
Jiewen Tan authored Mar 13, 2024
```
* tmp

* Remove debug step

* Fix a typo

* Move to is_torch_xla_available
```
b340d907
Llama: allow custom 4d masks (#29618) · 1e21c4fb
Joao Gante authored Mar 13, 2024

1e21c4fb
[`MaskFormer`, `Mask2Former`] Use einsum where possible (#29544) · 88a4f68f
amyeroberts authored Mar 13, 2024
```
* Use einsum where possible

* Fix
```
88a4f68f
Fix minor typo: infenrece => inference (#29621) · 62478857
Dries Verachtert authored Mar 13, 2024

62478857
[generate] deprecate forced ids processor (#29487) · fafe9093
Sanchit Gandhi authored Mar 13, 2024
```
* [generate] deprecate forced ids processor

* add todo

* make message clearer
```
fafe9093
Adds pretrained IDs directly in the tests (#29534) · 11bbb505
Lysandre Debut authored Mar 13, 2024
```
* Adds pretrained IDs directly in the tests

* Fix tests

* Fix tests

* Review!
```
11bbb505

Warn about tool use (#29628) · 38bff8c8

Lysandre Debut authored Mar 13, 2024



* Warn against remote tool use

* Additional disclaimer

* Update docs/source/en/custom_tools.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

38bff8c8

[Whisper] Deprecate forced ids for v4.39 (#29485) · 4afead8a
Sanchit Gandhi authored Mar 13, 2024
```
deprecate old funcs
```
4afead8a
Core: Fix copies on main (#29624) · 9acce7de
Younes Belkada authored Mar 13, 2024
```
fix fix copies
```
9acce7de

[Flash Attention 2] Add flash attention 2 for GPT-J (#28295) · be3fd8a2

bytebarde authored Mar 13, 2024



* initial implementation of flash attention for gptj

* modify flash attention and overwrite test_flash_attn_2_generate_padding_right

* update flash attention support list

* remove the copy line in the `CodeGenBlock`

* address copy mechanism

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add GPTJ attention classes

* add expected outputs in the gptj test

* Ensure repo consistency with 'make fix-copies'

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

be3fd8a2

12 Mar, 2024 13 commits

[`Gemma`] Supports converting directly in half-precision (#29529) · d522afea

Younes Belkada authored Mar 12, 2024

* Update convert_gemma_weights_to_hf.py

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py

* fixup

d522afea

Examples: check `max_position_embeddings` in the translation example (#29600) · d4796653
Joao Gante authored Mar 12, 2024
```
check max_position_embeddings
```
d4796653
Fix: handle logging of scalars in Weights & Biases summary (#29612) · 6b660d5e
Bharat Ramanathan authored Mar 12, 2024
```
fix: handle logging of scalars in wandb summary

fixes:  #29430
```
6b660d5e

Add tests for batching support (#29297) · 8e64ba28

Raushan Turganbay authored Mar 12, 2024



* add tests for batching support

* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* fixes and comments

* use cosine distance for conv models

* skip mra model testing

* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* finzalize  and make style

* check model type by input names

* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fixed batch size for all testers

* Revert "fixed batch size for all testers"

This reverts commit 525f3a0a058f069fbda00352cf202b728d40df99.

* add batch_size for all testers

* dict from model output

* do not skip layoutlm

* bring back some code from git revert

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* clean-up

* where did minus go in tolerance

* make whisper happy

* deal with consequences of losing minus

* deal with consequences of losing minus

* maskformer needs its own test for happiness

* fix more models

* tag flaky CV models from Amy's approval

* make codestyle

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8e64ba28

Fix typo ; Update quantization.md (#29615) · 11163fff
Furkan Akkurt authored Mar 12, 2024
```
Update quantization.md
```
11163fff

Update flava tests (#29611) · a15bd3af

Yih-Dar authored Mar 12, 2024



* update

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

a15bd3af

Set env var to hold Keras at Keras 2 (#29598) · df154258

Matt authored Mar 12, 2024

* Set env var to hold Keras at Keras 2

* Add Amy's update

* make fixup

* Use a warning instead

df154258

Update legacy Repository usage in various example files (#29085) · b6404866

Hilco van der Wilk authored Mar 12, 2024

* Update legacy Repository usage in `examples/pytorch/text-classification/run_glue_no_trainer.py`

Marked for deprecation here https://huggingface.co/docs/huggingface_hub/guides/upload#legacy-upload-files-with-git-lfs

* Fix import order

* Replace all example usage of deprecated Repository

* Fix remaining repo call and rename args variable

* Revert removing creation of gitignore files and don't change research examples

b6404866

Implemented add_pooling_layer arg to TFBertModel (#29603) · f1a565a3
tomigee authored Mar 12, 2024
```
Implemented add_pooling_layer argument
```
f1a565a3

Fix typo (determine) (#29606) · 50ec4933

Kola authored Mar 12, 2024



* Fix type (determine)

* ruff

* Update src/transformers/models/mamba/configuration_mamba.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

50ec4933

Stop passing None to compile() in TF examples (#29597) · 81ec8028

Matt authored Mar 12, 2024

* Fix examples to stop passing None to compile(), rework example invocation for run_text_classification.py

* Add Amy's fix

81ec8028

Fix minor typo: softare => software (#29602) · 73efe896
Dries Verachtert authored Mar 12, 2024

73efe896
Fix Fuyu doc typos (#29601) · 6cc5411d
Raushan Turganbay authored Mar 12, 2024
```
fix fuyu docs
```
6cc5411d

11 Mar, 2024 10 commits

Experimental loading of MLX files (#29511) · b382a09e

Pedro Cuenca authored Mar 11, 2024

* Experimental loading of MLX files

* Update exception message

* Add test

* Style

* Use model from hf-internal-testing

b382a09e

Tiny improvement for doc (#29581) · 73a27345

fzyzcjy authored Mar 12, 2024



* Update add_new_model.md

* Update docs/source/en/add_new_model.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

73a27345

Fixed broken link (#29558) · b45c0f55

Amrit Gupta authored Mar 11, 2024

Fixed broken link for Resources -> Token Classification -> Finetuning BERT for named-entity

b45c0f55

Add missing localized READMEs to the copies check (#29575) · c1e478aa
Klaus Hipp authored Mar 11, 2024
```
* Add missing localized READMEs to the copies check

* Run check to resolve all inconsistencies
```
c1e478aa

fix error: TypeError: Object of type Tensor is not JSON serializable … (#29568) · 47c95709

yuanzhoulvpi authored Mar 12, 2024



fix error: TypeError: Object of type Tensor is not JSON serializable trainer
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

47c95709

Don't use a subset in test fetcher if on `main` branch (#28816) · e5eb55b8
Yih-Dar authored Mar 11, 2024
```
save ci life
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
e5eb55b8
[Docs] Fix FastSpeech2Conformer model doc links (#29574) · dd1c9052
Klaus Hipp authored Mar 11, 2024
```
[Docs] Fix FastSpeech2Conformer links
```
dd1c9052

Make torch xla available on GPU (#29334) · 873d9bb3

Yitong Huang authored Mar 11, 2024



* add USE_TORCH_XLA env

* rename torch_tpu to torch_xla

* better is_torch_xla_available; fix some fsdp and performance issues

* fix format

* fix bug when pjrt_device is cpu

* fix bug

* fix the deprecation handling

---------
Co-authored-by: anw90 <ang868@gmail.com>
Co-authored-by: wangang.wa <wangang.wa@alibaba-inc.com>

873d9bb3

Bark model Flash Attention 2 Enabling to pass on check_device_map parameter to super() (#29357) · 9a3f4d4d
Damith Senanayake authored Mar 11, 2024
```
* Fixing error #29332. The _check_and_enable_flash_attn_2() method receives a check_device_map parameter and fails.

* style fixup
```
9a3f4d4d

Add Fill-in-the-middle training objective example - PyTorch (#27464) · 6d67837f

Tanay Mehta authored Mar 11, 2024

* add: initial script to train clm fim

* fix: if training model from scratch, new tokens will be added and embeddings resized

* fix: fixed attention_mask errors when generating FIM data

* fix: file formatted using black

* add: run_fim_no_trainer.py and fixed some comments in run_fim.py

* add: added fim examples to the README.md and ran code fixup

* fix: little bug in both fim training scripts

* fix: remove comment from notebook and added a note on fim related params

* fix: minor typo in README

* add: suggested minor changes to README and run_fim.py

* add: gradient_accumulation_steps and gradient_checkpointing args

* add: improved model embedding resizing

* add: pad_to_multiple_of and attn_implementation params

* add: requested minor changes

* add: deepspeed zero compatibility

* add: resize embeddings layer with zero3 support for fim model initialization

6d67837f