Commits · 18abc756c54a1b13ffc6f9e42148d09df95bef01 · chenpangpang / transformers

12 Sep, 2023 2 commits

[`core`] Import tensorflow inside relevant methods in `trainer_utils` (#26106) · 18abc756
Younes Belkada authored Sep 12, 2023
```
import tensorflow inside relevant methods in trainer_utils
```
18abc756

[`Persimmon`] Add support for persimmon (#26042) · 9cccb3a8

Arthur authored Sep 12, 2023



* intiial commit

* updates

* nits

* update conversion script

* update conversion script

* use path to load

* add tips etc

* some modeling logic

* modeling update

* more nits

* nits

* normal layer norm

* update config and doc

* nits

* update doc remove unused

* update

* fix inits and stuff

* fixup

* revert wrong changes

* updates

* more nits

* add default config values to the configuration file

* fixup happy

* update

* 2 tests left

* update readmes

* more nits

* slow test and more documentation

* update readme

* fix licences

* styling

* use fast if possible when saving tokenizer

* remove todo

* remove tokenization tests

* small last nits

* Apply suggestions from code review
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* nits to skip the timout doctest

* fix integration test

* fix test

* update eos token

* update to allow fast tokenization

* styling

* fix codeLlama as well for the update post processor

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add more copied from statements

* update

* doc passes doctest

* remove `# final layer norm?`

* change docstring prompot

* update

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* don't doctest the conversion script as it requires more packages

* don't init a model in the config

* oups

* fix doctest

---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9cccb3a8

11 Sep, 2023 2 commits
- [Core] Add lazy import structure to imports (#26090) · ce2e7ef3
  Patrick von Platen authored Sep 11, 2023
```
* improve import time

* Update src/transformers/integrations/__init__.py

* sort import
```
  ce2e7ef3
- only main process should call _save on deepspeed zero3 (#25959) · 7fd2d686
  Hang authored Sep 11, 2023
```
only main process should call _save when deepspeed zero3
```
  7fd2d686
08 Sep, 2023 4 commits

[`CodeLlamaTokenizerFast`] Fix fix `set_infilling_processor` to properly reset (#26041) · 09b2de6e

Arthur authored Sep 08, 2023

* fix `set_infilling_processor` to properly reset

* Add docstring!

* fixups

* more details in the docuemtation about the tokenization

* styl;e

09b2de6e

Skip warning if tracing with dynamo (#25581) · 6c26faa1

Angela Yi authored Sep 08, 2023

* Ignore warning if tracing with dynamo

* fix import error

* separate to function

* add test

6c26faa1

Update missing docs on `activation_dropout` and fix DropOut docs for SEW-D (#26031) · 18ee1fe7
Thien Tran authored Sep 08, 2023
```
* add missing doc for activation dropout

* fix doc for SEW-D dropout

* deprecate hidden_dropout for SEW-D
```
18ee1fe7

Fix Dropout Implementation in Graphormer (#24817) · 0c67a72c

Alexander Krauck authored Sep 08, 2023

This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically:

1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`.
2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers.

These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.

0c67a72c

07 Sep, 2023 6 commits

Try to fix training Loss inconsistent after resume from old checkpoint (#25872) · fb7d2469

dumpmemory authored Sep 08, 2023



* fix loss inconsistent after resume  #25340

* fix typo

* clean code

* reformatted code

* adjust code according to comments

* adjust check_dataloader_randomsampler location

* return sampler only

* handle sampler is None

* Update src/transformers/trainer_pt_utils.py

thanks @amyeroberts
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fb7d2469

Punctuation fix (#26025) · c5e66a40
MyungHa Kwon authored Sep 08, 2023
```
fix typo
```
c5e66a40
Fix vilt config docstring parameter to match value in init (#26017) · 00efd64e
raghavanone authored Sep 08, 2023
```
* Fix vilt config init parameter to match the ones in documentation

* Fix the documentation
```
00efd64e

Add `tgs` speed metrics (#25858) · 3744126c

CokeDong authored Sep 08, 2023



* Add tgs metrics

* bugfix and black formatting

* workaround for tokens counting

* formating and bugfix

* Fix

* Add opt-in for tgs metrics

* make style and fix error

* Fix doc

* fix docbuild

* hf-doc-build

* fix

* test

* Update src/transformers/training_args.py

renaming
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Update src/transformers/training_args.py

renaming
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Fix some symbol

* test

* Update src/transformers/trainer_utils.py

match nameing patterns
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/trainer.py

nice
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix reviews

* Fix

* Fix black

---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

3744126c

fix _resize_token_embeddings will set lm head size to 0 when enabled deepspeed zero3 (#26024) · df04959e
Kai authored Sep 07, 2023

df04959e
Fix err with FSDP (#25991) · e3a97163
Zach Mueller authored Sep 07, 2023
```
* Fix err

* Use version check
```
e3a97163

06 Sep, 2023 3 commits

modify context length for GPTQ + version bump (#25899) · fa6107c9

Marc Sun authored Sep 06, 2023



* add new arg for gptq

* add tests

* add min version autogptq

* fix order

* skip test

* fix

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

* change model path

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

fa6107c9

TF-OPT attention mask fixes (#25238) · 842e99f1

Matt authored Sep 06, 2023



* stash commit

* More OPT updates

* Update src/transformers/models/opt/modeling_tf_opt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

842e99f1

Falcon: fix revision propagation (#26006) · f6301b9a
Lysandre Debut authored Sep 06, 2023
```
* Fix revision propagation

* Cleaner
```
f6301b9a

05 Sep, 2023 13 commits

save space when converting hf model to megatron model. (#25950) · 172f42c5
tju_skywalker authored Sep 06, 2023
```
* fix convert megatron model too large

* fix convert megatron model too large
```
172f42c5

Fix Mega chunking error when using decoder-only model (#25765) · b8def689

Tanay Mehta authored Sep 06, 2023

* add: potential fix to mega chunking in decoder only model bug

* add: decoder with chunking test

* add: input_mask passed with input_ids

b8def689

[Wav2Vec2 Conformer] Fix inference float16 (#25985) · 8d518013
Sanchit Gandhi authored Sep 05, 2023
```
* [Wav2Vec2 Conformer] Fix inference float16

* fix test

* fix test more

* clean pipe test
```
8d518013

deepspeed resume from ckpt fixes and adding support for deepspeed optimizer... · 6bc517cc

Sourab Mangrulkar authored Sep 05, 2023

deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863)

* Add support for deepspeed optimizer and HF scheduler

* fix bug

* fix the import

* fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario

* fix loading of hf scheduler when loading deepspeed checkpoint

* fix import of `DeepSpeedSchedulerWrapper`

* add tests

* add the comment and skip the failing tests

* address comment

6bc517cc

Add TFDebertaV2ForMultipleChoice (#25932) · 1110b565

raghavanone authored Sep 05, 2023

* Add TFDebertaV2ForMultipleChoice

* Import newer model in main init

* Fix import issues

* Fix copies

* Add doc

* Fix tests

* Fix copies

* Fix docstring

1110b565

PegasusX add _no_split_modules (#25933) · da1af21d

andreeahedes authored Sep 05, 2023

* no_split_modules

* no_split_modules

* inputs_embeds+pos same device

* update _no_split_modules

* update _no_split_modules

da1af21d

Patch with accelerate xpu (#25714) · 70a98024

Abhilash Majumder authored Sep 05, 2023

* patch with accelerate xpu

* patch with accelerate xpu

* formatting

* fix tests

* revert ruff unrelated fixes

* revert ruff unrelated fixes

* revert ruff unrelated fixes

* fix test

* review fixes

* review fixes

* black fixed

* review commits

* review commits

* style fix

* use pytorch_utils

* revert markuplm test

70a98024

Trainer: delegate default generation values to `generation_config` (#25987) · 9a70d6e5
Joao Gante authored Sep 05, 2023

9a70d6e5

Update training_args.py to remove the runtime error (#25920) · aea76149

Sahel Sharify authored Sep 05, 2023

This cl iterates through a list of keys rather than dict items while updating the dict elements. Fixes the following error:
File "..../transformers/training_args.py", line 1544, in post_init
for k, v in self.fsdp_config.items():
RuntimeError: dictionary keys changed during iteration

aea76149

Use main in conversion script (#25973) · 391f2645

Yih-Dar authored Sep 05, 2023



* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

391f2645

fix typo (#25981) · 6f125aaa
Kai authored Sep 05, 2023
```
rename doanloading to downloading
```
6f125aaa

nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the... · 1cc3bc22

Huazhong Ji authored Sep 05, 2023

nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the minimum PyTorch version we currently support is 1.10.0 (#25974)

nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the
minimum PyTorch version we currently support is 1.10.0

1cc3bc22

Fix typo (#25966) · 404ff8fc
Susnato Dhar authored Sep 05, 2023
```
* Update feature_extraction_clap.py

* changed all lenght to length
```
404ff8fc

04 Sep, 2023 9 commits

v4.34.dev.0 · d8e13b3e
Lysandre authored Sep 04, 2023

d8e13b3e

[`Falcon`] Remove SDPA for falcon to support earlier versions of PyTorch (< 2.0) (#25947) · 49b69fe0

Younes Belkada authored Sep 04, 2023



* remove SDPA for falcon

* revert previous behaviour and add warning

* nit

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py

---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

49b69fe0

Put Falcon back (#25960) · 22a69f1d

Lysandre Debut authored Sep 04, 2023



* Put Falcon back

* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update test

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

22a69f1d

Add type hints for tf models final batch (#25883) · 040c4613

David Reguera authored Sep 04, 2023

* Add missing type hints and consistency to `RegNet` models

* Add missing type hints and consistency to `TFSamModel`

* Add missing type hints to `TFSegformerDecodeHead`

* Add missing type hints and consistency to `TransfoXL` family models

* Add missing type hints and consistency to `TFWav2Vec2ForSequenceClassification`

* Add type hints to `TFXLMModel`

* Fix linter

* Revert the type hints for `RegNet` to python 3.8 compliant

* Remove the redundant np.ndarray type hint.

040c4613

Fix smart check (#25955) · 44d2c199

Yih-Dar authored Sep 04, 2023



* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

44d2c199

Fix failing test (#25963) · 3a479672
Lysandre Debut authored Sep 04, 2023

3a479672

Add proper Falcon docs and conversion script (#25954) · 034bc5d2

Matt authored Sep 04, 2023

* Add proper Falcon docs and conversion script

* Autodetect the decoder architecture instead of using an arg

* Update docs now that we can autodetect

* Fix doc error

* Add doc to toctree

* Quick doc update

034bc5d2

Import deepspeed utilities from integrations (#25919) · bfb1895e
Omar Sanseviero authored Sep 04, 2023
```
Follow up from #25599
```
bfb1895e
[VITS] Handle deprecated weight norm (#25946) · eb984418
Sanchit Gandhi authored Sep 04, 2023

eb984418

01 Sep, 2023 1 commit

Update-llama-code (#25826) · a4dd53d8

Arthur authored Sep 01, 2023



* some bug fixes

* updates

* Update code_llama.md
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>

* Add co author
Co-authored-by: pcuenca <pedro@latenitesoft.com>

* add a test

* fixup

* nits

* some updates

* fix-coies

* adress comments

* nits

* nits

* fix docsting

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update

* add int for https://huggingface.co/spaces/hf-accelerate/model-memory-usage



---------
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>
Co-authored-by: pcuenca <pedro@latenitesoft.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

a4dd53d8