Commits · 0fe44059aed104b1a001b98fbf57332c866bf499 · chenpangpang / transformers

10 Apr, 2024 10 commits

Arthur authored Apr 10, 2024



* Fork.

* RecurrentGemma initial commit.

* Updating __init__.py.

* Minor modification to how we initialize the cache.
Changing how the config specifies the architecture.

* Reformat code to 4 spaces.
Fixed a few typos.

* Fixed the forward pass.
Still unclear on the cache?

* Fixed the RecurrentGemmaForCausalLM

* Minor comment that we might not need attention_mask and output_attention arguments.

* Now cache should work as well.

* Adding a temporary example to check whether the model generation works.

* Adding the tests and updating imports.

* Adding the example file missing in the previous commit.

* First working example.

* Removing .gitignore and reverting parts of __init__.

* Re-add .gitignore.

* Addressing comments for configuration.

* Move mask creation to `_prepare_inputs_for_generation`.

* First try at integration tests:
1. AttributeError: 'GriffinCausalLMOutput' object has no attribute 'attentions'.
2. `cache_position` not passed

* Transfoering between machines.

* Running normal tests.

* Minor fix.

* More fixes.

* Addressing more comments.

* Minor fixes.

* first stab at cleanup

* more refactoring

* fix copies and else

* renaming and get init to work

* fix causal mask creation

* update

* nit

* fix a hell lot of things

* updates

* update conversion script

* make all keys importable

* nits

* add auto mappings

* properly convert ffw_up and down

* add scaling

* fix generations

* for recurrent dtype

* update

* fix going beyong window

* fixup

* add missing files

* current updates to remove last einops

* finish modeling refactor

* TADA

* fix compile

* fix most failing testt ? ?

* update tests

* refactor and update

* update

* nits, fixup and update tests

* more fixup

* nits

* fix imports

* test format

* fixups

* nits

* tuple typing

* fix code quality

* add model card

* fix doc

* skip most generation tests

* nits

* style

* doc fixes

* fix pr and check_copies?

* last nit

* oupsy

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>

* update

* Update src/transformers/models/recurrent_gemma/convert_recurrent_gemma_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update based on review

* doc nit

* fix quality

* quality

* fix slow test model path

* update default dype

* ignore attributes that can be safely ignored in check config attributes

* 0lallalala come on

* save nit

* style

* remove to dict update

* make sure we can also run in float16

* style

---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Aleksandar Botev <botev@google.com>
Co-authored-by: Leonard Berrada <lberrada@users.noreply.github.com>
Co-authored-by: anushanf <anushanf@google.com>
Co-authored-by: botev <botevmg@gmail.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

0fe44059

Fix typing annotation in hf_argparser (#30156) · 33bca541
Xu Song authored Apr 10, 2024

33bca541

Fix accelerate kwargs for versions <0.28.0 (#30086) · 0f94e3e1

Anton Vlasjuk authored Apr 10, 2024

* fix learning rate display issue in galore optimizer

* fix kwarg in accelerate when using versions < 0.28.0

* this was supposed to be in the other PR whoops

0f94e3e1

[UDOP] Improve docs, add resources (#29571) · 505854f7
NielsRogge authored Apr 10, 2024
```
* Improve docs

* Add more tips
```
505854f7
[UDOP] Fix tests (#29573) · 50c1c19f
NielsRogge authored Apr 10, 2024
```
* Fix tests

* Fix tests

* Remove no_split_modules
```
50c1c19f

Add str to TrainingArguments report_to type hint (#30078) · b7d002bd

Matthew Hoffman authored Apr 10, 2024

* Add str to TrainingArguments report_to type hint

* Swap order in Union

* Merge Optional into Union

https://github.com/huggingface/transformers/pull/30078#issuecomment-2042227546

b7d002bd

[tests] make 2 tests device-agnostic (#30008) · 18546378
Fanli Lin authored Apr 10, 2024
```
add torch device
```
18546378

[CI] Quantization workflow fix (#30158) · bb76f81e

Marc Sun authored Apr 10, 2024



* fix workflow

* call ci

* Update .github/workflows/self-scheduled-caller.yml
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

bb76f81e

Fix and simplify semantic-segmentation example (#30145) · 56d001b2

Pavel Iakubovskii authored Apr 10, 2024

* Remove unused augmentation

* Fix pad_if_smaller() and remove unused augmentation

* Add indentation

* Fix requirements

* Update dataset use instructions

* Replace transforms with albumentations

* Replace identity transform with None

* Fixing formatting

* Fixed comment place

56d001b2

Raushan Turganbay authored Apr 10, 2024



* avoid generation length warning

* add tests

* Update src/transformers/generation/candidate_generator.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add tests and minor fixes

* refine `min_new_tokens`

* Update src/transformers/generation/candidate_generator.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add method to prepare length arguments

* add test for min length

* Update src/transformers/generation/candidate_generator.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* fix variable naming

* empty commit for tests

* trigger tests (empty)

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

41579763

09 Apr, 2024 11 commits

[CI] Fix setup (#30147) · 6cdbd73e

Marc Sun authored Apr 09, 2024

* [CI] fix setup

* fix

* test

* Revert "test"

This reverts commit 7df416d45074439e2fa1b78afd24eacf37ce072f.

6cdbd73e

[docs] Fix image segmentation guide (#30132) · 21e23ffc
Steven Liu authored Apr 09, 2024
```
fixes
```
21e23ffc

Fix quantization tests (#29914) · 58a939c6

Marc Sun authored Apr 09, 2024

* revert back to torch 2.1.1

* run test

* switch to torch 2.2.1

* udapte dockerfile

* fix awq tests

* fix test

* run quanto tests

* update tests

* split quantization tests

* fix

* fix again

* final fix

* fix report artifact

* build docker again

* Revert "build docker again"

This reverts commit 399a5f9d9308da071d79034f238c719de0f3532e.

* debug

* revert

* style

* new notification system

* testing notfication

* rebuild docker

* fix_prev_ci_results

* typo

* remove warning

* fix typo

* fix artifact name

* debug

* issue fixed

* debug again

* fix

* fix time

* test notif with faling test

* typo

* issues again

* final fix ?

* run all quantization tests again

* remove name to clear space

* revert modfiication done on workflow

* fix

* build docker

* build only quant docker

* fix quantization ci

* fix

* fix report

* better quantization_matrix

* add print

* revert to the basic one

58a939c6

Send headers when converting safetensors (#30144) · 6487e9b3
Yih-Dar authored Apr 09, 2024
```
Co-authored-by: Wauplin <lucainp@gmail.com>
```
6487e9b3

Fix slow tests for important models to be compatible with A10 runners (#29905) · 08a194fc

Yih-Dar authored Apr 09, 2024



* fix mistral and mixtral

* add pdb

* fix mixtral tesst

* fix

* fix mistral ?

* add fix gemma

* fix mistral

* fix

* test

* anoter test

* fix

* fix

* fix mistral tests

* fix them again

* final fixes for mistral

* fix padding right

* fix whipser fa2

* fix

* fix

* fix gemma

* test

* fix llama

* fix

* fix

* fix llama gemma

* add class attribute

* fix CI

* clarify whisper

* compute_capability

* rename names in some comments

* Add   # fmt: skip

* make style

* Update tests/models/mistral/test_modeling_mistral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update

* update

---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

08a194fc

[Trainer] Undo #29896 (#30129) · e9c23fa0
NielsRogge authored Apr 09, 2024
```
* Undo

* Use tokenizer

* Undo data collator
```
e9c23fa0
[Trainer] Fix default data collator (#30142) · ba1b24e0
NielsRogge authored Apr 09, 2024
```
* Fix data collator

* Support feature extractors as well
```
ba1b24e0

Revert workaround for TF safetensors loading (#30128) · ec59a421

Matt authored Apr 09, 2024

* See if we can get tests to pass with the fixed weights

* See if we can get tests to pass with the fixed weights

* Replace the revisions now that we don't need them anymore

ec59a421

Fix docs Pop2Piano (#30140) · 841e87ef
Raushan Turganbay authored Apr 09, 2024
```
fix copies
```
841e87ef

Add datasets.Dataset to Trainer's train_dataset and eval_dataset type hints (#30077) · af4c0262

Matthew Hoffman authored Apr 09, 2024

* Add datasets.Dataset to Trainer's train_dataset and eval_dataset type hints

* Add is_datasets_available check for importing datasets under TYPE_CHECKING guard

https://github.com/huggingface/transformers/pull/30077/files#r1555939352

af4c0262

Fix failing DeepSpeed model zoo tests (#30112) · 4e3490f7

Sourab Mangrulkar authored Apr 09, 2024

* fix sequence length errors

* fix label column name error for vit

* fix the lm_head embedding!=linear layer mismatches for Seq2Seq models

4e3490f7

08 Apr, 2024 17 commits

[`StableLm`] Add QK normalization and Parallel Residual Support (#29745) · 2f12e408

Jonathan Tow authored Apr 08, 2024

* init: add StableLm 2 support

* add integration test for parallel residual and qk layernorm

* update(modeling): match qk norm naming for consistency with phi/persimmon

* fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity

* `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward`

* refactor: rename head states var in `StableLmLayerNormPerHead`

* tests: update test model and add generate check

2f12e408

Adding `mps` as device for `Pipeline` class (#30080) · 8c00b53e

Felix Hirwa Nshuti authored Apr 08, 2024



* adding env variable for mps and is_torch_mps_available for Pipeline

* fix linting errors

* Remove environment overide
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8c00b53e

Fix typo at ImportError (#30090) · 7afade20
DrAnaximandre authored Apr 08, 2024
```
fix typo at ImportError
```
7afade20

Make vitdet jit trace complient (#30065) · ef38e2a7

fxmarty authored Apr 08, 2024

* remove controlflows

* style

* rename patch_ to padded_ following review comment

* style

ef38e2a7

Trainer / Core : Do not change init signature order (#30126) · a71def02
Younes Belkada authored Apr 08, 2024
```
* Update trainer.py

* fix copies
```
a71def02

Fix falcon with SDPA, alibi but no passed mask (#30123) · 1897874e

fxmarty authored Apr 08, 2024



* fix falcon without attention_mask & alibi

* add test

* Update tests/models/falcon/test_modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1897874e

fix learning rate display in trainer when using galore optimizer (#30085) · 1773afce
Anton Vlasjuk authored Apr 08, 2024
```
fix learning rate display issue in galore optimizer
```
1773afce

Accept token in trainer.push_to_hub() (#30093) · 08c84433

Nick Doiron authored Apr 08, 2024



* pass token to trainer.push_to_hub

* fmt

* Update src/transformers/trainer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* pass token to create_repo, update_folder

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

08c84433

[#29174] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888) · 0201f642

Utkarsha Gupte authored Apr 08, 2024



* ImportError: Trainer with PyTorch requires accelerate>=0.20.1 Fix

Adding the evaluate and accelerate installs at the beginning of the cell to fix the issue

* ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1

* Import Error Fix

* Update installation.md

* Update quicktour.md

* rollback other lang changes

* Update _config.py

* updates for other languages

* fixing error

* Tutorial Update

* Update tokenization_utils_base.py

* Just use an optimizer string to pass the doctest?

---------
Co-authored-by: Matt <rocketknight1@gmail.com>

0201f642

Patch fix - don't use safetensors for TF models (#30118) · 7f9aff91
amyeroberts authored Apr 08, 2024
```
* Patch fix - don't use safetensors for TF models

* Skip test for TF for now

* Update for another test
```
7f9aff91
fixing issue 30034 - adding data format for run_ner.py (#30088) · f5658732
JINO ROHIT authored Apr 08, 2024

f5658732
[tests] add `require_bitsandbytes` marker (#30116) · d16f0abc
Fanli Lin authored Apr 08, 2024
```
* add bnb flag

* move maker

* add accelerator maker
```
d16f0abc

updated examples/pytorch/language-modeling scripts and requirements.txt to... · 5e673ed2

Haz Sameen Shahgir authored Apr 08, 2024

updated examples/pytorch/language-modeling scripts and requirements.txt to require datasets>=2.14.0 (#30120)

updated requirements.txt and require_version() calls in examples/pytorch/language-modeling to require datasets>=2.14.0

5e673ed2

Make MLFlow version detection more robust and handles mlflow-skinny (#29957) · 836e88ca

Howard Liberty authored Apr 08, 2024

* Make MLFlow version detection more robust and handles mlflow-skinny

* Make function name more clear and refactor the logic

* Further refactor

836e88ca

Change log level to warning for num_train_epochs override (#30014) · a907a903
Xu Song authored Apr 08, 2024

a907a903

[Whisper] Computing features on GPU in batch mode for whisper feature extractor. (#29900) · 1ed93be4

vaibhavagg303 authored Apr 08, 2024



* add _torch_extract_fbank_features_batch function in feature_extractor_whisper

* reformat feature_extraction_whisper.py file

* handle batching in single function

* add gpu test & doc

* add batch test & device in each __call__

* add device arg in doc string

---------
Co-authored-by: vaibhav.aggarwal <vaibhav.aggarwal@sprinklr.com>

1ed93be4

doc: Correct spelling mistake (#30107) · 1fc34aa6
Cylis authored Apr 08, 2024

1fc34aa6

05 Apr, 2024 2 commits
- Fix whisper kwargs and generation config (#30018) · 76fa17c1
  Raushan Turganbay authored Apr 05, 2024
```
* clean-up whisper kwargs

* failing test
```
  76fa17c1
- Fix auto tests (#30067) · 9b5a6450
  Yih-Dar authored Apr 05, 2024
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  9b5a6450