Commits · c8d98405a8f7b0e5d07391b671dcc61bb9d7bad5 · chenpangpang / transformers

23 Feb, 2024 6 commits

Use torch 2.2 for daily CI (model tests) (#29208) · c8d98405

Yih-Dar authored Feb 23, 2024



* Use torch 2.2 for daily CI (model tests)

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c8d98405

Allow remote code repo names to contain "." (#29175) · 371b572e

Matt authored Feb 23, 2024

* stash commit

* stash commit

* It works!

* Remove unnecessary change

* We don't actually need the cache_dir!

* Update docstring

* Add test

* Add test with custom cache dir too

* Update model repo path

371b572e

[`Doc`] update model doc qwen2 (#29238) · 89c64817

Arthur authored Feb 23, 2024



* update model doc qwen2

* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

89c64817

Improve _update_causal_mask performance (#29210) · 3f60d11a
Alessandro Palla authored Feb 23, 2024
```
* Fix issue 29206

* Fix style
```
3f60d11a

Fix missing translation in README_ru (#29054) · 75ed76ec

Amin authored Feb 23, 2024



* Fix missing translation in README_ru

* Update README_ru.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

---------
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

75ed76ec

fix(mlflow): check mlflow version to use the synchronous flag (#29195) · 45244940
cchen-dialpad authored Feb 23, 2024
```
* fix(mlflow): check mlflow version to use the  flag

* fix indent

* add log_params async and fix quality
```
45244940

22 Feb, 2024 3 commits
- Fix `torch.compile` with `fullgraph=True` when `attention_mask` input is used (#29211) · 2cc8cf6c
  fxmarty authored Feb 22, 2024
```
* fix torch.export.export for llama

* do not change doc title

* make fix copies
```
  2cc8cf6c
- [Mistral, Mixtral] Improve docs (#29084) · dabe8556
  NielsRogge authored Feb 22, 2024
```
* Improve docs

* Improve chat template
```
  dabe8556
- [Gemma] Fix eager attention (#29187) · 2a9b1f80
  Sanchit Gandhi authored Feb 22, 2024
```
* fix modelling code

* add tests

* fix tests

* add some logit tests

* style

* fix fix
```
  2a9b1f80
21 Feb, 2024 8 commits

Add training version check for AQLM quantizer. (#29142) · fc37f389
Andrei Panferov authored Feb 21, 2024
```
* training version check

* warn old aqlm

* aqlm 1.0.2 real

* docs
```
fc37f389
FIX [`Gemma`] Fix bad rebase with transformers main (#29170) · ae49b218
Younes Belkada authored Feb 21, 2024
```
fix bad rebase
```
ae49b218

[ `gemma`] Adds support for Gemma

💎

(#29167) · 594c1277

Arthur authored Feb 21, 2024

* inital commit

* update

* update conversion checkpoint

* update conversion script

* nits

* some fixes

* nits

* merge

* fix permute

* nits

* fix

* nits

* nits

* nits

* fix rope

* fix both rope

* nites

* style

* make sure flax works

* fix flax init code

* fix foward

* nits

* print flax generation out

* current code

* nits

* SIIIIIIIIIIIIIIIIIII

* update

* add new tokenizer

* correct fast tokenizer

* fix conversion

* more comments

* fix modeling and conversion

* nits and nits

* nits testing

* add some tokenization tests

* add some edge cases

* add slow tests and fix them

* fixup

* fix copies for modeling

* fix copies

* add 7B slow tests

* fix

* fix

* fix tests

* make tokenizer cis go green

* styling

* last tokenizer nits

* update jax tests

* fix flax for 7b

* add jit testing 🤗



* cleanups

* isolated nit, inv_freq for rotary_emb.inv_freq

* propagate to jax

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* adjust test

* fix conversion script

* change name

* correct file names

* update conversion script

* Fix bos and eos token ids in the model configuration (#3)

* update modelling

* update conversion script

* add static cache for gemma

* fix sdpa generate

* fix batched

* multiple fixes

* fix FA2

* final fix

* Rename a few missing strings and filenames (#4)

* merge with upstream main

* fix copies

* fix copies

* fix fixup

* fix fixup

* fix

* fix

* final tests

* fix fx gemma tests

* fix fx bf16/fp16 tests

* update slow fx tests

* fx slow tests: one logits, one generation

* move jit test standalone

* Apply suggestions from code review

* nits

* tokenizer updates

* more tokenization updates: custom GemmaSentencepieceExtrator

* style

* Update src/transformers/cache_utils.py

* Update src/transformers/models/gemma/__init__.py

* Update tests/models/gemma/test_modeling_flax_gemma.py

* small nits

* style

* update tokenization test

* fix the rotary embedding

* with style

* fix slow tests

* WARNING this commit might be very important for precisions

* Update tests/models/gemma/test_modeling_flax_gemma.py

* Update src/transformers/models/gemma/configuration_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/models/gemma/modeling_flax_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* small nits here and there!

* forgotten nit

* remove on the fly computation of inv_freq

* revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_flax_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* nit conversion script link

* fix some tests

* add not doctest and pr doctest

* repo consistency

* fix last CIs 🚀



* update all readmes

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Lysandre Debut <hi@lysand.re>

594c1277

[`Maskformer`] safely get backbone config (#29166) · 58245ba6
amyeroberts authored Feb 21, 2024
```
Safe getattr
```
58245ba6

support SDPA Attention in stablelm (#29106) · 1d0ea7ab

Ekaterina Aidova authored Feb 21, 2024



* support SDPA Attention in stablelm

* add integration test

* add fallback for output_attentions

* Update src/transformers/models/stablelm/modeling_stablelm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/stablelm/test_modeling_stablelm.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/stablelm/modeling_stablelm.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* handle non-contiguous states

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

1d0ea7ab

`torch.compile` compatibility with `generate` + static cache (#29114) · cc4a664b

fxmarty authored Feb 21, 2024



* fix compatibility

* working version

* cleanup

* sanity checks

* more sanity

* working version WITH refactor

* working without API change

* cleanup & tests pass

* more cleaning

* fix test

* fix tests

* Update src/transformers/generation/utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* smaller comment

* update comment

* update comment

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

cc4a664b

🚨 Llama: update rope scaling to match static cache changes (#29143) · 3994fa5b
Joao Gante authored Feb 21, 2024

3994fa5b
v4.39.dev.0 · 1a77f07f
Arthur Zucker authored Feb 21, 2024

1a77f07f

20 Feb, 2024 20 commits

[`pipeline`] Add pool option to image feature extraction pipeline (#28985) · e770f031
amyeroberts authored Feb 20, 2024
```
* Add pool option

* PR comments - error message and exact outputs check
```
e770f031
Fix drop path being ignored in DINOv2 (#29147) · c47576ca
Fernando Pérez-García authored Feb 20, 2024
```
Fix drop path not being used
```
c47576ca
Added image_captioning version in es and included in toctree file (#29104) · 3c00b885
Gustavo Isturiz authored Feb 20, 2024
```
added image_captioning version in es and included in toctree file
```
3c00b885
Generate: missing generation config eos token setting in encoder-decoder tests (#29146) · 857fd8ea
Joao Gante authored Feb 20, 2024

857fd8ea

Raise unused kwargs image processor (#29063) · 1c81132e

Pablo Montalvo authored Feb 20, 2024

* draft processor arg capture

* add missing vivit model

* add new common test for image preprocess signature

* fix quality

* fix up

* add back missing validations

* quality

* move info level to warning for unused kwargs

1c81132e

[Phi] Add support for sdpa (#29108) · b8b16475
JB (Don) authored Feb 20, 2024

b8b16475
Save (circleci) cache at the end of a job (#29141) · 7688d8df
Yih-Dar authored Feb 20, 2024
```
nice job
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
7688d8df
Add support for fine-tuning CLIP-like models using contrastive-image-text example (#29070) · ee3af60b
Taylor Jackle Spriggs authored Feb 20, 2024
```
* add support for siglip and chinese-clip model training with contrastive-image-text example

* codebase fixups
```
ee3af60b

Revert low cpu mem tie weights (#29135) · 0996a100

amyeroberts authored Feb 20, 2024

* Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)"

This reverts commit 725f4ad1.

* Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)"

This reverts commit 4156f517.

0996a100

[`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues (#28010) · 15cfe389

Arthur authored Feb 20, 2024

* add add_dummy_prefix_space option to slow

* checking kwargs might be better. Should be there for all spm tokenizer IMO

* nits

* fix copies

* more copied

* nits

* add prefix space

* nit

* nits

* Update src/transformers/convert_slow_tokenizer.py

* fix inti

* revert wrong styling

* fix

* nits

* style

* updates

* make sure we use slow tokenizer for conversion instead of looking for the decoder

* support llama ast well

* update llama tokenizer fast

* nits

* nits nits nits

* update the doc

* update

* update to fix tests

* skip unrelated tailing test

* Update src/transformers/convert_slow_tokenizer.py

* add proper testing

* test decode as well

* more testing

* format

* fix llama test

* Apply suggestions from code review

15cfe389

FIX [`PEFT` / `Trainer` ] Handle better peft + quantized compiled models (#29055) · efdd4366
Younes Belkada authored Feb 20, 2024
```
* handle peft + compiled models

* add tests

* fixup

* adapt from suggestions

* clarify comment
```
efdd4366

[`cuda kernels`] only compile them when initializing (#29133) · 5e95dcab

Arthur authored Feb 20, 2024

* only compile when needed

* fix mra as well

* fix yoso as well

* update

* rempve comment

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

* opps

* Update src/transformers/models/deta/modeling_deta.py

* nit

5e95dcab

Generate: unset GenerationConfig parameters do not raise warning (#29119) · a7755d24
Joao Gante authored Feb 20, 2024

a7755d24
Llama: fix batched generation (#29109) · 7d312ad2
Joao Gante authored Feb 20, 2024

7d312ad2
FIX [`bnb` / `tests`] Propagate the changes from #29092 to 4-bit tests (#29122) · ff76e7c2
Younes Belkada authored Feb 20, 2024
```
* forgot to push the changes for 4bit ..

* trigger CI
```
ff76e7c2

Abstract image processor arg checks. (#28843) · 1c9134f0

Pablo Montalvo authored Feb 20, 2024



* abstract image processor arg checks.

* fix signatures and quality

* add validate_ method to rescale-prone processors

* add more validations

* quality

* quality

* fix formatting
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix formatting
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix formatting
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix formatting mishap
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix crop_size compatibility

* fix default mutable arg

* fix segmentation map + image arg validity

* remove segmentation check from arg validation

* fix quality

* fix missing segmap

* protect PILImageResampling type

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add back segmentation maps check

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1c9134f0

FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trainer` (#29082) · f7ef7cec

Younes Belkada authored Feb 20, 2024



* add RMSProp to Trainer

* revert some change

* Update src/transformers/trainer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

f7ef7cec

Move misplaced line (#29117) · a7ff2f23
Erich Schubert authored Feb 20, 2024
```
Move misplaced line, improve code comment
```
a7ff2f23
[`gradient_checkpointing`] default to use it for torch 2.3 (#28538) · 9094abe8
Arthur authored Feb 20, 2024
```
* default to use it

* style
```
9094abe8

Fixed nll with label_smoothing to just nll (#28708) · 49c0b293

Nilesh authored Feb 20, 2024

* Fixed nll with label_smoothing to nll

* Resolved conflict by rebase

* Fixed nll with label_smoothing to nll

* Resolved conflict by rebase

* Added label_smoothing to config file

* Fixed nits

49c0b293

19 Feb, 2024 3 commits

storing & logging gradient norm in trainer (#27326) · 4f09d0fd
Shijie Wu authored Feb 19, 2024
```
* report grad_norm during training

* support getting grad_norm from deepspeed
```
4f09d0fd
Fix two tiny typos in `pipelines/base.py::Pipeline::_sanitize_parameters()`'s docstring (#29102) · a4851d94
Sadra Barikbin authored Feb 19, 2024
```
* Update base.py

* Fix a typo
```
a4851d94

Bnb test fix for different hardwares (#29066) · 5ce90f32

Titus authored Feb 19, 2024



* generated text on A10G

* generated text in CI

* Apply suggestions from code review

add explanatory comments
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

5ce90f32