Commits · b44df05bc0866f88f06c8c14b392afc197a8c8b6 · chenpangpang / transformers

03 Apr, 2024 1 commit
- Update `tests/utils/tiny_model_summary.json` (#29941) · b44df05b
  Yih-Dar authored Apr 03, 2024
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  b44df05b
02 Apr, 2024 11 commits

Fix `remove_columns` in `text-classification` example (#29351) · fce52cef
Mario Šaško authored Apr 02, 2024

fce52cef
Generate: fix logits processors doctests (#29718) · 5080ab12
Joao Gante authored Apr 02, 2024
```
* fix norm

* fix logits processors doctests
```
5080ab12

Hard error when ignoring tensors. (#27484) (#29906) · 9b0a8ea7

Nicolas Patry authored Apr 02, 2024



* Hard error when ignoring tensors. (#27484)

* [WIP] Hard error when ignoring tensors.

* Better selection/error when saving a checkpoint.

- Find all names we should normally drop (those are in the transformers
  config)
- Find all disjoint tensors (for those we can safely trigger a copy to
  get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
  but we try to find them all anyway.)
- For all identical names:
  - If they are in the config, just ignore them everything is fine
  - If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
  disjoint. raise a hard error.

* Adding a failing test on `main` that passes here.

* We don't need to keep the subfolder logic in this test.

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add small tests.

* Dead variable.

* Fixup.

* Fixing tied_Weights_keys on generic models.

* Fixup + T5 encoder/decoder tying (with different layers)

* Code quality.

* Dynamic member.

* trigger

* Fixing encoder name for other types of encoder/decoder combos.

* Fix scoping.

* Update .github/workflows/self-scheduled.yml
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fixing the tied_weights after the call.

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

9b0a8ea7

Fix `skip_special_tokens` for `Wav2Vec2CTCTokenizer._decode` (#29311) · 15cd6871

Minsub Lee (Matt) authored Apr 02, 2024

* Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode

* Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode

* Exclude pad_token filtering since it is used as CTC-blank token

* Add small test for skip_special_tokens

* Update decoding test for added new token

15cd6871

[Docs] Make an ordered list prettier in add_tensorflow_model.md (#29949) · cb5927ca
Michael authored Apr 02, 2024

cb5927ca

Add Flash Attention 2 support to Musicgen and Musicgen Melody (#29939) · 0d04b1e2

Yoach Lacombe authored Apr 02, 2024

* add FA2 to o.g Musicgen

* make style

* add FA2 support to Musicgen Melody

* add generation FA2 tests to o.g Musicgen

* make style and fix copies

* add Musicgen to FA2 docs + deprecate list

* add sdpa supports to Musicgen's

* make style and fix copies

* refactor attention implementation arguments

* add Copied from to sdpa tests

* add copied form in sdpa tests melody

* add copied for FA2 generation tests

* add FA2 inference copied from

* make style

0d04b1e2

Adding FlaxNoRepeatNGramLogitsProcessor (#29677) · fed27ffc

théo gigant authored Apr 02, 2024

* fix issue with logit processor in beam search in Flax

* adding FlaxNoRepeatNGramLogitsProcessor class + unit test

* style correction and code verification

* add FlaxNoRepeatNGramLogitsProcessor to the test_processor_list and test_processor_list_jitted tests

* fix an issue where ngrams are banned only if they appear ==1 time + update description of get_previous_ngrams

* replace non-jit compatible masking of ngrams that are not yet generated with jittable version

* Revert "fix issue with logit processor in beam search in Flax"

This reverts commit 09b70d7e4dc32d0cc4db61af09a835a9cd238b50.

* add FlaxNoRepeatNGramLogitsProcessor to _get_logits_processor

* change the method of casting to boolean of banned tokens indices

* fix code style

* remove some useless operations + significantly faster computation of update indices using jax.lax.fori_loop

* remove useless loop iterations

* set some variables that were calculated and used multiple times

* fix format

fed27ffc

[bnb] Fix bug in `_replace_with_bnb_linear` (#29958) · 33288ff1
Marc Sun authored Apr 02, 2024
```
fix bug
```
33288ff1

Fix 29807 sinusoidal positional encodings in Flaubert, Informer and XLM (#29904) · 416711c3

Hovnatan Karapetyan authored Apr 02, 2024

* Fix sinusoidal_embeddings in FlaubertModel

* Fix for Informer

* Fix for XLM

* Move sinusoidal emb for XLM

* Move sinusoidal emb for Flaubert

* Small cleanup

* Add comments on tests code copied from

* Add with Distilbert->

416711c3

[`generate`] fix breaking change for patch (#29976) · 83b26dd7

Arthur authored Apr 02, 2024

* fix bug and add tests

* nit

* otherway to get the cur len instead of attention mask

* more places where this might have been broken

* nit

* oups

* inputs_embeds vs input_embeds

* test generated outptus

* style

* nit

* fix

* skip failing biogpt

83b26dd7

[docs] Big model loading (#29920) · 096f3046
Steven Liu authored Apr 01, 2024
```
* update

* feedback
```
096f3046

01 Apr, 2024 4 commits
- Generate: move misplaced test (#29902) · c9f6e5e3
  Joao Gante authored Apr 01, 2024
  
  c9f6e5e3
- [tests] fix the wrong output in... · e4f5b57a
  Fanli Lin authored Apr 01, 2024
```
[tests] fix the wrong output in `ImageToTextPipelineTests.test_conditional_generation_llava` (#29975)

bug fix
```
  e4f5b57a
- Fix copies main ci (#29979) · fa2c49b0
  Arthur authored Apr 01, 2024
```
* fix copies

* nit

* style

* Update utils/check_copies.py
```
  fa2c49b0
- Fix FA2 tests (#29909) · 569f6c7d
  Yoach Lacombe authored Apr 01, 2024
```
* fix FA2 tests

* refactor inference test name
```
  569f6c7d
31 Mar, 2024 1 commit

Rework tests to compare trainer checkpoint args (#29883) · 3b8e2932

Zach Mueller authored Mar 30, 2024



* Start rework

* Fix failing test

* Include max

* Update src/transformers/trainer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

3b8e2932

30 Mar, 2024 6 commits
- [`BC`] Fix BC for AWQ quant (#29965) · 6e584070
  TechxGenus authored Mar 31, 2024
```
fix awq quant
```
  6e584070
- Update model card and link of blog post. (#29928) · 46d63681
  Bo Zheng authored Mar 31, 2024
```
* Update qwen2_moe.md

* update link of blogpost.

* fixup

---------
Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
```
  46d63681
- Reset alarm signal when the function is ended (#29706) · f6701bc6
  Gary Wang authored Mar 31, 2024
```
Fixes #29690
```
  f6701bc6
- fix: get mlflow version from mlflow-skinny (#29918) · e644b600
  Alexander Jipa authored Mar 30, 2024
```
Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
```
  e644b600
- Add warning message for `run_qa.py` (#29867) · 156d30da
  Jacky Lee authored Mar 30, 2024
```
* improve: error message for best model metric

* update: raise warning instead of error
```
  156d30da
- Fix rope theta for OpenLlama (#29893) · 6fd93fe9
  Jacky Lee authored Mar 30, 2024
```
fix: rope_theta for open llama
```
  6fd93fe9
29 Mar, 2024 2 commits
- Super tiny fix 12 typos about "with with" (#29926) · 5ad7f170
  fzyzcjy authored Mar 29, 2024
```
* with with

* style
```
  5ad7f170
- Mark `test_eager_matches_sdpa_generate` flaky for some models (#29479) · 43d17c18
  Yih-Dar authored Mar 29, 2024
```
* fix

* revert for qwen2

* revert for qwen2

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  43d17c18
28 Mar, 2024 15 commits

Update installs in image classification doc (#29947) · ba56ed08

MariaHei authored Mar 28, 2024

Trainer with PyTorch now requires accelerate to be installed.

Partly resolves huggingface/transformers#29174

ba56ed08

[`LlamaSlowConverter`] Slow to Fast better support (#29797) · 536ea2ac

Arthur authored Mar 29, 2024

* fix

* fix test

* style

* nit

* rather rely on concert token to id

* fix quality

* Update src/transformers/convert_slow_tokenizer.py

536ea2ac

Fix doc issue #29758 in DebertaV2Config class (#29842) · e2036468
VINAYAKK GARG authored Mar 28, 2024
```
Fix doc issue in DebertaV2Config class
Co-authored-by: Vinayakk Garg <vigar@akamai.com>
```
e2036468
[`BC`] Fix BC for other libraries (#29934) · 2bbbf1be
Arthur authored Mar 28, 2024
```
* fi xbc?

* nit
```
2bbbf1be

Allow GradientAccumulationPlugin to be configured from AcceleratorConfig (#29589) · 4df5b9b4

Yu Chin Fabian Lim authored Mar 28, 2024



* add gradient_accumulation_kwargs to AcceleratorConfig

* add suggestions from @muellerzr to docstrings, new behavior and tests

* Documentation suggestions from @muellerz
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* addressed @muellerzr comments regarding tests and test utils

* moved accelerate version to top of file.

* @muellerzr's variable fix
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* address @amyeroberts. fix tests and docstrings

* address @amyeroberts additional suggestions

---------
Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

4df5b9b4

[ `TokenizationLlama`] fix the way we convert tokens to strings to keep... · a2a7f716

Arthur authored Mar 28, 2024

[ `TokenizationLlama`] fix the way we convert tokens to strings to keep leading spaces 🚨 breaking fix (#29453)

* nit

* update test and fix test

* fixup

a2a7f716

[`Mamba`] from pretrained issue with `self.embeddings` (#29851) · e677479c

Arthur authored Mar 28, 2024



* nit

* update

* oups

* Update src/transformers/models/mamba/modeling_mamba.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

e677479c

RoPE models: add numerical sanity-check test for RoPE scaling (#29808) · 441de62f
Joao Gante authored Mar 28, 2024
```
* add hard rope scaling test

* make fixup

* quick rope scaling tests

* add copy statements
```
441de62f

add functions to inspect model and optimizer status to trainer.py (#29838) · aac7099c

Christopher Keibel authored Mar 28, 2024



* add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py

* add tests and raise ValueError when optimizer is None

* add second layer to test and freeze its weigths

* check if torch is available before running tests

* use decorator to check if torch is available
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix test indentation
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

aac7099c

Safe import of LRScheduler (#29919) · 855b95ce

amyeroberts authored Mar 28, 2024



* Safe import of LRScheduler

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix up

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

855b95ce

Add beam search visualizer to the doc (#29876) · c9d2e855
Aymeric Roucher authored Mar 28, 2024

c9d2e855
Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915) · 248d5d23
Joao Gante authored Mar 28, 2024
```
* replace torch.testing.assert_allclose by torch.testing.assert_close

* missing atol rtol
```
248d5d23
[doc] fix some typos and add `xpu` to the testing documentation (#29894) · 7c19fafe
Fanli Lin authored Mar 28, 2024
```
fix typo
```
7c19fafe

Adding Flash Attention 2 Support for GPT2 (#29226) · 22d159dd

Eduardo Pacheco authored Mar 28, 2024



* First commit to add flash attention 2 for GPT-2

* more improvements

* Make GPT2 pass tests and fixed Decison Transformers copies

* Fixed missing arg

* fix copies

* Added expected speedup

* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Added test

* Fixed attn attribute

* Update docs/source/en/model_doc/gpt2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/gpt2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update Decision transformer attentions

* More updates

* Passing tests

* Fix copies

* Fix copies part 2

* Decision transformer updates

* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix copies

* Decision transformer not supporting flash attn

* Addressed comments

* Addressed comments

* Addressed comments

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

22d159dd

[`pipeline`]. Zero shot add doc warning (#29845) · 3a7e6836
Arthur authored Mar 28, 2024
```
* add doc warning

* fix build pr
```
3a7e6836