Commits · 7b1170b0faa49a33322415a2f6f8b398fd8fbcc7 · chenpangpang / transformers

25 Apr, 2024 10 commits

Alexander Visheratin authored Apr 25, 2024

* Added WSD scheduler.

* Added tests.

* Fixed errors.

* Fix formatting.

* CI fixes.

7b1170b0

🚨

Add training compatibility for Musicgen-like models (#29802) · 90cb55bf

Yoach Lacombe authored Apr 25, 2024



* first modeling code

* make repository

* still WIP

* update model

* add tests

* add latest change

* clean docstrings and copied from

* update docstrings md and readme

* correct chroma function

* correct copied from and remove unreleated test

* add doc to toctree

* correct imports

* add convert script to notdoctested

* Add suggestion from Sanchit
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct get_uncoditional_inputs docstrings

* modify README according to SANCHIT feedback

* add chroma to audio utils

* clean librosa and torchaudio hard dependencies

* fix FE

* refactor audio decoder -> audio encoder for consistency with previous musicgen

* refactor conditional -> encoder

* modify sampling rate logics

* modify license at the beginning

* refactor all_self_attns->all_attentions

* remove ignore copy from causallm generate

* add copied from for from_sub_models

* fix make copies

* add warning if audio is truncated

* add copied from where relevant

* remove artefact

* fix convert script

* fix torchaudio and FE

* modify chroma method according to feedback-> better naming

* refactor input_values->input_features

* refactor input_values->input_features and fix import fe

* add input_features to docstrigs

* correct inputs_embeds logics

* remove dtype conversion

* refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation

* change warning for chroma length

* Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* change way to save wav, using soundfile

* correct docs and change to soundfile

* fix import

* fix init proj layers

* add draft training

* fix cross entropy

* clean loss computation

* fix labels

* remove line breaks from md

* fix issue with docstrings

* add FE suggestions

* improve is in logics and remove useless imports

* remove custom from_pretrained

* simplify docstring code

* add suggestions for modeling tests

* make style

* update converting script with sanity check

* remove encoder attention mask from conditional generation

* replace musicgen melody checkpoints with official orga

* rename ylacombe->facebook in checkpoints

* fix copies

* remove unecessary warning

* add shape in code docstrings

* add files to slow doc tests

* fix md bug and add md to not_tested

* make fix-copies

* fix hidden states test and batching

* update training code

* add training tests for melody

* add training for o.g musicgen

* fix copied from

* remove final todos

* make style

* fix style

* add suggestions from review

* add ref to the original loss computation code

* rename method + fix labels in tests

* make style

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

90cb55bf

Prevent crash with `WandbCallback` with third parties (#30477) · ce5ae5a4

Tom Aarsen authored Apr 25, 2024

* Use EAFP principle to prevent crash with third parties

* Remove leftover debugging code

* Add info-level logger message

ce5ae5a4

Don't run fp16 MusicGen tests on CPU (#30466) · aca4a103
amyeroberts authored Apr 25, 2024

aca4a103

Fix SigLip classification doctest (#30475) · 4fed29e3

amyeroberts authored Apr 25, 2024

* Fix SigLip classification doctest

* Remove extra line

* Update src/transformers/models/siglip/modeling_siglip.py

4fed29e3

Script for finding candidate models for deprecation (#29686) · 30ee508c

amyeroberts authored Apr 25, 2024

* Add utility for finding candidate models for deprecation

* Better model filtering

* Update

* Add warning tip

* Fix up

* Review comments

* Filter requests based on tags

* Add copyright header

30ee508c

[fix codellama conversion] (#30472) · c60749d6
Arthur authored Apr 25, 2024
```
* fix codellama conversion

* nit
```
c60749d6
FIX / Workflow: Fix SSH workflow bug (#30474) · e9b16354
Younes Belkada authored Apr 25, 2024
```
Update ssh-runner.yml
```
e9b16354
FIX / Workflow: Change tailscale trigger condition (#30471) · cd0cd12a
Younes Belkada authored Apr 25, 2024
```
Update push-important-models.yml
```
cd0cd12a

Workflow / ENH: Add SSH into our runners workflow (#30425) · cebb0726

Younes Belkada authored Apr 25, 2024



* add SSH into our runners workflow

* fix

* fix

* fix

* use our previous approaches

* forward contrib credits from discussions

---------
Co-authored-by: Yih-Dar <ydshieh@users.noreply.github.com>

cebb0726

24 Apr, 2024 19 commits

consistent job / pytest report / artifact name correspondence (#30392) · fbb41cd4

Yih-Dar authored Apr 24, 2024



* better names

* run better names

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fbb41cd4

Non blocking support to torch DL's (#30465) · 6ad9c8f7
Zach Mueller authored Apr 24, 2024
```
* Non blocking support

* Check for optimization

* Doc
```
6ad9c8f7

Enable fp16 on CPU (#30459) · 5c57463b

Zach Mueller authored Apr 24, 2024

* Check removing flag for torch

* LLM oops

* Getting there...

* More discoveries

* Change

* Clean up and prettify

* Logic check

* Not

5c57463b

Neuron: When save_safetensor=False, no need to move model to CPU (#29703) · d1d94d79

jeffhataws authored Apr 24, 2024

save_safetensor=True is default as of release 4.35.0, which then
required TPU hotfix https://github.com/huggingface/transformers/pull/27799
(issue https://github.com/huggingface/transformers/issues/27578).
However, when the flag save_safetensor is set to False (compatibility mode),
moving the model to CPU causes generation of too many graphs
during checkpoint https://github.com/huggingface/transformers/issues/28438.
This PR disable moving of model to CPU when save_safetensor=False.

d1d94d79

[`research_project`] Most of the security issues come from this requirement.txt (#29977) · 661190b4
Arthur authored Apr 24, 2024
```
update most of decision transformers research project
```
661190b4
Fix wrong indent in `utils/check_if_new_model_added.py` (#30456) · d0d430f1
Yih-Dar authored Apr 24, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d0d430f1

Phi-3 (#30423) · c9693db2

Gustavo de Rosa authored Apr 24, 2024

* chore(root): Initial commit of Phi-3 files.

* fix(root): Fixes Phi-3 missing on readme.

* fix(root): Ensures files are consistent.

* fix(phi3): Fixes unit tests.

* fix(tests): Fixes style of phi-3 test file.

* chore(tests): Adds integration tests for Phi-3.

* fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm.

* fix(phi3): Fixes incorrect docstrings.

* fix(phi3): Fixes docstring typos.

* fix(phi3): Adds support for Su and Yarn embeddings.

* fix(phi3): Improves according first batch of reviews.

* fix(phi3): Uses up_states instead of y in Phi3MLP.

* fix(phi3): Uses gemma rotary embedding to support torch.compile.

* fix(phi3): Improves how rotary embedding classes are defined.

* fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.

* fix(phi3): Adds last suggestions to modeling file.

* fix(phi3): Splits inv_freq calculation in two lines.

c9693db2

Add `paths` filter to avoid the chance of being triggered (#30453) · 42fed15c
Yih-Dar authored Apr 24, 2024
```
* trigger

* remove the last job

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
42fed15c

[SegGPT] Fix loss calculation (#30421) · d26c1413

Eduardo Pacheco authored Apr 24, 2024



* Fixed main train issues

* Added loss test

* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added missing labels arg in SegGptModel forward

* Fixed typo

* Added slow test to test loss calculation

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

d26c1413

fix jamba slow foward for multi-gpu (#30418) · 37fa1f65
Marc Sun authored Apr 24, 2024
```
* fix jamba slow foward for multi-gpu

* remove comm

* oups

* style
```
37fa1f65
fix uncaught init of linear layer in clip's/siglip's for image classification models (#30435) · 5d64ae9d
Anton Vlasjuk authored Apr 24, 2024
```
* fix clip's/siglip's _init_weights to reflect linear layers in "for image classification"

* trigger slow tests
```
5d64ae9d
[tests] make test device-agnostic (#30444) · 16c8e176
Fanli Lin authored Apr 24, 2024
```
* make device-agnostic

* clean code
```
16c8e176

[`Llava`] + CIs fix red cis and llava integration tests (#30440) · 9a4a119c

Arthur authored Apr 24, 2024



* nit

* nit and fmt skip

* fixup

* Update src/transformers/convert_slow_tokenizer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set to true

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

9a4a119c

Fix YOLOS image processor resizing (#30436) · 767e3518

Pavel Iakubovskii authored Apr 24, 2024

* Add test for square image that fails

* Fix for square images

* Extend test cases

* Fix resizing in tests

* Style fixup

767e3518

Add llama3 (#30334) · 89c510d8

Arthur authored Apr 24, 2024



* nuke

* add co-author

* add co-author

* update card

* fixup and fix copies to please our ci

* nit fixup

* super small nits

* remove tokenizer_path from call to `write_model`

* always safe serialize by default

---------
Co-authored-by: pcuenca <pcuenca@users.noreply.github.com>
Co-authored-by: xenova <xenova@users.noreply.github.com>

89c510d8

New model PR needs green (slow tests) CI (#30341) · fc34f842

Yih-Dar authored Apr 24, 2024



* You should not pass
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

fc34f842

Remove mentions of models in the READMEs and link to the documentation page in... · c6bba940

Lysandre Debut authored Apr 24, 2024

Remove mentions of models in the READMEs and link to the documentation page in which they are featured. (#30420)

* REAMDEs

* REAMDEs v2

c6bba940

Remove add-new-model in favor of add-new-model-like (#30424) · d4e92f1a
Lysandre Debut authored Apr 24, 2024
```
* Remove add-new-model in favor of add-new-model-like

* nits
```
d4e92f1a
Remove task guides auto-update in favor of links towards task pages (#30429) · 0eb8fbcd
Lysandre Debut authored Apr 24, 2024

0eb8fbcd

23 Apr, 2024 11 commits

[`LlamaTokenizerFast`] Refactor default llama (#28881) · e34da3ee

Arthur authored Apr 23, 2024

* push legacy to fast as well

* super strange

* Update src/transformers/convert_slow_tokenizer.py

* make sure we are BC

* fix Llama test

* nit

* revert

* more test

* style

* update

* small update w.r.t tokenizers

* nit

* don't split

* lol

* add a test for `add_prefix_space=False`

* fix gemma tokenizer as well

* update

* fix gemma

* nicer failures

* fixup

* update

* fix the example for legacy = False

* use `huggyllama/llama-7b` for the PR doctest

* nit

* use from_slow

* fix llama

e34da3ee

Fix use_cache for xla fsdp (#30353) · 12c39e56
Jiewen Tan authored Apr 23, 2024
```
* Fix use_cache for xla fsdp

* Fix linters
```
12c39e56
Rename torch.run to torchrun (#30405) · b8b1e442
Steven Basart authored Apr 23, 2024
```
torch.run does not exist anywhere as far as I can tell.
```
b8b1e442

Remove old TF port docs (#30426) · 696ededd

Matt authored Apr 23, 2024

* Remove old TF port guide

* repo-consistency

* Remove some translations as well for consistency

* Remove some translations as well for consistency

696ededd

Fix LayoutLMv2 init issue and doctest (#30278) · 416fdbad

Yih-Dar authored Apr 23, 2024



* fix

* try suggestion

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

416fdbad

FIX: re-add bnb on docker image (#30427) · d179b9dc
Younes Belkada authored Apr 23, 2024
```
Update Dockerfile
```
d179b9dc
Make EosTokenCriteria compatible with mps (#30376) · 4b63d013
Pedro Cuenca authored Apr 23, 2024

4b63d013

fix for itemsize => element_size() for torch backwards compat (#30133) · 57fc00f3

Wing Lian authored Apr 23, 2024



* fix for itemsize => element_size() for torch backwards compat

* improve handling of element counting

* Update src/transformers/modeling_utils.py

* fixup

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

57fc00f3

Fix on "cache position" for assisted generation (#30068) · 77b59dce

Raushan Turganbay authored Apr 23, 2024



* clean commit history I hope

* get kv seq length correctly

* PR suggestions

* Update src/transformers/testing_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add comment

* give gpt bigcode it's own overriden method

* remove code

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

77b59dce

Jax: scipy version pin (#30402) · 31921d8d
Joao Gante authored Apr 23, 2024
```
scipy pin for jax
```
31921d8d
[tests] add `require_torch_sdpa` for test that needs sdpa support (#30408) · 2d61823f
Fanli Lin authored Apr 23, 2024
```
* add cuda flag

* check for sdpa

* add bitsandbytes
```
2d61823f