Commits · f26e4073707189c93915227779a4f6ea3c40d43b · chenpangpang / transformers

"tests/generation/test_utils.py" did not exist on "e8714c03078348be8dcdf018502f362d277249cb"

08 May, 2024 2 commits

Cache: models return input cache type (#30716) · f26e4073
Joao Gante authored May 08, 2024

f26e4073

Immutability for data collators (#30603) · 71c19850

Anton Vlasjuk authored May 08, 2024

* immutability fix for seq2seq as well as immutability tests for the collators

* ensure we don't act on none labels and formatting

* remove tf/pt in respective tests as they are not required

* more type error fixes tf/np

* remove todo

* apply suggestions from code review

* formatting / style

71c19850

07 May, 2024 4 commits

Add safetensors to model not found error msg for default use_safetensors value (#30602) · cf7bed98
David Xue authored May 07, 2024
```
* add safetensors to model not found error for default use_safetensors=None case

* format code w/ ruff

* fix assert true typo
```
cf7bed98

Reboot Agents (#30387) · 0ba15ced

Aymeric Roucher authored May 07, 2024



* Create CodeAgent and ReactAgent

* Fix formatting errors

* Update documentation for agents

* Add custom errors, improve logging

* Support variable usage in ReactAgent

* add messages

* Add message passing format

* Create React Code Agent

* Update

* Refactoring

* Fix errors

* Improve python interpreter

* Only non-tensor inputs should be sent to device

* Calculator tool slight refactor

* Improve docstrings

* Refactor

* Fix tests

* Fix more tests

* Fix even more tests

* Fix tests by replacing output and input types

* Fix operand type issue

* two small fixes

* EM TTS

* Fix agent running type errors

* Change text to speech tests to allow changed outputs

* Update doc with new agent types

* Improve code interpreter

* If max iterations reached, provide a real answer instead of an error

* Add edge case in interpreter

* Add safe imports to the interpreter

* Interpreter tweaks: tuples and listcomp

* Make style

* Make quality

* Add dictcomp to interpreter

* Rename ReactJSONAgent to ReactJsonAgent

* Misc changes

* ToolCollection

* Rename agent's logger to self.logger

* Add while loops to interpreter

* Update doc with new tools. still need to mention collections

* Add collections to the doc

* Small fixes on logs and interpretor

* Fix toolbox return type

* Docs + fixup

* Skip doctests

* Correct prompts with improved examples and formatting

* Update prompt

* Remove outdated docs

* Change agent to accept Toolbox object for tools

* Remove calculator tool

* Propagate removal of calculator in doc

* Fix 2 failing workflows

* Simplify additional argument passing

* AgentType audio

* Minor changes: function name, types

* Remove calculator tests

* Fix test

* Fix torch requirement

* Fix final answer tests

* Style fixes

* Fix tests

* Update docstrings with calculator removal

* Small type hint fixes

* Update tests/agents/test_translation.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_python_interpreter.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/default_tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_agents.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bert/configuration_bert.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/speech_to_text.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_speech_to_text.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_tools_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* pygments

* Answer comments

* Cleaning up

* Simplifying init for all agents

* Improving prompts and making code nicer

* Style fixes

* Add multiple comparator test in interpreter

* Style fixes

* Improve BERT example in documentation

* Add examples to doc

* Fix python interpreter quality

* Logging improvements

* Change test flag to agents

* Quality fix

* Add example for HfEngine

* Improve conversation example for HfEngine

* typo fix

* Verify doc

* Update docs/source/en/agents.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/agents.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/prompts.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/python_interpreter.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/agents.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix style issues

* local s2t tool

---------
Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Lysandre <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

0ba15ced

Word-level timestamps broken for short-form audio (#30325) · 9c8979e3

Kamil Akesbi authored May 07, 2024



* force chunk_length_s in AutomaticSpeechRecognitionPipeline

* compute num_frames even when stride is None

* add slow tests

* fix test

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add input validation

* fixup

* small fix

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

9c8979e3

Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024) · 54a2361a

JB (Don) authored May 07, 2024

* Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True

* Testing for the non-safe-tensors case, since the default is safe-tensors already

* Running fixup/fix-copies

* Adding accelerate annotations to tests

54a2361a

06 May, 2024 4 commits

Trainer - add cache clearing and the option for batched eval metrics computation (#28769) · df475bf8

Nate Cibik authored May 06, 2024

* Added cache clearing for GPU efficiency.

* Added cache clearing for GPU efficiency.

* Added batch_eval_metrics capability

* Ran make fixup

* Fixed bug

* Fixed whitespace issue

* Fixed outdated condition

* Updated docstrings with instructions for batch_eval_metrics. Updated end of dataloader logic

* Added first version of batch_eval_metrics Trainer test

* Fixed batch_eval_metrics Trainer tests for both eval and predict

* Fixed batch_eval_metrics behavior for new Trainer variables

* Fixed batch_eval_metrics Trainer tests

* Ran fixup

df475bf8

Trainer._load_from_checkpoint - support loading multiple Peft adapters (#30505) · e0769530

Clara Pohland authored May 06, 2024



* Trainer: load checkpoint model with multiple adapters

* Trainer._load_from_checkpoint support multiple active adapters

* PeftModel.set_adapter does not support multiple adapters yet

* Trainer._load_from_checkpoint test multiple adapters

---------
Co-authored-by: Clara Luise Pohland <clara-luise.pohland@telekom.de>

e0769530

Quantization / HQQ: Fix HQQ tests on our runner (#30668) · 9c772ac8
Younes Belkada authored May 06, 2024
```
Update test_hqq.py
```
9c772ac8

[`CI update`] Try to use dockers and no cache (#29202) · 307f632b

Arthur authored May 06, 2024



* change cis

* nits

* update

* minor updates

* [push-ci-image]

* nit [push-ci-image]

* nitsssss

* [build-ci-image]

* [push-ci-image]

* [push-ci-image]

* both

* [push-ci-image]

* this?

* [push-ci-image]

* pypi-kenlm needs g++

* [push-ci-image]

* nit

* more nits [push-ci-image]

* nits [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* add vision

* [push-ci-image]

* [push-ci-image]

* add new dummy file but will need to update them [push-ci-image]

* [push-ci-image]

* show package size as well

* [push-ci-image]

* potentially ignore failures

* workflow updates

* nits [push-ci-image]

* [push-ci-image]

* fix consistency

* clean nciida triton

* also show big packages [push-ci-image]

* nit

* update

* another one

* line escape?

* add accelerate [push-ci-image]

* updates [push-ci-image]

* nits to run tests, no push-ci

* try to parse skip reason to make sure nothing is skipped that should no be skippped

* nit?

* always show skipped reasons

* nits

* better parsing of the test outputs

* action="store_true",

* failure on failed

* show matched

* debug

* update short summary with skipped, failed and errors

* nits

* nits

* coolu pdates

* remove docbuilder

* fix

* always run checks

* oups

* nits

* don't error out on library printing

* non zero exi codes

* no warning

* nit

* WAT?

* format nit

* [push-ci-image]

* fail if fail is needed

* [push-ci-image]

* sound file for torch light?

* [push-ci-image]

* order is important [push-ci-image]

* [push-ci-image] reduce even further

* [push-ci-image]

* use pytest rich !

* yes [push-ci-image]

* oupsy

* bring back the full traceback, but pytest rich should help

* nit

* [push-ci-image]

* re run

* nit

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* empty push to trigger

* [push-ci-image]

* nit? [push-ci-image]

* empty

* try to install timm with no deps

* [push-ci-image]

* oups [push-ci-image]

* [push-ci-image]

* [push-ci-image] ?

* [push-ci-image] open ssh client for git checkout fast

* empty for torch light

* updates [push-ci-image]

* nit

* @v4 for checkout

* [push-ci-image]

* [push-ci-image]

* fix fetch tests with parallelism

* [push-ci-image]

* more parallelism

* nit

* more nits

* empty to re-trigger

* empty to re-trigger

* split by timing

* did not work with previous commit

* junit.xml

* no path?

* mmm this?

* junitxml format

* split by timing

* nit

* fix junit family

* now we can test if the xunit1 is compatible!

* this?

* fully list tests

* update

* update

* oups

* finally

* use classname

* remove working directory to make sure the path does not interfere

* okay no juni should have the correct path

* name split?

* sort by classname is what make most sense

* some testing

* naem

* oups

* test something fun

* autodetect

* 18?

* nit

* file size?

* uip

* 4 is best

* update to see versions

* better print

* [push-ci-image]

* [push-ci-image]

* please install the correct keras version

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* uv is fucking me up

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* nits

* [push-ci-image]

* [push-ci-image]

* install issues an pins

* tapas as well

* nits

* more paralellism

* short tb

* soundfile

* soundfile

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* oups

* [push-ci-image]

* fix some things

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* use torch-light for hub

* small git lfs for hub job

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* fix tf tapas

* [push-ci-image]

* nits

* [push-ci-image]

* don't update the test

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* no use them

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* update tf proba

* [push-ci-image]

* [push-ci-image]

* woops

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* test with built dockers

* [push-ci-image]

* skip annoying tests

* revert fix copy

* update test values

* update

* last skip and fixup

* nit

* ALL GOOOD

* quality

* Update tests/models/layoutlmv2/test_image_processing_layoutlmv2.py

* Update docker/quality.dockerfile
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/models/tapas/modeling_tf_tapas.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>

* use torch-speed

* updates

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* fuck ken-lm [push-ci-image]

* [push-ci-image]

* [push-ci-image]

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

307f632b

02 May, 2024 4 commits

Add HQQ quantization support (#29637) · 59952994

mobicham authored May 02, 2024



* update HQQ transformers integration

* push import_utils.py

* add force_hooks check in modeling_utils.py

* fix | with Optional

* force bias as param

* check bias is Tensor

* force forward for multi-gpu

* review fixes pass

* remove torch grad()

* if any key in linear_tags fix

* add cpu/disk check

* isinstance return

* add multigpu test + refactor tests

* clean hqq_utils imports in hqq.py

* clean hqq_utils imports in quantizer_hqq.py

* delete hqq_utils.py

* Delete src/transformers/utils/hqq_utils.py

* ruff init

* remove torch.float16 from __init__ in test

* refactor test

* isinstance -> type in quantizer_hqq.py

* cpu/disk device_map check in quantizer_hqq.py

* remove type(module) nn.linear check in quantizer_hqq.py

* add BaseQuantizeConfig import inside HqqConfig init

* remove hqq import in hqq.py

* remove accelerate import from test_hqq.py

* quant config.py doc update

* add hqqconfig to main_classes doc

* make style

* __init__ fix

* ruff __init__

* skip_modules list

* hqqconfig format fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* test_hqq.py remove mistral comment

* remove self.using_multi_gpu is False

* torch_dtype default val set and logger.info

* hqq.py isinstance fix

* remove torch=None

* torch_device test_hqq

* rename test_hqq

* MODEL_ID in test_hqq

* quantizer_hqq setattr fix

* quantizer_hqq typo fix

* imports quantizer_hqq.py

* isinstance quantizer_hqq

* hqq_layer.bias reformat quantizer_hqq

* Step 2 as comment in quantizer_hqq

* prepare_for_hqq_linear() comment

* keep_in_fp32_modules fix

* HqqHfQuantizer reformat

* quantization.md hqqconfig

* quantization.md model example reformat

* quantization.md # space

* quantization.md space   })

* quantization.md space   })

* quantization_config fix doc
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* axis value check in quantization_config

* format

* dynamic config explanation

* quant config method in quantization.md

* remove shard-level progress

* .cuda fix modeling_utils

* test_hqq fixes

* make fix-copies

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

59952994

Output `None` as attention when layer is skipped (#30597) · 4c940934
Jonghwan Hyeon authored May 03, 2024
```
* Output `None` as attention when layer is skipped

* Add test for output_attentions
```
4c940934
🚨 Update image_processing_vitmatte.py (#30566) · f9530258
Richard Brown authored May 02, 2024
```
* Update image_processing_vitmatte.py

* add test

* [run-slow]vitmatte
```
f9530258
Fix for Neuron (#30259) · fbabd674
Michael Benayoun authored May 02, 2024

fbabd674

01 May, 2024 3 commits

Fix llava half precision and autocast issues (#29721) · 5090ea3f

Fraser Mince authored May 01, 2024

* Ensure input_embeds and image_features are the same dtype in autocast

* Fix nans in half precision llava-next and fix autocasting behavior.

* Fix styling issues.

* fix randn newline instantiation

* fix broken slow llava test

* Fix llava next init.

* fix styling issues

* [run-slow]llava,llava_next

* fix styling issues

5090ea3f

Encoder-decoder models: move embedding scale to nn.Module (#30410) · 38a4bf79

Raushan Turganbay authored May 01, 2024



* move scaling to nn.Module

* let the test be here for now (need to fix)

* failing tests

* last failing models

* Revert commit 4c14817f38

* clean-up

* oops forgot

* codestyle

* raise NotImplemented when possible

* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* skip tests in respective modeling files

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

38a4bf79

Use text config's vocab size in testing models (#30568) · 9d31b32e
Raushan Turganbay authored May 01, 2024
```
use text config's vocab size
```
9d31b32e

30 Apr, 2024 5 commits

Remove `use_square_size` after loading (#30567) · 78fdd64d

Yih-Dar authored Apr 30, 2024



* fix

* add test

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

78fdd64d

Add chat templating support for KeyDataset in text-generation pipeline (#30558) · 2ecefc39

DarshanDeshpande authored Apr 30, 2024

* added chat templating support for keydataset in generation pipeline

* fixed and improved test

* fix formatting test failures

* Fix tests

* Fix tests

2ecefc39

BlipModel: get_multimodal_features method (#30438) · 0cdb6b3f

Jiarui Xu authored May 01, 2024

* add_blip_get_multimodal_feautres

* Fix docstring error

* reimplement get_multimodal_features

* fix error

* recheck code quality

* add new necessary tests

0cdb6b3f

Fix seq2seq collator padding (#30556) · 9112520b

Anton Vlasjuk authored Apr 30, 2024

* fix seq2seq data collator to respect the given padding strategy

further added tests for the seq2seq data collator in the style of the `data_collator_for_token_classification` (pt, tf, np)

* formatting and change bool equals "==" to "is"

* add missed return types in tests

* update numpy test as it can handle unequal shapes, not like pt or tf

9112520b

Cache: Static cache as a standalone object (#30476) · 75bbfd5b
Joao Gante authored Apr 30, 2024

75bbfd5b

26 Apr, 2024 6 commits

[SegGPT] Fix seggpt image processor (#29550) · 6d4cabda

Eduardo Pacheco authored Apr 26, 2024

* Fixed SegGptImageProcessor to handle 2D and 3D prompt mask inputs

* Added new test to check prompt mask equivalence

* New proposal

* Better proposal

* Removed unnecessary method

* Updated seggpt docs

* Introduced do_convert_rgb

* nits

6d4cabda

load_image - decode b64encode and encodebytes strings (#30192) · c793b26f
amyeroberts authored Apr 26, 2024
```
* Decode b64encode and encodebytes strings

* Remove conditional encode -- image is always a string
```
c793b26f

[`DETR`] Remove timm hardcoded logic in modeling files (#29038) · aafa7ce7

amyeroberts authored Apr 26, 2024



* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Fix up

* Add tests

* Tidy up

* Enable instantiating model with pretrained backbone weights

* Update tests so backbone checkpoint isn't passed in

* Clarify pretrained import

* Update configs - docs and validation check

* Update src/transformers/utils/backbone_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Clarify exception message

* Update config init in tests

* Add test for when use_timm_backbone=True

* Use load_backbone instead

* Add use_timm_backbone to the model configs

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Draft

* Fix tests

* Add back timm - weight naming

* More tidying up

* Whoops

* Tidy up

* Handle when kwargs are none

* Update tests

* Revert test changes

* Deformable detr test - don't use default

* Don't mutate; correct model attributes

* Add some clarifying comments

* nit - grammar is hard

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

aafa7ce7

[`BERT`] Add support for sdpa (#28802) · dfa7b580

JB (Don) authored Apr 26, 2024

* Adding SDPA support for BERT

* Using the proper input name for testing model input in inference()

* Adding documentation for SDPA in BERT model page

* Use the stable link for the documentation

* Adding a gate to only call .contiguous() for torch < 2.2.0

* Additions and fixes to the documentation

* Minor updates to documentation

* Adding extra requirements needed for the contiguous() bug

* Adding "Adapted from" in plcae of the "Copied from"

* Add benchmark speedup tables to the documentation

* Minor fixes to the documentation

* Use ClapText as a replacemenet for Bert in the Copied-From

* Some more fixes for the fix-copies references

* Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage

[test all]

* Undo changes to separate test

* Refactored SDPA self attention code for KV projections

* Change use_sdpa to attn_implementation

* Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)

dfa7b580

Use the Keras set_random_seed in tests (#30504) · 2de5cb12
Matt authored Apr 26, 2024
```
Use the Keras set_random_seed to ensure reproducible weight initialization
```
2de5cb12
Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488) · 20081c74
Michael Goin authored Apr 26, 2024
```
* Update modeling_utils/dtype_byte_size to handle float8 types

* Add a test for dtype_byte_size

* Format

* Fix bool
```
20081c74

25 Apr, 2024 5 commits

Fix Llava for 0-embeddings (#30473) · e60491ad
Raushan Turganbay authored Apr 25, 2024

e60491ad

Introduce Stateful Callbacks (#29666) · ad697f18

Zach Mueller authored Apr 25, 2024



* Introduce saveable callbacks

* Add note

* Test for non-present and flag

* Support early stopping and refusing to train further

* Update docstring

* More saving

* Import oopsie

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Make it go through TrainerArguments

* Document

* Fix test

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Rework to allow for duplicates

* CLean

* Fix failing tests

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ad697f18

Add WSD scheduler (#30231) · 7b1170b0

Alexander Visheratin authored Apr 25, 2024

* Added WSD scheduler.

* Added tests.

* Fixed errors.

* Fix formatting.

* CI fixes.

7b1170b0

🚨

Add training compatibility for Musicgen-like models (#29802) · 90cb55bf

Yoach Lacombe authored Apr 25, 2024



* first modeling code

* make repository

* still WIP

* update model

* add tests

* add latest change

* clean docstrings and copied from

* update docstrings md and readme

* correct chroma function

* correct copied from and remove unreleated test

* add doc to toctree

* correct imports

* add convert script to notdoctested

* Add suggestion from Sanchit
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct get_uncoditional_inputs docstrings

* modify README according to SANCHIT feedback

* add chroma to audio utils

* clean librosa and torchaudio hard dependencies

* fix FE

* refactor audio decoder -> audio encoder for consistency with previous musicgen

* refactor conditional -> encoder

* modify sampling rate logics

* modify license at the beginning

* refactor all_self_attns->all_attentions

* remove ignore copy from causallm generate

* add copied from for from_sub_models

* fix make copies

* add warning if audio is truncated

* add copied from where relevant

* remove artefact

* fix convert script

* fix torchaudio and FE

* modify chroma method according to feedback-> better naming

* refactor input_values->input_features

* refactor input_values->input_features and fix import fe

* add input_features to docstrigs

* correct inputs_embeds logics

* remove dtype conversion

* refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation

* change warning for chroma length

* Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* change way to save wav, using soundfile

* correct docs and change to soundfile

* fix import

* fix init proj layers

* add draft training

* fix cross entropy

* clean loss computation

* fix labels

* remove line breaks from md

* fix issue with docstrings

* add FE suggestions

* improve is in logics and remove useless imports

* remove custom from_pretrained

* simplify docstring code

* add suggestions for modeling tests

* make style

* update converting script with sanity check

* remove encoder attention mask from conditional generation

* replace musicgen melody checkpoints with official orga

* rename ylacombe->facebook in checkpoints

* fix copies

* remove unecessary warning

* add shape in code docstrings

* add files to slow doc tests

* fix md bug and add md to not_tested

* make fix-copies

* fix hidden states test and batching

* update training code

* add training tests for melody

* add training for o.g musicgen

* fix copied from

* remove final todos

* make style

* fix style

* add suggestions from review

* add ref to the original loss computation code

* rename method + fix labels in tests

* make style

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

90cb55bf

Don't run fp16 MusicGen tests on CPU (#30466) · aca4a103
amyeroberts authored Apr 25, 2024

aca4a103

24 Apr, 2024 5 commits

Phi-3 (#30423) · c9693db2

Gustavo de Rosa authored Apr 24, 2024

* chore(root): Initial commit of Phi-3 files.

* fix(root): Fixes Phi-3 missing on readme.

* fix(root): Ensures files are consistent.

* fix(phi3): Fixes unit tests.

* fix(tests): Fixes style of phi-3 test file.

* chore(tests): Adds integration tests for Phi-3.

* fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm.

* fix(phi3): Fixes incorrect docstrings.

* fix(phi3): Fixes docstring typos.

* fix(phi3): Adds support for Su and Yarn embeddings.

* fix(phi3): Improves according first batch of reviews.

* fix(phi3): Uses up_states instead of y in Phi3MLP.

* fix(phi3): Uses gemma rotary embedding to support torch.compile.

* fix(phi3): Improves how rotary embedding classes are defined.

* fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.

* fix(phi3): Adds last suggestions to modeling file.

* fix(phi3): Splits inv_freq calculation in two lines.

c9693db2

[SegGPT] Fix loss calculation (#30421) · d26c1413

Eduardo Pacheco authored Apr 24, 2024



* Fixed main train issues

* Added loss test

* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added missing labels arg in SegGptModel forward

* Fixed typo

* Added slow test to test loss calculation

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

d26c1413

[tests] make test device-agnostic (#30444) · 16c8e176
Fanli Lin authored Apr 24, 2024
```
* make device-agnostic

* clean code
```
16c8e176

[`Llava`] + CIs fix red cis and llava integration tests (#30440) · 9a4a119c

Arthur authored Apr 24, 2024



* nit

* nit and fmt skip

* fixup

* Update src/transformers/convert_slow_tokenizer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* set to true

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

9a4a119c

Fix YOLOS image processor resizing (#30436) · 767e3518

Pavel Iakubovskii authored Apr 24, 2024

* Add test for square image that fails

* Fix for square images

* Extend test cases

* Fix resizing in tests

* Style fixup

767e3518

23 Apr, 2024 2 commits

[`LlamaTokenizerFast`] Refactor default llama (#28881) · e34da3ee

Arthur authored Apr 23, 2024

* push legacy to fast as well

* super strange

* Update src/transformers/convert_slow_tokenizer.py

* make sure we are BC

* fix Llama test

* nit

* revert

* more test

* style

* update

* small update w.r.t tokenizers

* nit

* don't split

* lol

* add a test for `add_prefix_space=False`

* fix gemma tokenizer as well

* update

* fix gemma

* nicer failures

* fixup

* update

* fix the example for legacy = False

* use `huggyllama/llama-7b` for the PR doctest

* nit

* use from_slow

* fix llama

e34da3ee

Fix on "cache position" for assisted generation (#30068) · 77b59dce

Raushan Turganbay authored Apr 23, 2024



* clean commit history I hope

* get kv seq length correctly

* PR suggestions

* Update src/transformers/testing_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add comment

* give gpt bigcode it's own overriden method

* remove code

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

77b59dce