Commits · 00dc856233a4539ded53520db6606a4b152c30be · chenpangpang / transformers

10 Nov, 2023 4 commits

[`AttentionMaskConverter`] ]Fix-mask-inf (#27114) · 68afca3e

Arthur authored Nov 10, 2023

* fix?

* actual fix

* fixups

* add dataclass to the attention mask converter

* refine testing suite

* make sure there are no overflows

* update the test

68afca3e

Add CLVP (#24745) · 7e9f10ac

Susnato Dhar authored Nov 10, 2023

* init commit

* attention arch done except rotary emb

* rotary emb done

* text encoder working

* outputs matching

* arch first pass done

* make commands done, tests and docs remaining

* all tests passed, only docs remaining

* docs done

* doc-builder fix

* convert script removed(not relevant)

* minor comments done

* added ckpt conversion script

* tokenizer done

* very minor fix of index.md 2

* mostly make fixup related

* all done except fe and rotary emb

* very small change

* removed unidecode dependency

* style changes

* tokenizer removed require_backends

* added require_inflect to tokenizer tests

* removed VOCAB_FILES in tokenizer test

* inflect dependency removed

* added rotary pos emb cache and simplified the apply method

* style

* little doc change

* more comments

* feature extractor added

* added processor

* auto-regressive config added

* added CLVPConditioningEncoder

* comments done except the test one

* weights added successfull(NOT tested)

* tokenizer fix with numbers

* generate outputs matching

* almost tests passing Integ tests not written

* Integ tests added

* major CUDA error fixed

* docs done

* rebase and multiple fixes

* fixed rebase overwrites

* generate code simplified and tests for AutoRegressive model added

* minor changes

* refectored gpt2 code in clvp file

* weights done and all code refactored

* mostly done except the fast_tokenizer

* doc test fix

* config file's doc fixes

* more config fix

* more comments

* tokenizer comments mostly done

* modeling file mostly refactored and can load modules

* ClvpEncoder tested

* ClvpDecoder, ClvpModel and ClvpForCausalLM tested

* integration and all tests passed

* more fixes

* docs almost done

* ckpt conversion refectored

* style and some failing tests fix

* comments

* temporary output fix but test_assisted_decoding_matches_greedy_search test fails

* majority changes done

* use_cache outputs same now! Along with the asisted_greedy_decoding test fix

* more comments

* more comments

* prepare_inputs_for_generation fixed and _prepare_model_inputs added

* style fix

* clvp.md change

* moved clvpconditionalencoder norms

* add model to new index

* added tokenizer input_ids_with_special_tokens

* small fix

* config mostly done

* added config-tester and changed conversion script

* more comments

* comments

* style fix

* some comments

* tokenizer changed back to prev state

* small commnets

* added output hidden states for the main model

* style fix

* comments

* small change

* revert small change

* .

* Update clvp.md

* Update test_modeling_clvp.py

* :)

* some minor change

* new fixes

* remove to_dict from FE

7e9f10ac

[`Quantization`] Add str to enum conversion for AWQ (#27320) · fd685cfd

Younes Belkada authored Nov 10, 2023



* add str to enum conversion

* fixup

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fd685cfd

add attention_mask and position_ids in assisted model (#26892) · 184f60dc

jiqing-feng authored Nov 10, 2023

* add attention_mask and position_ids in assisted model

* fix bug

* fix attention mask

* fix attention_mask

* check assist inputs

* check assist input ids length

* fix assist model type

* set assist attention mask device

184f60dc

09 Nov, 2023 6 commits

Fix `Owlv2` checkpoint name and a default value in `Owlv2VisionConfig` (#27402) · 740cd935
Yih-Dar authored Nov 09, 2023
```
* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
740cd935
remove failing tests and clean FE files (#27414) · 51a98c40
Yoach Lacombe authored Nov 09, 2023
```
* remove failing tests and clean FE files

* remove same similar text from tvlt
```
51a98c40
Fix RequestCounter to make it more future-proof (#27406) · e38348ae
Lucain authored Nov 09, 2023
```
* Fix RequestCounter to make it more future-proof

* code quality
```
e38348ae
Fix fuyu checkpoint repo in `FuyuConfig` (#27399) · cf2a3f37
Yih-Dar authored Nov 09, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
cf2a3f37
Adds dvclive callback (#27352) · 791ec370
Dave Berenbaum authored Nov 09, 2023
```
* dvclive trainer callback

* style fixes

* dvclive link fixes
```
791ec370

[`CodeLlamaTokenizer`] Nit, update __init__ to make sure the AddedTokens are... · 085ea7e5

Arthur authored Nov 09, 2023

[`CodeLlamaTokenizer`] Nit, update __init__ to make sure the AddedTokens are not normalized because they are special (#27359)

* make sure tokens are properly initialized for codellama slow

* add m ore pretrained models

* style

* test more tokenizers checkpoints

085ea7e5

08 Nov, 2023 5 commits

Add Flash Attention 2 support to Bark (#27364) · a5bee89c

Yoach Lacombe authored Nov 08, 2023



* change handmade attention mask to _prepare_4d_attention_mask

* add flashattention2 support in Bark

* add flashattention2 tests on BarkSemanticModel

* make style

* fix flashattention and tests + make style

* fix memory leak and allow Bark to pass flash attention to sub-models

* make style

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove unecessary code from tests + justify overriding

* Update tests/models/bark/test_modeling_bark.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

a5bee89c

[Flax Whisper] large-v3 compatibility (#27360) · 7b175cfa
Sanchit Gandhi authored Nov 08, 2023

7b175cfa

MusicGen Update (#27084) · f16ff0f0

Sanchit Gandhi authored Nov 08, 2023

* [MusicGen] Add stereo model

* safe serialization

* Update src/transformers/models/musicgen/modeling_musicgen.py

* split over 2 lines

* fix slow tests on cuda

f16ff0f0

Fix `Kosmos-2` device issue (#27346) · 5ef650b0

Yih-Dar authored Nov 08, 2023



* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5ef650b0

Add numpy alternative to FE using torchaudio (#26339) · be74b2ea

Yoach Lacombe authored Nov 08, 2023

* add audio_utils usage in the FE of SpeechToText

* clean unecessary parameters of AudioSpectrogramTransformer FE

* add audio_utils usage in AST

* add serialization tests and function to FEs

* make style

* remove use_torchaudio and move to_dict to FE

* test audio_utils usage

* make style and fix import (remove torchaudio dependency import)

* fix torch dependency for jax and tensor tests

* fix typo

* clean tests with suggestions

* add lines to test if is_speech_availble is False

be74b2ea

07 Nov, 2023 10 commits

Allow scheduler parameters (#26480) · 7e1eff76

Plemeur authored Nov 08, 2023



* Allow for scheduler kwargs

* Formatting

* Arguments checks, passing the tests

* Black failed somehow

---------
Co-authored-by: Pierre <pierre@avatarin.com>

7e1eff76

FIx Bark batching feature (#27271) · ac5d4cf6
Yoach Lacombe authored Nov 07, 2023
```
* fix bark batching

* make style

* add tests and make style
```
ac5d4cf6
[`Whisper`] Nit converting the tokenizer (#27349) · 8f840edd
Arthur authored Nov 07, 2023
```
* `nospeech` instead of `nocaption` for the no speech token

* oups
```
8f840edd
Remove padding_masks from `gpt_bigcode`. (#27348) · cc9f27bb
Susnato Dhar authored Nov 07, 2023
```
Update modeling_gpt_bigcode.py
```
cc9f27bb

Resolve AttributeError by utilizing device calculation at the start of the... · 8c91f15a

Folco Bertini Baldassini authored Nov 07, 2023

Resolve AttributeError by utilizing device calculation at the start of the forward function (#27347)

This commit addresses the 'NoneType' object AttributeError within the IdeficsModel forward function. Previously, the 'device' attribute was accessed directly from input_ids, resulting in a potential 'NoneType' error. Now, the device is properly calculated at the beginning of the forward function and utilized consistently throughout, ensuring the 'image_hidden_states' are derived from the correct device. This modification enables smoother processing and compatibility, ensuring the correct device attribution for 'image_encoder_embeddings' in the IdeficsModel forward pass.

8c91f15a

Remove a redundant variable. (#27288) · 9459d821

Chi authored Nov 07, 2023

* Removed the redundant SiLUActivation class and now use nn.functional.silu directly.

* I apologize for adding torch.functional.silu. I have replaced it with nn.SiLU.

* Remove redundant variable in feature_extraction file

9459d821

[`Whisper`] Add conversion script for the tokenizer (#27338) · 88832c01

Arthur authored Nov 07, 2023

* draft

* updates

* full conversion taken from `https://gist.github.com/xenova/a452a6474428de0182b17605a98631ee`



* psuh

* nits

* updates

* more nits

* Add co author
Co-authored-by: Joshua Lochner <admin@xenova.com>

* fixup

* cleanup

* styling

* add proper path

* update

* nits

* don't  push the exit

* clean

* update whisper doc

* don't error out if tiktoken is not here

* make sure we are BC with conversion

* nit

* Update docs/source/en/model_doc/whisper.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* merge and update

* update markdwon

* Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

---------
Co-authored-by: Joshua Lochner <admin@xenova.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

88832c01

[`FA2`] Add flash attention for `GPT-Neo` (#26486) · 0ded2815

Susnato Dhar authored Nov 07, 2023



* added flash attention for gpt-neo

* small change
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* readme updated

* .

* changes

* removed padding_mask

* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

0ded2815

Fix Whisper Conversion Script: Correct decoder_attention_heads and _download function (#26834) · 606d9084

Xabier de Zuazo authored Nov 07, 2023

* Fix error in convert_openai_to_hf.py: "_download() missing 1 required positional argument: root"

* Fix error in convert_openai_to_hf.py: "TypeError: byte indices must be integers or slices, not str"

* Fix decoder_attention_heads value in convert_openai_to_hf.py.

Correct the assignment for `decoder_attention_heads` in the conversion script for the Whisper model.

* Black reformat convert_openai_to_hf.py file.

* Fix Whisper model configuration defaults (for Tiny).

- Correct encoder/decoder layers and attention heads count.
- Update model width (`d_model`) to 384.

* Add docstring to the convert_openai_to_hf.py script with a doctest

* Add shebang and +x permission to the convert_openai_to_hf.py

* convert_openai_to_hf.py: reuse the read model_bytes in the _download() function

* Move convert_openai_to_hf.py doctest example to whisper.md

* whisper.md: Add an inference example to the Conversion section.

* whisper.md: remove `model.config.forced_decoder_ids` from examples (deprecated)

* whisper.md: Remove "## Format Conversion" section; not used by users

* whisper.md: Use librispeech_asr_dummy dataset and load_dataset()

606d9084

[Whisper] Block language/task args for English-only (#27322) · da7ea9a4

Sanchit Gandhi authored Nov 07, 2023



* [Whisper] Block language/task args for English-only

* Update src/transformers/models/whisper/modeling_whisper.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

da7ea9a4

06 Nov, 2023 6 commits
- [docs] fixed links with 404 (#27327) · 9beb2737
  Maria Khalusova authored Nov 06, 2023
```
* fixed links with 404

* make style
```
  9beb2737
- Fix `Kosmos2Processor` batch mode (#27323) · 1b20e2bb
  Yih-Dar authored Nov 06, 2023
```
* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  1b20e2bb
- Fix VideoMAEforPretrained dtype error (#27296) · a6e0d5a2
  Iker García-Ferrero authored Nov 06, 2023
```
* Fix dtype error

* Fix mean and std dtype

* make style
```
  a6e0d5a2
- enable memory tracker metrics for npu (#27280) · 1ffc4dee
  Hz, Ji authored Nov 06, 2023
  
  1ffc4dee
- Remove an unexpected argument for FlaxResNetBasicLayerCollection (#27272) · d7dcfa89
  Pingzhi Li authored Nov 06, 2023
```
Remove unexpected argument for FlaxResNetBasicLayerCollection
```
  d7dcfa89
- Fix tokenizer export for LLamaTokenizerFast (#27222) · b026b5ca
  Mayank Mishra authored Nov 06, 2023
```
* fix tokenizer

* fix tokenizer
```
  b026b5ca
03 Nov, 2023 5 commits

[`FA2`] Add flash attention for for `DistilBert` (#26489) · 1ac2463d

Susnato Dhar authored Nov 03, 2023

* flash attention added for DistilBert

* fixes

* removed padding_masks

* Update modeling_distilbert.py

* Update test_modeling_distilbert.py

* style fix

1ac2463d

Fix switch transformer mixed precision issue (#27220) · f13f544a
Shiyu Li authored Nov 03, 2023
```
* Fix mixed precision error for switch transformer

* Fixup
```
f13f544a

Update the ConversationalPipeline docstring for chat templates (#27250) · db69bd88

Matt authored Nov 03, 2023

* Update the ConversationalPipeline docstring now that we're using chat templates

* Direct access to conversation.messages

* Explain the string init

db69bd88

Refactor: Use Llama RoPE implementation for Falcon (#26933) · 05ea7b79

Tom Aarsen authored Nov 03, 2023

* Use Llama RoPE implementation for Falcon

+ Add copy functionalities

* Use standard cache format for Falcon

* Simplify apply_rotary_pos_emb, copy from Llama

* Remove unnecessary cache conversion test

We don't need to convert any caches anymore!

* Resolve copy complaint

05ea7b79

Fuyu protection (#27248) · e9a6c72b
Lysandre Debut authored Nov 03, 2023

e9a6c72b

02 Nov, 2023 4 commits

Fixed base model class name extraction from PeftModels (#27162) · 552ff244

Komal Kumar authored Nov 02, 2023

* Fixed base model class name extraction from PeftModels

* Changes to first unwrap the model then extract the base model name

* Changed base_model to base_model.model to stay consistent with peft model abstractions

552ff244

Removed the redundant SiLUActivation class. (#27136) · 49912168

Chi authored Nov 02, 2023

* Removed the redundant SiLUActivation class and now use nn.functional.silu directly.

* I apologize for adding torch.functional.silu. I have replaced it with nn.SiLU.

49912168

Dev version · bc78fd12
Lysandre authored Nov 02, 2023

bc78fd12

Enrich TTS pipeline parameters naming (#26473) · 0ed6729b

Yoach Lacombe authored Nov 02, 2023



* enrich TTS pipeline docstring for clearer forward_params use

* change token leghts

* update Pipeline parameters

* correct docstring and make style

* fix tests

* make style

* change music prompt
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* raise errors if generate_kwargs with forward-only models

* make style

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

0ed6729b