Commits · f370bebdc352cd7c1bea2f88ae0c140ab694c5fd · chenpangpang / transformers

23 Oct, 2023 10 commits

Bugfix device map detr model (#26849) · f370bebd

Pedro Gabriel Gengo Lourenço authored Oct 23, 2023



* Fixed replace_batch_norm when on meta device

* lint fix

* Adding coauthor
Co-authored-by: Pi Esposito <piero.skywalker@gmail.com>

* Removed tests

* Remove unused deps

* Try to fix copy issue

* try fix copy one more time

* Reverted import changes

---------
Co-authored-by: Pi Esposito <piero.skywalker@gmail.com>

f370bebd

Remove ambiguous `padding_mask` and instead use a 2D->4D Attn Mask Mapper (#26792) · 33f98cfd

Patrick von Platen authored Oct 23, 2023



* [Attn Mask Converter] refactor attn mask

* up

* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

* improve

* rename

* better cache

* renaming

* improve more

* improve

* fix bug

* finalize

* make style & make fix-copies

* correct more

* start moving attention_mask

* fix llama

* improve falcon

* up

* improve more

* improve more

* Update src/transformers/models/owlv2/modeling_owlv2.py

* make style

* make style

* rename to converter

* Apply suggestions from code review

---------
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

33f98cfd

Remove token_type_ids from default TF GPT-2 signature (#26962) · f7354a3b
Matt authored Oct 23, 2023
```
Remove token_type_ids from default GPT-2 signature
```
f7354a3b
small typos found (#26988) · c0b5ad94
Rafael Padilla authored Oct 23, 2023
```
just very small typos found
```
c0b5ad94
[`SeamlessM4T`] fix copies with NLLB MoE int8 (#27018) · f9f27b0f
Arthur authored Oct 23, 2023
```
fix copies on newly merged model
```
f9f27b0f
[`NLLB-MoE`] Fix NLLB MoE 4bit inference (#27012) · 244a53e0
Younes Belkada authored Oct 23, 2023
```
fix NLLB MoE 4bit
```
244a53e0

Add Seamless M4T model (#25693) · cb45f71c

Yoach Lacombe authored Oct 23, 2023



* first raw commit

* still POC

* tentative convert script

* almost working speech encoder conversion scripts

* intermediate code for encoder/decoders

* add modeling code

* first version of speech encoder

* make style

* add new adapter layer architecture

* add adapter block

* add first tentative config

* add working speech encoder conversion

* base model convert works now

* make style

* remove unnecessary classes

* remove unecessary functions

* add modeling code speech encoder

* rework logics

* forward pass of sub components work

* add modeling codes

* some config modifs and modeling code modifs

* save WIP

* new edits

* same output speech encoder

* correct attention mask

* correct attention mask

* fix generation

* new generation logics

* erase comments

* make style

* fix typo

* add some descriptions

* new state

* clean imports

* add tests

* make style

* make beam search and num_return_sequences>1 works

* correct edge case issue

* correct SeamlessM4TConformerSamePadLayer copied from

* replace ACT2FN relu by nn.relu

* remove unecessary return variable

* move back a class

* change name conformer_attention_mask ->conv_attention_mask

* better nit code

* add some Copied from statements

* small nits

* small nit in dict.get

* rename t2u model -> conditionalgeneration

* ongoing refactoring of structure

* update models architecture

* remove SeamlessM4TMultiModal classes

* add tests

* adapt tests

* some non-working code for vocoder

* add seamlessM4T vocoder

* remove buggy line

* fix some hifigan related bugs

* remove hifigan specifc config

* change

* add WIP tokenization

* add seamlessM4T working tokenzier

* update tokenization

* add tentative feature extractor

* Update converting script

* update working FE

* refactor input_values -> input_features

* update FE

* changes in generation, tokenizer and modeling

* make style and add t2u_decoder_input_ids

* add intermediate outputs for ToSpeech models

* add vocoder to speech models

* update valueerror

* update FE with languages

* add vocoder convert

* update config docstrings and names

* update generation code and configuration

* remove todos and update config.pad_token_id to generation_config.pad_token_id

* move block vocoder

* remove unecessary code and uniformize tospeech code

* add feature extractor import

* make style and fix some copies from

* correct consistency + make fix-copies

* add processor code

* remove comments

* add fast tokenizer support

* correct pad_token_id in M4TModel

* correct config

* update tests and codes  + make style

* make some suggested correstion - correct comments and change naming

* rename some attributes

* rename some attributes

* remove unecessary sequential

* remove option to use dur predictor

* nit

* refactor hifigan

* replace normalize_mean and normalize_var with do_normalize + save lang ids to generation config

* add tests

* change tgt_lang logic

* update generation ToSpeech

* add support import SeamlessM4TProcessor

* fix generate

* make tests

* update integration tests, add option to only return text and update tokenizer fast

* fix wrong function call

* update import and convert script

* update integration tests + update repo id

* correct paths and add first test

* update how new attention masks are computed

* update tests

* take first care of batching in vocoder code

* add batching with the vocoder

* add waveform lengths to model outputs

* make style

* add generate kwargs + forward kwargs of M4TModel

* add docstrings forward methods

* reformate docstrings

* add docstrings t2u model

* add another round of modeling docstrings + reformate speaker_id -> spkr_id

* make style

* fix check_repo

* make style

* add seamlessm4t to toctree

* correct check_config_attributes

* write config docstrings + some modifs

* make style

* add docstrings tokenizer

* add docstrings to processor, fe and tokenizers

* make style

* write first version of model docs

* fix FE + correct FE test

* fix tokenizer + add correct integration tests

* fix most tokenization tests

* make style

* correct most processor test

* add generation tests and fix num_return_sequences > 1

* correct integration tests -still one left

* make style

* correct position embedding

* change numbeams to 1

* refactor some modeling code and correct one test

* make style

* correct typo

* refactor intermediate fnn

* refactor feedforward conformer

* make style

* remove comments

* make style

* fix tokenizer tests

* make style

* correct processor tests

* make style

* correct S2TT integration

* Apply suggestions from Sanchit code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct typo

* replace torch.nn->nn + make style

* change Output naming (waveforms -> waveform) and ordering

* nit renaming and formating

* remove return None when not necessary

* refactor SeamlessM4TConformerFeedForward

* nit typo

* remove almost copied from comments

* add a copied from comment and remove an unecessary dropout

* remove inputs_embeds from speechencoder

* remove backward compatibiliy function

* reformate class docstrings for a few components

* remove unecessary methods

* split over 2 lines smthg hard to read

* make style

* replace two steps offset by one step as suggested

* nice typo

* move warnings

* remove useless lines from processor

* make generation non-standard test more robusts

* remove torch.inference_mode from tests

* split integration tests

* enrich md

* rename control_symbol_vocoder_offset->vocoder_offset

* clean convert file

* remove tgt_lang and src_lang from FE

* change generate docstring of ToText models

* update generate docstring of tospeech models

* unify how to deal withtext_decoder_input_ids

* add default spkr_id

* unify tgt_lang for t2u_model

* simplify tgt_lang verification

* remove a todo

* change config docstring

* make style

* simplify t2u_tgt_lang_id

* make style

* enrich/correct comments

* enrich .md

* correct typo in docstrings

* add torchaudio dependency

* update tokenizer

* make style and fix copies

* modify SeamlessM4TConverter with new tokenizer behaviour

* make style

* correct small typo docs

* fix import

* update docs and add requirement to tests

* add convert_fairseq2_to_hf in utils/not_doctested.txt

* update FE

* fix imports and make style

* remove torchaudio in FE test

* add seamless_m4t.md to utils/not_doctested.txt

* nits and change the way docstring dataset is loaded

* move checkpoints from ylacombe/ to facebook/ orga

* refactor warning/error to be in the 119 line width limit

* round overly precised floats

* add stereo audio behaviour

* refactor .md and make style

* enrich docs with more precised architecture description

* readd undocumented models

* make fix-copies

* apply some suggestions

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* correct bug from previous commit

* refactor a parameter allowing to clean the code + some small nits

* clean tokenizer

* make style and fix

* make style

* clean tokenizers arguments

* add precisions for some tests

* move docs from not_tested to slow

* modify tokenizer according to last comments

* add copied from statements in tests

* correct convert script

* correct parameter docstring style

* correct tokenization

* correct multi gpus

* make style

* clean modeling code

* make style

* add copied from statements

* add copied statements

* add support with ASR pipeline

* remove file added inadvertently

* fix docstrings seamlessM4TModel

* add seamlessM4TConfig to OBJECTS_TO_IGNORE due of unconventional markdown

* add seamlessm4t to assisted generation ignored models

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

cb45f71c

Change default `max_shard_size` to smaller value (#26942) · 50d0cf4f
Younes Belkada authored Oct 23, 2023
```
* Update modeling_utils.py

* fixup

* let's change it to 5GB

* fix
```
50d0cf4f
python falcon doc-string example typo (#26995) · 45425660
Gema Parreño authored Oct 23, 2023
```
git python falcon typo
```
45425660
Limit to inferior fsspec version (#27010) · 70032949
Lysandre Debut authored Oct 23, 2023
```
Pin fsspec
```
70032949

20 Oct, 2023 2 commits

Fix Fuyu image scaling bug (#26918) · c030fc89

Pedro Cuenca authored Oct 20, 2023

* Fix Fuyu image scaling bug

It could produce negative padding and hence inference errors for certain
image sizes.

* Fix aspect ratio scaling test

c030fc89

[docstring] Fix docstring for speech-to-text config (#26883) · 929134bf

Adam Ross authored Oct 20, 2023

* Fix docstring for speech-to-text config

* Refactor doc line len <= 119 char

* Remove Speech2TextConfig from OBJECTS_TO_IGNORE

* Fix Speech2TextConfig doc str

* Fix Speech2TextConfig doc using doc-builder

* Refactor Speech2TextConfig doc

929134bf

19 Oct, 2023 4 commits

[`FA-2` / `Mistral`] Supprot fa-2 + right padding + forward (#26912) · bc4bbd9f
Younes Belkada authored Oct 19, 2023
```
supprot fa-2 + right padding + forward
```
bc4bbd9f

Pin Keras for now (#26904) · cbd278f0

Matt authored Oct 19, 2023

* Pin Keras for now out of paranoia

* Add the keras pin to _tests_requirements.txt too

* Make sure the Keras version matches the TF one

* make fixup

cbd278f0

[docstring] Fix docstrings for `CodeGen` (#26821) · ad08137e

Daniil authored Oct 19, 2023



* remove docstrings CodeGen from objects_to_ignore

* autofix codegen docstrings

* fill in the missing types and docstrings

* fixup

* change descriptions to be in a separate line

* apply docstring suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* update n_ctx description in CodeGenConfig

---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

ad08137e

[docstring] Fix docstring for `ChineseCLIP` (#26880) · 816c2237

Sparty authored Oct 19, 2023



* Remove ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig from check_docstrings

* Run fix_and_overwrite for ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig

* Replace <fill_type> and <fill_docstring> in configuration_chinese_clip.py, image_processing_chinese_clip.py with type and docstring values

---------
Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>

816c2237

18 Oct, 2023 7 commits

[`FA-2`] Revert suggestion that broke FA2 fine-tuning with quantized models (#26916) · 574a5384
Younes Belkada authored Oct 19, 2023
```
revert
```
574a5384

Add fuyu model (#26911) · caa0ff0b

Pablo Montalvo authored Oct 19, 2023



* initial commit

* add processor, add fuyu naming

* add draft processor

* fix processor

* remove dropout to fix loading of weights

* add image processing fixes from Pedro

* fix

* fix processor

* add basic processing fuyu test

* add documentation and TODO

* address comments, add tests, add doc

* replace assert with torch asserts

* add Mixins and fix tests

* clean imports

* add model tester, clean imports

* fix embedding test

* add updated tests from pre-release model

* Processor: return input_ids used for inference

* separate processing and model tests

* relax test tolerance for embeddings

* add test for logit comparison

* make sure fuyu image processor is imported in the init

* fix formattingh

* more formatting issues

* and more

* fixups

* remove some stuff

* nits

* update init

* remove the fuyu file

* Update integration test with release model

* Update conversion script.

The projection is not used, as confirmed by the authors.

* improve geenration

* Remove duplicate function

* Trickle down patches to model call

* processing fuyu updates

* remove things

* fix prepare_inputs_for_generation to fix generate()

* remove model_input

* update

* add generation tests

* nits

* draft leverage automodel and autoconfig

* nits

* fix dtype patch

* address comments, update READMEs and doc, include tests

* add working processing test, remove refs to subsequences

* add tests, remove Sequence classification

* processing

* update

* update the conversion script

* more processing cleanup

* safe import

* take out ModelTesterMixin for early release

* more cl;eanup

* more cleanup

* more cleanup

* and more

* register a buffer

* nits

* add postprocessing of generate output

* nits

* updates

* add one working test

* fix test

* make fixup works

* fixup

* Arthur's updates

* nits

* update

* update

* fix processor

* update tests

* passe more fixups

* fix

* nits

* don't import torch

* skip fuyu config for now

* fixup done

* fixup

* update

* oups

* nits

* Use input embeddings

* no buffer

* update

* styling processing fuyu

* fix test

* update licence

* protect torch import

* fixup and update not doctested

* kwargs should be passed

* udpates

* update the impofixuprts in the test

* protect import

* protecting imports

* protect imports in type checking

* add testing decorators

* protect top level import structure

* fix typo

* fix check init

* move requires_backend to functions

* Imports

* Protect types

---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre@huggingface.co>

caa0ff0b

[`FA-2`] Final fix for FA2 dtype (#26846) · 5a73316b

Younes Belkada authored Oct 18, 2023



* final fix for FA2 dtype

* try

* oops

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* apply fix everywhere

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

5a73316b

Add default template warning (#26637) · d933818d

Matt authored Oct 18, 2023

* Add default template warnings

* make fixup

* Move warnings to FutureWarning

* Move warnings to FutureWarning

* fix make fixup

* Remove futurewarning

d933818d

[`Tokenizer`] Fix slow and fast serialization (#26570) · ef7e9369

Arthur authored Oct 18, 2023

* fix

* last attempt

* current work

* fix forward compatibility

* save all special tokens

* current state

* revert additional changes

* updates

* remove tokenizer.model

* add a test and the fix

* nit

* revert one more break

* fix typefield issue

* quality

* more tests

* fix fields for FC

* more nits?

* new additional changes

* how

* some updates

* simplify all

* more nits

* revert some things to original

* nice

* nits

* a small hack

* more nits

* ahhaha

* fixup

* update

* make test run on ci

* use subtesting

* update

* Update .circleci/create_circleci_config.py

* updates

* fixup

* nits

* replace typo

* fix the test

* nits

* update

* None max dif pls

* a partial fix

* had to revert one thing

* test the fast

* updates

* fixup

* and more nits

* more fixes

* update

* Oupsy 👁



* nits

* fix marian

* on our way to heaven

* Update src/transformers/models/t5/tokenization_t5.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* fixup

* Update src/transformers/tokenization_utils_fast.py
Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com>

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com>

* fix phobert

* skip some things, test more

* nits

* fixup

* fix deberta

* update

* update

* more updates

* skip one test

* more updates

* fix camembert

* can't test this one

* more good fixes

* kind of a major update

- seperate what is only done in fast in fast init and refactor
- add_token(AddedToken(..., speicla = True)) ignores it in fast
- better loading

* fixup

* more fixups

* fix pegasus and mpnet

* remove skipped tests

* fix phoneme tokenizer if self.verbose

* fix individual models

* update common tests

* update testing files

* all over again

* nits

* skip test for markup lm

* fixups

* fix order of addition in fast by sorting the added tokens decoder

* proper defaults for deberta

* correct default for fnet

* nits on add tokens, string initialized to special if special

* skip irrelevant herbert tests

* main fixes

* update test added_tokens_serialization

* the fix for bart like models and class instanciating

* update bart

* nit!

* update idefix test

* fix whisper!

* some fixup

* fixups

* revert some of the wrong chanegs

* fixup

* fixup

* skip marian

* skip the correct tests

* skip for tf and flax as well

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com>

ef7e9369

Fix Seq2seqTrainer decoder attention mask (#26841) · 34678db4
Matt authored Oct 18, 2023
```
Don't drop decoder_input_ids without also dropping decoder_attention_mask
```
34678db4
Generate: improve docstrings for custom stopping criteria (#26863) · e893b1ef
Joao Gante authored Oct 18, 2023
```
improve docstrings
```
e893b1ef

17 Oct, 2023 7 commits
- Fix TensorFlow pakage check (#26842) · ef42cb62
  jayfurmanek authored Oct 17, 2023
```
Add tf-nightly-rocm to _is_tf_available check
```
  ef42cb62
- [docstring] Fix docstring for LukeConfig (#26858) · 51042ae8
  louietouie authored Oct 17, 2023
```
* Deleted LukeConfig and ran check_docstrings.py

* Filled docstring information

---------
Co-authored-by: louie <louisparizeau@Chicken.local>
```
  51042ae8
- 🚨 🚨 Raise error when no speaker embeddings in speecht5._generate_speech (#26418) · db611aab
  Yoach Lacombe authored Oct 17, 2023
```
* add warning when no speaker embeddings in speecht5._generate_speech

* modify warning to error

* adapt generation test
```
  db611aab
- [`FA2`] Fix flash attention 2 fine-tuning with Falcon (#26852) · 41c42f85
  Younes Belkada authored Oct 17, 2023
```
fix fa2 + dropout issue
```
  41c42f85
- 🚨🚨 Generate: change order of ops in beam sample to avoid nans (#26843) · 4b423e60
  Joao Gante authored Oct 17, 2023
```
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
```
  4b423e60
- Update logits_process.py docstrings to clarify penalty and reward cases (attempt #2) (#26784) · 0b8604d0
  larekrow authored Oct 17, 2023
```
* Update logits_process.py docstrings + match arg fields to __init__'s

* Ran `make style`
```
  0b8604d0
- fix: when window_size is passes as array (#26800) · 85e9d644
  Shinji Yamada authored Oct 17, 2023
  
  85e9d644
16 Oct, 2023 8 commits

🚨

[`Quantization`] Store the original dtype in the config as a private attribute

🚨

(#26761) · fd6a0ade

Younes Belkada authored Oct 16, 2023



* First step

* fix

* add adjustements for gptq

* change to `_pre_quantization_dtype`

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix serialization

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

fd6a0ade

Conversation pipeline fixes (#26795) · 14b04b4b

Matt authored Oct 16, 2023

* Adjust length limits and allow naked conversation list inputs

* Adjust length limits and allow naked conversation list inputs

* Maybe use a slightly more reasonable limit than 1024

* Skip tests for old models that never supported this anyway

* Cleanup input docstrings

* More docstring cleanup + skip failing TF test

* Make fixup

14b04b4b

[docstring] Fix bert generation tokenizer (#26820) · 5c6b83cb

przemL authored Oct 16, 2023

* Remove BertGenerationTokenizer from objects to ignore

The file BertGenerationTokenizer is removed from
objects to ignore as a first step to fix docstring.

* Docstrings fix for BertGenerationTokenizer

Docstring fix is generated for BertGenerationTokenizer
by using check_docstrings.py.

* Fix docstring for BertGenerationTokenizer

Added sep_token type and docstring in BertGenerationTokenizer.

5c6b83cb

Llama tokenizer: remove space in template comment (#26788) · 3ef71345

Pedro Cuenca authored Oct 16, 2023

* Remove space in template comment

I think the space between the eos and bos tokens is not present in the actual template output. I'm using this documentation as a reference for everyone asking about prompting, so would like to clarify whether there's a space or not :)

* Update fast tokenizer too

* Apply to Code Llama

* Link to original code snippet.

3ef71345

fix resume_from_checkpoint bug (#26739) · b91cff5a
Jintao authored Oct 16, 2023
```
* fix resume_from_checkpoint bug

* update code
```
b91cff5a
Make fsdp ram efficient loading optional (#26631) · a5f5568d
Sourab Mangrulkar authored Oct 16, 2023
```
make fsdp ram efficient loading optional
```
a5f5568d
[docstring] Fix docstring for `CodeLlamaTokenizerFast` (#26666) · 5c081e29
Bojun-Feng authored Oct 16, 2023
```
* remove from OBJECTS_TO_IGNORE

* run check_docstrings.py

* fill in information

* ignore CodeLlamaTokenizer
```
5c081e29

[docstring] Fix docstring for `CanineConfig` (#26771) · 0e52af4d

Sparty authored Oct 16, 2023



* Remove CanineConfig from check_docstrings

* Run fix_and_overwrite for CanineConfig

* Replace <fill_type> and <fill_docstring> in configuration_canine.py with type and docstring values

---------
Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>

0e52af4d

13 Oct, 2023 2 commits
- [`Flava`] Fix flava doc (#26789) · 7cc6f822
  Younes Belkada authored Oct 13, 2023
```
* fix flava doctest

* add shape

* adapt
```
  7cc6f822
- Fixed KeyError for Mistral (#26682) · 8e05ad32
  Matteo Raso authored Oct 13, 2023
```
* Fixed KeyError for Mistral

* Removed try block

* Removed whitespace
```
  8e05ad32