Commits · 8fff07ded0a914299a3f4c25580bc5eeb6d0aae6 · chenpangpang / transformers

10 Jun, 2024 1 commit

Yih-Dar authored Jun 10, 2024



* [run-slow] cohere

* [run-slow] cohere

* [run-slow] cohere

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

8fff07de

07 Jun, 2024 1 commit

Remove ConversationalPipeline and Conversation object (#31165) · 065729a6

Matt authored Jun 07, 2024

* Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal

* Update not-doctested.txt

* Fix JA and ZH docs

* Fix JA and ZH docs some more

* Fix JA and ZH docs some more

065729a6

04 Jun, 2024 1 commit

Upload (daily) CI results to Hub (#31168) · 3714f3f8

Yih-Dar authored Jun 04, 2024



* build

* build

* build

* build

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3714f3f8

03 Jun, 2024 2 commits
- [`GemmaModel`] fix small typo (#31202) · 1749841a
  Arthur authored Jun 03, 2024
```
* fixes

* fix-copies
```
  1749841a
- Remove copied froms for deprecated models (#31153) · 5b5b48b1
  amyeroberts authored Jun 03, 2024
```
* Remove copied froms for deprecated models

* Remove automatically in script
```
  5b5b48b1
31 May, 2024 2 commits

Diff converter v2 (#30868) · 96eb0628

Arthur authored May 31, 2024

* current working example!

* commit regex and result file

* update

* nit

* push the conversion file

* oups

* roadmap and nits

* attempt diffs for 3 files

* persimmon

* nit

* add diff file that is the same as the modeling_llama.py

* fix rope nits

* updates

* updates with converted versions

* give some breathing space to the code

* delete

* update

* update

* push the actual result

* update regex patterns

* update regex patterns

* fix some issues

* fix some issues

* fix some issues

* updates

* updates

* updates

* updates

* updates

* revert changes done to llama

* updates

* update gemma

* updates

* oups

* current state

* current state

* update

* ouiiii

* nit

* clear diffs

* nit

* fixup

* update

* doc 🚀

* 🔥

* for now use gemma

* deal with comments

* style

* handle funtions

* deal with assigns

* todos

* process inheritage

* keep decorators?

* 🤗

* deal with duplicates

* fixup

* correctly remove duplicate code

* run ruff post script

* ruff deals pretty well with imports, let's leave it to him

* ah maybe not lol

* for now remove all imports from child.

* nit

* conversion of llama

* okay

* convert starcoder2

* synch with main

* update llama diff

* updates

* https://docs.astral.sh/ruff/rules/redefined-while-unused/

 fixes the imports, bit needs later version of ruff

* updates

* okay actual state

* non zero exit

* update!

* revert unrelated

* remove other diff files

* updates

* cleanup

* update

* less diff!

* stash

* current updates

* updates

* No need for call

* finished fining deps

* update

* current changes

* current state

* current state

* new status

* nit

* finally

* fixes

* nits

* order is now expected

* use logger info instead of prints

* fixup

* up

* nit

* update

* nits

* update

* correct merge

* update

* update

* update

* add warning

* update caution message

* update

* better merging strategy

* copy class statements :wink

* fixups

* nits

* update

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* nits

* smaller header

* do cleanup some stuff

* even simpler header?

* fixup

* updates

* ruff

* update examples

* nit

* TODO

* state

* OUUUUUUF

* current state

* nits

* final state

* add a readme

* fixup

* remove diff llama

* fix

* nit

* dummy noy funny

* ruff format tests src utils --check

* everless diffs

* less diffs and fix test

* fixes

* naming nit?

* update converter and add supper example

* nits

* updated for function signatures

* update

* update

* add converted dummies

* autoformat

* single target assign fix

* fixup

* fix some imports

* fixes

* don't push them

* `# noqa: F841`

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

96eb0628

helper (#31152) · f3f640dc

Arthur authored May 31, 2024



* helper

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates

* more doc

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

f3f640dc

28 May, 2024 1 commit

Deprecate low use models (#30781) · a564d10a

amyeroberts authored May 28, 2024

* Deprecate models
- graphormer
- time_series_transformer
- xlm_prophetnet
- qdqbert
- nat
- ernie_m
- tvlt
- nezha
- mega
- jukebox
- vit_hybrid
- x_clip
- deta
- speech_to_text_2
- efficientformer
- realm
- gptsan_japanese

* Fix up

* Fix speech2text2 imports

* Make sure message isn't indented

* Fix docstrings

* Correctly map for deprecated models from model_type

* Uncomment out

* Add back time series transformer and x-clip

* Import fix and fix-up

* Fix up with updated ruff

a564d10a

24 May, 2024 1 commit
- save the list of new model failures (#31013) · a3cdff41
  Yih-Dar authored May 24, 2024
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  a3cdff41
22 May, 2024 2 commits

Finally fix the missing new model failure CI report (#30968) · 1432f641
Yih-Dar authored May 22, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
1432f641

update ruff version (#30932) · 673440d0

Arthur authored May 22, 2024



* update ruff version

* fix research projects

* Empty

* Fix errors

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

673440d0

17 May, 2024 1 commit
- update release script (#30880) · e9a8041d
  Arthur authored May 17, 2024
```
* update release script

* update release script
```
  e9a8041d
16 May, 2024 1 commit

add sdpa to ViT [follow up of #29325] (#30555) · 1c21f48a

hyenal authored May 16, 2024



remove blank line (+1 squashed commit)
Squashed commits:
[24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
Squashed commits:
[08bd27e7a] [run-slow]vit_msn,vision_encoder_decoder
[ec96a8db3] [run-slow]vit_msn
[ead817eca] fix vit msn multi gpu
[d12cdc8fd] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[3fdbfa88f] doc
[a3ff33e4a] finish implementation
[e20b7b7fb] Update test_modeling_common.py
[e290c5810] Update test_modeling_flax_common.py
[d3af86f46] comment
[ff7dd32d8] more comments
[59b137889] suggestion
[7e2ba6d67] attn_implementation as attribute of the class
[fe66ab71f] minor
[38642b568] Apply suggestions from code review

Accept comments
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[22cde7d52] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[48e137cc6] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[99f4c679f] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[96cf20a6d] Update src/transformers/models/vit_msn/modeling_vit_msn.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[c59377d23] Update src/transformers/models/vit_mae/modeling_vit_mae.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[b70a47259] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[00c84d216] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[61f00ebb0] all tests are passing locally
[e9e0b82b7] vision encoder/decoder
[4d5076b56] test-vision (+20 squashed commits)
Squashed commits:
[d1add8db9] yolo
[9fde65716] fix flax
[986566c28] minor
[ca2f21d1f] vit
[3333efd7a] easy models change
[ebfc21402] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[b8b8603ed] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[48ecc7e26] all tests are passing locally
[bff7fc366] minor
[62f88306f] fix yolo and text_encoder tests
[121507555] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1064cae0a] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[b7f52ff3a] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[cffaa10dd] fix-copies
[ef6c511c4] test vit hybrid
[7d4ba8644] vit hybrid
[66f919033] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1fcc0a031] fixes
[cfde6eb21] fixup
[e77df1ed3] all except yolo end encoder decoder (+17 squashed commits)
Squashed commits:
[602913e22] vit + vit_mae are working
[547f6c4cc] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/  passes
[61a97dfa9] it s the complete opposite...
[aefab37d4] fix more tests
[71802a1b9] fix all torch tests
[40b12eb58] encoder - decoder tests
[941552b69] slow decorator where appropriate
[14d055d80] has_attentions to yolo and msn
[3381fa19f] add correct name
[e261316a7] repo consistency
[31c6d0c08] fixup
[9d214276c] minor fix
[11ed2e1b7] chore
[eca6644c4] add sdpa to vit-based models
[cffbf390b] make fix-copies result
[6468319b0] fix style
[d324cd02a] add sdpa for vit
Co-authored-by: Liubov Yaronskaya <luba.yaronskaya@gmail.com>

1c21f48a

15 May, 2024 2 commits

Deprecate models script - correctly set the model name for the doc file (#30785) · 58faa7b8
amyeroberts authored May 15, 2024
```
* Correctly set the moel name for the doc file

* Fix up
```
58faa7b8

Loading GGUF files support (#30391) · a4284495

Lysandre Debut authored May 15, 2024



* Adds support for loading GGUF files
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>

* add q2_k q3_k q5_k support from @99991

* fix tests

* Update doc

* Style

* Docs

* fix CI

* Update docs/source/en/gguf.md

* Update docs/source/en/gguf.md

* Compute merges

* change logic

* add comment for clarity

* add comment for clarity

* Update src/transformers/models/auto/tokenization_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change logic

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_gguf_pytorch_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* put back comment

* add comment about mistral

* comments and added tests

* fix unconsistent type

* more

* fix tokenizer

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* address comments about tests and tokenizer + add added_tokens

* from_gguf -> gguf_file

* replace on docs too

---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

a4284495

13 May, 2024 2 commits

Deprecate models script (#30184) · 0f8fefd4

amyeroberts authored May 13, 2024

* Add utility for finding candidate models for deprecation

* Update model init

* Make into configurable script

* Fix path

* Add sorting of base object alphabetically

* Tidy

* Refactor __init__ alpha ordering

* Update script with logging

* fix import

* Fix logger

* Fix logger

* Get config file before moving files

* Take models from CLI

* Split models into lines to make easier to feed to deprecate_models script

* Update

* Use posix path

* Print instead

* Add example in module docstring

* Fix up

* Add clarifying comments; add models to DEPRECATE_MODELS

* Address PR comments

* Don't update relative paths on the same level

0f8fefd4

Save other CI jobs' result (torch/tf pipeline, example, deepspeed etc) (#30699) · 82c1625e

Yih-Dar authored May 13, 2024



* update

* update

* update

* update

* update

* update

* update

* update

* Update utils/notification_service.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

82c1625e

09 May, 2024 1 commit

Update CodeLlama references (#30218) · c99d88e5

Omar Sanseviero authored May 09, 2024

* Update CodeLlama references

* Update slow_documentation_tests.txt

* Update slow_documentation_tests.txt

c99d88e5

07 May, 2024 3 commits

Rename artifact name `prev_ci_results` to `ci_results` (#30697) · 884e3b1c
Yih-Dar authored May 07, 2024
```
* rename

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
884e3b1c
Update `workflow_id` in `utils/get_previous_daily_ci.py` (#30695) · 05ec950c
Yih-Dar authored May 07, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
05ec950c

Reboot Agents (#30387) · 0ba15ced

Aymeric Roucher authored May 07, 2024



* Create CodeAgent and ReactAgent

* Fix formatting errors

* Update documentation for agents

* Add custom errors, improve logging

* Support variable usage in ReactAgent

* add messages

* Add message passing format

* Create React Code Agent

* Update

* Refactoring

* Fix errors

* Improve python interpreter

* Only non-tensor inputs should be sent to device

* Calculator tool slight refactor

* Improve docstrings

* Refactor

* Fix tests

* Fix more tests

* Fix even more tests

* Fix tests by replacing output and input types

* Fix operand type issue

* two small fixes

* EM TTS

* Fix agent running type errors

* Change text to speech tests to allow changed outputs

* Update doc with new agent types

* Improve code interpreter

* If max iterations reached, provide a real answer instead of an error

* Add edge case in interpreter

* Add safe imports to the interpreter

* Interpreter tweaks: tuples and listcomp

* Make style

* Make quality

* Add dictcomp to interpreter

* Rename ReactJSONAgent to ReactJsonAgent

* Misc changes

* ToolCollection

* Rename agent's logger to self.logger

* Add while loops to interpreter

* Update doc with new tools. still need to mention collections

* Add collections to the doc

* Small fixes on logs and interpretor

* Fix toolbox return type

* Docs + fixup

* Skip doctests

* Correct prompts with improved examples and formatting

* Update prompt

* Remove outdated docs

* Change agent to accept Toolbox object for tools

* Remove calculator tool

* Propagate removal of calculator in doc

* Fix 2 failing workflows

* Simplify additional argument passing

* AgentType audio

* Minor changes: function name, types

* Remove calculator tests

* Fix test

* Fix torch requirement

* Fix final answer tests

* Style fixes

* Fix tests

* Update docstrings with calculator removal

* Small type hint fixes

* Update tests/agents/test_translation.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_python_interpreter.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/default_tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_agents.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bert/configuration_bert.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/tools.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/speech_to_text.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_speech_to_text.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/agents/test_tools_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* pygments

* Answer comments

* Cleaning up

* Simplifying init for all agents

* Improving prompts and making code nicer

* Style fixes

* Add multiple comparator test in interpreter

* Style fixes

* Improve BERT example in documentation

* Add examples to doc

* Fix python interpreter quality

* Logging improvements

* Change test flag to agents

* Quality fix

* Add example for HfEngine

* Improve conversation example for HfEngine

* typo fix

* Verify doc

* Update docs/source/en/agents.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/agents.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/prompts.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/agents/python_interpreter.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/agents.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix style issues

* local s2t tool

---------
Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Lysandre <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

0ba15ced

03 May, 2024 1 commit
- Avoid duplication in PR slow CI model list (#30634) · 91d155ea
  Yih-Dar authored May 03, 2024
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  91d155ea
30 Apr, 2024 1 commit

General PR slow CI (#30540) · 87927b24

Yih-Dar authored Apr 30, 2024



* More general PR slow CI

* Update utils/pr_slow_ci_models.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

87927b24

25 Apr, 2024 1 commit

Script for finding candidate models for deprecation (#29686) · 30ee508c

amyeroberts authored Apr 25, 2024

* Add utility for finding candidate models for deprecation

* Better model filtering

* Update

* Add warning tip

* Fix up

* Review comments

* Filter requests based on tags

* Add copyright header

30ee508c

24 Apr, 2024 7 commits

consistent job / pytest report / artifact name correspondence (#30392) · fbb41cd4

Yih-Dar authored Apr 24, 2024



* better names

* run better names

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fbb41cd4

Fix wrong indent in `utils/check_if_new_model_added.py` (#30456) · d0d430f1
Yih-Dar authored Apr 24, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d0d430f1

Add llama3 (#30334) · 89c510d8

Arthur authored Apr 24, 2024



* nuke

* add co-author

* add co-author

* update card

* fixup and fix copies to please our ci

* nit fixup

* super small nits

* remove tokenizer_path from call to `write_model`

* always safe serialize by default

---------
Co-authored-by: pcuenca <pcuenca@users.noreply.github.com>
Co-authored-by: xenova <xenova@users.noreply.github.com>

89c510d8

New model PR needs green (slow tests) CI (#30341) · fc34f842

Yih-Dar authored Apr 24, 2024



* You should not pass
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

fc34f842

Remove mentions of models in the READMEs and link to the documentation page in... · c6bba940

Lysandre Debut authored Apr 24, 2024

Remove mentions of models in the READMEs and link to the documentation page in which they are featured. (#30420)

* REAMDEs

* REAMDEs v2

c6bba940

Remove add-new-model in favor of add-new-model-like (#30424) · d4e92f1a
Lysandre Debut authored Apr 24, 2024
```
* Remove add-new-model in favor of add-new-model-like

* nits
```
d4e92f1a
Remove task guides auto-update in favor of links towards task pages (#30429) · 0eb8fbcd
Lysandre Debut authored Apr 24, 2024

0eb8fbcd

23 Apr, 2024 1 commit

Remove old TF port docs (#30426) · 696ededd

Matt authored Apr 23, 2024

* Remove old TF port guide

* repo-consistency

* Remove some translations as well for consistency

* Remove some translations as well for consistency

696ededd

19 Apr, 2024 2 commits

Add TF swiftformer (#23342) · d2cec09b

João David authored Apr 19, 2024



* Duplicate swiftformer

* Convert SwiftFormerPatchEmbedding

* Convert SwiftFormerEmbeddings

* Convert TFSwiftFormerMlp

* Convert TFSwiftFormerConvEncoder

* Convert TFSwiftFormerLocalRepresentation

* convert TFSwiftFormerEncoderBlock

* Convert SwiftFormerStage

* Convert SwiftFormerEncoder

* Add TFSWiftFormerPreTrainedModel

* Convert SwiftFormerForImageClassification

* Add kwargs and start drop path

* Fix syntax

* Change Model class name

* Add TFSwiftFormer to __init__

* Duplicate test_modeling_swiftformer

* First test conversions

* Change require_torch to require_tf

* Add exports to swiftformer __init__

* Add TFSwiftFormerModel wrapper

* Fix __init__ and run black

* Remove docstring from MainLayer, fix padding

* Use keras.layers.Activation on keras.Sequential

* Fix swiftformer exports

* Fix activation layer from config

* Remove post_inits

* Use tf.keras.layers.ZeroPadding2D

* Convert torch normalize

* Change tf test input shape

* Fix softmax and reduce_sum

* Convert expand_dims and repeat

* Add missing reshape and tranpose

* Simplify TFSwiftFormerEncoderBlock.call

* Fix mismatch in patch embeddings

* Fix expected output shape to match channels last

* Fix swiftformer typo

* Disable test_onnx

* Fix TFSwiftFormerForImageClassification call

* Add unpack inputs

* Convert flatten(2).mean(-1)

* Change vision dummy inputs (to be reviewed)

* Change test_forward_signature to use .call

* Fix @unpack_inputs

* Set return_tensors="tf" and rename class

* Rename wrongly named patch_embeddings layer

* Add serving_output and change dummy_input shape

* Make dimensions BCHW and transpose inside embedding layer

* Change SwiftFormerEncoderBlock

* Fix ruff problems

* Add image size to swiftformer config

* Change tranpose to MainLayer and use -1 for reshape

* Remove serving_outputs and dummy_inputs

* Remove test_initialization test from tf model

* Make Sequential component a separate layer

* Fix layers' names

* Tranpose encoder outputs

* Fix tests and check if hidden states is not None

* Fix TFSwiftFormerForImageClassification

* Run make fixup

* Run make fix-copies

* Update modeling_tf_auto

* Update docs

* Fix modeling auto mapping

* Update modelint_tf_swiftformer docs

* Fill image_size doc and type

* Add reduction=None to loss computation

* Update docs

* make style

* Debug: Delete the tip to see if that changes anything

* Re-add tip

* Remove add_code_sample_docstrings

* Remove unused import

* Get the debug to actually tell us the problem it has with the docs

* Try a substitution to match the PyTorch file?

* Add swiftformer to ignore list

* Add build() methods

* Update copyright year
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Remove FIXME comment

* Remove from_pt

* Update copyright year
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Rename one-letter variables

* Remove FIXMEs related to momentum

* Remove old TODO comment

* Remove outstanding FIXME comments

* Get dropout rate from config

* Add specific dropout config for MLP

* Add convencoder dropout to config

* Pass config to SwiftFormerDropPath layer

* Fix drop_path variable name and add Adapted from comment

* Run ruff

* Removed copied from comment

* Run fix copies

* Change drop_path to identity to match pt

* Cleanup build() methods and move to new keras imports

* Update docs/source/en/model_doc/swiftformer.md
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Raise error if drop_path_rate > 0.0

* Apply suggestions from code review

Replace (self.dim), with self.dim,
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove drop_path function

* Add training to TFSwiftFormerEncoder

* Set self.built = True last
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Should have been added to previous commit
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Change default_feature_extractor to default_image_processor
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Import Keras from modeling_tf_utils

* Remove relative import

* Run ruff --fix

* Move import keras to tf_available

* Add copied from comment to test_forward_signature

* Reduce batch size and num_labels

* Extract loss logic to hf_compute_loss

* Run ruff format

---------
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

d2cec09b

Transformers Metadata (#30344) · e67ccf06
Lysandre Debut authored Apr 19, 2024

e67ccf06

18 Apr, 2024 1 commit

Add jamba (#29943) · 3f20877d

tomeras91 authored Apr 18, 2024

* Add jamba arch

* apply "make fix-copies" changes

* fix link to model in JambaConfig docstring

* Add n_ctx in modeling file because repo-consistency wants that

* Add jamba to flash attention and sdpa documentation

* mamba dt_proj quant fix now works for LoRA as well

* override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers

* add jamba to tokenization auto

* fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24)

* simple PR fixes

* remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer

* remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530)

* Add copied comment on JambaMLP (it's the same as MixtralMLP)

* remove padding_mask warnings. It's not supported anymore

* fix docstring. Float instead of int

* A few more minor PR fixes

* (1) lowercase names for mamba layernorms (2) remove _apply_inner_layernorms and do it directly in the forward pass

* Return None attention weights from mamba layers. Append to all attentions only if not None.

* remove some leftover jamba archive lists

* Better separation between expert vs non-expert layers. non-expert layers return None as router_logits, and it is not concatenated to all_router_logits returned from JambaModel

* no need to take router_logits at config.expert_layer_offset anymore. result.router_logits now holds results only for expert layers

* Add Jamba paper on READMEs

* (1) rename n_ctx -> max_position_embeddings (2) don't use it in the modeling file since it's not needed (set it as an exception to check_config_attributes)

* Add copied from comment

* remove the code path for apply_inner_layernorms=False. Jamba always has the inner mamba layernorms

* clearer docstring for _convert_to_standard_cache

* style fixes

* Change calc_logits_for_entire_prompt (bool) to num_logits_to_keep (int). Adapt assisted decoding code tp use it. Also small change in low memory beam search decoding path to support this new int value in model_inputs

* rename test so it still overrides what its meant to override

* draft

* oups

* nit

* remove more complexe logic

* fix names used in config

* fix fix fix

* style

* fix some more failing tests

* generate did not init the cache 🙃



* more small nits

* typo

* config.mamba_expand * config.hidden_size for the intermediate size of the mamba shapes

* fix init of pkv with torch.tensor()

* empty tensor

* fix some init issues

* stupid changes required by generate because it does not even support it's own DynamicCache class

* more fixes

* fix general assisted gen cache_position bug

* tests passing

* Add offsets and periods as SPECIAL_CASES_TO_ALLOW in check_config_attributes.py

* fix reorder_cache to reorder mamba states and override some more functions in HybridMambaAttentionDynamicCache

* no need to override test_past_key_values_format() and _check_past_key_values_for_generate() in tests anymore

* fix docstrings and typehints for past_key_values

* style fixes

* fix docs

* change typehint due to copy from Mixtral

* forgot import

* import order

* Add configuration_jamba and modeling_jamba to not_doctested because the model is too big to download (in docstring of JambaForCausalLM.forward)

* Add integration test with tiny tandom Jamba model on hub

* fix flash attention cache shapes

* bring back forgotten hidden states

* rename HybridMambaAttentionDynamicCache.seqlen_offset to has_previous_state (and make bool) and bugfix - it should be set to True after a finished forward pass of the entire model

* align integration test after modeling fixes

* bugfix - mamba can use precomputed states only of forward pass is on a single token

* bugfix - mamba can use precomputed states only if they match the batch size

* typo

* remove making _prepare_4d_causal_attention_mask a leaf function

* stop using past_seq_len.get_seq_length(). Use cache positions instead. Adjust test (test_decoder_model_past_with_large_inputs) accordingly

---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>

3f20877d

16 Apr, 2024 2 commits
- Fix test fetcher (doctest) + `Idefics2`'s doc example (#30274) · 5fabebdb
  Yih-Dar authored Apr 16, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  5fabebdb
- More fixes for doctest (#30265) · cbc2cc18
  Yih-Dar authored Apr 16, 2024
```
* fix

* update

* update

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  cbc2cc18
15 Apr, 2024 3 commits

Add Idefics2 (#30253) · 6b78360e

amyeroberts authored Apr 15, 2024



* Initial add model additions

* Test

* All weights loading

* Can perform full forward pass

* Local and remote the same

* Matching local and remote

* Fixup

* Idefics2Model importable; fixup docstrings

* Don't skip by default

* Remove deprecated use_resampler arg

* Remove self.config

* DecoupledLinear takes config

* Tidy up

* Enable eager attention and tidy up

* Most tests passing

* Update for batch of processed images

* Add image processor

* Update doc pages

* Update conversion script

* Remove erroneous breakpoint

* Remove accidendtal spelling change

* Update to reflect changes on hub - make generate work

* Fix up

* Image processor tests

* Update tests

* Add a processor

* Add a processor

* Update convert script

* Update modeling file - remove fixmes

* Bug fix

* Add processing test

* Use processor

* Fix up

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Fix test

* Update config - PR comments and defaults align with checkpoint

* Reviewer comments

* Add copied froms for flahs attention

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove qk_layer_norm and freeze_layers functionality

* Fix

* Remove freeze_layer options from config

* Sync with upstream main

* Fix attention shapes siglip

* Remove Llava-next refs - TO REBASE

* Use AutoModel for text model

* Add comment to explain vision embeddings

* Fix issue with tie_word_embeddings

* Address review comments

* Fix and fix up

* Chat templates for idefics

* Fix copies

* Fix

* Add layer norms to FA2

* Fix tests

* Apply suggestions from code review
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Fix

* Review comments

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update inputs merger

* Merge weights in correct order

* Update convert script

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update template

* Model code examples (fix idefics too)

* More review comments

* Tidy up

* Update processing

* Fix attention mask preparation

* Update inputs_merger inputs

* Vectorize inputs_merger

* Update src/transformers/models/idefics2/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/idefics2/modeling_idefics2.py

* Review comments

* saying bye to the `qk_layer_norms`

* Simplify

* Update latents

* Remove erroneuous readme changes

* Return images when applying chat template

* Fix bug - prompt images are for a single sample

* Update src/transformers/models/idefics2/modeling_idefics2.py

* image splitting

* fix test

* some more comment

* some comment

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics2/image_processing_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update processor

* Update model tests

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Don't add BOS in template

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Remove index in examples

* Update tests to reflect #13

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* PR comment - consistent typing

* Update readme and model doc

* Update docs

* Update checkpoint references

* Update examples

* Fix and update tests

* Small addition

* Update tests - remove copied from as no ignore placement copy could be found

* Update example

* small fixes

* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update README.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Connector model as bridge

* Fix up

* Fix up

* Don't pass model inputs for generation kwargs update

* IDEFICS-2 -> Idefics2

* Remove config archive name

* IDEFICS-2 -> Idefics2

* Add back llava-next

* Update readmes

* Add requirements for processor tester

* Use custom convert_to_rgb to avoid possible BC

* Fix doc example

* Fix doc example

* Skip model doc tests - as model to large

* More doc example - account for image splitting

* Update src/transformers/image_transforms.py

* Fix config doctest

---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Victor SANH <victorsanh@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

6b78360e

update github actions packages' version to suppress warnings (#30249) · 440bd3c3
Yih-Dar authored Apr 15, 2024
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
440bd3c3
Fix doctest more (for `docs/source/en`) (#30247) · fe2d20d2
Yih-Dar authored Apr 15, 2024
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
fe2d20d2