Commits · 30b453206d224ee5f747afa33ff216671558e6a0 · chenpangpang / transformers

19 Apr, 2024 2 commits

Enable multi-device for some models (#30207) · 30b45320

Jacky Lee authored Apr 19, 2024



* feat: multidevice for resnet

* feat: yes! resnet

* fix: compare all elements in tuple

* feat: support for regnet

* feat: support for convnextv2

* feat: support for bit

* feat: support for cvt

* feat: add support for focalnet

* feat: support for yolos

* feat: support for glpn

* feat: support for imagegpt

* feat: support for levit

* feat: support for mgp_str

* feat: support for mobilnet_v1

* feat: support for mobilnet_v2

* feat: support for mobilevit

* feat: support for mobilevitv2

* feat: support for poolformer

* fix: copies

* fix: code quality check

* update: upstream changes from main

* fix: consistency check

* feat: support for sam

* feat: support for switchformer

* feat: support for swin

* feat: support for swinv2

* feat: support for timesformer

* feat: suport for trocr

* feat: support for upernet

* fix: check copies

* update: rerun CI

* update: rerun again, maybe

* update: one more rerun

---------
Co-authored-by: Jacky Lee <jackylee328@gmail.com>

30b45320

[UDOP] Add special tokens to tokenizer (#29594) · ecfe9be7

NielsRogge authored Apr 19, 2024

* Add special tokens

* Add special tokens

* Use fmt

* Uncomment code

* Add test

* Remove scripts

* Address comments

* Improve tests

* Address comment

* Remove flag

ecfe9be7

18 Apr, 2024 16 commits

Fix `AssertionError` in clip conversion script (#30321) · d9850abd

Yih-Dar authored Apr 18, 2024



* fix

* fix

* fix

* update comments

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d9850abd

Avoid `jnp` import in `utils/generic.py` (#30322) · 01ae3b87
Yih-Dar authored Apr 18, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
01ae3b87
🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190) · 60d5f8f9
Zach Mueller authored Apr 18, 2024
```
* Alias

* Note alias

* Tests and src

* Rest

* Clean

* Change typing?

* Fix tests

* Deprecation versions
```
60d5f8f9
disable use_cache if using gradient checkpointing (#30320) · 57b92bbf
Zizhao Chen authored Apr 18, 2024

57b92bbf
fix Parameter dtype in audio models (#30310) · 68be1d3c
Yoach Lacombe authored Apr 18, 2024

68be1d3c
Fix: remove `pad token id` in pipeline forward arguments (#30285) · 79132145
Raushan Turganbay authored Apr 18, 2024

79132145
Dev version · ce8e64fb
Lysandre authored Apr 18, 2024

ce8e64fb

FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert... · 5728b5ad

Younes Belkada authored Apr 18, 2024

FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time (#30317)

* Update awq.py

* style

* revert felix PR

* fix

* add felix comments

5728b5ad

Add DBRX Model (#29921) · 005b957f

Abhi Venigalla authored Apr 18, 2024



* wip

* fix __init__.py

* add docs

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments 1

* work on make fixup

* pass configs down

* add sdpa attention

* remove DbrxBlock

* add to configuration_auto

* docstring now passes formatting test

* fix style

* update READMEs

* add dbrx to modeling_auto

* make fix-copies generated this

* add DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP

* config docstring passes formatting test

* rename moe_loss_weight to router_aux_loss_coef

* add to flash-attn documentation

* fix model-path in tests

* Explicitly make `"suli"` the default `ffn_act_fn`
Co-authored-by: Wing Lian <wing.lian@gmail.com>

* default to using router_aux_loss_coef over ffn_config[moe_loss_weight]

* fix _flash_attn_uses_top_left_mask and is_causal

* fix tests path

* don't use token type IDs

* follow Llama and remove token_type_ids from test

* init ConfigTester differently so tests pass

* remove multiple choice test

* remove question + answer test

* remove sequence classification test

* remove token classification test

* copy Llama tests and remove token_type_ids from test inputs

* do not test pruning or headmasking; style code

* add _tied_weights_keys parameter to pass test

* add type hints

* fix type check

* update config tester

* remove masked_lm test

* remove encoder tests

* initialize DbrxModelTester with correct params

* style

* torch_dtype does not rely on torch

* run make fixup, fix-copies

* use https://huggingface.co/v2ray/dbrx-base-fixed/blob/main/modeling_dbrx.py



* add copyright info

* fix imports and DbrxRotaryEmbedding

* update DbrxModel docstring

* use copies

* change model path in docstring

* use config in DbrxFFN

* fix flashattention2, sdpaattention

* input config to DbrXAttention, DbrxNormAttentionNorm

* more fixes

* fix

* fix again!

* add informative comment

* fix ruff?

* remove print statement + style

* change doc-test

* fix doc-test

* fix docstring

* delete commented out text

* make defaults match dbrx-instruct

* replace `router_aux_loss_coef` with `moe_loss_weight`

* is_decoder=True

* remove is_decoder from configtester

* implement sdpa properly

* make is_decoder pass tests

* start on the GenerationTesterMixin tests

* add dbrx to sdpa documentation

* skip weight typing test

* style

* initialize smaller model
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Add DBRX to toctree

* skip test_new_cache_format

* make config defaults smaller again

* add pad_token_id

* remove pad_token_id from config

* Remove all references to DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP

* Update src/transformers/models/dbrx/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix typo

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update docs, fix configuration_auto.py

* address pr comments

* remove is_decoder flag

* slice

* fix requires grad

* remove grad

* disconnect differently

* remove grad

* enable grads

* patch

* detach expert

* nissan al ghaib

* Update modeling_dbrx.py

* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* replace "Gemma" with "Dbrx"

* remove # type: ignore

* don't hardcode vocab_size

* remove ToDo

* Re-add removed idefics2 line

* Update test to use tiny-random!

* Remove TODO

* Remove one more case of loading the entire dbrx-instruct in the tests

* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* address some comments

* small model

* add dbrx to tokenization_auto

* More docstrings with add_start_docstrings

* Dbrx for now

* add PipelineTesterMixin

* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove flash-attn2 import error

* fix docstring
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add useage example

* put on one line
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix ffn_act_fn
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change "dbrx" to "DBRX" for display purposes.

* fix __init__.py?

* fix __init__.py

* fix README

* return the aux_loss

* remove extra spaces

* fix configuration_auto.py

* fix format in tokenization_auto

* remove new line

* add more useage examples

---------
Co-authored-by: Abhi Venigalla <abhi.venigalla@databricks.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Eitan Turok <eitan.turok@databricks.com>
Co-authored-by: Eitan Turok <150733043+eitanturok@users.noreply.github.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: Eitan Turok <eitanturok@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

005b957f

Do not drop mask with SDPA for more cases (#30311) · 63c5e27e
fxmarty authored Apr 18, 2024
```
* overlooked

* style

* cleaner
```
63c5e27e

Revert "Re-enable SDPA's FA2 path (#30070)" (#30314) · acab997b

Arthur authored Apr 18, 2024

* Revert "Re-enable SDPA's FA2 path (#30070)"

This reverts commit 05bdef16.

* Revert "Fix quality Olmo + SDPA (#30302)"

This reverts commit ec92f983.

acab997b

Fix RecurrentGemma device_map (#30273) · 7509a0ad
Marc Sun authored Apr 18, 2024
```
* Switch to non persistant buffer

* fix device mismatch issue due to cache

* style
```
7509a0ad

Add jamba (#29943) · 3f20877d

tomeras91 authored Apr 18, 2024

* Add jamba arch

* apply "make fix-copies" changes

* fix link to model in JambaConfig docstring

* Add n_ctx in modeling file because repo-consistency wants that

* Add jamba to flash attention and sdpa documentation

* mamba dt_proj quant fix now works for LoRA as well

* override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers

* add jamba to tokenization auto

* fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24)

* simple PR fixes

* remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer

* remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530)

* Add copied comment on JambaMLP (it's the same as MixtralMLP)

* remove padding_mask warnings. It's not supported anymore

* fix docstring. Float instead of int

* A few more minor PR fixes

* (1) lowercase names for mamba layernorms (2) remove _apply_inner_layernorms and do it directly in the forward pass

* Return None attention weights from mamba layers. Append to all attentions only if not None.

* remove some leftover jamba archive lists

* Better separation between expert vs non-expert layers. non-expert layers return None as router_logits, and it is not concatenated to all_router_logits returned from JambaModel

* no need to take router_logits at config.expert_layer_offset anymore. result.router_logits now holds results only for expert layers

* Add Jamba paper on READMEs

* (1) rename n_ctx -> max_position_embeddings (2) don't use it in the modeling file since it's not needed (set it as an exception to check_config_attributes)

* Add copied from comment

* remove the code path for apply_inner_layernorms=False. Jamba always has the inner mamba layernorms

* clearer docstring for _convert_to_standard_cache

* style fixes

* Change calc_logits_for_entire_prompt (bool) to num_logits_to_keep (int). Adapt assisted decoding code tp use it. Also small change in low memory beam search decoding path to support this new int value in model_inputs

* rename test so it still overrides what its meant to override

* draft

* oups

* nit

* remove more complexe logic

* fix names used in config

* fix fix fix

* style

* fix some more failing tests

* generate did not init the cache 🙃



* more small nits

* typo

* config.mamba_expand * config.hidden_size for the intermediate size of the mamba shapes

* fix init of pkv with torch.tensor()

* empty tensor

* fix some init issues

* stupid changes required by generate because it does not even support it's own DynamicCache class

* more fixes

* fix general assisted gen cache_position bug

* tests passing

* Add offsets and periods as SPECIAL_CASES_TO_ALLOW in check_config_attributes.py

* fix reorder_cache to reorder mamba states and override some more functions in HybridMambaAttentionDynamicCache

* no need to override test_past_key_values_format() and _check_past_key_values_for_generate() in tests anymore

* fix docstrings and typehints for past_key_values

* style fixes

* fix docs

* change typehint due to copy from Mixtral

* forgot import

* import order

* Add configuration_jamba and modeling_jamba to not_doctested because the model is too big to download (in docstring of JambaForCausalLM.forward)

* Add integration test with tiny tandom Jamba model on hub

* fix flash attention cache shapes

* bring back forgotten hidden states

* rename HybridMambaAttentionDynamicCache.seqlen_offset to has_previous_state (and make bool) and bugfix - it should be set to True after a finished forward pass of the entire model

* align integration test after modeling fixes

* bugfix - mamba can use precomputed states only of forward pass is on a single token

* bugfix - mamba can use precomputed states only if they match the batch size

* typo

* remove making _prepare_4d_causal_attention_mask a leaf function

* stop using past_seq_len.get_seq_length(). Use cache positions instead. Adjust test (test_decoder_model_past_with_large_inputs) accordingly

---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>

3f20877d

Fix all torch pipeline failures except one (#30290) · 28a22834
Yih-Dar authored Apr 18, 2024
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
28a22834
Fix donut token2json multiline (#30300) · 7915a259
Pavel Iakubovskii authored Apr 18, 2024
```
* Fix multiline processing

* Update test for token2json
```
7915a259

Add Flash Attention 2 to M2M100 model (#30256) · b65df514

Alexander Visheratin authored Apr 18, 2024



* Added flash attention 2.

* Fixes.

* Fix inheritance.

* Fixed init.

* Remove stuff.

* Added documentation.

* Add FA2 to M2M100 documentation.

* Add test.

* Fixed documentation.

* Update src/transformers/models/m2m_100/modeling_m2m_100.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update docs/source/en/model_doc/nllb.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixed variable name.

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b65df514

17 Apr, 2024 9 commits

Fix quality Olmo + SDPA (#30302) · ec92f983
fxmarty authored Apr 17, 2024
```
fix olmo
```
ec92f983

Re-enable SDPA's FA2 path (#30070) · 05bdef16

fxmarty authored Apr 17, 2024



* tentatively re-enable FA2 + SDPA

* better comment

* _ignore_causal_mask_sdpa as staticmethod

* type hints

* use past_seen_tokens instead

* enable copied from for sdpa

* ruff

* llama simplifications on review

* remove unnecessary self.is_causal check

* fix copies

* cleaning

* precise message

* better doc

* add test

* simplify

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

05bdef16

Add OLMo model family (#29890) · e4ea19b9

Shane A authored Apr 17, 2024

* Add OLMo using add-new-model-like with Llama

* Fix incorrect tokenizer for OLMo

* Copy-paste relevant OLMo methods and their imports

* Add OLMo config

* Modify OLMo config to follow HF conventions

* Remove unneeded Llama code from OLMo model

* Add ability for OLMo model to output attentions

* Add OLMoPreTrainedModel and OLMoModel

* Add OLMoForCausalLM

* Minor fixes to OLMo model for style and missing functions

* Implement OLMo tokenizer

* Implement OLMo to HF conversion script

* Add tests for OLMo model

* Add tests for OLMo fast tokenizer

* Add auto-generated dummy objects

* Remove unimplemented OLMo classes from auto and init classes and re-format

* Add README and associated auto-generated files

* Use OLMo names for common properties

* Run make fixup

* Remove `|` from OLMo typing

* Remove unneeded tokenization_olmo.py

* Revert model, config and converter to add-new-model-like Llama

* Move logic for adding b...

e4ea19b9

Upgrading to tokenizers 0.19.0 (#30289) · 8e5f76f5

Nicolas Patry authored Apr 17, 2024

* [DO NOT MERGE] Testing tokenizers 0.19.0rc0

* Accounting for the breaking change.

* Ruff.

* Upgrading to tokenizers `0.19` (new release with preprend_scheme fixed
and new surface for BPE tiktoken bug).

8e5f76f5

Add strategy to store results in evaluation loop (#30267) · c15aad09

Pavel Iakubovskii authored Apr 17, 2024

* Add evaluation loop container for interm. results

* Add tests for EvalLoopContainer

* Formatting

* Fix padding_index in test and typo

* Move EvalLoopContainer to pr_utils to avoid additional imports

* Fix `eval_do_concat_batches` arg description

* Fix EvalLoopContainer import

c15aad09

Add token type ids to CodeGenTokenizer (#29265) · 8d6b5096

st81 authored Apr 17, 2024

* Add create token type ids to CodeGenTokenizer

* Fix inconsistent length of token type ids

* Format source codes

* Fix inconsistent order of methods

* Update docstring

* add test_tokenizer_integration test

* Format source codes

* Add `copied from` comment to CodeGenTokenizerFast

* Add doc of create_token_type_ids_from_sequences

* Make return_token_type_ids False by default

* Make test_tokenizer_integration as slow test

* Add return_token_type_ids to tokenizer init arg

* Add test for tokenizer's init return_token_type_ids

* Format source codes

8d6b5096

Enable fx tracing for Mistral (#30209) · 304c6a1e
Raushan Turganbay authored Apr 17, 2024
```
* tracing for mistral

* typo

* fix copies
```
304c6a1e
Fix SpeechT5 forward docstrings (#30287) · 41145247
Yoach Lacombe authored Apr 17, 2024

41145247

Fix SDPA sliding window compatibility (#30127) · 40eb6d6c

fxmarty authored Apr 17, 2024



* fix sdpa + sliding window

* give credit
Co-authored-by: ehuaa <ehuamail@163.com>

* remove unnecessary warning

* fix typog

* add test

---------
Co-authored-by: ehuaa <ehuamail@163.com>

40eb6d6c

16 Apr, 2024 7 commits

Fix test fetcher (doctest) + `Idefics2`'s doc example (#30274) · 5fabebdb
Yih-Dar authored Apr 16, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
5fabebdb
fix: Fixed a `raise` statement (#30275) · 37b5946a
Sai-Suraj-27 authored Apr 16, 2024
```
* Fixed a raise statement.

* Minor changes.
```
37b5946a
Raise relevent err when wrong type is passed in as the accelerator_config (#29997) · e27d9308
Zach Mueller authored Apr 16, 2024
```
* Raise relevent err

* Use type instead
```
e27d9308

add `push_to_hub` to pipeline (#29172) · 0eaef0c7

Hafedh authored Apr 16, 2024



* add `push_to_hub` to pipeline

* fix docs

* format with ruff

* update save_pretrained

* update save_pretrained

* remove unnecessary comment

* switch to push_to_hub method in DynamicPipelineTester

* remove unused imports

* update docs for add_new_pipeline

* fix docs for add_new_pipeline

* add comment

* fix italien docs

* changes to token retrieval for pipelines

* Update src/transformers/pipelines/base.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

0eaef0c7

Allow for str versions of dicts based on typing (#30227) · 487505ff

Zach Mueller authored Apr 16, 2024

* Bookmark, initial impelemtation. Need to test

* Clean

* Working fully, woop woop

* I think working version now, testing

* Fin!

* rm cast, could keep None

* Fix typing issue

* rm typehint

* Add test

* Add tests and make more rigid

487505ff

FIX: Fix 8-bit serialization tests (#30051) · b86d0f4e

Younes Belkada authored Apr 16, 2024



* fix 8-bit serialization tests

* add more clarification

* Update src/transformers/quantizers/quantizer_bnb_8bit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b86d0f4e

More fixes for doctest (#30265) · cbc2cc18

Yih-Dar authored Apr 16, 2024



* fix

* update

* update

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

cbc2cc18

15 Apr, 2024 6 commits

Add Idefics2 (#30253) · 6b78360e

amyeroberts authored Apr 15, 2024



* Initial add model additions

* Test

* All weights loading

* Can perform full forward pass

* Local and remote the same

* Matching local and remote

* Fixup

* Idefics2Model importable; fixup docstrings

* Don't skip by default

* Remove deprecated use_resampler arg

* Remove self.config

* DecoupledLinear takes config

* Tidy up

* Enable eager attention and tidy up

* Most tests passing

* Update for batch of processed images

* Add image processor

* Update doc pages

* Update conversion script

* Remove erroneous breakpoint

* Remove accidendtal spelling change

* Update to reflect changes on hub - make generate work

* Fix up

* Image processor tests

* Update tests

* Add a processor

* Add a processor

* Update convert script

* Update modeling file - remove fixmes

* Bug fix

* Add processing test

* Use processor

* Fix up

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Fix test

* Update config - PR comments and defaults align with checkpoint

* Reviewer comments

* Add copied froms for flahs attention

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove qk_layer_norm and freeze_layers functionality

* Fix

* Remove freeze_layer options from config

* Sync with upstream main

* Fix attention shapes siglip

* Remove Llava-next refs - TO REBASE

* Use AutoModel for text model

* Add comment to explain vision embeddings

* Fix issue with tie_word_embeddings

* Address review comments

* Fix and fix up

* Chat templates for idefics

* Fix copies

* Fix

* Add layer norms to FA2

* Fix tests

* Apply suggestions from code review
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Fix

* Review comments

* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update inputs merger

* Merge weights in correct order

* Update convert script

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update template

* Model code examples (fix idefics too)

* More review comments

* Tidy up

* Update processing

* Fix attention mask preparation

* Update inputs_merger inputs

* Vectorize inputs_merger

* Update src/transformers/models/idefics2/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/idefics2/modeling_idefics2.py

* Review comments

* saying bye to the `qk_layer_norms`

* Simplify

* Update latents

* Remove erroneuous readme changes

* Return images when applying chat template

* Fix bug - prompt images are for a single sample

* Update src/transformers/models/idefics2/modeling_idefics2.py

* image splitting

* fix test

* some more comment

* some comment

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics2/image_processing_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update processor

* Update model tests

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Don't add BOS in template

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Remove index in examples

* Update tests to reflect #13

* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* PR comment - consistent typing

* Update readme and model doc

* Update docs

* Update checkpoint references

* Update examples

* Fix and update tests

* Small addition

* Update tests - remove copied from as no ignore placement copy could be found

* Update example

* small fixes

* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Update README.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>

* Connector model as bridge

* Fix up

* Fix up

* Don't pass model inputs for generation kwargs update

* IDEFICS-2 -> Idefics2

* Remove config archive name

* IDEFICS-2 -> Idefics2

* Add back llava-next

* Update readmes

* Add requirements for processor tester

* Use custom convert_to_rgb to avoid possible BC

* Fix doc example

* Fix doc example

* Skip model doc tests - as model to large

* More doc example - account for image splitting

* Update src/transformers/image_transforms.py

* Fix config doctest

---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Victor SANH <victorsanh@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

6b78360e

round epoch only in console (#30237) · 76681015
LZR authored Apr 15, 2024

76681015
Separate out kwargs in processor (#30193) · ec344b56
amyeroberts authored Apr 15, 2024
```
* Separate out kwargs in processor

* Fix up
```
ec344b56
fix: Fixed `type annotation` for compatability with python 3.8 (#30243) · fc8eda36
Sai-Suraj-27 authored Apr 15, 2024
```
* Fixed type annotation for compatability with python 3.8

* Fixed unsorted imports.
```
fc8eda36
fix: Replaced deprecated `typing.Text` with `str` (#30230) · b3595cf0
Sai-Suraj-27 authored Apr 15, 2024
```
typing.Text is deprecated. Use str instead
```
b3595cf0

Add test for parse_json_file and change typing to os.PathLike (#30183) · 8fd2de93

Xu Song authored Apr 15, 2024

* Add test for parse_json_file

* Change Path to PathLike

* Fix `Import block is un-sorted or un-formatted`

* revert parse_json_file

* Fix ruff format

* Add parse_json_file test

8fd2de93