Commits · 999981daf4acf21cd80d843ee5069123aadf4796 · chenpangpang / transformers

"tests/models/pegasus/test_modeling_pegasus.py" did not exist on "f4432b7e01dc46008fb823096d884bdc2861b49c"

16 Jul, 2024 1 commit
- Tests: remove cuda versions when the result is the same 🧹🧹 (#31955) · 999981da
  Joao Gante authored Jul 16, 2024
```
remove cuda versions when the result is the same
```
  999981da
09 Jul, 2024 1 commit
- Generate: Add new decoding strategy "DoLa" in `.generate()` (#29619) · d094d8d9
  Yung-Sung Chuang authored Jul 09, 2024
```
Co-authored-by: Joao Gante <joao@huggingface.co>
```
  d094d8d9
05 Jul, 2024 1 commit

Yih-Dar authored Jul 05, 2024



* skip 3 7b tests

* fix

* fix

* fix

* [run-slow] gemma

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

eef0507f

27 Jun, 2024 1 commit

Add gemma 2 (#31659) · 0cf60f13

Arthur authored Jun 27, 2024



* inital commit

* Add doc

* protect?

* fixup stuffs

* update tests

* fix build documentation

* mmmmmmm config attributes

* style

* nit

* uodate

* nit

* Fix docs

* protect some stuff

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

0cf60f13

26 Jun, 2024 1 commit

Skip tests properly (#31308) · 1de7dc74

amyeroberts authored Jun 26, 2024

* Skip tests properly

* [test_all]

* Add 'reason' as kwarg for skipTest

* [test_all] Fix up

* [test_all]

1de7dc74

24 Jun, 2024 1 commit
- Correct @is_flaky test decoration (#31480) · 3c2d4d60
  Pavel Iakubovskii authored Jun 24, 2024
```
* Correct @is_flaky decorator
```
  3c2d4d60
22 May, 2024 1 commit

update ruff version (#30932) · 673440d0

Arthur authored May 22, 2024



* update ruff version

* fix research projects

* Empty

* Fix errors

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

673440d0

21 May, 2024 1 commit

CI: AMD MI300 tests fix (#30797) · 7a4792e6

Mohit Sharma authored May 21, 2024

* add fix

* update import

* updated dicts and comments

* remove prints

* Update testing_utils.py

7a4792e6

20 May, 2024 1 commit

Add TokenClassification for Mistral, Mixtral and Qwen2 (#29878) · 07bf2dff

Joseph Enguehard authored May 20, 2024



* Add MistralForTokenClassification

* Add tests and docs

* Add token classification for Mixtral and Qwen2

* Save llma for token classification draft

* Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2

* Formatting

* Add token classification support for Qwen2Moe model

* Add dropout layer to each ForTokenClassification model

* Add copied from in tests

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Propagate suggested changes

* Style

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

07bf2dff

16 May, 2024 1 commit

Make `Gemma` work with `torch.compile` (#30775) · 1b3dba94

Yih-Dar authored May 16, 2024



* fix

* [run-slow] gemma

* add test

* add `test_compile_static_cache`

* fix

* style

* remove subprocess

* use attribute

* fix

* style

* update

* [run-slow] dbrx,gemma,jetmoe,phi3,recurrent_gemma

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1b3dba94

08 May, 2024 1 commit
- Cache: models return input cache type (#30716) · f26e4073
  Joao Gante authored May 08, 2024
  
  f26e4073
09 Apr, 2024 1 commit

Fix slow tests for important models to be compatible with A10 runners (#29905) · 08a194fc

Yih-Dar authored Apr 09, 2024



* fix mistral and mixtral

* add pdb

* fix mixtral tesst

* fix

* fix mistral ?

* add fix gemma

* fix mistral

* fix

* test

* anoter test

* fix

* fix

* fix mistral tests

* fix them again

* final fixes for mistral

* fix padding right

* fix whipser fa2

* fix

* fix

* fix gemma

* test

* fix llama

* fix

* fix

* fix llama gemma

* add class attribute

* fix CI

* clarify whisper

* compute_capability

* rename names in some comments

* Add   # fmt: skip

* make style

* Update tests/models/mistral/test_modeling_mistral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update

* update

---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

08a194fc

01 Apr, 2024 1 commit
- Fix FA2 tests (#29909) · 569f6c7d
  Yoach Lacombe authored Apr 01, 2024
```
* fix FA2 tests

* refactor inference test name
```
  569f6c7d
01 Mar, 2024 1 commit
- Fix llama + gemma accelete tests (#29380) · cec77334
  Marc Sun authored Mar 01, 2024
  
  cec77334
28 Feb, 2024 1 commit

FIX [`Gemma` / `CI`] Make sure our runners have access to the model (#29242) · ad00c482

Younes Belkada authored Feb 28, 2024



* pu hf token in gemma tests

* update suggestion

* add to flax

* revert

* fix

* fixup

* forward contrib credits from discussion

---------
Co-authored-by: ArthurZucker <ArthurZucker@users.noreply.github.com>

ad00c482

22 Feb, 2024 1 commit

[Gemma] Fix eager attention (#29187) · 2a9b1f80

Sanchit Gandhi authored Feb 22, 2024

* fix modelling code

* add tests

* fix tests

* add some logit tests

* style

* fix fix

2a9b1f80

21 Feb, 2024 1 commit

[ `gemma`] Adds support for Gemma

💎

(#29167) · 594c1277

Arthur authored Feb 21, 2024

* inital commit

* update

* update conversion checkpoint

* update conversion script

* nits

* some fixes

* nits

* merge

* fix permute

* nits

* fix

* nits

* nits

* nits

* fix rope

* fix both rope

* nites

* style

* make sure flax works

* fix flax init code

* fix foward

* nits

* print flax generation out

* current code

* nits

* SIIIIIIIIIIIIIIIIIII

* update

* add new tokenizer

* correct fast tokenizer

* fix conversion

* more comments

* fix modeling and conversion

* nits and nits

* nits testing

* add some tokenization tests

* add some edge cases

* add slow tests and fix them

* fixup

* fix copies for modeling

* fix copies

* add 7B slow tests

* fix

* fix

* fix tests

* make tokenizer cis go green

* styling

* last tokenizer nits

* update jax tests

* fix flax for 7b

* add jit testing 🤗



* cleanups

* isolated nit, inv_freq for rotary_emb.inv_freq

* propagate to jax

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* adjust test

* fix conversion script

* change name

* correct file names

* update conversion script

* Fix bos and eos token ids in the model configuration (#3)

* update modelling

* update conversion script

* add static cache for gemma

* fix sdpa generate

* fix batched

* multiple fixes

* fix FA2

* final fix

* Rename a few missing strings and filenames (#4)

* merge with upstream main

* fix copies

* fix copies

* fix fixup

* fix fixup

* fix

* fix

* final tests

* fix fx gemma tests

* fix fx bf16/fp16 tests

* update slow fx tests

* fx slow tests: one logits, one generation

* move jit test standalone

* Apply suggestions from code review

* nits

* tokenizer updates

* more tokenization updates: custom GemmaSentencepieceExtrator

* style

* Update src/transformers/cache_utils.py

* Update src/transformers/models/gemma/__init__.py

* Update tests/models/gemma/test_modeling_flax_gemma.py

* small nits

* style

* update tokenization test

* fix the rotary embedding

* with style

* fix slow tests

* WARNING this commit might be very important for precisions

* Update tests/models/gemma/test_modeling_flax_gemma.py

* Update src/transformers/models/gemma/configuration_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/models/gemma/modeling_flax_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* small nits here and there!

* forgotten nit

* remove on the fly computation of inv_freq

* revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float

* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_flax_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* nit conversion script link

* fix some tests

* add not doctest and pr doctest

* repo consistency

* fix last CIs 🚀



* update all readmes

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Lysandre Debut <hi@lysand.re>

594c1277

17 Jan, 2024 1 commit

Add qwen2 (#28436) · d6ffe74d

Junyang Lin authored Jan 17, 2024



* add config, modeling, and tokenization

* add auto and init

* update readme

* update readme

* update team name

* fixup

* fixup

* update config

* update code style

* update for fixup

* update for fixup

* update for fixup

* update for testing

* update for testing

* fix bug for config and tokenization

* fix bug for bos token

* not doctest

* debug tokenizer

* not doctest

* debug tokenization

* debug init for tokenizer

* fix style

* update init

* delete if in token auto

* add tokenizer doc

* add tokenizer in init

* Update dummy_tokenizers_objects.py

* update

* update

* debug

* Update tokenization_qwen2.py

* debug

* Update convert_slow_tokenizer.py

* add copies

* add copied from and make style

* update files map

* update test

* fix style

* fix merge reading and update tests

* fix tests

* fix tests

* fix style

* debug a variable in readme

* Update src/transformers/models/qwen2/configuration_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update test and copied from

* fix style

* update qwen2 tokenization  and tests

* Update tokenization_qwen2.py

* delete the copied from after property

* fix style

* update tests

* update tests

* add copied from

* fix bugs

* update doc

* add warning for sliding window attention

* update qwen2 tokenization

* fix style

* Update src/transformers/models/qwen2/modeling_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix tokenizer fast

---------
Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d6ffe74d

21 Dec, 2023 1 commit

[`Mixtral` & `Mistral`] Add support for sdpa (#28133) · f9a98c47

Arthur authored Dec 21, 2023



* some nits

* update test

* add support d\sd[a

* remove some dummy inputs

* all good

* style

* nits

* fixes

* fix more copies

* nits

* styling

* fix

* Update src/transformers/models/mistral/modeling_mistral.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add a slow test just to be sure

* fixup

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f9a98c47

20 Dec, 2023 1 commit
- Generate: fix speculative decoding (#28166) · 45b70384
  Joao Gante authored Dec 20, 2023
```
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
```
  45b70384
08 Dec, 2023 1 commit

F.scaled_dot_product_attention support (#26572) · 80377eb0

fxmarty authored Dec 08, 2023



* add sdpa

* wip

* cleaning

* add ref

* yet more cleaning

* and more :)

* wip llama

* working llama

* add output_attentions=True support

* bigcode sdpa support

* fixes

* gpt-bigcode support, require torch>=2.1.1

* add falcon support

* fix conflicts falcon

* style

* fix attention_mask definition

* remove output_attentions from attnmaskconverter

* support whisper without removing any Copied from statement

* fix mbart default to eager renaming

* fix typo in falcon

* fix is_causal in SDPA

* check is_flash_attn_2_available in the models init as well in case the model is not initialized through from_pretrained

* add warnings when falling back on the manual implementation

* precise doc

* wip replace _flash_attn_enabled by config.attn_implementation

* fix typo

* add tests

* style

* add a copy.deepcopy on the config in from_pretrained, as we do not want to modify it inplace

* obey to config.attn_implementation if a config is passed in from_pretrained

* fix is_torch_sdpa_available when torch is not installed

* remove dead code

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bart/modeling_bart.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove duplicate pretraining_tp code

* add dropout in llama

* precise comment on attn_mask

* add fmt: off for _unmask_unattended docstring

* precise num_masks comment

* nuke pretraining_tp in LlamaSDPAAttention following Arthur's suggestion

* cleanup modeling_utils

* backward compatibility

* fix style as requested

* style

* improve documentation

* test pass

* style

* add _unmask_unattended tests

* skip meaningless tests for idefics

* hard_check SDPA requirements when specifically requested

* standardize the use if XXX_ATTENTION_CLASSES

* fix SDPA bug with mem-efficient backend on CUDA when using fp32

* fix test

* rely on SDPA is_causal parameter to handle the causal mask in some cases

* fix FALCON_ATTENTION_CLASSES

* remove _flash_attn_2_enabled occurences

* fix test

* add OPT to the list of supported flash models

* improve test

* properly test on different SDPA backends, on different dtypes & properly handle separately the pad tokens in the test

* remove remaining _flash_attn_2_enabled occurence

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/perf_infer_gpu_one.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove use_attn_implementation

* fix docstring & slight bug

* make attn_implementation internal (_attn_implementation)

* typos

* fix tests

* deprecate use_flash_attention_2=True

* fix test

* add back llama that was removed by mistake

* fix tests

* remove _flash_attn_2_enabled occurences bis

* add check & test that passed attn_implementation is valid

* fix falcon torchscript export

* fix device of mask in tests

* add tip about torch.jit.trace and move bt doc below sdpa

* fix parameterized.expand order

* move tests from test_modeling_attn_mask_utils to test_modeling_utils as a relevant test class is already there

* update sdpaattention class with the new cache

* Update src/transformers/configuration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bark/modeling_bark.py

* address review comments

* WIP torch.jit.trace fix. left: test both eager & sdpa

* add test for torch.jit.trace for both eager/sdpa

* fix falcon with torch==2.0 that needs to use sdpa

* fix doc

* hopefully last fix

* fix key_value_length that has no default now in mask converter

* is it flacky?

* fix speculative decoding bug

* tests do pass

* fix following #27907

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

80377eb0

07 Dec, 2023 2 commits

Fix device of masks in tests (#27887) · c99f2547
fxmarty authored Dec 07, 2023
```
fix device of mask in tests
```
c99f2547

Add Llama Flax Implementation (#24587) · 75336c17

Alex McKinney authored Dec 07, 2023

* Copies `modeling_flax_gpt_neo.py` to start

* MLP Block. WIP Attention and Block

* Adds Flax implementation of `LlamaMLP`
Validated with in-file test.
Some slight numeric differences, but assuming it isn't an issue

* Adds `FlaxLlamaRMSNorm` layer
`flax.linen` includes `RMSNorm` layer but not necessarily in all
versions. Hence, we add in-file.

* Adds FlaxLlamaAttention
Copied from GPT-J as it has efficient caching implementation as well as
rotary embeddings.
Notice numerically different, but not by a huge amount. Needs
investigating

* Adds `FlaxLlamaDecoderLayer`
numerically inaccurate, debugging..

* debugging rotary mismatch
gptj uses interleaved whilst llama uses contiguous
i think they match now but still final result is wrong.
maybe drop back to just debugging attention layer?

* fixes bug with decoder layer
still somewhat numerically inaccurate, but close enough for now

* adds markers for what to implement next
the structure here diverges a lot from the PT version.
not a big fan of it, but just get something working for now

* implements `FlaxLlamaBlockCollection`]
tolerance must be higher than expected, kinda disconcerting

* Adds `FlaxLlamaModule`
equivalent PyTorch model is `LlamaModel`
yay! a language model🤗

* adds `FlaxLlamaForCausalLMModule`
equivalent to `LlamaForCausalLM`
still missing returning dict or tuple, will add later

* start porting pretrained wrappers
realised it probably needs return dict as a prereq

* cleanup, quality, style

* readds `return_dict` and model output named tuples

* (tentatively) pretrained wrappers work 🔥

* fixes numerical mismatch in `FlaxLlamaRMSNorm`
seems `jax.lax.rsqrt` does not match `torch.sqrt`.
manually computing `1 / jax.numpy.sqrt` results in matching values.

* [WIP] debugging numerics

* numerical match
I think issue was accidental change of backend. forcing CPU fixes test.
We expect some mismatch on GPU.

* adds in model and integration tests for Flax Llama
summary of failing:
- mul invalid combination of dimensions
- one numerical mismatch
- bf16 conversion (maybe my local backend issue)
- params are not FrozenDict

* adds missing TYPE_CHECKING import and `make fixup`

* adds back missing docstrings
needs review on quality of docstrings, not sure what is required.
Furthermore, need to check if `CHECKPOINT_FOR_DOC` is valid. See TODO

* commenting out equivalence test as can just use common

* debugging

* Fixes bug where mask and pos_ids were swapped in pretrained models
This results in all tests passing now 🔥



* cleanup of modeling file

* cleanup of test file

* Resolving simpler review comments

* addresses more minor review comments

* fixing introduced pytest errors from review

* wip additional slow tests

* wip tests
need to grab a GPU machine to get real logits for comparison
otherwise, slow tests should be okay

* `make quality`, `make style`

* adds slow integration tests
- checking logits
- checking hidden states
- checking generation outputs

* `make fix-copies`

* fix mangled function following `make fix-copies`

* adds missing type checking imports

* fixes missing parameter checkpoint warning

* more finegrained 'Copied from' tags
avoids issue of overwriting `LLAMA_INPUTS_DOCSTRING`

* swaps import guards
??? how did these get swapped initially?

* removing `inv_freq` again as pytorch version has now removed

* attempting to get CI to pass

* adds doc entries for llama flax models

* fixes typo in __init__.py imports

* adds back special equivalence tests
these come from the gpt neo flax tests. there is special behaviour for these models that needs to override the common version

* overrides tests with dummy to see if CI passes
need to fill in these tests later

* adds my contribution to docs

* `make style; make quality`

* replaces random masking with fixed to work with flax version

* `make quality; make style`

* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* updates `x`->`tensor` in `rotate_half`

* addresses smaller review comments

* Update docs/source/en/model_doc/llama.md
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* adds integration test class

* adds `dtype` to rotary embedding to cast outputs

* adds type to flax llama rotary layer

* `make style`

* `make fix-copies`

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* applies suggestions from review

* Update modeling_flax_llama.py

* `make fix-copies`

* Update tests/models/llama/test_modeling_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fixes shape mismatch in FlaxLlamaMLP

* applies some suggestions from reviews

* casts attn output logits to f32 regardless of dtype

* adds attn bias using `LlamaConfig.attention_bias`

* adds Copied From comments to Flax Llama test

* mistral and persimmon test change -copy from llama

* updates docs index

* removes Copied from in tests

it was preventing `make fix-copies` from succeeding

* quality and style

* ignores FlaxLlama input docstring

* adds revision to `_CHECKPOINT_FOR_DOC`

* repo consistency and quality

* removes unused import

* removes copied from from Phi test

now diverges from llama tests following FlaxLlama changes

* adds `_REAL_CHECKPOINT_FOR_DOC`

* removes refs from pr tests

* reformat to make ruff happy

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

75336c17

27 Nov, 2023 1 commit

Fix mistral generate for long prompt / response (#27548) · b09912c8

Yanan Xie authored Nov 27, 2023

* Fix mistral generate for long prompt / response

* Add unit test

* fix linter

* fix linter

* fix test

* add assisted generation test for mistral and load the model in 4 bit + fa2

b09912c8

24 Nov, 2023 1 commit
- Skip pipeline tests for 2 models for now (#27687) · 623432dc
  Yih-Dar authored Nov 24, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  623432dc
21 Nov, 2023 2 commits
- Explicitely specify `use_cache=True` in Flash Attention tests (#27635) · 7f043738
  fxmarty authored Nov 21, 2023
```
explicit use_cache=True
```
  7f043738
- Fix flash attention bugs with Mistral and Falcon (#27625) · 82cc0a79
  fxmarty authored Nov 21, 2023
```
* fix various bugs with flash attention

* bump

* fix test

* fix mistral

* use skiptest instead of return that may be misleading

* fix on review
```
  82cc0a79
16 Nov, 2023 1 commit

[`Styling`] stylify using ruff (#27144) · 651408a0

Arthur authored Nov 16, 2023



* try to stylify using ruff

* might need to remove these changes?

* use ruf format andruff check

* use isinstance instead of type comparision

* use # fmt: skip

* use # fmt: skip

* nits

* soem styling changes

* update ci job

* nits isinstance

* more files update

* nits

* more nits

* small nits

* check and format

* revert wrong changes

* actually use formatter instead of checker

* nits

* well docbuilder is overwriting this commit

* revert notebook changes

* try to nuke docbuilder

* style

* fix feature exrtaction test

* remve `indent-width = 4`

* fixup

* more nits

* update the ruff version that we use

* style

* nuke docbuilder styling

* leve the print for detected changes

* nits

* Remove file I/O
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>

* style

* nits

* revert notebook changes

* Add # fmt skip when possible

* Add # fmt skip when possible

* Fix

* More `  # fmt: skip` usage

* More `  # fmt: skip` usage

* More `  # fmt: skip` usage

* NIts

* more fixes

* fix tapas

* Another way to skip

* Recommended way

* Fix two more fiels

* Remove asynch
Remove asynch

---------
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>

651408a0

09 Nov, 2023 1 commit
- use `pytest.mark` directly (#27390) · 3258ff93
  Yih-Dar authored Nov 09, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  3258ff93
31 Oct, 2023 1 commit

device agnostic models testing (#27146) · 50378cbf

Hz, Ji authored Nov 01, 2023

* device agnostic models testing

* add decorator `require_torch_fp16`

* make style

* apply review suggestion

* Oops, the fp16 decorator was misused

50378cbf

16 Oct, 2023 1 commit
- Fix Mistral OOM again (#26847) · b8f1cde9
  Yih-Dar authored Oct 16, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  b8f1cde9
12 Oct, 2023 1 commit

Fix `MistralIntegrationTest` OOM (#26754) · db5e0c32

Yih-Dar authored Oct 12, 2023



* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

db5e0c32

11 Oct, 2023 1 commit

`Copied from` for test files (#26713) · 5334796d

Yih-Dar authored Oct 11, 2023



* copied statement for test files

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5334796d

06 Oct, 2023 1 commit

Remove unnecessary unsqueeze - squeeze in rotary positional embedding (#26162) · 64845307

fxmarty authored Oct 06, 2023

* remove unnecessary unsqueeze-squeeze in llama

* correct other models

* fix

* revert gpt_neox_japanese

* fix copie

* fix test

64845307

03 Oct, 2023 1 commit

[`Mistral`] Add Flash Attention-2 support for `mistral` (#26464) · ae9a344c

Younes Belkada authored Oct 03, 2023



* add FA-2 support for mistral

* fixup

* add sliding windows

* fixing few nits

* v1 slicing cache - logits do not match

* add comment

* fix bugs

* more mem efficient

* add warning once

* add warning once

* oops

* fixup

* more comments

* copy

* add safety checker

* fixup

* Update src/transformers/models/mistral/modeling_mistral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* copied from

* up

* raise when padding side is right

* fixup

* add doc + few minor changes

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ae9a344c

27 Sep, 2023 1 commit

[Mistral] Mistral-7B-v0.1 support (#26447) · 72958fcd

Chris Bamford authored Sep 27, 2023



* [Mistral] Mistral-7B-v0.1 support

* fixing names

* slightly longer test

* fixups

* not_doctested

* wrongly formatted references

* make fixuped

---------
Co-authored-by: Timothee Lacroix <t@eugen.ai>
Co-authored-by: timlacroix <t@mistral.ai>

72958fcd

12 Sep, 2023 1 commit

[`Persimmon`] Add support for persimmon (#26042) · 9cccb3a8

Arthur authored Sep 12, 2023



* intiial commit

* updates

* nits

* update conversion script

* update conversion script

* use path to load

* add tips etc

* some modeling logic

* modeling update

* more nits

* nits

* normal layer norm

* update config and doc

* nits

* update doc remove unused

* update

* fix inits and stuff

* fixup

* revert wrong changes

* updates

* more nits

* add default config values to the configuration file

* fixup happy

* update

* 2 tests left

* update readmes

* more nits

* slow test and more documentation

* update readme

* fix licences

* styling

* use fast if possible when saving tokenizer

* remove todo

* remove tokenization tests

* small last nits

* Apply suggestions from code review
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* nits to skip the timout doctest

* fix integration test

* fix test

* update eos token

* update to allow fast tokenization

* styling

* fix codeLlama as well for the update post processor

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add more copied from statements

* update

* doc passes doctest

* remove `# final layer norm?`

* change docstring prompot

* update

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* don't doctest the conversion script as it requires more packages

* don't init a model in the config

* oups

* fix doctest

---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9cccb3a8

25 Aug, 2023 1 commit

[`CodeLlama`] Add support for `CodeLlama` (#25740) · 015f8e11

Arthur authored Aug 25, 2023



* add all

* Revert "Delete .github directory"

This reverts commit 9b0ff7b052e2b20b629a26fb13606b78a42944d1.

* make conversion script backward compatible

* fixup

* more styling

* copy to llama changes

* fix repo consistency

* nits

* document correct classes

* updates

* more fixes

* nits

* update auto mappings

* add readmes

* smallupdates

* llama-code replace with llama_code

* make fixup

* updates to the testsing suite

* fix fast nits

* more small fixes

* fix decode

* fix template processing

* properly reset the normalizer

* nits processor

* tokenization tests pass

* styling

* last tests

* additional nits

* one test is left

* nits

Co-authored-by faabian <faabian@users.noreply.github.com>

* update failing test

* fixup

* remove decode infilling users should handle it on their onw after generation, padding can be a problem

* update

* make test slow and more meaningfull

* fixup

* doc update

* fixup

* Apply suggestions from code review

* add kwargs doc

* tokenizer requires `requires_backend`

* type requires_backends

* CodeLlama instead of LlamaCode

* more name cahnges

* nits

* make doctests happy

* small pipeline nits

* last nit

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update

* add codellama to toctree

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

015f8e11

02 Aug, 2023 1 commit
- CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) · bd90cda9
  Yih-Dar authored Aug 02, 2023
```
* CI with layers=2

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  bd90cda9
21 Jul, 2023 1 commit
- [`LlamaConfig`] Nit: pad token should be None by default (#24958) · 0511369a
  Arthur authored Jul 21, 2023
```
* pad token should be None by default

* fix tests

* nits
```
  0511369a