Commits · 005b957fb851607115800f50e594b77662a771ba · chenpangpang / transformers

18 Apr, 2024 9 commits

Abhi Venigalla authored Apr 18, 2024



* wip

* fix __init__.py

* add docs

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments 1

* work on make fixup

* pass configs down

* add sdpa attention

* remove DbrxBlock

* add to configuration_auto

* docstring now passes formatting test

* fix style

* update READMEs

* add dbrx to modeling_auto

* make fix-copies generated this

* add DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP

* config docstring passes formatting test

* rename moe_loss_weight to router_aux_loss_coef

* add to flash-attn documentation

* fix model-path in tests

* Explicitly make `"suli"` the default `ffn_act_fn`
Co-authored-by: Wing Lian <wing.lian@gmail.com>

* default to using router_aux_loss_coef over ffn_config[moe_loss_weight]

* fix _flash_attn_uses_top_left_mask and is_causal

* fix tests path

* don't use token type IDs

* follow Llama and remove token_type_ids from test

* init ConfigTester differently so tests pass

* remove multiple choice test

* remove question + answer test

* remove sequence classification test

* remove token classification test

* copy Llama tests and remove token_type_ids from test inputs

* do not test pruning or headmasking; style code

* add _tied_weights_keys parameter to pass test

* add type hints

* fix type check

* update config tester

* remove masked_lm test

* remove encoder tests

* initialize DbrxModelTester with correct params

* style

* torch_dtype does not rely on torch

* run make fixup, fix-copies

* use https://huggingface.co/v2ray/dbrx-base-fixed/blob/main/modeling_dbrx.py



* add copyright info

* fix imports and DbrxRotaryEmbedding

* update DbrxModel docstring

* use copies

* change model path in docstring

* use config in DbrxFFN

* fix flashattention2, sdpaattention

* input config to DbrXAttention, DbrxNormAttentionNorm

* more fixes

* fix

* fix again!

* add informative comment

* fix ruff?

* remove print statement + style

* change doc-test

* fix doc-test

* fix docstring

* delete commented out text

* make defaults match dbrx-instruct

* replace `router_aux_loss_coef` with `moe_loss_weight`

* is_decoder=True

* remove is_decoder from configtester

* implement sdpa properly

* make is_decoder pass tests

* start on the GenerationTesterMixin tests

* add dbrx to sdpa documentation

* skip weight typing test

* style

* initialize smaller model
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Add DBRX to toctree

* skip test_new_cache_format

* make config defaults smaller again

* add pad_token_id

* remove pad_token_id from config

* Remove all references to DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP

* Update src/transformers/models/dbrx/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix typo

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update docs, fix configuration_auto.py

* address pr comments

* remove is_decoder flag

* slice

* fix requires grad

* remove grad

* disconnect differently

* remove grad

* enable grads

* patch

* detach expert

* nissan al ghaib

* Update modeling_dbrx.py

* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* replace "Gemma" with "Dbrx"

* remove # type: ignore

* don't hardcode vocab_size

* remove ToDo

* Re-add removed idefics2 line

* Update test to use tiny-random!

* Remove TODO

* Remove one more case of loading the entire dbrx-instruct in the tests

* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* address some comments

* small model

* add dbrx to tokenization_auto

* More docstrings with add_start_docstrings

* Dbrx for now

* add PipelineTesterMixin

* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove flash-attn2 import error

* fix docstring
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add useage example

* put on one line
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix ffn_act_fn
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change "dbrx" to "DBRX" for display purposes.

* fix __init__.py?

* fix __init__.py

* fix README

* return the aux_loss

* remove extra spaces

* fix configuration_auto.py

* fix format in tokenization_auto

* remove new line

* add more useage examples

---------
Co-authored-by: Abhi Venigalla <abhi.venigalla@databricks.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Eitan Turok <eitan.turok@databricks.com>
Co-authored-by: Eitan Turok <150733043+eitanturok@users.noreply.github.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: Eitan Turok <eitanturok@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

005b957f

Do not drop mask with SDPA for more cases (#30311) · 63c5e27e
fxmarty authored Apr 18, 2024
```
* overlooked

* style

* cleaner
```
63c5e27e

Revert "Re-enable SDPA's FA2 path (#30070)" (#30314) · acab997b

Arthur authored Apr 18, 2024

* Revert "Re-enable SDPA's FA2 path (#30070)"

This reverts commit 05bdef16.

* Revert "Fix quality Olmo + SDPA (#30302)"

This reverts commit ec92f983.

acab997b

Fix RecurrentGemma device_map (#30273) · 7509a0ad
Marc Sun authored Apr 18, 2024
```
* Switch to non persistant buffer

* fix device mismatch issue due to cache

* style
```
7509a0ad
Add atol for sliding window test (#30303) · 9459efb8
fxmarty authored Apr 18, 2024
```
atol for sliding window test
```
9459efb8

Add jamba (#29943) · 3f20877d

tomeras91 authored Apr 18, 2024

* Add jamba arch

* apply "make fix-copies" changes

* fix link to model in JambaConfig docstring

* Add n_ctx in modeling file because repo-consistency wants that

* Add jamba to flash attention and sdpa documentation

* mamba dt_proj quant fix now works for LoRA as well

* override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers

* add jamba to tokenization auto

* fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24)

* simple PR fixes

* remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer

* remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530)

* Add copied comment on JambaMLP (it's the same as MixtralMLP)

* remove padding_mask warnings. It's not supported anymore

* fix docstring. Float instead of int...

3f20877d

Fix all torch pipeline failures except one (#30290) · 28a22834
Yih-Dar authored Apr 18, 2024
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
28a22834
Fix donut token2json multiline (#30300) · 7915a259
Pavel Iakubovskii authored Apr 18, 2024
```
* Fix multiline processing

* Update test for token2json
```
7915a259

Add Flash Attention 2 to M2M100 model (#30256) · b65df514

Alexander Visheratin authored Apr 18, 2024



* Added flash attention 2.

* Fixes.

* Fix inheritance.

* Fixed init.

* Remove stuff.

* Added documentation.

* Add FA2 to M2M100 documentation.

* Add test.

* Fixed documentation.

* Update src/transformers/models/m2m_100/modeling_m2m_100.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update docs/source/en/model_doc/nllb.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixed variable name.

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b65df514

17 Apr, 2024 14 commits

Fix quality Olmo + SDPA (#30302) · ec92f983
fxmarty authored Apr 17, 2024
```
fix olmo
```
ec92f983

Re-enable SDPA's FA2 path (#30070) · 05bdef16

fxmarty authored Apr 17, 2024



* tentatively re-enable FA2 + SDPA

* better comment

* _ignore_causal_mask_sdpa as staticmethod

* type hints

* use past_seen_tokens instead

* enable copied from for sdpa

* ruff

* llama simplifications on review

* remove unnecessary self.is_causal check

* fix copies

* cleaning

* precise message

* better doc

* add test

* simplify

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

05bdef16

Add OLMo model family (#29890) · e4ea19b9

Shane A authored Apr 17, 2024

* Add OLMo using add-new-model-like with Llama

* Fix incorrect tokenizer for OLMo

* Copy-paste relevant OLMo methods and their imports

* Add OLMo config

* Modify OLMo config to follow HF conventions

* Remove unneeded Llama code from OLMo model

* Add ability for OLMo model to output attentions

* Add OLMoPreTrainedModel and OLMoModel

* Add OLMoForCausalLM

* Minor fixes to OLMo model for style and missing functions

* Implement OLMo tokenizer

* Implement OLMo to HF conversion script

* Add tests for OLMo model

* Add tests for OLMo fast tokenizer

* Add auto-generated dummy objects

* Remove unimplemented OLMo classes from auto and init classes and re-format

* Add README and associated auto-generated files

* Use OLMo names for common properties

* Run make fixup

* Remove `|` from OLMo typing

* Remove unneeded tokenization_olmo.py

* Revert model, config and converter to add-new-model-like Llama

* Move logic for adding bos/eos token into GPTNeoxTokenizerFast

* Change OLMoConfig defaults to match OLMo-7B

* Use GPTNeoXToknizerFast in OLMo tokenizer tests

* Modify auto-generated OLMoModelTests to work for OLMo

* Add non-parametric layer norm OLMoLayerNorm

* Update weight conversion script for OLMo

* Fix __init__ and auto structure for OLMo

* Fix errors from make fixup

* Remove OLMoTokenizerFast from documentation

* Add missing 'Copied from' for OLMoModel._update_causal_mask

* Run make fix-copies

* Rearrange string replacements in OLMoForCausalLM Copied from

* Move OLMo and Llama CausalLM.forward example into global constants

* Fix OLMO_GENERATION_EXAMPLE doc string typo

* Add option for qkv clipping to OLMo

* Rearrange OLMoConfig kwargs in convert_olmo_weights_to_hf

* Add clip_qkv to OLMoConfig in convert_olmo_weights_to_hf

* Fix OLMo tokenization bug using conversion script

* Keep model in full precision after conversion

* Do not add eos token automatically

* Update references to OLMo model in HF Hub

* Do not add eos token during encoding by default

* Fix Llama generation example

* Run make fixup

* OLMo 7B integration test fix

* Remove unneeded special case for OLMoConfig

* OLMo 7B Twin 2T integration test fix

* Fix test_model_7b_greedy_generation

* Remove test_compile_static_cache

* Fix OLMo and Llama generation example

* Run make fixup

* Revert "OLMo 7B integration test fix"

This reverts commit 4df56a4b150681bfa559846f40e9b7b7f97d7908.

* Revert "OLMo 7B Twin 2T integration test fix"

This reverts commit 9ff65a4a294ace89ab047b793ca55e623a9ceefc.

* Ungate 7B integration tests and fix greedy generation test

* Add retries for flaky test_eager_matches_sdpa_generate

* Fix output of doc example for OLMoForCausalLM.forward

* Downsize OLMo doc test for OLMoForCausalLM.forward to 1B model

* Try fix incorrect characters in OLMoForCausalLM.forward doct test

* Try fix incorrect characters in OLMoForCausalLM.forward doc test using end quotes

* Remove pretraining_tp from OLMo config and model

* Add missing 'Copied from' instances

* Remove unneeded causal_mask from OLMoModel

* Revert Llama changes

* Ignore copy for OLMoForCausalLM.forward

* Change 'OLMo' to 'Olmo' in classes

* Move minimal OLMo tokenization tests to model tests

* Add missed 'Copied from' for repeat_kv

e4ea19b9

Upgrading to tokenizers 0.19.0 (#30289) · 8e5f76f5

Nicolas Patry authored Apr 17, 2024

* [DO NOT MERGE] Testing tokenizers 0.19.0rc0

* Accounting for the breaking change.

* Ruff.

* Upgrading to tokenizers `0.19` (new release with preprend_scheme fixed
and new surface for BPE tiktoken bug).

8e5f76f5

Add strategy to store results in evaluation loop (#30267) · c15aad09

Pavel Iakubovskii authored Apr 17, 2024

* Add evaluation loop container for interm. results

* Add tests for EvalLoopContainer

* Formatting

* Fix padding_index in test and typo

* Move EvalLoopContainer to pr_utils to avoid additional imports

* Fix `eval_do_concat_batches` arg description

* Fix EvalLoopContainer import

c15aad09

Add token type ids to CodeGenTokenizer (#29265) · 8d6b5096

st81 authored Apr 17, 2024

* Add create token type ids to CodeGenTokenizer

* Fix inconsistent length of token type ids

* Format source codes

* Fix inconsistent order of methods

* Update docstring

* add test_tokenizer_integration test

* Format source codes

* Add `copied from` comment to CodeGenTokenizerFast

* Add doc of create_token_type_ids_from_sequences

* Make return_token_type_ids False by default

* Make test_tokenizer_integration as slow test

* Add return_token_type_ids to tokenizer init arg

* Add test for tokenizer's init return_token_type_ids

* Format source codes

8d6b5096

FIX: Fix push important models CI (#30291) · 812a5de2
Younes Belkada authored Apr 17, 2024
```
Update push-important-models.yml
```
812a5de2
Fix `Fatal Python error: Bus error` in `ZeroShotAudioClassificationPipelineTests` (#30283) · eb75516e
Yih-Dar authored Apr 17, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
eb75516e
Fix test `ExamplesTests::test_run_translation` (#30281) · 05dab4e5
Yih-Dar authored Apr 17, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
05dab4e5
Enable fx tracing for Mistral (#30209) · 304c6a1e
Raushan Turganbay authored Apr 17, 2024
```
* tracing for mistral

* typo

* fix copies
```
304c6a1e

Configuring Translation Pipelines documents update #27753 (#29986) · 98717cb3

Utkarsha Gupte authored Apr 17, 2024

* Configuring Translation Pipelines documents update #27753

Configuring Translation Pipelines documents update

* Language Format Addition

* adding supported list of languages list

98717cb3

FIX / AWQ: Fix failing exllama test (#30288) · 080b7008
Younes Belkada authored Apr 17, 2024
```
fix filing exllama test
```
080b7008
Fix SpeechT5 forward docstrings (#30287) · 41145247
Yoach Lacombe authored Apr 17, 2024

41145247

Fix SDPA sliding window compatibility (#30127) · 40eb6d6c

fxmarty authored Apr 17, 2024



* fix sdpa + sliding window

* give credit
Co-authored-by: ehuaa <ehuamail@163.com>

* remove unnecessary warning

* fix typog

* add test

---------
Co-authored-by: ehuaa <ehuamail@163.com>

40eb6d6c

16 Apr, 2024 10 commits

Fix test fetcher (doctest) + `Idefics2`'s doc example (#30274) · 5fabebdb
Yih-Dar authored Apr 16, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
5fabebdb
fix: Fixed a `raise` statement (#30275) · 37b5946a
Sai-Suraj-27 authored Apr 16, 2024
```
* Fixed a raise statement.

* Minor changes.
```
37b5946a

BLIP - fix pt-tf equivalence test (#30258) · c63f1589

amyeroberts authored Apr 16, 2024

* BLIP - fix pt-tf equivalence test

* Update tests/models/blip/test_modeling_blip.py

* Update more model tests

c63f1589

Raise relevent err when wrong type is passed in as the accelerator_config (#29997) · e27d9308
Zach Mueller authored Apr 16, 2024
```
* Raise relevent err

* Use type instead
```
e27d9308

add `push_to_hub` to pipeline (#29172) · 0eaef0c7

Hafedh authored Apr 16, 2024



* add `push_to_hub` to pipeline

* fix docs

* format with ruff

* update save_pretrained

* update save_pretrained

* remove unnecessary comment

* switch to push_to_hub method in DynamicPipelineTester

* remove unused imports

* update docs for add_new_pipeline

* fix docs for add_new_pipeline

* add comment

* fix italien docs

* changes to token retrieval for pipelines

* Update src/transformers/pipelines/base.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

0eaef0c7

Workflow: Update tailscale to release version (#30268) · 60dea593
Younes Belkada authored Apr 16, 2024
```
Update tailscale to release version
```
60dea593

Allow for str versions of dicts based on typing (#30227) · 487505ff

Zach Mueller authored Apr 16, 2024

* Bookmark, initial impelemtation. Need to test

* Clean

* Working fully, woop woop

* I think working version now, testing

* Fin!

* rm cast, could keep None

* Fix typing issue

* rm typehint

* Add test

* Add tests and make more rigid

487505ff

FIX: Fix 8-bit serialization tests (#30051) · b86d0f4e

Younes Belkada authored Apr 16, 2024



* fix 8-bit serialization tests

* add more clarification

* Update src/transformers/quantizers/quantizer_bnb_8bit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b86d0f4e

FIX: Fix corner-case issue with the important models workflow (#30212) · ddf5f258

Younes Belkada authored Apr 16, 2024

* Update push-important-models.yml

* dummy commit

* Update modeling_bark.py

* test

* test

* test

* another test

* another test

* test

* final test

* final test

* test

* another test

* test

* test

* another test

* test llama

* revert everything

* remove echo

ddf5f258

More fixes for doctest (#30265) · cbc2cc18

Yih-Dar authored Apr 16, 2024



* fix

* update

* update

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

cbc2cc18

15 Apr, 2024 7 commits

Update `ko/_toctree.yml` (#30062) · 51bcadc1

Jungnerd authored Apr 16, 2024



* fix: update `ko/_toctree.yml`

* fix: update ko/_toctree.yml

* Update docs/source/ko/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: delete `perf_infer_gpu_many`

* fix: Replace untranslated docs with `in_translation`
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: Replace untraslated docs with `in_translation`

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

51bcadc1

Remove incorrect arg in codellama doctest (#30257) · 5be21302
Matt authored Apr 15, 2024
```
Remove incorrect arg in codellama docstring
```
5be21302
[Docs] Update recurrent_gemma.md for some minor nits (#30238) · 8127f396
Sayak Paul authored Apr 15, 2024
```
Update recurrent_gemma.md
```
8127f396

Add Idefics2 (#30253) · 6b78360e

amyeroberts authored Apr 15, 2024

* Initial add model additions

* Test

* All weights loading

* Can perform full forward pass

* Local and remote the same

* Matching local and remote

* Fixup

* Idefics2Model importable; fixup docstrings

* Don't skip by default

* Remove deprecated use_resampler arg

* Remove self.config

* DecoupledLinear takes config

* Tidy up

* Enable eager attention and tidy up

* Most tests passing

* Update for batch of processed images

* Add image processor

* Update doc pages

* Update conversion script

* Remove erroneous breakpoint

* Remove accidendtal spelling change

* Update to reflect changes on hub - make generate work

* Fix up

* Image processor tests

* Update tests

* Add a processor

* Add a processor

* Update convert script

* Update modeling file - remove fixmes

* Bug fix

* Add processing test

* Use processor

* Fix up

* Update src/transformers/models/idefics2/modeling_idefics2.py

Co-authored-by: V...

6b78360e

[tests] add the missing `require_torch_multi_gpu` flag (#30250) · 667939a2
Fanli Lin authored Apr 15, 2024
```
add gpu flag
```
667939a2
update github actions packages' version to suppress warnings (#30249) · 440bd3c3
Yih-Dar authored Apr 15, 2024
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
440bd3c3
round epoch only in console (#30237) · 76681015
LZR authored Apr 15, 2024

76681015