Commits · b15343de6fc25a72e5bc4ac8be238359f2178b12 · chenpangpang / transformers

11 Jul, 2023 2 commits

[Patch-t5-tokenizer] Patches the changes on T5 to make sure previous behaviour... · b15343de

Arthur authored Jul 11, 2023


[Patch-t5-tokenizer] Patches the changes on T5 to make sure previous behaviour is still valide for beginning of words (#24622)

* patch `_tokenize` function

* more tests

* properly fix

* fixup

* Update src/transformers/models/t5/tokenization_t5.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix without ifs

* update

* protect import

* add python processing

* is first needed

* add doc and update with lefacy

* updaate

* fix T5 SPM converter

* styling

* fix T5 warning

* add is_seqio_available

* remove is_first

* revert some changes

* more tests and update

* update llama test batterie

* fixup

* refactor T5 spm common tests

* draft the llama tests

* update

* uopdate test

* nits

* refine

* name nit

* fix t5 tests

* fix T5

* update

* revert convert slow to fast changes that fail lots of tests

* legacy support

* fixup

* nits is first not defined

* don't use legacy behaviour for switch transformers

* style

* My attempt to check.

* nits

* fixes

* update

* fixup

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates

* fixup

* add legacy warning

* fixup

* warning_once nit

* update t5 documentation test

* update llama tok documentation

* add space to warning

* nits

* nit

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* last nits

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

b15343de

Falcon port (#24523) · b3ab3fac

Matt authored Jul 11, 2023



* Initial commit

* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Cleanup config docstring

* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Convert to relative imports

* Remove torch < 1.8 warning

* Restructure cos_sin header

* qkv -> query, key, value

* Refactor attention calculation

* Add a couple of config variables to account for the different checkpoints

* Successful merging of the code paths!

* Fix misplaced line in the non-parallel attention path

* Update config and tests

* Add a pad_token_id when testing

* Support output_attentions when alibi is None

* make fixup

* Skip KV cache shape test

* No more _keys_to_ignore_on_load_missing

* Simplify self attention a bit

* Simplify self attention a bit

* make fixup

* stash commit

* Some more attention mask updates

* Should pass all tests except assisted generation!

* Add big model generation test

* make fixup

* Add temporary workaround for test

* Test overrides for assisted generation

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/falcon/test_modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Test overrides for assisted generation

* Add generation demo

* Update copyright

* Make the docstring model actually small

* Add module-level docstring

* Remove all assertions

* Add copied from bloom

* Reformat the QKV layer

* Add copied from bloom

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove unused line and reformat

* No single letter variables

* Cleanup return names

* Add copied from line

* Remove the deprecated arguments blocks

* Change the embeddings test to an alibi on/off test

* Remove position_ids from FalconForQA

* Remove old check for token type IDs

* Fix the alibi path when multi_query is False

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/falcon/test_modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update config naming

* Fix typo for new_decoder_architecture

* Add some comments

* Fix docstring

* Fix docstring

* Create range in the right dtype from the start

* Review comment cleanup

* n_head_kv -> num_kv_heads

* self.alibi -> self.use_alibi

* self.num_kv -> self.num_kv_heads

* Reorder config args

* Made alibi arguments Optional

* Add all model docstrings

* Add extra checkpoints

* Add author info for Falcon

* Stop removing token_type_ids because our checkpoints shouldn't return it anymore

* Add one hopeful comment for the future

* Fix typo

* Update tests, fix cache issue for generation

* Use -1e9 instead of -inf to avoid float overflow

* Recompute the rotary embeddings much less often

* Re-enable disabled tests

* One final fix to attention mask calculation, and update tests

* Cleanup targeting falcon-40b equivalency

* Post-rebase docs update

* Update docstrings, especially in the config

* More descriptive variable names, and comments where we can't rename them

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b3ab3fac

10 Jul, 2023 4 commits

add link to accelerate doc (#24601) · 35eac0df
Marc Sun authored Jul 10, 2023

35eac0df
Docs: change some `input_ids` doc reference from `BertTokenizer` to `AutoTokenizer` (#24730) · a074a5d3
Joao Gante authored Jul 10, 2023

a074a5d3

[`T5`] Adding model_parallel = False to `T5ForQuestionAnswering` and... · 25411085

Sebastian Husch Lee authored Jul 10, 2023

[`T5`] Adding model_parallel = False to `T5ForQuestionAnswering` and `MT5ForQuestionAnswering` (#24684)

Adding model_parallel = False

25411085

Add Multi Resolution Analysis (MRA) (New PR) (#24513) · 30ed3adf

novice authored Jul 10, 2023



* Add all files

* Update masked_language_modeling.md

* fix mlm models

* fix conflicts

* fix conflicts

* fix copies

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Reduce seq_len and hidden_size in ModelTester

* remove output_attentions

* fix conflicts

* remove copied from statements

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

30ed3adf

07 Jul, 2023 4 commits

Enable `conversational` pipeline for `GPTSw3Tokenizer` (#24648) · abaca9f9

Dan Saattrup Nielsen authored Jul 07, 2023

* feat: Add `_build_conversation_input_ids` to GPT-SW3 tokenizer, adjust line length

* feat: Merge in PR https://github.com/huggingface/transformers/pull/24504

.

This allows the GPT-SW3 models (and other GPT-2 based models) to be 4-bit quantised
using `load_in_4bit` with `bitsandbytes`.

* fix: F-string

* fix: F-string

* fix: Remove EOS token from all responses

* fix: Remove redundant newlines

* feat: Add `load_in_4bit` to `Pipeline`

* fix: Separate turns with `\n<s>\n` rather than `<s>`

* fix: Add missing newline in prompt

* tests: Add unit tests for the new `_build_conversation_input_ids` method

* style: Automatic style correction

* tests: Compare encodings rather than decodings

* fix: Remove `load_in_4bit` from pipeline arguments

* docs: Add description and references of the GPT-SW3 chat format

* style: Line breaks

* Apply suggestions from code review

Fix Conversation type hints
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix: Import TYPE_CHECKING

* style: Run automatic fixes

* tests: Remove `_build_conversation_input_ids` unit tests

* tests: Remove import of `Conversation` in GPT-SW3 unit test

* style: Revert formatting

* style: Move TYPE_CHECKING line after all imports

* style: Imports order

* fix: Change prompt to ensure that `sp_model.encode` and `encode` yields same result

* docs: Add TODO comment related to the addition of whitespace during decoding

* style: Automatic style checks

* fix: Remove final whitespace in prompt, as prefix whitespace is used by sentencepiece

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

abaca9f9

Whisper: fix prompted max length (#24666) · f614b6e3
Joao Gante authored Jul 07, 2023

f614b6e3
Fix flaky `test_for_warning_if_padding_and_no_attention_mask` (#24706) · 49572942
Yih-Dar authored Jul 07, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
49572942
[`MT5`] Fix CONFIG_MAPPING issue leading it to load umt5 class (#24678) · fb78769b
Arthur authored Jul 07, 2023
```
* update

* add umt5 to auto tokenizer mapping

* nits

* fixup

* fix failing torch test
```
fb78769b

06 Jul, 2023 5 commits

Fix integration with Accelerate and failing test (#24691) · fded6f41
Zach Mueller authored Jul 06, 2023
```
Fix integration
```
fded6f41
Avoid import `sentencepiece_model_pb2` in `utils.__init__.py` (#24689) · bbf30908
Yih-Dar authored Jul 06, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bbf30908

DeepSpeed/FSDP ckpt saving utils fixes and FSDP training args fixes (#24591) · 66a37842

Sourab Mangrulkar authored Jul 06, 2023

* update ds and fsdp ckpt logic

* refactoring

* fix 🐛

* resolve comment

* fix issue with overriding of the fsdp config set by accelerate

66a37842

Add dropouts to GPT-NeoX (#24680) · 39274045

Zhao Tianyu authored Jul 06, 2023

* add attention dropout, post attention dropout, post mlp dropout to gpt-neox

* fix typo

* add documentation

* fix too long line

* ran Checking/fixing src/transformers/models/gpt_neox/configuration_gpt_neox.py src/transformers/models/gpt_neox/modeling_gpt_neox.py
python utils/custom_init_isort.py
python utils/sort_auto_mappings.py
doc-builder style src/transformers docs/source --max_len 119 --path_to_docs docs/source
python utils/check_doc_toc.py --fix_and_overwrite
running deps_table_update
updating src/transformers/dependency_versions_table.py
python utils/check_copies.py
python utils/check_table.py
python utils/check_dummies.py
python utils/check_repo.py
Checking all models are included.
Checking all models are public.
Checking all models are properly tested.
Checking all objects are properly documented.
Checking all models are in at least one auto class.
Checking all names in auto name mappings are defined.
Checking all keys in auto name mappings are defined in `CONFIG_MAPPING_NAMES`.
Checking all auto mappings could be imported.
Checking all objects are equally (across frameworks) in the main __init__.
python utils/check_inits.py
python utils/check_config_docstrings.py
python utils/check_config_attributes.py
python utils/check_doctest_list.py
python utils/update_metadata.py --check-only
python utils/check_task_guides.py

39274045

LlamaTokenizer should be picklable (#24681) · fb3b22c3
Yuchao Dai authored Jul 06, 2023
```
* LlamaTokenizer should be picklable

* make fixup
```
fb3b22c3

05 Jul, 2023 6 commits
- Add Nucleotide Transformer notebooks and restructure notebook list (#24669) · 9a5d468b
  Matt authored Jul 05, 2023
```
* Add Nucleotide Transformer notebooks and restructure lists

* Add missing linebreak!
```
  9a5d468b
- Fix model referenced and results in documentation. Model mentioned was inaccessible (#24609) · 3df3b9d4
  Rafael Padilla authored Jul 05, 2023
  
  3df3b9d4
- Unpin `huggingface_hub` (#24667) · 050ef145
  Yih-Dar authored Jul 05, 2023
```
* fix

* fix

* fix

* [test all] commit

* [test all] commit

* [test all] commit

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  050ef145
- Add `is_torch_mps_available` function to utils (#24660) · bd9dfc23
  Nripesh Niketan authored Jul 05, 2023
```
* Add mps function utils

* black formating

* format fix

* Added MPS functionality to transformers

* format fix
```
  bd9dfc23
- Fix `VisionTextDualEncoderIntegrationTest` (#24661) · ee339bad
  Yih-Dar authored Jul 05, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  ee339bad
- Fix `EncodecModelTest::test_multi_gpu_data_parallel_forward` (#24663) · d211a84a
  Yih-Dar authored Jul 05, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  d211a84a
04 Jul, 2023 7 commits

Make warning disappear for remote code in pipelines (#24603) · 469f4d0c
Sylvain Gugger authored Jul 04, 2023
```
* Make warning disappear for remote code in pipelines

* Make sure it works twice in a row

* No need for that
```
469f4d0c
Add `finetuned_from` property in the autogenerated model card (#24528) · b19c7b5c
Sylvain Gugger authored Jul 04, 2023
```
* Add finetuned_from tag in the autogenerated model card

* Update name
```
b19c7b5c

Update warning messages reffering to post_process_object_detection (#24649) · ea9caf7a

Rafael Padilla authored Jul 04, 2023

* including the threshold alert in warning messages.

* Updating doc owlvit.md including post_process_object_detection function with threshold.

* fix

ea9caf7a

documentation_tests.txt - sort filenames alphabetically (#24647) · f3e96235
amyeroberts authored Jul 04, 2023
```
* Sort filenames alphabetically

* Add check for order
```
f3e96235

llama fp16 torch.max bug fix (#24561) · a3b402ff

Prathik Rao authored Jul 04, 2023



* open llama fp16 bug fix

* bug fix

* bug fixed

* make style

* Update modeling_llama.py

* apply formatting

* Address amy's comment

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

a3b402ff

Fix audio feature extractor deps (#24636) · 4e945660
Sanchit Gandhi authored Jul 04, 2023
```
* Fix audio feature extractor deps

* use audio utils window over torch window
```
4e945660

precompiled_charsmap checking before adding to the normalizers' list for... · cd4584e3

Shahad Mahmud authored Jul 04, 2023

precompiled_charsmap checking before adding to the normalizers' list for XLNetTokenizerFast conversion. (#24618)

* precompiled_charsmap checking before adding to the normalizers' list.

* precompiled_charsmap checking for all Sentencepiece tokenizer models

* precompiled_charsmap checking for SPM tokenizer models - correct formatting

cd4584e3

03 Jul, 2023 7 commits

Generate: force cache with `inputs_embeds` forwarding (#24639) · f4e4b4d0
Joao Gante authored Jul 03, 2023

f4e4b4d0
Generate: multi-device support for contrastive search (#24635) · 9934bb1f
Joao Gante authored Jul 03, 2023

9934bb1f

Fix loading dataset docs link in run_translation.py example (#24594) · 4b26a616

Gema Parreño authored Jul 03, 2023



* fix loading dataset link

* Update examples/tensorflow/translation/run_translation.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update examples/tensorflow/translation/run_translation.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4b26a616

Pin `Pillow` for now (#24633) · 6eedfa6d
Yih-Dar authored Jul 03, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
6eedfa6d

[Time-Series] Added blog-post to tips (#24482) · fc7ce2eb

Eli Simhayev authored Jul 03, 2023

* [Time-Series] Added blog-post to tips

* added Resources to time series models docs

* removed "with Bert"

fc7ce2eb

🌐

[i18n-KO] Translated `perplexity.mdx` to Korean (#23850) · e16191a8

Nayeon Han authored Jul 03, 2023

* docs: ko: `perplexity.mdx`

* translate comment

* reference english file

* change extension

* update toctree

e16191a8

[`Umt5`] Add google's umt5 to `transformers` (#24477) · 799df10a

Arthur authored Jul 03, 2023



* add tokenization template

* update conversion script

* update modeling code

* update

* update convert checkpoint

* update modeling

* revert changes on convert script

* new conversion script for new format

* correct position bias

* cleaning a bit

* Credit co authors
Co-authored-by: agemagician <ahmed.elnaggar@tum.de>

Co-authored-by: stefan-it
<>

* styling

* Add docq

* fix copies

* add co author

* Other Author

* Merge branch 'main' of https://github.com/huggingface/transformers

 into add-umt5

* add testing

* nit

* Update docs/source/en/model_doc/umt5.mdx
Co-authored-by: Stefan Schweter <stefan@schweter.it>

* fix t5

* actual fix?

* revert wrong changes

* remove

* update test

* more fixes

* revert some changes

* add SPIECE_UNDERLINE

* add a commone xample

* upfate

* fix copies

* revert changes on t5 conversion script

* revert bytefallback changes since there was no addition yet

* fixup

* fixup

* ingore umt5 cutom testing folder

* fix readmes

* revertT5 changes

* same outputs

* fixup

* update example

* Apply suggestions from code review

* style

* draft addition of all new files

* current update

* fix attention and stuff

* finish refactoring

* auto config

* fixup

* more nits

* add umt5 to init

* use md format

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert changes on mt5

* revert mt4 changes

* update test

* more fixes

* add to mapping

* fix-copies

* fix copies

* foix retain grad

* fix some tests

* nits

* done

* Update src/transformers/models/umt5/modeling_umt5.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/umt5.md

* Update src/transformers/models/umt5/__init__.py

* Update docs/source/en/model_doc/umt5.md
Co-authored-by: Stefan Schweter <stefan@schweter.it>

* Update src/transformers/models/umt5/modeling_umt5.py

* update conversion script + use google checkpoints

* nits

* update test and modelling

* stash slow convert

* update fixupd

* don't change slow

---------

Co-authored-by: stefan-it <>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

799df10a

01 Jul, 2023 1 commit
- fix pydantic install command · 66ded238
  ydshieh authored Jul 01, 2023
  
  66ded238
30 Jun, 2023 4 commits

Limit Pydantic to V1 in dependencies (#24596) · d51aa48a

Serge Matveenko authored Jul 01, 2023



* Limit Pydantic to V1 in dependencies

Pydantic is about to release V2 release which will break a lot of things. This change prevents `transformers` to be used with Pydantic V2 to avoid breaking things.

* more

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d51aa48a

Use protobuf 4 (#24599) · 299aafe5

Yih-Dar authored Jun 30, 2023



* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

299aafe5

[several models] improve readability (#24585) · 49e812d1
Stas Bekman authored Jun 30, 2023
```
* [modeling_clip.py] improve readability

* apply to other models

* fix
```
49e812d1

Speed up TF tests by reducing hidden layer counts (#24595) · 134caef3

Matt authored Jun 30, 2023

* hidden layers, huh, what are they good for (absolutely nothing)

* Some tests break with 1 hidden layer, use 2

* Use 1 hidden layer in a few slow models

* Use num_hidden_layers=2 everywhere

* Slightly higher tol for groupvit

* Slightly higher tol for groupvit

134caef3