Commits · e971486d891a7a580ba3f84d8a4b525c0c51850a · chenpangpang / transformers

30 Oct, 2023 15 commits

Fix: typos in README.md (#27154) · e971486d
MD FAIZAN KHAN authored Oct 31, 2023

e971486d

[`core`/ `GC` / `tests`] Stronger GC tests (#27124) · f7ea959b

Younes Belkada authored Oct 30, 2023



* stronger GC tests

* better tests and skip failing tests

* break down into 3 sub-tests

* break down into 3 sub-tests

* refactor a bit

* more refactor

* fix

* last nit

* credits contrib and suggestions

* credits contrib and suggestions

---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

f7ea959b

Device agnostic trainer testing (#27131) · 5bbf6712
Hz, Ji authored Oct 31, 2023

5bbf6712

Translating `en/main_classes` folder docs to Japanese

🇯🇵

(#26894) · 84724efd

Rockerz authored Oct 30, 2023



* add

* add

* add

* Add deepspeed.md

* Add

* add

* Update docs/source/ja/main_classes/callback.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/output.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/pipelines.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/text_generation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update  logging.md

* Update toctree.yml

* Update docs/source/ja/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Add suggesitons

* m

* Update docs/source/ja/main_classes/trainer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update toctree.yml

* Update Quantization.md

* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update toctree.yml

* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

84724efd

🌐 [i18n-ZH] Translate serialization.md into Chinese (#27076) · 9093b19b
Yeyang authored Oct 30, 2023
```
* docs(zh): translate serialization.md

* docs(zh): add space around links
```
9093b19b

Remove some Kosmos-2 `copied from` (#27149) · 3224c0c1

Yih-Dar authored Oct 30, 2023



* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3224c0c1

make tests of pytorch_example device agnostic (#27081) · cd19b193
Hz, Ji authored Oct 30, 2023

cd19b193
[`tests` / `Quantization`] Fix bnb test (#27145) · 6b466771
Younes Belkada authored Oct 30, 2023
```
* fix bnb test

* link to GH issue
```
6b466771

Fix some tests using `"common_voice"` (#27147) · 57699496

Yih-Dar authored Oct 30, 2023



* Use mozilla-foundation/common_voice_11_0

* Update expected values

* Update expected values

* For test_word_time_stamp_integration

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

57699496

Add `Kosmos-2` model (#24709) · 691fd8fd

Yih-Dar authored Oct 30, 2023



* Add KOSMOS-2 model

* update

* update

* update

* address review comment - 001

* address review comment - 002

* address review comment - 003

* style

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

* address review comment - 004

* address review comment - 005

* address review comment - 006

* address review comment - 007

* address review comment - 008

* address review comment - 009

* address review comment - 010

* address review comment - 011

* update readme

* fix

* fix

* fix

* [skip ci] fix

* revert the change in _decode

* fix docstring

* fix docstring

* Update docs/source/en/model_doc/kosmos-2.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* no more Kosmos2Tokenizer

* style

* remove "returned when being computed by the model"

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* UTM5 Atten

* fix attn mask

* use present_key_value_states instead of next_decoder_cache

* style

* conversion scripts

* conversion scripts

* conversion scripts

* Add _reorder_cache

* fix doctest and copies

* rename 1

* rename 2

* rename 3

* make fixup

* fix table

* fix docstring

* rename 4

* change repo_id

* remove tip

* update md file

* make style

* update md file

* put docs/source/en/model_doc/kosmos-2.md to slow

* update conversion script

* Use CLIPImageProcessor in Kosmos2Processor

* Remove Kosmos2ImageProcessor

* Remove to_dict in Kosmos2Config

* Remove files

* fix import

* Update conversion

* normalized=False

* Not using hardcoded values like <image>

* elt --> element

* Apply suggestion

* Not using hardcoded values like </image>

* No assert

* No nested functions

* Fix md file

* copy

* update doc

* fix docstring

* fix name

* Remove _add_remove_spaces_around_tag_tokens

* Remove dummy docstring of _preprocess_single_example

* Use `BatchEncoding`

* temp

* temp

* temp

* Update

* Update

* Make Kosmos2ProcessorTest a bit pretty

* Update gradient checkpointing

* Fix gradient checkpointing test

* Remove one liner remove_special_fields

* Simplify conversion script

* fix add_eos_token

* update readme

* update tests

* Change to microsoft/kosmos-2-patch14-224

* style

* Fix doc

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

691fd8fd

remove the obsolete code related to fairscale FSDP (#26651) · d751dbec
Hz, Ji authored Oct 30, 2023
```
* remove the obsolete code related to fairscale FSDP

* apple review suggestion
```
d751dbec
[`Trainer` / `GC`] Add `gradient_checkpointing_kwargs` in trainer and training arguments (#27068) · 5fbed2d7
Younes Belkada authored Oct 30, 2023
```
* add `gradient_checkpointing_kwargs` in trainer and training arguments

* add comment

* add test - currently failing

* now tests pass
```
5fbed2d7
Fix data2vec-audio note about attention mask (#27116) · e830495c
Thien Tran authored Oct 30, 2023
```
fix data2vec audio note about attention mask
```
e830495c
[`FA2`/ `Mistral`] Revert previous behavior with right padding + forward (#27125) · 16043211
Younes Belkada authored Oct 30, 2023
```
Update modeling_mistral.py
```
16043211

Fix slack report failing for doctest (#27042) · 211ad4c9

Yih-Dar authored Oct 30, 2023



* fix slack report for doctest

* separate reports

* style

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

211ad4c9

29 Oct, 2023 1 commit
- [Typo fix] flag config in WANDB (#27130) · 722e9364
  Gema Parreño authored Oct 29, 2023
```
typo fix flag config
```
  722e9364
27 Oct, 2023 11 commits

Fix docstring and type hint for resize (#27104) · 9e87618f
Daniil authored Oct 27, 2023
```
fix docstring and type hint for resize
```
9e87618f
translate transformers_agents.md to Chinese (#27046) · ef23b68e
jiaqiw09 authored Oct 27, 2023
```
* update translation

* fix problems mentioned in reviews
```
ef23b68e

Added Telugu [te] translation for README.md in main (#27077) · 96f9e78f

Akhil authored Oct 28, 2023



* Create index.md

* Create _toctree.yml

* Updated index.md in telugu

* Update _toctree.yml

* Create quicktour.md

* Update quicktour.md

* Create index.md

* Update quicktour.md

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Delete docs/source/hi/index.md

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update build_documentation.yml

Added telugu [te]

* Update build_pr_documentation.yml

Added Telugu [te]

* Update _toctree.yml

* Create README_te.md

Telugu translation for README.md

* Update README_te.md

Added Telugu translation for Readme.md

* Update README_te.md

* Update README_te.md

* Update README_te.md

* Update README_te.md

* Update README.md

* Update README_es.md

* Update README_es.md

* Update README_hd.md

* Update README_ja.md

* Update README_ko.md

* Update README_pt-br.md

* Update README_ru.md

* Update README_zh-hans.md

* Update README_zh-hant.md

* Update README_te.md

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

96f9e78f

[Attention Mask] Refactor all encoder-decoder attention mask (#27086) · ac589375

Patrick von Platen authored Oct 27, 2023



* [FA2 Bart] Add FA2 to all Bart-like

* better

* Refactor attention mask

* remove all customized atteniton logic

* format

* mass rename

* replace _expand_mask

* replace _expand_mask

* mass rename

* add pt files

* mass replace & rename

* mass replace & rename

* mass replace & rename

* mass replace & rename

* Update src/transformers/models/idefics/modeling_idefics.py

* fix more

* clean more

* fix more

* make style

* fix again

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* Apply suggestions from code review

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* small fix mistral

* finish

* finish

* finish

* finish

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ac589375

fix detr device map (#27089) · 29c74f58
Marc Sun authored Oct 27, 2023
```
* fix detr device map

* add comments
```
29c74f58

[`core`/ `gradient_checkpointing`] Refactor GC - part 2 (#27073) · ffff9e70

Younes Belkada authored Oct 27, 2023



* fix

* more fixes

* fix other models

* fix long t5

* use `gradient_checkpointing_func` instead

* fix copies

* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* replace it with `is_gradient_checkpointing_set`

* remove default

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ffff9e70

Fix no split modules underlying modules (#27090) · 5be1fb6d

Marc Sun authored Oct 27, 2023



* fix no split

* style

* remove comm

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* rename modules

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

5be1fb6d

Provide alternative when warning on use_auth_token (#27105) · 66b088fa
Lucain authored Oct 27, 2023

66b088fa

Add early stopping for Bark generation via logits processor (#26675) · e2bffcfa

Isaac Chung authored Oct 27, 2023

* add early stopping logits processor

* black formmated

* indent

* follow method signature

* actual logic

* check for None

* address comments on docstrings and method signature

* add unit test under `LogitsProcessorTest` wip

* unit test passing

* black formatted

* condition per sample

* add to BarkModelIntegrationTests

* wip BarkSemanticModelTest

* rename and add to kwargs handling

* not add to BarkSemanticModelTest

* correct logic and assert last outputs tokens different in test

* doc-builder style

* read from kwargs as well

* assert len of with less than that of without

* ruff

* add back seed and test case

* add original impl default suggestion

* doc-builder

* rename and use softmax

* switch back to LogitsProcessor and update docs wording

* camelCase and spelling and saving compute

* assert strictly less than

* assert less than

* expand test_generate_semantic_early_stop instead

e2bffcfa

Revert "add exllamav2 arg" (#27102) · 90ee9cea
Arthur authored Oct 27, 2023
```
Revert "add exllamav2 arg (#26437)"

This reverts commit 8214d6e7.
```
90ee9cea
[`T5Tokenizer`] Fix fast and extra tokens (#27085) · aa4198a2
Arthur authored Oct 27, 2023
```
* v4.35.dev.0

* nit t5fast match t5 slow
```
aa4198a2

26 Oct, 2023 13 commits

Added huggingface emoji instead of the markdown format (#27091) · 6f316016
Varshaa Shetty authored Oct 27, 2023
```
Added huggingface emoji instead of the markdown format as it was not displaying the required emoji in that format
```
6f316016

Save TB logs as part of push_to_hub (#27022) · 34a64064

Zach Mueller authored Oct 26, 2023

* Support runs/

* Upload runs folder as part of push to hub

* Add a test

* Add to test deps

* Update with proposed solution from Slack

* Ensure that repo gets deleted in tests

34a64064

Correct docstrings and a typo in comments (#27047) · 18925925

L. Yeung authored Oct 26, 2023



* docs(training_args): correct docstrings

Correct docstrings of these methods in `TrainingArguments`:

- `set_save`
- `set_logging`

* docs(training_args): adjust words in docstrings
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* docs(trainer): correct a typo in comments

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

18925925

add exllamav2 arg (#26437) · 8214d6e7

Marc Sun authored Oct 26, 2023

* add_ xllamav2 arg

* add test

* style

* add check

* add doc

* replace by use_exllama_v2

* fix tests

* fix doc

* style

* better condition

* fix logic

* add deprecate msg

8214d6e7

[Llama FA2] Re-add _expand_attention_mask and clean a couple things (#27074) · d7cb5e13

Patrick von Platen authored Oct 26, 2023

* clean

* clean llama

* fix more

* make style

* Apply suggestions from code review

* Apply suggestions from code review

* Update src/transformers/models/llama/modeling_llama.py

* Update src/transformers/models/llama/modeling_llama.py

* Apply suggestions from code review

* finish

* make style

d7cb5e13

Add-support for commit description (#26704) · 4864d08d
Arthur authored Oct 26, 2023
```
* fix

* update

* revert

* add dosctring

* good to go

* update

* add a test
```
4864d08d
Create SECURITY.md · 15cd0962
Arthur authored Oct 26, 2023

15cd0962
Remove unneeded prints in modeling_gpt_neox.py (#27080) · fe2877ce
Younes Belkada authored Oct 26, 2023

fe2877ce
Bump`flash_attn` version to `2.1` (#27079) · efba1a17
Younes Belkada authored Oct 26, 2023
```
* pin FA-2 to `2.1`

* fix on modeling
```
efba1a17

Bring back `set_epoch` for Accelerate-based dataloaders (#26850) · 90412401

Zach Mueller authored Oct 26, 2023



* Working tests!

* Fix sampler

* Fix

* Update src/transformers/trainer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix check

* Clean

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

90412401

Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/lxmert (#26888) · 3c269240

dependabot[bot] authored Oct 26, 2023

Bump urllib3 in /examples/research_projects/lxmert

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18

)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

3c269240

Bump werkzeug from 2.2.3 to 3.0.1 in /examples/research_projects/decision_transformer (#27072) · 9c5240af

dependabot[bot] authored Oct 26, 2023

Bump werkzeug in /examples/research_projects/decision_transformer

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.2.3 to 3.0.1.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/2.2.3...3.0.1

)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

9c5240af

Handle unsharded Llama2 model types in conversion script (#27069) · df2eebf1
corey hu authored Oct 25, 2023
```
Handle all unshared models types
```
df2eebf1