Commits · ac5893756bafcd745d93a442cf36f984545dbad8 · chenpangpang / transformers

27 Oct, 2023 8 commits

[Attention Mask] Refactor all encoder-decoder attention mask (#27086) · ac589375

Patrick von Platen authored Oct 27, 2023



* [FA2 Bart] Add FA2 to all Bart-like

* better

* Refactor attention mask

* remove all customized atteniton logic

* format

* mass rename

* replace _expand_mask

* replace _expand_mask

* mass rename

* add pt files

* mass replace & rename

* mass replace & rename

* mass replace & rename

* mass replace & rename

* Update src/transformers/models/idefics/modeling_idefics.py

* fix more

* clean more

* fix more

* make style

* fix again

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* Apply suggestions from code review

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* small fix mistral

* finish

* finish

* finish

* finish

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ac589375

fix detr device map (#27089) · 29c74f58
Marc Sun authored Oct 27, 2023
```
* fix detr device map

* add comments
```
29c74f58

[`core`/ `gradient_checkpointing`] Refactor GC - part 2 (#27073) · ffff9e70

Younes Belkada authored Oct 27, 2023



* fix

* more fixes

* fix other models

* fix long t5

* use `gradient_checkpointing_func` instead

* fix copies

* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* replace it with `is_gradient_checkpointing_set`

* remove default

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ffff9e70

Fix no split modules underlying modules (#27090) · 5be1fb6d

Marc Sun authored Oct 27, 2023



* fix no split

* style

* remove comm

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* rename modules

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

5be1fb6d

Provide alternative when warning on use_auth_token (#27105) · 66b088fa
Lucain authored Oct 27, 2023

66b088fa

Add early stopping for Bark generation via logits processor (#26675) · e2bffcfa

Isaac Chung authored Oct 27, 2023

* add early stopping logits processor

* black formmated

* indent

* follow method signature

* actual logic

* check for None

* address comments on docstrings and method signature

* add unit test under `LogitsProcessorTest` wip

* unit test passing

* black formatted

* condition per sample

* add to BarkModelIntegrationTests

* wip BarkSemanticModelTest

* rename and add to kwargs handling

* not add to BarkSemanticModelTest

* correct logic and assert last outputs tokens different in test

* doc-builder style

* read from kwargs as well

* assert len of with less than that of without

* ruff

* add back seed and test case

* add original impl default suggestion

* doc-builder

* rename and use softmax

* switch back to LogitsProcessor and update docs wording

* camelCase and spelling and saving compute

* assert strictly less than

* assert less than

* expand test_generate_semantic_early_stop instead

e2bffcfa

Revert "add exllamav2 arg" (#27102) · 90ee9cea
Arthur authored Oct 27, 2023
```
Revert "add exllamav2 arg (#26437)"

This reverts commit 8214d6e7.
```
90ee9cea
[`T5Tokenizer`] Fix fast and extra tokens (#27085) · aa4198a2
Arthur authored Oct 27, 2023
```
* v4.35.dev.0

* nit t5fast match t5 slow
```
aa4198a2

26 Oct, 2023 13 commits

Added huggingface emoji instead of the markdown format (#27091) · 6f316016
Varshaa Shetty authored Oct 27, 2023
```
Added huggingface emoji instead of the markdown format as it was not displaying the required emoji in that format
```
6f316016

Save TB logs as part of push_to_hub (#27022) · 34a64064

Zach Mueller authored Oct 26, 2023

* Support runs/

* Upload runs folder as part of push to hub

* Add a test

* Add to test deps

* Update with proposed solution from Slack

* Ensure that repo gets deleted in tests

34a64064

Correct docstrings and a typo in comments (#27047) · 18925925

L. Yeung authored Oct 26, 2023



* docs(training_args): correct docstrings

Correct docstrings of these methods in `TrainingArguments`:

- `set_save`
- `set_logging`

* docs(training_args): adjust words in docstrings
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* docs(trainer): correct a typo in comments

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

18925925

add exllamav2 arg (#26437) · 8214d6e7

Marc Sun authored Oct 26, 2023

* add_ xllamav2 arg

* add test

* style

* add check

* add doc

* replace by use_exllama_v2

* fix tests

* fix doc

* style

* better condition

* fix logic

* add deprecate msg

8214d6e7

[Llama FA2] Re-add _expand_attention_mask and clean a couple things (#27074) · d7cb5e13

Patrick von Platen authored Oct 26, 2023

* clean

* clean llama

* fix more

* make style

* Apply suggestions from code review

* Apply suggestions from code review

* Update src/transformers/models/llama/modeling_llama.py

* Update src/transformers/models/llama/modeling_llama.py

* Apply suggestions from code review

* finish

* make style

d7cb5e13

Add-support for commit description (#26704) · 4864d08d
Arthur authored Oct 26, 2023
```
* fix

* update

* revert

* add dosctring

* good to go

* update

* add a test
```
4864d08d
Create SECURITY.md · 15cd0962
Arthur authored Oct 26, 2023

15cd0962
Remove unneeded prints in modeling_gpt_neox.py (#27080) · fe2877ce
Younes Belkada authored Oct 26, 2023

fe2877ce
Bump`flash_attn` version to `2.1` (#27079) · efba1a17
Younes Belkada authored Oct 26, 2023
```
* pin FA-2 to `2.1`

* fix on modeling
```
efba1a17

Bring back `set_epoch` for Accelerate-based dataloaders (#26850) · 90412401

Zach Mueller authored Oct 26, 2023



* Working tests!

* Fix sampler

* Fix

* Update src/transformers/trainer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix check

* Clean

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

90412401

Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/lxmert (#26888) · 3c269240

dependabot[bot] authored Oct 26, 2023

Bump urllib3 in /examples/research_projects/lxmert

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18

)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

3c269240

Bump werkzeug from 2.2.3 to 3.0.1 in /examples/research_projects/decision_transformer (#27072) · 9c5240af

dependabot[bot] authored Oct 26, 2023

Bump werkzeug in /examples/research_projects/decision_transformer

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.2.3 to 3.0.1.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/2.2.3...3.0.1

)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

9c5240af

Handle unsharded Llama2 model types in conversion script (#27069) · df2eebf1
corey hu authored Oct 25, 2023
```
Handle all unshared models types
```
df2eebf1

25 Oct, 2023 8 commits

Hindi translation of pipeline_tutorial.md (#26837) · a2f55a65

Aarya Balwadkar authored Oct 25, 2023



* hindi translation of pipeline_tutorial.md

* Update pipeline_tutorial.md

* Update build_documentation.yml

* Update build_pr_documentation.yml

* Updated build_documentation.yml

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

a2f55a65

🌐

[i18n-ZH] Translate custom_models.md into Chinese (#27065) · ba5144f7

Yeyang authored Oct 26, 2023



* docs(zh): translate custom_models.md

* minor fix in customer_models
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

ba5144f7

[`docs`] Add `MaskGenerationPipeline` in docs (#27063) · c34c50cd

Younes Belkada authored Oct 25, 2023

* add `MaskGenerationPipeline` in docs

* Update __init__.py

* fix repo consistency and clarify docstring

* add on check docstirngs

* actually we do have a tf sam

* oops

c34c50cd

[DOCS] minor fixes in README.md (#27048) · ba073ea9
Akash Kundu authored Oct 25, 2023
```
minor fixes
```
ba073ea9
[docstring] fix incorrect llama docstring: encoder -> decoder (#27071) · a64f8c1f
Jing Hua authored Oct 26, 2023
```
fix incorrect docstring: encoder -> decoder
```
a64f8c1f

Fix TypicalLogitsWarper tensor OOB indexing edge case (#26579) · 0baa9246

Nick Hill authored Oct 25, 2023

* Fix TypicalLogitsWarper tensor OOB indexing edge case

This can be triggerd fairly quickly with low precision e.g. bfloat16 and typical_p = 0.99.

* Shift threshold index by one

* Use explicit named arg for clamp min

0baa9246

[`core`] Refactor of `gradient_checkpointing` (#27020) · 06e782da

Younes Belkada authored Oct 25, 2023

* v1

* fix

* remove `create_custom_forward`

* fixup

* fixup

* add test and fix all failing GC tests

* remove all remaining `create_custom_forward` methods

* fix idefics bug

* fixup

* replace with `__call__`

* add comment

* quality

06e782da

Skip-test (#27062) · 9286f0ac
Arthur authored Oct 25, 2023
```
* skip plbart test

* nits

* update
```
9286f0ac

24 Oct, 2023 11 commits

Fix RoPE config validation for FalconConfig + various config typos (#26929) · 6cbc1369

Tom Aarsen authored Oct 24, 2023

* Resolve incorrect ValueError in RoPE config for Falcon

* Add broken codeblock tag in Falcon Config

* Fix typo: an float -> a float

* Implement copy functionality for Fuyu and Persimmon

for RoPE scaling validation

* Make style

6cbc1369

Add a default decoder_attention_mask for EncoderDecoderModel during training (#26752) · a0fd3448

JB (Don) authored Oct 25, 2023

* Add a default decoder_attention_mask for EncoderDecoderModel during training

Since we are already creating the default decoder_input_ids from the labels, we should also
create a default decoder_attention_mask to go with it.

* Fix test constant that relied on manual_seed()

The test was changed to use a decoder_attention_mask that ignores padding instead (which is
the default one created by BERT when attention_mask is None).

* Create the decoder_attention_mask using decoder_input_ids instead of labels

* Fix formatting in test

a0fd3448

[docs] Performance docs refactor p.2 (#26791) · 9333bf07

Maria Khalusova authored Oct 24, 2023



* initial edits

* improvements for clarity and flow

* improvements for clarity and flow, removed the repetead section

* removed two docs that had no content

* Revert "removed two docs that had no content"

This reverts commit e98fa2fa0d8e171163f15cb8a04bdada1053543b.

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feedback addressed

* more feedback addressed

* feedback addressed

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

9333bf07

Fix config silent copy in from_pretrained (#27043) · 13ef14e1
Patrick von Platen authored Oct 24, 2023
```
* Fix config modeling utils

* fix more

* fix attn mask bug

* Update src/transformers/modeling_utils.py
```
13ef14e1

Device agnostic testing (#25870) · 9da45171

Alex McKinney authored Oct 24, 2023



* adds agnostic decorators and availability fns

* renaming decorators and fixing imports

* updating some representative example tests
bloom, opt, and reformer for now

* wip device agnostic functions

* lru cache to device checking functions

* adds `TRANSFORMERS_TEST_DEVICE_SPEC`
if present, imports the target file and updates device to function
mappings

* comments `TRANSFORMERS_TEST_DEVICE_SPEC` code

* extra checks on device name

* `make style; make quality`

* updates default functions for agnostic calls

* applies suggestions from review

* adds `is_torch_available` guard

* Add spec file to docs, rename function dispatch names to backend_*

* add backend import to docs example for spec file

* change instances of  to

* Move register backend to before device check as per @statelesshz changes

* make style

* make opt test require fp16 to run

---------
Co-authored-by: arsalanu <arsalanu@graphcore.ai>
Co-authored-by: arsalanu <hzji210@gmail.com>

9da45171

Add fuyu device map (#26949) · 41496b95
Marc Sun authored Oct 24, 2023
```
* add _no_split_modules

* style

* fix _no_split_modules

* add doc
```
41496b95
add info on TRL docs (#27024) · b18e3140
Leandro von Werra authored Oct 24, 2023
```
* add info on TRL docs

* add TRL link

* tweak text

* tweak text
```
b18e3140
Safe import of rgb_to_id from FE modules (#27037) · cb0c6806
amyeroberts authored Oct 24, 2023
```
Safe import from FE modules
```
cb0c6806
[`TFxxxxForSequenceClassifciation`] Fix the eager mode after #25085 (#25751) · 7bde5d63
Arthur authored Oct 24, 2023
```
* TODOS

* Switch .shape -> shape_list

---------
Co-authored-by: Matt <rocketknight1@gmail.com>
```
7bde5d63

Normalize only if needed (#26049) · e2d6d5ce

Michal Jamroz authored Oct 24, 2023



* Normalize only if needed

* Update examples/pytorch/image-classification/run_image_classification.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* if else in one line

* within block

* one more place, sorry for mess

* import order

* Update examples/pytorch/image-classification/run_image_classification.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/pytorch/image-classification/run_image_classification_no_trainer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

e2d6d5ce

Add descriptive docstring to WhisperTimeStampLogitsProcessor (#25642) · 576e2823

JP authored Oct 24, 2023



* adding in logit examples for Whisper processor

* adding in updated logits processor for Whisper

* adding in cleaned version of  logits processor for Whisper

* adding docstrings for whisper processor

* making sure the formatting is correct

* adding logits after doc builder

* Update src/transformers/generation/logits_process.py

Adding in suggested fix to the LogitProcessor description.
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/logits_process.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/logits_process.py

Removing tip per suggestion.
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/logits_process.py

Removing redundant code per suggestion.
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* adding in revised version

* adding in version with timestamp examples

* Update src/transformers/generation/logits_process.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* enhanced paragraph on behavior of processor

* fixing doc quality issue

* removing the word poem from example

* adding in updated docstring

* adding in new version of file after doc-builder

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

576e2823