Commits · d9dc993fdd6e6b0a61fe68ccbe838e00c73b9f80 · chenpangpang / transformers

28 Mar, 2024 1 commit
- Fix typo in T5Block error message (#29881) · d9dc993f
  Minseo Kang authored Mar 28, 2024
  
  d9dc993f
25 Mar, 2024 1 commit

Remove static pretrained maps from the library's internals (#29112) · 39114c03

Lysandre Debut authored Mar 25, 2024



* [test_all] Remove static pretrained maps from the library's internals

* Deprecate archive maps instead of removing them

* Revert init changes

* [test_all] Deprecate instead of removing

* [test_all] PVT v2 support

* [test_all] Tests should all pass

* [test_all] Style

* Address review comments

* Update src/transformers/models/deprecated/_archive_maps.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deprecated/_archive_maps.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* [test_all] trigger tests

* [test_all] LLAVA

* [test_all] Bad rebase

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

39114c03

16 Feb, 2024 1 commit
- Update all references to canonical models (#29001) · f497f564
  Lysandre Debut authored Feb 16, 2024
```
* Script & Manual edition

* Update
```
  f497f564
01 Feb, 2024 1 commit

Adding [T5/MT5/UMT5]ForTokenClassification (#28443) · 0d26abdd

JB (Don) authored Feb 01, 2024

* Adding [T5/MT5/UMT5]ForTokenClassification

* Add auto mappings for T5ForTokenClassification and variants

* Adding ForTokenClassification to the list of models

* Adding attention_mask param to the T5ForTokenClassification test

* Remove outdated comment in test

* Adding EncoderOnly and Token Classification tests for MT5 and UMT5

* Fix typo in umt5 string

* Add tests for all the existing MT5 models

* Fix wrong comment in dependency_versions_table

* Reverting change to common test for _keys_to_ignore_on_load_missing

The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.

* Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model

* Add fix-copies to MT5ModelTest

0d26abdd

02 Nov, 2023 1 commit
- Remove redundant code from T5 encoder mask creation (#27216) · 147e8ce4
  Pietro Lesci authored Nov 02, 2023
```
* remove redundant code

* update

* add typecasting

* make `attention_mask` float again
```
  147e8ce4
01 Nov, 2023 1 commit
- Fix CPU offload + disk offload tests (#27204) · 95020f20
  Lysandre Debut authored Nov 01, 2023
```
Fix disk offload tests + weight sharing issues
```
  95020f20
27 Oct, 2023 1 commit

[`core`/ `gradient_checkpointing`] Refactor GC - part 2 (#27073) · ffff9e70

Younes Belkada authored Oct 27, 2023



* fix

* more fixes

* fix other models

* fix long t5

* use `gradient_checkpointing_func` instead

* fix copies

* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* replace it with `is_gradient_checkpointing_set`

* remove default

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ffff9e70

25 Oct, 2023 1 commit

[`core`] Refactor of `gradient_checkpointing` (#27020) · 06e782da

Younes Belkada authored Oct 25, 2023

* v1

* fix

* remove `create_custom_forward`

* fixup

* fixup

* add test and fix all failing GC tests

* remove all remaining `create_custom_forward` methods

* fix idefics bug

* fixup

* replace with `__call__`

* add comment

* quality

06e782da

12 Oct, 2023 1 commit
- Add many missing spaces in adjacent strings (#26751) · 40ea9ab2
  Tom Aarsen authored Oct 12, 2023
```
Add missing spaces in adjacent strings
```
  40ea9ab2
11 Oct, 2023 1 commit

In assisted decoding, pass model_kwargs to model's forward call (fix... · dcc49d8a

Billy Bradley authored Oct 11, 2023

In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242)

* In assisted decoding, pass model_kwargs to model's forward call

Previously, assisted decoding would ignore any additional kwargs
that it doesn't explicitly handle. This was inconsistent with other
generation methods, which pass the model_kwargs through
prepare_inputs_for_generation and forward the returned dict to the
model's forward call.

The prepare_inputs_for_generation method needs to be amended in all
models, as previously it only kept the last input ID when a past_key_values
was passed.

* Improve variable names in _extend_attention_mask

* Refactor extending token_type_ids into a function

* Replace deepcopy with copy to optimize performance

* Update new persimmon model with llama changes for assisted generation

* Update new mistral model for assisted generation with prepare_inputs_for_generation

* Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation

dcc49d8a

28 Sep, 2023 1 commit

Do not warn about unexpected decoder weights when loading T5EncoderModel and... · 216dff75

fleance authored Sep 28, 2023

Do not warn about unexpected decoder weights when loading T5EncoderModel and LongT5EncoderModel (#26211)

Ignore decoder weights when using T5EncoderModel and LongT5EncoderModel

Both T5EncoderModel and LongT5EncoderModel do not have any decoder layers, so
loading a pretrained model checkpoint such as t5-small will give warnings about
keys found in the model checkpoint that are not in the model itself.

To prevent this log warning, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for
both T5EncoderModel and LongT5EncoderModel

216dff75

25 Jul, 2023 1 commit

[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) · 8f36ab3e

Sebastian Husch Lee authored Jul 25, 2023

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

8f36ab3e

10 Jul, 2023 1 commit

[`T5`] Adding model_parallel = False to `T5ForQuestionAnswering` and... · 25411085

Sebastian Husch Lee authored Jul 10, 2023

[`T5`] Adding model_parallel = False to `T5ForQuestionAnswering` and `MT5ForQuestionAnswering` (#24684)

Adding model_parallel = False

25411085

27 Jun, 2023 2 commits

Clean load keys (#24505) · 8e5d1619

Sylvain Gugger authored Jun 27, 2023

* Preliminary work on some models

* Fix test load missing and make sure nonpersistent buffers are tested

* Always ignore nonpersistent buffers if in state_dict

* Treat models

* More models

* Treat remaining models

* Fix quality

* Fix tests

* Remove draft

* This test is not needed anymore

* Fix copies

* Fix last test

* Newly added models

* Fix last tests

* Address review comments

8e5d1619

[`T5`] Add T5ForQuestionAnswering and MT5ForQuestionAnswering (#24481) · 06910f5a

Sebastian authored Jun 27, 2023



* Adding T5ForQuestionAnswering

* Changed weight initialization that results in better initial loss when fine-tuning

* Update to class variables

* Running make fixup

* Running make fix-copies

* Remove model_parallel

* Adding MT5ForQuestionAnswering

* Adding docs

* Fix wrong doc

* Update src/transformers/models/mt5/modeling_mt5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* File formatting

* Undoing change

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

06910f5a

22 Jun, 2023 1 commit
- Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420) · 3ce3385c
  Younes Belkada authored Jun 22, 2023
```
Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)"

This reverts commit 285a4801.
```
  3ce3385c
21 Jun, 2023 1 commit

Fix gradient checkpointing + fp16 autocast for most models (#24247) · 285a4801

Younes Belkada authored Jun 21, 2023



* fix gc bug

* continue PoC on OPT

* fixes

* :exploding_head:

* fix tests

* remove pytest.mark

* fixup

* forward contrib credits from discussions

* forward contrib credits from discussions

* reverting changes on untouched files.

---------
Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com>
Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>

285a4801

13 Jun, 2023 1 commit

Tied params cleanup (#24211) · 695928e1

Sylvain Gugger authored Jun 13, 2023

* First test

* Add info for all models

* style

* Repo consistency

* Fix last model and cleanup prints

* Repo consistency

* Use consistent function for detecting tied weights

695928e1

12 May, 2023 1 commit

replaced assert with raise ValueError for t5, switch_transformers, pix2struct,... · 79743ced

Susnato Dhar authored May 12, 2023

replaced assert with raise ValueError for t5, switch_transformers, pix2struct, mt5, longt5, gptsan_japanese. (#23273)

* replaced assert with raise ValueError

* one liner

* reverse one liner and cache-decoder check

79743ced

20 Apr, 2023 1 commit
- Include decoder_attention_mask in T5 model inputs (#22835) · 3b61d289
  Aashiq Muhamed authored Apr 20, 2023
  
  3b61d289
03 Apr, 2023 1 commit
- [`T5`] Enable naive Pipeline Parallelism training for T5 (#22535) · d7a4f5be
  Younes Belkada authored Apr 03, 2023
```
* enable PP for T5

* make fixup

* fix failing tests
```
  d7a4f5be
15 Mar, 2023 1 commit

t5 remove data dependency (#22097) · 7c4999e4

Prathik Rao authored Mar 15, 2023



* t5 remove data dependency

* make style

* make fix-copies

---------
Co-authored-by: Prathik Rao <prathikrao@microsoft.com>

7c4999e4

09 Mar, 2023 1 commit

[21737][T5]: Fix gradient checkpoint bug (#22036) · 1a77a1a8

Nipun Jindal authored Mar 09, 2023



* [21737][T5]: Fix gradient checkpoint bug

* [21737][T5]: Fix gradient checkpoint bug

* [21737][T5]: Fix gradient checkpoint bug

* Update src/transformers/models/mt5/modeling_mt5.py

* Update src/transformers/models/t5/modeling_t5.py

---------
Co-authored-by: njindal <njindal@adobe.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

1a77a1a8

28 Feb, 2023 1 commit
- [`T5`] Fix torchquant issue (#21843) · ae9230af
  Younes Belkada authored Feb 28, 2023
```
* fix torchquant issue

* add tests
```
  ae9230af
27 Feb, 2023 1 commit
- introduce `logger.warning_once` and use it for grad checkpointing code (#21804) · c7f3abc2
  Stas Bekman authored Feb 27, 2023
```
* logger.warning_once

* style
```
  c7f3abc2
07 Feb, 2023 2 commits

[CI ] Remove `past` in favor of `pat_key_values` (#21443) · 12eb528b

Arthur authored Feb 07, 2023

* fix past renamed to past_key_value

* update more `past`that were ski^êd

* fixup

* remove changes made to rag

* refactor `_reorder_cache` to use `past_key_values`

* fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache

12eb528b

Deprecate parallelize API (#21448) · 5b493762
Sylvain Gugger authored Feb 06, 2023
```
* Deprecate parallelize API

* Add documentation

* Fix copies
```
5b493762

06 Feb, 2023 1 commit

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

24 Jan, 2023 1 commit

[`t5`] Fix T5 inference in `float16` + `bnb` error (#21281) · e2e393c6

Younes Belkada authored Jan 24, 2023

* attempts to fix:

- upcast input for `T5DenseActDense`
- add the condition `self.wo.weight.dtype != torch.int8`
- added tests on `test/mixed_int8`
- `make fixup`

* fix ci test

e2e393c6

23 Jan, 2023 1 commit

Models docstring (#21225) · fd5cdaee

Sylvain Gugger authored Jan 23, 2023

* Clean all models

* Style

* Last to remove

* address review comments

* Address review comments

fd5cdaee

08 Jan, 2023 1 commit

Replace `past` with `past_key_values` (#20944) · f0577df6

Arthur authored Jan 08, 2023

* start cleanup

* more updates

* more models are affected

* more updates

* update generation utils

* style

* revert change that removed reorder cachce

* update generation utils

* style

* style

* remove reorder cache

f0577df6

03 Jan, 2023 1 commit
- Fix T5 docstring (#20957) · a3e8d3cb
  ivanllt authored Jan 03, 2023
```
Fix start_docstring for deparallelize method
```
  a3e8d3cb
15 Dec, 2022 1 commit

Patch for FlanT5-XXL 8bit support (#20760) · b9b70b0e

Lars Mennen authored Dec 15, 2022

* Workaround for #20287: FlanT5-XXL 8bit support

* Make fix-copies

* revert unrelated change

* Dont apply to longt5 and switch transformers

b9b70b0e

13 Dec, 2022 1 commit

Add `keep_in_fp32_modules` support (#20683) · 1af4bee8

Younes Belkada authored Dec 13, 2022



* add `keep_in_fp32_modules` support

* pass it as class attribute

* few modifs

- make tests `slow`
- fix logic

* better logic

* fix failing test

* `bfloat16` support

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* simplify tests

* simplify tests

* fix test

* modify message

* more checks

* fix failing tests

* add more conditions

- add `is_accelerate_available`
- fixes pipleine tests that failed

* add suggestions

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix failing `bnb` test

* add last safety checker
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1af4bee8

06 Dec, 2022 1 commit
- updating T5 and BART models to support Prefix Tuning (#20601) · 97a51b0c
  Sourab Mangrulkar authored Dec 06, 2022
```
* updating T5 and BART models to support Prefix Tuning

* `make fix-copies`

* address comments

* address comments
```
  97a51b0c
09 Nov, 2022 1 commit

Attempting to test automatically the `_keys_to_ignore`. (#20042) · bac2d29a

Nicolas Patry authored Nov 09, 2022



* Attempting to test automatically the `_keys_to_ignore`.

* Style.

* First fix pass.

* Moving test on its own.

* Another batch.

* Second round removing BatchNorm

* Fixing layoutlmv{2,3} + support older Python.

* Disable miss missing warning.

* Removing dodgy additions.

* Big pass.

* mbart.

* More corrections.

* Fixup.

* Updating test_correct_missing_keys

* Add escape hatch for when the head has no extra params so doesn't need

the missing keys check.

* Fixing test.

* Greener.

* Green ! (except for weird splinter bug).

* Adding a test about `named_parameters` usage.

* Shorten message.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* After rebase modifications.

* More explicit condition checking.

* Fixing slow tests issues.

* Remove extra pdb.

* Remove print.

* Attempt to make failure consistent + fixing roc_bert.

* Removing the seed  (all tests passing with it).
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bac2d29a

06 Sep, 2022 2 commits

Fix decode_input_ids to bare T5Model and improve doc (#18791) · f85acb4d

Ekagra Ranjan authored Sep 06, 2022



* use tokenizer to output tensor

* add preprocessing for decoder_input_ids for bare T5Model

* add preprocessing to tf and flax

* linting

* linting

* Update src/transformers/models/t5/modeling_flax_t5.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_tf_t5.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

f85acb4d

Mask t5 relative position bias then head pruned (#17968) · 734b7e2a

Had authored Sep 06, 2022



* add position bias head masking if heads pruned

* fix pruning function in t5 encoder

* make style

* make fix-copies

* Revert added folder
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

734b7e2a

27 Jul, 2022 1 commit
- [XLA] Improve t5 model performance (#18288) · d5610b53
  Yanming Wang authored Jul 27, 2022
  
  d5610b53
06 Jul, 2022 1 commit
- Fix T5 incorrect weight decay in Trainer and official summarization example (#18002) · bf37e5c7
  ADAning authored Jul 06, 2022
```
* Add ALL_LAYERNORM_LAYERS for LayerNorm

* fix bug of appending layer norm
```
  bf37e5c7