Commits · 0996a10077219de0556281511fc02f3ab68002d5 · chenpangpang / transformers

20 Feb, 2024 1 commit

Revert low cpu mem tie weights (#29135) · 0996a100

amyeroberts authored Feb 20, 2024

* Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)"

This reverts commit 725f4ad1.

* Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)"

This reverts commit 4156f517.

0996a100

16 Feb, 2024 1 commit
- Update all references to canonical models (#29001) · f497f564
  Lysandre Debut authored Feb 16, 2024
```
* Script & Manual edition

* Update
```
  f497f564
14 Feb, 2024 1 commit

Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948) · 725f4ad1

JB (Don) authored Feb 15, 2024

* Add tie_weights() to LM heads and set bias in set_output_embeddings()

The bias were not tied correctly in some LM heads, and this change should fix that.

* Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin

* Adding _tie_weights() to MPNet and Vilt

* Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device

* Rename to test name to save_load to match the convention

725f4ad1

27 Oct, 2023 1 commit

[`core`/ `gradient_checkpointing`] Refactor GC - part 2 (#27073) · ffff9e70

Younes Belkada authored Oct 27, 2023



* fix

* more fixes

* fix other models

* fix long t5

* use `gradient_checkpointing_func` instead

* fix copies

* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* replace it with `is_gradient_checkpointing_set`

* remove default

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ffff9e70

25 Oct, 2023 1 commit

[`core`] Refactor of `gradient_checkpointing` (#27020) · 06e782da

Younes Belkada authored Oct 25, 2023

* v1

* fix

* remove `create_custom_forward`

* fixup

* fixup

* add test and fix all failing GC tests

* remove all remaining `create_custom_forward` methods

* fix idefics bug

* fixup

* replace with `__call__`

* add comment

* quality

06e782da

11 Oct, 2023 1 commit

In assisted decoding, pass model_kwargs to model's forward call (fix... · dcc49d8a

Billy Bradley authored Oct 11, 2023

In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242)

* In assisted decoding, pass model_kwargs to model's forward call

Previously, assisted decoding would ignore any additional kwargs
that it doesn't explicitly handle. This was inconsistent with other
generation methods, which pass the model_kwargs through
prepare_inputs_for_generation and forward the returned dict to the
model's forward call.

The prepare_inputs_for_generation method needs to be amended in all
models, as previously it only kept the last input ID when a past_key_values
was passed.

* Improve variable names in _extend_attention_mask

* Refactor extending token_type_ids into a function

* Replace deepcopy with copy to optimize performance

* Update new persimmon model with llama changes for assisted generation

* Update new mistral model for assisted generation with prepare_inputs_for_generation

* Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation

dcc49d8a

14 Sep, 2023 1 commit

Fix beam search when using model parallel (#24969) · 8881f38a

Dong-Yong Lee authored Sep 15, 2023



* Fix GPTNeoX beam search when using parallelize

* Fix beam search idx device when using model parallel

* remove onnx related stuff
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix: move test_beam_search_on_multi_gpu to GenerationTesterMixin

* fix: add right item to _no_split_modules of MegaPreTrainedModel

* fix: add num_beams within parallelized beam_search test
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8881f38a

08 Aug, 2023 1 commit

Add warning for missing attention mask when pad tokens are detected (#25345) · 5ea2595e

JB (Don) authored Aug 08, 2023

* Add attention mask and pad token warning to many of the models

* Remove changes under examples/research_projects

These files are not maintained by HG.

* Skip the warning check during torch.fx or JIT tracing

* Switch ordering for the warning and input shape assignment

This ordering is a little cleaner for some of the cases.

* Add missing line break in one of the files

5ea2595e

17 Jul, 2023 1 commit

Replace assert statements with exceptions (#24856) · d0154015

Syed Salman Habeeb Quadri authored Jul 18, 2023

* Changed AssertionError to ValueError

try-except block was using AssesrtionError in except statement while the expected error is value error. Fixed the same.

* Changed AssertionError to ValueError

try-except block was using AssesrtionError in except statement while the expected error is ValueError. Fixed the same.
Note: While raising the ValueError args are passed to it, but later added again while handling the error (See the code snippet)

* Changed AssertionError to ValueError

try-except block was using AssesrtionError in except statement while the expected error is ValueError. Fixed the same.
Note: While raising the ValueError args are passed to it, but later added again while handling the error (See the code snippet)

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed assert statement to ValueError based

* Changed assert statement to ValueError based

* Changed assert statement to ValueError based

* Changed incorrect error handling from AssertionError to ValueError

* Undoed change from AssertionError to ValueError as it is not needed

* Reverted back to using AssertionError as it is not necessary to make it into ValueError

* Fixed erraneous comparision

Changed == to !=

* Fixed erraneous comparision

Changed == to !=

* formatted the code

* Ran make fix-copies

d0154015

30 Jun, 2023 1 commit

Show a warning for missing attention masks when pad_token_id is not None (#24510) · 78a2b19f

JB (Don) authored Jun 30, 2023



* Adding warning messages to BERT for missing attention masks

These warning messages when there are pad tokens within the input ids and
no attention masks are given. The warning message should only show up once.

* Adding warning messages to BERT for missing attention masks

These warning messages are shown when the pad_token_id is not None
and no attention masks are given. The warning message should only
show up once.

* Ran fix copies to copy over the changes to some of the other models

* Add logger.warning_once.cache_clear() to the test

* Shows warning when there are no attention masks and input_ids start/end with pad tokens

* Using warning_once() instead and fix indexing in input_ids check

---------
Co-authored-by: JB Lau <hckyn@voyager2.local>

78a2b19f

27 Jun, 2023 1 commit

Clean load keys (#24505) · 8e5d1619

Sylvain Gugger authored Jun 27, 2023

* Preliminary work on some models

* Fix test load missing and make sure nonpersistent buffers are tested

* Always ignore nonpersistent buffers if in state_dict

* Treat models

* More models

* Treat remaining models

* Fix quality

* Fix tests

* Remove draft

* This test is not needed anymore

* Fix copies

* Fix last test

* Newly added models

* Fix last tests

* Address review comments

8e5d1619

22 Jun, 2023 1 commit
- Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420) · 3ce3385c
  Younes Belkada authored Jun 22, 2023
```
Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)"

This reverts commit 285a4801.
```
  3ce3385c
21 Jun, 2023 1 commit

Fix gradient checkpointing + fp16 autocast for most models (#24247) · 285a4801

Younes Belkada authored Jun 21, 2023



* fix gc bug

* continue PoC on OPT

* fixes

* :exploding_head:

* fix tests

* remove pytest.mark

* fixup

* forward contrib credits from discussions

* forward contrib credits from discussions

* reverting changes on untouched files.

---------
Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com>
Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>

285a4801

13 Jun, 2023 1 commit

Tied params cleanup (#24211) · 695928e1

Sylvain Gugger authored Jun 13, 2023

* First test

* Add info for all models

* style

* Repo consistency

* Fix last model and cleanup prints

* Repo consistency

* Use consistent function for detecting tied weights

695928e1

06 Mar, 2023 1 commit
- Fix bert issue (#21963) · 934d0b8b
  saswatmeher authored Mar 06, 2023
```
Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>
```
  934d0b8b
27 Feb, 2023 1 commit
- introduce `logger.warning_once` and use it for grad checkpointing code (#21804) · c7f3abc2
  Stas Bekman authored Feb 27, 2023
```
* logger.warning_once

* style
```
  c7f3abc2
07 Feb, 2023 1 commit

[CI ] Remove `past` in favor of `pat_key_values` (#21443) · 12eb528b

Arthur authored Feb 07, 2023

* fix past renamed to past_key_value

* update more `past`that were ski^êd

* fixup

* remove changes made to rag

* refactor `_reorder_cache` to use `past_key_values`

* fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache

12eb528b

06 Feb, 2023 1 commit

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

23 Jan, 2023 1 commit

Models docstring (#21225) · fd5cdaee

Sylvain Gugger authored Jan 23, 2023

* Clean all models

* Style

* Last to remove

* address review comments

* Address review comments

fd5cdaee

14 Jan, 2023 1 commit

Rework automatic code samples in docstrings (#20757) · c8f35a9c

Sylvain Gugger authored Jan 14, 2023

* Rework automatic code samples in docstrings

* ImageProcessor->AutoImageProcessor

* Add models to fix copies

* Last typos

* A couple more models

* Fix copies

c8f35a9c

08 Jan, 2023 1 commit

Replace `past` with `past_key_values` (#20944) · f0577df6

Arthur authored Jan 08, 2023

* start cleanup

* more updates

* more models are affected

* more updates

* update generation utils

* style

* revert change that removed reorder cachce

* update generation utils

* style

* style

* remove reorder cache

f0577df6

15 Nov, 2022 1 commit

update relative positional embedding (#20203) · f60eec40

Arthur authored Nov 15, 2022

* update relative positional embedding

* make fix copies

* add `use_cache` to list of arguments

* fixup

* 1line fucntion

* add `test_decoder_model_past_with_large_inputs_relative_pos_emb`

* add relative pos embedding test for more models

* style

f60eec40

09 Nov, 2022 1 commit

Attempting to test automatically the `_keys_to_ignore`. (#20042) · bac2d29a

Nicolas Patry authored Nov 09, 2022



* Attempting to test automatically the `_keys_to_ignore`.

* Style.

* First fix pass.

* Moving test on its own.

* Another batch.

* Second round removing BatchNorm

* Fixing layoutlmv{2,3} + support older Python.

* Disable miss missing warning.

* Removing dodgy additions.

* Big pass.

* mbart.

* More corrections.

* Fixup.

* Updating test_correct_missing_keys

* Add escape hatch for when the head has no extra params so doesn't need

the missing keys check.

* Fixing test.

* Greener.

* Green ! (except for weird splinter bug).

* Adding a test about `named_parameters` usage.

* Shorten message.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* After rebase modifications.

* More explicit condition checking.

* Fixing slow tests issues.

* Remove extra pdb.

* Remove print.

* Attempt to make failure consistent + fixing roc_bert.

* Removing the seed  (all tests passing with it).
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bac2d29a

14 Sep, 2022 1 commit
- PyTorch >= 1.7.0 and TensorFlow >= 2.4.0 (#19016) · a2a3afbc
  Sylvain Gugger authored Sep 14, 2022
  
  a2a3afbc
03 Aug, 2022 1 commit

Fix torch version comparisons (#18460) · 02b176c4

LSinev authored Aug 03, 2022

Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu

version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

02b176c4

12 May, 2022 1 commit

Black preview (#17217) · afe5d42d

Sylvain Gugger authored May 12, 2022

* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black

afe5d42d

04 May, 2022 1 commit

Type hint complete Albert model file. (#16682) · 9c5ae87f

karthikrangasai authored May 04, 2022



* Type hint complete Albert model file.

* Update typing.

* Update src/transformers/models/albert/modeling_albert.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

9c5ae87f

03 May, 2022 1 commit
- Remove device parameter from create_extended_attention_mask_for_decoder (#16894) · 39f8eafc
  Pavel Belevich authored May 03, 2022
  
  39f8eafc
12 Apr, 2022 1 commit

Moved functions to pytorch_utils.py (#16625) · a315988b

Anmol Joshi authored Apr 12, 2022

* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix

a315988b

11 Apr, 2022 1 commit

Add Doc Test for BERT (#16523) · 2831826b

Minh Chien Vu authored Apr 11, 2022



* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

2831826b

31 Mar, 2022 1 commit

make tuple annotation more specific to avoid failures during symbolic_trace (#16490) · 99a01423

chenbohua3 authored Mar 31, 2022

* make tuple annotation more specific to avoid failures during symbolic_trace

* make tuple annotation more specific to avoid failures during symbolic_trace

99a01423

25 Mar, 2022 1 commit
- Big file_utils cleanup (#16396) · 088c1880
  Sylvain Gugger authored Mar 25, 2022
```
* Big file_utils cleanup

* This one still needs to be treated separately
```
  088c1880
23 Mar, 2022 1 commit

Reorganize file utils (#16264) · 4975002d

Sylvain Gugger authored Mar 23, 2022

* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit

4975002d

22 Mar, 2022 1 commit

Add type annotations for Rembert/Splinter and copies (#16338) · ec3aace0

Jacob Dineen authored Mar 22, 2022



* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`
Co-authored-by: matt <rocketknight1@gmail.com>

ec3aace0

11 Mar, 2022 1 commit
- Add type annotations for BERT and copies (#16074) · bb69d154
  Matt authored Mar 11, 2022
```
* Add type annotations for BERT and copies

* make fixup
```
  bb69d154
07 Feb, 2022 1 commit

FX tracing improvement (#14321) · 0fe17f37

Michael Benayoun authored Feb 07, 2022

* Change the way tracing happens, enabling dynamic axes out of the box

* Update the tests and modeling xlnet

* Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).

* Comments and making tracing work for gpt-j and xlnet

* Refactore things related to num_choices (and batch_size, sequence_length)

* Update fx to work on PyTorch 1.10

* Postpone autowrap_function feature usage for later

* Add copyrights

* Remove unnecessary file

* Fix issue with add_new_model_like

* Apply suggestions

0fe17f37

31 Jan, 2022 1 commit

Fix loss calculation in TFXXXForTokenClassification models (#15294) · 554d333e

Yih-Dar authored Jan 31, 2022



* Fix loss calculation in TFFunnelForTokenClassification

* revert the change in TFFunnelForTokenClassification

* fix FunnelForTokenClassification loss

* fix other TokenClassification loss

* fix more

* fix more

* add num_labels to ElectraForTokenClassification

* revert the change to research projects
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

554d333e

28 Dec, 2021 1 commit

Doc styler examples (#14953) · b5e2b183

Sylvain Gugger authored Dec 27, 2021

* Fix bad examples

* Add black formatting to style_doc

* Use first nonempty line

* Put it at the right place

* Don't add spaces to empty lines

* Better templates

* Deal with triple quotes in docstrings

* Result of style_doc

* Enable mdx treatment and fix code examples in MDXs

* Result of doc styler on doc source files

* Last fixes

* Break copy from

b5e2b183

27 Dec, 2021 1 commit

Doc styler v2 (#14950) · 87e6e4fe

Sylvain Gugger authored Dec 27, 2021

* New doc styler

* Fix issue with args at the start

* Code sample fixes

* Style code examples in MDX

* Fix more patterns

* Typo

* Typo

* More patterns

* Do without black for now

* Get more info in error

* Docstring style

* Re-enable check

* Quality

* Fix add_end_docstring decorator

* Fix docstring

87e6e4fe

21 Dec, 2021 1 commit

Convert docstrings of modeling files (#14850) · 7af80f66

Sylvain Gugger authored Dec 21, 2021

* Convert file_utils docstrings to Markdown

* Test on BERT

* Return block indent

* Temporarily disable doc styler

* Remove from quality checks as well

* Remove doc styler mess

* Remove check from circleCI

* Fix typo

* Convert file_utils docstrings to Markdown

* Test on BERT

* Return block indent

* Temporarily disable doc styler

* Remove from quality checks as well

* Remove doc styler mess

* Remove check from circleCI

* Fix typo

* Let's go on all other model files

* Add templates too

* Styling and quality

7af80f66