Commits · f497f564bb76697edab09184a252fc1b1a326d1e · chenpangpang / transformers

16 Feb, 2024 1 commit
- Update all references to canonical models (#29001) · f497f564
  Lysandre Debut authored Feb 16, 2024
```
* Script & Manual edition

* Update
```
  f497f564
08 Aug, 2023 1 commit

Add warning for missing attention mask when pad tokens are detected (#25345) · 5ea2595e

JB (Don) authored Aug 08, 2023

* Add attention mask and pad token warning to many of the models

* Remove changes under examples/research_projects

These files are not maintained by HG.

* Skip the warning check during torch.fx or JIT tracing

* Switch ordering for the warning and input shape assignment

This ordering is a little cleaner for some of the cases.

* Add missing line break in one of the files

5ea2595e

17 Jul, 2023 1 commit

Replace assert statements with exceptions (#24856) · d0154015

Syed Salman Habeeb Quadri authored Jul 18, 2023

* Changed AssertionError to ValueError

try-except block was using AssesrtionError in except statement while the expected error is value error. Fixed the same.

* Changed AssertionError to ValueError

try-except block was using AssesrtionError in except statement while the expected error is ValueError. Fixed the same.
Note: While raising the ValueError args are passed to it, but later added again while handling the error (See the code snippet)

* Changed AssertionError to ValueError

try-except block was using AssesrtionError in except statement while the expected error is ValueError. Fixed the same.
Note: While raising the ValueError args are passed to it, but later added again while handling the error (See the code snippet)

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed AssertionError to ValueError

* Changed assert statement to ValueError based

* Changed assert statement to ValueError based

* Changed assert statement to ValueError based

* Changed incorrect error handling from AssertionError to ValueError

* Undoed change from AssertionError to ValueError as it is not needed

* Reverted back to using AssertionError as it is not necessary to make it into ValueError

* Fixed erraneous comparision

Changed == to !=

* Fixed erraneous comparision

Changed == to !=

* formatted the code

* Ran make fix-copies

d0154015

27 Jun, 2023 1 commit

Clean load keys (#24505) · 8e5d1619

Sylvain Gugger authored Jun 27, 2023

* Preliminary work on some models

* Fix test load missing and make sure nonpersistent buffers are tested

* Always ignore nonpersistent buffers if in state_dict

* Treat models

* More models

* Treat remaining models

* Fix quality

* Fix tests

* Remove draft

* This test is not needed anymore

* Fix copies

* Fix last test

* Newly added models

* Fix last tests

* Address review comments

8e5d1619

26 Jun, 2023 1 commit
- Update AlbertModel type annotation (#24450) · 892399c5
  amyeroberts authored Jun 26, 2023
```
Update type annotation
```
  892399c5
22 Jun, 2023 1 commit
- Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420) · 3ce3385c
  Younes Belkada authored Jun 22, 2023
```
Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)"

This reverts commit 285a4801.
```
  3ce3385c
21 Jun, 2023 1 commit

Fix gradient checkpointing + fp16 autocast for most models (#24247) · 285a4801

Younes Belkada authored Jun 21, 2023



* fix gc bug

* continue PoC on OPT

* fixes

* :exploding_head:

* fix tests

* remove pytest.mark

* fixup

* forward contrib credits from discussions

* forward contrib credits from discussions

* reverting changes on untouched files.

---------
Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com>
Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>

285a4801

13 Jun, 2023 1 commit

Tied params cleanup (#24211) · 695928e1

Sylvain Gugger authored Jun 13, 2023

* First test

* Add info for all models

* style

* Repo consistency

* Fix last model and cleanup prints

* Repo consistency

* Use consistent function for detecting tied weights

695928e1

06 Feb, 2023 1 commit

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

23 Jan, 2023 1 commit

Models docstring (#21225) · fd5cdaee

Sylvain Gugger authored Jan 23, 2023

* Clean all models

* Style

* Last to remove

* address review comments

* Address review comments

fd5cdaee

09 Nov, 2022 1 commit

Attempting to test automatically the `_keys_to_ignore`. (#20042) · bac2d29a

Nicolas Patry authored Nov 09, 2022



* Attempting to test automatically the `_keys_to_ignore`.

* Style.

* First fix pass.

* Moving test on its own.

* Another batch.

* Second round removing BatchNorm

* Fixing layoutlmv{2,3} + support older Python.

* Disable miss missing warning.

* Removing dodgy additions.

* Big pass.

* mbart.

* More corrections.

* Fixup.

* Updating test_correct_missing_keys

* Add escape hatch for when the head has no extra params so doesn't need

the missing keys check.

* Fixing test.

* Greener.

* Green ! (except for weird splinter bug).

* Adding a test about `named_parameters` usage.

* Shorten message.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* After rebase modifications.

* More explicit condition checking.

* Fixing slow tests issues.

* Remove extra pdb.

* Remove print.

* Attempt to make failure consistent + fixing roc_bert.

* Removing the seed  (all tests passing with it).
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bac2d29a

14 Sep, 2022 1 commit
- PyTorch >= 1.7.0 and TensorFlow >= 2.4.0 (#19016) · a2a3afbc
  Sylvain Gugger authored Sep 14, 2022
  
  a2a3afbc
03 Aug, 2022 1 commit

Fix torch version comparisons (#18460) · 02b176c4

LSinev authored Aug 03, 2022

Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu

version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

02b176c4

20 Jun, 2022 1 commit

Not use -1e4 as attn mask (#17306) · d3cb2888

Yih-Dar authored Jun 20, 2022



* Use torch.finfo(self.dtype).min

* for GPTNeoX

* for Albert

* For Splinter

* Update src/transformers/models/data2vec/modeling_data2vec_audio.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix -inf used in Bart-like models

* Fix a few remaining -inf

* more fix

* clean up

* For CLIP

* For FSMT

* clean up

* fix test

* Add dtype argument and use it for LayoutLMv3

* update FlaxLongT5Attention
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d3cb2888

12 May, 2022 1 commit

Black preview (#17217) · afe5d42d

Sylvain Gugger authored May 12, 2022

* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black

afe5d42d

04 May, 2022 1 commit

Type hint complete Albert model file. (#16682) · 9c5ae87f

karthikrangasai authored May 04, 2022



* Type hint complete Albert model file.

* Update typing.

* Update src/transformers/models/albert/modeling_albert.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

9c5ae87f

22 Apr, 2022 1 commit

Add doc tests for Albert and Bigbird (#16774) · 0d1cff11

Minh Chien Vu authored Apr 23, 2022



* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup

* Add Doctest for Albert and Bigbird

* make fixup

* overwrite examples for Albert and Bigbird

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update longer examples for Bigbird

* using examples from squad_v2

* print out example text

* change name token-classification-big-bird checkpoint to random
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

0d1cff11

12 Apr, 2022 1 commit

Moved functions to pytorch_utils.py (#16625) · a315988b

Anmol Joshi authored Apr 12, 2022

* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix

a315988b

25 Mar, 2022 1 commit
- Big file_utils cleanup (#16396) · 088c1880
  Sylvain Gugger authored Mar 25, 2022
```
* Big file_utils cleanup

* This one still needs to be treated separately
```
  088c1880
23 Mar, 2022 1 commit

Reorganize file utils (#16264) · 4975002d

Sylvain Gugger authored Mar 23, 2022

* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit

4975002d

07 Feb, 2022 1 commit

FX tracing improvement (#14321) · 0fe17f37

Michael Benayoun authored Feb 07, 2022

* Change the way tracing happens, enabling dynamic axes out of the box

* Update the tests and modeling xlnet

* Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).

* Comments and making tracing work for gpt-j and xlnet

* Refactore things related to num_choices (and batch_size, sequence_length)

* Update fx to work on PyTorch 1.10

* Postpone autowrap_function feature usage for later

* Add copyrights

* Remove unnecessary file

* Fix issue with add_new_model_like

* Apply suggestions

0fe17f37

31 Jan, 2022 1 commit

Fix loss calculation in TFXXXForTokenClassification models (#15294) · 554d333e

Yih-Dar authored Jan 31, 2022



* Fix loss calculation in TFFunnelForTokenClassification

* revert the change in TFFunnelForTokenClassification

* fix FunnelForTokenClassification loss

* fix other TokenClassification loss

* fix more

* fix more

* add num_labels to ElectraForTokenClassification

* revert the change to research projects
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

554d333e

28 Jan, 2022 1 commit

Fix missing eps arg for LayerNorm in ElectraGeneratorPredictions (#15332) · db079567

Yih-Dar authored Jan 29, 2022



* fix missing eps

* Same fix for ConvBertGeneratorPredictions

* Same fix for AlbertMLMHead
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

db079567

28 Dec, 2021 1 commit

Doc styler examples (#14953) · b5e2b183

Sylvain Gugger authored Dec 27, 2021

* Fix bad examples

* Add black formatting to style_doc

* Use first nonempty line

* Put it at the right place

* Don't add spaces to empty lines

* Better templates

* Deal with triple quotes in docstrings

* Result of style_doc

* Enable mdx treatment and fix code examples in MDXs

* Result of doc styler on doc source files

* Last fixes

* Break copy from

b5e2b183

27 Dec, 2021 1 commit

Doc styler v2 (#14950) · 87e6e4fe

Sylvain Gugger authored Dec 27, 2021

* New doc styler

* Fix issue with args at the start

* Code sample fixes

* Style code examples in MDX

* Fix more patterns

* Typo

* Typo

* More patterns

* Do without black for now

* Get more info in error

* Docstring style

* Re-enable check

* Quality

* Fix add_end_docstring decorator

* Fix docstring

87e6e4fe

21 Dec, 2021 1 commit

Convert docstrings of modeling files (#14850) · 7af80f66

Sylvain Gugger authored Dec 21, 2021

* Convert file_utils docstrings to Markdown

* Test on BERT

* Return block indent

* Temporarily disable doc styler

* Remove from quality checks as well

* Remove doc styler mess

* Remove check from circleCI

* Fix typo

* Convert file_utils docstrings to Markdown

* Test on BERT

* Return block indent

* Temporarily disable doc styler

* Remove from quality checks as well

* Remove doc styler mess

* Remove check from circleCI

* Fix typo

* Let's go on all other model files

* Add templates too

* Styling and quality

7af80f66

30 Nov, 2021 1 commit

use functional interface for softmax in attention (#14198) · 6ed9882d

Thomas Viehmann authored Nov 30, 2021

* use functional interface instead of instantiating module and immediately calling it

* fix torch.nn.functional to nn.functional. Thank you Stas!

6ed9882d

18 Nov, 2021 2 commits
- [Bert, et al] fix early device assignment (#14447) · 72a6bf33
  Stas Bekman authored Nov 18, 2021
```
* fix early device assignment

* more models
```
  72a6bf33
- Add a post init method to all models (#14431) · d83b0e0c
  Sylvain Gugger authored Nov 18, 2021
```
* Add a post init method to all models

* Fix tests

* Fix last tests

* Fix templates

* Add comment

* Forgot to save
```
  d83b0e0c
15 Oct, 2021 1 commit
- [Docs] More general docstrings (#14028) · f5af8736
  Patrick von Platen authored Oct 16, 2021
```
* up

* finish

* up

* up

* finish
```
  f5af8736
11 Oct, 2021 1 commit

Replace assert by ValueError of... · 3499728d

Lahfa Samy authored Oct 11, 2021


Replace assert by ValueError of src/transformers/models/electra/modeling_{electra,tf_electra}.py and all other models that had copies (#13955)

* Replace all assert by ValueError in src/transformers/models/electra

* Reformat with black to pass check_code_quality test

* Change some assert to ValueError of modeling_bert & modeling_tf_albert

* Change some assert in multiples models

* Change multiples models assertion to ValueError in order to validate
  check_code_style test and models template test.

* Black reformat

* Change some more asserts in multiples models

* Change assert to ValueError in modeling_layoutlm.py to fix copy error in code_style_check

* Add proper message to ValueError in modeling_tf_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/bert/modeling_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message to models/convbert/modeling_tf_convbert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add error message for ValueError to modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/tapas/modeling_tapas.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/electra/modeling_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message in src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/rembert/modeling_rembert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/albert/modeling_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3499728d

17 Sep, 2021 1 commit

Optimize Token Classification models for TPU (#13096) · eae7a96b

Ibraheem Moosa authored Sep 17, 2021

* Optimize Token Classification models for TPU

As per the XLA document XLA cannot handle masked indexing well. So token classification
models for BERT and others use an implementation based on `torch.where`. This implementation
works well on TPU. 

ALBERT token classification model uses the masked indexing which causes performance issues
on TPU. This PR fixes this issue by following the BERT implementation.

* Same fix for ELECTRA

* Same fix for LayoutLM

eae7a96b

31 Aug, 2021 1 commit

Set missing seq_length variable when using inputs_embeds with ALBERT & Remove... · ef8d6f2b

Jongheon Kim authored Aug 31, 2021

Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication (#13152)

* Set seq_length variable when using inputs_embeds

* remove code duplication

ef8d6f2b

23 Aug, 2021 1 commit
- Fix load tf alias in Albert. (#13159) · f1bb6f08
  Allan Lin authored Aug 24, 2021
  
  f1bb6f08
12 Aug, 2021 1 commit

Fix classifier dropout in AlbertForMultipleChoice (#13087) · 3f52c685

Ibraheem Moosa authored Aug 12, 2021

Classification head of AlbertForMultipleChoice uses `hidden_dropout_prob` instead of `classifier_dropout_prob`. This
is not desirable as we cannot change classifer head dropout probability without changing the dropout probabilities of
the whole model.

3f52c685

06 Aug, 2021 1 commit

Tpu tie weights (#13030) · 7fcee113

Sylvain Gugger authored Aug 06, 2021

* Fix tied weights on TPU

* Manually tie weights in no trainer examples

* Fix for test

* One last missing

* Gettning owned by my scripts

* Address review comments

* Fix test

* Fix tests

* Fix reformer tests

7fcee113

26 Jul, 2021 1 commit

add `classifier_dropout` to classification heads (#12794) · 0c1c42c1

Philip May authored Jul 26, 2021



* add classifier_dropout to Electra

* no type annotations yet
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add classifier_dropout to Electra

* add classifier_dropout to Electra ForTokenClass.

* add classifier_dropout to bert

* add classifier_dropout to roberta

* add classifier_dropout to big_bird

* add classifier_dropout to mobilebert

* empty commit to trigger CI

* add classifier_dropout to reformer

* add classifier_dropout to ConvBERT

* add classifier_dropout to Albert

* add classifier_dropout to Albert
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0c1c42c1

28 Jun, 2021 1 commit
- Remove the need for `einsum` in Albert's attention computation (#12394) · a7d0b288
  Funtowicz Morgan authored Jun 28, 2021
```
* debug albert einsum

* Fix matmul computation

* Let's use torch linear layer.

* Style.
```
  a7d0b288
22 Jun, 2021 1 commit

Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252) · af6e01c5

Hamid Shojanazeri authored Jun 22, 2021



* registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing

* sytle format

* adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue

* adding the try catch to the fix as persistent flag is only available from PT >1.6

* adding version check

* added the condition to only use the token_type_ids buffer when its autogenerated not passed by user

* adding comments and making the conidtion where token_type_ids are None to use the registered buffer

* taking out position-embeddding from the if block

* adding comments

* handling the case if buffer for position_ids was not registered

* reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings

* reverting the token_type_ids in case of None to the previous version

* reverting changes on position_ids adding back the if block

* changes added by running make fix-copies

* changes added by running make fix-copies and added the import version as it was getting used

* changes added by running make fix-copies

* changes added by running make fix-copies

* fixing the import format

* fixing the import format

* modified to use temp tensor for trimed and expanded token_type_ids buffer

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* clean up

* clean up

* clean up

* clean up

* Nit

* Nit

* Nit

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* changes based on latest in master

* Adapt templates

* Add version import
Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

af6e01c5

14 Jun, 2021 1 commit
- [style] consistent nn. and nn.functional (#12124) · 1ed2ebf6
  Stas Bekman authored Jun 14, 2021
```
* consistent nn. and nn.functional

* fix glitch

* fix glitch #2
```
  1ed2ebf6