Commits · 47489a6974574c3b6a550be4da65325525965c45 · chenpangpang / transformers

11 Oct, 2021 1 commit

Replace assert by ValueError of... · 3499728d

Lahfa Samy authored Oct 11, 2021


Replace assert by ValueError of src/transformers/models/electra/modeling_{electra,tf_electra}.py and all other models that had copies (#13955)

* Replace all assert by ValueError in src/transformers/models/electra

* Reformat with black to pass check_code_quality test

* Change some assert to ValueError of modeling_bert & modeling_tf_albert

* Change some assert in multiples models

* Change multiples models assertion to ValueError in order to validate
  check_code_style test and models template test.

* Black reformat

* Change some more asserts in multiples models

* Change assert to ValueError in modeling_layoutlm.py to fix copy error in code_style_check

* Add proper message to ValueError in modeling_tf_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/bert/modeling_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message to models/convbert/modeling_tf_convbert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add error message for ValueError to modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/tapas/modeling_tapas.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/electra/modeling_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message in src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/rembert/modeling_rembert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/albert/modeling_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3499728d

17 Sep, 2021 1 commit

Optimize Token Classification models for TPU (#13096) · eae7a96b

Ibraheem Moosa authored Sep 17, 2021

* Optimize Token Classification models for TPU

As per the XLA document XLA cannot handle masked indexing well. So token classification
models for BERT and others use an implementation based on `torch.where`. This implementation
works well on TPU. 

ALBERT token classification model uses the masked indexing which causes performance issues
on TPU. This PR fixes this issue by following the BERT implementation.

* Same fix for ELECTRA

* Same fix for LayoutLM

eae7a96b

31 Aug, 2021 1 commit

Set missing seq_length variable when using inputs_embeds with ALBERT & Remove... · ef8d6f2b

Jongheon Kim authored Aug 31, 2021

Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication (#13152)

* Set seq_length variable when using inputs_embeds

* remove code duplication

ef8d6f2b

23 Aug, 2021 1 commit
- Fix load tf alias in Albert. (#13159) · f1bb6f08
  Allan Lin authored Aug 24, 2021
  
  f1bb6f08
12 Aug, 2021 1 commit

Fix classifier dropout in AlbertForMultipleChoice (#13087) · 3f52c685

Ibraheem Moosa authored Aug 12, 2021

Classification head of AlbertForMultipleChoice uses `hidden_dropout_prob` instead of `classifier_dropout_prob`. This
is not desirable as we cannot change classifer head dropout probability without changing the dropout probabilities of
the whole model.

3f52c685

06 Aug, 2021 1 commit

Tpu tie weights (#13030) · 7fcee113

Sylvain Gugger authored Aug 06, 2021

* Fix tied weights on TPU

* Manually tie weights in no trainer examples

* Fix for test

* One last missing

* Gettning owned by my scripts

* Address review comments

* Fix test

* Fix tests

* Fix reformer tests

7fcee113

26 Jul, 2021 1 commit

add `classifier_dropout` to classification heads (#12794) · 0c1c42c1

Philip May authored Jul 26, 2021



* add classifier_dropout to Electra

* no type annotations yet
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add classifier_dropout to Electra

* add classifier_dropout to Electra ForTokenClass.

* add classifier_dropout to bert

* add classifier_dropout to roberta

* add classifier_dropout to big_bird

* add classifier_dropout to mobilebert

* empty commit to trigger CI

* add classifier_dropout to reformer

* add classifier_dropout to ConvBERT

* add classifier_dropout to Albert

* add classifier_dropout to Albert
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0c1c42c1

28 Jun, 2021 1 commit
- Remove the need for `einsum` in Albert's attention computation (#12394) · a7d0b288
  Funtowicz Morgan authored Jun 28, 2021
```
* debug albert einsum

* Fix matmul computation

* Let's use torch linear layer.

* Style.
```
  a7d0b288
22 Jun, 2021 1 commit

Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252) · af6e01c5

Hamid Shojanazeri authored Jun 22, 2021



* registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing

* sytle format

* adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue

* adding the try catch to the fix as persistent flag is only available from PT >1.6

* adding version check

* added the condition to only use the token_type_ids buffer when its autogenerated not passed by user

* adding comments and making the conidtion where token_type_ids are None to use the registered buffer

* taking out position-embeddding from the if block

* adding comments

* handling the case if buffer for position_ids was not registered

* reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings

* reverting the token_type_ids in case of None to the previous version

* reverting changes on position_ids adding back the if block

* changes added by running make fix-copies

* changes added by running make fix-copies and added the import version as it was getting used

* changes added by running make fix-copies

* changes added by running make fix-copies

* fixing the import format

* fixing the import format

* modified to use temp tensor for trimed and expanded token_type_ids buffer

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* clean up

* clean up

* clean up

* clean up

* Nit

* Nit

* Nit

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* changes based on latest in master

* Adapt templates

* Add version import
Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

af6e01c5

14 Jun, 2021 1 commit
- [style] consistent nn. and nn.functional (#12124) · 1ed2ebf6
  Stas Bekman authored Jun 14, 2021
```
* consistent nn. and nn.functional

* fix glitch

* fix glitch #2
```
  1ed2ebf6
07 Jun, 2021 1 commit

Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c

François Lagunas authored Jun 07, 2021

* Fixing bug that appears when using distilation (and potentially other uses).
During backward pass Pytorch complains with:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.

* Fixing all models QA clamp_ bug.

f8bd8c6c

01 Jun, 2021 1 commit
- modify qa-trainer (#11872) · 7e73601f
  Fan Zhang authored Jun 01, 2021
```
* modify qa-trainer

* fix flax model
```
  7e73601f
20 May, 2021 1 commit
- Fix regression in regression (#11785) · 469384a7
  Sylvain Gugger authored May 20, 2021
```
* Fix regression in regression

* Add test
```
  469384a7
06 May, 2021 1 commit
- fix head_mask for albert encoder part(`AlbertTransformer`) (#11596) · c1780ce7
  baeseongsu authored May 06, 2021
```
* fix head mask for albert encoder part

* fix head_mask for albert encoder part
```
  c1780ce7
04 May, 2021 1 commit

Add multi-class, multi-label and regression to transformers (#11012) · c40c7e21

abhishek thakur authored May 04, 2021



* add to  bert

* review comments

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* self.config.problem_type

* fix style

* fix

* fin

* fix

* update doc

* fix

* test

* Test more problem types

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* remove

* fix

* quality

* make fix-copies

* remove test
Co-authored-by: abhishek thakur <abhishekkrthakur@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

c40c7e21

26 Apr, 2021 1 commit
- make style (#11442) · 32dbb2d9
  Patrick von Platen authored Apr 26, 2021
  
  32dbb2d9
31 Mar, 2021 1 commit

Enforce string-formatting with f-strings (#10980) · acc3bd9d

Sylvain Gugger authored Mar 31, 2021



* First third

* Styling and fix mistake

* Quality

* All the rest

* Treat %s and %d

* typo

* Missing )

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

acc3bd9d

05 Mar, 2021 2 commits

Refactoring checkpoint names for multiple models (#10527) · 90ecc296

Daniel Hug authored Mar 05, 2021

* Refactor checkpoint name in ALBERT and ALBERT_tf

* Refactor checkpoint name in BART and BART_tf

* Refactor checkpoint name in BERT generation

* Refactor checkpoint name in Blenderbot_tf

* Refactor checkpoint name in Blenderbot_small_tf

* Refactor checkpoint name in ConvBERT AND CONVBERT_TF

* Refactor checkpoint name in CTRL AND CTRL_TF

* Refactor checkpoint name in DistilBERT AND DistilBERT_TF

* Refactor checkpoint name in DistilBERT redo

* Refactor checkpoint name in Electra and Electra_tf

* Refactor checkpoint name in FlauBERT and FlauBERT_tf

* Refactor checkpoint name in FSMT

* Refactor checkpoint name in GPT2 and GPT2_tf

* Refactor checkpoint name in IBERT

* Refactor checkpoint name in LED and LED_tf

* Refactor checkpoint name in Longformer and Longformer_tf

* Refactor checkpoint name in Lxmert and Lxmert_tf

* Refactor checkpoint name in Marian_tf

* Refactor checkpoint name in MBART and MBART_tf

* Refactor checkpoint name in MobileBERT and MobileBERT_tf

* Refactor checkpoint name in mpnet and mpnet_tf

* Refactor checkpoint name in openai and openai_tf

* Refactor checkpoint name in pegasus_tf

* Refactor checkpoint name in reformer

* Refactor checkpoint name in Roberta and Roberta_tf

* Refactor checkpoint name in SqueezeBert

* Refactor checkpoint name in Transformer_xl and Transformer_xl_tf

* Refactor checkpoint name in XLM and XLM_tf

* Refactor checkpoint name in XLNET and XLNET_tf

* Refactor checkpoint name in BERT_tf

* run make tests, style, quality, fixup

90ecc296

Fix embeddings for PyTorch 1.8 (#10549) · 7da995c0

Sylvain Gugger authored Mar 05, 2021

* Fix embeddings for PyTorch 1.8

* Try with PyTorch 1.8.0

* Fix embeddings init

* Fix copies

* Typo

* More typos

7da995c0

23 Dec, 2020 1 commit

Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893

Suraj Patil authored Dec 23, 2020

* add past_key_values

* add use_cache option

* make mask before cutting ids

* adjust position_ids according to past_key_values

* flatten past_key_values

* fix positional embeds

* fix _reorder_cache

* set use_cache to false when not decoder, fix attention mask init

* add test for caching

* add past_key_values for Roberta

* fix position embeds

* add caching test for roberta

* add doc

* make style

* doc, fix attention mask, test

* small fixes

* adress patrick's comments

* input_ids shouldn't start with pad token

* use_cache only when decoder

* make consistent with bert

* make copies consistent

* add use_cache to encoder

* add past_key_values to tapas attention

* apply suggestions from code review

* make coppies consistent

* add attn mask in tests

* remove copied from longformer

* apply suggestions from code review

* fix bart test

* nit

* simplify model outputs

* fix doc

* fix output ordering

88ef8893

02 Dec, 2020 1 commit

[PyTorch] Refactor Resize Token Embeddings (#8880) · 443f67e8

Patrick von Platen authored Dec 02, 2020

* fix resize tokens

* correct mobile_bert

* move embedding fix into modeling_utils.py

* refactor

* fix lm head resize

* refactor

* break lines to make sylvain happy

* add news tests

* fix typo

* improve test

* skip bart-like for now

* check if base_model = get(...) is necessary

* clean files

* improve test

* fix tests

* revert style templates

* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py

443f67e8

27 Nov, 2020 1 commit

Fix dpr<>bart config for RAG (#8808) · a7d46a06

Patrick von Platen authored Nov 27, 2020

* correct dpr test and bert pos fault

* fix dpr bert config problem

* fix layoutlm

* add config to dpr as well

a7d46a06

25 Nov, 2020 1 commit

[XLNet] Fix mems behavior (#8567) · 2a6fbe6a

Patrick von Platen authored Nov 25, 2020

* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible

2a6fbe6a

24 Nov, 2020 1 commit

Support various BERT relative position embeddings (2nd) (#8276) · 2c83b3c3

zhiheng-huang authored Nov 24, 2020



* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2c83b3c3

23 Nov, 2020 1 commit

consistent ignore keys + make private (#8737) · e84786aa

Stas Bekman authored Nov 23, 2020

* consistent ignore keys + make private

* style

* - authorized_missing_keys    => _keys_to_ignore_on_load_missing
  - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected

* move public doc of private attributes to private comment

e84786aa

17 Nov, 2020 2 commits

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

Reorganize repo (#8580) · c89bdfbe

Sylvain Gugger authored Nov 16, 2020

* Put models in subfolders

* Styling

* Fix imports in tests

* More fixes in test imports

* Sneaky hidden imports

* Fix imports in doc files

* More sneaky imports

* Finish fixing tests

* Fix examples

* Fix path for copies

* More fixes for examples

* Fix dummy files

* More fixes for example

* More model import fixes

* Is this why you're unhappy GitHub?

* Fix imports in conver command

c89bdfbe

16 Nov, 2020 1 commit

Switch `return_dict` to `True` by default. (#8530) · 1073a2bd

Sylvain Gugger authored Nov 16, 2020

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests

1073a2bd

30 Oct, 2020 1 commit
- Doc fixes and filter warning in wandb (#8189) · 089cc101
  Sylvain Gugger authored Oct 30, 2020
  
  089cc101
28 Oct, 2020 1 commit
- Rename add_start_docstrings_to_callable (#8120) · 378142af
  Sylvain Gugger authored Oct 28, 2020
  
  378142af
26 Oct, 2020 1 commit

Doc styling (#8067) · 08f534d2

Sylvain Gugger authored Oct 26, 2020

* Important files

* Styling them all

* Revert "Styling them all"

This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.

* Syling them for realsies

* Fix syntax error

* Fix benchmark_utils

* More fixes

* Fix modeling auto and script

* Remove new line

* Fixes

* More fixes

* Fix more files

* Style

* Add FSMT

* More fixes

* More fixes

* More fixes

* More fixes

* Fixes

* More fixes

* More fixes

* Last fixes

* Make sphinx happy

08f534d2

12 Oct, 2020 1 commit
- Fix typo in all model docs (#7714) · 13c18577
  Sylvain Gugger authored Oct 12, 2020
  
  13c18577
25 Sep, 2020 1 commit

[Longformer, Bert, Roberta, ...] Fix multi gpu training (#7272) · e50a931c

Patrick von Platen authored Sep 25, 2020

* fix multi-gpu

* fix longformer

* force to delete unnecessary layers

* fix notifications

* fix warning

* fix roberta

* fix tests

* remove hasattr

* fix tests

* fix roberta

* merge and clean authorized keys

e50a931c

24 Sep, 2020 1 commit
- Make PyTorch model files independent from each other (#7352) · 27174bd4
  Sylvain Gugger authored Sep 24, 2020
  
  27174bd4
23 Sep, 2020 1 commit

Models doc (#7345) · 3323146e

Sylvain Gugger authored Sep 23, 2020



* Clean up model documentation

* Formatting

* Preparation work

* Long lines

* Main work on rst files

* Cleanup all config files

* Syntax fix

* Clean all tokenizers

* Work on first models

* Models beginning

* FaluBERT

* All PyTorch models

* All models

* Long lines again

* Fixes

* More fixes

* Update docs/source/model_doc/bert.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/electra.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Last fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

3323146e

04 Sep, 2020 1 commit
- [doc] remove the implied defaults to :obj:`None`, s/True/ :obj:`True/, etc. (#6956) · 48ff6d51
  Stas Bekman authored Sep 04, 2020
```
* remove the implied defaults to :obj:`None`

* fix bug in the original

* replace to :obj:`True`, :obj:`False`
```
  48ff6d51
26 Aug, 2020 3 commits

Black 20 release · a75c64d8
Lysandre authored Aug 26, 2020

a75c64d8

Centralize logging (#6434) · 77abd1e7

Lysandre Debut authored Aug 26, 2020



* Logging

* Style

* hf_logging > utils.logging

* Address @thomwolf's comments

* Update test

* Update src/transformers/benchmark/benchmark_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Revert bad change
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

77abd1e7

Add "tie_word_embeddings" config param (#6692) · 925f34bb

Patrick von Platen authored Aug 26, 2020

* add tie_word_embeddings

* correct word embeddings in modeling utils

* make style

* make config param only relevant for torch

* make style

* correct typo

* delete deprecated arg in transo-xl

925f34bb

25 Aug, 2020 1 commit
- add missing keys (#6719) · d17cce22
  Patrick von Platen authored Aug 25, 2020
  
  d17cce22