Commits · 02b176c4ce14340d26d42825523f406959c6c202 · chenpangpang / transformers

03 Aug, 2022 1 commit

Fix torch version comparisons (#18460) · 02b176c4

LSinev authored Aug 03, 2022

Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu

version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

02b176c4

23 Jun, 2022 1 commit

Add missing type hints for QDQBertModel (#17783) · d37a68e6

willtai authored Jun 23, 2022

* Feat: add missing type hints for QDQBertModel

* fix: ran black and isort

* feat: Add missing output type for QDQBertModel

* feat: Add type hints for QDQBertLMHeadModel and models starting with QDQBertFor

* fix: add missing return type for QDQBertModel

* fix: remove wrong return type for QDQBertEmbeddings

* fix: readded config argument to load_tf_weights_in_qdqbert

* fix: add BertConfig type to BertEmbeddings config due t checko error in ci

* fix: removed config type hints to avoid copy checks

d37a68e6

12 May, 2022 1 commit

Black preview (#17217) · afe5d42d

Sylvain Gugger authored May 12, 2022

* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black

afe5d42d

04 May, 2022 1 commit

Type hint complete Albert model file. (#16682) · 9c5ae87f

karthikrangasai authored May 04, 2022



* Type hint complete Albert model file.

* Update typing.

* Update src/transformers/models/albert/modeling_albert.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

9c5ae87f

03 May, 2022 1 commit
- Remove device parameter from create_extended_attention_mask_for_decoder (#16894) · 39f8eafc
  Pavel Belevich authored May 03, 2022
  
  39f8eafc
12 Apr, 2022 1 commit

Moved functions to pytorch_utils.py (#16625) · a315988b

Anmol Joshi authored Apr 12, 2022

* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix

a315988b

25 Mar, 2022 1 commit
- Big file_utils cleanup (#16396) · 088c1880
  Sylvain Gugger authored Mar 25, 2022
```
* Big file_utils cleanup

* This one still needs to be treated separately
```
  088c1880
23 Mar, 2022 1 commit

Reorganize file utils (#16264) · 4975002d

Sylvain Gugger authored Mar 23, 2022

* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit

4975002d

22 Mar, 2022 1 commit

Add type annotations for Rembert/Splinter and copies (#16338) · ec3aace0

Jacob Dineen authored Mar 22, 2022



* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`
Co-authored-by: matt <rocketknight1@gmail.com>

ec3aace0

31 Jan, 2022 1 commit

Fix loss calculation in TFXXXForTokenClassification models (#15294) · 554d333e

Yih-Dar authored Jan 31, 2022



* Fix loss calculation in TFFunnelForTokenClassification

* revert the change in TFFunnelForTokenClassification

* fix FunnelForTokenClassification loss

* fix other TokenClassification loss

* fix more

* fix more

* add num_labels to ElectraForTokenClassification

* revert the change to research projects
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

554d333e

28 Dec, 2021 1 commit

Doc styler examples (#14953) · b5e2b183

Sylvain Gugger authored Dec 27, 2021

* Fix bad examples

* Add black formatting to style_doc

* Use first nonempty line

* Put it at the right place

* Don't add spaces to empty lines

* Better templates

* Deal with triple quotes in docstrings

* Result of style_doc

* Enable mdx treatment and fix code examples in MDXs

* Result of doc styler on doc source files

* Last fixes

* Break copy from

b5e2b183

27 Dec, 2021 1 commit

Doc styler v2 (#14950) · 87e6e4fe

Sylvain Gugger authored Dec 27, 2021

* New doc styler

* Fix issue with args at the start

* Code sample fixes

* Style code examples in MDX

* Fix more patterns

* Typo

* Typo

* More patterns

* Do without black for now

* Get more info in error

* Docstring style

* Re-enable check

* Quality

* Fix add_end_docstring decorator

* Fix docstring

87e6e4fe

21 Dec, 2021 1 commit

Convert docstrings of modeling files (#14850) · 7af80f66

Sylvain Gugger authored Dec 21, 2021

* Convert file_utils docstrings to Markdown

* Test on BERT

* Return block indent

* Temporarily disable doc styler

* Remove from quality checks as well

* Remove doc styler mess

* Remove check from circleCI

* Fix typo

* Convert file_utils docstrings to Markdown

* Test on BERT

* Return block indent

* Temporarily disable doc styler

* Remove from quality checks as well

* Remove doc styler mess

* Remove check from circleCI

* Fix typo

* Let's go on all other model files

* Add templates too

* Styling and quality

7af80f66

19 Nov, 2021 1 commit

Add QDQBert model and quantization examples of SQUAD task (#14066) · a59e7c1e

Shang Zhang authored Nov 19, 2021



* clean up branch for add-qdqbert-model

* README update for QAT example; update docstrings in modeling_qdqbert.py

* Update qdqbert.rst

* Update README.md

* Update README.md

* calibration data using traning set; QAT example runs in fp32

* re-use BERTtokenizer for qdqbert

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove qdqbert tokenizer

* Update qdqbert.rst

* update evaluate-hf-trt-qa.py

* update configuration_qdqbert.py

* update modeling_qdqbert.py: add copied statement; replace assert with ValueError

* update copied from statement

* add is_quantization_available; run make fix-copies

* unittest add require_quantization

* add backend dependency to qdqbert model

* update README; update evaluate script; make style

* lint

* docs qdqbert update

* circleci build_doc add pytorch-quantization for qdqbert

* update README

* update example readme with instructions to upgrade TensorRT to 8.2

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* change quantization to pytorch_quantization for backend requirement

* feed_forward_chunking not supported in QDQBert

* make style

* update model docstrings and comments in testing scripts

* rename example to quantization-qdqbert; rename example scripts from qat to quant

* Update src/transformers/models/qdqbert/modeling_qdqbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* rm experimental functions in quant_trainer

* qa cleanup

* make fix-copies for docs index.rst

* fix doctree; use post_init() for qdqbert

* fix early device assignment for qdqbert

* fix CI:Model templates runner
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a59e7c1e

18 Nov, 2021 2 commits
- [Bert, et al] fix early device assignment (#14447) · 72a6bf33
  Stas Bekman authored Nov 18, 2021
```
* fix early device assignment

* more models
```
  72a6bf33
- Add a post init method to all models (#14431) · d83b0e0c
  Sylvain Gugger authored Nov 18, 2021
```
* Add a post init method to all models

* Fix tests

* Fix last tests

* Fix templates

* Add comment

* Forgot to save
```
  d83b0e0c
09 Nov, 2021 1 commit
- [Bert2Bert] allow bert2bert + relative embeddings (#14324) · e81d8d7f
  Patrick von Platen authored Nov 09, 2021
```
* [Bert2Bert] allow bert2bert + relative embeddings

* up

* Update README_ko.md

* up

* up
```
  e81d8d7f
01 Nov, 2021 1 commit
- Raising exceptions instead of using assertions for few models (#14219) · 33fb9833
  Prabhudatta Das authored Nov 01, 2021
```
* raising exceptions instead of using assertions for few models

* fixed formatting issues

* fixing copy inconsistencies
```
  33fb9833
15 Oct, 2021 1 commit
- [Docs] More general docstrings (#14028) · f5af8736
  Patrick von Platen authored Oct 16, 2021
```
* up

* finish

* up

* up

* finish
```
  f5af8736
11 Oct, 2021 1 commit

Replace assert by ValueError of... · 3499728d

Lahfa Samy authored Oct 11, 2021


Replace assert by ValueError of src/transformers/models/electra/modeling_{electra,tf_electra}.py and all other models that had copies (#13955)

* Replace all assert by ValueError in src/transformers/models/electra

* Reformat with black to pass check_code_quality test

* Change some assert to ValueError of modeling_bert & modeling_tf_albert

* Change some assert in multiples models

* Change multiples models assertion to ValueError in order to validate
  check_code_style test and models template test.

* Black reformat

* Change some more asserts in multiples models

* Change assert to ValueError in modeling_layoutlm.py to fix copy error in code_style_check

* Add proper message to ValueError in modeling_tf_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/bert/modeling_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message to models/convbert/modeling_tf_convbert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add error message for ValueError to modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/tapas/modeling_tapas.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/electra/modeling_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message in src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/rembert/modeling_rembert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/albert/modeling_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3499728d

22 Sep, 2021 1 commit

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

31 Aug, 2021 1 commit

Set missing seq_length variable when using inputs_embeds with ALBERT & Remove... · ef8d6f2b

Jongheon Kim authored Aug 31, 2021

Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication (#13152)

* Set seq_length variable when using inputs_embeds

* remove code duplication

ef8d6f2b

16 Aug, 2021 3 commits
- Depend on hidden_dropout_prob · 62ba3b6b
  Lysandre authored Aug 16, 2021
  
  62ba3b6b
- Fix BERT/MobileBERT classifier dropout · 3c6d73bc
  Lysandre authored Aug 16, 2021
  
  3c6d73bc
- Update modeling_bert.py (#13129) · 7d2feb3a
  weierstrass_walker authored Aug 16, 2021
  
  7d2feb3a
26 Jul, 2021 1 commit

add `classifier_dropout` to classification heads (#12794) · 0c1c42c1

Philip May authored Jul 26, 2021



* add classifier_dropout to Electra

* no type annotations yet
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add classifier_dropout to Electra

* add classifier_dropout to Electra ForTokenClass.

* add classifier_dropout to bert

* add classifier_dropout to roberta

* add classifier_dropout to big_bird

* add classifier_dropout to mobilebert

* empty commit to trigger CI

* add classifier_dropout to reformer

* add classifier_dropout to ConvBERT

* add classifier_dropout to Albert

* add classifier_dropout to Albert
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0c1c42c1

22 Jun, 2021 1 commit

Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252) · af6e01c5

Hamid Shojanazeri authored Jun 22, 2021



* registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing

* sytle format

* adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue

* adding the try catch to the fix as persistent flag is only available from PT >1.6

* adding version check

* added the condition to only use the token_type_ids buffer when its autogenerated not passed by user

* adding comments and making the conidtion where token_type_ids are None to use the registered buffer

* taking out position-embeddding from the if block

* adding comments

* handling the case if buffer for position_ids was not registered

* reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings

* reverting the token_type_ids in case of None to the previous version

* reverting changes on position_ids adding back the if block

* changes added by running make fix-copies

* changes added by running make fix-copies and added the import version as it was getting used

* changes added by running make fix-copies

* changes added by running make fix-copies

* fixing the import format

* fixing the import format

* modified to use temp tensor for trimed and expanded token_type_ids buffer

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* changes made by fix-copies after temp tensor modifications

* clean up

* clean up

* clean up

* clean up

* Nit

* Nit

* Nit

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* modified according to support device conversion on traced models

* changes based on latest in master

* Adapt templates

* Add version import
Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

af6e01c5

07 Jun, 2021 1 commit

Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c

François Lagunas authored Jun 07, 2021

* Fixing bug that appears when using distilation (and potentially other uses).
During backward pass Pytorch complains with:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.

* Fixing all models QA clamp_ bug.

f8bd8c6c

01 Jun, 2021 1 commit
- modify qa-trainer (#11872) · 7e73601f
  Fan Zhang authored Jun 01, 2021
```
* modify qa-trainer

* fix flax model
```
  7e73601f
20 May, 2021 1 commit
- Fix regression in regression (#11785) · 469384a7
  Sylvain Gugger authored May 20, 2021
```
* Fix regression in regression

* Add test
```
  469384a7
04 May, 2021 1 commit

Add multi-class, multi-label and regression to transformers (#11012) · c40c7e21

abhishek thakur authored May 04, 2021



* add to  bert

* review comments

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* self.config.problem_type

* fix style

* fix

* fin

* fix

* update doc

* fix

* test

* Test more problem types

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* remove

* fix

* quality

* make fix-copies

* remove test
Co-authored-by: abhishek thakur <abhishekkrthakur@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

c40c7e21

26 Apr, 2021 1 commit
- make style (#11442) · 32dbb2d9
  Patrick von Platen authored Apr 26, 2021
  
  32dbb2d9
07 Apr, 2021 1 commit
- fix: The 'warn' method is deprecated (#11105) · c9035e45
  Stas Bekman authored Apr 07, 2021
```
* The 'warn' method is deprecated

* fix test
```
  c9035e45
31 Mar, 2021 1 commit

Enforce string-formatting with f-strings (#10980) · acc3bd9d

Sylvain Gugger authored Mar 31, 2021



* First third

* Styling and fix mistake

* Quality

* All the rest

* Treat %s and %d

* typo

* Missing )

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

acc3bd9d

05 Mar, 2021 1 commit

Fix embeddings for PyTorch 1.8 (#10549) · 7da995c0

Sylvain Gugger authored Mar 05, 2021

* Fix embeddings for PyTorch 1.8

* Try with PyTorch 1.8.0

* Fix embeddings init

* Fix copies

* Typo

* More typos

7da995c0

03 Mar, 2021 1 commit

Refactor checkpoint name in BERT and MobileBERT (#10424) · 801ff969

Sylvain Gugger authored Mar 03, 2021

* Refactor checkpoint name in BERT and MobileBERT

* Add option to check copies

* Add QuestionAnswering

* Add last models

* Make black happy

801ff969

19 Jan, 2021 2 commits

Fix model templates and use less than 119 chars (#9684) · 7e662e6a
Sylvain Gugger authored Jan 19, 2021
```
* Fix model templates and use less than 119 chars

* Missing new line
```
7e662e6a

Update `past_key_values` in GPT-2 (#9596) · b020a736

Yusuke Mori authored Jan 20, 2021



* Update past_key_values in gpt2 (#9391)

* Update generation_utils, and rename some items

* Update modeling_gpt2 to avoid an error in gradient_checkpointing

* Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2

* Change the location of '_reorder_cache' in modeling files

* Add '_reorder_cache' in modeling_ctrl

* Fix a bug of my last commit in CTRL

* Add '_reorder_cache' to GPT2DoubleHeadsModel

* Manage 'use_cache' in config of test_modeling_gpt2

* Clean up the doc string

* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix the doc string (GPT-2, CTRL)

* improve gradient_checkpointing_behavior
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

b020a736

06 Jan, 2021 1 commit

[GenerationOutputs] Fix GenerationOutputs Tests (#9443) · b8462b5b

Patrick von Platen authored Jan 06, 2021

* fix generation models

* fix led

* fix docs

* add is_decoder

* fix last docstrings

* make style

* fix t5 cross attentions

* correct t5

b8462b5b

23 Dec, 2020 1 commit

Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893

Suraj Patil authored Dec 23, 2020

* add past_key_values

* add use_cache option

* make mask before cutting ids

* adjust position_ids according to past_key_values

* flatten past_key_values

* fix positional embeds

* fix _reorder_cache

* set use_cache to false when not decoder, fix attention mask init

* add test for caching

* add past_key_values for Roberta

* fix position embeds

* add caching test for roberta

* add doc

* make style

* doc, fix attention mask, test

* small fixes

* adress patrick's comments

* input_ids shouldn't start with pad token

* use_cache only when decoder

* make consistent with bert

* make copies consistent

* add use_cache to encoder

* add past_key_values to tapas attention

* apply suggestions from code review

* make coppies consistent

* add attn mask in tests

* remove copied from longformer

* apply suggestions from code review

* fix bart test

* nit

* simplify model outputs

* fix doc

* fix output ordering

88ef8893