Commits · acb709d55150501698b5b500ca49683b913d4b3d · chenpangpang / transformers

23 Jun, 2022 13 commits

Change no trainer image_classification test (#17635) · acb709d5
Zachary Mueller authored Jun 23, 2022
```
* Adjust test arguments and use a new example test
```
acb709d5

Update modeling_cvt.py (#17846) · e70abdad

Fx039482 authored Jun 23, 2022

As shown in the colab notebook I added the missing type hints for " CvtForImageClassification
CvtModel
"

e70abdad

Fix broken test for models with batchnorm (#17841) · 1a7ef334

Matt authored Jun 23, 2022

* Fix tests that broke when models used batchnorm

* Initializing the model twice does not actually...
...give you the same weights each time.
I am good at machine learning.

* Fix speed regression

1a7ef334

BLOOM minor changes on tokenizer (#17823) · 18c263c4

Younes Belkada authored Jun 23, 2022



* few fixes:

- hardcode tokenizer padding side
- remove unused args

* few fixes:

- added new attribute on TokenizerTesterMixin
- added new slow test
- remove unused arg on tokenizer class

* make style

* Update src/transformers/models/bloom/tokenization_bloom_fast.py
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* make quality

* apply changes

- remove new attribute
- redefine test on the class

* add comments
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

18c263c4

Improve performance docs (#17750) · 6f29029b

Leandro von Werra authored Jun 23, 2022



* add skeleton files

* fix cpu inference link

* add hint to make clear that single gpu section contains general info

* add new files to ToC

* update toctree to have subsection for performance

* add "coming soon" to the still empty sections

* fix missing title

* fix typo

* add reference to empty documents

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

6f29029b

Fix an error message in BigBird (#17840) · 5bc779ae
Yih-Dar authored Jun 23, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
5bc779ae
Fix properties of unset special tokens in non verbose mode (#17797) · 3eed5530
Guillaume Klein authored Jun 23, 2022
```
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
```
3eed5530
change message (#17836) · b2fdbacc
SaulLu authored Jun 23, 2022

b2fdbacc

Add missing type hints for QDQBertModel (#17783) · d37a68e6

willtai authored Jun 23, 2022

* Feat: add missing type hints for QDQBertModel

* fix: ran black and isort

* feat: Add missing output type for QDQBertModel

* feat: Add type hints for QDQBertLMHeadModel and models starting with QDQBertFor

* fix: add missing return type for QDQBertModel

* fix: remove wrong return type for QDQBertEmbeddings

* fix: readded config argument to load_tf_weights_in_qdqbert

* fix: add BertConfig type to BertEmbeddings config due t checko error in ci

* fix: removed config type hints to avoid copy checks

d37a68e6

Update type hints modeling_yoso.py (#17827) · 4297f44b

Fx039482 authored Jun 23, 2022

* Update modeling_yoso.py

* make fixup

* Update modeling_yoso.py

That should be it copied from previous PR

4297f44b

TF: generate without `tf.TensorArray` (#17801) · 5cce3076
Joao Gante authored Jun 23, 2022

5cce3076

add doctests for DETR (#17786) · ab223fc1

Quentin authored Jun 23, 2022

* add: check labels for detr object detection doctests

* add: check shapes

* add: add detr to documentation_tests.py

* fix: make fixup output

* fix: add a comment

ab223fc1

Fix push CI artifact path (#17788) · 8d634b70
Yih-Dar authored Jun 23, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
8d634b70

22 Jun, 2022 7 commits

Offload fixes (#17810) · df8e6804
Sylvain Gugger authored Jun 22, 2022
```
* Offload fixes

* Add a test
```
df8e6804

CLI: use hub's `create_commit` (#17755) · 0d0c392c

Joao Gante authored Jun 22, 2022

* use create_commit

* better commit message and description

* touch setup.py to trigger cache update

* add hub version gating

0d0c392c

Bump numpy from 1.21.0 to 1.22.0 in /examples/research_projects/lxmert (#17817) · c366ce10

dependabot[bot] authored Jun 22, 2022

Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0

)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

c366ce10

Bump numpy in /examples/research_projects/visual_bert (#17816) · af0d21e7

dependabot[bot] authored Jun 22, 2022

Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0

)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

af0d21e7

initial commit (#17818) · 56b83cf0
Arthur authored Jun 22, 2022

56b83cf0

Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer`... · 13570381

Eran Hirsch authored Jun 22, 2022

Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` (#17805)

* Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict`

* Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it

* Remove `self._num_beams` from trainer classes

* - Run fixup
- Fix "Constraint" not exposed
- Fix synced_gpus to actually read from param

* Use kwargs

* Copy kwargs before making changes to it

* Fix style issues unused imports

13570381

Flax sharded (#17760) · 16c6eb7c
Arthur authored Jun 22, 2022

16c6eb7c

21 Jun, 2022 16 commits

Fix `top_k_top_p_filtering` having unexpected behavior (#17744) · 3b00b623

unifyh authored Jun 22, 2022

- Fix `top_k_top_p_filtering` not passing `filter_value` to
   `TopPLogitsWarper` causing any top-p filtered logits to be -inf
   instead of specified value

 - Add corresponding test

3b00b623

Remove duplicate code (#17708) · 3ccff0d4
Kyungmin Lee authored Jun 22, 2022

3ccff0d4

Improve error message Union not allowed (#17769) · 26a6a426

Bram Vanroy authored Jun 21, 2022



* Improve error message Union not allowed

* make style

* Update src/transformers/hf_argparser.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

26a6a426

Add final_layer_norm to OPT model (#17785) · abc400b0

Thomas Wang authored Jun 21, 2022



* Add final_layer_norm to OPT model

* Add JAX and TF version

* Fix Keras name

* Woops

* Allow for non breaking change

* Apply suggestions from code review

* add tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

abc400b0

Properly check for a TPU device (#17802) · 52404cba
Zachary Mueller authored Jun 21, 2022

52404cba
Fix test for BF16 detection (#17803) · ef23fae5
Sylvain Gugger authored Jun 21, 2022

ef23fae5

TF Sharded (#17713) · 7cced021

Arthur authored Jun 21, 2022



* initial commit

* update modeeling tf utils

* quality

* clean and update args

* update

* remove potential bug

* code quality

* update

* update max shard

* update tests for sharding from pretrained

* fix remaining test

* make style

* h5py if tf available

* update and fix test

* fix test

* style

* modified push to hub to support shard for TF

* quick fix

* update code

* merge branch main and style

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update based on reviews

* update doc

* update and style

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* fix typo

* style
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7cced021

Use 5e-5 For BigBird PT/Flax equivalence tests (#17780) · f47afefb

Yih-Dar authored Jun 21, 2022



* rename to check_pt_flax_outputs

* update check_pt_flax_outputs

* use 5e-5 for BigBird PT/Flax test
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f47afefb

Prepare transformers for v0.8.0 huggingface-hub release (#17716) · 6a5272b2

Lysandre Debut authored Jun 21, 2022



* Prepare CI for v0.8.0

* pin hfh (revert before merge)

* Revert "pin hfh (revert before merge)"

This reverts commit a0103140e1c77b810ffcb735192968bc03be3e1f.

* Test rc3

* Test latest rc

* Unpin to the RC
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

6a5272b2

Fix forward reference imports in DeBERTa configs (#17800) · 7bc88c05
Sylvain Gugger authored Jun 21, 2022

7bc88c05

Fix Automatic Download of Pretrained Weights in DETR (#17712) · 27e90738

Anugunj Naman authored Jun 21, 2022



* added use_backbone_pretrained

* style fixes

* update

* Update detr.mdx

* Update detr.mdx

* Update detr.mdx

* update using doc py

* Update detr.mdx

* Update src/transformers/models/detr/configuration_detr.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

27e90738

[ViTMAE] Fix docstrings and variable names (#17710) · b681e12d

NielsRogge authored Jun 21, 2022



* Fix docstrings and variable names

* Rename x to something better

* Improve messages

* Fix docstrings and add test for greyscale images
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

b681e12d

Add link to notebook (#17791) · 3fab17fc
NielsRogge authored Jun 21, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
3fab17fc

[CodeParrot] Near-deduplication with jaccard similarity (#17054) · da2bd2ae

Jia LI authored Jun 21, 2022



* deduplication draft

* update style

* update style test

* dummy test main

* rename modules

* rename functions

* return extremes in deduplicate_clusters

* update style

* cast str for gzip

* update doc string

* time processing

* use dataset map to compute minhash

* fill value for short token

* remove da map method

* update style

* use share object to multiprocess

* update style

* use f-string and minor fix
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

* update style

* use module parameters

* change ds_dedup to ds_filter

* save ds_dedup

* mv test to script tests

* make jaccard threshold a parameter of deduplicate_dataset

* update style

* add doc strings

* update style

* add doc string for DuplicationIndex

* save files into data dir

* update readme

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

* make near deduplication optional

* move near deduplication in README

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* use f string
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

da2bd2ae

add onnx support for deberta and debertav2 (#17617) · eb16be41

mrbean authored Jun 21, 2022



* add onnx support for debertav2

* debertav2 -> deberta-v2 in onnx features file

* remove causal lm

* add deberta-v2-xlarge to onnx tests

* use self.type().dtype() in xsoftmax
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* remove hack for deberta

* remove unused imports

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* use generate dummy inputs

* linter

* add imports

* add support for deberta v1 as well

* deberta does not support multiple choice

* Update src/transformers/models/deberta/configuration_deberta.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* one line ordered dict

* fire build
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

eb16be41

Add UL2 (just docs) (#17740) · 8fcbe275

Patrick von Platen authored Jun 21, 2022



* Add UL2
Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com>

* Correct naming

* sort better

* up

* apply sylvains suggestion

8fcbe275

20 Jun, 2022 4 commits

Update modeling_longt5.py (#17777) · da27c4b3

Brad Jascob authored Jun 20, 2022

On line 180, `torch.tensor(-1.0, xxx)` gives the error "TypeError: 'float' object cannot be interpreted as an integer"
This is because the dtype here is `int64`. For `dtype=int64`, this needs to simply be `-1`.
This impacts the long-t5-tglogbal-x model. It does not impact the long-t5-local-x version which does not appear to call this line.

da27c4b3

Not use -1e4 as attn mask (#17306) · d3cb2888

Yih-Dar authored Jun 20, 2022



* Use torch.finfo(self.dtype).min

* for GPTNeoX

* for Albert

* For Splinter

* Update src/transformers/models/data2vec/modeling_data2vec_audio.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix -inf used in Bart-like models

* Fix a few remaining -inf

* more fix

* clean up

* For CLIP

* For FSMT

* clean up

* fix test

* Add dtype argument and use it for LayoutLMv3

* update FlaxLongT5Attention
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d3cb2888

Fix cache for GPT-Neo-X (#17764) · fdb12080
Sylvain Gugger authored Jun 20, 2022
```
* Fix cache for GPT-Neo-X

* Add more tests
```
fdb12080
deprecate is_torch_bf16_available (#17738) · a2d34b7c
Stas Bekman authored Jun 20, 2022
```
* deprecate is_torch_bf16_available

* address suggestions
```
a2d34b7c