Commits · 9f8bfe703c4e8b88f462b23ff4968b1385bd955b · chenpangpang / transformers

13 Apr, 2022 3 commits

Fix #16660 (tokenizers setters of ids of special tokens) (#16661) · 9f8bfe70

davidleonfdez authored Apr 13, 2022

* Fix setters of *_token_id properties of SpecialTokensMixin

* Test setters of common tokens ids

* Move to a separate test checks of setters of tokens ids

* Add independent test for ByT5

* Add Canine test

* Test speech to text

9f8bfe70

[Doctests] Fix all T5 doc tests (#16646) · b24201fa

Patrick von Platen authored Apr 13, 2022



* [Doctests] Fix all T5 doc tests

* make style

* Update docs/source/en/model_doc/t5.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply Sylvains comments

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

b24201fa

Fix decoding score comparison when using logits processors or warpers (#10638) · f7196f2e
Santiago Castro authored Apr 13, 2022
```
* Normalize using a logits warper

* Add a flag in `generate` to support the logit renormalization

* Add in RAG
```
f7196f2e

12 Apr, 2022 15 commits

TF generate: handle case without cache in beam search (#16704) · eb5bdcdf
Joao Gante authored Apr 12, 2022

eb5bdcdf
add Bigbird ONNX config (#16427) · 9c9db751
Minh Chien Vu authored Apr 13, 2022
```
* add Bigbird ONNX config
```
9c9db751

[FlaxWav2Vec2Model] Fix bug in attention mask (#16725) · a9604067

Sanchit Gandhi authored Apr 12, 2022

* [FlaxWav2Vec2Model] Fix bug in attention mask

* more fixes

* add (Flax)SpeechEncoderDecoderModel PT-FX cross-test

a9604067

[FlaxSpeechEncoderDecoder] Fix input shape bug in weights init (#16728) · 6adefba3
Sanchit Gandhi authored Apr 12, 2022
```
* [FlaxSpeechEncoderDecoder] Fix input shape bug in weights init

* make style
```
6adefba3

Add Doc Tests for Reformer PyTorch (#16565) · 1bac40db

hiromu authored Apr 13, 2022

* start working

* fix: ReformerForQA doctest

* fix: ReformerModelWithLMHead doctest

* fix: ReformerModelForSC doctest

* fix: ReformerModelForMLM doctest

* add: documentation_tests.txt

* make fixup

* change: ReformerModelForSC doctest

* change: checkpoint

1bac40db

TF: remove set_tensor_by_indices_to_value (#16729) · d7f7f29f
Joao Gante authored Apr 12, 2022

d7f7f29f

Moved functions to pytorch_utils.py (#16625) · a315988b

Anmol Joshi authored Apr 12, 2022

* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix

a315988b

Remove duplicate header (#16732) · 0711c45e
Sylvain Gugger authored Apr 12, 2022

0711c45e

Change the chunk_iter function to handle (#16730) · a192f61e

Nicolas Patry authored Apr 12, 2022

* Change the chunk_iter function to handle

the subtle cases where the last chunk gets ignored since all the
data is in the `left_strided` data.

We need to remove the right striding on the previous item.

* Remove commented line.

a192f61e

Replace assertion with exception (#16720) · cc034f72

Anmol Joshi authored Apr 12, 2022



* Updated assertions to exceptions

* updated assertions to exceptions

* bug fixes

* fix-copies

* Update modeling_ctrl.py

* Update src/transformers/models/ctrl/modeling_tf_ctrl.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_tf_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update modeling_led.py

* Update modeling_led.py

* Update modeling_led.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

cc034f72

Qdqbert example add benchmark script with ORT-TRT (#16592) · 14daa610

Shang Zhang authored Apr 12, 2022

* add ort-trt benchmark script

* Update README.md

* ort version can be newer

* formatting

* specify ORT version

14daa610

Update run_translation_no_trainer.py (#16652) · db3edd05
Heerak Son authored Apr 12, 2022
```
args.model_name_or_path -> args.config_name
fix it
```
db3edd05

Only call get_output_embeddings when tie_word_embeddings is set (#16667) · b9f12bed

smelm authored Apr 12, 2022



This avoids an unnecessary call and avoids problems during
initialization of class hierarchies.
Co-authored-by: Samuel Melm <samuel.melm@stud.uni-heidelberg.de>

b9f12bed

Add Doc Test GPT-2 (#16439) · 924484ee

Michael Chung authored Apr 12, 2022



* First Pass All Tests Pass

* WIP

* Adding file to documentation tests

* Change the base model for the example in the doc test.

* Fix Code Styling by running
make fixup

* Called Style

* Reverted to gpt2 model rather than distill gpt2
Then used a token classification model over a sequence model for an example.

* Fix Styling Issue

* Hopefully ignores the formatting issue.
Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>

924484ee

[Bart] correct doc test (#16722) · 70851a6b
Patrick von Platen authored Apr 12, 2022

70851a6b

11 Apr, 2022 18 commits

Fix example logs repeating themselves (#16669) · 69233cf0

Zachary Mueller authored Apr 11, 2022

Move declaration of log streams to before tests, so that results won't get compounded on top of each other

69233cf0

Improve PT/TF equivalence test (#16557) · dce33f21

Yih-Dar authored Apr 11, 2022



* add error message

* Use names in the error message

* allow ModelOutput

* rename to check_pt_tf_outputs and move outside

* fix style

* skip past_key_values in a better way

* Add comments

* improve code for label/loss

* make the logic clear by moving the ignore keys out

* fix _postprocessing_to_ignore

* fix _postprocessing_to_ignore: create new outputs from the remaining fields

* ignore past_key_values in TFGPT2 models for now

* make check_pt_tf_outputs better regarding names

* move check_pt_tf_models outside

* rename methods

* remove test_pt_tf_model_equivalence in TFCLIPModelTest

* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence

* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models

* Fix quality

* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence

* Fix quality

* fix

* fix style

* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence

* Fix quality

* add docstring

* improve comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

dce33f21

Handle image_embeds in ViltModel (#16696) · 7f730085

Yih-Dar authored Apr 11, 2022



* update

* batch_size -> text_batch_size
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

7f730085

Private repo TrainingArgument (#16707) · 161c0a2e

Nicholas Broad authored Apr 11, 2022



* private repo argument to trainer

* format
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>

161c0a2e

Don't push checkpoints to hub in `no_trainer` scripts (#16703) · d4b3e359
Zachary Mueller authored Apr 11, 2022
```
Adds checkpoint prefixes to the gitignore if `push_to_hub` is used along with `checkpointint_steps`
```
d4b3e359

Enable more test_torchscript (#16679) · c04619ec

Yih-Dar authored Apr 11, 2022



* update _create_and_check_torchscript

* Enable test_torchscript

* clear_class_registry
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c04619ec

Reduce memory leak in _create_and_check_torchscript (#16691) · 3918d6a9
Yih-Dar authored Apr 11, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
3918d6a9
Rename the method test_torchscript (#16693) · 2109afae
Yih-Dar authored Apr 11, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2109afae
Fix TF_MASKED_LM_SAMPLE (#16698) · 40618ec2
Yih-Dar authored Apr 11, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
40618ec2
update decoder_vocab_size when resizing embeds (#16700) · 1471857f
Suraj Patil authored Apr 11, 2022

1471857f

Fix t5 shard on TPU Pods (#16527) · 5e686757

Ahmed Elnaggar authored Apr 11, 2022



* Fix t5 shard on TPU Pods

The current script doesn't work properly on a TPU pod because the global batch is not divided correctly per host.
This pull request fixes this issue by dividing the global batch to each host before it is shared on each host.

* fix style
Co-authored-by: ahmed-elnaggar <ahmed.elnaggar@allianz.com>

5e686757

Add Doc Test for BERT (#16523) · 2831826b

Minh Chien Vu authored Apr 11, 2022



* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

2831826b

[Doctests] Correct task summary (#16644) · 098b0026
Patrick von Platen authored Apr 11, 2022

098b0026

fixed crash when deleting older checkpoint and a file f"{checkpoint_prefix}-*" exist (#16686) · 6ef7186b

Sadra authored Apr 11, 2022

I create an archive of older checkpoints during training the checkpoint has a  name with `f"{checkpoint_prefix}-*.zip/.tar ` 
previously `glob(f"{checkpoint_prefix}-*")` takes all files/folders starting with the name checkpoint, and later `shutil.rmtree(checkpoint)` takes a folder name; since at some point it my get a zip file; it crashes training; adding this `if os.path.isdir(x)` allows only folders on `glob_checkpoints`

6ef7186b

Generate: min length can't be larger than max length (#16668) · b0bf3011
Joao Gante authored Apr 11, 2022
```
* min length must be smaller than max length

* Update min_length in tests
```
b0bf3011

Jia multi gpu eval (#16428) · 4868a830

Jia LI authored Apr 11, 2022



* add simple multi gpu complet

* add human_eval_multi_gpu

* use copy strategy to distribute across gpu, to avoid padding

* add doc string

* update code style

* use task id to arrange output

* truncate input to avoid zero pad

* Stop the copy mechanism

* update style

* restore copies to scale better in distributed mode

* update style

* replace human eval

* Apply suggestions from code review

1. Tokenize all input at the same time
2. use attention_mask to get the input length
3. other small fixes
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* correct typo and update docstring

* update code style

* remove num sample division constraint

* remove max len calculation

* use accelerator.gather once to speed up

* use accelerate set_seed; update accelerate version

* correct gather bug
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

4868a830

Fix some doc examples in task summary (#16666) · 8e93dc7e
Yih-Dar authored Apr 11, 2022
```
* Fix some doc examples
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
8e93dc7e

add a warning in `SpmConverter` for sentencepiece's model using the byte fallback feature (#16629) · 1025a9b7

SaulLu authored Apr 11, 2022

* update proto sentencepiece model

* Revert "update proto sentencepiece model"

This reverts commit b07f671747fec35773d0b3d4788b8b15aefa0229.

* add check

* add test

* Revert "Revert "update proto sentencepiece model""

This reverts commit 46108257b8927b73627ec8f4f3eed53a95fc700d.

* test for log level

* test for log level 2

* warning at the warning level

* clean

* format

* add explanation in docstring

1025a9b7

08 Apr, 2022 4 commits

Update audio examples with MInDS-14 (#16633) · 7c5d7991

Steven Liu authored Apr 08, 2022

* ✨ update audio examples with minds dataset

* 🖍 make style

* 🖍 minor fixes for doctests

7c5d7991

[Trainer] tf32 arg doc (#16674) · 4d461067

Stas Bekman authored Apr 08, 2022



* [Trainer] tf32 arg doc

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4d461067

only load state dict when the checkpoint is not None (#16673) · f4d4f0a1
Laura Hanu authored Apr 08, 2022

f4d4f0a1

Add tests for no_trainer and fix existing examples (#16656) · d57da992

Zachary Mueller authored Apr 08, 2022

* Fixed some bugs involving saving during epochs
* Added tests mimicking the existing examples tests
* Added in json exporting to all `no_trainer` examples for consistency

d57da992