Commits · 34ef029dc04bb94b4e280e02ef3e82dacf1b9dfc · chenpangpang / transformers

13 Apr, 2022 11 commits

Add self training code for text classification (#16738) · 34ef029d

Tu Vu authored Apr 13, 2022

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Delete strata

34ef029d

Add defensive check for config num_labels and id2label (#16709) · 8e0d3b42

Sylvain Gugger authored Apr 13, 2022

* Add defensive check for config num_labels and id2label

* Actually check value...

* Only warning inside init plus better error message

8e0d3b42

Reduce Funnel PT/TF diff (#16744) · 6bed0647

Yih-Dar authored Apr 13, 2022



* Make Funnel Test less flaky
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6bed0647

CI: setup-dependent pip cache (#16751) · 0b8f6972
Joao Gante authored Apr 13, 2022
```
* Setup-dependent pip cache

* Do not restore from old versions
```
0b8f6972
[modeling_utils] better explanation of ignore keys (#16741) · ac43a40e
Stas Bekman authored Apr 13, 2022

ac43a40e

Fix and improve CTRL doctests (#16573) · 0235bc57

Jeremy Fisher authored Apr 13, 2022



* Improve CTRL doctests

* Fix `CTRLForSequenceClassification` flakiness with inconsistent losses

* Remove unused

* Fixup

* Add CTRL to documentation_tests.txt

* Fix control code not being first

* Add output assertions

* Change from sshleifer/tiny-ctrl -> ctrl

* Run `make fixup`

* apply `list` to output logits shape for clarity

* Reduce output loss precision to make assertion more robust

* Add assertion of control code being first

* Fix docstyle

* upper case sentence following control code

* Weird bug fixes

* Add a better generation example
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

0235bc57

Add Doc Test for GPT-J (#16507) · 06b4aac9

Michael Chung authored Apr 13, 2022



* Required the values GPTJ unfortunately cannot run the model =)

* Added the file to the doc tests

* Run Fixup and Style

* Fixed with the test versions of gptj. Ran Style and Fixup.

* Trigger ci

* A Minor Change to License

* Fixed spacing added to the benchmark_utils. Then refactored tests to const variables.

* Removed strings that were included as default parameters anyways.
Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>

06b4aac9

[from_pretrained] refactor find_mismatched_keys (#16706) · 12bfa97a
Stas Bekman authored Apr 13, 2022

12bfa97a

Fix #16660 (tokenizers setters of ids of special tokens) (#16661) · 9f8bfe70

davidleonfdez authored Apr 13, 2022

* Fix setters of *_token_id properties of SpecialTokensMixin

* Test setters of common tokens ids

* Move to a separate test checks of setters of tokens ids

* Add independent test for ByT5

* Add Canine test

* Test speech to text

9f8bfe70

[Doctests] Fix all T5 doc tests (#16646) · b24201fa

Patrick von Platen authored Apr 13, 2022



* [Doctests] Fix all T5 doc tests

* make style

* Update docs/source/en/model_doc/t5.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply Sylvains comments

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

b24201fa

Fix decoding score comparison when using logits processors or warpers (#10638) · f7196f2e
Santiago Castro authored Apr 13, 2022
```
* Normalize using a logits warper

* Add a flag in `generate` to support the logit renormalization

* Add in RAG
```
f7196f2e

12 Apr, 2022 15 commits

TF generate: handle case without cache in beam search (#16704) · eb5bdcdf
Joao Gante authored Apr 12, 2022

eb5bdcdf
add Bigbird ONNX config (#16427) · 9c9db751
Minh Chien Vu authored Apr 13, 2022
```
* add Bigbird ONNX config
```
9c9db751

[FlaxWav2Vec2Model] Fix bug in attention mask (#16725) · a9604067

Sanchit Gandhi authored Apr 12, 2022

* [FlaxWav2Vec2Model] Fix bug in attention mask

* more fixes

* add (Flax)SpeechEncoderDecoderModel PT-FX cross-test

a9604067

[FlaxSpeechEncoderDecoder] Fix input shape bug in weights init (#16728) · 6adefba3
Sanchit Gandhi authored Apr 12, 2022
```
* [FlaxSpeechEncoderDecoder] Fix input shape bug in weights init

* make style
```
6adefba3

Add Doc Tests for Reformer PyTorch (#16565) · 1bac40db

hiromu authored Apr 13, 2022

* start working

* fix: ReformerForQA doctest

* fix: ReformerModelWithLMHead doctest

* fix: ReformerModelForSC doctest

* fix: ReformerModelForMLM doctest

* add: documentation_tests.txt

* make fixup

* change: ReformerModelForSC doctest

* change: checkpoint

1bac40db

TF: remove set_tensor_by_indices_to_value (#16729) · d7f7f29f
Joao Gante authored Apr 12, 2022

d7f7f29f

Moved functions to pytorch_utils.py (#16625) · a315988b

Anmol Joshi authored Apr 12, 2022

* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix

a315988b

Remove duplicate header (#16732) · 0711c45e
Sylvain Gugger authored Apr 12, 2022

0711c45e

Change the chunk_iter function to handle (#16730) · a192f61e

Nicolas Patry authored Apr 12, 2022

* Change the chunk_iter function to handle

the subtle cases where the last chunk gets ignored since all the
data is in the `left_strided` data.

We need to remove the right striding on the previous item.

* Remove commented line.

a192f61e

Replace assertion with exception (#16720) · cc034f72

Anmol Joshi authored Apr 12, 2022



* Updated assertions to exceptions

* updated assertions to exceptions

* bug fixes

* fix-copies

* Update modeling_ctrl.py

* Update src/transformers/models/ctrl/modeling_tf_ctrl.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_tf_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update modeling_led.py

* Update modeling_led.py

* Update modeling_led.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

cc034f72

Qdqbert example add benchmark script with ORT-TRT (#16592) · 14daa610

Shang Zhang authored Apr 12, 2022

* add ort-trt benchmark script

* Update README.md

* ort version can be newer

* formatting

* specify ORT version

14daa610

Update run_translation_no_trainer.py (#16652) · db3edd05
Heerak Son authored Apr 12, 2022
```
args.model_name_or_path -> args.config_name
fix it
```
db3edd05

Only call get_output_embeddings when tie_word_embeddings is set (#16667) · b9f12bed

smelm authored Apr 12, 2022



This avoids an unnecessary call and avoids problems during
initialization of class hierarchies.
Co-authored-by: Samuel Melm <samuel.melm@stud.uni-heidelberg.de>

b9f12bed

Add Doc Test GPT-2 (#16439) · 924484ee

Michael Chung authored Apr 12, 2022



* First Pass All Tests Pass

* WIP

* Adding file to documentation tests

* Change the base model for the example in the doc test.

* Fix Code Styling by running
make fixup

* Called Style

* Reverted to gpt2 model rather than distill gpt2
Then used a token classification model over a sequence model for an example.

* Fix Styling Issue

* Hopefully ignores the formatting issue.
Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>

924484ee

[Bart] correct doc test (#16722) · 70851a6b
Patrick von Platen authored Apr 12, 2022

70851a6b

11 Apr, 2022 14 commits

Fix example logs repeating themselves (#16669) · 69233cf0

Zachary Mueller authored Apr 11, 2022

Move declaration of log streams to before tests, so that results won't get compounded on top of each other

69233cf0

Improve PT/TF equivalence test (#16557) · dce33f21

Yih-Dar authored Apr 11, 2022



* add error message

* Use names in the error message

* allow ModelOutput

* rename to check_pt_tf_outputs and move outside

* fix style

* skip past_key_values in a better way

* Add comments

* improve code for label/loss

* make the logic clear by moving the ignore keys out

* fix _postprocessing_to_ignore

* fix _postprocessing_to_ignore: create new outputs from the remaining fields

* ignore past_key_values in TFGPT2 models for now

* make check_pt_tf_outputs better regarding names

* move check_pt_tf_models outside

* rename methods

* remove test_pt_tf_model_equivalence in TFCLIPModelTest

* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence

* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models

* Fix quality

* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence

* Fix quality

* fix

* fix style

* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence

* Fix quality

* add docstring

* improve comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

dce33f21

Handle image_embeds in ViltModel (#16696) · 7f730085

Yih-Dar authored Apr 11, 2022



* update

* batch_size -> text_batch_size
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

7f730085

Private repo TrainingArgument (#16707) · 161c0a2e

Nicholas Broad authored Apr 11, 2022



* private repo argument to trainer

* format
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>

161c0a2e

Don't push checkpoints to hub in `no_trainer` scripts (#16703) · d4b3e359
Zachary Mueller authored Apr 11, 2022
```
Adds checkpoint prefixes to the gitignore if `push_to_hub` is used along with `checkpointint_steps`
```
d4b3e359

Enable more test_torchscript (#16679) · c04619ec

Yih-Dar authored Apr 11, 2022



* update _create_and_check_torchscript

* Enable test_torchscript

* clear_class_registry
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c04619ec

Reduce memory leak in _create_and_check_torchscript (#16691) · 3918d6a9
Yih-Dar authored Apr 11, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
3918d6a9
Rename the method test_torchscript (#16693) · 2109afae
Yih-Dar authored Apr 11, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2109afae
Fix TF_MASKED_LM_SAMPLE (#16698) · 40618ec2
Yih-Dar authored Apr 11, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
40618ec2
update decoder_vocab_size when resizing embeds (#16700) · 1471857f
Suraj Patil authored Apr 11, 2022

1471857f

Fix t5 shard on TPU Pods (#16527) · 5e686757

Ahmed Elnaggar authored Apr 11, 2022



* Fix t5 shard on TPU Pods

The current script doesn't work properly on a TPU pod because the global batch is not divided correctly per host.
This pull request fixes this issue by dividing the global batch to each host before it is shared on each host.

* fix style
Co-authored-by: ahmed-elnaggar <ahmed.elnaggar@allianz.com>

5e686757

Add Doc Test for BERT (#16523) · 2831826b

Minh Chien Vu authored Apr 11, 2022



* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

2831826b

[Doctests] Correct task summary (#16644) · 098b0026
Patrick von Platen authored Apr 11, 2022

098b0026

fixed crash when deleting older checkpoint and a file f"{checkpoint_prefix}-*" exist (#16686) · 6ef7186b

Sadra authored Apr 11, 2022

I create an archive of older checkpoints during training the checkpoint has a  name with `f"{checkpoint_prefix}-*.zip/.tar ` 
previously `glob(f"{checkpoint_prefix}-*")` takes all files/folders starting with the name checkpoint, and later `shutil.rmtree(checkpoint)` takes a folder name; since at some point it my get a zip file; it crashes training; adding this `if os.path.isdir(x)` allows only folders on `glob_checkpoints`

6ef7186b