Commits · f94a52cd795fae57704367b3a248eeb343384f18 · chenpangpang / transformers

12 Aug, 2020 9 commits

[s2s] add BartTranslationDistiller for distilling mBART (#6363) · f94a52cd
Sam Shleifer authored Aug 12, 2020

f94a52cd

Adding PaddingDataCollator (#6442) · d2370e1b

Sylvain Gugger authored Aug 12, 2020

* Data collator with padding

* Add type annotation

* Support tensors as well

* Add comment

* Fix for labels wrong shape

* Data collator with padding

* Add type annotation

* Support tensors as well

* Add comment

* Fix for labels wrong shape

* Remove changes rendered unnecessary

d2370e1b

Fix #6428 (#6437) · 96c3329f
Sylvain Gugger authored Aug 12, 2020

96c3329f
Activate check on the CI (#6427) · a8db954c
Sylvain Gugger authored Aug 12, 2020
```
* Activate check on the CI

* Fix repo inconsistencies

* Don't document too much
```
a8db954c
Move prediction_loss_only to TrainingArguments (#6426) · 34fabe16
Sylvain Gugger authored Aug 12, 2020

34fabe16

Fixes to make life easier with the nlp library (#6423) · e9c30314

Sylvain Gugger authored Aug 12, 2020



* allow using tokenizer.pad as a collate_fn in pytorch

* allow using tokenizer.pad as a collate_fn in pytorch

* Add documentation and tests

* Make attention mask the right shape

* Better test
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

e9c30314

[test] replace capsys with the more refined CaptureStderr/CaptureStdout (#6422) · 87b35943

Stas Bekman authored Aug 12, 2020



* replace capsys with the more refined CaptureStderr/CaptureStdout

* Update examples/seq2seq/test_seq2seq_examples.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

87b35943

Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttent… (#4323) · ac5bcf23

Jared T Nielsen authored Aug 12, 2020

* Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttention into two separate dropout layers.

* Same dropout fixes for PyTorch.

ac5bcf23

Disabled pabee test (#6431) · 4ffea5ce
Lysandre Debut authored Aug 12, 2020

4ffea5ce

11 Aug, 2020 28 commits

[model_card] rohanrajpal/bert-base-codemixed-uncased-sentiment (#6324) · 155288f0

Rohan Rajpal authored Aug 12, 2020



* Create README.md

* Update model_cards/rohanrajpal/bert-base-codemixed-uncased-sentiment/README.md

* Update model_cards/rohanrajpal/bert-base-codemixed-uncased-sentiment/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

155288f0

Create model card T5-base fine-tuned on event2Mind for Intent Prediction (#6412) · 4e6245fc
Manuel Romero authored Aug 12, 2020

4e6245fc
Create README.md (#6381) · 46e3a0a6
Manuel Romero authored Aug 12, 2020

46e3a0a6
Create README.md (#6378) · 31dfde74
Manuel Romero authored Aug 12, 2020

31dfde74
Add metadata to be indexed properly (#6380) · 25e29150
Manuel Romero authored Aug 12, 2020

25e29150
Change metadata to be indexed correctly (#6379) · 471be5f2
Manuel Romero authored Aug 12, 2020

471be5f2

Create README.md (#6346) · 42ee0bc6

Rohan Rajpal authored Aug 12, 2020



* Create README.md

* add results on SAIL dataset

* Update model_cards/rohanrajpal/bert-base-multilingual-codemixed-cased-sentiment/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

42ee0bc6

[examples] add pytest dependency (#6425) · 3f071c4b
Sam Shleifer authored Aug 11, 2020

3f071c4b

lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361) · ece0903e

Stas Bekman authored Aug 11, 2020



* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* [model_cards] electra-base-turkish-cased-ner (#6350)

* for electra-base-turkish-cased-ner

* Add metadata
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Temporarily de-activate TPU CI

* Update modeling_tf_utils.py (#6372)

fix typo: ckeckpoint->checkpoint

* the test now works again (#6371)

* correct pl link in readme (#6364)

* refactor almost identical tests (#6339)

* refactor almost identical tests

* important to add a clear assert error message

* make the assert error even more descriptive than the original bt

* Small docfile fixes (#6328)

* Patch models (#6326)

* TFAlbertFor{TokenClassification, MultipleChoice}

* Patch models

* BERT and TF BERT info


s

* Update check_repo

* Ci GitHub caching (#6382)

* Cache Github Actions CI

* Remove useless file

* Colab button (#6389)

* Add colab button

* Add colab link for tutorials

* Fix links for open in colab (#6391)

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove dup (leftover from merge)

* convert the test into the new refactored format

* stick to using the current_step as is, without ++
Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Alexander Measure <ameasure@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

ece0903e

Create README.md (#6386) · 6c87b73d
cedspam authored Aug 11, 2020
```
* Create README.md

* Update README.md
```
6c87b73d
[pl] restore lr logging behavior for glue, ner examples (#6314) · 0203d651
Stas Bekman authored Aug 11, 2020

0203d651
rename prepare_translation_batch -> prepare_seq2seq_batch (#6103) · be1520d3
Sam Shleifer authored Aug 11, 2020

be1520d3
PegasusForConditionalGeneration (torch version) (#6340) · 66fa8cea
Sam Shleifer authored Aug 11, 2020
```
Co-authored-by: Jingqing  Zhang <jingqing.zhang15@imperial.ac.uk>
```
66fa8cea
[s2s] wmt download script use less ram (#6405) · f6cb0f80
Stas Bekman authored Aug 11, 2020

f6cb0f80
pl version: examples/requirements.txt is single source of truth (#6309) · 7c6a085e
Stas Bekman authored Aug 11, 2020

7c6a085e
Create Model Card File (#6357) · 1d1d5bec
Pranav Vadrevu authored Aug 11, 2020

1d1d5bec

Create README.md (#6413) · 00ce881c

Abed khooli authored Aug 11, 2020

* Create README.md

Model card for https://huggingface.co/akhooli/gpt2-small-arabic



* Update model_cards/akhooli/gpt2-small-arabic/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

00ce881c

switch Hindi-BERT to S3 README (#6396) · 3ae30787
Nick Doiron authored Aug 11, 2020

3ae30787

Create README.md (#6397) · 824e651e

Abed khooli authored Aug 11, 2020



* Create README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md

* Update model_cards/akhooli/gpt2-small-arabic-poetry/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

824e651e

[Performance improvement] "Bad tokens ids" optimization (#6064) · 40478291

guillaume-be authored Aug 11, 2020

* Optimized banned token masking

* Avoid duplicate EOS masking if in bad_words_id

* Updated mask generation to handle empty banned token list

* Addition of unit tests for the updated bad_words_ids masking

* Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test

* Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test (timeout does not work on Windows)

* Moving Marian import to the test context to allow TF only environments to run

* Moving imports to torch_available test

* Updated operations device and test

* Updated operations device and test

* Added docstring and comment for in-place scores modification

* Moving test to own test_generation_utils, use of lighter models for testing

* removed unneded imports in test_modeling_common

* revert formatting change for ModelTesterMixin

* Updated caching, simplified eos token id test, removed unnecessary @require_torch

* formatting compliance

40478291

Warn if debug requested without TPU fixes (#6308) (#6390) · 87e124c2

David LaPalomento authored Aug 11, 2020



* Warn if debug requested without TPU fixes (#6308)
Check whether a PyTorch compatible TPU is available before attempting to print TPU metrics after training has completed. This way, users who apply `--debug` without reading the documentation aren't suprised by a stacktrace.

* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

87e124c2

Fix tokenizer saving and loading error (#6026) · cdf1f7ed

Junyuan Zheng authored Aug 11, 2020



* fix tokenizer saving and loading bugs when adding AddedToken to additional special tokens

* Add tokenizer test

* Style

* Style 2
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

cdf1f7ed

testing utils: capturing std streams context manager (#6231) · 83984a61

Stas Bekman authored Aug 11, 2020

* testing utils: capturing std streams context manager

* style

* missing import

* add the origin of this code

83984a61

add pl_glue example test (#6034) · f6c0680d

Stas Bekman authored Aug 11, 2020

* add pl_glue example test

* for now just test that it runs, next validate results of eval or predict?

* complete the run_pl_glue test to validate the actual outcome

* worked on my machine, CI gets less accuracy - trying higher epochs

* match run_pl.sh hparms

* more epochs?

* trying higher lr

* for now just test that the script runs to a completion

* correct the comment

* if cuda is available, add --fp16 --gpus=1 to cover more bases

* style

f6c0680d

Feed forward chunking (#6024) · b25cec13

Pradhy729 authored Aug 11, 2020



* Chunked feed forward for Bert

This is an initial implementation to test applying feed forward chunking for BERT.
Will need additional modifications based on output and benchmark results.

* Black and cleanup

* Feed forward chunking in BertLayer class.

* Isort

* add chunking for all models

* fix docs

* Fix typo
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

b25cec13

Add TPU testing once again · 8a3db6b3
Lysandre authored Aug 11, 2020

8a3db6b3
Add missing docker arg for TPU CI. (#6393) · f65ac1fa
zcain117 authored Aug 10, 2020

f65ac1fa
[s2s] Script to save wmt data to disk (#6403) · b9ecd92e
Sam Shleifer authored Aug 10, 2020

b9ecd92e

10 Aug, 2020 3 commits

TF Longformer (#5764) · 00bb0b25

Patrick von Platen authored Aug 10, 2020



* improve names and tests longformer

* more and better tests for longformer

* add first tf test

* finalize tf basic op functions

* fix merge

* tf shape test passes

* narrow down discrepancies

* make longformer local attn tf work

* correct tf longformer

* add first global attn function

* add more global longformer func

* advance tf longformer

* finish global attn

* upload big model

* finish all tests

* correct false any statement

* fix common tests

* make all tests pass except keras save load

* fix some tests

* fix torch test import

* finish tests

* fix test

* fix torch tf tests

* add docs

* finish docs

* Update src/transformers/modeling_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply Lysandres suggestions

* reverse to assert statement because function will fail otherwise

* applying sylvains recommendations

* Update src/transformers/modeling_longformer.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

00bb0b25

[EncoderDecoderModel] add a `add_cross_attention` boolean to config (#6377) · 34259366
Patrick von Platen authored Aug 10, 2020
```
* correct encoder decoder model

* Apply suggestions from code review

* apply sylvains suggestions
```
34259366
Fix links for open in colab (#6391) · 06bc347c
Sylvain Gugger authored Aug 10, 2020

06bc347c