Commits · d5a72b6e19e9b594767f1046bc9ceec997cd69a9 · chenpangpang / transformers

26 May, 2021 4 commits

[Flax] Allow dataclasses to be jitted (#11886) · d5a72b6e

Patrick von Platen authored May 26, 2021

* fix_torch_device_generate_test

* remove @

* change dataclasses to flax ones

* fix typo

* fix jitted tests

* fix bert & electra

d5a72b6e

Correcting comments in T5Stack to reflect correct tuple order (#11330) · e6126e19

talkhaldi authored May 26, 2021



* Correcting comments to reflect correct tuple order

In order to match the actual order (line 513 and 516, and as accessed in 968), I've changed the order mentioned in comments L962 and L966-967.

* Update modeling_t5.py

Updating another comment as well

* Removing extra space

* Fixing style and quality

* style & quality

* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

e6126e19

Fix usage of head masks by TF encoder-decoder models' `generate()` function (#11775) · 0b933584

Daniel Stancl authored May 26, 2021

* Fix Bart

* Fix Blenderbot{,_small}

* Fix LED

* Fix Marian

* Fix MBart

* Fix Pegasus

* Fix T5

* Add test for generation with head_mask

* Add a common TF test

* Override a test for the LED model as head masking is not yet properly implemented

* Remove all head_masks from input preparation for LED

* Drop masking for T5 as it needs a bit of refactor

0b933584

Ensure input tensor are on device. (#11874) · 0b0a5984

francescorubbo authored May 26, 2021

The feature extractor does not create tensors on the appropriate device,
so we call `ensure_tensor_on_device` before feeding the processed inputs
to the model.

0b0a5984

25 May, 2021 9 commits
- [Wav2Vec2ForCTC] example typo fixed (#11878) · a9c797f9
  Ahmet Akkoç authored May 26, 2021
  
  a9c797f9
- [Examples] create model with custom config on the fly (#11798) · 1b653010
  Stas Bekman authored May 25, 2021
```
* create custom model on the flight

* better wording

* add update_from_string

* cleanup

* cleanup

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more bool options

* style

* fix logger

* add test

* add the doc

* assert on conflict of options
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  1b653010
- [lm examples] fix overflow in perplexity calc (#11855) · 6287c929
  Stas Bekman authored May 25, 2021
```
* fix overflow in perplexity calc

* use inf

* fix
```
  6287c929
- [Wav2Vec2] SpecAugment Fast (#11764) · 7630c11f
  Patrick von Platen authored May 25, 2021
```
* first try

* finish
```
  7630c11f
- Add option to log only once in multinode training (#11819) · f086652b
  Sylvain Gugger authored May 25, 2021
```
* Add option to long only once in multinode training

* Use an alternate property
```
  f086652b
- typo (#11858) · b8344a27
  Wang Ran (汪然) authored May 25, 2021
  
  b8344a27
- fixed a small typo in the doc (#11856) · f9880f62
  Shiro T authored May 25, 2021
  
  f9880f62
- Enable memory metrics in tests that need it (#11859) · 6da129cb
  Lysandre Debut authored May 25, 2021
  
  6da129cb
- Add some tests to the slow suite #11860 · db0b2477
  Lysandre Debut authored May 25, 2021
  
  db0b2477
24 May, 2021 7 commits

[Trainer] Report both steps and num samples per second (#11818) · afe479ad

Sylvain Gugger authored May 24, 2021



* [Trainer] Report both steps and num samples per second

* Fix batch number

* Update src/transformers/trainer_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

afe479ad

Fix two typos in docs (#11852) · eaab9397
Nick Lane-Smith authored May 24, 2021
```
* typo2

* fix typo
```
eaab9397

Fix flos single node (#11844) · 8a2a3a25

Teven authored May 24, 2021

* fixing flos bug/typo in non-distributed setting

* storing flos every logging_interval

8a2a3a25

Switch mem metrics flag (#11851) · adb785b0

Sylvain Gugger authored May 24, 2021



* Switch mem metrics flag

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

adb785b0

Fix reference to XLNet (#11846) · fcdb85e9
Sylvain Gugger authored May 24, 2021

fcdb85e9
[Flax] Fix PyTorch import error (#11839) · f5806041
Patrick von Platen authored May 24, 2021
```
* fix_torch_device_generate_test

* remove @

* change pytorch import to flax import
```
f5806041
Replace double occurrences as the last step (#11367) · 0cbddfb1
Lysandre Debut authored May 24, 2021

0cbddfb1

22 May, 2021 1 commit

Faster list concat for trainer_pt_utils.get_length_grouped_indices() (#11825) · 73fde1de

ctheodoris authored May 22, 2021

get_length_grouped_indices() in LengthGroupedSampler and DistributedLengthGroupedSampler
is prohibitively slow for large number of megabatches (in test case takes hours for ~270k
megabatches with 100 items each) due to slow list concatenation with sum(megabatches, []).

Resolves: #11795
Co-authored-by: ctheodoris <cvtheodo@ds.dfci.harvard.edu>

73fde1de

21 May, 2021 7 commits
- Add flax text class colab (#11824) · da22245e
  Patrick von Platen authored May 21, 2021
```
* fix_torch_device_generate_test

* remove @

* add flax glue link
```
  da22245e
- [Deepspeed] support `zero.Init` in `from_config` (#11805) · a26f4d62
  Stas Bekman authored May 21, 2021
```
* support zero.Init in from_config

* no need for eval test
```
  a26f4d62
- [Flax] Small fixes in `run_flax_glue.py` (#11820) · 82335185
  Patrick von Platen authored May 21, 2021
```
* fix_torch_device_generate_test

* remove @

* correct best seed for flax fine-tuning
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
```
  82335185
- Avoid TensorFlow import in Trainer · b8697bc6
  Sylvain Gugger authored May 21, 2021
  
  b8697bc6
- fix roformer config doc (#11813) · e2c1dd09
  yujun authored May 21, 2021
  
  e2c1dd09
- Patch recursive import (#11812) · 1b652295
  Lysandre Debut authored May 21, 2021
  
  1b652295
- [Flax] Align GLUE training script with mlm training script (#11778) · bd987165
  Patrick von Platen authored May 21, 2021
```
* speed up flax glue

* remove unnecessary line

* remove folder

* remove run in loop
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
```
  bd987165
20 May, 2021 6 commits

Fix failing test on Windows Platform (#11589) · 22394387

Keren Fuentes authored May 20, 2021

* add separator for windows

* fixes test_is_copy_consistent on Windows

* fixing writing encoding issue on extended test (for Windows)

* resolving comments

22394387

A cleaner and more scalable implementation of symbolic tracing (#11763) · f4a0d6ff

Michael Benayoun authored May 20, 2021



Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures:
- ALBERT
- DistilBERT
- MobileBERT
- MegatronBERT
- GPT2
- GPT Neo
Co-authored-by: Michael Benayoun <michael@huggingface.co>

f4a0d6ff

Fix regression in regression (#11785) · 469384a7
Sylvain Gugger authored May 20, 2021
```
* Fix regression in regression

* Add test
```
469384a7
Fix pattern in conf.py (#11784) · 5ad5cc71
Sylvain Gugger authored May 20, 2021

5ad5cc71

Add new model RoFormer (use rotary position embedding ) (#11684) · 206f06f2

yujun authored May 20, 2021



* add roformer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* update

* add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding

* update docs

* make style and make quality

* roback

* unchanged

* rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding

* update Copyright year

* move # Add modeling imports here to the correct position

* max_position_embeddings can be set to 1536

* # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer

* # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer

* update tokenization_roformer

* make style

* add staticmethod apply_rotary_position_embeddings

* add TF staticmethod apply_rotary_position_embeddings

* update torch apply_rotary_position_embeddings

* fix tf apply_rotary_position_embeddings error

* make style

* add pytorch RoFormerSelfAttentionRotaryPositionEmbeddingTest

* add TF rotary_position_embeddings test

* update test_modeling_rofomer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_tf_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refact roformer tokenizer

* add RoFormerTokenizerFast

* add RoFormerTokenizationTest

* add require_jieba

* update Copyright

* update tokenizer & add copy from

* add option rotary_value

* use rust jieba

* use rjieba

* use rust jieba

* fix test_alignement_methods

* slice normalized_string is too slow

* add config.embedding_size when embedding_size!=hidden_size

* fix pickle tokenizer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style and make quality
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

206f06f2

Deprecate commands from the transformers-cli that are in the hf-cli (#11779) · 075fdab4
Lysandre Debut authored May 20, 2021

075fdab4

19 May, 2021 3 commits

Add DOI badge to README (#11771) · 2582e59a
Albert Villanova del Moral authored May 19, 2021

2582e59a

[Flax MLM] Refactor run mlm with optax (#11745) · 00440e35

Patrick von Platen authored May 19, 2021



* refactor

* update

* update

* update

* refactor run mlm

* finalize

* refactor more

* fix typo

* update

* finish refactor

* modify run mlm

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* small fixes

* upload

* upload

* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

00440e35

[T5 failing CI] Fix generate test (#11770) · 43891be1
Patrick von Platen authored May 19, 2021
```
* fix_torch_device_generate_test

* remove @
```
43891be1

18 May, 2021 3 commits

Fix usage of head masks by PT encoder-decoder models' `generate()` function (#11621) · 680d181c

Daniel Stancl authored May 19, 2021

* Add missing head masking for generate() function

* Add head_mask, decoder_head_mask and cross_attn_head_mask
into prepare_inputs_for_generation for generate() function
for multiple encoder-decoder models.

* Add test_genereate_with_head_masking

* [WIP] Update the new test and handle special cases

* make style

* Omit ProphetNet test so far

* make fix-copies

680d181c

FlaxGPT2 (#11556) · ca33278f

Suraj Patil authored May 19, 2021



* flax gpt2

* combine masks

* handle shared embeds

* add causal LM sample

* style

* add tests

* style

* fix imports, docs, quality

* don't use cache

* add cache

* add cache 1st version

* make use cache work

* start adding test for generation

* finish generation loop compilation

* rewrite test

* finish

* update

* update

* apply sylvains suggestions

* update

* refactor

* fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ca33278f

Fix a small error in summarization example (#11762) · eb3e072a
Tomy Hsieh authored May 19, 2021

eb3e072a