Commits · 206f06f2ddcb8ccfc52d13612a7d41219e41e932 · chenpangpang / transformers

20 May, 2021 2 commits

Add new model RoFormer (use rotary position embedding ) (#11684) · 206f06f2

yujun authored May 20, 2021



* add roformer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* update

* add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding

* update docs

* make style and make quality

* roback

* unchanged

* rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding

* update Copyright year

* move # Add modeling imports here to the correct position

* max_position_embeddings can be set to 1536

* # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer

* # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer

* update tokenization_roformer

* make style

* add staticmethod apply_rotary_position_embeddings

* add TF staticmethod apply_rotary_position_em...

206f06f2

Deprecate commands from the transformers-cli that are in the hf-cli (#11779) · 075fdab4
Lysandre Debut authored May 20, 2021

075fdab4

19 May, 2021 3 commits

Add DOI badge to README (#11771) · 2582e59a
Albert Villanova del Moral authored May 19, 2021

2582e59a

[Flax MLM] Refactor run mlm with optax (#11745) · 00440e35

Patrick von Platen authored May 19, 2021



* refactor

* update

* update

* update

* refactor run mlm

* finalize

* refactor more

* fix typo

* update

* finish refactor

* modify run mlm

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* small fixes

* upload

* upload

* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

00440e35

[T5 failing CI] Fix generate test (#11770) · 43891be1
Patrick von Platen authored May 19, 2021
```
* fix_torch_device_generate_test

* remove @
```
43891be1

18 May, 2021 10 commits

Fix usage of head masks by PT encoder-decoder models' `generate()` function (#11621) · 680d181c

Daniel Stancl authored May 19, 2021

* Add missing head masking for generate() function

* Add head_mask, decoder_head_mask and cross_attn_head_mask
into prepare_inputs_for_generation for generate() function
for multiple encoder-decoder models.

* Add test_genereate_with_head_masking

* [WIP] Update the new test and handle special cases

* make style

* Omit ProphetNet test so far

* make fix-copies

680d181c

FlaxGPT2 (#11556) · ca33278f

Suraj Patil authored May 19, 2021



* flax gpt2

* combine masks

* handle shared embeds

* add causal LM sample

* style

* add tests

* style

* fix imports, docs, quality

* don't use cache

* add cache

* add cache 1st version

* make use cache work

* start adding test for generation

* finish generation loop compilation

* rewrite test

* finish

* update

* update

* apply sylvains suggestions

* update

* refactor

* fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ca33278f

Fix a small error in summarization example (#11762) · eb3e072a
Tomy Hsieh authored May 19, 2021

eb3e072a

Add Flax Examples and Cloud TPU README (#11753) · 77f9bd18

Avital Oliver authored May 18, 2021



* Add Flax Examples README

* Apply suggestions from code review

* Update examples/flax/README.md

* add nice table

* fix

* fix

* apply suggestions

* upload

* finish flax readme.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

77f9bd18

add `dataset_name` to data_args and added accuracy metric (#11760) · 04e25c62

Philipp Schmid authored May 18, 2021

* add `dataset_name` to data_args and added accuracy metric

* added documentation for dataset_name

* spelling correction

04e25c62

Fixed: Better names for nlp variables in pipelines' tests and docs. (#11752) · fd3b12e8
Vyom Pathak authored May 18, 2021
```
* Fixed: Better names for nlp variables in pipelines' tests and docs.

* Fixed: Better variable names
```
fd3b12e8
Add more subsections to main doc (#11758) · cebb96f5
Patrick von Platen authored May 18, 2021
```
* add headers to main doc

* Apply suggestions from code review

* update

* upload
```
cebb96f5
Fix incorrect newline in #11650 (#11757) · da7e73b7
Tommy Chiang authored May 18, 2021

da7e73b7
Fix checkpoint deletion (#11748) · a515caa3
Sylvain Gugger authored May 18, 2021

a515caa3

[TokenClassification] Label realignment for subword aggregation (#11680) · b88e0e01

Nicolas Patry authored May 18, 2021

* [TokenClassification] Label realignment for subword aggregation

Tentative to replace https://github.com/huggingface/transformers/pull/11622/files



- Added `AggregationStrategy`
- `ignore_subwords` and `grouped_entities` arguments are now fused
  into `aggregation_strategy`. It makes more sense anyway because
  `ignore_subwords=True` with `grouped_entities=False` did not have a
  meaning anyway.
- Added 2 new ways to aggregate which are MAX, and AVERAGE
- AVERAGE requires a bit more information than the others, for now this
case is slightly specific, we should keep that in mind for future
changes.
- Testing has been modified to reflect new argument, and to check the
correct deprecation and the new aggregation_strategy.
- Put the testing argument and testing results for aggregation_strategy,
close together, so that readers can understand what is supposed to
happen.
- `aggregate` is now only tested on a small model as it does not mean
anything to test it globally for all models.
- Previous tests are unchanged in desired output.
- Added a new test case that showcases better the difference between the
  FIRST, MAX and AVERAGE strategies.

* Wrong framework.

* Addressing three issues.

1- Tags might not follow B-, I- convention, so any tag should work now
(assumed as B-TAG)
2- Fixed an issue with average that leads to a substantial code change.
3- The testing suite was not checking for the "index" key for "none"
strategy. This is now fixed.

The issue is that "O" could not be chosen by AVERAGE strategy because
those tokens were filtered out beforehand, so their relative scores were
not counted in the average. Now filtering on
ignore_labels will happen at the very end of the pipeline fixing
that issue.
It's a bit hard to make sure this stays like that because we do
not have a end-to-end test for that behavior

* Formatting.

* Adding formatting to code + cleaner handling of B-, I- tags.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>

* Typo.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>

b88e0e01

17 May, 2021 7 commits
- push (#11750) · c73e3532
  Patrick von Platen authored May 17, 2021
  
  c73e3532
- Use new evaluation loop in TrainerQA (#11746) · 936b5715
  Sylvain Gugger authored May 17, 2021
  
  936b5715
- [BigBird Pegasus] Make tests faster (#11744) · 73893fc7
  Patrick von Platen authored May 17, 2021
```
* improve tests

* remove bogus file

* make style
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
```
  73893fc7
- fixed shape issue for T5 tracing (#11742) · a0531c8a
  Michael Benayoun authored May 17, 2021
```
Co-authored-by: Michael Benayoun <michael@huggingface.co>
```
  a0531c8a
- Add visual + link to Premium Support webpage (#11740) · 0fc56df5
  Julien Chaumond authored May 17, 2021
```
* Update README.md

* Update index.rst
```
  0fc56df5
- Remove tapas model card (#11739) · 2f88bd9c
  Julien Chaumond authored May 17, 2021
  
  2f88bd9c
- Improvements to Flax finetuning script (#11727) · 726e953d
  Marc van Zee authored May 17, 2021
```
* Add Cloud details to README

* Flax script and readme updates

* Some simplifications of Flax script
```
  726e953d
14 May, 2021 4 commits
- Experimental symbolic tracing feature with torch.fx for BERT, ELECTRA and T5 (#11475) · 86d5fb0b
  Michael Benayoun authored May 14, 2021
```
Symbolic tracing feature for BERT, ELECTRA and T5
Co-authored-by: Michael Benayoun <michael@huggingface.co>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  86d5fb0b
- Add Cloud details to README (#11706) · 94a23487
  Marc van Zee authored May 14, 2021
```
* Add Cloud details to README

* Flax script and readme updates
```
  94a23487
- correct example script (#11726) · 113eaa75
  Patrick von Platen authored May 14, 2021
  
  113eaa75
- Fix T5 beam search using parallelize (#11717) · bd3b599c
  Oyvind Tafjord authored May 14, 2021
  
  bd3b599c
13 May, 2021 8 commits

Fix loading the best model on the last stage of training (#11718) · 218d552f
Volodymyr Byno authored May 13, 2021

218d552f
Fix v4.6.0 doc · 25208200
Sylvain Gugger authored May 13, 2021

25208200
Fix doc deployment · cbbf49f6
Sylvain Gugger authored May 13, 2021

cbbf49f6

[T5] Add 3D attention mask to T5 model (2) (#9643) (#11197) · 91cf2915

lexhuismans authored May 13, 2021

* Add 3D attention mask to T5 model (#9643)

Added code for 3D attention mask in T5 model. Similar to BERT model.

* Add test for 3D attention mask

Added test for 3D attention mask: test_decoder_model_past_with_3d_attn_mask()
3D attention mask of the shape [Batch_size, Seq_length, Seq_length] both for
attention mask and decoder attention mask. Test is passing.

91cf2915

add everything (#11651) · 6ee1a4fd
Vasudev Gupta authored May 13, 2021

6ee1a4fd

[Flax] Fix BERT initialization & token_type_ids default (#11695) · 57b6a80d

Patrick von Platen authored May 13, 2021



* fix some stuff

* fix roberta & electra as well

* del run bug
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

57b6a80d

Fix gpt-2 warnings (#11709) · daf0d6a9
Lysandre Debut authored May 13, 2021

daf0d6a9

Enable option for subword regularization in more tokenizers. (#11417) · 37ed3ab7

Philip May authored May 13, 2021

* improve slow class tok usage at xlm rob

* add subword regularization for barthez

* improve barthez tok. test

* fix tokenizer tests

* add subword regularization for camembert

* add subword regularization for deberta v2 tokenizer

* add more doc to deberta v2 tokenizer

* add subword regularization for speech to text tok.

* fix sp_model_kwargs type in speech 2 text tok.

* add subword regularization for M2M100 tok.

* add more concrete type hints

* fix tests for m2m100 and s2t tok.

* add missing Any import

* fix syntax error in m2m100 tok.

* fix unpickle of m2m100 and s2t tok.

* fix test of m2m100 and s2t tok.

* improve unpickle of deberta v2 tok.

* add test for pickle of barthez & camembert

* fix pickle of barthez & camembert

* add test for deberta v2 tok. pickle

* fix m2m100 tok. pickle

* fix s2t tok. pickle

* add subword regularization to albert tok.

* refactor subword reg. test into TokenizerTesterMixin

improve albert tok. test

remove sample argument form albert tok.

check subword reg. using TokenizerTesterMixin

improve tok. tests

improve xlm roberta tok. tests

improve xlm roberta tok. tests

* add subword regularization for big bird t.

* improve xlm roberta tok. test

* add subword regularization for mbart50 tok.

* add subword regularization for pegasus tok.

* add subword regularization for reformer tok.

* add subword regularization for T5 tok.

* fix t5 tok. test formatting

* add subword regularization for xlm_proph. tok.

* add subword regularization for xlnet tok.

* add subword regularization for gert_gen tok.

* add typing to tokenizers

* add typing to xlm rob. tok

* add subword regularization for marian tok.

* add reverse tok. test

* fix marian tok test

* fix marian tok test

* fix casing in tok. tests

* fix style of tok. common test

* fix deberta v2 tok test

* add type annotations to tok. tests

* add type annotations to tok. __init__

* add typing to kokenizer

* add type annotations to tok. __init__

* don't specify the default when it's None

* fix barthez tok. doc

* move sentencepiece tok. tests to TokenizerTesterMixin

* fix unused imports

* fix albert tok. test

* add comment to sentencepiece test options

* fix Any import at big bird tok.

* fix Any import at xlm prophetnet tok.

* empty commit to trigger CI

37ed3ab7

12 May, 2021 6 commits

Vit deit fixes (#11309) · fa84540e

NielsRogge authored May 12, 2021



* Improve docs of DeiT and ViT, add community notebook

* Add gitignore for test_samples

* Add notebook with Trainer
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

fa84540e

Docs for v4.7.0.dev0 · d77eb0cf
Lysandre authored May 12, 2021

d77eb0cf
Release: v4.6.0 · 64e78564
Lysandre authored May 12, 2021

64e78564

[Lazy init] Force fall back to slow init for composite models (#11705) · fd6204b2

Patrick von Platen authored May 12, 2021



* fix encoder-decoder & RAG

* finalize

* Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/rag/modeling_rag.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

fd6204b2

fix example in config doc (#11696) · 5c1cda9d
Suraj Patil authored May 12, 2021

5c1cda9d
remove defaults to None if optional (#11703) · 77f4c46b
Philip May authored May 12, 2021

77f4c46b