Commits · f9880f62add09f1adbf01597843a47386855cfe0 · chenpangpang / transformers

25 May, 2021 3 commits
- fixed a small typo in the doc (#11856) · f9880f62
  Shiro T authored May 25, 2021
  
  f9880f62
- Enable memory metrics in tests that need it (#11859) · 6da129cb
  Lysandre Debut authored May 25, 2021
  
  6da129cb
- Add some tests to the slow suite #11860 · db0b2477
  Lysandre Debut authored May 25, 2021
  
  db0b2477
24 May, 2021 7 commits

[Trainer] Report both steps and num samples per second (#11818) · afe479ad

Sylvain Gugger authored May 24, 2021



* [Trainer] Report both steps and num samples per second

* Fix batch number

* Update src/transformers/trainer_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

afe479ad

Fix two typos in docs (#11852) · eaab9397
Nick Lane-Smith authored May 24, 2021
```
* typo2

* fix typo
```
eaab9397

Fix flos single node (#11844) · 8a2a3a25

Teven authored May 24, 2021

* fixing flos bug/typo in non-distributed setting

* storing flos every logging_interval

8a2a3a25

Switch mem metrics flag (#11851) · adb785b0

Sylvain Gugger authored May 24, 2021



* Switch mem metrics flag

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

adb785b0

Fix reference to XLNet (#11846) · fcdb85e9
Sylvain Gugger authored May 24, 2021

fcdb85e9
[Flax] Fix PyTorch import error (#11839) · f5806041
Patrick von Platen authored May 24, 2021
```
* fix_torch_device_generate_test

* remove @

* change pytorch import to flax import
```
f5806041
Replace double occurrences as the last step (#11367) · 0cbddfb1
Lysandre Debut authored May 24, 2021

0cbddfb1

22 May, 2021 1 commit

Faster list concat for trainer_pt_utils.get_length_grouped_indices() (#11825) · 73fde1de

ctheodoris authored May 22, 2021

get_length_grouped_indices() in LengthGroupedSampler and DistributedLengthGroupedSampler
is prohibitively slow for large number of megabatches (in test case takes hours for ~270k
megabatches with 100 items each) due to slow list concatenation with sum(megabatches, []).

Resolves: #11795
Co-authored-by: ctheodoris <cvtheodo@ds.dfci.harvard.edu>

73fde1de

21 May, 2021 7 commits
- Add flax text class colab (#11824) · da22245e
  Patrick von Platen authored May 21, 2021
```
* fix_torch_device_generate_test

* remove @

* add flax glue link
```
  da22245e
- [Deepspeed] support `zero.Init` in `from_config` (#11805) · a26f4d62
  Stas Bekman authored May 21, 2021
```
* support zero.Init in from_config

* no need for eval test
```
  a26f4d62
- [Flax] Small fixes in `run_flax_glue.py` (#11820) · 82335185
  Patrick von Platen authored May 21, 2021
```
* fix_torch_device_generate_test

* remove @

* correct best seed for flax fine-tuning
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
```
  82335185
- Avoid TensorFlow import in Trainer · b8697bc6
  Sylvain Gugger authored May 21, 2021
  
  b8697bc6
- fix roformer config doc (#11813) · e2c1dd09
  yujun authored May 21, 2021
  
  e2c1dd09
- Patch recursive import (#11812) · 1b652295
  Lysandre Debut authored May 21, 2021
  
  1b652295
- [Flax] Align GLUE training script with mlm training script (#11778) · bd987165
  Patrick von Platen authored May 21, 2021
```
* speed up flax glue

* remove unnecessary line

* remove folder

* remove run in loop
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
```
  bd987165
20 May, 2021 6 commits

Fix failing test on Windows Platform (#11589) · 22394387

Keren Fuentes authored May 20, 2021

* add separator for windows

* fixes test_is_copy_consistent on Windows

* fixing writing encoding issue on extended test (for Windows)

* resolving comments

22394387

A cleaner and more scalable implementation of symbolic tracing (#11763) · f4a0d6ff

Michael Benayoun authored May 20, 2021



Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures:
- ALBERT
- DistilBERT
- MobileBERT
- MegatronBERT
- GPT2
- GPT Neo
Co-authored-by: Michael Benayoun <michael@huggingface.co>

f4a0d6ff

Fix regression in regression (#11785) · 469384a7
Sylvain Gugger authored May 20, 2021
```
* Fix regression in regression

* Add test
```
469384a7
Fix pattern in conf.py (#11784) · 5ad5cc71
Sylvain Gugger authored May 20, 2021

5ad5cc71

Add new model RoFormer (use rotary position embedding ) (#11684) · 206f06f2

yujun authored May 20, 2021



* add roformer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* update

* add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding

* update docs

* make style and make quality

* roback

* unchanged

* rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding

* update Copyright year

* move # Add modeling imports here to the correct position

* max_position_embeddings can be set to 1536

* # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer

* # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer

* update tokenization_roformer

* make style

* add staticmethod apply_rotary_position_embeddings

* add TF staticmethod apply_rotary_position_embeddings

* update torch apply_rotary_position_embeddings

* fix tf apply_rotary_position_embeddings error

* make style

* add pytorch RoFormerSelfAttentionRotaryPositionEmbeddingTest

* add TF rotary_position_embeddings test

* update test_modeling_rofomer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_tf_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refact roformer tokenizer

* add RoFormerTokenizerFast

* add RoFormerTokenizationTest

* add require_jieba

* update Copyright

* update tokenizer & add copy from

* add option rotary_value

* use rust jieba

* use rjieba

* use rust jieba

* fix test_alignement_methods

* slice normalized_string is too slow

* add config.embedding_size when embedding_size!=hidden_size

* fix pickle tokenizer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style and make quality
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

206f06f2

Deprecate commands from the transformers-cli that are in the hf-cli (#11779) · 075fdab4
Lysandre Debut authored May 20, 2021

075fdab4

19 May, 2021 3 commits

Add DOI badge to README (#11771) · 2582e59a
Albert Villanova del Moral authored May 19, 2021

2582e59a

[Flax MLM] Refactor run mlm with optax (#11745) · 00440e35

Patrick von Platen authored May 19, 2021



* refactor

* update

* update

* update

* refactor run mlm

* finalize

* refactor more

* fix typo

* update

* finish refactor

* modify run mlm

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* small fixes

* upload

* upload

* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

00440e35

[T5 failing CI] Fix generate test (#11770) · 43891be1
Patrick von Platen authored May 19, 2021
```
* fix_torch_device_generate_test

* remove @
```
43891be1

18 May, 2021 10 commits

Fix usage of head masks by PT encoder-decoder models' `generate()` function (#11621) · 680d181c

Daniel Stancl authored May 19, 2021

* Add missing head masking for generate() function

* Add head_mask, decoder_head_mask and cross_attn_head_mask
into prepare_inputs_for_generation for generate() function
for multiple encoder-decoder models.

* Add test_genereate_with_head_masking

* [WIP] Update the new test and handle special cases

* make style

* Omit ProphetNet test so far

* make fix-copies

680d181c

FlaxGPT2 (#11556) · ca33278f

Suraj Patil authored May 19, 2021



* flax gpt2

* combine masks

* handle shared embeds

* add causal LM sample

* style

* add tests

* style

* fix imports, docs, quality

* don't use cache

* add cache

* add cache 1st version

* make use cache work

* start adding test for generation

* finish generation loop compilation

* rewrite test

* finish

* update

* update

* apply sylvains suggestions

* update

* refactor

* fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ca33278f

Fix a small error in summarization example (#11762) · eb3e072a
Tomy Hsieh authored May 19, 2021

eb3e072a

Add Flax Examples and Cloud TPU README (#11753) · 77f9bd18

Avital Oliver authored May 18, 2021



* Add Flax Examples README

* Apply suggestions from code review

* Update examples/flax/README.md

* add nice table

* fix

* fix

* apply suggestions

* upload

* finish flax readme.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

77f9bd18

add `dataset_name` to data_args and added accuracy metric (#11760) · 04e25c62

Philipp Schmid authored May 18, 2021

* add `dataset_name` to data_args and added accuracy metric

* added documentation for dataset_name

* spelling correction

04e25c62

Fixed: Better names for nlp variables in pipelines' tests and docs. (#11752) · fd3b12e8
Vyom Pathak authored May 18, 2021
```
* Fixed: Better names for nlp variables in pipelines' tests and docs.

* Fixed: Better variable names
```
fd3b12e8
Add more subsections to main doc (#11758) · cebb96f5
Patrick von Platen authored May 18, 2021
```
* add headers to main doc

* Apply suggestions from code review

* update

* upload
```
cebb96f5
Fix incorrect newline in #11650 (#11757) · da7e73b7
Tommy Chiang authored May 18, 2021

da7e73b7
Fix checkpoint deletion (#11748) · a515caa3
Sylvain Gugger authored May 18, 2021

a515caa3

[TokenClassification] Label realignment for subword aggregation (#11680) · b88e0e01

Nicolas Patry authored May 18, 2021

* [TokenClassification] Label realignment for subword aggregation

Tentative to replace https://github.com/huggingface/transformers/pull/11622/files



- Added `AggregationStrategy`
- `ignore_subwords` and `grouped_entities` arguments are now fused
  into `aggregation_strategy`. It makes more sense anyway because
  `ignore_subwords=True` with `grouped_entities=False` did not have a
  meaning anyway.
- Added 2 new ways to aggregate which are MAX, and AVERAGE
- AVERAGE requires a bit more information than the others, for now this
case is slightly specific, we should keep that in mind for future
changes.
- Testing has been modified to reflect new argument, and to check the
correct deprecation and the new aggregation_strategy.
- Put the testing argument and testing results for aggregation_strategy,
close together, so that readers can understand what is supposed to
happen.
- `aggregate` is now only tested on a small model as it does not mean
anything to test it globally for all models.
- Previous tests are unchanged in desired output.
- Added a new test case that showcases better the difference between the
  FIRST, MAX and AVERAGE strategies.

* Wrong framework.

* Addressing three issues.

1- Tags might not follow B-, I- convention, so any tag should work now
(assumed as B-TAG)
2- Fixed an issue with average that leads to a substantial code change.
3- The testing suite was not checking for the "index" key for "none"
strategy. This is now fixed.

The issue is that "O" could not be chosen by AVERAGE strategy because
those tokens were filtered out beforehand, so their relative scores were
not counted in the average. Now filtering on
ignore_labels will happen at the very end of the pipeline fixing
that issue.
It's a bit hard to make sure this stays like that because we do
not have a end-to-end test for that behavior

* Formatting.

* Adding formatting to code + cleaner handling of B-, I- tags.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>

* Typo.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>

b88e0e01

17 May, 2021 3 commits
- push (#11750) · c73e3532
  Patrick von Platen authored May 17, 2021
  
  c73e3532
- Use new evaluation loop in TrainerQA (#11746) · 936b5715
  Sylvain Gugger authored May 17, 2021
  
  936b5715
- [BigBird Pegasus] Make tests faster (#11744) · 73893fc7
  Patrick von Platen authored May 17, 2021
```
* improve tests

* remove bogus file

* make style
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
```
  73893fc7