Commits · 3857f2b4e34912c942694489c2b667d9476e55f5 · chenpangpang / transformers

07 Jun, 2021 1 commit
- fix deberta 2 tokenizer integration test (#12017) · 3857f2b4
  Philip May authored Jun 07, 2021
  
  3857f2b4
04 Jun, 2021 1 commit

[Deepspeed] Assert on mismatches between ds and hf args (#12021) · 2c73b930

Stas Bekman authored Jun 04, 2021



* wip

* add mismatch validation + test

* renames

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* renames
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

2c73b930

02 Jun, 2021 4 commits

[deepspeed] add nvme test skip rule (#11997) · 61c50634
Stas Bekman authored Jun 02, 2021
```
* add nvme skip rule

* fix
```
61c50634
[deepspeed] Move code and doc into standalone files (#11984) · 640318be
Stas Bekman authored Jun 02, 2021
```
* move code and docs

* style

* moved

* restore
```
640318be

VisualBERT (#10534) · 88ca6a23

Gunjan Chhablani authored Jun 02, 2021



* Init VisualBERT

* Add cookie-cutter, Config, and Embeddings

* Add preliminary Model

* Add Bert analogous classes

* Add basic code for NLVR, VQA, Flickr

* Update Init

* Fix VisualBert Downstream Models

* Rename classifier to cls

* Comment position_ids buffer

* Remove sentence image predictor output

* Update output dicts

* Remove unnecessary files

* Fix Auto Modeling

* Fix transformers init

* Add conversion script

* Add conversion script

* Fix docs

* Update visualbert modelling

* Update configuration

* Style fixes

* Add model and integration tests

* Add all tests

* Update model mapping

* Add simple detector from original repository

* Update docs and configs

* Fix style

* Fix style

* Update docs

* Fix style

* Fix import issues in style

* Fix style

* Add changes from review

* Fix style

* Fix style

* Update docs

* Fix style

* Fix style

* Update docs/source/model_doc/visual_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes from review

* Remove convert run script

* Add changes from review

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes from review

* Add changes from review

* Add visual embedding example in docs

* Fix "copied from" comments

* Add changes from review

* Fix error, style, checkpoints

* Update docs

* Fix integration tests

* Fix style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

88ca6a23

[RAG] Fix rag from pretrained question encoder generator behavior (#11962) · 43f46aa7
Patrick von Platen authored Jun 02, 2021
```
* fix_torch_device_generate_test

* remove @

* fix rag from pretrained loading

* add test

* uplaod

* finish
```
43f46aa7

01 Jun, 2021 6 commits

[Trainer] add train loss and flops metrics reports (#11980) · 4ba203d9

Stas Bekman authored Jun 01, 2021

* add train loss and flops metrics reports

* consistency

* add train_loss to skip keys

* restore on_train_end call timing

4ba203d9

[DeepSpeed] decouple `DeepSpeedConfigHF` from `Trainer` (#11966) · 7ec596ec

Stas Bekman authored Jun 01, 2021



* decouple DeepSpeedConfigHF from Trainer

* add LoggingLevel ctx manager; add new test

* cleanup

* add docs

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* implemented suggested renames

* formatter workaround
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7ec596ec

ByT5 model (#11971) · 47a98fc4

Patrick von Platen authored Jun 01, 2021



* allow tf to use uneven num of layers

* add tokenizer

* finish docs

* finish docs

* Apply suggestions from code review

* include in index

* finish

* Update docs/source/model_doc/byt5.rst
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* apply sylvais suggestions

* make style
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

47a98fc4

Add regression tests for slow sentencepiece tokenizers. (#11737) · fcad8018

Philip May authored Jun 01, 2021

* add test_vocab_size for sentencepiece tok.

* add test_get_vocab for sentencepiece tok.

* add test_convert_token_and_id for sentencepiece tok.

* add test_tokenize_and_convert_tokens_to_string for all tok.

* improve test_tokenize_and_convert_tokens_to_string for sp. tok.

* add common tokenizer integration tests
- for albert
- for barthez

* add tokenizer integration tests to bert gen.

* add most tokenizer integration tests

* fix camembert tokenizer integration test

* add tokenizer integration test to marian

* add tokenizer integration test to reformer

* add typing and doc to tokenizer_integration_test_util

* fix tokenizer integration test of reformer

* improve test_sentencepiece_tokenize_and_convert_tokens_to_string

* empty commit to trigger CI

* fix tokenizer integration test of reformer

* remove code not needed anymore

* empty commit to trigger CI

* empty commit to trigger CI

fcad8018

RAG-2nd2end-revamp (#11893) · 9ec0f01b

Shamane Siri authored Jun 01, 2021



* initial

* code quality test

* code quality

* added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver

* minor change in test_modeling_rag

* fixed tests

* Update examples/research_projects/rag-end2end-retriever/README.md

typo corrected as suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* Update examples/research_projects/rag-end2end-retriever/finetune_rag.py

type change suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* Update src/transformers/models/rag/retrieval_rag.py

Adding this change as mentioned by lhoestq.
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* completed the minor changes suggested by the reviewers
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

9ec0f01b

Add FlaxCLIP (#11883) · ad25fd62

Suraj Patil authored Jun 01, 2021

* add flax CLIP

* default input_shape

* add tests

* fix test

* fix name

* fix docs

* fix shapes

* attend at least 1 token

* flax conv to torch conv

* return floats

* fix equivalence tests

* fix import

* return attention_weights and update tests

* fix dosctrings

* address patricks comments

* input_shape arg

* add tests for get_image_features and get_text_features methods

* fix tests

ad25fd62

31 May, 2021 1 commit
- fix assert (#11935) · fb60c309
  Philip May authored May 31, 2021
  
  fb60c309
28 May, 2021 2 commits

[Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918) · af1a10bf

Jayendra authored May 28, 2021



* Added logic to return attention from flax-bert model and added test cases to check that

* Added new line at the end of file to test_modeling_flax_common.py

* fixing code style

* Fixing Roberta and Elextra models too from cpoying bert

* Added temporary hack to not run test_attention_outputs for FlaxGPT2

* Returning attention weights from GPT2 and changed the tests accordingly.

* last fixes

* bump flax dependency
Co-authored-by: jayendra <jayendra@infocusp.in>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

af1a10bf

Added Sequence Classification class in GPTNeo (#11906) · e1205e47
Bhadresh Savani authored May 28, 2021
```
* seq classification changes

* fix tests
```
e1205e47

27 May, 2021 1 commit

Adding new argument `max_new_tokens` for generate. (#11476) · 80d712fa

Nicolas Patry authored May 27, 2021

* Adding new argument `max_new_tokens` for generate.

This is a proposal to add a new argument `max_new_tokens` to `generate`.
This include a `MaxNewTokensCriteria` that enables callers that don't
know about the token length ahead (like pipelines callers) to manage
more easily the length of their generated output.

* Adding a test for the user warning when both`max_length` and
`max_new_tokens` are used together.

* Removed redundant `no_grad`.

80d712fa

26 May, 2021 3 commits

Flax Generate (#11777) · 996a315e

Patrick von Platen authored May 27, 2021



* fix_torch_device_generate_test

* remove @

* add

* indexing

* correct a couple of tests

* fix tests

* add logits processor

* finish top_k, top_p, temp

* add docs

* correct flax prng key default

* improve generate

* add generation docs

* add docs

* make style

* revert model outputs change

* make style

* correct typo

* fix tests

* fix slow test

* add raise

* finish generation
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

996a315e

[Flax] Allow dataclasses to be jitted (#11886) · d5a72b6e

Patrick von Platen authored May 26, 2021

* fix_torch_device_generate_test

* remove @

* change dataclasses to flax ones

* fix typo

* fix jitted tests

* fix bert & electra

d5a72b6e

Fix usage of head masks by TF encoder-decoder models' `generate()` function (#11775) · 0b933584

Daniel Stancl authored May 26, 2021

* Fix Bart

* Fix Blenderbot{,_small}

* Fix LED

* Fix Marian

* Fix MBart

* Fix Pegasus

* Fix T5

* Add test for generation with head_mask

* Add a common TF test

* Override a test for the LED model as head masking is not yet properly implemented

* Remove all head_masks from input preparation for LED

* Drop masking for T5 as it needs a bit of refactor

0b933584

25 May, 2021 5 commits

[Examples] create model with custom config on the fly (#11798) · 1b653010

Stas Bekman authored May 25, 2021



* create custom model on the flight

* better wording

* add update_from_string

* cleanup

* cleanup

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more bool options

* style

* fix logger

* add test

* add the doc

* assert on conflict of options
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1b653010

[Wav2Vec2] SpecAugment Fast (#11764) · 7630c11f
Patrick von Platen authored May 25, 2021
```
* first try

* finish
```
7630c11f
Add option to log only once in multinode training (#11819) · f086652b
Sylvain Gugger authored May 25, 2021
```
* Add option to long only once in multinode training

* Use an alternate property
```
f086652b
Enable memory metrics in tests that need it (#11859) · 6da129cb
Lysandre Debut authored May 25, 2021

6da129cb
Add some tests to the slow suite #11860 · db0b2477
Lysandre Debut authored May 25, 2021

db0b2477

24 May, 2021 1 commit

[Trainer] Report both steps and num samples per second (#11818) · afe479ad

Sylvain Gugger authored May 24, 2021



* [Trainer] Report both steps and num samples per second

* Fix batch number

* Update src/transformers/trainer_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

afe479ad

21 May, 2021 2 commits
- [Deepspeed] support `zero.Init` in `from_config` (#11805) · a26f4d62
  Stas Bekman authored May 21, 2021
```
* support zero.Init in from_config

* no need for eval test
```
  a26f4d62
- Patch recursive import (#11812) · 1b652295
  Lysandre Debut authored May 21, 2021
  
  1b652295
20 May, 2021 4 commits

Fix failing test on Windows Platform (#11589) · 22394387

Keren Fuentes authored May 20, 2021

* add separator for windows

* fixes test_is_copy_consistent on Windows

* fixing writing encoding issue on extended test (for Windows)

* resolving comments

22394387

A cleaner and more scalable implementation of symbolic tracing (#11763) · f4a0d6ff

Michael Benayoun authored May 20, 2021



Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures:
- ALBERT
- DistilBERT
- MobileBERT
- MegatronBERT
- GPT2
- GPT Neo
Co-authored-by: Michael Benayoun <michael@huggingface.co>

f4a0d6ff

Fix regression in regression (#11785) · 469384a7
Sylvain Gugger authored May 20, 2021
```
* Fix regression in regression

* Add test
```
469384a7

Add new model RoFormer (use rotary position embedding ) (#11684) · 206f06f2

yujun authored May 20, 2021



* add roformer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* update

* add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding

* update docs

* make style and make quality

* roback

* unchanged

* rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding

* update Copyright year

* move # Add modeling imports here to the correct position

* max_position_embeddings can be set to 1536

* # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer

* # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer

* update tokenization_roformer

* make style

* add staticmethod apply_rotary_position_embeddings

* add TF staticmethod apply_rotary_position_embeddings

* update torch apply_rotary_position_embeddings

* fix tf apply_rotary_position_embeddings error

* make style

* add pytorch RoFormerSelfAttentionRotaryPositionEmbeddingTest

* add TF rotary_position_embeddings test

* update test_modeling_rofomer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_tf_roformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refact roformer tokenizer

* add RoFormerTokenizerFast

* add RoFormerTokenizationTest

* add require_jieba

* update Copyright

* update tokenizer & add copy from

* add option rotary_value

* use rust jieba

* use rjieba

* use rust jieba

* fix test_alignement_methods

* slice normalized_string is too slow

* add config.embedding_size when embedding_size!=hidden_size

* fix pickle tokenizer

* Update docs/source/model_doc/roformer.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style and make quality
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

206f06f2

19 May, 2021 1 commit
- [T5 failing CI] Fix generate test (#11770) · 43891be1
  Patrick von Platen authored May 19, 2021
```
* fix_torch_device_generate_test

* remove @
```
  43891be1
18 May, 2021 5 commits

Fix usage of head masks by PT encoder-decoder models' `generate()` function (#11621) · 680d181c

Daniel Stancl authored May 19, 2021

* Add missing head masking for generate() function

* Add head_mask, decoder_head_mask and cross_attn_head_mask
into prepare_inputs_for_generation for generate() function
for multiple encoder-decoder models.

* Add test_genereate_with_head_masking

* [WIP] Update the new test and handle special cases

* make style

* Omit ProphetNet test so far

* make fix-copies

680d181c

FlaxGPT2 (#11556) · ca33278f

Suraj Patil authored May 19, 2021



* flax gpt2

* combine masks

* handle shared embeds

* add causal LM sample

* style

* add tests

* style

* fix imports, docs, quality

* don't use cache

* add cache

* add cache 1st version

* make use cache work

* start adding test for generation

* finish generation loop compilation

* rewrite test

* finish

* update

* update

* apply sylvains suggestions

* update

* refactor

* fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ca33278f

Fixed: Better names for nlp variables in pipelines' tests and docs. (#11752) · fd3b12e8
Vyom Pathak authored May 18, 2021
```
* Fixed: Better names for nlp variables in pipelines' tests and docs.

* Fixed: Better variable names
```
fd3b12e8
Fix checkpoint deletion (#11748) · a515caa3
Sylvain Gugger authored May 18, 2021

a515caa3

[TokenClassification] Label realignment for subword aggregation (#11680) · b88e0e01

Nicolas Patry authored May 18, 2021

* [TokenClassification] Label realignment for subword aggregation

Tentative to replace https://github.com/huggingface/transformers/pull/11622/files



- Added `AggregationStrategy`
- `ignore_subwords` and `grouped_entities` arguments are now fused
  into `aggregation_strategy`. It makes more sense anyway because
  `ignore_subwords=True` with `grouped_entities=False` did not have a
  meaning anyway.
- Added 2 new ways to aggregate which are MAX, and AVERAGE
- AVERAGE requires a bit more information than the others, for now this
case is slightly specific, we should keep that in mind for future
changes.
- Testing has been modified to reflect new argument, and to check the
correct deprecation and the new aggregation_strategy.
- Put the testing argument and testing results for aggregation_strategy,
close together, so that readers can understand what is supposed to
happen.
- `aggregate` is now only tested on a small model as it does not mean
anything to test it globally for all models.
- Previous tests are unchanged in desired output.
- Added a new test case that showcases better the difference between the
  FIRST, MAX and AVERAGE strategies.

* Wrong framework.

* Addressing three issues.

1- Tags might not follow B-, I- convention, so any tag should work now
(assumed as B-TAG)
2- Fixed an issue with average that leads to a substantial code change.
3- The testing suite was not checking for the "index" key for "none"
strategy. This is now fixed.

The issue is that "O" could not be chosen by AVERAGE strategy because
those tokens were filtered out beforehand, so their relative scores were
not counted in the average. Now filtering on
ignore_labels will happen at the very end of the pipeline fixing
that issue.
It's a bit hard to make sure this stays like that because we do
not have a end-to-end test for that behavior

* Formatting.

* Adding formatting to code + cleaner handling of B-, I- tags.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>

* Typo.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>

b88e0e01

17 May, 2021 1 commit

[BigBird Pegasus] Make tests faster (#11744) · 73893fc7

Patrick von Platen authored May 17, 2021



* improve tests

* remove bogus file

* make style
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

73893fc7

14 May, 2021 1 commit

Experimental symbolic tracing feature with torch.fx for BERT, ELECTRA and T5 (#11475) · 86d5fb0b

Michael Benayoun authored May 14, 2021



Symbolic tracing feature for BERT, ELECTRA and T5
Co-authored-by: Michael Benayoun <michael@huggingface.co>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

86d5fb0b

13 May, 2021 1 commit
- Fix loading the best model on the last stage of training (#11718) · 218d552f
  Volodymyr Byno authored May 13, 2021
  
  218d552f