Commits · 31563e056da7a8813071022d465384748d395d30 · chenpangpang / transformers

08 Feb, 2021 6 commits

Restore TF embeddings and attention layers to their previous version (#9890) · 31563e05

Julien Plu authored Feb 08, 2021

* Refacto BERT

* Restore all the concerned models

* Remove print

* Update template

* Apply Sylvain's and Morgan's comments

* Fix cast

* Put the cast inside call

* Remove cond in ebds

* Fix funnel

* Restore previous dot product (attention_scores) computation

* Add ConvBERT and BART

* Make all the S2S models ONNX compliant

* Fix test

* Fix check copies

31563e05

Disable temporarily too slow tests (Longformer/LED) (#10062) · 8bb52bd2
Julien Plu authored Feb 08, 2021
```
* Disable temporarily too slow tests

* Fix style

* Fix template
```
8bb52bd2

Cleaning up `ConversationalPipeline` to support more than DialoGPT. (#10002) · b1aa4982

Nicolas Patry authored Feb 08, 2021

* Cleaning up `ConversationalPipeline` to support more than DialoGPT.

Currently ConversationalPipeline was heavily biased towards DialoGPT
,which is the default model for this pipeline.

This PR proposes changes to put back the modifications specific to
DialoGPT into tokenizer-specific behavior wherever possible, by
creating `_build_conversation_input_ids` function that takes
conversation as input, and returns a list of ints corresponding
to the tokens. It feels natural to put here because all models
have probably different strategies to build input_ids from the
full conversation and it's the tokenizer's job to transform strings
into tokens (and vice-versa)

If `_build_conversation_input_ids` is missing, previous behavior is
used so we don't break anything so far (except for blenderbot where it's a fix).

This PR also contains a fix for too long inputs. There used
to be dead code for trying to limit the size of incoming input.
The introduced fixed is that we limit
within `_build_conversation_input_ids` to `tokenizer.model_max_length`.
It corresponds to the intent of the removed dead code and is actually
better because it corresponds to `model_max_length` which is different
from `max_length` (which is a default parameter for `generate`).

- Removed `history` logic from the Conversation as it's not relevant
anymore because tokenization logic has been moved to tokenizer.
And tokenizer cannot save any cache, and conversation cannot know
what is relevant or not.
Also it's not usable from `blenderbot` because the input_ids are
not append only (EOS tokens is always at the end).

- Added `iter_texts` method on `Conversation` because all
the code was literred with some form of this iteration of
past/generated_responses.

* Removing torch mention in types.

* Adding type checking to `_build_conversation_input_ids`.

* Fixing import in strings.

b1aa4982

fix bart tests (#10060) · 9a0399e1
Patrick von Platen authored Feb 08, 2021

9a0399e1
Fix slow dpr test (#10059) · d51302cc
Lysandre Debut authored Feb 08, 2021
```
* Correct cast to device

* Comment back the slow test
```
d51302cc
Integration test for FlauBert (#10022) · 12e44af5
sandip authored Feb 08, 2021

12e44af5

04 Feb, 2021 4 commits

Hotfixing tests (blenderbot decoderonly tests, also need to remove (#10003) · d5888ef0
Nicolas Patry authored Feb 04, 2021
```
`encoder_no_repeat_ngram_size` from their config.
```
d5888ef0

Adding new `encoder_no_repeat_ngram_size` to `generate`. (#9984) · aeb18b92

Nicolas Patry authored Feb 04, 2021

Adding new `encoder_no_repeat_ngram_size` to `generate`.

Blenderbot results seemed off compared to original ParlAI script:
`https://parl.ai/projects/recipes/`

. Notably the model seems
to repeat a lot what was said during the conversation.

The actual problem was that `no_repeat_ngram_size` actually applies
to the `encoder_input_ids` but HF's `no_repeat_ngram_size` applies
to the previously generated ids (within the decoder). The history
conversation of blenderbot is within the `encoder` part so that
explains why HF's implementation had the repetitions.

This fix was focused on blenderbot *not* small and added tests
for those because they are quite different in configuration.

This change includes:

- Adding a new EncoderNoRepeatLogitProcessor.
- Adding 1 new arg to `generate` (`encoder_no_repeat_ngram_size`)
- Adding 1 new config parameter `encoder_no_repeat_ngram_size`.
- Adding 2 tests, one for the pipeline (high level, inputs exhibited
repeat behavior, one low level for EncoderNoRepeatLogitProcessor)
- Factored NoRepeatLogitProcessor so that logic could be reused.

Further work:

- Blenderbot conversational pipeline still does not behave correctly
 as they way input is prepared within the pipeline is still incorrect
(follow up PR)
- Blenderbot allows the bot to have personas, which is done by
prepending "your personna: XXXX" to the input, this could be explored
too in a follow up PR.

@patrickvonplaten
@LysandreJik

* Update src/transformers/generation_logits_process.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Doc quality.

* Fixing test.

* Last fixes.

* Fixing to account for batch_size.

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/generation_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

aeb18b92

Added Integration testing for DistilBert model from issue #9948' (#9995) · 804cd185
Daniel Hug authored Feb 04, 2021

804cd185

BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128) · 00031785

demSd authored Feb 04, 2021



* initiliaze bart4causalLM

* create BartDecoderWrapper, setters/getters

* delete spaces

* forward and additional methods

* update cache function, loss function, remove ngram* params in data class.

* add bartcausallm, bartdecoder testing

* correct bart for causal lm

* remove at

* add mbart as well

* up

* fix typo

* up

* correct

* add pegasusforcausallm

* add blenderbotforcausallm

* add blenderbotsmallforcausallm

* add marianforcausallm

* add test for MarianForCausalLM

* add Pegasus test

* add BlenderbotSmall test

* add blenderbot test

* fix a fail

* fix an import fail

* a fix

* fix

* Update modeling_pegasus.py

* fix models

* fix inputs_embeds setting getter

* adapt tests

* correct repo utils check

* finish test improvement

* fix tf models as well

* make style

* make fix-copies

* fix copies

* run all tests

* last changes

* fix all tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

00031785

03 Feb, 2021 6 commits
- Alber model integration testing added (#9980) · 2f06f2bc
  sandip authored Feb 03, 2021
  
  2f06f2bc
- Integration test added for TF MPnet (#9979) · 75fd00fb
  sandip authored Feb 03, 2021
  
  75fd00fb
- Integration test for mobilebert (#9978) · ce08043f
  sandip authored Feb 03, 2021
  
  ce08043f
- TF DistilBERT integration tests (#9975) · 1486205d
  sandip authored Feb 03, 2021
```
* TF DistilBERT integration test

* Update test_modeling_tf_distilbert.py
```
  1486205d
- Added integration tests for TensorFlow implementation of the ALBERT model (#9976) · f2d5c04e
  sandip authored Feb 03, 2021
```
* TF Albert integration test

* TF Alber integration test added
```
  f2d5c04e
- Fix Longformer and LED (#9942) · 3f77c26d
  Julien Plu authored Feb 03, 2021
```
* Fix Longformer and LED

* Add a test for graph execution with inputs_embeds

* Apply style
```
  3f77c26d
02 Feb, 2021 4 commits

Add head_mask and decoder_head_mask to PyTorch LED (#9856) · 71bdc076

Daniel Stancl authored Feb 02, 2021

* Add {decoder_,}head_mask to LED

* Fix create_custom_forward signatue in encoder

* Add head_mask to longformer

* Add head_mask to longformer to fix dependencies
of LED on Longformer.

* Not working yet

* Add mising one input in longofrmer_modeling.py

* make fix-copies

71bdc076

Wav2Vec2 (#9659) · d6217fb3

Patrick von Platen authored Feb 02, 2021



* add raw scaffold

* implement feat extract layers

* make style

* remove +

* correctly convert weights

* make feat extractor work

* make feature extraction proj work

* run forward pass

* finish forward pass

* Succesful decoding example

* remove unused files

* more changes

* add wav2vec tokenizer

* add new structure

* fix run forward

* add other layer norm architecture

* finish 2nd structure

* add model tests

* finish tests for tok and model

* clean-up

* make style

* finish docstring for model and config

* make style

* correct docstring

* correct tests

* change checkpoints to fairseq

* fix examples

* finish wav2vec2

* make style

* apply sylvains suggestions

* apply lysandres suggestions

* change print to log.info

* re-add assert statement

* add input_values as required input name

* finish wav2vec2 tokenizer

* Update tests/test_tokenization_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply sylvains suggestions
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

d6217fb3

ALBERT Tokenizer integration test (#9943) · 1809de51
Lysandre Debut authored Feb 02, 2021
```
* ALBERT Tokenizer integration test

* Batching

* Style
```
1809de51

[Tokenizer Utils Base] Make pad function more flexible (#9928) · 538b3b46

Patrick von Platen authored Feb 02, 2021

* change tokenizer requirement

* split line

* Correct typo from list to str

* improve style

* make other function pretty as well

* add comment

* correct typo

* add new test

* pass tests for tok without padding token

* Apply suggestions from code review

538b3b46

01 Feb, 2021 1 commit

Add head_mask and decoder_head_mask to FSMT (#9819) · 0c6c0afc

Daniel Stancl authored Feb 01, 2021

* Add {decoder_,}head_mask to fsmt_modeling.py

* Enable test_headmasking and some changes to docs

* Remove test_head_masking flag from fsmt test file

Remove test_head_masking flag from test_modeling_fsmt.py
since test_head_masking is set to be True by default (thus it is redundant to store).

* Merge master and remove test_head_masking = True

* Rebase necessary due to an update of jaxlib

* Remove test_head_masking=True in tests/test_modeling_fsmt.py
as it is redundant.

0c6c0afc

29 Jan, 2021 2 commits

Add XLA test (#9848) · fdcde144
Julien Plu authored Jan 29, 2021

fdcde144

Adding a new `return_full_text` parameter to TextGenerationPipeline. (#9852) · c2d0ffec

Nicolas Patry authored Jan 29, 2021

* Adding a new `return_full_text` parameter to TextGenerationPipeline.

For text-generation, it's sometimes used as prompting text.
In that context, prefixing `generated_text` with the actual input
forces the caller to take an extra step to remove it.

The proposed change adds a new parameter (for backward compatibility).
`return_full_text` that enables the caller to prevent adding the prefix.

* Doc quality.

c2d0ffec

28 Jan, 2021 3 commits

Remove redundant `test_head_masking = True` flags in test files (#9858) · 4c3ae89a

Daniel Stancl authored Jan 28, 2021

* Remove redundant test_head_masking = True flags

* Remove all redundant test_head_masking flags in PyTorch test_modeling_* files

* Make test_head_masking = True as a default choice in test_modeling_tf_commong.py

* Remove all redundant test_head_masking flags in TensorFlow
test_modeling_tf_* files

* Put back test_head_masking=False fot TFT5 models

4c3ae89a

Deprecate model_path in Trainer.train (#9854) · b4e559cf
Sylvain Gugger authored Jan 28, 2021

b4e559cf
Fixing flaky conversational test + flag it as a pipeline test. (#9837) · b936582f
Nicolas Patry authored Jan 28, 2021

b936582f

27 Jan, 2021 7 commits

ADD BORT (#9813) · 5ed5a546

Stefan Schweter authored Jan 27, 2021

* tests: add integration tests for new Bort model

* bort: add conversion script from Gluonnlp to Transformers 🚀



* bort: minor cleanup (BORT -> Bort)

* add docs

* make fix-copies

* clean doc a bit

* correct docs

* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct dialogpt doc

* correct link

* Update docs/source/model_doc/bort.rst

* Update docs/source/model_doc/dialogpt.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5ed5a546

[traner] fix --lr_scheduler_type choices (#9800) · 7c6d6329

Stas Bekman authored Jan 27, 2021



* fix --lr_scheduler_type choices

* rewrite to fix for all enum-based cl args

* cleanup

* adjust test

* style

* Proposal that should work

* Remove needless code

* Fix test
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

7c6d6329

Allow --arg Value for booleans in HfArgumentParser (#9823) · 893120fa
Sylvain Gugger authored Jan 27, 2021
```
* Allow --arg Value for booleans in HfArgumentParser

* Update last test

* Better error message
```
893120fa

When resuming training from checkpoint, Trainer loads model (#9818) · 35d55b7b

Sylvain Gugger authored Jan 27, 2021

* Whenresuming training from checkpoint, Trainer loads model

* Finish cleaning tests

* Address review comment

* Use global_step from state

35d55b7b

Adding a test to prevent late failure in the Table question answering (#9808) · 285c6262

Nicolas Patry authored Jan 27, 2021

pipeline.

- If table is empty then the line that contain `answer[0]` will fail.
- This PR add a check to prevent `answer[0]`.
- Also adds an early check for presence of `table` and `query` to
prevent late failure and give better error message.
- Adds a few tests to make sure these errors are correctly raised.

285c6262

Add a test for mixed precision (#9806) · 2c891c15
Julien Plu authored Jan 27, 2021

2c891c15

ConvBERT Model (#9717) · f617490e

abhishek thakur authored Jan 27, 2021

* finalize convbert

* finalize convbert

* fix

* fix

* fix

* push

* fix

* tf image patches

* fix torch model

* tf tests

* conversion

* everything aligned

* remove print

* tf tests

* fix tf

* make tf tests pass

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽



* add doc and cleanup

* add electra test again

* fix doc

* fix doc again

* fix doc again

* Update src/transformers/modeling_tf_pytorch_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/conv_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* conv_bert -> convbert

* more fixes from review

* add conversion script

* dont use pretrained embed

* unused config

* suggestions from julien

* some more fixes

* p -> param

* fix copyright

* fix doc

* Update src/transformers/models/convbert/configuration_convbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* comments from reviews

* fix-copies

* fix style

* revert shape_list
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

f617490e

26 Jan, 2021 3 commits

Adding `skip_special_tokens=True` to FillMaskPipeline (#9783) · 781e4b13

Nicolas Patry authored Jan 26, 2021

* We most likely don't want special tokens in this output.

* Adding `skip_special_tokens=True` to FillMaskPipeline

- It's backward incompatible.
- It makes for sense for pipelines to remove references to
special_tokens (all of the other pipelines do that).
- Keeping special tokens makes it hard for users to actually remove them
  because all models have different tokens (<s>, <cls>, [CLS], ....)

* Fixing `token_str` in the same vein, and actually fix the tests too !

781e4b13

Add head_mask/decoder_head_mask for TF BART models (#9639) · 1867d9a8

Daniel Stancl authored Jan 26, 2021

* Add head_mask/decoder_head_mask for TF BART models

* Add head_mask and decoder_head_mask input arguments for TF BART-based
models as a TF counterpart to the PR #9569

* Add test_headmasking functionality to tests/test_modeling_tf_common.py

* TODO: Add a test to verify that we can get a gradient back for
importance score computation

* Remove redundant #TODO note

Remove redundant #TODO note from tests/test_modeling_tf_common.py

* Fix assertions

* Make style

* Fix ...Model input args and adjust one new test

* Add back head_mask and decoder_head_mask to BART-based ...Model
after the last commit

* Remove head_mask ande decoder_head_mask from input_dict
in TF test_train_pipeline_custom_model as these two have different
shape than other input args (Necessary for passing this test)

* Revert adding global_rng in test_modeling_tf_common.py

1867d9a8

[Flaky Generation Tests] Make sure that no early stopping is happening for beam search (#9794) · d94cc2f9
Patrick von Platen authored Jan 26, 2021
```
* fix ci

* fix ci

* renaming

* fix dup line
```
d94cc2f9

25 Jan, 2021 1 commit

[fsmt] onnx triu workaround (#9738) · fac7cfb1

Stas Bekman authored Jan 25, 2021

* onnx triu workaround

* style

* working this time

* add test

* more efficient version

fac7cfb1

22 Jan, 2021 2 commits
- Fix test (#9755) · a449ffcb
  Julien Plu authored Jan 22, 2021
  
  a449ffcb
- Fix some TF slow tests (#9728) · d7c31abf
  Julien Plu authored Jan 22, 2021
```
* Fix saved model tests + fix a graph issue in longformer

* Apply style
```
  d7c31abf
21 Jan, 2021 1 commit

Fix memory regression in Seq2Seq example (#9713) · 5f80c15e

Sylvain Gugger authored Jan 21, 2021

* Fix memory regression in Seq2Seq example

* Fix test and properly deal with -100

* Easier condition with device safety

* Patch for MBartTokenzierFast

5f80c15e