Commits · fb36c273a299da6e53052f56e8ebd96fae6b09cb · chenpangpang / transformers

21 Jan, 2021 1 commit

Allow text generation for ProphetNetForCausalLM (#9707) · fb36c273

guillaume-be authored Jan 21, 2021

* Moved ProphetNetForCausalLM's parent initialization after config update

* Added unit tests for generation for ProphetNetForCausalLM

fb36c273

20 Jan, 2021 2 commits

Add DeBERTa head models (#9691) · d1370d29

NielsRogge authored Jan 20, 2021

* Add DebertaForMaskedLM, DebertaForTokenClassification, DebertaForQuestionAnswering

* Add docs and fix quality

* Fix Deberta not having pooler

d1370d29

New TF embeddings (cleaner and faster) (#9418) · 14042d56

Julien Plu authored Jan 20, 2021



* Create new embeddings + add to BERT

* Add Albert

* Add DistilBert

* Add Albert + Electra + Funnel

* Add Longformer + Lxmert

* Add last models

* Apply style

* Update the template

* Remove unused imports

* Rename attribute

* Import embeddings in their own model file

* Replace word_embeddings per weight

* fix naming

* Fix Albert

* Fix Albert

* Fix Longformer

* Fix Lxmert Mobilebert and MPNet

* Fix copy

* Fix template

* Update the get weights function

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/electra/modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address Sylvain's comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

14042d56

19 Jan, 2021 5 commits

Add separated decoder_head_mask for T5 Models (#9634) · 2ebbbf55

Daniel Stancl authored Jan 19, 2021

* Add decoder_head_mask for PyTorch T5 model

* Add decoder_head_mask args into T5Model and T5ForConditionalGeneration

* Slightly change the order of input args to be in accordance
with the convention from BART-based models introduced within the PR #9569.

* Make style for modeling_t5.py

* Add decoder_head_mask for TF T5 models

* Separate head_mask and decoder_head_mask args in TF T5 models

* Slightly change the order of input args to follow convention
of BART-based models updated in PR #9569

* Update test_forward_signature tests/test_modeling_tf_common.py
w.r.t. the changed order of input args

* Add FutureWarnings for T5 and TFT5 models

* Add FutureWarnings for T5 and TFT5 models warning a user that
input argument `head_mask` was split into two arguments -
`head_mask` and `decoder_head_mask`

* Add default behaviour - `decoder_head_mask` is set to copy
`head_mask`

* Fix T5 modeling and FutureWarning

* Make proper usage of head_mask and decoder_head_mask
in cross_attention

* Fix conditions for raising FutureWarning

* Reformat FutureWarning in T5 modeling

* Refactor the warning message

2ebbbf55

New run_seq2seq script (#9605) · e4c06ed6

Sylvain Gugger authored Jan 19, 2021



* New run_seq2seq script

* Add tests

* Mark as slow

* Update examples/seq2seq/run_seq2seq.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/data/data_collator.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/data/data_collator.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

e4c06ed6

Update `past_key_values` in GPT-2 (#9596) · b020a736

Yusuke Mori authored Jan 20, 2021



* Update past_key_values in gpt2 (#9391)

* Update generation_utils, and rename some items

* Update modeling_gpt2 to avoid an error in gradient_checkpointing

* Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2

* Change the location of '_reorder_cache' in modeling files

* Add '_reorder_cache' in modeling_ctrl

* Fix a bug of my last commit in CTRL

* Add '_reorder_cache' to GPT2DoubleHeadsModel

* Manage 'use_cache' in config of test_modeling_gpt2

* Clean up the doc string

* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix the doc string (GPT-2, CTRL)

* improve gradient_checkpointing_behavior
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

b020a736

Fix DPRReaderTokenizer's attention_mask (#9663) · 917dbb15

Sergey Mkrtchyan authored Jan 19, 2021

* Fix the attention_mask in DPRReaderTokenizer

* Add an integration test for DPRReader inference

* Run make style

917dbb15

fix test (#9669) · 12c1b5b8
Patrick von Platen authored Jan 19, 2021

12c1b5b8

18 Jan, 2021 1 commit

Add head_mask/decoder_head_mask for BART (#9569) · 357fb1c5

Daniel Stancl authored Jan 18, 2021



* Add head_mask/decoder_head_mask for BART

This branch implement head_mask and decoder_head_mask
for BART-based models. Full list below:
- BART
- MBart
- Blenderbot
- BlenderbotSmall
- Marian
- Pegasus

Everything is accompanied with updated testing.

* Fix test_headmasking for BART models

* Fix text_headmasking for BART-like models
which has only 2 layers in each modules.
The condition
```
self.assertNotEqual(attentions[1][..., 0, :, :].flatten().sum().item(), 0.0)
```
is, therefore, invalid for encoder-decoder models considering
the `head_mask`
```
head_mask = torch.ones(
    self.model_tester.num_hidden_layers,
    self.model_tester.num_attention_heads,
    device=torch_device,
)
head_mask[0, 0] = 0
head_mask[-1, :-1] = 0
```
specified in the `test_headmasking` test/function.

* Adjust test_modeling_common.py to reflect T5 input args

* Update tests/test_modeling_common.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

* make fix-copies
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

357fb1c5

14 Jan, 2021 3 commits
- Upstream (and rename) sortish sampler (#9574) · 329fe274
  Sylvain Gugger authored Jan 14, 2021
```
* Upstream (and rename) sortish sampler

* Use proper sampler

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  329fe274
- BatchEncoding.to with device with tests (#9584) · 280db79a
  Lysandre Debut authored Jan 14, 2021
  
  280db79a
- Fix Trainer with a parallel model (#9578) · 5e1bea4f
  Sylvain Gugger authored Jan 14, 2021
```
* Fix Trainer with a parallel model

* More clean up
```
  5e1bea4f
13 Jan, 2021 3 commits

Fix slow tests v4.2.0 (#9561) · c9495166

Lysandre Debut authored Jan 13, 2021

* Fix conversational pipeline test

* LayoutLM

* ProphetNet

* BART

* Blenderbot & small

* Marian

* mBART

* Pegasus

* Tapas tokenizer

* BERT2BERT test

* Style

* Example requirements

* TF BERT2BERT test

c9495166

Fix data parallelism in Trainer (#9566) · 04dc65e5

Sylvain Gugger authored Jan 13, 2021



* Fix data parallelism in Trainer

* Update src/transformers/training_args.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

04dc65e5

fix BlenderbotSmallTokenizer (#9538) · 69ed3606
Suraj Patil authored Jan 13, 2021
```
* add model_input_names

* fix test
```
69ed3606

12 Jan, 2021 5 commits

Refactor `prepare_seq2seq_batch` (#9524) · 063d8d27

Sylvain Gugger authored Jan 12, 2021

* Add target contextmanager and rework prepare_seq2seq_batch

* Fix tests, treat BART and Barthez

* Add last tokenizers

* Fix test

* Set src token before calling the superclass

* Remove special behavior for T5

* Remove needless imports

* Remove needless asserts

063d8d27

topk -> top_k (#9541) · dfbf0f55
Lysandre Debut authored Jan 12, 2021

dfbf0f55
LayoutLM Config (#9539) · a1100fac
Lysandre Debut authored Jan 12, 2021

a1100fac

Improve LayoutLM (#9476) · e45eba3b

NielsRogge authored Jan 12, 2021



* Add LayoutLMForSequenceClassification and integration tests

Improve docs

Add LayoutLM notebook to list of community notebooks

* Make style & quality

* Address comments by @sgugger, @patrickvonplaten and @LysandreJik

* Fix rebase with master

* Reformat in one line

* Improve code examples as requested by @patrickvonplaten
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

e45eba3b

[TFBart] Split TF-Bart (#9497) · 7f286132

Patrick von Platen authored Jan 12, 2021

* make templates ready

* make add_new_model_command_ready

* finish tf bart

* prepare tf mbart

* finish tf bart

* add tf mbart

* add marian

* prep pegasus

* add tf pegasus

* push blenderbot tf

* add blenderbot

* add blenderbot small

* clean-up

* make fix copy

* define blend bot tok

* fix

* up

* make style

* add to docs

* add copy statements

* overwrite changes

* improve

* fix docs

* finish

* fix last slow test

* fix missing git conflict line

* fix blenderbot

* up

* fix blenderbot small

* load changes

* finish copied from

* upload fix

7f286132

11 Jan, 2021 4 commits

Enable TruncationStrategy override for pipelines (#9432) · d20e9c72

Nicolas Patry authored Jan 11, 2021

* Enable TruncationStrategy override for pipelines

* Update isort.

* Fixing test

* Fixing text_generation pipeline.

* Using same DummyTok as other PR  for easier merge later.

* Some more import guards.

* Remove bogus file.

* Do not pass `generate_kwargs` to `_parse_and_tokenize`.
@patrickvonplaten

* Removed DummyTok.

* Doc quality.

d20e9c72

fix tf led pt test (#9513) · 6c8ec2a9
Patrick von Platen authored Jan 11, 2021

6c8ec2a9
Remove tolerance + drop_rows_to_fit by default (#9507) · d415882b
Lysandre Debut authored Jan 11, 2021
```
* Remove tolerance + drop_rows_to_fit by default

* remove drop_rows_to_fit
```
d415882b

Full rework of the TF input/output embeddings and bias resizing (#9193) · 1243ee7d

Julien Plu authored Jan 11, 2021

* Start rework resizing

* Rework bias/decoder resizing

* Full resizing rework

* Full resizing rework

* Start to update the models with the new approach

* Finish to update the models

* Update all the tests

* Update the template

* Fix tests

* Fix tests

* Test a new approach

* Refactoring

* Refactoring

* Refactoring

* New rework

* Rework BART

* Rework bert+blenderbot

* Rework CTRL

* Rework Distilbert

* Rework DPR

* Rework Electra

* Rework Flaubert

* Rework Funnel

* Rework GPT2

* Rework Longformer

* Rework Lxmert

* Rework marian+mbart

* Rework mobilebert

* Rework mpnet

* Rework openai

* Rework pegasus

* Rework Roberta

* Rework T5

* Rework xlm+xlnet

* Rework template

* Fix TFT5EncoderOnly + DPRs

* Restore previous methods

* Fix Funnel

* Fix CTRL and TransforXL

* Apply style

* Apply Sylvain's comments

* Restore a test in DPR

* Address the comments

* Fix bug

* Apply style

* remove unused import

* Fix test

* Forgot a method

* missing test

* Trigger CI

* naming update

* Rebase

* Trigger CI

1243ee7d

10 Jan, 2021 1 commit
- Fixing tests. It seems master changed something in the warnings. (#9483) · 96f1f74a
  Nicolas Patry authored Jan 10, 2021
```
Trying to keep warning tests for now. Should be discarded if it becomes
too hard to maintain.
```
  96f1f74a
08 Jan, 2021 2 commits

Making Conversation possible to create directly a full conversation (#9434) · 02e05fb0

Nicolas Patry authored Jan 08, 2021



* Cleaning up conversation tests.

* Adding tests that don't require downloading models + conversation can be

fully created from static state.

* Making tests non flaky (by fixing generation length)

* Bumping isort version.

* Doc cleanup.

* Remove unused test in this PR.

* Torch import guard for TF.

* Missing torch guard.

* Small mistake in doc.

* Actual uses `_history` and `_index` cache.

+ remove dead enumerate
+ improve warning message.

* Update src/transformers/pipelines/conversational.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/conversational.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/conversational.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding comments and cleaner code to address history copy.

* Improving pipeline name in tests.

* Change tokenizer to a real one (still created at runtime with no

external dependency)

* Simplify DummyTok, reverse changes on tokenization.

* Removing DummyTok.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

02e05fb0

Fix TF input for np.ndarray (#9294) · 4fbcf8ea

Julien Plu authored Jan 08, 2021

* Fix input for np.ndarray"

* add a test

* add a test

* Add a test

* Apply style

* Fix test

4fbcf8ea

07 Jan, 2021 3 commits

[TFGPT2] - Fix flaky past_key_values test (#9460) · f33a6f34
Patrick von Platen authored Jan 07, 2021
```
* fix tf flakey

* remove test files
```
f33a6f34
[LED Test] fix common inputs pt for flaky pt-tf led test (#9459) · a400fe89
Patrick von Platen authored Jan 07, 2021
```
* fix common inputs pt flakey led

* fix other tests correspondingly
```
a400fe89

New serving (#9419) · 812045ad

Julien Plu authored Jan 07, 2021

* Add a serving method

* Add albert

* Add serving for BERT and BART

* Add more models

* Finish the serving addition

* Temp fix

* Restore DPR

* Fix funnel attribute

* Fix attributes GPT2

* Fix OpenAIGPT attribute

* Fix T5 attributes

* Fix Bart attributes

* Fix TransfoXL attributes

* Add versioning

* better test

* Update template

* Fix Flaubert

* Fix T5

* Apply style

* Remove unused imports

* Deactivate extra parameters

* Remove too long test + saved_model default to False

* Ignore the saved model test for some models

* Fix some inputs

* Fix mpnet serving

* Trigger CI

* Address all comments

812045ad

06 Jan, 2021 4 commits

[GenerationOutputs] Fix GenerationOutputs Tests (#9443) · b8462b5b

Patrick von Platen authored Jan 06, 2021

* fix generation models

* fix led

* fix docs

* add is_decoder

* fix last docstrings

* make style

* fix t5 cross attentions

* correct t5

b8462b5b

Fast transformers import part 1 (#9441) · 0c96262f

Sylvain Gugger authored Jan 06, 2021

* Don't import libs to check they are available

* Don't import integrations at init

* Add importlib_metdata to deps

* Remove old vars references

* Avoid syntax error

* Adapt testing utils

* Try to appease torchhub

* Add dependency

* Remove more private variables

* Fix typo

* Another typo

* Refine the tf availability test

0c96262f

Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150) · c89f1bc9

Simon Brandeis authored Jan 06, 2021



* Define new output dataclasses for greedy generation

* Add output_[...] flags in greedy generation methods

Added output_attentions, output_hidden_states, output_scores flags in
generate and greedy_search methods in GenerationMixin.

* [WIP] Implement logic and tests for output flags in generation

* Update GreedySearchOutput classes & docstring

* Implement greedy search output accumulation logic

Update greedy_search unittests

Fix generate method return value docstring

Properly init flags with the default config

* Update configuration to add output_scores flag

* Fix test_generation_utils

Sort imports and fix isinstance tests for GreedySearchOutputs

* Fix typo in generation_utils

* Add return_dict_in_generate for backwards compatibility

* Add return_dict_in_generate flag in config

* Fix tyPo in configuration

* Fix handling of attentions and hidden_states flags

* Make style & quality

* first attempt attentions

* some corrections

* improve tests

* special models requires special test

* disable xlm test for now

* clean tests

* fix for tf

* isort

* Add output dataclasses for other generation methods

* Add logic to return dict in sample generation

* Complete test for sample generation

- Pass output_attentions and output_hidden_states flags to encoder in
encoder-decoder models
- Fix import satements order in test_generation_utils file

* Add logic to return dict in sample generation

- Refactor tests to avoid using self.assertTrue, which provides
scarce information when the test fails
- Add tests for the three beam_search methods: vanilla, sample and
grouped

* Style doc

* Fix copy-paste error in generation tests

* Rename logits to scores and refactor

* Refactor group_beam_search for consistency

* make style

* add sequences_scores

* fix all tests

* add docs

* fix beam search finalize test

* correct docstring

* clean some files

* Made suggested changes to the documentation

* Style doc ?

* Style doc using the Python util

* Update src/transformers/generation_utils.py

* fix empty lines

* fix all test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

c89f1bc9

[trainer] self.model_wrapped + _model_unwrap (#9390) · 9f675b05

Stas Bekman authored Jan 06, 2021



* model wrapped + model_unwrap

* cleanup

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* deprecation warning

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9f675b05

05 Jan, 2021 3 commits

[PyTorch Bart] Split Bart into different models (#9343) · eef66035

Patrick von Platen authored Jan 05, 2021

* first try

* remove old template

* finish bart

* finish mbart

* delete unnecessary line

* init pegasus

* save intermediate

* correct pegasus

* finish pegasus

* remove cookie cutter leftover

* add marian

* finish blenderbot

* replace in file

* correctly split blenderbot

* delete "old" folder

* correct "add statement"

* adapt config for tf comp

* correct configs for tf

* remove ipdb

* fix more stuff

* fix mbart

* push pegasus fix

* fix mbart

* more fixes

* fix research projects code

* finish docs for bart, mbart, and marian

* delete unnecessary file

* correct attn typo

* correct configs

* remove pegasus for seq class

* correct peg docs

* correct peg docs

* finish configs

* further improve docs

* add copied from statements to mbart

* fix copied from in mbart

* add copy statements to marian

* add copied from to marian

* add pegasus copied from

* finish pegasus

* finish copied from

* Apply suggestions from code review

* make style

* backward comp blenderbot

* apply lysandres and sylvains suggestions

* apply suggestions

* push last fixes

* fix docs

* fix tok tests

* fix imports code style

* fix doc

eef66035

LED (#9278) · 189387e9

Patrick von Platen authored Jan 05, 2021

* create model

* add integration

* save current state

* make integration tests pass

* add one more test

* add explanation to tests

* remove from bart

* add padding

* remove unnecessary test

* make all tests pass

* re-add cookie cutter tests

* finish PyTorch

* fix attention test

* Update tests/test_modeling_common.py

* revert change

* remove unused file

* add string to doc

* save intermediate

* make tf integration tests pass

* finish tf

* fix doc

* fix docs again

* add led to doctree

* add to auto tokenizer

* added tips for led

* make style

* apply jplus statements

* correct tf longformer

* apply lysandres suggestions

* apply sylvains suggestions

* Apply suggestions from code review

189387e9

Use stable functions (#9369) · 4225740a
Julien Plu authored Jan 05, 2021

4225740a

04 Jan, 2021 1 commit
- [test_model_parallelization] multiple fixes (#9354) · 143289dc
  Stas Bekman authored Jan 04, 2021
  
  143289dc
25 Dec, 2020 1 commit

[GPT2] Correct gradient checkpointing (#9308) · 61443cd7

Patrick von Platen authored Dec 25, 2020

* correct gpt2

* fix gpt2

* fix use_cache ordering

* correct past tolerance

* fix for all cases

* style

61443cd7

24 Dec, 2020 1 commit

Proposed Fix : [RagSequenceForGeneration] generate "without" input_ids (#9220) · f3a3b91d

Ratthachat (Jung) authored Dec 24, 2020

* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing



* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

* fix RagSeq generate with context_input_ids

fix RagSeq generate with context_input_ids

* apply style

* delete unused lines

* Add test_rag_sequence_generate_batch_from_context_input_ids

* Readability improved

* stylying

* Stylize

* typos

* add check_model_generate_from_context_input_ids

* make style

* Apply suggestions from code review

* make style2
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>

f3a3b91d