Commits · f4a0d6ff867e8a82a33d7a653e7d45372a463271 · chenpangpang / transformers

20 May, 2021 1 commit

A cleaner and more scalable implementation of symbolic tracing (#11763) · f4a0d6ff

Michael Benayoun authored May 20, 2021



Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures:
- ALBERT
- DistilBERT
- MobileBERT
- MegatronBERT
- GPT2
- GPT Neo
Co-authored-by: Michael Benayoun <michael@huggingface.co>

f4a0d6ff

15 Mar, 2021 1 commit

GPT2DoubleHeadsModel made parallelizable (#10658) · 505494a8

Igor Shalyminov authored Mar 15, 2021

* GPT2DoubleHeadsModel made parallelizeable

* GPT2DoubleHeadsModel added as parallelizeable onto the GPT2 test suite

505494a8

12 Mar, 2021 1 commit

Adding new parameter to `generate`: `max_time`. (#9846) · 543d0549

Nicolas Patry authored Mar 12, 2021

* [WIP] Adding new parameter to `generate`:  `max_time`.

Generation by tokens number is sometimes a bit clunky because we don't
know how many tokens are good enough or even how many tokens are in
the payload (for pipelines users for instance). This leads to hard
to understand behavior.

This PR proposes a new argument `max_time` which is a float of seconds
for the allowed time for `generate` to run on.
Ideally combinations of `max_tokens=None`, `max_time=2` could be used to
generate as many tokens as possible within time budget.

NB: Another possible approach consists of passing a callback to `generate`
  putting the caller in charge of the actual decision of when to stop
  generating tokens. It opens the door to 'which args should we pass'
  to this callback. It's hard to imagine other use-cases for this
  early stopping behavior than time (that are not already covered by
  parameters of generate)

* Revamp with StoppingCriteria

* Removing deprecated mentions.

* Forgot arguments to stopping criteria.

* Readding max_length it's not just used as a stopping criteria.

* Default value for `stopping_criteria`.

* Address @patrickvonplaten comments.

- More docstrings
- Actual doc
- Include in global namespace
- Remove TF work.

* Put back `max_length` (deprecation different PR).

* Doc quality.

* Fixing old behavior without `stopping_criteria` but with `max_length`.

Making sure we don't break that in the future.

* Adding more tests for possible inconsistencies between

`max_length` and `stopping_criteria`.

* Fixing the torch imports.

543d0549

19 Jan, 2021 1 commit

Update `past_key_values` in GPT-2 (#9596) · b020a736

Yusuke Mori authored Jan 20, 2021



* Update past_key_values in gpt2 (#9391)

* Update generation_utils, and rename some items

* Update modeling_gpt2 to avoid an error in gradient_checkpointing

* Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2

* Change the location of '_reorder_cache' in modeling files

* Add '_reorder_cache' in modeling_ctrl

* Fix a bug of my last commit in CTRL

* Add '_reorder_cache' to GPT2DoubleHeadsModel

* Manage 'use_cache' in config of test_modeling_gpt2

* Clean up the doc string

* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix the doc string (GPT-2, CTRL)

* improve gradient_checkpointing_behavior
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

b020a736

22 Dec, 2020 1 commit

[EncoderDecoder] Make tests more aggressive (#9256) · e9d77ccd

Patrick von Platen authored Dec 22, 2020

* add tests

* make style and fix bart bug

* fix bart past key value edge case

* correct tf bart test

* fix gpt2 tf

* fix t5 test

e9d77ccd

07 Dec, 2020 1 commit
- Copyright (#8970) · 00aa9dbc
  Sylvain Gugger authored Dec 07, 2020
```
* Add copyright everywhere missing

* Style
```
  00aa9dbc
23 Nov, 2020 1 commit

gpt2 and t5 parallel modeling (#8696) · 1cd9be2a

alexorona authored Nov 23, 2020



* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

1cd9be2a

17 Nov, 2020 1 commit

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

16 Nov, 2020 2 commits

Switch `return_dict` to `True` by default. (#8530) · 1073a2bd

Sylvain Gugger authored Nov 16, 2020

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests

1073a2bd

Fix GPT2DoubleHeadsModel to work with model.generate() (#6601) · afb50c66

LSinev authored Nov 16, 2020

* Fix passing token_type_ids during GPT2DoubleHeadsModel.generate() if used

and for GPT2LMHeadModel too

* Update tests to check token_type_ids usage in GPT2 models

afb50c66

09 Nov, 2020 1 commit

[Tests] Add Common Test for Training + Fix a couple of bugs (#8415) · 9c83b96e

Patrick von Platen authored Nov 09, 2020

* add training tests

* correct longformer

* fix docs

* fix some tests

* fix some more train tests

* remove ipdb

* fix multiple edge case model training

* fix funnel and prophetnet

* clean gpt models

* undo renaming of albert

9c83b96e

03 Nov, 2020 1 commit

Refactoring the generate() function (#6949) · a1bbcf3f

Patrick von Platen authored Nov 03, 2020

* first draft

* show design proposition for new generate method

* up

* make better readable

* make first version

* gpt2 tests pass

* make beam search for gpt2 work

* add first encoder-decoder code

* delete typo

* make t5 work

* save indermediate

* make bart work with beam search

* finish beam search bart / t5

* add default kwargs

* make more tests pass

* fix no bad words sampler

* some fixes and tests for all distribution processors

* fix test

* fix rag slow tests

* merge to master

* add nograd to generate

* make all slow tests pass

* speed up generate

* fix edge case bug

* small fix

* correct typo

* add type hints and docstrings

* fix typos in tests

* add beam search tests

* add tests for beam scorer

* fix test rag

* finish beam search tests

* move generation tests in seperate file

* fix generation tests

* more tests

* add aggressive generation tests

* fix tests

* add gpt2 sample test

* add more docstring

* add more docs

* finish doc strings

* apply some more of sylvains and sams comments

* fix some typos

* make fix copies

* apply lysandres and sylvains comments

* final corrections on examples

* small fix for reformer

a1bbcf3f

21 Oct, 2020 1 commit
- fix test (#7947) · 52decab3
  Patrick von Platen authored Oct 21, 2020
  
  52decab3
14 Oct, 2020 1 commit

Add batch inferencing support for GPT2LMHeadModel (#7552) · 121dd433

Jonathan Chang authored Oct 14, 2020



* Add support for gpt2 batch inferencing

* add test

* remove typo
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

121dd433

06 Oct, 2020 1 commit
- Add GPT2ForSequenceClassification based on DialogRPT (#7501) · 59824318
  Lysandre Debut authored Oct 06, 2020
```
* Add GPT2ForSequenceClassification based on DialogRPT

* Better documentation

* Code quality
```
  59824318
01 Oct, 2020 1 commit

[Seq2Seq] Fix a couple of bugs and clean examples (#7474) · 62f5ae68

Patrick von Platen authored Oct 01, 2020



* clean T5

* fix t5 tests

* fix index typo

* fix tf common test

* fix examples

* change positional ordering for Bart and FSTM

* add signature test

* clean docs and add tests

* add docs to encoder decoder

* clean docs

* correct two doc strings

* remove sig test for TF Elektra & Funnel

* fix tf t5 slow tests

* fix input_ids to inputs in tf

* Update src/transformers/modeling_bart.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_bart.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* implement lysandre results

* make style

* fix encoder decoder typo

* fix tf slow tests

* fix slow tests

* renaming

* remove unused input
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

62f5ae68

29 Sep, 2020 1 commit

Adding gradient checkpointing to GPT2 (#7446) · 9e9a1fb8

Teven authored Sep 29, 2020



* GPT2 gradient checkpointing

* find_unused_parameters removed if checkpointing

* find_unused_parameters removed if checkpointing

* Update src/transformers/configuration_gpt2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Added a test for generation with checkpointing

* Update src/transformers/configuration_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9e9a1fb8

01 Sep, 2020 1 commit

[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735) · afc4ece4

Patrick von Platen authored Sep 01, 2020

* fix generate for GPT2 Double Head

* fix gpt2 double head model

* fix  bart / t5

* also add for no beam search

* fix no beam search

* fix encoder decoder

* simplify t5

* simplify t5

* fix t5 tests

* fix BART

* fix transfo-xl

* fix conflict

* integrating sylvains and sams comments

* fix tf past_decoder_key_values

* fix enc dec test

afc4ece4

26 Aug, 2020 1 commit
- Black 20 release · a75c64d8
  Lysandre authored Aug 26, 2020
  
  a75c64d8
24 Aug, 2020 1 commit
- Update repo to isort v5 (#6686) · a5737779
  Sylvain Gugger authored Aug 24, 2020
```
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
```
  a5737779
20 Aug, 2020 1 commit
- [Tests] fix attention masks in Tests (#6621) · 505f2d74
  Patrick von Platen authored Aug 20, 2020
```
* fix distilbert

* fix typo
```
  505f2d74
14 Aug, 2020 1 commit

[EncoderDecoder] Add Cross Attention for GPT2 (#6415) · 1d6e71e1

Patrick von Platen authored Aug 14, 2020



* add cross attention layers for gpt2

* make gpt2 cross attention work

* finish bert2gpt2

* add explicit comments

* remove attention mask since not yet supported

* revert attn mask in pipeline

* Update src/transformers/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1d6e71e1

13 Aug, 2020 1 commit

cleanup tf unittests: part 2 (#6260) · e983da0e

Stas Bekman authored Aug 13, 2020

* cleanup torch unittests: part 2

* remove trailing comma added by isort, and which breaks flake

* one more comma

* revert odd balls

* part 3: odd cases

* more ["key"] -> .key refactoring

* .numpy() is not needed

* more unncessary .numpy() removed

* more simplification

e983da0e

04 Aug, 2020 1 commit

cleanup torch unittests (#6196) · 5deed37f

Stas Bekman authored Aug 03, 2020

* improve unit tests

this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973
before I apply it to the rest

* batch 1

* batch 2

* batch 3

* batch 4

* batch 5

* style

* non-tf template

* last deletion of check_loss_output

5deed37f

31 Jul, 2020 1 commit
- Model output test (#6155) · d951c14a
  Sylvain Gugger authored Jul 31, 2020
```
* Use return_dict=True in all tests

* Formatting
```
  d951c14a
23 Jul, 2020 1 commit

Avoid unnecessary warnings when loading pretrained model (#5922) · f5b5c5bd

Sylvain Gugger authored Jul 23, 2020

* Avoid unnecessary warnings when loading pretrained model

* Fix test

* Add other keys to ignore

* keys_to_ignore_at_load -> authorized_missing_keys

f5b5c5bd

01 Jul, 2020 1 commit
- Move tests/utils.py -> transformers/testing_utils.py (#5350) · 13deb95a
  Sam Shleifer authored Jul 01, 2020
  
  13deb95a
24 Jun, 2020 1 commit
- [Use cache] Align logic of `use_cache` with output_attentions and output_hidden_states (#5194) · c2a26ec8
  Patrick von Platen authored Jun 24, 2020
```
* fix use cache

* add bart use cache

* fix bart

* finish bart
```
  c2a26ec8
16 Jun, 2020 1 commit
- [cleanup] Hoist ModelTester objects to top level (#4939) · c852036b
  Amil Khare authored Jun 16, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  c852036b
05 Jun, 2020 1 commit
- Use labels to remove deprecation warnings (#4807) · f1fe1846
  Sylvain Gugger authored Jun 05, 2020
  
  f1fe1846
02 Jun, 2020 1 commit

Kill model archive maps (#4636) · d4c2cb40

Julien Chaumond authored Jun 02, 2020

* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI

d4c2cb40

27 May, 2020 1 commit
- [testing] LanguageModelGenerationTests require_tf or require_torch (#4616) · 07797c4d
  Sam Shleifer authored May 27, 2020
  
  07797c4d
19 May, 2020 2 commits

[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468) · aa925a52
Patrick von Platen authored May 19, 2020
```
* fix gpu slow tests in pytorch

* change model to device syntax
```
aa925a52

Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300) · 4c068936

Julien Chaumond authored May 18, 2020

* Test case for #3936

* multigpu tests pass on pytorch 1.4.0

* Fixup

* multigpu tests pass on pytorch 1.5.0

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* rename multigpu to require_multigpu

* mode doc

4c068936

01 May, 2020 1 commit

[ci] Load pretrained models into the default (long-lived) cache · f54dc3f4

Julien Chaumond authored Apr 23, 2020

There's an inconsistency right now where:
- we load some models into CACHE_DIR
- and some models in the default cache
- and often, in both for the same models

When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth.

I'd rather always use the default cache

f54dc3f4

20 Mar, 2020 1 commit
- Clean special token init in modeling_....py (#3264) · 95e00d08
  Patrick von Platen authored Mar 20, 2020
```
* make style

* fix conflicts
```
  95e00d08
08 Mar, 2020 3 commits
- fix typo in test gpt2 · 66c82765
  patrickvonplaten authored Mar 08, 2020
  
  66c82765
- fix typo in test · 314bdc7c
  patrickvonplaten authored Mar 08, 2020
  
  314bdc7c
- updated all tests · 57597614
  patrickvonplaten authored Mar 08, 2020
  
  57597614
03 Mar, 2020 1 commit

Add generate() functionality to TF 2.0 (#3063) · 41341003

Patrick von Platen authored Mar 03, 2020

* add first copy past test to tf 2 generate

* add tf top_k_top_p_filter fn

* add generate function for TF

* add generate function for TF

* implemented generate for all models expect transfoXL

* implemented generate for all models expect transfoXL

* implemented generate for all models expect transfoXL

* make style

* change permission of test file to correct ones

* delete ipdb

* delete ipdb

* fix bug and finish simple gpt2 integration test

* clean test file

* clean test file

* make style

* make style

* make style

* make style

* change import style

* change import style

* make style

* make style

* add decorators

* add decorators

* fix tf ctrl bug dim => axis in TF

* make style

* make style

* refactored test file

* refactored test file

* take out test_torch_tf_conversion if nothing is defined

* take out test_torch_tf_conversion if nothing is defined

* remove useless files

* remove useless files

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* solve conflicts

* solve conflicts

* fix conflicts

* fix conflicts

* merge conflicts

* delete ipdb

* exposed top_k_top_p_filtering fns

* delete weirdly created w! file

* add comment to test tf common modeling

* fix conflicts

* fix conflicts

* make style

* merge conflicts

* make style

* change tf.tensor.shape to shape_list(tensor)

41341003