Commits · ce2298fb5f84a8d0d8860c15fb677b7ada07a8ad · chenpangpang / transformers

09 Apr, 2020 1 commit

[T5, generation] Add decoder caching for T5 (#3682) · ce2298fb

Patrick von Platen authored Apr 10, 2020



* initial commit to add decoder caching for T5

* better naming for caching

* finish T5 decoder caching

* correct test

* added extensive past testing for T5

* clean files

* make tests cleaner

* improve docstring

* improve docstring

* better reorder cache

* make style

* Update src/transformers/modeling_t5.py
Co-Authored-By: Yacine Jernite <yjernite@users.noreply.github.com>

* make set output past work for all layers

* improve docstring

* improve docstring
Co-authored-by: Yacine Jernite <yjernite@users.noreply.github.com>

ce2298fb

06 Apr, 2020 1 commit
- [Generate, Test] Split generate test function into beam search, no beam search (#3601) · 2ee41056
  Patrick von Platen authored Apr 06, 2020
```
* split beam search and no beam search test

* fix test

* clean generate tests
```
  2ee41056
31 Mar, 2020 1 commit
- [Generate] Add bad words list argument to the generate function (#3367) · b38d552a
  Patrick von Platen authored Mar 31, 2020
```
* add bad words list

* make style

* add bad_words_tokens

* make style

* better naming

* make style

* fix typo
```
  b38d552a
26 Mar, 2020 1 commit
- [Bart/Memory] don't create lm_head (#3323) · 39371ee4
  Sam Shleifer authored Mar 26, 2020
```
* delete lm_head, skips weight tying
* Fixed s3
```
  39371ee4
19 Mar, 2020 1 commit

Support T5 Generation (#3228) · bbf26c4e

Patrick von Platen authored Mar 19, 2020



* fix conflicts

* update bart max length test

* correct spelling mistakes

* implemented model specific encode function

* fix merge conflicts

* better naming

* save intermediate state -> need to rethink strucuture a bit

* leave tf problem as it is for now

* current version

* add layers.pop

* remove ipdb

* make style

* clean return cut decoding

* remove ipdbs

* Fix restoring layers in the decoders that doesnt exists.

* push good intermediate solution for now

* fix conflicts

* always good to refuse to merge conflicts when rebasing

* fix small bug

* improve function calls

* remove unused file

* add correct scope behavior for t5_generate
Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>

bbf26c4e

17 Mar, 2020 1 commit

[generate] do_sample default back to False (#3298) · e8f44af5

Patrick von Platen authored Mar 17, 2020

* change do_samples back

* None better default as boolean

* adapt do_sample to True in test example

* make style

e8f44af5

11 Mar, 2020 4 commits
- only do output_past=True for language generation in bart · aceb3fba
  Patrick von Platen authored Mar 05, 2020
  
  aceb3fba
- fix conflicts · ff648221
  Patrick von Platen authored Mar 06, 2020
  
  ff648221
- refactored code a bit and made more generic · c0d9dd3b
  Patrick von Platen authored Mar 05, 2020
  
  c0d9dd3b
- fix conflicts · d8e2b3c5
  Patrick von Platen authored Mar 06, 2020
  
  d8e2b3c5
05 Mar, 2020 1 commit
- Correct missing keys + test (#3143) · 0001d056
  Lysandre Debut authored Mar 05, 2020
  
  0001d056
03 Mar, 2020 1 commit

Add generate() functionality to TF 2.0 (#3063) · 41341003

Patrick von Platen authored Mar 03, 2020

* add first copy past test to tf 2 generate

* add tf top_k_top_p_filter fn

* add generate function for TF

* add generate function for TF

* implemented generate for all models expect transfoXL

* implemented generate for all models expect transfoXL

* implemented generate for all models expect transfoXL

* make style

* change permission of test file to correct ones

* delete ipdb

* delete ipdb

* fix bug and finish simple gpt2 integration test

* clean test file

* clean test file

* make style

* make style

* make style

* make style

* change import style

* change import style

* make style

* make style

* add decorators

* add decorators

* fix tf ctrl bug dim => axis in TF

* make style

* make style

* refactored test file

* refactored test file

* take out test_torch_tf_conversion if nothing is defined

* take out test_torch_tf_conversion if nothing is defined

* remove useless files

* remove useless files

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* solve conflicts

* solve conflicts

* fix conflicts

* fix conflicts

* merge conflicts

* delete ipdb

* exposed top_k_top_p_filtering fns

* delete weirdly created w! file

* add comment to test tf common modeling

* fix conflicts

* fix conflicts

* make style

* merge conflicts

* make style

* change tf.tensor.shape to shape_list(tensor)

41341003

02 Mar, 2020 1 commit
- correct greedy generation when doing beam search (#3078) · 2fdc7f6c
  Patrick von Platen authored Mar 02, 2020
```
* correct greedy generation when doing beam search

* improve comment
```
  2fdc7f6c
26 Feb, 2020 1 commit

Fix (non-slow) tests on GPU (torch) (#3024) · 9cda3620

Julien Chaumond authored Feb 26, 2020



* Fix tests on GPU (torch)

* Fix bart slow tests
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

9cda3620

24 Feb, 2020 1 commit

Add slow generate tests for pretrained lm models (#2909) · 17c45c39

Patrick von Platen authored Feb 24, 2020

* add slow generate lm_model tests

* fix conflicts

* merge conflicts

* fix conflicts

* add slow generate lm_model tests

* make style

* delete unused variable

* fix conflicts

* fix conflicts

* fix conflicts

* delete unused variable

* fix conflicts

* finished hard coded tests

17c45c39

21 Feb, 2020 1 commit

Improve special_token_id logic in run_generation.py and add tests (#2885) · fc38d4c8

Patrick von Platen authored Feb 21, 2020



* improving generation

* finalized special token behaviour for no_beam_search generation

* solved modeling_utils merge conflict

* solve merge conflicts in modeling_utils.py

* add run_generation improvements from PR #2749

* adapted language generation to not use hardcoded -1 if no padding token is available

* remove the -1 removal as hard coded -1`s are not necessary anymore

* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown

* add slow language generation tests for pretrained models using hardcoded output with pytorch seed

* delete ipdb

* check that all generated tokens are valid

* renaming

* renaming Generation -> Generate

* make style

* updated so that generate_beam_search has same token behavior than generate_no_beam_search

* consistent return format for run_generation.py

* deleted pretrain lm generate tests -> will be added in another PR

* cleaning of unused if statements and renaming

* run_generate will always return an iterable

* make style

* consistent renaming

* improve naming, make sure generate function always returns the same tensor, add docstring

* add slow tests for all lmhead models

* make style and improve example comments modeling_utils

* better naming and refactoring in modeling_utils

* improving generation

* finalized special token behaviour for no_beam_search generation

* solved modeling_utils merge conflict

* solve merge conflicts in modeling_utils.py

* add run_generation improvements from PR #2749

* adapted language generation to not use hardcoded -1 if no padding token is available

* remove the -1 removal as hard coded -1`s are not necessary anymore

* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown

* add slow language generation tests for pretrained models using hardcoded output with pytorch seed

* delete ipdb

* check that all generated tokens are valid

* renaming

* renaming Generation -> Generate

* make style

* updated so that generate_beam_search has same token behavior than generate_no_beam_search

* consistent return format for run_generation.py

* deleted pretrain lm generate tests -> will be added in another PR

* cleaning of unused if statements and renaming

* run_generate will always return an iterable

* make style

* consistent renaming

* improve naming, make sure generate function always returns the same tensor, add docstring

* add slow tests for all lmhead models

* make style and improve example comments modeling_utils

* better naming and refactoring in modeling_utils

* changed fast random lm generation testing design to more general one

* delete in old testing design in gpt2

* correct old variable name

* temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed

* adapted all fast random generate tests to new design

* better warning description in modeling_utils

* better comment

* better comment and error message
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

fc38d4c8

20 Feb, 2020 1 commit

New BartModel (#2745) · 53ce3854

Sam Shleifer authored Feb 20, 2020

* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs

53ce3854

04 Feb, 2020 3 commits
- fix default getattr · 9e5b549b
  sshleifer authored Feb 04, 2020
  
  9e5b549b
- double quotes · 25848a60
  sshleifer authored Feb 04, 2020
  
  25848a60
- minor cleanup of test_attention_outputs · cbcb83f2
  sshleifer authored Feb 03, 2020
  
  cbcb83f2
16 Jan, 2020 1 commit
- Fix failing torchscript test for xlnet · d9fa1bad
  Julien Chaumond authored Jan 15, 2020
```
model.parameters() order is apparently not stable (only for xlnet, for some reason)
```
  d9fa1bad
14 Jan, 2020 1 commit

Bias should be resized with the weights · 100e3b6f

Lysandre authored Jan 14, 2020

Created a link between the linear layer bias and the model attribute bias. This does not change anything for the user nor for the conversion scripts, but allows the `resize_token_embeddings` method to resize the bias as well as the weights of the decoder.

Added a test.

100e3b6f

11 Jan, 2020 2 commits
- flake · c6f682c1
  Julien Chaumond authored Jan 11, 2020
  
  c6f682c1
- Convention: name mixins mixins · 2f32dfd3
  Julien Chaumond authored Jan 11, 2020
  
  2f32dfd3
10 Jan, 2020 1 commit
- rm old ConfigTester · 055e80cf
  Julien Chaumond authored Jan 10, 2020
  
  055e80cf
06 Jan, 2020 2 commits
- GPU text generation: mMoved the encoded_prompt to correct device · 81d6841b
  alberduris authored Dec 31, 2019
  
  81d6841b
- Moved the encoded_prompts to correct device · dd4df80f
  alberduris authored Dec 31, 2019
  
  dd4df80f
23 Dec, 2019 1 commit
- Remove unused variables in tests. · e6c0019c
  Aymeric Augustin authored Dec 23, 2019
  
  e6c0019c
22 Dec, 2019 10 commits

Remove sys.version_info[0] == 2 or 3. · 798b3b38
Aymeric Augustin authored Dec 22, 2019

798b3b38
Remove __future__ imports. · c824d15a
Aymeric Augustin authored Dec 22, 2019

c824d15a
Remove unused GPTModelTester. · daf8bebc
Aymeric Augustin authored Dec 22, 2019
```
It isn't imported anywhere.
```
daf8bebc

Replace (TF)CommonTestCases for modeling with a mixin. · 345c23a6

Aymeric Augustin authored Dec 22, 2019

I suspect the wrapper classes were created in order to prevent the
abstract base class (TF)CommonModelTester from being included in test
discovery and running, because that would fail.

I solved this by replacing the abstract base class with a mixin.

Code changes are just de-indenting and automatic reformattings
performed by black to use the extra line space.

345c23a6

Remove unittest.main() in test modules. · 7e98e211

Aymeric Augustin authored Dec 22, 2019

This construct isn't used anymore these days.

Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.

Use python -m unittest tests/test_foo.py instead.

7e98e211

Switch test files to the standard test_*.py scheme. · ced0a942
Aymeric Augustin authored Dec 22, 2019

ced0a942
Move tests outside of library. · 067395d5
Aymeric Augustin authored Dec 22, 2019

067395d5
Fix F401 flake8 warning (x28). · 939148b0
Aymeric Augustin authored Dec 21, 2019
```
Do manually what autoflake couldn't manage.
```
939148b0

Fix F401 flake8 warning (x152 / 268). · 80327a13

Aymeric Augustin authored Dec 21, 2019

This change is mostly autogenerated with:

    $ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py

I made minor changes in the generated diff.

80327a13

Sort imports with isort. · 158e82e0

Aymeric Augustin authored Dec 21, 2019

This is the result of:

    $ isort --recursive examples templates transformers utils hubconf.py setup.py

158e82e0

21 Dec, 2019 2 commits

Reformat source code with black. · fa84ae26

Aymeric Augustin authored Dec 21, 2019

This is the result of:

    $ black --line-length 119 examples templates transformers utils hubconf.py setup.py

There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.

This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.

fa84ae26

Take advantage of the cache when running tests. · b670c266

Aymeric Augustin authored Dec 20, 2019

Caching models across test cases and across runs of the test suite makes
slow tests somewhat more bearable.

Use gettempdir() instead of /tmp in tests. This makes it easier to
change the location of the cache with semi-standard TMPDIR/TEMP/TMP
environment variables.

Fix #2222.

b670c266