Commits · a573777901e662ec2e565be312ffaeedef6effec · chenpangpang / transformers

24 Aug, 2020 5 commits
- Update repo to isort v5 (#6686) · a5737779
  Sylvain Gugger authored Aug 24, 2020
```
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
```
  a5737779
- Fixed DataCollatorForLanguageModeling not accepting lists of lists (#6685) · d329c9b0
  Teven authored Aug 24, 2020
```
* Fixed DataCollatorForLanguageModeling + PermutationLanguageModeling not accepting lists of lists

* Update data_collator.py

* black was grumpy
```
  d329c9b0
- Missing commit · 0a850d21
  sgugger authored Aug 24, 2020
  
  0a850d21
- Don't reset the dataset type + plug for rm unused columns (#6683) · b30879fe
  Sylvain Gugger authored Aug 24, 2020
```
* Don't reset the type of the dataset

* Formatting

* Update trainer.py
Co-authored-by: Teven <teven.lescao@gmail.com>
```
  b30879fe
- Specify config filename (#6626) · 1a779ad7
  Jared T Nielsen authored Aug 24, 2020
  
  1a779ad7
21 Aug, 2020 2 commits
- CamembertForCausalLM (#6577) · d0e42a7b
  Suraj Patil authored Aug 21, 2020
```
* added CamembertForCausalLM

* add in __init__ and auto model

* style

* doc
```
  d0e42a7b
- Remove accidental comment (#6629) · bdf7e5de
  josephrocca authored Aug 21, 2020
  
  bdf7e5de
20 Aug, 2020 8 commits

Trainer automatically drops unused columns in nlp datasets (#6449) · e5f45227

Sylvain Gugger authored Aug 20, 2020

* Add a classmethod to easily build a Trainer from nlp dataset and metric

* Fix docstrings

* Split train/eval

* Formatting

* Log dropped columns + docs

* Authorize callable activations

* Poc for auto activation

* Be framework-agnostic

* Formatting

* Remove class method

* Remove unnecessary code

e5f45227

Regression test for pegasus bugfix (#6606) · 5bf4465e
Sam Shleifer authored Aug 20, 2020

5bf4465e

XLNet Bug when training with apex 16-bit precision (#6567) · 95395837

Ivan Dolgov authored Aug 20, 2020



* xlnet fp16 bug fix

* comment cast added

* Update modeling_xlnet.py
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

95395837

TFTrainer dataset doc & fix evaluation bug (#6618) · f9d280a9

Joe Davison authored Aug 20, 2020

* TFTrainer dataset doc & fix evaluation bug

discussed in #6551

* add docstring to test/eval datasets

f9d280a9

Add tests to Trainer (#6605) · 573bdb0a

Sylvain Gugger authored Aug 20, 2020

* Add tests to Trainer

* Test if removing long breaks everything

* Remove ugly hack

* Fix distributed test

* Use float for number of epochs

573bdb0a

Fix CI · b3e54698
sgugger authored Aug 20, 2020

b3e54698
removed redundant arg in prepare_inputs (#6614) · 33bf4264
Prajjwal Bhargava authored Aug 20, 2020
```
* removed redundant arg in prepare_inputs

* made same change in prediction_loop
```
33bf4264
[cleanup] remove confusing newline (#6603) · 93c5c9a5
Oren Amsalem authored Aug 20, 2020

93c5c9a5

19 Aug, 2020 5 commits

Fix #6575 (#6596) · 18ca0e91
Sylvain Gugger authored Aug 19, 2020

18ca0e91
[BartTokenizerFast] add prepare_seq2seq_batch (#6543) · 7581884d
Suraj Patil authored Aug 19, 2020

7581884d
tf generation utils: remove unused kwargs (#6591) · 9a86321b
Sam Shleifer authored Aug 19, 2020

9a86321b

Feed forward chunking others (#6365) · 2a7402cb

Pradhy729 authored Aug 19, 2020



* Feed forward chunking for Distilbert & Albert

* Added ff chunking for many other models

* Change model signature

* Added chunking for XLM

* Cleaned up by removing some variables.

* remove test_chunking flag
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

2a7402cb

[EncoderDecoder] Add functionality to tie encoder decoder weights (#6538) · fe0b85e7

Patrick von Platen authored Aug 19, 2020



* start adding tie encoder to decoder functionality

* finish model tying

* make style

* Apply suggestions from code review

* fix t5 list including cross attention

* apply sams suggestions

* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add max depth break point
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fe0b85e7

18 Aug, 2020 3 commits
- add BartConfig.force_bos_token_to_be_generated (#6526) · 1529bf96
  Sam Shleifer authored Aug 18, 2020
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  1529bf96
- Fixed label datatype for STS-B (#6492) · 5a81195e
  Ali Modarressi authored Aug 18, 2020
```
* fixed label datatype for sts-b

* naming update

* make style

* make style
```
  5a81195e
- [marian] converter supports models from new Tatoeba project (#6342) · 12d76241
  Sam Shleifer authored Aug 17, 2020
  
  12d76241
17 Aug, 2020 7 commits
- [T5Tokenizer] add prepare_seq2seq_batch method (#6122) · 407da12e
  Suraj Patil authored Aug 17, 2020
```
* tests
```
  407da12e
- [Doc] add more MBart and other doc (#6490) · c9564f53
  Suraj Patil authored Aug 17, 2020
```
* add mbart example

* add Pegasus and MBart in readme

* typo

* add MBart in Pretrained models

* add pre-proc doc

* add DPR in readme

* fix indent

* doc fix
```
  c9564f53
- Fix CI · 7ca6ab67
  sgugger authored Aug 17, 2020
  
  7ca6ab67
- [BartTokenizer] add prepare s2s batch (#6212) · 2a77813d
  Suraj Patil authored Aug 17, 2020
```
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
```
  2a77813d
- [sched] polynomial_decay_schedule use default power=1.0 (#6473) · 39c3b1d9
  Stas Bekman authored Aug 17, 2020
  
  39c3b1d9
- [testing] a new TestCasePlus subclass + get_auto_remove_tmp_dir() (#6494) · 9dbe4094
  Stas Bekman authored Aug 17, 2020
```
* [testing] switch to a new TestCasePlus + get_auto_remove_tmp_dir() for auto-removal of tmp dirs

* respect after=True for tempfile, simplify code

* comments

* comment fix

* put `before` last in args, so can make debug even faster
```
  9dbe4094
- Support additional dictionaries for BERT Japanese tokenizers (#6515) · 48c6c613
  Masatoshi Suzuki authored Aug 17, 2020
```
* Update BERT Japanese tokenizers

* Update CircleCI config to download unidic

* Specify to use the latest dictionary packages
```
  48c6c613
14 Aug, 2020 5 commits

Fix TPU Convergence bug introduced by PR#6151 (#6488) · 24107c2c

Jin Young (Daniel) Sohn authored Aug 14, 2020

Currently with the bug introduced we're taking two optimizer steps per
batch: one global one, where `xm.optimizer_step` injects a CRS between
all cores in training, and one without. This has been affecting training
accuracy (for example, XLNet GLUE on MNLI is not converging, etc.).

24107c2c

Generation doc (#6470) · 895ed8f4

Sylvain Gugger authored Aug 14, 2020



* Generation doc

* MBartForConditionalGeneration (#6441)

* add MBartForConditionalGeneration

* style

* rebase and fixes

* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS

* fix docs

* don't ignore mbart

* doc

* fix mbart fairseq link

* put mbart before bart

* apply doc suggestions

* Use hash to clean the test dirs (#6475)

* Use hash to clean the test dirs

* Use hash to clean the test dirs

* Use hash to clean the test dirs

* fix

* [EncoderDecoder] Add Cross Attention for GPT2 (#6415)

* add cross attention layers for gpt2

* make gpt2 cross attention work

* finish bert2gpt2

* add explicit comments

* remove attention mask since not yet supported

* revert attn mask in pipeline

* Update src/transformers/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Sort unique_no_split_tokens to make it deterministic (#6461)

* change unique_no_split_tokens's type to set

* use sorted list instead of set

* style

* Import accuracy_score (#6480)

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address comments

* Styling

* Generation doc

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address comments

* Styling
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
Co-authored-by: gijswijnholds <gijswijnholds@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

895ed8f4

Sort unique_no_split_tokens to make it deterministic (#6461) · 9a8c168f
Quentin Lhoest authored Aug 14, 2020
```
* change unique_no_split_tokens's type to set

* use sorted list instead of set

* style
```
9a8c168f

[EncoderDecoder] Add Cross Attention for GPT2 (#6415) · 1d6e71e1

Patrick von Platen authored Aug 14, 2020



* add cross attention layers for gpt2

* make gpt2 cross attention work

* finish bert2gpt2

* add explicit comments

* remove attention mask since not yet supported

* revert attn mask in pipeline

* Update src/transformers/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1d6e71e1

MBartForConditionalGeneration (#6441) · 680f1337

Suraj Patil authored Aug 14, 2020

* add MBartForConditionalGeneration

* style

* rebase and fixes

* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS

* fix docs

* don't ignore mbart

* doc

* fix mbart fairseq link

* put mbart before bart

* apply doc suggestions

680f1337

13 Aug, 2020 5 commits
- add BartTokenizerFast in AutoTokenizer (#6464) · f51161e2
  Suraj Patil authored Aug 13, 2020
```
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  f51161e2
- add LongformerTokenizerFast in AutoTokenizer (#6463) · a442f87a
  Suraj Patil authored Aug 13, 2020
  
  a442f87a
- Test model outputs equivalence (#6445) · f7cbc13d
  Lysandre Debut authored Aug 13, 2020
```
* Test model outputs equivalence

* Fix failing tests

* From dict to kwargs

* DistilBERT

* Addressing @sgugger and @patrickvonplaten's comments
```
  f7cbc13d
- typo fix (#6462) · 54c687e9
  Prajjwal Bhargava authored Aug 13, 2020
  
  54c687e9
- Fix docs and bad word tokens generation_utils.py (#6387) · 9d94aecd
  Zhu Baohe authored Aug 13, 2020
```
* fix

* fix2

* fix3
```
  9d94aecd