Commits · 6715e3b6a14e58fe535a58cdc41d87ca81254656 · chenpangpang / transformers

26 Apr, 2021 12 commits
- Clarify description of the is_split_into_words argument (#11449) · 6715e3b6
  Kostas Stathoulopoulos authored Apr 26, 2021
```
* Improve documentation for is_split_into_words argument

* Change description wording
```
  6715e3b6
- Pass along seed to DistributedSampler (#11406) · ab2cabb9
  Sylvain Gugger authored Apr 26, 2021
```
* Pass along seed to DistributedSampler

* Add seed to DistributedLengthGroupedSampler
```
  ab2cabb9
- fix some typos in docs, comments, logging/errors (#11432) · b24ead87
  LSinev authored Apr 26, 2021
  
  b24ead87
- docs(examples): fix link to TPU launcher script (#11427) · e3e70f95
  Amine Abdaoui authored Apr 26, 2021
  
  e3e70f95
- Add basic support for FP16 in SageMaker model parallelism (#11407) · d7633a4e
  Sylvain Gugger authored Apr 26, 2021
```
* Add FP16 support for SageMaker MP

* Add print debugs

* Squeeze

* Remove debug statements

* Add defensive check

* Typo
```
  d7633a4e
- TF BART models - Add `cross_attentions` to model output and fix... · 38a716cd
  Daniel Stancl authored Apr 26, 2021
```
TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699)

* Add cross_attn_head_mask to BART

* Fix cross_attentions in TFBart-like models

* This commit enables returning of `cross_attentions`
for TFBart-like models

* It also fixes attention head masking in cross-attenion module

* Update TF model templates

* Fix missing , in TF model templates

* Fix typo: congig -> config
```
  38a716cd
- Pin black to 21.4b0 · 4bd6b54f
  Sylvain Gugger authored Apr 26, 2021
  
  4bd6b54f
- With style · c1625b32
  Sylvain Gugger authored Apr 26, 2021
  
  c1625b32
- Pin black to 20.8.b1 · 4b72cfd9
  Sylvain Gugger authored Apr 26, 2021
  
  4b72cfd9
- make style (#11442) · 32dbb2d9
  Patrick von Platen authored Apr 26, 2021
  
  32dbb2d9
- add pooling layer support (#11439) · 04ab2ca6
  Vasudev Gupta authored Apr 26, 2021
  
  04ab2ca6
- updating the checkpoint for GPT2ForSequence Classification to one with classification head (#11434) · 30f06589
  abiolaTresor authored Apr 26, 2021
  
  30f06589
25 Apr, 2021 2 commits

EncoderDecoderConfigs should not create new objects (#11300) · 35cd8eed

cronoik authored Apr 25, 2021



* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel

* rollback to current version of the huggingface master branch

* reworked version that ties the encoder and decoder config of the parent encoderdecoder instance

* overwrite of resize_token_embeddings throws an error now

* review comment suggestion
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig

* added test to avoid diverging configs of wrapper class and wrapped classes

* Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py

* make style
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

35cd8eed

Add head_mask, decoder_head_mask, cross_head_mask to ProphetNet (#9964) · f45cb66b

Daniel Stancl authored Apr 25, 2021

* Add head_mask & decoder_head_mask + some corrections

* Fix head masking for N-grams

* Enable test_headmasking for encoder and decod

* Fix one typo regarding in modeling_propgetnet.py

* Enable test_headmasking for ProphetNetStandaloneDecoderModelTest
and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py

* make style

* Fix cross_head_mask

* Fix attention head mask naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Still need to merge #10605 to master to pass the tests

f45cb66b

24 Apr, 2021 2 commits
- Style · 52166f67
  Sylvain Gugger authored Apr 23, 2021
  
  52166f67
- documentation linked to the parent class PreTrainedTokenizerFast but it should... · 9cac4fab
  cronoik authored Apr 24, 2021
```
documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410)
```
  9cac4fab
23 Apr, 2021 19 commits

Merge branch 'master' of github.com:huggingface/transformers · b7fc043f
Sylvain Gugger authored Apr 23, 2021

b7fc043f
Use 3 workers for torch tests · 81a6c7cd
Sylvain Gugger authored Apr 23, 2021

81a6c7cd

Enable option for subword regularization in `XLMRobertaTokenizer` (#11149) · 195bfd11

Philip May authored Apr 23, 2021



* enable subword regularization.

* fix tokenizer storage

* fix docstring formatting

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Stefan Schweter <stefan@schweter.it>

* fix docstring formatting

* add test for subword regularization tokenizer

* improve comments of test

* add sp_model_kwargs

* reformat docstring to match the style

* add some more documentation

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve docstring

* empty commit to trigger CI

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix docstring formatting for sphinx
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

195bfd11

Default to accuracy metric (#11405) · 1ef152eb
Sylvain Gugger authored Apr 23, 2021

1ef152eb

Fix cross-attention head mask for Torch encoder-decoder models (#10605) · e3ff165a

Daniel Stancl authored Apr 23, 2021

* Fix cross-attention head mask for Torch BART models

* Fix head masking for cross-attention module for the following
models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
Pegasus

* Enable test_headmasking for M2M_100 model

* Fix cross_head_mask for FSMT, LED and T5

* This commit fixes `head_mask` for cross-attention modules
in the following models: FSMT, LED, T5

* It also contains some smaller changes in doc so that
it is be perfectly clear the shape of `cross_head_mask`
is the same as of `decoder_head_mask`

* Update template

* Fix template for BartForCausalLM

* Fix cross_head_mask for Speech2Text models

* Fix cross_head_mask in templates

* Fix args order in BartForCausalLM template

* Fix doc in BART templates

* Make more explicit naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Fix doc

* make style quality

* Fix speech2text docstring

e3ff165a

Wrong branch Sylvain... · ca6b80ca
Sylvain Gugger authored Apr 23, 2021

ca6b80ca
Try to trigger failure more · 3951fc55
Sylvain Gugger authored Apr 23, 2021

3951fc55
Style · bd41a0f7
Sylvain Gugger authored Apr 23, 2021

bd41a0f7

Fixing bug in generation (#11297) · 1811883e

Nicola De Cao authored Apr 23, 2021

When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.

1811883e

added support for exporting of t5 to onnx with past_key_values (#10651) · 5c009186
Kiran R authored Apr 23, 2021

5c009186
push (#11400) · 50f4539b
Patrick von Platen authored Apr 23, 2021

50f4539b

Trainer push to hub (#11328) · bf2e0cf7

Sylvain Gugger authored Apr 23, 2021



* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bf2e0cf7

Fixed trainer total_flos relaoding in distributed mode (#11383) · 7bc86bea
Teven authored Apr 23, 2021
```
* Fixed trainer total_flos relaoding in distributed mode

* logging flos at the end of training
```
7bc86bea
make blenderbot test slow (#11395) · 74e84f1f
Patrick von Platen authored Apr 23, 2021

74e84f1f
fixed typos (#11391) · c3d6f339
Yoshitomo Matsubara authored Apr 23, 2021

c3d6f339
Fix typo in text (#11396) · a90d3f18
Max Del authored Apr 23, 2021

a90d3f18
correct conversion (#11394) · 2dc2d79a
Patrick von Platen authored Apr 23, 2021

2dc2d79a
correct typo (#11393) · b48cf712
Patrick von Platen authored Apr 23, 2021

b48cf712

[Flax] Big FlaxBert Refactor (#11364) · 8c9b5fcb

Patrick von Platen authored Apr 23, 2021

* improve flax

* refactor

* typos

* Update src/transformers/modeling_flax_utils.py

* Apply suggestions from code review

* Update src/transformers/modeling_flax_utils.py

* fix typo

* improve error tolerance

* typo

* correct nasty saving bug

* fix from pretrained

* correct tree map

* add note

* correct weight tying

8c9b5fcb

22 Apr, 2021 5 commits
- Fix Trainer with remove_unused_columns=False (#11382) · 3ed5e97b
  Sylvain Gugger authored Apr 22, 2021
```
* Fix Trainer with remove_unused_columns=False

* Typo
```
  3ed5e97b
- Fix typo (#11369) · 0f3ad150
  PenutChen authored Apr 22, 2021
  
  0f3ad150
- Correctly cast num_train_epochs to int (#11379) · 26173960
  Matt authored Apr 22, 2021
  
  26173960
- Add space (#11373) · 881945c0
  Takuya Makino authored Apr 22, 2021
  
  881945c0
- [run_translation.py] fix typo (#11372) · 5b5e4ca3
  johnson7788 authored Apr 22, 2021
```
fix typo
Co-authored-by: johnson <johnson@github.com>
```
  5b5e4ca3