Commits · 8d43c71a1ca3ad322cc45008eb66a5611f1e017e · chenpangpang / transformers

27 Apr, 2021 3 commits

fix docs for decoder_input_ids (#11466) · 8d43c71a
Suraj Patil authored Apr 27, 2021
```
* fix docs for decoder_input_ids

* revert the changes for bart and mbart
```
8d43c71a

Finish Making Quick Tour respect the model object (#11467) · 7ceff67e

Hamel Husain authored Apr 27, 2021



* finish quicktour

* fix import

* fix print

* explain config default better

* Update docs/source/quicktour.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7ceff67e

update QuickTour docs to reflect model output object (#11462) · 88ac60f7
Hamel Husain authored Apr 26, 2021
```
* update docs to reflect model output object

* run make style`
```
88ac60f7

26 Apr, 2021 20 commits
- Remove max length beam scorer (#11378) · 741d48f5
  Ashwin Geet D'Sa authored Apr 27, 2021
```
* removed max_len

* removed max_length from BeamSearchScorer

* correct max length

* finish

* del vim

* finish & add test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
```
  741d48f5
- [Deepspeed] ZeRO-Infinity integration plus config revamp (#11418) · bc2571e6
  Stas Bekman authored Apr 26, 2021
```
* adding Z-inf

* revamp config process

* up version requirement

* wip

* massive rewrite

* cleanup

* cleanup

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* consistent json commas

* act on suggestions

* leave this feature for 0.3.16

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  bc2571e6
- Variable Correction for Consistency in Distillation Example (#11444) · 0661abc5
  Jaimeen Ahn authored Apr 27, 2021
```
As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively,  the correction makes the example work
```
  0661abc5
- [Examples] Fixes inconsistency around eval vs val and predict vs test (#11380) · 1d30ec95
  Bhadresh Savani authored Apr 26, 2021
```
* added changes for uniformity

* modified files

* corrected typo

* fixed qa scripts

* fix typos

* fixed predict typo in qa no trainer

* fixed test file

* reverted trainer changes

* reverted trainer changes in custom exmaples

* updated readme

* added changes in deepspeed test

* added changes for predict and eval
```
  1d30ec95
- Give each test a different repo name (#11453) · 7959d835
  Sylvain Gugger authored Apr 26, 2021
  
  7959d835
- Style · b03b2a65
  Sylvain Gugger authored Apr 26, 2021
  
  b03b2a65
- make sure to test against the local checkout (#11437) · ce11318e
  Stas Bekman authored Apr 26, 2021
  
  ce11318e
- [docs] fix invalid class name (#11438) · a753cafd
  Stas Bekman authored Apr 26, 2021
```
* fix invalid class name

* proper ref

* proper ref
```
  a753cafd
- Clarify description of the is_split_into_words argument (#11449) · 6715e3b6
  Kostas Stathoulopoulos authored Apr 26, 2021
```
* Improve documentation for is_split_into_words argument

* Change description wording
```
  6715e3b6
- Pass along seed to DistributedSampler (#11406) · ab2cabb9
  Sylvain Gugger authored Apr 26, 2021
```
* Pass along seed to DistributedSampler

* Add seed to DistributedLengthGroupedSampler
```
  ab2cabb9
- fix some typos in docs, comments, logging/errors (#11432) · b24ead87
  LSinev authored Apr 26, 2021
  
  b24ead87
- docs(examples): fix link to TPU launcher script (#11427) · e3e70f95
  Amine Abdaoui authored Apr 26, 2021
  
  e3e70f95
- Add basic support for FP16 in SageMaker model parallelism (#11407) · d7633a4e
  Sylvain Gugger authored Apr 26, 2021
```
* Add FP16 support for SageMaker MP

* Add print debugs

* Squeeze

* Remove debug statements

* Add defensive check

* Typo
```
  d7633a4e
- TF BART models - Add `cross_attentions` to model output and fix... · 38a716cd
  Daniel Stancl authored Apr 26, 2021
```
TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699)

* Add cross_attn_head_mask to BART

* Fix cross_attentions in TFBart-like models

* This commit enables returning of `cross_attentions`
for TFBart-like models

* It also fixes attention head masking in cross-attenion module

* Update TF model templates

* Fix missing , in TF model templates

* Fix typo: congig -> config
```
  38a716cd
- Pin black to 21.4b0 · 4bd6b54f
  Sylvain Gugger authored Apr 26, 2021
  
  4bd6b54f
- With style · c1625b32
  Sylvain Gugger authored Apr 26, 2021
  
  c1625b32
- Pin black to 20.8.b1 · 4b72cfd9
  Sylvain Gugger authored Apr 26, 2021
  
  4b72cfd9
- make style (#11442) · 32dbb2d9
  Patrick von Platen authored Apr 26, 2021
  
  32dbb2d9
- add pooling layer support (#11439) · 04ab2ca6
  Vasudev Gupta authored Apr 26, 2021
  
  04ab2ca6
- updating the checkpoint for GPT2ForSequence Classification to one with classification head (#11434) · 30f06589
  abiolaTresor authored Apr 26, 2021
  
  30f06589
25 Apr, 2021 2 commits

EncoderDecoderConfigs should not create new objects (#11300) · 35cd8eed

cronoik authored Apr 25, 2021



* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel

* rollback to current version of the huggingface master branch

* reworked version that ties the encoder and decoder config of the parent encoderdecoder instance

* overwrite of resize_token_embeddings throws an error now

* review comment suggestion
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig

* added test to avoid diverging configs of wrapper class and wrapped classes

* Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py

* make style
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

35cd8eed

Add head_mask, decoder_head_mask, cross_head_mask to ProphetNet (#9964) · f45cb66b

Daniel Stancl authored Apr 25, 2021

* Add head_mask & decoder_head_mask + some corrections

* Fix head masking for N-grams

* Enable test_headmasking for encoder and decod

* Fix one typo regarding in modeling_propgetnet.py

* Enable test_headmasking for ProphetNetStandaloneDecoderModelTest
and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py

* make style

* Fix cross_head_mask

* Fix attention head mask naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Still need to merge #10605 to master to pass the tests

f45cb66b

24 Apr, 2021 2 commits
- Style · 52166f67
  Sylvain Gugger authored Apr 23, 2021
  
  52166f67
- documentation linked to the parent class PreTrainedTokenizerFast but it should... · 9cac4fab
  cronoik authored Apr 24, 2021
```
documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410)
```
  9cac4fab
23 Apr, 2021 13 commits

Merge branch 'master' of github.com:huggingface/transformers · b7fc043f
Sylvain Gugger authored Apr 23, 2021

b7fc043f
Use 3 workers for torch tests · 81a6c7cd
Sylvain Gugger authored Apr 23, 2021

81a6c7cd

Enable option for subword regularization in `XLMRobertaTokenizer` (#11149) · 195bfd11

Philip May authored Apr 23, 2021



* enable subword regularization.

* fix tokenizer storage

* fix docstring formatting

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Stefan Schweter <stefan@schweter.it>

* fix docstring formatting

* add test for subword regularization tokenizer

* improve comments of test

* add sp_model_kwargs

* reformat docstring to match the style

* add some more documentation

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve docstring

* empty commit to trigger CI

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix docstring formatting for sphinx
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

195bfd11

Default to accuracy metric (#11405) · 1ef152eb
Sylvain Gugger authored Apr 23, 2021

1ef152eb

Fix cross-attention head mask for Torch encoder-decoder models (#10605) · e3ff165a

Daniel Stancl authored Apr 23, 2021

* Fix cross-attention head mask for Torch BART models

* Fix head masking for cross-attention module for the following
models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
Pegasus

* Enable test_headmasking for M2M_100 model

* Fix cross_head_mask for FSMT, LED and T5

* This commit fixes `head_mask` for cross-attention modules
in the following models: FSMT, LED, T5

* It also contains some smaller changes in doc so that
it is be perfectly clear the shape of `cross_head_mask`
is the same as of `decoder_head_mask`

* Update template

* Fix template for BartForCausalLM

* Fix cross_head_mask for Speech2Text models

* Fix cross_head_mask in templates

* Fix args order in BartForCausalLM template

* Fix doc in BART templates

* Make more explicit naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Fix doc

* make style quality

* Fix speech2text docstring

e3ff165a

Wrong branch Sylvain... · ca6b80ca
Sylvain Gugger authored Apr 23, 2021

ca6b80ca
Try to trigger failure more · 3951fc55
Sylvain Gugger authored Apr 23, 2021

3951fc55
Style · bd41a0f7
Sylvain Gugger authored Apr 23, 2021

bd41a0f7

Fixing bug in generation (#11297) · 1811883e

Nicola De Cao authored Apr 23, 2021

When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.

1811883e

added support for exporting of t5 to onnx with past_key_values (#10651) · 5c009186
Kiran R authored Apr 23, 2021

5c009186
push (#11400) · 50f4539b
Patrick von Platen authored Apr 23, 2021

50f4539b

Trainer push to hub (#11328) · bf2e0cf7

Sylvain Gugger authored Apr 23, 2021



* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bf2e0cf7

Fixed trainer total_flos relaoding in distributed mode (#11383) · 7bc86bea
Teven authored Apr 23, 2021
```
* Fixed trainer total_flos relaoding in distributed mode

* logging flos at the end of training
```
7bc86bea