Commits · 804c2974d5e1c95e71afe57f8f97b3a8bcd921eb · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "1486d2aec2c667aa2beeed5eaac6625c87577093"

30 Apr, 2021 10 commits

Update TF text classification example (#11496) · 20d6931e
Matt authored Apr 30, 2021
```
Big refactor, fixes and multi-GPU/TPU support
```
20d6931e
Fix do_eval default value in training_args.py (#11511) · 8b945ef0
bonniehyeon authored Apr 30, 2021
```
* Fix do_eval default value in training_args.py

* Update PULL_REQUEST_TEMPLATE.md
```
8b945ef0
Accepts BatchEncoding in LengthSampler (#11431) · c2cd02ac
Takuya Makino authored Apr 30, 2021

c2cd02ac
Implement Fast Tokenization for Deberta (#11387) · 30ede899
Shubham Sanghavi authored Apr 30, 2021

30ede899

Adding `AutomaticSpeechRecognitionPipeline`. (#11337) · db9dd09c

Nicolas Patry authored Apr 30, 2021



* Adding `AutomaticSpeechRecognitionPipeline`.

- Because we added everything to enable this pipeline, we probably
should add it to `transformers`.
- This PR tries to limit the scope and focuses only on the pipeline part
(what should go in, and out).
- The tests are very specific for S2T and Wav2vec2 to make sure both
architectures are supported by the pipeline. We don't use the mixin for
tests right now, because that requires more work in the `pipeline`
function (will be done in a follow up PR).
- Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
  sense from a user perspective, it does not add any additional
dependencies (as in hard dependency, because users can always use their
own load mechanism). Meanwhile, it feels slightly clunky to have so much
optional preprocessing.
- The pipeline is not done to support streaming audio right now.

Future work:

- Add `automatic-speech-recognition` as a `task`. And add the
FeatureExtractor.from_pretrained within `pipeline` function.
- Add small models within tests
- Add the Mixin to tests.
- Make the logic between ForCTC vs ForConditionalGeneration better.

* Update tests/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Adding docs + main import + type checking + LICENSE.

* Doc style !.

* Fixing TYPE_HINT.

* Specifying waveform shape in the docs.

* Adding asserts + specify in the documentation the shape of the input
np.ndarray.

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding require to tests + move the `feature_extractor` doc.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

db9dd09c

T5 Gradient Checkpointing (#11353) · 76116f47

CeShine Lee authored Apr 30, 2021

* Implement gradient checkpoinging for T5Stack

* A bit more robust type checking

* Add `gradient_checkpointing` to T5Config

* Formatting

* Set requires_grad only when training

* None return value will only cause problems when training

* Change the output tuple according to `use_cache`

* Enable gradient checkpointing for the decoder

Squashed commit of the following:

commit 658bdd0bd1215353a8770f558bda2ea69a0ad0c7
Author: Ceshine Lee <shuanck@gmail.com>
Date:   Sat Apr 24 14:08:17 2021 +0800

    Only set `require_grad` for gradient checkpointing

commit acaeee6b2e675045fb28ce2176444c1d63e908bd
Author: Ceshine Lee <shuanck@gmail.com>
Date:   Sat Apr 24 13:59:35 2021 +0800

    Make gradient checkpointing work with the decoder

* Formatting

76116f47

make style (#11520) · 022a1e9e
Patrick von Platen authored Apr 30, 2021

022a1e9e

add sp_model_kwargs to unpickle of xlm roberta tok (#11430) · e0db8276

Philip May authored Apr 30, 2021

add test for pickle

simplify test

fix test code style

add missing pickle import

fix test

fix test

fix test

e0db8276

correct the dimension comment of matrix multiplication (#11494) · b43e3f93
Frederik Bode authored Apr 30, 2021
```
Co-authored-by: Frederik Bode <frederik@paperbox.ai>
```
b43e3f93
Pin HuggingFace Hub dependency (#11502) · f37f2adb
Lysandre Debut authored Apr 30, 2021

f37f2adb

29 Apr, 2021 4 commits

Split checkpoint from model_name_or_path in examples (#11492) · b29eb247
Sylvain Gugger authored Apr 29, 2021
```
* Split checkpoint from model_name_or_path in examples

* Address review comments

* Address review comments
```
b29eb247
solved coefficient issue for the TF version of gelu_fast (#11514) · d6ec54ba
Michael Benayoun authored Apr 29, 2021
```
Co-authored-by: Michael Benayoun <michael@huggingface.co>
```
d6ec54ba
Reformat to make code clearer in tokenizer call (#11497) · ad1f7bef
Sylvain Gugger authored Apr 29, 2021
```
* Reformat to make code clearer

* Reformat to make code clearer
```
ad1f7bef

[Flax] Add docstrings & model outputs (#11498) · f748bd42

Patrick von Platen authored Apr 29, 2021



* add attentions & hidden states

* add model outputs + docs

* finish docs

* finish tests

* finish impl

* del @

* finish

* finish

* correct test

* apply sylvains suggestions

* Update src/transformers/models/bert/modeling_flax_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* simplify more
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f748bd42

28 Apr, 2021 1 commit

Update `PreTrainedTokenizerBase` to check/handle batch length for `text_pair` parameter (#11486) · c0eb218a

Hamel Husain authored Apr 28, 2021



* Update tokenization_utils_base.py

* add assertion

* check batch len

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add error message
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c0eb218a

27 Apr, 2021 1 commit
- fix docs for decoder_input_ids (#11466) · 8d43c71a
  Suraj Patil authored Apr 27, 2021
```
* fix docs for decoder_input_ids

* revert the changes for bart and mbart
```
  8d43c71a
26 Apr, 2021 14 commits

Remove max length beam scorer (#11378) · 741d48f5

Ashwin Geet D'Sa authored Apr 27, 2021



* removed max_len

* removed max_length from BeamSearchScorer

* correct max length

* finish

* del vim

* finish & add test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

741d48f5

[Deepspeed] ZeRO-Infinity integration plus config revamp (#11418) · bc2571e6

Stas Bekman authored Apr 26, 2021



* adding Z-inf

* revamp config process

* up version requirement

* wip

* massive rewrite

* cleanup

* cleanup

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* consistent json commas

* act on suggestions

* leave this feature for 0.3.16

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bc2571e6

Style · b03b2a65
Sylvain Gugger authored Apr 26, 2021

b03b2a65
[docs] fix invalid class name (#11438) · a753cafd
Stas Bekman authored Apr 26, 2021
```
* fix invalid class name

* proper ref

* proper ref
```
a753cafd
Clarify description of the is_split_into_words argument (#11449) · 6715e3b6
Kostas Stathoulopoulos authored Apr 26, 2021
```
* Improve documentation for is_split_into_words argument

* Change description wording
```
6715e3b6
Pass along seed to DistributedSampler (#11406) · ab2cabb9
Sylvain Gugger authored Apr 26, 2021
```
* Pass along seed to DistributedSampler

* Add seed to DistributedLengthGroupedSampler
```
ab2cabb9
fix some typos in docs, comments, logging/errors (#11432) · b24ead87
LSinev authored Apr 26, 2021

b24ead87

Add basic support for FP16 in SageMaker model parallelism (#11407) · d7633a4e

Sylvain Gugger authored Apr 26, 2021

* Add FP16 support for SageMaker MP

* Add print debugs

* Squeeze

* Remove debug statements

* Add defensive check

* Typo

d7633a4e

TF BART models - Add `cross_attentions` to model output and fix... · 38a716cd

Daniel Stancl authored Apr 26, 2021

TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699)

* Add cross_attn_head_mask to BART

* Fix cross_attentions in TFBart-like models

* This commit enables returning of `cross_attentions`
for TFBart-like models

* It also fixes attention head masking in cross-attenion module

* Update TF model templates

* Fix missing , in TF model templates

* Fix typo: congig -> config

38a716cd

Pin black to 21.4b0 · 4bd6b54f
Sylvain Gugger authored Apr 26, 2021

4bd6b54f
With style · c1625b32
Sylvain Gugger authored Apr 26, 2021

c1625b32
make style (#11442) · 32dbb2d9
Patrick von Platen authored Apr 26, 2021

32dbb2d9
add pooling layer support (#11439) · 04ab2ca6
Vasudev Gupta authored Apr 26, 2021

04ab2ca6
updating the checkpoint for GPT2ForSequence Classification to one with classification head (#11434) · 30f06589
abiolaTresor authored Apr 26, 2021

30f06589

25 Apr, 2021 2 commits

EncoderDecoderConfigs should not create new objects (#11300) · 35cd8eed

cronoik authored Apr 25, 2021



* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel

* rollback to current version of the huggingface master branch

* reworked version that ties the encoder and decoder config of the parent encoderdecoder instance

* overwrite of resize_token_embeddings throws an error now

* review comment suggestion
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig

* added test to avoid diverging configs of wrapper class and wrapped classes

* Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py

* make style
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

35cd8eed

Add head_mask, decoder_head_mask, cross_head_mask to ProphetNet (#9964) · f45cb66b

Daniel Stancl authored Apr 25, 2021

* Add head_mask & decoder_head_mask + some corrections

* Fix head masking for N-grams

* Enable test_headmasking for encoder and decod

* Fix one typo regarding in modeling_propgetnet.py

* Enable test_headmasking for ProphetNetStandaloneDecoderModelTest
and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py

* make style

* Fix cross_head_mask

* Fix attention head mask naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Still need to merge #10605 to master to pass the tests

f45cb66b

24 Apr, 2021 2 commits
- Style · 52166f67
  Sylvain Gugger authored Apr 23, 2021
  
  52166f67
- documentation linked to the parent class PreTrainedTokenizerFast but it should... · 9cac4fab
  cronoik authored Apr 24, 2021
```
documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410)
```
  9cac4fab
23 Apr, 2021 6 commits

Enable option for subword regularization in `XLMRobertaTokenizer` (#11149) · 195bfd11

Philip May authored Apr 23, 2021



* enable subword regularization.

* fix tokenizer storage

* fix docstring formatting

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Stefan Schweter <stefan@schweter.it>

* fix docstring formatting

* add test for subword regularization tokenizer

* improve comments of test

* add sp_model_kwargs

* reformat docstring to match the style

* add some more documentation

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve docstring

* empty commit to trigger CI

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix docstring formatting for sphinx
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

195bfd11

Fix cross-attention head mask for Torch encoder-decoder models (#10605) · e3ff165a

Daniel Stancl authored Apr 23, 2021

* Fix cross-attention head mask for Torch BART models

* Fix head masking for cross-attention module for the following
models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
Pegasus

* Enable test_headmasking for M2M_100 model

* Fix cross_head_mask for FSMT, LED and T5

* This commit fixes `head_mask` for cross-attention modules
in the following models: FSMT, LED, T5

* It also contains some smaller changes in doc so that
it is be perfectly clear the shape of `cross_head_mask`
is the same as of `decoder_head_mask`

* Update template

* Fix template for BartForCausalLM

* Fix cross_head_mask for Speech2Text models

* Fix cross_head_mask in templates

* Fix args order in BartForCausalLM template

* Fix doc in BART templates

* Make more explicit naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Fix doc

* make style quality

* Fix speech2text docstring

e3ff165a

Style · bd41a0f7
Sylvain Gugger authored Apr 23, 2021

bd41a0f7

Fixing bug in generation (#11297) · 1811883e

Nicola De Cao authored Apr 23, 2021

When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.

1811883e

added support for exporting of t5 to onnx with past_key_values (#10651) · 5c009186
Kiran R authored Apr 23, 2021

5c009186
push (#11400) · 50f4539b
Patrick von Platen authored Apr 23, 2021

50f4539b