Commits · 1c86157d9d8eeccb8c95fd399f860b3e751b2efe · chenpangpang / transformers

03 May, 2021 4 commits
- Remove `datasets` submodule. (#11563) · 1c86157d
  Lysandre Debut authored May 03, 2021
  
  1c86157d
- [Wav2Vec2] Fix convert (#11562) · c448c01f
  Patrick von Platen authored May 03, 2021
```
* push

* small change

* correct other typo
```
  c448c01f
- [Flax BERT/Roberta] few small fixes (#11558) · 623281aa
  Suraj Patil authored May 03, 2021
```
* small fixes

* style
```
  623281aa
- Fix examples in M2M100 docstrings (#11540) · a5d2967b
  lewtun authored May 03, 2021
```
Replaces `tok` with `tokenizer` so examples can run with copy-paste
```
  a5d2967b
02 May, 2021 1 commit

Fixed docs for the shape of `scores` in `generate()` (#10057) · 98020865

jingyihe authored May 02, 2021

* Fixed the doc for the shape of return scores tuples in generation_utils.py.

* Fix the output shape of `scores` for `DecoderOnlyOutput`.

* style fix

98020865

30 Apr, 2021 19 commits

[DeepSpeed] fp32 support (#11499) · 4e7bf94e

Stas Bekman authored Apr 30, 2021

* prep for deepspeed==0.3.16

* new version

* too soon

* support and test fp32 mode

* troubleshooting doc start

* workaround no longer needed

* add fp32 doc

* style

* cleanup, add tf32 note

* clarify

* release was made

4e7bf94e

[debug utils] activation/weights underflow/overflow detector (#11274) · 282f3ac3

Stas Bekman authored Apr 30, 2021



* sync

* add activation overflow debug utility

* cleanup

* document detect_overflow

* import torch

* add deprecation warning

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* convert to rst, add note

* add class

* fix docs

* improve the doc

* rework to dump a lot more info about each frame

* complete expansion

* cleanup

* format

* cleanup

* doesn't have to be transformers

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* wrap long line

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

282f3ac3

Improve task summary docs (#11513) · 804c2974

Hamel Husain authored Apr 30, 2021



* fix task summary docs

* refactor to use model.config.id2label instead of list

* fix nit

* Update docs/source/task_summary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

804c2974

Add Stas and Suraj as authors (#11526) · bc80f8bc
Sylvain Gugger authored Apr 30, 2021

bc80f8bc

[Examples] Added support for test-file in QA examples with no trainer (#11510) · 84326a28

Bhadresh Savani authored Apr 30, 2021

* added support for test-file

* fixed typo

* added suggested changes

* reformatted code

* modifed files

* fix post processing error

* Trigger CI

* removed extra lines

84326a28

Run model templates on master (#11527) · af0692a2
Lysandre Debut authored Apr 30, 2021

af0692a2
reszie token embeds (#11524) · 57c8e822
Suraj Patil authored Apr 30, 2021

57c8e822
Update TF text classification example (#11496) · 20d6931e
Matt authored Apr 30, 2021
```
Big refactor, fixes and multi-GPU/TPU support
```
20d6931e
Fix do_eval default value in training_args.py (#11511) · 8b945ef0
bonniehyeon authored Apr 30, 2021
```
* Fix do_eval default value in training_args.py

* Update PULL_REQUEST_TEMPLATE.md
```
8b945ef0
Accepts BatchEncoding in LengthSampler (#11431) · c2cd02ac
Takuya Makino authored Apr 30, 2021

c2cd02ac
Implement Fast Tokenization for Deberta (#11387) · 30ede899
Shubham Sanghavi authored Apr 30, 2021

30ede899

Adding `AutomaticSpeechRecognitionPipeline`. (#11337) · db9dd09c

Nicolas Patry authored Apr 30, 2021



* Adding `AutomaticSpeechRecognitionPipeline`.

- Because we added everything to enable this pipeline, we probably
should add it to `transformers`.
- This PR tries to limit the scope and focuses only on the pipeline part
(what should go in, and out).
- The tests are very specific for S2T and Wav2vec2 to make sure both
architectures are supported by the pipeline. We don't use the mixin for
tests right now, because that requires more work in the `pipeline`
function (will be done in a follow up PR).
- Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
  sense from a user perspective, it does not add any additional
dependencies (as in hard dependency, because users can always use their
own load mechanism). Meanwhile, it feels slightly clunky to have so much
optional preprocessing.
- The pipeline is not done to support streaming audio right now.

Future work:

- Add `automatic-speech-recognition` as a `task`. And add the
FeatureExtractor.from_pretrained within `pipeline` function.
- Add small models within tests
- Add the Mixin to tests.
- Make the logic between ForCTC vs ForConditionalGeneration better.

* Update tests/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Adding docs + main import + type checking + LICENSE.

* Doc style !.

* Fixing TYPE_HINT.

* Specifying waveform shape in the docs.

* Adding asserts + specify in the documentation the shape of the input
np.ndarray.

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding require to tests + move the `feature_extractor` doc.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

db9dd09c

T5 Gradient Checkpointing (#11353) · 76116f47

CeShine Lee authored Apr 30, 2021

* Implement gradient checkpoinging for T5Stack

* A bit more robust type checking

* Add `gradient_checkpointing` to T5Config

* Formatting

* Set requires_grad only when training

* None return value will only cause problems when training

* Change the output tuple according to `use_cache`

* Enable gradient checkpointing for the decoder

Squashed commit of the following:

commit 658bdd0bd1215353a8770f558bda2ea69a0ad0c7
Author: Ceshine Lee <shuanck@gmail.com>
Date:   Sat Apr 24 14:08:17 2021 +0800

    Only set `require_grad` for gradient checkpointing

commit acaeee6b2e675045fb28ce2176444c1d63e908bd
Author: Ceshine Lee <shuanck@gmail.com>
Date:   Sat Apr 24 13:59:35 2021 +0800

    Make gradient checkpointing work with the decoder

* Formatting

76116f47

Update README.md (#11489) · 58c789e3
Manuel Romero authored Apr 30, 2021
```
Add link to code
```
58c789e3
make style (#11520) · 022a1e9e
Patrick von Platen authored Apr 30, 2021

022a1e9e

add sp_model_kwargs to unpickle of xlm roberta tok (#11430) · e0db8276

Philip May authored Apr 30, 2021

add test for pickle

simplify test

fix test code style

add missing pickle import

fix test

fix test

fix test

e0db8276

correct the dimension comment of matrix multiplication (#11494) · b43e3f93
Frederik Bode authored Apr 30, 2021
```
Co-authored-by: Frederik Bode <frederik@paperbox.ai>
```
b43e3f93
Pin HuggingFace Hub dependency (#11502) · f37f2adb
Lysandre Debut authored Apr 30, 2021

f37f2adb
Patch notification service · 60d5bda4
Lysandre authored Apr 30, 2021

60d5bda4

29 Apr, 2021 4 commits

Split checkpoint from model_name_or_path in examples (#11492) · b29eb247
Sylvain Gugger authored Apr 29, 2021
```
* Split checkpoint from model_name_or_path in examples

* Address review comments

* Address review comments
```
b29eb247
solved coefficient issue for the TF version of gelu_fast (#11514) · d6ec54ba
Michael Benayoun authored Apr 29, 2021
```
Co-authored-by: Michael Benayoun <michael@huggingface.co>
```
d6ec54ba
Reformat to make code clearer in tokenizer call (#11497) · ad1f7bef
Sylvain Gugger authored Apr 29, 2021
```
* Reformat to make code clearer

* Reformat to make code clearer
```
ad1f7bef

[Flax] Add docstrings & model outputs (#11498) · f748bd42

Patrick von Platen authored Apr 29, 2021



* add attentions & hidden states

* add model outputs + docs

* finish docs

* finish tests

* finish impl

* del @

* finish

* finish

* correct test

* apply sylvains suggestions

* Update src/transformers/models/bert/modeling_flax_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* simplify more
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f748bd42

28 Apr, 2021 3 commits

fix #1149 (#11493) · 3f6add8b
Hamel Husain authored Apr 28, 2021

3f6add8b

Update `PreTrainedTokenizerBase` to check/handle batch length for `text_pair` parameter (#11486) · c0eb218a

Hamel Husain authored Apr 28, 2021



* Update tokenization_utils_base.py

* add assertion

* check batch len

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add error message
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c0eb218a

Update min versions in README and add Flax (#11472) · 2d27900b
Sylvain Gugger authored Apr 28, 2021
```
* Update min versions in README and add Flax

* Adapt index
```
2d27900b

27 Apr, 2021 3 commits

fix docs for decoder_input_ids (#11466) · 8d43c71a
Suraj Patil authored Apr 27, 2021
```
* fix docs for decoder_input_ids

* revert the changes for bart and mbart
```
8d43c71a

Finish Making Quick Tour respect the model object (#11467) · 7ceff67e

Hamel Husain authored Apr 27, 2021



* finish quicktour

* fix import

* fix print

* explain config default better

* Update docs/source/quicktour.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7ceff67e

update QuickTour docs to reflect model output object (#11462) · 88ac60f7
Hamel Husain authored Apr 26, 2021
```
* update docs to reflect model output object

* run make style`
```
88ac60f7

26 Apr, 2021 6 commits

Remove max length beam scorer (#11378) · 741d48f5

Ashwin Geet D'Sa authored Apr 27, 2021



* removed max_len

* removed max_length from BeamSearchScorer

* correct max length

* finish

* del vim

* finish & add test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

741d48f5

[Deepspeed] ZeRO-Infinity integration plus config revamp (#11418) · bc2571e6

Stas Bekman authored Apr 26, 2021



* adding Z-inf

* revamp config process

* up version requirement

* wip

* massive rewrite

* cleanup

* cleanup

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* consistent json commas

* act on suggestions

* leave this feature for 0.3.16

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bc2571e6

Variable Correction for Consistency in Distillation Example (#11444) · 0661abc5

Jaimeen Ahn authored Apr 27, 2021

As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively,  the correction makes the example work

0661abc5

[Examples] Fixes inconsistency around eval vs val and predict vs test (#11380) · 1d30ec95

Bhadresh Savani authored Apr 26, 2021

* added changes for uniformity

* modified files

* corrected typo

* fixed qa scripts

* fix typos

* fixed predict typo in qa no trainer

* fixed test file

* reverted trainer changes

* reverted trainer changes in custom exmaples

* updated readme

* added changes in deepspeed test

* added changes for predict and eval

1d30ec95

Give each test a different repo name (#11453) · 7959d835
Sylvain Gugger authored Apr 26, 2021

7959d835
Style · b03b2a65
Sylvain Gugger authored Apr 26, 2021

b03b2a65