Commits · c1780ce7a487162f44d74cb705b46ff42e7dfe0c · chenpangpang / transformers

"git@developer.sourcefind.cn:gaoqiong/migraphx.git" did not exist on "422d2c735d71dd4bceab765ce5119829f82bd575"

06 May, 2021 1 commit
- fix head_mask for albert encoder part(`AlbertTransformer`) (#11596) · c1780ce7
  baeseongsu authored May 06, 2021
```
* fix head mask for albert encoder part

* fix head_mask for albert encoder part
```
  c1780ce7
05 May, 2021 5 commits

Accept tensorflow-rocm package when checking TF availability (#11595) · 864c1dfe
Mats Sjöberg authored May 05, 2021

864c1dfe

Pytorch - Lazy initialization of models (#11471) · 3e3e41ae

Patrick von Platen authored May 05, 2021



* lazy_init_weights

* remove ipdb

* save int

* add necessary code

* remove unnecessary utils

* Update src/transformers/models/t5/modeling_t5.py

* clean

* add tests

* correct

* finish tests

* finish tests

* fix some more tests

* fix xlnet & transfo-xl

* fix more tests

* make sure tests are independent

* fix tests more

* finist tests

* final touches

* Update src/transformers/modeling_utils.py

* Apply suggestions from code review

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* clean tests

* give arg positive name

* add more mock weights to xlnet
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

3e3e41ae

Skip Funnel test · 8fa8e194
Lysandre authored May 05, 2021

8fa8e194

add importlib_metadata and huggingface_hub as dependency in the conda recipe (#11591) · 83e59d8e

Deepali authored May 05, 2021



* add importlib_metadata as dependency (#11490)
Co-authored-by: Deepali Chourasia <deepch23@us.ibm.com>

* add huggingface_hub dependency
Co-authored-by: Deepali Chourasia <deepch23@us.ibm.com>

83e59d8e

copies need to be fixed too (#11585) · bf0dfa98
Stas Bekman authored May 05, 2021

bf0dfa98

04 May, 2021 8 commits

[trainer] document resume randomness (#11588) · c065025c
Stas Bekman authored May 04, 2021
```
* document resume randomness

* fix link

* reword

* fix

* reword

* style
```
c065025c

Reproducible checkpoint (#11582) · 6b241e0e

Sylvain Gugger authored May 04, 2021

* Set generator in dataloader

* Use generator in all random samplers

* Checkpoint all RNG states

* Final version

* Quality

* Test

* Address review comments

* Quality

* Remove debug util

* Add python and numpy RNGs

* Split states in different files in distributed

* Quality

* local_rank for TPUs

* Only use generator when accepted

* Add test

* Set seed to avoid flakiness

* Make test less flaky

* Quality

6b241e0e

[Flax] Add Electra models (#11426) · 0afe4a90

Patrick Fernandes authored May 04, 2021



* add electra model to flax

* Remove Electra Next Sentence Prediction model added by mistake

* fix parameter sharing and loosen equality threshold

* fix styling issues

* add mistaken removen imports

* fix electra table

* Add FlaxElectra to automodels and fixe docs

* fix issues pointed out the PR

* fix flax electra to comply with latest changes

* remove stale class

* add copied from
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

0afe4a90

Removes SageMakerTrainer code but keeps class as wrapper (#11587) · 226e74b6
Philipp Schmid authored May 04, 2021
```
* removed all old code

* make quality
```
226e74b6

[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py (#11470) · 084a187d

Patrick von Platen authored May 04, 2021



* add flax roberta

* make style

* correct initialiazation

* modify model to save weights

* fix copied from

* fix copied from

* correct some more code

* add more roberta models

* Apply suggestions from code review

* merge from master

* finish

* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

084a187d

Make quality scripts work when one backend is missing. (#11573) · 2ce0fb84

Sylvain Gugger authored May 04, 2021

* Make quality scripts work when one backend is missing.

* Check env variable is properly set

* Add default

* With print statements

* Fix typo

* Set env variable

* Remove debug code

2ce0fb84

Enable added tokens (#11325) · 09b0bcfe

Lysandre Debut authored May 04, 2021

* Fix tests

* Reorganize

* Update tests/test_modeling_mobilebert.py

* Remove unnecessary addition

09b0bcfe

Add multi-class, multi-label and regression to transformers (#11012) · c40c7e21

abhishek thakur authored May 04, 2021



* add to  bert

* review comments

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* self.config.problem_type

* fix style

* fix

* fin

* fix

* update doc

* fix

* test

* Test more problem types

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* remove

* fix

* quality

* make fix-copies

* remove test
Co-authored-by: abhishek thakur <abhishekkrthakur@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

c40c7e21

03 May, 2021 12 commits

fix resize_token_embeddings (#11572) · 7c622482
Stas Bekman authored May 03, 2021

7c622482

Update training tutorial (#11533) · fe82b1bf

Sylvain Gugger authored May 03, 2021



* Update training tutorial

* Apply suggestions from code review
Co-authored-by: Hamel Husain <hamelsmu@github.com>

* Address review comments

* Update docs/source/training.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* More review comments

* Last review comments
Co-authored-by: Hamel Husain <hamelsmu@github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

fe82b1bf

Accumulate opt state dict on do_rank 0 (#11481) · f4c9a7e6
Sylvain Gugger authored May 03, 2021

f4c9a7e6
Fixes a useless warning. (#11566) · 1e8e0686
Nicolas Patry authored May 03, 2021
```
Fixes #11525
```
1e8e0686
Fix metric computation in `run_glue_no_trainer` (#11569) · 87dd1a00
Sylvain Gugger authored May 03, 2021

87dd1a00

[Wav2vec2] Fixed tokenization mistakes while adding single-char tokens to tokenizer (#11538) · a721a5ee

Muktan authored May 03, 2021



* Fixed tokenization mistakes while adding single-char tokens to tokenizer

* Added tests and Removed unnecessary comments.

* finalize wav2vec2 tok

* add more aggressive tests

* Apply suggestions from code review

* fix useless import
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a721a5ee

Add LUKE (#11223) · f3cf8ae7

NielsRogge authored May 03, 2021



* Rebase with master

* Minor bug fix in docs

* Copy files from adding_luke_v2 and improve docs

* change the default value of use_entity_aware_attention to True

* remove word_hidden_states

* fix head models

* fix tests

* fix the conversion script

* add integration tests for the pretrained large model

* improve docstring

* Improve docs, make style

* fix _init_weights for pytorch 1.8

* improve docs

* fix tokenizer to construct entity sequence with [MASK] entity when entities=None

* Make fix-copies

* Make style & quality

* Bug fixes

* Add LukeTokenizer to init

* Address most comments by @patil-suraj and @LysandreJik

* rename _compute_extended_attention_mask to get_extended_attention_mask

* add comments to LukeSelfAttention

* fix the documentation of the tokenizer

* address comments by @patil-suraj, @LysandreJik, and @sgugger

* improve docs

* Make style, quality and fix-copies

* Improve docs

* fix docs

* add "entity_span_classification" task

* update example code for LukeForEntitySpanClassification

* improve docs

* improve docs

* improve the code example in luke.rst

* rename the classification layer in LukeForEntityClassification from typing to classifier

* add bias to the classifier in LukeForEntitySpanClassification

* update docs to use fine-tuned hub models in code examples of the head models

* update the example sentences

* Make style & quality

* Add require_torch to tokenizer tests

* Add require_torch to tokenizer tests

* Address comments by @sgugger and add community notebooks

* Make fix-copies
Co-authored-by: Ikuya Yamada <ikuya@ikuya.net>

f3cf8ae7

fix the mlm longformer example by changing [MASK] to <mask> (#11559) · 6a11e4c2
Frederik Bode authored May 03, 2021

6a11e4c2
Remove `datasets` submodule. (#11563) · 1c86157d
Lysandre Debut authored May 03, 2021

1c86157d
[Wav2Vec2] Fix convert (#11562) · c448c01f
Patrick von Platen authored May 03, 2021
```
* push

* small change

* correct other typo
```
c448c01f
[Flax BERT/Roberta] few small fixes (#11558) · 623281aa
Suraj Patil authored May 03, 2021
```
* small fixes

* style
```
623281aa
Fix examples in M2M100 docstrings (#11540) · a5d2967b
lewtun authored May 03, 2021
```
Replaces `tok` with `tokenizer` so examples can run with copy-paste
```
a5d2967b

02 May, 2021 1 commit

Fixed docs for the shape of `scores` in `generate()` (#10057) · 98020865

jingyihe authored May 02, 2021

* Fixed the doc for the shape of return scores tuples in generation_utils.py.

* Fix the output shape of `scores` for `DecoderOnlyOutput`.

* style fix

98020865

30 Apr, 2021 13 commits

[DeepSpeed] fp32 support (#11499) · 4e7bf94e

Stas Bekman authored Apr 30, 2021

* prep for deepspeed==0.3.16

* new version

* too soon

* support and test fp32 mode

* troubleshooting doc start

* workaround no longer needed

* add fp32 doc

* style

* cleanup, add tf32 note

* clarify

* release was made

4e7bf94e

[debug utils] activation/weights underflow/overflow detector (#11274) · 282f3ac3

Stas Bekman authored Apr 30, 2021



* sync

* add activation overflow debug utility

* cleanup

* document detect_overflow

* import torch

* add deprecation warning

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* convert to rst, add note

* add class

* fix docs

* improve the doc

* rework to dump a lot more info about each frame

* complete expansion

* cleanup

* format

* cleanup

* doesn't have to be transformers

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* wrap long line

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

282f3ac3

Improve task summary docs (#11513) · 804c2974

Hamel Husain authored Apr 30, 2021



* fix task summary docs

* refactor to use model.config.id2label instead of list

* fix nit

* Update docs/source/task_summary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

804c2974

Add Stas and Suraj as authors (#11526) · bc80f8bc
Sylvain Gugger authored Apr 30, 2021

bc80f8bc

[Examples] Added support for test-file in QA examples with no trainer (#11510) · 84326a28

Bhadresh Savani authored Apr 30, 2021

* added support for test-file

* fixed typo

* added suggested changes

* reformatted code

* modifed files

* fix post processing error

* Trigger CI

* removed extra lines

84326a28

Run model templates on master (#11527) · af0692a2
Lysandre Debut authored Apr 30, 2021

af0692a2
reszie token embeds (#11524) · 57c8e822
Suraj Patil authored Apr 30, 2021

57c8e822
Update TF text classification example (#11496) · 20d6931e
Matt authored Apr 30, 2021
```
Big refactor, fixes and multi-GPU/TPU support
```
20d6931e
Fix do_eval default value in training_args.py (#11511) · 8b945ef0
bonniehyeon authored Apr 30, 2021
```
* Fix do_eval default value in training_args.py

* Update PULL_REQUEST_TEMPLATE.md
```
8b945ef0
Accepts BatchEncoding in LengthSampler (#11431) · c2cd02ac
Takuya Makino authored Apr 30, 2021

c2cd02ac
Implement Fast Tokenization for Deberta (#11387) · 30ede899
Shubham Sanghavi authored Apr 30, 2021

30ede899

Adding `AutomaticSpeechRecognitionPipeline`. (#11337) · db9dd09c

Nicolas Patry authored Apr 30, 2021



* Adding `AutomaticSpeechRecognitionPipeline`.

- Because we added everything to enable this pipeline, we probably
should add it to `transformers`.
- This PR tries to limit the scope and focuses only on the pipeline part
(what should go in, and out).
- The tests are very specific for S2T and Wav2vec2 to make sure both
architectures are supported by the pipeline. We don't use the mixin for
tests right now, because that requires more work in the `pipeline`
function (will be done in a follow up PR).
- Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
  sense from a user perspective, it does not add any additional
dependencies (as in hard dependency, because users can always use their
own load mechanism). Meanwhile, it feels slightly clunky to have so much
optional preprocessing.
- The pipeline is not done to support streaming audio right now.

Future work:

- Add `automatic-speech-recognition` as a `task`. And add the
FeatureExtractor.from_pretrained within `pipeline` function.
- Add small models within tests
- Add the Mixin to tests.
- Make the logic between ForCTC vs ForConditionalGeneration better.

* Update tests/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Adding docs + main import + type checking + LICENSE.

* Doc style !.

* Fixing TYPE_HINT.

* Specifying waveform shape in the docs.

* Adding asserts + specify in the documentation the shape of the input
np.ndarray.

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding require to tests + move the `feature_extractor` doc.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

db9dd09c

T5 Gradient Checkpointing (#11353) · 76116f47

CeShine Lee authored Apr 30, 2021

* Implement gradient checkpoinging for T5Stack

* A bit more robust type checking

* Add `gradient_checkpointing` to T5Config

* Formatting

* Set requires_grad only when training

* None return value will only cause problems when training

* Change the output tuple according to `use_cache`

* Enable gradient checkpointing for the decoder

Squashed commit of the following:

commit 658bdd0bd1215353a8770f558bda2ea69a0ad0c7
Author: Ceshine Lee <shuanck@gmail.com>
Date:   Sat Apr 24 14:08:17 2021 +0800

    Only set `require_grad` for gradient checkpointing

commit acaeee6b2e675045fb28ce2176444c1d63e908bd
Author: Ceshine Lee <shuanck@gmail.com>
Date:   Sat Apr 24 13:59:35 2021 +0800

    Make gradient checkpointing work with the decoder

* Formatting

76116f47