Commits · fa322474060beb3673cf5a3e39ccd3c8ad57ecd3 · chenpangpang / transformers

26 Apr, 2022 2 commits
- apply torch int div to layoutlmv2 (#15457) · fa322474
  Manuel authored Apr 26, 2022
```
* apply torch int div

* black linting fixup

* update path to torch_int_div

* clarify imports
```
  fa322474
- Limit the use of PreTrainedModel.device (#16935) · 344b9fb0
  Sylvain Gugger authored Apr 25, 2022
```
* Limit the use of PreTrainedModel.device

* Fix
```
  344b9fb0
25 Apr, 2022 11 commits

Fix issue probably-meant-fstring found at https://codereview.doctor (#16913) · 65687520
code-review-doctor authored Apr 25, 2022

65687520
Replace deprecated logger.warn with warning (#16876) · fea94d67
Sanchit Gandhi authored Apr 25, 2022

fea94d67
TF: XLA stable softmax (#16892) · e03966e4
Joao Gante authored Apr 25, 2022
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
e03966e4
added deit onnx config (#16887) · 8246caf3
Rushi Chaudhari authored Apr 25, 2022
```
* added deit onnx config
```
8246caf3
TF: XLA Logits Warpers (#16899) · 9331b379
Joao Gante authored Apr 25, 2022
```
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
```
9331b379
TF: XLA logits processors - minimum length, forced eos, and forced bos (#16912) · 809dac48
Joao Gante authored Apr 25, 2022
```
* XLA min len, forced eos, and forced bos
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
```
809dac48
Fix RemBertTokenizerFast (#16933) · f6210c49
Yih-Dar authored Apr 25, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
f6210c49

Fix PyTorch RAG tests GPU OOM (#16881) · 32adbb26

Yih-Dar authored Apr 25, 2022



* add torch.cuda.empty_cache in some PT RAG tests

* torch.cuda.empty_cache in tearDownModule()

* tearDown()

* add gc.collect()
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

32adbb26

Add missing ckpt in config docs (#16900) · 3e47d19c

Yih-Dar authored Apr 25, 2022



* add missing ckpt in config docs

* add more missing ckpt in config docs

* fix wrong ckpts

* fix realm ckpt

* fix s2t2

* fix xlm_roberta ckpt

* Fix for deberta v2

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* use only one checkpoint for DPR

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

3e47d19c

Fix doc test quicktour dataset (#16929) · 3a71e94a
Patrick von Platen authored Apr 25, 2022
```
* fix doc test

* fix doc test
Co-authored-by: Patrick <patrick@pop-os.localdomain>
```
3a71e94a
add bigbird typo fixes (#16897) · 508baf19
Thomas Chaigneau authored Apr 25, 2022
```
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
```
508baf19

23 Apr, 2022 1 commit
- [DocTests] Fix some doc tests (#16889) · 72728be3
  Patrick von Platen authored Apr 23, 2022
```
* [DocTests] Fix some doc tests

* hacky fix

* correct
```
  72728be3
22 Apr, 2022 7 commits

Changes in create_optimizer to support tensor parallelism with SMP (#16880) · 22fc93c4

cavdard authored Apr 22, 2022



* changes in create optimizer to support tensor parallelism with SMP

* Update src/transformers/trainer.py

Convert if check to one line.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

22fc93c4

TF: XLA repetition penalty (#16879) · 99c8226b
Joao Gante authored Apr 22, 2022

99c8226b
Add OnnxConfig for ConvBERT (#16859) · ec81c11a
Thomas Chaigneau authored Apr 22, 2022
```
* add OnnxConfig for ConvBert
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
```
ec81c11a

Add doc tests for Albert and Bigbird (#16774) · 0d1cff11

Minh Chien Vu authored Apr 23, 2022



* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup

* Add Doctest for Albert and Bigbird

* make fixup

* overwrite examples for Albert and Bigbird

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update longer examples for Bigbird

* using examples from squad_v2

* print out example text

* change name token-classification-big-bird checkpoint to random
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

0d1cff11

Minor fixes/improvements in `convert_file_size_to_int` (#16891) · 9fa88172
Mario Šaško authored Apr 22, 2022
```
* Minor improvements to `convert_file_size_to_int`

* Add <unit>bit version to kilos and megas

* Minor fix
```
9fa88172
TF: rework XLA generate tests (#16866) · 6d90d76f
Joao Gante authored Apr 22, 2022

6d90d76f

Add missing entries in mappings (#16857) · 3b1bbefc

Yih-Dar authored Apr 22, 2022



* add missing entries in some mappings
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3b1bbefc

21 Apr, 2022 9 commits

New features for CodeParrot training script (#16851) · d9184131

Loubna Ben Allal authored Apr 21, 2022



* add tflops logging and fix grad accumulation

* add accelerate tracking and checkpointing

* scale loss of last batch correctly

* fix typo

* compress loss computation
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add resume from checkpoint argument

* add load_state accelerate from checkpoint, register lr scheduler and add tflops function

* reformat code

* reformat code

* add condition on path for resume checkpoint

* combine if conditions
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add source for tflops formula
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

d9184131

Fix doctest list (#16878) · eef2422e
Yih-Dar authored Apr 21, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
eef2422e

Fix GPT-J onnx conversion (#16780) · 0b1e0fcf

Thomas Chaigneau authored Apr 21, 2022



* add gptj to TOKENIZER_MAPPING_NAMES

* fix int32 to float to avoid problem in onnx

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

0b1e0fcf

Use ACT2FN to fetch ReLU activation (#16874) · bae9b645

Eldar Kurtic authored Apr 21, 2022

- all activations should be fetched through ACT2FN
- it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`

bae9b645

Return input_ids in ImageGPT feature extractor (#16872) · cb555af2
Sylvain Gugger authored Apr 21, 2022

cb555af2

Adding support for `array` key in raw dictionnaries in ASR pipeline. (#16827) · e789418e

Nicolas Patry authored Apr 21, 2022



* Adding support for `array` key in raw dictionnaries in ASR pipeline.

* ES .

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Making it work by not popping `array` first.

* Black 22.3
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e789418e

tiny tweak to allow BatchEncoding.token_to_char when token doesn't correspond to chars (#15901) · daf520b0

ghlai9665 authored Apr 21, 2022



* tweak to allow BatchEncoding.char_to_token(0)

* update docstring

* remote trailing whitespace

* make fixup

* make value checking for span_indices explicit
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

daf520b0

t5: add conversion script for T5X to FLAX (#16853) · cb7e1664

Stefan Schweter authored Apr 21, 2022

* t5: add conversion script for T5X to FLAX

* t5: make flake happy

* t5: add copyright message to t5x conversion script

* t5: fix lm head for v1.0 checkpoints

cb7e1664

Long QuestionAnsweringPipeline fix. (#16778) · 6620f60c

Nicolas Patry authored Apr 21, 2022

* Temporary commit witht the long QA fix.

* Adding slow tests covering this fix.

* Removing fast test as it doesn't fail anyway.

6620f60c

20 Apr, 2022 6 commits
- Fix multiproc metrics in no_trainer examples (#16865) · 705d6536
  Zachary Mueller authored Apr 20, 2022
  
  705d6536
- Fix custom init sorting script (#16864) · 175da8d1
  Sylvain Gugger authored Apr 20, 2022
  
  175da8d1
- [docs] fix url (#16860) · 67ed0e43
  Stas Bekman authored Apr 20, 2022
  
  67ed0e43
- [modeling_utils] use less cpu memory with sharded checkpoint loading (#16844) · afa1ef09
  Stas Bekman authored Apr 20, 2022
```
* less cpu memory with sharded checkpoint loading

* Trigger CI

* Trigger CI
```
  afa1ef09
- Fixing return type tensor with `num_return_sequences>1`. (#16828) · e13a91fe
  Nicolas Patry authored Apr 20, 2022
```
* Fixing return type tensor with `num_return_sequences>1`.

* Nit.
```
  e13a91fe
- add DebertaV2 fast tokenizer (#15529) · ff06b177
  Yang Ming authored Apr 20, 2022
```
Co-authored-by: alcinos <carion.nicolas@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  ff06b177
19 Apr, 2022 4 commits

[Typo] Fix typo in modeling utils (#16840) · e1c153cb
Patrick von Platen authored Apr 19, 2022

e1c153cb

Add support for bitsandbytes (#15622) · 3104036e

Manuel R. Ciosici authored Apr 19, 2022



* Add initial BNB integration

* fixup! Add initial BNB integration

* Add bnb test decorator

* Update Adamw8bit option name

* Use the full bnb package name

* Overide bnb for all embedding layers

* Fix package name

* Formatting

* Remove unnecessary import

* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Rename AdamwBNB optimizer option

* Add training test checking that bnb memory utilization is lower

* fix merge

* fix merge; fix + extend new test

* cleanup

* expand bnb

* move all require_* candidates to testing_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

3104036e

Improve test_pt_tf_model_equivalence on PT side (#16731) · e6d23a4b

Yih-Dar authored Apr 19, 2022



* Update test_pt_tf_model_equivalence on PT side
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

e6d23a4b

Type hints added to Speech to Text (#16506) · 3dd57b15

Dahlbomii authored Apr 19, 2022



* Type hints added

* return hints added

* Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

3dd57b15