Commits · 22fc93c4d9608fa9cd171b4f3044f8c756f86773 · chenpangpang / transformers

22 Apr, 2022 7 commits

Changes in create_optimizer to support tensor parallelism with SMP (#16880) · 22fc93c4

cavdard authored Apr 22, 2022



* changes in create optimizer to support tensor parallelism with SMP

* Update src/transformers/trainer.py

Convert if check to one line.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

22fc93c4

TF: XLA repetition penalty (#16879) · 99c8226b
Joao Gante authored Apr 22, 2022

99c8226b
Add OnnxConfig for ConvBERT (#16859) · ec81c11a
Thomas Chaigneau authored Apr 22, 2022
```
* add OnnxConfig for ConvBert
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
```
ec81c11a

Add doc tests for Albert and Bigbird (#16774) · 0d1cff11

Minh Chien Vu authored Apr 23, 2022



* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup

* Add Doctest for Albert and Bigbird

* make fixup

* overwrite examples for Albert and Bigbird

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update longer examples for Bigbird

* using examples from squad_v2

* print out example text

* change name token-classification-big-bird checkpoint to random
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

0d1cff11

Minor fixes/improvements in `convert_file_size_to_int` (#16891) · 9fa88172
Mario Šaško authored Apr 22, 2022
```
* Minor improvements to `convert_file_size_to_int`

* Add <unit>bit version to kilos and megas

* Minor fix
```
9fa88172
TF: rework XLA generate tests (#16866) · 6d90d76f
Joao Gante authored Apr 22, 2022

6d90d76f

Add missing entries in mappings (#16857) · 3b1bbefc

Yih-Dar authored Apr 22, 2022



* add missing entries in some mappings
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3b1bbefc

21 Apr, 2022 9 commits

New features for CodeParrot training script (#16851) · d9184131

Loubna Ben Allal authored Apr 21, 2022



* add tflops logging and fix grad accumulation

* add accelerate tracking and checkpointing

* scale loss of last batch correctly

* fix typo

* compress loss computation
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add resume from checkpoint argument

* add load_state accelerate from checkpoint, register lr scheduler and add tflops function

* reformat code

* reformat code

* add condition on path for resume checkpoint

* combine if conditions
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add source for tflops formula
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

d9184131

Fix doctest list (#16878) · eef2422e
Yih-Dar authored Apr 21, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
eef2422e

Fix GPT-J onnx conversion (#16780) · 0b1e0fcf

Thomas Chaigneau authored Apr 21, 2022



* add gptj to TOKENIZER_MAPPING_NAMES

* fix int32 to float to avoid problem in onnx

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

0b1e0fcf

Use ACT2FN to fetch ReLU activation (#16874) · bae9b645

Eldar Kurtic authored Apr 21, 2022

- all activations should be fetched through ACT2FN
- it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`

bae9b645

Return input_ids in ImageGPT feature extractor (#16872) · cb555af2
Sylvain Gugger authored Apr 21, 2022

cb555af2

Adding support for `array` key in raw dictionnaries in ASR pipeline. (#16827) · e789418e

Nicolas Patry authored Apr 21, 2022



* Adding support for `array` key in raw dictionnaries in ASR pipeline.

* ES .

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Making it work by not popping `array` first.

* Black 22.3
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e789418e

tiny tweak to allow BatchEncoding.token_to_char when token doesn't correspond to chars (#15901) · daf520b0

ghlai9665 authored Apr 21, 2022



* tweak to allow BatchEncoding.char_to_token(0)

* update docstring

* remote trailing whitespace

* make fixup

* make value checking for span_indices explicit
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

daf520b0

t5: add conversion script for T5X to FLAX (#16853) · cb7e1664

Stefan Schweter authored Apr 21, 2022

* t5: add conversion script for T5X to FLAX

* t5: make flake happy

* t5: add copyright message to t5x conversion script

* t5: fix lm head for v1.0 checkpoints

cb7e1664

Long QuestionAnsweringPipeline fix. (#16778) · 6620f60c

Nicolas Patry authored Apr 21, 2022

* Temporary commit witht the long QA fix.

* Adding slow tests covering this fix.

* Removing fast test as it doesn't fail anyway.

6620f60c

20 Apr, 2022 6 commits
- Fix multiproc metrics in no_trainer examples (#16865) · 705d6536
  Zachary Mueller authored Apr 20, 2022
  
  705d6536
- Fix custom init sorting script (#16864) · 175da8d1
  Sylvain Gugger authored Apr 20, 2022
  
  175da8d1
- [docs] fix url (#16860) · 67ed0e43
  Stas Bekman authored Apr 20, 2022
  
  67ed0e43
- [modeling_utils] use less cpu memory with sharded checkpoint loading (#16844) · afa1ef09
  Stas Bekman authored Apr 20, 2022
```
* less cpu memory with sharded checkpoint loading

* Trigger CI

* Trigger CI
```
  afa1ef09
- Fixing return type tensor with `num_return_sequences>1`. (#16828) · e13a91fe
  Nicolas Patry authored Apr 20, 2022
```
* Fixing return type tensor with `num_return_sequences>1`.

* Nit.
```
  e13a91fe
- add DebertaV2 fast tokenizer (#15529) · ff06b177
  Yang Ming authored Apr 20, 2022
```
Co-authored-by: alcinos <carion.nicolas@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  ff06b177
19 Apr, 2022 18 commits

[Typo] Fix typo in modeling utils (#16840) · e1c153cb
Patrick von Platen authored Apr 19, 2022

e1c153cb

Add support for bitsandbytes (#15622) · 3104036e

Manuel R. Ciosici authored Apr 19, 2022



* Add initial BNB integration

* fixup! Add initial BNB integration

* Add bnb test decorator

* Update Adamw8bit option name

* Use the full bnb package name

* Overide bnb for all embedding layers

* Fix package name

* Formatting

* Remove unnecessary import

* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Rename AdamwBNB optimizer option

* Add training test checking that bnb memory utilization is lower

* fix merge

* fix merge; fix + extend new test

* cleanup

* expand bnb

* move all require_* candidates to testing_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

3104036e

Improve test_pt_tf_model_equivalence on PT side (#16731) · e6d23a4b

Yih-Dar authored Apr 19, 2022



* Update test_pt_tf_model_equivalence on PT side
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

e6d23a4b

Type hints added to Speech to Text (#16506) · 3dd57b15

Dahlbomii authored Apr 19, 2022



* Type hints added

* return hints added

* Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

3dd57b15

replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in some docstrings (#16835) · 1efca4e6
SaulLu authored Apr 19, 2022
```
* replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in docstring

* quality
```
1efca4e6

Correct Logging of Eval metric to Tensorboard (#16825) · b5c6a63e

Jeevesh Juneja authored Apr 19, 2022

* Correct Logging of Eval metric to Tensorboard

An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``.

* Remove unused variable

b5c6a63e

TF: Add sigmoid activation function (#16819) · f09c45e0
Joao Gante authored Apr 19, 2022

f09c45e0

Add doc about `attention_mask` on gpt2 (#16829) · 74814574

wiio12 authored Apr 19, 2022

* Add doc about `attention_mask` on gpt2

Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.

* Add doc about attention_mask on gpt2_tf

* clean up style

* remove empty line white spaces

* remove whitespace in empty line

74814574

Add image classification script, no trainer (#16727) · b96e82c8

NielsRogge authored Apr 19, 2022

* Add first draft

* Improve README and run fixup

* Make script aligned with other scripts, improve README

* Improve script and add test

* Remove print statement

* Apply suggestions from code review

* Add num_labels to make test pass

* Improve README

b96e82c8

[ASR Pipeline] Correct init docs (#16833) · db9f1891
Patrick von Platen authored Apr 19, 2022
```
* correct

* up
```
db9f1891
Add onnx export of models with a multiple choice classification head (#16758) · 77de8d6c
Ella Charlaix authored Apr 19, 2022
```
* Add export of models with a multiple-choice classification head
```
77de8d6c
fix `rum_clm.py` seeking text column name twice (#16624) · b74a9553
Wonjae Kim authored Apr 19, 2022

b74a9553

Type hints added for TFMobileBert (#16505) · 3663fca4

Dahlbomii authored Apr 19, 2022



* Type hints added

* make style

* Return type hints added

* fixed typo
Co-authored-by: matt <rocketknight1@gmail.com>

3663fca4

Some tests misusing assertTrue for comparisons fix (#16771) · a2392415

code-review-doctor authored Apr 19, 2022

* Fix issue avoid-misusing-assert-true found at https://codereview.doctor



* fix tests

* fix tf
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a2392415

[Flax] improve large model init and loading (#16148) · d3bd9ac7

Suraj Patil authored Apr 19, 2022



* begin do_init

* add params_shape_tree

* raise error if params are accessed when do_init is False

* don't allow do_init=False when keys are missing

* make shape tree a property

* assign self._params at the end

* add test for do_init

* add do_init arg to all flax models

* fix param setting

* disbale do_init for composite models

* update test

* add do_init in FlaxBigBirdForMultipleChoice

* better names and errors

* improve test

* style

* add a warning when do_init=False

* remove extra if

* set params after _required_params

* add test for from_pretrained

* do_init => _do_init

* chage warning to info

* fix typo

* add params in init_weights

* add params to gpt neo init

* add params to init_weights

* update do_init test

* Trigger CI

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update template

* trigger CI

* style

* style

* fix template
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d3bd9ac7

Wav2 vec2 phoneme ctc tokenizer optimisation (#16817) · 6de4ee61

Arthur authored Apr 19, 2022



* Solved href rendering issue in heading

Markdown references in headings such as '####' don't render well.
Replaced it with <h4>...<a></a></h> banners.

* PhonemeTokenizer optimization using phonemizer lib

The backend should only be initialized once, otherwise it is reloaded.
Added `init_backend` function, intializes a backend attribute.
Phonemize re-uses self.backend.
Should give ~10 times faster phonemization.

* formatted file with make style

* Documentation suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update /tokenization_wav2vec2_phoneme.py based on PR suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update CONTRIBUTING.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6de4ee61

Fix `LayoutLMv2` tokenization docstrings (#16187) · 306c9ee9
Li-Huai (Allan) Lin authored Apr 19, 2022
```
* Fix docstrings

* Fix up

* Fix
```
306c9ee9

Add semantic script no trainer, v2 (#16788) · 7db7aab4

NielsRogge authored Apr 19, 2022

* Add first draft from previous PR

* First draft

* Improve README and remove num_labels

* Make script more aligned with other scripts

* Improve README and apply suggestion from code review

7db7aab4