Commits · 31ee80d55673f32c0f5d50936f371e661b74b21a · chenpangpang / transformers

24 May, 2022 1 commit

NielsRogge authored May 24, 2022



* Make forward pass work

* More improvements

* Remove unused imports

* Remove timm dependency

* Improve loss calculation of token classifier

* Fix most tests

* Add docs

* Add model integration test

* Make all tests pass

* Add LayoutLMv3FeatureExtractor

* Improve integration test + make fixup

* Add example script

* Fix style

* Add LayoutLMv3Processor

* Fix style

* Add option to add visual labels

* Make more tokenizer tests pass

* Fix more tests

* Make more tests pass

* Fix bug and improve docs

* Fix import of processors

* Improve docstrings

* Fix toctree and improve docs

* Fix auto tokenizer

* Move tests to model folder

* Move tests to model folder

* change default behavior add_prefix_space

* add prefix space for fast

* add_prefix_spcae set to True for Fast

* no space before `unique_no_split` token

* add test to hightligh special treatment of added tokens

* fix `test_batch_encode_dynamic_overflowing` by building a long enough example

* fix `test_full_tokenizer` with add_prefix_token

* Fix tokenizer integration test

* Make the code more readable

* Add tests for LayoutLMv3Processor

* Fix style

* Add model to README and update init

* Apply suggestions from code review

* Replace asserts by value errors

* Add suggestion by @ducviet00

* Add model to doc tests

* Simplify script

* Improve README

* a step ahead to fix

* Update pair_input_test

* Make all tokenizer tests pass - phew

* Make style

* Add LayoutLMv3 to CI job

* Fix auto mapping

* Fix CI job name

* Make all processor tests pass

* Make tests of LayoutLMv2 and LayoutXLM consistent

* Add copied from statements to fast tokenizer

* Add copied from statements to slow tokenizer

* Remove add_visual_labels attribute

* Fix tests

* Add link to notebooks

* Improve docs of LayoutLMv3Processor

* Fix reference to section
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

31ee80d5

23 May, 2022 8 commits

Add support for `device_map="auto"` to OPT (#17382) · 13541b4a
Sylvain Gugger authored May 23, 2022

13541b4a
OPTForCausalLM lm_head input size should be config.word_embed_proj_dim (#17225) · 71cced8a
vfbd authored May 23, 2022

71cced8a

Use Accelerate in `from_pretrained` for big model inference (#17341) · 56f50590

Sylvain Gugger authored May 23, 2022



* Initial work

* More or less finished with first draft

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix randomly initialized weights

* Update src/transformers/modeling_utils.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Address review comments

* Rename DeepSpeed folder to temporarily fix the test issue?

* Revert to try if Accelerate fix works

* Use latest Accelerate release

* Quality and fixes

* Style

* Quality

* Add doc

* Test + fix

* More blocks
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

56f50590

Traced models serialization and torchscripting fix (#17206) · 2e7e4280

Michael Benayoun authored May 23, 2022

* Fix torch.jit.script and pickling issues

* Fix get_attr issues

* Fix import in function

* Fix GPT-J and T5 tracing for torch=1.11

* Gate graph surgery on torch version

* Modeling minor changes to enable TorchScripting

* Model serialization / deserialization test

* Remove _assert_is_none users

2e7e4280

Fix Comet ML integration (#17381) · 1cd01b0a

Maximilian Schmidt authored May 23, 2022

Callback function `on_train_end` crashed if Comet ML integration was
used but `COMET_MODE` set to `DISABLE`

1cd01b0a

Fix cvt docstrings (#17367) · c86aad61
Anugunj Naman authored May 23, 2022

c86aad61

Correct & Improve Doctests for LayoutLMv2 (#17168) · 7b8cb269

ghlai9665 authored May 23, 2022



* add inference example to LayoutLMv2ForQuestionAnswering, passing doctest

* add loss example to LayoutLMv2ForQuestionAnswering, passing doctest

* Add correct doctest for LayoutLMv2ForTokenClassification, passing doctest

* add correct doctest for LayoutLMv2ForSequenceClassification, passing test

* add correct doctest for LayoutLMv2Model, passing test

* make fixup

* fix to address review comments

* make style

* fix doctest line break issue, add to documentaiton_tests.txt, address review comments

* move comment about layoutlmv2 dependencies to the doc page

* format doc page as suggested
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* delete extraneous backtick
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7b8cb269

Fix CodeParrot training script (#17291) · b48ac1a0

Loubna Ben Allal authored May 23, 2022



* average loss over batches and accumulated steps for tracking

* fix layernorm weight decay

* use AdamW from Pytorch instead of Transformers

* add shuffling of sequences inside the batches

* add shuffling of sequences inside the batches

* add logging dir and reformat code

* fix lr tracking

* remove Mistral scaling

* keep Mistral scaling

* reformat code

* fix error

* fix error

* use shuffling function from Pytorch

* remove argument for shuffling batch sequences as it isn't optional

* update package versions and install accelerate from source

* remove unused package

* Update loss average over accumulated steps
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update loss average over accumulated steps
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* use one shuffle buffer argument

* compute avg_loss in one line
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

b48ac1a0

20 May, 2022 2 commits
- Fix a typo relative_postion_if_large -> relative_position_if_large (#17366) · b9bb4173
  Daniel Stancl authored May 20, 2022
  
  b9bb4173
- Pin dill to fix examples (#17368) · 3fd7de49
  Sylvain Gugger authored May 20, 2022
```
* Pin dill for now

* Try this version?

* force install

* Actually use dep in testing

* Try a larger pin
```
  3fd7de49
19 May, 2022 7 commits

[Test OPT] Add batch generation test opt (#17359) · 54192058
Patrick von Platen authored May 19, 2022
```
* up

* up
```
54192058
Fix bug in Wav2Vec2 pretrain example (#17326) · 48c22691
ddobokki authored May 20, 2022

48c22691
fix for 17292 (#17293) · 5d6feecf
Nathan Dahlberg authored May 19, 2022

5d6feecf
[Generation] Fix Transition probs (#17311) · 518bd02c
Patrick von Platen authored May 19, 2022
```
* [Draft] fix transition probs

* up

* up

* up

* make it work

* fix

* finish

* update
```
518bd02c

[OPT] Run test in lower precision on GPU (#17353) · e8714c03

Patrick von Platen authored May 19, 2022

* [OPT] Run test only in half precision

* up

* up

* up

* up

* finish

* fix on GPU

* Update tests/models/opt/test_modeling_opt.py

e8714c03

Adding `batch_size` test to QA pipeline. (#17330) · 2b282296
Nicolas Patry authored May 19, 2022

2b282296

[BC] Fixing usage of text pairs (#17324) · a4386d7e

Nicolas Patry authored May 19, 2022



* [BC] Fixing usage of text pairs

The BC is actually preventing users from misusing the pipeline since
users could have been willing to send text pairs and the pipeline would
instead understand the thing as a batch returning bogus results.

The correct usage of text pairs is preserved in this PR even when that
makes the code clunky.

Adds support for {"text":..,, "text_pair": ...} inputs for both dataset
iteration and more explicit usage to pairs.

* Updating the doc.

* Update src/transformers/pipelines/text_classification.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/text_classification.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_text_classification.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* quality.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

a4386d7e

18 May, 2022 17 commits

[tests] fix copy-n-paste error (#17312) · 3601aa8f
Stas Bekman authored May 18, 2022
```
* [tests] fix copy-n-paste error

* fix
```
3601aa8f

Fix ci_url might be None (#17332) · 1b20c970

Yih-Dar authored May 18, 2022



* fix

* Update utils/notification_service.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

1b20c970

fix (#17337) · 6aad3872

Yih-Dar authored May 18, 2022


Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6aad3872

Fix metric calculation in examples and setup tests to run on multi-gpu for... · 1762ded3

Zachary Mueller authored May 18, 2022

Fix metric calculation in examples and setup tests to run on multi-gpu for no_trainer scripts (#17331)

* Fix length in no_trainer examples

* Add setup and teardown

* Use new accelerator config generator to automatically make tests able to run based on environment

1762ded3

docs for typical decoding (#17186) · 6e195eb9
Jader Martins authored May 18, 2022
```
Co-authored-by: Jader Martins <jadermcs94@gmail.com>
```
6e195eb9

Not send successful report (#17329) · 060fe61d

Yih-Dar authored May 18, 2022



* send report only if there is any failure
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

060fe61d

Fix test_t5_decoder_model_past_large_inputs (#17320) · b3b9f99e
Yih-Dar authored May 18, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
b3b9f99e

Add onnx export cuda support (#17183) · 6da76b9c

Jingya HUANG authored May 18, 2022


Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

6da76b9c

Add CvT (#17299) · adc0ff25

NielsRogge authored May 18, 2022



* Adding cvt files

* Adding cvt files

* changes in init file

* Adding cvt files

* changes in init file

* Style fixes

* Address comments from code review

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Format lists in docstring

* Fix copies

* Apply suggestion from code review
Co-authored-by: AnugunjNaman <anugunjjha@gmail.com>
Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

adc0ff25

Fix style · 47107028
Sylvain Gugger authored May 18, 2022

47107028

Add Information Gain Filtration algorithm (#16953) · 5fdb54ec

mraunak authored May 18, 2022



* Add information gain filtration algorithm

* Complying with black requirements

* Added author

* Fixed import order

* flake8 corrections
Co-authored-by: Javier Turek <javier.turek@intel.com>

5fdb54ec

Fix typo (#17328) · 91ede485
Kamal Raj authored May 18, 2022

91ede485

remove (#17325) · fe28eb94

Yih-Dar authored May 18, 2022


Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fe28eb94

Accepting real pytorch device as arguments. (#17318) · 2cb2ea3f
Nicolas Patry authored May 18, 2022
```
* Accepting real pytorch device as arguments.

* is_torch_available.
```
2cb2ea3f
Updating the docs for `max_seq_len` in QA pipeline (#17316) · 1c9d1f4c
Nicolas Patry authored May 18, 2022

1c9d1f4c

[T5] Fix init in TF and Flax for pretraining (#17294) · 60ad7344

Patrick von Platen authored May 18, 2022



* fix init

* Apply suggestions from code review

* fix

* finish

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

60ad7344

Add type hints for ProphetNet (Pytorch) (#17223) · 7ba1d4e5

Joaq authored May 18, 2022



* added type hints to prophetnet

* reformatted with black

* fix bc black misformatted some parts

* fix imports

* fix imports

* Update src/transformers/models/prophetnet/configuration_prophetnet.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* update OPTIONAL type hint and docstring
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

7ba1d4e5

17 May, 2022 5 commits

Add trajectory transformer (#17141) · d6b8e9ce

Carl authored May 18, 2022



* Add trajectory transformer


Fix model init


Fix end of lines for .mdx files

Add trajectory transformer model to toctree

Add forward input docs

Fix docs, remove prints, simplify prediction test

Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update docs, more descriptive comments

Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update readme

Small comment update and add conversion script

Rebase and reformat

Fix copies

Fix rebase, remove duplicates

Fix rebase, remove duplicates

* Remove tapex

* Remove tapex

* Remove tapex

d6b8e9ce

fix (#17310) · c3526400
Patrick von Platen authored May 18, 2022

c3526400

[LED] fix global_attention_mask not being passed for generation and docs... · d9050dc7

Cesare Campagnano authored May 17, 2022


[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112)

* [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing

* LED docs clarification
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] gradient_checkpointing=True should be passed to TrainingArguments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] docs: remove wrong word
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] docs fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d9050dc7

Add support for pretraining recurring span selection to Splinter (#17247) · bad35839

Jean Vancoppenolle authored May 17, 2022



* Add SplinterForSpanSelection for pre-training recurring span selection.

* Formatting.

* Rename SplinterForSpanSelection to SplinterForPreTraining.

* Ensure repo consistency

* Fixup changes

* Address SplinterForPreTraining PR comments

* Incorporate feedback and derive multiple question tokens per example.

* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de>
Co-authored-by: Tobias Günther <tobias.guenther@retresco.de>
Co-authored-by: Tobias Günther <github@tobigue.de>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bad35839

Add PR author in CI report + merged by info (#17298) · 05113055

Yih-Dar authored May 17, 2022



* Add author info to CI report

* Add merged by info

* update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

05113055