Commits · abd503d939401850c29ad8305e03409abc032985 · chenpangpang / transformers

17 Mar, 2022 3 commits

update test (#16219) · d9b8d1a9
Francesco Saverio Zuppichini authored Mar 17, 2022

d9b8d1a9

[Tests] Fix DiT test (#16218) · 03c14a51

NielsRogge authored Mar 17, 2022



* Fix device

* Clean up
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

03c14a51

Fixes Loss for TransfoXL when using Trainer API v2 (#16140) · 73f0a5d1

Lysandre Debut authored Mar 17, 2022



* fix(transfo_xl): Fixes TransfoXL support when using Trainer.

* fix(tests): Uses losses_1 and losses_2 pattern with TransfoXL test.

* fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Fixes code styling.

* Backward compatibility

* Update src/transformers/models/transfo_xl/modeling_transfo_xl.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Gustavo de Rosa <gth.rosa@uol.com.br>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

73f0a5d1

16 Mar, 2022 5 commits

Fix generation min length (#16206) · 2410d0f8
Patrick von Platen authored Mar 16, 2022
```
* up

* fix min lengths
```
2410d0f8

Swin support for any input size (#15986) · 667b823b

Francesco Saverio Zuppichini authored Mar 16, 2022



* padding done

* correctly return one attention per layer

* almost correct, attentions are not flatten one tuple per stage

* tests green

* doc

* conversations

* reshaping hidden_states

* view in the test

* reshape_hidden_states in Encoder and Model

* new outputs with reshaped_hidden_states

* conversations

* doc

* Update docs/source/model_doc/swin.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* fix tests

* minor changes

* resolved conversations

* attentions one per stage

* typo

* typos

* typos

* function signature

* CI

* clean up tests
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

667b823b

TF: add beam search tests (#16202) · 204c54d4
Joao Gante authored Mar 16, 2022

204c54d4
Fix loading CLIPVisionConfig and CLIPTextConfig (#16198) · 19099457
Suraj Patil authored Mar 16, 2022
```
* override from_pretrained

* add tests

* remove docstrings

* fix typo

* Trigger CI
```
19099457
Replace all deprecated `jax.ops` operations with jnp's `at` (#16078) · ee27b3d7
Sanchit Gandhi authored Mar 16, 2022
```
* Replace all deprecated `jax.ops` operations with jnp's `at`

* np to jnp scores

* suggested changes
```
ee27b3d7

15 Mar, 2022 3 commits

TF XLA greedy generation (#15786) · cd4c5c90

Matt authored Mar 15, 2022



* First attempt at TF XLA generation

* Fix comments

* Update XLA greedy generate with direct XLA calls

* Support attention mask, prepare_inputs_for_generation no longer hardcoded for greedy

* Handle position_ids correctly

* make xla generate work for non xla case

* force using xla generate

* refactor

* more fixes

* finish cleaning

* finish

* finish

* clean gpt2 tests

* add gpt2 tests

* correct more cases

* up

* finish

* finish

* more fixes

* flake 8 stuff

* final rag fix

* Update src/transformers/models/rag/modeling_tf_rag.py

* finish t5 as well

* finish

* Update src/transformers/generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

cd4c5c90

Improve Swin for VisionEncoderDecoder (#16070) · a7aca42f

NielsRogge authored Mar 15, 2022



* Add Swin2Bart test

* Fix swin tests
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

a7aca42f

Visual Attention Network (VAN) (#16027) · 0a057201

Francesco Saverio Zuppichini authored Mar 15, 2022



* encoder works

* addded files

* norm in stage

* convertion script

* tests

* fix copies

* make fix-copies

* fixed __init__

* make fix-copies

* fix

* shapiro test needed

* make fix-copie

* minor changes

* make style + quality

* minor refactor conversion script

* rebase + tests

* removed unused variables

* updated doc

* toctree

* CI

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolved conversations

* make fixup

* config passed to modules

* config passed to modules

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* conversations

* copyrights

* normal test

* tests
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

0a057201

14 Mar, 2022 4 commits

[WIP] Resnet (#15770) · e3008c67

Francesco Saverio Zuppichini authored Mar 14, 2022



* first commit

* ResNet model correctly implemented.

basic modeling + weights conversion is done

removed unused doc

mdx file

doc and conversion script

added feature_extractor to auto

test

minor changes + style + quality

doc

test

Delete process.yml

A left over from my attempt of running circleci locally

* minor changes

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* new test format

* minor changes from conversations

* minor changes from conversations

* make style + quality

* readded the tests

* test + README

* minor changes from conversations

* error in README

* make fix-copies

* removed regression for classification head

* make quality

* fixed loss control flow

* fixed loss control flow

* resolved conversations

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* READMEs

* index.mdx

* minor changes

* updated tests and models

* unused import

* outputs

* Update docs/source/model_doc/resnet.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added embeddings_size

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversation

* added push to hub

* test

* embedding_size

* make fix-copies

* resolved conversations

* CI

* changed organization

* minor changes

* CI

* minor changes

* conversations

* conversation

* doc

* tests

* removed unused docstring

* conversation

* removed unused outputs

* CI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

e3008c67

Make TF pt-tf equivalence test more aggressive (#15839) · 923c35b5

Yih-Dar authored Mar 14, 2022



* Make TF pt-tf equivalence test more aggressive

* Fix for TFConvNextModelTest and TFTransfoXLModelTest

* fix kwargs for outputs

* clean-up

* Add docstring for check_outputs()

* remove: need to rename encoder-decoder

* clean-up

* send PyTorch things to the correct device

* Add back the accidentally removed test case in test_pt_tf_model_equivalence()

* Fix: change to tuple before calling check_outputs()

* Fix: tfo could be a list

* use to_tuple()

* allow tfo only to be tuple or tensor

* allow tfo to be list or tuple for now + style change

* minor fix

* remove np.copy and update comments

* tfo -> tf_output, same for pt

* Add more detailed comment

* remove the incorrect comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

923c35b5

Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained... · 2de99e6c

Sanchit Gandhi authored Mar 14, 2022

Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints (#16056)

* Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints

* change wording

2de99e6c

Add TFCamembertForCausalLM and ONNX integration test (#16073) · 6e1e88fd
lewtun authored Mar 14, 2022
```
* Make Camembert great again!

* Add Camembert to TensorFlow ONNX tests
```
6e1e88fd

12 Mar, 2022 1 commit

[Deepspeed] add support for bf16 mode (#14569) · 580dd87c

Stas Bekman authored Mar 11, 2022



* [WIP] add support for bf16 mode

* prep for bf16

* prep for bf16

* fix; zero2/bf16 is ok

* check bf16 is available

* test fixes

* enable zero3_bf16

* config files

* docs

* split stage_dtype; merge back to non-dtype-specific config file

* fix doc

* cleanup

* cleanup

* bfloat16 => bf16 to match the PR changes

* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/

* test fixes/skipping

* move

* fix

* Update docs/source/main_classes/deepspeed.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* backticks

* cleanup

* cleanup

* cleanup

* new version

* add note about grad accum in bf16
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

580dd87c

11 Mar, 2022 2 commits

Add soft length regulation for sequence generation (#15245) · 9442b3ce

Kevin Bondzio authored Mar 11, 2022



* add possibility to softly regulate length when using sampling method in model.generate() function

* fix test config, fix formatting

* fix rag integration, fix docstyling

* fix wrong docstring

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* change test according to new param

* fix formatting

* fix test case

* fix doc style

* move start_length calculation to Logitprocessor

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix rag integration, fix docstyling

* fix test config, fix formatting

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* add possibility to softly regulate length when using sampling method in model.generate() function

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* remove unused import

* fix small errors

* fix test

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix test config, fix formatting

* fix rag integration, fix docstyling

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* change test according to new param

* fix test case

* move start_length calculation to Logitprocessor

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix rag integration, fix docstyling

* fix test config, fix formatting

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix test config, fix formatting

* fix rag integration, fix docstyling

* add possibility to softly regulate length when using sampling method in model.generate() function

* fix rag integration, fix docstyling

* change param to tuple, add test

* fix old param in rag_model, remove unused import

* fix small errors

* Update src/transformers/generation_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/generation_utils.py

* Update src/transformers/generation_utils.py

* fix docstring, add type ind model rag

* fix docstrings

* introduce seq_length variable for cleaner code

* fix black formatting

* add input_ids_seq_length to modeling_rag

* add input_ids_seq_length to test

* retrigger checks

* retrigger checks
Co-authored-by: Kevin Bondzio <kev@AIM-LAP-02.local>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Kevin Bondzio <kev@AIM-LAP-02.fritz.box>

9442b3ce

Fix a TF test name (LayoutLMModelTest) (#16061) · b6bdb943
Yih-Dar authored Mar 11, 2022
```
* fix name
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
b6bdb943

10 Mar, 2022 6 commits

Fix duplicate arguments passed to dummy inputs in ONNX export (#16045) · 6b093283

lewtun authored Mar 10, 2022

* Fix duplicate arguments passed to dummy inputs in ONNX export

* Fix M2M100 ONNX config

* Ensure we check PreTrained model only if torch is available

* Remove TensorFlow tests for models without PyTorch parity

6b093283

support new marian models (#15831) · ba21001f

Suraj Patil authored Mar 10, 2022

* support not sharing embeddings

* update modeling

* update tokenizer

* fix conversion script

* always use self.shared

* boom boom

* begin tests

* update tests

* fix resize_decoder_token_embeddings

* address Patrick's comments

* style

* update conversion script

* fix conversion script

* fix tokenizer

* better name target vocab

* add integration test for tokenizer with two vocabs

* style

* address Patrick's comments

* add integration test for model

ba21001f

Fix Bug in Flax-Speech-Encoder-Decoder Test (#16041) · 1da84ae0
Sanchit Gandhi authored Mar 10, 2022
```
* Fix Bug in Flax-Speech-Encoder-Decoder Test

* change thresholds for CPU precision
```
1da84ae0

[Tests] Add attentions_option to ModelTesterMixin (#15909) · 8d83ebdf

NielsRogge authored Mar 10, 2022



* Add attentions_option to common tester

* Fix tests, apply suggestion

* Apply suggestion from code review
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

8d83ebdf

Add Document Image Transformer (DiT) (#15984) · 0835119b

NielsRogge authored Mar 10, 2022



* Add conversion script

* Improve script

* Fix bug

* Add option to push to hub

* Add support for classification models

* Update model name

* Upload feature extractor files first

* Remove hash checking

* Fix config

* Add id2label

* Add import

* Fix id2label file name

* Fix expected shape

* Add model to README

* Improve docs

* Add integration test and fix CI

* Fix code style

* Add missing init

* Add model to SPECIAL_MODULE_TO_TEST_MAP
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

0835119b

Freeze Feature Encoder in FlaxSpeechEncoderDecoder (#15997) · fde90187
Sanchit Gandhi authored Mar 10, 2022
```
* Freeze Feature Encoder in FlaxSpeechEncoderDecoder

* add backprop test
```
fde90187

09 Mar, 2022 6 commits

Add FlaxBartForCausalLM (#15995) · b256f351

Sanchit Gandhi authored Mar 09, 2022

* add causal lm

* add CausalLM tests

* Add FlaxBartForCausalLM

* Add EncoderDecoder model tests

* change docstring

* make repo-consistency

* suggested changes

* remove jax ops

* correction

* rename pre-trained decoder model

b256f351

Add ONNX export for ViT (#15658) · 50dd314d

lewtun authored Mar 09, 2022



* Add ONNX support for ViT

* Refactor to use generic preprocessor

* Add vision dep to tests

* Extend ONNX slow tests to ViT

* Add dummy image generator

* Use model_type to determine modality

* Add deprecation warnings for tokenizer argument

* Add warning when overwriting the preprocessor

* Add optional args to docstrings

* Add minimum PyTorch version to OnnxConfig

* Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case

* Add reasonable value for default atol
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

50dd314d

Use tiny models for get_pretrained_model in TFEncoderDecoderModelTest (#15989) · b7fa1e3d

Yih-Dar authored Mar 09, 2022



* Use tiny model for TFRembertEncoderDecoderModelTest.get_pretrained_model()
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

b7fa1e3d

done (#16012) · 1e8f3799
Francesco Saverio Zuppichini authored Mar 09, 2022

1e8f3799

Removed an outdated check about hdf5_version (#16011) · 3ea04699

Yih-Dar authored Mar 09, 2022



* removed an outdated check about hdf5_version
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3ea04699

Add `ForInstanceSegmentation` models to `image-segmentation` pipelines (#15937) · f4e4ad34

Nicolas Patry authored Mar 09, 2022

* Adding ForInstanceSegmentation to pipelines.

* Last fix `category_id` renamed to `label_id`.

* Can't be none no more.

* No `is_thing_map` anymore.

f4e4ad34

08 Mar, 2022 5 commits

Seed _get_train_sampler's generator with arg seed to improve reproducibility (#15961) · 5b7dcc73

David Hall authored Mar 08, 2022



* Seed get_train_sampler's generator with arg seed to improve reproducibility

and make the world_size<=1 code path more similar to the others

* move test file into trainer test explicitly

* dumb typo

* make style lint happy

* per discussion, switch to data_seed

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5b7dcc73

TF generate refactor - past without encoder outputs (#15944) · 70203b59

Joao Gante authored Mar 08, 2022

* Remove packed past from generation_tf_utils

* update models with the new past format

* update template accordingly

70203b59

Fix TFEncoderDecoderModelTest - Pytorch device (#15979) · 72983303
Yih-Dar authored Mar 08, 2022
```
* fix device
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
72983303
[Tests] Fix ViTMAE integration test (#15949) · b19f3e69
NielsRogge authored Mar 08, 2022
```
* Fix test across both cpu and gpu

* Fix typo
```
b19f3e69
Fix LayoutLMv2 test (#15939) · 9879a1d5
NielsRogge authored Mar 08, 2022
```
* Fix LayoutLMv2 test

* Update black
```
9879a1d5

07 Mar, 2022 3 commits

Set scale_embedding to False in some TF tests (#15952) · 8b9ae455

Yih-Dar authored Mar 07, 2022



* set scale_embedding to False to avoid large (> 1e-5) output differences between PT/TF
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

8b9ae455

Backprop Test for Freeze FlaxWav2Vec2 Feature Encoder (#15938) · 1a62b25c

Sanchit Gandhi authored Mar 07, 2022

* Backprop Test for Freeze FlaxWav2Vec2 Feature Encoder

* remove jnp.ndarray type suggestion

* assert frozen grads are precisely zero

1a62b25c

[Bug Fix] Beam search example in docs fails & a fix (integrating `max_length`... · ef9c3ca3

Chan Woo Kim authored Mar 07, 2022

[Bug Fix] Beam search example in docs fails & a fix (integrating `max_length` in `BeamScorer.finalize()`) (#15555)

* added the test and fix

* had left out a comment

ef9c3ca3

04 Mar, 2022 2 commits

made MaskFormerModelTest faster (#15942) · 9932ee4b
Francesco Saverio Zuppichini authored Mar 04, 2022

9932ee4b

Constrained Beam Search [*With* Disjunctive Decoding] (#15761) · 5c6f57ee

Chan Woo Kim authored Mar 05, 2022



* added classes to get started with constrained beam search

* in progress, think i can directly force tokens now but not yet with the round robin

* think now i have total control, now need to code the bank selection

* technically works as desired, need to optimize and fix design choices leading to undersirable outputs

* complete PR #1 without disjunctive decoding

* removed incorrect tests

* Delete k.txt

* Delete test.py

* Delete test.sh

* revert changes to test scripts

* genutils

* full implementation with testing, no disjunctive yet

* shifted docs

* passing all tests realistically ran locally

* removing accidentally included print statements

* fixed source of error in initial PR test

* fixing the get_device() vs device trap

* fixed documentation docstrings about constrained_beam_search

* fixed tests having failing for Speech2TextModel's floating point inputs

* fix cuda long tensor

* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search

* deleted accidentally added test halting code with assert False

* code reformat

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

* fixing based on comments on PR

* took out the testing code that should but work fails without the beam search moditification ; style changes

* fixing comments issues

* docstrings for ConstraintListState

* typo in PhrsalConstraint docstring

* docstrings improvements

* finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing.

* fixed bug found in constrained beam search that used beam_idx that were not global across all the batches

* disjunctive constraint working 100% correctly

* passing all tests

* Accidentally included mlruns

* Update src/transformers/generation_beam_constraints.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/generation_beam_constraints.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* complete overhaul of type complexities and other nits

* strict type checks in generate()

* fixing second round of feedback by narsil

* fixed failing generation test because of type check overhaul

* generation test fail fix

* fixing test fails
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

5c6f57ee