Commits · 8a96b0f10a89c1a26d37385d81fa7e3a3ea0a601 · chenpangpang / transformers

"...gpu/git@developer.sourcefind.cn:gaoqiong/migraphx.git" did not exist on "cc098f4daba858cd00e40f575413b801f129836b"

17 Mar, 2022 1 commit

[Generate Docs] Correct docs (#16133) · 8a96b0f1

Patrick von Platen authored Mar 17, 2022



* [Generate Docs] Correct docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

8a96b0f1

16 Mar, 2022 1 commit

Swin support for any input size (#15986) · 667b823b

Francesco Saverio Zuppichini authored Mar 16, 2022



* padding done

* correctly return one attention per layer

* almost correct, attentions are not flatten one tuple per stage

* tests green

* doc

* conversations

* reshaping hidden_states

* view in the test

* reshape_hidden_states in Encoder and Model

* new outputs with reshaped_hidden_states

* conversations

* doc

* Update docs/source/model_doc/swin.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* fix tests

* minor changes

* resolved conversations

* attentions one per stage

* typo

* typos

* typos

* function signature

* CI

* clean up tests
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

667b823b

15 Mar, 2022 4 commits

Framework split (#16030) · 4f4e5ddb
Sylvain Gugger authored Mar 15, 2022
```
* First files

* More files

* Last files

* Style
```
4f4e5ddb

[Fix doc example] Fix first example for the custom_datasets tutorial (#16087) · bcaf5660

Markus Sagen authored Mar 15, 2022

* Fix inconsistent example variable naming

- Example code for a sequence classification in Tensorflow had spelling mistakes and incorrect and inconsistent naming
- Changed variable naming to be consistent with the two other TF examples

* Fix incorrect incorrect training examples

bcaf5660

Added spanish translation of quicktour.mdx (#16158) · daa49447

Daniel Espejel authored Mar 15, 2022



* Added spanish translation of quicktour.mdx

* Suggestions applied in the revision of the translation
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

daa49447

Visual Attention Network (VAN) (#16027) · 0a057201

Francesco Saverio Zuppichini authored Mar 15, 2022



* encoder works

* addded files

* norm in stage

* convertion script

* tests

* fix copies

* make fix-copies

* fixed __init__

* make fix-copies

* fix

* shapiro test needed

* make fix-copie

* minor changes

* make style + quality

* minor refactor conversion script

* rebase + tests

* removed unused variables

* updated doc

* toctree

* CI

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolved conversations

* make fixup

* config passed to modules

* config passed to modules

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* conversations

* copyrights

* normal test

* tests
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

0a057201

14 Mar, 2022 4 commits

[WIP] Resnet (#15770) · e3008c67

Francesco Saverio Zuppichini authored Mar 14, 2022



* first commit

* ResNet model correctly implemented.

basic modeling + weights conversion is done

removed unused doc

mdx file

doc and conversion script

added feature_extractor to auto

test

minor changes + style + quality

doc

test

Delete process.yml

A left over from my attempt of running circleci locally

* minor changes

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* new test format

* minor changes from conversations

* minor changes from conversations

* make style + quality

* readded the tests

* test + README

* minor changes from conversations

* error in README

* make fix-copies

* removed regression for classification head

* make quality

* fixed loss control flow

* fixed loss control flow

* resolved conversations

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* READMEs

* index.mdx

* minor changes

* updated tests and models

* unused import

* outputs

* Update docs/source/model_doc/resnet.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added embeddings_size

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversation

* added push to hub

* test

* embedding_size

* make fix-copies

* resolved conversations

* CI

* changed organization

* minor changes

* CI

* minor changes

* conversations

* conversation

* doc

* tests

* removed unused docstring

* conversation

* removed unused outputs

* CI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

e3008c67

Spanish translation of the file training.mdx (#16047) · efd6e9a8

Yhary Arias authored Mar 14, 2022



* Spanish translation of the file training.mdx

* Settings - Spanish translation of the file training.mdx

* Latest changes to the Spanish translation of the training.mdx file

* Delete Hugging.mdx

* Last changes to the training fil Espanish version

* Latest modifications

* Latest changes, document ready for PR

* Nits
Co-authored-by: Yhary Arias <yharystefa@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

efd6e9a8

Fix and document Zero Shot Image Classification (#16079) · 802984ad
Omar Sanseviero authored Mar 14, 2022

802984ad
Add TFCamembertForCausalLM and ONNX integration test (#16073) · 6e1e88fd
lewtun authored Mar 14, 2022
```
* Make Camembert great again!

* Add Camembert to TensorFlow ONNX tests
```
6e1e88fd

12 Mar, 2022 1 commit

[Deepspeed] add support for bf16 mode (#14569) · 580dd87c

Stas Bekman authored Mar 11, 2022



* [WIP] add support for bf16 mode

* prep for bf16

* prep for bf16

* fix; zero2/bf16 is ok

* check bf16 is available

* test fixes

* enable zero3_bf16

* config files

* docs

* split stage_dtype; merge back to non-dtype-specific config file

* fix doc

* cleanup

* cleanup

* bfloat16 => bf16 to match the PR changes

* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/

* test fixes/skipping

* move

* fix

* Update docs/source/main_classes/deepspeed.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* backticks

* cleanup

* cleanup

* cleanup

* new version

* add note about grad accum in bf16
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

580dd87c

11 Mar, 2022 3 commits

Audio/vision task guides (#15808) · ae2dd42b

Steven Liu authored Mar 11, 2022

* 📝 first draft of audio/vision guides

* ✨ make fixup

* 🖍 fix typo

* 🖍 close parentheses

* 🖍 apply feedback

* 🖍 apply feedback, make fixup

* 🖍 more fixup for perceiver

* 🖍 apply feedback

* ✨ make fixup

* 🖍 fix data collator

ae2dd42b

Update troubleshoot guide (#16001) · 5b4c97d0
Steven Liu authored Mar 11, 2022
```
* 📝 first draft

* 🖍 apply feedback

* 🖍 apply feedback
```
5b4c97d0
Force default brnahc name via the config · f7708e1b
Sylvain Gugger authored Mar 11, 2022

f7708e1b

10 Mar, 2022 3 commits

updating fine-tune classifier documentation (#16063) · 96ac7549
David S. Batista authored Mar 10, 2022

96ac7549

[Docs] Improve PyTorch, Flax generate API (#15988) · 6ce11c2c

Patrick von Platen authored Mar 10, 2022

* Move generate docs

* up

* Update docs/source/_toctree.yml

* correct

* correct some stuff

* correct tests

* more fixes

* finish generate

* add to doc stest

* finish

* finalize

* add warning to generate method

6ce11c2c

Add Document Image Transformer (DiT) (#15984) · 0835119b

NielsRogge authored Mar 10, 2022



* Add conversion script

* Improve script

* Fix bug

* Add option to push to hub

* Add support for classification models

* Update model name

* Upload feature extractor files first

* Remove hash checking

* Fix config

* Add id2label

* Add import

* Fix id2label file name

* Fix expected shape

* Add model to README

* Improve docs

* Add integration test and fix CI

* Fix code style

* Add missing init

* Add model to SPECIAL_MODULE_TO_TEST_MAP
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

0835119b

09 Mar, 2022 3 commits

Add FlaxBartForCausalLM (#15995) · b256f351

Sanchit Gandhi authored Mar 09, 2022

* add causal lm

* add CausalLM tests

* Add FlaxBartForCausalLM

* Add EncoderDecoder model tests

* change docstring

* make repo-consistency

* suggested changes

* remove jax ops

* correction

* rename pre-trained decoder model

b256f351

Add ONNX export for ViT (#15658) · 50dd314d

lewtun authored Mar 09, 2022



* Add ONNX support for ViT

* Refactor to use generic preprocessor

* Add vision dep to tests

* Extend ONNX slow tests to ViT

* Add dummy image generator

* Use model_type to determine modality

* Add deprecation warnings for tokenizer argument

* Add warning when overwriting the preprocessor

* Add optional args to docstrings

* Add minimum PyTorch version to OnnxConfig

* Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case

* Add reasonable value for default atol
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

50dd314d

[Doctests] Move doctests to new GPU & Fix bugs (#15969) · c1aaa439

Patrick von Platen authored Mar 09, 2022



* test

* up

* up

* Empty test commit

* up

* update tests

* up

* fix some vision models

* correct

* correct docs

* Trigger notification

* finalize

* check

* correct quicktour

* Apply suggestions from code review

* improve doctests

* Trigger Build

* next try

* next try

* and again

* Output current clone information

* Output current clone information

* Correct path

* add tf round again

* revert to daily job
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

c1aaa439

07 Mar, 2022 1 commit

Update training scripts docs (#15931) · 38cc3506

Steven Liu authored Mar 07, 2022

* 📝 first draft

* 🖍 apply feedback

* 🖍 remove examples from toctree

* 🗑 remove examples from docs/source

38cc3506

04 Mar, 2022 3 commits

Constrained Beam Search [*With* Disjunctive Decoding] (#15761) · 5c6f57ee

Chan Woo Kim authored Mar 05, 2022



* added classes to get started with constrained beam search

* in progress, think i can directly force tokens now but not yet with the round robin

* think now i have total control, now need to code the bank selection

* technically works as desired, need to optimize and fix design choices leading to undersirable outputs

* complete PR #1 without disjunctive decoding

* removed incorrect tests

* Delete k.txt

* Delete test.py

* Delete test.sh

* revert changes to test scripts

* genutils

* full implementation with testing, no disjunctive yet

* shifted docs

* passing all tests realistically ran locally

* removing accidentally included print statements

* fixed source of error in initial PR test

* fixing the get_device() vs device trap

* fixed documentation docstrings about constrained_beam_search

* fixed tests having failing for Speech2TextModel's floating point inputs

* fix cuda long tensor

* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search

* deleted accidentally added test halting code with assert False

* code reformat

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

* fixing based on comments on PR

* took out the testing code that should but work fails without the beam search moditification ; style changes

* fixing comments issues

* docstrings for ConstraintListState

* typo in PhrsalConstraint docstring

* docstrings improvements

* finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing.

* fixed bug found in constrained beam search that used beam_idx that were not global across all the batches

* disjunctive constraint working 100% correctly

* passing all tests

* Accidentally included mlruns

* Update src/transformers/generation_beam_constraints.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/generation_beam_constraints.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* complete overhaul of type complexities and other nits

* strict type checks in generate()

* fixing second round of feedback by narsil

* fixed failing generation test because of type check overhaul

* generation test fail fix

* fixing test fails
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

5c6f57ee

Add missing support for Flax XLM-RoBERTa (#15900) · 01485cee

Javier de la Rosa authored Mar 04, 2022



* Adding Flax XLM-RoBERTa

* Add Flax to __init__

* Adding doc and dummy objects

* Add tests

* Add Flax XLM-R models autodoc

* Fix tests

* Add Flask XLM-RoBERTa to TEST_FILES_WITH_NO_COMMON_TESTS

* Update src/transformers/models/xlm_roberta/modeling_flax_xlm_roberta.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Remove test on large Flask XLM-RoBERTa

* Add tokenizer to the test
Co-authored-by: Suraj Patil <surajp815@gmail.com>

01485cee

Making MaskFormerForInstanceSegmentation. (#15934) · 89c7d9cf

Nicolas Patry authored Mar 04, 2022

Small adjustments.

Adding in type hint.

Last fix ?

Only include the default dict thing, not the pipelines.

89c7d9cf

03 Mar, 2022 3 commits
- Update README.md (#15926) · a7df656f
  Patrick von Platen authored Mar 04, 2022
  
  a7df656f
- Fix and improve REALM fine-tuning (#15297) · 7b3bd1f2
  Li-Huai (Allan) Lin authored Mar 03, 2022
```
* Draft

* Add test

* Update src/transformers/models/realm/modeling_realm.py

* Apply suggestion

* Add block_mask

* Update

* Update

* Add block_embedding_to

* Remove no_grad

* Use AutoTokenizer

* Remove model.to overridding
```
  7b3bd1f2
- [Fix link in pipeline doc] (#15906) · 439de3f7
  Patrick von Platen authored Mar 03, 2022
  
  439de3f7
02 Mar, 2022 4 commits

TF generate refactor - Sample (#15793) · baab5e7c

Joao Gante authored Mar 02, 2022



* Add TF logits wrappers 

* Add sample method

* add tests for TF logit wrappers

* TF generate sample tests now run on CPU
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

baab5e7c

Maskformer (#15682) · d83d22f5

Francesco Saverio Zuppichini authored Mar 02, 2022

* maskformer

* conflicts

* minor fixes

* feature extractor test fix

refactor MaskFormerLoss following conversation

MaskFormer related types should not trigger a module time import error

missed one

removed all the types that are not used

update config mapping

minor updates in the doc

resolved conversation that doesn't need a discussion

minor changes

resolved conversations

fixed DetrDecoder

* minor changes

minor changes

fixed mdx file

test feature_extractor return types

functional losses -> classes

removed the return type test for the feature extractor

minor changes + style + quality

* conflicts?

* rebase master

* readme

* added missing files

* deleded poolformers test that where in the wrong palce

* CI

* minor changes

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* resolved conversations

* minor changes

* conversations

[Unispeech] Fix slow tests (#15818)

* remove soundfile old way of loading audio

* Adapt slow test

[Barthez Tokenizer] Fix saving (#15815)

[TFXLNet] Correct tf xlnet generate (#15822)

* [TFXLNet] Correct tf xlnet

* adapt test comment

Fix the push run (#15807)

Fix semantic segmentation pipeline test (#15826)

Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776)

Add model specific output classes to PoolFormer model docs (#15746)

* Added model specific output classes to poolformer docs

* Fixed Segformer typo in Poolformer docs

Adding the option to return_timestamps on pure CTC ASR models. (#15792)

* Adding the option to return_timestamps on pure CTC ASR models.

* Remove `math.prod` which was introduced in Python 3.8

* int are not floats.

* Reworking the PR to support "char" vs "word" output.

* Fixup!

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Quality.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824)

Fix tf.concatenate + test past_key_values for TF models (#15774)

* fix wrong method name tf.concatenate

* add tests related to causal LM / decoder

* make style and quality

* clean-up

* Fix TFBertModel's extended_attention_mask when past_key_values is provided

* Fix tests

* fix copies

* More tf.int8 -> tf.int32 in TF test template

* clean-up

* Update TF test template

* revert the previous commit + update the TF test template

* Fix TF template extended_attention_mask when past_key_values is provided

* Fix some styles manually

* clean-up

* Fix ValueError: too many values to unpack in the test

* Fix more: too many values to unpack in the test

* Add a comment for extended_attention_mask when there is past_key_values

* Fix TFElectra extended_attention_mask when past_key_values is provided

* Add tests to other TF models

* Fix for TF Electra test: add prepare_config_and_inputs_for_decoder

* Fix not passing training arg to lm_head in TFRobertaForCausalLM

* Fix tests (with past) for TF Roberta

* add testing for pask_key_values for TFElectra model
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[examples/summarization and translation] fix readme (#15833)

Add ONNX Runtime quantization for text classification notebook (#15817)

Re-enable doctests for the quicktour (#15828)

* Re-enable doctests for the quicktour

* Re-enable doctests for task_summary (#15830)

* Remove &

Framework split model report (#15825)

Add TFConvNextModel (#15750)

* feat: initial implementation of convnext in tensorflow.

* fix: sample code for the classification model.

* chore: added checked for from the classification model.

* chore: set bias initializer in the classification head.

* chore: updated license terms.

* chore: removed ununsed imports

* feat: enabled argument during using drop_path.

* chore: replaced tf.identity with layers.Activation(linear).

* chore: edited default checkpoint.

* fix: minor bugs in the initializations.

* partial-fix: tf model errors for loading pretrained pt weights.

* partial-fix: call method updated

* partial-fix: cross loading of weights (4x3 variables to be matched)

* chore: removed unneeded comment.

* removed playground.py

* rebasing

* rebasing and removing playground.py.

* fix: renaming TFConvNextStage conv and layer norm layers

* chore: added initializers and other minor additions.

* add: tests for convnext.

* fix: integration tester class.

* fix: issues mentioned in pr feedback (round 1).

* fix: how output_hidden_states arg is propoagated inside the network.

* feat: handling of arg for pure cnn models.

* chore: added a note on equal contribution in model docs.

* rebasing

* rebasing and removing playground.py.

* feat: encapsulation for the convnext trunk.

* Fix variable naming; Test-related corrections; Run make fixup

* chore: added Joao as a contributor to convnext.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: corrected copyright year and added comment on NHWC.

* chore: fixed the black version and ran formatting.

* chore: ran make style.

* chore: removed from_pt argument from test, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* fix: tests in the convnext subclass, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: moved convnext test to the correct location

* fix: locations for the test file of convnext.

* fix: convnext tests.

* chore: applied sgugger's suggestion for dealing w/ output_attentions.

* chore: added comments.

* chore: applied updated quality enviornment style.

* chore: applied formatting with quality enviornment.

* chore: revert to the previous tests/test_modeling_common.py.

* chore: revert to the original test_modeling_common.py

* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py

* fix: tests for convnext.

* chore: removed output_attentions argument from convnext config.

* chore: revert to the earlier tf utils.

* fix: output shapes of the hidden states

* chore: removed unnecessary comment

* chore: reverting to the right test_modeling_tf_common.py.

* Styling nits
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

* minor changes

* doc fix in feature extractor

* doc

* typose

* removed detr logic from config

* removed num_labels

* small fix in the config

* auxilary -> auxiliary

* make style

* some test is failing

* fix a weird char in config prevending doc-builder

* retry to fix the doc-builder issue

* make style

* new try to fix the doc builder

* CI

* change weights to facebook
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

d83d22f5

[Bart] Fix implementation note doc (#15879) · 40040727
Patrick von Platen authored Mar 02, 2022

40040727

M2M100 support for ONNX export (#15193) · 4bfe75bd

Michael Benayoun authored Mar 02, 2022

* Add M2M100 support for ONNX export

* Delete useless imports

* Add M2M100 to tests

* Fix protobuf issue

4bfe75bd

01 Mar, 2022 4 commits

Inference for multilingual models (#15836) · 6ccfa217
Steven Liu authored Mar 01, 2022
```
* 📝 first draft for multilingual models

* 🖍 make style
```
6ccfa217
Add link to notebooks (#15791) · c008afea
NielsRogge authored Mar 01, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
c008afea
[Benchmark tools] Deprecate all (#15848) · 9863f7d2
Patrick von Platen authored Mar 01, 2022
```
* [Benchmark tools] Deprecate all

* up
```
9863f7d2

Add Data2Vec (#15507) · df5a4094

Eduardo Gonzalez Ponferrada authored Mar 01, 2022



* Add data2vec model cloned from roberta

* Add checkpoint conversion script

* Fix copies

* Update docs

* Add checkpoint conversion script

* Remove fairseq data2vec_text script and fix format

* Add comment on where to get data2vec_text.py

* Remove mock implementation cheat.py and fix style

* Fix copies

* Remove TF and Flax classes from init

* Add back copy from fairseq data2vec_text.py and fix style

* Update model name in docs/source/index.mdx to be CamelCase

* Revert model name in table to lower-case to get check_table test to pass

* Update src/transformers/models/data2vec/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/model_doc/data2vec.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/data2vec.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update documentation

* Copy-paste Data2VecConfig from BertConfig

* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency

* Update config special tokens to match RoBERTa

* Split multiple assertions and add individual error messages

* Rename Data2VecModel to Data2VecForTextModel

* Add Data2Vec to _toctree.yml

* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings

* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).

* finish audio model

* finish audio file

* Update names and fix style, quality and repo consistency

* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.

* add inputs to logits to data2vec'

* correct autio models

* correct config auto

* correct tok auto

* Update utils/tests_fetcher.py

* delete unnecessary files

* delete unnecessary files

* further renaming

* make all tests pass

* finish

* remove useless test file

* Update tests/test_modeling_common.py

* Update utils/check_repo.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec_text.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix copies

* Update docs

* Remove fairseq data2vec_text script and fix format

* Add comment on where to get data2vec_text.py

* Remove mock implementation cheat.py and fix style

* Fix copies

* Remove TF and Flax classes from init

* Add back copy from fairseq data2vec_text.py and fix style

* Update model name in docs/source/index.mdx to be CamelCase

* Revert model name in table to lower-case to get check_table test to pass

* Update documentation

* Update src/transformers/models/data2vec/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/configuration_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/data2vec/modeling_data2vec.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Copy-paste Data2VecConfig from BertConfig

* Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency

* Update config special tokens to match RoBERTa

* Split multiple assertions and add individual error messages

* Rename Data2VecModel to Data2VecForTextModel

* Add Data2Vec to _toctree.yml

* Rename Data2VecEmbeddings to Data2VecForTextEmbeddings

* Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding).

* finish audio model

* finish audio file

* add inputs to logits to data2vec'

* Update names and fix style, quality and repo consistency

* Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files.

* correct autio models

* correct config auto

* correct tok auto

* delete unnecessary files

* delete unnecessary files

* Update utils/tests_fetcher.py

* further renaming

* make all tests pass

* finish

* remove useless test file

* Update tests/test_modeling_common.py

* Update utils/check_repo.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/data2vec/modeling_data2vec_text.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Move data2vec tests to new structure

* Fix test imports for text tests

* Remove fairseq files

* Change paper link to arxiv

* Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation.

* Update text model checkpoint to be facebook/data2vec-text-base

* Add 'Copy from' statements and update paper links and docs

* fix copy from statements

* improve copied from

* correct more copied from statements

* finish copied from stuff

* make style

* add model to README

* add to master
Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

df5a4094

28 Feb, 2022 1 commit

Flax Speech-Encoder-Decoder Model (#15613) · e3342edc

Sanchit Gandhi authored Feb 28, 2022



* rebase

* Delete shift tokens func

* downsample decoder input seq len for init

* correct attention mask

* add tests

* pt flax cross test

* make fixup

* init file for import

* change pt-flax cross test threshold

* pt-flax test logits only

* move tests

* make repo-consistency

* consistent indentation
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

e3342edc

25 Feb, 2022 3 commits

Add TFConvNextModel (#15750) · 84eaa6ac

Sayak Paul authored Feb 25, 2022

* feat: initial implementation of convnext in tensorflow.

* fix: sample code for the classification model.

* chore: added checked for from the classification model.

* chore: set bias initializer in the classification head.

* chore: updated license terms.

* chore: removed ununsed imports

* feat: enabled argument during using drop_path.

* chore: replaced tf.identity with layers.Activation(linear).

* chore: edited default checkpoint.

* fix: minor bugs in the initializations.

* partial-fix: tf model errors for loading pretrained pt weights.

* partial-fix: call method updated

* partial-fix: cross loading of weights (4x3 variables to be matched)

* chore: removed unneeded comment.

* removed playground.py

* rebasing

* rebasing and removing playground.py.

* fix: renaming TFConvNextStage conv and layer norm layers

* chore: added initializers and other minor additions.

* add: tests for convnext.

* fix: integration tester class.

* fix: issues mentioned in pr feedback (round 1).

* fix: how output_hidden_states arg is propoagated inside the network.

* feat: handling of arg for pure cnn models.

* chore: added a note on equal contribution in model docs.

* rebasing

* rebasing and removing playground.py.

* feat: encapsulation for the convnext trunk.

* Fix variable naming; Test-related corrections; Run make fixup

* chore: added Joao as a contributor to convnext.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: corrected copyright year and added comment on NHWC.

* chore: fixed the black version and ran formatting.

* chore: ran make style.

* chore: removed from_pt argument from test, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* fix: tests in the convnext subclass, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: moved convnext test to the correct location

* fix: locations for the test file of convnext.

* fix: convnext tests.

* chore: applied sgugger's suggestion for dealing w/ output_attentions.

* chore: added comments.

* chore: applied updated quality enviornment style.

* chore: applied formatting with quality enviornment.

* chore: revert to the previous tests/test_modeling_common.py.

* chore: revert to the original test_modeling_common.py

* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py

* fix: tests for convnext.

* chore: removed output_attentions argument from convnext config.

* chore: revert to the earlier tf utils.

* fix: output shapes of the hidden states

* chore: removed unnecessary comment

* chore: reverting to the right test_modeling_tf_common.py.

* Styling nits
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

84eaa6ac

Re-enable doctests for the quicktour (#15828) · 0118c4f6

Sylvain Gugger authored Feb 25, 2022

* Re-enable doctests for the quicktour

* Re-enable doctests for task_summary (#15830)

* Remove &

0118c4f6

Add model specific output classes to PoolFormer model docs (#15746) · 7566734d
Tanay Mehta authored Feb 25, 2022
```
* Added model specific output classes to poolformer docs

* Fixed Segformer typo in Poolformer docs
```
7566734d

23 Feb, 2022 1 commit

🧼 NLP task guides (#15731) · fecb08c2

Steven Liu authored Feb 23, 2022

* clean commit of changes to NLP tasks

* 🖍 apply feedback

* 📝

 move tf data collator in multiple choice
Co-authored-by: Steven <stevhliu@gmail.com>

fecb08c2