Commits · aebca696afb511013d71fb6da5c162d945fb31d4 · chenpangpang / transformers

29 Mar, 2022 1 commit

Fix missing output_attentions in PT/Flax equivalence test (#16271) · aebca696

Yih-Dar authored Mar 29, 2022



* fix - set output_attentions to True

* Update tests/test_modeling_flax_common.py

* update for has_attentions

* overwrite check_outputs in FlaxBigBirdModelTest
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

aebca696

28 Mar, 2022 4 commits

Add DPT (#15991) · 979b039c

NielsRogge authored Mar 28, 2022



* First draft

* More improvements

* Add fusion blocks

* Make conversion script work for dpt_large

* Make conversion script work

* Improve implementation

* Improve conversion script

* Add DPTForSemanticSegmentation

* Make conversion work for semantic segmentation

* Add tests

* Remove print statements

* First draft

* Redesign neck

* Improve tests

* Improve implementation some more

* Make neck output list of tensors

* Improve neck and feature extractor

* Fix integration tests

* Make more tests pass

* Make all tests pass

* Add missing config archive map

* Add in_index attribute to make heads accept list of tensors

* Apply suggestions from code review

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply some more suggestions

* Add copied from statements

* Remove assert

* Apply suggestions from code review

* Apply suggestions from code review

* Remove DPTInterpolate in favor of nn.Upsample

* Add comments

* Apply suggestions from code review

* Apply suggestions from code review

* Add proposed design

* Update design

* Add DPTReassembleLayer

* Add DPTFeatureFusionStage

* Apply more suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Fix rebase

* Update in_index and out_indices

* Fix conversion script

* Fix code quality

* Add model to toctree and use DepthEstimatorOutput

* Fix rebase

* Fix code examples

* Improve code

* Fix copied from statements

* Apply suggestions from code review

* Remove compute_loss method

* Apply suggestions from code review

* Fix documentation tests file

* Remove test.py file

* Improve doc example
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>

979b039c

[FlaxSpeechEncoderDecoderModel] Ensure Input and Output Word Embeddings Are **Not** Tied (#16444) · 7ca46335
Sanchit Gandhi authored Mar 28, 2022
```
* [FlaxSpeechEncoderDecoderModel] Ensure Input and Output Word Embeddings Are **Not** Tied

* rebase
```
7ca46335
Fix PerceiverMLP and test (#16405) · e0ac72b7
Jaesun Park authored Mar 28, 2022
```
Co-authored-by: Jaesun Park <jaesun.park1@navercorp.com>
```
e0ac72b7
[Flax] Improve Robustness of Back-Prop Tests (#16418) · 925fc57b
Sanchit Gandhi authored Mar 28, 2022
```
* [Flax] Improve Robustness of Back-Prop Tests

* check equality of logits/outputs

* make fixup
```
925fc57b

25 Mar, 2022 4 commits

Add TF implementation of GPT-J (#15623) · ed2ee373

Daniel Stancl authored Mar 25, 2022

* Initial commit

* Add TFGPTJModel

* Fix a forward pass

* Add TFGPTJCausalLM

* Add TFGPTJForSequenceClassification

* Add TFGPTJForQuestionAnswering

* Fix docs

* Deal with TF dynamic shapes

* Add Loss parents to models

* Adjust split and merge heads to handle 4 and 5-dim tensors

* Update outputs for @tooslow tests

ed2ee373

[FlaxSpeechEncoderDecoder] Fix feature extractor gradient test (#16407) · e231c729
Sanchit Gandhi authored Mar 25, 2022

e231c729

Add ONNX support for Blenderbot and BlenderbotSmall (#15875) · a97f3150

lewtun authored Mar 25, 2022

* Add ONNX support for Blenderbot

* Add BlenderbotSmall ONNX configuration

* Update serialization table

a97f3150

Checkpoint sharding (#16343) · b473617d

Sylvain Gugger authored Mar 25, 2022



* Sharded checkpoint support

* Handle distant sharded checkpoints

* Add tests

* TODO is done

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix docstring

* Add example and format

* Address review comments

* More review comments

* End of merge

* Revert unintentional change

* VsCode what did you do?

* Style

* Changes

* Address final comments

* Quality

* Moar tests

* Move import beneath is_pt_available
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

b473617d

24 Mar, 2022 3 commits

Adapt import to new structure · cae394c8
Sylvain Gugger authored Mar 24, 2022

cae394c8

Update PT Flax equivalence tests in PT test file (#16280) · f571dc20

Yih-Dar authored Mar 24, 2022



* update PT/Flax equivalence tests on PT side

* overwrite check_outputs in BigBirdModelTest
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f571dc20

Fix BigBirdModelTester (#16310) · 2a27c800

Yih-Dar authored Mar 24, 2022



* fix

* update the expected value in test_fast_integration
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

2a27c800

23 Mar, 2022 5 commits

Decision transformer gym (#15845) · aff9bc40

Edward Beeching authored Mar 23, 2022



* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Add type hints for Pegasus (#16324)

* Funnel type hints (#16323)

* add pt funnel type hints

* add tf funnel type hints

* Add type hints for ProphetNet PyTorch (#16272)

* [GLPN] Improve docs (#16331)

* Add link to notebook

* Add link

* Fix bug
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Added type hints for Pytorch Marian calls (#16200)

* Added type hinting for forward functions in pytorch marian

* typo correction

* Removed type hints on functions from BART per Suraj Patil request

* fix import pb

* fix typo

* corrected tuple call

* ran black

* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List

* Fixing copies to roformer and pegasus
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* done (#16340)

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type annotations for Rembert/Splinter and copies (#16338)

* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`
Co-authored-by: matt <rocketknight1@gmail.com>

* [Bug template] Shift responsibilities for long-range (#16344)

* Fix code repetition in serialization guide (#16346)

* Adopt framework-specific blocks for content (#16342)

* ✨ refactor code samples with framework-specific blocks

* ✨ update training.mdx

* 🖍

 apply feedback

* Updates the default branch from master to main (#16326)

* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Updated copies, config auto, and readme files.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com>
Co-authored-by: Adam Montgomerie <adam@avanssion.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

aff9bc40

Make Transformers use cache files when hf.co is down (#16362) · c595b6e6

Sylvain Gugger authored Mar 23, 2022

* Make Transformers use cache files when hf.co is down

* Fix tests

* Was there a random circleCI failure?

* Isolate patches

* Style

* Comment out the failure since it doesn't fail anymore

* Better comment

c595b6e6

TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 (#16332) · 9e8c37dc
Joao Gante authored Mar 23, 2022
```
* revert tf gpt2

* add test for unpack_inputs and fix test case

* add changes to vision encoder decoder
```
9e8c37dc

Reorganize file utils (#16264) · 4975002d

Sylvain Gugger authored Mar 23, 2022

* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit

4975002d

Updates the default branch from master to main (#16326) · eca77f47

Lysandre Debut authored Mar 23, 2022



* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

eca77f47

22 Mar, 2022 1 commit

Add GLPN (#16199) · 0c55d47c

NielsRogge authored Mar 22, 2022



* First draft

* Fix logits calculation

* Improve tests

* Add copied from statements

* Fix base_model_prefix

* Improve implementation, upload new models

* Update design

* Fix integration test

* Add model to README and toctree

* Add document image

* Apply suggestions from code review

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add decoder_hidden_size attribute

* Update design of decoder

* Add DepthEstimatorOutput class

* Rename in_index to head_in_index and add feature extractor tests

* Apply suggestions from code review

* Apply suggestions from code review

* Update pretrained model name and add to doc tests

* Remove test.py script

* Update copied from statements and clean up
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0c55d47c

19 Mar, 2022 1 commit
- Add has_attentions to TFModelTesterMixin as done on PyTorch side (#16259) · f4669364
  Yih-Dar authored Mar 19, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f4669364
18 Mar, 2022 5 commits

Aggressive PT/TF equivalence test on PT side (#16250) · 75c666b4

Yih-Dar authored Mar 18, 2022



* Aggressive PT/TF equivalence test on PT side

* Ugly fix for `TFTapasForQuestionAnswering`

* apply review suggestions
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

75c666b4

Make Flax pt-flax equivalence test more aggressive (#15841) · d481b641

Yih-Dar authored Mar 18, 2022



* Make test_equivalence_pt_to_flax more aggressive

* Make test_equivalence_flax_to_pt more aggressive

* don't use to_tuple

* clean-up

* fix missing test cases + testing on GPU

* fix conversion

* fix `ValueError: assignment destination is read-only`

* Add type checking

* commit to revert later

* Fix

* fix

* fix device

* better naming

* clean-up
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d481b641

update jax version and re-enable some tests (#16254) · b25b92ac
Suraj Patil authored Mar 18, 2022

b25b92ac

Attention mask is important in the case of batching... (#16222) · ecb4662d

Nicolas Patry authored Mar 18, 2022

* Attention mask is important in the case of batching...

* Improve the fix.

* Making the sentence different enough that they exhibit different
predictions.

ecb4662d

Update expected slices for pillow > 9 (#16117) · ec4e421b

NielsRogge authored Mar 18, 2022



* Update expected slices for pillow > 9

* Add expected slices depending on pillow version

* Add different slices depending on pillow version for other models
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

ec4e421b

17 Mar, 2022 6 commits

[FlaxSpeechEncoderDecoderModel] Skip from_encoder_decoder_pretrained (#16236) · 632ff3c3
Suraj Patil authored Mar 17, 2022
```
* skip the test

* fix

* fix skip
```
632ff3c3

Support PEP 563 for HfArgumentParser (#15795) · 81643edd

罗崚骁(LUO Lingxiao) authored Mar 18, 2022



* Support PEP 563 for HfArgumentParser

* Fix issues for Python 3.6

* Add test for string literal annotation for HfArgumentParser

* Remove wrong comment

* Fix typo

* Improve code readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Use `isinstance` to compare types to pass quality check

* Fix style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

81643edd

Skip equivalence test for TransfoXL (#16224) · 5a6b3ccd
Lysandre Debut authored Mar 17, 2022
```
* Skip test for TransfoXL

* Single list
```
5a6b3ccd
update test (#16219) · d9b8d1a9
Francesco Saverio Zuppichini authored Mar 17, 2022

d9b8d1a9

[Tests] Fix DiT test (#16218) · 03c14a51

NielsRogge authored Mar 17, 2022



* Fix device

* Clean up
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

03c14a51

Fixes Loss for TransfoXL when using Trainer API v2 (#16140) · 73f0a5d1

Lysandre Debut authored Mar 17, 2022



* fix(transfo_xl): Fixes TransfoXL support when using Trainer.

* fix(tests): Uses losses_1 and losses_2 pattern with TransfoXL test.

* fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Adds requested changes to allow for backward compatibility.

fix(transfo_xl): Fixes code styling.

* Backward compatibility

* Update src/transformers/models/transfo_xl/modeling_transfo_xl.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Gustavo de Rosa <gth.rosa@uol.com.br>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

73f0a5d1

16 Mar, 2022 5 commits

Fix generation min length (#16206) · 2410d0f8
Patrick von Platen authored Mar 16, 2022
```
* up

* fix min lengths
```
2410d0f8

Swin support for any input size (#15986) · 667b823b

Francesco Saverio Zuppichini authored Mar 16, 2022



* padding done

* correctly return one attention per layer

* almost correct, attentions are not flatten one tuple per stage

* tests green

* doc

* conversations

* reshaping hidden_states

* view in the test

* reshape_hidden_states in Encoder and Model

* new outputs with reshaped_hidden_states

* conversations

* doc

* Update docs/source/model_doc/swin.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* fix tests

* minor changes

* resolved conversations

* attentions one per stage

* typo

* typos

* typos

* function signature

* CI

* clean up tests
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

667b823b

TF: add beam search tests (#16202) · 204c54d4
Joao Gante authored Mar 16, 2022

204c54d4
Fix loading CLIPVisionConfig and CLIPTextConfig (#16198) · 19099457
Suraj Patil authored Mar 16, 2022
```
* override from_pretrained

* add tests

* remove docstrings

* fix typo

* Trigger CI
```
19099457
Replace all deprecated `jax.ops` operations with jnp's `at` (#16078) · ee27b3d7
Sanchit Gandhi authored Mar 16, 2022
```
* Replace all deprecated `jax.ops` operations with jnp's `at`

* np to jnp scores

* suggested changes
```
ee27b3d7

15 Mar, 2022 3 commits

TF XLA greedy generation (#15786) · cd4c5c90

Matt authored Mar 15, 2022



* First attempt at TF XLA generation

* Fix comments

* Update XLA greedy generate with direct XLA calls

* Support attention mask, prepare_inputs_for_generation no longer hardcoded for greedy

* Handle position_ids correctly

* make xla generate work for non xla case

* force using xla generate

* refactor

* more fixes

* finish cleaning

* finish

* finish

* clean gpt2 tests

* add gpt2 tests

* correct more cases

* up

* finish

* finish

* more fixes

* flake 8 stuff

* final rag fix

* Update src/transformers/models/rag/modeling_tf_rag.py

* finish t5 as well

* finish

* Update src/transformers/generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

cd4c5c90

Improve Swin for VisionEncoderDecoder (#16070) · a7aca42f

NielsRogge authored Mar 15, 2022



* Add Swin2Bart test

* Fix swin tests
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

a7aca42f

Visual Attention Network (VAN) (#16027) · 0a057201

Francesco Saverio Zuppichini authored Mar 15, 2022



* encoder works

* addded files

* norm in stage

* convertion script

* tests

* fix copies

* make fix-copies

* fixed __init__

* make fix-copies

* fix

* shapiro test needed

* make fix-copie

* minor changes

* make style + quality

* minor refactor conversion script

* rebase + tests

* removed unused variables

* updated doc

* toctree

* CI

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolved conversations

* make fixup

* config passed to modules

* config passed to modules

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversations

* conversations

* copyrights

* normal test

* tests
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

0a057201

14 Mar, 2022 2 commits

[WIP] Resnet (#15770) · e3008c67

Francesco Saverio Zuppichini authored Mar 14, 2022



* first commit

* ResNet model correctly implemented.

basic modeling + weights conversion is done

removed unused doc

mdx file

doc and conversion script

added feature_extractor to auto

test

minor changes + style + quality

doc

test

Delete process.yml

A left over from my attempt of running circleci locally

* minor changes

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* new test format

* minor changes from conversations

* minor changes from conversations

* make style + quality

* readded the tests

* test + README

* minor changes from conversations

* error in README

* make fix-copies

* removed regression for classification head

* make quality

* fixed loss control flow

* fixed loss control flow

* resolved conversations

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* READMEs

* index.mdx

* minor changes

* updated tests and models

* unused import

* outputs

* Update docs/source/model_doc/resnet.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added embeddings_size

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* conversation

* added push to hub

* test

* embedding_size

* make fix-copies

* resolved conversations

* CI

* changed organization

* minor changes

* CI

* minor changes

* conversations

* conversation

* doc

* tests

* removed unused docstring

* conversation

* removed unused outputs

* CI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

e3008c67

Make TF pt-tf equivalence test more aggressive (#15839) · 923c35b5

Yih-Dar authored Mar 14, 2022



* Make TF pt-tf equivalence test more aggressive

* Fix for TFConvNextModelTest and TFTransfoXLModelTest

* fix kwargs for outputs

* clean-up

* Add docstring for check_outputs()

* remove: need to rename encoder-decoder

* clean-up

* send PyTorch things to the correct device

* Add back the accidentally removed test case in test_pt_tf_model_equivalence()

* Fix: change to tuple before calling check_outputs()

* Fix: tfo could be a list

* use to_tuple()

* allow tfo only to be tuple or tensor

* allow tfo to be list or tuple for now + style change

* minor fix

* remove np.copy and update comments

* tfo -> tf_output, same for pt

* Add more detailed comment

* remove the incorrect comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

923c35b5