Commits · 132402d752044301b37e54405832738b16f49df6 · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "a279d3561a37d0d7ae714b979776b3c8f77bcd9d"

20 Jun, 2022 1 commit
- TF: BART compatible with XLA generation (#17479) · 132402d7
  Joao Gante authored Jun 20, 2022
```
* Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
```
  132402d7
09 Jun, 2022 2 commits
- Use shape_list to safely get shapes for Swin (#17591) · 9fc34235
  amyeroberts authored Jun 09, 2022
```
* Use shape_list to safely get shapes

* Add relevant test

* Tidy and add metrics

* Resolve dynamic shaping issues and move test

* Tidy up and all samples in batch

* Formatting
```
  9fc34235
- has_attentions - consistent test skipping logic and tf tests (#17495) · dfc76b25
  amyeroberts authored Jun 09, 2022
  
  dfc76b25
06 Jun, 2022 1 commit

Add magic method to our TF models to convert datasets with column inference (#17160) · 19a8a303

Matt authored Jun 06, 2022



* Add method to call to_tf_dataset() with column inference

* Add test for dataset creation

* Add a default arg for data collator

* Fix test

* Fix call with non-dev version of datasets

* Test correct column removal too

* make fixup

* More tests to make sure we remove unwanted columns

* Fix test to avoid predicting on unbuilt models

* Fix test to avoid predicting on unbuilt models

* Fix test to remove unwanted head mask columns from inputs

* Stop pushing your debug breakpoints to the main repo of the $2bn company you work for

* Skip the test in convnext because no grouped conv support

* Drop bools from the dataset dict

* Make style

* Skip the training test for models whose input dicts don't give us labels

* Skip transformerXL in the test because it doesn't return a simple loss

* Skip TFTapas because of some odd NaN losses

* make style

* make fixup

* Add docstring

* fixup

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove breakpoint from tests

* Fix assert, add requires_backends

* Protect tokenizer import with if TYPE_CHECKING

* make fixup

* Add noqa, more fixup

* More rearranging for ~* aesthetics *~

* Adding defaults for shuffle and batch_size to match to_tf_dataset()

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

19a8a303

17 May, 2022 1 commit

Rewrite TensorFlow train_step and test_step (#17057) · 349f1c85

Matt authored May 17, 2022

* Initial commit

* Better label renaming

* Remove breakpoint before pushing (this is your job)

* Test a lot more in the Keras fit() test

* make fixup

* Clarify the case where we flatten y dicts into tensors

* Clarify the case where we flatten y dicts into tensors

* Extract label name remapping to a method

349f1c85

12 May, 2022 2 commits

Black preview (#17217) · afe5d42d

Sylvain Gugger authored May 12, 2022

* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black

afe5d42d

Add test to ensure models can take int64 inputs (#17210) · f04257fd

Matt authored May 12, 2022

* Add test to ensure models can take int64 inputs

* is_integer is an attribute, not a method

* Fix test when some inputs aren't tensors

* Add casts to blenderbot and blenderbot-small

* Add casts to the other failing models

f04257fd

25 Apr, 2022 1 commit
- TF: XLA stable softmax (#16892) · e03966e4
  Joao Gante authored Apr 25, 2022
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  e03966e4
19 Apr, 2022 1 commit

Improve test_pt_tf_model_equivalence on PT side (#16731) · e6d23a4b

Yih-Dar authored Apr 19, 2022



* Update test_pt_tf_model_equivalence on PT side
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

e6d23a4b

11 Apr, 2022 1 commit

Improve PT/TF equivalence test (#16557) · dce33f21

Yih-Dar authored Apr 11, 2022



* add error message

* Use names in the error message

* allow ModelOutput

* rename to check_pt_tf_outputs and move outside

* fix style

* skip past_key_values in a better way

* Add comments

* improve code for label/loss

* make the logic clear by moving the ignore keys out

* fix _postprocessing_to_ignore

* fix _postprocessing_to_ignore: create new outputs from the remaining fields

* ignore past_key_values in TFGPT2 models for now

* make check_pt_tf_outputs better regarding names

* move check_pt_tf_models outside

* rename methods

* remove test_pt_tf_model_equivalence in TFCLIPModelTest

* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence

* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models

* Fix quality

* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence

* Fix quality

* fix

* fix style

* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence

* Fix quality

* add docstring

* improve comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

dce33f21

06 Apr, 2022 1 commit

TF generate refactor - Beam Search (#16374) · 3f43d824

Joao Gante authored Apr 06, 2022

* refactor TF beam search

* refactored generate can now properly use attention masks

* add force bos/eos logit processors

3f43d824

05 Apr, 2022 1 commit

Adding new train_step logic to make things less confusing for users (#15994) · 43540052

Matt authored Apr 05, 2022



* Adding new train_step logic to make things less confusing for users

* DO NOT ASK WHY WE NEED THAT SUBCLASS

* Metrics now working, at least for single-output models with type annotations!

* Updates and TODOs for the new train_step

* Make fixup

* Temporary test workaround until T5 has types

* Temporary test workaround until T5 has types

* I think this actually works! Needs a lot of tests though

* MAke style/quality

* Revert changes to T5 tests

* Deleting the aforementioned unmentionable subclass

* Deleting the aforementioned unmentionable subclass

* Adding a Keras API test

* Style fixes

* Removing unneeded TODO and comments

* Update test_step too

* Stop trying to compute metrics with the dummy_loss, patch up test

* Make style

* make fixup

* Docstring cleanup

* make fixup

* make fixup

* Stop expanding 1D input tensors when using dummy loss

* Adjust T5 test given the new compile()

* make fixup

* Skipping test for convnext

* Removing old T5-specific Keras test now that we have a common one

* make fixup

* make fixup

* Only skip convnext test on CPU

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Avoiding TF import issues

* make fixup

* Update compile() to support TF 2.3

* Skipping model.fit() on template classes for now

* Skipping model.fit() on template class tests for now

* Replace ad-hoc solution with find_labels

* make fixup
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

43540052

04 Apr, 2022 1 commit
- TF: Finalize `unpack_inputs`-related changes (#16499) · dad5ca83
  Joao Gante authored Apr 04, 2022
```
* Add unpack_inputs to remaining models

* removed kwargs to `call()` in TF models

* fix TF T5 tests
```
  dad5ca83
01 Apr, 2022 1 commit

Use random_attention_mask for TF tests (#16517) · 2199382d

Yih-Dar authored Apr 01, 2022



* use random_attention_mask for TF tests

* Fix for TFCLIP test (for now).
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

2199382d

23 Mar, 2022 2 commits

Make Transformers use cache files when hf.co is down (#16362) · c595b6e6

Sylvain Gugger authored Mar 23, 2022

* Make Transformers use cache files when hf.co is down

* Fix tests

* Was there a random circleCI failure?

* Isolate patches

* Style

* Comment out the failure since it doesn't fail anymore

* Better comment

c595b6e6

TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 (#16332) · 9e8c37dc
Joao Gante authored Mar 23, 2022
```
* revert tf gpt2

* add test for unpack_inputs and fix test case

* add changes to vision encoder decoder
```
9e8c37dc

19 Mar, 2022 1 commit
- Add has_attentions to TFModelTesterMixin as done on PyTorch side (#16259) · f4669364
  Yih-Dar authored Mar 19, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f4669364
17 Mar, 2022 1 commit
- Skip equivalence test for TransfoXL (#16224) · 5a6b3ccd
  Lysandre Debut authored Mar 17, 2022
```
* Skip test for TransfoXL

* Single list
```
  5a6b3ccd
14 Mar, 2022 1 commit

Make TF pt-tf equivalence test more aggressive (#15839) · 923c35b5

Yih-Dar authored Mar 14, 2022



* Make TF pt-tf equivalence test more aggressive

* Fix for TFConvNextModelTest and TFTransfoXLModelTest

* fix kwargs for outputs

* clean-up

* Add docstring for check_outputs()

* remove: need to rename encoder-decoder

* clean-up

* send PyTorch things to the correct device

* Add back the accidentally removed test case in test_pt_tf_model_equivalence()

* Fix: change to tuple before calling check_outputs()

* Fix: tfo could be a list

* use to_tuple()

* allow tfo only to be tuple or tensor

* allow tfo to be list or tuple for now + style change

* minor fix

* remove np.copy and update comments

* tfo -> tf_output, same for pt

* Add more detailed comment

* remove the incorrect comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

923c35b5

02 Mar, 2022 1 commit

TF generate refactor - Sample (#15793) · baab5e7c

Joao Gante authored Mar 02, 2022



* Add TF logits wrappers 

* Add sample method

* add tests for TF logit wrappers

* TF generate sample tests now run on CPU
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

baab5e7c

25 Feb, 2022 1 commit

Add TFConvNextModel (#15750) · 84eaa6ac

Sayak Paul authored Feb 25, 2022

* feat: initial implementation of convnext in tensorflow.

* fix: sample code for the classification model.

* chore: added checked for from the classification model.

* chore: set bias initializer in the classification head.

* chore: updated license terms.

* chore: removed ununsed imports

* feat: enabled argument during using drop_path.

* chore: replaced tf.identity with layers.Activation(linear).

* chore: edited default checkpoint.

* fix: minor bugs in the initializations.

* partial-fix: tf model errors for loading pretrained pt weights.

* partial-fix: call method updated

* partial-fix: cross loading of weights (4x3 variables to be matched)

* chore: removed unneeded comment.

* removed playground.py

* rebasing

* rebasing and removing playground.py.

* fix: renaming TFConvNextStage conv and layer norm layers

* chore: added initializers and other minor additions.

* add: tests for convnext.

* fix: integration tester class.

* fix: issues mentioned in pr feedback (round 1).

* fix: how output_hidden_states arg is propoagated inside the network.

* feat: handling of arg for pure cnn models.

* chore: added a note on equal contribution in model docs.

* rebasing

* rebasing and removing playground.py.

* feat: encapsulation for the convnext trunk.

* Fix variable naming; Test-related corrections; Run make fixup

* chore: added Joao as a contributor to convnext.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: corrected copyright year and added comment on NHWC.

* chore: fixed the black version and ran formatting.

* chore: ran make style.

* chore: removed from_pt argument from test, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* fix: tests in the convnext subclass, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: moved convnext test to the correct location

* fix: locations for the test file of convnext.

* fix: convnext tests.

* chore: applied sgugger's suggestion for dealing w/ output_attentions.

* chore: added comments.

* chore: applied updated quality enviornment style.

* chore: applied formatting with quality enviornment.

* chore: revert to the previous tests/test_modeling_common.py.

* chore: revert to the original test_modeling_common.py

* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py

* fix: tests for convnext.

* chore: removed output_attentions argument from convnext config.

* chore: revert to the earlier tf utils.

* fix: output shapes of the hidden states

* chore: removed unnecessary comment

* chore: reverting to the right test_modeling_tf_common.py.

* Styling nits
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

84eaa6ac

15 Feb, 2022 1 commit

TF generate refactor - Greedy Search (#15562) · 2e12b907

Patrick von Platen authored Feb 15, 2022



* TF generate start refactor

* Add tf tests for sample generate

* re-organize

* boom boom

* Apply suggestions from code review

* re-add

* add all code

* make random greedy pass

* make encoder-decoder random work

* further improvements

* delete bogus file

* make gpt2 and t5 tests work

* finish logits tests

* correct logits processors

* correct past / encoder_outputs drama

* refactor some methods

* another fix

* refactor shape_list

* fix more shape list

* import shape
_list

* finish docs

* fix imports

* make style

* correct tf utils

* Fix TFRag as well

* Apply Lysandre's and Sylvais suggestions

* Update tests/test_generation_tf_logits_process.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/tf_utils.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* remove cpu according to gante

* correct logit processor
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

2e12b907

08 Feb, 2022 1 commit

Add TFSpeech2Text (#15113) · 8406fa6d

Joao Gante authored Feb 08, 2022

* Add wrapper classes

* convert inner layers to tf

* Add TF Encoder and Decoder layers

* TFSpeech2Text models

* Loadable model

* TF model with same outputs as PT model

* test skeleton

* correct tests and run the fixup

* correct attention expansion

* TFSpeech2Text pask_key_values with TF format

8406fa6d

01 Feb, 2022 1 commit
- remove "inputs" in tf common test script (no longer required) (#15262) · af5c3329
  Yih-Dar authored Feb 01, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  af5c3329
19 Jan, 2022 1 commit

Rename compute_loss in TF models (#15207) · 2708bfa1

Matt authored Jan 19, 2022

* Rename compute_loss to hf_compute_loss to avoid conflicts with the new Keras method

* make style

* Adding deprecation warning to `compute_loss`

* Fix sneaky reference to compute_loss

* Replace logger.warning with warnings.warn

* Clarifying warning and deprecation timeline

2708bfa1

18 Jan, 2022 2 commits
- Revert previous change - that was meant to be in a branch! · 1a354d53
  matt authored Jan 18, 2022
  
  1a354d53
- Fix a sneaky reference to compute_loss in the tests · 2085f209
  matt authored Jan 18, 2022
  
  2085f209
14 Jan, 2022 2 commits
- update from keras2onnx to tf2onnx (#15162) · ebc4edfe
  Joao Gante authored Jan 14, 2022
  
  ebc4edfe
- TF Bert inference - support `np.ndarray` optional arguments (#15074) · 7d9a33fb
  Joao Gante authored Jan 14, 2022
```
* TF Bert inference - support np.ndarray optional arguments

* apply np input tests to all TF architectures
```
  7d9a33fb
23 Dec, 2021 1 commit

Add TFCLIPModel (#13967) · 8f2cc1c3

Yih-Dar authored Dec 23, 2021



* Start the work for TFCLIPModel

* Convert to TF code (TODO: loss + doc)

* Clean up

* Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd

* assert -> raise error

* Expose TFCLIPModel

* Deal with dummy_inputs

* Add tests

* Fix all tests. TODO: manual check weight loading + add more comments

* Fix pt tf equivalence test

* fixes

* update TFCLIPVisionEmbeddings's Conv2D

* Fix loss + overwrite test_pt_tf_model_equivalence from common

* Add a comment about the change about MainLayer in test_keras_save_load

* Set return_loss=True in TFCLIPModelTester + make tests pass

* overwrite test_pt_tf_model_equivalence from tf common

* fix base_model_prefix

* Fix examples

* remove unused

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply review suggestions

* change self.pre_layrnorm to self.pre_layernorm

* apply more review suggestions

* return attention probs before dropout (to align with PT)

* fix weight init

* fix

* build doc

* fix missing doc

* fix for test
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8f2cc1c3

20 Dec, 2021 1 commit

Add a main_input_name attribute to all models (#14803) · 33f36c86

Sylvain Gugger authored Dec 20, 2021



* Add a main_input_name attribute to all models

* Fix tests

* Wtf Vs Code?

* Update src/transformers/models/imagegpt/modeling_imagegpt.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Style

* Fix copies
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

33f36c86

15 Dec, 2021 1 commit

TF model cards (#14720) · 48d48276

Matt authored Dec 15, 2021

* Initial commit for Keras model cards

* Revert accidental change

* make style

* make style

* make style

* Fix PR comments

* Move repo creation to __init__

* Fixes to README.md creation

* Partial progress for proper card creation on `push_to_hub`

* Proper card creation from `push_to_hub` plus fixes for malformed model cards

* Fixes for model card creation outside the callback

* Adding a model card creation test

* Putting the model card creation test in the right file.
Good job, Matt.

* make style

* Fix model card test temp dir usage

* Fix model card creation when no optimizer present

* Fixes for when training history not present

* Fix accidental edit to test_modeling_common

48d48276

17 Nov, 2021 1 commit

[WIP] Ensure TF model configs can be converted to proper JSON (#14415) · 1991da07

N authored Nov 17, 2021



* test: make sure model configs are jsonifiable

* fix: return python dict instead of config object

* fix: accept pretrained config and use correct class

* Re-enabling slow tests and applying them to core models only

* Re-enabling slow tests and applying them to core models only

* Add new test file to fetcher

* Remove tooslow tests from test_modeling_tf_common.py

* make style

* Style fixes

* Style fixes

* Style fixes

* Style fixes

* Adding core tests to GPT2 and BART

* Removing unused imports
Co-authored-by: niklas.fruehauf <niklas.fruehauf@sovanta.com>
Co-authored-by: matt <rocketknight1@gmail.com>

1991da07

11 Nov, 2021 1 commit

Experimenting with adding proper get_config() and from_config() methods (#14361) · 4c35c8d8

Matt authored Nov 11, 2021

* Experimenting with adding proper get_config() and from_config() methods

* Adding a test for get/from config

* Fix test for get/from config

4c35c8d8

09 Nov, 2021 1 commit

Add TFViTModel (#13778) · be4a6c64

Yih-Dar authored Nov 09, 2021



* Start the work for TFViTModel

* Convert to TF code - need to check in the follow up commits

* Clean up model code

* Expose TFViTModel

* make style

* make quality

* Add test

* make style & quality

* Fix some imports

* fix wrong usage - *kwargs => ** kwargs

* Fix Conv2D weight loading (PT->TF) issue

* Add tests for images with different sizes + fix model

* Fix some common tests for TFViTModel

* Use inputs instead of input_ids in test_compile_tf_model

* Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name

* Avoid transpose in TFViT call

* Fix Conv2D issue in load_tf2_weights_in_pytorch_model

* Use tf.keras.layers.Conv2D instead of tf.nn.conv2d

* Using simpler heuristic to detect Conv2D layer

* Change convert_tf_weight_name_to_pt_weight_name to return TransposeType

* Check tf_weight_shape is not None before using it

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix missing comma

* fix input dtype
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

be4a6c64

02 Nov, 2021 1 commit
- Update Transformers to huggingface_hub >= 0.1.0 (#14251) · 558f8543
  Sylvain Gugger authored Nov 02, 2021
```
* Update Transformers to huggingface_hub >= 0.1.0

* Forgot to save...

* Style

* Fix test
```
  558f8543
25 Oct, 2021 1 commit

Add TF<>PT and Flax<>PT everywhere (#14047) · 0c3174c7

Patrick von Platen authored Oct 25, 2021

* up

* up

* up

* up

* up

* up

* up

* add clip

* fix clip PyTorch

* fix clip PyTorch

* up

* up

* up

* up

* up

* up

* up

0c3174c7

21 Oct, 2021 1 commit

Fix ignore_mismatched_sizes (#14085) · 234cfefb

Li-Huai (Allan) Lin authored Oct 22, 2021

* Fix

* Style

* Name

* Fix tests

* Style

* Remove embed sizes checking

* Disable some tests

* Fix

* Apply suggestion

234cfefb

12 Oct, 2021 1 commit

Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06

Yih-Dar authored Oct 13, 2021



* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8b240a06

13 Jul, 2021 1 commit

Add option to load a pretrained model with mismatched shapes (#12664) · 90178b0c

Sylvain Gugger authored Jul 13, 2021



* Add option to load a pretrained model with mismatched shapes

* Fail at loading when mismatched shapes in Flax

* Fix tests

* Update src/transformers/modeling_flax_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

90178b0c