1. 11 Apr, 2022 1 commit
    • Yih-Dar's avatar
      Improve PT/TF equivalence test (#16557) · dce33f21
      Yih-Dar authored
      
      
      * add error message
      
      * Use names in the error message
      
      * allow ModelOutput
      
      * rename to check_pt_tf_outputs and move outside
      
      * fix style
      
      * skip past_key_values in a better way
      
      * Add comments
      
      * improve code for label/loss
      
      * make the logic clear by moving the ignore keys out
      
      * fix _postprocessing_to_ignore
      
      * fix _postprocessing_to_ignore: create new outputs from the remaining fields
      
      * ignore past_key_values in TFGPT2 models for now
      
      * make check_pt_tf_outputs better regarding names
      
      * move check_pt_tf_models outside
      
      * rename methods
      
      * remove test_pt_tf_model_equivalence in TFCLIPModelTest
      
      * Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence
      
      * move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models
      
      * Fix quality
      
      * Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence
      
      * Fix quality
      
      * fix
      
      * fix style
      
      * Clean-up TFLEDModelTest.test_pt_tf_model_equivalence
      
      * Fix quality
      
      * add docstring
      
      * improve comment
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      dce33f21
  2. 06 Apr, 2022 1 commit
  3. 05 Apr, 2022 1 commit
    • Matt's avatar
      Adding new train_step logic to make things less confusing for users (#15994) · 43540052
      Matt authored
      
      
      * Adding new train_step logic to make things less confusing for users
      
      * DO NOT ASK WHY WE NEED THAT SUBCLASS
      
      * Metrics now working, at least for single-output models with type annotations!
      
      * Updates and TODOs for the new train_step
      
      * Make fixup
      
      * Temporary test workaround until T5 has types
      
      * Temporary test workaround until T5 has types
      
      * I think this actually works! Needs a lot of tests though
      
      * MAke style/quality
      
      * Revert changes to T5 tests
      
      * Deleting the aforementioned unmentionable subclass
      
      * Deleting the aforementioned unmentionable subclass
      
      * Adding a Keras API test
      
      * Style fixes
      
      * Removing unneeded TODO and comments
      
      * Update test_step too
      
      * Stop trying to compute metrics with the dummy_loss, patch up test
      
      * Make style
      
      * make fixup
      
      * Docstring cleanup
      
      * make fixup
      
      * make fixup
      
      * Stop expanding 1D input tensors when using dummy loss
      
      * Adjust T5 test given the new compile()
      
      * make fixup
      
      * Skipping test for convnext
      
      * Removing old T5-specific Keras test now that we have a common one
      
      * make fixup
      
      * make fixup
      
      * Only skip convnext test on CPU
      
      * Update src/transformers/modeling_tf_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Avoiding TF import issues
      
      * make fixup
      
      * Update compile() to support TF 2.3
      
      * Skipping model.fit() on template classes for now
      
      * Skipping model.fit() on template class tests for now
      
      * Replace ad-hoc solution with find_labels
      
      * make fixup
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      43540052
  4. 04 Apr, 2022 1 commit
  5. 01 Apr, 2022 1 commit
  6. 23 Mar, 2022 2 commits
  7. 19 Mar, 2022 1 commit
  8. 17 Mar, 2022 1 commit
  9. 14 Mar, 2022 1 commit
    • Yih-Dar's avatar
      Make TF pt-tf equivalence test more aggressive (#15839) · 923c35b5
      Yih-Dar authored
      
      
      * Make TF pt-tf equivalence test more aggressive
      
      * Fix for TFConvNextModelTest and TFTransfoXLModelTest
      
      * fix kwargs for outputs
      
      * clean-up
      
      * Add docstring for check_outputs()
      
      * remove: need to rename encoder-decoder
      
      * clean-up
      
      * send PyTorch things to the correct device
      
      * Add back the accidentally removed test case in test_pt_tf_model_equivalence()
      
      * Fix: change to tuple before calling check_outputs()
      
      * Fix: tfo could be a list
      
      * use to_tuple()
      
      * allow tfo only to be tuple or tensor
      
      * allow tfo to be list or tuple for now + style change
      
      * minor fix
      
      * remove np.copy and update comments
      
      * tfo -> tf_output, same for pt
      
      * Add more detailed comment
      
      * remove the incorrect comment
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      923c35b5
  10. 02 Mar, 2022 1 commit
  11. 25 Feb, 2022 1 commit
    • Sayak Paul's avatar
      Add TFConvNextModel (#15750) · 84eaa6ac
      Sayak Paul authored
      
      
      * feat: initial implementation of convnext in tensorflow.
      
      * fix: sample code for the classification model.
      
      * chore: added checked for  from the classification model.
      
      * chore: set bias initializer in the classification head.
      
      * chore: updated license terms.
      
      * chore: removed ununsed imports
      
      * feat: enabled  argument during using drop_path.
      
      * chore: replaced tf.identity with layers.Activation(linear).
      
      * chore: edited default checkpoint.
      
      * fix: minor bugs in the initializations.
      
      * partial-fix: tf model errors for loading pretrained pt weights.
      
      * partial-fix: call method updated
      
      * partial-fix: cross loading of weights (4x3 variables to be matched)
      
      * chore: removed unneeded comment.
      
      * removed playground.py
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * fix: renaming TFConvNextStage conv and layer norm layers
      
      * chore: added initializers and other minor additions.
      
      * chore: added initializers and other minor additions.
      
      * add: tests for convnext.
      
      * fix: integration tester class.
      
      * fix: issues mentioned in pr feedback (round 1).
      
      * fix: how output_hidden_states arg is propoagated inside the network.
      
      * feat: handling of  arg for pure cnn models.
      
      * chore: added a note on equal contribution in model docs.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * feat: encapsulation for the convnext trunk.
      
      * Fix variable naming; Test-related corrections; Run make fixup
      
      * chore: added Joao as a contributor to convnext.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * chore: corrected copyright year and added comment on NHWC.
      
      * chore: fixed the black version and ran formatting.
      
      * chore: ran make style.
      
      * chore: removed from_pt argument from test, ran make style.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * fix: tests in the convnext subclass, ran make style.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * chore: moved convnext test to the correct location
      
      * fix: locations for the test file of convnext.
      
      * fix: convnext tests.
      
      * chore: applied  sgugger's suggestion for dealing w/ output_attentions.
      
      * chore: added comments.
      
      * chore: applied updated quality enviornment style.
      
      * chore: applied formatting with quality enviornment.
      
      * chore: revert to the previous tests/test_modeling_common.py.
      
      * chore: revert to the original test_modeling_common.py
      
      * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py
      
      * fix: tests for convnext.
      
      * chore: removed output_attentions argument from convnext config.
      
      * chore: revert to the earlier tf utils.
      
      * fix: output shapes of the hidden states
      
      * chore: removed unnecessary comment
      
      * chore: reverting to the right test_modeling_tf_common.py.
      
      * Styling nits
      Co-authored-by: default avatarariG23498 <aritra.born2fly@gmail.com>
      Co-authored-by: default avatarJoao Gante <joao@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <Sylvain.gugger@gmail.com>
      84eaa6ac
  12. 15 Feb, 2022 1 commit
    • Patrick von Platen's avatar
      TF generate refactor - Greedy Search (#15562) · 2e12b907
      Patrick von Platen authored
      
      
      * TF generate start refactor
      
      * Add tf tests for sample generate
      
      * re-organize
      
      * boom boom
      
      * Apply suggestions from code review
      
      * re-add
      
      * add all code
      
      * make random greedy pass
      
      * make encoder-decoder random work
      
      * further improvements
      
      * delete bogus file
      
      * make gpt2 and t5 tests work
      
      * finish logits tests
      
      * correct logits processors
      
      * correct past / encoder_outputs drama
      
      * refactor some methods
      
      * another fix
      
      * refactor shape_list
      
      * fix more shape list
      
      * import shape
      _list
      
      * finish docs
      
      * fix imports
      
      * make style
      
      * correct tf utils
      
      * Fix TFRag as well
      
      * Apply Lysandre's and Sylvais suggestions
      
      * Update tests/test_generation_tf_logits_process.py
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      
      * Update src/transformers/tf_utils.py
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      
      * remove cpu according to gante
      
      * correct logit processor
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      2e12b907
  13. 08 Feb, 2022 1 commit
    • Joao Gante's avatar
      Add TFSpeech2Text (#15113) · 8406fa6d
      Joao Gante authored
      * Add wrapper classes
      
      * convert inner layers to tf
      
      * Add TF Encoder and Decoder layers
      
      * TFSpeech2Text models
      
      * Loadable model
      
      * TF model with same outputs as PT model
      
      * test skeleton
      
      * correct tests and run the fixup
      
      * correct attention expansion
      
      * TFSpeech2Text pask_key_values with TF format
      8406fa6d
  14. 01 Feb, 2022 1 commit
  15. 19 Jan, 2022 1 commit
    • Matt's avatar
      Rename compute_loss in TF models (#15207) · 2708bfa1
      Matt authored
      * Rename compute_loss to hf_compute_loss to avoid conflicts with the new Keras method
      
      * make style
      
      * Adding deprecation warning to `compute_loss`
      
      * Fix sneaky reference to compute_loss
      
      * Replace logger.warning with warnings.warn
      
      * Clarifying warning and deprecation timeline
      2708bfa1
  16. 18 Jan, 2022 2 commits
  17. 14 Jan, 2022 2 commits
  18. 23 Dec, 2021 1 commit
    • Yih-Dar's avatar
      Add TFCLIPModel (#13967) · 8f2cc1c3
      Yih-Dar authored
      
      
      * Start the work for TFCLIPModel
      
      * Convert to TF code (TODO: loss + doc)
      
      * Clean up
      
      * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd
      
      * assert -> raise error
      
      * Expose TFCLIPModel
      
      * Deal with dummy_inputs
      
      * Add tests
      
      * Fix all tests. TODO: manual check weight loading + add more comments
      
      * Fix pt tf equivalence test
      
      * fixes
      
      * update TFCLIPVisionEmbeddings's Conv2D
      
      * Fix loss + overwrite test_pt_tf_model_equivalence from common
      
      * Add a comment about the change about MainLayer in test_keras_save_load
      
      * Set return_loss=True in TFCLIPModelTester + make tests pass
      
      * overwrite test_pt_tf_model_equivalence from tf common
      
      * fix base_model_prefix
      
      * Fix examples
      
      * remove unused
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply review suggestions
      
      * change self.pre_layrnorm to self.pre_layernorm
      
      * apply more review suggestions
      
      * return attention probs before dropout (to align with PT)
      
      * fix weight init
      
      * fix
      
      * build doc
      
      * fix missing doc
      
      * fix for test
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      8f2cc1c3
  19. 20 Dec, 2021 1 commit
  20. 15 Dec, 2021 1 commit
    • Matt's avatar
      TF model cards (#14720) · 48d48276
      Matt authored
      * Initial commit for Keras model cards
      
      * Revert accidental change
      
      * make style
      
      * make style
      
      * make style
      
      * Fix PR comments
      
      * Move repo creation to __init__
      
      * Fixes to README.md creation
      
      * Partial progress for proper card creation on `push_to_hub`
      
      * Proper card creation from `push_to_hub` plus fixes for malformed model cards
      
      * Fixes for model card creation outside the callback
      
      * Adding a model card creation test
      
      * Putting the model card creation test in the right file.
      Good job, Matt.
      
      * make style
      
      * Fix model card test temp dir usage
      
      * Fix model card creation when no optimizer present
      
      * Fixes for when training history not present
      
      * Fix accidental edit to test_modeling_common
      48d48276
  21. 17 Nov, 2021 1 commit
    • N's avatar
      [WIP] Ensure TF model configs can be converted to proper JSON (#14415) · 1991da07
      N authored
      
      
      * test: make sure model configs are jsonifiable
      
      * fix: return python dict instead of config object
      
      * fix: accept pretrained config and use correct class
      
      * Re-enabling slow tests and applying them to core models only
      
      * Re-enabling slow tests and applying them to core models only
      
      * Add new test file to fetcher
      
      * Remove tooslow tests from test_modeling_tf_common.py
      
      * make style
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Adding core tests to GPT2 and BART
      
      * Removing unused imports
      Co-authored-by: default avatarniklas.fruehauf <niklas.fruehauf@sovanta.com>
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      1991da07
  22. 11 Nov, 2021 1 commit
  23. 09 Nov, 2021 1 commit
    • Yih-Dar's avatar
      Add TFViTModel (#13778) · be4a6c64
      Yih-Dar authored
      
      
      * Start the work for TFViTModel
      
      * Convert to TF code - need to check in the follow up commits
      
      * Clean up model code
      
      * Expose TFViTModel
      
      * make style
      
      * make quality
      
      * Add test
      
      * make style & quality
      
      * Fix some imports
      
      * fix wrong usage - *kwargs => ** kwargs
      
      * Fix Conv2D weight loading (PT->TF) issue
      
      * Add tests for images with different sizes + fix model
      
      * Fix some common tests for TFViTModel
      
      * Use inputs instead of input_ids in test_compile_tf_model
      
      * Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name
      
      * Avoid transpose in TFViT call
      
      * Fix Conv2D issue in load_tf2_weights_in_pytorch_model
      
      * Use tf.keras.layers.Conv2D instead of tf.nn.conv2d
      
      * Using simpler heuristic to detect Conv2D layer
      
      * Change convert_tf_weight_name_to_pt_weight_name to return TransposeType
      
      * Check tf_weight_shape is not None before using it
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix missing comma
      
      * fix input dtype
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      be4a6c64
  24. 02 Nov, 2021 1 commit
  25. 25 Oct, 2021 1 commit
  26. 21 Oct, 2021 1 commit
  27. 12 Oct, 2021 1 commit
    • Yih-Dar's avatar
      Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06
      Yih-Dar authored
      
      
      * Add cross attentions to TFGPT2Model
      
      * Add TFEncoderDecoderModel
      
      * Add TFBaseModelOutputWithPoolingAndCrossAttentions
      
      * Add cross attentions to TFBertModel
      
      * Fix past or past_key_values argument issue
      
      * Fix generation
      
      * Fix save and load
      
      * Add some checks and comments
      
      * Clean the code that deals with past keys/values
      
      * Add kwargs to processing_inputs
      
      * Add serving_output to TFEncoderDecoderModel
      
      * Some cleaning + fix use_cache value issue
      
      * Fix tests + add bert2bert/bert2gpt2 tests
      
      * Fix more tests
      
      * Ignore crossattention.bias when loading GPT2 weights into TFGPT2
      
      * Fix return_dict_in_generate in tf generation
      
      * Fix is_token_logit_eos_token bug in tf generation
      
      * Finalize the tests after fixing some bugs
      
      * Fix another is_token_logit_eos_token bug in tf generation
      
      * Add/Update docs
      
      * Add TFBertEncoderDecoderModelTest
      
      * Clean test script
      
      * Add TFEncoderDecoderModel to the library
      
      * Add cross attentions to TFRobertaModel
      
      * Add TFRobertaEncoderDecoderModelTest
      
      * make style
      
      * Change the way of position_ids computation
      
      * bug fix
      
      * Fix copies in tf_albert
      
      * Remove some copied from and apply some fix-copies
      
      * Remove some copied
      
      * Add cross attentions to some other TF models
      
      * Remove encoder_hidden_states from TFLayoutLMModel.call for now
      
      * Make style
      
      * Fix TFRemBertForCausalLM
      
      * Revert the change to longformer + Remove copies
      
      * Revert the change to albert and convbert + Remove copies
      
      * make quality
      
      * make style
      
      * Add TFRembertEncoderDecoderModelTest
      
      * make quality and fix-copies
      
      * test TFRobertaForCausalLM
      
      * Fixes for failed tests
      
      * Fixes for failed tests
      
      * fix more tests
      
      * Fixes for failed tests
      
      * Fix Auto mapping order
      
      * Fix TFRemBertEncoder return value
      
      * fix tf_rembert
      
      * Check copies are OK
      
      * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined
      
      * Add TFEncoderDecoderModelSaveLoadTests
      
      * fix tf weight loading
      
      * check the change of use_cache
      
      * Revert the change
      
      * Add missing test_for_causal_lm for TFRobertaModelTest
      
      * Try cleaning past
      
      * fix _reorder_cache
      
      * Revert some files to original versions
      
      * Keep as many copies as possible
      
      * Apply suggested changes - Use raise ValueError instead of assert
      
      * Move import to top
      
      * Fix wrong require_torch
      
      * Replace more assert by raise ValueError
      
      * Add test_pt_tf_model_equivalence (the test won't pass for now)
      
      * add test for loading/saving
      
      * finish
      
      * finish
      
      * Remove test_pt_tf_model_equivalence
      
      * Update tf modeling template
      
      * Remove pooling, added in the prev. commit, from MainLayer
      
      * Update tf modeling test template
      
      * Move inputs["use_cache"] = False to modeling_tf_utils.py
      
      * Fix torch.Tensor in the comment
      
      * fix use_cache
      
      * Fix missing use_cache in ElectraConfig
      
      * Add a note to from_pretrained
      
      * Fix style
      
      * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt
      
      * Fix TFMLP (in TFGPT2) activation issue
      
      * Fix None past_key_values value in serving_output
      
      * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub
      
      * Apply review suggestions - style for cross_attns in serving_output
      
      * Apply review suggestions - change assert + docstrings
      
      * break the error message to respect the char limit
      
      * deprecate the argument past
      
      * fix docstring style
      
      * Update the encoder-decoder rst file
      
      * fix Unknown interpreted text role "method"
      
      * fix typo
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      8b240a06
  28. 13 Jul, 2021 1 commit
  29. 08 Jul, 2021 1 commit
    • Funtowicz Morgan's avatar
      [RFC] Laying down building stone for more flexible ONNX export capabilities (#11786) · 2aa3cd93
      Funtowicz Morgan authored
      
      
      * Laying down building stone for more flexible ONNX export capabilities
      
      * Ability to provide a map of config key to override before exporting.
      
      * Makes it possible to export BART with/without past keys.
      
      * Supports simple mathematical syntax for OnnxVariable.repeated
      
      * Effectively apply value override from onnx config for model
      
      * Supports export with additional features such as with-past for seq2seq
      
      * Store the output path directly in the args for uniform usage across.
      
      * Make BART_ONNX_CONFIG_* constants and fix imports.
      
      * Support BERT model.
      
      * Use tokenizer for more flexibility in defining the inputs of a model.
      
      * Add TODO as remainder to provide the batch/sequence_length as CLI args
      
      * Enable optimizations to be done on the model.
      
      * Enable GPT2 + past
      
      * Improve model validation with outputs containing nested structures
      
      * Enable Roberta
      
      * Enable Albert
      
      * Albert requires opset >= 12
      
      * BERT-like models requires opset >= 12
      
      * Remove double printing.
      
      * Enable XLM-Roberta
      
      * Enable DistilBERT
      
      * Disable optimization by default
      
      * Fix missing setattr when applying optimizer_features
      
      * Add value field to OnnxVariable to define constant input (not from tokenizers)
      
      * Add T5 support.
      
      * Simplify model type retrieval
      
      * Example exporting token_classification pipeline for DistilBERT.
      
      * Refactoring to package `transformers.onnx`
      
      * Solve circular dependency & __main__
      
      * Remove unnecessary imports in `__init__`
      
      * Licences
      
      * Use @Narsil's suggestion to forward the model's configuration to the ONNXConfig to avoid interpolation.
      
      * Onnx export v2 fixes (#12388)
      
      * Tiny fixes
      Remove `convert_pytorch` from onnxruntime-less runtimes
      Correct reference to model
      
      * Style
      
      * Fix Copied from
      
      * LongFormer ONNX config.
      
      * Removed optimizations
      
      * Remvoe bad merge relicas.
      
      * Remove unused constants.
      
      * Remove some deleted constants from imports.
      
      * Fix unittest to remove usage of PyTorch model for onnx.utils.
      
      * Fix distilbert export
      
      * Enable ONNX export test for supported model.
      
      * Style.
      
      * Fix lint.
      
      * Enable all supported default models.
      
      * GPT2 only has one output
      
      * Fix bad property name when overriding config.
      
      * Added unittests and docstrings.
      
      * Disable with_past tests for now.
      
      * Enable outputs validation for default export.
      
      * Remove graph opt lvls.
      
      * Last commit with on-going past commented.
      
      * Style.
      
      * Disabled `with_past` for now
      
      * Remove unused imports.
      
      * Remove framework argument
      
      * Remove TFPreTrainedModel reference
      
      * Add documentation
      
      * Add onnxruntime tests to CircleCI
      
      * Add test
      
      * Rename `convert_pytorch` to `export`
      
      * Use OrderedDict for dummy inputs
      
      * WIP Wav2Vec2
      
      * Revert "WIP Wav2Vec2"
      
      This reverts commit f665efb04c92525c3530e589029f0ae7afdf603e.
      
      * Style
      
      * Use OrderedDict for I/O
      
      * Style.
      
      * Specify OrderedDict documentation.
      
      * Style :)
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      2aa3cd93
  30. 23 Jun, 2021 2 commits
    • Sylvain Gugger's avatar
      Clean push to hub API (#12187) · 53c60bab
      Sylvain Gugger authored
      
      
      * Clean push to hub API
      
      * Create working dir if it does not exist
      
      * Different tweak
      
      * New API + all models + test Flax
      
      * Adds the Trainer clean up
      
      * Update src/transformers/file_utils.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Address review comments
      
      * (nit) output types
      
      * No need to set clone_from when folder exists
      
      * Update src/transformers/trainer.py
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      
      * Add generated_from_trainer tag
      
      * Update to new version
      
      * Fixes
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      53c60bab
    • Daniel Stancl's avatar
      Add output in a dictionary for TF `generate` method (#12139) · 26a2e365
      Daniel Stancl authored
      * Add output args to greedy search
      
      * Fix critical typo + make style quality
      
      * Handle generate_beam_search
      
      * Add dict_specific tests and fix the placement of encoder outputs
      
      * Add  specific outputs
      
      * Update doc
      
      * Fix typo
      
      * Adjust handling encoder_outputs + Fix generating for T5
      
      * Fix generate for RAG
      
      * Fix handling ouptut_attentions when target_mapping is not None
      
      Take care of situations when target_mapping is provided
      as there are 2-tuple of attentions
      
      Change from:
      if inputs["output_attentions"]:
          attentions = tuple(tf.transpose(t, perm(2, 3, 0, 1)) for t in attentions)
      
      to:
      if inputs["output_attentions"]:
          if inputs["target_mapping"] is not None:
              # when target_mapping is provided, there are 2-tuple of attentions
               attentions = tuple(
                   tuple(tf.transpose(attn_stream, perm=(2, 3, 0, 1)) for attn_stream in t) for t in attentions
              )
          else:
              attentions = tuple(tf.transpose(t, perm=(2, 3, 0, 1)) for t in attentions)
      
      * Rename kwargs to model_kwargs
      
      * make style quality
      
      * Move imports in test_modeling_tf_common.py
      
      Move ModelOutput-related imports in test_modeling_tf_common.py
      into the `is_tf_available():` statement.
      
      * Rewrite nested if-statements
      
      * Fix added tests
      26a2e365
  31. 14 Jun, 2021 1 commit
    • Will Rice's avatar
      Adding TFWav2Vec2Model (#11617) · d438eee0
      Will Rice authored
      
      
      * [WIP] Add TFWav2Vec2Model
      
      Work in progress for adding a tensorflow version of Wav2Vec2
      
      * feedback changes
      
      * small fix
      
      * Test Feedback Round 1
      
      * Add SpecAugment and CTC Loss
      
      * correct spec augment mask creation
      
      * docstring and correct copyright
      
      * correct bugs
      
      * remove bogus file
      
      * finish tests correction
      
      * del unnecessary layers
      
      * Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * make style
      
      * correct final bug
      
      * Feedback Changes
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d438eee0
  32. 26 May, 2021 1 commit
    • Daniel Stancl's avatar
      Fix usage of head masks by TF encoder-decoder models' `generate()` function (#11775) · 0b933584
      Daniel Stancl authored
      * Fix Bart
      
      * Fix Blenderbot{,_small}
      
      * Fix LED
      
      * Fix Marian
      
      * Fix MBart
      
      * Fix Pegasus
      
      * Fix T5
      
      * Add test for generation with head_mask
      
      * Add a common TF test
      
      * Override a test for the LED model as head masking is not yet properly implemented
      
      * Remove all head_masks from input preparation for LED
      
      * Drop masking for T5 as it needs a bit of refactor
      0b933584
  33. 26 Apr, 2021 2 commits
  34. 23 Apr, 2021 1 commit
  35. 08 Apr, 2021 1 commit