• Yih-Dar's avatar
    Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06
    Yih-Dar authored
    
    
    * Add cross attentions to TFGPT2Model
    
    * Add TFEncoderDecoderModel
    
    * Add TFBaseModelOutputWithPoolingAndCrossAttentions
    
    * Add cross attentions to TFBertModel
    
    * Fix past or past_key_values argument issue
    
    * Fix generation
    
    * Fix save and load
    
    * Add some checks and comments
    
    * Clean the code that deals with past keys/values
    
    * Add kwargs to processing_inputs
    
    * Add serving_output to TFEncoderDecoderModel
    
    * Some cleaning + fix use_cache value issue
    
    * Fix tests + add bert2bert/bert2gpt2 tests
    
    * Fix more tests
    
    * Ignore crossattention.bias when loading GPT2 weights into TFGPT2
    
    * Fix return_dict_in_generate in tf generation
    
    * Fix is_token_logit_eos_token bug in tf generation
    
    * Finalize the tests after fixing some bugs
    
    * Fix another is_token_logit_eos_token bug in tf generation
    
    * Add/Update docs
    
    * Add TFBertEncoderDecoderModelTest
    
    * Clean test script
    
    * Add TFEncoderDecoderModel to the library
    
    * Add cross attentions to TFRobertaModel
    
    * Add TFRobertaEncoderDecoderModelTest
    
    * make style
    
    * Change the way of position_ids computation
    
    * bug fix
    
    * Fix copies in tf_albert
    
    * Remove some copied from and apply some fix-copies
    
    * Remove some copied
    
    * Add cross attentions to some other TF models
    
    * Remove encoder_hidden_states from TFLayoutLMModel.call for now
    
    * Make style
    
    * Fix TFRemBertForCausalLM
    
    * Revert the change to longformer + Remove copies
    
    * Revert the change to albert and convbert + Remove copies
    
    * make quality
    
    * make style
    
    * Add TFRembertEncoderDecoderModelTest
    
    * make quality and fix-copies
    
    * test TFRobertaForCausalLM
    
    * Fixes for failed tests
    
    * Fixes for failed tests
    
    * fix more tests
    
    * Fixes for failed tests
    
    * Fix Auto mapping order
    
    * Fix TFRemBertEncoder return value
    
    * fix tf_rembert
    
    * Check copies are OK
    
    * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined
    
    * Add TFEncoderDecoderModelSaveLoadTests
    
    * fix tf weight loading
    
    * check the change of use_cache
    
    * Revert the change
    
    * Add missing test_for_causal_lm for TFRobertaModelTest
    
    * Try cleaning past
    
    * fix _reorder_cache
    
    * Revert some files to original versions
    
    * Keep as many copies as possible
    
    * Apply suggested changes - Use raise ValueError instead of assert
    
    * Move import to top
    
    * Fix wrong require_torch
    
    * Replace more assert by raise ValueError
    
    * Add test_pt_tf_model_equivalence (the test won't pass for now)
    
    * add test for loading/saving
    
    * finish
    
    * finish
    
    * Remove test_pt_tf_model_equivalence
    
    * Update tf modeling template
    
    * Remove pooling, added in the prev. commit, from MainLayer
    
    * Update tf modeling test template
    
    * Move inputs["use_cache"] = False to modeling_tf_utils.py
    
    * Fix torch.Tensor in the comment
    
    * fix use_cache
    
    * Fix missing use_cache in ElectraConfig
    
    * Add a note to from_pretrained
    
    * Fix style
    
    * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt
    
    * Fix TFMLP (in TFGPT2) activation issue
    
    * Fix None past_key_values value in serving_output
    
    * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub
    
    * Apply review suggestions - style for cross_attns in serving_output
    
    * Apply review suggestions - change assert + docstrings
    
    * break the error message to respect the char limit
    
    * deprecate the argument past
    
    * fix docstring style
    
    * Update the encoder-decoder rst file
    
    * fix Unknown interpreted text role "method"
    
    * fix typo
    Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    8b240a06
roberta.rst 8.11 KB