• Yih-Dar's avatar
    Use cross_attention_hidden_size in Encoder-Decoder models (#14378) · 4cdb67ca
    Yih-Dar authored
    
    
    * add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax)
    
    * for TFEncoderDecoderModel
    
    * add equivalence test for TFEncoderDecoderModel
    
    * fix
    
    * fix failed equivalence tests
    
    * remove unused import
    
    * add detailed comment
    
    * Fix check_equivalence_tf_to_pt by using encoder/decoder
    
    * cleaning
    
    * Use cross_attention_hidden_size in speech-to-text
    
    * clean fast init logging msg in encoder decoder models
    
    * increase tol from 1e-5 to 1e-3 for tf test
    
    * style
    
    * style
    
    * make sure projection layer can run
    
    * remove type conversion + add check
    
    * fix conflict (config.output_hidden_size)
    
    * Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel
    Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
    4cdb67ca
test_modeling_flax_encoder_decoder.py 21.4 KB