• Daniel Stancl's avatar
    Fix usage of head masks by PT encoder-decoder models' `generate()` function (#11621) · 680d181c
    Daniel Stancl authored
    * Add missing head masking for generate() function
    
    * Add head_mask, decoder_head_mask and cross_attn_head_mask
    into prepare_inputs_for_generation for generate() function
    for multiple encoder-decoder models.
    
    * Add test_genereate_with_head_masking
    
    * [WIP] Update the new test and handle special cases
    
    * make style
    
    * Omit ProphetNet test so far
    
    * make fix-copies
    680d181c
test_modeling_prophetnet.py 51.2 KB