1. 23 Apr, 2021 1 commit
    • Daniel Stancl's avatar
      Fix cross-attention head mask for Torch encoder-decoder models (#10605) · e3ff165a
      Daniel Stancl authored
      * Fix cross-attention head mask for Torch BART models
      
      * Fix head masking for cross-attention module for the following
      models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
      Pegasus
      
      * Enable test_headmasking for M2M_100 model
      
      * Fix cross_head_mask for FSMT, LED and T5
      
      * This commit fixes `head_mask` for cross-attention modules
      in the following models: FSMT, LED, T5
      
      * It also contains some smaller changes in doc so that
      it is be perfectly clear the shape of `cross_head_mask`
      is the same as of `decoder_head_mask`
      
      * Update template
      
      * Fix template for BartForCausalLM
      
      * Fix cross_head_mask for Speech2Text models
      
      * Fix cross_head_mask in templates
      
      * Fix args order in BartForCausalLM template
      
      * Fix doc in BART templates
      
      * Make more explicit naming
      
      * `cross_head_mask` -> `cross_attn_head_mask`
      
      * `cross_layer_head_mask` -> `cross_attn_layer_head_mask`
      
      * Fix doc
      
      * make style quality
      
      * Fix speech2text docstring
      e3ff165a
  2. 09 Mar, 2021 1 commit
  3. 08 Mar, 2021 1 commit
  4. 06 Mar, 2021 1 commit
    • Suraj Patil's avatar
      Add m2m100 (#10236) · f6e74a63
      Suraj Patil authored
      * m2m_100
      
      * no layernorm_embedding
      
      * sinusoidal positional embeddings
      
      * update pos embeddings
      
      * add default config values
      
      * tokenizer
      
      * add conversion script
      
      * fix config
      
      * fix pos embed
      
      * remove _float_tensor
      
      * update tokenizer
      
      * update lang codes
      
      * handle lang codes
      
      * fix pos embeds
      
      * fix spm key
      
      * put embedding weights on device
      
      * remove qa and seq classification heads
      
      * fix convert script
      
      * lang codes pn one line
      
      * fix embeds
      
      * fix tokenizer
      
      * fix tokenizer
      
      * add fast tokenizer
      
      * style
      
      * M2M100MT => M2M100
      
      * fix copyright, style
      
      * tokenizer converter
      
      * vocab file
      
      * remove fast tokenizer
      
      * fix embeds
      
      * fix tokenizer
      
      * fix tests
      
      * add tokenizer tests
      
      * add integration test
      
      * quality
      
      * fix model name
      
      * fix test
      
      * doc
      
      * doc
      
      * fix doc
      
      * add copied from statements
      
      * fix tokenizer tests
      
      * apply review suggestions
      
      * fix urls
      
      * fix shift_tokens_right
      
      * apply review suggestions
      
      * fix
      
      * fix doc
      
      * add lang code to id
      
      * remove unused function
      
      * update checkpoint names
      
      * fix copy
      
      * fix tokenizer
      
      * fix checkpoint names
      
      * fix merge issue
      
      * style
      f6e74a63