1. 23 Apr, 2021 1 commit
    • Daniel Stancl's avatar
      Fix cross-attention head mask for Torch encoder-decoder models (#10605) · e3ff165a
      Daniel Stancl authored
      * Fix cross-attention head mask for Torch BART models
      
      * Fix head masking for cross-attention module for the following
      models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
      Pegasus
      
      * Enable test_headmasking for M2M_100 model
      
      * Fix cross_head_mask for FSMT, LED and T5
      
      * This commit fixes `head_mask` for cross-attention modules
      in the following models: FSMT, LED, T5
      
      * It also contains some smaller changes in doc so that
      it is be perfectly clear the shape of `cross_head_mask`
      is the same as of `decoder_head_mask`
      
      * Update template
      
      * Fix template for BartForCausalLM
      
      * Fix cross_head_mask for Speech2Text models
      
      * Fix cross_head_mask in templates
      
      * Fix args order in BartForCausalLM template
      
      * Fix doc in BART templates
      
      * Make more explicit naming
      
      * `cross_head_mask` -> `cross_attn_head_mask`
      
      * `cross_layer_head_mask` -> `cross_attn_layer_head_mask`
      
      * Fix doc
      
      * make style quality
      
      * Fix speech2text docstring
      e3ff165a
  2. 08 Apr, 2021 1 commit
  3. 02 Feb, 2021 1 commit
    • Daniel Stancl's avatar
      Add head_mask and decoder_head_mask to PyTorch LED (#9856) · 71bdc076
      Daniel Stancl authored
      * Add {decoder_,}head_mask to LED
      
      * Fix create_custom_forward signatue in encoder
      
      * Add head_mask to longformer
      
      * Add head_mask to longformer to fix dependencies
      of LED on Longformer.
      
      * Not working yet
      
      * Add mising one input in longofrmer_modeling.py
      
      * make fix-copies
      71bdc076
  4. 21 Jan, 2021 1 commit
  5. 07 Jan, 2021 1 commit
  6. 06 Jan, 2021 1 commit
  7. 05 Jan, 2021 1 commit
    • Patrick von Platen's avatar
      LED (#9278) · 189387e9
      Patrick von Platen authored
      * create model
      
      * add integration
      
      * save current state
      
      * make integration tests pass
      
      * add one more test
      
      * add explanation to tests
      
      * remove from bart
      
      * add padding
      
      * remove unnecessary test
      
      * make all tests pass
      
      * re-add cookie cutter tests
      
      * finish PyTorch
      
      * fix attention test
      
      * Update tests/test_modeling_common.py
      
      * revert change
      
      * remove unused file
      
      * add string to doc
      
      * save intermediate
      
      * make tf integration tests pass
      
      * finish tf
      
      * fix doc
      
      * fix docs again
      
      * add led to doctree
      
      * add to auto tokenizer
      
      * added tips for led
      
      * make style
      
      * apply jplus statements
      
      * correct tf longformer
      
      * apply lysandres suggestions
      
      * apply sylvains suggestions
      
      * Apply suggestions from code review
      189387e9