1. 26 May, 2021 5 commits
  2. 25 May, 2021 9 commits
  3. 24 May, 2021 7 commits
  4. 22 May, 2021 1 commit
  5. 21 May, 2021 7 commits
  6. 20 May, 2021 6 commits
  7. 19 May, 2021 3 commits
  8. 18 May, 2021 2 commits
    • Daniel Stancl's avatar
      Fix usage of head masks by PT encoder-decoder models' `generate()` function (#11621) · 680d181c
      Daniel Stancl authored
      * Add missing head masking for generate() function
      
      * Add head_mask, decoder_head_mask and cross_attn_head_mask
      into prepare_inputs_for_generation for generate() function
      for multiple encoder-decoder models.
      
      * Add test_genereate_with_head_masking
      
      * [WIP] Update the new test and handle special cases
      
      * make style
      
      * Omit ProphetNet test so far
      
      * make fix-copies
      680d181c
    • Suraj Patil's avatar
      FlaxGPT2 (#11556) · ca33278f
      Suraj Patil authored
      
      
      * flax gpt2
      
      * combine masks
      
      * handle shared embeds
      
      * add causal LM sample
      
      * style
      
      * add tests
      
      * style
      
      * fix imports, docs, quality
      
      * don't use cache
      
      * add cache
      
      * add cache 1st version
      
      * make use cache work
      
      * start adding test for generation
      
      * finish generation loop compilation
      
      * rewrite test
      
      * finish
      
      * update
      
      * update
      
      * apply sylvains suggestions
      
      * update
      
      * refactor
      
      * fix typo
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      ca33278f