1. 23 Jul, 2024 1 commit
  2. 14 Jul, 2024 1 commit
  3. 11 Jul, 2024 1 commit
    • Arthur's avatar
      Refactor flash attention implementation in transformers (#31446) · e3143952
      Arthur authored
      
      
      * dumb commit
      
      * nit
      
      * update
      
      * something like this
      
      * unpack in modeling utils
      
      * safe import
      
      * oups
      
      * update
      
      * nits
      
      * diff convert gemma
      
      * update
      
      * start propagating
      
      * udpate other modeling code as well
      
      * update for sliding window models
      
      * nits
      
      * more init cleanups
      
      * styling
      
      * fixup
      
      * noice
      
      * pass fixup
      
      * typo typing_extension -> typing_extensions
      
      * torch.nn.functionnal -> torch.nn.functional
      
      * add to import structure
      
      * unpack
      
      * simplify a bit more for this first version
      
      * nut
      
      * update
      
      * update
      
      * nit
      
      * ease the import of `Unpack`
      
      * remove useless `use_sliding_window`
      
      * no qua please
      
      * protect import?
      
      * style
      
      * [run-slow]
      
      * [run slow] llama,gemma,mistral,mixtral
      
      * remove extra kwargs
      
      * fix llama
      
      * address review comments
      
      * apply diff_model_converter to modeling_gemma.py
      
      * remove cache_position 1
      
      * remove cache_position 2
      
      * some cleaning
      
      * refactor gemma2 as well
      
      * apply review comments
      
      * rename file to modeling_flash_attention_utils.py
      
      * siglip refactor
      
      * remove dead code
      
      * is the hub down?
      
      * still down?
      
      * fix siglip
      
      * fix gemma2
      
      * fatal: Could not read from remote repository.
      
      * fix typo in softcap implem
      
      * flacky
      
      * Failed: Timeout >120.0s
      
      ---------
      Co-authored-by: default avatarfxmarty <9808326+fxmarty@users.noreply.github.com>
      e3143952
  4. 07 Jun, 2024 1 commit
  5. 23 May, 2024 1 commit
  6. 16 May, 2024 1 commit
  7. 15 May, 2024 1 commit
  8. 14 May, 2024 1 commit