1. 29 Jul, 2024 1 commit
    • Joao Gante's avatar
      Generate: end-to-end compilation (#30788) · 7ffe25f2
      Joao Gante authored
      * mvp
      
      * added test (a few models need fixes)
      
      * fix a few test cases
      
      * test nits
      
      * harder test 馃槇
      
      * revert changes in stablelm
      
      * test with improved condition
      
      * add todo
      
      * tmp commit
      
      * merged with main
      
      * nits
      
      * add todo
      
      * final corrections
      
      * add docs for generation compilation
      
      * docs nits
      
      * add  tip
      
      * PR suggestions
      
      * add more details to the compilation docs
      
      * fix cache positions
      
      * cache is now init in generate; update docs
      
      * tag test as flaky
      
      * docs
      
      * post rebase make fixup and other nits
      
      * remove unintended changes
      
      * whisper (encoder-decoder) not supported
      
      * move token default updates to ; add tests for token defaults
      
      * push changes
      
      * manual rebase
      
      * chameleon doesn't support this
      
      * fix test_static_cache_mha_mqa_gqa (broken in another PR)
      
      * docs: dynamic is better with end-to-end compilation
      7ffe25f2
  2. 20 May, 2024 1 commit
    • Longjie Zheng's avatar
      Add torch.compile for Mistral (#30642) · 616bb11d
      Longjie Zheng authored
      * first version
      
      * fix sliding window
      
      * fix style
      
      * add sliding window cache
      
      * fix style
      
      * address comments
      
      * fix test
      
      * fix style
      
      * move sliding window check inside cache init
      
      * revert changes on irrelevant files & add comment on SlidingWindowCache
      
      * address comments & fix style
      
      fix style
      
      * update causal mask
      
      * [run-slow] mistral
      
      * [run-slow] mistral
      
      * [run-slow] mistral
      
      * [run-slow] mistral
      
      * [run-slow] mistral
      
      * [run-slow] llama
      
      * [run-slow] mistral
      
      * [run-slow] mistral
      
      * [run-slow] mistral
      
      * revert CI from a10 to t4
      
      * wrap up
      616bb11d
  3. 30 Apr, 2024 1 commit
  4. 22 Apr, 2024 1 commit