"configs/vscode:/vscode.git/clone" did not exist on "be5fdae5739283dd782e1c3029eaec075900b3f4"
  1. 07 Aug, 2024 7 commits
  2. 06 Aug, 2024 7 commits
  3. 05 Aug, 2024 1 commit
  4. 02 Aug, 2024 1 commit
  5. 01 Aug, 2024 2 commits
    • Nikos Karampatziakis's avatar
      Offloaded KV Cache (#31325) · ca59d6f7
      Nikos Karampatziakis authored
      * Initial implementation of OffloadedCache
      
      * enable usage via cache_implementation
      
      * Address feedback, add tests, remove legacy methods.
      
      * Remove flash-attn, discover synchronization bugs, fix bugs
      
      * Prevent usage in CPU only mode
      
      * Add a section about offloaded KV cache to the docs
      
      * Fix typos in docs
      
      * Clarifications and better explanation of streams
      ca59d6f7
    • Sanchit Gandhi's avatar
      [whisper] compile compatibility with long-form decoding (#31772) · e234061c
      Sanchit Gandhi authored
      * [whisper] compile compatibility with long-form decoding
      
      * clarify comment
      
      * fix after rebase
      
      * finalise
      
      * fix bsz
      
      * fix cache split
      
      * remove contiguous
      
      * style
      
      * finish
      
      * update doc
      
      * prevent cuda graph trace
      e234061c
  6. 30 Jul, 2024 2 commits
  7. 29 Jul, 2024 3 commits
    • Aymeric Roucher's avatar
      Add stream messages from agent run for gradio chatbot (#32142) · a24a9a66
      Aymeric Roucher authored
      * Add stream_to_gradio method for running agent in gradio demo
      a24a9a66
    • Joao Gante's avatar
      Generate: end-to-end compilation (#30788) · 7ffe25f2
      Joao Gante authored
      * mvp
      
      * added test (a few models need fixes)
      
      * fix a few test cases
      
      * test nits
      
      * harder test 😈
      
      * revert changes in stablelm
      
      * test with improved condition
      
      * add todo
      
      * tmp commit
      
      * merged with main
      
      * nits
      
      * add todo
      
      * final corrections
      
      * add docs for generation compilation
      
      * docs nits
      
      * add  tip
      
      * PR suggestions
      
      * add more details to the compilation docs
      
      * fix cache positions
      
      * cache is now init in generate; update docs
      
      * tag test as flaky
      
      * docs
      
      * post rebase make fixup and other nits
      
      * remove unintended changes
      
      * whisper (encoder-decoder) not supported
      
      * move token default updates to ; add tests for token defaults
      
      * push changes
      
      * manual rebase
      
      * chameleon doesn't support this
      
      * fix test_static_cache_mha_mqa_gqa (broken in another PR)
      
      * docs: dynamic is better with end-to-end compilation
      7ffe25f2
    • Sai-Suraj-27's avatar
      fix(docs): Fixed a link in docs (#32274) · 49928892
      Sai-Suraj-27 authored
      Fixed a link in docs.
      49928892
  8. 25 Jul, 2024 2 commits
  9. 24 Jul, 2024 2 commits
    • Matt's avatar
      🚨 No more default chat templates (#31733) · edd68f4e
      Matt authored
      * No more default chat templates
      
      * Add the template to the GPT-SW3 tests since it's not available by default now
      
      * Fix GPT2 test
      
      * Fix Bloom test
      
      * Fix Bloom test
      
      * Remove default templates again
      edd68f4e
    • Dr. Artificial曾小健's avatar
      Update qwen2.md (#32108) · 5f4ee98a
      Dr. Artificial曾小健 authored
      * Update qwen2.md
      
      outdated description
      
      * Update qwen2.md
      
      amended
      
      * Update qwen2.md
      
      Update
      
      * Update qwen2.md
      
      fix wrong version code, now good to go
      5f4ee98a
  10. 23 Jul, 2024 4 commits
  11. 22 Jul, 2024 4 commits
  12. 19 Jul, 2024 5 commits