1. 11 Dec, 2025 1 commit
  2. 09 Dec, 2025 4 commits
  3. 08 Dec, 2025 1 commit
    • Michael Yang's avatar
      refactor rope · 603ceefa
      Michael Yang authored
      change to a flatter directory structure and group the options with the
      function
      
      update models to call rope in one place
      603ceefa
  4. 02 Dec, 2025 1 commit
  5. 20 Nov, 2025 2 commits
  6. 19 Nov, 2025 4 commits
  7. 18 Nov, 2025 2 commits
  8. 13 Nov, 2025 1 commit
  9. 06 Nov, 2025 1 commit
  10. 03 Nov, 2025 1 commit
  11. 30 Oct, 2025 2 commits
  12. 29 Oct, 2025 2 commits
  13. 28 Oct, 2025 2 commits
  14. 20 Oct, 2025 1 commit
  15. 18 Oct, 2025 1 commit
  16. 16 Oct, 2025 2 commits
    • Jeffrey Morgan's avatar
      renderers: add global flag for setting [img] tags (#12669) · 65fb3ff4
      Jeffrey Morgan authored
      Adds a temporary global flag to renderers that causes renderers to always
      render images as [img]. In a follow up change, we will consider making this
      the default, and this flag could eventually be removed
      65fb3ff4
    • Grace's avatar
      Grace/qwen3 thinking (#12647) · e2a0b244
      Grace authored
      * changing initial status to take into consideration prefill
      
      * Add seperate strings for content and thinking builder
      
      * thinking tests
      
      * remove white space from string before closing think tag
      e2a0b244
  17. 14 Oct, 2025 2 commits
  18. 13 Oct, 2025 2 commits
    • Grace's avatar
      Qwen3VL Cloud Parser and Renderer (#12526) · 05982a95
      Grace authored
      
      
      * working (other than tool call is the incorrect order) for tool calls and tools
      
      * Tests work, other than image tags (tests do not go through server) and tools (not in the correct order, but contents are the same)
      
      * testing for qwen3vl parser - toolparser is working
      
      * made changes to JSON tool parser, wraps the TollCallFunction with a TollCall object
      
      * Working parser for thinking models - assumes state of thinking, emits unambiguous content in thinking, does not call tool call in thinking
      
      * changed the parser to start with collecting content
      
      * thinking prefill
      
      * add hasThinkingSupport parameter to parser
      
      * qwen3-vl -> qwen3-vl-instruct for renderer/parser
      
      * Add hasThinkingSupport=false to QwenVLParser
      
      ---------
      Co-authored-by: default avatarDevon Rifkin <drifkin@drifkin.net>
      05982a95
    • Michael Yang's avatar
      fix(qwen3): deepseek distill · 6c833d5f
      Michael Yang authored
      deepseek's qwen3 distill uses a different rope scheme so support both
      6c833d5f
  19. 10 Oct, 2025 1 commit
  20. 09 Oct, 2025 2 commits
  21. 03 Oct, 2025 1 commit
  22. 30 Sep, 2025 1 commit
  23. 25 Sep, 2025 1 commit
    • Devon Rifkin's avatar
      parsers: fix unicode handling for qwen3-coder · 05ba4ca1
      Devon Rifkin authored
      When trimming whitespace at the end of every chunk, we were iterating
      backwards over the string byte-by-byte instead of rune-by-rune.
      
      As an example of how this can cause corruption, suppose we have the
      multi-byte character  (`"\u2705"`), which is represented in utf-8 as
      the three bytes `0xE2 0x9C 0x85`. It happens that `0x85` is NEL, which
      passes `unicode.IsSpace()`. Because we were iterating byte-by-byte, this
      caused us to mistakenly slice in the middle of the rune, removing `0x85`
      and leaving `0xE2 0x9C`, which beyond being the incorrect place to
      slice, is not even a valid utf-8 character.
      
      `trailingWhitespaceLen()` was modified to count from the end in a
      rune-aware way. Tests with various multibyte unicode characters were
      also added.
      
      
      Fixes: #12414
      05ba4ca1
  24. 24 Sep, 2025 2 commits
    • Grace's avatar
      Grace/deepseek v3 migration (#12385) · fbd82ba5
      Grace authored
      
      
      * init deepseek model file
      
      * temp removal of flash attention implementation
      
      * shapes and proper, can make a pass
      
      * query, key, value have good cosine similarity, but the max diff is a bit high
      
      * Attention block is working! ** with eager for now, have not added the mask line
      
      * Attention block is working! ** with eager for now, have not added the mask line
      
      * working MoE at around 0.95 cosine sim
      
      * added cosine similarity function
      
      * Starting end to end structure
      
      * Trying (and failing) to get rope to work, going to test full thing on tater
      
      * running on tater36... just not the right outputs
      
      * we have the right values for rope... but its still not working?
      
      * chnage Extrapolation Factor to 1
      
      * removed adding residuals twice, removed normalization from shared expert, refactored Norms (Attention, MLP) to be outside the (Attention, MLP) blocks and in the Transformer block instead, add cache setLayer
      
      * Temporary modelfiles for cpu
      
      * change kpass intermediate step to kv, two layer outputs [0,1] look fine
      
      * this calls for 16 chicken nuggets
      
      * whoops
      
      * cleaning up code
      
      * delete stuff we dont need
      
      * getting rid of debug statements for llama cpp
      
      * working with long contexts
      
      * fix long context view error
      
      * reverting some changes I made for files that are not apart of pr
      
      * Added proper tokenizer for deeepseek3
      
      * clean up model and go test
      
      * remove Modelfile
      
      * not passing the tests
      
      * whoops
      
      * how to pass the ci tests
      
      * resolving some of the comments
      
      * rename
      
      * linted and renamed deepseek3 -> deepseek2
      
      * remove name go
      
      * addressed changes - main change was adopting qwen3 naming scheme
      
      * I cannot with linters
      
      * clean up logs
      
      * clean up logs
      
      ---------
      Co-authored-by: default avatarGrace Guo <graceguo@Graces-MBP.localdomain>
      Co-authored-by: default avatarGrace Guo <graceguo@Graces-MacBook-Pro.local>
      Co-authored-by: default avatargraceguo <graceguo@tater36.localdomain>
      fbd82ba5
    • Michael Yang's avatar
      fix: leaf alt name (#12390) · e1979c57
      Michael Yang authored
      a leaf node with an alternative name gets all its alternatives names
      added into the same branch rather than creating branches themselves
      e1979c57