- 11 Dec, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 09 Dec, 2025 4 commits
-
-
nicole pardal authored
-
Parth Sareen authored
-
Parth Sareen authored
-
Jeffrey Morgan authored
-
- 08 Dec, 2025 1 commit
-
-
Michael Yang authored
change to a flatter directory structure and group the options with the function update models to call rope in one place
-
- 02 Dec, 2025 1 commit
-
-
Patrick Devine authored
This change: * fixes rope scaling in the mistral converter * updates ministral to include llama4 scaling * includes a new ministral parser for parsing reasoning and tool calling --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-
- 20 Nov, 2025 2 commits
-
-
Grace authored
-
Michael Yang authored
the check for mla omits v3 and r1 which should not return unsupported. instead check the tokenizer for compatibility
-
- 19 Nov, 2025 4 commits
-
-
Patrick Devine authored
-
Grace authored
-
nicole pardal authored
-
Michael Yang authored
-
- 18 Nov, 2025 2 commits
-
-
Michael Yang authored
-
Grace authored
* Add mla for flash attention * Revert to using chunks
-
- 13 Nov, 2025 1 commit
-
-
Michael Yang authored
* use slice/chunks * bert * llama4 * gemma3n * gptoss * mistral3 * qwen3vl * qwen25vl * deepseek2 * remove unused ops
-
- 06 Nov, 2025 1 commit
-
-
Daniel Hiltgen authored
-
- 03 Nov, 2025 1 commit
-
-
Michael Yang authored
-
- 30 Oct, 2025 2 commits
-
-
Michael Yang authored
* ml(ggml): mrope * interleave mrope
-
Michael Yang authored
this change fixes images with an alpha channel by overlaying the image onto a white background
-
- 29 Oct, 2025 2 commits
-
-
Grace authored
Eats extra whitespace at the end/beginning of content
-
Michael Yang authored
-
- 28 Oct, 2025 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 20 Oct, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 18 Oct, 2025 1 commit
-
-
Daniel Hiltgen authored
Co-authored-by:Michael Yang <git@mxy.ng>
-
- 16 Oct, 2025 2 commits
-
-
Jeffrey Morgan authored
Adds a temporary global flag to renderers that causes renderers to always render images as [img]. In a follow up change, we will consider making this the default, and this flag could eventually be removed
-
Grace authored
* changing initial status to take into consideration prefill * Add seperate strings for content and thinking builder * thinking tests * remove white space from string before closing think tag
-
- 14 Oct, 2025 2 commits
-
-
Devon Rifkin authored
-
Devon Rifkin authored
-
- 13 Oct, 2025 2 commits
-
-
Grace authored
* working (other than tool call is the incorrect order) for tool calls and tools * Tests work, other than image tags (tests do not go through server) and tools (not in the correct order, but contents are the same) * testing for qwen3vl parser - toolparser is working * made changes to JSON tool parser, wraps the TollCallFunction with a TollCall object * Working parser for thinking models - assumes state of thinking, emits unambiguous content in thinking, does not call tool call in thinking * changed the parser to start with collecting content * thinking prefill * add hasThinkingSupport parameter to parser * qwen3-vl -> qwen3-vl-instruct for renderer/parser * Add hasThinkingSupport=false to QwenVLParser --------- Co-authored-by:Devon Rifkin <drifkin@drifkin.net>
-
Michael Yang authored
deepseek's qwen3 distill uses a different rope scheme so support both
-
- 10 Oct, 2025 1 commit
-
-
yajianggroup authored
Signed-off-by:yajianggroup <yajianggroup@outlook.com>
-
- 09 Oct, 2025 2 commits
-
-
shengxinjing authored
-
shengxinjing authored
-
- 03 Oct, 2025 1 commit
-
-
Grace authored
-
- 30 Sep, 2025 1 commit
-
-
Devon Rifkin authored
-
- 25 Sep, 2025 1 commit
-
-
Devon Rifkin authored
When trimming whitespace at the end of every chunk, we were iterating backwards over the string byte-by-byte instead of rune-by-rune. As an example of how this can cause corruption, suppose we have the multi-byte character
✅ (`"\u2705"`), which is represented in utf-8 as the three bytes `0xE2 0x9C 0x85`. It happens that `0x85` is NEL, which passes `unicode.IsSpace()`. Because we were iterating byte-by-byte, this caused us to mistakenly slice in the middle of the rune, removing `0x85` and leaving `0xE2 0x9C`, which beyond being the incorrect place to slice, is not even a valid utf-8 character. `trailingWhitespaceLen()` was modified to count from the end in a rune-aware way. Tests with various multibyte unicode characters were also added. Fixes: #12414
-
- 24 Sep, 2025 2 commits
-
-
Grace authored
* init deepseek model file * temp removal of flash attention implementation * shapes and proper, can make a pass * query, key, value have good cosine similarity, but the max diff is a bit high * Attention block is working! ** with eager for now, have not added the mask line * Attention block is working! ** with eager for now, have not added the mask line * working MoE at around 0.95 cosine sim * added cosine similarity function * Starting end to end structure * Trying (and failing) to get rope to work, going to test full thing on tater * running on tater36... just not the right outputs * we have the right values for rope... but its still not working? * chnage Extrapolation Factor to 1 * removed adding residuals twice, removed normalization from shared expert, refactored Norms (Attention, MLP) to be outside the (Attention, MLP) blocks and in the Transformer block instead, add cache setLayer * Temporary modelfiles for cpu * change kpass intermediate step to kv, two layer outputs [0,1] look fine * this calls for 16 chicken nuggets * whoops * cleaning up code * delete stuff we dont need * getting rid of debug statements for llama cpp * working with long contexts * fix long context view error * reverting some changes I made for files that are not apart of pr * Added proper tokenizer for deeepseek3 * clean up model and go test * remove Modelfile * not passing the tests * whoops * how to pass the ci tests * resolving some of the comments * rename * linted and renamed deepseek3 -> deepseek2 * remove name go * addressed changes - main change was adopting qwen3 naming scheme * I cannot with linters * clean up logs * clean up logs --------- Co-authored-by:
Grace Guo <graceguo@Graces-MBP.localdomain> Co-authored-by:
Grace Guo <graceguo@Graces-MacBook-Pro.local> Co-authored-by:
graceguo <graceguo@tater36.localdomain>
-
Michael Yang authored
a leaf node with an alternative name gets all its alternatives names added into the same branch rather than creating branches themselves
-