- 14 Jan, 2025 6 commits
-
-
Jeffrey Morgan authored
-
Bruce MacDonald authored
Add native support for converting Qwen2 family models (including Qwen2.5) from safetensors to gguf format so we can run it.
-
Steve Berdy authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Patrick Devine authored
-
- 13 Jan, 2025 2 commits
-
-
Parth Sareen authored
-
Jeffrey Morgan authored
-
- 11 Jan, 2025 1 commit
-
-
Patrick Devine authored
-
- 10 Jan, 2025 2 commits
-
-
Michael Yang authored
chore: upgrade to gods v2
-
Jeffrey Morgan authored
-
- 09 Jan, 2025 1 commit
-
-
Patrick Devine authored
-
- 08 Jan, 2025 3 commits
-
-
isamu arimoto authored
-
Jeffrey Morgan authored
-
Michael authored
-
- 06 Jan, 2025 1 commit
-
-
frob authored
* Add CUSTOM_CPU_FLAGS. * fix golangci-lint error. --------- Co-authored-by:Richard Lyons <rick@frob.com.au>
-
- 04 Jan, 2025 1 commit
-
-
Ubaldo Porcheddu authored
-
- 03 Jan, 2025 2 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
These fields are deprecated, but specifying them will not do anything. Removing them as the other deprecated fields will still work, but these do not, so they dont match our existing pattern.
-
- 01 Jan, 2025 1 commit
-
-
Patrick Devine authored
Replaces `POST /api/create` to use JSON instead of a Modelfile. This is a breaking change.
-
- 29 Dec, 2024 4 commits
-
-
Jeffrey Morgan authored
-
Simon Schampijer authored
-
Anas Khan authored
Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Jeffrey Morgan authored
-
- 28 Dec, 2024 1 commit
-
-
Emilien Lancelot authored
-
- 27 Dec, 2024 2 commits
-
-
CIIDMike authored
-
Adarsh Mishra authored
-
- 25 Dec, 2024 2 commits
-
-
Jared Donnell authored
-
aritra saha authored
-
- 23 Dec, 2024 3 commits
-
-
Emanuil Rusev authored
-
湛露先生 authored
-
ItzCrazyKns authored
-
- 22 Dec, 2024 1 commit
-
-
Patrick Devine authored
-
- 21 Dec, 2024 1 commit
-
-
Michael Yang authored
gods v2 uses go generics rather than interfaces which simplifies the code considerably
-
- 20 Dec, 2024 2 commits
-
-
Patrick Devine authored
-
Squishedmac authored
-
- 19 Dec, 2024 1 commit
-
-
Parth Sareen authored
This change adds a test to catch a regression in schema_to_grammar where the order of keys in the JSON schema is not preserved in the generated grammar, which is critical for step-by-step reasoning.
-
- 18 Dec, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Dec, 2024 2 commits
-
-
Jesse Gross authored
Sometimes the KV cache requires defragmentation even without triggering the threshold heuristic. In this case, decoding will not being able to find a KV cache slot. This is particularly difficult for the caller to handle if it happens in between ubatches. To avoid this, we should immediately trigger a defrag. In addition, a heavily fragmented cache can require more than max_moves to defragment. Currently, we stop when we hit the limit but this can leave a cache that still does not have adequate space even after defragmentation is triggered. Instead, we should do multiple batches of processing until everything is complete. Fixes #7949
-
Blake Mizerany authored
This fixes another regression in the previous commit that fixed other known bugs.
-