- 31 Jul, 2024 2 commits
-
-
Jeffrey Morgan authored
Better example for multi-modal input
-
jmorganca authored
-
- 30 Jul, 2024 4 commits
-
-
royjhan authored
* add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * update tests * test name * list metrics
-
Daniel Hiltgen authored
Prevent partial loading on mixed GPU brands
-
Daniel Hiltgen authored
In mult-brand GPU setups, if we couldn't fully load the model we would fall through the scheduler and mistakenly try to load across a mix of brands. This makes sure we find the set of GPU(s) that best fit for the partial load.
-
Kim Hallberg authored
* Update example models * Remove unused README.md
-
- 29 Jul, 2024 11 commits
-
-
Daniel Hiltgen authored
Better explain multi-gpu behavior
-
Daniel Hiltgen authored
Ensure amd gpu nodes are numerically sorted
-
Daniel Hiltgen authored
Report better error on cuda unsupported os/arch
-
royjhan authored
* hot fix * backend stream support * clean up * finish reason * move to openai
-
Daniel Hiltgen authored
Explain font problems on windows 10
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Ikko Eltociear Ashimine authored
HuggingFace -> Hugging Face
-
Michael Yang authored
fix: model save
-
Veit Heller authored
-
Jeffrey Morgan authored
-
- 28 Jul, 2024 1 commit
-
-
Michael authored
-
- 27 Jul, 2024 1 commit
-
-
Tibor Schmidt authored
-
- 26 Jul, 2024 9 commits
-
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Enable windows error dialog for subprocess
-
Michael Yang authored
fix nil deref in auth.go
-
Blake Mizerany authored
This fixes various data races scattered throughout the download/pull client where the client was accessing the download state concurrently. This commit is mostly a hot-fix and will be replaced by a new client one day soon. Also, remove the unnecessary opts argument from downloadChunk.
-
Michael Yang authored
-
Michael Yang authored
autodetect stop parameters from template
-
Michael Yang authored
stop parameter is saved as a slice which is incompatible with modelfile parsing
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 25 Jul, 2024 7 commits
-
-
Michael Yang authored
docs
-
Michael Yang authored
Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Blake Mizerany authored
This changes the registry client to reuse the original download URL it gets on the first redirect response for all subsequent requests, preventing thundering herd issues when hot new LLMs are released.
-
Jeffrey Morgan authored
-
royjhan authored
-
Jeffrey Morgan authored
This reverts commit bb46bbcf.
-
Daniel Hiltgen authored
If we detect an NVIDIA GPU, but nvidia doesn't support the os/arch, this will report a better error for the user and point them to docs to self-install the drivers if possible.
-
- 24 Jul, 2024 4 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
For systems that enumerate over 10 CPUs the default lexicographical sort order interleaves CPUs and GPUs.
-
Michael Yang authored
-
royjhan authored
* float cmp * increase tolerance
-
- 23 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
-