- 02 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
Bump llama.cpp to b2581
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Switch back to subprocessing for llama.cpp
-
- 01 Apr, 2024 17 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Leaving the cudart library loaded kept ~30m of memory pinned in the GPU in the main process. This change ensures we don't hold GPU resources when idle.
-
Daniel Hiltgen authored
We may have users that run into problems with our current payload model, so this gives us an escape valve.
-
Daniel Hiltgen authored
"cudart init failure: 35" isn't particularly helpful in the logs.
-
Daniel Hiltgen authored
Cleaner shutdown logic, a bit of response hardening
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-
Patrick Devine authored
-
Michael Yang authored
update memory estimations for gpu offloading
-
Michael Yang authored
refactor model parsing
-
Michael Yang authored
fix generate output
-
Michael Yang authored
-
Michael Yang authored
count each layer independently when deciding gpu offloading
-
Michael Yang authored
-
Philipp Gillé authored
-
Saifeddine ALOUI authored
-
Jesse Zhang authored
Corrective Retrieval Augmented Generation Demo, powered by Langgraph and Streamlit
🤗 Support: - Ollama - OpenAI APIs
-
- 31 Mar, 2024 2 commits
-
-
Yaroslav authored
Plugins list updated
-
sugarforever authored
* Community Integration: ChatOllama * fixed typo
-
- 29 Mar, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Patrick Devine authored
Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 28 Mar, 2024 9 commits
-
-
Daniel Hiltgen authored
CI automation for tagging latest images
-
Daniel Hiltgen authored
Bump ROCm to 6.0.2 patch release
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
CI windows gpu builds
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
If we're doing generate, test windows cuda and rocm as well
-
Michael Yang authored
fix: trim quotes on OLLAMA_ORIGINS
-
Michael Yang authored
-
Michael Yang authored
-
- 27 Mar, 2024 6 commits
-
-
Michael Yang authored
fix: workflows
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
only generate on changes to llm subdirectory
-
Michael Yang authored
-
Michael Yang authored
-