- 08 Apr, 2024 2 commits
-
-
Michael Yang authored
-
writinwaters authored
RAGFlow now supports integration with Ollama.
-
- 07 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528)
-
- 06 Apr, 2024 3 commits
-
-
Thomas Vitale authored
Fixes gh-3514 Signed-off-by:Thomas Vitale <ThomasVitale@users.noreply.github.com>
-
Michael Yang authored
-
Michael Yang authored
-
- 05 Apr, 2024 1 commit
-
-
Michael Yang authored
add command-r graph estimate
-
- 04 Apr, 2024 13 commits
-
-
Daniel Hiltgen authored
Add test case for context exhaustion
-
Daniel Hiltgen authored
fix dll compress in windows building
-
Michael Yang authored
-
Daniel Hiltgen authored
Fail fast if mingw missing on windows
-
Daniel Hiltgen authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Confirmed this fails on 0.1.30 with known regression but passes on main
-
Daniel Hiltgen authored
CI missing archive
-
Daniel Hiltgen authored
-
mofanke authored
-
Daniel Hiltgen authored
CI subprocess path fix
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Fix CI release glitches
-
- 03 Apr, 2024 8 commits
-
-
Daniel Hiltgen authored
The subprocess change moved the build directory arm64 builds weren't setting cross-compilation flags when building on x86
-
Michael Yang authored
update graph size estimate
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Michael Yang authored
default head_kv to 1
-
Blake Mizerany authored
This also moves the checkServerHeartbeat call out of the "RunE" Cobra stuff (that's the only word I have for that) to on-site where it's after the check for OLLAMA_MODELS, which allows the helpful error message to be printed before the server heartbeat check. This also arguably makes the code more readable without the magic/superfluous "pre" function caller.
-
Daniel Hiltgen authored
Fix numgpu opt miscomparison
-
Pier Francesco Contino authored
Co-authored-by:Pier Francesco Contino <pfcontino@gmail.com>
-
- 02 Apr, 2024 8 commits
-
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Michael Yang authored
fix metal gpu
-
Michael Yang authored
-
Daniel Hiltgen authored
Bump llama.cpp to b2581
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Switch back to subprocessing for llama.cpp
-
- 01 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Leaving the cudart library loaded kept ~30m of memory pinned in the GPU in the main process. This change ensures we don't hold GPU resources when idle.
-
Daniel Hiltgen authored
We may have users that run into problems with our current payload model, so this gives us an escape valve.
-
Daniel Hiltgen authored
"cudart init failure: 35" isn't particularly helpful in the logs.
-