- 25 Jan, 2024 7 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* Fix clearing kv cache between requests with the same prompt * fix powershell script
-
Patrick Devine authored
-
Michael Yang authored
stub generate outputs for lint
-
Michael Yang authored
refactor tensor read
-
Jeffrey Morgan authored
-
Michael Yang authored
-
- 24 Jan, 2024 5 commits
-
-
Daniel Hiltgen authored
More logging for gpu management
-
Michael Yang authored
-
Daniel Hiltgen authored
Fix an ordering glitch of dlerr/dlclose and add more logging to help root cause some crashes users are hitting. This also refines the function pointer names to use the underlying function names instead of simplified names for readability.
-
Daniel Hiltgen authored
Report more information about GPUs in verbose mode
-
Jeffrey Morgan authored
-
- 23 Jan, 2024 8 commits
-
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
This adds additional calls to both CUDA and ROCm management libraries to discover additional attributes about the GPU(s) detected in the system, and wires up runtime verbosity selection. When users hit problems with GPUs we can ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the results.
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Set a default version using git describe
-
Daniel Hiltgen authored
If a VERSION is not specified, this will generate a version string that represents the state of the repo. For example `0.1.21-12-gffaf52e1-dirty` representing 12 commits away from 0.1.21 tag, on commit gffaf52e1 and the tree is dirty.
-
Daniel Hiltgen authored
Refine Accelerate usage on mac
-
Daniel Hiltgen authored
For old macs, accelerate seems to cause crashes, but for AVX2 capable macs, it does not.
-
- 22 Jan, 2024 12 commits
-
-
Jeffrey Morgan authored
-
Michael Yang authored
faq: update to use launchctl setenv
-
Daniel Hiltgen authored
Refine debug logging for llm
-
Michael Yang authored
-
Daniel Hiltgen authored
Debug logging on init failure
-
Daniel Hiltgen authored
This wires up logging in llama.cpp to always go to stderr, and also turns up logging if OLLAMA_DEBUG is set.
-
Daniel Hiltgen authored
-
Jeffrey Morgan authored
-
Michael Yang authored
fix: remove overwritten model layers
-
Meng Zhuo authored
-
Daniel Hiltgen authored
Make CPU builds parallel and customizable AMD GPUs
-
Daniel Hiltgen authored
Probe GPUs before backend init
-
- 21 Jan, 2024 5 commits
-
-
Daniel Hiltgen authored
Detect potential error scenarios so we can fallback to CPU mode without hitting asserts.
-
Daniel Hiltgen authored
The linux build now support parallel CPU builds to speed things up. This also exposes AMD GPU targets as an optional setting for advaced users who want to alter our default set.
-
Daniel Hiltgen authored
Combine the 2 Dockerfiles and add ROCm
-
Daniel Hiltgen authored
This renames Dockerfile.build to Dockerfile, and adds some new stages to support 2 modes of building - the build_linux.sh script uses intermediate stages to extract the artifacts for ./dist, and the default build generates a container image usable by both cuda and rocm cards. This required transitioniing the x86 base to the rocm image to avoid layer bloat.
-
Jeffrey Morgan authored
-
- 20 Jan, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 19 Jan, 2024 1 commit
-
-
Michael Yang authored
if create overrides a manifest, first add the older manifest's layers to the delete map so they can be cleaned up
-