- 13 Oct, 2025 1 commit
-
-
Gabe Goodhart authored
Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) * feat: Bump llama.cpp to df1b612 Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * fix(mtmd): Correctly encode text chunks during mtmd tokenization There can be text chunks that appear interspersed with the image embeddings that contain template delimiter tokens for some models. These need to be correctly translated to text tokens. Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * tests: Use MtmdChunk in image_test Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * style: Fix unnecessary conversion linting Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * fix(ggml): Revert changes to ggml_hip.cpp These changes were done largely by our code assistant and are likely wrong Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * fix: Revert changes in mem_nvml.cpp Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * feat: Update sync point to 1deee0 This brings in several more optimization commits and model support for EmbeddingGemma Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * feat: Update patches for 1deee0 Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * feat: sync for bump to 1deee0 Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * fix: Bad patch updates with errant `+` Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * feat: Bump llama.cpp/ggml to 7049736 Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> * fix: format-patches after latest bump Branch: LlamaCPPBump-GraniteDocling Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by:
Gabe Goodhart <ghart@us.ibm.com>
-
- 02 Oct, 2025 1 commit
-
-
Daniel Hiltgen authored
Notable EOLs with this change: - MacOS v12 and v13 are no longer supported (v14+ required) - AMD gfx900 and gfx906 are no longer supported
-
- 14 Aug, 2025 1 commit
-
-
Michael Yang authored
* TEMPORARY: Update the llama.cpp upstream to my fork's Granite Four branch This will be redone once my branch is merged upstream in llama.cpp * feat: Update all patches There are a number that are no longer needed at all: - 0003-embeddings: Embeddings entirely overhauled on master - 0008-ensure-KV-cache-is-fully-defragmented: KV caching entirely overhauled on master - 0019-metal-add-mean-kernel-14267: Merged upstream - 0020-CUDA-add-mean-operation-14313: Merged upstream * feat: Sync llama.cpp and ggml * fix: Update rsync-filter for all moved/new/removed files * fix: Add files missing from sync * fix: Update ggml rsync-filter for new ggml-cpu/arch subdirs * fix: Add ggml files missing from sync * fix: Narrow llama.cpp rsync-filter to not include mtmd main tool cpp files * fix: Remove mtmd main cpp files * fix: Add missing include in sampling_ext.cpp * fix: Update llama.go to use mtmd instead of clip/llava * fix: Add patch for mtmd_input_text *...
-
- 14 May, 2025 1 commit
-
-
Michael Yang authored
-
- 12 May, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 02 May, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 25 Apr, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 16 Apr, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 27 Feb, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 29 Jan, 2025 1 commit
-
-
Michael Yang authored
* add build to .dockerignore * test: only build one arch * add build to .gitignore * fix ccache path * filter amdgpu targets * only filter if autodetecting * Don't clobber gpu list for default runner This ensures the GPU specific environment variables are set properly * explicitly set CXX compiler for HIP * Update build_windows.ps1 This isn't complete, but is close. Dependencies are missing, and it only builds the "default" preset. * build: add ollama subdir * add .git to .dockerignore * docs: update development.md * update build_darwin.sh * remove unused scripts * llm: add cwd and build/lib/ollama to library paths * default DYLD_LIBRARY_PATH to LD_LIBRARY_PATH in runner on macOS * add additional cmake output vars for msvc * interim edits to make server detection logic work with dll directories like lib/ollama/cuda_v12 * remove unncessary filepath.Dir, cleanup * add hardware-specific directory to path * use absolute server path * build: linux arm * cmake install targets * remove unused files * ml: visit each library path once * build: skip cpu variants on arm * build: install cpu targets * build: fix workflow * shorter names * fix rocblas install * docs: clean up development.md * consistent build dir removal in development.md * silence -Wimplicit-function-declaration build warnings in ggml-cpu * update readme * update development readme * llm: update library lookup logic now that there is one runner (#8587) * tweak development.md * update docs * add windows cuda/rocm tests --------- Co-authored-by:
jmorganca <jmorganca@gmail.com> Co-authored-by:
Daniel Hiltgen <daniel@ollama.com>
-
- 08 Jan, 2025 1 commit
-
-
Jeffrey Morgan authored
-
- 14 Dec, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 11 Dec, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 26 Oct, 2024 1 commit
-
-
Daniel Hiltgen authored
On windows compiled with gcc the c++ regex library failed to handle the characters
-