- 09 Jan, 2026 1 commit
-
-
Daniel Hiltgen authored
* WIP - MLX backend with gemma3 * MLX: add cmake and go tag build toggles To build the new MLX backend code: cmake --preset MLX cmake --build --preset MLX --parallel cmake --install build --component MLX go build -tags mlx . Note: the main.go entrypoint for the MLX engine will change in a follow up commit. * add experimental image generation runtime * add experimental image generation runtime * MLX: wire up cuda build for linux * MLX: get dependencies correct and dedup This is still too large for a unified github artifact, but is now "correct" for the mlx_cuda_v13 directory. * fix relative link bug in dedup * Add darwin build and readme * add go build tag for mlx dependent code and wire up build_darwin.sh * lint cleanup * macos: build mlx for x86 This will be CPU only. * cuda build instructions and fix drift from mlx bump * stale comment * Delete agent helper doc * Clean up readme.md * Revise README for tokenizer clarity and details Updated README to clarify tokenizer functionality and removed correctness section. --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-
- 16 Dec, 2025 1 commit
-
-
Bruce MacDonald authored
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
-
- 27 Oct, 2025 1 commit
-
-
Devon Rifkin authored
On main, the `RENDERER` and `PARSER` fields from the `Modelfile` don't get propagated to a new model created with a `req.From` parameter. This is easily triggered via `ollama run qwen3-coder`, then running some save command like `/save qwen3-coder-custom`. Added a regression test for this, and then open the config for the "from" model in order to use its renderer/parser as a default for the new model. This will fix the CLI and also API-based creates. Fixes: https://github.com/ollama/ollama/issues/12792
-
- 20 Oct, 2025 1 commit
-
-
Michael Yang authored
-
- 17 Sep, 2025 1 commit
-
-
Patrick Devine authored
-
- 06 May, 2025 1 commit
-
-
Daniel Hiltgen authored
* Move quantization logic to GGML via new backend This moves the model aware logic to Go code and calls GGMLs quantization code for model creation. * Remove "add model quantizations" This is no longer needed now that quantization is implemented in Go+GGML code directly.
-
- 14 Feb, 2025 1 commit
-
-
Michael Yang authored
feat: add new Ollama engine using ggml through cgo This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this. - `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go` - `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go` - `ml.Tensor` defines the interface for a tensor and tensor operations This is the first implementation of the new engine. Follow up PRs will implement more features: - non-greedy sampling (#8410) - integration with Ollama and KV caching (#8301) - more model support (#9080) with more coming soon Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
- 15 Jan, 2025 1 commit
-
-
Patrick Devine authored
-
- 01 Jan, 2025 1 commit
-
-
Patrick Devine authored
Replaces `POST /api/create` to use JSON instead of a Modelfile. This is a breaking change.
-
- 27 Aug, 2024 2 commits
-
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 02 Aug, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 31 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 22 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 19 Jul, 2024 1 commit
-
-
Josh authored
add template validation to modelfile
-
- 16 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 12 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 11 Jul, 2024 2 commits
-
-
Michael Yang authored
This reverts commit 19753c18. for compat. messages will be added at a later date
-
Michael Yang authored
-
- 05 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 13 Jun, 2024 1 commit
-
-
Patrick Devine authored
-
- 12 Jun, 2024 1 commit
-
-
Michael Yang authored
multiple templates may appear in a model if a model is created from another model that 1) has an autodetected template and 2) defines a custom template
-
- 07 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 04 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 14 May, 2024 1 commit
-
-
Michael Yang authored
-