- 20 Mar, 2025 1 commit
-
-
Patrick Devine authored
This change allows the gemma3 template to be autodetected during `ollama create`.
-
- 14 Feb, 2025 1 commit
-
-
Michael Yang authored
feat: add new Ollama engine using ggml through cgo This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this. - `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go` - `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go` - `ml.Tensor` defines the interface for a tensor and tensor operations This is the first implementation of the new engine. Follow up PRs will implement more features: - non-greedy sampling (#8410) - integration with Ollama and KV caching (#8301) - more model support (#9080) with more coming soon Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
- 01 Jan, 2025 1 commit
-
-
Patrick Devine authored
Replaces `POST /api/create` to use JSON instead of a Modelfile. This is a breaking change.
-
- 18 Oct, 2024 1 commit
-
-
Patrick Devine authored
Co-authored-by:
jmorganca <jmorganca@gmail.com> Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Jesse Gross <jesse@ollama.com>
-
- 18 Sep, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 23 Aug, 2024 1 commit
-
-
Patrick Devine authored
-
- 12 Aug, 2024 3 commits
- 08 Aug, 2024 1 commit
-
-
Jesse Gross authored
Commit 1829fb61 ("manifest: Fix crash on startup when trying to clean up unused files (#5840)") changed the config layer stored in manifests from a pointer to a value. This was done in order to avoid potential nil pointer dereferences after it is deserialized from JSON in the event that the field is missing. This changes the Layers slice to also be stored by value. This enables consistency in handling across the two objects.
-
- 31 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 22 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 18 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 17 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 16 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 15 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 12 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 01 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 27 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 25 Jun, 2024 1 commit
-
-
Blake Mizerany authored
Previously, some costly things were causing the loading of GGUF files and their metadata and tensor information to be VERY slow: * Too many allocations when decoding strings * Hitting disk for each read of each key and value, resulting in a not-okay amount of syscalls/disk I/O. The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro m3. This commit also prevents collecting large arrays of values when decoding GGUFs (if desired). When such keys are encountered, their values are null, and are encoded as such in JSON. Also, this fixes a broken test that was not encoding valid GGUF.
-
- 12 Jun, 2024 1 commit
-
-
Michael Yang authored
multiple templates may appear in a model if a model is created from another model that 1) has an autodetected template and 2) defines a custom template
-
- 04 Jun, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 20 May, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
- 06 May, 2024 5 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
- FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32}
-