Commits · 83f0ec8269eeaaef993af5b61916919db34e8cb7 · OpenDAS / ollama

"megatron/vscode:/vscode.git/clone" did not exist on "b1efc33d3302cf0ebca3ef1b457e56a8dcf4a052"

11 Mar, 2025 4 commits
- all: address linter errors · 83f0ec82
  jmorganca authored Mar 11, 2025
  
  83f0ec82
- fix conversion · c62861f4
  Patrick Devine authored Mar 07, 2025
  
  c62861f4
- add gemma vision encoder · 4b037a97
  Michael Yang authored Mar 06, 2025
  
  4b037a97
- gemma2 impl · 5f74d1fd
  Patrick Devine authored Feb 07, 2025
  
  5f74d1fd
14 Feb, 2025 1 commit

Michael Yang authored Feb 14, 2025



feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

58245413

16 Jan, 2025 1 commit
- convert: import support for command-r models from safetensors (#6063) · 93a8daf2
  Josh authored Jan 15, 2025
```
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
```
  93a8daf2
14 Jan, 2025 1 commit

convert: qwen2 from safetensors (#8408) · f6f37130

Bruce MacDonald authored Jan 14, 2025

Add native support for converting Qwen2 family models (including Qwen2.5)
from safetensors to gguf format so we can run it.

f6f37130

10 Sep, 2024 1 commit
- catch when model vocab size is set correctly (#6714) · 84b84ce2
  Patrick Devine authored Sep 09, 2024
  
  84b84ce2
23 Aug, 2024 1 commit
- convert safetensor adapters into GGUF (#6327) · 0c819e16
  Patrick Devine authored Aug 23, 2024
  
  0c819e16
21 Aug, 2024 2 commits
- convert gemma2 · 3546bbd0
  Michael Yang authored Jun 28, 2024
  
  3546bbd0
- bert · 5a28b9cf
  Michael Yang authored Jun 06, 2024
  
  5a28b9cf
12 Aug, 2024 1 commit
- add conversion for microsoft phi 3 mini/medium 4k, 128 · 6ffb5cb0
  Michael Yang authored Jun 03, 2024
  
  6ffb5cb0
31 Jul, 2024 3 commits
- convert: only extract large files · eafc607a
  Michael Yang authored Jun 29, 2024
  
  eafc607a
- comments · df993fa3
  Michael Yang authored Jul 08, 2024
  
  df993fa3
- refactor convert · 5e9db9fb
  Michael Yang authored May 31, 2024
  
  5e9db9fb
04 Jun, 2024 1 commit
- lint · e40145a3
  Michael Yang authored May 21, 2024
  
  e40145a3
20 May, 2024 5 commits
- cleanup · bbbd9f20
  Michael Yang authored May 15, 2024
  
  bbbd9f20
- bpe pretokenizer · 547132e8
  Michael Yang authored May 15, 2024
  
  547132e8
- add fixes for llama · d355d202
  Patrick Devine authored May 08, 2024
  
  d355d202
- llama3 conversion · c8cf0d94
  Patrick Devine authored Apr 28, 2024
  
  c8cf0d94
- some changes for llama3 · d88582df
  Patrick Devine authored Apr 18, 2024
  
  d88582df
06 May, 2024 1 commit

quantize any fp16/fp32 model · 9685c345

Michael Yang authored Apr 12, 2024

- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}

9685c345

24 Apr, 2024 1 commit
- add mixtral 8x7b model conversion (#3859) · ce8ce825
  Patrick Devine authored Apr 23, 2024
  
  ce8ce825
15 Apr, 2024 1 commit
- Add llama2 / torch models for `ollama create` (#3607) · 9f8691c6
  Patrick Devine authored Apr 15, 2024
  
  9f8691c6
06 Apr, 2024 1 commit
- no rope parameters · be517e49
  Michael Yang authored Apr 05, 2024
  
  be517e49
01 Apr, 2024 1 commit
- Simplify model conversion (#3422) · 3b6a9154
  Patrick Devine authored Apr 01, 2024
  
  3b6a9154
29 Mar, 2024 1 commit
- Add gemma safetensors conversion (#3250) · 5a5efee4
  Patrick Devine authored Mar 28, 2024
```
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  5a5efee4
26 Mar, 2024 1 commit
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b
11 Mar, 2024 1 commit
- convert: fix shape · 9ea492f1
  Michael Yang authored Mar 10, 2024
  
  9ea492f1
08 Mar, 2024 2 commits
- decode ggla · 76bdebba
  Michael Yang authored Mar 08, 2024
  
  76bdebba
- convert: fix default shape · 18979ad4
  Michael Yang authored Mar 08, 2024
  
  18979ad4
07 Mar, 2024 1 commit
- Convert Safetensors to an Ollama model (#2824) · 2c017ca4
  Patrick Devine authored Mar 06, 2024
  
  2c017ca4