Commits · 58245413f4df1c56d7e5f03ab6ff20dfafb8daa1 · OpenDAS / ollama

14 Feb, 2025 1 commit

Michael Yang authored Feb 14, 2025



feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

58245413

13 Feb, 2025 4 commits
- docs: add ollamazing to the README.md (#9075) · 8cf16063
  Bùi Đức Nhật authored Feb 14, 2025
  
  8cf16063
- docs: add H200 as supported device. (#9076) · 3a4449e2
  frob authored Feb 13, 2025
```
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
  3a4449e2
- openai: finish_reason as tool_calls for streaming with tools (#7963) · 10d59d5f
  Anuraag (Rag) Agrawal authored Feb 14, 2025
  
  10d59d5f
- build: add -DGGML_CUDA_NO_PEER_COPY=ON for rocm builds on windows (#9060) · a4f69a01
  Jeffrey Morgan authored Feb 13, 2025
  
  a4f69a01
12 Feb, 2025 3 commits
- readme: add Homebrew to package managers section (#9052) · 82658c3e
  Clinton authored Feb 12, 2025
  
  82658c3e
- docs: fix nix package link (#9045) · 378d6e1e
  bloominstrong authored Feb 13, 2025
```
removing the channel tag from the url so it will always go to the current stable channel.
```
  378d6e1e
- doc: fix link for Abso (#9043) · afa55bc7
  Hugues Chocart authored Feb 12, 2025
  
  afa55bc7
11 Feb, 2025 2 commits
- fix: harden backend loading (#9024) · 49df03da
  Michael Yang authored Feb 11, 2025
```
* wrap ggml_backend_load_best in try/catch
* ignore non-ollama paths
```
  49df03da
- readme: add Abso SDK to community integrations (#8973) · 0189bdd0
  Hugues Chocart authored Feb 11, 2025
  
  0189bdd0
10 Feb, 2025 2 commits
- ml/backend/ggml: fix crash on dlopen for non-AVX systems (#8976) · f4711da7
  Jeffrey Morgan authored Feb 10, 2025
  
  f4711da7
- readme: add Lunary to observability community integrations (#8975) · 38117fba
  Hugues Chocart authored Feb 10, 2025
  
  38117fba
08 Feb, 2025 4 commits
- ci: use windows-2022 to sign and bundle (#8941) · 1f766c36
  Michael Yang authored Feb 08, 2025
```
ollama requires vcruntime140_1.dll which isn't found on 2019. previously
the job used the windows runner (2019) but it explicitly installs
2022 to build the app. since the sign job doesn't actually build
anything, it can use the windows-2022 runner instead.
```
  1f766c36
- docs: add LocalLLM app to community integrations (#8953) · 484a99e4
  Qusai Ismael authored Feb 08, 2025
  
  484a99e4
- docs: ollama zig community lib (#8688) · ec6121c3
  DravenK authored Feb 09, 2025
  
  ec6121c3
- docs: link directly to latest release page for tdm-gcc (#8939) · b86c0a15
  Jeffrey Morgan authored Feb 08, 2025
  
  b86c0a15
07 Feb, 2025 6 commits
- readme: add deepseek to supported models · 7e402ebb
  Guddu Kumar authored Feb 08, 2025
  
  7e402ebb
- docs: improve syntax highlighting in code blocks (#8854) · b901a712
  Azis Alvriyanto authored Feb 08, 2025
  
  b901a712
- add gfx instinct gpus (#8933) · abb8dd57
  Michael Yang authored Feb 07, 2025
  
  abb8dd57
- docs: include port in faq.md OLLAMA_HOST examples (#8905) · a400df48
  Leisure Linux authored Feb 07, 2025
  
  a400df48
- readme: add React Native client to community integrations (#8877) · 6ab4ba4c
  annilq authored Feb 07, 2025
  
  6ab4ba4c
- readme: add ChibiChat to community integrations (#8883) · e8d4eb3e
  CosmicEventHorizon authored Feb 06, 2025
  
  e8d4eb3e
06 Feb, 2025 11 commits
- build(rocm): add numa, elf (#8900) · ae7e368f
  Michael Yang authored Feb 06, 2025
  
  ae7e368f
- readme: add Ollama Chat WebUI for Docker to community integrations (#8084) · 31acd1eb
  oslook authored Feb 07, 2025
  
  31acd1eb
- build(rocm): add tinfo (#8899) · 9a4757ae
  Michael Yang authored Feb 06, 2025
  
  9a4757ae
- docs: add step for removing libraries in linux.md (#8897) · 78140197
  Abhinav Pant authored Feb 07, 2025
  
  78140197
- build: add missing dependencies (#8896) · b698f9a0
  Michael Yang authored Feb 06, 2025
  
  b698f9a0
- format: rename test file from byte_test.go to bytes_test.go (#8865) · 32285a6d
  Azis Alvriyanto authored Feb 07, 2025
  
  32285a6d
- ci: fix linux archive (#8862) · 1c198977
  Michael Yang authored Feb 05, 2025
```
the find returns intermediate directories which pulls the parent
directories. it also omits files under lib/ollama.

switch back to globbing
```
  1c198977
- readme: add simple-discord-ai to community integrations (#8659) · 330b6c50
  zyphixor authored Feb 05, 2025
  
  330b6c50
- runner: avoid buffer overwrite when generating multiple embeddings (#8714) · 928911bc
  Diego Pereira authored Feb 05, 2025
```
Shield the code processing the embedding result
from subsequent calls that may overwrite the same
buffer to process a second input when retrieving
model embeddings.
```
  928911bc
- chore: update gitattributes (#8860) · 5b446cc8
  Michael Yang authored Feb 05, 2025
```
* chore: update gitattributes
* chore: add build info source
```
  5b446cc8
- readme: add MLflow Tracing as an observability integration (#8811) · 451c1596
  Daniel Lok authored Feb 06, 2025
  
  451c1596
05 Feb, 2025 7 commits
- chore: add optional field for server logs · 932bded1
  Michael Yang authored Feb 05, 2025
  
  932bded1
- ci: fix linux archive · 070ad913
  Michael Yang authored Feb 05, 2025
  
  070ad913
- format: byte formatting test coverage (#8692) · 8d8b9f83
  Azis Alvriyanto authored Feb 06, 2025
```
Removed redundant checks and streamlined the switch-case structure.
Added test cases for both HumanBytes and HumanBytes2 to cover a wide range of scenarios.
```
  8d8b9f83
- docs: add section in development.md on library detection (#8855) · f00d359a
  Jeffrey Morgan authored Feb 05, 2025
  
  f00d359a
- server: increase timeout in stall detection from 5s to 30s (#8831) · 291def6a
  Yashwanth A authored Feb 05, 2025
```
In some cases, downloads slow due to disk i/o or other factors,
causing the download to restart a part. This causes the download
to "reverse" in percent completion. By increasing the timeout to 30s,
this should happen less frequently.
```
  291def6a
- llama: use dynamic backend loading for mllama and clip (#8835) · cd3fbf1c
  Jeffrey Morgan authored Feb 05, 2025
  
  cd3fbf1c
- server: always print upload/download part info (#8832) · c852b8e0
  Jeffrey Morgan authored Feb 04, 2025
  
  c852b8e0