Commits · 90ca84172c2a98ecfd76eb7e05cd3e33e1dde507 · OpenDAS / ollama

22 Aug, 2024 1 commit

Fix embeddings memory corruption (#6467) · 90ca8417

Daniel Hiltgen authored Aug 22, 2024

* Fix embeddings memory corruption

The patch was leading to a buffer overrun corruption.  Once removed though, parallism
in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
work around this, only use slot 0 for embeddings.

* Fix embed integration test assumption

The token eval count has changed with recent llama.cpp bumps (0.3.5+)

90ca8417

21 Aug, 2024 8 commits
- Merge pull request #6064 from ollama/mxyng/convert-llama3 · 6bd8a4b0
  Michael Yang authored Aug 21, 2024
```
convert: update llama conversion for llama3.1
```
  6bd8a4b0
- llama3.1 · 77903ab8
  Michael Yang authored Jul 29, 2024
  
  77903ab8
- Merge pull request #5365 from ollama/mxyng/convert-gemma2 · e22286c9
  Michael Yang authored Aug 21, 2024
```
convert gemma2
```
  e22286c9
- Merge pull request #4917 from ollama/mxyng/convert-bert · 107f6959
  Michael Yang authored Aug 21, 2024
```
convert bert model from safetensors
```
  107f6959
- Merge pull request #6386 from zwwhdls/fix-new-layer · 4ecc70d3
  Michael Yang authored Aug 21, 2024
```
fix: chmod new layer to 0o644 when creating it
```
  4ecc70d3
- convert gemma2 · 3546bbd0
  Michael Yang authored Jun 28, 2024
  
  3546bbd0
- create bert models from cli · beb49eef
  Michael Yang authored Jun 07, 2024
  
  beb49eef
- bert · 5a28b9cf
  Michael Yang authored Jun 06, 2024
  
  5a28b9cf
20 Aug, 2024 1 commit

Split rocm back out of bundle (#6432) · a017cf2f

Daniel Hiltgen authored Aug 20, 2024

We're over budget for github's maximum release artifact size with rocm + 2 cuda
versions. This splits rocm back out as a discrete artifact, but keeps the layout so it can
be extracted into the same location as the main bundle.

a017cf2f

19 Aug, 2024 17 commits
- CI: remove directories from dist dir before upload step (#6429) · 19e5a890
  Daniel Hiltgen authored Aug 19, 2024
  
  19e5a890
- CI: handle directories during checksum (#6427) · f91c9e37
  Daniel Hiltgen authored Aug 19, 2024
  
  f91c9e37
- Merge pull request #6424 from dhiltgen/cuda_v12 · 2df6905e
  Daniel Hiltgen authored Aug 19, 2024
```
Fix overlapping artifact name on CI
```
  2df6905e
- Fix overlapping artifact name on CI · d8be22e4
  Daniel Hiltgen authored Aug 19, 2024
  
  d8be22e4
- Merge pull request #5049 from dhiltgen/cuda_v12 · 652c273f
  Daniel Hiltgen authored Aug 19, 2024
```
Cuda v12
```
  652c273f
- Merge pull request #6402 from rick-github/numParallel · 88e77050
  Daniel Hiltgen authored Aug 19, 2024
```
Override numParallel in pickBestPartialFitByLibrary() only if unset.
```
  88e77050
- Review comments · f9e31da9
  Daniel Hiltgen authored Aug 15, 2024
  
  f9e31da9
- Adjust layout to bin+lib/ollama · 88bb9e33
  Daniel Hiltgen authored Aug 14, 2024
  
  88bb9e33
- Remove Jetpack · 3b19cdba
  Daniel Hiltgen authored Aug 13, 2024
  
  3b19cdba
- Add windows cuda v12 + v11 support · 927d98a6
  Daniel Hiltgen authored Jul 12, 2024
  
  927d98a6
- Enable cuda v12 flags · f6c811b3
  Daniel Hiltgen authored Jul 12, 2024
  
  f6c811b3
- Add cuda v12 variant and selection logic · 4fe3a556
  Daniel Hiltgen authored Jun 13, 2024
```
Based on compute capability and driver version, pick
v12 or v11 cuda variants.
```
  4fe3a556
- Report GPU variant in log · fc3b4cda
  Daniel Hiltgen authored Jun 19, 2024
  
  fc3b4cda
- Add Jetson cuda variants for arm · d470ebe7
  Daniel Hiltgen authored May 30, 2024
```
This adds new variants for arm64 specific to Jetson platforms
```
  d470ebe7
- Wire up ccache and pigz in the docker based build · c7bcb003
  Daniel Hiltgen authored Aug 09, 2024
```
This should help speed things up a little
```
  c7bcb003
- Refactor linux packaging · 74d45f01
  Daniel Hiltgen authored Jul 08, 2024
```
This adjusts linux to follow a similar model to windows with a discrete archive
(zip/tgz) to cary the primary executable, and dependent libraries. Runners are
still carried as payloads inside the main binary

Darwin retain the payload model where the go binary is fully self contained.
```
  74d45f01
- server: limit upload parts to 16 (#6411) · 9fddef37
  Jeffrey Morgan authored Aug 19, 2024
  
  9fddef37
18 Aug, 2024 2 commits
- Fix white space. · 885cf450
  Richard Lyons authored Aug 18, 2024
  
  885cf450
- Reset NumCtx. · 9352eeb7
  Richard Lyons authored Aug 18, 2024
  
  9352eeb7
17 Aug, 2024 1 commit
- Override numParallel only if unset. · 0ad0e738
  Richard Lyons authored Aug 18, 2024
  
  0ad0e738
16 Aug, 2024 1 commit
- fix: chmod new layer to 0o644 when creating it · bdc4308a
  zwwhdls authored Aug 16, 2024
```
Signed-off-by: zwwhdls <zww@hdls.me>
```
  bdc4308a
15 Aug, 2024 4 commits
- Merge pull request #6381 from eust-w/main · d29cd4c2
  Daniel Hiltgen authored Aug 15, 2024
```
fix: Add tooltip to system tray icon
```
  d29cd4c2
- fix: Add tooltip to system tray icon · a84c05cf
  eust-w authored Aug 16, 2024
```
- Updated setIcon method to include tooltip text for the system tray icon.
- Added NIF_TIP flag and set the tooltip text using UTF16 encoding.

Resolves: #6372
```
  a84c05cf
- Merge pull request #6363 from ollama/mxyng/fix-noprune · e3d7f32a
  Michael Yang authored Aug 15, 2024
```
fix: noprune on pull
```
  e3d7f32a
- only skip invalid json manifests · 3a75e74e
  Michael Yang authored Aug 15, 2024
  
  3a75e74e
14 Aug, 2024 4 commits

skip invalid manifest files · 237dccba
Michael Yang authored Aug 14, 2024

237dccba
fix noprune · b3f75fc8
Michael Yang authored Aug 14, 2024

b3f75fc8
add `CONTRIBUTING.md` (#6349) · 8200c371
Jeffrey Morgan authored Aug 14, 2024

8200c371

Fix typo and improve readability (#5964) · 0a8d6ea8

longtao authored Aug 14, 2024



* Fix typo and improve readability

Summary:
* Rename updatAvailableMenuID to updateAvailableMenuID
* Replace unused cmd parameter with _ in RunServer function
* Fix typos in comments

(cherry picked from commit 5b8715f0b04773369e8eb1f9e6737995a0ab3ba7)

* Update api/client.go
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

0a8d6ea8

13 Aug, 2024 1 commit

server: reduce max connections used in download (#6347) · 8e1050f3

Blake Mizerany authored Aug 13, 2024

The previous value of 64 was WAY too high and unnecessary. It reached
diminishing returns and blew past it. This is a more reasonable number
for _most_ normal cases. For users on cloud servers with excellent
network quality, this will keep screaming for them, without hitting our
CDN limits. For users with relatively poor network quality, this will
keep them from saturating their network and causing other issues.

8e1050f3