Commits · 397cae79620e09220f4207beadd523ce1ae7cbd5 · OpenDAS / ollama

27 Aug, 2024 4 commits
- llm: fix typo in comment (#6530) · 397cae79
  Sean Khatiri authored Aug 27, 2024
  
  397cae79
- adjust image sizes · 1c70a00f
  Patrick Devine authored Aug 27, 2024
  
  1c70a00f
- update the import docs (#6104) · ac80010d
  Patrick Devine authored Aug 26, 2024
  
  ac80010d
- server: clean up route names for consistency (#6524) · 47fa0839
  Jeffrey Morgan authored Aug 26, 2024
  
  47fa0839
25 Aug, 2024 1 commit

Only enable numa on CPUs (#6484) · 0f92b19b

Daniel Hiltgen authored Aug 24, 2024

The numa flag may be having a performance impact on multi-socket systems with GPU loads

0f92b19b

23 Aug, 2024 6 commits
- gpu: Group GPU Library sets by variant (#6483) · 69be940b
  Daniel Hiltgen authored Aug 23, 2024
```
The recent cuda variant changes uncovered a bug in ByLibrary
which failed to group by common variant for GPU types.
```
  69be940b
- Merge pull request #5446 from ollama/mxyng/faq · 9638c24c
  Michael Yang authored Aug 23, 2024
```
update faq
```
  9638c24c
- update faq · bb362caf
  Michael Yang authored Jul 02, 2024
  
  bb362caf
- convert safetensor adapters into GGUF (#6327) · 0c819e16
  Patrick Devine authored Aug 23, 2024
  
  0c819e16
- gpu: Ensure driver version set before variant (#6480) · 7a1e1c1c
  Daniel Hiltgen authored Aug 23, 2024
```
During rebasing, the ordering was inverted causing the cuda version
selection logic to break, with driver version being evaluated as zero
incorrectly causing a downgrade to v11.
```
  7a1e1c1c
- llm: Align cmake define for cuda no peer copy (#6455) · 0b03b9c3
  Daniel Hiltgen authored Aug 23, 2024
```
Define changed recently and this slipped through the cracks with the old
name.
```
  0b03b9c3
22 Aug, 2024 1 commit

Fix embeddings memory corruption (#6467) · 90ca8417

Daniel Hiltgen authored Aug 22, 2024

* Fix embeddings memory corruption

The patch was leading to a buffer overrun corruption.  Once removed though, parallism
in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
work around this, only use slot 0 for embeddings.

* Fix embed integration test assumption

The token eval count has changed with recent llama.cpp bumps (0.3.5+)

90ca8417

21 Aug, 2024 8 commits
- Merge pull request #6064 from ollama/mxyng/convert-llama3 · 6bd8a4b0
  Michael Yang authored Aug 21, 2024
```
convert: update llama conversion for llama3.1
```
  6bd8a4b0
- llama3.1 · 77903ab8
  Michael Yang authored Jul 29, 2024
  
  77903ab8
- Merge pull request #5365 from ollama/mxyng/convert-gemma2 · e22286c9
  Michael Yang authored Aug 21, 2024
```
convert gemma2
```
  e22286c9
- Merge pull request #4917 from ollama/mxyng/convert-bert · 107f6959
  Michael Yang authored Aug 21, 2024
```
convert bert model from safetensors
```
  107f6959
- Merge pull request #6386 from zwwhdls/fix-new-layer · 4ecc70d3
  Michael Yang authored Aug 21, 2024
```
fix: chmod new layer to 0o644 when creating it
```
  4ecc70d3
- convert gemma2 · 3546bbd0
  Michael Yang authored Jun 28, 2024
  
  3546bbd0
- create bert models from cli · beb49eef
  Michael Yang authored Jun 07, 2024
  
  beb49eef
- bert · 5a28b9cf
  Michael Yang authored Jun 06, 2024
  
  5a28b9cf
20 Aug, 2024 1 commit

Split rocm back out of bundle (#6432) · a017cf2f

Daniel Hiltgen authored Aug 20, 2024

We're over budget for github's maximum release artifact size with rocm + 2 cuda
versions. This splits rocm back out as a discrete artifact, but keeps the layout so it can
be extracted into the same location as the main bundle.

a017cf2f

19 Aug, 2024 17 commits
- CI: remove directories from dist dir before upload step (#6429) · 19e5a890
  Daniel Hiltgen authored Aug 19, 2024
  
  19e5a890
- CI: handle directories during checksum (#6427) · f91c9e37
  Daniel Hiltgen authored Aug 19, 2024
  
  f91c9e37
- Merge pull request #6424 from dhiltgen/cuda_v12 · 2df6905e
  Daniel Hiltgen authored Aug 19, 2024
```
Fix overlapping artifact name on CI
```
  2df6905e
- Fix overlapping artifact name on CI · d8be22e4
  Daniel Hiltgen authored Aug 19, 2024
  
  d8be22e4
- Merge pull request #5049 from dhiltgen/cuda_v12 · 652c273f
  Daniel Hiltgen authored Aug 19, 2024
```
Cuda v12
```
  652c273f
- Merge pull request #6402 from rick-github/numParallel · 88e77050
  Daniel Hiltgen authored Aug 19, 2024
```
Override numParallel in pickBestPartialFitByLibrary() only if unset.
```
  88e77050
- Review comments · f9e31da9
  Daniel Hiltgen authored Aug 15, 2024
  
  f9e31da9
- Adjust layout to bin+lib/ollama · 88bb9e33
  Daniel Hiltgen authored Aug 14, 2024
  
  88bb9e33
- Remove Jetpack · 3b19cdba
  Daniel Hiltgen authored Aug 13, 2024
  
  3b19cdba
- Add windows cuda v12 + v11 support · 927d98a6
  Daniel Hiltgen authored Jul 12, 2024
  
  927d98a6
- Enable cuda v12 flags · f6c811b3
  Daniel Hiltgen authored Jul 12, 2024
  
  f6c811b3
- Add cuda v12 variant and selection logic · 4fe3a556
  Daniel Hiltgen authored Jun 13, 2024
```
Based on compute capability and driver version, pick
v12 or v11 cuda variants.
```
  4fe3a556
- Report GPU variant in log · fc3b4cda
  Daniel Hiltgen authored Jun 19, 2024
  
  fc3b4cda
- Add Jetson cuda variants for arm · d470ebe7
  Daniel Hiltgen authored May 30, 2024
```
This adds new variants for arm64 specific to Jetson platforms
```
  d470ebe7
- Wire up ccache and pigz in the docker based build · c7bcb003
  Daniel Hiltgen authored Aug 09, 2024
```
This should help speed things up a little
```
  c7bcb003
- Refactor linux packaging · 74d45f01
  Daniel Hiltgen authored Jul 08, 2024
```
This adjusts linux to follow a similar model to windows with a discrete archive
(zip/tgz) to cary the primary executable, and dependent libraries. Runners are
still carried as payloads inside the main binary

Darwin retain the payload model where the go binary is fully self contained.
```
  74d45f01
- server: limit upload parts to 16 (#6411) · 9fddef37
  Jeffrey Morgan authored Aug 19, 2024
  
  9fddef37
18 Aug, 2024 2 commits
- Fix white space. · 885cf450
  Richard Lyons authored Aug 18, 2024
  
  885cf450
- Reset NumCtx. · 9352eeb7
  Richard Lyons authored Aug 18, 2024
  
  9352eeb7