Commits · 47c2b947a954292fbe6f9d7470d7d2ce75bd838c · OpenDAS / ollama

28 Aug, 2024 8 commits
- Merge pull request #6546 from ollama/mxyng/fix-test · 47c2b947
  Michael Yang authored Aug 28, 2024
```
fix(test): do not clobber models directory
```
  47c2b947
- Merge pull request #6539 from ollama/mxyng/validate-modelpath · 5eb77bf9
  Michael Yang authored Aug 28, 2024
```
fix: validate modelpath
```
  5eb77bf9
- fix(test): do not clobber models directory · e4d0a9c3
  Michael Yang authored Aug 28, 2024
  
  e4d0a9c3
- add llama3.1 chat template (#6545) · 7416ced7
  Patrick Devine authored Aug 28, 2024
  
  7416ced7
- Merge pull request #6522 from ollama/mxyng/detect-chat · 9cfd2dd3
  Michael Yang authored Aug 28, 2024
```
detect chat template from configs that contain lists
```
  9cfd2dd3
- update deprecated warnings · 8e6da3cb
  Michael Yang authored Aug 27, 2024
  
  8e6da3cb
- validate model path · d9d50c43
  Michael Yang authored Aug 27, 2024
  
  d9d50c43
- throw an error when encountering unsupport tensor sizes (#6538) · 6c1c1ad6
  Patrick Devine authored Aug 27, 2024
  
  6c1c1ad6
27 Aug, 2024 11 commits
- Move ollama executable out of bin dir (#6535) · 93ea9240
  Daniel Hiltgen authored Aug 27, 2024
  
  93ea9240
- more tokenizer tests · 60e47573
  Michael Yang authored Aug 27, 2024
  
  60e47573
- add safetensors to the modelfile docs (#6532) · d13c3daa
  Patrick Devine authored Aug 27, 2024
  
  d13c3daa
- Fix import image width (#6528) · 1713eddc
  Patrick Devine authored Aug 27, 2024
  
  1713eddc
- Update manual instructions with discrete ROCm bundle (#6445) · 4e1c4f6e
  Daniel Hiltgen authored Aug 27, 2024
  
  4e1c4f6e
- llm: fix typo in comment (#6530) · 397cae79
  Sean Khatiri authored Aug 27, 2024
  
  397cae79
- adjust image sizes · 1c70a00f
  Patrick Devine authored Aug 27, 2024
  
  1c70a00f
- clean up convert tokenizer · eae3af68
  Michael Yang authored Aug 27, 2024
  
  eae3af68
- detect chat template from configs that contain lists · 3eb08377
  Michael Yang authored Aug 26, 2024
  
  3eb08377
- update the import docs (#6104) · ac80010d
  Patrick Devine authored Aug 26, 2024
  
  ac80010d
- server: clean up route names for consistency (#6524) · 47fa0839
  Jeffrey Morgan authored Aug 26, 2024
  
  47fa0839
25 Aug, 2024 1 commit

Only enable numa on CPUs (#6484) · 0f92b19b

Daniel Hiltgen authored Aug 24, 2024

The numa flag may be having a performance impact on multi-socket systems with GPU loads

0f92b19b

23 Aug, 2024 6 commits
- gpu: Group GPU Library sets by variant (#6483) · 69be940b
  Daniel Hiltgen authored Aug 23, 2024
```
The recent cuda variant changes uncovered a bug in ByLibrary
which failed to group by common variant for GPU types.
```
  69be940b
- Merge pull request #5446 from ollama/mxyng/faq · 9638c24c
  Michael Yang authored Aug 23, 2024
```
update faq
```
  9638c24c
- update faq · bb362caf
  Michael Yang authored Jul 02, 2024
  
  bb362caf
- convert safetensor adapters into GGUF (#6327) · 0c819e16
  Patrick Devine authored Aug 23, 2024
  
  0c819e16
- gpu: Ensure driver version set before variant (#6480) · 7a1e1c1c
  Daniel Hiltgen authored Aug 23, 2024
```
During rebasing, the ordering was inverted causing the cuda version
selection logic to break, with driver version being evaluated as zero
incorrectly causing a downgrade to v11.
```
  7a1e1c1c
- llm: Align cmake define for cuda no peer copy (#6455) · 0b03b9c3
  Daniel Hiltgen authored Aug 23, 2024
```
Define changed recently and this slipped through the cracks with the old
name.
```
  0b03b9c3
22 Aug, 2024 1 commit

Fix embeddings memory corruption (#6467) · 90ca8417

Daniel Hiltgen authored Aug 22, 2024

* Fix embeddings memory corruption

The patch was leading to a buffer overrun corruption.  Once removed though, parallism
in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
work around this, only use slot 0 for embeddings.

* Fix embed integration test assumption

The token eval count has changed with recent llama.cpp bumps (0.3.5+)

90ca8417

21 Aug, 2024 8 commits
- Merge pull request #6064 from ollama/mxyng/convert-llama3 · 6bd8a4b0
  Michael Yang authored Aug 21, 2024
```
convert: update llama conversion for llama3.1
```
  6bd8a4b0
- llama3.1 · 77903ab8
  Michael Yang authored Jul 29, 2024
  
  77903ab8
- Merge pull request #5365 from ollama/mxyng/convert-gemma2 · e22286c9
  Michael Yang authored Aug 21, 2024
```
convert gemma2
```
  e22286c9
- Merge pull request #4917 from ollama/mxyng/convert-bert · 107f6959
  Michael Yang authored Aug 21, 2024
```
convert bert model from safetensors
```
  107f6959
- Merge pull request #6386 from zwwhdls/fix-new-layer · 4ecc70d3
  Michael Yang authored Aug 21, 2024
```
fix: chmod new layer to 0o644 when creating it
```
  4ecc70d3
- convert gemma2 · 3546bbd0
  Michael Yang authored Jun 28, 2024
  
  3546bbd0
- create bert models from cli · beb49eef
  Michael Yang authored Jun 07, 2024
  
  beb49eef
- bert · 5a28b9cf
  Michael Yang authored Jun 06, 2024
  
  5a28b9cf
20 Aug, 2024 1 commit

Split rocm back out of bundle (#6432) · a017cf2f

Daniel Hiltgen authored Aug 20, 2024

We're over budget for github's maximum release artifact size with rocm + 2 cuda
versions. This splits rocm back out as a discrete artifact, but keeps the layout so it can
be extracted into the same location as the main bundle.

a017cf2f

19 Aug, 2024 4 commits
- CI: remove directories from dist dir before upload step (#6429) · 19e5a890
  Daniel Hiltgen authored Aug 19, 2024
  
  19e5a890
- CI: handle directories during checksum (#6427) · f91c9e37
  Daniel Hiltgen authored Aug 19, 2024
  
  f91c9e37
- Merge pull request #6424 from dhiltgen/cuda_v12 · 2df6905e
  Daniel Hiltgen authored Aug 19, 2024
```
Fix overlapping artifact name on CI
```
  2df6905e
- Fix overlapping artifact name on CI · d8be22e4
  Daniel Hiltgen authored Aug 19, 2024
  
  d8be22e4