Commits · e9216ea459f3fabaef81f376ddbf0ba6ef292b37 · OpenDAS / ollama

26 Nov, 2023 1 commit
- update submodule commit · 9e4a3164
  Jeffrey Morgan authored Nov 26, 2023
  
  9e4a3164
24 Nov, 2023 2 commits

Jing Zhang authored Nov 24, 2023

* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows

82b9b329

Disable CUDA peer access as a workaround for multi-gpu inference bug (#1261) · 12e8c12d

Jongwook Choi authored Nov 24, 2023

When CUDA peer access is enabled, multi-gpu inference will produce
garbage output. This is a known bug of llama.cpp (or nvidia). Until the
upstream bug is fixed, we can disable CUDA peer access temporarily
to ensure correct output.

See #961.

12e8c12d

22 Nov, 2023 2 commits
- consistent cpu instructions on macos and linux · d77dde12
  Jeffrey Morgan authored Nov 22, 2023
  
  d77dde12
- fix: gguf int type · 199941cd
  Michael Yang authored Nov 22, 2023
  
  199941cd
21 Nov, 2023 2 commits
- update llama.cpp · a00fac4e
  Michael Yang authored Nov 21, 2023
  
  a00fac4e
- only set `main_gpu` if value > 0 is provided · a3fcecf9
  Jeffrey Morgan authored Nov 20, 2023
  
  a3fcecf9
20 Nov, 2023 3 commits
- recent llama.cpp update added kernels for fp32, q5_0, and q5_1 · 19b7a4d7
  Michael Yang authored Nov 20, 2023
  
  19b7a4d7
- main-gpu argument is not getting passed to llamacpp, fixed. (#1192) · be61a817
  Purinda Gunasekara authored Nov 21, 2023
  
  be61a817
- enable cpu instructions on intel macs · 13ba6df5
  Jeffrey Morgan authored Nov 19, 2023
  
  13ba6df5
19 Nov, 2023 2 commits
- Update llm/llama.go · 36a3bbf6
  Jeffrey Morgan authored Nov 18, 2023
  
  36a3bbf6
- fix potentially inaccurate error message · 43a72614
  Bruce MacDonald authored Nov 17, 2023
  
  43a72614
17 Nov, 2023 1 commit
- build intel mac with correct binary and compile flags · 41434a7c
  Jeffrey Morgan authored Nov 16, 2023
  
  41434a7c
10 Nov, 2023 1 commit

JSON mode: add `"format" as an api parameter (#1051) · 5cba29b9

Jeffrey Morgan authored Nov 09, 2023



* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

5cba29b9

09 Nov, 2023 2 commits
- skip gpu if less than 2GB VRAM are available (#1059) · 1ae84bc2
  Bruce MacDonald authored Nov 09, 2023
  
  1ae84bc2
- instead of static number of parameters for each model family, get the real... · c5e1bbab
  Michael Yang authored Nov 08, 2023
```
instead of static number of parameters for each model family, get the real number from the tensors (#1022)

* parse tensor info

* refactor decoder

* return actual parameter count

* explicit rounding

* s/Human/HumanNumber/
```
  c5e1bbab
04 Nov, 2023 1 commit
- remove unused `fmt.Println` · c44b6194
  Jeffrey Morgan authored Nov 03, 2023
  
  c44b6194
03 Nov, 2023 1 commit
- Restore system prompt on requests and default `num_keep` to `0` · 17678b72
  Jeffrey Morgan authored Nov 03, 2023
  
  17678b72
02 Nov, 2023 1 commit
- default rope params to 0 for new models (#968) · 2e537046
  Jeffrey Morgan authored Nov 02, 2023
  
  2e537046
31 Oct, 2023 1 commit
- append LD_LIBRARY_PATH · 642128b7
  Michael Yang authored Oct 31, 2023
  
  642128b7
27 Oct, 2023 3 commits
- restore building runner with `AVX` on by default (#900) · 3a1ed9ff
  Jeffrey Morgan authored Oct 27, 2023
  
  3a1ed9ff
- catch insufficient permissions nvidia err (#934) · 6d283882
  Bruce MacDonald authored Oct 27, 2023
  
  6d283882
- offload 75% of available vram to improve stability (#921) · 2665f3c2
  Bruce MacDonald authored Oct 26, 2023
  
  2665f3c2
24 Oct, 2023 3 commits
- fix metal assertion errors · b0c9cd0f
  Jeffrey Morgan authored Oct 24, 2023
  
  b0c9cd0f
- update submodule commit · 77f61c63
  Jeffrey Morgan authored Oct 24, 2023
  
  77f61c63
- update submodule commit · f3604534
  Jeffrey Morgan authored Oct 23, 2023
  
  f3604534
23 Oct, 2023 3 commits

bump submodules · 0c7a00a2

Michael Yang authored Oct 23, 2023

pin to 9e70cc03229df19ca2d28ce23cc817198f897278 for now since
438c2ca83045a00ef244093d27e9ed41a8cb4ea9 is breaking

0c7a00a2

update default log target · c9167494
Michael Yang authored Oct 23, 2023

c9167494

ggufv3 · 125d0a01

Michael Yang authored Oct 23, 2023

ggufv3 adds support for big endianness, mainly for s390x architecture.
while that's not currently supported for ollama, the change is simple.

loosen version check to be more forward compatible. unless specified,
gguf versions other v1 will be decoded into v2.

125d0a01

19 Oct, 2023 2 commits
- simpler check for model loading compatibility errors · 7ed5a39b
  Jeffrey Morgan authored Oct 19, 2023
  
  7ed5a39b
- add error for `falcon` and `starcoder` vocab compatibility (#844) · a7dad24d
  Jeffrey Morgan authored Oct 19, 2023
```
add error for falcon and starcoder vocab compatibility
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
```
  a7dad24d
18 Oct, 2023 6 commits
- use TrimPrefix instead of TrimLeft · 730996e5
  Arne Müller authored Oct 18, 2023
  
  730996e5
- removed redundant strings.CutPrefix from Decode · ce6197a8
  Arne Müller authored Oct 18, 2023
  
  ce6197a8
- use strings.TrimLeft to remove spaces · 46b9953f
  Arne Müller authored Oct 18, 2023
  
  46b9953f
- relay CUDA errors to the client (#825) · 565648f3
  Bruce MacDonald authored Oct 18, 2023
  
  565648f3
- moved removal of leading space into Predict · 90c49bed
  Arne Müller authored Oct 18, 2023
  
  90c49bed
- fix whitespace removal · 5dc0cff4
  Arne Müller authored Oct 18, 2023
  
  5dc0cff4
17 Oct, 2023 3 commits
- use cut prefix · b36b0b71
  Michael Yang authored Oct 16, 2023
  
  b36b0b71
- remove unused struct · 094df375
  Michael Yang authored Oct 16, 2023
  
  094df375
- Update llama.cpp gguf to latest (#710) · f3648fd2
  Bruce MacDonald authored Oct 17, 2023
  
  f3648fd2