Commits · 08f1e18965c15648504fc5ec367134898e92ec6d · OpenDAS / ollama

08 Jan, 2024 3 commits

Offload layers to GPU based on new model size estimates (#1850) · 08f1e189

Jeffrey Morgan authored Jan 08, 2024



* select layers based on estimated model memory usage

* always account for scratch vram

* dont load +1 layers

* better estmation for graph alloc

* Update gpu/gpu_darwin.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

* Update llm/llm.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

* Update llm/llm.go

* add overhead for cuda memory

* Update llm/llm.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

* fix build error on linux

* address comments

---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

08f1e189

remove ggml automatic re-pull (#1856) · 7e8f7c83
Bruce MacDonald authored Jan 08, 2024

7e8f7c83
document response in modelfile template variables (#1428) · 3f3eb19a
Bruce MacDonald authored Jan 08, 2024

3f3eb19a

07 Jan, 2024 6 commits
- Merge pull request #1834 from dhiltgen/old_cuda · 059ae458
  Daniel Hiltgen authored Jan 07, 2024
```
Detect very old CUDA GPUs and fall back to CPU
```
  059ae458
- Merge pull request #1828 from dhiltgen/fix_llava · 6347f501
  Daniel Hiltgen authored Jan 07, 2024
```
Accept windows paths for image processing
```
  6347f501
- dont use `-Wall` in static build (#1833) · 5feec959
  Jeffrey Morgan authored Jan 07, 2024
  
  5feec959
- add `-DCMAKE_SYSTEM_NAME=Darwin` cmake flag (#1832) · dbdd50b2
  Jeffrey Morgan authored Jan 07, 2024
  
  dbdd50b2
- Detect very old CUDA GPUs and fall back to CPU · d74ce6bd
  Daniel Hiltgen authored Jan 06, 2024
```
If we try to load the CUDA library on an old GPU, it panics and crashes
the server.  This checks the compute capability before we load the
library so we can gracefully fall back to CPU mode.
```
  d74ce6bd
- Update README.md - Community Integrations - Ollama for Ruby (#1830) · 57942b46
  Guilherme Baptista authored Jan 07, 2024
  
  57942b46
06 Jan, 2024 4 commits

Accept windows paths for image processing · e0d05b0f

Daniel Hiltgen authored Jan 06, 2024

This enhances our regex to support windows style paths.  The regex will
match invalid path specifications, but we'll still validate file
existence and filter out mismatches

e0d05b0f

Merge pull request #1697 from dhiltgen/win_docs · 2d9dd14f
Daniel Hiltgen authored Jan 05, 2024
```
Add windows native build instructions
```
2d9dd14f
add cuda lib path for nvidia container toolkit · 1caa5612
Jeffrey Morgan authored Jan 05, 2024

1caa5612

Merge pull request #1797 from... · 0101e76d

Michael Yang authored Jan 05, 2024

Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05

fix: allow extension origins (still needs explicit listing), fixes #1686

0101e76d

05 Jan, 2024 13 commits
- only pull gguf model if already exists (#1817) · 3a9f4471
  Bruce MacDonald authored Jan 05, 2024
  
  3a9f4471
- switch api for ShowRequest to use the name field (#1816) · 9c2941e6
  Patrick Devine authored Jan 05, 2024
  
  9c2941e6
- Add unit tests for Parser (#1815) · 238ac5e7
  Patrick Devine authored Jan 05, 2024
  
  238ac5e7
- simplify ggml update logic (#1814) · 4f4980b6
  Bruce MacDonald authored Jan 05, 2024
```
- additional information is now available in show response, use this to pull gguf before running
- make gguf updates cancellable
```
  4f4980b6
- add show info command and fix the modelfile · 22e93efa
  Patrick Devine authored Jan 04, 2024
  
  22e93efa
- split up interactive generation · 2909dce8
  Patrick Devine authored Jan 04, 2024
  
  2909dce8
- gpu: read memory info from all cuda devices (#1802) · df325373
  Jeffrey Morgan authored Jan 05, 2024
```
* gpu: read memory info from all cuda devices

* add `LOOKUP_SIZE` constant

* better constant name

* address comments
```
  df325373
- remove unused generate patches (#1810) · 3367b5f3
  Bruce MacDonald authored Jan 05, 2024
  
  3367b5f3
- Merge pull request #1801 from jmorganca/mattw/correctdockerlink · 46edbbc5
  Matt Williams authored Jan 04, 2024
  
  46edbbc5
- Merge pull request #1791 from jmorganca/mxyng/update-build · d2ff18cd
  Michael Yang authored Jan 04, 2024
```
update Dockerfile.build
```
  d2ff18cd
- fix docker doc to point to hub · df086d3c
  Matt Williams authored Jan 04, 2024
```
Signed-off-by: Matt Williams <m@technovangelist.com>
```
  df086d3c
- Allow extension origins (still needs explicit listing), fixes #1686 · 8baaaa39
  Nicholas Dudfield authored Jan 05, 2024
  
  8baaaa39
- update build · f9961c70
  Michael Yang authored Jan 04, 2024
  
  f9961c70
04 Jan, 2024 14 commits
- Merge pull request #1790 from dhiltgen/llm_code_shuffle · cd8fad33
  Daniel Hiltgen authored Jan 04, 2024
```
Cleaup stale submodule
```
  cd8fad33
- Cleaup stale submodule · 9983fa5f
  Daniel Hiltgen authored Jan 04, 2024
```
If the tree has a stale submodule, make sure we clean it up first
```
  9983fa5f
- Merge pull request #1788 from dhiltgen/llm_code_shuffle · dfda91c2
  Daniel Hiltgen authored Jan 04, 2024
```
Revamp code layout for the llm directory and llama.cpp submodule
```
  dfda91c2
- Init submodule with new path · fac9060d
  Daniel Hiltgen authored Jan 04, 2024
  
  fac9060d
- remove old llama.cpp submodule path · a554616f
  Daniel Hiltgen authored Jan 04, 2024
  
  a554616f
- Code shuffle to clean up the llm dir · 77d96da9
  Daniel Hiltgen authored Jan 04, 2024
  
  77d96da9
- Add embeddings to API (#1773) · 0d6e3565
  Brian Murray authored Jan 04, 2024
  
  0d6e3565
- Merge pull request #1785 from dhiltgen/win_native_cli · b5939008
  Daniel Hiltgen authored Jan 04, 2024
```
Load dynamic cpu lib on windows
```
  b5939008
- Load dynamic cpu lib on windows · e9ce91e9
  Daniel Hiltgen authored Jan 04, 2024
```
On linux, we link the CPU library in to the Go app and fall back to it
when no GPU match is found. On windows we do not link in the CPU library
so that we can better control our dependencies for the CLI.  This fixes
the logic so we correctly fallback to the dynamic CPU library
on windows.
```
  e9ce91e9
- fix: pull either original model or from model on create (#1774) · 4ad6c9b1
  Bruce MacDonald authored Jan 04, 2024
  
  4ad6c9b1
- tweak memory requirements error text · c0285158
  Jeffrey Morgan authored Jan 03, 2024
  
  c0285158
- add macOS memory check for 47B models · 77a66df7
  Jeffrey Morgan authored Jan 03, 2024
  
  77a66df7
- remove unused filetype check · 5b4837f8
  Jeffrey Morgan authored Jan 03, 2024
  
  5b4837f8
- update cmake flags for `amd64` macOS (#1780) · 29340c2e
  Jeffrey Morgan authored Jan 03, 2024
```
* update cmake flags for intel macOS

* remove `LLAMA_K_QUANTS`

* put back `CMAKE_OSX_DEPLOYMENT_TARGET` and disable `LLAMA_F16C`
```
  29340c2e