Commits · c5c451ca3bde83e75a2a98ed9fd4e63a56bb02a9 · OpenDAS / ollama

09 Apr, 2024 1 commit
- Handle very slow model loads · c5ff443b
  Daniel Hiltgen authored Apr 09, 2024
```
During testing, we're seeing some models take over 3 minutes.
```
  c5ff443b
06 Apr, 2024 1 commit
- no rope parameters · be517e49
  Michael Yang authored Apr 05, 2024
  
  be517e49
03 Apr, 2024 1 commit
- update graph size estimate · 12e923e1
  Michael Yang authored Apr 02, 2024
  
  12e923e1
02 Apr, 2024 2 commits
- Revert options as a ref in the server · 6589eb8a
  Daniel Hiltgen authored Apr 02, 2024
  
  6589eb8a
- fix metal gpu · 80163ebc
  Michael Yang authored Apr 02, 2024
  
  80163ebc
01 Apr, 2024 1 commit

Switch back to subprocessing for llama.cpp · 58d95cc9

Daniel Hiltgen authored Mar 14, 2024

This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.

58d95cc9