Commits · a2e60ebcafa68befd1e259ab3caeb1647417f4b5 · OpenDAS / ollama

04 Apr, 2024 5 commits
- Merge pull request #3490 from dhiltgen/ci_fixes · a2e60ebc
  Daniel Hiltgen authored Apr 04, 2024
```
CI missing archive
```
  a2e60ebc
- CI missing archive · 883ec4d1
  Daniel Hiltgen authored Apr 04, 2024
  
  883ec4d1
- Merge pull request #3481 from dhiltgen/ci_fixes · 9768e2dc
  Daniel Hiltgen authored Apr 03, 2024
```
CI subprocess path fix
```
  9768e2dc
- CI subprocess path fix · 08600d5b
  Daniel Hiltgen authored Apr 03, 2024
  
  08600d5b
- Merge pull request #3479 from dhiltgen/ci_fixes · a624e672
  Daniel Hiltgen authored Apr 03, 2024
```
Fix CI release glitches
```
  a624e672
03 Apr, 2024 8 commits
- Fix CI release glitches · e4a7e5b2
  Daniel Hiltgen authored Apr 03, 2024
```
The subprocess change moved the build directory
arm64 builds weren't setting cross-compilation flags when building on x86
```
  e4a7e5b2
- Merge pull request #3463 from ollama/mxyng/graph-estimate · a0a15cfd
  Michael Yang authored Apr 03, 2024
```
update graph size estimate
```
  a0a15cfd
- update graph size estimate · 12e923e1
  Michael Yang authored Apr 02, 2024
  
  12e923e1
- Fix macOS builds on older SDKs (#3467) · cd135317
  Jeffrey Morgan authored Apr 03, 2024
  
  cd135317
- Merge pull request #3466 from ollama/mxyng/head-kv · 4f895d63
  Michael Yang authored Apr 03, 2024
```
default head_kv to 1
```
  4f895d63
- cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470) · 7d05a6ee
  Blake Mizerany authored Apr 02, 2024
```
This also moves the checkServerHeartbeat call out of the "RunE" Cobra
stuff (that's the only word I have for that) to on-site where it's after
the check for OLLAMA_MODELS, which allows the helpful error message to
be printed before the server heartbeat check. This also arguably makes
the code more readable without the magic/superfluous "pre" function
caller.
```
  7d05a6ee
- Merge pull request #3464 from dhiltgen/subprocess · 464d8178
  Daniel Hiltgen authored Apr 02, 2024
```
Fix numgpu opt miscomparison
```
  464d8178
- feat: add OLLAMA_DEBUG in ollama server help message (#3461) · 531324a9
  Pier Francesco Contino authored Apr 03, 2024
```
Co-authored-by: Pier Francesco Contino <pfcontino@gmail.com>
```
  531324a9
02 Apr, 2024 8 commits
- Revert options as a ref in the server · 6589eb8a
  Daniel Hiltgen authored Apr 02, 2024
  
  6589eb8a
- default head_kv to 1 · 90f071c6
  Michael Yang authored Apr 02, 2024
  
  90f071c6
- Merge pull request #3465 from ollama/mxyng/fix-metal · a039e383
  Michael Yang authored Apr 02, 2024
```
fix metal gpu
```
  a039e383
- fix metal gpu · 80163ebc
  Michael Yang authored Apr 02, 2024
  
  80163ebc
- Merge pull request #3343 from dhiltgen/bump_more2 · a57818d9
  Daniel Hiltgen authored Apr 02, 2024
```
Bump llama.cpp to b2581
```
  a57818d9
- Fix windows lint CI flakiness · 841adda1
  Daniel Hiltgen authored Apr 02, 2024
  
  841adda1
- Bump to b2581 · 0035e31a
  Daniel Hiltgen authored Mar 25, 2024
  
  0035e31a
- Merge pull request #3218 from dhiltgen/subprocess · c863c6a9
  Daniel Hiltgen authored Apr 02, 2024
```
Switch back to subprocessing for llama.cpp
```
  c863c6a9
01 Apr, 2024 17 commits
- Refined min memory from testing · 1f11b525
  Daniel Hiltgen authored Apr 01, 2024
  
  1f11b525
- Release gpu discovery library after use · 526d4eb2
  Daniel Hiltgen authored Mar 30, 2024
```
Leaving the cudart library loaded kept ~30m of memory
pinned in the GPU in the main process.  This change ensures
we don't hold GPU resources when idle.
```
  526d4eb2
- Safeguard for noexec · 0a74cb31
  Daniel Hiltgen authored Mar 28, 2024
```
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
```
  0a74cb31
- Detect too-old cuda driver · 10ed1b62
  Daniel Hiltgen authored Mar 28, 2024
```
"cudart init failure: 35" isn't particularly helpful in the logs.
```
  10ed1b62
- Integration test improvements · 4fec5816
  Daniel Hiltgen authored Mar 27, 2024
```
Cleaner shutdown logic, a bit of response hardening
```
  4fec5816
- Apply 01-cache.diff · 0a0e9f3e
  Daniel Hiltgen authored Mar 19, 2024
  
  0a0e9f3e
- Switch back to subprocessing for llama.cpp · 58d95cc9
  Daniel Hiltgen authored Mar 14, 2024
```
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems.  This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
```
  58d95cc9
- Simplify model conversion (#3422) · 3b6a9154
  Patrick Devine authored Apr 01, 2024
  
  3b6a9154
- Merge pull request #3241 from ollama/mxyng/mem · d6dd2ff8
  Michael Yang authored Apr 01, 2024
```
update memory estimations for gpu offloading
```
  d6dd2ff8
- Merge pull request #2926 from ollama/mxyng/decode-ggml-v2 · e57a6ba8
  Michael Yang authored Apr 01, 2024
```
refactor model parsing
```
  e57a6ba8
- Merge pull request #3442 from ollama/mxyng/generate-output · 12ec2346
  Michael Yang authored Apr 01, 2024
```
fix generate output
```
  12ec2346
- fix generate output · 1ec0df10
  Michael Yang authored Apr 01, 2024
  
  1ec0df10
- update memory calcualtions · 91b3e4d2
  Michael Yang authored Mar 18, 2024
```
count each layer independently when deciding gpu offloading
```
  91b3e4d2
- refactor model parsing · d338d704
  Michael Yang authored Mar 13, 2024
  
  d338d704
- Add chromem-go to community integrations (#3437) · 011bb673
  Philipp Gillé authored Apr 01, 2024
  
  011bb673
- Update README.md (#3436) · d1246272
  Saifeddine ALOUI authored Apr 01, 2024
  
  d1246272
- Community Integration: CRAG Ollama Chat (#3423) · b0a8246a
  Jesse Zhang authored Apr 01, 2024
```
Corrective Retrieval Augmented Generation Demo, powered by Langgraph and Streamlit 🤗

Support: 
- Ollama
- OpenAI APIs
```
  b0a8246a
31 Mar, 2024 2 commits
- Update README.md (#3378) · e6fb39c1
  Yaroslav authored Mar 31, 2024
```
Plugins list updated
```
  e6fb39c1
- Community Integration: ChatOllama (#3400) · e1f1c374
  sugarforever authored Mar 31, 2024
```
* Community Integration: ChatOllama

* fixed typo
```
  e1f1c374