Commits · a57818d93e251317d32df40d9bb8d7df8a4075bf · OpenDAS / ollama

02 Apr, 2024 4 commits
- Merge pull request #3343 from dhiltgen/bump_more2 · a57818d9
  Daniel Hiltgen authored Apr 02, 2024
```
Bump llama.cpp to b2581
```
  a57818d9
- Fix windows lint CI flakiness · 841adda1
  Daniel Hiltgen authored Apr 02, 2024
  
  841adda1
- Bump to b2581 · 0035e31a
  Daniel Hiltgen authored Mar 25, 2024
  
  0035e31a
- Merge pull request #3218 from dhiltgen/subprocess · c863c6a9
  Daniel Hiltgen authored Apr 02, 2024
```
Switch back to subprocessing for llama.cpp
```
  c863c6a9
01 Apr, 2024 17 commits
- Refined min memory from testing · 1f11b525
  Daniel Hiltgen authored Apr 01, 2024
  
  1f11b525
- Release gpu discovery library after use · 526d4eb2
  Daniel Hiltgen authored Mar 30, 2024
```
Leaving the cudart library loaded kept ~30m of memory
pinned in the GPU in the main process.  This change ensures
we don't hold GPU resources when idle.
```
  526d4eb2
- Safeguard for noexec · 0a74cb31
  Daniel Hiltgen authored Mar 28, 2024
```
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
```
  0a74cb31
- Detect too-old cuda driver · 10ed1b62
  Daniel Hiltgen authored Mar 28, 2024
```
"cudart init failure: 35" isn't particularly helpful in the logs.
```
  10ed1b62
- Integration test improvements · 4fec5816
  Daniel Hiltgen authored Mar 27, 2024
```
Cleaner shutdown logic, a bit of response hardening
```
  4fec5816
- Apply 01-cache.diff · 0a0e9f3e
  Daniel Hiltgen authored Mar 19, 2024
  
  0a0e9f3e
- Switch back to subprocessing for llama.cpp · 58d95cc9
  Daniel Hiltgen authored Mar 14, 2024
```
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems.  This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
```
  58d95cc9
- Simplify model conversion (#3422) · 3b6a9154
  Patrick Devine authored Apr 01, 2024
  
  3b6a9154
- Merge pull request #3241 from ollama/mxyng/mem · d6dd2ff8
  Michael Yang authored Apr 01, 2024
```
update memory estimations for gpu offloading
```
  d6dd2ff8
- Merge pull request #2926 from ollama/mxyng/decode-ggml-v2 · e57a6ba8
  Michael Yang authored Apr 01, 2024
```
refactor model parsing
```
  e57a6ba8
- Merge pull request #3442 from ollama/mxyng/generate-output · 12ec2346
  Michael Yang authored Apr 01, 2024
```
fix generate output
```
  12ec2346
- fix generate output · 1ec0df10
  Michael Yang authored Apr 01, 2024
  
  1ec0df10
- update memory calcualtions · 91b3e4d2
  Michael Yang authored Mar 18, 2024
```
count each layer independently when deciding gpu offloading
```
  91b3e4d2
- refactor model parsing · d338d704
  Michael Yang authored Mar 13, 2024
  
  d338d704
- Add chromem-go to community integrations (#3437) · 011bb673
  Philipp Gillé authored Apr 01, 2024
  
  011bb673
- Update README.md (#3436) · d1246272
  Saifeddine ALOUI authored Apr 01, 2024
  
  d1246272
- Community Integration: CRAG Ollama Chat (#3423) · b0a8246a
  Jesse Zhang authored Apr 01, 2024
```
Corrective Retrieval Augmented Generation Demo, powered by Langgraph and Streamlit 🤗

Support: 
- Ollama
- OpenAI APIs
```
  b0a8246a
31 Mar, 2024 2 commits
- Update README.md (#3378) · e6fb39c1
  Yaroslav authored Mar 31, 2024
```
Plugins list updated
```
  e6fb39c1
- Community Integration: ChatOllama (#3400) · e1f1c374
  sugarforever authored Mar 31, 2024
```
* Community Integration: ChatOllama

* fixed typo
```
  e1f1c374
29 Mar, 2024 2 commits
- Update 90_bug_report.yml · 06a1508b
  Jeffrey Morgan authored Mar 29, 2024
  
  06a1508b
- Add gemma safetensors conversion (#3250) · 5a5efee4
  Patrick Devine authored Mar 28, 2024
```
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  5a5efee4
28 Mar, 2024 9 commits
- Merge pull request #3398 from dhiltgen/release_latest · 97ae517f
  Daniel Hiltgen authored Mar 28, 2024
```
CI automation for tagging latest images
```
  97ae517f
- Merge pull request #3377 from dhiltgen/rocm_v6_bump · 44b813e4
  Daniel Hiltgen authored Mar 28, 2024
```
Bump ROCm to 6.0.2 patch release
```
  44b813e4
- CI automation for tagging latest images · 539043f5
  Daniel Hiltgen authored Mar 28, 2024
  
  539043f5
- Merge pull request #3392 from dhiltgen/ci_build_win_cuda · dbcace68
  Daniel Hiltgen authored Mar 28, 2024
```
CI windows gpu builds
```
  dbcace68
- Bump ROCm to 6.0.2 patch release · c91a4ebc
  Daniel Hiltgen authored Mar 27, 2024
  
  c91a4ebc
- CI windows gpu builds · b79c7e45
  Daniel Hiltgen authored Mar 28, 2024
```
If we're doing generate, test windows cuda and rocm as well
```
  b79c7e45
- Merge pull request #3379 from ollama/mxyng/origins · 035b274b
  Michael Yang authored Mar 28, 2024
```
fix: trim quotes on OLLAMA_ORIGINS
```
  035b274b
- Merge pull request #3391 from ollama/mxyng-patch-1 · 9c6a2549
  Michael Yang authored Mar 28, 2024
  
  9c6a2549
- Update troubleshooting link · f31f2bed
  Michael Yang authored Mar 28, 2024
  
  f31f2bed
27 Mar, 2024 6 commits
- Merge pull request #3380 from ollama/mxyng/conditional-generate · 756c2575
  Michael Yang authored Mar 28, 2024
```
fix: workflows
```
  756c2575
- fix: workflows · 5255d0af
  Michael Yang authored Mar 27, 2024
  
  5255d0af
- fix: trim quotes on OLLAMA_ORIGINS · af8a8a6b
  Michael Yang authored Mar 27, 2024
  
  af8a8a6b
- Merge pull request #3376 from ollama/mxyng/conditional-generate · 461ad250
  Michael Yang authored Mar 27, 2024
```
only generate on changes to llm subdirectory
```
  461ad250
- stub stub · 8838ae78
  Michael Yang authored Mar 27, 2024
  
  8838ae78
- mangle arch · db75402a
  Michael Yang authored Mar 27, 2024
  
  db75402a