Commits · cd5c8f6471abf32965289f0226016a78f0c5c938 · OpenDAS / ollama

12 Sep, 2024 1 commit

Optimize container images for startup (#6547) · cd5c8f64

Daniel Hiltgen authored Sep 12, 2024

* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release

cd5c8f64

01 May, 2024 1 commit
- ignore debug bin files · 34a4a94f
  Mark Ward authored Apr 29, 2024
  
  34a4a94f
01 Apr, 2024 1 commit

Switch back to subprocessing for llama.cpp · 58d95cc9

Daniel Hiltgen authored Mar 14, 2024

This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.

58d95cc9

15 Feb, 2024 1 commit

Implement new Go based Desktop app · 29e90cc1

Daniel Hiltgen authored Dec 26, 2023

This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.

29e90cc1

19 Dec, 2023 1 commit

Add cgo implementation for llama.cpp · d4cd6957

Daniel Hiltgen authored Nov 13, 2023

Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.

d4cd6957

27 Nov, 2023 1 commit
- ignore jetbrain ides (#1287) · 3d620f94
  Jason Jacobs authored Nov 27, 2023
  
  3d620f94
24 Nov, 2023 1 commit

windows CUDA support (#1262) · 82b9b329

Jing Zhang authored Nov 24, 2023

* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows

82b9b329

18 Nov, 2023 1 commit
- cache docker builds · 85e4441c
  Jeffrey Morgan authored Nov 18, 2023
  
  85e4441c
30 Aug, 2023 2 commits

update docs for subprocess · a82eb275
Jeffrey Morgan authored Aug 30, 2023

a82eb275

subprocess llama.cpp server (#401) · 42998d79

Bruce MacDonald authored Aug 30, 2023

* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm

42998d79

28 Jul, 2023 1 commit
- add `ggml-metal.metal` to `.gitignore` · 67b6f8ba
  Jeffrey Morgan authored Jul 28, 2023
  
  67b6f8ba
22 Jul, 2023 1 commit
- Update .gitignore · e6c427ce
  jk1jk authored Jul 22, 2023
  
  e6c427ce
12 Jul, 2023 1 commit
- fix compilation issue in Dockerfile, remove from `README.md` until ready · 7c71c10d
  Jeffrey Morgan authored Jul 11, 2023
  
  7c71c10d
11 Jul, 2023 2 commits
- vendor llama.cpp · 442dec1c
  Michael Yang authored Jul 11, 2023
  
  442dec1c
- call llama.cpp directly from go · fd4792ec
  Michael Yang authored Jul 07, 2023
  
  fd4792ec
06 Jul, 2023 2 commits
- use `Makefile` for dependency building instead of `go generate` · 9fe01867
  Jeffrey Morgan authored Jul 06, 2023
  
  9fe01867
- add binary to .gitignore · b0e986fb
  Jeffrey Morgan authored Jul 05, 2023
  
  b0e986fb
26 Jun, 2023 1 commit
- add templates to prompt command · d34985b9
  Bruce MacDonald authored Jun 26, 2023
  
  d34985b9
25 Jun, 2023 2 commits
- reorganize directories · b361fa72
  Jeffrey Morgan authored Jun 25, 2023
  
  b361fa72
- build server into desktop app · d3709f85
  Jeffrey Morgan authored Jun 25, 2023
  
  d3709f85
23 Jun, 2023 3 commits
- package server with client · c5bafaff
  Bruce MacDonald authored Jun 23, 2023
  
  c5bafaff
- build server executable · f0eee3fa
  Bruce MacDonald authored Jun 23, 2023
  
  f0eee3fa
- Update .gitignore · db81d81b
  Bruce MacDonald authored Jun 23, 2023
  
  db81d81b
22 Jun, 2023 1 commit
- initial commit · 8fa91332
  Jeffrey Morgan authored Jun 22, 2023
  
  8fa91332