1. 19 Dec, 2023 3 commits
    • 65a's avatar
      Use build tags to generate accelerated binaries for CUDA and ROCm on Linux. · f8ef4439
      65a authored
      The build tags rocm or cuda must be specified to both go generate and go build.
      ROCm builds should have both ROCM_PATH set (and the ROCM SDK present) as well
      as CLBlast installed (for GGML) and CLBlast_DIR set in the environment to the
      CLBlast cmake directory (likely /usr/lib/cmake/CLBlast). Build tags are also
      used to switch VRAM detection between cuda and rocm implementations, using
      added "accelerator_foo.go" files which contain architecture specific functions
      and variables. accelerator_none is used when no tags are set, and a helper
      function addRunner will ignore it if it is the chosen accelerator. Fix go
      generate commands, thanks @deadmeu for testing.
      f8ef4439
    • Daniel Hiltgen's avatar
      Add cgo implementation for llama.cpp · d4cd6957
      Daniel Hiltgen authored
      Run the server.cpp directly inside the Go runtime via cgo
      while retaining the LLM Go abstractions.
      d4cd6957
    • Bruce MacDonald's avatar
      deprecate ggml · 811b1f03
      Bruce MacDonald authored
      
      
      - remove ggml runner
      - automatically pull gguf models when ggml detected
      - tell users to update to gguf in the case automatic pull fails
      Co-Authored-By: default avatarJeffrey Morgan <jmorganca@gmail.com>
      811b1f03
  2. 18 Dec, 2023 3 commits
  3. 13 Dec, 2023 1 commit
  4. 04 Dec, 2023 1 commit
  5. 26 Nov, 2023 2 commits
  6. 24 Nov, 2023 2 commits
  7. 22 Nov, 2023 1 commit
  8. 21 Nov, 2023 1 commit
  9. 20 Nov, 2023 1 commit
  10. 17 Nov, 2023 1 commit
  11. 27 Oct, 2023 1 commit
  12. 24 Oct, 2023 3 commits
  13. 23 Oct, 2023 2 commits
  14. 17 Oct, 2023 1 commit
  15. 06 Oct, 2023 2 commits
  16. 21 Sep, 2023 2 commits
  17. 20 Sep, 2023 6 commits
  18. 18 Sep, 2023 1 commit
    • Bruce MacDonald's avatar
      subprocess improvements (#524) · 66003e1d
      Bruce MacDonald authored
      * subprocess improvements
      
      - increase start-up timeout
      - when runner fails to start fail rather than timing out
      - try runners in order rather than choosing 1 runner
      - embed metal runner in metal dir rather than gpu
      - refactor logging and error messages
      
      * Update llama.go
      
      * Update llama.go
      
      * simplify by using glob
      66003e1d
  19. 14 Sep, 2023 1 commit
  20. 12 Sep, 2023 2 commits
  21. 07 Sep, 2023 1 commit
  22. 06 Sep, 2023 2 commits