1. 19 Dec, 2023 3 commits
    • 65a's avatar
      Use build tags to generate accelerated binaries for CUDA and ROCm on Linux. · f8ef4439
      65a authored
      The build tags rocm or cuda must be specified to both go generate and go build.
      ROCm builds should have both ROCM_PATH set (and the ROCM SDK present) as well
      as CLBlast installed (for GGML) and CLBlast_DIR set in the environment to the
      CLBlast cmake directory (likely /usr/lib/cmake/CLBlast). Build tags are also
      used to switch VRAM detection between cuda and rocm implementations, using
      added "accelerator_foo.go" files which contain architecture specific functions
      and variables. accelerator_none is used when no tags are set, and a helper
      function addRunner will ignore it if it is the chosen accelerator. Fix go
      generate commands, thanks @deadmeu for testing.
      f8ef4439
    • Daniel Hiltgen's avatar
      Add cgo implementation for llama.cpp · d4cd6957
      Daniel Hiltgen authored
      Run the server.cpp directly inside the Go runtime via cgo
      while retaining the LLM Go abstractions.
      d4cd6957
    • Bruce MacDonald's avatar
      deprecate ggml · 811b1f03
      Bruce MacDonald authored
      
      
      - remove ggml runner
      - automatically pull gguf models when ggml detected
      - tell users to update to gguf in the case automatic pull fails
      Co-Authored-By: default avatarJeffrey Morgan <jmorganca@gmail.com>
      811b1f03
  2. 18 Dec, 2023 3 commits
  3. 14 Dec, 2023 1 commit
  4. 13 Dec, 2023 1 commit
  5. 12 Dec, 2023 2 commits
  6. 11 Dec, 2023 2 commits
  7. 10 Dec, 2023 4 commits
  8. 09 Dec, 2023 1 commit
  9. 05 Dec, 2023 7 commits
  10. 04 Dec, 2023 2 commits
    • Bruce MacDonald's avatar
      chat api (#991) · 7a0899d6
      Bruce MacDonald authored
      - update chat docs
      - add messages chat endpoint
      - remove deprecated context and template generate parameters from docs
      - context and template are still supported for the time being and will continue to work as expected
      - add partial response to chat history
      7a0899d6
    • Michael Yang's avatar
      update for qwen · 6deebf24
      Michael Yang authored
      6deebf24
  11. 26 Nov, 2023 2 commits
  12. 24 Nov, 2023 2 commits
  13. 22 Nov, 2023 2 commits
  14. 21 Nov, 2023 2 commits
  15. 20 Nov, 2023 3 commits
  16. 19 Nov, 2023 2 commits
  17. 17 Nov, 2023 1 commit