1. 21 Sep, 2024 1 commit
  2. 20 Sep, 2024 1 commit
    • Daniel Hiltgen's avatar
      Add Windows arm64 support to official builds (#5712) · d632e23f
      Daniel Hiltgen authored
      * Unified arm/x86 windows installer
      
      This adjusts the installer payloads to be architecture aware so we can cary
      both amd64 and arm64 binaries in the installer, and install only the applicable
      architecture at install time.
      
      * Include arm64 in official windows build
      
      * Harden schedule test for slow windows timers
      
      This test seems to be a bit flaky on windows, so give it more time to converge
      d632e23f
  3. 18 Sep, 2024 1 commit
  4. 17 Sep, 2024 1 commit
    • Michael Yang's avatar
      make patches git am-able · 7bd7b027
      Michael Yang authored
      raw diffs can be applied using `git apply` but not with `git am`. git
      patches, e.g. through `git format-patch` are both apply-able and am-able
      7bd7b027
  5. 13 Sep, 2024 1 commit
  6. 12 Sep, 2024 2 commits
    • Daniel Hiltgen's avatar
      Use GOARCH for build dirs (#6779) · fda0d3be
      Daniel Hiltgen authored
      Corrects x86_64 vs amd64 discrepancy
      fda0d3be
    • Daniel Hiltgen's avatar
      Optimize container images for startup (#6547) · cd5c8f64
      Daniel Hiltgen authored
      * Optimize container images for startup
      
      This change adjusts how to handle runner payloads to support
      container builds where we keep them extracted in the filesystem.
      This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
      size, and should result in faster startup times for container images.
      
      * Refactor payload logic and add buildx support for faster builds
      
      * Move payloads around
      
      * Review comments
      
      * Converge to buildx based helper scripts
      
      * Use docker buildx action for release
      cd5c8f64
  7. 11 Sep, 2024 1 commit
    • Jesse Gross's avatar
      runner: Flush pending responses before returning · 93ac3760
      Jesse Gross authored
      If there are any pending reponses (such as from potential stop
      tokens) then we should send them back before ending the sequence.
      Otherwise, we can be missing tokens at the end of a response.
      
      Fixes #6707
      93ac3760
  8. 10 Sep, 2024 1 commit
  9. 06 Sep, 2024 1 commit
  10. 05 Sep, 2024 2 commits
  11. 04 Sep, 2024 2 commits
  12. 03 Sep, 2024 2 commits
    • Daniel Hiltgen's avatar
      Log system memory at info (#6617) · 037a4d10
      Daniel Hiltgen authored
      On systems with low system memory, we can hit allocation failures that are difficult to diagnose
      without debug logs.  This will make it easier to spot.
      037a4d10
    • FellowTraveler's avatar
      Fix sprintf to snprintf (#5664) · 94fff580
      FellowTraveler authored
      /Users/au/src/ollama/llm/ext_server/server.cpp:289:9: warning: 'sprintf' is deprecated: This function is provided for compatibility reasons only. Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead.
      94fff580
  13. 29 Aug, 2024 1 commit
  14. 27 Aug, 2024 1 commit
  15. 25 Aug, 2024 1 commit
  16. 23 Aug, 2024 2 commits
  17. 22 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Fix embeddings memory corruption (#6467) · 90ca8417
      Daniel Hiltgen authored
      * Fix embeddings memory corruption
      
      The patch was leading to a buffer overrun corruption.  Once removed though, parallism
      in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
      work around this, only use slot 0 for embeddings.
      
      * Fix embed integration test assumption
      
      The token eval count has changed with recent llama.cpp bumps (0.3.5+)
      90ca8417
  18. 21 Aug, 2024 1 commit
  19. 20 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Split rocm back out of bundle (#6432) · a017cf2f
      Daniel Hiltgen authored
      We're over budget for github's maximum release artifact size with rocm + 2 cuda
      versions.  This splits rocm back out as a discrete artifact, but keeps the layout so it can
      be extracted into the same location as the main bundle.
      a017cf2f
  20. 19 Aug, 2024 6 commits
  21. 12 Aug, 2024 1 commit
  22. 11 Aug, 2024 2 commits
  23. 08 Aug, 2024 1 commit
  24. 07 Aug, 2024 1 commit
  25. 06 Aug, 2024 1 commit
  26. 05 Aug, 2024 4 commits