1. 26 Sep, 2024 1 commit
    • Blake Mizerany's avatar
      server: close response body on error (#6986) · 03608cb4
      Blake Mizerany authored
      This change closes the response body when an error occurs in
      makeRequestWithRetry. Previously, the first, non-200 response body was
      not closed before reattempting the request. This change ensures that
      the response body is closed in all cases where an error occurs,
      preventing leaks of file descriptors.
      
      Fixes #6974
      03608cb4
  2. 20 Sep, 2024 1 commit
    • Daniel Hiltgen's avatar
      Add Windows arm64 support to official builds (#5712) · d632e23f
      Daniel Hiltgen authored
      * Unified arm/x86 windows installer
      
      This adjusts the installer payloads to be architecture aware so we can cary
      both amd64 and arm64 binaries in the installer, and install only the applicable
      architecture at install time.
      
      * Include arm64 in official windows build
      
      * Harden schedule test for slow windows timers
      
      This test seems to be a bit flaky on windows, so give it more time to converge
      d632e23f
  3. 18 Sep, 2024 1 commit
  4. 12 Sep, 2024 1 commit
    • Daniel Hiltgen's avatar
      Optimize container images for startup (#6547) · cd5c8f64
      Daniel Hiltgen authored
      * Optimize container images for startup
      
      This change adjusts how to handle runner payloads to support
      container builds where we keep them extracted in the filesystem.
      This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
      size, and should result in faster startup times for container images.
      
      * Refactor payload logic and add buildx support for faster builds
      
      * Move payloads around
      
      * Review comments
      
      * Converge to buildx based helper scripts
      
      * Use docker buildx action for release
      cd5c8f64
  5. 11 Sep, 2024 1 commit
  6. 05 Sep, 2024 3 commits
  7. 28 Aug, 2024 2 commits
  8. 27 Aug, 2024 2 commits
  9. 23 Aug, 2024 1 commit
  10. 22 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Fix embeddings memory corruption (#6467) · 90ca8417
      Daniel Hiltgen authored
      * Fix embeddings memory corruption
      
      The patch was leading to a buffer overrun corruption.  Once removed though, parallism
      in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
      work around this, only use slot 0 for embeddings.
      
      * Fix embed integration test assumption
      
      The token eval count has changed with recent llama.cpp bumps (0.3.5+)
      90ca8417
  11. 21 Aug, 2024 1 commit
  12. 19 Aug, 2024 1 commit
  13. 18 Aug, 2024 2 commits
  14. 17 Aug, 2024 1 commit
  15. 16 Aug, 2024 1 commit
  16. 15 Aug, 2024 1 commit
  17. 14 Aug, 2024 2 commits
  18. 13 Aug, 2024 3 commits
    • Blake Mizerany's avatar
      server: reduce max connections used in download (#6347) · 8e1050f3
      Blake Mizerany authored
      The previous value of 64 was WAY too high and unnecessary. It reached
      diminishing returns and blew past it. This is a more reasonable number
      for _most_ normal cases. For users on cloud servers with excellent
      network quality, this will keep screaming for them, without hitting our
      CDN limits. For users with relatively poor network quality, this will
      keep them from saturating their network and causing other issues.
      8e1050f3
    • Michael Yang's avatar
      lint · 2697d7f5
      Michael Yang authored
      - fixes printf: non-constant format string in call to fmt.Printf
      - fixes SA1032: arguments have the wrong order
      - disables testifylint
      2697d7f5
    • royjhan's avatar
      Load Embedding Model on Empty Input (#6325) · 8b00a415
      royjhan authored
      * load on empty input
      
      * no load on invalid input
      8b00a415
  19. 12 Aug, 2024 3 commits
  20. 11 Aug, 2024 1 commit
  21. 09 Aug, 2024 1 commit
  22. 08 Aug, 2024 2 commits
  23. 07 Aug, 2024 3 commits
    • Jesse Gross's avatar
      image: Clarify argument to WriteManifest is config · 97ec8cfd
      Jesse Gross authored
      When creating a model the config layer is appended to the list of
      layers and then the last layer is used as the config when writing the
      manifest. This change directly uses the config layer to write the
      manifest. There is no behavior change but it is less error prone.
      97ec8cfd
    • Jesse Gross's avatar
      manifest: Fix crash on startup when trying to clean up unused files (#5840) · 1829fb61
      Jesse Gross authored
      Currently if the config field is missing in the manifest file (or
      corrupted), Ollama will crash when it tries to read it. This can
      happen at startup or when pulling new models.
      
      This data is mostly just used for showing model information so we
      can be tolerant of it not being present - it is not required to
      run the models. Besides avoiding crashing, this also gives us the
      ability to restructure the config in the future by pulling it
      into the main manifest file.
      1829fb61
    • Jesse Gross's avatar
      manifest: Don't prune layers if we can't open a manifest file · 685a5353
      Jesse Gross authored
      If there is an error when opening a manifest file (corrupted, permission denied, etc.)
      then the referenced layers will not be included in the list of active
      layers. This causes them to be deleted when pruning happens at startup
      or a model is pulled.
      
      In such a situation, we should prefer to preserve data in the hopes that
      it can be recovered rather than being agressive about deletion.
      685a5353
  24. 06 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Ensure sparse files on windows during download · fc85f50a
      Daniel Hiltgen authored
      The file.Truncate call on windows will write the whole file
      unless you set the sparse flag, leading to heavy I/O at the
      beginning of download.  This should improve our
      I/O behavior on windows and put less stress on the users disk.
      fc85f50a
  25. 02 Aug, 2024 2 commits
  26. 01 Aug, 2024 1 commit