1. 13 Jul, 2024 1 commit
  2. 05 Jul, 2024 3 commits
  3. 03 Jul, 2024 1 commit
    • Daniel Hiltgen's avatar
      Only set default keep_alive on initial model load · 955f2a4e
      Daniel Hiltgen authored
      This change fixes the handling of keep_alive so that if client
      request omits the setting, we only set this on initial load.  Once
      the model is loaded, if new requests leave this unset, we'll keep
      whatever keep_alive was there.
      955f2a4e
  4. 02 Jul, 2024 3 commits
  5. 01 Jul, 2024 2 commits
  6. 25 Jun, 2024 1 commit
    • Blake Mizerany's avatar
      llm: speed up gguf decoding by a lot (#5246) · cb42e607
      Blake Mizerany authored
      Previously, some costly things were causing the loading of GGUF files
      and their metadata and tensor information to be VERY slow:
      
        * Too many allocations when decoding strings
        * Hitting disk for each read of each key and value, resulting in a
          not-okay amount of syscalls/disk I/O.
      
      The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro
      m3.
      
      This commit also prevents collecting large arrays of values when
      decoding GGUFs (if desired). When such keys are encountered, their
      values are null, and are encoded as such in JSON.
      
      Also, this fixes a broken test that was not encoding valid GGUF.
      cb42e607
  7. 21 Jun, 2024 1 commit
  8. 19 Jun, 2024 1 commit
    • royjhan's avatar
      Extend api/show and ollama show to return more model info (#4881) · fedf7163
      royjhan authored
      
      
      * API Show Extended
      
      * Initial Draft of Information
      Co-Authored-By: default avatarPatrick Devine <pdevine@sonic.net>
      
      * Clean Up
      
      * Descriptive arg error messages and other fixes
      
      * Second Draft of Show with Projectors Included
      
      * Remove Chat Template
      
      * Touches
      
      * Prevent wrapping from files
      
      * Verbose functionality
      
      * Docs
      
      * Address Feedback
      
      * Lint
      
      * Resolve Conflicts
      
      * Function Name
      
      * Tests for api/show model info
      
      * Show Test File
      
      * Add Projector Test
      
      * Clean routes
      
      * Projector Check
      
      * Move Show Test
      
      * Touches
      
      * Doc update
      
      ---------
      Co-authored-by: default avatarPatrick Devine <pdevine@sonic.net>
      fedf7163
  9. 16 Jun, 2024 1 commit
  10. 06 Jun, 2024 2 commits
  11. 04 Jun, 2024 3 commits
  12. 24 May, 2024 1 commit
  13. 20 May, 2024 3 commits
  14. 16 May, 2024 1 commit
  15. 15 May, 2024 1 commit
  16. 14 May, 2024 7 commits
  17. 10 May, 2024 2 commits
  18. 09 May, 2024 5 commits
  19. 08 May, 2024 1 commit
    • Bruce MacDonald's avatar
      Add preflight OPTIONS handling and update CORS config (#4086) · cef45fea
      Bruce MacDonald authored
      * Add preflight OPTIONS handling and update CORS config
      
      - Implement early return with HTTP 204 (No Content) for OPTIONS requests in allowedHostsMiddleware to optimize preflight handling.
      
      - Extend CORS configuration to explicitly allow 'Authorization' headers and 'OPTIONS' method when OLLAMA_ORIGINS environment variable is set.
      
      * allow auth, content-type, and user-agent headers
      
      * Update routes.go
      cef45fea