1. 07 Aug, 2024 1 commit
    • Jesse Gross's avatar
      manifest: Don't prune layers if we can't open a manifest file · 685a5353
      Jesse Gross authored
      If there is an error when opening a manifest file (corrupted, permission denied, etc.)
      then the referenced layers will not be included in the list of active
      layers. This causes them to be deleted when pruning happens at startup
      or a model is pulled.
      
      In such a situation, we should prefer to preserve data in the hopes that
      it can be recovered rather than being agressive about deletion.
      685a5353
  2. 06 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Ensure sparse files on windows during download · fc85f50a
      Daniel Hiltgen authored
      The file.Truncate call on windows will write the whole file
      unless you set the sparse flag, leading to heavy I/O at the
      beginning of download.  This should improve our
      I/O behavior on windows and put less stress on the users disk.
      fc85f50a
  3. 02 Aug, 2024 2 commits
  4. 01 Aug, 2024 5 commits
  5. 31 Jul, 2024 5 commits
  6. 30 Jul, 2024 2 commits
    • royjhan's avatar
      Add Metrics to `api\embed` response (#5709) · 1b44d873
      royjhan authored
      * add prompt tokens to embed response
      
      * rm slog
      
      * metrics
      
      * types
      
      * prompt n
      
      * clean up
      
      * reset submodule
      
      * update tests
      
      * test name
      
      * list metrics
      1b44d873
    • Daniel Hiltgen's avatar
      Prevent partial loading on mixed GPU brands · 34542099
      Daniel Hiltgen authored
      In mult-brand GPU setups, if we couldn't fully load the model we
      would fall through the scheduler and mistakenly try to load across
      a mix of brands.  This makes sure we find the set of GPU(s) that
      best fit for the partial load.
      34542099
  7. 26 Jul, 2024 3 commits
  8. 25 Jul, 2024 1 commit
  9. 22 Jul, 2024 10 commits
  10. 21 Jul, 2024 1 commit
  11. 20 Jul, 2024 1 commit
  12. 19 Jul, 2024 1 commit
  13. 18 Jul, 2024 3 commits
  14. 17 Jul, 2024 2 commits
  15. 16 Jul, 2024 2 commits