1. 29 Aug, 2024 2 commits
  2. 28 Aug, 2024 8 commits
  3. 27 Aug, 2024 11 commits
  4. 25 Aug, 2024 1 commit
  5. 23 Aug, 2024 6 commits
  6. 22 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Fix embeddings memory corruption (#6467) · 90ca8417
      Daniel Hiltgen authored
      * Fix embeddings memory corruption
      
      The patch was leading to a buffer overrun corruption.  Once removed though, parallism
      in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
      work around this, only use slot 0 for embeddings.
      
      * Fix embed integration test assumption
      
      The token eval count has changed with recent llama.cpp bumps (0.3.5+)
      90ca8417
  7. 21 Aug, 2024 8 commits
  8. 20 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Split rocm back out of bundle (#6432) · a017cf2f
      Daniel Hiltgen authored
      We're over budget for github's maximum release artifact size with rocm + 2 cuda
      versions.  This splits rocm back out as a discrete artifact, but keeps the layout so it can
      be extracted into the same location as the main bundle.
      a017cf2f
  9. 19 Aug, 2024 2 commits