1. 23 Aug, 2024 3 commits
  2. 22 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Fix embeddings memory corruption (#6467) · 90ca8417
      Daniel Hiltgen authored
      * Fix embeddings memory corruption
      
      The patch was leading to a buffer overrun corruption.  Once removed though, parallism
      in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
      work around this, only use slot 0 for embeddings.
      
      * Fix embed integration test assumption
      
      The token eval count has changed with recent llama.cpp bumps (0.3.5+)
      90ca8417
  3. 21 Aug, 2024 8 commits
  4. 20 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Split rocm back out of bundle (#6432) · a017cf2f
      Daniel Hiltgen authored
      We're over budget for github's maximum release artifact size with rocm + 2 cuda
      versions.  This splits rocm back out as a discrete artifact, but keeps the layout so it can
      be extracted into the same location as the main bundle.
      a017cf2f
  5. 19 Aug, 2024 17 commits
  6. 18 Aug, 2024 2 commits
  7. 17 Aug, 2024 1 commit
  8. 16 Aug, 2024 1 commit
  9. 15 Aug, 2024 4 commits
  10. 14 Aug, 2024 2 commits