1. 22 Aug, 2024 1 commit
    • Daniel Hiltgen's avatar
      Fix embeddings memory corruption (#6467) · 90ca8417
      Daniel Hiltgen authored
      * Fix embeddings memory corruption
      
      The patch was leading to a buffer overrun corruption.  Once removed though, parallism
      in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
      work around this, only use slot 0 for embeddings.
      
      * Fix embed integration test assumption
      
      The token eval count has changed with recent llama.cpp bumps (0.3.5+)
      90ca8417
  2. 06 Aug, 2024 1 commit
  3. 31 Jul, 2024 2 commits
  4. 29 Jul, 2024 1 commit
  5. 26 Jul, 2024 1 commit
  6. 25 Jul, 2024 1 commit
  7. 24 Jul, 2024 1 commit
  8. 22 Jul, 2024 1 commit
  9. 21 Jul, 2024 1 commit
  10. 20 Jul, 2024 1 commit
  11. 07 Jul, 2024 1 commit
  12. 05 Jul, 2024 2 commits
  13. 03 Jul, 2024 1 commit
  14. 27 Jun, 2024 1 commit
  15. 17 Jun, 2024 1 commit
  16. 07 Jun, 2024 1 commit
  17. 30 May, 2024 1 commit
  18. 23 May, 2024 2 commits
  19. 16 May, 2024 1 commit
  20. 06 May, 2024 1 commit
  21. 26 Apr, 2024 1 commit
  22. 25 Apr, 2024 1 commit
  23. 02 Apr, 2024 1 commit
  24. 23 Mar, 2024 1 commit
  25. 14 Mar, 2024 1 commit
  26. 13 Mar, 2024 1 commit
  27. 11 Mar, 2024 2 commits
  28. 10 Mar, 2024 2 commits
  29. 09 Mar, 2024 1 commit
  30. 08 Mar, 2024 1 commit
  31. 01 Mar, 2024 1 commit
  32. 20 Feb, 2024 1 commit
  33. 19 Feb, 2024 1 commit
    • Daniel Hiltgen's avatar
      Fix cuda leaks · fc39a6cd
      Daniel Hiltgen authored
      This should resolve the problem where we don't fully unload from the GPU
      when we go idle.
      fc39a6cd
  34. 12 Feb, 2024 1 commit
  35. 06 Feb, 2024 1 commit