• Daniel Hiltgen's avatar
    Fix embeddings memory corruption (#6467) · 90ca8417
    Daniel Hiltgen authored
    * Fix embeddings memory corruption
    
    The patch was leading to a buffer overrun corruption.  Once removed though, parallism
    in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
    work around this, only use slot 0 for embeddings.
    
    * Fix embed integration test assumption
    
    The token eval count has changed with recent llama.cpp bumps (0.3.5+)
    90ca8417
embed_test.go 4.86 KB