• Jesse Gross's avatar
    llm: Don't always evict models on CPU-only systems · 5317202c
    Jesse Gross authored
    Model eviction happens when we have at least one other model
    loaded and are unable to load all layers into VRAM. However, on
    CPU-only systems we can never load layers into VRAM, so this
    constantly triggered eviction.
    
    Fixes #13227
    5317202c
server.go 53.3 KB