• Jesse Gross's avatar
    kvcache: Log contents of cache when unable to find a slot · 0d38b665
    Jesse Gross authored
    There is a bug when using sliding window attention where we run
    out of KV cache slots. This is likely due to not correctly removing
    all of the entries as they slide out of range. This adds additional
    logging when this occurs to track down the source.
    
    Bug #10127
    0d38b665
causal.go 19.7 KB