• Jesse Gross's avatar
    kvcache: Account for source tensors in defrag operation count · d3e9ca3e
    Jesse Gross authored
    Defragging the KV cache can generate a lot of operations, so we
    need to be careful that we don't overflow the number that the graph
    can support. We currently account for all of the nodes that we add
    to the graph for each move but we also need to include the original
    cache tensors as well.
    
    Fixes #9904
    d3e9ca3e
causal.go 15.2 KB