kvcache: Support non-causal attention
Models can disable causality for all or part of their processing while continuing to store data in the KV cache.
Showing
Please register or sign in to comment
Models can disable causality for all or part of their processing while continuing to store data in the KV cache.