"vscode:/vscode.git/clone" did not exist on "a23f0158bb1aeb2c4078a032647c51f03c03a166"
Simplify the `attention` function (#2609)
* Simplify the `attention` function - Use one definition rather than multiple. - Add `key`/`value` arguments, so that we don't need the `PREFILL_IN_KVCACHE` constant. - Make it kwargs-only (to avoid mixing up the various `Tensor` args). * Fixup flashinfer support
Showing
Please register or sign in to comment