vllm/attention/backends/flashmla.py · 469e903b252ee79054dad613b0112ef646787f51 · OpenDAS / vllm_cscc · GitLab

Find file Blame History Permalink

[V1][Bugfix] Standardize quantized kv cache rejection for attention backends (#14221) · 6832707e
Michael Goin authored Mar 06, 2025
```
Signed-off-by: mgoin <mgoin64@gmail.com>
```
6832707e

flashmla.py 8.81 KB