vllm/v1/worker/gpu_model_runner.py · e397bd659226785b9131eb9bf18f22e37cb68349 · OpenDAS / vllm_cscc · GitLab

Find file Blame History Permalink

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532) · 3e41992f
Lucas Wilkinson authored Dec 12, 2025
```
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
```
3e41992f

gpu_model_runner.py 236 KB