vllm/model_executor/layers/quantization/fp8.py · 500b93c8dc84182776f17c3c31053aaba9865e8b · OpenDAS / vllm_cscc · GitLab

Find file Blame History Permalink

[Misc] Support FP8 kv cache scales from compressed-tensors (#6528) · 9e0b558a
Michael Goin authored Jul 23, 2024

9e0b558a

fp8.py 17.3 KB