[CK_TILE]naive attn support FP8 KVCache quant (#1747)
* quant
* fix bug
* simple smoothquant after softmax
* update kv-quant
* update stride
* fix fp8-pertoken-kvcache
* update int8/fp8 quant support
---------
Co-authored-by: so <a.com>
Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com>
Showing
Please register or sign in to comment