[Bugfix] Fix KV scales inconsistency in fp8 MLA & FlashInfer kv_cache_dtype...
[Bugfix] Fix KV scales inconsistency in fp8 MLA & FlashInfer kv_cache_dtype "auto" leading to gibberish (#37054)
Signed-off-by:
Andy Lo <andy@mistral.ai>
Showing
Please register or sign in to comment