[Fix]Load kv-cache dtype from hf_quant_config.json automatically (fix for reverted PR) (#30785)
Signed-off-by: <>
Co-authored-by:
root <root@gpu-937.slurm-workers-slurm.slurm.svc.cluster.local>
Showing
Please register or sign in to comment