Merge branch 'v0.15.1-dev-lxh-glm5-fix' into 'v0.15.1-dev'
Fix:GLM-5量化模型mla_attention layout修复&&sparse_attn fp8支持 See merge request dcutoolkit/deeplearing/vllm!495
Showing
Please register or sign in to comment
Fix:GLM-5量化模型mla_attention layout修复&&sparse_attn fp8支持 See merge request dcutoolkit/deeplearing/vllm!495