vllm/model_executor/models/deepseek_v2.py · b66c8e4b22302c944d6b68732792ecaee81a3768 · OpenDAS / vllm_cscc

Synchronize the modifications from the 12th to the 17th: · b66c8e4b

zhuwenwen authored Dec 17, 2025

修复CompressedTensorsLinearMethod中的w4a16的冲突问题
feat(moe): add Marlin W16A16 fused MoE behind VLLM_USE_MARLIN_W16A16_MOE
replace the fp8_mqa_logits and fp8_paged_mqa_logits interfaces in deepgemm with mqa_logits and paged_mqa_logits from lightop

b66c8e4b

deepseek_v2.py 69 KB

Replace deepseek_v2.py