deepseek_v2_w4a8模型forward_CRQ分支逻辑增加slilu_mul_quant融合 See merge request dcutoolkit/deeplearing/vllm!261
deepseekv2-w4a8支持custom-rms-quant融合 See merge request dcutoolkit/deeplearing/vllm!259
[fix]解决DeepSeek-V3.2 mtp启动失败 See merge request dcutoolkit/deeplearing/vllm!253
Support blaslt w8a8 GEMM op. See merge request dcutoolkit/deeplearing/vllm!238
[fix]修复mtp中的笔误 See merge request dcutoolkit/deeplearing/vllm!249
解决w8a8 pp16开启marlin的oom问题 See merge request dcutoolkit/deeplearing/vllm!248
[fix]解决开启mtp后,在极端情况碰到显存不足时,导致mla中申请的tensor数据错乱问题 See merge request dcutoolkit/deeplearing/vllm!247