- 12 Mar, 2026 14 commits
-
-
wujl5 authored
-
wangmin6 authored
perf: DS V2模型MOE部分增加rmsQuant See merge request dcutoolkit/deeplearing/vllm!492
-
wujl5 authored
-
wangmin6 authored
fix: fix bug http://hpczentao.sugon.com/bug-view-118388.html See merge request dcutoolkit/deeplearing/vllm!491
-
-
wangmin6 authored
mrope_1d修改 See merge request dcutoolkit/deeplearing/vllm!490
-
wangmin6 authored
perf: DS V2模型MLA中增加rmsQuant See merge request dcutoolkit/deeplearing/vllm!487
-
wujl5 authored
-
wangmin6 authored
mla: opt-cat 在 prefill 和 decode 的拼接路由 See merge request dcutoolkit/deeplearing/vllm!483
-
wangmin6 authored
perf: DS v2增加DTBMM融合,默认关闭 See merge request dcutoolkit/deeplearing/vllm!480
-
wujl5 authored
-
guanyu1 authored
-
guanyu1 authored
-
guanyu1 authored
-
- 11 Mar, 2026 9 commits
-
-
zhangqha authored
dpsk_v32的mtp层的dense加载适配 See merge request dcutoolkit/deeplearing/vllm!479
-
zhangqha authored
feat:修复dsa的mqa接口兼容glm5 See merge request dcutoolkit/deeplearing/vllm!478
-
yangql authored
-
liuchy5 authored
-
laibao authored
-
zhangqha authored
Fix:修复调用Triton MoE gemm时缺失的参数,对齐接口 See merge request dcutoolkit/deeplearing/vllm!476
-
zhangqha authored
# Conflicts: # vllm/model_executor/layers/fused_moe/fused_moe.py
-
zhangqha authored
feat: support shared expert fusion. See merge request dcutoolkit/deeplearing/vllm!469
-
lixh6 authored
Fix:Extend MAX_VPT from 32 to 256 to accommodate large-scale MoE models (e.g., GLM-5-quantized model).
-
- 10 Mar, 2026 4 commits
- 09 Mar, 2026 3 commits
- 07 Mar, 2026 1 commit
-
-
wanglong3 authored
-
- 06 Mar, 2026 8 commits
-
-
zhangqha authored
perf:Deepseek v2模型增加rmsQuant和siluMulQuant融合 See merge request dcutoolkit/deeplearing/vllm!468
-
wujl5 authored
-
zhangqha authored
support qwen3-asr See merge request dcutoolkit/deeplearing/vllm!466
-
zhangqha authored
[perf]glm4_moe模型适配rmsquant和silu_quant融合算子 See merge request dcutoolkit/deeplearing/vllm!467
-
王敏 authored
-
zhangqha authored
[perf]添加Module支持split qkv+rmsnorm+rope+kvcache融合算子,GLM4_MOE完成适配 See merge request dcutoolkit/deeplearing/vllm!465
-
王敏 authored
-
zhangqha authored
修复dsa的workspace的bug,以及添加环境变量关闭DSAVLLM_DISABLE_DSA=1 See merge request dcutoolkit/deeplearing/vllm!463
-
- 05 Mar, 2026 1 commit
-
-
yangql authored
-