- 06 Mar, 2026 6 commits
-
-
zhangqha authored
support qwen3-asr See merge request dcutoolkit/deeplearing/vllm!466
-
zhangqha authored
[perf]glm4_moe模型适配rmsquant和silu_quant融合算子 See merge request dcutoolkit/deeplearing/vllm!467
-
王敏 authored
-
zhangqha authored
[perf]添加Module支持split qkv+rmsnorm+rope+kvcache融合算子,GLM4_MOE完成适配 See merge request dcutoolkit/deeplearing/vllm!465
-
王敏 authored
-
zhangqha authored
修复dsa的workspace的bug,以及添加环境变量关闭DSAVLLM_DISABLE_DSA=1 See merge request dcutoolkit/deeplearing/vllm!463
-
- 05 Mar, 2026 8 commits
-
-
yangql authored
-
zhangqha authored
修复channel-int8 的block_shape读取bug See merge request dcutoolkit/deeplearing/vllm!462
-
lixh6 authored
-
weishb authored
-
SAC_fanth authored
-
zhangqha authored
feat:fix dsa See merge request dcutoolkit/deeplearing/vllm!457
-
zhangqha authored
解决custom allreduce在K100AI上新模型报错问题 See merge request dcutoolkit/deeplearing/vllm!459
-
xiabo authored
-
- 04 Mar, 2026 5 commits
- 03 Mar, 2026 4 commits
- 02 Mar, 2026 6 commits
- 28 Feb, 2026 2 commits
- 27 Feb, 2026 1 commit
-
-
zhuwenwen authored
feat(sampler): 增加 reduced topk+topp 采样快速路径以降低全词表 softmax 开销 See merge request dcutoolkit/deeplearing/vllm!447
-
- 26 Feb, 2026 1 commit
-
-
laibao authored
新增 VLLM_V1_USE_REDUCED_TOPK_TOPP_SAMPLER 开关并补充适用场景说明 在 V1 GPU 输入批预计算 max_top_k/has_any_no_top_k,native sampler 满足条件时走 reduced fast path,异常自动回退
-
- 25 Feb, 2026 3 commits
- 24 Feb, 2026 4 commits