- 10 Feb, 2026 6 commits
- 09 Feb, 2026 5 commits
-
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
[feat]宽松mtp支持temp,top-p等参数设置 See merge request dcutoolkit/deeplearing/vllm!420
-
zhuwenwen authored
[feat]支持prefill和decode调度分离 See merge request dcutoolkit/deeplearing/vllm!419
-
zhuwenwen authored
适配w8a8 deepep,接入lightop版deepgemm See merge request dcutoolkit/deeplearing/vllm!418
-
- 08 Feb, 2026 4 commits
- 06 Feb, 2026 13 commits
-
-
zhuwenwen authored
set fp8_e4m3 only supported on nmz and support q&kvcache fp8 set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
-
zhuwenwen authored
[feat]支持宽松mtp See merge request dcutoolkit/deeplearing/vllm!414
-
王敏 authored
-
王敏 authored
-
王敏 authored
# Conflicts: # vllm/model_executor/layers/fused_moe/modular_kernel.py
-
zhuwenwen authored
-
王敏 authored
# Conflicts: # vllm/model_executor/layers/fused_moe/config.py # vllm/model_executor/layers/fused_moe/layer.py # vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_marlin.py
-
王敏 authored
-
zhuwenwen authored
-
zhuwenwen authored
-
王敏 authored
-
zhuwenwen authored
-
zhuwenwen authored
-
- 05 Feb, 2026 5 commits
- 04 Feb, 2026 7 commits