1. 13 Nov, 2025 2 commits
    • zhuwenwen's avatar
      feat: 添加输出占位符功能以优化调度 · 613edd7d
      zhuwenwen authored
      - 在环境变量中引入 `VLLM_SCHED_ENABLE_MINIMAL_INJECTION` 以控制流水线并行调度的最小注入。
      - 调整 Scheduler 逻辑以使用新的最小注入功能。
      - 更新调度逻辑以利用输出占位符,确保在解码过程中避免 0-token 停滞。
      - 增强 Scheduler,根据批次队列状态管理最小进度注入。
      613edd7d
    • 王敏's avatar
      [feat]w4a8和w8a8适配deepep低延迟 · 92761bde
      王敏 authored
      92761bde
  2. 03 Nov, 2025 1 commit
  3. 01 Nov, 2025 1 commit
  4. 27 Oct, 2025 1 commit
  5. 24 Oct, 2025 1 commit
    • zhuwenwen's avatar
      add VLLM_USE_LIGHTOP_MOE_SUM_MUL_ADD · c2e6f453
      zhuwenwen authored
      support prefix cache on kme
      fix the error in test_moe caused by moe align not supporting 511 and 211
      multi-modal switching to torch implementation on z100l&k100
      c2e6f453
  6. 17 Oct, 2025 1 commit
  7. 15 Oct, 2025 1 commit
  8. 13 Oct, 2025 3 commits
  9. 10 Oct, 2025 1 commit
  10. 09 Oct, 2025 1 commit
  11. 30 Sep, 2025 1 commit
  12. 24 Sep, 2025 3 commits
  13. 22 Sep, 2025 1 commit
  14. 18 Sep, 2025 1 commit
  15. 17 Sep, 2025 1 commit
  16. 14 Sep, 2025 1 commit
  17. 13 Sep, 2025 3 commits
  18. 09 Sep, 2025 1 commit
  19. 07 Sep, 2025 1 commit
  20. 04 Sep, 2025 1 commit
  21. 01 Sep, 2025 1 commit
  22. 19 Aug, 2025 1 commit
  23. 08 Aug, 2025 1 commit
  24. 06 Aug, 2025 1 commit
  25. 04 Aug, 2025 1 commit
  26. 25 Jul, 2025 1 commit
  27. 17 Jul, 2025 1 commit
  28. 10 Jul, 2025 1 commit
  29. 03 Jul, 2025 2 commits
  30. 02 Jul, 2025 2 commits
  31. 01 Jul, 2025 1 commit