1. 14 Apr, 2026 3 commits
    • laibao's avatar
      Revert "feat:新增step3.5-mtp3功能" · 94823af1
      laibao authored
      This reverts commit a1f4d869.
      94823af1
    • laibao's avatar
      feat:新增step3.5-mtp3功能 · a1f4d869
      laibao authored
      a1f4d869
    • laibao's avatar
      [BUGFIX] 修复 Step3p5 MTP 参数加载与 EAGLE lm_head 共享逻辑 · 7bf17aa2
      laibao authored
      fix:
      
      - 修复 Step3p5 MTP 在加载 checkpoint 时对可选标量参数的识别逻辑,将 q/k/v zero_point 纳入 optional 参数集合,避免参数校验与加载不一致。
      
      revert:
      
      - 回退 EAGLE 中针对 MTP shared_head.head 强制复用 target lm_head 的逻辑,避免与当前 Step3p5 MTP 权重结构产生冲突。
      
      目的:
      
      - 降低 Step3p5 MTP 在权重加载阶段的兼容性问题,减少由于 lm_head 共享路径不一致导致的异常行为,方便后续排查和协作。
      7bf17aa2
  2. 10 Apr, 2026 1 commit
  3. 03 Apr, 2026 2 commits
  4. 02 Apr, 2026 1 commit
  5. 01 Apr, 2026 1 commit
  6. 26 Mar, 2026 2 commits
  7. 24 Mar, 2026 2 commits
  8. 23 Mar, 2026 1 commit
  9. 21 Mar, 2026 4 commits
  10. 19 Mar, 2026 1 commit
  11. 18 Mar, 2026 1 commit
  12. 17 Mar, 2026 2 commits
  13. 16 Mar, 2026 3 commits
  14. 15 Mar, 2026 1 commit
    • fanwl's avatar
      Add FA Unified Attention 2D · eb35ba1b
      fanwl authored
      - Add VLLM_V1_USE_FA_UNIFIED_ATTN_2D 环境变量
      - 0: Triton attention, 1: FA unified attention
      eb35ba1b
  15. 12 Mar, 2026 5 commits
  16. 11 Mar, 2026 2 commits
  17. 09 Mar, 2026 1 commit
  18. 04 Mar, 2026 2 commits
  19. 03 Mar, 2026 1 commit
  20. 02 Mar, 2026 1 commit
  21. 26 Feb, 2026 1 commit
  22. 24 Feb, 2026 1 commit
    • laibao's avatar
      • perf(v1): 增加可选的快速 token-id 拷贝路径 · d3a95d54
      laibao authored
        - 新增环境变量 `VLLM_V1_FAST_TOKEN_ID_COPY`(默认关闭)
        - 在 `CachedRequestState` 中缓存 int32 的 prompt token ids(numpy 数组)
        - 开启后在 `InputBatch` 中使用 `np.copyto` 拷贝 prompt/output token ids
      d3a95d54
  23. 16 Feb, 2026 1 commit