1. 10 Apr, 2026 1 commit
  2. 03 Apr, 2026 2 commits
  3. 02 Apr, 2026 1 commit
  4. 01 Apr, 2026 1 commit
  5. 26 Mar, 2026 2 commits
  6. 24 Mar, 2026 2 commits
  7. 23 Mar, 2026 1 commit
  8. 21 Mar, 2026 4 commits
  9. 19 Mar, 2026 1 commit
  10. 18 Mar, 2026 1 commit
  11. 17 Mar, 2026 2 commits
  12. 16 Mar, 2026 3 commits
  13. 15 Mar, 2026 1 commit
    • fanwl's avatar
      Add FA Unified Attention 2D · eb35ba1b
      fanwl authored
      - Add VLLM_V1_USE_FA_UNIFIED_ATTN_2D 环境变量
      - 0: Triton attention, 1: FA unified attention
      eb35ba1b
  14. 12 Mar, 2026 5 commits
  15. 11 Mar, 2026 2 commits
  16. 09 Mar, 2026 1 commit
  17. 04 Mar, 2026 2 commits
  18. 03 Mar, 2026 1 commit
  19. 02 Mar, 2026 1 commit
  20. 26 Feb, 2026 1 commit
  21. 24 Feb, 2026 1 commit
    • laibao's avatar
      • perf(v1): 增加可选的快速 token-id 拷贝路径 · d3a95d54
      laibao authored
        - 新增环境变量 `VLLM_V1_FAST_TOKEN_ID_COPY`(默认关闭)
        - 在 `CachedRequestState` 中缓存 int32 的 prompt token ids(numpy 数组)
        - 开启后在 `InputBatch` 中使用 `np.copyto` 拷贝 prompt/output token ids
      d3a95d54
  22. 16 Feb, 2026 1 commit
  23. 09 Feb, 2026 3 commits