1. 31 Jan, 2026 1 commit
  2. 26 Jan, 2026 1 commit
  3. 22 Jan, 2026 1 commit
  4. 15 Jan, 2026 3 commits
  5. 13 Jan, 2026 2 commits
  6. 08 Jan, 2026 1 commit
  7. 06 Jan, 2026 1 commit
  8. 02 Jan, 2026 1 commit
  9. 30 Dec, 2025 1 commit
  10. 26 Dec, 2025 1 commit
  11. 23 Dec, 2025 2 commits
  12. 19 Dec, 2025 2 commits
  13. 18 Dec, 2025 1 commit
  14. 10 Dec, 2025 1 commit
  15. 09 Dec, 2025 1 commit
  16. 07 Dec, 2025 2 commits
  17. 05 Dec, 2025 1 commit
    • Nick Hill's avatar
      [BugFix] Eagerly abort cancelled final-step requests (#29987) · dc264bce
      Nick Hill authored
      
      
      Currently, when requests are cancelled while executing their final
      step, "completion" is handled based on normal stop processing (e.g.
      length or stop token), so the abort has no effect. This is typically
      not a problem, but when a kv connector is involved it thinks the
      request completed successfully rather than being aborted.
      
      This is problematic for disaggregated prefill which will free kv
      cache blocks if the request was aborted but not if it completed
      successfully—since the cancelled request will never be sent to
      the decode side, kv cache blocks remain pinned until the fall-back
      timeout expires. The problem is exacerbated when many requests
      are cancelled and/or there are large prefills whose forward pass
      takes a long time (since the window is bigger).
      
      This PR fixes the problem by processing pending aborts
      immediately prior to processing model output each step; we process
      only aborts, not new requests, since it's preferable for latency to
      process model outputs before new incoming requests.
      
      Fixes #26400.
      Signed-off-by: default avatarNick Hill <nhill@redhat.com>
      dc264bce
  18. 03 Dec, 2025 1 commit
  19. 02 Dec, 2025 1 commit
  20. 29 Nov, 2025 1 commit
  21. 28 Nov, 2025 1 commit
  22. 14 Nov, 2025 1 commit
  23. 13 Nov, 2025 2 commits
  24. 12 Nov, 2025 1 commit
  25. 11 Nov, 2025 1 commit
  26. 10 Nov, 2025 1 commit
  27. 07 Nov, 2025 1 commit
  28. 05 Nov, 2025 2 commits
  29. 01 Nov, 2025 1 commit
  30. 26 Oct, 2025 1 commit
  31. 21 Oct, 2025 1 commit
  32. 18 Oct, 2025 1 commit