"docs/pages/backends/vllm/README.md" did not exist on "2c3066bd5ddedfcb871fd8663d50fe1533f327fb"
- 23 Mar, 2026 1 commit
-
-
Biswa Panda authored
-
- 02 Mar, 2026 1 commit
-
-
Ryan McCormick authored
-
- 02 Jan, 2026 1 commit
-
-
Tushar Sharma authored
Signed-off-by:Tushar Sharma <tusharma@nvidia.com>
-
- 10 Nov, 2025 1 commit
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
- 29 Sep, 2025 1 commit
-
-
akshaver authored
-
- 16 Sep, 2025 1 commit
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
- 22 Aug, 2025 1 commit
-
-
Graham King authored
-
- 22 May, 2025 1 commit
-
-
jmswen authored
-
- 29 Apr, 2025 1 commit
-
-
Graham King authored
In a distributed system we don't know if the remote workers need pre-processing done ingress-side or not. Previously Client required us to decide this before discovering the remote endpoints, which was fine because pre-processing was worker-side. As part of moving pre-processing back to ingress-side we need to split this into two steps: - Client discovers the endpoints, and (later PR) will fetch their Model Deployment Card. - PushRouter will use the Model Deployment Card to decide if they need pre-processing or not, which affects the types of the generic parameters. Part of #743
-
- 18 Apr, 2025 1 commit
-
-
Hongkuan Zhou authored
Co-authored-by:ishandhanani <82981111+ishandhanani@users.noreply.github.com>
-
- 25 Feb, 2025 1 commit
-
-
Neelay Shah authored
Signed-off-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 13 Feb, 2025 1 commit
-
-
Ryan Olson authored
-
- 12 Feb, 2025 1 commit
-
-
Ryan Olson authored
Signed-off-by:
Ryan Olson <ryanolson@users.noreply.github.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 11 Feb, 2025 1 commit
-
-
Graham King authored
-
- 10 Feb, 2025 1 commit
-
-
Graham King authored
-
- 05 Feb, 2025 1 commit
-
-
J Wyman authored
-
- 04 Feb, 2025 1 commit
-
-
Ryan Olson authored
the journey begins
-