- 02 Mar, 2026 1 commit
-
-
MatejKosec authored
feat: Full Anthropic Messages API cache_control support (top-level, per-block, system block arrays) (#6629) Signed-off-by:Matej Kosec <mkosec@nvidia.com>
-
- 07 Jan, 2026 1 commit
-
-
KrishnanPrash authored
Signed-off-by:Krishnan Prashanth <kprashanth@nvidia.com>
-
- 02 Jan, 2026 1 commit
-
-
Tushar Sharma authored
Signed-off-by:Tushar Sharma <tusharma@nvidia.com>
-
- 31 Dec, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Claude <noreply@anthropic.com>
-
- 22 Dec, 2025 1 commit
-
-
smatta-star authored
Signed-off-by:Satvik Matta <smatta@nvidia.com>
-
- 19 Dec, 2025 1 commit
-
-
milesial authored
Signed-off-by:Alexandre Milesi <milesial@users.noreply.github.com>
-
- 09 Dec, 2025 1 commit
-
-
Vladislav Nosivskoy authored
Signed-off-by:Vladislav Nosivskoy <vladnosiv@gmail.com>
-
- 08 Nov, 2025 1 commit
-
-
Ryan McCormick authored
feat: Add support for skip_special_tokens parameter in v1/completions and v1/chat/completions endpoints (#4175)
-
- 03 Nov, 2025 1 commit
-
-
KrishnanPrash authored
Signed-off-by:Krishnan Prashanth <kprashanth@nvidia.com>
-
- 27 Oct, 2025 1 commit
-
-
zhongdaor-nv authored
Signed-off-by:zhongdaor <zhongdaor@nvidia.com>
-
- 10 Oct, 2025 1 commit
-
-
ryan-lempka authored
-
- 29 Sep, 2025 1 commit
-
-
nv-nedelman-1 authored
Signed-off-by:Nicholas Edelman <nedelman@nvidia.com>
-
- 26 Sep, 2025 1 commit
-
-
Ayush Agarwal authored
Signed-off-by:ayushag <ayushag@nvidia.com>
-
- 23 Sep, 2025 1 commit
-
-
Ryan Olson authored
Signed-off-by:ayushag <ayushag@nvidia.com>
-
- 17 Sep, 2025 2 commits
-
-
Greg Clark authored
Signed-off-by:Greg Clark <grclark@nvidia.com>
-
Chi McIsaac authored
Signed-off-by:Chi McIsaac <chixie.mcisaac@gmail.com>
-
- 16 Sep, 2025 2 commits
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
Ayush Agarwal authored
Signed-off-by:ayushag <ayushag@nvidia.com>
-
- 29 Aug, 2025 2 commits
-
-
Ayush Agarwal authored
-
ryan-lempka authored
Signed-off-by:Ryan Lempka <rlempka@nvidia.com>
-
- 28 Aug, 2025 1 commit
-
-
ryan-lempka authored
Signed-off-by:Ryan Lempka <rlempka@nvidia.com>
-
- 22 Aug, 2025 1 commit
-
-
Graham King authored
-
- 20 Aug, 2025 1 commit
-
-
nachiketb-nvidia authored
Changing the chat completions response objects from structs to types of dynamo_async_openai Implement aggregator traits for them chat completion structs add reasoning_content under message and delta message in lib/async-openai
-
- 19 Aug, 2025 1 commit
-
-
nachiketb-nvidia authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
- 14 Aug, 2025 1 commit
-
-
Greg Clark authored
Signed-off-by:Greg Clark <grclark@nvidia.com>
-
- 12 Aug, 2025 1 commit
-
-
KrishnanPrash authored
feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of `nvext`) and Structured Output / Guided Decoding (#2380) Signed-off-by:
KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com> Co-authored-by:
Ayush Agarwal <ayushag@nvidia.com>
-
- 01 Jul, 2025 1 commit
-
-
Nathan Barry authored
-
- 26 Jun, 2025 1 commit
-
-
Paul Hendricks authored
-
- 17 Mar, 2025 1 commit
-
-
Graham King authored
Previously several parts of the stack ensured max tokens (for this single request) was set. Now only text input sets it (to 8k). Everything else leaves as is, potentially blank. The engines themselves have very small defaults, 16 for vllm and 128 for sglang. Also fix dynamo-run CUDA startup message to only print if we're using an engine that would benefit from it (mistralrs, llamacpp).
-
- 08 Mar, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Biswa Panda <biswa.panda@gmail.com>
-
- 05 Mar, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
- 27 Feb, 2025 5 commits
-
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Paul Hendricks authored
-
- 26 Feb, 2025 1 commit
-
-
Paul Hendricks authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
- 25 Feb, 2025 1 commit
-
-
Neelay Shah authored
Signed-off-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 24 Feb, 2025 1 commit
-
-
Biswa Panda authored
-
- 10 Feb, 2025 1 commit
-
-
Ryan Olson authored
Signed-off-by:
Ryan Olson <ryanolson@users.noreply.github.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com> Co-authored-by:
Neelay Shah <neelays@nvidia.com>
-