- 15 Sep, 2025 1 commit
-
-
Elyas Mehtabuddin authored
Signed-off-by:
ayushag <ayushag@nvidia.com> Signed-off-by:
Biswa Panda <biswa.panda@gmail.com> Co-authored-by:
ayushag <ayushag@nvidia.com> Co-authored-by:
Biswa Panda <biswa.panda@gmail.com>
-
- 10 Sep, 2025 1 commit
-
-
Michael Feil authored
Signed-off-by:michaelfeil <me@michaelfeil.eu>
-
- 27 Aug, 2025 1 commit
-
-
GuanLuo authored
-
- 26 Aug, 2025 1 commit
-
-
Chi McIsaac authored
-
- 22 Aug, 2025 2 commits
-
-
Graham King authored
-
Ayush Agarwal authored
-
- 21 Aug, 2025 1 commit
-
-
Michael Feil authored
Signed-off-by:
Michael Feil <63565275+michaelfeil@users.noreply.github.com> Co-authored-by:
Ryan McCormick <mccormick.codes@gmail.com>
-
- 20 Aug, 2025 1 commit
-
-
nachiketb-nvidia authored
Changing the chat completions response objects from structs to types of dynamo_async_openai Implement aggregator traits for them chat completion structs add reasoning_content under message and delta message in lib/async-openai
-
- 19 Aug, 2025 2 commits
-
-
nachiketb-nvidia authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
Yan Ru Pei authored
-
- 13 Aug, 2025 1 commit
-
-
Hongkuan Zhou authored
-
- 12 Aug, 2025 1 commit
-
-
KrishnanPrash authored
feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of `nvext`) and Structured Output / Guided Decoding (#2380) Signed-off-by:
KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com> Co-authored-by:
Ayush Agarwal <ayushag@nvidia.com>
-
- 07 Aug, 2025 1 commit
-
-
Neelay Shah authored
Signed-off-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by:
Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>
-
- 23 Jul, 2025 1 commit
-
-
heisenberglit authored
-
- 18 Jul, 2025 1 commit
-
-
Ryan Olson authored
-
- 09 Jul, 2025 1 commit
-
-
Paul Hendricks authored
-
- 01 Jul, 2025 1 commit
-
-
Paul Hendricks authored
-
- 26 Jun, 2025 1 commit
-
-
Paul Hendricks authored
-
- 13 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
Signed-off-by:Hongkuan Zhou <tedzhouhk@gmail.com>
-
- 11 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
-
- 04 Jun, 2025 2 commits
-
-
Paul Hendricks authored
-
Tom O'Brien authored
-
- 03 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
jothomson <jwillthomson19@gmail.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 21 May, 2025 2 commits
-
-
Graham King authored
-
Graham King authored
- Stop advertising a model when it's last instance stops. Previously was when any instance stops. - Faster locks on model manager. - Move discovery code out of http, as it is used by all inputs.
-
- 19 May, 2025 1 commit
-
-
Tom O'Brien authored
Implements OpenAI embeddings (interface only). - Adds ModelType::Embedding - Adds OpenAI embedding request/response structs - Adds support for embedding model discovery
-
- 29 Apr, 2025 1 commit
-
-
Abrar Shivani authored
Adds support for specifying default request parameters through a json template file that can be applied across all inference requests. This enables consistent parameter settings while still allowing per-request overrides. Changes: - Add --request-template CLI flag to specify template file path - Integrate template support in HTTP, batch and text input modes - Template values can be overridden by individual request parameters - Example template.json: ``` { "model": "Qwen2.5-3B-Instruct", "temperature": 0.7, "max_completion_tokens": 4096 } ```
-
- 21 Apr, 2025 1 commit
-
-
Graham King authored
"echo_core" is an engine that echoes the post-processed request back to you so you can see the template. Good for testing. It needed an extra flag set to work correctly.
-
- 26 Mar, 2025 1 commit
-
-
Ryan Olson authored
-
- 08 Mar, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Biswa Panda <biswa.panda@gmail.com>
-
- 05 Mar, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
- 28 Feb, 2025 1 commit
-
-
Paul Hendricks authored
-
- 27 Feb, 2025 2 commits
-
-
Paul Hendricks authored
-
Paul Hendricks authored
-
- 26 Feb, 2025 1 commit
-
-
Paul Hendricks authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
- 25 Feb, 2025 1 commit
-
-
Neelay Shah authored
Signed-off-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 10 Feb, 2025 1 commit
-
-
Ryan Olson authored
Signed-off-by:
Ryan Olson <ryanolson@users.noreply.github.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com> Co-authored-by:
Neelay Shah <neelays@nvidia.com>
-