- 22 Aug, 2025 2 commits
-
-
Graham King authored
-
Graham King authored
-
- 21 Aug, 2025 2 commits
-
-
nachiketb-nvidia authored
-
Graham King authored
-
- 20 Aug, 2025 2 commits
-
-
Ayush Agarwal authored
-
nachiketb-nvidia authored
Changing the chat completions response objects from structs to types of dynamo_async_openai Implement aggregator traits for them chat completion structs add reasoning_content under message and delta message in lib/async-openai
-
- 19 Aug, 2025 2 commits
-
-
nachiketb-nvidia authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
atchernych authored
Co-authored-by:Biswa Panda <biswa.panda@gmail.com>
-
- 18 Aug, 2025 1 commit
-
-
Ayush Agarwal authored
-
- 15 Aug, 2025 1 commit
-
-
Ayush Agarwal authored
-
- 14 Aug, 2025 1 commit
-
-
Greg Clark authored
Signed-off-by:Greg Clark <grclark@nvidia.com>
-
- 13 Aug, 2025 1 commit
-
-
Elyas Mehtabuddin authored
-
- 12 Aug, 2025 1 commit
-
-
KrishnanPrash authored
feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of `nvext`) and Structured Output / Guided Decoding (#2380) Signed-off-by:
KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com> Co-authored-by:
Ayush Agarwal <ayushag@nvidia.com>
-
- 11 Aug, 2025 1 commit
-
-
Graham King authored
-
- 07 Aug, 2025 1 commit
-
-
Ayush Agarwal authored
Co-authored-by:Ryan McCormick <rmccormick@nvidia.com>
-
- 18 Jul, 2025 2 commits
-
-
Ryan Olson authored
-
Jacky authored
-
- 17 Jul, 2025 1 commit
-
-
Ryan Olson authored
-
- 09 Jul, 2025 1 commit
-
-
Paul Hendricks authored
-
- 07 Jul, 2025 1 commit
-
-
Jacky authored
-
- 03 Jul, 2025 1 commit
-
-
Tom O'Brien authored
-
- 01 Jul, 2025 2 commits
-
-
Nathan Barry authored
-
Paul Hendricks authored
-
- 26 Jun, 2025 4 commits
-
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Paul Hendricks authored
-
- 25 Jun, 2025 2 commits
-
-
Zhongdongming Dai authored
-
ishandhanani authored
Co-authored-by:Ryan McCormick <rmccormick@nvidia.com>
-
- 24 Jun, 2025 2 commits
-
-
Paul Hendricks authored
-
Paul Hendricks authored
-
- 11 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
-
- 04 Jun, 2025 2 commits
-
-
Paul Hendricks authored
-
Tom O'Brien authored
-
- 03 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
jothomson <jwillthomson19@gmail.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 02 Jun, 2025 1 commit
-
-
Graham King authored
It was confusing to have two names for one type. This tidy up started in #1064 , is now complete.
-
- 29 May, 2025 1 commit
-
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
-
- 23 May, 2025 1 commit
-
-
Yan Ru Pei authored
-
- 19 May, 2025 1 commit
-
-
Tom O'Brien authored
Implements OpenAI embeddings (interface only). - Adds ModelType::Embedding - Adds OpenAI embedding request/response structs - Adds support for embedding model discovery
-
- 14 May, 2025 1 commit
-
-
Graham King authored
Router: ``` dynamo-run in=http out=dyn://dynamo.endpoint.generate --router-mode kv ``` Worker (* N): ``` dynamo-run in=dyn://dynamo.endpoint.generate out=vllm /data/llms/Qwen/Qwen3-4B ``` You need patched vllm and the C bindings `.so`. Full docs in the updated guide: `docs/guides/dynamo_run.md`. This gives us a pure-Rust ingress node: OpenAI compliant HTTP server + Pre-processor + KV-aware router.
-