Commits · 79a9d69d2bd3f89af0b08221ce153f6fd26c7e01 · OpenDAS / dynamo

28 Aug, 2025 2 commits
- feat: Prevent double-tokenization when EPP picks worker (#2559) · 7d13b6e3
  atchernych authored Aug 28, 2025
  
  7d13b6e3
- chore: deprecate duplicate params in nvext (#2754) · e3619ce0
  ryan-lempka authored Aug 27, 2025
```
Signed-off-by: Ryan Lempka <rlempka@nvidia.com>
```
  e3619ce0
26 Aug, 2025 2 commits
- feat: align OpenAI response IDs with distributed trace IDs (#2496) · a485ab78
  Chi McIsaac authored Aug 26, 2025
  
  a485ab78
- feat: parse normal text along with tool calls (#2709) · 889d6529
  Ayush Agarwal authored Aug 26, 2025
  
  889d6529
25 Aug, 2025 2 commits
- feat: enable --dyn-reasoning-parser flag to set reasoning parser for vllm deployments (#2700) · f5a41004
  nachiketb-nvidia authored Aug 25, 2025
  
  f5a41004
- feat: add gpt oss reasoning parser through harmony (#2656) · 3036e60b
  nachiketb-nvidia authored Aug 25, 2025
```
- couple of refactors
- added a new dependency, openai-harmony
- implemented the gpt oss parser
```
  3036e60b
22 Aug, 2025 2 commits
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
- feat: [vLLM] implement cli args for tool and reasoning parsers (#2619) · cbe854fc
  Ayush Agarwal authored Aug 22, 2025
  
  cbe854fc
21 Aug, 2025 2 commits
- feat: enable basic reasoning parsing of <think> </think> tokens (#2555) · 8e8152a1
  nachiketb-nvidia authored Aug 21, 2025
  
  8e8152a1
- chore: Remove Clone / Sync from DeltaGenerator (#2598) · 0c50a233
  Graham King authored Aug 21, 2025
  
  0c50a233
20 Aug, 2025 2 commits

feat: added parsers lib (#2542) · 526b02f1
Ayush Agarwal authored Aug 20, 2025

526b02f1

chore: remove flatten for chat response types, add reasoning_content (#2543) · c12fe501

nachiketb-nvidia authored Aug 19, 2025

Changing the chat completions response objects from structs to types of dynamo_async_openai

Implement aggregator traits for them chat completion structs

add reasoning_content under message and delta message in lib/async-openai

c12fe501

19 Aug, 2025 2 commits
- chore: Bring async-openai into repo as request starter (#2520) · 199b9a30
  nachiketb-nvidia authored Aug 19, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  199b9a30
- feat: skip router when worker id is pre-determined (#2450) · 6bc6d400
  atchernych authored Aug 19, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  6bc6d400
18 Aug, 2025 1 commit
- chore: enable tool call array parsing (#2466) · 41f095cf
  Ayush Agarwal authored Aug 18, 2025
  
  41f095cf
15 Aug, 2025 1 commit
- chore: Tool call parsers incremental improvements + Model Specific Parsers (#2457) · a7184bec
  Ayush Agarwal authored Aug 15, 2025
  
  a7184bec
14 Aug, 2025 1 commit
- feat: logprob handling (#2426) · f476fd74
  Greg Clark authored Aug 14, 2025
```
Signed-off-by: Greg Clark <grclark@nvidia.com>
```
  f476fd74
13 Aug, 2025 1 commit
- chore: Refactor tool calling for wider support in the future (#2393) · 086ea4f0
  Elyas Mehtabuddin authored Aug 13, 2025
  
  086ea4f0
12 Aug, 2025 1 commit

feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of... · 18bb779e

KrishnanPrash authored Aug 12, 2025

feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of `nvext`) and Structured Output / Guided Decoding (#2380)
Signed-off-by: KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Ayush Agarwal <ayushag@nvidia.com>

18bb779e

07 Aug, 2025 1 commit
- chore: guided decoding support for nvext (#2339) · b165ec4a
  Ayush Agarwal authored Aug 07, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  b165ec4a
18 Jul, 2025 1 commit
- feat: http disconnects (#2014) · 343a4814
  Ryan Olson authored Jul 18, 2025
  
  343a4814
17 Jul, 2025 1 commit
- feat: record + analyze logprobs (#1957) · 49b7a0d9
  Ryan Olson authored Jul 17, 2025
  
  49b7a0d9
09 Jul, 2025 1 commit
- feat: Support for unary tool use in ChatCompletions API (#1800) · 5e2f29f5
  Paul Hendricks authored Jul 09, 2025
  
  5e2f29f5
01 Jul, 2025 2 commits
- feat: Validation engine for validating OpenAI api request data (#1674) · ee86bad3
  Nathan Barry authored Jul 01, 2025
  
  ee86bad3
- feat: Support for Responses API (#1694) · dfbd741d
  Paul Hendricks authored Jul 01, 2025
  
  dfbd741d
26 Jun, 2025 4 commits
- refactor: remove dead protocols code and organize imports idiomatically (#1669) · 9d7c5df5
  Paul Hendricks authored Jun 26, 2025
  
  9d7c5df5
- refactor: removing unsized integer conversions (#1668) · 8a2d6529
  Paul Hendricks authored Jun 26, 2025
  
  8a2d6529
- refactor: refactored using CompletionResponse (#1658) · e3f1bd5d
  Paul Hendricks authored Jun 26, 2025
  
  e3f1bd5d
- refactor: refactored using Choice and CompletionFinishReason (#1635) · 7b7b6a6d
  Paul Hendricks authored Jun 26, 2025
  
  7b7b6a6d
25 Jun, 2025 2 commits
- fix: fix usage.total_tokens count for OpenAI endpoints (#1649) · 6032c82f
  Zhongdongming Dai authored Jun 25, 2025
  
  6032c82f
- feat: support batch `/completions` (#1626) · fc16a79b
  ishandhanani authored Jun 25, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  fc16a79b
24 Jun, 2025 2 commits
- refactor: using async_openai::types::Logprobs (#1625) · 0edc886f
  Paul Hendricks authored Jun 24, 2025
  
  0edc886f
- refactor: refactoring to use async_openai::types::CompletionUsage (#1397) · 0c9ae4dd
  Paul Hendricks authored Jun 24, 2025
  
  0c9ae4dd
11 Jun, 2025 1 commit
- refactor: use comment filed in annotated to pass metric-related information (#1385) · 227a0e71
  Hongkuan Zhou authored Jun 11, 2025
  
  227a0e71
04 Jun, 2025 2 commits
- refactor: Rename CompletionRequest to NvCreateCompletionRequest (#1383) · c103d56a
  Paul Hendricks authored Jun 04, 2025
  
  c103d56a
- feat: add implementation for embeddings (#1290) · e83009a6
  Tom O'Brien authored Jun 04, 2025
  
  e83009a6
03 Jun, 2025 1 commit

feat: add more metrics to rust frontend (#1315) · 98d4abbb

Hongkuan Zhou authored Jun 03, 2025


Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: jothomson <jwillthomson19@gmail.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

98d4abbb

19 May, 2025 1 commit

feat: Add OpenAI Embeddings interface in rust lib (#1110) · 73fdfb8a

Tom O'Brien authored May 19, 2025

Implements OpenAI embeddings (interface only).

- Adds ModelType::Embedding
- Adds OpenAI embedding request/response structs
- Adds support for embedding model discovery

73fdfb8a

17 Mar, 2025 1 commit

fix(vllm,sglang): Let the engine enforce max tokens (#216) · 05765cd4

Graham King authored Mar 17, 2025

Previously several parts of the stack ensured max tokens (for this single request) was set.

Now only text input sets it (to 8k). Everything else leaves as is, potentially blank. The engines themselves have very small defaults, 16 for vllm and 128 for sglang.

Also fix dynamo-run CUDA startup message to only print if we're using an engine that would benefit from it (mistralrs, llamacpp).

05765cd4

14 Mar, 2025 1 commit
- fix: Fix cargo doc warnings for lib/llm (#151) · dac63127
  Ryan McCormick authored Mar 14, 2025
  
  dac63127