Commits · b658ba6139b8a6d7c796cee97e810bf270a9e893 · OpenDAS / dynamo

22 Aug, 2025 2 commits
- fix: Tests now pass with RUST_BACKTRACE set (#2647) · 819bb62f
  Graham King authored Aug 22, 2025
  
  819bb62f
- chore(llm): Rename protocols::Endpoint to EndpointId (#2615) · 6a358f7c
  Graham King authored Aug 22, 2025
  
  6a358f7c
21 Aug, 2025 2 commits
- feat: enable basic reasoning parsing of <think> </think> tokens (#2555) · 8e8152a1
  nachiketb-nvidia authored Aug 21, 2025
  
  8e8152a1
- chore: Remove Clone / Sync from DeltaGenerator (#2598) · 0c50a233
  Graham King authored Aug 21, 2025
  
  0c50a233
20 Aug, 2025 2 commits

feat: added parsers lib (#2542) · 526b02f1
Ayush Agarwal authored Aug 20, 2025

526b02f1

chore: remove flatten for chat response types, add reasoning_content (#2543) · c12fe501

nachiketb-nvidia authored Aug 19, 2025

Changing the chat completions response objects from structs to types of dynamo_async_openai

Implement aggregator traits for them chat completion structs

add reasoning_content under message and delta message in lib/async-openai

c12fe501

19 Aug, 2025 2 commits
- chore: Bring async-openai into repo as request starter (#2520) · 199b9a30
  nachiketb-nvidia authored Aug 19, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  199b9a30
- feat: skip router when worker id is pre-determined (#2450) · 6bc6d400
  atchernych authored Aug 19, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  6bc6d400
18 Aug, 2025 1 commit
- chore: enable tool call array parsing (#2466) · 41f095cf
  Ayush Agarwal authored Aug 18, 2025
  
  41f095cf
15 Aug, 2025 1 commit
- chore: Tool call parsers incremental improvements + Model Specific Parsers (#2457) · a7184bec
  Ayush Agarwal authored Aug 15, 2025
  
  a7184bec
14 Aug, 2025 1 commit
- feat: logprob handling (#2426) · f476fd74
  Greg Clark authored Aug 14, 2025
```
Signed-off-by: Greg Clark <grclark@nvidia.com>
```
  f476fd74
13 Aug, 2025 1 commit
- chore: Refactor tool calling for wider support in the future (#2393) · 086ea4f0
  Elyas Mehtabuddin authored Aug 13, 2025
  
  086ea4f0
12 Aug, 2025 1 commit

feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of... · 18bb779e

KrishnanPrash authored Aug 12, 2025

feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of `nvext`) and Structured Output / Guided Decoding (#2380)
Signed-off-by: KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Ayush Agarwal <ayushag@nvidia.com>

18bb779e

11 Aug, 2025 1 commit
- fix(preprocessor): Populate model ID in PreprocessedRequest (#2397) · c443528f
  Graham King authored Aug 11, 2025
  
  c443528f
07 Aug, 2025 1 commit
- chore: guided decoding support for nvext (#2339) · b165ec4a
  Ayush Agarwal authored Aug 07, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  b165ec4a
18 Jul, 2025 2 commits
- feat: http disconnects (#2014) · 343a4814
  Ryan Olson authored Jul 18, 2025
  
  343a4814
- feat: Add migration to LLM requests (#1930) · 1f07dab7
  Jacky authored Jul 18, 2025
  
  1f07dab7
17 Jul, 2025 1 commit
- feat: record + analyze logprobs (#1957) · 49b7a0d9
  Ryan Olson authored Jul 17, 2025
  
  49b7a0d9
09 Jul, 2025 1 commit
- feat: Support for unary tool use in ChatCompletions API (#1800) · 5e2f29f5
  Paul Hendricks authored Jul 09, 2025
  
  5e2f29f5
07 Jul, 2025 1 commit
- feat: Failure Detection while Responses are returning (#1671) · b4ddca99
  Jacky authored Jul 07, 2025
  
  b4ddca99
03 Jul, 2025 1 commit
- feat: Implement frontend tokenization for embedding requests (#1494) · 47e7fde7
  Tom O'Brien authored Jul 03, 2025
  
  47e7fde7
01 Jul, 2025 2 commits
- feat: Validation engine for validating OpenAI api request data (#1674) · ee86bad3
  Nathan Barry authored Jul 01, 2025
  
  ee86bad3
- feat: Support for Responses API (#1694) · dfbd741d
  Paul Hendricks authored Jul 01, 2025
  
  dfbd741d
26 Jun, 2025 4 commits
- refactor: remove dead protocols code and organize imports idiomatically (#1669) · 9d7c5df5
  Paul Hendricks authored Jun 26, 2025
  
  9d7c5df5
- refactor: removing unsized integer conversions (#1668) · 8a2d6529
  Paul Hendricks authored Jun 26, 2025
  
  8a2d6529
- refactor: refactored using CompletionResponse (#1658) · e3f1bd5d
  Paul Hendricks authored Jun 26, 2025
  
  e3f1bd5d
- refactor: refactored using Choice and CompletionFinishReason (#1635) · 7b7b6a6d
  Paul Hendricks authored Jun 26, 2025
  
  7b7b6a6d
25 Jun, 2025 2 commits
- fix: fix usage.total_tokens count for OpenAI endpoints (#1649) · 6032c82f
  Zhongdongming Dai authored Jun 25, 2025
  
  6032c82f
- feat: support batch `/completions` (#1626) · fc16a79b
  ishandhanani authored Jun 25, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  fc16a79b
24 Jun, 2025 2 commits
- refactor: using async_openai::types::Logprobs (#1625) · 0edc886f
  Paul Hendricks authored Jun 24, 2025
  
  0edc886f
- refactor: refactoring to use async_openai::types::CompletionUsage (#1397) · 0c9ae4dd
  Paul Hendricks authored Jun 24, 2025
  
  0c9ae4dd
11 Jun, 2025 1 commit
- refactor: use comment filed in annotated to pass metric-related information (#1385) · 227a0e71
  Hongkuan Zhou authored Jun 11, 2025
  
  227a0e71
04 Jun, 2025 2 commits
- refactor: Rename CompletionRequest to NvCreateCompletionRequest (#1383) · c103d56a
  Paul Hendricks authored Jun 04, 2025
  
  c103d56a
- feat: add implementation for embeddings (#1290) · e83009a6
  Tom O'Brien authored Jun 04, 2025
  
  e83009a6
03 Jun, 2025 1 commit

feat: add more metrics to rust frontend (#1315) · 98d4abbb

Hongkuan Zhou authored Jun 03, 2025


Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: jothomson <jwillthomson19@gmail.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

98d4abbb

02 Jun, 2025 1 commit
- chore: Remove PreprocessedRequest alias BackendInput (#1307) · 3f6a7472
  Graham King authored Jun 02, 2025
```
It was confusing to have two names for one type.

This tidy up started in #1064 , is now complete.
```
  3f6a7472
29 May, 2025 1 commit

feat: expose estimated kv cache hit in dynamo-run (#1246) · c9eb6a83

Hongkuan Zhou authored May 29, 2025


Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

c9eb6a83

23 May, 2025 1 commit
- chore: rm duplicate fwd pass metric (#1190) · 9d944c27
  Yan Ru Pei authored May 23, 2025
  
  9d944c27
19 May, 2025 1 commit

feat: Add OpenAI Embeddings interface in rust lib (#1110) · 73fdfb8a

Tom O'Brien authored May 19, 2025

Implements OpenAI embeddings (interface only).

- Adds ModelType::Embedding
- Adds OpenAI embedding request/response structs
- Adds support for embedding model discovery

73fdfb8a

14 May, 2025 1 commit

feat(dynamo-run): KV-aware routing (#1064) · 29813508

Graham King authored May 14, 2025

Router:
```
dynamo-run in=http out=dyn://dynamo.endpoint.generate --router-mode kv
```

Worker (* N):
```
dynamo-run in=dyn://dynamo.endpoint.generate out=vllm /data/llms/Qwen/Qwen3-4B
```

You need patched vllm and the C bindings `.so`. Full docs in the updated guide: `docs/guides/dynamo_run.md`.

This gives us a pure-Rust ingress node: OpenAI compliant HTTP server + Pre-processor + KV-aware router.

29813508