Commits · 980727bba1afdce37fb1aaf8085bdc3b8917d1b9 · OpenDAS / dynamo

24 Sep, 2025 2 commits
- chore: bump versions ahead of 0.5.1 release (#3209) · 980727bb
  Harrison Saturley-Hall authored Sep 24, 2025
```
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
```
  980727bb
- feat: modelexpress dynamo integration (#3191) · da3b1dbd
  Hyunjae Woo authored Sep 24, 2025
  
  da3b1dbd
18 Sep, 2025 2 commits
- feat: enhance GPT OSS frontend with improved harmony tool calling parser and... · 6675bfc8
  zhongdaor-nv authored Sep 18, 2025
```
feat: enhance GPT OSS frontend with improved harmony tool calling parser and reasoning parser (#2999)
Signed-off-by: zhongdaor <zhongdaor@nvidia.com>
```
  6675bfc8
- fix: crates.io allows at most 5 keywords (#3122) · ef6734d0
  Harrison Saturley-Hall authored Sep 18, 2025
```
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
```
  ef6734d0
02 Sep, 2025 1 commit
- chore: bump version numbers ahead of 0.5.0 release (#2812) · 561ecb98
  Harrison Saturley-Hall authored Sep 02, 2025
```
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
```
  561ecb98
25 Aug, 2025 1 commit
- feat: python bindings for the entire KvPushRouter + per-request router configs (#2658) · f08729ae
  Yan Ru Pei authored Aug 25, 2025
  
  f08729ae
22 Aug, 2025 1 commit
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
20 Aug, 2025 3 commits
- chore: Remove async-openai-macros (#2554) · 49958435
  Graham King authored Aug 20, 2025
  
  49958435
- feat: added parsers lib (#2542) · 526b02f1
  Ayush Agarwal authored Aug 20, 2025
  
  526b02f1
- chore: Bumped Dynamo version to 0.4.1 (#2545) · 9a021885
  Dmitry Tokarev authored Aug 19, 2025
  
  9a021885
19 Aug, 2025 2 commits
- chore: Bring async-openai into repo as request starter (#2520) · 199b9a30
  nachiketb-nvidia authored Aug 19, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  199b9a30
- feat: task scheduler (#2406) · a33033b7
  Ryan Olson authored Aug 19, 2025
```
Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
```
  a33033b7
15 Aug, 2025 1 commit
- fix: remove kvmanager feature from python 3.12 ai-dynamo-runtime wheel (#2456) · ffae72b7
  Harrison Saturley-Hall authored Aug 15, 2025
  
  ffae72b7
07 Aug, 2025 1 commit

feat: cross process instrumentation (#2243) · bd4fe1a7

Neelay Shah authored Aug 07, 2025

Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>

bd4fe1a7

06 Aug, 2025 1 commit
- fix: upgrade axum to 0.8 and etcd-client to 0.16 (#2317) · b2aa504b
  Dan Aloni authored Aug 06, 2025
```
Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
```
  b2aa504b
30 Jul, 2025 1 commit
- chore: Version bump to 0.4.0 (#2179) · 4c90b1b9
  Dmitry Tokarev authored Jul 30, 2025
  
  4c90b1b9
23 Jul, 2025 1 commit
- fix: updates versions and adds ahashmap to BPE (#2072) · 66b7d2c7
  Paul Hendricks authored Jul 23, 2025
  
  66b7d2c7
22 Jul, 2025 1 commit

feat: add a hierarchical Prometheus MetricsRegistry trait for... · e5a8628f

Keiven C authored Jul 22, 2025

feat: add a hierarchical Prometheus MetricsRegistry trait for DistributedRuntime, Namespace, Components, and Endpoint (#2008)
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Ryan Olson <rolson@nvidia.com>

e5a8628f

17 Jul, 2025 1 commit
- feat(runtime): Support tokio-console (#1986) · 1eadc013
  Graham King authored Jul 17, 2025
  
  1eadc013
15 Jul, 2025 2 commits
- feat: adding http clients and recorded response stream (#1919) · a9e0891c
  Ryan Olson authored Jul 15, 2025
  
  a9e0891c
- fix: Remove OpenSSL dependency, use Rust TLS (#1945) · 4da078b8
  Graham King authored Jul 15, 2025
  
  4da078b8
14 Jul, 2025 1 commit
- feat: Shrink the ai-dynamo wheel by 35 MiB (#1918) · ad8ad66b
  Graham King authored Jul 14, 2025
```
Remove http and llmctl binaries. They have been unused for a while.
```
  ad8ad66b
08 Jul, 2025 1 commit
- feat: Build DistributedRuntime-level HTTP server with /health /metrics (#1656) · ece76a62
  ZichengMa authored Jul 08, 2025
  
  ece76a62
07 Jul, 2025 1 commit
- chore: update versions for 0.3.2 release (#1793) · c4935b34
  Anant Sharma authored Jul 07, 2025
  
  c4935b34
03 Jul, 2025 1 commit
- chore(engines): Upgrade mistralrs to 0.6.0 (#1767) · 4ab47617
  Graham King authored Jul 03, 2025
  
  4ab47617
01 Jul, 2025 1 commit
- fix: Prometheus to pull from dcgm-exporter:9400 instead of 9401 (#1707) · 54c21168
  Keiven C authored Jul 01, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  54c21168
30 Jun, 2025 1 commit
- refactor: Upgrade async-openai (#1693) · 82eae1fd
  Paul Hendricks authored Jun 30, 2025
  
  82eae1fd
13 Jun, 2025 1 commit
- chore: update dynamo and nixl versions for 0.3.1 (#1517) · 99e67e60
  Anant Sharma authored Jun 13, 2025
  
  99e67e60
29 May, 2025 1 commit
- chore: update dynamo and nixl versions for 0.3.0 (#1240) · 9d9a1d9b
  Anant Sharma authored May 29, 2025
  
  9d9a1d9b
09 May, 2025 3 commits
- feat: kv block manager (#965) · 4564a387
  Ryan Olson authored May 09, 2025
  
  4564a387
- chore: bump versions and NIXL dependencies for 0.2.1 (#1012) · e9cb035a
  Harrison Saturley-Hall authored May 09, 2025
  
  e9cb035a
- feat: allow adding auth to etcd (#980) · b2e401bc
  wxsm authored May 09, 2025
```
Allow both password or TLS auth, if none of these is provided fallback to no auth

Closes #657
```
  b2e401bc
06 May, 2025 1 commit

feat(dynamo-run): vllm and sglang subprocess engines (#954) · 28fd481c

Graham King authored May 06, 2025

New vllm and sglang engines that run in a sub-process. Will hopefully replace the existing embedded python engines.
    
Why?
    
  - Pure Python, does not require knowing Rust to work on it. Much simpler to maintain.
  - No embedded Python interpreter which avoids linking libpython and avoids the MacOS virtualenv issues.
  - Should have better performance as it's "native" vllm / sglang.
  - Works with any version of vllm (including v1!) and sglang. Less upgrade struggle.

28fd481c

01 May, 2025 1 commit
- chore(dynamo-llm): Move the pre-processor to ingress side (#903) · 2d2a1027
  Graham King authored May 01, 2025
```
Part of https://github.com/ai-dynamo/dynamo/issues/743
```
  2d2a1027
29 Apr, 2025 1 commit

chore: Split PushRouter from Client (#817) · a1a10365

Graham King authored Apr 29, 2025

In a distributed system we don't know if the remote workers need pre-processing done ingress-side or not. Previously Client required us to decide this before discovering the remote endpoints, which was fine because pre-processing was worker-side.

As part of moving pre-processing back to ingress-side we need to split this into two steps:
- Client discovers the endpoints, and (later PR) will fetch their Model Deployment Card.
- PushRouter will use the Model Deployment Card to decide if they need pre-processing or not, which affects the types of the generic parameters.

Part of #743

a1a10365

26 Apr, 2025 1 commit

feat: local planner for 0.2.0 release (#398) · 7d5d6f8c

Hongkuan Zhou authored Apr 25, 2025

Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
Co-authored-by: ishandhanani <ishandhanani@gmail.com>
Co-authored-by: Ubuntu <ubuntu@dev-inst-2w1vokvyuts83rzn4n1k7mnzew9.us-central1-a.c.brevdevprod.internal>
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: Anant Sharma <anants@nvidia.com>

7d5d6f8c

25 Apr, 2025 3 commits

chore: bump NIXL version and package versions (#836) · 0715d469
Harrison Saturley-Hall authored Apr 25, 2025
```
Signed-off-by: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com>
```
0715d469
build: update cudarc dependency to crate version (#815) · 448e79a6
Anant Sharma authored Apr 25, 2025

448e79a6

chore: Publish Model Deployment Card to NATS (#799) · d346782c

Graham King authored Apr 25, 2025

This will allow an ingress-side pre-processor to see it without needing a model checkout.

Currently pre-processing is done in the worker, which has access to the model deployment card ("MDC") files (`config.json`, `tokenizer.json` and `tokenizer_config.json`) locally. We want to move the pre-processor to the ingress side to support KV routing. That requires ingress side (i.e the HTTP server), on a different machine than the worker to be able to see those three files.

To support that this PR makes the worker upload the contents of those files to the NATS object store, and publishes the MDC with those NATS urls to the key-value store.

The key-value store has an interface so any store (nats, etcd, redis, etc) can be supported. Implementations for memory and NATS are provided.

Fetching the MDC from the store, doing pre-processing ingress side, and publishing a card backed by a GGUF, are all for a later commit.

Part of #743

d346782c

17 Apr, 2025 1 commit
- feat: adding dynamo-tokens crate (#718) · 99b76ba4
  Ryan Olson authored Apr 17, 2025
  
  99b76ba4