Commits · 0c66b2d2905fc5a3a852ffab642b2ca5b6f1eee1 · OpenDAS / dynamo

"lib/kvbm-physical/src/layout/validation.rs" did not exist on "cf433e6825d83f41905da47d69ca5ee30d4eb1ba"

07 Nov, 2025 1 commit
- chore: better error logging for "failed to join reader and writer tasks" #3910 (#3913) · 0c66b2d2
  Yan Ru Pei authored Nov 07, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  0c66b2d2
24 Oct, 2025 1 commit

refactor: redesign the metrics API from Trait to composition to make the code... · cbe0b177

Keiven C authored Oct 24, 2025


refactor: redesign the metrics API from Trait to composition to make the code cleaner and easier to understand (#3687)
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>

cbe0b177

23 Oct, 2025 1 commit
- chore: Use KeyValueStoreManager instead of etcd::Client (#3822) · 7731b024
  Graham King authored Oct 23, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  7731b024
21 Oct, 2025 1 commit
- refactor(runtime): Replace std::sync::Mutex with parking_lot::Mutex (#3740) · 9ae98ed7
  Graham King authored Oct 21, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  9ae98ed7
17 Oct, 2025 1 commit
- refactor: Make `nats_client` optional internally (#3705) · 66fd6f84
  Graham King authored Oct 17, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  66fd6f84
16 Oct, 2025 1 commit
- chore: move worker_monitor to the llm crate (#3667) · 7aa8e0e6
  Yan Ru Pei authored Oct 16, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  7aa8e0e6
13 Oct, 2025 1 commit
- feat: OTEL Exporter and Tempo Visualization (#3307) · 1f92dd54
  mohammedabdulwahhab authored Oct 13, 2025
```
Signed-off-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  1f92dd54
29 Sep, 2025 1 commit
- chore: add loopback as default address for network (#3250) · 1208f017
  akshaver authored Sep 29, 2025
  
  1208f017
19 Sep, 2025 1 commit
- feat: Request Cancellation unary request support (#3004) · a8fd1271
  Jacky authored Sep 18, 2025
```
Signed-off-by: Jacky <18255193+kthui@users.noreply.github.com>
```
  a8fd1271
17 Sep, 2025 1 commit
- feat: Canary Health Check. (#2903) · 08cb08c1
  Tzu-Ling Kan authored Sep 16, 2025
```
Signed-off-by: tzulingk@nvidia.com <tzulingk@nvidia.com>
```
  08cb08c1
16 Sep, 2025 1 commit
- chore(runtime): Shorten the license header (#3059) · 02a22cbc
  Graham King authored Sep 16, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  02a22cbc
03 Sep, 2025 1 commit
- chore: many bug fixes and improvements when testing planner (#2776) · 7da510cf
  Hongkuan Zhou authored Sep 02, 2025
```
Signed-off-by: hongkuanz <hongkuanz@nvidia.com>
Signed-off-by: hongkuan <hongkuanz@nvidia.com>
```
  7da510cf
28 Aug, 2025 1 commit
- refactor: centralize Prometheus metrics naming and sanitization DIS-554 (#2733) · 84c9890b
  Keiven C authored Aug 28, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  84c9890b
23 Aug, 2025 1 commit
- fix: Skip checksum tests in release mode since they're not computed (#2669) · 05913af5
  Ryan McCormick authored Aug 22, 2025
  
  05913af5
22 Aug, 2025 1 commit
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
21 Aug, 2025 2 commits
- fix: guard inflight_requests and request_duration from early returns. (#2576) · 105436c3
  Tzu-Ling Kan authored Aug 21, 2025
  
  105436c3
- feat: Add model label for vllm backend metrics (#2474) · 57728909
  Tzu-Ling Kan authored Aug 21, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  57728909
19 Aug, 2025 3 commits

feat: Rename dynamo_component_concurrent_requests (#2515) · 6f7f6b12
Tzu-Ling Kan authored Aug 19, 2025
```
Signed-off-by: Tzu-Ling Kan <tzulingk@nvidia.com>
```
6f7f6b12

feat: kvbm + connector (#2258) · 07cfc3a1

Ryan Olson authored Aug 19, 2025


Signed-off-by: Ryan Olson <rolson@nvidia.com>
Co-authored-by: Olga Andreeva <oandreeva@nvidia.com>
Co-authored-by: Ziqi Fan <ziqif@nvidia.com>
Co-authored-by: John Thompson <jothomson@nvidia.com>
Co-authored-by: Richard Huo <rihuo@nvidia.com>
Co-authored-by: Zicheng Ma <zichengm@nvidia.com>

07cfc3a1

feat: router-level request rejection (#2465) · 85d83108
Yan Ru Pei authored Aug 19, 2025

85d83108

18 Aug, 2025 1 commit
- fix: small build warning fix (#2504) · 8cde945e
  nachiketb-nvidia authored Aug 18, 2025
```
Signed-off-by: nachiketb-nvidia <nachiketb@nvidia.com>
```
  8cde945e
14 Aug, 2025 1 commit
- perf: Only compute checksums on debug builds (#2446) · 9ddb3efd
  jthomson04 authored Aug 14, 2025
```
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
```
  9ddb3efd
07 Aug, 2025 2 commits

chore(metrics): Remove the Arc (#2357) · a3f7a39f
Graham King authored Aug 07, 2025

a3f7a39f

feat: cross process instrumentation (#2243) · bd4fe1a7

Neelay Shah authored Aug 07, 2025

Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>

bd4fe1a7

05 Aug, 2025 4 commits
- ci: Improve caching on pre-merge-rust (#2253) · e95f8758
  Ryan McCormick authored Aug 05, 2025
  
  e95f8758
- feat: migrate requests when planner shutdown decode engine (vllm) (#2280) · 36c4ef5e
  Hongkuan Zhou authored Aug 05, 2025
```
Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
Co-authored-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>
```
  36c4ef5e
- feat: Allow Python Engine to end stream before final (#2270) · 347620a1
  Jacky authored Aug 05, 2025
  
  347620a1
- feat: Parameterize health and live HTTP endpoint paths (#2230) · 7c8f8fdc
  Yingge He authored Aug 05, 2025
  
  7c8f8fdc
28 Jul, 2025 1 commit
- feat: Base metrics: add generic ingress handler metrics (#2090) · 615580d8
  Keiven C authored Jul 28, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  615580d8
23 Jul, 2025 1 commit
- feat: health check changes based on endpoint served (#1996) · b127d95f
  Neelay Shah authored Jul 22, 2025
  
  b127d95f
18 Jul, 2025 1 commit
- feat: Add migration to LLM requests (#1930) · 1f07dab7
  Jacky authored Jul 18, 2025
  
  1f07dab7
16 Jul, 2025 1 commit
- perf(router): Remove lock from router hot path (#1963) · aba60996
  Graham King authored Jul 16, 2025
  
  aba60996
07 Jul, 2025 1 commit
- feat: Failure Detection while Responses are returning (#1671) · b4ddca99
  Jacky authored Jul 07, 2025
  
  b4ddca99
24 Jun, 2025 1 commit
- fix: rename create_response_steam to create_response_stream (#1615) · 68e4d2c1
  zxyy-bys authored Jun 24, 2025
  
  68e4d2c1
13 Jun, 2025 1 commit
- feat: FT downed worker instance tracking and skipping (#1424) · a09ca3ec
  Jacky authored Jun 13, 2025
  
  a09ca3ec
23 May, 2025 1 commit

fix: etcd.rs - linear increasing watch with number of requests (#1081) · 3f9c3ffe

Yan Ru Pei authored May 23, 2025

Signed-off-by: Michael Feil <63565275+michaelfeil@users.noreply.github.com>
Co-authored-by: Michael Feil <63565275+michaelfeil@users.noreply.github.com>
Co-authored-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: Ryan Olson <ryanolson@users.noreply.github.com>

3f9c3ffe

22 May, 2025 2 commits

feat(dynamo-run): Allow setting context-length (#1157) · 6d5da821

Graham King authored May 22, 2025

Llama 4 has a very large context length (aka n_ctx, model_max_length, max_model_len), and vllm won't start unless it can allocate enough KV cache for the entire context.

Allow passing `--context-length <N>` to `dynamo-run` to limit it so long-context models will fit.

Future todo:
- Restrict every request's `max_tokens` to below the context length. Our pre-processor should do this by setting stop_conditions.max_tokens. mistralrs engine wrapper must do it itself because it does not use the pre-processor.
- mistralrs and llamacpp currently have a hard-coded max context length if one is not provided on the command line. Change those to be the model's built-in max, read from the GGUF or tokenizer_config.json.

6d5da821

fix: Enable Dynamo HTTP servers to run on IPv6-only hosts (#1166) · 27e92701
jmswen authored May 21, 2025

27e92701

19 May, 2025 1 commit

feat: Support multiple models on single ingress node (#1127) · aeb79e62

Graham King authored May 19, 2025

We can now do this:

- Node 1:

```
dynamo-run in=http out=dyn
```

- Node 2 and 3, two instances of component 'backend' in the nemotron_ultra pipeline:

```
dynamo-run in=dyn://nemotron_ultra.backend.generate out=vllm /data/models/NemotronUltra
```

- Node 4 and 5, two instances of the 'backend' component in nemotron_super pipeline:

```
dynamo-run in=dyn://nemotron_super.backend.generate out=vllm /data/models/NemotronSuper
```

The ingress node will discover all four instances and route correctly. We have been planning for this for a long time now.

As part of this auto-discovery is now always `out=dyn`, with no extra URL parts. Previously it could only route to a single pipeline.

Also:
- Refactor endpoint / instance naming now that I understand them
- Fix removing models when their instance stops.

aeb79e62

15 May, 2025 1 commit
- chore: Update default router mode from random to round-robin (#1097) · 770c230c
  Ryan McCormick authored May 15, 2025
  
  770c230c