Commits · 381c428c110e6a33fd69b5d3e124f84d276b2e9a · OpenDAS / dynamo

"fern/pages/vscode:/vscode.git/clone" did not exist on "f3aa1e01291aa1ab747409a273975bde7cf4e47c"

13 Nov, 2025 1 commit
- feat: kv router should route to available instances (#4225) · 8379b0cd
  Yan Ru Pei authored Nov 12, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  8379b0cd
11 Nov, 2025 1 commit
- chore: Remove static mode (#4235) · e1af3af6
  Graham King authored Nov 11, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  e1af3af6
10 Nov, 2025 1 commit
- refactor: Make the Runtime and DistributedRuntime fields private (#4193) · cf630bf7
  Graham King authored Nov 10, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  cf630bf7
08 Nov, 2025 1 commit
- fix: refactor to use service discovery (#4092) · 09b26bf6
  mohammedabdulwahhab authored Nov 08, 2025
```
Signed-off-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  09b26bf6
07 Nov, 2025 1 commit
- feat(keyvalue): Filesystem backed KeyValueStore (#4138) · 794c0a44
  Graham King authored Nov 07, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  794c0a44
06 Nov, 2025 1 commit

feat: ETCD high availability client failover - lease watch resilience (#3950) · 6e2b22ea

Jacky authored Nov 05, 2025


Signed-off-by: Jacky <18255193+kthui@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

6e2b22ea

28 Oct, 2025 1 commit
- chore(runtime): Do not expose etcd lease ID (#3915) · c78b5901
  Graham King authored Oct 28, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  c78b5901
24 Oct, 2025 1 commit

refactor: redesign the metrics API from Trait to composition to make the code... · cbe0b177

Keiven C authored Oct 24, 2025


refactor: redesign the metrics API from Trait to composition to make the code cleaner and easier to understand (#3687)
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>

cbe0b177

23 Oct, 2025 1 commit
- chore: Use KeyValueStoreManager instead of etcd::Client (#3822) · 7731b024
  Graham King authored Oct 23, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  7731b024
21 Oct, 2025 1 commit
- refactor(runtime): Replace std::sync::Mutex with parking_lot::Mutex (#3740) · 9ae98ed7
  Graham King authored Oct 21, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  9ae98ed7
20 Oct, 2025 1 commit
- chore: Replace ServiceConfigBuilder with add_stats_service (#3736) · f6ed01b1
  Graham King authored Oct 20, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  f6ed01b1
17 Oct, 2025 1 commit
- refactor: Make `nats_client` optional internally (#3705) · 66fd6f84
  Graham King authored Oct 17, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  66fd6f84
13 Oct, 2025 1 commit
- chore: update for error messages (#3549) · 6337afec
  Neelay Shah authored Oct 13, 2025
  
  6337afec
06 Oct, 2025 1 commit
- feat: Set instance endpoint status and endpoint health status (#3411) · 332482a9
  Tzu-Ling Kan authored Oct 06, 2025
```
Signed-off-by: tzulingk@nvidia.com <tzulingk@nvidia.com>
```
  332482a9
17 Sep, 2025 1 commit
- feat: Canary Health Check. (#2903) · 08cb08c1
  Tzu-Ling Kan authored Sep 16, 2025
```
Signed-off-by: tzulingk@nvidia.com <tzulingk@nvidia.com>
```
  08cb08c1
16 Sep, 2025 1 commit
- chore(runtime): Shorten the license header (#3059) · 02a22cbc
  Graham King authored Sep 16, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  02a22cbc
03 Sep, 2025 1 commit
- chore: many bug fixes and improvements when testing planner (#2776) · 7da510cf
  Hongkuan Zhou authored Sep 02, 2025
```
Signed-off-by: hongkuanz <hongkuanz@nvidia.com>
Signed-off-by: hongkuan <hongkuanz@nvidia.com>
```
  7da510cf
22 Aug, 2025 2 commits
- fix: move metrics registration to service creation (#2664) · 92151e3e
  Keiven C authored Aug 22, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  92151e3e
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
21 Aug, 2025 1 commit
- feat: Add model label for vllm backend metrics (#2474) · 57728909
  Tzu-Ling Kan authored Aug 21, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  57728909
19 Aug, 2025 1 commit
- feat: router-level request rejection (#2465) · 85d83108
  Yan Ru Pei authored Aug 19, 2025
  
  85d83108
15 Aug, 2025 1 commit
- feat(metrics): add NATS client metrics to prometheus_metrics_fmt (#2292) · acbdabc4
  Keiven C authored Aug 14, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  acbdabc4
14 Aug, 2025 1 commit
- feat: Add a "model" label to Component metrics (#2389) · 3a3f5bf2
  Tzu-Ling Kan authored Aug 14, 2025
  
  3a3f5bf2
13 Aug, 2025 1 commit
- feat: Allow an endpoint to serve multiple models (#2418) · 72ec5f5c
  Graham King authored Aug 13, 2025
  
  72ec5f5c
07 Aug, 2025 1 commit

feat: cross process instrumentation (#2243) · bd4fe1a7

Neelay Shah authored Aug 07, 2025

Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>

bd4fe1a7

05 Aug, 2025 1 commit

feat: migrate requests when planner shutdown decode engine (vllm) (#2280) · 36c4ef5e

Hongkuan Zhou authored Aug 05, 2025

Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
Co-authored-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>

36c4ef5e

31 Jul, 2025 1 commit
- fix: Integration tests fixes (#2161) · f10e44ca
  Keiven C authored Jul 31, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  f10e44ca
28 Jul, 2025 1 commit
- feat: Base metrics: add generic ingress handler metrics (#2090) · 615580d8
  Keiven C authored Jul 28, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  615580d8
23 Jul, 2025 1 commit
- feat: health check changes based on endpoint served (#1996) · b127d95f
  Neelay Shah authored Jul 22, 2025
  
  b127d95f
22 Jul, 2025 1 commit

feat: add a hierarchical Prometheus MetricsRegistry trait for... · e5a8628f

Keiven C authored Jul 22, 2025

feat: add a hierarchical Prometheus MetricsRegistry trait for DistributedRuntime, Namespace, Components, and Endpoint (#2008)
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Ryan Olson <rolson@nvidia.com>

e5a8628f

18 Jul, 2025 1 commit
- feat: Add migration to LLM requests (#1930) · 1f07dab7
  Jacky authored Jul 18, 2025
  
  1f07dab7
16 Jul, 2025 1 commit
- perf(router): Remove lock from router hot path (#1963) · aba60996
  Graham King authored Jul 16, 2025
  
  aba60996
17 Jun, 2025 1 commit
- refactor: Update inhibited instance removal logic (#1548) · 4abab20f
  Jacky authored Jun 17, 2025
  
  4abab20f
13 Jun, 2025 1 commit
- feat: FT downed worker instance tracking and skipping (#1424) · a09ca3ec
  Jacky authored Jun 13, 2025
  
  a09ca3ec
11 Jun, 2025 1 commit
- refactor: move kv store to runtime (#1459) · 08355da6
  Ryan Olson authored Jun 11, 2025
  
  08355da6
23 May, 2025 1 commit

fix: etcd.rs - linear increasing watch with number of requests (#1081) · 3f9c3ffe

Yan Ru Pei authored May 23, 2025

Signed-off-by: Michael Feil <63565275+michaelfeil@users.noreply.github.com>
Co-authored-by: Michael Feil <63565275+michaelfeil@users.noreply.github.com>
Co-authored-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: Ryan Olson <ryanolson@users.noreply.github.com>

3f9c3ffe

21 May, 2025 1 commit

chore: Fix model removal on instance stop, refactor discovery (#1142) · b520bf44

Graham King authored May 21, 2025

- Stop advertising a model when it's last instance stops. Previously was when any instance stops.
- Faster locks on model manager.
- Move discovery code out of http, as it is used by all inputs.

b520bf44

19 May, 2025 1 commit

feat: Support multiple models on single ingress node (#1127) · aeb79e62

Graham King authored May 19, 2025

We can now do this:

- Node 1:

```
dynamo-run in=http out=dyn
```

- Node 2 and 3, two instances of component 'backend' in the nemotron_ultra pipeline:

```
dynamo-run in=dyn://nemotron_ultra.backend.generate out=vllm /data/models/NemotronUltra
```

- Node 4 and 5, two instances of the 'backend' component in nemotron_super pipeline:

```
dynamo-run in=dyn://nemotron_super.backend.generate out=vllm /data/models/NemotronSuper
```

The ingress node will discover all four instances and route correctly. We have been planning for this for a long time now.

As part of this auto-discovery is now always `out=dyn`, with no extra URL parts. Previously it could only route to a single pipeline.

Also:
- Refactor endpoint / instance naming now that I understand them
- Fix removing models when their instance stops.

aeb79e62

29 Apr, 2025 1 commit

chore: Split PushRouter from Client (#817) · a1a10365

Graham King authored Apr 29, 2025

In a distributed system we don't know if the remote workers need pre-processing done ingress-side or not. Previously Client required us to decide this before discovering the remote endpoints, which was fine because pre-processing was worker-side.

As part of moving pre-processing back to ingress-side we need to split this into two steps:
- Client discovers the endpoints, and (later PR) will fetch their Model Deployment Card.
- PushRouter will use the Model Deployment Card to decide if they need pre-processing or not, which affects the types of the generic parameters.

Part of #743

a1a10365

07 Apr, 2025 1 commit

feat(dynamo-run): Basic routing choice (#524) · ec2e7307

Graham King authored Apr 07, 2025

As a first step towards KV routing:
- introduce a `--router-mode` in dynamo-run that only does random and round-robin right now. Not that interesting yet.
- Make the vllm engine publish the KV events received from our patched vllm.

Now we "just" need to connect the two. Easy right?

ec2e7307